execline
Software
www.skarnet.org
Value splitting
In a shell, when you write
$ A='foo bar' ; echo $A
the echo command is given two arguments, foo
and bar. The $A value has been split,
and the space between foo and bar acted as a
delimitor.
If you want to avoid splitting, you must write something like
$ A='foo bar' ; echo "$A"
The doublequotes "protect" the spaces. Unfortunately, it's easy
to forget them and perform unwanted splits during script execution
- countless bugs happen because of the shell's splitting behaviour.
execline provides a splitting facility, with
several advantages over the shell's:
How it works
- Splitting always occurs together with a
substitution. A substitution
command can request that the substitution value be split.
- The splitting function
parses the value, looking for delimitors. It fills up a
structure, marking the split points, and the number n
of words the value is to be split into.
- The substitution rewrites the argv. A non-split value will
be written as one word in the argv; a split value will be written
as n separate words.
- The empty word, when split, always evaluates to one word:
itself.
- Substitution of split values is
performed recursively.
Delimitors
Delimitors are characters that mark the end of a word and the start
of another. They are never included in the final words.
You can use any character (except the null character, which you cannot
use in execline scripts anyway) as a delimitor, by giving a string
consisting of all the delimitors you want as the argument to the
-d option used by substitution commands. By default, the
string " \n\r\t" is used, which means that the commands will
split a value if they encounter a space, newline, carriage return, or tab.
Crunching
What should the program do when it finds a set of consecutive
delimitors ? By default, it crunches the delimitors:
it behaves as if there was only 1 delimitor. The string
"foo\n bar" will be split into 2 words: foo and
bar.
Up to version 1.03, having one or more delimitors at the end of the string
would cause an empty word to be appended to the list of words. With
execline-1.04, ending delimitors are simply removed.
Sometimes the crunching behaviour is not desirable: two consecutive delimitors
can mean that an empty word is present in between. In that case,
setting the -c option to the substitution command will ask
for a not crunching split, where every delimitor counts.
The string "foo\n bar" will be split into 3 words:
foo, the empty word, and bar.
Decoding netstrings
Netstrings are
a way to reliably encode strings containing arbitrary characters.
execline takes advantage of this to offer a completely safe
splitting mechanism. If a substitution command is given an empty
delimitor string (by use of the -d "" option), the
splitting function will try to interpret the value as a sequence
of netstrings, every netstring representing a word. For instance,
in the following command line:
$ define -s -d "" A '1:a,2:bb,0:,7:xyz 123,1: ,' echo '$A'
the echo command will be given five arguments:
- the "a" string
- the "bb" string
- the empty string
- the "xyz 123" string
- the " " string (a single space)
However, if the value is not a valid sequence of netstrings, the
substitution command will die with an error message.
The dollarat command, for instance,
can produce a sequence of netstrings (encoding all the arguments
given to an execline script), meant to be decoded by a substitution
command with the -d "" option.