next up previous contents index
Next: 2.3 Stacks Up: 2. Onyx Language Reference Previous: 2.1 Objects   Contents   Index


2.2 Syntax

Onyx's syntax is very simple in comparison to most languages. The scanner and parser are implemented as a human-understandable finite state machine (nested C switch statements with a couple of auxiliary variables), which should give the reader an idea of the simplicity of the language syntax.

CRNL (carriage return, newline) pairs are in all important cases converted to newlines during scanning.

The characters ``#'', ``!'', ``,'', ``;'', ``:'', ``$'', ``~'', ``['', ``]'', ``{'', ``}'', ``('', ``)'', `` ` '', `` ' '', ``<'', and ``>'' are special. In most cases, any of the special characters and whitespace (space, tab, newline, formfeed, null) terminate any preceding token. All other characters including non-printing characters are considered regular characters.

A comment starts with a ``#'' character outside of a string context and extends to the next newline or formfeed.

Procedures are actually executable arrays, but Onyx provides special syntax for declaring procedures. Procedures are delimited by ``{'' and ``}'', and can be nested. Normally, the interpreter executes code as it is scanned, but inside of procedure declarations, execution is deferred. Instead of executing a procedure body as it is encountered, the tokens of the procedure body are pushed onto the operand stack until the closing ``}'' is encountered, at which time an executable array is constructed from the tokens in the procedure body and pushed onto the operand stack.

A partial grammar specification, using BNF notation (where convenient) is as follows:

<program> ::=
<statement>

<statement> ::=
<procedure> <statement> | <object> <statement> | $\epsilon$

<procedure> ::=
{<statement>}

<object> ::=
<integer> | <real> | <name> | <string>

<integer> ::=
<dec_integer> | <radix_integer>

<real> ::=
<dec_real> | <exp_real>

<name> :
Any token that cannot be interpreted as a number or a string is interpreted as an executable name. There are seven syntaxes for names: executable, evaluable, callable, invokable, fetchable, literal, and immediately evaluated. Executable and evaluable names are looked up in the dictionary stack and executed (unless execution is deferred). Evaluable names behave the same as executable names, except when being processed by the bind operator. Callable, invokable, fetchable, and literal names are handled the same as for all other types; the special syntax for names with these attributes are merely a programming convenience. Immediately evaluated names are replaced by their values as defined in the dictionary stack, even if execution is deferred. Examples include:
foo     # executable
4noth3r # executable
!bar    # evaluable
:method # callable
;method # invokable
,data   # fetchable
$biz    # literal
~baz    # immediately evaluated

If the result of an immediately evaluated name is an executable array, the evaluable attribute is set for the array so that when the array is interpreted, it is executed. This allows immediate evaluation to be indiscriminately used without concern for whether the result is an executable array or, say, an executable operator.

<string> ::=
A string delimited by `` ` '' and `` ' ''. Ticks may be embedded in the string without escaping them, as long as the unescaped ticks are balanced. The following sequences have special meaning when escaped by a ``\'' character:
`
` character.
'
' character.
\
\ character.
0
Nul.
n
Newline.
r
Carriage return.
t
Tab.
b
Backspace.
f
Formfeed.
a
Alarm.
e
Escape.
x[0-9a-fA-F][0-9a-fA-F]
Hex encoding for a byte.
c[a-zA-Z]
Control character.
\n (newline)
Ignore.
\r\n (carriage return, newline)
Ignore.

``\'' has no special meaning unless followed by a character in the above list. This is especially convenient when specifying regular expressions.

Examples include:

`'
`A string.'
`An embedded \n newline.'
`Another embedded
newline.'
`An ignored \
newline.'
`Balanced ` and ' are allowed.'
`Manually escaped \` tick.'
`Manually escaped \` tick and `balanced unescaped ticks'.'
`An actual \\ backslash.'
`Another actual \ backslash.'

<dec_integer> :
Signed decimal integer in the range $-2^{63}$ to $2^{63} - 1$. The sign is optional. Examples include:
0
42
-365
+17

<radix_integer> :
Signed integer with explicit base between 2 and 36, inclusive, in the range to $2^{63} - 1$. Integer digits are composed of decimal numbers and lower or upper case letters. The sign is optional. Examples include:
2@101
16@ff
16@Ff
16@FF
-10@42
10@42
+10@42
9@18
35@7r3x
35@7R3x

<dec_real> :
Double precision floating point number in decimal notation. At least one decimal digit and a decimal point are required. Examples include:
0.
.0
3.
.141
3.141
42.75
+3.50
-5.0

<exp_real> :
Floating point number in exponential notation. The format is the same as for <dec_real>, except that an exponent is appended. The exponent is composed of an ``e'' or ``E'', an optional sign, and a base 10 integer that is limited by the precision of the floating point format (approximately $-308$ to $307$). Examples include:
6.022e23
60.22e22
6.022e+23
1.661e-24
1.661E-24

Arrays do not have explicit syntactic support, but the [ and ] operators support their construction. Examples of array construction include:

[]
[0 `A string' `Another string.' true]
[5
42
false]

Dictionaries do not have explicit syntactic support, but the < and > operators support their construction. Examples of dictionary construction include:

<>
<$answer 42 $question `Who knows' $translate {babelfish} >

Stacks do not have explicit syntactic support, but the ( and ) operators support their construction. Examples of stack contstruction include:

()
(1 2 mark `a')


next up previous contents index
Next: 2.3 Stacks Up: 2. Onyx Language Reference Previous: 2.1 Objects   Contents   Index
Jason Evans 2005-03-16