by David Leonard, 2006
for SEE version 2.0
The impatient may want to jump straight to the §4.1 code example.
The Simple ECMAScript Engine ('SEE') is a parser and runtime library for the popular ECMAScript language. ECMAScript is the official name for what most people call JavaScript:
[ECMAScript] is based on several originating technologies, the most well known being JavaScript (Netscape) and JScript (Microsoft). The language was invented by Brendan Eich at Netscape and first appeared in that company's Navigator 2.0 browser. It has appeared in all subsequent browsers from Netscape and in all browsers from Microsoft starting with Internet Explorer 3.0. (ECMA-262 standard, 1999)
SEE fully complies with ECMAScript Edition 3, and to JavaScript 1.5. It has compatibility modes that allow it to run scripts developed under earlier versions of JavaScript, Microsoft's JScript and LiveScript.
This documentation is intended for developers wishing to incorporate SEE into their applications. It explains how you can use SEE to:
This documentation does not explain the ECMAScript language, nor discuss how to build the library on your system.
SEE includes an example application, called see-shell which allows interactive use of the interpreter, and demonstrates how to write host function objects.
I will use the phrase host application
to mean your application, or
any application that uses the SEE runtime environment auxillary to
some primary purpose.
Examples of a host application are web browsers and
scriptable XML processors.
Throughout this documentation, references are made to the C functions and macros provided by the SEE library. To avoid definitional redundancy and to improve precision, the reader is encouraged to examine the SEE header files to find the precise definitions and arguments of each function or macro. Signatures for C macros are given, but you should understand that the compiler cannot normally typecheck your use of those macros.
Where literal C code is used, it is typeset in a monospace font, like this:
if (failed) { abort(); } /* comment */
Similarly, ECMAScript code is typeset in a sans serif font, like this:
window.location = "about:blank";
Important parts of exammple code are highlighted
, and
elided code is indicated with an elipsis, like this: ...
Function definitions listed in the name index
appear in green, like this: SEE_example()
The term ASCII refers to character codes in the decimal range 0 through 127, inclusive.
Compiling SEE requires an ANSI C compiler. Although the SEE library is essentially self-contained, it does depend on you (the host application developer) providing the following:
SEE uses scripts from GNU autoconf to determine if these
are available, and also to determine other system-dependent
properties.
Host applications should #include <see/see.h>
to
access all the macros and functions prototypes.
(As a developer you may find the need to edit header files and configure scripts to make SEE compile on your system. I would be interested in hearing what changes were needed so that future releases can supply this automatically for other users. Please send mail to leonard@users.sourceforge.net.nospam.)
The first step in executing ECMAScript program text with SEE is to create
an interpreter instance.
Each interpreter instance represents an execution context and
global variable space.
When created,
an interpreter is initialised with all the standard ECMAScript objects
such as Math
and String
.
Modules may add other objects (see §7).
First, the host application should allocate storage for a
struct SEE_interpreter
and then call
SEE_interpreter_init()
to initialise that structure.
void SEE_interpreter_init(struct SEE_interpreter *interp);
A pointer to the initialised SEE_interpreter
is required for almost every function that SEE provides.
The pointer is conventionally named interp
.
Here is an example where storage has been allocated on the stack, and consequently the interpreter exists only until the function returns.
void example() { struct SEE_interpreter interp_storage; SEE_interpreter_init(&interp_storage); /* now the interpreter is ready */ }
There is no mechanism for explicitly destroying an initialised interpreter; instead, SEE relies on the garbage collector to reclaim all unreferenced storage (see §3).
SEE supports multiple, simultaneous, independent interpreter instances. This is useful, for example, in an HTML web browser application, where each window may need its own interpreter instance because the variables and bindings to built-in objects must be different and separate in each one.
SEE's functions are not thread-safe within the same interpreter, but multiple different interpreters can be safely used by different threads without collision. This is because global data structures used by SEE are marked immutable when the first interpreter is initialised. Interpreters remain completely independent of each other only if the application:
If SEE encounters an internal error (such as memory exhaustion,
memory corruption, or a bug), it calls the global function pointer
SEE_system.abort
,
passing it a pointer to the interpreter in context (or NULL
),
and a short descriptive message.
The SEE_system.abort
hook initially points to a wapper function that simply calls
the C library function abort()
. You can set the hook
early if you want to handle errors more gracefully.
Its signature is:
extern struct { ... extern void (*abort)(struct SEE_interpreter *interp, const char *msg) _SEE_dead; ... } SEE_system;
A convenience macro, SEE_ABORT()
is provided for
applications to call. It calls the hook function.
extern void (*SEE_ABORT)(struct SEE_interpreter *interp, const char *msg);
SEE uses a garbage collecting memory allocator. SEE has global function pointers for memory allocation that the host application can configure. These hooks must be set up before any interpreter instances are created.
SEE manages memory by calling through the following function pointers
stored in the global structure SEE_system
.
Your host application can replace these pointers before it creates
any interpreters.
extern struct { ... void * (*malloc)(struct SEE_interpreter *interp, SEE_size_t size); void * (*malloc_finalize)(struct SEE_interpreter *interp, SEE_size_t size, void (*)(struct SEE_interpreter *, void *, void *), void *); void * (*malloc_string)(struct SEE_interpreter *interp, SEE_size_t size); void (*free)(struct SEE_interpreter *interp, void *ptr); void (*mem_exhausted)(struct SEE_interpreter *interp); void (*gcollect)(struct SEE_interpreter *interp); ... } SEE_system;
These hooks are invoked through the following functions:
SEE_malloc()
- allocates storage that is scanned during garbage collection
SEE_malloc_finalize()
- allocates storage with a function that is called when it is unreachable
SEE_malloc_string()
- allocates storage for a string that will not contain pointers
SEE_free()
- releases storage that you can guarantee is unreferenced from anywhere
SEE_gcollect()
- detects and releases all unreachable objects
void * SEE_malloc(struct SEE_interpreter *interp, SEE_size_t size); void * SEE_malloc_finalize(struct SEE_interpreter *interp, SEE_size_t size void (*finalizefn)(struct SEE_interpreter *i, void *p, void *closure), void *closure); void * SEE_malloc_string(struct SEE_interpreter *interp, SEE_size_t size); void SEE_free(struct SEE_interpreter *interp, void **datap); void SEE_gcollect(struct SEE_interpreter *interp);
Notice that SEE_free()
takes a pointer-to-a-pointer
argument, unlike
its counterpart in the SEE_system
structure.
The pointer will be set to NULL
after freeing.
Freeing an already-NULL
pointer with SEE_free()
has no effect.
If SEE was compiled with Boehm-gc support, SEE_system.malloc
is initialised to point to a wrapper
around the GC_malloc()
function,
SEE_system.malloc_string
is initialised to point to a wrapper
around GC_malloc_atomic()
,
SEE_system.free
is initialised to point to a wrapper
around GC_free()
, and
SEE_system.malloc_finalize
is initialised to point to a wrapper
around GC_malloc()
and GC_register_finalizer()
.
Otherwise, the initial functions print a warning message and
use the system malloc()
without releasing any memory.
If you intend to hook in your own memory allocator, be aware that any of
these hooks may be called with a NULL
interpreter argument
which indicates an unknown context.
In case of errors or resource exhaustion,
the malloc hooks must not throw an exception,
but should return NULL
on failure.
SEE will detect this situation and act accordingly.
Instead of calling the SEE_malloc
functions directly,
application code should use these
convenient type-cast macros to allocate storage:
SEE_NEW()
- allocate structure storage in the context of an interpreter,
returning a pointer of given type
SEE_NEW_ARRAY()
- allocate storage for an array of elements of the given type
SEE_NEW_STRING_ARRAY()
- a gc-efficient form of SEE_NEW_ARRAY()
, where you guarantee that the elements will not contain pointers
SEE_NEW_FINALIZE()
- same as SEE_NEW()
, but associates a finalizer function
SEE_ALLOCA()
- allocate storage for an array on the stack (see alloca()
)
SEE_STRING_ALLOCA()
- like SEE_ALLOCA()
, except used to hint that the allocated
elements will not contain pointers
T * SEE_NEW(struct SEE_interpreter *interp, type T); T * SEE_NEW_ARRAY(struct SEE_interpreter *interp, type T, int length); T * SEE_NEW_STRING_ARRAY(struct SEE_interpreter *interp, type T, int length); T * SEE_NEW_FINALIZE(struct SEE_interpreter *interp, type T, void (*finalizefn)(struct SEE_interpreter *, void *, void *), void *closure); T * SEE_ALLOCA(struct SEE_interpreter *interp, type T, int length); T * SEE_STRING_ALLOCA(struct SEE_interpreter *interp, type T, int length);
A usage example is:
char *buffer = SEE_NEW_STRING_ARRAY(interp, char, 30);
The allocator macros and functions check for memory allocation failures,
and in that instance will automatically call
SEE_system.mem_exhausted()
.
This hook defaults
to a function that simply calls SEE_ABORT()
with a short message.
Your application may prefer to change the mem_exhausted
hook to handle this situation more gracefully.
It is worth familiarizing yourself with the macro definitions
in <see/mem.h>
to see what they do.
Why is SEE so dependent on a garbage collector? Why doesn't it use reference counting?
This subsection is a short diversion on answering this good question. I have asked myself the same thing about other applications that use garbage collectors. I'll justify SEE's reliance on a garbage collector with the following reasons:
malloc()
and free()
)
would have significantly increased the
complexity, development time, run-time performance and code size of the library.
This would in turn affect those properties of the host application.
There are various convincing documents that explain why a garbage collector
is a better general software engineering choice than (say) explicit
reference counting.
(See Advantages and Disadvantages of Conservative Garbage Collection.)
longjmp()
, because references on the stack would
become memory leaks or introduce too much fragile 'finally' code.
If you will be embedding SEE in a host application that uses
non-GC malloc()
,
then any struct SEE_...
pointers that
you keep inside storage obtained by malloc()
are likely to be unreliable.
This is because garbage collectors do not normally look
inside malloc
ed memory.
To make the GC aware of your reference to the SEE object,
you will either need to arrange for the GC to
scan your malloc'd memory (e.g. by adding it to its root set
) or
by using a level of pointer indirection through uncollectable
GC
memory.
For example, you might create an indirect reference by allocating storage
with Boehm's GC_MALLOC_UNCOLLECTABLE()
function for
a structure like this:
struct myobjref { /* Always allocate this as GC uncollectable */ struct SEE_interpreter *interp; struct SEE_object *object; }
Then, you can safely keep a pointer to the myobjref
in storage allocated by malloc()
.
It would still be your problem to eventually release
the myobjref
with a call to GC_FREE()
.
You may also find it convenient for SEE to manage the lifetime of your
malloc
ed host data.
This normally happens when you create wrapped SEE objects
(see §6.3)
and include pointers into storage allocated with
with malloc()
.
To achieve this, use SEE_NEW_FINALIZE()
to allocate
the wrapped object structure, and write a finalizer function that calls
free()
on the right members.
Similar approaches can be used for external allocators that use reference counting.
Host objects that acquire operating system resources should release those resources when their finalizer is called by the garbage collector. However, a major criticism of garbage collectors is that finalizers are not invoked early or often enough. What this means is that it is highly desirable to release system resources at the earliest time possible (i.e. immediately the referring object becomes unreachable), but garbage collectors deliberately don't do this because performing the reachability analysis on objects is expensive and often left only to when memory is low.
To address this problem, I recommend developers follow these guidelines when designing their host object finalizers:
dispose()
)
that immediately releases all resources acquired by the object.
The dispose method should cope with being called multiple times without error.
The object will need to maintain state indicating whether it is valid or
disposed, and its methods should check this state and generally throw
an exception if it is invalid.
You may also wish to provide the user with an accessor for this state,
e.g. isValid()
.
The state can usually be combined with other normally occurring error
states of the object.
hookthe dispose method by having the finalizer call the method indirectly with
SEE_OBJECT_CALL
, and then
once again, directly, in case the user's hook failed.
SEE_gcollect()
, and then
try just once more to acquire the resource before failing.
An example of this technique is shown in the mod_File.c example module
that comes with SEE.
The principal effects of these guidelines are to first shift the burden of optional, optimal reachability analysis onto the user, and secondly to couple memory exhaustion with resource exhaustion to exploit the benefits of late reachability analysis and avoid false resource loss when memory is plentiful.
SEE's ultimate purpose is to execute user scripts. A full script, or a self-contained fragment of a script is referred to as program text. You should execute program text using the following general strategy:
SEE_interpreter
(§2);
SEE_input
unicode stream reader
(§4.2)
to transport the
ECMAScript program text to SEE's parser;
SEE_Global_eval()
to parse and
evaluate the stream;
The SEE_Global_eval()
function is able to
execute program text and then
store the value associated with the last executed statement in
a location given by a value pointer.
In a non-interactive environment, this last statement's value is
usually meaningless, and the
value result return pointer ('res
') given to
SEE_Global_eval()
may be
safely given as NULL
.
void SEE_Global_eval(struct SEE_interpreter *interp, struct SEE_input *input, struct SEE_value *res);
The program text is first parsed and then immediately executed with this
function. If the evaluated text contains function definitions, the
function-objects created inside the interpreter will contain a
'precompiled' copy of the function text. This means it is safe
to destroy the input immediately after it has been passed to
SEE_Global_eval()
.
Although the rest of this document explains the library API in detail, a complete, but simple example of using the SEE interpreter follows:
#include <see/see.h>
/* Simple example of using the interpreter */
int
main()
{
struct SEE_interpreter interp_storage, *interp;
struct SEE_input *input;
SEE_try_context_t try_ctxt;
struct SEE_value result;
char *program_text = "Math.sqrt(3 + 4 * 7) + 9
";
/* Initialise an interpreter */
SEE_interpreter_init(&interp_storage);
interp = &interp_storage;
/* Create an input stream that provides program text */
input = SEE_input_utf8(interp, program_text);
/* Establish an exception context */
SEE_TRY(interp, try_ctxt) {
/* Call the program evaluator */
SEE_Global_eval(interp, input, &result);
/* Print the result */
if (SEE_VALUE_GET_TYPE(&result) == SEE_NUMBER)
printf("The answer is %f\n", result.u.number);
else
printf("Unexpected answer\n");
}
/* Finally: */
SEE_INPUT_CLOSE(input);
/* Catch any exceptions */
if (SEE_CAUGHT(try_ctxt))
printf("Unexpected exception\n");
exit(0);
}
When this program is compiled, linked against the SEE library and the garbage collector library, and run, it should respond with:
The answer is 14.567764
This works because the value of the last executed statement in the
program_text
is stored in result
.
Calling SEE_Global_eval()
is essentially the same
as using ECMAScript's built-in eval()
function.
If you are interested in developing a provider module for SEE, then you should look at the example module file mod_File.c. See also §7.
SEE uses Unicode character stream sources known as 'inputs' to consume (scan and parse) ECMAScript program text. An input is a stream of 32-bit Unicode UCS-4 characters. The stream is read, one character at a time, through its 'get next character' callback function.
The SEE library provides some useful stream constructors.
Each constructor create a new SEE_input
structure, initialised for reading the source it is supplied.
SEE_input_file()
- streams from a stdio FILE
pointer, and
understands Unicode byte-order marks in that file
SEE_input_utf8()
- streams the contents of a null-terminated char
array, and
assumes 7-bit ASCII or UTF-8 encoding
SEE_input_string()
- streams the contents of a SEE_string
value structure
(which uses UTF-16 encoding, see §5.3)
struct SEE_input *SEE_input_file(struct SEE_interpreter *interp, FILE *f, const char *filename, const char *encoding); struct SEE_input *SEE_input_utf8(struct SEE_interpreter *interp, const char *s); struct SEE_input *SEE_input_string(struct SEE_interpreter *interp, struct SEE_string *s);
If these constructors do not adequately meet your needs, you are encouraged to develop your own. They're quite easy to do, if a bit fiddly. I recommend you find the source to one of the above and modify it to do what you want.
The rest of this section describes the input API in detail, with a view towards custom input streams.
Why streams instead of strings? SEE uses a stream API for inputs rather than (say) a simple UCS-4 or UTF-8 string API, because Unicode-compliant applications will usually have a much better understanding of the encodings they are using than will SEE. With only a small amount of effort, streams provide this flexibility while avoiding unnecessary duplication or text storage.
Inputs are described by SEE_input
structures.
These are functionally similar to stdio's FILE
type, or Java's
ByteReader
classes.
Except they stream fully-decoded Unicode characters.
The SEE_input
structure is the focus of the API and maintains
the input's stream state and provides a pointer to its access (callback)
methods.
struct SEE_input { struct SEE_inputclass *inputclass; SEE_boolean_t eof; SEE_unicode_t lookahead; ... }; struct SEE_inputclass { SEE_unicode_t (*next)(struct SEE_input *input); void (*close)(struct SEE_input *input); };
The inputclass
member
indicates the access methods.
It is a pointer to a SEE_inputclass
structure. This class structure
contains function pointers to the two methods next()
and
close()
.
The next()
method should advance the input pointer, update the
eof
and lookahead
members of the
SEE_input
structure, and return the old value of
lookahead
.
SEE's scanner calls next()
repeatedly, until
the eof
member becomes true.
When eof
is true, the value of lookahead
becomes
meaningless (but should be set to -1
).
Generally, the stream's constructor will internally call its
next()
function once initially, to 'prime' the lookahead field.
If the next()
method encounters an encoding error, it should
set lookahead
to SEE_INPUT_BADCHAR
and try to
recover.
It can throw an exception if it wants to, but SEE does not attempt to
handle that: the application or user program will receive it.
If you don't particularly care about Unicode, it is helpful to
know that 7-bit ASCII is a direct subset of Unicode, so you can just pass
each of your ASCII char
s as a 32-bit SEE_unicode_t
masked with 0x7f
.
(See the Unicode standards.)
The close()
method should deallocate any operating system
resources acquired during the input stream's construction.
By convention, SEE will not call the close()
method
of any application-supplied input. The onus is on the caller to close the
inputs supplied to SEE library functions.
For this reason, you should use the 'finally' behaviour described
in §4.3 to clean up a possibly failed stream.
The SEE_input
structure represents the current state of the
input stream.
Most importantly, the lookahead
field must always reflect the
next character that a call to next()
would return.
Once initialised, the filename
, first_lineno
and
interpreter
members of the SEE_input
structure
should not be changed.
The lookahead
and eof
members
should also be initialised before the structure is given to SEE.
You are encouraged to read the source code to the three constructors listed at the beginning of this section.
Consumers, like SEE's lexical analyser, will use these convenience macros to call input methods on a constructed input stream, rather than calling through the class structure directly:
SEE_INPUT_NEXT()
-
Consumes and returns the next Unicode character from the stream
SEE_INPUT_CLOSE()
-
Releases any resources obtained by the stream
SEE_unicode_t SEE_INPUT_NEXT(struct SEE_input *input); void SEE_INPUT_CLOSE(struct SEE_input *input);
SEE's exceptions are implemented using C's
setjmp()
/longjmp()
mechanism. SEE provides macros
that establish a try-catch context, and test later if a try block
terminated abnormally (i.e. due to an thrown exception). Typical code that
uses try-catch looks like this:
struct SEE_interpreter *interp; struct SEE_value *e; SEE_try_context_t c; /* storage for the try-catch context */ ... SEE_TRY(interp, c) { /* * Now inside a protected "try block". * The following calls may throw exceptions if they want, * causing the try block to exit immediately. */ do_something(); do_something_else(); /* * Because the SEE_TRY macro expands into a 'for' loop, * avoid using 'break', or 'return' statements. * If you must leave the try block, use 'continue;', * or throw an exception. */ } /* Code placed here always runs. */ do_cleanup(); if ((e = SEE_CAUGHT(c))) { /* Handle the thrown exception 'e', somehow. */ handle_exception(e); /* or you can throw it up to the next try-catch like so: */ SEE_THROW(interp, e); } ...
Do not return
, goto
or
break
out of a try block; the macro does not check for this,
and the try-catch context may not be restored properly, causing all sorts of
havoc.
Exceptions thrown outside of any try-catch context will cause the interpreter to abort.
If you are not interested in catching exceptions, and only want the 'finally' behaviour, use the following idiom:
SEE_TRY(interp, c) { do_something(); } do_finally(); /* optional */ SEE_DEFAULT_CATCH(interp, c);
The signatures of these macros are:
SEE_TRY(struct SEE_interpreter *interp, SEE_try_context_t ctxt) { stmt... } struct SEE_object *SEE_CAUGHT(SEE_try_context_t ctxt); void SEE_THROW(struct SEE_interpreter *interp, struct SEE_object *exception); void SEE_DEFAULT_CATCH(struct SEE_interpreter *interp, SEE_try_context_t ctxt);
An application can use SEE's periodic callback mechanism to check for
timeouts, interrupts or GUI events.
The periodic
field in the SEE_system
structure,
if set to something other than NULL
, is called in the following
situations:
extern struct { /* ... */ void (*periodic)(struct SEE_interpreter *); /* ... */ } SEE_system;
It is possible for a cfunction to block the current thread and prevent
the periodic
hook from being called by SEE.
Alternatives to the periodic
hook are:
⚠ Note:
The periodic
hook appeared in API 2.0
Eventually, your host application will want to pass numbers, strings and complex value objects about, through the SEE interpreter, to and from the user code. This section describes the C interface to ECMAScript values.
The ECMAScript language has exactly six types of value. They are:
undefined
null
true
and false
The SEE_value
structure can represent values of all of
these types.
struct SEE_value { enum { ... } _type; union { SEE_boolean_t boolean; SEE_number_t number; struct SEE_string * string; struct SEE_object * object; ... } u; };
The first member, _type
, is the discriminator,
and must be one of the enumerated values
SEE_UNDEFINED
, SEE_NULL
,
SEE_BOOLEAN
, SEE_NUMBER
, SEE_STRING
or
SEE_OBJECT
.
You should access the _type
member using the
SEE_VALUE_GET_TYPE()
macro.
enum { ... } SEE_VALUE_GET_TYPE(struct SEE_value *value);
Depending on the type,
you can directly access the corresponding value of a
SEE_value
.
If the value variable is declared as:
struct SEE_value v;
then the value that it holds is directly accessed through
its union member, v.u
.
The following table shows when the union fields of v.u
are valid:
SEE_VALUE_GET_TYPE(&v) |
Valid member | Member's type |
---|---|---|
SEE_UNDEFINED |
n/a | n/a |
SEE_NULL |
n/a | n/a |
SEE_BOOLEAN |
v.u.boolean |
SEE_boolean_t |
SEE_NUMBER |
v.u.number |
SEE_number_t |
SEE_STRING |
v.u.string |
struct SEE_string * |
SEE_OBJECT |
v.u.object |
struct SEE_object * |
Two other types (SEE_COMPLETION
and SEE_REFERENCE
)
are only used internally to SEE and are not documented here.
To convert/coerce values into values of a different types, use the utility functions describe in §5.1.
To create new values in struct SEE_value
structures,
use the following initialisation macros. They first set the _type
field and then copy the second parameter into the appropriate union field.
It is fine to use a local variable for a struct SEE_value
,
because the garbage collector can see what is being used from the stack.
void SEE_SET_UNDEFINED(struct SEE_value *val); void SEE_SET_NULL(struct SEE_value *val); void SEE_SET_OBJECT(struct SEE_value *val, struct SEE_object *obj); void SEE_SET_STRING(struct SEE_value *val, struct SEE_string *str); void SEE_SET_NUMBER(struct SEE_value *val, SEE_number_t num); void SEE_SET_BOOLEAN(struct SEE_value *val, SEE_boolean_t bool);
Most SEE_value
s are passed about the SEE library functions using
pointers. This is because the general contract is that the caller supplies
storage for the return value (usually named ret
), while
other pointer arguments are treated as read-only.
Conventionally, the result value pointer is provided as the last argument
to these functions and is named res
.
Avoid storing a struct SEE_value
as a pointer.
Instead, extract and copy values into storage using the following macro:
void SEE_VALUE_COPY(struct SEE_value *dst, struct SEE_value *src);
⚠ Note:
The SEE_VALUE_COPY()
macro breaks the convention of
putting the result pointer last
by instead following the better-known idiom of memcpy()
, which
places the destination first.
A simple pitfall to avoid when passing values to SEE functions is to use a single value as both a parameter to the function and as the return result storage. Do not do this. It is possible that the function will initialise its return storage before it accesses its parameters.
The ECMAScript language specification provides for conversion functions that the host application developer may find useful. They convert arbitrary values into values of a known type:
SEE_ToPrimitive()
- Returns a non-object value. It calls the
object's DefaultValue()
method
(see §6.3)
SEE_ToBoolean()
- Returns a value of type SEE_BOOLEAN
SEE_ToNumber()
- Returns a value of type SEE_NUMBER
SEE_ToInteger()
- Returns a value of type SEE_NUMBER
that is also a finite integer
SEE_ToString()
- Returns a value of type SEE_STRING
SEE_ToObject()
- Returns a value of type SEE_OBJECT
using the String
,
Number
and
Boolean
constructors
void SEE_ToPrimitive(struct SEE_interpreter *interp, struct SEE_value *val, struct SEE_value *hint, struct SEE_value *res); void SEE_ToBoolean(struct SEE_interpreter *interp, struct SEE_value *val, struct SEE_value *res); void SEE_ToNumber(struct SEE_interpreter *interp, struct SEE_value *val, struct SEE_value *res); void SEE_ToInteger(struct SEE_interpreter *interp, struct SEE_value *val, struct SEE_value *res); void SEE_ToString(struct SEE_interpreter *interp, struct SEE_value *val, struct SEE_value *res); void SEE_ToObject(struct SEE_interpreter *interp, struct SEE_value *val, struct SEE_value *res);
See also the SEE_parse_args()
function for a convenient way to extract C types from SEE values.
The undefined and null types have exactly one implied value each, namely
undefined
and null
.
⚠ Note:
ECMAScript's
null
is not an object type, and is
not related to C's NULL
constant.
Boolean types (SEE_boolean_t
) have values of either true (non-zero) or false (zero).
Number values (SEE_number_t
) are IEEE 754 signed floating
point numbers, normally corresponding to the C compiler's built-in
double
type.
The following macros may be used to find information about a number value.
(They assume that the type
is SEE_NUMBER
):
SEE_NUMBER_ISNAN()
- return true if the value represents an error condition (not a number)
SEE_NUMBER_ISPINF()
- return true if the value is +∞
SEE_NUMBER_ISNINF()
- return true if the value is -∞
SEE_NUMBER_ISINF()
- return true if the value is ±∞
SEE_NUMBER_ISFINITE()
- return true if the value is a finite number
int SEE_NUMBER_ISNAN(struct SEE_value *val); int SEE_NUMBER_ISPINF(struct SEE_value *val); int SEE_NUMBER_ISNINF(struct SEE_value *val); int SEE_NUMBER_ISINF(struct SEE_value *val); int SEE_NUMBER_ISFINITE(struct SEE_value *val);
SEE also provides constants SEE_Infinity
and SEE_NaN
which may be stored in number values, but should not be used
to compare number values with C's ==
operator.
Use the macros mentioned previously, instead.
const SEE_number_t SEE_Infinity; const SEE_number_t SEE_NaN;
Numbers (and other values) may be converted to integers using the functions
SEE_ToInt32()
, SEE_ToUint32()
or
SEE_ToUint16()
.
SEE_int32_t SEE_ToInt32(struct SEE_interpreter *interp, struct SEE_value *val); SEE_uint32_t SEE_ToUint32(struct SEE_interpreter *interp, struct SEE_value *val); SEE_uint16_t SEE_ToUint16(struct SEE_interpreter *interp, struct SEE_value *val);
SEE provides three data types for integers:
SEE_uint16_t
- 16 bit unsigned integer
SEE_uint32_t
- 32 bit unsigned integer
SEE_int32_t
- 32 bit signed integer
String values are pointers to SEE_string
structures,
that hold UTF-16 strings.
The structure is defined something like this:
struct SEE_string { unsigned int length; SEE_char_t *data; ... };
The useful members are:
length
- Length of string content
data
- Read-only storage for the string content (UTF-16 characters)
Be aware that other strings may come to share the string's data, such
as by forming substrings.
A string's content must not be modified after construction because of this
risk. However, the length
field of a string may be changed to a smaller value
at any time without concern.
The SEE_char_t
type represents a UTF-16 character in the
string. It is equivalent to a 16-bit unsigned integer.
To manipulate a string, first create a new string using one of the following:
SEE_string_new()
- create a new, empty string
SEE_string_dup()
- create a new string with duplicate content
SEE_string_sprintf()
- create a new string using
printf
-like arguments
SEE_string_vsprintf()
- create a new string using
vprintf
-like arguments
struct SEE_string *SEE_string_new(struct SEE_interpreter *interp, unsigned int space); struct SEE_string *SEE_string_dup(struct SEE_interpreter *interp, struct SEE_string *s); struct SEE_string *SEE_string_sprintf(struct SEE_interpreter *interp, const char *fmt, ...); struct SEE_string *SEE_string_vsprintf(struct SEE_interpreter *interp, const char *fmt, va_list ap);
And then, before passing your new string to any other function, append characters to it using the following:
SEE_string_addch()
- append a UTF-16 character
SEE_string_append()
- append contents of another string
SEE_string_append_ascii()
- convert 7bit ASCII to Unicode and append to another string
SEE_string_append_int()
- append a signed integer's
representation in base 10
void SEE_string_addch(struct SEE_string *s, SEE_char_t ch); void SEE_string_append(struct SEE_string *s, const struct SEE_string *sffx); void SEE_string_append(struct SEE_string *s, const char *); void SEE_string_append_int(struct SEE_string *s, int i);
Once a new string has been passed to any other SEE function, it should not
have its contents modified in any way.
Strings should not be shared
between different interpreters, unless internalised with
SEE_intern_global()
(see §5.3.1).
All strings in SEE use UTF-16 encoding, meaning that in some cases
you may need to be aware of Unicode 'surrogate' characters. If the host
application really needs UCS-4 strings (which are subtly different to UTF-16),
you will need to write your own converter function. Use the implementation of
SEE_input_string()
(§4.2) as
the basis for such a converter because it understands UTF-16 combiner codes.
The functions
SEE_string_sprintf()
and SEE_string_vsprintf()
do not exactly have the same formats
as the standard printf()
function, although they are
substantially similar.
Follows is a table of understood formats:
Format | Type | Comment |
---|---|---|
%[+][-][0][#]d |
signed int |
decimal |
%[+][-][0][#]u |
unsigned int |
decimal |
%[+][-][0][#]x |
unsigned int |
hexadecimal (base 16) |
%c |
char |
ASCII only |
%C |
SEE_char_t | |
%[-][#][.#]s |
const char * |
ASCII only |
%[-][#][.#]S |
struct SEE_string * | |
%[-][#]p |
void * | |
%% |
single % | |
% other |
literal % other |
⚠ Note:
Pror to API 2.0, the SEE_string_sprintf()
and
SEE_string_vsprintf()
used the system
snprintf
forcing its output to 7-bit ASCII.
Where a hash (#
) appears in the format column above, it means that either a positive
integer in base 10 may be supplied to indicate padding or precision,
(eg %4d
)
or an asterisk (*
) can be used instead
to indicate that the next int
argument provides the padding or precision value.
This follows the behaviour of printf()
.
Other string functions provided are:
SEE_string_substr()
- create a read-only substring string
SEE_string_literal()
- create a copy of the string, escaping chars and
enclosing it in double quotes ("
)
SEE_string_fputs()
- write the string to the stdio file using UTF-8 encoding,
returns EOF
on error
SEE_string_toutf8()
- copy the string into a C string buffer, using UTF-8 encoding,
throwing an error if the buffer is too small.
Space is required for the trailing null character.
SEE_string_utf8_size()
- returns the size in bytes, excluding the trailing null
character, of a UTF-8 conversion of the given string.
SEE_string_concat()
- efficiently concatenate two strings, creating a new string
SEE_string_cmp()
- compares two strings, like strcmp()
struct SEE_string *SEE_string_substr(struct SEE_interpreter *interp, struct SEE_string *s, int index, int length); struct SEE_string *SEE_string_literal(struct SEE_interpreter *interp, const struct SEE_string *s); int SEE_string_fputs(const struct SEE_string *s, FILE *file); void SEE_string_toutf8(struct SEE_interpreter *interp, char *buffer, SEE_size_t buffer_size, const struct SEE_string *s); SEE_size_t SEE_string_utf8_size(struct SEE_interpreter *interp, const struct SEE_string *s); struct SEE_string *SEE_string_concat(struct SEE_interpreter *interp, struct SEE_string *s1, struct SEE_string *s2); int SEE_string_cmp(const struct SEE_string *s1, const struct SEE_string *s2);
⚠ Note:
The SEE_string_toutf8()
function does not check for
null characters in the output.
Consider using SEE_string_literal()
to make the
string well-formed before converting to UTF-8.
If you find yourself comparing strings a lot, you may find it easier to
compare internalised strings.
These are strings that are kept in a fast
hash table and may be compared equal using pointer equality.
The SEE_intern()
function returns an 'internalized' copy of the
given string and is very fast on already-interned strings.
It is worth using in lieu of SEE_string_cmp()
if the strings
are likely to be intern'ed already. (For example, all property names in
the standard library are.)
The function SEE_intern_ascii()
is a convenience function
that first converts the C string into a SEE_string
before
intern'ing. The C string must be an ASCII string terminated by a null
character.
struct SEE_string *SEE_intern(struct SEE_interpreter *interp, struct SEE_string *s); struct SEE_string *SEE_intern_ascii(struct SEE_interpreter *interp, const char *s);
SEE supports statically initialised strings. If you have a large number of strings to create and use (e.g. properties and method names) over many interpreter instances, statically initialised strings can save space, and improve performance.
A statically initialised string, 'Hello, world
',
would look like this:
/* Example of a statically-initialised UTF-16 string */ static SEE_char_t hello_world_chars[12] = { 'H', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd' }; static struct SEE_string hello_world = { 12, /* length */ hello_world_chars /* data */ };
The main problem with static strings is finding an elegant way to initialise the strings' content. There is no simple way in ANSI C to have the compiler convert common ASCII strings into UTF-16 arrays. The internal approach taken by SEE in supporting all the standard ECMAScript object property names, is to generate C program text from a file of ASCII strings during the build process.
If an application wishes to internalise strings across interpreters,
it can add all its global strings into the global
intern table before creating any interpreters.
This is done by calling
SEE_intern_global()
for each string.
Doing this can save a moderate amount of overhead, and can
improve performance if the intern'ed string needs to be used often.
struct SEE_string * SEE_intern_global(const char *str);
⚠ Note:
Prior to API 2.0, SEE_intern_global()
had a
very different signature.
See §8.3.
ECMAScript uses a prototype-inheritance object model with simple named properties. More information on the object model can be found in the ECMA-262 standard, and in other JavaScript references.
This section describes how in-memory objects can be accessed and manipualated (the 'client interface'), and also how host applications can expose their own application objects and methods (the 'implementation interface').
Object instances are implemented as in-memory structures, with an
objectclass
pointer to a table of operational methods.
Object references are held inside values with a type field
of SEE_OBJECT
(see §5).
If you want to create a plain object quickly from C,
the convenience function
SEE_Object_new()
is the same as evaluating
new Object()
.
struct SEE_object *SEE_Object_new(struct SEE_interpreter *interp);
All object values are pointers to object instances.
The pointers are of type struct SEE_object *
.
No object pointer in a SEE_value
should ever point to
NULL
.
I find working with struct SEE_object *
pointer
types directly, instead of using struct SEE_value
to be
convenient, when I know that I am dealing with objects.
To use an object instance, you should interact with it using
the following internal method
macros:
SEE_OBJECT_GET()
- retrieve a named property or return undefined
('o.prop
');
also known as the [[Get]]
internal method
SEE_OBJECT_PUT()
- create/update a named property
('o.prop = val
');
also known as the [[Put]]
internal method
SEE_OBJECT_CANPUT()
- returns true if the property can be changed;
also known as the [[CanPut]]
internal method;
assumes prop
is an internalised string
SEE_OBJECT_HASPROPERTY()
- tests for existence of a property
('"prop" in o
');
also known as the [[HasProperty]]
internal method;
assumes prop
is an internalised string
SEE_OBJECT_DELETE()
- delete a property; returns true on success
('delete o.prop
');
also known as the [[Delete]]
internal method
SEE_OBJECT_DEFAULTVALUE()
- returns the string or number value associated with the object;
also known as the [[DefaultValue]]
internal method
SEE_OBJECT_CONSTRUCT()
†
- call object as a constructor
('new o(...)
');
also known as the [[Construct]]
internal method
SEE_OBJECT_CALL()
†
- call object as a function ('o(...)
');
also known as the [[Call]]
internal method
SEE_OBJECT_HASINSTANCE()
†
- return true if the objects are related
('x instanceof o
');
also known as the [[HasInstance]]
internal method
SEE_OBJECT_ENUMERATOR()
†
- create a property enumerator
('for (i in o) ...
')
SEE_OBJECT_GET_SEC_DOMAIN()
†
- returns the security domain associated with callable objects only
(see §9)
void SEE_OBJECT_GET(struct SEE_interpreter *interp, struct SEE_object *obj, struct SEE_string *prop, struct SEE_value *res); void SEE_OBJECT_PUT(struct SEE_interpreter *interp, struct SEE_object *obj, struct SEE_string *prop, struct SEE_value *res, int flags); int SEE_OBJECT_CANPUT(struct SEE_interpreter *interp, struct SEE_object *obj, struct SEE_string *prop); int SEE_OBJECT_HASPROPERTY(struct SEE_interpreter *interp, struct SEE_object *obj, struct SEE_string *prop); int SEE_OBJECT_DELETE(struct SEE_interpreter *interp, struct SEE_object *obj, struct SEE_string *prop); void SEE_OBJECT_DEFAULTVALUE(struct SEE_interpreter *interp, struct SEE_object *obj, struct SEE_value *hint, struct SEE_value *res); void SEE_OBJECT_CONSTRUCT(struct SEE_interpreter *interp, struct SEE_object *obj, struct SEE_object *thisobj, int argc, struct SEE_value **argv, struct SEE_value *res); void SEE_OBJECT_CALL(struct SEE_interpreter *interp, struct SEE_object *obj, struct SEE_object *thisobj, int argc, struct SEE_value **argv, struct SEE_value *res); int SEE_OBJECT_HASINSTANCE(struct SEE_interpreter *interp, struct SEE_object *obj, struct SEE_value *instance); struct SEE_enum *SEE_OBJECT_ENUMERATOR(struct SEE_interpreter *interp, struct SEE_object *obj); void *SEE_OBJECT_GET_SEC_DOMAIN(struct SEE_interpreter *interp, struct SEE_object *obj);
†
Five of the macros above (CONSTRUCT
,
CALL
, HASINSTANCE
,
ENUMERATOR
and GET_SEC_DOMAIN
)
call optional internal methods, and
do not check if the object class has not provided them.
This means the macros may try to call through a NULL
function
pointer, which will cause an error.
You can determine if the object's class provides
the optional methods by using the following macros before you use one of the
four marked above.
These check macros returns true if the method they check for is valid
(i.e. they check the function pointer in the object class is
non-NULL
):
SEE_OBJECT_HAS_CALL()
- object can be called with SEE_OBJECT_CALL()
SEE_OBJECT_HAS_CONSTRUCT()
- object can be called with SEE_OBJECT_CONSTRUCT()
SEE_OBJECT_HAS_HASINSTANCE()
- object can be called with SEE_OBJECT_HASINSTANCE()
SEE_OBJECT_HAS_ENUMERATOR()
- object can be called with SEE_OBJECT_ENUMERATOR()
SEE_OBJECT_HAS_GET_SEC_DOMAIN()
- object can be called with SEE_OBJECT_GET_SEC_DOMAIN()
int SEE_OBJECT_HAS_CALL(struct SEE_object *obj); int SEE_OBJECT_HAS_CONSTRUCT(struct SEE_object *obj); int SEE_OBJECT_HAS_HASINSTANCE(struct SEE_object *obj); int SEE_OBJECT_HAS_ENUMERATOR(struct SEE_object *obj); int SEE_OBJECT_HAS_GET_SEC_DOMAIN(struct SEE_object *obj);
Because it is frequently convenient to provide property names using
ASCII C strings instead of with struct SEE_string
pointers,
the following convenience macros are provided.
They are functionally identical to their counterparts, except that they
convert their property name argument with SEE_intern_ascii()
:
void SEE_OBJECT_GETA(struct SEE_interpreter *interp, struct SEE_object *obj, const char *ascii_prop, struct SEE_value *res); void SEE_OBJECT_PUTA(struct SEE_interpreter *interp, struct SEE_object *obj, const char *ascii_prop, struct SEE_value *res, int flags); int SEE_OBJECT_CANPUTA(struct SEE_interpreter *interp, struct SEE_object *obj, const char *ascii_prop); int SEE_OBJECT_HASPROPERTYA(struct SEE_interpreter *interp, struct SEE_object *obj, const char *ascii_prop); int SEE_OBJECT_DELETEA(struct SEE_interpreter *interp, struct SEE_object *obj, const char *ascii_prop);
⚠ Note:
The convenience macros SEE_OBJECT_*A
were introduced in API 2.0.
When storing properties in an object with SEE_OBJECT_PUT()
,
a flags
parameter is required.
In normal operation, this flag should be supplied as zero, but when populating
an object with its properties for the first time, the following bit
flags can be used:
Flag | Meaning |
---|---|
SEE_ATTR_READONLY |
Future assignments (puts) on this property will fail |
SEE_ATTR_DONTENUM |
Enumerators will not list this property
and will hide inherited prototype properties of
the same name until this property is delete d.
(see §6.2) |
SEE_ATTR_DONTDELETE |
Future delete s on this property will
fail |
A property enumerator is a mechanism for discovering the properties that
an object contains. The language exercises this with its
for (var v in ...)
construct.
The results of the enumeration need not be sorted, nor even
to be the same order each time.
Calling SEE_OBJECT_ENUMERATOR()
returns a
newly created enumerator which is a pointer to a
struct SEE_enum
.
Once obtained, the following macros can be used to access the enumerator:
SEE_ENUM_NEXT()
- return a pointer to a property name string, or NULL
when the properties have been exhausted.
struct SEE_string *SEE_ENUM_NEXT(struct SEE_interpreter *interp, struct SEE_enum *e, int *flags_return);
Enumerators can assume that the underlying object does not change during enumeration. A suggested strategy for a caller that does need to remove or add an object's properties while enumerating them is to first create a private list of its property names, ensuring that it has exhausted the enumerator before attempting to modify the object.
/* An example of enumerating properties on an object from C */ void print_properties(struct SEE_interpreter *interp, struct SEE_object *obj) { struct SEE_enum *enumerator; struct SEE_string *prop; /* Ignore objects that don't provide an enumerator */ if (!SEE_OBJECT_HAS_ENUMERATOR(obj)) return; enumerator = SEE_OBJECT_ENUMERATOR(interp, obj); while ((prop = SEE_ENUM_NEXT(interp, enumerator, NULL)) != NULL) { SEE_PrintString(interp, prop, stdout); printf("\n"); } }
When a host application wishes to expose its own 'host objects' to ECMAScript programs, it must use the object implementation API described in this section.
All SEE objects are in-memory structures starting with a
struct SEE_object
:
struct SEE_object { struct SEE_objectclass *objectclass; struct SEE_object * Prototype; };
Normally, this structure is part of a larger structure that maintains the
object's private state. For example, native Number
objects could be implemented with the following:
struct number_object { /* example implementation of Number */ struct SEE_object object; SEE_number_t number; };
Keeping the object
part at the top of the
number_object
structure means that pointers of type
struct number_object *
can be cast to and from pointers of type
struct SEE_object *
. This is a general idiom: begin all
host object structures with a field member of type
struct SEE_object
named object
.
Although the ECMAScript language does not use classes per se,
SEE's internal object implementation does use a class 'abstraction'
to speed up execution and make implementation re-use easier.
Each object has a field, object.objectclass
, that must
be initialised to point to a struct SEE_objectclass
that
provides the object's behaviour. The class structure looks like this:
struct SEE_objectclass { const char * Class; /* mandatory */ SEE_get_fn_t Get; /* mandatory */ SEE_put_fn_t Put; /* mandatory */ SEE_boolean_fn_t CanPut; /* mandatory */ SEE_boolean_fn_t HasProperty; /* mandatory */ SEE_boolean_fn_t Delete; /* mandatory */ SEE_default_fn_t DefaultValue; /* mandatory */ SEE_enumerator_fn_t enumerator; /* optional */ SEE_call_fn_t Construct; /* optional */ SEE_call_fn_t Call; /* optional */ SEE_hasinstance_fn_t HasInstance; /* optional */ SEE_get_sec_domain_fn_t get_sec_domain; /* optional (API 2.0) */ };
⚠ Note:
The type of the Class
field was a
struct SEE_string *
in API 1.0.
The application generally provides this structure in static storage, as
most of its members are function pointers or strings known at compile time.
A member marked optional should be set to NULL
if it is
meaningless.
The object methods marked mandatory
(Get
, Put
, etc.)
are never NULL
, and should provide the precise behaviours
that SEE expects on native objects.
These behaviours are fully described in the
ECMA-262 standard, and are summarised in the following table:
Field | Behaviour |
---|---|
Class |
name of the class as revealed by toString() |
Get |
retrieve a named property (or return undefined ) |
Put |
create/update a named property |
Delete |
delete a property or return 0 |
HasProperty |
returns 0 if the property doesn't exist |
CanPut |
returns 0 if the property cannot be changed |
DefaultValue |
turns the object into a string or number value |
enumerator |
allow enumeration of the properties (see above) |
Construct |
constructs a new object; as per the
new keyword |
Call |
the object has been called as a function |
HasInstance |
returns 0 if the objects are unrelated |
get_sec_domain |
returns the security domain associated with functions |
It is up to the host application to provide storage for the properties, and
so forth. The simplest strategy is to ignore property calls to
Put
and Get
that are meaningless.
To this end, if the host object does not want to expend effort
supporting some of the mandatory operations, it can use the
corresponding 'do-nothing' function(s) from this list:
SEE_no_get()
SEE_no_put()
SEE_no_canput()
SEE_no_hasproperty()
SEE_no_delete()
SEE_no_defaultvalue()
SEE_no_enumerator()
The Prototype
field of an object instance
can either be set to:
Object_prototype
,
NULL
, meaning no prototype, or
NULL
, it is recommended you provide a
toString()
method (to help with debugging).
Once the host application has constructed its own objects that conform to the API, they can be inserted into the 'Global object' as object-valued properties.
The 'Global object' is an unnamed, top-level object whose sole purpose
is to 'hold' all the built-in objects, such as Object
,
Function
, Math
,
etc., as well as all user-declared global variables. The host
application can access it through the Global
member of the
SEE_interpreter
structure.
SEE provides support for a special kind of object class called native
objects. Native objects maintain a hash table of properties, and
implement the mandatory methods (plus enumerator
), and
correctly observe the Prototype
field.
struct SEE_native { struct SEE_object object; struct SEE_property * properties[SEE_NATIVE_HASHLEN]; };
An application can create host objects based on native objects.
First, place a struct SEE_native
at the beginning of a
structure:
struct some_host_object { struct SEE_native native; int host_specific_info; };
Then, use the following objects methods, either directly in the
SEE_objectclass
structure, or by calling them indirectly
from method implementations:
SEE_native_get()
SEE_native_put()
SEE_native_canput()
SEE_native_hasproperty()
SEE_native_delete()
SEE_native_defaultvalue()
SEE_native_enumerator()
It is very important that you initialize the native
field when constructing your host object.
Do this using the SEE_native_init()
function.
void SEE_native_init(struct SEE_native *obj, struct SEE_interpreter *i, const struct SEE_objectclass *obj_class, struct SEE_object *prototype);
The host application will likely want a C function to be able to be called directly from a user script. SEE supports this by wrapping C function pointers in 'cfunction' objects.
The convenience function SEE_cfunction_make()
constructs
an object whose
Prototype
field points to
Function.prototype
,
and whose objectclass
's Call
method points to a
given C function that contains the desired code.
The SEE_cfunction_make()
takes a pointer to the C
function, and an integer indicating the expected number of arguments.
The integer becomes the function object's
length
property, which is advisory only.
struct SEE_object *SEE_cfunction_make(struct SEE_interpreter *interp, SEE_call_fn_t func, struct SEE_string *name, int argc);
⚠ Note:
Objects returned by SEE_cfunction_make()
should really only
be used in the interpreter context in which they were created, but the
current version of SEE does not check for this. (Because cfunction objects
are essentially read-only after construction, and if memory allocation
operates independently of the interpreters, sharing cfunction objects
across interpreters will be OK, but it is not recommended for future
portability.)
When attaching cfunctions to an object, you may find
the SEE_CFUNCTION_PUTA()
macro useful.
It performs both the SEE_cfunction_make()
and
SEE_OBJECT_PUT()
operations in one step.
Its signature is:
void SEE_CFUNCTION_PUTA(struct SEE_interpreter *interp, struct SEE_object *obj, const char *name, SEE_call_fn_t func, int length, int attr);
The C function must conform to the SEE_call_fn_t
signature.
This is demonstrated below, with math_sqrt()
, which is
the actual code behind the Math.sqrt
object:
/* Implementation of Math.sqrt() method */ static void math_sqrt(interp, self, thisobj, argc, argv, res) struct SEE_interpreter *interp; struct SEE_object *self, *thisobj; int argc; struct SEE_value **argv, *res; { struct SEE_value v; if (argc == 0) SEE_SET_UNDEFINED(res); else { SEE_ToNumber(interp, argv[0], &v); SEE_SET_NUMBER(res, sqrt(v.u.number)); } }
The arguments to this function are described in the following table:
Argument | Purpose |
---|---|
interp |
the current interpreter context |
self |
a pointer to the object called
(Math.sqrt here) |
thisobj |
the this object
(the Math object here) |
argc |
number of arguments |
argv |
array of value pointers, of length argc |
res |
uninitialised value location in which to store the result |
A common convention in all ECMAScript functions is that unspecified
arguments should be treated as undefined
, and
extraneous arguments should just be ignored.
If the function uses thisobj
,
it should check any assumptions made about it, especially if it is expected
to be a host object.
This is because method functions can easily be attached to
other objects by user code.
When writing cfunctions, you can use the SEE_parse_args()
convenience function to make argument processing easier.
This function takes a format string and converts arguments according to
the table below.
It can throw a TypeError
exception if a
conversion error occurs.
void SEE_parse_args(struct SEE_interpreter *interp, int argc, struct SEE_value **argv, const char *fmt, ...);
Format | Parameter type | Conversion applied | Result when undefined |
---|---|---|---|
a |
char **
| SEE_ToString() ,
then into an ASCII C string |
"undefined" |
A |
char **
| NULL if undefined ,
otherwise the same as format 'a ' |
NULL |
b |
int *
| SEE_ToBoolean() |
0 |
h |
SEE_uint16_t *
| SEE_ToUint16() |
0 |
i |
SEE_int32_t *
| SEE_ToInt32() |
0 |
n |
SEE_number_t *
| SEE_ToNumber() |
SEE_NaN |
o |
struct SEE_object **
| SEE_ToObject() |
TypeError |
O |
struct SEE_object **
| NULL if undefined
or null ,
otherwise the same as format 'o ' |
NULL |
p |
struct SEE_value *
| SEE_ToPrimitive() |
undefined |
s |
struct SEE_string **
| SEE_ToString() |
"undefined" |
u |
SEE_uint32_t *
| SEE_ToUint32() |
0 |
v |
struct SEE_value *
| the argument is copied without conversion | undefined |
x |
the argument is ignored | ||
z |
char **
| SEE_ToString() ,
then into a UTF8 C string |
"undefined" |
Z |
char **
| NULL if undefined ,
otherwise the same as format 'z ' |
NULL |
| |
optional argument marker | ||
. |
throws a TypeError on further
arguments |
||
space | space character is ignored |
The optional argument marker ('|
') disables
storing a result when the argument is
undefined
or not provided by the caller.
This allows storage to be initialised to default values.
The 'a
' and 'A
' formats will
throw a TypeError
if the string contains
non-ASCII characters.
The 'a
', 'A
',
'z
' and 'Z
' formats will throw
an error if the resulting string would contain a null character.
An example of using SEE_parse_args()
:
/* Possible implementation of Math.sqrt() method */ static void math_sqrt_possible(interp, self, thisobj, argc, argv, res) struct SEE_interpreter *interp; struct SEE_object *self, *thisobj; int argc; struct SEE_value **argv, *res; { SEE_number_t n; SEE_parse_args(interp, argc, argv, "n", &n); SEE_SET_NUMBER(res, sqrt(n)); }
Occasionally, a host application will wish to take some user text and
create a callable function object from it. An example of this problem is
in attaching the JavaScript code from HTML attributes onto form
elements of a web page.
One way to achieve this is to invoke the Function
constructor object with the
SEE_OBJECT_CONSTRUCT()
macro, passing it the formal arguments
text and body text as arguments.
(See the ECMAScript standard for details on the
Function
constructor.)
Another way, that is more convenient if the user text is available as
an input stream, is to use the SEE_Function_new()
function:
struct SEE_object *SEE_Function_new(struct SEE_interpreter *interp, struct SEE_string *name, struct SEE_input *param_input, struct SEE_input *body_input);
where any of the the name
, param_input
and
body_input
parameters may be NULL
(indicating to use the empty string).
The returned function object may be called with the
SEE_OBJECT_CALL()
macro.
Host applications sometimes need to convey errors to ECMAScript programs.
Errors in ECMAScript are typically indicated by throwing an exception
with an object value. The thrown objects conventionally have
Error.prototype
somewhere in their prototype chain,
and provide a message
and name
property which the Error.prototype
reads to generate
a human-readable error message.
Host applications can conveniently construct and throw error exceptions using the following macros:
void SEE_error_throw(struct SEE_interpreter *interp, struct SEE_object *error_constructor, const char *fmt, ...); void SEE_error_throw_string(struct SEE_interpreter *interp, struct SEE_object *error_constructor, struct SEE_string *string); void SEE_error_throw_sys(struct SEE_interpreter *interp, struct SEE_object *error_constructor, const char *fmt, ...);
These convenience macros construct a new error object, and throw it as an
exception using SEE_THROW()
.
The object thrown is given a message
string property that reflects the rest of the arguments provided
to the called macro.
The SEE_error_throw_sys()
macro works like
SEE_error_throw()
but appends a textual
description of errno
using strerror()
.
The error_constructor
argument should be one of the error
constructor objects found in the SEE_interpreter
structure:
Member | Meaning |
---|---|
Error |
runtime error |
EvalError |
error in eval() |
RangeError |
numeric argument has exceeded allowable range |
ReferenceError |
invalid reference was detected |
SyntaxError |
parsing error |
TypeError |
actual type of an operand different to that expected |
URIError |
error in a global URI handling function |
A simple example:
if (something_is_wrong) SEE_error_throw(interp, interp->Error, "something is wrong!");
Although Error
is usually sufficient for most errors,
host applications can create their own error constructor object with the
SEE_Error_make()
convenience function. Only one constructor
of the same name should be created per interpreter.
struct SEE_object *SEE_Error_make(struct SEE_interpreter *interp, struct SEE_string *name);
SEE provides a module abstraction for host implementations that want a structured approach to adding their objects into a SEE interpreter.
⚠ Note: The module abstraction was introduced in API 2.0.
A struct SEE_module
is a collection of functions
that are automatically called by SEE at various stages of each intepreter
initialisation. The module may initialise and insert its own objects into
each interpreter before user scripts can be run.
struct SEE_module { SEE_uint32_t magic; const char *name; const char *version; unsigned int index; /* Set by SEE_module_add() */ int (*mod_init)(void); void (*alloc)(struct SEE_interpreter *); void (*init)(struct SEE_interpreter *); };
The magic
field must be initialised to the constant value
SEE_MODULE_MAGIC
.
The name
field is currently unused, but should consist
of a short, unique identifier corresponding to the name of the module.
The version
field is currently unused, and should be
set to NULL
.
The index
field is set by the SEE_module_add()
function as the module is added to SEE.
Each added module is given a unique index. Do not change the index.
The mod_init
function pointer is called immediately the module
is loaded (by SEE_module_add()
).
This is an opportunity for
a module to obtain pointers to globally interned strings.
(See SEE_intern_global()
in §5.3.2.)
The mod_init
function is expected to return zero to indicate
a successful initialisation.
This pointer may be set to NULL
if unneeded.
The alloc
function is called after built-in objects have
been allocated but before any other modules or built-in objects have been
initialised. It is dangerous to make use of the interpreter at this stage.
The main use of the alloc
function pointer is to allow
circular dependencies between modules.
For most modules, the alloc
pointer can be left as
NULL
.
The init
function is called after all built-in objects and
modules have been allocated, and after all built-in objects have been
initialised.
It is safe to make use of the interpreter built-ins at this stage, but
not to make use of other modules.
For most modules, the init
function is the place to
insert newly-created host objects into a pristine
interp->Global
.
A pointer to your module structure must be passed to
SEE_module_add()
before any interpreters are created.
It is not possible to dynamically add modules once interpreters have
been created because of the way the global intern table is managed.
(If your module does not modify the global intern table, and your system
is single-threaded, then you may be able to add the module dynamically.)
Finally, per-interpreter private storage for each module is
provided through the SEE_MODULE_PRIVATE()
macro.
This macro evaluates to a void *
lvalue that may be
assigned dynamic storage during alloc
.
const SEE_uint32_t SEE_MODULE_MAGIC; int SEE_module_add(struct SEE_module *module); void *SEE_MODULE_PRIVATE(struct SEE_interpreter *, struct SEE_module *);
The SEE_module_add()
function adds a module to the
global list of modules initialised whenever a new interpreter
is constructed. This function returns zero if the module was added
successfully. It returns -1
if an internal error occurred.
Otherwise it returns the same non-zero value that the module's
mod_init
function hook returned.
Once added, a module cannot be removed.
The interested reader is referred to the mod_File.c module example under the shell directory of the SEE source code.
SEE provides backward-compatibility with earlier versions of JavaScript and JScript. These features ought never be used, since JavaScript program authors should be mindful of standards. Nevertheless, this section documents the compatibility modes that SEE supplies.
The behaviour of the SEE library is modified on a per-interpreter basis,
by passing special flags to a variant of the interpreter's initialisation
routine, SEE_interpreter_init_compat()
. This function otherwise
behaves just like SEE_interpreter_init()
(see §2).
void SEE_interpreter_init_compat(struct SEE_interpreter *interp, int flags);
The flags
parameter is a bitwise OR of the constants
described in the following table.
⚠ Note:
API 2.0 removed the SEE_COMPAT_UNDEFDEF
flag and
introduced the SEE_COMPAT_JSxx
flags.
Flag | Behaviour | ||
---|---|---|---|
SEE_COMPAT_STRICT
| This is not really a flag. It is defined as zero, and can be used when no compatibility flags are wanted. SEE will operate in its default ECMA compliance mode. | ||
SEE_COMPAT_UTF_UNSAFE
| Treats overlong UTF-8 encodings as valid unicode characters. You should never need this. | ||
SEE_COMPAT_262_3B
|
Enables optional features from Appendix B of ECMA-262 ed3, namely:
| ||
SEE_COMPAT_SGMLCOM
|
This flag makes the lexical analyser stage
treat the 4-character sequence
'<!-- ' as if it were the
'// ' comment introducer.
This is useful when parsing HTML SCRIPT elements.
| ||
SEE_COMPAT_JS11
|
Enables JavaScript 1.1 compatibility:
| ||
SEE_COMPAT_JS12
|
Enables JavaScript 1.2 compatibility:
| ||
SEE_COMPAT_JS13
|
Enables JavaScript 1.3 compatibility:
| ||
SEE_COMPAT_JS14
|
Enables JavaScript 1.4 compatibility:
|
SEE now always optimises empty function calls
by skipping the expensive
process of extending the scope chain, creating an
arguments
property, etc. and just
synthesizing undefined
for the call, instead.
This was an optional optimisation prior to SEE 2.0, but is now always
in effect.
As distributed, SEE has two different version numbers:
The library version is available to programs to query through the
SEE_version()
function.
This function returns a pointer to a static C string containing
identifiers separated by a space character (0x20
).
The first identifier is the name of the library (e.g. "see"
)
and the second identifier is the package version number
(e.g. 2.0
).
Further identifiers indicate the features used when compiling the library.
This string is useful for end users to determine what capabilities
their library implementation has.
const char *SEE_version(void);
The major and minor API version numbers indicate backward-compatible and backward-incompatible changes to the API, i.e the interface described in this documentation and the header files. The API version number is independent of the package and library version number.
Practically, developers should use the following code to signal the case of compiling against a future version of SEE with an API that isn't backward compatible with this document.
#define DESIRED_SEE_API_MAJOR 2 #if SEE_VERSION_API_MAJOR > DESIRED_SEE_API_MAJOR #warning "SEE API major version mismatch " #SEE_VERSION_API_MAJOR #endif
The rules I use for versioning future APIs are:
This document will indicate at what API version new API elements are added, defaulting to 1.0.
const int SEE_VERSION_API_MAJOR; const int SEE_VERSION_API_MINOR;
Applications written using SEE-1.3.1 (API 1.0) can be changed
to compile against SEE-2.0 (API 2.0).
If you want, both APIs can be supported by testing the macro
SEE_VERSION_API_MAJOR
.
The following list indicates the differences and steps to take
when porting API 1.0 code to API 2.0:
struct SEE_objectclass
,
has changed type from struct SEE_string *Class
to
const char *Class
.
You should use a simple C constant string to name the class.
void SEE_intern_global(struct SEE_string *)
has been replaced with
struct SEE_string *SEE_intern_global(const char *s)
.
You can remove the static arrays of SEE_char_t
, and
use a simple C constant string.
For example: s_Foo = SEE_intern_global("Foo");
trace
callback function is called far
less frequently and now takes an extra parameter of
type enum SEE_trace_event
.
You may need to change the signature of functions you assign to
the trace
hook, and make use of the event information.
SEE_COMPAT_UNDEFDEF
and
SEE_COMPAT_ARRAYJOIN1
has been removed. Replace them with zero.
The SEE library provides a simple framework for the host application to manage security contexts for scripts and host functions.
⚠ Note:
The SEE_interpreter.sec_domain
,
SEE_objectclass.get_sec_domain
, and
SEE_system.transit_sec_domain
fields were introduced in API 2.0.
The host application manages the 'current' security domain
by setting the interpreter's sec_domain
field.
During execution, when callable objects are created, they inherit the
value of the interpreter's sec_domain
field.
When an object is called, the host application is given the opportunity
of changing the current security domain.
Just before an object is called (either through the
SEE_OBJECT_CALL()
or
SEE_OBJECT_CONSTRUCT()
macro),
SEE takes the following steps:
SEE_system.transit_sec_domain
is NULL
,
then no security domain modification occurs,
and execution continues; otherwise
get_sec_domain()
method, then no modification occurs, and execution continues; otherwise
SEE_OBJECT_GET_SEC_DOMAIN()
macro is called to
obtain the function's security domain.
SEE_system.transit_sec_domain()
function,
which is expected to change the sec_domain
field
of the interpreter if needed.
sec_domain
is restored to the
interpreter before the return value or exeception is propagated.
The interpreter's sec_domain
field is initially set to NULL
by SEE_interpreter_init()
.
Consequently, all built-in SEE function objects constructed during
initialisation will have a security domain of NULL
.
This also applies to modules, unless they explicitly change the current
security context during their init()
handler.
The following functions (amongst others) are sensitive to the interpreter's
sec_domain
field and will record it in any callable objects
they produce:
SEE_Global_eval()
,
SEE_Function_new()
,
SEE_cfunction_make()
,
Apart from the
SEE_OBJECT_CALL()
and
SEE_OBJECT_CONSTRUCT()
macros,
which only restore the sec_domain
,
no other function in the SEE library changes the
interpreter's sec_domain
value.
⚠ Note:
If the interpreter's sec_domain
field is somehow changed without
restoration during an inner function, it will eventually be restored to
its original value as the function returns.
This is simply a consequence of a caller invoking
SEE_OBJECT_CALL()
or SEE_OBJECT_CONSTRUCT()
.
You should not rely on this side-effect.
If you plan to make use of SEE's security framework for foreign code, I recommend you follow the Java principals model used both by Java and Mozilla by following these guidelines:
NULL
pointer.
all principals matching foo@bar.com are granted FileRead permission on resources matching /public/*
checkPermission(interp, permission, resource)
that searches the authorization statements and
throws an exception if the current domain (the principal set stored in
interp->sec_domain
)
does not have the required permission.
Ensure this function is called in the sensitive places of your code.
SEE_system.transit_sec_domain()
so that it efficiently computes the intersection of the current and
the callee's
security domain and sets it to be the current domain.
Note that SEE's built-in functions will have a NULL
domain
and can be treated as completed trusted,
that is NULL
makes no change to the current security domain.
SEE_Global_eval()
,
SEE_Function_new()
, or even
SEE_cfunction_make()
,
first set the interpreter's sec_domain
field
to reflect exactly the principal that controls the source.
Be mindful to restore the security domain when you have finished,
especially during exceptions, like so:
void eval_in_domain(interp, input, input_sec_domain, result) struct SEE_interpreter *interp; struct SEE_input *input; void *input_sec_domain; struct SEE_value *result; { SEE_try_context_t c; void *saved_sec_domain; saved_sec_domain = interp->sec_domain; interp->sec_domain = input_sec_domain; SEE_TRY(interp, c) { SEE_Global_eval(interp, input, result); } interp->sec_domain = saved_sec_domain; SEE_DEFAULT_CATCH(interp, c); }
The SEE library contains various debugging facilities, that are
omitted if it is compiled with the NDEBUG
preprocessor define.
These functions are intended for the developer to use while application debugging, and not for general use.
void SEE_PrintValue(struct SEE_interpreter *interp, struct SEE_value *val, FILE *file); void SEE_PrintObject(struct SEE_interpreter *interp, struct SEE_object *obj, FILE *file); void SEE_PrintString(struct SEE_interpreter *interp, struct SEE_string *str, FILE *file); void SEE_PrintTraceback(struct SEE_interpreter *interp, FILE *file);
If debugging the library itself, it is worth reading the source code to find the debug flag variables that can be turned on by the host application to enable verbose traces during execution.
Defining the
NDEBUG
preprocessor symbol when building the library
also disables (slow) internal assertions that would otherwise
help show up application misuse of the API.
When using gdb on Unix, you can save a lot of heartache by using libtool to invoke it. Libtool knows what to do.
$ ./libtool --mode=execute gdb shell/see-shell
The SEE library does not contain a script debugger, however it does provide an interpreter hook for external debuggers and the see-shell example tool contains an example of using it.
The interpreter structure provides a trace
callback field,
which is called on certain events during execution
(function call, return, throw or statement).
The callback is also passed a handle to the current execution context,
(a struct SEE_context *
)
and an external debugger may examine it directly, or indirectly via the
SEE_context_eval()
utility function, which is otherwise
functionally identical to SEE_Global_eval()
.
SEE_context_eval()
is intended only for use by external debuggers attached to the
trace
callback.
⚠ Note:
During a debugger's execution,
the trace
callback should be disabled
by setting it to NULL
,
otherwise re-entrant tracing can occur.
⚠ Note:
The signature and frequency of calling the trace
hook changed
between API 1.0 and API 2.0.
void SEE_context_eval(struct SEE_context *context, struct SEE_string *expr, struct SEE_value *res);If see-shell is invoked with the -g option, then immediately before it executes the first script, it will prompt the user for debugging operations.
Commands available at the '%
' prompt include:
break
[filename:
]lineno
show
delete
number
step
cont
where
info
eval
expr
throw
expr
© David Leonard, 2004. This documentation may be entirely reproduced and freely distributed, as long as this copyright notice remains intact, and either the distributed reproduction or translation is a complete and bona fide copy, or the modified reproduction is subtantially the same and includes a brief summary of the modifications made.
$Id: USAGE.html 1126 2006-08-05 12:48:25Z d $
SEE_ENUM_RESET
SEE_native_init