Using SEE, the Simple ECMAScript Engine

by David Leonard, 2006
for SEE version 2.0

The impatient may want to jump straight to the §4.1 code example.

Table of contents

Introduction

The Simple ECMAScript Engine ('SEE') is a parser and runtime library for the popular ECMAScript language. ECMAScript is the official name for what most people call JavaScript:

[ECMAScript] is based on several originating technologies, the most well known being JavaScript (Netscape) and JScript (Microsoft). The language was invented by Brendan Eich at Netscape and first appeared in that company's Navigator 2.0 browser. It has appeared in all subsequent browsers from Netscape and in all browsers from Microsoft starting with Internet Explorer 3.0. (ECMA-262 standard, 1999)

SEE fully complies with ECMAScript Edition 3, and to JavaScript 1.5. It has compatibility modes that allow it to run scripts developed under earlier versions of JavaScript, Microsoft's JScript and LiveScript.

This documentation is intended for developers wishing to incorporate SEE into their applications. It explains how you can use SEE to:

This documentation does not explain the ECMAScript language, nor discuss how to build the library on your system.

SEE includes an example application, called see-shell which allows interactive use of the interpreter, and demonstrates how to write host function objects.

Document conventions

I will use the phrase host application to mean your application, or any application that uses the SEE runtime environment auxillary to some primary purpose. Examples of a host application are web browsers and scriptable XML processors.

Throughout this documentation, references are made to the C functions and macros provided by the SEE library. To avoid definitional redundancy and to improve precision, the reader is encouraged to examine the SEE header files to find the precise definitions and arguments of each function or macro. Signatures for C macros are given, but you should understand that the compiler cannot normally typecheck your use of those macros.

Where literal C code is used, it is typeset in a monospace font, like this:

if (failed) { abort(); } /* comment */

Similarly, ECMAScript code is typeset in a sans serif font, like this:

window.location = "about:blank";

Important parts of exammple code are highlighted, and elided code is indicated with an elipsis, like this: ...

Function definitions listed in the name index appear in green, like this: SEE_example()

The term ASCII refers to character codes in the decimal range 0 through 127, inclusive.

1 Requirements

Compiling SEE requires an ANSI C compiler. Although the SEE library is essentially self-contained, it does depend on you (the host application developer) providing the following:

an IEEE 754 floating point type and a math library
Most modern compilers have this, but if you are developing for some obscure architecture, you should check.
a garbage-collecting memory allocator
The free Boehm gc is highly recommended (See also §3.1).

SEE uses scripts from GNU autoconf to determine if these are available, and also to determine other system-dependent properties. Host applications should #include <see/see.h> to access all the macros and functions prototypes.

(As a developer you may find the need to edit header files and configure scripts to make SEE compile on your system. I would be interested in hearing what changes were needed so that future releases can supply this automatically for other users. Please send mail to leonard@users.sourceforge.net.nospam.)

2 Creating interpreters

The first step in executing ECMAScript program text with SEE is to create an interpreter instance. Each interpreter instance represents an execution context and global variable space. When created, an interpreter is initialised with all the standard ECMAScript objects such as Math and String. Modules may add other objects (see §7).

First, the host application should allocate storage for a struct SEE_interpreter and then call SEE_interpreter_init() to initialise that structure.

void SEE_interpreter_init(struct SEE_interpreter *interp);

A pointer to the initialised SEE_interpreter is required for almost every function that SEE provides. The pointer is conventionally named interp.

Here is an example where storage has been allocated on the stack, and consequently the interpreter exists only until the function returns.

void
example()
{
    struct SEE_interpreter interp_storage;

    SEE_interpreter_init(&interp_storage);
    /* now the interpreter is ready */
}

There is no mechanism for explicitly destroying an initialised interpreter; instead, SEE relies on the garbage collector to reclaim all unreferenced storage (see §3).

2.1 Multiple simultaneous interpreters

SEE supports multiple, simultaneous, independent interpreter instances. This is useful, for example, in an HTML web browser application, where each window may need its own interpreter instance because the variables and bindings to built-in objects must be different and separate in each one.

SEE's functions are not thread-safe within the same interpreter, but multiple different interpreters can be safely used by different threads without collision. This is because global data structures used by SEE are marked immutable when the first interpreter is initialised. Interpreters remain completely independent of each other only if the application:

2.2 Fatal error handlers

If SEE encounters an internal error (such as memory exhaustion, memory corruption, or a bug), it calls the global function pointer SEE_system.abort, passing it a pointer to the interpreter in context (or NULL), and a short descriptive message. The SEE_system.abort hook initially points to a wapper function that simply calls the C library function abort(). You can set the hook early if you want to handle errors more gracefully. Its signature is:

extern struct {
    ...
    extern void (*abort)(struct SEE_interpreter *interp, const char *msg) _SEE_dead;
    ...
} SEE_system;

A convenience macro, SEE_ABORT() is provided for applications to call. It calls the hook function.

extern void (*SEE_ABORT)(struct SEE_interpreter *interp, const char *msg);

3 Memory management

SEE uses a garbage collecting memory allocator. SEE has global function pointers for memory allocation that the host application can configure. These hooks must be set up before any interpreter instances are created.

SEE manages memory by calling through the following function pointers stored in the global structure SEE_system. Your host application can replace these pointers before it creates any interpreters.

extern struct {
    ...
    void * (*malloc)(struct SEE_interpreter *interp, SEE_size_t size);
    void * (*malloc_finalize)(struct SEE_interpreter *interp, SEE_size_t size,
        void (*)(struct SEE_interpreter *, void *, void *), void *);
    void * (*malloc_string)(struct SEE_interpreter *interp, SEE_size_t size);

    void   (*free)(struct SEE_interpreter *interp, void *ptr);
    void   (*mem_exhausted)(struct SEE_interpreter *interp);
    void   (*gcollect)(struct SEE_interpreter *interp);
    ...
} SEE_system;

These hooks are invoked through the following functions:

void * SEE_malloc(struct SEE_interpreter *interp, SEE_size_t size);
void * SEE_malloc_finalize(struct SEE_interpreter *interp, SEE_size_t size
        void (*finalizefn)(struct SEE_interpreter *i, void *p, void *closure),
	void *closure);
void * SEE_malloc_string(struct SEE_interpreter *interp, SEE_size_t size);
void SEE_free(struct SEE_interpreter *interp, void **datap);
void SEE_gcollect(struct SEE_interpreter *interp);

Notice that SEE_free() takes a pointer-to-a-pointer argument, unlike its counterpart in the SEE_system structure. The pointer will be set to NULL after freeing. Freeing an already-NULL pointer with SEE_free() has no effect.

If SEE was compiled with Boehm-gc support, SEE_system.malloc is initialised to point to a wrapper around the GC_malloc() function, SEE_system.malloc_string is initialised to point to a wrapper around GC_malloc_atomic(), SEE_system.free is initialised to point to a wrapper around GC_free(), and SEE_system.malloc_finalize is initialised to point to a wrapper around GC_malloc() and GC_register_finalizer(). Otherwise, the initial functions print a warning message and use the system malloc() without releasing any memory.

If you intend to hook in your own memory allocator, be aware that any of these hooks may be called with a NULL interpreter argument which indicates an unknown context. In case of errors or resource exhaustion, the malloc hooks must not throw an exception, but should return NULL on failure. SEE will detect this situation and act accordingly.

Instead of calling the SEE_malloc functions directly, application code should use these convenient type-cast macros to allocate storage:

T * SEE_NEW(struct SEE_interpreter *interp, type T);
T * SEE_NEW_ARRAY(struct SEE_interpreter *interp, type T, int length);
T * SEE_NEW_STRING_ARRAY(struct SEE_interpreter *interp, type T, int length);
T * SEE_NEW_FINALIZE(struct SEE_interpreter *interp, type T,
    void (*finalizefn)(struct SEE_interpreter *, void *, void *),
    void *closure);
T * SEE_ALLOCA(struct SEE_interpreter *interp, type T, int length);
T * SEE_STRING_ALLOCA(struct SEE_interpreter *interp, type T, int length);

A usage example is:

char *buffer = SEE_NEW_STRING_ARRAY(interp, char, 30);

The allocator macros and functions check for memory allocation failures, and in that instance will automatically call SEE_system.mem_exhausted(). This hook defaults to a function that simply calls SEE_ABORT() with a short message. Your application may prefer to change the mem_exhausted hook to handle this situation more gracefully.

It is worth familiarizing yourself with the macro definitions in <see/mem.h> to see what they do.

3.1 On memory allocators

Why is SEE so dependent on a garbage collector? Why doesn't it use reference counting?

This subsection is a short diversion on answering this good question. I have asked myself the same thing about other applications that use garbage collectors. I'll justify SEE's reliance on a garbage collector with the following reasons:

3.2 Interacting with an external allocator

If you will be embedding SEE in a host application that uses non-GC malloc(), then any struct SEE_... pointers that you keep inside storage obtained by malloc() are likely to be unreliable. This is because garbage collectors do not normally look inside malloced memory.

To make the GC aware of your reference to the SEE object, you will either need to arrange for the GC to scan your malloc'd memory (e.g. by adding it to its root set) or by using a level of pointer indirection through uncollectable GC memory.

For example, you might create an indirect reference by allocating storage with Boehm's GC_MALLOC_UNCOLLECTABLE() function for a structure like this:

struct myobjref {   /* Always allocate this as GC uncollectable */
    struct SEE_interpreter *interp;
    struct SEE_object *object;
}

Then, you can safely keep a pointer to the myobjref in storage allocated by malloc(). It would still be your problem to eventually release the myobjref with a call to GC_FREE().

You may also find it convenient for SEE to manage the lifetime of your malloced host data. This normally happens when you create wrapped SEE objects (see §6.3) and include pointers into storage allocated with with malloc(). To achieve this, use SEE_NEW_FINALIZE() to allocate the wrapped object structure, and write a finalizer function that calls free() on the right members.

Similar approaches can be used for external allocators that use reference counting.

3.3 Finalization

Host objects that acquire operating system resources should release those resources when their finalizer is called by the garbage collector. However, a major criticism of garbage collectors is that finalizers are not invoked early or often enough. What this means is that it is highly desirable to release system resources at the earliest time possible (i.e. immediately the referring object becomes unreachable), but garbage collectors deliberately don't do this because performing the reachability analysis on objects is expensive and often left only to when memory is low.

To address this problem, I recommend developers follow these guidelines when designing their host object finalizers:

  1. Provide a host object method (conventionally called dispose()) that immediately releases all resources acquired by the object. The dispose method should cope with being called multiple times without error. The object will need to maintain state indicating whether it is valid or disposed, and its methods should check this state and generally throw an exception if it is invalid. You may also wish to provide the user with an accessor for this state, e.g. isValid(). The state can usually be combined with other normally occurring error states of the object.
  2. Arrange to have the finalizer simply invoke the dispose method. You may even choose to allow users to hook the dispose method by having the finalizer call the method indirectly with SEE_OBJECT_CALL, and then once again, directly, in case the user's hook failed.
  3. Detect/predict resource exhaustion at the point when host objects acquire new resources. In this case, force an immediate garbage collection by calling SEE_gcollect(), and then try just once more to acquire the resource before failing. An example of this technique is shown in the mod_File.c example module that comes with SEE.

The principal effects of these guidelines are to first shift the burden of optional, optimal reachability analysis onto the user, and secondly to couple memory exhaustion with resource exhaustion to exploit the benefits of late reachability analysis and avoid false resource loss when memory is plentiful.

4 Running programs

SEE's ultimate purpose is to execute user scripts. A full script, or a self-contained fragment of a script is referred to as program text. You should execute program text using the following general strategy:

  1. obtain a reference to an (initialised) SEE_interpreter (§2);
  2. construct a SEE_input unicode stream reader (§4.2) to transport the ECMAScript program text to SEE's parser;
  3. establish a try-catch context (§4.3);
  4. call the function SEE_Global_eval() to parse and evaluate the stream;
  5. handle any exceptions caught in the try-catch context (§4.3);
  6. examine the value result returned (§5) (optional)

The SEE_Global_eval() function is able to execute program text and then store the value associated with the last executed statement in a location given by a value pointer. In a non-interactive environment, this last statement's value is usually meaningless, and the value result return pointer ('res') given to SEE_Global_eval() may be safely given as NULL.

void SEE_Global_eval(struct SEE_interpreter *interp, 
                struct SEE_input *input, 
                struct SEE_value *res);

The program text is first parsed and then immediately executed with this function. If the evaluated text contains function definitions, the function-objects created inside the interpreter will contain a 'precompiled' copy of the function text. This means it is safe to destroy the input immediately after it has been passed to SEE_Global_eval().

4.1 Example

Although the rest of this document explains the library API in detail, a complete, but simple example of using the SEE interpreter follows:

#include <see/see.h>

/* Simple example of using the interpreter */
int
main()
{
        struct SEE_interpreter interp_storage, *interp;
        struct SEE_input *input;
        SEE_try_context_t try_ctxt;
        struct SEE_value result;
        char *program_text = "Math.sqrt(3 + 4 * 7) + 9";

        /* Initialise an interpreter */
        SEE_interpreter_init(&interp_storage);
        interp = &interp_storage;

        /* Create an input stream that provides program text */
        input = SEE_input_utf8(interp, program_text);

        /* Establish an exception context */
        SEE_TRY(interp, try_ctxt) {
                /* Call the program evaluator */
                SEE_Global_eval(interp, input, &result);

                /* Print the result */
                if (SEE_VALUE_GET_TYPE(&result) == SEE_NUMBER)
                        printf("The answer is %f\n", result.u.number);
                else
                        printf("Unexpected answer\n");
        }

        /* Finally: */
        SEE_INPUT_CLOSE(input);

        /* Catch any exceptions */
        if (SEE_CAUGHT(try_ctxt))
                printf("Unexpected exception\n");

        exit(0);
}

When this program is compiled, linked against the SEE library and the garbage collector library, and run, it should respond with:

The answer is 14.567764

This works because the value of the last executed statement in the program_text is stored in result. Calling SEE_Global_eval() is essentially the same as using ECMAScript's built-in eval() function.

If you are interested in developing a provider module for SEE, then you should look at the example module file mod_File.c. See also §7.

4.2 Inputs

SEE uses Unicode character stream sources known as 'inputs' to consume (scan and parse) ECMAScript program text. An input is a stream of 32-bit Unicode UCS-4 characters. The stream is read, one character at a time, through its 'get next character' callback function.

The SEE library provides some useful stream constructors. Each constructor create a new SEE_input structure, initialised for reading the source it is supplied.

struct SEE_input *SEE_input_file(struct SEE_interpreter *interp, 
                FILE *f, const char *filename, const char *encoding);
struct SEE_input *SEE_input_utf8(struct SEE_interpreter *interp,
                const char *s);
struct SEE_input *SEE_input_string(struct SEE_interpreter *interp,
                struct SEE_string *s);

If these constructors do not adequately meet your needs, you are encouraged to develop your own. They're quite easy to do, if a bit fiddly. I recommend you find the source to one of the above and modify it to do what you want.

The rest of this section describes the input API in detail, with a view towards custom input streams.

4.2.1 Input provider API

Why streams instead of strings? SEE uses a stream API for inputs rather than (say) a simple UCS-4 or UTF-8 string API, because Unicode-compliant applications will usually have a much better understanding of the encodings they are using than will SEE. With only a small amount of effort, streams provide this flexibility while avoiding unnecessary duplication or text storage.

Inputs are described by SEE_input structures. These are functionally similar to stdio's FILE type, or Java's ByteReader classes. Except they stream fully-decoded Unicode characters. The SEE_input structure is the focus of the API and maintains the input's stream state and provides a pointer to its access (callback) methods.

struct SEE_input {
        struct SEE_inputclass *inputclass;
        SEE_boolean_t          eof;
        SEE_unicode_t          lookahead;
        ...
};

struct SEE_inputclass {
        SEE_unicode_t   (*next)(struct SEE_input *input);
        void            (*close)(struct SEE_input *input);
};

The inputclass member indicates the access methods. It is a pointer to a SEE_inputclass structure. This class structure contains function pointers to the two methods next() and close().

The next() method should advance the input pointer, update the eof and lookahead members of the SEE_input structure, and return the old value of lookahead. SEE's scanner calls next() repeatedly, until the eof member becomes true. When eof is true, the value of lookahead becomes meaningless (but should be set to -1). Generally, the stream's constructor will internally call its next() function once initially, to 'prime' the lookahead field.

If the next() method encounters an encoding error, it should set lookahead to SEE_INPUT_BADCHAR and try to recover. It can throw an exception if it wants to, but SEE does not attempt to handle that: the application or user program will receive it. If you don't particularly care about Unicode, it is helpful to know that 7-bit ASCII is a direct subset of Unicode, so you can just pass each of your ASCII chars as a 32-bit SEE_unicode_t masked with 0x7f. (See the Unicode standards.)

The close() method should deallocate any operating system resources acquired during the input stream's construction. By convention, SEE will not call the close() method of any application-supplied input. The onus is on the caller to close the inputs supplied to SEE library functions. For this reason, you should use the 'finally' behaviour described in §4.3 to clean up a possibly failed stream.

The SEE_input structure represents the current state of the input stream. Most importantly, the lookahead field must always reflect the next character that a call to next() would return. Once initialised, the filename, first_lineno and interpreter members of the SEE_input structure should not be changed. The lookahead and eof members should also be initialised before the structure is given to SEE.

You are encouraged to read the source code to the three constructors listed at the beginning of this section.

4.2.2 Input client API

Consumers, like SEE's lexical analyser, will use these convenience macros to call input methods on a constructed input stream, rather than calling through the class structure directly:

SEE_unicode_t SEE_INPUT_NEXT(struct SEE_input *input);
void SEE_INPUT_CLOSE(struct SEE_input *input);

4.3 Try-catch contexts

SEE's exceptions are implemented using C's setjmp()/longjmp() mechanism. SEE provides macros that establish a try-catch context, and test later if a try block terminated abnormally (i.e. due to an thrown exception). Typical code that uses try-catch looks like this:

struct SEE_interpreter *interp;
struct SEE_value *e;
SEE_try_context_t c; /* storage for the try-catch context */

...

SEE_TRY(interp, c) {

        /*
         * Now inside a protected "try block".
         * The following calls may throw exceptions if they want,
         * causing the try block to exit immediately.
         */
        do_something();
        do_something_else();

        /* 
         * Because the SEE_TRY macro expands into a 'for' loop,
         * avoid using 'break', or 'return' statements.
         * If you must leave the try block, use 'continue;',
         * or throw an exception.
         */
}

/* Code placed here always runs. */
do_cleanup();

if ((e = SEE_CAUGHT(c))) {
        /* Handle the thrown exception 'e', somehow. */
        handle_exception(e);

        /* or you can throw it up to the next try-catch like so: */
        SEE_THROW(interp, e);
}

...

Do not return, goto or break out of a try block; the macro does not check for this, and the try-catch context may not be restored properly, causing all sorts of havoc.

Exceptions thrown outside of any try-catch context will cause the interpreter to abort.

If you are not interested in catching exceptions, and only want the 'finally' behaviour, use the following idiom:

SEE_TRY(interp, c) {
        do_something();
}
do_finally();    /* optional */
SEE_DEFAULT_CATCH(interp, c);

The signatures of these macros are:

SEE_TRY(struct SEE_interpreter *interp, SEE_try_context_t ctxt) { stmt... }
struct SEE_object *SEE_CAUGHT(SEE_try_context_t ctxt);
void SEE_THROW(struct SEE_interpreter *interp, struct SEE_object *exception);
void SEE_DEFAULT_CATCH(struct SEE_interpreter *interp, SEE_try_context_t ctxt);

4.4 Periodic callbacks

An application can use SEE's periodic callback mechanism to check for timeouts, interrupts or GUI events. The periodic field in the SEE_system structure, if set to something other than NULL, is called in the following situations:

extern struct {
	/* ... */
        void (*periodic)(struct SEE_interpreter *);
	/* ... */
} SEE_system;

It is possible for a cfunction to block the current thread and prevent the periodic hook from being called by SEE. Alternatives to the periodic hook are:

⚠ Note: The periodic hook appeared in API 2.0

5 Values

Eventually, your host application will want to pass numbers, strings and complex value objects about, through the SEE interpreter, to and from the user code. This section describes the C interface to ECMAScript values.

The ECMAScript language has exactly six types of value. They are:

The SEE_value structure can represent values of all of these types.

struct SEE_value {
    enum { ... }            _type;
    union {
        SEE_boolean_t       boolean;
        SEE_number_t        number;
        struct SEE_string * string;
        struct SEE_object * object;
        ...
    } u;
};

The first member, _type, is the discriminator, and must be one of the enumerated values SEE_UNDEFINED, SEE_NULL, SEE_BOOLEAN, SEE_NUMBER, SEE_STRING or SEE_OBJECT. You should access the _type member using the SEE_VALUE_GET_TYPE() macro.

enum { ... } SEE_VALUE_GET_TYPE(struct SEE_value *value);

Depending on the type, you can directly access the corresponding value of a SEE_value. If the value variable is declared as:

struct SEE_value v;

then the value that it holds is directly accessed through its union member, v.u. The following table shows when the union fields of v.u are valid:

SEE_VALUE_GET_TYPE(&v) Valid member Member's type
SEE_UNDEFINED n/a n/a
SEE_NULL n/a n/a
SEE_BOOLEAN v.u.boolean SEE_boolean_t
SEE_NUMBER v.u.number SEE_number_t
SEE_STRING v.u.string struct SEE_string *
SEE_OBJECT v.u.object struct SEE_object *

Two other types (SEE_COMPLETION and SEE_REFERENCE) are only used internally to SEE and are not documented here.

To convert/coerce values into values of a different types, use the utility functions describe in §5.1.

To create new values in struct SEE_value structures, use the following initialisation macros. They first set the _type field and then copy the second parameter into the appropriate union field. It is fine to use a local variable for a struct SEE_value, because the garbage collector can see what is being used from the stack.

void SEE_SET_UNDEFINED(struct SEE_value *val);
void SEE_SET_NULL(struct SEE_value *val);
void SEE_SET_OBJECT(struct SEE_value *val, struct SEE_object *obj);
void SEE_SET_STRING(struct SEE_value *val, struct SEE_string *str);
void SEE_SET_NUMBER(struct SEE_value *val, SEE_number_t num);
void SEE_SET_BOOLEAN(struct SEE_value *val, SEE_boolean_t bool);

Most SEE_values are passed about the SEE library functions using pointers. This is because the general contract is that the caller supplies storage for the return value (usually named ret), while other pointer arguments are treated as read-only. Conventionally, the result value pointer is provided as the last argument to these functions and is named res.

Avoid storing a struct SEE_value as a pointer. Instead, extract and copy values into storage using the following macro:

void SEE_VALUE_COPY(struct SEE_value *dst, struct SEE_value *src);

⚠ Note: The SEE_VALUE_COPY() macro breaks the convention of putting the result pointer last by instead following the better-known idiom of memcpy(), which places the destination first.

A simple pitfall to avoid when passing values to SEE functions is to use a single value as both a parameter to the function and as the return result storage. Do not do this. It is possible that the function will initialise its return storage before it accesses its parameters.

5.1 Value conversion

The ECMAScript language specification provides for conversion functions that the host application developer may find useful. They convert arbitrary values into values of a known type:

void SEE_ToPrimitive(struct SEE_interpreter *interp,
                struct SEE_value *val, struct SEE_value *hint,
                struct SEE_value *res);
void SEE_ToBoolean(struct SEE_interpreter *interp,
                struct SEE_value *val, struct SEE_value *res);
void SEE_ToNumber(struct SEE_interpreter *interp,
                struct SEE_value *val, struct SEE_value *res);
void SEE_ToInteger(struct SEE_interpreter *interp,
                struct SEE_value *val, struct SEE_value *res);
void SEE_ToString(struct SEE_interpreter *interp,
                struct SEE_value *val, struct SEE_value *res);
void SEE_ToObject(struct SEE_interpreter *interp,
                struct SEE_value *val, struct SEE_value *res);

See also the SEE_parse_args() function for a convenient way to extract C types from SEE values.

5.2 Undefined, null, boolean and number values

The undefined and null types have exactly one implied value each, namely undefined and null.

⚠ Note: ECMAScript's null is not an object type, and is not related to C's NULL constant.

Boolean types (SEE_boolean_t) have values of either true (non-zero) or false (zero).

Number values (SEE_number_t) are IEEE 754 signed floating point numbers, normally corresponding to the C compiler's built-in double type.

The following macros may be used to find information about a number value. (They assume that the type is SEE_NUMBER):

int SEE_NUMBER_ISNAN(struct SEE_value *val);
int SEE_NUMBER_ISPINF(struct SEE_value *val);
int SEE_NUMBER_ISNINF(struct SEE_value *val);
int SEE_NUMBER_ISINF(struct SEE_value *val);
int SEE_NUMBER_ISFINITE(struct SEE_value *val);

SEE also provides constants SEE_Infinity and SEE_NaN which may be stored in number values, but should not be used to compare number values with C's == operator. Use the macros mentioned previously, instead.

const SEE_number_t SEE_Infinity;
const SEE_number_t SEE_NaN;

Numbers (and other values) may be converted to integers using the functions SEE_ToInt32(), SEE_ToUint32() or SEE_ToUint16().

SEE_int32_t  SEE_ToInt32(struct SEE_interpreter *interp, struct SEE_value *val);
SEE_uint32_t SEE_ToUint32(struct SEE_interpreter *interp, struct SEE_value *val);
SEE_uint16_t SEE_ToUint16(struct SEE_interpreter *interp, struct SEE_value *val);

SEE provides three data types for integers:

5.3 String values

String values are pointers to SEE_string structures, that hold UTF-16 strings. The structure is defined something like this:

struct SEE_string {
        unsigned int     length;
        SEE_char_t      *data;
        ...
};

The useful members are:

Be aware that other strings may come to share the string's data, such as by forming substrings. A string's content must not be modified after construction because of this risk. However, the length field of a string may be changed to a smaller value at any time without concern.

The SEE_char_t type represents a UTF-16 character in the string. It is equivalent to a 16-bit unsigned integer.

To manipulate a string, first create a new string using one of the following:

struct SEE_string *SEE_string_new(struct SEE_interpreter *interp,
                unsigned int space);
struct SEE_string *SEE_string_dup(struct SEE_interpreter *interp,
                struct SEE_string *s);
struct SEE_string *SEE_string_sprintf(struct SEE_interpreter *interp,
                const char *fmt, ...);
struct SEE_string *SEE_string_vsprintf(struct SEE_interpreter *interp,
                const char *fmt, va_list ap);

And then, before passing your new string to any other function, append characters to it using the following:

void SEE_string_addch(struct SEE_string *s, SEE_char_t ch);
void SEE_string_append(struct SEE_string *s, const struct SEE_string *sffx);
void SEE_string_append(struct SEE_string *s, const char *);
void SEE_string_append_int(struct SEE_string *s, int i);

Once a new string has been passed to any other SEE function, it should not have its contents modified in any way. Strings should not be shared between different interpreters, unless internalised with SEE_intern_global() (see §5.3.1).

All strings in SEE use UTF-16 encoding, meaning that in some cases you may need to be aware of Unicode 'surrogate' characters. If the host application really needs UCS-4 strings (which are subtly different to UTF-16), you will need to write your own converter function. Use the implementation of SEE_input_string() (§4.2) as the basis for such a converter because it understands UTF-16 combiner codes.

The functions SEE_string_sprintf() and SEE_string_vsprintf() do not exactly have the same formats as the standard printf() function, although they are substantially similar. Follows is a table of understood formats:

Format Type Comment
%[+][-][0][#]d signed int decimal
%[+][-][0][#]u unsigned int decimal
%[+][-][0][#]x unsigned int hexadecimal (base 16)
%c char ASCII only
%C SEE_char_t
%[-][#][.#]s const char * ASCII only
%[-][#][.#]S struct SEE_string *
%[-][#]p void *
%% single %
%other literal %other

⚠ Note: Pror to API 2.0, the SEE_string_sprintf() and SEE_string_vsprintf() used the system snprintf forcing its output to 7-bit ASCII.

Where a hash (#) appears in the format column above, it means that either a positive integer in base 10 may be supplied to indicate padding or precision, (eg %4d) or an asterisk (*) can be used instead to indicate that the next int argument provides the padding or precision value. This follows the behaviour of printf().

Other string functions provided are:

struct SEE_string *SEE_string_substr(struct SEE_interpreter *interp,
                struct SEE_string *s, int index, int length);
struct SEE_string *SEE_string_literal(struct SEE_interpreter *interp,
                const struct SEE_string *s);
int SEE_string_fputs(const struct SEE_string *s, FILE *file);
void SEE_string_toutf8(struct SEE_interpreter *interp, 
                char *buffer, SEE_size_t buffer_size,
                const struct SEE_string *s);
SEE_size_t SEE_string_utf8_size(struct SEE_interpreter *interp, 
                const struct SEE_string *s);
struct SEE_string *SEE_string_concat(struct SEE_interpreter *interp,
                struct SEE_string *s1, struct SEE_string *s2);
int SEE_string_cmp(const struct SEE_string *s1,
                const struct SEE_string *s2);

⚠ Note: The SEE_string_toutf8() function does not check for null characters in the output. Consider using SEE_string_literal() to make the string well-formed before converting to UTF-8.

5.3.1 Internalised strings

If you find yourself comparing strings a lot, you may find it easier to compare internalised strings. These are strings that are kept in a fast hash table and may be compared equal using pointer equality. The SEE_intern() function returns an 'internalized' copy of the given string and is very fast on already-interned strings. It is worth using in lieu of SEE_string_cmp() if the strings are likely to be intern'ed already. (For example, all property names in the standard library are.)

The function SEE_intern_ascii() is a convenience function that first converts the C string into a SEE_string before intern'ing. The C string must be an ASCII string terminated by a null character.

struct SEE_string *SEE_intern(struct SEE_interpreter *interp,
                struct SEE_string *s);
struct SEE_string *SEE_intern_ascii(struct SEE_interpreter *interp,
                const char *s);

5.3.2 Statically initialised strings

SEE supports statically initialised strings. If you have a large number of strings to create and use (e.g. properties and method names) over many interpreter instances, statically initialised strings can save space, and improve performance.

A statically initialised string, 'Hello, world', would look like this:

/* Example of a statically-initialised UTF-16 string */
static SEE_char_t hello_world_chars[12] = {
    'H', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd'
};
static struct SEE_string hello_world = {
    12,                                                /* length */
    hello_world_chars                                  /* data */
};

The main problem with static strings is finding an elegant way to initialise the strings' content. There is no simple way in ANSI C to have the compiler convert common ASCII strings into UTF-16 arrays. The internal approach taken by SEE in supporting all the standard ECMAScript object property names, is to generate C program text from a file of ASCII strings during the build process.

If an application wishes to internalise strings across interpreters, it can add all its global strings into the global intern table before creating any interpreters. This is done by calling SEE_intern_global() for each string. Doing this can save a moderate amount of overhead, and can improve performance if the intern'ed string needs to be used often.

struct SEE_string * SEE_intern_global(const char *str);

⚠ Note: Prior to API 2.0, SEE_intern_global() had a very different signature. See §8.3.

6 Objects

ECMAScript uses a prototype-inheritance object model with simple named properties. More information on the object model can be found in the ECMA-262 standard, and in other JavaScript references.

This section describes how in-memory objects can be accessed and manipualated (the 'client interface'), and also how host applications can expose their own application objects and methods (the 'implementation interface').

Object instances are implemented as in-memory structures, with an objectclass pointer to a table of operational methods. Object references are held inside values with a type field of SEE_OBJECT (see §5).

If you want to create a plain object quickly from C, the convenience function SEE_Object_new() is the same as evaluating new Object().

struct SEE_object *SEE_Object_new(struct SEE_interpreter *interp);

6.1 Object values, and the object client interface

All object values are pointers to object instances. The pointers are of type struct SEE_object *. No object pointer in a SEE_value should ever point to NULL. I find working with struct SEE_object * pointer types directly, instead of using struct SEE_value to be convenient, when I know that I am dealing with objects.

To use an object instance, you should interact with it using the following internal method macros:

void SEE_OBJECT_GET(struct SEE_interpreter *interp,
                struct SEE_object *obj, struct SEE_string *prop,
                struct SEE_value *res);
void SEE_OBJECT_PUT(struct SEE_interpreter *interp,
                struct SEE_object *obj, struct SEE_string *prop,
                struct SEE_value *res, int flags);
int SEE_OBJECT_CANPUT(struct SEE_interpreter *interp,
                struct SEE_object *obj, struct SEE_string *prop);
int SEE_OBJECT_HASPROPERTY(struct SEE_interpreter *interp,
                struct SEE_object *obj, struct SEE_string *prop);
int SEE_OBJECT_DELETE(struct SEE_interpreter *interp,
                struct SEE_object *obj, struct SEE_string *prop);
void SEE_OBJECT_DEFAULTVALUE(struct SEE_interpreter *interp,
                struct SEE_object *obj, struct SEE_value *hint,
                struct SEE_value *res);
void SEE_OBJECT_CONSTRUCT(struct SEE_interpreter *interp,
                struct SEE_object *obj, struct SEE_object *thisobj,
                int argc, struct SEE_value **argv,
                struct SEE_value *res);
void SEE_OBJECT_CALL(struct SEE_interpreter *interp,
                struct SEE_object *obj, struct SEE_object *thisobj,
                int argc, struct SEE_value **argv,
                struct SEE_value *res);
int SEE_OBJECT_HASINSTANCE(struct SEE_interpreter *interp,
                struct SEE_object *obj, struct SEE_value *instance);
struct SEE_enum *SEE_OBJECT_ENUMERATOR(struct SEE_interpreter *interp,
                struct SEE_object *obj);
void *SEE_OBJECT_GET_SEC_DOMAIN(struct SEE_interpreter *interp,
                struct SEE_object *obj);

Five of the macros above (CONSTRUCT, CALL, HASINSTANCE, ENUMERATOR and GET_SEC_DOMAIN) call optional internal methods, and do not check if the object class has not provided them. This means the macros may try to call through a NULL function pointer, which will cause an error. You can determine if the object's class provides the optional methods by using the following macros before you use one of the four marked above. These check macros returns true if the method they check for is valid (i.e. they check the function pointer in the object class is non-NULL):

int SEE_OBJECT_HAS_CALL(struct SEE_object *obj);
int SEE_OBJECT_HAS_CONSTRUCT(struct SEE_object *obj);
int SEE_OBJECT_HAS_HASINSTANCE(struct SEE_object *obj);
int SEE_OBJECT_HAS_ENUMERATOR(struct SEE_object *obj);
int SEE_OBJECT_HAS_GET_SEC_DOMAIN(struct SEE_object *obj);

Because it is frequently convenient to provide property names using ASCII C strings instead of with struct SEE_string pointers, the following convenience macros are provided. They are functionally identical to their counterparts, except that they convert their property name argument with SEE_intern_ascii():

void SEE_OBJECT_GETA(struct SEE_interpreter *interp,
                struct SEE_object *obj, const char *ascii_prop,
                struct SEE_value *res);
void SEE_OBJECT_PUTA(struct SEE_interpreter *interp,
                struct SEE_object *obj, const char *ascii_prop,
                struct SEE_value *res, int flags);
int SEE_OBJECT_CANPUTA(struct SEE_interpreter *interp,
                struct SEE_object *obj, const char *ascii_prop);
int SEE_OBJECT_HASPROPERTYA(struct SEE_interpreter *interp,
                struct SEE_object *obj, const char *ascii_prop);
int SEE_OBJECT_DELETEA(struct SEE_interpreter *interp,
                struct SEE_object *obj, const char *ascii_prop);

⚠ Note: The convenience macros SEE_OBJECT_*A were introduced in API 2.0.

When storing properties in an object with SEE_OBJECT_PUT(), a flags parameter is required. In normal operation, this flag should be supplied as zero, but when populating an object with its properties for the first time, the following bit flags can be used:

Flag Meaning
SEE_ATTR_READONLY Future assignments (puts) on this property will fail
SEE_ATTR_DONTENUM Enumerators will not list this property and will hide inherited prototype properties of the same name until this property is deleted. (see §6.2)
SEE_ATTR_DONTDELETE Future deletes on this property will fail

6.2 Property enumerators

A property enumerator is a mechanism for discovering the properties that an object contains. The language exercises this with its for (var v in ...) construct. The results of the enumeration need not be sorted, nor even to be the same order each time.

Calling SEE_OBJECT_ENUMERATOR() returns a newly created enumerator which is a pointer to a struct SEE_enum. Once obtained, the following macros can be used to access the enumerator:

struct SEE_string *SEE_ENUM_NEXT(struct SEE_interpreter *interp,
                struct SEE_enum *e, int *flags_return);

Enumerators can assume that the underlying object does not change during enumeration. A suggested strategy for a caller that does need to remove or add an object's properties while enumerating them is to first create a private list of its property names, ensuring that it has exhausted the enumerator before attempting to modify the object.

/* An example of enumerating properties on an object from C */
void
print_properties(struct SEE_interpreter *interp, struct SEE_object *obj)
{
        struct SEE_enum *enumerator;
        struct SEE_string *prop;

        /* Ignore objects that don't provide an enumerator */
        if (!SEE_OBJECT_HAS_ENUMERATOR(obj))
                return;

        enumerator = SEE_OBJECT_ENUMERATOR(interp, obj);
        while ((prop = SEE_ENUM_NEXT(interp, enumerator, NULL)) != NULL) {
                SEE_PrintString(interp, prop, stdout);
                printf("\n");
        }
}

6.3 The object implementation interface

When a host application wishes to expose its own 'host objects' to ECMAScript programs, it must use the object implementation API described in this section.

All SEE objects are in-memory structures starting with a struct SEE_object:

struct SEE_object {
        struct SEE_objectclass *objectclass;
        struct SEE_object *     Prototype;
};

Normally, this structure is part of a larger structure that maintains the object's private state. For example, native Number objects could be implemented with the following:

struct number_object {             /* example implementation of Number */
        struct SEE_object object;
        SEE_number_t      number;
};

Keeping the object part at the top of the number_object structure means that pointers of type struct number_object * can be cast to and from pointers of type struct SEE_object *. This is a general idiom: begin all host object structures with a field member of type struct SEE_object named object.

Although the ECMAScript language does not use classes per se, SEE's internal object implementation does use a class 'abstraction' to speed up execution and make implementation re-use easier. Each object has a field, object.objectclass, that must be initialised to point to a struct SEE_objectclass that provides the object's behaviour. The class structure looks like this:

struct SEE_objectclass {
        const char *            Class;          /* mandatory */
        SEE_get_fn_t            Get;            /* mandatory */
        SEE_put_fn_t            Put;            /* mandatory */
        SEE_boolean_fn_t        CanPut;         /* mandatory */
        SEE_boolean_fn_t        HasProperty;    /* mandatory */
        SEE_boolean_fn_t        Delete;         /* mandatory */
        SEE_default_fn_t        DefaultValue;   /* mandatory */
        SEE_enumerator_fn_t     enumerator;     /* optional */
        SEE_call_fn_t           Construct;      /* optional */
        SEE_call_fn_t           Call;           /* optional */
        SEE_hasinstance_fn_t    HasInstance;    /* optional */
        SEE_get_sec_domain_fn_t get_sec_domain; /* optional (API 2.0) */
};

⚠ Note: The type of the Class field was a struct SEE_string * in API 1.0.

The application generally provides this structure in static storage, as most of its members are function pointers or strings known at compile time. A member marked optional should be set to NULL if it is meaningless.

The object methods marked mandatory (Get, Put, etc.) are never NULL, and should provide the precise behaviours that SEE expects on native objects. These behaviours are fully described in the ECMA-262 standard, and are summarised in the following table:

Field Behaviour
Class name of the class as revealed by toString()
Get retrieve a named property (or return undefined)
Put create/update a named property
Delete delete a property or return 0
HasProperty returns 0 if the property doesn't exist
CanPut returns 0 if the property cannot be changed
DefaultValue turns the object into a string or number value
enumerator allow enumeration of the properties (see above)
Construct constructs a new object; as per the new keyword
Call the object has been called as a function
HasInstance returns 0 if the objects are unrelated
get_sec_domain returns the security domain associated with functions

It is up to the host application to provide storage for the properties, and so forth. The simplest strategy is to ignore property calls to Put and Get that are meaningless. To this end, if the host object does not want to expend effort supporting some of the mandatory operations, it can use the corresponding 'do-nothing' function(s) from this list:

The Prototype field of an object instance can either be set to:

If you choose to use NULL, it is recommended you provide a toString() method (to help with debugging).

Once the host application has constructed its own objects that conform to the API, they can be inserted into the 'Global object' as object-valued properties.

The 'Global object' is an unnamed, top-level object whose sole purpose is to 'hold' all the built-in objects, such as Object, Function, Math, etc., as well as all user-declared global variables. The host application can access it through the Global member of the SEE_interpreter structure.

6.4 Native objects

SEE provides support for a special kind of object class called native objects. Native objects maintain a hash table of properties, and implement the mandatory methods (plus enumerator), and correctly observe the Prototype field.

struct SEE_native {
        struct SEE_object       object;
        struct SEE_property *   properties[SEE_NATIVE_HASHLEN];
};

An application can create host objects based on native objects. First, place a struct SEE_native at the beginning of a structure:

struct some_host_object {
        struct SEE_native       native;
        int                     host_specific_info;
};

Then, use the following objects methods, either directly in the SEE_objectclass structure, or by calling them indirectly from method implementations:

It is very important that you initialize the native field when constructing your host object. Do this using the SEE_native_init() function.

void SEE_native_init(struct SEE_native *obj, struct SEE_interpreter *i,
                const struct SEE_objectclass *obj_class, 
                struct SEE_object *prototype);

6.5 C function objects

The host application will likely want a C function to be able to be called directly from a user script. SEE supports this by wrapping C function pointers in 'cfunction' objects.

The convenience function SEE_cfunction_make() constructs an object whose Prototype field points to Function.prototype, and whose objectclass's Call method points to a given C function that contains the desired code.

The SEE_cfunction_make() takes a pointer to the C function, and an integer indicating the expected number of arguments. The integer becomes the function object's length property, which is advisory only.

struct SEE_object *SEE_cfunction_make(struct SEE_interpreter *interp,
                SEE_call_fn_t func, struct SEE_string *name, int argc);

⚠ Note: Objects returned by SEE_cfunction_make() should really only be used in the interpreter context in which they were created, but the current version of SEE does not check for this. (Because cfunction objects are essentially read-only after construction, and if memory allocation operates independently of the interpreters, sharing cfunction objects across interpreters will be OK, but it is not recommended for future portability.)

When attaching cfunctions to an object, you may find the SEE_CFUNCTION_PUTA() macro useful. It performs both the SEE_cfunction_make() and SEE_OBJECT_PUT() operations in one step. Its signature is:

void SEE_CFUNCTION_PUTA(struct SEE_interpreter *interp, 
                struct SEE_object *obj, const char *name,
                SEE_call_fn_t func, int length, int attr);

The C function must conform to the SEE_call_fn_t signature. This is demonstrated below, with math_sqrt(), which is the actual code behind the Math.sqrt object:

/* Implementation of Math.sqrt() method */
static void
math_sqrt(interp, self, thisobj, argc, argv, res)
        struct SEE_interpreter *interp;
        struct SEE_object *self, *thisobj;
        int argc;
        struct SEE_value **argv, *res;
{
        struct SEE_value v;

        if (argc == 0)
                SEE_SET_UNDEFINED(res);
        else {
                SEE_ToNumber(interp, argv[0], &v);
                SEE_SET_NUMBER(res, sqrt(v.u.number));
        }
}

The arguments to this function are described in the following table:

Argument Purpose
interp the current interpreter context
self a pointer to the object called (Math.sqrt here)
thisobj the this object (the Math object here)
argc number of arguments
argv array of value pointers, of length argc
res uninitialised value location in which to store the result

A common convention in all ECMAScript functions is that unspecified arguments should be treated as undefined, and extraneous arguments should just be ignored. If the function uses thisobj, it should check any assumptions made about it, especially if it is expected to be a host object. This is because method functions can easily be attached to other objects by user code.

When writing cfunctions, you can use the SEE_parse_args() convenience function to make argument processing easier. This function takes a format string and converts arguments according to the table below. It can throw a TypeError exception if a conversion error occurs.

void SEE_parse_args(struct SEE_interpreter *interp,
        int argc, struct SEE_value **argv, const char *fmt, ...);
Format Parameter type Conversion applied Result when undefined
a char ** SEE_ToString(), then into an ASCII C string "undefined"
A char ** NULL if undefined, otherwise the same as format 'a' NULL
b int * SEE_ToBoolean() 0
h SEE_uint16_t * SEE_ToUint16() 0
i SEE_int32_t * SEE_ToInt32() 0
n SEE_number_t * SEE_ToNumber() SEE_NaN
o struct SEE_object ** SEE_ToObject() TypeError
O struct SEE_object ** NULL if undefined or null, otherwise the same as format 'o' NULL
p struct SEE_value * SEE_ToPrimitive() undefined
s struct SEE_string ** SEE_ToString() "undefined"
u SEE_uint32_t * SEE_ToUint32() 0
v struct SEE_value * the argument is copied without conversion undefined
x the argument is ignored
z char ** SEE_ToString(), then into a UTF8 C string "undefined"
Z char ** NULL if undefined, otherwise the same as format 'z' NULL
| optional argument marker
. throws a TypeError on further arguments
space space character is ignored

The optional argument marker ('|') disables storing a result when the argument is undefined or not provided by the caller. This allows storage to be initialised to default values.

The 'a' and 'A' formats will throw a TypeError if the string contains non-ASCII characters. The 'a', 'A', 'z' and 'Z' formats will throw an error if the resulting string would contain a null character.

An example of using SEE_parse_args():

/* Possible implementation of Math.sqrt() method */
static void
math_sqrt_possible(interp, self, thisobj, argc, argv, res)
        struct SEE_interpreter *interp;
        struct SEE_object *self, *thisobj;
        int argc;
        struct SEE_value **argv, *res;
{
        SEE_number_t n;

	SEE_parse_args(interp, argc, argv, "n", &n);
	SEE_SET_NUMBER(res, sqrt(n));
}

6.6 User function objects

Occasionally, a host application will wish to take some user text and create a callable function object from it. An example of this problem is in attaching the JavaScript code from HTML attributes onto form elements of a web page. One way to achieve this is to invoke the Function constructor object with the SEE_OBJECT_CONSTRUCT() macro, passing it the formal arguments text and body text as arguments. (See the ECMAScript standard for details on the Function constructor.)

Another way, that is more convenient if the user text is available as an input stream, is to use the SEE_Function_new() function:

struct SEE_object *SEE_Function_new(struct SEE_interpreter *interp, 
                struct SEE_string *name, struct SEE_input *param_input, 
                struct SEE_input *body_input);

where any of the the name, param_input and body_input parameters may be NULL (indicating to use the empty string).

The returned function object may be called with the SEE_OBJECT_CALL() macro.

6.7 Errors and Error objects

Host applications sometimes need to convey errors to ECMAScript programs. Errors in ECMAScript are typically indicated by throwing an exception with an object value. The thrown objects conventionally have Error.prototype somewhere in their prototype chain, and provide a message and name property which the Error.prototype reads to generate a human-readable error message.

Host applications can conveniently construct and throw error exceptions using the following macros:

void SEE_error_throw(struct SEE_interpreter *interp,
                struct SEE_object *error_constructor,
                const char *fmt, ...);
void SEE_error_throw_string(struct SEE_interpreter *interp, 
                struct SEE_object *error_constructor,
                struct SEE_string *string);
void SEE_error_throw_sys(struct SEE_interpreter *interp,
                struct SEE_object *error_constructor,
                const char *fmt, ...);

These convenience macros construct a new error object, and throw it as an exception using SEE_THROW(). The object thrown is given a message string property that reflects the rest of the arguments provided to the called macro. The SEE_error_throw_sys() macro works like SEE_error_throw() but appends a textual description of errno using strerror().

The error_constructor argument should be one of the error constructor objects found in the SEE_interpreter structure:

Member Meaning
Error runtime error
EvalError error in eval()
RangeError numeric argument has exceeded allowable range
ReferenceError invalid reference was detected
SyntaxError parsing error
TypeError actual type of an operand different to that expected
URIError error in a global URI handling function

A simple example:

if (something_is_wrong)
        SEE_error_throw(interp, interp->Error, "something is wrong!");

Although Error is usually sufficient for most errors, host applications can create their own error constructor object with the SEE_Error_make() convenience function. Only one constructor of the same name should be created per interpreter.

struct SEE_object *SEE_Error_make(struct SEE_interpreter *interp,
                struct SEE_string *name);

7 Modules

SEE provides a module abstraction for host implementations that want a structured approach to adding their objects into a SEE interpreter.

⚠ Note: The module abstraction was introduced in API 2.0.

A struct SEE_module is a collection of functions that are automatically called by SEE at various stages of each intepreter initialisation. The module may initialise and insert its own objects into each interpreter before user scripts can be run.

struct SEE_module {
        SEE_uint32_t      magic;
        const char       *name;
        const char       *version;
        unsigned int      index;        /* Set by SEE_module_add() */
        int             (*mod_init)(void);
        void            (*alloc)(struct SEE_interpreter *);
        void            (*init)(struct SEE_interpreter *);
};

The magic field must be initialised to the constant value SEE_MODULE_MAGIC. The name field is currently unused, but should consist of a short, unique identifier corresponding to the name of the module. The version field is currently unused, and should be set to NULL. The index field is set by the SEE_module_add() function as the module is added to SEE. Each added module is given a unique index. Do not change the index.

The mod_init function pointer is called immediately the module is loaded (by SEE_module_add()). This is an opportunity for a module to obtain pointers to globally interned strings. (See SEE_intern_global() in §5.3.2.) The mod_init function is expected to return zero to indicate a successful initialisation. This pointer may be set to NULL if unneeded.

The alloc function is called after built-in objects have been allocated but before any other modules or built-in objects have been initialised. It is dangerous to make use of the interpreter at this stage. The main use of the alloc function pointer is to allow circular dependencies between modules. For most modules, the alloc pointer can be left as NULL.

The init function is called after all built-in objects and modules have been allocated, and after all built-in objects have been initialised. It is safe to make use of the interpreter built-ins at this stage, but not to make use of other modules. For most modules, the init function is the place to insert newly-created host objects into a pristine interp->Global.

A pointer to your module structure must be passed to SEE_module_add() before any interpreters are created. It is not possible to dynamically add modules once interpreters have been created because of the way the global intern table is managed. (If your module does not modify the global intern table, and your system is single-threaded, then you may be able to add the module dynamically.)

Finally, per-interpreter private storage for each module is provided through the SEE_MODULE_PRIVATE() macro. This macro evaluates to a void * lvalue that may be assigned dynamic storage during alloc.

const SEE_uint32_t SEE_MODULE_MAGIC;
int SEE_module_add(struct SEE_module *module);
void *SEE_MODULE_PRIVATE(struct SEE_interpreter *, struct SEE_module *);

The SEE_module_add() function adds a module to the global list of modules initialised whenever a new interpreter is constructed. This function returns zero if the module was added successfully. It returns -1 if an internal error occurred. Otherwise it returns the same non-zero value that the module's mod_init function hook returned. Once added, a module cannot be removed.

The interested reader is referred to the mod_File.c module example under the shell directory of the SEE source code.

8 Compatibility features

8.1 Compatibility with other JavaScript implementations

SEE provides backward-compatibility with earlier versions of JavaScript and JScript. These features ought never be used, since JavaScript program authors should be mindful of standards. Nevertheless, this section documents the compatibility modes that SEE supplies.

The behaviour of the SEE library is modified on a per-interpreter basis, by passing special flags to a variant of the interpreter's initialisation routine, SEE_interpreter_init_compat(). This function otherwise behaves just like SEE_interpreter_init() (see §2).

void SEE_interpreter_init_compat(struct SEE_interpreter *interp,
                int flags);

The flags parameter is a bitwise OR of the constants described in the following table.

⚠ Note: API 2.0 removed the SEE_COMPAT_UNDEFDEF flag and introduced the SEE_COMPAT_JSxx flags.

Flag Behaviour
SEE_COMPAT_STRICT This is not really a flag. It is defined as zero, and can be used when no compatibility flags are wanted. SEE will operate in its default ECMA compliance mode.
SEE_COMPAT_UTF_UNSAFE Treats overlong UTF-8 encodings as valid unicode characters. You should never need this.
SEE_COMPAT_262_3B Enables optional features from Appendix B of ECMA-262 ed3, namely:
  • defines Date.prototype.toGMTString(), equivalent to toUTCString()
  • defines Date.prototype.getYear() and Date.setYear()
  • defines escape() and unescape() in the global object
  • defines String.prototype.substr()
SEE_COMPAT_SGMLCOM This flag makes the lexical analyser stage treat the 4-character sequence '<!--' as if it were the '//' comment introducer. This is useful when parsing HTML SCRIPT elements.
SEE_COMPAT_JS11 Enables JavaScript 1.1 compatibility:
  • The string representation of a bad date (e.g. String(new Date(NaN))) is returned as the string "Invalid Date", instead of "NaN".
  • Calling Date as a constructor or function will recognise Netscape-style date strings of which one form is '1/1/1999 12:30 AM'.
  • Conversions from a date to a string will include the timezone.
  • Native objects synthesize a property called __proto__ with the same value as the internal [[Prototype]] property (or null). Assignments to __proto__ are accepted if they don't cause a cycle. [EXT:7 + EXT:8]
  • Invalid \u or \x escapes will treat the leading \u or \x as a single-letter escape. This would be a lexical analyser error under ECMA.
  • Regular expression instances have a [[Call]] property, which is essentially equivalent to RegExp.prototype.exec(). This has the side-effect of changing what the typeof operator returns when applied to regular expression instances.
  • The String.prototype object is given the following methods that simply return the string, and ignore their arguments: anchor, big, blink, bold, fixed, fontcolor, fontsize, italics, link, small, strike, sub and sup. The substr method is also added.
  • Enumerating over properties is done in a sorted fashion. During sort, property names are ordered arithmetically if they are suitable as array indicies, otherwise they are ordered lexicographically. [EXT:1]
  • SEE's lexical analyser will recognise octal integers (i.e. integers starting with '0') and will fall back to decimal if the token contains a non-octal digit. Same with parseInt(). [EXT:4 + EXT:18]
  • Coercing native values that do not have a [[DefaultValue]] internal property will return an object-unique string, instead of throwing a TypeError. [EXT:5 + EXT:6]
  • Function.prototype will not have a prototype property of its own. [EXT:9]
  • Function.prototype.toString() applied to built-in functions and constructors (which are not function instances) will return a bogus do-nothing FunctionDeclaration instead of throwing a TypeError. [EXT:13]
  • The global object has its property [[Prototype]] property set to Object.prototype, effectively making all its properties available to the global scope, but having the good side effect of allowing toString() to work anywhere. [EXT:17]
  • Calling eval() with a this different to the global object executes its contents with the scope and variable object always set to this (instead of inheriting the caller's context as per s10.2.2 of the standard). [EXT:23]
  • Native functions assign themselves an arguments property when called, so that the old idiom of using f.arguments inside the function f will work. [EXT:2 + EXT:11 + EXT:12]
  • The system-generated arguments object created inside a function has a default-value (a comma-separated string representing the arguments), instead of raising a TypeError. The upshot of this is that arguments can be coerced into a string. [EXT:14]
  • Reserved words can be used as identifiers (with a warning message) [EXT:3]
  • Invalid quantifiers in regular expressions (e.g. /a{12x}/) are treated as literals instead of raising a SyntaxError. [EXT:24]
  • RegExp supports the [[HasInstance]] operator, meaning that /x*/ instanceof RegExp will work. [EXT:20]
  • Change the token grammar to recognise simple character classes in literal regular expressions. This means the expression /[/]/ will be parsed as if it were /[\/]/. [EXT:15]
  • In regex character classes, three digit octal escape sequences beginning with \0 are understood. [EXT:25]
  • Regex execution leaves results in the RegExp object, in the properteis $1 ... $9, $_, $*, $+, $`, $', global, ignoreCase, input, lastIndex, lastMatch, lastParen, leftContext, multiline, rightContext, and source. [EXT:21]
  • The global function escape() returns uppercase hex digits, instead of lowercase. [EXT:19]
  • Array.join(undefined) uses the string 'undefined' as the join string instead of ','. However when called without arguments will still use ','. Also, String.split(undefined) will work in a corresponding reverse way. [EXT:16]
SEE_COMPAT_JS12 Enables JavaScript 1.2 compatibility:
  • Includes all JavaScript 1.1 compatibility behaviour.
  • The constructor new Array() when given a single numeric argument will return an array consisting of just that argument. e.g. new Array(3) is the same as [3]. This differs from the ECMA standard where an array of length 3 would be created, viz [undefined,undefined,undefined]. (JavaScript 1.2 only.)
  • Boolean instance objects are converted to their logical value in expressions that are converted to boolean. This differs from the ECMA standard where all object instances are to be converted to true. For example, the condition in the statement if (new Boolean(false)) statement, will be evaluated as true in ECMA, and false by JavaScript 1.2. (JavaScript 1.2 only.)
  • Array.prototype.toString() and Object.prototype.toString() return string forms in literal notation, e.g. "[1,2,3]", and "{a:1, b:2}". (JavaScript 1.2 only.)
  • String.prototype.split() when operating on the empty string "" will return an empty array [] instead of an array consisting of the empty string, [""]. (JavaScript 1.2 only.)
  • String.prototype.split(), when given a delimiter of exactly one space character (" "), will strip the leading whitespace from the string, and then split on /\s+/. (JavaScript 1.2 only.)
  • The Number() function when applied to an array will return the array length, instead of NaN. (JavaScript 1.2 only.)
SEE_COMPAT_JS13 Enables JavaScript 1.3 compatibility:
  • Includes JavaScript 1.1—1.2 compatibility behaviour.
SEE_COMPAT_JS14 Enables JavaScript 1.4 compatibility:
  • Includes JavaScript 1.1—1.3 compatibility behaviour.
SEE_COMPAT_JS15 Enables JavaScript 1.5 compatibility:
  • Includes JavaScript 1.1—1.4 compatibility behaviour.
  • Permit conditional function declarations of the form if (1) function foo (args) { body; }. The statement becomes syntactically identical to if (1) foo = function foo (args) { body; }.

SEE now always optimises empty function calls by skipping the expensive process of extending the scope chain, creating an arguments property, etc. and just synthesizing undefined for the call, instead. This was an optional optimisation prior to SEE 2.0, but is now always in effect.

8.2 Compatibility with future versions of SEE

As distributed, SEE has two different version numbers:

  1. the package, release and shared library version number (e.g. 2.0)
  2. the API version number (e.g. 1.0)

The library version is available to programs to query through the SEE_version() function. This function returns a pointer to a static C string containing identifiers separated by a space character (0x20). The first identifier is the name of the library (e.g. "see") and the second identifier is the package version number (e.g. 2.0). Further identifiers indicate the features used when compiling the library. This string is useful for end users to determine what capabilities their library implementation has.

const char *SEE_version(void);

The major and minor API version numbers indicate backward-compatible and backward-incompatible changes to the API, i.e the interface described in this documentation and the header files. The API version number is independent of the package and library version number.

Practically, developers should use the following code to signal the case of compiling against a future version of SEE with an API that isn't backward compatible with this document.

#define DESIRED_SEE_API_MAJOR 2
#if SEE_VERSION_API_MAJOR > DESIRED_SEE_API_MAJOR
 #warning "SEE API major version mismatch " #SEE_VERSION_API_MAJOR
#endif

The rules I use for versioning future APIs are:

This document will indicate at what API version new API elements are added, defaulting to 1.0.

const int SEE_VERSION_API_MAJOR;
const int SEE_VERSION_API_MINOR;

8.3 Porting from API 1.0 to API 2.0

Applications written using SEE-1.3.1 (API 1.0) can be changed to compile against SEE-2.0 (API 2.0). If you want, both APIs can be supported by testing the macro SEE_VERSION_API_MAJOR. The following list indicates the differences and steps to take when porting API 1.0 code to API 2.0:

9 Security

The SEE library provides a simple framework for the host application to manage security contexts for scripts and host functions.

⚠ Note: The SEE_interpreter.sec_domain, SEE_objectclass.get_sec_domain, and SEE_system.transit_sec_domain fields were introduced in API 2.0.

The host application manages the 'current' security domain by setting the interpreter's sec_domain field. During execution, when callable objects are created, they inherit the value of the interpreter's sec_domain field. When an object is called, the host application is given the opportunity of changing the current security domain.

Just before an object is called (either through the SEE_OBJECT_CALL() or SEE_OBJECT_CONSTRUCT() macro), SEE takes the following steps:

  1. If SEE_system.transit_sec_domain is NULL, then no security domain modification occurs, and execution continues; otherwise
  2. If the object's class does not provide a get_sec_domain() method, then no modification occurs, and execution continues; otherwise
  3. The SEE_OBJECT_GET_SEC_DOMAIN() macro is called to obtain the function's security domain.
  4. If the function's security domain is identical to the current security domain (pointer comparison), then no modification occurs, and execution continues; otherwise
  5. SEE calls the SEE_system.transit_sec_domain() function, which is expected to change the sec_domain field of the interpreter if needed.
  6. Execution continues, but if an exception occurs during the call, or the object's call completes successfully, then the old value of sec_domain is restored to the interpreter before the return value or exeception is propagated.

The interpreter's sec_domain field is initially set to NULL by SEE_interpreter_init(). Consequently, all built-in SEE function objects constructed during initialisation will have a security domain of NULL. This also applies to modules, unless they explicitly change the current security context during their init() handler.

The following functions (amongst others) are sensitive to the interpreter's sec_domain field and will record it in any callable objects they produce:

Apart from the SEE_OBJECT_CALL() and SEE_OBJECT_CONSTRUCT() macros, which only restore the sec_domain, no other function in the SEE library changes the interpreter's sec_domain value.

⚠ Note: If the interpreter's sec_domain field is somehow changed without restoration during an inner function, it will eventually be restored to its original value as the function returns. This is simply a consequence of a caller invoking SEE_OBJECT_CALL() or SEE_OBJECT_CONSTRUCT(). You should not rely on this side-effect.

9.1 Guidelines for using the security framework

If you plan to make use of SEE's security framework for foreign code, I recommend you follow the Java principals model used both by Java and Mozilla by following these guidelines:

  1. Design a domain structure that represents a set of principals. A principal is a named entity which the user would recognise; e.g. another user, or a web site. You should provide functions for checking if a principal is in a set, and also to compute the intersection of two sets. You should also provide an 'empty' set constant which is not the NULL pointer.
  2. Design a structure to hold authorization statements depening on the way your user will expresses them. These statements relate permisions to individual principals. These questions could be prompted dynamically, or might be read from an authorization file. For example, the user may want to express the authorization statement
    all principals matching foo@bar.com are granted FileRead permission on resources matching /public/*
  3. Write a function checkPermission(interp, permission, resource) that searches the authorization statements and throws an exception if the current domain (the principal set stored in interp->sec_domain) does not have the required permission. Ensure this function is called in the sensitive places of your code.
  4. Implement the domain transition function SEE_system.transit_sec_domain() so that it efficiently computes the intersection of the current and the callee's security domain and sets it to be the current domain. Note that SEE's built-in functions will have a NULL domain and can be treated as completed trusted, that is NULL makes no change to the current security domain.
  5. When your application receives source text from a trusted or untrusted source, then before calling SEE_Global_eval(), SEE_Function_new(), or even SEE_cfunction_make(), first set the interpreter's sec_domain field to reflect exactly the principal that controls the source. Be mindful to restore the security domain when you have finished, especially during exceptions, like so:
    void
    eval_in_domain(interp, input, input_sec_domain, result)
    	struct SEE_interpreter *interp;
    	struct SEE_input *input;
    	void *input_sec_domain;
    	struct SEE_value *result;
    {
    	SEE_try_context_t c;
    	void *saved_sec_domain;
    
    	saved_sec_domain = interp->sec_domain;
    	interp->sec_domain = input_sec_domain;
    	SEE_TRY(interp, c) {
                    SEE_Global_eval(interp, input, result);
    	}
    	interp->sec_domain = saved_sec_domain;
    	SEE_DEFAULT_CATCH(interp, c);
    }

10 Debugging facilities

10.1 Debugging the SEE library (for developers)

The SEE library contains various debugging facilities, that are omitted if it is compiled with the NDEBUG preprocessor define.

These functions are intended for the developer to use while application debugging, and not for general use.

void SEE_PrintValue(struct SEE_interpreter *interp, 
                struct SEE_value *val, FILE *file);
void SEE_PrintObject(struct SEE_interpreter *interp, 
                struct SEE_object *obj, FILE *file);
void SEE_PrintString(struct SEE_interpreter *interp, 
                struct SEE_string *str, FILE *file);
void SEE_PrintTraceback(struct SEE_interpreter *interp, 
                FILE *file);

If debugging the library itself, it is worth reading the source code to find the debug flag variables that can be turned on by the host application to enable verbose traces during execution.

Defining the NDEBUG preprocessor symbol when building the library also disables (slow) internal assertions that would otherwise help show up application misuse of the API.

When using gdb on Unix, you can save a lot of heartache by using libtool to invoke it. Libtool knows what to do.

$ ./libtool --mode=execute gdb shell/see-shell

10.2 Debugging scripts (for users)

The SEE library does not contain a script debugger, however it does provide an interpreter hook for external debuggers and the see-shell example tool contains an example of using it.

The interpreter structure provides a trace callback field, which is called on certain events during execution (function call, return, throw or statement). The callback is also passed a handle to the current execution context, (a struct SEE_context *) and an external debugger may examine it directly, or indirectly via the SEE_context_eval() utility function, which is otherwise functionally identical to SEE_Global_eval(). SEE_context_eval() is intended only for use by external debuggers attached to the trace callback.

⚠ Note: During a debugger's execution, the trace callback should be disabled by setting it to NULL, otherwise re-entrant tracing can occur.

⚠ Note: The signature and frequency of calling the trace hook changed between API 1.0 and API 2.0.

void SEE_context_eval(struct SEE_context *context,
                struct SEE_string *expr, struct SEE_value *res);
If see-shell is invoked with the -g option, then immediately before it executes the first script, it will prompt the user for debugging operations.

Commands available at the '%' prompt include:

break [filename:]lineno
add a new breakpoint
show
show current breakpoints
delete number
delete existing breakpoint
step
step to a new line
cont
continue execution
where
print traceback information
info
print context information
eval expr
evaluate and print an expression in the current context
throw expr
throw an exception

References

Name index

SEE_ABORT
SEE_ALLOCA
SEE_CFUNCTION_PUTA (2.0)
SEE_CAUGHT
SEE_cfunction_make
SEE_context_eval
SEE_DEFAULT_CATCH
SEE_ENUM_NEXT
SEE_Error_make
SEE_error_throw
SEE_error_throw_string
SEE_error_throw_sys
SEE_Function_new
SEE_gcollect (2.0)
SEE_Global_eval
SEE_Infinity
SEE_input struct
SEE_inputclass struct
SEE_INPUT_CLOSE
SEE_input_file
SEE_INPUT_NEXT
SEE_input_string
SEE_input_utf8
SEE_intern
SEE_intern_ascii (2.0)
SEE_intern_global (2.0*)
SEE_interpreter_init
SEE_interpreter_init_compat
SEE_malloc
SEE_malloc_string
SEE_malloc_finalize (2.0)
SEE_mem_exhausted_hook
SEE_mem_free_hook
SEE_mem_malloc_hook
SEE_mem_malloc_string_hook
SEE_module struct (2.0)
SEE_module_add (2.0)
SEE_MODULE_MAGIC (2.0)
SEE_MODULE_PRIVATE (2.0)
SEE_NaN
SEE_native struct
SEE_native_init
SEE_NEW
SEE_NEW_ARRAY
SEE_NEW_FINALIZE (2.0)
SEE_NEW_STRING_ARRAY
SEE_NUMBER_ISFINITE
SEE_NUMBER_ISINF
SEE_NUMBER_ISNAN
SEE_NUMBER_ISNINF
SEE_NUMBER_ISPINF
SEE_object struct
SEE_objectclass struct
SEE_OBJECT_CALL
SEE_OBJECT_CANPUT
SEE_OBJECT_CANPUTA (2.0)
SEE_OBJECT_CONSTRUCT
SEE_OBJECT_DEFAULTVALUE
SEE_OBJECT_DELETE
SEE_OBJECT_DELETEA (2.0)
SEE_OBJECT_ENUMERATOR
SEE_OBJECT_GET
SEE_OBJECT_GETA (2.0)
SEE_OBJECT_GET_SEC_DOMAIN (2.0)
SEE_OBJECT_HASINSTANCE
SEE_OBJECT_HASPROPERTY
SEE_OBJECT_HASPROPERTYA (2.0)
SEE_OBJECT_HAS_CALL
SEE_OBJECT_HAS_CONSTRUCT
SEE_OBJECT_HAS_ENUMERATOR
SEE_OBJECT_HAS_GET_SEC_DOMAIN (2.0)
SEE_OBJECT_HAS_HASINSTANCE
SEE_Object_new
SEE_OBJECT_PUT
SEE_OBJECT_PUTA (2.0)
SEE_parse_args (2.0)
SEE_PrintObject
SEE_PrintString
SEE_PrintTraceback
SEE_PrintValue
SEE_SET_BOOLEAN
SEE_SET_NULL
SEE_SET_NUMBER
SEE_SET_OBJECT
SEE_SET_STRING
SEE_SET_UNDEFINED
SEE_string struct
SEE_string_addch
SEE_STRING_ALLOCA
SEE_string_append
SEE_string_append_ascii (2.0)
SEE_string_append_int
SEE_string_cmp
SEE_string_concat
SEE_string_dup
SEE_string_fputs
SEE_string_literal
SEE_string_new
SEE_string_sprintf
SEE_string_substr
SEE_string_toutf8 (2.0)
SEE_string_utf8_size (2.0)
SEE_string_vsprintf
SEE_THROW
SEE_ToBoolean
SEE_ToInt32
SEE_ToInteger
SEE_ToNumber
SEE_ToObject
SEE_ToPrimitive
SEE_ToString
SEE_ToUint16
SEE_ToUint32
SEE_TRY
SEE_value struct
SEE_VALUE_COPY
SEE_VALUE_GET_TYPE
SEE_version
SEE_VERSION_API_MAJOR
SEE_VERSION_API_MINOR

© David Leonard, 2004. This documentation may be entirely reproduced and freely distributed, as long as this copyright notice remains intact, and either the distributed reproduction or translation is a complete and bona fide copy, or the modified reproduction is subtantially the same and includes a brief summary of the modifications made.

$Id: USAGE.html 1126 2006-08-05 12:48:25Z d $