Chapter 7 |
Other extensions |
|
Camlp4 provides a system of functions defined by pattern matching,
which are extensible.
The library module is Extfun
and the syntax to be loaded is
"pa_extfun.cmo"
. The empty function is Extfun.empty
.
You can extend a function using the statement extfun
whose
syntax is:
extfun expression with |
[ pattern-1 -> expression-1 |
| pattern-2 -> expression-2 |
... |
| pattern-n -> expression-n ] |
The patterns are ordered in lexicographic order (for example, in a
tuple, first comparing the first elements of the tuple). Variables are
inserted after constructors. In an extension, the patterns do not need
to be in a ``good'' order, since they are sorted. Non exhaustive
pattern matching do not generate a warning.
``Or'' patterns can be used only at the first level. In this case:
pat1 | pat2 -> expr
The binding is split into two cases (the expr is duplicated):
pat1 -> expr
| pat2 -> expr
Internal ``or'' patterns inside patterns are not accepted.
The statement extfun
returns another extensible function. The
type of extensible functions is ('a, 'b) Extfun.t
. To use an
extensible function, one must use the function Extfun.apply
which transforms it in a function of type 'a -> 'b
. If matching
failed, such a function raises the exception Extfun.Failure
.
The contents (patterns) of an extensible function can be displayed
using Extfun.print
.
Remark: extensible functions are not efficient: when applied, all
patterns are tested, one by one, until one of them matches.
Extensible functions are used in Camlp4 extensible pretty printing.
The functional streams are another implementation of streams. Like
normal streams, their contents can be accessible only one element at a
time. But the elements are not removed. A functional stream parser
returns the couple of a result and the remaining stream.
The library module is Fstream
and the syntax to be loaded is
"pa_fstream.cmo"
. The syntax of a functional stream is:
|
functional-stream ::= |
|
fstream
[: list-of-components-separated-by-semicolon :] |
|
component ::= |
|
` stream-element |
|
| stream |
and a functional parser, applying to a functional stream is:
|
functional-parser ::= |
|
fparser |
|
[ stream-pattern-1 -> expression-1 |
|
| stream-pattern-2 -> expression-2 |
|
.. |
|
| stream-pattern-n ->
expression-n ] |
|
stream-pattern ::= |
|
[: list-of-components-separated-by-semicolon :] |
|
component ::= |
|
` stream-pattern-element |
|
| pattern = expression |
|
| stream-pattern |
The functional stream patterns elements syntax are actually the same
than in normal stream pattern.
A functional stream is of type 'a Fstream.t
and a functional
stream parser of type
'a Fstream.t -> ('a * 'a Fstream.t) option
. When a parser
fails, it returns None
, otherwise Some
of the result and
the remaining stream. The elements in the initial stream are not
removed.
A functional parser use limited backtrack. It is a backtrack in a
sense that when a rule fails, the next rule is tested with the initial
stream. If no rule applies, the functional parser returns
None
. There is no Error exception causing the parsing to be
abandoned.
The backtrack is limited in a sense that if a rule is
[: p1 = e1; p2 = e2 :]
, if e2
fails, the rule is
abandoned: there is no attempt to try the next possible rule inside
e1
(which would suppose continuations).
The functions available in the module Fstream
are like the ones
in Stream
. But there is no function ``Fstream.peek'', only
Fstream.next
.
Functional parsers have a drawback that in case of syntax error, one
cannot know where, since the parsing continues until all rules have
been tested. To turn around this problem, the function
Fstream.count_frozen
returns the number of unfrozen tokens in
the stream, allowing to find the location of the error, providing a
location array have been used (which is normal usage in stream parsing
and grammars). It works if the stream had not been unfrozen before.
Camlp4 with the syntax extension pa_ocamllex.cma
is an
alternative of the command ocamllex
. This is Alain Frisch's
contribution.
You just have to add pa_ocamllex.cma
to the list of modules
loaded by Camlp4. For instance:
ocamlc -c -pp 'camlp4o pa_ocamllex.cma' my_lexer.ml
You can also print the generated lexer with:
camlp4o pa_ocamllex.cma pr_o.cmo my_lexer.ml
Unlike ocamllex, pa_ocamllex
can also be used with the revised
syntax.
There are two new kind of structure items (phrases):
let_regexp id = re
and:
rule id pat1 .. patn = parse
re1 { ... }
| ...
and ....
The first one declares a regexp alias; the second one creates a lexer
and declares the corresponding entry points. The syntax of regular
expressions and the shape of lexer definitions is similar to ocamllex.
Notes:
- named regexp are global (shared by all lexers in the source file, and
not subject to module scoping rules)
- each lexer has its own internal lexing table
- it is possible to give extra argument to lexer entry points;
arguments may be arbitrary patterns; they follow the 'lexbuf'
implicit argument, so a call to a lexer looks like:
token lexbuf arg1 .. argn
7.3.2 |
Standalone mode (ocamllex emulation) |
|
In standalone mode (-ocamllex switch on camlp4
command line),
pa_ocamllex
accepts standard ocamllex files (usually .mll). You
can simulate ocamllex with the following command:
camlp4o pa_ocamllex.cma pr_o.cmo -ocamllex -impl foo.mll > foo.ml
Or you can simplify your Makefile and avoid the creation of the foo.ml
file; for instance:
.mll.cmo:
$(CAMLC) $(COMPFLAGS) -c -pp 'camlp4o pa_ocamllex.cma -ocamllex -impl' \
-impl $<
Notes:
-
extra arguments are accepted also in standalone mode.