Class Syntax::Tokenizer
In: lib/syntax/common.rb
Parent: Object

The base class of all tokenizers. It sets up the scanner and manages the looping until all tokens have been extracted. It also provides convenience methods to make sure adjacent tokens of identical groups are returned as a single token.

Methods

finish   option   set   setup   start   step   teardown   tokenize  

Constants

EOL = /(?=\r\n?|\n|$)/

Attributes

chunk  [R]  The current chunk of text being accumulated
group  [R]  The current group being processed by the tokenizer

Public Instance methods

Finish tokenizing. This flushes the buffer, yielding any remaining text to the client.

Get the value of the specified option.

Specify a set of tokenizer-specific options. Each tokenizer may (or may not) publish any options, but if a tokenizer does those options may be used to specify optional behavior.

Subclasses may override this method to provide implementation-specific setup logic.

Start tokenizing. This sets up the state in preparation for tokenization, such as creating a new scanner for the text and saving the callback block. The block will be invoked for each token extracted.

Subclasses must implement this method, which is called for each iteration of the tokenization process. This method may extract multiple tokens.

Subclasses may override this method to provide implementation-specific teardown logic.

Begins tokenizing the given text, calling step until the text has been exhausted.

[Validate]