org.biojava.bio.seq.io
Class WordTokenization
java.lang.Object
org.biojava.utils.Unchangeable
org.biojava.bio.seq.io.WordTokenization
- All Implemented Interfaces:
- Annotatable, Changeable, java.io.Serializable, SymbolTokenization
- Direct Known Subclasses:
- CrossProductTokenization, DoubleTokenization, IntegerTokenization, NameTokenization
- public abstract class WordTokenization
- extends Unchangeable
- implements SymbolTokenization, java.io.Serializable
Base class for tokenizations which accept whitespace-separated
`words'. Splits at whitespace, except when it is quoted by
either double-quotes ("), brackets (), or square brackets [].
- Since:
- 1.2
- Author:
- Thomas Down, Greg Cox, Keith James
- See Also:
- Serialized Form
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
WordTokenization
public WordTokenization(Alphabet fab)
getAlphabet
public Alphabet getAlphabet()
- Description copied from interface:
SymbolTokenization
- The alphabet to which this tokenization applies.
- Specified by:
getAlphabet
in interface SymbolTokenization
getTokenType
public SymbolTokenization.TokenType getTokenType()
- Description copied from interface:
SymbolTokenization
- Determine the style of tokenization represented by this object.
- Specified by:
getTokenType
in interface SymbolTokenization
getAnnotation
public Annotation getAnnotation()
- Description copied from interface:
Annotatable
- Should return the associated annotation object.
- Specified by:
getAnnotation
in interface Annotatable
- Returns:
- an Annotation object, never null
tokenizeSymbolList
public java.lang.String tokenizeSymbolList(SymbolList sl)
throws IllegalSymbolException,
IllegalAlphabetException
- Description copied from interface:
SymbolTokenization
- Return a string representation of a list of symbols.
- Specified by:
tokenizeSymbolList
in interface SymbolTokenization
- Throws:
IllegalAlphabetException
- if alphabets don't match
IllegalSymbolException
parseStream
public StreamParser parseStream(SeqIOListener siol)
- Description copied from interface:
SymbolTokenization
- Return an object which can parse an arbitrary character stream into
symbols.
- Specified by:
parseStream
in interface SymbolTokenization
- Parameters:
siol
- The listener which gets notified of parsed symbols.
splitString
protected java.util.List splitString(java.lang.String str)
throws IllegalSymbolException
- Throws:
IllegalSymbolException
parseString
protected Symbol[] parseString(java.lang.String s)
throws IllegalSymbolException
- Throws:
IllegalSymbolException