Package org.biojava.bio.symbol

Representation of the Symbols that make up a sequence, and locations within them.

See:
          Description

Interface Summary
Alignment An alignment containing multiple SymbolLists.
Alphabet The set of AtomicSymbols which can be concatanated together to make a SymbolList.
AtomicSymbol A symbol that is indivisible.
CrossProductAlphabet Cross product of a list of alphabets.
CrossProductSymbol Symbol in a CrossProductAlphabet.
FiniteAlphabet An alphabet over a finite set of Symbols.
Location A biological location.
Symbol A single symbol.
SymbolList A sequence of symbols that belong to an alphabet.
SymbolParser These objects are responsible for converting strings into Symbols and SymbolLists.
TranslationTable Encapsulates the mapping from a source to a destination alphabet.
 

Class Summary
AbstractAlphabet An abstract implementation of Alphabet.
AbstractSymbolList Abstract helper implementation of the SymbolList core interface.
AllTokensAlphabet An implementation of FiniteAlphabet that grows the alphabet to accomodate all the characters seen while parsing a file.
Alphabet.EmptyAlphabet The class that implements Alphabet and is empty.
AlphabetManager The first port of call for retrieving standard alphabets.
AlphabetManager.ListWrapper Simple wrapper to assist in list-comparisons.
CompoundLocation A complex location.
CrossProductSymbolNameParser Allows Symbol objects to be created from Strings assuming that they follow the guide-lines layed down in CrossProductAlphabet for naming.
DoubleAlphabet An efficient implementation of an Alphabet over the infinite set of double values.
DoubleAlphabet.DoubleSymbol A single double value.
FixedWidthParser A parser that uses a fixed width window of characters to look up the associated symbol.
GappedSymbolList This implementation of SymbolList wraps another one, allowing you to insert gaps.
IntegerAlphabet An efficient implementation of an Alphabet over the infinite set of integer values.
IntegerAlphabet.IntegerSymbol A single int value.
Location.EmptyLocation The implementation of Location that contains no positions at all.
Location.LocationComparator  
NameParser This uses Symbol names to parse characters into symbols.
OrderNSymbolList An n-th order view of another SymbolList.
PointLocation A single symbol.
RangeLocation A simple implementation of Location that contains all points between getMin and getMax inclusive.
ReverseSymbolList An reverse view of another SymbolList.
SimpleAlignment A simple implementation of an Alignment.
SimpleAlphabet A simple no-frills implementation of the FiniteAlphabet interface.
SimpleAtomicSymbol A no-frills implementation of AtomicSymbol.
SimpleSymbol A no-frills implementation of a symbol.
SimpleSymbolList Basic implementation of SymbolList.
SimpleTranslationTable A no-frills implementation of TranslationTable that uses a Map to map from symbols in a finite source alphabet into a target alphabet.
SingletonAlphabet An alphabet that contains a single atomic symbol.
SuffixTree Suffix tree implementation.
SuffixTree.SuffixNode A node in the suffix tree.
SymbolList.EmptySymbolList The empty immutable implementation.
TokenParser This uses symbol token to parse characters into Symbols.
TranslatedSymbolList Provides a 'translated' view of an underlying SymbolList.
WindowedSymbolList A view of windows onto another SymbolList.
 

Exception Summary
IllegalAlphabetException The exception to indicate that an invalid aphabet has been used.
IllegalSymbolException The exception to indicate that a symbol is not valid within a context.
 

Package org.biojava.bio.symbol Description

Representation of the Symbols that make up a sequence, and locations within them.

This package is not intended to have strong biological ties. It is here to make programming things like dynamic-programming much easier. It also handles serialization of well-known alphabets so that aplicable singleton properties of alphabets and Symbols are maintained.

All coordinates are in 'bio-coordinates' - that is - legal indexes start from 1 and a range is inclusive (4 to 7 includes 4, 5, 6 and 7).

A Symbol is a single token. The Symbol maintains a name, a token (char), and an Annotation bundle. A set of Symbols is represented by an Alphabet instance. If the Alphabet can guarantee that there are only ever a finite number of Symbols contained with in it, then it must implement FiniteAlphabet. The Symbol objects within a FiniteAlphabet can be tested for equality by comparing their references directly. A SymbolList is a string over the Symbols from a single Alphabet instance. This allows you to represent a sequence of tokens, such as DNA nucleotides, or stock-market prices.

CrossProductAlphabet and CrossProductSymbol allow alphabets and symbols to be represented that are the combination of two or more alphabets and symbols under cross-product. For example, the CrossProduct alphabet DNA x DNA would contain all di-nucleotides. DNA x DNA x DNA x Protein would contain all combinations of three nucleotides and a single amino-acid. Dice x Coin would contain every possible combination of dice roles (1..6) and of coin flips (Heads, Tails) as the Symbol objects (1, Heads), (1, Tails), (2, Heads) ... (6, Tails). If any one of the Alphabets that make up the source of a CrossProductAlphabet is not finite, then the resulting CrossProductAlphabet will not be finite either.

Locations within a SymbolList can be represented by a Location object. This interface defines a sub-set of points that are within the Location. This uses bio-coordinates, and defines all the operations that you are likely to need to build your own Locations (union, intersection and the like).