org.biojava.bio.seq.io
Class EmblLikeFormat

java.lang.Object
  extended byorg.biojava.bio.seq.io.EmblLikeFormat
All Implemented Interfaces:
java.util.EventListener, ParseErrorListener, ParseErrorSource, SequenceFormat, java.io.Serializable

public class EmblLikeFormat
extends java.lang.Object
implements SequenceFormat, java.io.Serializable, ParseErrorSource, ParseErrorListener

Format processor for handling EMBL records and similar files. This takes a very simple approach: all `normal' attribute lines are passed to the listener as a tag (first two characters) and a value (the rest of the line from the 6th character onwards). Any data between the special `SQ' line and the "//" entry terminator is passed as a SymbolReader.

This low-level format processor should normally be used in conjunction with one or more `filter' objects, such as EmblProcessor.

Many ideas borrowed from the old EmblFormat processor by Thomas Down and Thad Welch.

Since:
1.1
Author:
Thomas Down, Greg Cox, Keith James
See Also:
Serialized Form

Field Summary
protected static java.lang.String ACCESSION_TAG
           
protected static java.lang.String AUTHORS_TAG
           
protected static java.lang.String CIRCULAR_TAG
           
protected static java.lang.String COMMENT_TAG
           
protected static java.lang.String COORDINATE_TAG
           
protected static java.lang.String DATE_TAG
           
static java.lang.String DEFAULT
           
protected static java.lang.String DEFINITION_TAG
           
protected static java.lang.String DIVISION_TAG
           
protected static java.lang.String END_SEQUENCE_TAG
           
protected static java.lang.String FEATURE_TABLE_TAG
           
protected static java.lang.String FEATURE_TAG
           
protected static java.lang.String ID_TAG
           
protected static java.lang.String JOURNAL_TAG
           
protected static java.lang.String KEYWORDS_TAG
           
protected static java.lang.String ORGANISM_TAG
           
protected static java.lang.String REF_ACCESSION_TAG
           
protected static java.lang.String REFERENCE_TAG
           
protected static java.lang.String SEPARATOR_TAG
           
protected static java.lang.String SIZE_TAG
           
protected static java.lang.String SOURCE_TAG
           
protected static java.lang.String START_SEQUENCE_TAG
           
protected static java.lang.String STRAND_NUMBER_TAG
           
protected static java.lang.String TITLE_TAG
           
protected static java.lang.String TYPE_TAG
           
protected static java.lang.String VERSION_TAG
           
 
Constructor Summary
EmblLikeFormat()
           
 
Method Summary
 void addParseErrorListener(ParseErrorListener theListener)
          Adds a parse error listener to the list of listeners if it isn't already included.
 void BadLineParsed(ParseErrorEvent theEvent)
           This method determines the behaviour when a bad line is processed.
 java.lang.String getDefaultFormat()
          Deprecated.  
 boolean getElideSymbols()
          Return a flag indicating if symbol data will be skipped when parsing streams.
protected  void notifyParseErrorEvent(ParseErrorEvent theEvent)
          Passes the event on to all the listeners registered for ParseErrorEvents.
protected  void processSequenceLine(java.lang.String line, StreamParser parser)
          Dispatch symbol data from SQ-block line of an EMBL-like file.
 boolean readSequence(java.io.BufferedReader reader, SymbolTokenization symParser, SeqIOListener listener)
          Read a sequence and pass data on to a SeqIOListener.
 void removeParseErrorListener(ParseErrorListener theListener)
          Removes a parse error listener from the list of listeners if it is included.
 void setElideSymbols(boolean b)
          Specifies whether the symbols (SQ) part of the entry should be ignored.
 void writeSequence(Sequence seq, java.io.PrintStream os)
          writeSequence writes a sequence to the specified PrintStream, using the default format.
 void writeSequence(Sequence seq, java.lang.String format, java.io.PrintStream os)
          Deprecated. use writeSequence(Sequence seq, PrintStream os)
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT

public static final java.lang.String DEFAULT
See Also:
Constant Field Values

ID_TAG

protected static final java.lang.String ID_TAG
See Also:
Constant Field Values

SIZE_TAG

protected static final java.lang.String SIZE_TAG
See Also:
Constant Field Values

STRAND_NUMBER_TAG

protected static final java.lang.String STRAND_NUMBER_TAG
See Also:
Constant Field Values

TYPE_TAG

protected static final java.lang.String TYPE_TAG
See Also:
Constant Field Values

CIRCULAR_TAG

protected static final java.lang.String CIRCULAR_TAG
See Also:
Constant Field Values

DIVISION_TAG

protected static final java.lang.String DIVISION_TAG
See Also:
Constant Field Values

ACCESSION_TAG

protected static final java.lang.String ACCESSION_TAG
See Also:
Constant Field Values

VERSION_TAG

protected static final java.lang.String VERSION_TAG
See Also:
Constant Field Values

DATE_TAG

protected static final java.lang.String DATE_TAG
See Also:
Constant Field Values

DEFINITION_TAG

protected static final java.lang.String DEFINITION_TAG
See Also:
Constant Field Values

KEYWORDS_TAG

protected static final java.lang.String KEYWORDS_TAG
See Also:
Constant Field Values

SOURCE_TAG

protected static final java.lang.String SOURCE_TAG
See Also:
Constant Field Values

ORGANISM_TAG

protected static final java.lang.String ORGANISM_TAG
See Also:
Constant Field Values

REFERENCE_TAG

protected static final java.lang.String REFERENCE_TAG
See Also:
Constant Field Values

COORDINATE_TAG

protected static final java.lang.String COORDINATE_TAG
See Also:
Constant Field Values

REF_ACCESSION_TAG

protected static final java.lang.String REF_ACCESSION_TAG
See Also:
Constant Field Values

AUTHORS_TAG

protected static final java.lang.String AUTHORS_TAG
See Also:
Constant Field Values

TITLE_TAG

protected static final java.lang.String TITLE_TAG
See Also:
Constant Field Values

JOURNAL_TAG

protected static final java.lang.String JOURNAL_TAG
See Also:
Constant Field Values

COMMENT_TAG

protected static final java.lang.String COMMENT_TAG
See Also:
Constant Field Values

FEATURE_TAG

protected static final java.lang.String FEATURE_TAG
See Also:
Constant Field Values

SEPARATOR_TAG

protected static final java.lang.String SEPARATOR_TAG
See Also:
Constant Field Values

FEATURE_TABLE_TAG

protected static final java.lang.String FEATURE_TABLE_TAG
See Also:
Constant Field Values

START_SEQUENCE_TAG

protected static final java.lang.String START_SEQUENCE_TAG
See Also:
Constant Field Values

END_SEQUENCE_TAG

protected static final java.lang.String END_SEQUENCE_TAG
See Also:
Constant Field Values
Constructor Detail

EmblLikeFormat

public EmblLikeFormat()
Method Detail

setElideSymbols

public void setElideSymbols(boolean b)

Specifies whether the symbols (SQ) part of the entry should be ignored. If this property is set to true, the parser will never call addSymbols on the SeqIOListener, but parsing will be faster if you're only interested in header information.

This property also allows the header to be parsed for files which have invalid sequence data.


getElideSymbols

public boolean getElideSymbols()
Return a flag indicating if symbol data will be skipped when parsing streams.


readSequence

public boolean readSequence(java.io.BufferedReader reader,
                            SymbolTokenization symParser,
                            SeqIOListener listener)
                     throws IllegalSymbolException,
                            java.io.IOException,
                            ParseException
Description copied from interface: SequenceFormat
Read a sequence and pass data on to a SeqIOListener.

Specified by:
readSequence in interface SequenceFormat
Parameters:
reader - The stream of data to parse.
symParser - A SymbolParser defining a mapping from character data to Symbols.
listener - A listener to notify when data is extracted from the stream.
Returns:
a boolean indicating whether or not the stream contains any more sequences.
Throws:
IllegalSymbolException - if it is not possible to translate character data from the stream into valid BioJava symbols.
java.io.IOException - if an error occurs while reading from the stream.
ParseException

processSequenceLine

protected void processSequenceLine(java.lang.String line,
                                   StreamParser parser)
                            throws IllegalSymbolException,
                                   ParseException
Dispatch symbol data from SQ-block line of an EMBL-like file.

Throws:
IllegalSymbolException
ParseException

writeSequence

public void writeSequence(Sequence seq,
                          java.io.PrintStream os)
                   throws java.io.IOException
Description copied from interface: SequenceFormat
writeSequence writes a sequence to the specified PrintStream, using the default format.

Specified by:
writeSequence in interface SequenceFormat
Parameters:
seq - the sequence to write out.
os - the printstream to write to.
Throws:
java.io.IOException

writeSequence

public void writeSequence(Sequence seq,
                          java.lang.String format,
                          java.io.PrintStream os)
                   throws java.io.IOException
Deprecated. use writeSequence(Sequence seq, PrintStream os)

writeSequence writes a sequence to the specified PrintStream, using the specified format.

Specified by:
writeSequence in interface SequenceFormat
Parameters:
seq - a Sequence to write out.
format - a String indicating which sub-format of those available from a particular SequenceFormat implemention to use when writing.
os - a PrintStream object.
Throws:
java.io.IOException - if an error occurs.

getDefaultFormat

public java.lang.String getDefaultFormat()
Deprecated.  

getDefaultFormat returns the String identifier for the default format written by a SequenceFormat implementation.

Specified by:
getDefaultFormat in interface SequenceFormat
Returns:
a String.

BadLineParsed

public void BadLineParsed(ParseErrorEvent theEvent)

This method determines the behaviour when a bad line is processed. Some options are to log the error, throw an exception, ignore it completely, or pass the event through.

This method should be overwritten when different behavior is desired.

Specified by:
BadLineParsed in interface ParseErrorListener
Parameters:
theEvent - The event that contains the bad line and token.

addParseErrorListener

public void addParseErrorListener(ParseErrorListener theListener)
Adds a parse error listener to the list of listeners if it isn't already included.

Specified by:
addParseErrorListener in interface ParseErrorSource
Parameters:
theListener - Listener to be added.

removeParseErrorListener

public void removeParseErrorListener(ParseErrorListener theListener)
Removes a parse error listener from the list of listeners if it is included.

Specified by:
removeParseErrorListener in interface ParseErrorSource
Parameters:
theListener - Listener to be removed.

notifyParseErrorEvent

protected void notifyParseErrorEvent(ParseErrorEvent theEvent)
Passes the event on to all the listeners registered for ParseErrorEvents.

Parameters:
theEvent - The event to be handed to the listeners.