org.biojava.bio.seq.db
Class IndexedSequenceDB

java.lang.Object
  |
  +--org.biojava.bio.seq.db.IndexedSequenceDB

public final class IndexedSequenceDB
extends java.lang.Object
implements SequenceDB, java.io.Serializable

This class reads in a file or a set of files containing sequence data. It contains methods for automatically indexing these sequences.

See Also:
Serialized Form

Method Summary
 void addFile(java.io.File seqFile)
          Add sequences from a file to the sequence database.
static IndexedSequenceDB createDB(java.lang.String name, java.io.File indexFile, SequenceFormat format, SequenceFactory sFact, SymbolParser symParser, IDMaker idMaker)
          Create a sequence database
 java.util.Set getFiles()
          Retrieve an unmodifiable set of files.
 java.lang.String getName()
          Get the name of this sequence database.
 Sequence getSequence(java.lang.String id)
          Retrieve a single sequence by its id.
 java.util.Set ids()
          Get an imutable set of all of the IDs in the database.
static IndexedSequenceDB openDB(java.io.File indexFile)
          Open an index at indexFile.
 void removeFile(java.io.File seqFile)
          Remove a file from the database
 SequenceIterator sequenceIterator()
          Returns a SequenceTterator over all sequences in the database.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

openDB

public static IndexedSequenceDB openDB(java.io.File indexFile)
                                throws java.io.IOException,
                                       BioException
Open an index at indexFile.

If indexFile exists, it will just load the indexes from there. If it does not exist, a new index file will be created.

Parameters:
the - File to use for persistantly storing the indexes
Throws:
java.io.IOException - if for any reason indexFile can't be used

createDB

public static IndexedSequenceDB createDB(java.lang.String name,
                                         java.io.File indexFile,
                                         SequenceFormat format,
                                         SequenceFactory sFact,
                                         SymbolParser symParser,
                                         IDMaker idMaker)
                                  throws java.io.IOException,
                                         BioException
Create a sequence database
Parameters:
name - a name for the database
indexFile - the indexed file of sequences
format - the kind of format being read in e.g. EMBL/FASTA
sFact - the sequence factory object for generating sequence objects from the file.
symParser - the SymbolParser object for the sequences read in e.g. DNA or RNA parsers.
idMaker - sets the idMaker to map the set of sequences encountered.

getFiles

public java.util.Set getFiles()
Retrieve an unmodifiable set of files.
Returns:
a Set of all files indexed by this indexer

addFile

public void addFile(java.io.File seqFile)
             throws java.io.IOException,
                    BioException
Add sequences from a file to the sequence database. This method works on an "all or nothing" principle. If it can successfully interpret the entire file, all the sequences will be read in. However, if it encounters any problems, it will abandon the whole file; an IOException will be thrown. A bioexception will be thrown if it has problems understanding the sequences.
Parameters:
seqFile - the file containing the sequence or set of sequences

removeFile

public void removeFile(java.io.File seqFile)
                throws java.io.IOException
Remove a file from the database
Parameters:
seqFile - the file to remove

getName

public java.lang.String getName()
Description copied from interface: SequenceDB
Get the name of this sequence database.
Specified by:
getName in interface SequenceDB
Tags copied from interface: SequenceDB
Returns:
the name of the sequence database, which may be null.

getSequence

public Sequence getSequence(java.lang.String id)
                     throws BioException
Description copied from interface: SequenceDB
Retrieve a single sequence by its id.
Specified by:
getSequence in interface SequenceDB
Tags copied from interface: SequenceDB
Parameters:
the - id to retrieve by
Returns:
the Sequence with that id
Throws:
BioException - if for any reason the sequence could not be retrieved

sequenceIterator

public SequenceIterator sequenceIterator()
Description copied from interface: SequenceDB
Returns a SequenceTterator over all sequences in the database. The order of retrieval is undefined.
Specified by:
sequenceIterator in interface SequenceDB
Tags copied from interface: SequenceDB
Returns:
a SequenceIterator over all sequences

ids

public java.util.Set ids()
Description copied from interface: SequenceDB
Get an imutable set of all of the IDs in the database. The ids are legal arguments to getSequence.
Specified by:
ids in interface SequenceDB
Tags copied from interface: SequenceDB
Returns:
a Set of ids - at the moment, strings