Package org.biojava.bio.program.ssaha

SSAHA sequence searching API.

See:
          Description

Interface Summary
DataStore A repository that can be searched with a sequence.
DataStoreFactory Builder for a data store.
SearchListener The interface used to inform interested parties that some sequence has been searched and something found.
SequenceStreamer  
 

Class Summary
CompactedDataStore An implementation of DataStore that will map onto a file using the NIO constructs.
CompactedDataStoreFactory Builder for a data store that is backed by a java.nio.MappedByteBuffer.
HitMerger A listener that merges overlapping hits and culls all hits under a given length.
MappedDataStoreFactory Builder for a data store that is backed by a java.nio.MappedByteBuffer.
NIODataStoreFactory Builder for a datastore that has no practical file size limit.
SequenceStreamer.FileStreamer  
SequenceStreamer.SequenceDBStreamer  
 

Package org.biojava.bio.program.ssaha Description

SSAHA sequence searching API.

Overview

SSAHA is Sequence Searching Algorithm by Hashing. The idea is to take a sequence database, such as EMBL, walk over all of the sequences using a window size and step size, represent each of these same-sized fragments as a bit-string, and use the bit-string as an index into a hash-table. The hash-table is used to store the location of every window (sequence and position). Search sequences are encoded as bit-patterns in the same manner, and then this is used as an index into the table to fetch all hits. Finaly, these hits are sorted and potentialy merged to produce HSPs.