org.biojava.bio.program.sax
Class BlastLikeSAXParser

java.lang.Object
  extended byorg.biojava.bio.program.sax.AbstractNativeAppSAXParser
      extended byorg.biojava.bio.program.sax.BlastLikeSAXParser
All Implemented Interfaces:
org.biojava.bio.program.sax.NamespaceConfigurationIF, org.xml.sax.XMLReader

public class BlastLikeSAXParser
extends org.biojava.bio.program.sax.AbstractNativeAppSAXParser

A facade class allowing for direct SAX2-like parsing of the native output from Blast-like bioinformatics software. Because the parser is SAX2 compliant, application writers can simply pass XML ContentHandlers to the parser in order to receive notifcation of SAX2 events.

The SAX2 events produced are as if the input to the parser was an XML file validating against the biojava BlastLikeDataSetCollection DTD. There is no requirement for an intermediate conversion of native output to XML format. An application of the parsing framework, however, is to create XML format files from native output files.

The biojava Blast-like parsing framework is designed to uses minimal memory,so that in principle, extremely large native outputs can be parsed and XML ContentHandlers can listen only for small amounts of information.

The framework currently supports parsing of native output from the following bioinformatics programs. Please note that if you are using different versions of NCBI or WU Blast to those listed below, it is worth considering trying setting the parsing mode to Lazy, which means parsing will be attempted if the program is recognised, regardless of version.

Planned addition support

Notes to SAX driver writers

The framework that this parser is built on is designed to be extensible with support for both different pieces of software (i.e. not just software that produces Blast-like output), and multiple versions of programs.

This class inherits from the org.biojava.bio.program.sax.AbstractNativeAppSAXParser abstract base class. The abstract base class is a good place to start looking if you want to write new native application SAX2 parsers. This and releated classes have only package-level visibility. Typically, application writers are expected to provide a facade class in this package (similar to the current class) to allow users access to functionality.

NB Support for InputSource is not complete due to the fact that URLs are not resolved and cannot, therefore, be used as an InputSource. System pathnames, ByteStreams and CharacterStreams, however, are all supported.

Copyright © 2000 Cambridge Antibody Technology. All Rights Reserved.

Primary author -

Other authors -

Version:
1.0
Author:
Cambridge Antibody Technology (CAT)
See Also:
BlastLikeToXMLConverter

Field Summary
protected  int iState
           
protected  java.lang.String oFullNamespacePrefix
           
protected  org.xml.sax.ContentHandler oHandler
           
protected  java.lang.String oNamespacePrefix
           
protected  boolean tNamespacePrefixes
           
protected  boolean tNamespaces
           
 
Constructor Summary
BlastLikeSAXParser()
          Initialises SAXParser, and sets default namespace prefix to "biojava".
 
Method Summary
 void addPrefixMapping(java.lang.String poPrefix, java.lang.String poURI)
          Adds a namespace prefix to URI mapping as (key,value) pairs.
protected  void changeState(int piState)
          Centralise chaining of iState field to help with debugging.
protected  void characters(char[] ch, int start, int length)
          Utility method to centralize the sending of a SAX characters message a document handler.
protected  void endElement(org.biojava.bio.program.sax.QName poQName)
          Utility method to centralize the sending of a SAX endElement message a document handler.
 org.xml.sax.ContentHandler getContentHandler()
          Return the content handler.
protected  java.io.BufferedReader getContentStream(org.xml.sax.InputSource poSource)
          Create a stream from an an InputSource, picking the correct stream according to order of precedance.
 org.xml.sax.DTDHandler getDTDHandler()
          Do-nothing implementation of interface method
 org.xml.sax.EntityResolver getEntityResolver()
          Do-nothing implementation of interface method
 org.xml.sax.ErrorHandler getErrorHandler()
          Do-nothing implementation of interface method
 boolean getFeature(java.lang.String poName)
          Do-nothing implementation of interface method
 java.lang.String getNamespacePrefix()
          Describe getNamespacePrefix method here.
 boolean getNamespacePrefixes()
          Support SAX2 configuration of namespace support of parser.
 boolean getNamespaces()
          Support SAX2 configuration of namespace support of parser.
 java.lang.Object getProperty(java.lang.String name)
          Do-nothing implementation of interface method
 java.lang.String getURIFromPrefix(java.lang.String poPrefix)
          Gets the URI for a namespace prefix, given that prefix, or null if the prefix is not recognised.
 void parse(org.xml.sax.InputSource poSource)
          parse initiates the parsing operation.
 void parse(java.lang.String poSystemId)
          Full implementation of interface method.
 java.lang.String prefix(java.lang.String poElementName)
          Given an unprefixed element name, returns a new element name with a namespace prefix
 void setContentHandler(org.xml.sax.ContentHandler poHandler)
          Allow an application to register a content event handler.
 void setDTDHandler(org.xml.sax.DTDHandler handler)
          Do-nothing implementation of interface method
 void setEntityResolver(org.xml.sax.EntityResolver resolver)
          Do-nothing implementation of interface method
 void setErrorHandler(org.xml.sax.ErrorHandler handler)
          Do-nothing implementation of interface method
 void setFeature(java.lang.String poName, boolean value)
          Handles support for Namespaces and Namespace-prefixes
 void setModeLazy()
          Setting the mode to lazy means that, if the program is recognised, e.g.
 void setModeStrict()
          This is the default, parsing will be attempted only if both the program e.g.
 void setNamespacePrefix(java.lang.String poPrefix)
           
 void setProperty(java.lang.String name, java.lang.Object value)
          Do-nothing implementation of interface method
protected  void startElement(org.biojava.bio.program.sax.QName poQName, org.xml.sax.Attributes atts)
          Utility method to centralize sending of a SAX startElement message to document handler
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

oHandler

protected org.xml.sax.ContentHandler oHandler

tNamespaces

protected boolean tNamespaces

tNamespacePrefixes

protected boolean tNamespacePrefixes

oNamespacePrefix

protected java.lang.String oNamespacePrefix

oFullNamespacePrefix

protected java.lang.String oFullNamespacePrefix

iState

protected int iState
Constructor Detail

BlastLikeSAXParser

public BlastLikeSAXParser()
Initialises SAXParser, and sets default namespace prefix to "biojava".

Method Detail

parse

public void parse(org.xml.sax.InputSource poSource)
           throws java.io.IOException,
                  org.xml.sax.SAXException
parse initiates the parsing operation.

Parameters:
poSource - an InputSource.
Throws:
java.io.IOException - if an error occurs.
org.xml.sax.SAXException - if an error occurs.

setModeStrict

public void setModeStrict()
This is the default, parsing will be attempted only if both the program e.g. NCBI BlastP, and a particular version are recognised as bsing supported.


setModeLazy

public void setModeLazy()
Setting the mode to lazy means that, if the program is recognised, e.g. WU-TBlastX, then parsing will be attempted even if the particular version is not recognised. Using this option is more likely to result in erroneous parsing than if the strict mode is used.


setContentHandler

public void setContentHandler(org.xml.sax.ContentHandler poHandler)
Allow an application to register a content event handler. If the application does not register a content handler, all content events reported by the SAX parser will be silently ignored.

Applications may register a new or different handler in the middle of a parse, and the SAX parser must begin using the new handler immediately.

Specified by:
setContentHandler in interface org.xml.sax.XMLReader
Parameters:
poHandler - a ContentHandler The XML content handler
Throws:
java.lang.NullPointerException - If the handler argument is null

getContentHandler

public org.xml.sax.ContentHandler getContentHandler()
Return the content handler.

Specified by:
getContentHandler in interface org.xml.sax.XMLReader
Returns:
a ContentHandler The current content handler, or null if none has been registered.

parse

public void parse(java.lang.String poSystemId)
           throws java.io.IOException,
                  org.xml.sax.SAXException
Full implementation of interface method.

Specified by:
parse in interface org.xml.sax.XMLReader
Throws:
java.io.IOException
org.xml.sax.SAXException

getFeature

public boolean getFeature(java.lang.String poName)
                   throws org.xml.sax.SAXNotRecognizedException,
                          org.xml.sax.SAXNotSupportedException
Do-nothing implementation of interface method

Specified by:
getFeature in interface org.xml.sax.XMLReader
Throws:
org.xml.sax.SAXNotRecognizedException
org.xml.sax.SAXNotSupportedException

setFeature

public void setFeature(java.lang.String poName,
                       boolean value)
                throws org.xml.sax.SAXNotRecognizedException,
                       org.xml.sax.SAXNotSupportedException
Handles support for Namespaces and Namespace-prefixes

Specified by:
setFeature in interface org.xml.sax.XMLReader
Throws:
org.xml.sax.SAXNotRecognizedException
org.xml.sax.SAXNotSupportedException

getProperty

public java.lang.Object getProperty(java.lang.String name)
                             throws org.xml.sax.SAXNotRecognizedException,
                                    org.xml.sax.SAXNotSupportedException
Do-nothing implementation of interface method

Specified by:
getProperty in interface org.xml.sax.XMLReader
Throws:
org.xml.sax.SAXNotRecognizedException
org.xml.sax.SAXNotSupportedException

setProperty

public void setProperty(java.lang.String name,
                        java.lang.Object value)
                 throws org.xml.sax.SAXNotRecognizedException,
                        org.xml.sax.SAXNotSupportedException
Do-nothing implementation of interface method

Specified by:
setProperty in interface org.xml.sax.XMLReader
Throws:
org.xml.sax.SAXNotRecognizedException
org.xml.sax.SAXNotSupportedException

setEntityResolver

public void setEntityResolver(org.xml.sax.EntityResolver resolver)
Do-nothing implementation of interface method

Specified by:
setEntityResolver in interface org.xml.sax.XMLReader

getEntityResolver

public org.xml.sax.EntityResolver getEntityResolver()
Do-nothing implementation of interface method

Specified by:
getEntityResolver in interface org.xml.sax.XMLReader

setDTDHandler

public void setDTDHandler(org.xml.sax.DTDHandler handler)
Do-nothing implementation of interface method

Specified by:
setDTDHandler in interface org.xml.sax.XMLReader

getDTDHandler

public org.xml.sax.DTDHandler getDTDHandler()
Do-nothing implementation of interface method

Specified by:
getDTDHandler in interface org.xml.sax.XMLReader

setErrorHandler

public void setErrorHandler(org.xml.sax.ErrorHandler handler)
Do-nothing implementation of interface method

Specified by:
setErrorHandler in interface org.xml.sax.XMLReader

getErrorHandler

public org.xml.sax.ErrorHandler getErrorHandler()
Do-nothing implementation of interface method

Specified by:
getErrorHandler in interface org.xml.sax.XMLReader

startElement

protected void startElement(org.biojava.bio.program.sax.QName poQName,
                            org.xml.sax.Attributes atts)
                     throws org.xml.sax.SAXException
Utility method to centralize sending of a SAX startElement message to document handler

Parameters:
poQName - a QName value
atts - an Attributes value
Throws:
org.xml.sax.SAXException - if an error occurs

endElement

protected void endElement(org.biojava.bio.program.sax.QName poQName)
                   throws org.xml.sax.SAXException
Utility method to centralize the sending of a SAX endElement message a document handler.

Throws:
org.xml.sax.SAXException - thrown if
thrown - if

characters

protected void characters(char[] ch,
                          int start,
                          int length)
                   throws org.xml.sax.SAXException
Utility method to centralize the sending of a SAX characters message a document handler.

Parameters:
start - -
length - -
Throws:
org.xml.sax.SAXException - thrown if
thrown - if

getNamespaces

public boolean getNamespaces()
Support SAX2 configuration of namespace support of parser.

Specified by:
getNamespaces in interface org.biojava.bio.program.sax.NamespaceConfigurationIF

getNamespacePrefixes

public boolean getNamespacePrefixes()
Support SAX2 configuration of namespace support of parser.

Specified by:
getNamespacePrefixes in interface org.biojava.bio.program.sax.NamespaceConfigurationIF

addPrefixMapping

public void addPrefixMapping(java.lang.String poPrefix,
                             java.lang.String poURI)
Adds a namespace prefix to URI mapping as (key,value) pairs. This mapping can be looked up later to get URIs on request using the getURIFromPrefix method.

Parameters:
poPrefix - a String representation of the namespace prefix
poURI - a String representation of the URI for the namespace prefix.

getURIFromPrefix

public java.lang.String getURIFromPrefix(java.lang.String poPrefix)
Gets the URI for a namespace prefix, given that prefix, or null if the prefix is not recognised.

Specified by:
getURIFromPrefix in interface org.biojava.bio.program.sax.NamespaceConfigurationIF
Parameters:
poPrefix - a String The namespace prefix.

setNamespacePrefix

public void setNamespacePrefix(java.lang.String poPrefix)
Parameters:
poPrefix - a String value

getNamespacePrefix

public java.lang.String getNamespacePrefix()
Describe getNamespacePrefix method here.

Returns:
a String value

prefix

public java.lang.String prefix(java.lang.String poElementName)
Given an unprefixed element name, returns a new element name with a namespace prefix

Returns:
a String value

getContentStream

protected java.io.BufferedReader getContentStream(org.xml.sax.InputSource poSource)
Create a stream from an an InputSource, picking the correct stream according to order of precedance.

Returns:
a BufferedReader value

changeState

protected void changeState(int piState)
Centralise chaining of iState field to help with debugging. E.g. printing out value etc. All changes to iState should be made through this method.

Parameters:
piState - an int value