org.opencyc.xml
Class GatherOpenDirectoryTitles

java.lang.Object
  |
  +--org.opencyc.xml.GatherOpenDirectoryTitles
All Implemented Interfaces:
com.hp.hpl.jena.rdf.arp.StatementHandler

public class GatherOpenDirectoryTitles
extends java.lang.Object
implements com.hp.hpl.jena.rdf.arp.StatementHandler

Gathers Open Directory Titles and constructs a dictionary associating topic resource IDs with their titles.

The Another RDF Parser (ARP) is used to parse the input DAML document. This class implements statement callbacks from ARP. Each triple in the input file causes a call on one of the statement methods. The same triple may occur more than once in a file, causing repeat calls to the method.

Author:
Stephen L. Reed

Copyright 2001 Cycorp, Inc., license is open source GNU LGPL.

the license

www.opencyc.org

OpenCyc at SourceForge

THIS SOFTWARE AND KNOWLEDGE BASE CONTENT ARE PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OPENCYC ORGANIZATION OR ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE AND KNOWLEDGE BASE CONTENT, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


Inner Class Summary
protected  class GatherOpenDirectoryTitles.DamlTermInfo
          Records the DAML term information for Cyc import.
 
Field Summary
protected  com.hp.hpl.jena.rdf.arp.ARP arp
          Another RDF Parser instance.
protected  CycFort damlOntologyDefiningURL
          URL which defines the imported daml ontology
protected  java.lang.String damlOntologyDefiningURLString
          URL string which defines the imported daml ontology
static int DEFAULT_VERBOSITY
          The default verbosity of this application.
 java.util.HashMap odpTitles
          Dictionary of category identifiers and Open Directory topic strings.
protected  java.util.HashMap ontologyNicknames
          Ontology library nicknames, which become namespace identifiers upon import into Cyc.
protected  GatherOpenDirectoryTitles.DamlTermInfo previousDamlTermInfo
          Previously imported term used to avoid redundant assertions.
protected  int verbosity
          Sets verbosity of this application.
 
Constructor Summary
GatherOpenDirectoryTitles(java.util.HashMap ontologyNicknames)
          Constructs a new GatherOpenDirectoryTitles object.
 
Method Summary
protected  void displayTriple(GatherOpenDirectoryTitles.DamlTermInfo subjectTermInfo, GatherOpenDirectoryTitles.DamlTermInfo predicateTermInfo, GatherOpenDirectoryTitles.DamlTermInfo objLitTermInfo)
          Displays the RDF triple.
protected  java.lang.String escaped(java.lang.String text)
          Returns the given string argument with embedded double quote characters escaped.
protected  void examineTriple(GatherOpenDirectoryTitles.DamlTermInfo subjectTermInfo, GatherOpenDirectoryTitles.DamlTermInfo predicateTermInfo, GatherOpenDirectoryTitles.DamlTermInfo objLitTermInfo)
          Examines the RDF triple and gathers the topic titles.
protected  void gatherTitles(java.lang.String damlOntologyDefiningURLString)
          Parses and imports the given DAML URL.
protected  java.lang.String getOntologyNickname(java.lang.String nameSpace, com.hp.hpl.mesa.rdf.jena.model.Resource resource)
          Returns the ontology nickname for the given XML namespace.
protected  boolean hasUriNamespaceSyntax(java.lang.String uri)
          Returns true if the given URI has embedded XML namespace separators.
protected  boolean isProbableUri(java.lang.String string)
          Returns true if the given string is likely to be a URI.
protected  GatherOpenDirectoryTitles.DamlTermInfo literal(com.hp.hpl.jena.rdf.arp.ALiteral literal)
          Returns the DamlTerm info of the given RDF literal.
protected  GatherOpenDirectoryTitles.DamlTermInfo resource(com.hp.hpl.jena.rdf.arp.AResource aResource, GatherOpenDirectoryTitles.DamlTermInfo predicateTermInfo)
          Returns the DamlTerm info of the given RDF resource.
 void setVerbosity(int verbosity)
          Sets verbosity of the constraint solver output.
 void statement(com.hp.hpl.jena.rdf.arp.AResource subject, com.hp.hpl.jena.rdf.arp.AResource predicate, com.hp.hpl.jena.rdf.arp.ALiteral literal)
          Provides the ARP statement handler for triple having an Literal.
 void statement(com.hp.hpl.jena.rdf.arp.AResource subject, com.hp.hpl.jena.rdf.arp.AResource predicate, com.hp.hpl.jena.rdf.arp.AResource object)
          Provides the ARP statement handler for triple having an Object.
protected  com.hp.hpl.mesa.rdf.jena.model.Resource translateResource(com.hp.hpl.jena.rdf.arp.AResource aResource)
          Converts an ARP resource into a Jena resource.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_VERBOSITY

public static final int DEFAULT_VERBOSITY
The default verbosity of this application. 0 --> quiet ... 9 -> maximum diagnostic input.

verbosity

protected int verbosity
Sets verbosity of this application. 0 --> quiet ... 9 -> maximum diagnostic input.

arp

protected com.hp.hpl.jena.rdf.arp.ARP arp
Another RDF Parser instance.

ontologyNicknames

protected java.util.HashMap ontologyNicknames
Ontology library nicknames, which become namespace identifiers upon import into Cyc. namespace uri --> ontologyNickname

previousDamlTermInfo

protected GatherOpenDirectoryTitles.DamlTermInfo previousDamlTermInfo
Previously imported term used to avoid redundant assertions.

damlOntologyDefiningURLString

protected java.lang.String damlOntologyDefiningURLString
URL string which defines the imported daml ontology

damlOntologyDefiningURL

protected CycFort damlOntologyDefiningURL
URL which defines the imported daml ontology

odpTitles

public java.util.HashMap odpTitles
Dictionary of category identifiers and Open Directory topic strings. The topic strings are not valid XML names and will be imported into Cyc as functionally wrapped strings.
Constructor Detail

GatherOpenDirectoryTitles

public GatherOpenDirectoryTitles(java.util.HashMap ontologyNicknames)
Constructs a new GatherOpenDirectoryTitles object.
Parameters:
cycAccess - the CycAccess instance which manages the connection to the Cyc server and provides Cyc API services
ontologyNicknames - the dictionary associating each ontology uri with the nickname used for the Cyc namespace qualifier
Method Detail

gatherTitles

protected void gatherTitles(java.lang.String damlOntologyDefiningURLString)
                     throws java.io.IOException
Parses and imports the given DAML URL.
Parameters:
damlOntologyDefiningURLString - the URL to import
importMtName - the microtheory into which DAML content is asserted

statement

public void statement(com.hp.hpl.jena.rdf.arp.AResource subject,
                      com.hp.hpl.jena.rdf.arp.AResource predicate,
                      com.hp.hpl.jena.rdf.arp.AResource object)
Provides the ARP statement handler for triple having an Object.
Specified by:
statement in interface com.hp.hpl.jena.rdf.arp.StatementHandler
Parameters:
subject - the RDF Triple Subject
predicate - the RDF Triple Predicate
object - the RDF Triple Object

statement

public void statement(com.hp.hpl.jena.rdf.arp.AResource subject,
                      com.hp.hpl.jena.rdf.arp.AResource predicate,
                      com.hp.hpl.jena.rdf.arp.ALiteral literal)
Provides the ARP statement handler for triple having an Literal.
Specified by:
statement in interface com.hp.hpl.jena.rdf.arp.StatementHandler
Parameters:
subject - the RDF Triple Subject
predicate - the RDF Triple Predicate
literal - the RDF Triple Literal

examineTriple

protected void examineTriple(GatherOpenDirectoryTitles.DamlTermInfo subjectTermInfo,
                             GatherOpenDirectoryTitles.DamlTermInfo predicateTermInfo,
                             GatherOpenDirectoryTitles.DamlTermInfo objLitTermInfo)
                      throws java.io.IOException,
                             java.net.UnknownHostException,
                             CycApiException
Examines the RDF triple and gathers the topic titles.
Parameters:
subjectTermInfo - the subject DamlTermInfo object
predicateTermInfo - the predicate DamlTermInfo object
objLitTermInfo - the object or literal DamlTermInfo object

escaped

protected java.lang.String escaped(java.lang.String text)
Returns the given string argument with embedded double quote characters escaped.
Parameters:
string - the given string
Returns:
the given string argument with embedded double quote characters escaped

displayTriple

protected void displayTriple(GatherOpenDirectoryTitles.DamlTermInfo subjectTermInfo,
                             GatherOpenDirectoryTitles.DamlTermInfo predicateTermInfo,
                             GatherOpenDirectoryTitles.DamlTermInfo objLitTermInfo)
Displays the RDF triple.
Parameters:
subjectTermInfo - the subject DamlTermInfo object
predicateTermInfo - the predicate DamlTermInfo object
objLitTermInfo - the object or literal DamlTermInfo object

resource

protected GatherOpenDirectoryTitles.DamlTermInfo resource(com.hp.hpl.jena.rdf.arp.AResource aResource,
                                                          GatherOpenDirectoryTitles.DamlTermInfo predicateTermInfo)
Returns the DamlTerm info of the given RDF resource.
Parameters:
aResource - the RDF resource
predicateTermInfo - when processing the RDF triple object, contains the predicate term info, otherwise is null;
Returns:
the DamlTerm info of the given RDF resource

literal

protected GatherOpenDirectoryTitles.DamlTermInfo literal(com.hp.hpl.jena.rdf.arp.ALiteral literal)
Returns the DamlTerm info of the given RDF literal.
Parameters:
literal - the RDF literal
Returns:
the DamlTerm info of the given RDF literal

isProbableUri

protected boolean isProbableUri(java.lang.String string)
Returns true if the given string is likely to be a URI.
Parameters:
string - the given string
Returns:
true if the given string is likely to be a URI

hasUriNamespaceSyntax

protected boolean hasUriNamespaceSyntax(java.lang.String uri)
Returns true if the given URI has embedded XML namespace separators.
Parameters:
uri - the URI
Returns:
true if the given URI has embedded XML namespace separators, otherwise false

getOntologyNickname

protected java.lang.String getOntologyNickname(java.lang.String nameSpace,
                                               com.hp.hpl.mesa.rdf.jena.model.Resource resource)
Returns the ontology nickname for the given XML namespace.
Parameters:
nameSpace - the XML namespace for which the nickname is sought
resource - the resource containing the namespace, used for error messages
Returns:
the ontology nickname for the given XML namespace

translateResource

protected com.hp.hpl.mesa.rdf.jena.model.Resource translateResource(com.hp.hpl.jena.rdf.arp.AResource aResource)
Converts an ARP resource into a Jena resource.
Parameters:
aResource - The ARP resource.
Returns:
The Jena resource.

setVerbosity

public void setVerbosity(int verbosity)
Sets verbosity of the constraint solver output. 0 --> quiet ... 9 -> maximum diagnostic input.
Parameters:
verbosity - 0 --> quiet ... 9 -> maximum diagnostic input