Saxon provides an API for executing XPath expressions. The API is loosely modelled
on the proposed DOM Level 3 API for XPath. For full documentation, see the Javadoc description of package
net.sf.saxon.xpath
. A sample application using this API is available: it is called XPathExample.java
,
and can be found in the samples/java
directory. To run this application, see the instructions
in Shakespeare XPath Sample Application.
This API is based on the class net.sf.saxon.xpath.XPathEvaluator
. This class provides a few
simple configuration interfaces to set the source document, the static context, and the context node,
plus a number of methods for evaluating XPath expressions.
Here is a simple example of the use of this class:
// Create an XPathEvaluator and set the source document
InputSource is = new InputSource(new File(filename).toURL().toString());
SAXSource ss = new SAXSource(is);
XPathEvaluator xpe = new XPathEvaluator(ss);
// Declare a variable for use in XPath expressions
StandaloneContext sc = (StandaloneContext)xpe.getStaticContext();
Variable wordVar = sc.declareVariable("word", "");
// Compile the XPath expressions used by the application
XPathExpression findLine =
xpe.createExpression("//LINE[contains(., $word)]");
XPathExpression findLocation =
xpe.createExpression("concat(ancestor::ACT/TITLE, ' ', ancestor::SCENE/TITLE)");
XPathExpression findSpeaker =
xpe.createExpression("string(ancestor::SPEECH/SPEAKER[1])");
// Create a reader for reading input from the console
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
// Loop until the user enters "." to end the application
while (true) {
// Prompt for input
System.out.println("\n>>>> Enter a word to search for, or '.' to quit:\n");
// Read the input
String word = in.readLine().trim();
if (word.equals(".")) {
break;
}
if (!word.equals("")) {
// Set the value of the XPath variable
wordVar.setValue(word);
// Find the lines containing the requested word
List matchedLines = findLine.evaluate();
// Process these lines
boolean found = false;
for (Iterator iter = matchedLines.iterator(); iter.hasNext();) {
// Note that we have found at least one line
found = true;
// Get the next matching line
NodeInfo line = (NodeInfo)iter.next();
// Find where it appears in the play
findLocation.setContextNode(line);
System.out.println("\n" + findLocation.evaluateSingle());
// Find out who the speaker of this line is
findSpeaker.setContextNode(line);
// Output the name of the speaker and the content of the line
System.out.println(findSpeaker.evaluateSingle() + ": " + line.getStringValue());
}
// If no lines were found, say so
if (!found) {
System.err.println("No lines were found containing the word '" + word + "'");
}
}
}
// Finish when the user enters "."
System.out.println("Finished.");
The XPathEvaluator
must be initialized with a source document, which can be supplied
as a JAXP Source
object. Any kind of Source
object recognized by Saxon is
allowed (including, for example, a JDOM source). This can be supplied either in the constructor for the
XPathEvaluator
, or through the setSource
method. The setSource
method returns a net.sf.saxon.om.DocumentInfo
object representing the root of the
tree for the document: this is useful if you want to use some of the more advanced features of the
Saxon API, but you can ignore it if you don't need it.
There are two methods for direct evaluation of XPath expressions,
evaluate()
which returns a List containing the result of the expression (which in general is a sequence),
and evaluateSingle()
which returns the first item in the result (this is appropriate where it is known
that the result will be single-valued). The results are returned as NodeInfo
objects in the case of nodes,
or as objects of the most appropriate Java class in the case of atomic values: for example, Boolean, Double,
or String in the case of the traditional XPath 1.0 data types.
XPath itself provides no sorting capability. You can therefore specify a sort order in which you want
the results of an expression returned. This is done by nominating another expression, via the setSortKey
method: this second expression is applied to each item in the result sequence, and its value determines
the position of that item in the sorted result order.
You can call methods directly on the NodeInfo
object to get information about a node: for
example getDisplayName()
gets the name of the node in a form suitable for display, and
getStringValue()
gets the string value of the node, as defined in the XPath data model. You
can also use the node as the context node for evaluation of subsequent expressions, by calling the
method setContextNode
on the XPathEvaluator
object.
It is also possible to prepare an XPath expression for subsequent execution, using the
createExpression()
method
on the XPathEvaluator
class. This is worthwhile where the same expression is to be executed repeatedly.
The compiled expression is represented by an instance of the class
net.sf.saxon.xpath.XPathExpression
,
and it can be executed repeatedly, with different context nodes. The compiled expression can
only be used with documents that were constructed using the same NamePool (which will be the
case if the default NamePool is used throughout)
A compiled expression can reference XPath variables; the values of these variables must be supplied
before the expression is evaluated, and can be different each time it is evaluated. To do this you will
need access to the StandaloneContext
object used by the XPathEvaluator
: you
can get this by calling getStaticContext
and casting the result to a StandaloneContext
.
Before compiling
an expression that uses variables, the variables it uses must be declared using the declareVariable()
method
on the StandaloneContext
class. This method returns a Variable
object, whose
setValue()
method can be used to set a value for the variable before the expression is
evaluated.
The StandaloneContext
object is also needed if the XPath expression uses namespaces (which
it will need to, if the source document itself uses namespaces). Before compiling or evaluating an
XPath expression that uses namespace prefixes, the namespace must be declared. You can do this explicitly
using the declareNamespace()
method on the StandaloneContext
object.
Alternatively, you can use the setNamespaces()
method, which declares all the namespaces
that are in-scope for a given node in the source document.
Certain namespaces are predeclared with their conventional prefixes: the XSLT namespace (xsl),
the XML namespace (xml), the XML Schema namespace (xs), and the Saxon namespace (saxon).
All the core XPath functions are available, with the exception of the document
function.
The XSLT-specific functions, such as key
and generate-id
, are not available.
You can call Java extension functions by binding a namespace to the Java class (for example,
java:java.lang.Double
). You can also call Saxon and EXSLT extension functions using their
normal namespace - with the exception of a small number of Saxon extension functions, such as
saxon:evaluate
and saxon:serialize
, which work only in an XSLT context.
The design principle of this API is to minimize the number of Saxon classes that need to be used.
Apart from the NodeInfo
interface, which is needed when manipulating Saxon trees, only the four classes
XPathProcessor, XPathExpression, StandaloneContext, and XPathException are needed.
For convenience, these classes are all in the net.sf.saxon.xpath
package.