Writing extension functions

Extension functions must be implemented in Java.

An extension function is invoked using a name such as prefix:localname(). The prefix must be the prefix associated with a namespace declaration that is in scope. The namespace URI is used to identify a Java class, and the local name is used to identify a method, field, or constructor within the class.

The command line option -TJ is useful for debugging the loading of Java extensions. It gives detailed information about the methods that are examined for a possible match.

There are various ways a mapping from URIs to Java classes can be established. The simplest is to use a URI that identifies the Java class explicitly. The namespace URI should be "java:" followed by the fully-qualified class name (for example xmlns:date="java:java.util.Date"). The class must be on the classpath.

For compatibility with other products and previous Saxon releases, Saxon also supports certain other formats of URI. The URI may be a string containing a "/", in which the fully-qualified class name appears after the final "/". (for example xmlns:date="http://www.jclark.com/xt/java/java.util.Date"). The part of the URI before the final "/" is immaterial. The format xmlns:date="java.util.Date" is also supported.

The Saxon namespace URI "http://saxon.sf.net/" is recognised as a special case, and causes the function to be loaded from the class net.sf.saxon.functions.Extensions. This class name can be specified explicitly if you prefer. The various EXSLT namespaces are also recognized specially.

In XSLT it is also possible to set up a mapping from a URI to a Java class using a saxon:script declaration in the stylesheet. This declaration can also name a Java archive, which means the class does not have to be on the classpath.

The rest of this section considers how a Java method, field, or constructor is identified. This decision (called binding) is always made at the time the XPath expression is compiled. (In previous Saxon releases it was sometimes delayed until the actual argument values were known at run-time).

There are three cases to consider: static methods, constructors, and instance-level methods. In addition, a public field in a class is treated as if it were a zero-argument method, so public static fields can be accessed in the same way as public static methods, and public instance-level fields in the same way as instance-level methods.

Static methods can be called directly. The localname of the function must match the name of a public static method in this class. The names match if they contain the same characters, excluding hyphens and forcing any character that follows a hyphen to upper-case. For example the XPath function call to-string() matches the Java method toString(); but the function call can also be written as toString() if you prefer.

If there are several methods in the class that match the localname, and that have the correct number of arguments, then the system attempts to find the one that is the best fit to the types of the supplied arguments: for example if the call is f(1,2) then a method with two int arguments will be preferred to one with two float arguments. The rules for deciding between methods are quite complex. Essentially, for each candidate method, Saxon calculates the "distance" between the types of the supplied arguments and the Java class of the corresponding method in the method's signature, using a set of tables given below. For example, the distance between the XPath data type "xs:Integer" and the Java class "long" is very small, while the distance between an XPath xs:integer and a Java boolean is much larger. If there is one candidate method where the distances of all arguments are less-than-or-equal-to the distances computed for other candidate methods, and the distance of at least one argument is smaller, then that method is chosen. If there are several methods with the same name and the correct number of arguments, but none is preferable to the others under these rules, an error is reported: the message indicates that there is more than one method that matches the function call.

For example (in XSLT):

<xsl:value-of select="math:sqrt($arg)"
   xmlns:math="java:java.lang.Math"/>

This will invoke the static method java.lang.Math#sqrt(), applying it to the value of the variable $arg, and copying the value of the square root of $arg to the result tree.

Similarly (in XQuery):

<a xmlns:double="java:java.lang.Double"/> 
                              {double:MAX_VALUE()} <a>

This will output the value of the static field java.lang.Double#MAX_VALUE. (In practice, it is better to declare the namespace in the query prolog, because it will then not be copied to the result tree.)

Java constructors are called by using the function named new(). If there are several constructors, then again the system tries to find the one that is the best fit, according to the types of the supplied arguments. The result of calling new() is an XPath value whose type is denoted by a QName whose local name is the actual Java class (for example java.sql.Connection or java.util.List) and whose namespace URI is http://saxon.sf.net/java-type (conventional prefix class). Any '$' characters in the class name are replaced by '-' characters in the QName. The only things that can be done with a wrapped Java Object are to assign it to a variable, to pass it to an extension function, and to convert it to a string, number, or boolean, using the rules given below.

Instance-level methods (that is, non-static methods) are called by supplying an extra first argument of type Java Object which is the object on which the method is to be invoked. A Java Object is usually created by calling an extension function (e.g. a constructor) that returns an object; it may also be passed to the style sheet as the value of a global parameter. Matching of method names is done as for static methods. If there are several methods in the class that match the localname, the system again tries to find the one that is the best fit, according to the types of the supplied arguments.

For example, the following XSLT stylesheet prints the date and time. (In XSLT 2.0, of course, this no longer requires extension functions, but the example is still valid.)


<xsl:stylesheet
  version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:date="java:java.util.Date">

<xsl:template match="/">
  <html>
    <xsl:if test="function-available('date:to-string') and 
                          function-available('date:new')">
      <p><xsl:value-of select="date:to-string(date:new())"/></p>
    </xsl:if>
  </html>
</xsl:template>

</xsl:stylesheet>

The equivalent in XQuery is:


declare namespace date="java:java.util.Date";
<p>{date:to-string(date:new())}</p>

A Java method called as an extension function may have an extra first argument of class net.sf.saxon.expr.XPathContext. This argument is not supplied by the calling XPath or XQuery code, but by Saxon itself. The XPathContext object provides methods to access many internal Saxon resources, the most useful being getContextItem() which returns the context item from the dynamic context. The XPathContext object is not available with constructors.

If any exceptions are thrown by the method, or if a matching method cannot be found, processing of the stylesheet will be abandoned. If the tracing option has been set (-T) on the command line, a full stack trace will be output. The exception will be wrapped in a TransformerException and passed to any user-specified ErrorListener object, so the ErrorListener can also produce extra diagnostics.

Expand

Next