Collations used for comparing strings can be specified by means of a URI. A collation URI may
be used as an argument to many of the standard functions, and
also as an attribute of xsl:sort
in XSLT, and in the order by
clause of a FLWOR expression in XQuery.
The W3C specifications leave the details of collation URIs entirely implementation-defined. This section explains the collation URIs that can be used with Saxon.
In Saxon XSLT stylesheets, collations may be described using a saxon:collation
element as a top-level declaration in the stylesheet. In this case the value of the name
attribute of the saxon:collation
may be used as a collation URI. There is no constraint
on the form this URI takes, indeed there is no requirement that it be a legal URI.
See saxon:collation for more details.
A collation URI may also be constructed directly. This enables collation URIs to be used in
XPath and XQuery applications as well as in XSLT stylesheets. Such a collation URI takes the form
http://saxon.sf.net/collation?keyword=value;keyword=value;...
. The query parameters
in the URI can be separated either by ampersands or semicolons, but semicolons are usually more
convenient. The keywords available are as follows:
keyword |
values |
effect |
class |
fully-qualified Java class name of a class that
implements |
This parameter should not be combined with any other parameter.
An instance of the requested class is created, and is used to perform
the comparisons. Note that if the collation is to be used
in functions such as contains() and starts-with(), this class must also be a
|
lang |
any value allowed for xml:lang, for example |
This is used to find the collation appropriate to a Java locale. The collation
may be further tailored using the parameters |
strength |
primary, secondary, tertiary, or identical |
Indicates the differences that are considered significant when comparing two strings. A/B is a primary difference; A/a is a secondary difference; a/� is a tertiary difference (though this varies by language). So if strength=primary then A=a is true; with strength=secondary then A=a is false but a=� is true; with strength=tertiary then a=� is false. |
decomposition |
none, standard, full |
Indicates how the collator handles Unicode composed characters. See the JDK documentation for details. |
It is also possible to specify the Unicode Codepoint Collation defined in the
W3C specifications, currently http://www.w3.org/2003/11/xpath-functions/collation/codepoint
.
In addition, the APIs provided for executing XPath and XQuery expressions allow named collations to be registered by the calling application, as part of the static context.