Controlling Validation from Java

The easiest way to control validation from a Java application is to run a JAXP identity transformation, having first set the option to perform schema validation. The following code (from the sample application QuickValidator.java) illustrates this:


  try {
      System.setProperty("javax.xml.transform.TransformerFactory",
                         "com.saxonica.SchemaAwareTransformerFactory");
      TransformerFactory factory = TransformerFactory.newInstance();
      factory.setAttribute(FeatureKeys.SCHEMA_VALIDATION, Boolean.TRUE);
      Transformer trans = factory.newTransformer();
      StreamSource source = new StreamSource(new File(args[0]).toURI().toString());
      SAXResult sink = new SAXResult(new DefaultHandler());
      trans.transform(source, sink);
  } catch (TransformerException err) {
      System.err.println("Validation failed");
  }

If you set an ErrorListener on the TransformerFactory, then you can control the way that error messages are output.

If you want to validate against a schema without hard-coding the URI of the schema into the source document, you can do this by pre-loading the schema into the TransformerFactory. This extended example (again from the sample application QuickValidator.java) illustrates this:


  try {
      System.setProperty("javax.xml.transform.TransformerFactory",
                         "com.saxonica.SchemaAwareTransformerFactory");
      TransformerFactory factory = TransformerFactory.newInstance();
      factory.setAttribute(FeatureKeys.SCHEMA_VALIDATION, Boolean.TRUE);
      if (args.length > 1) {
          StreamSource schema = new StreamSource(new File(args[1]).toURI().toString());
          ((SchemaAwareTransformerFactory)factory).addSchema(schema);
      }
      Transformer trans = factory.newTransformer();
      StreamSource source = new StreamSource(new File(args[0]).toURI().toString());
      SAXResult sink = new SAXResult(new DefaultHandler());
      trans.transform(source, sink);
  } catch (TransformerException err) {
      System.err.println("Validation failed");
  }

You can preload as many schemas as you like using the addSchema method. Such schemas are parsed, validated, and compiled once, and can be used as often as you like for validating multiple source documents. You cannot unload a schema once it has been loaded. If you want to remove or replace a schema, start afresh with a new TransformerFactory.

Behind the scenes, the TransformerFactory uses a Configuration object to hold all the configuration information. The basic Saxon product uses the class net.sf.saxon.TransformerFactoryImpl for the TransformerFactory, and net.sf.saxon.Configuration for the underlying configuration information. The schema-aware product subclasses these with com.saxonica.SchemaAwareTransformerFactory and com.saxonica.SchemaAwareConfiguration respectively. You can get hold of the configuration object by casting the TransformerFactory to a Saxon TransformerFactorImpl and calling the getConfiguration() method. This gives you more precise control, for example it allows you to retrieve the Schema object containing the schema components for a given target namespace, and to inspect the compiled schema to establish its properties. See the JavaDoc documentation for further details.

Saxon currently implements its own API for access to the schema components. This API should be regarded as temporary. In the longer term, it is likely that Saxon will offer an API for schema access that has been proposed in a member submission to W3C.

The programming approach outlined above, of using an identity transformer, is suitable for a wide class of applications. For example, it enables you to insert a validation step into a SAX-based pipeline. However, for finer control, there are lower-level interfaces available in Saxon that you can also use. See for example the JavaDoc for the SchemaAwareConfiguration class, which includes methods such as getElementValidator. This constructs a Receiver which acts as a validating XML event filter. This can be inserted into a pipeline of Receivers. Saxon also provides classes to bridge between SAX events and Receiver events: ReceivingContentHandler and ContentHandlerProxy respectively.

Expand

Next