DtdToHaskell is a tool (and Text.XML.HaXml.Xml2Haskell provides the class framework) for translating any valid XML DTD into equivalent Haskell types. This allows you to generate, edit, and transform documents as normal typed values in programs, and to read and write them as human-readable XML documents.
Usage: DtdToHaskell [dtdfile [outfile]]
(Missing file arguments or dashes (-) indicate stdin
or stdout respectively.)
The program reads and parses a DTD from dtdfile (which may be either just a DTD, or a full XML document containing an internal DTD). It generates into outfile a Haskell module containing a collection of type definitions plus some class instance declarations for I/O.
In order to use the resulting module, you need to import it, and also to import Text.XML.HaXml.Xml2Haskell. To read and write XML files as values of the declared types, use the functions
Xml2Haskell.readXml :: XmlContent a => FilePath -> IO a Xml2Haskell.writeXml :: XmlContent a => FilePath -> a -> IO ()not forgetting to resolve the overloading in one of the usual ways (e.g. by implicit context at point of use, by explicit type signatures on values, use value as an argument to a function with an explicit signature, use `asTypeOf`, etc.)
You will need to study the automatically-generated type declarations to write your own transformation scripts - most things are pretty obvious parallels to the DTD structure.
Limitations
We mangle tag names and attribute names to ensure that they have the
correct lexical form in Haskell, but this means that (for instance) we
can't distinguish Myname and myname, which are
different names in XML but translate to overlapping types in Haskell
(and hence probably won't compile).
Attribute names translate into named fields: but because Haskell doesn't allow different types to have the same named field, this means your XML document which uses the same name for similar attributes on different tags would crash and burn. We have fixed this by incorporating the tagname into the named field in addition to the attribute name, e.g. tagAttr instead of just attr. Uglier, but more portable.
XML namespaces. Currently, we just mangle the namespace identifier into any tag name which uses it. Probably the right way to do it is to regard the namespace as a separate imported module, and hence translate the namespace prefix into a module qualifier. Does this sound about right?
External subset. Since HaXml release 1.00, we support the XML DTD external subset. This means we can read and parse a whole bunch of files as part of the same DTD, and we respect INCLUDE and IGNORE conditional sections.
There are some fringe parts of the DTD we are not entirely sure about - Tokenised Types and Notation Types. In particular, there is no validity checking of these external references. If you find a problem, mail us: Malcolm.Wallace@cs.york.ac.uk