The Documentation Format

The document type definition (DTD) follows the HTML 4.0 draft, minus all the interactive features (forms, scripts, applets), plus MathHTML draft 97-07-10 (e.g. for browers supporting MathHTML: ( x a b + y + z i=1 n si ) 25 = α and a number of extensions. Even though these drafts are evolving, the processing tools are modular and can easily be adapted. Moreover, with a well structured content describing format, it is relatively easy to migrate documents to a different expression of the structure (i.e. different tag names). The extensions are mostly accomodated within the existing tags and attributes. They are detailed in the following sections.

Title page information

The following elements must be present in root HTML files used to generate printed documentation. They are ignored in included documents.


<DIV CLASS=HEAD>
<SPAN CLASS="AUTHOR">John Doe<A HREF="#adr1" REL="AFFILIATION"></A></SPAN>
<SPAN CLASS="AUTHOR">Mary Doe<A HREF="#adr2" REL="AFFILIATION"></A></SPAN>
<ADDRESS ID="adr1">
Newsletter editor<BR>
8723 Buena Vista, Smallville, CT 01234<BR>
Tel: +1 (123) 456 7890<BR>
email: jd@sgml.com
</ADDRESS>
<ADDRESS ID="adr2">
Newsletter editor<BR>
8000 Buena Vista, Bigville, VT 01234<BR>
Tel: +1 (123) 456 7891<BR>
email: md@sgml.com
</ADDRESS>
<SPAN CLASS="COPYRIGHT">Copyright 1997 under the General Public License,
  see file COPYING.</SPAN>
<SPAN CLASS="DATE">23 Jan 1997 16:05:31 GMT</SPAN>
<SPAN CLASS="KEYWORD">beta</SPAN>
<SPAN CLASS="KEYWORD">text</SPAN>
<SPAN CLASS="KEYWORD">SGML</SPAN>
</DIV>

Sectioning

The document structure is based on nested sections, each section starting with a title. In HTML, the nesting of sections is implicit and deduced from the heading level. Besides the core sections, there are the abstract and the appendices which need to be suitably identified. The abstract may be used as a summary for the document. The appendices may be grouped, numbered separately, and placed at the end when several HTML files are grouped during a linearization.

<HEAD><TITLE>This is the title but is redundant</TITLE>
</HEAD><BODY>
<H1>This is the title for real</H1>
  <P> First paragraph </P>
<DIV CLASS=ABSTRACT>
    <P>A real short abstract. </P>
</DIV>
<H2>Introduction</H2>
  <P>This is all about gnus and gnats as usual. </P>
<H2>Motivation</H2>
  <H3>History</H3>
    <P> Once upon a time... </P>
<DIV CLASS=APPENDIX>
  <H2>Biological characteristics of gnus</H2>
    ...
  <H2>Biological characteristics of gnats</H2>
    ...
</DIV>

Bibliographical references

In an hypertext document, references may point to an html file on the Web, to a bibliographic entry for the referenced document, or to both. The following classes are used to distinguish the three cases, and their treatment is described.

BIB.REF
The URL is added to the text between parenthesis. When no class is specified, BIB.REF is assumed.
BIB.ENTRY
The entry found at the URL is added to the table of references.
BIB.NOREF
Nothing is added to the table of references, usually because a BIB.ENTRY link is already present and should go in the table of references.

The referenced bibliographic entries must use the following classes, with the specified fields, inspired from the LaTeX bibtex entries.

BIB.ARTICLE
Fields: author, title, journal, year, and optionally volume, number, pages, month, url, note.
BIB.BOOK
Fields: author or editor, title, publisher, year, and optionally volume/number, series, address, edition, month, url, note.
BIB.BOOKLET
Fields: title, and optionally author, howpublished, address, month, year, url, note.
BIB.INBOOK
Fields: author or editor, title, chapter and/or pages, publisher, year, and optionally volume or number, series, type, address, edition, month, url, note.
BIB.INCOLLECTION
Fields: author, title, booktitle, publisher, year, and optionally volume or number, series, type, chapter, pages, address, edition, month, url, note.
BIB.INPROCEEDINGS
Fields: author, title, booktitle, year, and optionally editor, volume or number, series, pages, address, month, organization, publisher, url, note.
BIB.MANUAL
Fields: title, and optionally author, organization, address, edition, month, year, url, note.
BIB.MSTHESIS
Fields: author, title, school, year, and optionally type, address, month, url, note.
BIB.PHDTHESIS
Fields: author, title, school, year, and optionally type, address, month, url, note.
BIB.MISCENTRY
Fields: optionally author, title, howpublished, month, year, url, note.
BIB.PROCEEDINGS
Fields: title, year, and optionally editor, volume or number, series, address, month, organization, publisher, url, note.
BIB.TECHREPORT
Fields: author, title, institution, year, and optionally type, number, address, month, url, note.
BIB.UNPUBLISHED
Fields: author, title, note, and optionally month, year, url.

The document then contains a number of references, some of which point to entries, usually in a separate bibliographic database.

Document.html:

The entries are based on the bibtex/LaTeX
  <A REL="BIB.NOREF" HREF="http://www.latex.org/manual.html">[LaTeX]</A>
  <A REL="BIB.ENTRY" HREF="../bib/BibEntries.html#lamport1985"></A>,
a popular typesetting system for gnus and gnats.

BibEntries.html:

<DIV ID="Doe1997" CLASS=BIB.ARTICLE>
<SPAN CLASS=AUTHOR>John Doe</SPAN>
<SPAN CLASS=AUTHOR>Mary Doe</SPAN>
<SPAN CLASS=TITLE>Gnus and Gnats</SPAN>
<SPAN CLASS=JOURNAL>Software Review</SPAN>
<SPAN CLASS=YEAR>1997</SPAN>
</DIV>
<DIV ID="lamport1985" CLASS=BIB.BOOK>
<SPAN CLASS=AUTHOR>Leslie Lamport</SPAN>
<SPAN CLASS=TITLE>LaTeX User's Guide and Reference Manual</SPAN>
<SPAN CLASS=PUBLISHER>Addison-Wesley</SPAN>
<SPAN CLASS=ADDRESS>Reading, Massachusetts</SPAN>
<SPAN CLASS=YEAR>1985</SPAN>
</DIV>

Internal References

Bibliographical references are used to access material outside of the current document. Internal references point the reader to a section, table, figure... through its number (by default and when REL=REF.NUMBER), or page (when REL=REF.PAGE). The corresponding target (section, table, figure...) must be named with the ID or NAME attribute.

The evolution of gnu populations are shown in Table 
  <A REL=REF.NUMBER HREF="#gnutable">[gnu table]</A>,
on page
  <A REL=REF.PAGE HREF="#gnutable">[]</A>,
...
<TABLE ID="gnutable">
  <CAPTION> Gnu populations in North America from 1800 to 1900</CAPTION>
  <TR>...
</TABLE>

Indexing

Index marks are used to collect for a document the list of pages where an important topic is discussed. In some cases, begin and end marks delineate a section where the topic is discussed, and the corresponding page range appears in the index. The term printed in the index may not be the correct key for sorting. Indeed, terms may start with a capital letter, or be emphasized to indicate the that the term is first defined here.

A simple index mark is indicated by a SPAN element of CLASS INDEX.MARK. It contains a list of usually no more than three SPAN elements of class INDEX.KEY, each possibly followed by a SPAN element of CLASS INDEX.TEXT when the text to print differs from the sorting key. It may end with a list of SPAN elements of CLASS INDEX.SEE to refer to another index item.

When a text range is to be delineated for the index, two index marks are used, one with CLASS INDEX.MARK.BEGIN and the other with CLASS INDEX.MARK.END. These may not contain INDEX.SEE elements, and the contained INDEX.KEY and INDEX.TEXT elements must match those in the corresponding begin/end mark.

This section discusses the Gnu population size variations over time and
geographical area.
<SPAN CLASS=INDEX.MARK>
  <SPAN CLASS=INDEX.KEY>gnu</SPAN>
  <SPAN CLASS=INDEX.TEXT><EM>Gnu</EM></SPAN> 
    <!-- Gnu is emphasized because first defined here -->
  <SPAN CLASS=INDEX.KEY>population</SPAN>
  <SPAN CLASS=INDEX.TEXT>Population</SPAN>
  <SPAN CLASS=INDEX.KEY>size</SPAN>
</SPAN>
<SPAN CLASS=INDEX.MARK>
  <SPAN CLASS=INDEX.KEY>gnu</SPAN>
  <SPAN CLASS=INDEX.TEXT><EM>Gnu</EM></SPAN>
  <SPAN CLASS=INDEX.KEY>population</SPAN>
  <SPAN CLASS=INDEX.TEXT>Population</SPAN>
  <SPAN CLASS=INDEX.KEY>growth</SPAN>
  <SPAN CLASS=INDEX.SEE>gnu</SPAN>
  <SPAN CLASS=INDEX.SEE>population</SPAN>
  <SPAN CLASS=INDEX.SEE>size</SPAN>
</SPAN>
Various causes affect the Gnu population size.
<SPAN CLASS=INDEX.MARK.BEGIN>
  <SPAN CLASS=INDEX.KEY>gnat</SPAN>
  <SPAN CLASS=INDEX.TEXT><EM>Gnat</EM></SPAN>
</SPAN>
Most significantly, the presence of gnat has a direct correlation with
gnu health problems.
<SPAN CLASS=INDEX.MARK.END>
  <SPAN CLASS=INDEX.KEY>gnat</SPAN>
</SPAN>

Figures

The current HTML practice is to use bitmaps for anything graphical; in some cases, a vector format like Postscript or CGM is used. By representing diagrams as structured elements, one allows further editing and reuse of the diagrams, insures that the diagrams can be represented at the full resolution of the output device, and allows spell checking and cut/paste of any text within the diagram.

The following elements are introduced for representing figures. All positions and sizes are floating point numbers in points.

FIGURE
This element contains all the other elements, as well as a CAPTION element as first child. It has the attributes WIDTH and HEIGHT.
RECTANGLE
This element has the attributes VPOS, HPOS, WIDTH, HEIGHT, PENCOLOR, FILLCOLOR, DEPTH, PENSTYLE, FILLSTYLE, PENWIDTH, JOIN, CORNERRADIUS.
CIRCLE
This element has the attributes VPOS, HPOS, RADIUS, PENCOLOR, FILLCOLOR, DEPTH, PENSTYLE, FILLSTYLE.
ELLIPSE
This element has the attributes VPOS, HPOS, WIDTH, HEIGHT, PENCOLOR, FILLCOLOR, DEPTH, PENSTYLE, FILLSTYLE, PENWIDTH.
POLYLINE
This element has the attributes CLOSED, PENCOLOR, FILLCOLOR, DEPTH, PENSTYLE, FILLSTYLE, PENWIDTH, CAP, JOIN, BARROWTYPE, BARROWSIZE, EARROWTYPE, EARROWSIZE, POINTS.
SPLINE
This element has the attributes CLOSED, INTERPOLATED, PENCOLOR, FILLCOLOR, DEPTH, PENSTYLE, FILLSTYLE, PENWIDTH, CAP, BARROWTYPE, BARROWSIZE, EARROWTYPE, EARROWSIZE, POINTS.
PICTURE
This element has the attributes VPOS, HPOS, WIDTH, HEIGHT, DEPTH, SRC, ALT.
ARC
This element has the attributes P1, P2, P3, PIE, PENCOLOR, FILLCOLOR, DEPTH, PENSTYLE, FILLSTYLE, CAP, BARROWTYPE, BARROWSIZE, EARROWTYPE, EARROWSIZE.
GTEXT
This element may contain any block level element (lists, paragraphs...) and has the following attributes VPOS, HPOS, WIDTH, HEIGHT, DEPTH.
GGROUP
This element may contain any element acceptable in FIGURE, except for CAPTION. It has the following attributes: VPOS, HPOS, WIDTH, HEIGHT, DEPTH, TRANSFORM.

Each attribute is detailed below.

ALT
String describing the associated PICTURE, for browsers without graphics capabilities.
BARROWSIZE
Three blank separated floating point numbers representing the arrow thickness, width and height, for the arrow at the beginning of a polyline, spline, or arc.
BARROWTYPE
One of TRIANGLE, FTRIANGLE (filled), DIAMOND, FDIAMOND, V, HOLLOW, FHOLLOW. It describes the shape of the arrow head, if any, at the beginning of a polyline, spline or arc.
CAP
One of BUTT, ROUND, or PROJECTING for the end of a polyline, spline, or arc.
CLOSED
When set to CLOSED, it indicates that the polyline or spline is a closed figure (i.e. an edge is added from the last point to the first).
CORNERRADIUS
A floating point number indicating the radius of the rounded corners for a rectangle. The default is 0, which corresponds to square corners for a rectangle.
DEPTH
The depth is a cardinal number determining the order in which objects are drawn. Objects are drawn in decreasing depth order. Thus, deeper objects are hidden behind the others. The depth for a GGROUP serves as offset for the whole group.
EARROWSIZE
Just like BARROWSIZE but for the end arrow.
EARROWTYPE
Just like BARROWTYPE but for the end arrow.
FILLCOLOR
Color used to fill the area within graphical objects. The color may be expressed as three floating point numbers between 0 and 1 representing the red, green and blue intensity, or as a color name.
FILLSTYLE
The fill style, described by a bitmap (URL), or the name of a builtin style.
HEIGHT
Height allowed for the graphical element. It determines the area available for text objects, twice the vertical radius for ellipse, the area into which to fit a picture, and the clipping boundary for group objects.
HPOS
Horizontal position of the anchor point for the graphical object (center for circle and ellipse, lower left corner for rectangle, picture, group, and text objects).
INTERPOLATED
When set to interpolated, the spline passes through all the control points.
JOIN
One of MITER, ROUND, BEVEL.
P1
Start point (x y) for an arc.
P2
Point on the arc, somewhere between P1 and P3.
P3
End point for an arc.
PENCOLOR
Color for the outline of graphics shapes. It may be expressed as red, green, and blue intensities (in the [0,1] interval), or as a color name.
PENSTYLE
Blank separated list of floating point numbers, the first being the offset within the dash pattern, and the following numbers representing the length of solid line and space, in alternation, within the pattern.
PENWIDTH
Width of the outline for graphics shapes.
PIE
When set to PIE, an arc is closed in a pie shape.
POINTS
List of blank separated x y floating point coordinates for polylines and splines.
RADIUS
Floating point number for the radius of circles.
SRC
URL for PICTURE elements.
TRANSFORM
Six blank separated floating point numbers representing the transformation matrix to apply to graphical objects contained within a group. The transformation of text objects may not be properly handled on all viewing systems, especially rotations.
VPOS
Vertical position of the anchor point for the graphical object (center for circle and ellipse, upper left corner for rectangle, picture, group, and text objects).
WIDTH
Width allowed for the graphical element. It determines the area available for text objects, twice the horizontal radius for ellipse, the area into which to fit a picture, and the clipping boundary for group objects.
<FIGURE WIDTH=400 HEIGHT=600>
<CAPTION>Schema of cultural exchanges between gnus and gnats
</CAPTION>
<RECTANGLE HPOS=100 VPOS=100 WIDTH=200 HEIGHT=150 FILLCOLOR="yellow"/>
<CIRCLE HPOS=200 VPOS=200 RADIUS=100 FILLCOLOR="pink"/>
<ELLIPSE HPOS=200 VPOS=400 WIDTH=200 HEIGHT=100 FILLCOLOR="red"/>
<POLYLINE CLOSED PENCOLOR="black" DEPTH=2 PENWIDTH=2.5 
 POINTS="100.0 100.0 300 100 300 200 100 200"/>
<SPLINE CLOSED FILLCOLOR="magenta" POINTS="10 10 20 10 20 20 10 20"/>
<PICTURE HPOS=200 VPOS=400 HEIGHT=100 WIDTH=100 SRC="gnu.ppm"
 ALT="A gnu in sunset"/>
<ARC P1="200 200" P2="200 300" P3="300 300" CAP="BUTT"
 BARROWTYPE=DIAMOND BARROWSIZE="0 30 60"/>
<GTEXT VPOS=300 HPOS=200 HEIGHT=300 WIDTH=200>
  <P>Gnus sometimes have difficult cultural exchanges with gnats, as
     demonstrated by the surrounding <EM>diagram.</EM>
</GTEXT>
<GGROUP VPOS=300 HPOS=200 TRANSFORM="0 .707 .707 0 0 0">
  <ELLIPSE FILLCOLOR="black" HPOS=0 VPOS=0 WIDTH=200 HEIGHT=100/>
</GGROUP>
</FIGURE>