The document type definition (DTD) follows the HTML 4.0 draft, minus all the interactive features (forms, scripts, applets), plus MathHTML draft 97-07-10 (e.g. for browers supporting MathHTML: and a number of extensions. Even though these drafts are evolving, the processing tools are modular and can easily be adapted. Moreover, with a well structured content describing format, it is relatively easy to migrate documents to a different expression of the structure (i.e. different tag names). The extensions are mostly accomodated within the existing tags and attributes. They are detailed in the following sections.
The following elements must be present in root HTML files used to generate printed documentation. They are ignored in included documents.
<DIV CLASS=HEAD> <SPAN CLASS="AUTHOR">John Doe<A HREF="#adr1" REL="AFFILIATION"></A></SPAN> <SPAN CLASS="AUTHOR">Mary Doe<A HREF="#adr2" REL="AFFILIATION"></A></SPAN> <ADDRESS ID="adr1"> Newsletter editor<BR> 8723 Buena Vista, Smallville, CT 01234<BR> Tel: +1 (123) 456 7890<BR> email: jd@sgml.com </ADDRESS> <ADDRESS ID="adr2"> Newsletter editor<BR> 8000 Buena Vista, Bigville, VT 01234<BR> Tel: +1 (123) 456 7891<BR> email: md@sgml.com </ADDRESS> <SPAN CLASS="COPYRIGHT">Copyright 1997 under the General Public License, see file COPYING.</SPAN> <SPAN CLASS="DATE">23 Jan 1997 16:05:31 GMT</SPAN> <SPAN CLASS="KEYWORD">beta</SPAN> <SPAN CLASS="KEYWORD">text</SPAN> <SPAN CLASS="KEYWORD">SGML</SPAN> </DIV>
The document structure is based on nested sections, each section starting with a title. In HTML, the nesting of sections is implicit and deduced from the heading level. Besides the core sections, there are the abstract and the appendices which need to be suitably identified. The abstract may be used as a summary for the document. The appendices may be grouped, numbered separately, and placed at the end when several HTML files are grouped during a linearization.
<HEAD><TITLE>This is the title but is redundant</TITLE> </HEAD><BODY> <H1>This is the title for real</H1> <P> First paragraph </P> <DIV CLASS=ABSTRACT> <P>A real short abstract. </P> </DIV> <H2>Introduction</H2> <P>This is all about gnus and gnats as usual. </P> <H2>Motivation</H2> <H3>History</H3> <P> Once upon a time... </P> <DIV CLASS=APPENDIX> <H2>Biological characteristics of gnus</H2> ... <H2>Biological characteristics of gnats</H2> ... </DIV>
In an hypertext document, references may point to an html file on the Web, to a bibliographic entry for the referenced document, or to both. The following classes are used to distinguish the three cases, and their treatment is described.
The referenced bibliographic entries must use the following classes, with the specified fields, inspired from the LaTeX bibtex entries.
The document then contains a number of references, some of which point to entries, usually in a separate bibliographic database.
Document.html: The entries are based on the bibtex/LaTeX <A REL="BIB.NOREF" HREF="http://www.latex.org/manual.html">[LaTeX]</A> <A REL="BIB.ENTRY" HREF="../bib/BibEntries.html#lamport1985"></A>, a popular typesetting system for gnus and gnats. BibEntries.html: <DIV ID="Doe1997" CLASS=BIB.ARTICLE> <SPAN CLASS=AUTHOR>John Doe</SPAN> <SPAN CLASS=AUTHOR>Mary Doe</SPAN> <SPAN CLASS=TITLE>Gnus and Gnats</SPAN> <SPAN CLASS=JOURNAL>Software Review</SPAN> <SPAN CLASS=YEAR>1997</SPAN> </DIV> <DIV ID="lamport1985" CLASS=BIB.BOOK> <SPAN CLASS=AUTHOR>Leslie Lamport</SPAN> <SPAN CLASS=TITLE>LaTeX User's Guide and Reference Manual</SPAN> <SPAN CLASS=PUBLISHER>Addison-Wesley</SPAN> <SPAN CLASS=ADDRESS>Reading, Massachusetts</SPAN> <SPAN CLASS=YEAR>1985</SPAN> </DIV>
Bibliographical references are used to access material outside of the current document. Internal references point the reader to a section, table, figure... through its number (by default and when REL=REF.NUMBER), or page (when REL=REF.PAGE). The corresponding target (section, table, figure...) must be named with the ID or NAME attribute.
The evolution of gnu populations are shown in Table <A REL=REF.NUMBER HREF="#gnutable">[gnu table]</A>, on page <A REL=REF.PAGE HREF="#gnutable">[]</A>, ... <TABLE ID="gnutable"> <CAPTION> Gnu populations in North America from 1800 to 1900</CAPTION> <TR>... </TABLE>
Index marks are used to collect for a document the list of pages where an important topic is discussed. In some cases, begin and end marks delineate a section where the topic is discussed, and the corresponding page range appears in the index. The term printed in the index may not be the correct key for sorting. Indeed, terms may start with a capital letter, or be emphasized to indicate the that the term is first defined here.
A simple index mark is indicated by a SPAN element of CLASS INDEX.MARK. It contains a list of usually no more than three SPAN elements of class INDEX.KEY, each possibly followed by a SPAN element of CLASS INDEX.TEXT when the text to print differs from the sorting key. It may end with a list of SPAN elements of CLASS INDEX.SEE to refer to another index item.
When a text range is to be delineated for the index, two index marks are used, one with CLASS INDEX.MARK.BEGIN and the other with CLASS INDEX.MARK.END. These may not contain INDEX.SEE elements, and the contained INDEX.KEY and INDEX.TEXT elements must match those in the corresponding begin/end mark.
This section discusses the Gnu population size variations over time and geographical area. <SPAN CLASS=INDEX.MARK> <SPAN CLASS=INDEX.KEY>gnu</SPAN> <SPAN CLASS=INDEX.TEXT><EM>Gnu</EM></SPAN> <!-- Gnu is emphasized because first defined here --> <SPAN CLASS=INDEX.KEY>population</SPAN> <SPAN CLASS=INDEX.TEXT>Population</SPAN> <SPAN CLASS=INDEX.KEY>size</SPAN> </SPAN> <SPAN CLASS=INDEX.MARK> <SPAN CLASS=INDEX.KEY>gnu</SPAN> <SPAN CLASS=INDEX.TEXT><EM>Gnu</EM></SPAN> <SPAN CLASS=INDEX.KEY>population</SPAN> <SPAN CLASS=INDEX.TEXT>Population</SPAN> <SPAN CLASS=INDEX.KEY>growth</SPAN> <SPAN CLASS=INDEX.SEE>gnu</SPAN> <SPAN CLASS=INDEX.SEE>population</SPAN> <SPAN CLASS=INDEX.SEE>size</SPAN> </SPAN> Various causes affect the Gnu population size. <SPAN CLASS=INDEX.MARK.BEGIN> <SPAN CLASS=INDEX.KEY>gnat</SPAN> <SPAN CLASS=INDEX.TEXT><EM>Gnat</EM></SPAN> </SPAN> Most significantly, the presence of gnat has a direct correlation with gnu health problems. <SPAN CLASS=INDEX.MARK.END> <SPAN CLASS=INDEX.KEY>gnat</SPAN> </SPAN>
The current HTML practice is to use bitmaps for anything graphical; in some cases, a vector format like Postscript or CGM is used. By representing diagrams as structured elements, one allows further editing and reuse of the diagrams, insures that the diagrams can be represented at the full resolution of the output device, and allows spell checking and cut/paste of any text within the diagram.
The following elements are introduced for representing figures. All positions and sizes are floating point numbers in points.
Each attribute is detailed below.
<FIGURE WIDTH=400 HEIGHT=600> <CAPTION>Schema of cultural exchanges between gnus and gnats </CAPTION> <RECTANGLE HPOS=100 VPOS=100 WIDTH=200 HEIGHT=150 FILLCOLOR="yellow"/> <CIRCLE HPOS=200 VPOS=200 RADIUS=100 FILLCOLOR="pink"/> <ELLIPSE HPOS=200 VPOS=400 WIDTH=200 HEIGHT=100 FILLCOLOR="red"/> <POLYLINE CLOSED PENCOLOR="black" DEPTH=2 PENWIDTH=2.5 POINTS="100.0 100.0 300 100 300 200 100 200"/> <SPLINE CLOSED FILLCOLOR="magenta" POINTS="10 10 20 10 20 20 10 20"/> <PICTURE HPOS=200 VPOS=400 HEIGHT=100 WIDTH=100 SRC="gnu.ppm" ALT="A gnu in sunset"/> <ARC P1="200 200" P2="200 300" P3="300 300" CAP="BUTT" BARROWTYPE=DIAMOND BARROWSIZE="0 30 60"/> <GTEXT VPOS=300 HPOS=200 HEIGHT=300 WIDTH=200> <P>Gnus sometimes have difficult cultural exchanges with gnats, as demonstrated by the surrounding <EM>diagram.</EM> </GTEXT> <GGROUP VPOS=300 HPOS=200 TRANSFORM="0 .707 .707 0 0 0"> <ELLIPSE FILLCOLOR="black" HPOS=0 VPOS=0 WIDTH=200 HEIGHT=100/> </GGROUP> </FIGURE>