============================
The Docutils Document Tree
============================
A Guide to the Docutils DTD
***************************
:Author: David Goodger
:Contact: docutils-develop@lists.sourceforge.net
:Revision: $Revision: 7302 $
:Date: $Date: 2012-01-03 20:23:53 +0100 (Die, 03. Jän 2012) $
:Copyright: This document has been placed in the public domain.
.. contents:: :depth: 1
This document describes the XML data structure of Docutils_ documents:
the relationships and semantics of elements and attributes. The
Docutils document structure is formally defined by the `Docutils
Generic DTD`_ XML document type definition, docutils.dtd_, which is
the definitive source for details of element structural relationships.
This document does not discuss implementation details. Those can be
found in internal documentation (docstrings) for the
``docutils.nodes`` module, where the document tree data structure is
implemented in a class library.
The reader is assumed to have some familiarity with XML or SGML, and
an understanding of the data structure meaning of "tree". For a list
of introductory articles, see `Introducing the Extensible Markup
Language (XML)`_.
The reStructuredText_ markup is used for illustrative examples
throughout this document. For a gentle introduction, see `A
ReStructuredText Primer`_. For complete technical details, see the
`reStructuredText Markup Specification`_.
.. _Docutils: http://docutils.sourceforge.net/
.. _Docutils Generic DTD:
.. _Docutils DTD:
.. _docutils.dtd: docutils.dtd
.. _Introducing the Extensible Markup Language (XML):
http://xml.coverpages.org/xmlIntro.html
.. _reStructuredText: http://docutils.sourceforge.net/rst.html
.. _A ReStructuredText Primer: ../user/rst/quickstart.html
.. _reStructuredText Markup Specification: rst/restructuredtext.html
-------------------
Element Hierarchy
-------------------
.. contents:: :local:
Below is a simplified diagram of the hierarchy of elements in the
Docutils document tree structure. An element may contain any other
elements immediately below it in the diagram. Notes are written in
square brackets. Element types in parentheses indicate recursive or
one-to-many relationships; sections may contain (sub)sections, tables
contain further body elements, etc. ::
+--------------------------------------------------------------------+
| document [may begin with a title, subtitle, decoration, docinfo] |
| +--------------------------------------+
| | sections [each begins with a title] |
+-----------------------------+-------------------------+------------+
| [body elements:] | (sections) |
| | - literal | - lists | | - hyperlink +------------+
| | blocks | - tables | | targets |
| para- | - doctest | - block | foot- | - sub. defs |
| graphs | blocks | quotes | notes | - comments |
+---------+-----------+----------+-------+--------------+
| [text]+ | [text] | (body elements) | [text] |
| (inline +-----------+------------------+--------------+
| markup) |
+---------+
The Docutils document model uses a simple, recursive model for section
structure. A document_ node may contain body elements and section_
elements. Sections in turn may contain body elements and sections.
The level (depth) of a section element is determined from its physical
nesting level; unlike other document models (``
`` in HTML_,
```` in DocBook_, ```` in XMLSpec_) the level is not
incorporated into the element name.
The Docutils document model uses strict element content models. Every
element has a unique structure and semantics, but elements may be
classified into general categories (below). Only elements which are
meant to directly contain text data have a mixed content model, where
text data and inline elements may be intermixed. This is unlike the
much looser HTML_ document model, where paragraphs and text data may
occur at the same level.
Structural Elements
===================
Structural elements may only contain child elements; they do not
directly contain text data. Structural elements may contain body
elements or further structural elements. Structural elements can only
be child elements of other structural elements.
Category members: document_, section_, topic_, sidebar_
Structural Subelements
----------------------
Structural subelements are child elements of structural elements.
Simple structuctural subelements (title_, subtitle_) contain text
data; the others are compound and do not directly contain text data.
Category members: title_, subtitle_, decoration_, docinfo_,
transition_
Bibliographic Elements
``````````````````````
The docinfo_ element is an optional child of document_. It groups
bibliographic elements together. All bibliographic elements except
authors_ and field_ contain text data. authors_ contains further
bibliographic elements (most notably author_). field_ contains
field_name_ and field_body_ body subelements.
Category members: address_, author_, authors_, contact_, copyright_,
date_, field_, organization_, revision_, status_, version_
Decorative Elements
```````````````````
The decoration_ element is also an optional child of document_. It
groups together elements used to generate page headers and footers.
Category members: footer_, header_
Body Elements
=============
Body elements are contained within structural elements and compound
body elements. There are two subcategories of body elements: simple
and compound.
Category members: admonition_, attention_, block_quote_, bullet_list_,
caution_, citation_, comment_, compound_, container_, danger_,
definition_list_, doctest_block_, enumerated_list_, error_,
field_list_, figure_, footnote_, hint_, image_, important_,
line_block_, literal_block_, note_, option_list_, paragraph_,
pending_, raw_, rubric_, substitution_definition_, system_message_,
table_, target_, tip_, warning_
Simple Body Elements
--------------------
Simple body elements are empty or directly contain text data. Those
that contain text data may also contain inline elements. Such
elements therefore have a "mixed content model".
Category members: comment_, doctest_block_, image_, literal_block_,
math_block_, paragraph_, pending_, raw_, rubric_, substitution_definition_,
target_
Compound Body Elements
----------------------
Compound body elements contain local substructure (body subelements)
and further body elements. They do not directly contain text data.
Category members: admonition_, attention_, block_quote_, bullet_list_,
caution_, citation_, compound_, container_, danger_, definition_list_,
enumerated_list_, error_, field_list_, figure_, footnote_, hint_,
important_, line_block, note_, option_list_, system_message_, table_,
tip_, warning_
Body Subelements
````````````````
Compound body elements contain specific subelements (e.g. bullet_list_
contains list_item_). Subelements may themselves be compound elements
(containing further child elements, like field_) or simple data
elements (containing text data, like field_name_). These subelements
always occur within specific parent elements, never at the body
element level (beside paragraphs, etc.).
Category members (simple): attribution_, caption_, classifier_,
colspec_, field_name_, label_, line_, option_argument_,
option_string_, term_
Category members (compound): definition_, definition_list_item_,
description_, entry_, field_, field_body_, legend_, list_item_,
option_, option_group_, option_list_item_, row_, tbody_, tgroup_,
thead_
Inline Elements
===============
Inline elements directly contain text data, and may also contain
further inline elements. Inline elements are contained within simple
body elements. Most inline elements have a "mixed content model".
Category members: abbreviation_, acronym_, citation_reference_,
emphasis_, footnote_reference_, generated_, image_, inline_, literal_,
math_, problematic_, reference_, strong_, subscript_,
substitution_reference_, superscript_, target_, title_reference_, raw_
.. _HTML: http://www.w3.org/MarkUp/
.. _DocBook: http://docbook.org/tdg/en/html/docbook.html
.. _XMLSpec: http://www.w3.org/XML/1998/06/xmlspec-report.htm
-------------------
Element Reference
-------------------
.. contents:: :local:
:depth: 1
Each element in the DTD (document type definition) is described in its
own section below. Each section contains an introduction plus the
following subsections:
* Details (of element relationships and semantics):
- Category: One or more references to the element categories in
`Element Hierarchy`_ above. Some elements belong to more than one
category.
- Parents: A list of elements which may contain the element.
- Children: A list of elements which may occur within the element.
- Analogues: Describes analogous elements in well-known document
models such as HTML_ or DocBook_. Lists similarities and
differences.
- Processing: Lists formatting or rendering recommendations for the
element.
* Content Model:
The formal XML content model from the `Docutils DTD`_, followed by:
- Attributes: Describes (or refers to descriptions of) the possible
values and semantics of each attribute.
- Parameter Entities: Lists the parameter entities which directly or
indirectly include the element.
* Examples: reStructuredText_ examples are shown along with
fragments of the document trees resulting from parsing.
_`Pseudo-XML` is used for the results of parsing and processing.
Pseudo-XML is a representation of XML where nesting is indicated by
indentation and end-tags are not shown. Some of the precision of
real XML is given up in exchange for easier readability. For
example, the following are equivalent:
- Real XML::
A TitleA paragraph.
- Pseudo-XML::
A Title
A paragraph.
--------------------
Many of the element reference sections below are marked "_`to be
completed`". Please help complete this document by contributing to
its writing.
``abbreviation``
================
`To be completed`_.
``acronym``
===========
`To be completed`_.
``address``
===========
The ``address`` element holds the surface mailing address information
for the author (individual or group) of the document, or a third-party
contact address. Its structure is identical to that of the
literal_block_ element: whitespace is significant, especially
newlines.
Details
-------
:Category:
`Bibliographic Elements`_
:Parents:
The following elements may contain ``address``: docinfo_, authors_
:Children:
``address`` elements contain text data plus `inline elements`_.
:Analogues:
``address`` is analogous to the DocBook "address" element.
:Processing:
As with the literal_block_ element, newlines and other whitespace
is significant and must be preserved. However, a monospaced
typeface need not be used.
See also docinfo_.
Content Model
-------------
.. parsed-literal::
`%text.model;`_
:Attributes:
The ``address`` element contains the `common attributes`_ (ids_,
names_, dupnames_, source_, and classes_), plus `xml:space`_.
:Parameter Entities:
The `%bibliographic.elements;`_ parameter entity directly includes
``address``.
Examples
--------
reStructuredText_ source::
Document Title
==============
:Address: 123 Example Ave.
Example, EX
Complete pseudo-XML_ result after parsing and applying transforms::
Document Title
123 Example Ave.
Example, EX
See docinfo_ for a more complete example, including processing
context.
``admonition``
==============
This element is a generic, titled admonition. Also see the specific
admonition elements Docutils offers (in alphabetical order): caution_,
danger_, error_, hint_, important_, note_, tip_, warning_.
Details
-------
:Category:
`Compound Body Elements`_
:Parents:
All elements employing the `%body.elements;`_ or
`%structure.model;`_ parameter entities in their content models
may contain ``admonition``.
:Children:
``admonition`` elements begin with a title_ and may contain one or
more `body elements`_.
:Analogues:
``admonition`` has no direct analogues in common DTDs. It can be
emulated with primitives and type effects.
:Processing:
Rendered distinctly (inset and/or in a box, etc.).
Content Model
-------------
.. parsed-literal::
(title_, (`%body.elements;`_)+)
:Attributes:
The ``admonition`` element contains only the `common attributes`_:
ids_, names_, dupnames_, source_, and classes_.
:Parameter Entities:
The `%body.elements;`_ parameter entity directly includes
``admonition``. The `%structure.model;`_ parameter entity
indirectly includes ``admonition``.
Examples
--------
reStructuredText source::
.. admonition:: And, by the way...
You can make up your own admonition too.
Pseudo-XML_ fragment from simple parsing::
And, by the way...
You can make up your own admonition too.
``attention``
=============
The ``attention`` element is an admonition, a distinctive and
self-contained notice. Also see the other admonition elements
Docutils offers (in alphabetical order): caution_, danger_, error_,
hint_, important_, note_, tip_, warning_, and the generic admonition_.
Details
-------
:Category:
`Compound Body Elements`_
:Parents:
All elements employing the `%body.elements;`_ or
`%structure.model;`_ parameter entities in their content models
may contain ``attention``.
:Children:
``attention`` elements contain one or more `body elements`_.
:Analogues:
``attention`` has no direct analogues in common DTDs. It can be
emulated with primitives and type effects.
:Processing:
Rendered distinctly (inset and/or in a box, etc.), with the
generated title "Attention!" (or similar).
Content Model
-------------
.. parsed-literal::
(`%body.elements;`_)+
:Attributes:
The ``attention`` element contains only the `common attributes`_:
ids_, names_, dupnames_, source_, and classes_.
:Parameter Entities:
The `%body.elements;`_ parameter entity directly includes
``attention``. The `%structure.model;`_ parameter entity
indirectly includes ``attention``.
Examples
--------
reStructuredText source::
.. Attention:: All your base are belong to us.
Pseudo-XML_ fragment from simple parsing::
All your base are belong to us.
``attribution``
===============
`To be completed`_.
``author``
==========
The ``author`` element holds the name of the author of the document.
Details
-------
:Category:
`Bibliographic Elements`_
:Parents:
The following elements may contain ``author``: docinfo_, authors_
:Children:
``author`` elements may contain text data plus `inline elements`_.
:Analogues:
``author`` is analogous to the DocBook "author" element.
:Processing:
See docinfo_.
Content Model
-------------
.. parsed-literal::
`%text.model;`_
:Attributes:
The ``author`` element contains only the `common attributes`_:
ids_, names_, dupnames_, source_, and classes_.
:Parameter Entities:
The `%bibliographic.elements;`_ parameter entity directly includes
``author``.
Examples
--------
reStructuredText_ source::
Document Title
==============
:Author: J. Random Hacker
Complete pseudo-XML_ result after parsing and applying transforms::
Document Title
J. Random Hacker
See docinfo_ for a more complete example, including processing
context.
``authors``
===========
The ``authors`` element is a container for author information for
documents with multiple authors.
Details
-------
:Category:
`Bibliographic Elements`_
:Parents:
Only the docinfo_ element contains ``authors``.
:Children:
``authors`` elements may contain the following elements: author_,
organization_, address_, contact_
:Analogues:
``authors`` is analogous to the DocBook "authors" element.
:Processing:
See docinfo_.
Content Model
-------------
.. parsed-literal::
((author_, organization_?, address_?, contact_?)+)
:Attributes:
The ``authors`` element contains only the `common attributes`_:
ids_, names_, dupnames_, source_, and classes_.
:Parameter Entities:
The `%bibliographic.elements;`_ parameter entity directly includes
``authors``.
Examples
--------
reStructuredText_ source::
Document Title
==============
:Authors: J. Random Hacker; Jane Doe
Complete pseudo-XML_ result after parsing and applying transforms::
Document Title
J. Random Hacker
Jane Doe
In reStructuredText, multiple author's names are separated with
semicolons (";") or commas (","); semicolons take precedence. There
is currently no way to represent the author's organization, address,
or contact in a reStructuredText "Authors" field.
See docinfo_ for a more complete example, including processing
context.
``block_quote``
===============
The ``block_quote`` element is used for quotations set off from the
main text (standalone).
Details
-------
:Category:
`Compound Body Elements`_
:Parents:
All elements employing the `%body.elements;`_ or
`%structure.model;`_ parameter entities in their content models
may contain ``block_quote``.
:Children:
``block_quote`` elements contain `body elements`_ followed by an
optional attribution_ element.
:Analogues:
``block_quote`` is analogous to the "blockquote" element in both
HTML and DocBook.
:Processing:
``block_quote`` elements serve to set their contents off from the
main text, typically with indentation and/or other decoration.
Content Model
-------------
.. parsed-literal::
((`%body.elements;`_)+, attribution_?)
:Attributes:
The ``block_quote`` element contains only the `common
attributes`_: ids_, names_, dupnames_, source_, and classes_.
:Parameter Entities:
The `%body.elements;`_ parameter entity directly includes
``block_quote``. The `%structure.model;`_ parameter entity
indirectly includes ``block_quote``.
Examples
--------
reStructuredText source::
As a great paleontologist once said,
This theory, that is mine, is mine.
-- Anne Elk (Miss)
Pseudo-XML_ fragment from simple parsing::
As a great paleontologist once said,
This theory, that is mine, is mine.
Anne Elk (Miss)
``bullet_list``
===============
The ``bullet_list`` element contains list_item_ elements which are
uniformly marked with bullets. Bullets are typically simple dingbats
(symbols) such as circles and squares.
Details
-------
:Category:
`Compound Body Elements`_
:Parents:
All elements employing the `%body.elements;`_ or
`%structure.model;`_ parameter entities in their content models
may contain ``bullet_list``.
:Children:
``bullet_list`` elements contain one or more list_item_ elements.
:Analogues:
``bullet_list`` is analogous to the HTML "ul" element and to the
DocBook "itemizedlist" element. HTML's "ul" is short for
"unordered list", which we consider to be a misnomer. "Unordered"
implies that the list items may be randomly rearranged without
affecting the meaning of the list. Bullet lists *are* often
ordered; the ordering is simply left implicit.
:Processing:
Each list item should begin a new vertical block, prefaced by a
bullet/dingbat.
Content Model
-------------
.. parsed-literal::
(list_item_ +)
:Attributes:
The ``bullet_list`` element contains the `common attributes`_
(ids_, names_, dupnames_, source_, and classes_), plus bullet_.
``bullet`` is used to record the style of bullet from the input
data. In documents processed from reStructuredText_, it contains
one of "-", "+", or "*". It may be ignored in processing.
:Parameter Entities:
The `%body.elements;`_ parameter entity directly includes
``bullet_list``. The `%structure.model;`_ parameter entity
indirectly includes ``bullet_list``.
Examples
--------
reStructuredText_ source::
- Item 1, paragraph 1.
Item 1, paragraph 2.
- Item 2.
Pseudo-XML_ fragment from simple parsing::
Item 1, paragraph 1.
Item 1, paragraph 2.
Item 2.
See list_item_ for another example.
``caption``
===========
`To be completed`_.
``caution``
===========
The ``caution`` element is an admonition, a distinctive and
self-contained notice. Also see the other admonition elements
Docutils offers (in alphabetical order): attention_, danger_, error_,
hint_, important_, note_, tip_, warning_, and the generic admonition_.
Details
-------
:Category:
`Compound Body Elements`_
:Parents:
All elements employing the `%body.elements;`_ or
`%structure.model;`_ parameter entities in their content models
may contain ``caution``.
:Children:
``caution`` elements contain one or more `body elements`_.
:Analogues:
``caution`` is analogous to the DocBook "caution" element.
:Processing:
Rendered distinctly (inset and/or in a box, etc.), with the
generated title "Caution" (or similar).
Content Model
-------------
.. parsed-literal::
(`%body.elements;`_)+
:Attributes:
The ``caution`` element contains only the `common attributes`_:
ids_, names_, dupnames_, source_, and classes_.
:Parameter Entities:
The `%body.elements;`_ parameter entity directly includes
``caution``. The `%structure.model;`_ parameter entity
indirectly includes ``caution``.
Examples
--------
reStructuredText source::
.. Caution:: Don't take any wooden nickels.
Pseudo-XML_ fragment from simple parsing::
Don't take any wooden nickels.
``citation``
============
`To be completed`_.
``citation_reference``
======================
`To be completed`_.
``classifier``
==============
The ``classifier`` element contains the classification or type of the
term_ being defined in a definition_list_. For example, it can be
used to indicate the type of a variable.
Details
-------
:Category:
`Body Subelements`_ (simple)
:Parents:
Only the definition_list_item_ element contains ``classifier``.
:Children:
``classifier`` elements may contain text data plus `inline elements`_.
:Analogues:
``classifier`` has no direct analogues in common DTDs. It can be
emulated with primitives or type effects.
:Processing:
See definition_list_item_.
Content Model
-------------
.. parsed-literal::
`%text.model;`_
:Attributes:
The ``classifier`` element contains only the `common attributes`_:
ids_, names_, dupnames_, source_, and classes_.
Examples
--------
Here is a hypothetical data dictionary. reStructuredText_ source::
name : string
Customer name.
i : int
Temporary index variable.
Pseudo-XML_ fragment from simple parsing::
name
string
Customer name.
i
int
Temporary index variable.
``colspec``
===========
`To be completed`_.
``comment``
===========
`To be completed`_.
``compound``
============
`To be completed`_.
``contact``
===========
The ``contact`` element holds contact information for the author
(individual or group) of the document, or a third-party contact. It
is typically used for an email or web address.
Details
-------
:Category:
`Bibliographic Elements`_
:Parents:
The following elements may contain ``contact``: docinfo_, authors_
:Children:
``contact`` elements may contain text data plus `inline
elements`_.
:Analogues:
``contact`` is analogous to the DocBook "email" element. The HTML
"address" element serves a similar purpose.
:Processing:
See docinfo_.
Content Model
-------------
.. parsed-literal::
`%text.model;`_
:Attributes:
The ``contact`` element contains only the `common attributes`_:
ids_, names_, dupnames_, source_, and classes_.
:Parameter Entities:
The `%bibliographic.elements;`_ parameter entity directly includes
``contact``.
Examples
--------
reStructuredText_ source::
Document Title
==============
:Contact: jrh@example.com
Complete pseudo-XML_ result after parsing and applying transforms::
Document Title
jrh@example.com
See docinfo_ for a more complete example, including processing
context.
``container``
=============
`To be completed`_.
``copyright``
=============
The ``copyright`` element contains the document's copyright statement.
Details
-------
:Category:
`Bibliographic Elements`_
:Parents:
Only the docinfo_ element contains ``copyright``.
:Children:
``copyright`` elements may contain text data plus `inline
elements`_.
:Analogues:
``copyright`` is analogous to the DocBook "copyright" element.
:Processing:
See docinfo_.
Content Model
-------------
.. parsed-literal::
`%text.model;`_
:Attributes:
The ``copyright`` element contains only the `common attributes`_:
ids_, names_, dupnames_, source_, and classes_.
:Parameter Entities:
The `%bibliographic.elements;`_ parameter entity directly includes
``copyright``.
Examples
--------
reStructuredText_ source::
Document Title
==============
:Copyright: This document has been placed in the public domain.
Complete pseudo-XML_ result after parsing and applying transforms::
Document Title
This document has been placed in the public domain.
See docinfo_ for a more complete example, including processing
context.
``danger``
==========
The ``danger`` element is an admonition, a distinctive and
self-contained notice. Also see the other admonition elements
Docutils offers (in alphabetical order): attention_, caution_, error_,
hint_, important_, note_, tip_, warning_, and the generic admonition_.
Details
-------
:Category:
`Compound Body Elements`_
:Parents:
All elements employing the `%body.elements;`_ or
`%structure.model;`_ parameter entities in their content models
may contain ``danger``.
:Children:
``danger`` elements contain one or more `body elements`_.
:Analogues:
``danger`` has no direct analogues in common DTDs. It can be
emulated with primitives and type effects.
:Processing:
Rendered distinctly (inset and/or in a box, etc.), with the
generated title "!DANGER!" (or similar).
Content Model
-------------
.. parsed-literal::
(`%body.elements;`_)+
:Attributes:
The ``danger`` element contains only the `common attributes`_:
ids_, names_, dupnames_, source_, and classes_.
:Parameter Entities:
The `%body.elements;`_ parameter entity directly includes
``danger``. The `%structure.model;`_ parameter entity
indirectly includes ``danger``.
Examples
--------
reStructuredText source::
.. DANGER:: Mad scientist at work!
Pseudo-XML_ fragment from simple parsing::
Mad scientist at work!
``date``
========
The ``date`` element contains the date of publication, release, or
last modification of the document.
Details
-------
:Category:
`Bibliographic Elements`_
:Parents:
Only the docinfo_ element contains ``date``.
:Children:
``date`` elements may contain text data plus `inline elements`_.
:Analogues:
``date`` is analogous to the DocBook "date" element.
:Processing:
Often used with the RCS/CVS keyword "Date". See docinfo_.
Content Model
-------------
.. parsed-literal::
`%text.model;`_
:Attributes:
The ``date`` element contains only the `common attributes`_:
ids_, names_, dupnames_, source_, and classes_.
:Parameter Entities:
The `%bibliographic.elements;`_ parameter entity directly includes
``date``.
Examples
--------
reStructuredText_ source::
Document Title
==============
:Date: 2002-08-20
Complete pseudo-XML_ result after parsing and applying transforms::
Document Title
2002-08-20
See docinfo_ for a more complete example, including processing
context.
``decoration``
==============
The ``decoration`` element is a container for header_ and footer_
elements and potential future extensions. These elements are used for
notes, time/datestamp, processing information, etc.
Details
-------
:Category:
`Structural Subelements`_
:Parents:
Only the document_ element contains ``decoration``.
:Children:
``decoration`` elements may contain `decorative elements`_.
:Analogues:
There are no direct analogies to ``decoration`` in HTML or in
DocBook. Equivalents are typically constructed from primitives
and/or generated by the processing system.
:Processing:
See the individual `decorative elements`_.
Content Model
-------------
.. parsed-literal::
(header_?, footer_?)
Although the content model doesn't specifically require contents, no
empty ``decoration`` elements are ever created.
:Attributes:
The ``decoration`` element contains only the `common attributes`_:
ids_, names_, dupnames_, source_, and classes_.
Examples
--------
reStructuredText_ source::
A paragraph.
Complete pseudo-XML_ result after parsing and applying transforms,
assuming that the datestamp command-line option or configuration
setting has been supplied::