Originally published at http://www.xmlfiles.com/dtd/
Minor revision made by Junghoo "John" Cho for the CS188 class at UCLA
The purpose of a DTD is to define the legal building blocks of an XML document. It defines the document structure with a list of legal elements. A DTD can be declared inline in your XML document, or as an external reference.
This is an XML document with a Document Type Definition:
<?xml version="1.0"?> <!DOCTYPE note [ <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> ]> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend</body> </note> |
Like the above example, if the DTD is to be included in your XML source file, it should be wrapped in a DOCTYPE definition with the following syntax:
<!DOCTYPE root-element [element-declarations]> |
In the above example, the DTD is interpreted like this:
!ELEMENT note (in line 2) defines the element "note" as having
four elements: "to,from,heading,body".
!ELEMENT to (in line 3) defines the "to" element to be of
the type "CDATA".
!ELEMENT from (in line 4) defines the "from" element to be of the
type "CDATA"
and so on.....
This is the same XML document with an external DTD:
<?xml version="1.0"?> <!DOCTYPE note SYSTEM "note.dtd"> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> |
This is a copy of the file "note.dtd" containing the Document Type Definition:
<?xml version="1.0"?> <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> |
XML documents (and HTML documents) are made up by the following building blocks:
Elements, Tags, Attributes, Entities, #PCDATA, and CDATA
We now briefly explain each of the building blocks:
Elements are the main building blocks of both XML and HTML documents.
Examples of HTML elements are "body" and "table". Examples of XML elements could be "note" and "message". Elements can contain text, other elements, or be empty. Examples of empty HTML elements are "hr", "br" and "img".
Tags are used to markup elements.
A starting tag like <element_name> mark up the beginning of an element, and an ending tag like </element_name> mark up the end of an element.
Examples:
A body element: <body>body text in between</body>.
A message element: <message>some message in between</message>
Attributes provide extra information about elements.
Attributes are placed inside the start tag of an element. Attributes come in name/value pairs. The following "img" element has an additional information about a source file:
<img src="computer.gif" /> |
The name of the element is "img". The name of the attribute is "src". The value of the attribute is "computer.gif". Since the element itself is empty it is closed by a " /".
#PCDATA means parsed character data.
Think of character data as the text found between the start tag and the end tag of an XML element.
#PCDATA is text that will be parsed by a parser. Tags inside the text will be treated as markup and entities will be expanded.
CDATA also means character data.
CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated as markup and entities will not be expanded.
Entities as variables used to define common text. Entity references are references to entities.
Most of you will known the HTML entity reference: " " that is used to insert an extra space in an HTML document. Entities are expanded when a document is parsed by an XML parser.
The following entities are predefined in XML:
Entity References | Character |
---|---|
< | < |
> | > |
& | & |
" | " |
' | ' |
In the DTD, XML elements are declared with an element declaration. An element declaration has the following syntax:
<!ELEMENT element-name (element-content)> |
Empty elements are declared with the keyword EMPTY inside the parentheses:
<!ELEMENT element-name (EMPTY)> example: <!ELEMENT img (EMPTY)> |
Elements with data are declared with the data type inside parentheses:
<!ELEMENT element-name (#PCDATA)> or <!ELEMENT element-name (#PCDATA)> or <!ELEMENT element-name (ANY)> example: <!ELEMENT note (#PCDATA)> |
If a #PCDATA section contains elements, these elements must also be declared.
Elements with one or more children are defined with the name of the children elements inside the parentheses:
<!ELEMENT element-name (child-element-name)> or <!ELEMENT element-name (child-element-name,child-element-name,.....)> example: <!ELEMENT note (to,from,heading,body)> |
<!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> |
<!ELEMENT element-name (child-name)> example <!ELEMENT note (message)> |
The example declaration above declares that the child element message can only occur one time inside the note element.
<!ELEMENT element-name (child-name+)> example <!ELEMENT note (message+)> |
The + sign in the example above declares that the child element message must occur one or more times inside the note element.
<!ELEMENT element-name (child-name*)> example <!ELEMENT note (message*)> |
The * sign in the example above declares that the child element message can occur zero or more times inside the note element.
<!ELEMENT element-name (child-name?)> example <!ELEMENT note (message?)> |
The ? sign in the example above declares that the child element message can occur zero or one times inside the note element.
example <!ELEMENT note (to+,from,header,message*,#PCDATA)> |
The example above declares that the element note must contain at least one to child element, exactly one from child element, exactly one header, zero or more message, and some other parsed character data as well. Puh!
In the DTD, XML element attributes are declared with an ATTLIST declaration. An attribute declaration has the following syntax:
<!ATTLIST element-name attribute-name attribute-type default-value> |
The attribute-type can have the following values:
Value | Explanation |
---|---|
CDATA |
The value is character data |
(eval|eval|..) |
The value must be an enumerated value |
ID |
The value is an unique id |
IDREF |
The value is the id of another element |
IDREFS |
The value is a list of other ids |
NMTOKEN |
The value is a valid XML name |
NMTOKENS |
The value is a list of valid XML names |
ENTITY |
The value is an entity |
ENTITIES |
The value is a list of entities |
NOTATION |
The value is a name of a notation |
xml: |
The value is predefined |
The attribute-default-value can have the following values:
Value | Explanation |
---|---|
value |
The attribute has the default value "value" |
#REQUIRED |
The attribute value must be included in the element |
#IMPLIED |
The attribute is optional and does not have to be included |
#FIXED value |
The attribute value is fixed |
DTD example: <!ELEMENT square (EMPTY)> <!ATTLIST square width CDATA "0"> XML example: <square width="100"></square> |
Syntax: <!ATTLIST element-name attribute-name CDATA "default-value"> DTD example: <!ATTLIST payment type CDATA "check"> XML example: <payment type="check"> |
Syntax: <!ATTLIST element-name attribute-name attribute-type #IMPLIED> DTD example: <!ATTLIST contact fax CDATA #IMPLIED> XML example: <contact fax="555-667788"> |
Syntax: <!ATTLIST element-name attribute_name attribute-type #REQUIRED> DTD example: <!ATTLIST person number CDATA #REQUIRED> XML example: <person number="5677"> |
Syntax: <!ATTLIST element-name attribute-name attribute-type #FIXED "value"> DTD example: <!ATTLIST sender company CDATA #FIXED "Microsoft"> XML example: <sender company="Microsoft"> |
Syntax: <!ATTLIST element-name attribute-name (eval|eval|..) default-value> DTD example: <!ATTLIST payment type (check|cash) "cash"> XML example: <payment type="check"> or <payment type="cash"> |
Syntax: <!ENTITY entity-name "entity-value"> DTD Example: <!ENTITY writer "Jan Egil Refsnes."> <!ENTITY copyright "Copyright XML101."> XML example: <author>&writer;©right;</author> |
Syntax: <!ENTITY entity-name SYSTEM "URI/URL"> DTD Example: <!ENTITY writer SYSTEM "http://www.xml101.com/entities/entities.xml"> <!ENTITY copyright SYSTEM "http://www.xml101.com/entities/entities.dtd"> XML example: <author>&writer;©right;</author> |
Copyright © 1998-2006 Jupitermedia All rights reserved. Reprinted with permission from http://www.internet.com.