DTD and XML, Part Two

This is part 2 of the DTD and XML - assignment. If you haven't read part 1 yet, you can do so here: DTD and XML, Part 1

The description and my solution of the DTD and XML - assignment can be viewed at the bottom of this page.

Components of XML documents from a DTD perspective

From a DTD perspective, XML documents are constructed by these five components:

Elements

Elements are the main components of XML documents as well as HTML and XHTML documents.

"message", "subject", "sender","receiver" and "content"from the message example in DTD and XML, Part 1. are examples of XML elements.

Elements can be empty, have text, or other elements as their content.

Declaring an Element

XML elements are declared with a DTD element declaration inside the DTD.

This is the syntax for an element declaration:

<!ELEMENT element-name content-keyword> or <!ELEMENT element-name (element-content)>

Empty elements

Elements types with empty content are declared using the content keyword EMPTY:

<!ELEMENT element-name EMPTY>

For example: <!ELEMENT break EMPTY>

In XML document: <break />

As the example show empty elements have no content between it's start tag and it's end tag. This is referred to as having empty content.

The "img" "br" elements are examples of empty elements from HTML and XHTML.

Elements with only pure text

Element types with text (character data) only are declared using the content keyword #PCDATA inside round brackets, like this:

<!ELEMENT element-name (#PCDATA)> example:

<!ELEMENT sender (#PCDATA)>

Elements types with any content

Element types declared using the keyword ANY , have no constraints on its content. It may contain subelements of any type and number.

<!ELEMENT element-name ANY> example: <!ELEMENT message ANY>

Element types with child elements

Element types with one or more child elements are declared in a sequence using the name of the child elements inside round brackets:

<!ELEMENT element-name (child-element-name)> or <!ELEMENT element-name (child-element-name,another-child-element-name,.....)> example: <!ELEMENT message (receiver,sender,subject,content)>

When child elements(subelements) are declared in a sequence separated by commas, the children must occur in the same sequence in the XML document. In a complete declaration, the children, and all those childrens children... and so on...must be declared as well.

The complete declaration of the "message" element would be:

<!ELEMENT message (receiver,sender,subject,content)> <!ELEMENT receiver (#PCDATA)> <!ELEMENT sender (#PCDATA)> <!ELEMENT subject (#PCDATA)> <!ELEMENT content (#PCDATA)>

Element types that can occur only once

<!ELEMENT element-name (child-name)> example:<!ELEMENT message (content)>

In the example declaration above the child element(i.e. the content element)is constrained to occur only once inside the "message" element.

Element types that must occur at least once

<!ELEMENT element-name (child-name+)> example: <!ELEMENT message (content+)>

The + sign in the example above declares that the child element(i.e. the content element) must occur at least once inside the "message" element. (I.e. a one to many constrain)

Element types that doesn't have to occur, but could occur many times

<!ELEMENT element-name (child-name*)> example:<!ELEMENT message (content*)>

The * sign in the example above declares that the child element(i.e. the content element) doesn't have to occur - but can occur many times - within the "message" element. (I.e. a zero to many constrain)

Element types that doesn't have to occur, but could occur one time

<!ELEMENT element-name (child-name?)> example: <!ELEMENT message (content?)>

The ? sign in the example above declares that the child element(i.e. the content element) can occur zero or one time within the "message" element.

Element types with either this or that content

example: <!ELEMENT message (receiver,sender,subject,(content|announcement))>

The example above declares that the "message" element must contain a "receiver" element, a "sender"element, a "subject" element, and either a "content" element or a "announcement" element.

Element types with mixed content

example: <!ELEMENT message (#PCDATA|receiver|sender|subject|content)*>

The example above declares that the "message" element can contain zero or more occurrences of text content(parsed character), "receiver elements", "sender elements", "subject elements", or "content" elements.

Attributes

An attribute is used to give extra information about an element.

Attributes are inserted within an elements start tag. An Attribute have a attribute name and an attribute value. The img element in HTML and XHTML, for example, use the src attribute to give extra information:

<img src="hacker.jpg" />.

The element name is "img". The attribute name is "src", and the attribute value is "hacker.jpg". The element itself, however, is empty.(has empty content) In XML, XHTML and stricter versions of HTML empty elements are closed by a " /" in the end tag of the element.

Entities

Entities are variables for defining shortcuts/macros to text.

They can be declared as:

You probably know the HTML entity reference: "&nbsp;"(No Breaking SPace), which is used in HTML to insert an extra space in a a document. Entities like "&nbsp;" are expanded when a document is parsed by a parser.

You can define your own entities within the DTD, but some common entities are already definded in XML:

<!ENTITY lt "&#38;"> <!ENTITY gt "&#62;"> <!ENTITY amp "&#38;"> <!ENTITY apos "&#39;"> <!ENTITY quot "&#34;">

PCDATA

PCDATA stands for Parsed Character DATA.

Character data is the text between the start tag end tag of an XML element. This text will be parsed by a parser.

CDATA

CDATA is character data(text) that will NOT be parsed by a parser.

Assignment Description

Put constrains on the cv-template.xml file from XML Basics Assignment by using a DTD. The document should be well-formed(have correct XML syntax) and validate(follow the rules set up in the DTD).

My solution, Assignment Files

«  Previous Next  »
Loading