XML documents can have a reference to a DTD or an XML Schema.

A Simple XML Document

Look at this simple XML document called "note.xml":

<?xml version="1.0"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

A Simple DTD

This is a simple DTD file called "note.dtd" that defines the elements of the XML document above ("note.xml"):

<!ELEMENT note (to, from, heading, body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>

Line 1 defines the note element to have four elements: "to, from, heading, body". Line 2-5 defines the to element to be of the type "#PCDATA", the from element to be of the type "#PCDATA", and so on...

A Simple XML Schema

This is a simple XML Schema file called "note.xsd" that defines the elements of the XML document above ("note.xml"):

<?xml version="1.0"?>
<xs:schema xmlns:xs="
targetNamespace="
xmlns="
elementFormDefault="qualified">
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>

The note element is said to be of a complex type because it contains other elements. The other elements (to, from, heading, body) are said to be simple types because they do not contain other elements. You will learn more about simple and complex types in the following chapters.

A Reference to a DTD

This XML document has a reference to a DTD:

<?xml version="1.0"?>
<!DOCTYPE note SYSTEM
"
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

A Reference to an XML Schema

This XML document has a reference to an XML Schema:

<?xml version="1.0"?>
<note
xmlns="
xmlns:xsi="
xsi:schemaLocation=
"
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

The <schema> element is the root element of every XML Schema!

The <schema> Element

The <schema> element is the root element of every XML Schema:

<?xml version="1.0"?>
<xs:schema>
...
...
</xs:schema>

The <schema> element may contain some attributes. A schema declaration often looks something like this:

<?xml version="1.0"?>
<xs:schema xmlns:xs="
targetNamespace="
xmlns="
elementFormDefault="qualified">
...
...
</xs:schema>

The following fragment:

xmlns:xs="

indicates that the elements and data types used in the schema (schema, element, complexType, sequence, string, boolean, etc.) come from the " namespace. It also specifies that the elements and data types that come from the " namespace should be prefixed with xs: !!

This fragment:

targetNamespace="

indicates that the elements defined by this schema (note, to, from, heading, body.) come from the " namespace.

This fragment:

xmlns="

indicates that the default namespace is "

This fragment:

elementFormDefault="qualified"

indicates that any elements used by the XML instance document which were declared in this schema must be namespace qualified.

Referencing a Schema in an XML Document

This XML document has a reference to an XML Schema:

<?xml version="1.0"?>
<note xmlns="
xmlns:xsi="
xsi:schemaLocation=" note.xsd">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

The following fragment:

xmlns="

specifies the default namespace declaration. This declaration tells the schema-validator that all the elements used in this XML document are declared in the " namespace.

Once you have the XML Schema Instance namespace available:

xmlns:xsi="

you can use the schemaLocation attribute. This attribute has two values. The first value is the namespace to use. The second value is the location of the XML schema to use for that namespace:

xsi:schemaLocation=" note.xsd"

XML Schemas define the elements of your XML files.

A simple element is an XML element that can contain only text. It cannot contain any other elements or attributes.

What is a Simple Element?

A simple element is an XML element that can contain only text. It cannot contain any other elements or attributes.

However, the "only text" restriction is quite misleading. The text can be of many different types. It can be one of the types that are included in the XML Schema definition (boolean, string, date, etc.), or it can be a custom type that you can define yourself.

You can also add restrictions (facets) to a data type in order to limit its content, and you can require the data to match a defined pattern.

How to Define a Simple Element

The syntax for defining a simple element is:

<xs:element name="xxx" type="yyy"/>

where xxx is the name of the element and yyy is the data type of the element.

Here are some XML elements:

<lastname>Refsnes</lastname>
<age>34</age>
<dateborn>1968-03-27</dateborn>

And here are the corresponding simple element definitions:

<xs:element name="lastname" type="xs:string"/>
<xs:element name="age" type="xs:integer"/>
<xs:element name="dateborn" type="xs:date"/>

Common XML Schema Data Types

XML Schema has a lot of built-in data types. Here is a list of the most common types:

  • xs:string
  • xs:decimal
  • xs:integer
  • xs:boolean
  • xs:date
  • xs:time

Declare Default and Fixed Values for Simple Elements

Simple elements can have a default value OR a fixed value set.

A default value is automatically assigned to the element when no other value is specified. In the following example the default value is "red":

<xs:element name="color" type="xs:string" default="red"/>

A fixed value is also automatically assigned to the element. You cannot specify another value. In the following example the fixed value is "red":

<xs:element name="color" type="xs:string" fixed="red"/>

All attributes are declared as simple types.

Only complex elements can have attributes!

What is an Attribute?

Simple elements cannot have attributes. If an element has attributes, it is considered to be of complex type. But the attribute itself is always declared as a simple type. This means that an element with attributes always has a complex type definition.

How to Define an Attribute

The syntax for defining an attribute is:

<xs:attribute name="xxx" type="yyy"/>

where xxx is the name of the attribute and yyy is the data type of the attribute.

Here are an XML element with an attribute:

<lastname lang="EN">Smith</lastname>

And here are a corresponding simple attribute definition:

<xs:attribute name="lang" type="xs:string"/>

Common XML Schema Data Types

XML Schema has a lot of built-in data types. Here is a list of the most common types:

  • xs:string
  • xs:decimal
  • xs:integer
  • xs:boolean
  • xs:date
  • xs:time

Declare Default and Fixed Values for Attributes

Attributes can have a default value OR a fixed value specified.

A default value is automatically assigned to the attribute when no other value is specified. In the following example the default value is "EN":

<xs:attribute name="lang" type="xs:string" default="EN"/>

A fixed value is also automatically assigned to the attribute. You cannot specify another value. In the following example the fixed value is "EN":

<xs:attribute name="lang" type="xs:string" fixed="EN"/>

Creating Optional and Required Attributes

All attributes are optional by default. To explicitly specify that the attribute is optional, use the "use" attribute:

<xs:attribute name="lang" type="xs:string" use="optional"/>

To make an attribute required:

<xs:attribute name="lang" type="xs:string" use="required"/>

Restrictions on Content

When an XML element or attribute has a type defined, it puts a restriction for the element's or attribute's content. If an XML element is of type "xs:date" and contains a string like "Hello Mother", the element will not validate.

But, there is more... with XML Schemas, you can add your own restrictions to your XML elements and attributes. These restrictions are called facets.You can read more about facets in the next chapter.

Restrictions are used to control acceptable values for XML elements or attributes. Restrictions on XML elements are called facets.

Restrictions on Values

This example defines an element called "age" with a restriction. The value of age can NOT be lower than 0 or greater than 100:

<xs:element name="age">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:minInclusive value="0"/>
<xs:maxInclusive value="100"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

Restrictions on a Set of Values

To limit the content of an XML element to a set of acceptable values, we would use the enumeration constraint.

This example defines an element called "car":

<xs:element name="car">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="Audi"/>
<xs:enumeration value="Golf"/>
<xs:enumeration value="BMW"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

The "car" element is a simple type with a restriction. The acceptable values are: Audi, Golf, BMW.

The example above could also have been written like this:.

<xs:element name="car" type="carType"/>
<xs:simpleType name="carType">
<xs:restriction base="xs:string">
<xs:enumeration value="Audi"/>
<xs:enumeration value="Golf"/>
<xs:enumeration value="BMW"/>
</xs:restriction>
</xs:simpleType>

Note: In this case the type "carType" can be used by other elements because it is not a part of the "car" element.

Restrictions on a Series of Values

To limit the content of an XML element to define a series of numbers or letters that can be used, we would use the pattern constraint.

This example defines an element called "letter":

<xs:element name="letter">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[a-z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

The "letter" element is a simple type with a restriction. The only acceptable value is ONE of the LOWERCASE letters from a to z.

The next example defines an element called "initials":

<xs:element name="initials">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[A-Z][A-Z][A-Z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

The "initials" element is a simple type with a restriction. The only acceptable value is THREE of the UPPERCASE letters from a to z.

This example also defines an element called "initials":

<xs:element name="initials">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[a-zA-Z][a-zA-Z][a-zA-Z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

The "initials" element is a simple type with a restriction. The only acceptable value is THREE of the LOWERCASE OR UPPERCASE letters from a to z.

This example defines an element called "choice":

<xs:element name="choice">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[xyz]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

The "choice" element is a simple type with a restriction. The only acceptable value is ONE of the following letters: x, y, OR z.

The next example defines an element called "prodid":

<xs:element name="prodid">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:pattern value="[0-9][0-9][0-9][0-9][0-9]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

The "prodid" element is a simple type with a restriction. The only acceptable value is FIVE digits in a sequence, and each digit must be in a range from 0 to 9.

Other Restrictions on a Series of Values

Some other restrictions that can be defined by the pattern constraint:

This example defines an element called "letter":

<xs:element name="letter">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="([a-z])*"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

The "letter" element is a simple type with a restriction. The acceptable value is zero or more occurrences of lowercase letters from a to z.

This example also defines an element called "letter":

<xs:element name="letter">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="([a-z][A-Z])+"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

The "letter" element is a simple type with a restriction. The acceptable value is one or more occurrences of a lowercase letter followed by a uppercase letter from a to z.

This example defines an element called "gender":

<xs:element name="gender">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="male|female"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

The "gender" element is a simple type with a restriction. The only acceptable value is male OR female.

This example defines an element called "password":

<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[a-zA-Z0-9]{8}"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

The "password" element is a simple type with a restriction. There must be exactly eight characters in a row and those characters must be lowercase or uppercase letters from a to z, or a number from 0 to 9.

Restrictions on White Space Characters

To specify how white space characters should be handled, we would use the whiteSpace constraint.

This example defines an element called "address":

<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="preserve"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

The "address" element is a simple type with a restriction. The whiteSpace constraint is set to "preserve", which means that the XML processor WILL NOT remove any white space characters.

This example also defines an element called "address":

<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="replace"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

This "address" element is a simple type with a restriction. The whiteSpace constraint is set to "replace", which means that the XML processor WILL REPLACE all white space characters (line feeds, tabs, spaces, and carriage returns) with spaces.

This example also defines an element called "address":

<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="collapse"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

This "address" element is a simple type with a restriction. The whiteSpace constraint is set to "collapse", which means that the XML processor WILL REMOVE all white space characters (line feeds, tabs, spaces, carriage returns are replaced with spaces, leading and trailing spaces are removed, multiple spaces are reduced to a single space).

Restrictions on Length

To limit the length of a value in an element, we would use the length, maxLength, and minLength constraints.

This example defines an element called "password":

<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:length value="8"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

The "password" element is a simple type with a restriction. The value must be exactly eight characters.

This example defines another element called "password":

<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:minLength value="5"/>
<xs:maxLength value="8"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

This "password" element is a simple type with a restriction. The value must be minimum five characters and maximum eight characters.

Restrictions for Datatypes

Constraint / Description
enumeration / Defines a list of acceptable values
fractionDigits / Specifies the maximum number of decimal places allowed. Must be equal to or greater than zero
length / Specifies the exact number of characters or list items allowed. Must be equal to or greater than zero
maxExclusive / Specifies the upper bounds for numeric values (the value must be less than this value)
maxInclusive / Specifies the upper bounds for numeric values (the value must be less than or equal to this value)
maxLength / Specifies the maximum number of characters or list items allowed. Must be equal to or greater than zero
minExclusive / Specifies the lower bounds for numeric values (the value must be greater than this value)
minInclusive / Specifies the lower bounds for numeric values (the value must be greater than or equal to this value)
minLength / Specifies the minimum number of characters or list items allowed. Must be equal to or greater than zero
pattern / Defines the exact sequence of characters that are acceptable
totalDigits / Specifies the exact number of digits allowed. Must be greater than zero
whiteSpace / Specifies how white space (line feeds, tabs, spaces, and carriage returns) are handled

A complex element contains other elements and/or attributes.

What is a Complex Element?

A complex element is an XML element that contains other elements and/or attributes.

There are four kinds of complex elements:

  • empty elements
  • elements that contain only other elements
  • elements that contain only text
  • elements that contain both other elements and text

Note: Each of these elements may contain attributes as well!

Examples of Complex XML Elements

A complex XML element, "product", which is empty:

<product pid="1345"/>

A complex XML element, "employee", which contains only other elements:

<employee>
<firstname>John</firstname>
<lastname>Smith</lastname>
</employee>

A complex XML element, "food", which contains only text:

<food type="dessert">Ice cream</food>

A complex XML element, "description", which contains both elements and text:

<description>
It happened on <date lang="norwegian">03.03.99</date> ....
</description>

How to Define a Complex Element

Look at this complex XML element, "employee", which contains only other elements:

<employee>
<firstname>John</firstname>
<lastname>Smith</lastname>
</employee>

We can define a complex element in an XML Schema in different ways:.

1. The "employee" element can be declared directly by naming the element, like this:

<xs:element name="employee">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>

If you use the method described above, only the "employee" element can use the specified complex type. Notice that the child elements, "firstname" and "lastname", are surrounded by the <sequence> indicator. This means that the child elements must appear in the same order as they are declared; "firstname" first and "lastname" second. You will learn more about indicators in the XSD Indicators chapter.

2. The "employee" element can have a type attribute that refers to the name of the complex type to use:

<xs:element name="employee" type="personinfo"/>
<xs:complexType name="personinfo">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>

If you use the method described above, several elements can refer to the same complex type, like this: