Dr. Tom's Guide to XSD

Thomas D. Wason, Ph.D. (aka Dr. Tom)
http://www.tomwason.com [Home]
wason@mindspring.com

One of the Dr. Tom Guides

Purpose of Document

The purpose of this guide is to describe how IMS uses XML-Schemas as the primary control documents for IMS XML bindings, which commonly accompany IMS specifications. This guide does not describe all of the features of XML-Schema, only those implemented within the IMS XML-Schemas.

Document Information

Title	Dr. Tom's Guide to XSD
Author	Thomas D. Wason
Version Date	2006-05-30
Current version	0.22
Copyright	Copyright © 2001 IMS Global Learning Consortium, Inc. Used by permission

Purpose of Document
Document Information
1. Introduction
2. What is XML-Schema and Why is IMS Using it?
3. How IMS is Using XML-Schema?
  3.1 Schema Conventions:
  3.2 Namespaces:
  3.3 Instances:
4. Basic XML-Schema Concepts
  4.1 The XML Pieces
    4.1.1 The Element
    4.1.2 The Attribute
    4.1.3 The Content Model
    4.1.4 The Data Type
  4.2 Separation of Declarations
  4.3 XSD Types
    4.3.1 Simple Type
    4.3.2 Complex Type
  4.4 XSD Content
    4.4.1 simpleContent
    4.4.2 complexContent
  4.5 Anonymous and Global Definitions
  4.6 Ref: The Ability to Point
  4.7 Predefined Attributes
  4.8 Namespaces
  4.9 Including and Importing
  4.10 Extension
5. Summary

1. Introduction

The IMS specifications define how data are structured for communication among systems. These data structures are "bound" into a standard method of expression called "XML" (eXtensible Markup Language). The mechanics of XML are outside of the scope of this document. You can turn to such documents as Dr. Tom's Guide to XML (drtomxml.html) for more information on XML. This document assumes that you are familiar with the basic concepts of XML: elements, attributes, content models, and root elements. Although blocks of XML are the basis for communication among systems, the XML blocks are usually thought of as files with an extension of ".xml". Frequently-but not always-these files are stored at either the transmitting or receiving end of a communication. This guide refers to them as files. The general term for an XML file is an "instance".

XML-Schema is a method for managing XML instance creation and validation. In other words, if you want to make an XML file, such as a manifest, learner information package, or meta-data, that conforms to IMS specifications, you don't want to have to keep looking at the specification as you make the XML. Instead, you want a tool that can help you do it right. This tool needs to use some sort of control document to help you make your XML. In this case, that control document will be written in XML-Schema language, and will have an extension of .xsd. The XSD file contains the information the tool needs to use to tell you exactly how to make a good XML file. The XSD will also will let the tool determine if a file is valid-you can use it as a checker. The XSD also allows vendors-and real people, too-to author and validate a very large number of different XML structures. You can think of an XSD as a blueprint for making and checking XML instances.

This guide describes the features of XML-Schema used by IMS and how to use XML-Schemas in the context of implementing an IMS specification.

2. What is XML-Schema and Why is IMS Using it?

IMS specifications define how information is exchanged. Everyone in an information exchange has an interest in being able to understand the information in the files that are exchanged. If I send you a message, I want to do it in a way that you can understand. XML allows the creation of new data structures that contain information. You can see that there is a potential problem: if one party can create new data structures, then how is the other to understand the message? It's similar to creating a new language as you use it. You can send along a dictionary and grammar, but then the recipient spends a lot of time interpreting the reference materials. If fluency is the result of frequency and consistency of use, then we want fluency so that we spend our time acting on information, not trying to decipher it. XML supports fluency by allowing users to control consistency.

There are several methods for controlling and testing (i.e., validating) XML files such as Document Type Definition (DTD), XML-Schema (XSD), and Resource Description Framework (RDF) files. These controls are embodied as files with appropriate extensions (e.g., .dtd, .xsd, or .rdf). XML-Schema lies someplace between DTD and RDF in power and elegance. It is less abstract than RDF, but provides deeper control and growth than DTD. XML-Schema supports a modest approach to object-oriented design. For those of you who think about this sort of thing, DTD is early binding, XSD is mid-late binding, and RDF is late binding in the interpretation by application. With a little familiarity, reading an XSD file is not much more difficult than reading a DTD, but you should use tools. Tools exist for XSD and DTD, but as of this writing none are readily available for RDF.

IMS has adopted XML-Schema as its control system, as it has more power than DTD but is not as abstract as RDF. The IMS data structures need the power of XSD for adequate expression. XML-Schema is being developed by the W3C, and is currently in a draft status. IMS has frozen its adoption at the 24 October 2000 version (http://www.w3.org/XML/Schema.html). [The latest draft is 30 March 2001, which is up for vote. There are some small changes, most of which affect data type names.] By adopting a specific version, IMS helps implementers create reliable methods for interpreting the XSD controls.

DTDs can be derived from XSDs, usually at a loss of information or richness. Thus, IMS uses a rich language for its primary control documents, XSDs, and also provides the simpler DTDs derived from those XSD documents.

Ideally, an XML instance that validates with a DTD will also validate with an XSD. IMS strives to make the XSDs such that a file that validates with an XSD will also validate with a DTD, but with much more limited data type checking. DTDs are also poorly suited for using namespaces; XSDs use namespaces well. When using DTDs, it is recommended that you not use namespaces, because it is impossible to predict what each application will do with them. As namespaces are important for the reuse of various data structures among the IMS specifications, XSDs are a natural choice for providing both control and extensibility. Namespacing (or using extensions) is complex enough to require a separate guide by itself and will not be discussed in further detail here.

Next, this guide will explain how IMS is using XML-Schema. Then, it'll discuss XML-Schema concepts in greater detail.

3. How is IMS Using XML-Schema?

For those who are familiar with XML-Schema, here is listing of IMS's usage rules and conventions. If you are not familiar with XML-Schema, you may want to come back to this part later. If XML-Schema is familiar to you, this may be all you need.

3.1 Schema Conventions

All complexTypes are defined globally. None are define anonymously.
No elements will have mixed contents of data and elements. The use of the "any" element form does allow mixed content as extensions, but mixed content is not used in the IMS XML binding.
Elements will contain either attributes (optional) and data, or attributes (optional) and elements.
All elements will each be defined separately. None will be defined within an element container such as sequence, choice, or all. Each element will be defined with a type.
simpleTypes are defined globally when there is a reasonable expectation that the type may be modified through extension or restriction.
Attributes that are used more than once are defined globally using an attributeGroup. The name of the attribute group has a prefix of "attr." (attr.identify).
An attributeGroup may contain one or more attributes.
Names of attributeGroups that have required usage should have the suffix ".req" (attr.href.req).
Groups of elements (sequence, choice, all) that are used more than once are usually defined in an external group.
External group names have a prefix of "grp." (grp.any).
Simple content models that may reasonably be expected to be redefined in derivations as complexTypes will be defined as complexTypes.
All element names will be lower case, following the XML binding.
Attribute names will be lower case, following the XML binding.
Types that are specific to elements will have names that are the element's name with "Type" appended (learnerinformationType).
Types that are used in more than one element will have lower case names with "Type" appended (datafieldType).

3.2 Namespaces:

The IMS XML-Schema namespace directory is: xsd.
The root schema has a target namespace that is comprised of the IMS namespace directory and the filename without the extension: targetNamespace="xsd/imscp_rootv1p1".
Sub-schemas do not have target namespaces.
IMS defines namespace prefixes in a meaningful and consistent manner. For example, the IMS Meta-data prefix is: imsmd:.
XSD files are named incorporating the specification namespace prefix (imsmd_rootv1p2.xsd).
The main schema file for a binding is always the "root" schema (imscp_rootv1p1.xsd).
The specification version information is at the end of the file name, before the extension, starting with the letter "v".
All subschemas of a binding are named to incorporate the prefix and version of the specification (imslip_activityv1p0.xsd).
Sub-schemas have unqualified element forms: elementFormDefault="unqualified"
IMS will maintain all subschema files for a specification within the same directory as the root schema file.
Subschemas will be included, not imported, into the root schema:
```
<xsd:include schemaLocation="ims_lip_activityv1p0.xsd"/>
```
The namespace of the root schema will be the same as the target namespace.
Each schema will define the namespace prefix for the XML-Schema namespace for the prefix "xsd:" (xmlns:xsd="http://www.w3.org/2000/10/XMLSchema"). All XSD elements and data types will use this prefix. As attributes assume the namespace of the containing element, it is not normally necessary to provide the "xsd:" prefix for attributes within a schema.
IMS has provided a local XML namespace file: ims_xml.xsd. This is imported into the schema as follows to provide local values for the xml: namespaced attributes xml:lang, xml:base, xml:link (deprecated):
```
<xsd:import namespace="http://www.w3.org/XML/1998/namespace"
 schemaLocation="ims_xml.xsd"/>
```
There a companion namespace designation in the root element "schema":
```
xmlns:xml="http://www.w3.org/XML/1998/namespace"
```
The specification version and the schema version are not necessarily the same.
If the version of an imported schema changes, it does not affect the version of the schema importing it.
If the version of an included schema changes, the version of the schema including it changes.
Imported IMS schemas should preserve the IMS designated prefix (imsmd:).
For versioning it is safer to import than to include.

The general form of the root element in the root schema is as follows:

<xsd:schema xmlns="xsd/imscp_rootv1p1"
targetNamespace="xsd/imscp_rootv1p1" 
xmlns:xml="http://www.w3.org/XML/1998/namespace" 
xmlns:imsmd="xsd/imsmd_rootv1p2" 
xmlns:xsd="http://www.w3.org/2000/10/XMLSchema" 
elementFormDefault="qualified" 
version="1.1:1.1 IMS CP 1.1 Schema 1.0">

Importing IMS bindings will declare both the namespace and the schemaLocation:
```
<import namespace=""schemaLocation=""/>
```
Schemas may provide specific points of extension using the "any" element. The namespace attribute of "##other" namespace will normally be used.
It is recommended that all elements directly under the root element contain the "##any" element for uncontrolled extensions.
A grp.any group should be used to define the "any" element. This will allow a global adjustment should it become necessary to change the elements characteristic in any way. This group will be included by reference as needed in complex types.

3.3 Instances:

The default (xmlns) namespace of the root element may conform to the namespace of the specification's root schema (xmlns="xsd/ims_lip_rootv1p0").
Declaration of the specification root schema namespace must match the target namespace declared in the root schema (xsi:schemaLocation="xsd/ims_lip_rootv1p0").
If the schema file name is given without a path name, it is assumed to be local to the parsing application (xsi:schemaLocation="xsd/ims_lip_rootv1p0 ims_lip_rootv1p0.xsd").
A schema filename may be provided with a complete path (xsi:schemaLocation=xsd/ims_lip_rootv1p0 xsd/ims_lip_rootv1p0.xsd), which points to the schema at the IMS XSD site. Other file paths may be used.

A root element declaration may be as follows:

<learnerinformation xmlns="xsd/ims_lip_rootv1p0"
xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance" 
xsi:schemaLocation="xsd/ims_lip_rootv1p0
ims_lip_rootv1p0.xsd">

An instance that uses a schema that imports other schemas must define a namespace and prefix for the indirectly imported schema, except for the IMS file ims_xml.xsd. This will be needed in the use of IMS Meta-data in other specifications, for example content packaging:
```
<manifest xmlns="xsd/imscp_rootv1p1"
xmlns:imsmd="xsd/imsmd_rootv1p2"
xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance"
xsi:schemaLocation="xsd/imscp_rootv1p1
imscp_rootv1p1.xsd xsd/imsmd_rootv1p2
imsmd_rootv1p2.xsd" 
identifier="Manifest01" version="MAN01.01">
```
XML instances shall not import the ims_xml.xsd file, nor shall they create its namespace:
```
xmlns:xml=http://www.w3.org/XML/1998/namespace
```

4. Basic XML-Schema Concepts

An XSD file is written in XML, thus it is made up of a collection of elements and attributes. An XSD allows each element and attribute, as well as content models to be defined. XML-Schema also allows the definition of data types that represent the actual information.

This guide is not meant to provide in-depth commentary about XML-Schema, but is only meant to provide explanation of the parts that IMS is using. Let's start with the XML pieces, as that is what XML-Schema manages.

4.1 The XML Pieces

This section provides a quick XML refresher. The complete XML 1.0 Specification document is available at: http://www.w3.org/TR/REC-xml. Dr. Tom's Guide to XML is available at drtomxml.html.

XML is a system for communicating structured data. The structure of the data is called the "information model". XML is one method of representing or "binding" the information model in a commonly understood form. XML has two major components, the element and the attribute. An information model is mapped into a combination of elements and attributes. The selection of what becomes an element and what becomes an attribute is determined by an organization, such as IMS.

4.1.1 The Element

An element is a basic unit in XML. It is the component that holds information. That information may be data, attributes, and/or additional elements. Think of it as a little box. The name of the element is its "token". An element looks like this:

<file>index.html</file>

The data "index.html" is contained in the element with a name of "file". Here's an element that contains some other elements:

<file>
  <name>index.html</name>
  <location>http://www.imsglobal.org</location>
</file>

4.1.2 The Attribute

Think of the attribute as a label on the element box. Sometimes there is so much information on the label that the box itself is empty. An empty element with an attribute looks like this:

<file href="index.html"/>

An attribute may be required or optional. It may have a predefined set of values that can be used. One of the values may be the default value, so if you don't use the attribute, the default value is assumed. Attributes have data types.

4.1.3 The Content Model

A content model of an element is a description of the elements it can contain. In addition to a simple listing of elements, each element's use is defined. Is it required? Is it optional? Can it be used only once or more than once? Can you choose it from among others? The element may contain text of some form. Is the order important? The content model is the definition of that information.

The content model within XML-Schema can have simpleContent and complexContent definitions, which can sometimes cause confusion. Also, naming can sometimes make things more complicated than they really are.

4.1.4 The Data Type

Elements and attributes can have data values. Data values may be of specific types, such as strings, datetime, integer, and so forth. XML 1.0 defines a pretty good list of these, and XML-Schema expands it: http://www.w3.org/TR/2000/CR-xmlschema-2-20001024/. In XML-Schema, the data type is usually defined, this will be illustrated later.

To continue our discussion about XML-Schema basics, there are two major concepts to remember in XSD: separation of declarations and types. These complement one another and are discussed below.

4.2 Separation of Declarations

A central concept of XML-Schema is containment. What can contain what? According to what rules? XML-Schema creates containers and allows the separation of components into pieces so that each aspect of a component can be contained and managed. In other words, you can separate the declaration of an element from what is in that element.

Figure 1.1 Separation of declaration of a component from declaration of its type.

In addition to separating the definition of a type from the declaration of a component, that type is now available for use "globally". You will see that this global nature of a type will bring us to the "content" definitions. But more about that later. Let us now turn to the types.

4.3 XSD Types

A type is a container describing an element or attribute. If you consider an element as a complete whole, with a content model and attributes, you have a complete package describing the element. This is called the element's type. To reiterate: a type defines what an element can contain (its element content model) and what its attributes are. It also defines the data types for the actual values held by the elements and attributes.

XML-Schema bundles all of the type information about an element into a "type". There are two kinds of type: simple type and complex type.

4.3.1 Simple Type

A simple type is defined with an element named "simpleType". The following is a fragment of XML that will help illustrate what sort of XML-Schema might define it. The example uses a camera description (this is an extrapolation of the camera example from the xfront.com guide http://www.xfront.com/HideVersusExpose.html).

<description>
  black metal
</description>

This <description> element contains the text "black metal". The text is of a simple data type, string. The string data type is defined in the XML Schema Definition namespace (xsd:). An XSD fragment that will provide a definition of the XML is:

<element name="description"
  <simpleType>
    <restriction base="xsd:string"/>
  </simpleType>
</element>

Attributes always have simple types. A simpleType is derived from some primitive data type. The only derivation permitted is a restriction of the primitive type. In this example, the restriction declares that the derivation is based on "xsd:string". As you will see later, we could restrict the string to contain only a term from an enumerated list. simpleType can do this. A simple type does not have any attributes and does not contain any other elements.

The example above looks a bit complex for describing a simple string data type. A simplified way of writing this is:

<element name="description" type="xsd:string"/>

There are no restrictions on the type. As you can see, the XML-Schema fragments are XML! The whole thing is an operation bootstrap. Once you have XML, why not use it to define how to make an XML instance? This is exactly what XML-Schema does. The file extension of .xsd is used to indicate that a file is an XML schema rather than an XML instance. You can read an XSD file with an XML editor. The XML-Schema standard defines a set of elements with attributes. The fact that these elements have names such as "element", "attribute", and "attributeGroup" may seem confusing at first. With some thought you can see that an XSD simply provides a method for defining (e.g., element name="...") an element or referring to one that already exists (e.g., element ref="...").

4.3.2 Complex Type

Not surprisingly, the complex type is defined by an element called "complexType". A complex type may define elements, attributes, and data that an element may contain. IMS does not mix data and elements in contents, so I'll not have any discussion of such a mix here. IMS complex types contain either elements and attributes or data and attributes. Never all three. If there is only data, then it can be a simple type. The general model of an XSD complex type is:

<complexType>
  <container>
    <element/>
    <element/>
  </container>
  <attribute/>
</complexType>

Figure 1.2 General form of a complexType.

XSD wraps up the list of elements into a sort of container defining how they are to be organized. The container types are: sequence, choice, all, and group. The type group is actually just an intermediate container, as it can only contain sequence, choice, or all containers. The purpose of group is to allow separation of declarations again, as groups can be reused. For now, don't be concerned about group-it's just a transparent bundling mechanism. Attributes can also be wrapped up into an "attributeGroup" for repeated use of a common set of attributes.

4.4 XSD Content

Now let us turn to the "content" elements in XML-Schema. The "content" elements are simpleContent and complexContent. They hold other things and fit the general model of XSD in creating various kinds of containers.

Figure 1.3 Relationship of content to Type.

The rules for what can go into simpleContent are different from what can go into complexContent. XML-Schema uses the device of these two elements to enforce those rules, so think of them as rule containers that can be interpreted by an XML Parser. Let's start with the simpleContent.

4.4.1 simpleContent

An element may contain data and have attributes. For example, we may want to say that our descriptive text is in a particular language. We can use the predefined attribute of "xml:lang" to define the language (in this case, US English):

<description xml:lang="en-US">
  black metal
</description>

XML has a simple content model that defines a complex type that has only data and attributes.

Figure 1.4 Relationship of simpleContent to complexType.

This is more than a simple type, as a simple type cannot have attributes. It is less than a fully complex type, as the element's content can only be data (i.e., no elements). Simple content is thus a special limited case of a complex type. It is derived from some simple type, the base. As it is filling a complexType, it may restrict or extend that base type:

<element name="description">
  <complexType>
    <simpleContent>
      <extension base="xsd:string">
        <attribute ref="xml:lang"/>
      </extension>
    </simpleContent>
  <complexType>
</element>

The rules for what can occur within the restriction element within a simpleContent are that it can only contain one simpleType and that it can modify that type only by restricting it in some way, such as the maximum length of a string. The restriction may also add attributes.

An extension within a simpleContent can only add attributes to the declared base type.

4.4.2 complexContent

If you can derive a complex type from a simple type via the simpleContent method, then shouldn't you be able to derive a complexType from some other complexType? Right-o, you can. Instead of using simpleContent you use complexContent.

Figure 1.5 Relationship of complexContent to complexType.

Let's start with a complexType that defines a content model of a body and a lens element.

<complexType name="simplecameraType">
  <sequence>
    <element ref="body"/>
    <element ref="lens"/>
  </sequence>
</complexType>

The element container is a sequence. We can extend the simplecameraType by adding a "manual_adapter" element:

<complexType name="extendedcameraType">
  <complexContent>
    <extension base="simplecameraType">
    <sequence>
      <element ref="manual_adapter"/>
    </sequence>
  </complexContent>
</complexType>

What we have done is derive a new complexType from the simplecameraType through extension. The result is the same as:

<complexType name="extendedcameraType">
  <sequence>
    <element ref="body"/>
    <element ref="lens"/>
    <element ref="manual_adapter"/>
  </sequence>
</complexType>

If we had derived by restricting the original type, we would have had to repeat every element in the sequence, defining the one that is no longer to be used with a maxOccurs=�0� declaration. A key to complexContent is that it can be made from a complexType or a simpleType. SimpleContent can be made from only a simpleType.

4.5 Anonymous and Global Definitions

This subject merits its own section because it is important to how IMS uses XML-Schema. There are two methods for defining an element's type information: anonymous and global. To repeat the XSD fragment defining camera:

<element name="camera">
  <complexType>
    <sequence>
      <element ref="body"/>
      <element ref="lens"/>
      <element ref="manual_adapter"/>
    </sequence>
  </complexType>
</element>

Notice that for the "camera" element we have defined the content model using a type. The type shown is "anonymous"; literally, without a name. This means that the type cannot be referred to, it has no name.

An element's type can be declared external to the element itself. We have actually seen this in the type="xsd:string" statement. Someplace there is a definition of the type "string". That someplace is in XSD land. A type can be declared external to an element or attribute. This is a "global" type. An element can be created and named, and then it can have a type declared through reference to a defined type. The following is an example using the "camera" element:

<element name="camera" type="cameraType"/>
<complexType name="cameraType">
  <sequence>
    <element ref="body"/>
    <element ref="lens"/>
    <element ref="manual_adapter"/>
  </sequence>
</complexType>

That's pretty simple, isn't it. The complexType was simply moved out of the camera element and named "cameraType". The element named "camera" now declares itself to be of type "cameraType".

Note: IMS will always declare complexTypes globally. This will make them available for controlled or restriction extension by derivation.

Within an element definition, type and the inclusion of either simpleType or complexType are mutually exclusive. This would be doubly defining the type.

4.6 Ref: The Ability to Point

In the example above you will note that within <sequence> there are a number of elements. These elements don't have names, but use a "ref" designation instead. This means that the element has been defined elsewhere, and the "ref" attribute is "referencing" defined element. IMS will always define its elements separately and include them by reference. In order to be able to refer, or point to something, it must have a name. If something does not have a name, it is anonymous.

An anonymous definition is "in line". Typically, an anonymous definition does not have a name, so it cannot be referred to. IMS uses global definitions instead of anonymous definitions. A global definition is accessible "globally". A globally defined element, type, or attribute can by referred to.

Sometimes it is useful to group attributes or elements together, especially when the same collection of components may be reused by different types. For elements, the group is called a "group". A group does not directly contain elements, but contains one of several different types of containers: sequence, choice, all. A group, when declared independently (as in IMS XSDs), has a name. Attribute groups also have names and define a set of one or more attributes.

4.7 Predefined Attributes

Some attributes that IMS uses have been predefined in the XML standard. These are xml:lang and xml:base. xml:lang is an attribute that has a data type specified in the XML standard. Typically it is used to define the language of a string contained in an element:

<langstring xml:lang="en-US">Here's some text.</langstring>

It is of the ISO country codes form, such as "en" or "en-US".

xml:base provides a base (offset) URI for addresses. For example:

<resource xml:base="http:imsglobal.org/xsd/">
  <file href="imscp_rootv1p1.xsd"/>
  <file href="imsmd_rootv1p2.xsd"/>
</resource>

This logically creates two file references:

http:imsglobal.org/xsd/imscp_rootv1p1.xsd
http:imsglobal.org/xsd/imscp_rootv1p1.xsd

4.8 Namespaces

Within the IMS domain, a namespace is a means of defining a particular schema. The namespace is then associated with components within the schema and the XML instance, for example:

<manifest ... >
  <metadata>
    <imsmd:lom>
      <imsmd:general>
        <imsmd:title>
          <imsmd:langstring xml:lang="en">
            Sniffy the Virtual Rat
          </imsmd:langstring>
        </imsmd:title>
      </imsmd:general>
    </imsmd:lom>
  </metadata>
</manifest>

The components with a prefix (imsmd:, xml:) are imported from the IMS Meta-data and the XML namespaces respectively. The <manifest> root element in this example is empty. Actually, the root element is one of the most common places to define namespaces. A root element of an instance that uses the namespace can be of the form:

<manifest 
  xmlns="xsd/imscp_rootv1p1" xmlns:imsmd="xsd/imsmd_rootv1p2"
  xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance" 
  xsi:schemaLocation="xsd/imscp_rootv1p1
  imscp_rootv1p1.xsd 
  xsd/imsmd_rootv1p2
  imsmd_rootv1p2.xsd" 
  identifier="Manifest01" version="MAN01.01">

What does all of this mean? It is similar to a set of pointers. The prefix label points to the namespace label which points to the actual file and XML-Schema.

Figure 1.6 Prefix to namespace to schema.

Notice that the schemaLocation has two parts: the first part is the namespace and the second is the path to the file, including the filename. Why doesn't the prefix simply point to the schema? Think of it this way: the namespace is the label on the can of schema. If the schema were a can of Brand-X peas, you would go to the grocery store and pick the can of peas from its label. The namespace label can be anything. It is a way to referring to the schema that has the component. By convention-and one that IMS adopts-the namespace is the equivalent to a brand or copyright. The XML-Schema file can actually declare a label that it must be referred to. This is the "targetNamespace". It is effectively the label on the can of peas, and you can't scrape it off. In some ways, it can be used to demonstrate the source of the XSD. This is the normal convention; think of it as similar to a copyright. The targetNamespace is not required. IMS uses it on the top level of each of its specifications' XML Schema, the "root" schema. Sub-schemas don't have targetNamespaces and are placed in the same directory as the root schema. They "adopt" the namespace of the schema that imports them. An XML-Schema root element, called "schema", is of the form:

<xsd:schema xmlns="xsd/imscp_rootv1p1" 
targetNamespace="xsd/imscp_rootv1p1" 
xmlns:xml="http://www.w3.org/XML/1998/namespace" 
xmlns:imsmd="xsd/imsmd_rootv1p2" 
xmlns:xsd="http://www.w3.org/2000/10/XMLSchema" 
elementFormDefault="qualified" 
version="1.1:1.1 IMS CP 1.1 Schema 1.0">
<xsd:import namespace="xsd/imsmd_rootv1p2"
  schemaLocation="xsd/imsmd_rootv1p2.xsd"/>
<xsd:import namespace="http://www.w3.org/XML/1998/namespace" 
  schemaLocation="ims_xml.xsd"/>

The namespace prefix could be defined at the start of the block that contains all of the namespaced elements:

<manifest xmlns="xsd/imscp_rootv1p1"
xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance" 
xsi:schemaLocation="xsd/imscp_rootv1p1 
ims_cp_rootv1p1.xsd xsd/imsmd_rootv1p2
imsmd_rootv1p2.xsd" 
identifier="Manifest01" version="MAN01.01">
  <metadata>
    <imsmd:lom xmlns:imsmd="xsd/imsmd_rootv1p2"
xsi:schemaLocation="xsd/imsmd_rootv1p2 imsmd_rootv1p2.xsd">
      <imsmd:general>
        <imsmd:title>
          <imsmd:langstring xml:lang="en">
            Sniffy the Virtual Rat
          </imsmd:langstring>
        </imsmd:title>
      </imsmd:general>
    </imsmd:lom>
  </metadata>
</manifest>

Actually, the namespace may be declared at any point in the hierarchy at or above the location of the use of the namespaced components. If a block declaration is made, then the default namespace for the elements is the block namespace. An example is the best way of explaining this:

<manifest ... >
  <metadata>
    <lom xmlns="xsd/imsmd_rootv1p2"
xsi:schemaLocation="xsd/imsmd_rootv1p2
imsmd_rootv1p2.xsd">
      <general>
        <title>
          <langstring xml:lang="en">
            Sniffy the Virtual Rat
          </langstring>
        </title>
      </general>
    </lom>
  </metadata>
</manifest>

Use either of the forms, prefix or block. Typically, all of the schemaLocations are defined in the root element. This is good practice, as it brings together in one place references to all of external files to be accessed.

<manifest xmlns="xsd/imscp_rootv1p1"
xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance"
xsi:schemaLocation="xsd/imscp_rootv1p1 imscp_rootv1p1.xsd
xsd/imsmd_rootv1p2 imsmd_rootv1p2.xsd" 
identifier="Manifest01" version="MAN01.01">
  <metadata>
    <lom xmlns="xsd/imsmd_rootv1p2">
      <general>
        <title>
          <langstring xml:lang="en">
            Sniffy the Virtual Rat
          </langstring>
        </title>
      </general>
    </lom>
  </metadata>
</manifest>

Notice that the prefix on xml:lang is never defined. It is built into XML, so you don't need to define it in the XML instance. You do need to define something for it in the schema. IMS has provided a local xsd for that, ims_xml.xsd.

elementFormDefault="qualified" defines whether or not elements in instances or schemas that import a schema need to qualify or provide a namespace for the elements. The namespaces of attributes are the same as the elements that contain them unless they are explicitly namespaced.

So now we have namespaces so we can bring together different pieces from different schemas. We can do this by: including, importing, and extending. The main difference between an importing and an including is that in an instance you must show where the importing comes from.Including can be considered a method of aggregating sub-schemas into one big schema.

4.9 Including and Importing

Including brings in an entire schema. It adds the included schema to the including schema. It can be a method for breaking a schema down into convenient sub-schemas. Including looks like this:

<xsd:include schemaLocation="ims_lip_commonv1p0.xsd"/>

The included schema is not reflected in the root element. The included schema is referred to by the URI of the schemaLocation. An included schema must have the same namespace as the including schema, or no targetNamespace, in which case, it will inherit the including schema's namespace. IMS will always put the included schemas in the same namespace (i.e., directory) as the including schema. The included schema is completely transparent to the instance.

Importing brings in specific elements from another schema as the importing The integrity of the namespace of the imported schema is maintained. The imported schema may be located anyplace, and have its own namespace. For example, the IMS content packaging schema imports the IMS Meta-data schema:

<xsd:import namespace="xsd/imsmd_rootv1p2" 
schemaLocation="imsmd_rootv1p2.xsd"/>

<xsd:import namespace="xsd/imsmd_rootv1p2" 
schemaLocation="xsd/imsmd_rootv1p2.xsd"/>

The second example imports a meta-data schema that is located on the IMS XSD site.

4.10 Extension

There are two basic types of extensions to XML-Schema controlled documents: predefined extension points and redefined content models. A predefined extension point can be thought of as a "free extension". At a free extension point, you are free to add an element or an attribute from any (usually another) namespace. The new element's namespace must be declared as explained above. In a controlled extension you override the containing element, extending its model to include the new element and/or attribute. The new element is defined to be equivalent to the one it replaces by defining the new one as being in a substitution group of the old element. This is all a bit complicated, so look to the guide on extensions.

5. Summary

Now you can go back to the list of rules about how IMS implements XML-Schema and they should make sense. The subject of extensions will be covered in a new guide. You should now know enough about XML-Schema to start using IMS XSDs.

Many of the terms in this guide are defined in the Glossary.

Author:

Thomas D. Wason, Ph.D. (aka Dr. Tom)
http://www.tomwason.com
wason@mindspring.com

http://www.tomwason.com