How do you associate a schema with an XML document? It’s easy but the details also easily forgotten. A summary and a how-to.
Why?
- You have an XML document and wants it validated against one or more W3C XML schemas. How do you tell the validator which schemas to use?
- Your XML editor can display useful hints and prevent creating invalid documents when it knows which schemas to use.
How?
- Associations between XML documents and W3C XML Schemas are created by adding specific attributes in the
www.w3.org/2001/XMLSchema-instance
namespace. - To be able to do this, you have to specify this namespace in your XML document, usually on the root element:
<RootElement xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" …> … </RootElement>
- The usual namespace prefix is
xsi
. Better stick to this to avoid confusion. - Use the
xsi:noNamespaceSchemaLocation
to specify the location of a schema file for the part(s) of your document that have no namespace attached (or when you don’t use namespaces at all):
<RootElement xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://bogus.com/schemas/schemaforthisdocument.xsd" … > … </RootElement>
- The
xsi:noNamespaceSchemaLocation
attribute is put on the root element. Only in pathological circumstances (non-namespaced inside namespaced XML) you might have a reason to put it elsewhere. - For namespaces, use the
xsi:schemaLocation
attribute. The content of this attribute consists of pairs of (namespace-name; schema-location-for-this-namespace), all separated by whitespace. For instance:
<RootElement xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="namespace1 schema-location-for-namespace1 namespace2 schema-location-for-namespace2 …" … > … </RootElement>
- There are several strategies where to put
xsi:schemaLocation
attribute(s): - You can use a single
xsi:schemaLocation
attribute on the root element, specifying all namespaces and schema locations. Since namespaces and schema locations are usually long strings, this will soon become very unreadable. - You can use an
xsi:schemaLocation
attribute on the first occurrence of an element/attribute in a specific namespace, specifying only the association for this particular namespace. This is cleaner but spreads the information on associated schemas all over your document. - You can make a mess of it by mixing both approaches or by specifying locations on elements that have nothing to do with the namespace involved. The tools won’t mind but readers will.
Pitfalls and peculiarities?
- Values in the
xsi:…
attributes have the official status of hints to the schema processor. So they can be ignored, treated wrong or overridden by something else. - Some tools have a tool-specific way to associate a document with its schemas. This usually overrides the values set in the
xsi:…
attributes, since these are only hints… - If the schema locations contain relative path information, this is resolved against the location of the XML document. So be careful how you specify the file locations if you ever plan to move your documents elsewhere. Unless of course you always transfer your documents and schemas in one go.
- The contents of the
xsi:schemaLocation
attribute very soon becomes a hard to interpret visual mess. You can hand-edit it nicely (as in the examples) but an XML pretty printer will undo all your hard work very quickly. Not much we can do about it…
More information?
- Have a look at the rather incomprehensible official specification: www.w3.org/TR/xmlschema-1
- For a simple tutorial: www.w3schools.com/schema/schema_howto.asp
Advertisements
Leave a Reply