Combining XML Documents with XInclude

 

Oleg Tkachenko

April 2004

Applies to:
   Extensible Markup Language (XML) 1.0
   Microsoft® .NET Framework

Summary: This article explores the problem of how to construct a single XML document from multiple documents. It focuses on XML Inclusions (XInclude), a general-purpose mechanism for facilitating modularity in XML. (15 printed pages)

Download the XInclude.NET-1.2.exe sample code.

Contents

Introduction
Why XInclude?
XInclude from 50,000 feet
XInclude Syntax
XInclude Processing Model
XPointer
Practical XInclude Usage
Conclusion
Acknowledgements

Introduction

The spirit of modular programming dictates that you break your tasks up into small manageable chunks. This dictum is as also applicable when producing XML documents. It often makes sense to build a large document from several smaller ones. Some situations that call for this chunking include composing a single book out of multiple chapters, building a Web page out of separately maintained documents, or adding a standard footer such as a corporate disclaimer into document.

There are many ways to approach solving the problem of how to construct a single XML document from multiple documents. This article focuses on one, which looks destined to be the universal general-purpose mechanism for facilitating modularity in XML: XML Inclusions (XInclude).

Why XInclude?

The first question one may ask is "Why use XInclude instead of XML external entities?" The answer is that XML external entities have a number of well-known limitations and inconvenient implications, which effectively prevent them from being a general-purpose inclusion facility. Specifically:

  • An XML external entity cannot be a full-blown independent XML document—neither standalone XML declaration nor Doctype declaration is allowed. That effectively means an XML external entity itself cannot include other external entities.
  • An XML external entity must be well formed XML (not so bad at first glance, but imagine you want to include sample C# code into your XML document).
  • Failure to load an external entity is a fatal error; any recovery is strictly forbidden.
  • Only the whole external entity may be included, there is no way to include only a portion of a document.
  • External entities must be declared in a DTD or an internal subset. This opens a Pandora's Box full of implications, such as the fact that the document element must be named in Doctype declaration and that validating readers may require that the full content model of the document be defined in DTD among others.

The deficiencies of using XML external entities as an inclusion mechanism have been known for some time and in fact spawned the submission of the XML Inclusion Proposal to the W3C in 1999 by Microsoft and IBM. The proposal defined a processing model and syntax for a general-purpose XML inclusion facility.

Four years later, version 1.0 of the XML Inclusions, also known as Xinclude, is aCandidate Recommendation, which means that the W3C believes that it has been widely reviewed and satisfies the basic technical problems it set out to solve, but is not yet a full recommendation.

So does XInclude really solve the aforementioned problems? Absolutely. Let's find out how.

XInclude from 50,000 feet

XInclude defines a general-purpose inclusion mechanism that facilitates modularity in XML documents. The inclusion process is formally defined as merging a number of XML information sets into a single composite XML Infoset. Authors specify which documents are to be merged and control merging process via inclusion instructions. XInclude syntax for inclusion instructions is based on familiar, easy to produce and process, XML constructs—elements, attributes, and URI references.

XInclude supports inclusion of non-XML text documents and allows authors to control the recovery process—for instance, it's possible to provide default content or alternative document to include, which will be included if the remote resource cannot be loaded.

XInclude also supports partial XML inclusion—it is possible to define (by providing XPointer pointer) which part(s) of an XML document ought to be included.

Here is basic XInclude "Hello World" sample to make the picture more concrete. Say you have some Web page definitions and want all of them to include a template footer with your company's copyright information:

page.xml:

<?xml version="1.0"?>
<webpage>
<body>Hello world!</body>
   <xi:include href="templates/footer.xml" xmlns:xi="http://www.w3.org/2003/XInclude"/>
</webpage>

footer.xml:

<?xml version="1.0"?>
<footer>© Contoso Corp, 2003</footer>

page.xml after XML Inclusion processing:

<?xml version="1.0"?>
<webpage>
<body>Hello world!</body>
   <footer xml:base="templates/footer.xml">© Contoso Corp, 2003</footer>
</webpage>

Figure 1. "Hello World!" XInclude sample.

Here is another introductory sample to illustrate an inclusion of external non-XML text data into XML document. Say you have an XML document stored on your server and you want it to include a counter, which describes how many times the document has been accessed:

<?xml version="1.0"?>
<catalog xmlns:xi="http://www.w3.org/2003/XInclude">
  <p>This document has been accessed
  <xi:include href="https://www.contoso.com/Counter.aspx?pid=catalog" parse="text"/> times.</p>
</catalog>

This is the document after XML Inclusion processing:

<?xml version="1.0"?>
<catalog xmlns:xi="http://www.w3.org/2003/XInclude">
  <p>This document has been accessed
  45453 times.</p>
</catalog>

XInclude Syntax

XInclude syntax is extremely simple, just two elements in the http://www.w3.org/2003/XInclude namespace, namely include and fallback. The commonly used namespace prefix is "xi" (but you are free to use any prefix you wish). For those who feel easier with formal syntax definitions, the XInclude spec provides XML Schema and DTD for XInclude. For others here is a summary:

xi:include element

The xi:include element serves as inclusion instruction. It defines which document to include and how. Its attributes are:

  • href—An URI reference of the document to include.

  • parse—Can have "xml" or "text" values thus defining how to include the specified document, either as XML or as plain text. The default value is "xml."

  • xpointer—An XPointer identifying a portion of XML document to include. This attribute is ignored when including as text (parse="text").

  • encoding—When including as text this attribute provides a hint about encoding the included document.

    Note  Recall that in general no one can say what an arbitrary text file's encoding is. Happily XML is not affected by this illness (there are strict rules on how to detect encoding of an XML document), so this means that the encoding attribute is ignored when including is done as XML (parse="xml").

  • accept, accept-charset and accept-language—These attributes may be used to aid in HTTP content negotiation. When the XInclude processor fetches a resource via the HTTP protocol, it should use values of these attributes to set Accept, Accept-Charset and Accept-Language headers in HTTP request. This facility is supposed to aim a situation, when a single URI may return different representations of the same resource (say, raw XML or XHTML, ISO8859-1 or UTF-8 encoded, English or Hebrew version) basing on HTTP headers analysis.

The xi:include element may contain exactly one optional xi:fallback element, any other content is merely ignored.

xi:fallback element

The xi:fallback element provides a mechanism to recover from missing resources. When the resource to be included is unavailable for any reason (connection problems, security restrictions, resource does not exist, the URI scheme is unknown or is not a fetchable one like mailto: and so forth), content of the xi:fallback element is included instead. The following is an example which shows how the xi:fallback element is used:

<page xmlns:xi="http://www.w3.org/2003/XInclude">
   <header>New This Week from MSDN</header>
<xi:include href="https://msdn.microsoft.com/rss.xml">
      <xi:fallback>Sorry, MSDN news are unavailable.<xi:fallback>
   </xi:include>
</page>

If the resource is missing and the xi:include element is empty, nothing is included. The xi:fallback element has no attributes, it must be a direct child of the xi:include element, and its content is unrestricted. Since there are no restrictions on the contents of an xi:fallback element, it can in turn contain another xi:include instruction, providing an alternative recovery resource to include:

<page xmlns:xi="http://www.w3.org/2003/XInclude">
   <header>New This Week from MSDN</header>
   <xi:include href="https://msdn.microsoft.com/rss.xml">
      <xi:fallback>
         <xi:include href="https://msdn.microsoft.com/xml/rss.xml">
            <xi:fallback>Sorry, MSDN news are unavailable.<xi:fallback>
         </xi:include>
      <xi:fallback>
   </xi:include>
</page>

Preserving Base URI

Attentive readers may have noted the xml:base attribute appearing in the result documents. This important feature prevents relative URI references in the included document from being broken down when the document is included into another one, possibly in a different place. Consider the following example.

The parts.xml document in the C:\Contoso\Inventory directory refers to the XML Schema parts.xsd in the same directory, using the relative URI reference:

<parts xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="parts.xsd">
    <part SKU="10023" quantity="100"/>
</parts>

The report.xml document in C:\Contoso\Reports directory includes parts.xml:

<report>
   <inventory>
      <xi:include href="C:\Contoso\Inventory\parts.xml" 
         xmlns:xi="http://www.w3.org/2003/XInclude"/>
   </inventory>
</report>

After XML Inclusion processing, report.xml document looks like this:

<report>
    <inventory>
        <parts xmlns:xsi="http://www.w3.org/2003/XMLSchema-instance" 
xsi:noNamespaceSchemaLocation="parts.xsd" 
xml:base="../Inventory/parts.xml">
              <part SKU="10023" quantity="100" />
        </parts>
    </inventory>
</report>

Note how the xml:base attribute preserves the base URI of the included document, effectively keeping the relative URI reference in xsi:noNamespaceSchemaLocation attribute pointing to the same schema file in C:\Contoso\Inventory directory.

XInclude Processing Model

Technically speaking, XInclude defines inclusion as a special type of XML Infoset transformation. The source infoset is transformed to the result infoset in which each xi:include element is replaced with the infoset it refers to. This process can also be thought of as infosets merge.

XML Inclusion is a recursive process, so each xi:include element in an included document is processed too. And of course to keep it safe, circular inclusions are detected and treated as fatal errors. What is a fatal error? XInclude defines two types of errors, which may occur during inclusion process—"fatal error," which as you can suppose refers to the presence of factors that prevent normal processing from continuing; and "resource error," which refers to a failure of an attempt to fetch a resource. Fatal errors are deadly, meaning that processing must be stopped, while resource errors are subject to fallback processing.

Now how is the xi:include element is processed? First of all, it depends on the parse attribute value. When parse="xml" (which is a default, by the way), the referenced document is fetched, is parsed as XML, and replaces the xi:include element in the source infoset along with its descendants. As you can suppose, an attempt to include non well-formed XML document results in fatal error as well-formedness is true Holy Grail of XML.

When parse="text", the referenced document is fetched and treated as plain text, replacing xi:include element in the same way. This is an extremely useful feature, which for instance allows including source code or XML documents as text. The following example shows the inclusion of an XML document in another as text:

<doc xmlns:xi="http://www.w3.org/2003/XInclude">
   For instance, consider the following SOAP message request to the Contoso Delivery Web Service: 
   <xi:include href="contoso-soap.xml" parse="text"/>
</doc>

After XML Inclusion, the resulting document looks like the following:

<doc xmlns:xi="http://www.w3.org/2003/XInclude">
For instance, consider the following SOAP message request to the Contoso Web Service:
&lt;?xml version="1.0" encoding="utf-8"?>
&lt;soap:Envelope xmlns:soap="https://schemas.xmlsoap.org/soap/envelope/">
  &lt;soap:Body>
    &lt;Delivery xmlns="https://www.contoso.com">
      &lt;address>
        &lt;Street>One Microsoft Way&lt;/Street>
        &lt;City>Redmond&lt;/City>
        &lt;Zip>98052&lt;/Zip>
      &lt;/address>
    &lt;/Delivery>
  &lt;/soap:Body>
&lt;/soap:Envelope>
</doc>

Note that the left angle bracket characters get escaped, because the whole contoso-soap.xml document (despite it being XML) is included as single text node.

In another interesting case where text inclusion might be useful, you want to include some external text document during XSL Transformation. While XSLT 1.0 has the document() function, this function only allows you to open external XML documents, not raw text ones. So for example, to include CSS file into the generated HTML document, a single xi:include instruction is enough:

<xsl:template match="page">
   <html>
      <head>
         <style type="text/css">
            <xi:include href="main.css" parse="text"/>
         </style>
         ...

One still cannot include binary resources this way because of obvious reasons. An attempt to include characters not permitted by the XML 1.0 recommendation results in a fatal error.

XPointer

So far, so good. The examples I've shown illustrate how to include an entire document, but what if you need to include only a part of a document, say all books of a particular author from a book catalog? Enter XPointer.

XInclude facilitates partial inclusion using XPointer pointers in xpointer attribute:
<para>
My favorite books are:
<xi:include href="books.xml" xpointer="xpointer(//book[@author='Stephen King'])"/>
</para>

The instruction above includes all book elements, whose author attribute value is "Stephen King" from the books.xml document. The expression in round brackets should look very familiar to you. Is it XPath? Not exactly, but very close. It's XPointer of a particular type, or more exactly, a pointer confirming to a particular XPointer scheme—xpointer(), which is an extension of XPath. Now let's see step by step what manner of beast XPointer is.

XPointer stands for XML Pointer Language, an extensible system for XML addressing. The cornerstone of XPointer is the XPointer Framework, which defines semantics and basic syntax for fragment identifiers. The XPointer Framework defines two types of pointers: shorthand and scheme-based ones.

A shorthand pointer (formerly known as barename) is in fact a rough analog of a fragment identifier in HTML, which points to a named anchor. The XPointer shorthand pointer instead identifies at most one element in an XML document by its ID. The ID can be determined by DTD, XML Schema or in an application-specific way.

<xi:include href="books.xml" xpointer="bk101"/>

The bk101 value in the xpointer attribute is a shorthand XPointer pointer, which identifies an element in the books.xml document, whose ID is "bk101."

A scheme-based pointer is a more sophisticated type of pointer, which consists of a sequence of pointer parts, optionally separated by white space. Each pointer part has a scheme name and some scheme-specific data in parentheses, such as element(bk101) or xpointer(//book):

<xi:include href="books.xml" xpointer="element(bk101)xpointer(//*[@title='Dreamcatcher'])"/>

In the previous inclusion instruction, the xpointer attribute value represents a scheme-based XPointer pointer, which identifies an element in the books.xml document, whose ID is "bk101," or if such cannot be found, it identifies elements, whose title attribute value is "Dreamcatcher."

Figure 2. Scheme-based XPointer pointer structure

Pointer parts are evaluated in left-to-right order until a part identifies some subresource. Unrecognized/unsupported parts are ignored. Note, however, that if no pointer part in the whole pointer identifies subresources, it is an error. The same goes for the shorthand pointer too.

As can be seen, scheme-based pointers provide a way to make fragment identification more reliable, thus allowing one to specify several pointer parts, each of which addresses the desired subresource in a different way. If a pointer part fails to identify the subresource, the next one is evaluated in turn, in effect providing sort of an alternative addressing behavior.

XPointer Schemes

There are three types of XPointer schemes defined by W3C, namely element(), xmlns(), and xpointer(). Additionally, Simon St.Laurent has proposed xpath1() scheme, which uses regular XPath 1.0 syntax, as in xpath1(//book).

The element() scheme facilitates basic addressing of XML elements by ID (just like shorthand pointer) and by positional number among sibling elements. For instance the element(/1/3) pointer part identifies the third child element of the root element and element(bk101/2) identifies the second child element of the element with ID "bk101."

The xmlns() scheme represents an auxiliary pointer part, which never identifies any subresources on its own, but allows defining namespace binding context for other pointer parts (recall that you can combine as many pointer parts as you want to). For instance, the following XPointer pointer identifies all book elements in the document, which belong to the "https://www.contoso.com" namespace:

xmlns(co=https://www.contoso.com)xpointer(//co:book)

The xpointer() scheme is the most powerful. It provides high level of functionality for addressing portions of XML documents. It is based on XPath 1.0 and extends XPath to allow addressing of strings, points, and ranges. Note that the xpointer() scheme is still work in progress and most likely is subject to further changes.

Now back to XInclude. The XInclude specification insists that any conformant XInclude processor must support the XPointer Framework and the element() scheme. That means in effect that you can safely use only shorthand and element() scheme-based pointers in your documents, but given that these are the early days of Xinclude, it's always good idea to consult your XInclude processor's documentation.

We are done with theoretical part. Now let's talk about practical programming.

Practical XInclude Usage

The first thing we need is XInclude-aware plumbing. Don't expect your favorite XML parser to support XInclude automatically. For instance, neither MSXML nor System.xml support Xinclude, unfortunately. The W3C lists available implementations of XInclude in the XInclude Implementations Report. Some of them are in fact experimental, but several are of a production quality.

Now I am going to focus on my favorite XInclude implementation, which I like the most, probably because I am the lead developer of that project, XInclude.NET.

XInclude for .NET - XInclude.NET

The XInclude.NET project was inspired by Chris Lovett's article XInclude, Anyone?, which I discovered while working on a similar problem of composing a website pages from documents developed separately by different teams. Having power of the .NET Framework at my fingertips made it feasible to build a robust, highly performant XInclude processor on top of the System.Xml API. Version 1.2 of the XInclude.NET library was released recently, and now users can decide whether this goal has been attained.

The XInclude.NET project is an implementation of the XInclude 1.0 Last Call Working Draft of November 10, 2003 (latest version at the time of writing), written in C# for the .NET Framework. In addition it supports the XPointer Framework, element(), xmlns(), xpath1() and xpointer() (XPath subset only) schemes. The code sample attached to this article contains the XInclude.NET 1.2 release, which provides a precompiled XInclude.dll assembly, API documentation, samples, and the source code of the XInclude implementation.

The key class within XInclude.NET is the XIncludingReader, found in the GotDotNet.XInclude namespace. The primary design goal was to build pluggable, streaming pipeline for XML processing. To meet that goal, XIncludingReader is implemented as an XmlReader, which can be wrapped around another XmlReader. This architecture allows easy plugging of XInclude processing layer into a variety of applications without any major modifications. For instance, to enable XInclude processing of XML while it's being loaded into XmlDocument, just wrap an XmlTextReader with the XIncludingReader:

using System.Xml;
using GotDotNet.XInclude;

public class Class1 {
    public static void Main(string[] args) {
        XmlTextReader r = new XmlTextReader("document.xml");
        XIncludingReader xir = new XIncludingReader(r);
        XmlDocument doc = new XmlDocument();
        doc.Load(xir);          
        //...
    }
}

XML Inclusion before performing an XSL Transformation is similarly straightforward:

using System.Xml;
using System.Xml.XPath;
using System.Xml.Xsl;
using System.IO;
using GotDotNet.XInclude;

public class Class1 {
    public static void Main(string[] args) {        
        XIncludingReader xir = new XIncludingReader("document.xml");
        XPathDocument doc = new XPathDocument(xir);
        XslTransform xslt = new XslTransform();
        xslt.Load("stylesheet.xslt");
        StreamWriter sw = new StreamWriter("result.html");
        xslt.Transform(doc, null, sw);
        sw.Close();        
    }
}

CSS inclusion sample, shown before, assumes that the XSLT stylesheet is processed by the XInclude processor; here is how it can be done:

public class Class1 {
    public static void Main(string[] args) {                
        XPathDocument doc = new XPathDocument("document.xml");
        XslTransform xslt = new XslTransform();
        XIncludingReader xir = new XIncludingReader("stylesheet.xslt");
        xslt.Load(xir);
        StreamWriter sw = new StreamWriter("result.html");
        xslt.Transform(doc, null, sw);
        sw.Close();        
    }
}

XML Inclusion process is orthogonal to XML parsing, validation, or transformation. That effectively means it's up to you when to allow XML Inclusion happen: after parsing, but before validation; or after validation, but before transformation, or even after transformation.

Note  Preserving the Base URI during XML Inclusion implies the appearance of an xml:base attribute in the resulting document. That effectively means in the "pre-validation inclusion" scenario that an XML schema author should expect the xml:base attribute on the top-level included element. In fact, the xml:base attribute is core standard XML stuff, so in such a case it makes sense to allow in schema xml:base attribute on each element.

How XIncludingReader Works

Note  A familiarity with the XmlReader architecture is required. Otherwise you can skip this section.

The XIncludingReader implementation is based on the technique of XmlReaders chaining, described in the Customized XML Reader Creation section of the .NET Framework Developer's Guide. This technique, quite similar to old-school SAX filtering, allows chaining of XmlReaders by delegating the calls, effectively providing a highly performant way to filter or modify XML "on the fly," while it is being read.

Figure 3. A chain of XmlReaders

XIncludingReader always works on top of another reader, "parent" reader in SAX terms, which provides input XML stream to the XIncludingReader. When the XIncludingReader's input is not XmlReader-compatible, as in System.IO.Stream or System.IO.TextReader, an interim XmlTextReader object is instantiated to act as a parent reader. The resulting XML, where XML Inclusion is done, can be read by the client application from XIncludingReader itself; remember it is just XmlReader.

Most of the time XIncludingReader does nothing; it merely delegates method calls to the parent reader and transparently exposes parent reader's properties, such as Name or Value, as its own. What it always does, though, is watch names of XML elements the parent reader is reporting. Whenever the parent reader gets positioned on the xi:include element, XIncludingReader wakes up and starts the inclusion process.

Instead of exposing the xi:include element to the client application, the parent reader is pushed to a stack and the new temporary XmlReader is instantiated to read the document the xi:include instruction is referring to. The actual type of this interim reader depends on whether the xpointer attribute is present (then it's a specialized XPointerReader, which implements XPointer and reads only what's identified by the pointer nodes), or whether the inclusion is done in text mode (then it's a specialized TextIncludingReader, which reads the whole document as a single text node). Otherwise, it acts just like an XmlTextReader.

Anyway the parent reader is pushed to the stack and this new reader becomes the parent reader. Usual reading continues on further until no more input can be read from the parent reader. The inclusion process is then finished, and the previous parent reader is popped out of the stack.

While that sounds quite simple, bear in mind that it's only the tip of the iceberg. Hidden parts, which are out of this article's scope, include implementation of fallback processing and quite nontrivial exposing of a synthetic xml:base attribute in the XmlReader API (Martin Gudgin has more about it in his blog. Just dive into the source code if you want to know more. Also, feel free to ask any XInclude.NET-related questions at the project's message board and file bugs to the bug tracker.

Such streaming, forward-only, non-caching implementation results in an extremely low memory footprint and almost no performance penalty. All that XIncludingReader does most of the time is just comparing names of elements, watching for the xi:include element. This is an extremely cheap operation provided that the comparison is done with respect to XmlNameTable (hence that's just a comparison of two object pointers in 99 percent of the cases).

Unfortunately, processing of XPointer pointers is not so cheap. In fact, it requires that the whole included document be loaded into memory just to select required parts by ID, by XPath selection path, or by xpointer() pointer. As a matter of interest, evaluation of XPointer pointers is internally implemented by translation of XPointer syntax to XPath expressions as follows:

  1. The shorthand pointer is translated to the id() function call: bk101 to id('bk101').
  2. The element() scheme-based pointer is translated to the XPath selection path, containing the id() function call and/or positional predicates: element(bk101)/3 to id('bk101')/*[3].
  3. The xpointer() scheme-based pointer is used directly as an XPath expression. That's why XInclude.NET doesn't support the xpointer() superset of XPath.

Future versions of XInclude.NET may provide more full and effective XPointer implementation.

Conclusion

XInclude is an emerging W3C standard designated to facilitate modularity in XML. It defines a processing model and syntax for general-purpose inclusion or merging of XML documents. It is independent of XML parsing and validation, and its syntax is based on XML elements, attributes, and URI references.

Although XInclude is not yet W3C Recommendation, the number of implementations as well as its overall appreciation in XML community is constantly growing. In this article, we have explored an introduction to XInclude and Xpointer. We also covered a specific implementation of XInclude for the .NET framework.

Acknowledgements

I would like to thank Dare Obasanjo for all his help in preparing this article.