Resolving the Unknown: Building Custom XmlResolvers in the .NET Framework

 

Mark Fussell
Microsoft Corporation

September 2004

Applies to:
   XML
   Microsoft .NET Framework
   Microsoft SQL Server

Summary: This article by Mark Fussell gets under the hood of the XmlResolver class in System.Xml and describes how to create your own implementations of XmlResolvers that allow you to retrieve XML documents from alternative data sources, such as embedded assembly resources or from a database using custom define schemes. (25 printed pages)

Click here to download the code sample for this article.

Contents

Introduction
Under the Hood of the XmlResolver
A Custom XmlUrlResolver Implementation
A Dynamic XmlResolver Implementation
Conclusion

Introduction

There are three great pillars of abstraction in System.Xml in Version 1.1 of the Microsoft .NET Framework for working with XML. The XmlReader class provides an abstraction for streaming-based read, the XmlWriter class provides an abstraction for streaming-based write, and the XPathNavigator provides an abstraction for a cursor-style, read-only random access. However, there is a lesser used forth abstract class called the XmlResolver, which, if you develop XML-aware applications for long enough in .NET, will eventually come to your rescue!

The XmlResolver class is described in the MSDN documentation as, "resolves external XML resources named by a URI." However, a more complete definition would be as follows:

XmlResolvers resolve external data sources that are identified by a URI and then retrieve the data identified by that URI, thereby providing an abstraction for accessing data sources such as files on a file system.

In V1.1 of the .NET Framework there are two XmlResolver implementations. The XmlUrlResolver resolves data sources using the file:// and http:// protocols or any other protocols supported by the System.Web.WebRequest class. The XmlUrlResolver is the default XmlResolver for all classes in the System.Xml namespace and is used to resolve external XML resources such as entities, DTDs, or schemas. It is also used to process include and import elements found in Extensible StyleSheet Language (XSL) stylesheets or XML Schema Definition language (XSD) schemas. Since the XmlUrlResolver is the default XmlResolver for components such as the XmlTextReader, you do not typically have to instantiate one in order to load XML documents.

The XmlSecureResolver was introduced in V1.1 of the .NET Framework and comes into effect when you start to realize that there are several malicious, typically luring attacks when processing and parsing XML on any platform. The W3C XML 1.0 specification was written with little or no regard to security considerations back in the heady pre-security days of 1998. As its name implies, the XmlSecureResolver secures another implementation of an XmlResolver by wrapping the supplied XmlResolver and restricting the resources to which it has access. For instance, the XmlSecureResolver has the ability to prohibit access to particular internet sites or zones. When you construct an XmlSecureResolver, you provide an XmlResolver implementation along with either a URL, some System.Security.Policy.Evidence, or a System.Security.PermissionSet, which is used by the XmlSecureResolver to determine security access. Either a PermissionSet is generated or the supplied one is used and a PermitOnly is called on it to secure the access of the underlying XmlResolver.

The example below illustrates getting a FileIOPermission to the root of the C: drive, and then denying access to this permission even though the application could be fully trusted. By supplying this permission set to the XmlSecureResolver, this permission set is enforced whenever any data source is accessed via the XmlTextReader class. In this case an attempt to read the file c:\temp\tmp.xml results in a security exception being thrown.

using System;
using System.IO;
using System.Net;
using System.Xml;
using System.Security;
using System.Security.Permissions;
using System.Security.Policy;

public class Test
{
    public static void Main()
    {
         try
         {
            // Create file permission to the c: drive
            FileIOPermission fp = new FileIOPermission( FileIOPermissionAccess.Read, "c:\\" );
            // and then deny the user this FileIO permission
            fp.Deny();
            // and add the file permission to a permission set
            PermissionSet ps = new PermissionSet(PermissionState.None);
            ps.AddPermission( fp );

            // now try to access a file tmp.xml on the c: drive
            XmlTextReader reader = new XmlTextReader(@"c:\temp\tmp.xml");
            // and secure access through the XmlSecureResolver.
            reader.XmlResolver = new XmlSecureResolver(new XmlUrlResolver(), ps);

            // Access the data source
            while (reader.Read()) ;
            Console.WriteLine( "Successful Read Access" );
         }
         catch ( Exception e )
         {
            Console.WriteLine("Unsuccessful Read Access");
            Console.WriteLine(e);
         }
      }
}

When the above code is run the following exception occurs.

Unsuccessful Read Access
System.Security.SecurityException: Request for the permission of type
 'System.Security.Permissions.FileIOPermission, mscorlib, 
Version=2.0.3600.0, Culture=neutral, PublicKeyToken=b77a5c561934e089' failed.

The action that failed was:
Demand

The type of the first permission that failed was:
System.Security.Permissions.FileIOPermission

The first permission that failed was:
<IPermission class="System.Security.Permissions.FileIOPermission,
 mscorlib, Version=2.0.3600.0, Culture=neutral, 
PublicKeyToken=b77a5c561934e089" version="1"
Read="c:\temp\tmp.xml"/>

The demand was for:
<IPermission class="System.Security.Permissions.FileIOPermission,
 mscorlib, Version=2.0.3600.0, Culture=neutral,
 PublicKeyToken=b77a5c561934e089" version="1"
Read="c:\temp\tmp.xml"/>

The granted set of the failing assembly was:
<PermissionSet class="System.Security.PermissionSet" version="1" Unrestricted="true">
<IPermission class="System.Security.Permissions.ZoneIdentityPermission,
 mscorlib, Version=2.0.3600.0, Culture=neutral, 
PublicKeyToken=b77a5c561934e089" version="1" Zone="MyComputer"/> 
</PermissionSet>

The refused set of the failing assembly was:
<PermissionSet class="System.Security.PermissionSet" version="1">
<IPermission class="System.Security.Permissions.FileIOPermission,
 mscorlib, Version=2.0.3600.0, Culture=neutral, 
PublicKeyToken=b77a5c561934e089" version="1"
Read="c:\"/>
</PermissionSet>

The method that caused the failure was:
Void OpenUrl()

Which is the long way of saying that you cannot access the file c:\temp\tmp.xml since you do not have FileIOPermission.

Under the Hood of the XmlResolver

The XmlResolver abstract class has a small API, consisting of two methods called ResolveUri() and GetEntity() along with a Credentials property. The design of all XmlResolvers is that the ResolveUri method is called to return an instance of a System.URI class, and the GetEntity method returns a stream of data from the resolved URI that can then be parsed as XML. This two-stage process provides the abstraction through which a class such as the XmlReader can request XML from data sources other than, say, just the file system, and implement schemes other than those supported in the .NET Framework.

Let's look at the XmlUrlResolver as an example of how it operates; this is best illustrated through a Data Flow Diagram (DFD), which shows the interaction of method calls in a numerical sequence.

Click here for larger image.

Figure 1. Data Flow Diagram for the XmlUrlResolver Class

The design pattern for a component using the XmlResolver is to first call the ResolveUri method and then, with this resolved URI, retrieve the XML data source as a stream through the GetEntity method. In this DFD the XmlUrlResolver differentiates between the file:// and http:// schemes based on the URI that is supplied to the GetEntity method, and returns either a file stream or stream from the System.Web.WebResponse class. DFDs like this are useful for understanding the flow of data and the order of method calls in a system or component. In order to implement a custom XmlResolver it is simply necessary to decide on the scheme to use and then implement the ResolveUri and GetEntity methods to support the scheme's syntax.

A Custom XmlUrlResolver Implementation

With our understanding of how XmlResolvers work we are going to implement a custom XmlResolver that understand two additional schemes. The first scheme is the res:// scheme, which allows you to retrieve embedded XML documents from a named CLR assembly. The second scheme is the db:// scheme, which allows you to retrieve XML documents from a named column in a named table from a SQL Server database.

Accessing XML Embedded in an Assembly

Adding XML documents such as XML schema and XSL stylesheets as embedded resources in an assembly is a great approach when distributing a project, as there are fewer files to copy on installation and you don't need to have different directories to keep data files. Figure 2 shows the books.xml document as an embedded resource in the XmlResolvers project, which is achieved by selecting the Build Action property and choosing the Embedded Resource option in the drop down list.

Figure 2. The books.xml document embedded as a resource in a project

The same is also done for the XML schema, books.xsd. Now we have to determine a scheme for retrieving these embedded assembly resources. The scheme we are going to create is res:// scheme; it has the following syntax.

res://<assemblyName>?<resourceName>

For example, to retrieve the books.xml document from the XmlResolvers assembly we would specify the following URI.

res://XmlResolvers?books.xml

Or to retrieve the books.xsd document from the FileStore assembly, which may be a separate assembly from the project assembly, we would specify the following URI.

res://FileStore?books.xsd

Now that we have embedded resources and a scheme for retrieving them, we can design the XmlResolver for accessing them.

The XmlResourceResolver

We are going to create an XmlResourceResolver class that is able to resolve the res:// scheme. The XmlResourceResolver class is derived from the XmlUrlResolver class. This enables us to do the following:

  • Take advantage of the XmlUrlResolver.ResolveUri() implementation.
  • Provide default support for the file:// and http:// schemes.

The code below shows the implementation of the XmlResourceResolver.GetEntity method.

public override object GetEntity(URI absoluteUri, string role, Type ofObjectToReturn)
{
   Stream stream = null;

   string origString = absoluteUri.OriginalString;
   int schemestart = origString.LastIndexOf("//") + 2;
   int start = origString.LastIndexOf("/") + 1;
   int end = origString.IndexOf('?');

   Console.WriteLine("Attempting to retrieve: {0}", absoluteUri);

   switch (absoluteUri.Scheme)
   {
      case "res":
// Handled res:// scheme requests against
// a named assembly with embedded resources

         Assembly assembly = null;
         string assemblyfilename = origString.Substring(start, end - start);

         try
         {
         if (string.Compare(Assembly.GetEntryAssembly().GetName().Name, assemblyfilename, true) == 0)
            {
               assembly = Assembly.GetEntryAssembly();
            }
            // Requested assembly is not loaded, so load it
            else
            {
         assembly = Assembly.Load(AssemblyName.GetAssemblyName(assemblyfilename + ".exe"));
            }
            string resourceName = assemblyfilename + "." + 
            absoluteUri.Query.Substring(absoluteUri.Query.IndexOf('?') + 1);
            stream = assembly.GetManifestResourceStream(resourceName);
         }

         catch (Exception e)
         {
            Console.Out.WriteLine(e.Message);
         }

         return stream;

      default:
         // Handle file:// and http:// 
         // requests from the XmlUrlResolver base class
         stream = (Stream)base.GetEntity(absoluteUri, role, ofObjectToReturn);
         try
         {
            if (CacheStreams)
               CacheStream(stream, absoluteUri);
         }
         catch (Exception e)
         {
            Console.Out.WriteLine(e.Message);
         }

         return stream;
   }
}

First, the GetEntity method parses the supplied absoluteUri parameter into values taken from the supplied URI. The URI.OriginalString property is used (as opposed to the URI.AbsoluteUri property) since the System.URI class converts the supplied URI into all lower case characters by default, and we need to preserve the case in order to load an embedded resource.

Since we will add further supported schemes to this XmlResourceResolver there is a switch on the scheme type, in this case the "res" string. In the body of the case statement we use the System.Reflection.Assembly class to determine whether the assembly is already loaded into memory (as this is the project assembly) by comparing this to the supplied assembly name.

if (string.Compare(Assembly.GetEntryAssembly().GetName().Name, assemblyfilename, true) == 0)
{
      assembly = Assembly.GetEntryAssembly();
}

If it is not already loaded you load the requested assembly, ensuring that you have specified the correct path to find it. In the code below the assembly to be loaded is assumed to be in the same directory as the project assembly; you may want to add a different path, instead.

else
{
            assembly = Assembly.Load(AssemblyName.GetAssemblyName(assemblyfilename + ".exe"));
}

Once the assembly is loaded, calling the GetManifestResourceStream method conveniently returns a stream to the named embedded resource, which is then returned from the case statement. If this stream is null indicating that the embedded resource cannot be found, then the calling component needs to throw an exception.

Finally, the default in the switch statement calls the GetEntity method on the XmlUrlResolver base class so that we have a single XmlResolver class that can resolve three schemes: file://, http://, and res://.

Both XML documents and XML schemas can be embedded into an assembly. The following example validates the books.xml document against the books.xsd schema using the XmlResourceResolver to specify the URI for each.

public static void LoadXmlandSchemaFromAssemblyResource()
{
   XmlTextReader docReader = new 
   XmlTextReader(@"res://XmlResolvers?books.xml");
   
   XmlResourceResolver resourceResolver = new XmlResourceResolver();
   docReader.XmlResolver = resourceResolver;

   XmlSchemaCollection schemas = new XmlSchemaCollection();
   schemas.ValidationEventHandler += new 
   ValidationEventHandler(schemas_ValidationEventHandler);

   XmlTextReader schemaReader = new 
   XmlTextReader(@"res://XmlResolvers?books.xsd");
   schemaReader.XmlResolver = resourceResolver;
   schemas.Add("bookstore.books.com", schemaReader);

   XmlValidatingReader validReader = new XmlValidatingReader(docReader);
   validReader.ValidationType = ValidationType.Schema;
   validReader.Schemas.Add(schemas);
   
   validReader.ValidationEventHandler += new 
   ValidationEventHandler(validReader_ValidationEventHandler);
   while (validReader.Read())
   { }
}

Accessing XML Stored in a SQL Server Database

Whereas the res:// scheme is very useful in shipping a solution that contains a number of embedded XML documents (it is especially good for XSL stylesheets, for example), it is not very updateable and not ideal for large numbers of documents. As with most data, a better place to store documents is in a database. We can further extend the XmlResourceResolver to support the retrieval of XML documents that are stored in a SQL Server database. We are going to develop solutions for both SQL Server 2000 and SQL Server 2005. The advantage that SQL Server 2005 offers is that XML is a native data type that can be used to create columns of type XML.

From a developer perspective, using SQL Server 2005 Express Edition makes for a very convenient and dynamic application-building platform. SQL Server 2005 Express Edition (SQL Server Express) is an easy-to-use version of SQL Server 2005, and is designed for building simple, dynamic applications and, best of all, is free to use and redistribute. It is limited to using a single CPU and up to 1GB RAM, with a 4GB maximum database size. SQL Server Express does not include any of the advanced components of SQL Server, but it does support the XML data type, which is ideal for acting as a more scalable local XML document repository. SQL Server 2005 Express Beta 2 can be downloaded here.

A typical scenario is storing a large number of XML schemas in order to validate XML documents. New schemas can be added or existing one updated simply by changing the entries in a database table.

As previously with the XmlResolver assembly, we need to come up with a scheme to access the XML schemas in the database tables. Here we have to be more careful, since databases are a significant resource and need to be protected. For instance, we could create a scheme that enables you to provide a SQL statement access to the database like this.

query://<database>?query=<query string>

The following query selects the xmlschema from the xsdschema table, where the namespace value is bookstore.books.com from the xmlschemarepository database.

query:// xmlschemarepository?query=Select xsdschema from xmlschema where namespace=bookstore.books.com

However, in order to prevent SQL injection or just SQL attacks, we are going to implement a more constrained scheme called the db:// scheme with the following syntax,

db://<table>/<column>?<value>

where the operator used with the generated SQL WHERE clause is always the equals operator using the supplied value.

For example, to select the xsdschema column from the xmlschema table where the namespace is equal to bookstore.books.com, the following URI is used with the db:// scheme.

db://xmlschema/xsdschema?bookstore.books.com

By only allowing certain string values for the table and column names, you severely restrict the types of queries that can be generated and make the XmlResourceResolver implementation more secure to potential SQL injection security attacks.

Adding XML Documents into SQL Server

XML documents can be stored in SQL Server 2000 as a column of type [n]varchar(max)/[n]varbinary(max). We will also look at the ability to store XML documents as the new XML data type that is introduced into SQL Server 2005, which is also supported in the SQL Server 2005 Express Edition. There are advantages and disadvantages to both these storage formats when working with XML documents.

We are going to store XML Schemas (XSDs) in a database called xmlschemarepository in a table called xmlschema. The T-SQL to create this table is shown below.

CREATE TABLE xmlschema(namespaces nvarchar(max) primary key, xsdschema nvarchar(max)) 

In SQL Server 2005 we can also create a table with an XML data type, instead using the following T-SQL statement.

CREATE TABLE xmlschemaxmlcol(namespaces nvarchar(max) primary key, xsdschema xml) 

We can then insert an XML schema document with a corresponding namespaces as the primary key into each of these tables with the following T-SQL statement.

insert into xmlschema values (N'bookstore.books.com', '<xs:schema xmlns:tns="bookstore.books.com" xmlns="bookstore.books.com"
 attributeFormDefault="unqualified" elementFormDefault="qualified"
 targetNamespace="bookstore.books.com"
 xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:simpleType name="listofsamplechapters">
  <xs:list itemType="xs:string"/>
  </xs:simpleType>

  <xs:element name="bookstore">
    <xs:complexType>
      <xs:sequence>
        <xs:element maxOccurs="unbounded" name="book">
          <xs:complexType>
            <xs:sequence>
              <xs:element name="title" type="xs:string" />
              <xs:element name="author">
                <xs:complexType>
                 <xs:sequence>
                 <xs:element minOccurs="0" name="name" type="xs:string" />
                 <xs:element minOccurs="0" name="first-name" type="xs:string" />
                 <xs:element minOccurs="0" name="last-name" type="xs:string" />
                 </xs:sequence>
                </xs:complexType>
              </xs:element>
              <xs:element minOccurs="0" name="date" type="xs:string" />
              <xs:element minOccurs="0" name="samplechapters" type="listofsamplechapters" />
              <xs:element maxOccurs="unbounded" name="price">
                <xs:complexType>
                  <xs:simpleContent>
                    <xs:extension base="xs:decimal">
                      <xs:attribute name="alternative" type="xs:string" use="optional" />
                    </xs:extension>
                  </xs:simpleContent>
                </xs:complexType>
              </xs:element>
            </xs:sequence>
            <xs:attribute name="genre" type="xs:string" use="required" />
            <xs:attribute name="publicationdate" type="xs:string" use="required" />
            <xs:attribute name="ISBN" type="xs:string" use="required" />
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>')

The following T-SQL checks that we can retrieve this XML schema by its namespace correctly.

Select xsdschema from xmlschema where namespaces = 'bookstore.books.com'

Or for the XML data type column,
Select xsdschema from xmlschemaxmlcol where namespaces = 'bookstore.books.com'

Now we are ready to retrieve XML schemas from the client by extending the XmlResourceResolver to handle our new db:// scheme.

Extending the XmlResourceResolver to support the db:// scheme

We are going to extend the XmlResourceResolver to be able load an XML schema from the xmlschema table. The code below shows how to use the db:// scheme to load an XML schema with a given namespace.

public static void LoadXmlSchemaFromDatabase()
{
XmlTextReader docReader = new XmlTextReader(@"db://xmlschema/xsdschema?bookstore.books.com");

   XmlResourceResolver resourceResolver = new XmlResourceResolver();
   resourceResolver.ConnectionInfo = connectionInfo;
   docReader.XmlResolver = resourceResolver;

   while (docReader.Read())
   {
      if (docReader.IsStartElement("xs:schema"))
      {
         docReader.MoveToAttribute("targetNamespace");
   Console.WriteLine("Retreived schema with targetNamespace : {0}", 
   docReader.Value);
      }
   }
}

In order to achieve this, the following code was added to the switch statement in the XmlResourceResolver class.

case "db":
   // Handled db:// scheme requests against a known database
   SqlDataReader dataReader = null;
   MemoryStream xmlstream = null;
   string cmdText = string.Empty;
   SqlCommand cmd;

   string tablename = origString.Substring(schemestart, start - (schemestart + 1));
   string columnname = origString.Substring(start, end - start);
   // Get the XML Schema namespace
   string name = absoluteUri.Query.Substring(absoluteUri.Query.IndexOf('?') + 1);

   using (SqlConnection connection = new SqlConnection(ConnectionString))
   {
      try
      {
         // Security check on the allowed table names 
         // in order to prevent any SQL injection attacks
         switch (tablename)
         {
            // Switch on XML documents stored in SQL Server 2000 
            case "xmlschema":
               // Security check on the allowed column names
               switch (columnname)
               {
                  case "xsdschema":
cmdText = @"SELECT " + columnname + " FROM " + tablename + " WHERE namespaces = @name";
                     break;
                  default:
throw new SecurityException("Security SQL Injection: Invalid column name used");
               }
               cmd = new SqlCommand(cmdText, connection);
               cmd.Parameters.Add(new SqlParameter("@name", name));
               Console.WriteLine(cmd.CommandText);

               connection.Open();
               dataReader = cmd.ExecuteReader();

               while (dataReader.Read())
               {
                  // return a memory stream
                  UTF8Encoding uniEncoding = new UTF8Encoding();
                  byte[] xmlbuffer = 
                  uniEncoding.GetBytes(dataReader[columnname].ToString());
                  xmlstream = new MemoryStream(xmlbuffer, 0, 
                  xmlbuffer.Length);
               }
               break;

            // Switch on XML documents stored in SQL Server 2005 
            case "xmlschemaxmlcol":
               // Security check on the allowed column names
               switch (columnname)
               {
                  case "xsdschema":
                  cmdText = @"SELECT " + columnname + " FROM " + tablename 
                  + " WHERE namespaces = @name";
                     break;
                  default:
                  throw new SecurityException("Security SQL Injection: 
                  Invalid column name used");
               }

               cmd = new SqlCommand(cmdText, connection);
               cmd.Parameters.Add(new SqlParameter("@name", name));
               Console.WriteLine(cmd.CommandText);
               connection.Open();
               dataReader = cmd.ExecuteReader();
               while (dataReader.Read())
               {
                  // return a memory stream 
                  Console.WriteLine(dataReader[columnname].ToString());
                  SqlXml xml = dataReader.GetSqlXml(0);
                  xmlstream = new 
                  MemoryStream(dataReader[columnname].ToString().Length);
                  XmlTextWriter writer = new XmlTextWriter(xmlstream, 
                  Encoding.Unicode);
                  writer.WriteNode(xml.CreateReader(), false);
                  // Flush the XmlWriter to the stream and close it.
                  writer.Flush();
                  // Set the position to the beginning of the stream.
                  xmlstream.Seek(0, SeekOrigin.Begin);
               }
               break;

            default:
            throw new SecurityException("Security SQL Injection: Invalid 
            column name used");
         }
      }
      catch (Exception e)
      {
         Console.Out.WriteLine(e.Message);
      }
      finally
      {
         // Close data reader object.
            // The database connection closed by going out of scope with "using" keyword
         if (dataReader != null)
            dataReader.Close();
      }
   }
   return xmlstream;

The code above shows XML schema documents being loaded from both SQL Server 2000 and SQL Server 2005 Express, depending on the name of the table and column.

Some benefits to storing the XML schema as an [n]varchar(max) or an [n]varbinary(max) include the following:

  • This approach provides textual fidelity for XML documents when used to store XML. This is a requirement for applications that deal with legal documents such as insurance documents.
  • It does not depend on XML support that is offered by the database and can be extended to support multiple database servers.
  • It uses the processing power of the client system, thereby reducing the load on the server. By performing CPU-intensive XML processing on the middle-tier, the server is relieved of some load and is available for other important tasks.
  • It offers the best possible performance for XML document level insertion and retrieval; that is, parts of documents cannot be retrieved or updated.

Some benefits of storing the XML schema as an XML data type are:

  • You can offer significantly better querying capabilities using the W3C XQuery language, with the ability to perform fine-grained queries and to modify operations on XML data. XML documents stored as [n]varchar (max) in the database do not offer fine-grained updates, inserts, or deletes on an XML document.
  • It is a simple and straightforward way of storing your XML data at the server while preserving document order and document structure.
  • It offers the ability to create indexes on XML data type columns for significantly faster query-processing.
  • It offers the ability to easily enforce XML schema validation and constraints on the XML data.

In order to load the XML schema from the XML data type xmlschemaxmlcol table, it is simply a matter of amending the URI as shown in the code below.

XmlTextReader docReader = new XmlTextReader(@"db://xmlschemaxmlcol/xsdschema?bookstore.books.com");

Using this XmlResourceResolver class, XML documents and XML schemas can be embedded either into an assembly or a database. The following example validates the books.xml document that is embedded in the XmlResolvers assembly against the bookstore.books.com XML schema that is retrieved from a SQL Server database.

public static void LoadXmlFromAssemblyResourceandSchemaFromDatabase()
{
   XmlTextReader docReader = new XmlTextReader(@"res://XmlResolvers?books.xml");
   XmlResourceResolver resourceResolver = new XmlResourceResolver();
   docReader.XmlResolver = resourceResolver;
   resourceResolver.ConnectionInfo = connectionInfo;

   XmlSchemaCollection schemas = new XmlSchemaCollection();
   schemas.ValidationEventHandler += new ValidationEventHandler(schemas_ValidationEventHandler);

XmlTextReader schemaReader = new XmlTextReader(@"db://xmlschema/xsdschema?bookstore.books.com");
   schemaReader.XmlResolver = resourceResolver;
   schemas.Add("bookstore.books.com", schemaReader);

   XmlValidatingReader validReader = new XmlValidatingReader(docReader);
   validReader.ValidationType = ValidationType.Schema;
   validReader.Schemas.Add(schemas);
   validReader.ValidationEventHandler += new ValidationEventHandler(validReader_ValidationEventHandler);
   while (validReader.Read())
   { }
}

Adding Local File System Caching

A final feature we are going to add to this XmlResourceResolver is the ability to perform local file caching for http:// requests. This is achieved with a hashtable of absolute URIs and retrieves a previously stored file, if there is one. This caching approach is straightforward and does not provide an aging mechanism to the files, which I will leave as an exercise for you, the reader, to add just like your school days.

The code below shows the CacheStream function, which takes a stream and an absolute URI as input, creates a local file, and then stores the path to the local file with the absolute URI as the key. In this way remote http:// requests such as for DTD or XML schemas found on the internet can be automatically stored locally on the first access.

private FileStream CacheStream(Stream inputStream, URI absoluteUri)
{
      //The path and filename that the cached stream will be stored to.
      string path = absoluteUri.AbsolutePath;
      int i = path.LastIndexOf("/");
      if (i > 0) path = path.Substring(i + 1);
      path = cacheLocation + path;
      Console.WriteLine("Caching file: " + path);
      Console.WriteLine();
      //Create a new stream representing the file to be written to,
      // and write the stream cache the stream
      // from the external location to the file.
      FileStream fileStream = new FileStream (path, FileMode.OpenOrCreate, 
      FileAccess.Write, FileShare.None);
      StreamReader sRead = new StreamReader(inputStream);
      StreamWriter sWrite = new StreamWriter(fileStream);
      sWrite.Write(sRead.ReadToEnd());

      //Add the information about the cached URI to the hashtable.
      uriTable.Add(absoluteUri.AbsoluteUri, path);

      //Close any open streams.
      sWrite.Close();
      sRead.Close();

      return new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read);
}

For example, the W3C XML schema for schema contains two DTD references. Run the code shown below, and then look in the local directory C:\FileCache\ and examine what local files have been downloaded and stored. You will be surprised to find a couple of DTDs, each of which required the use of the XmlResolver to download. Going through the code in the debugger is a great way to see the XmlResolver class in action, during which it finds resource references that need further resolution.

public static void LoadXmlFromNetworkandCache()
{
   XmlTextReader docReader = new XmlTextReader(@"http://www.w3.org/2001/XMLSchema.xsd");
   XmlResourceResolver resourceResolver = new XmlResourceResolver();

   resourceResolver.CacheLocation = @"C:\FileCache\";
   resourceResolver.CanCacheStreams = true;
docReader.XmlResolver = resourceResolver;

   while (docReader.Read())
   {
   }

//Read the same XML document and this time the XML document is read
// from the file cache.
XmlTextReader docReader2 = new XmlTextReader(@"http://www.w3.org/2001/XMLSchema.xsd");

   docReader2.XmlResolver = resourceResolver;
   while (docReader2.Read())
   {
   }
}

You can also wrap the XmlResourceResolver in an XmlSecureResolver in order to limit, for example, which internet sites can be accessed.

A Dynamic XmlResolver Implementation

Although the XmlResourceResolver provides support for several schemes, there may be times when you want to have a more dynamic behavior where, for example, a local configuration file determines which schemes are supported by which XmlResolver. In these cases it is useful to have a dynamic XmlResolver class that matches the scheme to the XmlResolver.

The code below shows an implementation of an XmlDynamicResolver that returns an instance of an XmlResolver depending on the requested scheme.

class XmlDynamicResolver : XmlResolver
   {
   Hashtable resolverCollection = new Hashtable();

   public void Add(string scheme, XmlResolver resolver)
   {
      resolverCollection.Add(scheme, resolver);
   }

   public override System.Net.ICredentials Credentials
   {
      set { throw new global::System.NotSupportedException(); }
   }

        public override object GetEntity(URI absoluteUri, string role, Type ofObjectToReturn)
   {
      Console.WriteLine("Attempting to retrieve: {0}", absoluteUri);
        XmlResolver resolver = (XmlResolver)resolverCollection[absoluteUri.Scheme];
      return resolver.GetEntity(absoluteUri, role, ofObjectToReturn);
   }

   public override URI ResolveUri(URI baseUri, string relativeUri)
   {
      URI actualUri = baseUri;

      if (actualUri == null)
         actualUri = new URI(true, relativeUri);

        XmlResolver resolver = (XmlResolver)resolverCollection[actualUri.Scheme];
      return resolver.ResolveUri(baseUri, relativeUri);
   }
}

To use this XmlDynamicResolver you add an instance of the XmlResolver for the given scheme. The example below shows the addition of the XmlResourceResolver for the "db" and "res" schemes, and the XmlUrlResolver for the "http" and "file" schemes. "Yes," the XmlResourceResolver can in fact handle all four of these schemes, so this is a somewhat contrived example; it does illustrate the class, however. Note that in this case the XmlDynamicResolver is derived from the XmlResolver abstract class.

public static void LoadXmlWithDynamicResolver()
{
   // Register schemes with the dynamic XmlResolver
   XmlDynamicResolver dynamicResolver = new XmlDynamicResolver();
   dynamicResolver.Add("db", new XmlResourceResolver());
   dynamicResolver.Add("res", new XmlResourceResolver());
   dynamicResolver.Add("file", new XmlUrlResolver());
   dynamicResolver.Add("http", new XmlUrlResolver());
   
   XmlTextReader docReader = new XmlTextReader(@"res://XmlResolvers?books.xml");
   docReader.XmlResolver = dynamicResolver;
   while (docReader.Read())
   {
      if (docReader.IsStartElement("price"))
         Console.WriteLine("Book price " + docReader.ReadInnerXml());
   }
}

Conclusion

This article has provided an in-depth review of the versatile XmlResolver class. XmlResolvers resolve external data sources identified by a URI and then retrieve the data identified by that URI, and so provide an abstraction for accessing data sources such as files on a file system. System.Xml in the .NET Framework provides two implementations of the XmlResolver: the XmlUrlResolver for file:// and http:// requests, and the XmlSecureResolver for limiting access to specified resources.

The beauty of the XmlResolver is that you can build them to retrieve XML from anywhere; all you need to do is return a stream. In this article we walked through XmlResolver implementations that pull XML documents that are embedded in a CLR assembly or are stored in a SQL Server database. If you want to see two more examples of XmlResolvers, Chris Lovett describes an XmlAspResolver for resolving files store in Microsoft ASP.NET virtual directories and an alternative implementation of an XmlCachingResolver that performs in memory caching rather that local file caching. All interesting XML projects create some form of XmlResolver and it would be good to hear about your implementations. Send an e-mail message describing them to mfussell@microsoft.com. ¡Viva la XmlResolver!