XML Files

Advanced Type Mappings

Aaron Skonnard

Code download available at:XMLFiles0306.exe(157 KB)

Q Can XmlSerializer deal with choice compositors?

Q Can XmlSerializer deal with choice compositors?

A XmlSerializer can map XML Schema's sequence and all compositors quite logically to Microsoft® .NET Framework class definitions, but choice requires a bit of special attention. Xsd:choice indicates that a single choice of numerous particles is allowed at a given location within a complex type. For example, consider the XML Schema complex type definition shown in Figure 1. This definition states that an instance may contain one of three elements: commission, hourly, or salary, as illustrated here:

<paymentMethod xmlns="https://example.org/xmlserializer"> <hourly>7.50</hourly> </paymentMethod>

A XmlSerializer can map XML Schema's sequence and all compositors quite logically to Microsoft® .NET Framework class definitions, but choice requires a bit of special attention. Xsd:choice indicates that a single choice of numerous particles is allowed at a given location within a complex type. For example, consider the XML Schema complex type definition shown in Figure 1. This definition states that an instance may contain one of three elements: commission, hourly, or salary, as illustrated here:

<paymentMethod xmlns="https://example.org/xmlserializer"> <hourly>7.50</hourly> </paymentMethod>

Figure 1 XML Schema Complex Type Definition

<xs:schema xmlns:xsd="https://www.w3.org/2001/XMLSchema" targetNamespace="https://example.org/xmlserializer" elementFormDefault="qualified" > <xs:complexType name="PaymentMethod"> <xs:choice> <xs:element name="commission" type="xs:double" /> <xs:element name="hourly" type="xs:double" /> <xs:element name="salary" type="xs:double" /> </xs:choice> </xs:complexType> <xs:element name="paymentMethod" type="PaymentMethod"/> •••

Although this type of constraint is quite common in XML documents, it's not supported across many programming languages. It would be quite easy to map something like this to a language that supports C-style type unions since they define a structure that is a choice of several data members, as illustrated by the following:

union PaymentMethod { double commission; double hourly; double salary; }; PaymentMethod paymentMethod;

However, since C# doesn't support unions it requires an out-of-band technique. If you run the aforementioned schema definition through xsd.exe (with the /classes switch), it generates a class like the one shown in Figure 2.

Figure 2 PaymentMethod Class

using System.Xml.Serialization; using System.Xml.Schema; ••• public class PaymentMethod { [XmlElement("commission", typeof(System.Double))] [XmlElement("hourly", typeof(System.Double)] [XmlElement("salary", typeof(System.Double)] [XmlChoiceIdentifier("ItemElementName")] public System.Double Item; [XmlIgnore] public ItemChoiceType1 ItemElementName; }

Notice that the PaymentMethod class only contains two members: Item and ItemElementName. The combination of these two elements represents an instance of the choice type. Since all three of the particle choices were declared as type xsd:double, the Item member is defined as type System.Double, which seems reasonable since that's the only possible type in this case.

The Item member is annotated with four attributes, three of type XmlElementAttribute and one of type XmlChoiceIdentifier. The XmlElementAttribute declarations specify the different name/type possibilities for the member when serialized. The XmlChoiceIdentifier attribute indicates that this particular member represents a choice compositor. However, simply looking at the value doesn't enable the serialization engine to unambiguously determine which element should be used. It needs a little bit more information. Hence, the XmlChoiceIdentifier specifies the name of another class member that contains the name of the actual element to use during serialization.

The ItemElementName member is of type ItemChoiceType, which is an enumeration type defined as follows:

••• [XmlTypeAttribute( Namespace="https://example.org/xmlserializer", IncludeInSchema=false)] public enum ItemChoiceType { commission, hourly, salary, } •••

Notice that ItemChoiceType is marked with IncludeInSchema=false to prevent it from being exposed by xsd.exe, again, it's only used for dealing with the choice. So when you instantiate a new PaymentMethod object and assign the Item field a value, you also need to specify which member of the choice you're actually using. ItemElementName is annotated with XmlIgnoreAttribute, preventing it from being serialized into the instance by XmlSerializer. Again, it's simply there to help XmlSerializer figure out how to properly serialize the Item field. Check out the example that is shown in Figure 3.

Figure 3 Serialize

using System; using System.IO; using System.Xml.Serialization; ••• static void Serialize() { PaymentMethod p = new PaymentMethod(); p.Item = 7.5; // specify which member we're actually using p.ItemElementName = ItemChoiceType.hourly; XmlSerializer s = new XmlSerializer(typeof(PaymentMethod)); FileStream fs = new FileStream("p.xml", FileMode.Create); s.Serialize(fs, p); fs.Close(); }

The code in Figure 3 generates an XML document named p.xml, which contains the following:

<?xml version="1.0"?> <paymentMethod xmlns="https://example.org/xmlserializer"> <hourly>7.5</hourly> </paymentMethod>

Then, of course, when you're deserializing PaymentMethod objects, you can inspect the ItemElementName field to figure out which choice particle was used in the instance document:

static void Deserialize() { XmlSerializer s = new XmlSerializer(typeof(PaymentMethod)); FileStream fs = new FileStream("p.xml", FileMode.Open); PaymentMethod p = (PaymentMethod)s.Deserialize(fs); fs.Close(); Console.WriteLine("{0}: {1}", p.ItemElementName, p.Item); }

This turns out to be quite handy when working with choices of different types. In the earlier example I showed, each particle was of type xsd:double. Suppose, however, that you extend it to contain an additional particle of a different type:

<xs:complexType name="PaymentMethod"> <xs:choice> <xs:element name="commission" type="xs:double" /> <xs:element name="hourly" type="xs:double" /> <xs:element name="salary" type="xs:double" /> <xs:element name="other" type="xs:string" /> </xs:choice> </xs:complexType> <xs:element name="paymentMethod" type="PaymentMethod"/>

If you regenerate the class definition using xsd.exe, you'll notice that the Item field is now declared as type object instead of a double, which makes sense since it can contain either a string or a double now. In situations like this, the ItemElementName method helps you figure out what type to expect in the Item field (see Figure 4).

Figure 4 Deserialize

static void Deserialize() { ••• PaymentMethod p = (PaymentMethod)s.Deserialize(fs); switch(p.ItemElementName) { case ItemChoiceType.other: string items = (string)p.Item; ••• default: double itemd = (double)p.Item; ••• } ••• }

One last issue I'll touch on is how to deal with repeating choices. For example, let's consider the slightly modified PaymentMethod complex type definition code that contains a repeating choice, as illustrated in the following:

<xs:complexType name="PaymentMethod"> <xs:choice maxOccurs="unbounded"> <xs:element name="commission" type="xs:double" /> <xs:element name="hourly" type="xs:double" />

This means that a repeating list of any of the particles (listed within choice) may be used within an instance of this type. If you regenerate the class definition with this change in place, you'll notice that the only difference is that the Item and ItemElementName fields are declared as arrays:

public class PaymentMethod { ••• [XmlChoiceIdentifier("ItemsElementName")] public object[] Items; [XmlIgnore] public ItemsChoiceType[] ItemsElementName; }

The fundamental method for dealing with choices remains the same. There is a separate object (in the Items array) for each instance of the choice and a corresponding ItemsChioceType value (in the ItemsElementname array) that indicates which particle was actually used. The arrays are associated with each other with respect to their index position.

As you can see, working with choices is more involved than dealing with the more common compositors, but it's workable if you understand the mapping and are willing to do a little extra work. You can download the complete sample illustrated here from the MSDN® Magazine Web site code page at the link at the top of this article.

Q Can XmlSerializer deal with mixed content models?

Q Can XmlSerializer deal with mixed content models?

A Yes, XmlSerializer can also handle mixed content models to a degree. A mixed content model allows a combination of text and elements. For example, consider the following complex type with a mixed content model:

<xs:complexType name="Employee" mixed="true"> <xs:sequence> <xs:element name="id" type="xs:string" /> <xs:element name="name" type="xs:string" /> </xs:sequence> </xs:complexType> <xs:element name="employee" type="EmployeeType"/>

A Yes, XmlSerializer can also handle mixed content models to a degree. A mixed content model allows a combination of text and elements. For example, consider the following complex type with a mixed content model:

<xs:complexType name="Employee" mixed="true"> <xs:sequence> <xs:element name="id" type="xs:string" /> <xs:element name="name" type="xs:string" /> </xs:sequence> </xs:complexType> <xs:element name="employee" type="EmployeeType"/>

The mixed="true" attribute states that an instance of this type may contain text nodes between the allowed elements. For example, the following document contains a valid instance of the Employee type defined in the previous XML snippet:

<employee xmlns="https://example.org/xmlserializer"> here is some text... <id>333-33-3333</id> here is some more... <name>Bob Smith</name> and here is even more... </employee>

As with choice compositors, this type of XML structure doesn't map naturally to .NET Framework class definitions, but XmlSerializer does its best to deal with it. Running xsd.exe against the schema file generates the following class definition that attempts to handle mixed content:

public class EmployeeType { public string id; public string name; [XmlText] public string[] Text; }

Notice that the element declarations map naturally to the id and name fields, but what about the extra text that may interleave with the elements? The only way to handle this is to provide a single bucket for holding all of the interleaving text nodes. That's the purpose of the Text field of type string[]. This field is annotated with XmlTextAttribute, indicating that it will contain the extra text nodes found within the document during deserialization. Note that you can only apply the XmlTextAttribute attribute to a single member of a class.

Interleaving text nodes encountered during deserialization are simply added to this array. For example, consider the deserialization code shown in this snippet:

XmlSerializer s2 = new XmlSerializer(typeof(Employee)); FileStream fs2 = new FileStream("employee.xml", FileMode.Open); Employee e = (Employee)s2.Deserialize(fs2); Console.WriteLine("id: {0}, name: {1}", e.id, e.name); foreach (string t in e.Text) Console.WriteLine(t);

Running the mixed content example through the previous code produces the following console output:

id: 333-33-3333, name: Bob Smith here is some text... here is some more... and here is even more...

As you can see, XmlSerializer enables you to get your hands on the interleaving text nodes, but there's no way to know exactly where they appeared in the original XML document. In many scenarios this can be a major drawback since the position of the text plays an important role in applications that commonly leverage mixed content models, such as document-oriented content management systems.

There are many XML structures that are perfectly reasonable in XML Schema but don't make much sense in the world of the .NET Framework (like choices or mixed content models). So if you're designing a system that relies heavily on the use of mixed content models, you're better off avoiding XmlSerializer altogether. Instead, you should use something like XSLT, which is well suited for processing document-oriented structures. You can download the complete sample from the MSDN Magazine Web site.

Q When using XmlSerializer, is it possible to force the value of a field to show up as the content of an XML element?

Q When using XmlSerializer, is it possible to force the value of a field to show up as the content of an XML element?

A Yes, through the XmlTextAttribute technique described in the previous question. For example, assume you have the following C# class definition

public class Employee { public string id; public string name; }

and you want to map it to the following XML document, where the id field maps to an XML attribute and the name field maps to the content of the element:

<Employee id="333-33-3333" xmlns="https://example.org/ xmlserializer">Bob Smith</Employee>

A Yes, through the XmlTextAttribute technique described in the previous question. For example, assume you have the following C# class definition

public class Employee { public string id; public string name; }

and you want to map it to the following XML document, where the id field maps to an XML attribute and the name field maps to the content of the element:

<Employee id="333-33-3333" xmlns="https://example.org/ xmlserializer">Bob Smith</Employee>

To accomplish this you need to annotate the name field with XmlTextAttribute and the id field with XmlAttributeAttribute, as shown here:

public class Employee { [XmlAttribute] public string id; [XmlText] public string name; }

If you run the compiled assembly through xsd.exe to generate a schema definition for this class, it produces a complex type that derives by extension from xsd:string in order to attach some attributes to the value:

<xs:complexType name="Employee"> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="id" type="xs:string" /> </xs:extension> </xs:simpleContent> </xs:complexType>

This type definition basically states that the instance element will have an id attribute and a string value as its content. The infrastructure didn't define it as a mixed content model, like in the XmlTextAttribute examples from the previous question, because in this case the element's content is of the text-only variety. Had I included another field that wasn't mapped to an attribute, however, it would have had to create a complex type with a mixed content model like before.

Q I've noticed that XmlSerializer won't serialize objects that implement IDictionary by default. Is there any way around this?

Q I've noticed that XmlSerializer won't serialize objects that implement IDictionary by default. Is there any way around this?

A Unfortunately there's no way to make XmlSerializer function with IDictionary-derived objects since the infrastructure explicitly checks for IDictionary at run time and disables serialization. One way around this is to write a new class that wraps an IDictionary object and copies the values into an array of serializable objects. The XmlSerializer framework also has a hidden hook for writing custom serialization code. To take advantage of this hook you can implement an interface called IXmlSerializable for the object you want to serialize/deserialize.

A Unfortunately there's no way to make XmlSerializer function with IDictionary-derived objects since the infrastructure explicitly checks for IDictionary at run time and disables serialization. One way around this is to write a new class that wraps an IDictionary object and copies the values into an array of serializable objects. The XmlSerializer framework also has a hidden hook for writing custom serialization code. To take advantage of this hook you can implement an interface called IXmlSerializable for the object you want to serialize/deserialize.

When you instantiate an XmlSerializer object, the constructor usually generates a temporary assembly containing XmlReader and XmlWriter code for moving between object instances and XML documents. Before doing this, however, it first checks to see if the supplied type derives from IXmlSerializable and if so, it generates code to call into the IXmlSerializable members instead. In other words, if you implement IXmlSerializable, you completely bypass the automatic serialization process and have the opportunity to provide your own.

IXmlSerializable is covered in the official documentation, but the documentation states it's not intended for public use and provides no information beyond that. This indicates that the development team wanted to reserve the right to modify, disable, or even completely remove this extensibility hook down the road. However, as long as you're willing to accept this uncertainty and deal with possible changes in the future, there's no reason whatsoever you can't take advantage of it.

Let's walk through a complete example of custom serialization to illustrate how it all fits together. To begin, here's what the interface looks like:

interface IXmlSerializable { XmlSchema GetSchema(); void ReadXml(XmlReader reader); void WriteXml(XmlWriter writer); }

ReadXml is called during deserialization, whereas WriteXml is called during serialization. The GetSchema method is called when the XML Schema is exported (either by using xsd.exe or via the .asmx WSDL generation process). To implement a class capable of serializing any IDictionary-derived object, you need to define a new class that implements IXmlSerializable and also wraps an IDictionary object, as illustrated in Figure 5.

Figure 5 IXmlSerializable Implementation

public class DictionarySerializer : IXmlSerializable { const string NS = "https://www.develop.com/xml/serialization"; public IDictionary dictionary; public DictionarySerializer() { dictionary = new Hashtable(); } public DictionarySerializer(IDictionary dictionary) { this.dictionary = dictionary; } public void WriteXml(XmlWriter w) { w.WriteStartElement("dictionary", NS); foreach (object key in dictionary.Keys) { object value = dictionary[key]; w.WriteStartElement("item", NS); w.WriteElementString("key", NS, key.ToString()); w.WriteElementString("value", NS, value.ToString()); w.WriteEndElement(); } w.WriteEndElement(); } public void ReadXml(XmlReader r) { r.Read(); // move past container r.ReadStartElement("dictionary"); while (r.NodeType != XmlNodeType.EndElement) { r.ReadStartElement("item", NS); string key = r.ReadElementString("key", NS); string value = r.ReadElementString("value", NS); r.ReadEndElement(); r.MoveToContent(); dictionary.Add(key, value); } } public System.Xml.Schema.XmlSchema GetSchema() { return LoadSchema(); } ••• }

The WriteXml method generates an XML document capable of representing any dictionary object in the following form:

<dictionary xmlns="https://www.develop.com/xml/serialization"> <item> <key>5</key> <value>Nathan</value> </item> <item> <key>4</key> <value>Michael</value> </item> ••• </dictionary>

The ReadXml method contains the necessary code for reading these documents and then converting the document back into a dictionary object form.

The following code illustrates how to serialize and deserialize a DictionarySerializer using XmlSerializer:

void Serialize(IDictionary d, Stream s) { DictionarySerializer ds = new DictionarySerializer(d); XmlSerializer xs = new XmlSerializer(typeof(DictionarySerializer)); xs.Serialize(s, ds); } IDictionary Deserialize(Stream s) { XmlSerializer xs = new XmlSerializer(typeof(DictionarySerializer)); return xs.Deserialize(s); }

With this in place, it's now possible to serialize anything that derives from IDictionary such as Hashtable, SortedList, ListDictionary, or HybridDictionary, as shown in Figure 6.

Figure 6 IDictionary Serialization

// serialize a Hashtable FileStream fs = new FileStream("ht.xml", FileMode.Create); Hashtable ht = new Hashtable(); ht.Add(1, "Aaron"); ... // fill Hashtable Serialize(ht, fs); fs.Close(); // serialize a SortedList FileStream fs2 = new FileStream("sl.xml", FileMode.Create); SortedList sl = new SortedList(); sl.Add(1, "Aaron"); ... // fill SortedList Serialize(sl, fs2); fs2.Close(); ••• // deserialize a Hashtable FileStream fs3 = new FileStream("ht.xml", FileMode.Open); Hashtable hash = Deserialize(fs3); •••

Since the WebMethod framework also uses the XmlSerializer plumbing, it's now possible to return DictionarySerializer objects from WebMethods. For example, let's consider the WebMethod that returns a Hashtable through a DictionarySerializer wrapper as shown in the following code:

[WebMethod] public DictionarySerializer GetHashTable() { Hashtable ht = new Hashtable(); ht.Add(1, "Aaron"); ht.Add(2, "Monica"); ht.Add(3, "Michelle"); return new DictionarySerializer (h1); }

If you execute this operation, the SOAP response looks something like the code shown here:

<soap:Envelope xmlns:soap="https://schemas.xmlsoap.org/soap/envelope/"> <soap:Body> <GetHashTableResponse xmlns="https://www.develop.com/xml/serialization/example"> <GetHashTableResult> <dictionary xmlns="https://www.develop.com/xml/serialization"> <item> <key>1</key> <value>Aaron</value> </item> •••

As you can see, I was able to customize the serialization process by simply implementing IXmlSerializable's ReadXml/WriteXml members. The last thing I need to do is customize the schema generation functionality so that clients know what to expect ahead of time. The framework typically generates the schema automatically, but since I've customized the serialization process, only I know what the schema should look like. Hence, the infrastructure will call GetSchema to retrieve the appropriate schema from my code at run time. As long as I implement this correctly, clients will have access to everything they need through the normal schema generation hooks.

In my implementation of GetSchema, I returned an XML Schema definition loaded into a System.Xml.Schema.XmlSchema object (see Figure 7). You'll see these definitions if you use xsd.exe to export the schema from the assembly or if you browse to the .asmx file and inspect the documentation. Notice that even the documentation page for GetHashtable illustrates exactly what the client should expect to receive now (see Figure 8).

Figure 7 GetSchema

<s:schema elementFormDefault="qualified" targetNamespace="https://www.develop.com/xml/serialization" xmlns:tns="https://www.develop.com/xml/serialization" > <s:complexType name="DictionaryType"> <s:sequence> <s:element maxOccurs="unbounded" name="item" type="tns:ItemType" /> </s:sequence> </s:complexType> <s:complexType name="ItemType"> <s:sequence> <s:element name="key" type="s:string" /> <s:element name="value" type="s:string" /> </s:sequence> </s:complexType> <s:element name="dictionary" type="tns:DictionaryType" /> </s:schema>

Figure 8 GetHashtable

Figure 8** GetHashtable **

You can download a complete example from the MSDN Magazine Web page I cited earlier. Remember, this is an unsupported and undocumented area of the .NET Framework that may change in the future.

Send your questions and comments for Aaron to  xmlfiles@microsoft.com.

Aaron Skonnard is an instructor/researcher at DevelopMentor, where he develops the XML and Web Service-related curriculum. Aaron coauthored Essential XML Quick Reference (Addison-Wesley, 2001) and Essential XML (Addison-Wesley, 2000).