Mapping the Object Hierarchy to XML Data

Article
10/17/2014

When an XML document is in memory, the conceptual representation is a tree. For programming, you have an object hierarchy to access the nodes of the tree. The following example shows you how the XML content becomes nodes.

As the XML is read into the DOM, the pieces are translated into nodes, and these nodes retain additional metadata about themselves, such as their node type and values. The node type is its object, and is what determines what actions can be performed and what properties can be set or retrieved.

If you have the following simple XML:

Input

<book>
    <title>The Handmaid's Tale</title>
</book>

The input is represented in memory as the following node tree with the assigned node type property:

Book and title node tree representation

a3bszkbd.simple_xml(en-us,VS.71).gif

The book element becomes an XmlElement object, the next element, title, also becomes an XmlElement, while the element content becomes an XmlText object. In looking at the XmlElement methods and properties, the methods and properties are different than the methods and properties available on an XmlText object. So knowing what node type the XML markup becomes is vital, as its node type determines the actions that can be performed.

The following example reads in XML data and writes out different text, depending on the node type. Using the following XML data file as input, items.xml:

Input

<?xml version="1.0"?>
<!-- This is a sample XML document -->
<!DOCTYPE Items [<!ENTITY number "123">]>
<Items>
  <Item>Test with an entity: &number;</Item>
  <Item>test with a child element <more/> stuff</Item>
  <Item>test with a CDATA section <![CDATA[<456>]]> def</Item>
  <Item>Test with a char entity: &#65;</Item>
  <!-- Fourteen chars in this element.-->
  <Item>1234567890ABCD</Item>
</Items>

The following code example reads the items.xml file and displays information for each node type.

Imports System
Imports System.IO
Imports System.Xml

Public Class Sample
   Private Const filename As String = "items.xml"
      
   Public Shared Sub Main()
      Dim txtreader As XmlTextReader = Nothing
      Dim reader As XmlValidatingReader = Nothing
      
      Try
         ' Load the reader with the data file and 
   'ignore all whitespace nodes. 
         txtreader = New XmlTextReader(filename)
         txtreader.WhitespaceHandling = WhitespaceHandling.None
         
         ' Implement the validating reader over the text reader. 
         reader = New XmlValidatingReader(txtreader)
         reader.ValidationType = ValidationType.None
         
         ' Parse the file and display each of the nodes.
         While reader.Read()
            Select Case reader.NodeType
               Case XmlNodeType.Element
                  Console.Write("<{0}>", reader.Name)
               Case XmlNodeType.Text
                  Console.Write(reader.Value)
               Case XmlNodeType.CDATA
                  Console.Write("<![CDATA[{0}]]>", reader.Value)
               Case XmlNodeType.ProcessingInstruction
                  Console.Write("<?{0} {1}?>", reader.Name, reader.Value)
               Case XmlNodeType.Comment
                  Console.Write("<!--{0}-->", reader.Value)
               Case XmlNodeType.XmlDeclaration
                  Console.Write("<?xml version='1.0'?>")
               Case XmlNodeType.Document
               Case XmlNodeType.DocumentType
                  Console.Write("<!DOCTYPE {0} [{1}]", reader.Name, reader.Value)
               Case XmlNodeType.EntityReference
                  Console.Write(reader.Name)
               Case XmlNodeType.EndElement
                  Console.Write("</{0}>", reader.Name)
            End Select
         End While
      
      Finally
         If Not (reader Is Nothing) Then
            reader.Close()
         End If
      End Try
   End Sub 'Main ' End class
End Class 'Sample
[C#]
using System;
using System.IO;
using System.Xml;

public class Sample
{
  private const String filename = "items.xml";

  public static void Main()
  {
     XmlTextReader txtreader = null;
     XmlValidatingReader reader = null;

     try
     {  
        // Load the reader with the data file and ignore 
     // all whitespace nodes.
        txtreader = new XmlTextReader(filename);
        txtreader.WhitespaceHandling = WhitespaceHandling.None;

        // Implement the validating reader over the text reader. 
        reader = new XmlValidatingReader(txtreader);
        reader.ValidationType = ValidationType.None;

        // Parse the file and display each of the nodes.
        while (reader.Read())
        {
           switch (reader.NodeType)
           {
             case XmlNodeType.Element:
               Console.Write("<{0}>", reader.Name);
               break;
             case XmlNodeType.Text:
               Console.Write(reader.Value);
               break;
             case XmlNodeType.CDATA:
               Console.Write("<![CDATA[{0}]]>", reader.Value);
               break;
             case XmlNodeType.ProcessingInstruction:
               Console.Write("<?{0} {1}?>", reader.Name, reader.Value);
               break;
             case XmlNodeType.Comment:
               Console.Write("<!--{0}-->", reader.Value);
               break;
             case XmlNodeType.XmlDeclaration:
               Console.Write("<?xml version='1.0'?>");
               break;
             case XmlNodeType.Document:
               break;
             case XmlNodeType.DocumentType:
               Console.Write("<!DOCTYPE {0} [{1}]", reader.Name, reader.Value);
               break;
             case XmlNodeType.EntityReference:
               Console.Write(reader.Name);
               break;
             case XmlNodeType.EndElement:
               Console.Write("</{0}>", reader.Name);
               break;
           }       
        }           
     }

     finally
     {
        if (reader!=null)
          reader.Close();
     }
  }
} // End class

The output from the example reveals the mapping of the data to the node types.

Output

<?xml version='1.0'?><!--This is a sample XML document --><!DOCTYPE Items [<!ENTITY number "123">]<Items><Item>Test with an entity: 123</Item><Item>test with a child element <more> stuff</Item><Item>test with a CDATA section <![CDATA[<456>]]> def</Item><Item>Test with a char entity: A</Item><--Fourteen chars in this element.--><Item>1234567890ABCD</Item></Items>

Taking the input one line at a time, and using the output generated from the code, you can use the following table to analyze what node test generated which lines of output, thereby understanding what XML data became what kind of node type.

Input	Output	Node Type Test
<?xml version="1.0"?>	<?xml version='1.0'?>	XmlNodeType.XmlDeclaration
<!-- This is a sample XML document -->	<!--This is a sample XML document -->	XmlNodeType.Comment
<!DOCTYPE Items [<!ENTITY number "123">]>	<!DOCTYPE Items [<!ENTITY number "123">]	XmlNodeType.DocumentType
<Items>	<Items>	XmlNodeType.Element
<Item>	<Item>	XmlNodeType.Element
Test with an entity: &number;</Item>	Test with an entity: 123	XmlNodeType.Text
</Item>	</Item>	XmlNodeType.EndElement
<Item>	<Item>	XmNodeType.Element
test with a child element	test with a child element	XmlNodeType.Text
<more>	<more>	XmlNodeType.Element
stuff	stuff	XmlNodeType.Text
</Item>	</Item>	XmlNodeType.EndElement
<Item>	<Item>	XmlNodeType.Element
test with a CDATA section	test with a CDATA section	XmlTest.Text
<![CDATA[<456>]]>	<![CDATA[<456>]]>	XmlTest.CDATA
def	def	XmlNodeType.Text
</Item>	</Item>	XmlNodeType.EndElement
<Item>	<Item>	XmlNodeType.Element
Test with a char entity: A	Test with a char entity: A	XmlNodeType.Text
</Item>	</Item>	XmlNodeType.EndElement
<!-- Fourteen chars in this element.-->	<--Fourteen chars in this element.-->	XmlNodeType.Comment
<Item>	<Item>	XmlNodeType.Element
1234567890ABCD	1234567890ABCD	XmlNodeType.Text
</Item>	</Item>	XmlNodeType.EndElement
</Items>	</Items>	XmlNodeType.EndElement

You must know what node type is assigned, as the node type controls what kinds of actions are valid, and what kind of properties you can set and retrieve.

Node creation for white space is controlled when the data is loaded into the DOM by the PreserveWhitespace flag. For more information, see White Space and Significant White Space Handling when Loading the.

To add new nodes to the DOM, see Inserting Nodes into an XML Document. To remove nodes from the DOM, see Removing Nodes, Content, and Values from an XML Document. To modify the content of nodes in the DOM, see Modifying Nodes, Content, and Values in an XML Document.

Mapping the Object Hierarchy to XML Data

See Also

Additional resources