Reading XML Data with XmlTextReader

The XmlTextReader class is an implementation of XmlReader, and provides a fast, performant parser. It enforces the rules that XML must be well-formed. It is neither a validating nor a non-validating parser since it does not have DTD or schema information. It can read text in blocks, or read characters from a stream.

The XmlTextReader provides the following functionality:

  • Enforces the rules that XML must be well-formed.
  • Checks that the DTD is well-formed. However, does not use the DTD for validation, expanding entity references, or adding default attributes.
  • Validating is not done against DTDs or Schemas.
  • Checks that DOCTYPE nodes are well-formed.
  • Checks if the entities are well-formed. For node types of EntityReference, a single, empty EntityReference node is returned. An empty EntityReference node is one in which its Value property is string.Empty. This is because you have no DTD or schema to expand the entity reference with. The XmlTextReader does ensure that the whole DTD is well-formed, including the EntityReference nodes.
  • Provides a performant XML parser, because the XmlTextReader does not have the overhead involved with validation checking.

The XmlTextReader can read data from different inputs, such as a stream object, a TextReader object, and a URL identifying a local file location or Web site.

The XmlTextReader uses an XmlResolver to locate external resources, such as DTDs, so it can check DTDs to see if they are well-formed. For more information on the XmlResolver, see Resolving Resources Using the XmlResolver.

The encoding declaration, <?xml version="1.0" encoding="ISO-8859-5"?>, contains an encoding attribute that sets the encoding for your document. The XmlTextReader has an Encoding property that returns the character encoding found in the encoding attribute in the XML declaration. If no encoding attribute is found, the default for the document is set to UTF-8.

If an external resource is read, such as a DTD used to expand an entity reference or a schema file, the encoding is set to the encoding value found in the external reference. If no encoding is found in the external reference, the default is set to UTF-8. The XmlTextReader supports many encodings, as it uses theSystem.Text.Encoding Class. Therefore, all encodings supported by that class are also supported by the XmlTextReader. The only encodings not supported are ones that map the <?xml sequence to different byte values than UTF-8, like UTF-7 and EBCDIC.

See Also

Full Content Reads using Character Streams | Document Type Declaration Information | Handling White Space with XmlTextReader | Attribute Value Normalization | Exception Handling using XmlException in XmlTextReader | XmlReader Class | XmlReader Members | XmlNodeReader Class | XmlNodeReader Members | XmlTextReader Class | XmlTextReader Members | XmlValidatingReader Class | XmlValidatingReader