Microsoft Office Word 2003 XML Object Model Overview
Paul Cornell
Microsoft Corporation
March 2003
Applies to:
Microsoft® Office Word 2003
Summary: Learn how to work with the Microsoft Office Word 2003 object model to programmatically perform XML-related actions such as managing XML schemas, namespaces, elements, attributes, and transforms. (18 pages)
Download odc_wdxmlom.exe.
Contents
Introduction
What You Can Do with the Object Model
New Objects and Collections
Enhancements to Existing Objects and Collections
Code Snippets
Code Example
Conclusion
Additional Resources
Introduction
Microsoft® Word 2000 and 2002 have extremely limited support for XML, which are internal to Word. Documents saved in HTML format in Word 2000 and 2002 have some islands of XML data saved within them, but you cannot use either Word 2000 or Word 2002 to natively create or save XML documents. In the next version of Word, Microsoft Office Word 2003, you can open, modify, and save XML documents that conform to the Word XML document schema or any customer-defined schema. You can also apply XSLT transformations to XML documents during file-open and file-save operations. This article explains how you can leverage the Word 2003 object model to programmatically perform XML-related actions using Microsoft Visual Basic® for Applications (VBA) code.
Note The information in this article pertains to Word 2003 only. The information in this article is expected to change with future releases of Word 2003.
What You Can Do with the Object Model
Additions to the Word 2003 object model allow you to programmatically perform XML-related actions in Word such as:
- Add schemas to the schema library.
- Attach schemas to XML documents.
- Manage elements and attributes in XML documents (add, change, delete, cut, copy, and so on).
- Save the document as a plain XML document in a selected customer-defined schema, without any Word-specific markup.
- Save the document as a rich Word XML document including all the information that is saved in the .doc format (such as custom property metadata and so on).
- Apply customer-defined XSLT transforms before saving the XML document.
- Trap events after an element from a customer-defined schema is inserted, before an element is deleted, and when the user moves among elements in an XML document.
- Validate selected nodes on an as-needed basis against schemas.
- Trap events after a violation of a customer defined schema has occurred.
- Provide custom XML document validation error handling actions.
New Objects and Collections
The Word 2003 object model has five new objects and five new collections that programmatically support XML-related actions. For information on the objects' and collections' members, see Word 2003 online programming Help or reference the "Code Snippets by Task" section later in this article.
The XMLChildNodeSuggestion object and the XMLChildNodesSuggestions collection represent one or more elements that are possible as children of the currently selected element according to the XML document's corresponding attached schema. You can navigate the XMLChildNodeSuggestions collection using an ordinal index number or an element name:
ActiveDocument.ChildNodeSuggestions.Item(Index:=1).Insert ActiveDocument.ChildNodeSuggestions. _ Item(Index:="sales_order").Insert
The XMLNamespace object and the XMLNamespaces collection represent one or more registered namespace URIs in the user's schema library. You can navigate the XMLNamespaces collection using an ordinal index number or a namespace URI:
Application.XMLNamespaces.Item(Index:=1).AttachToDocument Document:=ActiveDocument Application.XMLNamespaces.Item _ (Index:="myCompany.schemas.com.sales_order").AttachToDocument _ Document:=ActiveDocument
The XMLNode object and the XMLNodes collection represent one or more elements in an XML document, range, and even child nodes of a containing element. You can navigate the XMLNodes collection by ordinal index number only, for example:
MsgBox Prompt:=ActiveDocument.XMLNodes.Item(Index:=1).BaseName
The XMLSchemaReference object and the XMLSchemaReferences collection represent one or more schemas for each unique namespace that is attached to a document. You can navigate the XMLSchemaReferences collection using an ordinal index number or a namespace URI, for example:
ActiveDocument.XMLSchemaReferences.Item(Index:=1).Reload ActiveDocument.XMLSchemaReferences.Item _ (Index:="myCompany.schemas.com.sales_order").Reload
The XSLTransform object and the XSLTransforms collection represent one or more transforms registered for Word only. You can navigate the XSLTransforms collection by using an ordinal index number or by using an alias. If no transform is found with that alias, then Word attempts to use the transform's ID, for example:
MsgBox Application.XMLNamespaces.Item _ (Index:=1).XSLTransforms.Item(Index:=1).Location MsgBox Application.XMLNamespaces.Item _ (Index:=" myCompany.schemas.com.sales_order") _ .XSLTransforms.Item(Index:="Elegant Memo").Location
Note The Word 2003 XML object model was designed to resemble the existing Microsoft XML (MSXML) parser document object model. If you are familiar with the MSXML parser, you should find the Word XML object model familiar and intuitive. In fact, Word XML validation and parsing is done with MSXML in the background; Word 2003 uses Microsoft XML, v5.0, an incremental update to the existing Microsoft XML, v4.0 (Microsoft XML Core Services).
Enhancements to Existing Objects and Collections
The Word 2003 object model enhances the Application, Document, Range, and Selection objects, as well as the Options collection:
The Application object has one new property and two new events:
- The XMLNamespaces property accesses the application's XMLNamespaces collection.
- The XMLSelectionChange event fires whenever changes occur in the most immediate parent element of the current Selection object.
- The XMLValidationError event fires whenever a validation error occurs anywhere in the document.
The Document object has eight new properties, one new method, and two new events:
The XMLHideNamespaces property hides namespace aliases in the XML Structure task pane.
The XMLNodes property returns a collection of all of the document's XML elements, in linear order.
The XMLSaveDataOnly property specifies whether the document is saved to the Word XML document schema format (False) or only the markup in the customer-defined XML schema, without any Word schema elements, is saved (True).
The XMLSaveThroughXSLT property specifies the path to the transform to apply when the document is saved as XML (True). Note that the XMLUseXSLTWhenSaving property must also be set to True.
The XMLSchemaReferences property accesses the document's XMLSchemaReferences collection.
The XMLSchemaViolations property returns a collection of elements that have validations errors associated with them.
The XMLShowAdvancedErrors property displays more advanced validation error messages in the Word user interface when users rest their mouse pointer on visual validation error cues such as purple squiggles in the editing window or on strikeout symbols in the XML Structure task pane.
Note Advanced validation errors come directly from the MSXML parser 5.0.
The XMLUseXSLTWhenSaving property specifies whether a transform should be applied to a document saved as XML (True).
The TransformDocument method applies the specified transform to the document and reopens the document.
The XMLAfterInsert event fires right after a new element has been inserted anywhere in the document.
The XMLBeforeDelete event fires just before an element is deleted from anywhere in the document.
The Options collection has one new property:
- The PrintXMLTag property determines whether XML tags will appear when a document is printed (True).
The Range and Selection objects each have three new properties and one new method:
- The XML property returns the Word XML document schema representation of the Range or Selection object with customer-defined XML in it, if any. The returned String value is a full, well-formed Word XML document, starting with the w:WordDocument element as the root level element, but the w:body element would contain only the Word XML document schema representation of the contents of the Range or Selection object.
- The XMLNodes property returns a collection of elements representing all of the XML elements entirely within, or partially overlapping, the Range or Selection object.
- The XMLParentNode property returns the element representing the immediate parent element of the current Range or Selection object.
- The InsertXML method replaces the contents of the Range or Selection object with a String value marked up as well-formed XML data.
Code Snippets
This section lists a series of code snippets by common XML programming tasks. Note that the code snippets do not cover all of the properties, methods, and events in the object model. In most cases, if you do not enclose the code within Private Sub
and End Sub
statements, you can run the code snippet from the Immediate Window or paste the snippet into existing code. To find a desired code snippet, locate a general task area heading and then locate the specific task heading; the code snippet resides beneath the specific task heading. Each code snippet lists any conditions that are needed before the code can successfully run.
Working with XML Child Node Suggestions
Insert an XML child element
This code snippet inserts a sales_order child element into the active document. This code snippet assumes that an XML-formatted document is active and that an element named sales_order is valid for the document.
ActiveDocument.ChildNodeSuggestions.Item(Index:="sales_order").Insert
Insert all possible child XML elements for a given XML element
This code snippet inserts the root element and all of its child elements into the active XML document, each around a separate paragraph. This code snippet assumes you have already attached a schema to the active document and that you have not yet added the root element to the document. Note that this code snippet inserts the elements in the order that they are displayed in the XML Structure task pane.
Private Sub InsertAllAllowedChildElements()
Dim objCNS As Word.XMLChildNodeSuggestion
ActiveDocument.ChildNodeSuggestions(1).Insert
Selection.InsertParagraph
Selection.MoveRight
For Each objCNS In _
ActiveDocument.XMLNodes(1).ChildNodeSuggestions
objCNS.Insert
Selection.MoveRight
Selection.InsertParagraph
Selection.MoveRight
Next objCNS
End Sub
Working with XML Namespaces
Add an XML namespace to the schema library
This code snippet demonstrates how to add a namespace and a schema to the schema library for all users on a computer. This code snippet assumes that a schema exists at C:\temp\sales_order.xsd.
Application.XMLNamespaces.Add _
Path:="C:\temp\sales_order.xsd", _
Alias:="mCsO", _
InstallForAllUsers:=True
Attach an XML namespace to the active document
This code snippet attaches a namespace (and the associated schema) to the active XML document. This code snippet assumes an XML schema with the namespace URI myCompany.schemas.com.sales_order already exists in the user's schema library.
Application.XMLNamespaces.Item _
(Index:="myCompany.schemas.com.sales_order"). _
AttachToDocument Document:=ActiveDocument
Working with XML Schema References
Add a schema reference to the schema library
This code snippet demonstrates how to add a schema reference, along with an alias, to the active document and to the schema library at the same time, with a single command. In other words, it combines the two previous examples into a single VBA statement. This code snippet assumes that an XML schema exists at C:\temp\sales_order.xsd.
ActiveDocument.XMLSchemaReferences.Add _
NamespaceURI:="myCompany.schemas.com.sales_order", _
Alias:="mCsO", _
FileName:="C:\temp\sales_order.xsd", _
InstallForAllUsers:=True
Reload an XML schema referenced by the active document
In this code snippet, the active XML document reloads a referenced schema. This is especially useful if you are a developer creating a schema and wanting to test it in Word; the Reload method will reload the schema attached to the document if you have changed that schema. This is because once a schema is loaded in Word, it is cached until Word shuts down. This code snippet assumes the active document references the namespace URI myCompany.schemas.com.sales_order.
ActiveDocument.XMLSchemaReferences.Item _
(Index:="myCompany.schemas.com.sales_order").Reload
Validate all XML schemas referenced by the active document
This code snippet validates the active document against all schemas referenced by the active document and then shows the number of XML nodes in the document that violate the schemas.
ActiveDocument.XMLSchemaReferences.Validate
MsgBox "Number of violations: " & _
ActiveDocument.XMLSchemaViolations.Count
Working with Transforms
Associate a transform with a namespace in the schema library
This code snippet associates a transforms with a namespace in the schema library for all users of the computer. This will cause Word to automatically apply the specified XSL transformation any time an XML document is opened where the root element is in the specified namespace. This code snippet assumes the namespace URI myCompany.schemas.com.sales_order is available in the schema library and that a transform exists at C:\temp\sales_order_to_purchase_order.xsl.
Application.XMLNamespaces.Item _
(Index:="myCompany.schemas.com.sales_order").XSLTransforms.Add _
Location:="C:\temp\sales_order_to_purchase_order.xsl", _
Alias:="Sales Order to Purchase Order", _
InstallForAllUsers:=True
Use a transform to transform an XML document
This code snippet transforms the active document from one customer-defined schema to another. This code snippet assumes that the active document contains XML from some customer-defined schema and that the transform exists at C:\temp\sales_order_to_purchase_order.xsl. It applies that transform only to the customer-defined XML and not the Word XML document schema representation of the document.
ActiveDocument.TransformDocument _
Path:="C:\temp\sales_order_to_purchase_order.xsl", _
DataOnly:=True
Working with XML Nodes
Select a single element and retrieve the element's value as text and XML
This code snippet prints the value of the description element belonging to the second item_sold element in the XML document. The value is printed as both text and XML. This code snippet assumes that the active document conforms to the myCompany.schemas.com.sales_order namespace URI, uses the mCsO namespace alias, and that the XPath statement resolves to the description element in the document.
Private Sub ElementAsTextAndXML()
Dim objNode As Word.XMLNode
Set objNode = _
ActiveDocument.SelectSingleNode _
(XPath:="//mCsO:sales_order/mCsO:items_sold/ _
mCsO:item_sold[2]/mCsO:description", PrefixMapping:= _
"xmlns:mCsO='myCompany.schemas.com.sales_order'")
MsgBox Prompt:=objNode.Text
MsgBox Prompt:=objNode.XML
End Sub
Select multiple elements
This code snippet returns all of the items_sold element's child elements in the XML document and prints the number of child elements. This code snippet assumes that the active document conforms to the myCompany.schemas.com.sales_order namespace URI, uses the mCsO namespace alias, and that the XPath statement resolves to the items_sold element in the document.
Private Sub SelectMultipleElements()
Dim objNodes As Word.XMLNodes
Dim objNode As Word.XMLNode
Set objNode = ActiveDocument.SelectSingleNode _
(XPath:="//mCsO:sales_order/mCsO:items_sold", _
PrefixMapping:="xmlns:mCsO='myCompany.schemas.com.sales_order'")
Set objNodes = objNode.ChildNodes
MsgBox Prompt:="There are " & objNodes.Count & _
" child nodes in the " & objNode.BaseName & " element."
End Sub
Retrieve an element's contents as a Range object
This code snippet changes the value of the description element belonging to the second item_sold element in the XML document. To verify that the contents have been changed, the element's text is printed with the new value. This code snippet assumes that the active document conforms to the myCompany.schemas.com.sales_order namespace URI, uses the mCsO namespace alias, and that the XPath statement resolves to the description element in the document.
Private Sub ElementAsRange()
Dim objNode As Word.XMLNode
Dim objRange As Word.Range
Set objNode = _
ActiveDocument.SelectSingleNode _
(XPath:="//mCsO:sales_order/mCsO:items_sold/ _
mCsO:item_sold[2]/mCsO:description", PrefixMapping:= _
"xmlns:mCsO='myCompany.schemas.com.sales_order'")
Set objRange = objNode.Range
objRange.Text = "The contents of this element have been replaced."
MsgBox Prompt:=objNode.Text
End Sub
Retrieve an element's related elements (first child element, last child element, next sibling element, previous sibling element, and parent element)
This code snippet prints the names of elements' first child elements, last child elements, next sibling elements, previous sibling elements, and parent elements. This code snippet assumes that the active document conforms to the myCompany.schemas.com.sales_order namespace URI, uses the mCsO namespace alias, the XPath statements resolve to the specified elements in the document, and that the elements have first child elements, last child elements, next sibling elements, previous sibling elements, and parent elements in the document as specified.
Private Sub ElementRelatedElements()
Dim objNodes As Word.XMLNodes
Dim objNode As Word.XMLNode
Set objNode = ActiveDocument.SelectSingleNode _
(XPath:="//mCsO:sales_order/mCsO:items_sold", _
PrefixMapping:="xmlns:mCsO='myCompany.schemas.com.sales_order'")
With objNode
MsgBox Prompt:="First child node: " & .FirstChild.BaseName
MsgBox Prompt:="Last child node: " & .LastChild.BaseName
MsgBox Prompt:="Parent node: " & .ParentNode.BaseName
End With
Set objNode = ActiveDocument.SelectSingleNode _
(XPath:="//mCsO:sales_order/mCsO:items_sold/ _
mCsO:item_sold[2]", PrefixMapping:= _
"xmlns:mCsO='myCompany.schemas.com.sales_order'")
With objNode
MsgBox Prompt:="Next sibling node: " & .NextSibling.BaseName
MsgBox Prompt:="Previous sibling node: " &
.PreviousSibling.BaseName
MsgBox Prompt:="Parent node: " & .ParentNode.BaseName
End With
End Sub
Add an element
This code snippet adds the root sales_order element to the active document. This code snippet assumes that the active document conforms to the myCompany.schemas.com.sales_order namespace URI.
ActiveDocument.XMLNodes.Add _
Name:="sales_order", _
Namespace:="myCompany.schemas.com.sales_order"
Delete an element
This code snippet deletes the tax_each element from the third item_sold element of the items_sold element of the sales_order element in the active document. This code snippet assumes that the active document conforms to the myCompany.schemas.com.sales_order namespace URI, uses the mCsO namespace alias, and the XPath statements resolve to the specified element in the document.
Private Sub DeleteTaxEachElement()
Dim objNode As Word.XMLNode
Set objNode = ActiveDocument.SelectSingleNode _
(XPath:="//mCsO:sales_order/mCsO:items_sold/ _
mCsO:item_sold[3]/mCsO:tax_each", PrefixMapping:= _
"xmlns:mCsO='myCompany.schemas.com.sales_order'")
' Delete the node and its contents.
objNode.Cut
End Sub
Remove a child element from a parent element
This code snippet also removes the tax_each element from the third item_sold element of the items_sold element of the sales_order element in the active document. This code snippet assumes that the active document conforms to the myCompany.schemas.com.sales_order namespace URI, uses the mCsO namespace alias, and the XPath statements resolve to the specified element in the document.
Private Sub RemoveChildFromParentElement()
Dim objParentNode As Word.XMLNode
Dim objChildNode As Word.XMLNode
Dim strPrefixMapping as String
strPrefixMapping = _
"xmlns:mCsO='myCompany.schemas.com.sales_order'"
Set objParentNode = ActiveDocument.SelectSingleNode _
(XPath:="//mCsO:sales_order/mCsO:items_sold/ _
mCsO:item_sold[3]", PrefixMapping:=strPrefixMapping)
Set objChildNode = objParentNode.SelectSingleNode _
(XPath:="mCsO:tax_each", PrefixMapping:=strPrefixMapping)
objParentNode.RemoveChild ChildElement:=objChildNode
End Sub
Displaying XML-Related User Interface Components
Display the XML Structure task pane
This code snippet displays the XML Structure task pane.
Application.TaskPanes(Index:=wdTaskPaneXMLStructure).Visible = True
Display the XML Transforms task pane
This code snippet displays the XML Transforms task pane.
Application.TaskPanes(Index:=wdTaskPaneXMLTransforms).Visible = True
Display the Templates and Add-Ins dialog box
This code snippet displays the Templates and Add-Ins dialog box, which allows access to the schema library.
Application.Dialogs.Item(Index:=wdDialogToolsTemplates).Show
Display the XML tags in the document
This code snippet makes the XML tags visible in the document.
ActiveWindow.View.ShowXMLMarkup = True
Code Example
Attach an XML Schema and Add XML Elements to a Document
This code example demonstrates how to add a schema reference to the schema library, attach the referenced schema to the active document, and add the root element and all allowed child elements to the active document. This code example assumes that the schema path is correct, that the active document conforms to the myCompany.schemas.com.sales_order namespace URI and uses the mCsO namespace alias, and the XPath statements resolve to the specified element in the document.
Private Sub AddSchemaAndElements()
Dim objNamespace As Word.XMLNamespace
Dim objSchemaRef As Word.XMLSchemaReference
Dim blnNamespaceExists As Boolean
Dim blnSchemaExists As Boolean
Dim objActiveNode As Word.XMLNode
Const SCHEMA_PATH As String = _
"C:\temp\sales_order.xsd"
Const mCsO As String = _
"myCompany.schemas.com.sales_order"
blnNamespaceExists = False
blnSchemaExists = False
' Add an XML schema reference to the
' schema library if it doesn't already exist.
For Each objNamespace In Application.XMLNamespaces
If objNamespace.Location = SCHEMA_PATH Then
blnNamespaceExists = True
Exit For
End If
Next objNamespace
If blnNamespaceExists = False Then
Application.XMLNamespaces.Add Path:=SCHEMA_PATH, _
NamespaceURI:=mCsO, Alias:="mCsO", _
InstallForAllUsers:=True
End If
' Attach XML schema to the active document if it is not
' already attached.
For Each objSchemaRef In ActiveDocument.XMLSchemaReferences
If objSchemaRef.Location = SCHEMA_PATH Then
blnSchemaExists = True
Exit For
End If
Next objSchemaRef
If blnSchemaExists = False Then
Application.XMLNamespaces.Item(Index:=mCsO).AttachToDocument _
Document:=ActiveDocument
End If
' Add the root element to the active document.
Set objActiveNode = ActiveDocument.XMLNodes.Add _
(Name:="sales_order", _
Namespace:=mCsO)
' Add child elements to the active document.
With objActiveNode.ChildNodes
.Add(Name:="order_ID", _
Namespace:=mCsO).Range.Text = "02012003000001"
.Add(Name:="customer_ID", _
Namespace:=mCsO).Range.Text = "000000001"
.Add(Name:="salesperson_ID", _
Namespace:=mCsO).Range.Text = "000001"
.Add(Name:="sale_date_time", _
Namespace:=mCsO).Range.Text = "02.01.2003 21:34:00"
.Add(Name:="payment_method", _
Namespace:=mCsO).Range.Text = "credit card"
' Add an element with an attribute at the same time.
.Add(Name:="currency_type", _
Namespace:=mCsO).Attributes.Add _
(Name:="type", Namespace:="").NodeValue = "US"
.Add Name:="items_sold", _
Namespace:=mCsO
End With
' Make the items_sold element active.
Set objActiveNode = ActiveDocument.SelectSingleNode _
(XPath:="/mCsO:sales_order/mCsO:items_sold", _
PrefixMapping:="xmlns:mCsO='" & mCsO & "'")
' Add an item_sold element.
objActiveNode.ChildNodes.Add Name:="item_sold", _
Namespace:=mCsO
' Make the item_sold element active.
Set objActiveNode = ActiveDocument.SelectSingleNode _
(XPath:="/mCsO:sales_order/mCsO:items_sold/mCsO:item_sold", _
PrefixMapping:="xmlns:mCsO='" & mCsO & "'")
' Add child elements.
With objActiveNode.ChildNodes
.Add(Name:="quantity_sold", _
Namespace:=mCsO).Range.Text = "4"
.Add(Name:="description", _
Namespace:=mCsO).Range.Text = "Standard Widgets"
.Add(Name:="unit_of_measure", _
Namespace:=mCsO).Range.Text = "dozen"
.Add(Name:="price_each", _
Namespace:=mCsO).Range.Text = "29.99"
.Add(Name:="discount", _
Namespace:=mCsO).Range.Text = "0"
.Add(Name:="taxable", _
Namespace:=mCsO).Range.Text = "yes"
.Add(Name:="tax_each", _
Namespace:=mCsO).Range.Text = "1.95"
End With
' Add another item_sold element...
' Make the items_sold element active again.
Set objActiveNode = ActiveDocument.SelectSingleNode _
(XPath:="/mCsO:sales_order/mCsO:items_sold", _
PrefixMapping:="xmlns:mCsO='" & mCsO & "'")
' Add another item_sold element.
objActiveNode.ChildNodes.Add Name:="item_sold", _
Namespace:=mCsO
' Make the second item_sold element active.
Set objActiveNode = ActiveDocument.SelectSingleNode _
(XPath:="/mCsO:sales_order/mCsO:items_sold/ _
mCsO:item_sold[2]", PrefixMapping:="xmlns:mCsO='" & _
mCsO & "'")
' Add child elements.
With objActiveNode.ChildNodes
.Add(Name:="quantity_sold", _
Namespace:=mCsO).Range.Text = "2"
.Add(Name:="description", _
Namespace:=mCsO).Range.Text = "Deluxe Widgets"
.Add(Name:="unit_of_measure", _
Namespace:=mCsO).Range.Text = "each"
.Add(Name:="price_each", _
Namespace:=mCsO).Range.Text = "49.99"
.Add(Name:="discount", _
Namespace:=mCsO).Range.Text = "0"
.Add(Name:="taxable", _
Namespace:=mCsO).Range.Text = "yes"
.Add(Name:="tax_each", _
Namespace:=mCsO).Range.Text = "3.25"
End With
End Sub
Conclusion
In this article, you learned how Word 2003 is uniquely designed to open, modify, transform, and save XML documents. You learned about additions to the Word 2003 object model that allow you to programmatically work with XML documents, schemas, namespaces, elements, and transforms. Finally, a number of code snippets and a code example demonstrated how to work with XML schemas, namespaces, elements, and transforms using VBA code.
Additional Resources
- Elementary XML
- Going from HTML to XML
- MSDN XML Developer Center
- A Guide to XML and Its Technologies
- Understanding XML Namespaces
- The XML Files: A Quick Guide to XML Schema
- XSL Transformations: XSLT Alleviates XML Schema Incompatibility Headaches
- Visual Studio .NET Walkthrough: Creating an XML File with an Associated XML Schema
- What's New for Microsoft Office 2003 Developers?