Editing XML Data with Microsoft Office Word 2003 and Microsoft Office Excel 2003

 

Mary Chipman
MCW Technologies

February 2004

Applies to:
    Microsoft® Office Word 2003
    Microsoft Office Excel 2003
    Microsoft Office System

Summary: Mary Chipman provides a beginner's walkthrough of new functionality for using XML allowing you to separate the data in a document or worksheet from its presentation in Microsoft Office Word 2003 and Microsoft Office Excel 2003. You can then use and modify the content with other tools and exchange data with other XML systems. (9 printed pages)

Important The information set out in this topic is presented exclusively for the benefit and use of individuals and organizations outside the United States and its territories or whose products were distributed by Microsoft before January 2010, when Microsoft removed an implementation of particular functionality related to custom XML from Word. This information may not be read or used by individuals or organizations in the United States or its territories whose products were licensed by Microsoft after January 10, 2010; those products will not behave the same as products licensed before that date or licenses for use outside the United States.

Contents

Editing XML Data in Microsoft Office Word 2003
Editing XML Data in Microsoft Office Excel 2003
Conclusion

Editing XML Data in Microsoft Office Word 2003

Users can create and edit documents containing information marked by XML tags in a completely intuitive fashion without needing to learn or understand XML. Users can search for specific information, aggregate information from numerous sources, and eliminate many of the time-consuming, error-prone tasks associated with XML document creation and updating. However, if they simply open and start to edit an XML file in Microsoft® Office Word 2003, they can inadvertently delete the XML tags. If the XML document is validated against a schema, you cannot save the document as XML if there are missing tags.

**Note   **An XML schema is a set of tags that describe the elements, attributes, data types, and structure of a class of XML documents.

If you want to create XML documents or templates that users can safely edit, you need to take steps to protect and lock down Word so that users cannot remove the tags. This process is described in detail in later sections of this article.

Open an Existing XML File

To open an existing XML file, from the File menu, click Open and go to the folder where the file is located. Type the name of file in the File Name list box, or select it from the Open dialog box.

The Files of type list box on the Open dialog box determines which types of files to show in the Open dialog. You do not see files with an XML extension listed in the Open dialog box unless you choose All Files, All Word Documents, or XML Files from the Files of type list box. When you open the file, the XML data and their surrounding XML tags appear.

To toggle the display of the XML tags press CTRL+SHIFT+X or click the Show XML Tags in the Document check box in the XML Structure task pane.

Opening an XML File with a Transform

XSLT style sheets are special XML documents used to transform other XML documents. For example, you can apply an XSLT transform to an XML data document to present the data as HTML, complete with graphical elements and styled text. Transforms can also filter, sort, and group data.

You can open an XML file in Word and transform it with an XSLT style sheet by selecting the Open with Transform option on the Open dialog box. This loads the Choose an XML Transform dialog box shown in Figure 1. The Files of type list box displays XML Transforms (*.xsl; *.xslt). When you open the XML file, Word transforms it for display using the selected style sheet.

Figure 1. Opening an XML file with a transform (XSLT style sheet)

Create a New XML Document

To use Word to create an XML document, on the File menu, click New. This displays the New Document task pane. Click the XML document link. This creates a document and displays the XML Structure task pane.

You must apply a schema to the new XML document. On the XML Structure task pane, click Templates and Add-ins. On the Templates and Add-Ins dialog box, click the XML Schema tab. In the Available XML Schemas list, click the checkbox next to the schema you wish to use as shown in Figure 2. If the schema you want to use does not appear in the list, click Add Schema to locate it.

Figure 2. Attaching a schema to a new document

Once you select the XML schema for your document, you can mark up the document by applying tags using the XML Structure task pane. To simplify the process of filling in data in a new XML document, you can supply placeholder text for each XML tag in the document.

Displaying Placeholder Text

Placeholder text in an XML document or template facilitates data entry by allowing users to select the placeholder text easily when inserting new data between XML tags. In the XML Structure task pane, click XML Options . Check the box next to the option labeled Show Placeholder Text for All Empty Elements. The tag names for all empty elements in square brackets when the user does not display the XML tags, as shown in Figure 3.

When a user clicks the text in the square brackets and starts to type, Word replaces the entire contents of the placeholder text.

Figure 3. A memo with placeholder text

Customizing Placeholder Text

To customize the text displayed in the brackets, right-click on each element in the tree view of the XML Structure pane. Choose Attributes and fill in the text you want displayed. When you specify placeholder text in the Attributes dialog box, that individual element always uses your custom placeholder text, regardless of your Show Placeholder Text for All Empty Elements settings.

Editing Data in an XML Document

When users open an XML document with a schema attached, they can mark up the data in the document by applying XML elements and specifying XML attributes using the XML Structure task pane. Users can edit the data between the tags as long as they do not inadvertently delete the tags. Selecting a word or a sentence that is delimited by tags and then overtyping or deleting the text frequently deletes the tags as well. Any missing tags in a document leads to problems when a user tries to save an XML document that is validated against a schema. If a tag is missing, the user cannot save the document and receives an error message.

Preventing Deletion of XML Elements

Using placeholder text for empty elements does not prevent users from inadvertently deleting the XML tags while editing data in an XML document. To prevent users from deleting XML elements, you need to protect the document or template. On the Tools menu, click Protect Document to load the Protect Document task pane. In the Editing Restrictions section, Check the Allow Only This Type of Editing in the Document box, and select No Changes (Read Only) from the drop-down list.

At this point, Word locks the entire document and you need to create an exception for the contents of each tag so that users can edit the XML data without deleting the tags. In the Protect Document task pane, in the Exceptions section, select the contents of each element, and click the Everyone check box as shown in Figure 4. You need to repeat this process for the contents of each editable element in the document.

Figure 4. Create an exception so that users can edit XML data without deleting the tags

Click Yes, Start Enforcing Protection in the Protect Document pane. Click the default option, Prevent Accidental Changes. You can also add an optional password to make any changes to your protection scheme.

The contents of the elements you have unprotected are now shaded in light yellow. This is the only area that users can edit; Word locks the rest of the document so that users cannot delete the XML tags.

If you need to make changes to your protection scheme, on the Tools menu, click Unprotect Document or click Stop Protection in the Protect Document task pane.

Saving XML Data

There are several options for saving an XML document: you can save it as a template file for Word (*.dot), an XML file (*.xml), or as a Word document (*.doc).

On the File menu, when you choose the Save As option, you can choose the XML Document (XML) option in the Save as type drop-down list. If you click Save, the document is saved in Word Markup Language (WordprocessingML) and preserves the Word document, including formatting, hyperlinks, paragraphs, and all other aspects of the document.

When you select the XML Document option in the Save As dialog box, you can also click the Save Data Only checkbox located next to Save and Cancel. In this case, only the XML data that you mark up is saved and all other text and formatting is lost. If you attach a separate schema, Word saves the data that is marked with elements from the attached schema.

Saving with a Transform

To save an XML document through a transform (XSLT style sheet), check the Apply Transform checkbox located next to Save and Cancel. This displays the Transform button, which you can click to select the XML transform (XSLT) you want to use.

Editing XML Data in Microsoft Office Excel 2003

Editing XML data in Excel is much simpler than editing XML data in Word. There are not any tasks to prevent users from deleting the XML tags in an XML document because the user interface does not display the tags.

Microsoft Excel 2002 and Microsoft Office Excel 2003 support XML in a format designed specifically for Excel, named XML Spreadsheet. You can make data contained in an XML Spreadsheet available easily to other programs, and still manipulate it in Excel. Excel 2003 adds additional XML features that allow you to add XML schemas to a workbook and to manipulate XML using visual tools to select, drag, and drop XML elements onto a worksheet. Once you add the schema to Excel 2003, it is known as an XML map. You can use these maps to import, export, or save XML data.

Another new feature in Excel 2003 is an XML list, automatically created when one or more elements repeat (an element is everything between and including the start and end tags in raw XML). Excel associates the columns in an XML list with an XML schema element by setting the XPath property of each column. Excel maps non-repeating elements to single cells called XML-mapped cells.

Open an XML Data File

To open an XML file, on the File menu, click Open. In the Look In list, go to the hard disk, folder, or Internet location that contains the XML file you want to open. The Open XML dialog box presents the following options to open the file:

As an XML list. Excel creates the XML file in a new workbook. If the XML data file does not refer to a schema, Excel infers the schema.

As a read-only workbook. Excel opens the XML file as a read-only workbook with a flattened structure.

Use the XML Source task pane. Excel displays the schema of the XML data in the XML Source task pane. You can then drag schema elements onto the worksheet to map elements to cells. If you choose this option, the data itself does not display, and you need to import the data as a separate step after you mapped the desired elements.

When the XML data does not have an associated schema, Excel infers a schema based on the data in the XML file.

Work with the XML Source Task Pane

The XML Source task pane is the tool you use to work with schemas and maps on your spreadsheet. It provides a hierarchical tree view of your XML schemas, as well as some user-configurable options. To load the XML Source task pane, on the Data menu, click XML, and then click XML Source.

Add XML Maps to a Workbook

When you add an XML schema to a workbook, Excel automatically creates an XML map that enables you to map elements directly into your worksheet. On the task pane, click Workbook Maps to load the XML Maps dialog box. Click Add and locate the XML schema file. The XML Source task pane displays the map in a tree view, as shown in Figure 5.

Figure 5. The XML Source task pane with a mapping to the Customers schema

Import XML Data into an Existing Mapping

To import data into an existing mapping, on the Data menu point to XML, and then click Import. Alternatively, on the List toolbar, click Import XML Data. Go to the disk, folder, or Internet location that contains the XML data file associated with the schema mapping, and click Import.

Note   To open the List toolbar, on the View menu, point to Toolbars, and then click List.

The default settings for Excel specify to overwrite any mapped data already in the worksheet.

Append XML Data

To append XML data in a mapped range without overwriting existing data, select the mapped range and on the List toolbar, click XML Map Properties. Click Append New Data to Existing XML Lists and then click OK.

Work with XML Lists

Excel automatically creates an XML list when you drag one or more of repeating elements onto the worksheet. Excel maps non-repeating elements to single cells.

Using Formulas in an XML List

You can use formulas in columns with mapped elements that have an XML Schema Definition (XSD) data type that corresponds to a data type of number, date, or time in Excel 2003.

Working with Denormalized Data and List of Lists

Any time you associate a single data point in an incoming XML data file with one or more repeating elements, and the single data point and repeating elements are in a single list, Excel renders the data in the list by denormalizing it. The data point only appears once in the XML file, but Excel associates it with multiple rows. Excel renders the data point multiple times, once on each related row.

Figure 6. A denormalized list where the last three columns are repeating elements (Click picture to see larger image)

You can't export data to an XML file from an XML map containing denormalized data because there is ambiguity about to which of the many data points in the worksheet Excel should write to the single data point in the XML file. Repeating element XML maps are not exportable. To remedy the situation, map the denormalized element to a single cell.

A list of lists occurs when a schema defines an element that repeats and that can itself be the parent of repeating elements—for example, multiple customer elements, each of which contain multiple orders elements, and so on. Because Excel associates a single parent element with a repeating child element, Excel denormalizes the data when it maps data in the same list.

Save XML Data to XML Spreadsheet File Format

To save your XML data with the workbook, on the toolbar, click the Save button or on the File menu, click Save. This saves your XML data in the XML Spreadsheet format, which retains many, but not all, Excel features. For a complete list of features and functionality available in XML Spreadsheet format, see the Microsoft Office Excel 2003 help documentation.

Save or Export a Mapped Range to an XML Data File

To save the XML data, on the File menu, click Save As and in the Save as type list, select XML Data. This writes the data to a file defined by the XML schema associated with the XML map in your workbook. On the List and XML toolbar, click the Export button or on the Data menu, click XML and Export to export data to an XML data file.

On the List and XML toolbar, click the XML Map Properties button to configure data validation against the schema when exporting data. If XML schema validation is on, Excel validates the XML data file and provides error messages as appropriate.

Conclusion

Editing XML data in Microsoft Office Word 2003 may require that you protect your document or template to prevent users from accidentally deleting XML tags when they edit data. You can save XML data in Word to the Word Markup Language (WordprocessingML) format, which preserves the document formatting and options, or you can save only the XML data.

Editing XML data in Microsoft Office Excel 2003 does not require any extra steps to protect XML tags. You can open an XML file in Excel and map schema elements to a worksheet. You can save XML data in the XML Spreadsheet format, retaining many of the features of Excel, or you can save the data to an XML data file.

© Microsoft Corporation. All rights reserved.