XML in Excel and the Spreadsheet Component

This content is no longer actively maintained. It is provided as is, for anyone who may still be using these technologies, with no warranties or claims of accuracy with regard to the most recent product version or service release.

 

Michael Stowe
Microsoft Corporation

August 2001

Applies to:
     Microsoft® Excel 2002

Summary: XML is becoming an important, new data interchange format for customers to communicate legacy data. Customers who want to import this data into Microsoft Excel 2002 and the Microsoft Office XP Spreadsheet component will have the option of loading either generic, well-formed XML or specially formatted XML Spreadsheet files (using the XML Spreadsheet schema) to communicate their data. Excel also offers the option of creating a high-fidelity XML Spreadsheet file for any Excel spreadsheet. (8 printed pages)

For a complete description of the elements and attributes that make up the XML Spreadsheet (XMLSS) schema, see the XML Spreadsheet Reference.

Contents

Introduction
Features
Copying and Pasting
Excel XML Schema Namespaces
XML Spreadsheet Tag Hierarchy
Additional Notes

Introduction

Hypertext Markup Language (HTML) is an ideal file format for displaying Microsoft® Excel data in Web browsers. However, in order to persist Excel-specific features and have Excel workbooks displayed well in a Web browser, Excel HTML is highly complex. This results in files that are only read by Excel. Extensible Markup Language (XML), however, helps separate data and view formatting, which results in a cleaner, more robust file format and, more importantly, makes Excel data more available to other applications.

Features

Some of the features of the XML implementation in Excel and the Microsoft® Office XP Spreadsheet component include:

  • "Round-tripping" information through Excel and the Spreadsheet component
  • Opening non-Excel, non-Spreadsheet component generated XML data files
  • Opening and "flattening" arbitrary, well-formed XML
  • Opening arbitrary, well-formed XML as HTML by using an associated Extensible Stylesheet Language (XSL) file
  • Opening hand-authored XML Spreadsheet documents
  • Publishing data from Excel to the Spreadsheet component
  • Exporting data from the Spreadsheet component to Excel
  • Copying and pasting data from Excel to the Spreadsheet component, and vice versa
  • Using XML-based QueryTable objects in Excel
  • Persisting Range objects as XML in Excel and the Spreadsheet component

Copying and Pasting

When copying and pasting data, Excel and the Spreadsheet component only support elements and attributes in the ss namespace. Excel can optionally add x namespace elements for Rich Text or conditional formatting.

Formulas are not adjusted when data is copied and pasted. If a reference exists that is outside of a selected range, an ss:Formula attribute is not added (but Excel and the Spreadsheet component will add an x:PasteFormula attribute with the correct, non-adjusted formula).

When copying a filtered range, filter-specific information is not copied. Instead, only the filtered rows are copied, and formulas based on hidden rows are adjusted.

The clipboard formats when copying and pasting data among Excel, the Spreadsheet component, and Notepad are as follows:

Copy from Paste to Clipboard format
Office 2000 Spreadsheet component Office XP Spreadsheet component HTML
Office XP Spreadsheet component Office 2000 Spreadsheet component HTML
Office XP Spreadsheet component Office XP Spreadsheet component XML Spreadsheet
Office XP Spreadsheet component Excel 2002 XML Spreadsheet
Office XP Spreadsheet component Excel 2000 HTML
Office XP Spreadsheet component Excel 97 or earlier Unicode text (HTML is not supported)
Excel 2000 Office XP Spreadsheet component HTML
Excel 2002 Office XP Spreadsheet component XML Spreadsheet
Office XP Spreadsheet component Notepad Unicode text
Notepad Office XP Spreadsheet component Unicode text

Items not persisted in the XML Spreadsheet format include:

  • OLE objects
  • Drawing shapes
  • Charts
  • Microsoft® Visual Basic® for Applications (VBA) code
  • Groups and outlines
  • Custom envelope information

Excel XML Schema Namespaces

The valid namespaces referenced in the Excel XML schema include:

Namespace Prefix
Office xmlns:o="urn:schemas-microsoft-com:office:office"
Excel xmlns:x="urn:schemas-microsoft-com:office:excel"
XML Spreadsheet xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
Spreadsheet component xmlns:c="urn:schemas-microsoft-com:office:component:spreadsheet"
XML schema xmlns:s="uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882"
XML data type xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882"
MS-persist recordset xmlns:rs="urn:schemas-microsoft-com:rowset"
Rowset xmlns:z="#RowsetSchema"

XML Spreadsheet Tag Hierarchy

The hierarchy of tags in the XML Spreadsheet model is as follows:

<ss:Workbook>
    <ss:Styles>`
        <ss:Style>
            <ss:Alignment/>
            <ss:Borders>
                <ss:Border/>
            </ss:Borders>
            <ss:Font/>
            <ss:Interior/>
            <ss:NumberFormat/>
            <ss:Protection/>
        </ss:Style>
    </ss:Styles>
    <ss:Names>
        <ss:NamedRange/>
    </ss:Names>
    <ss:Worksheet>
        <ss:Names>
            <ss:NamedRange/>
        </ss:Names>
        <ss:Table>
            <ss:Column/>
            <ss:Row>
                <ss:Cell>
                    <ss:NamedCell/>
                    <ss:Data>
                        <Font/>
                        <B/>
                        <I/>
                        <U/>
                        <S/>
                        <Sub/>
                        <Sup/>
                        <Span/>
                    </ss:Data>
                    <x:PhoneticText/>
                    <ss:Comment>
                        <ss:Data>
                            <Font/>
                            <B/>
                            <I/>
                            <U/>
                            <S/>
                            <Sub/>
                            <Sup/>
                            <Span/>
                        </ss:Data>
                    </ss:Comment>
                    <o:SmartTags>
                        <stN:SmartTag/>
                    </o:SmartTags>
                </ss:Cell>
            </ss:Row>
        </ss:Table>
        <c:WorksheetOptions>
            <c:DisplayCustomHeaders/>
        </c:WorksheetOptions>
        <x:WorksheetOptions>
            <x:PageSetup>
                <x:Layout/>
                <x:PageMargins/>
                <x:Header/>
                <x:Footer/>
            </x:PageSetup>
        </x:WorksheetOptions>
        <x:AutoFilter>
            <x:AutoFilterColumn>
                <x:AutoFilterCondition/>
                <x:AutoFilterAnd)>
                    <x:AutoFilterCondition/>
                </x:AutoFilterAnd>
                <x:AutoFilterOr>
                    <x:AutoFilterCondition/>
                </x:AutoFilterOr>
            </x:AutoFilterColumn>
        </x:AutoFilter>
    </ss:Worksheet>
    <c:ComponentOptions>
        <c:Toolbar>
            <c:HideOfficeLogo/>
        </c:Toolbar>
    </c:ComponentOptions>
    <o:SmartTagType/>
</ss:Workbook>

Additional Notes

Saving XML Programmatically

When saving a workbook from Excel, you still use the SaveAs method of the Workbook object, but you now use the XlFileFormat enumerated constant xlXMLSpreadsheet.

Styles

If you create a number of named styles in Excel but do not use them, they are not persisted in the XML file.

If the default style is unspecified, Excel creates one when the XML file is opened and assumes the default value for every attribute of every style child element.

Element and attribute inheritance works as follows. Suppose you have the following XML:

<ss:Styles>
    <ss:Style ss:ID="Default" ss:Name="Normal">
        <ss:Font ss:Color="red"/>
    </ss:Style>
    <ss:Style ss:ID="xl1">
        <ss:Font ss:Bold="1"/>
    </ss:Style>
</ss:Styles>
...
<ss:Cell ss:StyleID="xl1">
    ...
</ss:Cell>

The cell above will contain a bold font with automatic color, not a red, bold font. Style inheritance works on a per-element basis, not a per-attribute basis. This means if a Style child element is present (such as <Font>), all unspecified attributes assume their default values. They do not inherit from their parent style.

As another example:

<ss:Styles>
     <ss:Style ss:ID="Default">
          <ss:Font ss:Size="14" ss:FontName="Times"/>
     </ss:Style>
     <ss:Style ss:ID="A">
          <ss:Font ss:Color="Red"/>
     </ss:Style>
     <ss:Style ss:ID="B" ss:Parent="Default">
          <ss:Font ss:FontName="Tahoma"/>
     </ss:Style>
     <ss:Style ss:ID="C" ss:Parent="A">
          <ss:Font ss:FontName="Courier"/>
     </ss:Style>
</ss:Styles>

This will result in the following styles:

Default:   Times, 14pt

A:           Red, Times, 14pt

B:           Tahoma, 14pt

C:           Courier, 14pt

For table, row, column, and cell style IDs, the table format is always applied first, followed by column, then row, then cell formatting. Exception styles are applied to cells to override this order of inheritance. For example, suppose you want column C (formatted blue) to lie above row 3 (formatted green). To get this effect, the cell at position C3 must contain a style ID pointing to column C's blue format.

For date/time cells, when the user does a Web query to an XML Spreadsheet document and specifies no HTML formatting, date formatting (whatever is specified in the XML Spreadsheet document being queried) will still be applied to cells with ss:Type="DateTime".

Data Type Clarification

When a <Cell> element includes an <ss:Data ss:Type="Boolean"> child element, the only legitimate values are 0 and 1. True and False are illegal values and will result in the error message "The Microsoft Office Spreadsheet Component could not load the XML data. TRUE is not legal for the cell value when the data type is a Boolean."