Beyond Macros

Create Word and Excel Smart Documents with C++ and XML

Mike Kelly

Code download available at:SmartDocuments.exe(159 KB)

This article assumes you're familiar with C++ and XML

Level of Difficulty123

SUMMARY

One of the coolest new parts of Office 2003 is a programmability feature called Smart Documents, which allows developers to augment Word and Excel documents with programmable content and behavior. Typically, examples illustrating Office programmability use Visual Basic or Visual Basic .NET. In this article, the author develops a Smart Document for Excel using C++. He describes the new ISmartDocument interface and shows how to use it to manage a simple task list such as an Excel spreadsheet.

Contents

Smart Document Concepts
Creating the Smart Document
Let's Begin
ISmartDocument Interface Tour
Property Bags
Creating a Smart Document Action DLL
Installing Your Smart Document Solution
Accessing the Smart Document Content
Debugging Smart Document DLLs
Change Notifications
Security
Generating the Status Report
Conclusion

Smart Documents are a new programmability feature in Microsoft® Word and Excel 2003. Developers can create Word or Excel Smart Documents augmented with programmable pieces that help users create or modify them. Reusing boilerplate text, implementing business rules for how a document should be formatted or completed, and routing a document through an approval process are all procedures that Office Smart Documents can simplify.

For years it has been possible to use Microsoft Office applications in custom solutions using ActiveX® and Visual Basic® for Applications (VBA). These have some limitations, however, particularly in deployment, user interface, and security. Smart Documents address these with a tighter integration between the popular "task pane" interface introduced in Office XP and the custom solution. With Smart Documents, you use XML to define custom task pane content that automatically synchronizes help, instructions, and programmable controls with the user's position in the document. Code for the custom task pane controls can be developed in several languages: Visual Basic 6.0, Visual Basic .NET, Visual C++®, or Visual C#®.

Smart Documents and associated controls are installed when the user simply opens the document obtained from a trusted server or an attachment in an e-mail message. They can automatically update themselves from a trusted server, greatly simplifying update deployment. Smart Documents follow the Office security settings, requiring them to be downloaded from a trusted server and digitally signed by a trusted source. Users can be given control to decide whether to open Smart Documents that do not meet these security standards.

In this article, the business problem to be solved is one familiar to readers of MSDN® Magazine: scheduling and tracking progress during a software project. While there are many good solutions (including Microsoft Project) for tracking schedules, I'm going to treat the schedule as a database stored in Excel because I want a solution that you can play with simply by installing Office 2003 and downloading the code for my article, without having to install any additional products.

In the spreadsheet there's a row for each schedule item containing the usual information: who's doing the task, the estimated amount of time to complete the task, and whether the task has been completed. I'll assume that, on my project team, developers update the schedule each week to reflect the previous week's work. In addition, since I like to know what's going on with my team, I'll ask them to identify which tasks they'll be working on next week. This is also the time when changes to schedule estimates will be made. The Smart Document solution I'll build will wrap up all these schedule changes into a nicely formatted Word status document—just the type of thing that managers love. As a bonus, I'm going to use the new XML features of Word for the status document. In fact, my Excel Smart Document solution will generate an XML status summary, which I'll combine with XSLT in Word to get the nice formatting.

Smart Document Concepts

There are five main parts to an Office 2003 Smart Document solution. First, there is a Word or Excel document or template. This is the Smart Document that the user opens to perform a business function—in this case generating a nicely formatted weekly status report for a software project. Please note that in Office 2003, only Word and Excel support Smart Documents.

Second, there is an XML schema definition (XSD). The Word document or Excel spreadsheet is marked up with elements from the XML schema to overlay an XML data model on all or part of the document or workbook. For example, in the project schedule workbook, there is a Project Name cell and a Developer Name cell. These correspond to the ProjectName and DeveloperName elements in the XML schema. "Marking up" the spreadsheet is how the developer associates these specific cells with the XML schema elements.

Third, you have the action handler DLL, which can be developed in Visual Basic, Visual Basic .NET, Visual C++, or Visual C#. It is then deployed to a Web or network server. This DLL implements the OLE interface ISmartDocument.

Fourth, there is an XML expansion pack manifest file that specifies the installation instructions for the solution. It provides the server location of the solution DLL and the solution's COM CLSID. It also mentions any other associated files needed for the solution and the solution ID. The solution ID uniquely identifies this Smart Document solution. Although you can use any unique identifier string as the solution ID, it's best to make it a GUID that you generate.

Finally, there may be other files needed by the solution (such as GIFs or JPEGs, HTML, or document fragments) that you want to install on the local machine. These would be specified in the manifest and installed or updated automatically by Office when the Smart Document is opened. They may then be used by your Smart Document solution code.

To clarify the roles of each document in my example, note that the Smart Document is the document to which the XML schema and custom logic are attached—in my case, it's the Excel spreadsheet. The fact that I'm generating a Word document with this is incidental; I could just as easily be putting information into a database or generating a PowerPoint® presentation.

You can think of the XML elements, which factor the spreadsheet or document into logical structural components, as hooks to which your Smart Document solution can attach controls that assist the user in working with these various parts of the document. The behavior of these controls is defined by your action handler DLL at run time. The controls effectively become a sophisticated context-sensitive help system and are displayed in the new document actions task pane as the user moves into particular sections (corresponding to XML elements) of the document or spreadsheet. Because Smart Document controls can be much more than just help text, your context-sensitive help can actually act as a user assistant in completing the business task system for which the document is used.

Creating the Smart Document

Much of the work involved in generating a Smart Document solution consists of creating the Word or Excel document and associating XML schema elements with it. In fact, you can often use existing Word or Excel documents as a basis for creating a Smart Document. Since many business documents already have a structure, you're merely formalizing and naming it when you define the XML schema.

From the developer's point of view, the document or spreadsheet is the principal UI for the solution. The Smart Document action handler DLL defines an additional UI that appears side by side with the document when the user is editing. It's sensible when creating Smart Documents to use the underlying power of Word and Excel whenever possible rather than writing custom code in your Smart Document DLL. A formula in my Excel schedule workbook illustrates this point. I wanted the status report to be dated the previous Monday, but because people often fill out a status report for the previous week on Monday or Tuesday, the calculation to get the report date right is a bit tricky. On Monday and Tuesday, it's the previous week's Monday; after that, it's this week's Monday. Rather than write a bunch of C++ code to do this, I wrote an Excel formula to encapsulate the calculation and put it in a cell in the worksheet:

=IF(WEEKDAY(TODAY(),3)<2, TODAY()-(7+WEEKDAY(TODAY(),3)), TODAY()-WEEKDAY(TODAY(),3))

I simply load that calculated date value from the worksheet cell using a function I wrote to get values from an Excel workbook.

Let's Begin

Now, let's start by creating an Excel spreadsheet to track a software schedule. Note that I am assuming you've downloaded the sample code, which includes the necessary support files.

Start Microsoft Excel 2003 and open Project Schedule (Original).xls. This spreadsheet, like most Excel spreadsheets, already has a structure. Now associate an XML schema with that structure using an XSD document, in this case ScheduleSmartDocument.xsd. In Excel, point to XML on the Data menu and select XML Source. When the XML Source task pane appears, click the Workbook Maps button and add the ScheduleSmartDocument.xsd file. You then click OK to dismiss the Browse dialog, and OK again to dismiss the XML Maps dialog. Figure 1 shows the Excel XML Source task pane after attaching the schema.

Figure 1 XML Source

Figure 1** XML Source **

To create the mapping, you must drag elements from the XML schema onto the workbook. For the elements like Project Name and Developer, this is a simple matter of grabbing the XML element name in the XML Source task pane and dragging it onto the cell in the worksheet that contains the associated data (for example, dragging ProjectName onto cell A2). For the repeating schedule items, shown as a hierarchy, grab the group in the task pane by selecting Items and drag it onto the header row (cell A5). You can confirm that the mapping is correct by selecting a particular element (for example Description) in the XML Source task pane; the corresponding portion of the workbook will be highlighted. Save the resulting workbook as "Project Schedule (Mapped).xls." In this example, I'll create a single spreadsheet. Or, you could save the schedule as an Excel Workbook Template. There, you simplify the process of users creating new copies of the document, each of which will itself be an Excel Smart Document.

ISmartDocument Interface Tour

The next step is to create a Smart Document action DLL and install it. Before you start working on a Smart Document action handler DLL (which implements the ISmartDocument interface methods), it's good to learn about this interface.

Your action handler DLL is going to create and manage a set of controls which will appear in the document actions task pane as the user moves through the document, enabling you to provide a custom user interface based on the position within the document. The XML schema elements you mapped to your document define the user's position within the document. In turn, the XML elements specify which controls should be shown to the user. The control types you can create are listed in Figure 2. You'll create the controls using ISmartDocument interface methods that you'll implement in your action handler DLL. When these controls are activated, such as when text is entered or a button is clicked, Office invokes your custom action handler DLL code through ISmartDocument interface methods.

Figure 2 Smart Document Control Types

Symbolic Name Control Type
C_TYPE_LINK Hyperlink
C_TYPE_HELP Help text provided as HTML and displayed in the document actions task pane
C_TYPE_HELPURL URL-to-HTML help text to be displayed in the document actions task pane
C_TYPE_SEPARATOR Separator line for formatting use only
C_TYPE_BUTTON Command button for triggering actions
C_TYPE_LABEL Static text to be displayed
C_TYPE_IMAGE Image to be displayed, provided as path name to a GIF, BMP, TIFF, JPEG, WMF, or other image file
C_TYPE_CHECKBOX Checkbox
C_TYPE_TEXTBOX Edit box for entering text
C_TYPE_LISTBOX List of items as a scrollable listbox
C_TYPE_COMBO Combobox (an edit control and dropdown list of choices as a unit)
C_TYPE_ACTIVEX ActiveX control
C_TYPE_DOCUMENT- Document fragment, provided as Word
FRAGMENT XML text (Word only)
C_TYPE_DOCUMENT- Document fragment, provided as URL to
FRAGMENTURL a file (Word only)
C_TYPE_RADIOGROUP Radio group of choices

I've divided the methods of the ISmartDocument interface into several categories: the configuration methods, which describe the controls created in the document actions task pane to Office (see Figure 3); control draw-time methods, which are invoked as controls and are drawn in the task pane (see Figure 4); and change notification methods, which Office invokes as the document actions task pane controls are manipulated by the user (see Figure 5).

Figure 5 ISmartDocument Interface Notification Methods

Method Parameters Description
InvokeControl int ControlID, BSTR ApplicationName, LPDISPATCH Target, BSTR Text, BSTR Xml, int LocaleID Called for controls of type C_TYPE_LINK, C_TYPE_BUTTON, C_TYPE_DOCUMENTFRAGMENT, and C_TYPE_DOCUMENTFRAGMENTURL when the user presses the button or clicks on the link or document fragment.
OnCheckboxChange int ControlID, LPDISPATCH Target, VARIANT_BOOL Checked Called for controls of type C_TYPE_CHECKBOX when the checked state changes.
OnTextboxContentChange int ControlID, LPDISPATCH Target, BSTR Value Called for controls of type C_TYPE_TEXTBOX when the text content changes.
OnListOrComboSelectChange int ControlID, LPDISPATCH Target, int Selected, BSTR Value Called for controls of type C_TYPE_LISTBOX or C_TYPE_COMBO when the selected item changes.
ImageClick int ControlID, BSTR ApplicationName, LPDISPATCH Target, BSTR Text, BSTR Xml, int LocaleID, int XCoordinate, int YCoordinate Called for controls of type C_TYPE_IMAGE when the user clicks on the image. X and Y are zero-based pixel coordinates relative to the upper-left corner of the image.
OnRadioGroupSelectChange int ControlID, LPDISPATCH Target, int Selected, BSTR Value Called for controls of type C_TYPE_RADIOGROUP when the user changes the selected state of the radio buttons. The Value string is the text of the item that has been newly selected.

Figure 4 ISmartDocument Interface Control Drawing Methods

Method Parameters Description
get_ControlCaptionFromID int ControlID, BSTR ApplicationName, int LocaleID, BSTR Text, BSTR Xml, LPDISPATCH Target, BSTR * Caption Provides the visible display name of the control (as the out BSTR) given the control ID.
PopulateHelpContent int ControlID, BSTR ApplicationName, int LocaleID, BSTR Text, BSTR Xml, LPDISPATCH Target, struct ISmartDocProperties * Props, BSTR * Content Called for controls of type C_TYPE_HELP or C_TYPE_HELPURL to provide content for the help text.
PopulateCheckbox int ControlID, BSTR ApplicationName, int LocaleID, BSTR Text, BSTR Xml, LPDISPATCH Target, struct ISmartDocProperties * Props, VARIANT_BOOL * Checked Called for controls of type C_TYPE_ CHECKBOX to provide the checked state of the checkbox (as a BOOLEAN variant in the Checked out parameter).
PopulateTextboxContent int ControlID, BSTR ApplicationName, int LocaleID, BSTR Text, BSTR Xml, LPDISPATCH Target, struct ISmartDocProperties * Props, BSTR * Value Called for controls of type C_TYPE_TEXTBOX to provide the content for the textbox (in the BSTR Value out parameter).
PopulateListOrComboContent int ControlID, BSTR ApplicationName, int LocaleID, BSTR Text, BSTR Xml, LPDISPATCH Target, struct ISmartDocProperties * Props, SAFEARRAY * * List, int * Count, int * InitialSelected Called for controls of type C_TYPE_LISTBOX or C_TYPE_COMBO to provide content for the listbox or combobox dropdown. The content is provided as a VARIANT array in the List out parameter; the number of items in the array is provided in the Count out parameter; and the (1-based) initially selected is provided in the InitialSelected out parameter (-1 to select nothing).
PopulateDocumentFragment int ControlID, BSTR ApplicationName, int LocaleID, BSTR Text, BSTR Xml, LPDISPATCH Target, struct ISmartDocProperties * Props, BSTR * DocumentFragment Used by Word only. Called for controls of type C_TYPE_DOCUMENTFRAGMENT to provide content for the document fragment (as HTML).
PopulateActiveXProps int ControlID, BSTR ApplicationName, int LocaleID, BSTR Text, BSTR Xml, LPDISPATCH Target, struct ISmartDocProperties * Props, ISmartDocProperties * ActiveXPropBag Called for controls of type C_TYPE_ACTIVEX to provide information needed to instantiate the ActiveX control in the task pane. This information is provided through an IPropertyBag.
PopulateImage int ControlID, BSTR ApplicationName, int LocaleID, BSTR Text, BSTR Xml, LPDISPATCH Target, struct ISmartDocProperties * Props, BSTR * ImageSrc Called for controls of type C_TYPE_IMAGE to provide the path (local file path, UNC, or URL) to the image. The image may be a GIF, JPEG, or WMF.
PopulateRadioGroup int ControlID, BSTR ApplicationName, int LocaleID, BSTR Text, BSTR Xml, LPDISPATCH Target, struct ISmartDocProperties * Props, SAFEARRAY * * List, int * Count, int * InitialSelected Called for controls of type C_TYPE_RADIOGROUP to provide the contents of the radio group buttons. These are provided as a set of BSTR strings in the VARIANT array, the number of items in the array, and the (1-based) initially selected item.
PopulateOther int ControlID, BSTR ApplicationName, int LocaleID, BSTR Text, BSTR Xml, LPDISPATCH Target, struct ISmartDocProperties * Props Called for controls of type C_TYPE_LINK and C_TYPE_BUTTON to provide the contents of the link. These are provided as a set of BSTR strings in the VARIANT array, the number of items in the array, and the (1-based) initially selected item.
OnPaneUpdateComplete LPDISPATCH Target Called when all the controls have been drawn.

Figure 3 ISmartDocument Interface Configuration Methods

Method Parameters Description
SmartDocInitialize BSTR ApplicationName, LPDISPATCH Document, BSTR SolutionPath, BSTR SolutionRegKeyRoot Called by an Office application when the Smart Document is opened by a user. Can be used to do any per-session initialization. The IDispatch points to the Word or Excel document being opened. The SolutionPath is the local machine directory where the solution files are installed.
get_SmartDocXmlTypeCount int * Count Provides Office with the number of XML elements in the Smart Document which have content in the document actions pane. Each document element with controls is assigned a unique 1-based Smart Document type ID (called SmartDocID in the methods listed below).
get_SmartDocXmlTypeName int SmartDocID, BSTR * bstrElementName Provides Office with the XML element name corresponding to SmartDocID (1-based).
get_SmartDocXmlTypeCaption int SmartDocID, int LocaleID, BSTR * Caption Provides Office with the localized caption to appear in the document actions task pane for the element name corresponding to SmartDocID. You could load this from a localized string table in your DLL.
get_ControlCount BSTR bstrElementName, int * Count Given an XML element name from your Smart Document, tells Office how many controls will be displayed in the document actions task pane when the user is positioned within that element.
get_ControlID BSTR bstrElementName, int ControlIndex, int * ControlID Given an element name and a (1-based) control index, you assign a unique control ID to each control using this method. The rest of the methods use this control ID to refer to controls.
get_ControlNameFromID int ControlID, BSTR * Name Given a control ID, provides the (internal) name of the control.
get_ControlTypeFromID int ControlID, BSTR ApplicationName, int LocaleID, C_TYPE * Type Given a control ID, provides the type of the control (see Figure 2 for a list of types).

Note that for many action DLLs, you will not use all the control types or even implement (beyond stubs) all the ISmartDocument methods. For example, since I have no ActiveX or listbox controls in my task pane, I have only boilerplate implementations for those methods in my sample code. It is even possible to take advantage of some simple Smart Document features (such as providing hyperlinks or context-sensitive help text) without writing an action handler DLL at all. Instead, you simply provide an XML file that describes the actions you want to associate with various XML elements in the associated Word or Excel Smart Document.

Many of the ISmartDocument methods have common parameters, which are summarized in Figure 6.

Figure 6 ISmartDocument Common Method Params

Parameter Description
BSTR ApplicationName ProgID of the Office application hosting the Smart Document (for example "Excel.Application.11").
LPDISPATCH Document or Target An IDispatch pointer to the underlying Word or Excel document (for SmartDocInitialize and OnPaneUpdateComplete). For other methods, an IDispatch to the Word or Excel range of the XML element affected.
int LocaleID LCID of the user's language; for use by multilanguage Smart Documents to enable localized captions and other resources to be loaded from the DLL.
ISmartDocProperties * Props An IPropertyBag that is used to provide control properties.
BSTR Text Text of the current document element.
BSTR Xml XML representation of the document element; can be NULL.

When Office loads your action DLL, it will first invoke the configuration methods shown in Figure 3. Office first calls SmartDocInitialize (enabling you to do any special, one-time initialization). It will then query you for the number of XML elements defined in the document that have associated controls by calling your implementation of get_SmartDocXmlTypeCount. It is also possible to have XML elements in the document without associated controls. In that case, the document actions pane will be empty, so you may want to define some standard help content for those elements. Once you've told Office how many XML document elements you're interested in, it will ask for the name of each element, which it obtains through multiple calls to get_SmartDocXmlTypeName. You need to return the qualified name for the XML element (for example, schema#element). Each element will have a caption that appears in the title of the document actions pane when the user is positioned within that element. This is returned by a call to get_SmartDocXmlTypeCaption. The calls to get_SmartDocXmlTypeName and get_SmartDocXmlTypeCaption occur in pairs, one per XML element. Next, Office wants to know information about the controls to display in the task pane for each element. This information includes:

  • get_ControlCount—gives the number of controls for the element.
  • get_ControlID—allows you to assign a unique ID number to each control based on the (1-based) index of the control for this element.
  • get_ControlNameFromID—returns the (non-localized) name of the control after being given the control ID. This name is not displayed, but is used to refer to the control in the Word or Excel object model.
  • get_ControlTypeFromID—given the control ID, returns the type of the control (from Figure 2).

As the user moves through the document, Office updates the document actions task pane with the appropriate controls, then invokes the control drawing methods (see Figure 4) as it draws the controls in the task pane. First, it invokes get_ControlCaptionFromID to allow you to provide a caption for the control. Note that the caption is not provided at configuration time, but at display time, and so can change dynamically (based, perhaps, on document content). Controls with a blank or NULL caption string are not drawn by Office. This may be used to dynamically hide and show controls. In order to hide a control, you should simply return a NULL or empty string from get_ControlCaptionFromID for that control; Office will not invoke the corresponding Populate methods for that control.

Office then invokes a control type-specific Populate method (such as PopulateHelpContent) for each control. You use these methods to provide content for the controls, such as the location of an image or the text of a hyperlink. Note that the XML and Text content of the document element that is associated with this control is passed to all the Populate methods. You should also note that controls of type C_TYPE_LABEL and C_TYPE_SEPARATOR do not have associated Populate methods. After all the controls have been drawn and the task pane is complete, Office will call OnPaneUpdateComplete.

As the user interacts with the controls in the task pane, Office invokes one of the notification methods that is shown in the list in Figure 5. Labels and separators have no notification methods associated with them. For others, the notification methods indicate either a change in state or a click, which could invoke an action (such as for a hyperlink or image).

Note that Office seems to err on the side of too much notification rather than too little. For example, as the cell selection changes in Excel (even though the XML element remains the same), Office redraws the document actions workpane and invokes your methods so you can have truly dynamic content in the task pane. You could, for example, query the value of the selected cell through the Excel object model and update your controls based on that value. I take advantage of this in my sample code to update information about the selected task in the task pane without needing to manage Excel VBA events in my document action DLL.

Property Bags

Most of the control drawing methods provide an ISmartDocProperties parameter. This is a property bag (inherited from IPropertyBag) of name/value pairs that provide additional information on the controls. This is where you specify size, font, alignment, and so on for your controls. Figure 7 provides the key names that apply to all controls. The property values, including values for the integer properties, are all supplied as BSTRs to ISmartDocProperties::Write. For example, the following line in PopulateTextboxContent will set the width of the textbox control to 50 pixels:

Props->Write(CComBSTR("W"), CComBSTR("50"));

Figure 7 ISmartDocProperties for All Controls

Property Description
X A positive integer that specifies how far from the left position the control should be indented, in pixels.
Y A positive integer that specifies how far below the normal top position the control should be positioned, in pixels.
H A positive integer that specifies the height of the control, in pixels.
W A positive integer that specifies the width of the control, in pixels.
Align A string giving horizontal alignment of the control within the task pane. "left" (the default) aligns the control on the left of the pane; "center" centers the control within the pane; "right" aligns the control against the right edge of the task pane.

Controls that have labels associated with them (C_TYPE_TEXTBOX, C_TYPE_CHECKBOX) have the additional properties shown in Figure 8 to describe attributes of the control's label. Note that these only affect the control label, not the font in which the control itself is drawn. Finally, there are some miscellaneous properties which apply only to particular types of controls. These are shown in Figure 9.

Figure 9 Other ISmartDocProperties Name/Value Pairs

Property Applies to Controls of Type Description
IsMultiLine C_TYPE_TEXTBOX A setting of True allows the textbox to accept multiple lines of text, each separated by a carriage return/line feed character.
NumberOfLines C_TYPE_TEXTBOX, C_TYPE_COMBO, C_TYPE_LISTBOX The number of lines to display in a textbox. For a listbox or combobox, the number of lines visible without scrolling.
PasswordCharacter C_TYPE_TEXTBOX, C_TYPE_COMBO A single character that is used to obscure typing into the textbox so that it may be used to enter passwords.
ExpandHelp C_TYPE_HELP, C_TYPE_HELPURL A setting of True expands the help control when originally displayed.
Border C_TYPE_IMAGE A setting of True displays a border around the image when the mouse is positioned within the image.
IsEditable C_TYPE_COMBO A setting of True allows editing of the text portion of a combobox.
ControlOnSameLine C_TYPE_TEXTBOX, C_TYPE_COMBO, C_TYPE_LISTBOX A setting of True (the default value) displays the control label to the side of the control. A setting of False displays the control label on the line above the control.
Expanded C_TYPE_DOCUMENTFRAGMENT, C_TYPE_DOCUMENTFRAGMENTURL A setting of True expands the document fragment content in the task pane.
     

Figure 8 ISmartDocProperties for Control Labels

Property Description
FontFace The font in which to display the control label ("Arial," for example)
FontSize The font size, in points, in which to display the control label
FontStyle The value "none" (default) or one or more of "underline," "italic," "strikeout" (separated by the "|" character if there are multiple in the list)
FontWeight The value "normal" (default) or "bold"

Now that you've had a fairly quick tour of the various interface methods, I'll build a Smart Document action handler DLL in Visual Studio® .NET.

Creating a Smart Document Action DLL

I decided to create my Smart Document action DLL in Visual C++ using the Active Template Library (ATL). ATL is a good choice because it's got a smaller footprint than MFC, but still provides plenty of convenient COM functionality you'll need in working with Smart Documents. However, you can certainly choose to use MFC if you're more familiar with it.

After downloading the Office 2003 Smart Documents SDK from MSDN Online, I started Visual Studio .NET and created a new project of type ATL Project, then used Add | Class to add an ATL Simple Object to the project. This is CScheduleSmartDoc in my sample code. From the class context menu for that object (use View | Class to bring up the class explorer), I used Add | Implement Interface to have this class implement the ISmartDocument interface. If you don't have Office 2003 installed on your development machine, you'll need to copy the type library from a machine with Office 2003 to do this. You can usually find it in C:\Program Files\Common Files\Microsoft Shared\Smart Tag\mstag.tlb, but make sure you grab version 1.2 (the Office 2003 version). Then you should implement the methods you need, using my sample code and interface descriptions as a guide.

Installing Your Smart Document Solution

Office uses an XML file (called the XML expansion pack manifest file) to describe your Smart Document solution, including directions for updating and deploying the solution. Typically, this file is called manifest.xml. I've added a skeleton manifest.xml to my sample code (see Figure 10) and highlighted in red the things you'll need to change.

  • The solution ID. You should use GuidGen to generate a unique GUID for your Smart Document solution. It's imperative that all solutions have a unique solution ID and, while it's not strictly required to be a GUID, that's the best way to ensure uniqueness.
  • The updateFrequency, which is how often (in minutes) Office should check the server for updates to the manifest. Updates are controlled by the version tag of each file specified in the XML expansion pack manifest. If the versions are different, and the update time has expired, Office will download the new version of the corresponding file from the server to the local machine.
  • The file path to your Smart Document action DLL. This can be a URL or network path if the DLL is deployed on a network or a path or file name relative to the location of the manifest file.
  • The CLSID of your Smart Document action DLL. Because it implements the ISmartDocument COM interface, it has a COM CLSID that Office will use to load it. You'll find this in the header file of the ATL class you generated to implement the ISmartDocument interface. It's the uuid property of the coclass (CScheduleSmartDoc in my sample code).
  • If there are additional files you want to install along with your solution (such as GIFs or HTML files that contain additional text) you can list them here, as separate <file> elements under the <solution> element. I have one additional GIF for the graphic that appears in my task pane.

Figure 10 XML Manifest File

<?xml version="1.0" encoding="UTF-8" standalone="no"?> <manifest xmlns="https://schemas.microsoft.com/office/xmlexpansionpacks/2003"> <version>1.1</version> <updateFrequency><span xmlns="https://www.w3.org/1999/xhtml">20160</span></updateFrequency> <uri><span xmlns="https://www.w3.org/1999/xhtml">ScheduleSmartDocument</span></uri> <solution> <solutionID><span xmlns="https://www.w3.org/1999/xhtml">{15960625-1612-46AB-877C-BBCB59503FCE}</span></solutionID> <type>smartDocument</type> <alias lcid="*">Simple <span xmlns="https://www.w3.org/1999/xhtml">Schedule Smart Document</span></alias> <targetApplication><span xmlns="https://www.w3.org/1999/xhtml">Excel.Application.11</span></targetApplication> <file> <type>solutionActionHandler</type> <version><span xmlns="https://www.w3.org/1999/xhtml">1.0</span></version> <filePath><span xmlns="https://www.w3.org/1999/xhtml">ScheduleSmartDocument.dll</span></filePath> <CLSID><span xmlns="https://www.w3.org/1999/xhtml">{0A9D54DE-12F8-4711-AF10-7D4ACA1D1924}</span></CLSID> <regsvr32/> </file> <span xmlns="https://www.w3.org/1999/xhtml"><file> <type>Other</type> <version>1.0</version> <filePath>ScheduleGraphic.GIF</filePath> </file></span> </solution> <span xmlns="https://www.w3.org/1999/xhtml"><solution> <solutionID>{443E624C-AE85-4b20-8522-1CFD5B4E7CBC}</solutionID> <alias>Schedule Smart Document XSL Transform</alias> <type>Transform</type> <context>https://schemas.microsoft.com/office/word/2003/wordml </context> <file> <type>primaryTransform</type> <version>1.0</version> <filePath>ScheduleReport.xsl</filePath> </file> </solution></span> </manifest>

One other important thing which blocked me for a few hours is that the <uri> element in your manifest must have the same name as the XML schema you are using for mapping your document elements. In my case, this was ScheduleSmartDocument.

Once your Smart Document is marked up with an XML schema (see the earlier section on creating the Smart Document) you can attach an XML expansion pack to associate your action handler DLL with the document. To do this in Excel, go to the Data | XML | XML Expansion Packs dialog. You use this dialog to first add a new expansion pack (pointing to the manifest.xml file for the solution), then attach an expansion pack to the document. This process is similar to marking the document with XML; you only need to do this as the Smart Document developer—users won't need this dialog. In Word, you use the XML Expansion Pack tab of the Add-Ins dialog (Tools | Templates and Add-Ins) to attach an expansion pack to the document.

When users open a document with an associated expansion pack, Office will automatically install or update the expansion pack files on their local machine. Expansion packs are typically installed to the C:\Documents and Settings\All Users\Application Data\Microsoft\Schemas directory, but this is subject to change in future versions of Office. If the user doesn't have administrative rights, the expansion pack files are installed in the corresponding per-user application data directory instead (for example, C:\Documents and Settings\Mike\Local Settings\Application Data\Microsoft\Schemas). Files in the XML expansion pack marked <runFromServer>true</runFromServer> will not be downloaded and installed on the user's machine, but will instead be opened directly from the server location on each use.

If you've been developing a solution on your local machine, you may find that after copying it to the deployment server, Office still looks for the manifest file in the path you used on your development machine. I had this problem and eventually found that Office encodes the path to the solution in a custom document property called Solution URL. After deploying the files to the server, you need to either change this property to the server path or use the XML Expansion Packs dialog to attach the expansion pack from the deployment server (which will reset this property value). You then save your Excel workbook back to the server with the server path to the solution files.

To test this as a user, you should then close the workbook, go to Data | XML | XML Expansion Packs (under Tools | Templates and Add-Ins in Word) and remove the ScheduleSmartDocument expansion pack files from your machine. This makes your development machine look like a normal user machine, which knows nothing of your expansion pack. Next, open the spreadsheet from the server. A dialog will appear asking if you want to download the XML expansion pack, to which you should click Yes. Office will then download and install the files listed in the expansion pack manifest on the local machine.

XML expansion packs have more features than you need for this simple solution. They can be used to distribute virtually any type of file from a server to the client, and can even be chained together into "manifest collections" which will cause Word or Excel to reference several chained manifests as if they were one large manifest. A good summary of all the fields in the expansion pack manifest file is in the article by Paul Cornell that I referenced earlier.

Accessing the Smart Document Content

To be useful, your Smart Document action DLL is often going to need to access (and possibly modify) the underlying Office document. In my case, that is the Excel project schedule spreadsheet. Several of the ISmartDocument methods provide you with a pointer to the IDispatch COM interface for the underlying document for just this purpose. As anyone who's programmed COM against Office knows, IDispatch provides the tunnel into the Office object model. In Visual Basic, using IDispatch and type libraries happens fairly automatically, but it's a bit more involved in C++.

At first I tried to use the #import directive, which allows Visual C++ to generate ATL smart pointer wrappers for all the interfaces in a type library. However, it required quite a bit of gymnastics to get this to compile correctly, and that is often a hint that something is awry. Sure enough, I found Knowledge Base article, Q238987 "Office Application Remains in Memory After Program Finishes", which describes known problems using the #import directive with Office type libraries and advises against it.

In the end, I decided to develop my own class, CExcelWorkbook, to wrap the Excel methods I need. This class inherits from the ATL template CComDispatchDriver, making it relatively easy to call methods on the Excel object model through IDispatch. Using the CComDispatchDriver method GetIDOfName, you can obtain a DISPID for a given property or method (for example, Range or ActiveSheet in the Excel object model). Once you have the DISPID (which I cache in the class for greater efficiency), you can use one of the other CComDispatch methods (such as GetProperty or InvokeN, where N is the number of parameters) to access properties and methods in the object model. Notice that the Excel Range value is exposed as a VBA property (rather than a method), but it also requires a parameter (the range address). Since CComDispatchDriver doesn't provide a way to pass a parameter to a GetProperty call, I implemented a new method (GetProperty1) in my custom class (CExcelWorkbook) which takes care of this.

Debugging Smart Document DLLs

Nothing ever goes as planned and my first attempt at doing the Smart Document action DLL was no exception. I tried to attach the XML schema in Excel and received an error message: "XML Expansion Pack Logic is Missing or Invalid." To debug this, I first used ListDLLs (a very handy tool from https://www.sysinternals.com) to determine that Excel was in fact loading my library, so the manifest seemed at least good enough to let Excel find the DLL. I quit Excel and then modified the Debugging properties of my project (right-click the solution and select Properties) to tell Visual Studio to use Excel as the EXE container for the DLL. For the Command value, I browsed to Excel.exe and selected it.

I then set a breakpoint on DllMain by pressing Ctrl-B to bring up the New Breakpoints dialog, typing DllMain in the Function field, and when a "disambiguate symbol" window came up (apparently there are two overloaded DllMain functions), selecting both of them. My goal was to get control when Excel first loaded the DLL. I then pressed F5, and Excel started up. Visual Studio warned me that I didn't have symbols for Excel, but I already knew that. I opened my Smart Document and tried again to attach the XML schema using Data | XML | XML Expansion Packs. Sure enough, Visual Studio popped up with the breakpoint in my ATL-generated DllMain.

My goal at this point was to determine what other methods, if any, were being called in my DLL. I set breakpoints on all my methods in my ISmartDocument interface implementation to see if any were getting called. In fact, several were and stepping through I was able to find that I had a very common problem of the index being off by one. The control index passed to get_SmartDocXmlTypeName is one-based, but my C++ code was treating it as zero-based, so the call for the last element was returning E_INVALIDARG. I later added the ATLTRACE2 macro to the entry point to all my interface methods, making it easier to debug what was being called and when.

Change Notifications

For my solution, I needed to know when the user selected a row in the Items area of the spreadsheet, which meant he wanted to edit information about that task. I then grabbed information from the current row and populated my task pane, allowing him to enter information about the week's work and planned work for the upcoming week. When he changed something in the task pane, I would use the Excel object model to copy that new information back into the worksheet row. But how would I know when a new row is selected? There is a hard solution here, which involves subscribing to Excel events (easy in Visual Basic, not so easy in C++), but it turns out to be unnecessary. The ISmartDocument interface methods are called whenever the selection changes in Excel, so if anything, you get too much change notification. By updating my internal view of the schedule tasks (CScheduleTaskList) on each change, I was able to ignore row selection changes.

Security

As macro viruses and other malicious uses of scripting technology have become widespread, Office has been adapted in two ways: by increasing the default security settings to prevent running most unsigned, potentially malicious code, and by proliferating the number of security settings to allow more administrative control over which solutions can be run even if not digitally signed. This recognizes the reality that many Office solutions (macros embedded in Word and Excel documents) are already widely deployed in large organizations and making them fully secure will take time. If a new version of Office were suddenly to require all solutions to be digitally signed (the ideal situation), it would create a deployment barrier for many organizations. However, the proliferation of Office security settings makes security a complex topic that I'll merely touch upon in this article.

The same Office security settings that apply to macros and add-ins apply to Smart Documents. These include:

  • The user's macro security setting (High, Medium, or Low, set in Tools | Macro | Security; the default is High).
  • Whether the Smart Document is being loaded from a trusted location: a trusted file system directory (for example, the per-user or workgroup templates directory), a Web server on the corporate intranet, or a trusted Internet site.
  • Whether the Smart Document components are digitally signed and if so whether the publisher is set as a Trusted Publisher in Tools | Macro | Security | Trusted Publishers.
  • Whether "Trust All Installed Add-Ins and Templates" is selected in Tools | Macro | Security | Trusted Publishers.

By default, the security level in Tools | Macro | Security is set to High and Office prevents loading add-in DLLs (including Smart Document action DLLs) that are not digitally signed. Other combinations of the security settings just listed can allow Smart Documents to be loaded (possibly with a security prompt to the user), but the fact remains that the best way to guarantee security in your organization (and avoid potentially confusing security prompts to your users) is to digitally sign all your deployed solutions, including Smart Document components, with a digital certificate obtained from VeriSign or GTE CyberTrust. XMLSign.exe, a tool provided with the Microsoft Office 2003 Smart Document SDK, can be used to digitally sign the XML Smart Document manifest file.

In addition, if an Office Smart Document solution is being run from a Web server, both the Microsoft Internet Explorer and Office security settings will affect whether the solution will run. If an XML expansion pack manifest file is on a server that is neither on the Internet Explorer trusted sites list nor in the intranet zone, no attempt will be made to retrieve it, nor will users be prompted to add the site to the Internet Explorer trusted sites list. If an XML expansion pack manifest file is on a server that is trusted or on an intranet, whether the manifest will be loaded depends on whether it is signed. If it is signed and allowed to run, it is still subject to the user's Office security settings.

During Smart Document solution development, you can enable or disable XML expansion pack manifest file security checking by editing the REG_DWORD value "DisableManifestSecurityCheck" under the registry key:

HKEY_LOCAL_MACHINE\Software\Microsoft\Office\Common\Smart Tag

A value of 1 disables XML expansion pack manifest file security checking, while a value of 0 enables it. If the registry setting is set to 1 (security checking is disabled) when you try to reference an XML expansion pack manifest file, a dialog box appears warning the user that disabling security is dangerous. It also gives you the option to reenable the XML expansion pack security checking immediately by clicking Yes. You should only set this registry value to 1 on a development machine during development of the solution (where it may be inconvenient to digitally sign each component after every build), but be sure to reenable it for the final testing of signed components. It is highly inadvisable to disable XML expansion pack manifest file security checking on user machines.

While this article discusses the creation of Smart Document action DLLs in C++, it is possible to create managed action DLLs in Visual Basic .NET or C#. In that case, the .NET Framework security model applies, but it is beyond the scope of this article to describe it in detail.

Generating the Status Report

My sample code uses IDispatch against the Excel object model to access information from the Schedule Smart Document worksheet to generate the status report. I could have also developed some C++ code to drive the Word object model to build the status document, but I opted for another approach, which also has the advantage of demonstrating the new XML features in Word 2003. I generate the status report as an XML file and open it in Word using XSLT to provide the nice formatting. This approach has the advantage of making it easy to combine several XML files (say, status reports from an entire team) into one large report using XSLT, which would be more difficult to do if they all were persisted as Word documents.

After mulling a bit about how to implement XML persistence, I decided to define classes to manage the task list of items that would be on the status report. As the developer is updating information in the schedule, this task list is being created and updated by code in the Smart Document action DLL, responding to controls it has placed in the document actions work pane. Then, at the end, the developer clicks a button in the work pane to generate the XML status report. Each class is responsible for correctly persisting the information it stores into XML using the MSXML DOM. The classes that handle this are CScheduleTaskList and CScheduleTask. The Word XSL stylesheet I use to format this is ScheduleReport.xsl, available in the sample code. When you install the Smart Document solution, this XSLT file is installed and made known to Word, so that when you open the XML status report in Word, it automatically applies this stylesheet. You could also use the XML Data Views task pane in Word to select this stylesheet.

One XML processing directive worth mentioning is in the XML I generate (see CScheduleTaskList::SaveToXML for the code that adds this and generates the XML status report):

<?mso-application progid="Word.Document"?>

This directive tells Windows Explorer that the default application for this XML file is Word. You'll also notice that the files icon changes as a result of this.

Getting the Word stylesheet to display the status report nicely took a bit of time, so I won't go into the details here. I'll warn you, though, that if you look at the stylesheet I've provided, you may be astonished at how complex the Word XML schema appears. Much of this is boilerplate code, a fair amount of which I obtained by creating a sample status document in Word (setting the styles and fonts the way I wanted them) and then saving it in XML. I copied large parts of the generated XML into my XSLT stylesheet. However, I make no claim that this approach generates the minimal XML needed to correctly format the status report in Word.

I can offer a few suggestions for developing an XSLT based on my experience. Remember that XML is a very precise, case-sensitive language. It's not unlike C++ in that way, but the diagnostics you receive if you get something wrong (if in fact you receive any diagnostics) tend to be of the "can't apply stylesheet" variety rather than the detailed descriptions compilers provide. If you're making large-scale modifications to something that already works, do so incrementally and with good version control so you can roll back your changes. I encountered a problem where my stylesheet referenced the namespace ScheduleSmartDocument but due to an error in the action DLL, the namespace in the generated XML status report was ScheduleSmartDoc. As a result, Word simply displayed an empty document when I applied the XSLT to the XML status report because the XPATHs used in the stylesheet matched nothing in the XML.

Getting Word to automatically apply the correct stylesheet took some work. In the end, I had the manifest that installs the Smart Document action DLL also install the stylesheet (but as a different solution of type Transform which registers the XSLT in Word's library of known XSLTs). For Word to be aware of the stylesheet, the solution entry in the manifest must have a context attribute that references the Word XML namespace (https://schemas.microsoft.com/office/word/2003/wordml).

The final key step is to use the Smart Document URI (ScheduleSmartDocument in my example) as the XML namespace in the manifest, the stylesheet, and the generated XML status report document. In my case, I made up the name ScheduleSmartDocument for my namespace. You should probably use a namespace that is based on the unique URL for your organization.

Working backwards to the XML from a Word document that looks right is a good approach. However, be very careful about slicing and dicing the resulting Word output. For instance, I tried to reduce the complexity of the lists section (which describes list formats used in the document) because I needed just a simple, one-level bulleted list for my status report. After an hour of errors, I gave up and left a lot of the section intact as Word generated it, even though I'm sure there's a lot there I don't really need.

If you open a generated Word XML file in Visual Studio, you'll find that it's too complex for the Visual Studio .NET XML schema viewer. So I suggest that you first click the Data view when the XML file is open, forcing Visual Studio to parse the XML, then go back to the XML view to see a nicely formatted version (which, when opened in Notepad, is virtually unreadable).

My final suggestion is that you use a tool I found very useful while tuning the stylesheet. The MSXSL.EXE command-line tool allows you to apply a stylesheet to an XML file and see the generated XML output. You can download it from here.

Conclusion

I first set out to develop the action DLL in C++ because there are so many examples of Office coding in Visual Basic. However, doing even relatively simple tasks like referencing a cell value in Excel with C++ is significantly more complicated than it is using Visual Basic. See Figure 11 for my C++ equivalent of the ActiveSheet.Range("A1").Value statement in Visual Basic (note that the sample code has a different version of this method because I rewrote it to share code with the set value method). I've provided C++ code for doing a few of the things a Smart Document action handler might want to do with the Excel object model, but the example I've written really doesn't benefit at all from being written in C++. I'd encourage readers to seriously consider the costs of using C++ versus Visual Basic.

Figure 11 ActiveSheet.Range("A1").Value in C++

HRESULT CExcelWorkbook::HrCellValue( int iRow, int iColumn, CComVariant *pValue) { USES_CONVERSION; HRESULT hr; ATLASSERT(p); // IDispatch must be set prior to // calling this method. if (pValue == NULL) return E_INVALIDARG; pValue->Clear(); AssureDispidRange(); TCHAR tzRangeReference[20]; ATLASSERT(iColumn<26); // Not too hard to support > 26, // but not needed here // Handle special value for "CurrentRow" if (iRow == iCurrentRow && FAILED(hr=GetSelectedRow(iRow))) return hr; wsprintf(tzRangeReference, "%c%d", 'A'+(iColumn-1), iRow); CComVariant varRangeName(T2COLE(tzRangeReference)); CComVariant varRange; if (SUCCEEDED(hr=GetProperty1(m_dispidRange, &varRangeName, &varRange))) { ATLASSERT(varRange.vt == VT_DISPATCH); ATLASSERT(varRange.pdispVal); LPDISPATCH lpRange = varRange.pdispVal; CComDispatchDriver dispRange(lpRange); AssureDispidRangeValue(dispRange); hr = dispRange.GetProperty(m_dispidRangeValue, pValue); } return hr; }

Another option with Office 2003 is C#, which is attractive, but be sure you are aware of the performance issues with having unmanaged code (Office) calling managed code (your action handler assembly written in C# or Visual Basic .NET). It may just be that Visual Basic is the best way to go.

Smart Documents provide a remarkably powerful user interface paradigm for developers. You can harness all the power of Microsoft Office applications while providing custom UI and behavior seamlessly integrated into the Office task pane user interface. Combined with the ease of updating and deploying Smart Documents, I expect they will soon be as widely used in large organizations to simplify business processes as spreadsheets with macros are today.

For related articles see:
Smart Documents Development Overview
Smart Documents in Microsoft Office 2003
Microsoft Office Word 2003 XML: Memo Styles Sample
Visual Studio Tools for the Microsoft Office System

For background information see:
Office 2003 XML: Integrating Office with the Rest of the World by Simon St. Laurent (O'Reilly, 2003)
CodeNotes for XML by Gregory Brill (Random House, 2002)

Mike Kelly, freelance technical writer and consultant, has just returned from a year in France and is now based in Redmond, WA. He was a developer and development manager on the Microsoft Office team from 1993-2002. His article on Windows Installer was published in Microsoft Systems Journal (September 1998). He can be reached at mikekell@exmsft.com.