Microsoft Office Word 2003 Preview (Part 1 of 2)

 

Siew Moi Khor
Microsoft Corporation

March 2003

Applies to:
 Microsoft® Office Word 2003

Summary:  Preview the new features and enhancements of Microsoft Office Word 2003. Enhancements include Extensible Markup Language (XML) support, Smart Document solutions, and research library. (22 printed pages)

Contents

Introduction
XML Support
Smart Document Solutions
Research Library
Conclusion

Introduction

Microsoft® Office Word 2003 includes many new and exciting features and improvements. In this article, peruse the following high-level preview of the latest features of Word 2003:

  • Extensible Markup Language (XML) support
  • Smart Document solutions
  • Research library

Part 2 of the article covers:

  • Smart tag improvements and enhancements
  • Document workspace collaboration built on Microsoft SharePoint™ Products and Technologies
  • Range permissions and style lockdown
  • Reading mode
  • Miscellaneous features

This article assumes you are familiar with Word.

XML Support

Microsoft Word 2000 and 2002 have limited support for XML. Documents saved in HTML format in Word 2000 and 2002 have some embedded islands of XML data saved within them, but you cannot use either Word 2000 or Word 2002 to natively create or save XML documents.

Without this capability, it is difficult to associate meaning with document contents. Information is locked inside a .doc format, a binary file which is not an interchangeable format. This makes multi-device, cross-platform exchange impossible. It also makes content repurposing outside Word difficult, data-mining impossible, and collaboration and document management problematical. It also does not allow for database and server-side processing.

XML removes the obstacles that stemmed from data being locked in binary files, and enables semantic structure around content, which makes data retrieval quite easy. XML support in Word 2003 is one of the most exciting breakthroughs for the Microsoft Office System. Through the use of XML, Word content becomes free-flowing, unlocked data, which can be modified. The sharing of information between documents, databases, and other applications also simplified with XML.

Programmability

Word 2003 is an XML solution development platform allowing developers to build powerful structured Word documents and templates that leverage XML to capture information from end users' input. End users can continue to enjoy all the rich editing features that they expect, like auto correct, spell check, grammar check, change-tracking, and more. In short, users can continue using Word the way they always have. They do not need to know anything about XML to take advantage of its capabilities in Word 2003.

There is a rich Visual Basic® for Applications (VBA) object model support for the XML functionality of Word 2003. There are also new object model events that allow developers to customize the Word 2003 editing environment. For example, you can hide XML from end users while taking full advantage of the power of XML in Word.

Additionally, a Word XML Content Development Kit (CDK) is available to The Microsoft Office System program participants to help developers quickly get up to speed on how to build XML solutions using Word 2003 as a development platform.

WordML

Word 2003 has a native XML file format called WordML that can be fully round-tripped without losing Word formatting. As a result, developers can easily detach presentation information from data. WordML can be transformed to separate the pure XML data from formatting as required, which also allows developers to reveal Word-specific content.

Using Word 2003 as an XML Editor for Customer-defined Schemas

Developers can use Word 2003 as an XML editor to apply XML elements to documents at design time (or even run time) based on the underlying, attached XML schemas (XSDs). This provides structure and validation enforcement between data and document. The document is effectively marked with visual tags that reveal the locations of the XML elements, as shown in Figure 1. The developer can toggle between the XML tagged view and normal view (without XML tags showing) at design time to compare the structure against the intended output. This can be done by selecting or clearing the Show XML tags in the document check box in the XML Structure task pane or by pressing CTRL+SHIFT+X.

Click here to see larger image

Figure 1. Applying XML elements to a document from the XML Structure task pane

The structure of the XML can be displayed using the task pane. This is activated by clicking Task Pane from the View menu. The structure is derived from the schema applied to the document. The XML Structure pane displays the hierarchy of elements in a schema in the upper window. In the lower window, the user can view all elements or, to ensure validity of the document, can choose to view only those elements that are available at the current insertion point.

Any schema violations that occur during editing are indicated by red icons in real time, along with a description of the problem (as can be seen in Figure 2). For example, a red icon with a cross in the circle indicates that the content is invalid, and the icon with a slash means the location of that particular element is incorrect.

Figure 2. Schema violation red icon alerts

Figure 3 shows the schema validation options available in the XML Options dialog box. In the same dialog box, you can set the XML save and view options. The XML Options dialog box can be displayed by clicking XML Options at the bottom of the task pane as shown in Figure 1.

Figure 3. XML schema violation, save and view options dialog box

Attaching Schemas

A developer can specify the structure for a Word document by providing an XSD. Word 2003 supports the ability to attach an XSD to documents. A schema specifies which elements can be applied in a document, the order in which those elements can be applied, and the type of content the elements can contain (for example, a string or an integer).

Documents are validated by Word 2003 to ensure that they conform to the attached schema. If the document is not valid, Word 2003 can be set to prevent the user from saving it as XML. If the document is valid, Word 2003 saves the data and formatting using the Word XML document schema, or as raw data only, depending on user choices as shown in Figure 4.

Click here for larger image.

Figure 4. Document save as type options including as an XML file

Word 2003 allows a selected Extensible Stylesheet Language Transformation (XSLT) to be applied to an existing XML file without any user or developer intervention. Word 2003 can track the associations and allow users to automatically apply the appropriate XSDs or XSLTs when it detects XML files that belong in certain categories (for example, news articles versus performance reviews). Solutions that leverage this functionality allow users to open an XML file, then Word 2003 will automatically apply an XSLT to display it, and attach the XSD file to help enforce the validity of the XML file during editing.

Schema Library

In addition, Word 2003 allows developers to manually manage schemas through a new feature called the Schema Library (see Figure 5). This allows developers to apply friendly names or aliases to member schemas in order to make them easily identifiable.

Click here for larger image.

Figure 5. The schema library dialog box

Schemas can be added to the Schema Library, which allows users to easily attach the schema to any of their documents. In addition, when Word 2003 opens an XML file in a namespace matching that of one of the schemas in the library, Word will automatically attach that schema to enable validation. XSLT files can also be added to the Schema Library, and then automatically applied when an XML file matching the namespace is opened in Word 2003. Therefore, XML files without the Word XML document schema can still be opened in Word 2003 in a rich way. Once users attach the schema to a document, the namespace is attached to that document. As long as the Schema Library has a pointer to the schema for that namespace, the schema will also be attached as the document is closed and reopened.

Schemas must adhere to the World Wide Web Consortium (W3C) XML Schema recommendation. You will not be able to attach an invalid schema to your Word documents.

XML Schema Solutions

You can package schemas with other related files (for example, a set of XSLTs) and add them to the Schema Library as whole solutions. You can also add Smart Document solutions to the Schema Library. For more information on Smart Document solutions, see the Smart Documents section below.

Saving an XML Document

Word 2003 provides different options for saving an XML document.

  • Saving a document as XML in Word

    The XML format emphasizes document structure as well as a separation of data—the contents of a document—from the document's presentation. This enables developers to provide users with flexibility when saving and viewing documents.

    Word 2003 allows developers to save XML documents as data only or to use the default Microsoft Word XML document schema to preserve both data and formatting information, so that users can open the document multiple times in Word or can pass the document between applications without loss of formatting or data.

  • Saving with the Microsoft Word XML document schema

    The Microsoft Word XML document schema includes elements for a Word document's properties, formatting, and structure. When you save a file as an XML document, Word 2003 uses its own schema by default (the Microsoft Word XML document schema) to represent the document. If you have attached an additional schema, Word 2003 saves both the Microsoft Word XML document schema and the additional schema-specific markup. When saved in this manner, the document retains all markup, plus formatting, tracked revisions, comments, and other Word-related features.

    Documents that use a customer-defined schema can be saved as data-only XML, according to the schema's structure. Conversely, documents can be saved by using the full Word XML (WordML) schema, providing additional value to developers by maintaining formatting and other Word features such as document properties, revision history and tracked changes.

  • Saving as data only

    If you have attached an additional schema, you can choose to save the data only, without any Microsoft Word XML document schema markup. This removes all Microsoft Word XML document schema—specific information—including formatting and other presentation instructions. Only the data that is within the root element or any of its children is retained.

    If you open the file in Word 2003 again, Word 2003 will check the Schema Library for an XSLT. If Word 2003 does not find an XSLT for that schema, it will apply a default XSLT so that it can display the document. If you specify a different XSLT when opening the data-only document, Word 2003 will display the XML document according to the XSLT instructions. You can also apply an XSLT when you save a document. If you have not attached a separate schema, when you save the document as data only, any data that has not been marked up will be lost. If you try to open the file again in Word, you will get an error.

  • Saving with a transform

    As shown in Figure 6, you can also apply a transform when using the Save As command to save a document as an .xml file. When you select the Apply transform check box, the Transform button will be made available for you to enable it so that you can browse to the transform you want to use.

    Click here for larger image.

    Figure 6. The apply transform option

Opening an XML File

For Word 2003 to be able to open an XML file, the XML in the file must be well-formed. If an XML file is not well-formed, Word 2003 will not be able to open it. Word 2003 cannot perform any error correction to try to fix the file. Word 2003 also does not perform any validation while opening a file. Validation is done in the background only after the file is open.

In the Open dialog box, the Open button offers additional options in its drop down list. One of the options is to open with transform.

If a document is not in a format that Word 2003 understands, Word uses a default XSLT to create a basic view of the XML file. This process preserves all XML from the original file, including its namespace information. No special formatting is added. When Word opens a file by using the default transform, each text node is a separate paragraph. The XML tags are visible and the XML Structure task pane is open by default.

Users can then add data to the document. They can also apply styles and other formatting to the document. If they want to retain those formatting changes, they can save the document with the Microsoft Word XML document schema attached.

Smart Document Solutions

Smart Document technology in Word 2003 and Microsoft Office Excel 2003 enables the creation of XML-based applications that provide users with contextual content via the Office task pane. With Smart Documents, users can increase productivity because content is presented in the task pane as they navigate through a document, reducing the time spent searching for or filling in data, or looking for help. Users benefit from a Smart Document's ability to deliver relevant information and actions through the use of an intuitive task pane that synchronizes content based on the user's current location within the document. The task pane presents users with supporting information, such as data that corresponds to the document, relative help content, calculation fields, hyperlinks or any number of controls. An example is shown in Figures 7 and 8.

Click here for larger image.

Figure 7. A Smart Document with the task pane displayed on the right

Click here for larger image.

Figure 8. The same Smart Document as shown in Figure 7 but with data from the task pane inserted into the template

Word 2003 and Excel 2003 documents can be designed with an underlying XML structure that ensures that users are entering and viewing valid information. At the same time, the XML structure gives developers the ability to build the document with context-specific help and supporting information.

Smart Documents build on the smart tag feature that was introduced in Microsoft Office XP, and extend this concept using a document-based metaphor aimed at simplifying and enhancing the user experience. Developers can build on rich XML-based documents to create Smart Document solutions that can be deployed and subsequently updated from a server once the initial document or template has been opened on the client. This makes distribution a nonexistent issue and allows easy maintenance.

Using Smart Document Solutions

Smart Documents enhance many business processes, ranging from those intended to ensure that users are working on the latest copy of a document, to more robust interactive processes that provide guidance, instruction, data, and formatted content delivered through the task pane and corresponding actions. Smart Document scenarios can include varying tasks such as enabling an expense report in Excel, producing a sales proposal generator and managing a legal compliance form process in Word.

Before Smart Documents, users who needed to fill out an expense report had to start with the latest spreadsheet file. If they did not already have it, they would have to navigate to an intranet page or file share, or use a copy that had been e-mailed to them or already existed on their machine. If users already had a copy, it was their responsibility to ensure it was the most recent copy, requiring that they check the usual places. Once users located the current document, they filled it out using the on-sheet instructions provided, perhaps in cells on the side or as in-cell comments. Next, users would submit the document for approval, which might entail saving the document back to a file share or e-mailing it directly to a supervisor.

Smart Documents can make processing such reports easier and more efficient while eliminating mistakes and confusion. In a Smart Document solution, the user still has to get the expense template initially, but only once since the solution includes information about how to update itself. This way, if the company ever changes the format or logic associated with the expense report solution, users need only open the existing version, and the Smart Document solution will maintain the latest functionality. Filling out the expense report also will be simpler and more intuitive. As users enter information, the task pane is automatically updated to offer relevant content that may be instructional or functional. For example, if users enter expense information into a cell that requires it to be classified by type, the task pane can explain each expense category type, what type of expenses are applicable, and offer to insert a type automatically. As another example, when users enter an expense amount and specify that it was not in a local currency, the task pane displays current currency exchange rates via an XML Web service to ensure precise accounting and streamline calculation. Another feature might help users calculate driving expenses by including mileage calculator that uses a point-to-point mapping XML Web service.

When users have completed their form, they click the Submit button in the task pane. The Smart Document first validates the report and then routes the complying document directly to the appropriate supervisor. After a report is submitted, the use's copy has the Submit button grayed out, and the report is marked as submitted. The supervisor's copy offers the choice of clicking the Approve Expense Report button or the Deny Expense Report button. If the supervisor clicks on Approve Expense Report button, a confirmation e-mail is sent automatically to the user, and the expense report data is submitted to the expense report processing service. If the report was denied, the expense report with the supervisor's comments is automatically routed back to the user. The entire process is performed and managed with logic contained within the Smart Document solution.

Building Smart Documents

Developers can leverage an existing document or start from scratch to build a Smart Document solution. The document must be attached to an underlying XML schema, which is then used as the basis for marking it with corresponding XML elements.

To develop a Smart Document DLL using the Smart Document object model, you will need to first reference the Smart Document type library, Microsoft Smart Tags 2.0 Type Library. The Smart Document object model is similar to the smart tag programmatic object model, and that's why the Smart Document object model lives inside of the smart tags type library. Once you set a reference to the Microsoft Smart Tags 2.0 Type Library, you write code to implement the Smart Document application programming interface (API) called ISmartDocument interface.

Smart Document code can be written in Microsoft Visual Basic® 6.0, Microsoft Visual Basic™ .NET, Microsoft C#™ .NET, or Microsoft Visual C++®. The code developers write will manipulate the document directly or interact with server-side processes, such as retrieving data or routing the document (or its contents) to a back-end system to complete the solution.

In Office XP, you could create simple smart tags without writing code using XML-based Microsoft Office Smart Tag List (MOSTL) files. Smart tags created using MOSTL can have smart tag actions that were limited to hyperlinks only. In The Microsoft Office System, you can also create simple Smart Document solutions by using extensions to the MOSTL file schema for The Microsoft Office System. These types of Smart Document solutions are limited to hyperlink, separator, button, label, image, and embedded help content only, and buttons are limited to hyperlink actions only.

An XML file called an XML expansion pack is one of the final building blocks of a Smart Document solution and an important one. As part of a typical Office solution deployment, you may require end users to run .msi files, run regsvr32.exe on COM add-in DLLs, batch files or other similar deployment steps. The Smart Document solution deployment mechanism is interesting and innovative. With The Microsoft Office System, users can interact with Word 2003 and Excel 2003 user interface components to quickly reference and load a Smart Document solution's component files, while at the same time respecting the user's Office security settings. The solution developer sets this up through the use of an XML expansion pack (see Figure 9).

Figure 9. The XML expansion pack dialog box

An XML expansion pack allows Office solution developers to reference one or more solution components that are installed as part of a complete Word 2003 or Excel 2003 solution. The XML expansion pack manifest file is defined using an XML schema. The XML expansion pack manifest file lists components that should be downloaded to make a Smart Document fully functional.

These solution components can include any type of file, including but not limited to, schemas, transforms, DLLs, image files, other XML files, text files, and so on. In addition to being able to specify files to download, the XML expansion pack manifest file also specifies setup information related to these files, including which registry keys to write to enable them, and whether or not they need to be installed in a particular place.

The XML expansion pack manifest file connects the Word or Excel document template, which users access initially, to all the supporting files (such as DLL, XSD or XSL) associated with a Smart Document solution. The developer simply needs to create the XML expansion pack manifest file and place it with the solution's supporting files on the server. When a Smart Document is opened by a user, the XML expansion pack manifest file technology in Office checks the document's internal metadata (specified in the Custom Properties tab of the Properties dialog box when the solution is created) to locate and inspect the assigned XML expansion pack manifest file thus ensuring that the entire solution is available and operational, and that any new files are downloaded as needed. This dramatically simplifies deployment and maintenance for the smart tag solution developer.

Additionally, a Microsoft Office Smart Documents Software Development Kit (SDK) is available to The Microsoft Office System program participants to help developers quickly get up to speed on how to build Smart Document solutions.

Benefits of Smart Document Solutions

Smart Documents result in a new breed of smart-client solutions that developers can deliver to Office users. These solutions are based on users' familiarity with Word and Excel documents, and combining distributed, Web-based computing and the use of open-standard technologies such as XML and XML Web services. The following are some advantages that Smart Document solutions offer developers:

  • Better document-based solutions

    Smart Documents allow developers to provide a better overall user experience than that of traditional Office document-based applications. Developers can create solutions that offer richer user interfaces, provide content that is relevant to a situation as users work within a document, and extend documents to seamlessly integrate with other processes and systems.

  • Programmable task pane

    Smart Documents offer a programmability model for task panes in the Microsoft Office System, allowing developers to provide appropriate content that has relevance within a solution. The task pane can be programmed to contain any variation of data, help, controls—such as buttons, check boxes, option buttons, and list boxes—hyperlinks, images, and free text. Developers can also manage task pane events to perform actions on behalf of the user.

  • Simplified deployment and update mechanism

    Smart Document solutions are deployed by using the XML expansion pack which is a new deployment mechanism that enables installation by opening a document that was obtained via e-mail or downloaded from a Web or file server. Smart Documents can automatically update themselves from any trusted server location, making upgrades a non-issue. Developers never have to install or manage the client-side code directly.

  • Higher security standards

    Smart Document solutions implement a high standard of security that requires that the document comes from a trusted source. Downloaded solutions are subject to Office security settings, can come only from trusted sites, and the XML expansion packs are required to be signed. Even if these requirements are met, the user is still prompted to decide whether or not to initiate the download and use a Smart Document solution. If not, nothing is executed on a user's machine.

  • XML support

    The XML support in Word 2003 and Excel 2003 documents enables Smart Documents, which make use of XML. Developers can easily leverage existing XML tools, data sources and their own XML skills to build robust Smart Document solutions.

Research Library

The new Research Library feature in the Microsoft Office System makes searching for relevant information and integrating that data into Office documents easier. The Research task pane, which is a task pane-based feature (see Figure 10) in Word 2003, Excel 2003, Microsoft Office PowerPoint 2003, Microsoft Office Outlook 2003, Microsoft Office Publisher 2003, Microsoft Office Visio 2003, Internet Explorer, and Microsoft OneNote 2003. The Research Library that functions within Office applications allows Office users to easily access Research Library services while working on Office documents.

Click here for larger image.

Figure 10. The Research task pane (the task pane on the right) with search results returned to the pane and inserted into a document

Research sources that are built into the Microsoft Office System, provide easier access to reference tools like dictionary, thesaurus, translation, encyclopedia, and some Web sites in multiple languages. Figure 11 shows the research services enabled by default in the Microsoft Office System, while Figure 12 shows the available built-in research services options that you can choose from.

**Note  **To display the Research task pane, in the Tools menu, click Research or click the Research Library icon on the toolbar. To search for information, just type keywords in the Search for field, or alternative by clicking on the word in the document you want to search for information on while holding down the Alt key.

Figure 11. List of built-in research sources enabled by default

Click here for larger image.

Figure 12. List of built-in research sources options

Additionally, the Research Library which can be controlled by administrators at a corporate level is extendable, allowing developers and third-party information providers the ability to create their own research services. This means developers can build custom research sources that integrate information from a company's back-end database sources, thus making business-specific data available to users. The data sources can be local or remote, behind a corporate firewall or on the Internet, including SharePoint sites. It is noteworthy that extending the research library allows developers to provide an innovative and intelligent solution that permeates across multiple applications in Office, since the Research Library feature as mentioned earlier, is supported in many Office applications.

In addition, the Research Library integration of smart tag technology allows developers to create custom actions like transforming, inserting or grabbing data from live feeds as shown in Figure 13. Smart tag integration in the Research Library feature is supported in Word 2003, Excel 2003, PowerPoint 2003, Outlook 2003, and Visio 2003.

Figure 13. Custom smart tag actions integrated into a Research service

Actions available in the Research task pane can be customized to integrate smart tags, hyperlinks, and textual data, which can be inserted into Office documents as structured XML data, which is not the case with typical Web searches.

The Research Library feature provides this functionality using formatted XML packets. Once a service is registered on a user's machine, the user can generate queries, which Office sends on to the service via Simple Object Access Protocol (SOAP). A research service receives a basic query packet, an XML string that adheres to the query protocol schema. Then the research service processes the query and returns the results of the query by using one of the authorized results schemas.

Users must opt to register a service from a research service provider on their machine as shown in Figure 14. Once registered, users can initiate searches by using the research service. When users initiate searches, Office sends query packets to your service. When Office receives a response from a service with the results of the search, the Office application from which the user initiated the search displays the results in the Research task pane as shown in Figure 10 above.

Figure 14. Users given the option to register a custom Research Library service

A Research task pane must respond to a list of Research Library predefined interfaces. The two primary interfaces are the Registration and Query interfaces:

Registration

Before a user can use a custom Research task pane, it must be registered with Office. Registering allows Office to get all the necessary service data and perform dynamic updates. Office also helps deploy the custom Research Library by advertising the service to an end-user, performing automatic registration, advertising the service, and performing user initiated installation. If this mean research service is not implemented, providers will have to write the registration key themselves.

The registration interface returns a XML document based on the registration schema that defines the origin of the Research Pane XML Web service, additional licensing information, and a pointer to the Query interface. The registration interface is called by Office to install the information that Office needs to use the custom Research task pane.

When you provide your end users with a URL to your Research Library service, they enter the URL into the Add Service dialog box Address field to search for your service as shown in Figure 15. If the search returns results, users will have the option to register your service on their machine as shown in Figure 14 above. For this to function properly, you must have a SOAP function called Registration located at the URL that you give your users.

Figure 15. Users adding a service entering the URL a service provider send them in the Address field

The Registration function takes a String and returns a String.

In C#, it looks as follows:

  [WebMethod] Public String Registration(String)

In Visual Basic .NET, it is:

  <WebMethod> Public Function Registration(reg as String) as String

Query

The second step in setting up a Research Library service is to develop a way to process the queries that you receive from your users. The Query interface is the primary interface that Office calls when the user interacts with the Research task pane. This includes any searches on a term, or a request through a form that collects input from a user. The Query function allows office to send queries to the server, and allows the server to respond with rich results. It is strictly a communications protocol, unrelated to the back end.

To process the queries that you receive from users of your service, you need to create a SOAP function named Query. This function, as shown in the syntax example below, takes a String and returns a String.

In C#, it looks as follows:

 [WebMethod] Public String Query(String)

In Visual Basic .NET, it is:

<WebMethod> Public Function Query(reg as String) as String

The Office application used to gain access to your service calls this Query function, and passes in a string comprised of XML data that adheres to the XML query packet schema. The Query function that you create needs to process this string and then return a string that adheres to the response schemas.

The Registration and Query interfaces are developed as XML web services that return XML packets based on pre-defined schemas included with the Research task pane SDK.

Updating Registration Settings

When you register your service by implementing the Registration function, the Research Library framework will periodically request that providers update their registration information. Information is updated at the provider level. This is performed automatically, as opposed to when a service is registered using custom installations requiring a custom update application if the service registration settings change.

Additionally, a Microsoft Office Research SDK is available. For more information, see the Office Developer Center.

Conclusion

Word 2003 has many new innovative and enhanced features. XML support in Word 2003 makes it easier for Word to integrate with other systems. Word content has now become free-flowing, unlocked data, which can be manipulated and easily repurposed. The innovative Smart Document technology enables the creation of XML-based applications that provide users with contextual content and relevant help. Smart Documents make users more productive, as it reduces the time that is needed to search or fill in data, or look for help.

The new Research Library feature in the Microsoft Office System enables information search from within an Office application, and makes integrating that data into Office documents uncomplicated. By harnessing the research library extensibility, developers can build custom research sources that integrate information from back end database sources and make business-specific data readily available to users.

In Part 2 of the article, we will look at more Word 2003 features such as collaborating using Document Workspace sites built on SharePoint Products and Technologies, smart tag improvements and enhancements, range permissions, style lockdown, reading mode, and other features.

Acknowledgement

Thanks to Jeff Reynar, Octavian Timofte, Roberto Taboada, Martin Sawicki, Brian Jones from the Word 2003 team, and Charles Maxson, an independent consultant, for their contributions and help in writing this article.