Programming Microsoft Word 2002 and Excel 2002 with Microsoft Visual C#

This content is no longer actively maintained. It is provided as is, for anyone who may still be using these technologies, with no warranties or claims of accuracy with regard to the most recent product version or service release.

 

Kerry Loynd and Siew-Moi Khor
Microsoft Corporation

October 2002

Applies to:
    Microsoft® Office XP
    Microsoft Visual C#™

Summary: Learn about COM interoperability between Microsoft C# and large, complex COM servers. This article shows how to prepare the Office XP COM objects and how to use them in a C# program, and offers some tips on understanding why things have to be done in certain ways. (16 printed pages)

Download or browse the odc_offcs.exe

Contents

Introduction
System Requirements
Some Quick .NET Basics
Using the Office XP Primary Interop Assemblies
Code Walkthroughs
   Example 1. Starting the Word Application Object
   Example 2. Creating a New Word Document
   Example 3. Opening an Existing Word Document
   Example 4. Using Events Exposed by Word
   Example 5. Animating the Office Assistant
   Example 6. Default Properties and Indexed Properties
Conclusion

Introduction

One of the most powerful features of Microsoft® Office XP is that its components, such as Microsoft Excel 2002 and Microsoft Word 2002, expose their functionality as Component Object Model (COM) interfaces. It is relatively easy to access these COM interfaces from Microsoft Visual Basic® 6.0, but more difficult if you want to use those interfaces and co-classes from C or C++. However, Microsoft .NET and Microsoft C#™ or Microsoft Visual C++® with Managed Extensions can use the COM objects exposed by Office XP almost as easily as with Visual Basic 6.0.

This article is written assuming you are programming for Office XP. To get the most out of this article, you should already be familiar with or have access to the Office XP programming documentation, although hyperlinks to the MSDN® documentation are used throughout this article.

The documentation tells you about the interfaces and co-classes offered by Office XP, and how to use them. The documentation is expressed in Visual Basic, so you will need to make mental translations of the signatures for methods and events. This article will show you how to do that, how to prepare the Office XP COM objects, and how to use them in a C# program. Finally, the article offers tips on understanding why things have to be done in certain ways. You should be able to use this information to help you take advantage of other COM servers using C# as well.

System Requirements

To run the sample, you will need the following software installed on your computer:

Some Quick .NET Basics

.NET technology introduces the concept of assemblies as the fundamental executable unit. Assemblies may be executables (.exe) or dynamic-link libraries (.dll), and may consist of multiple files. An assembly contains all the information about code, types and resources needed for a program to run.

In order to consume the COM objects exposed by Office XP, you need to use the primary interop assemblies (PIAs) so that the C# compiler can find out about the interfaces and co-classes exposed by Office XP.

This article will not go into details about interop assemblies or PIAs. For more information about PIAs, see Primary Interop Assemblies (PIAs). The Using the Office XP Primary Interop Assemblies section of this article provides the download location for the Office XP PIAs.

It is often instructive to look at the type information that has been exposed. Microsoft Visual Studio® .NET provides a tool called ILDASM to list the type information encapsulated in an assembly. Figure 1 shows a partial screen shot of the information that ILDASM shows for the Word 2002 primary interop assembly.

Note   To open the ILDASM tool, click Start, point to Programs, point to Microsoft Visual Studio .NET, point to Visual Studio .NET Tools and click Visual Studio .NET Command Prompt. In the Visual Studio .NET Command Prompt window, type ildasm. This will open the ILDASM window. To view the type information for a particular interop assembly or PIA, in the File menu, click Open. Browse to the location of the interop assembly or PIA, select the interop assembly or PIA you want to view and click Open.

Aa140045.odc_offcs01(en-us,office.10).gif

Figure 1. Using the ILDASM tool for viewing interop assembly type information

As you can see in Figure 1 above, the assembly is in Microsoft.Office.Interop.Word.dll, and the interfaces and co-classes are enclosed in the Microsoft.Office.Interop.Word namespace. The Application co-class has been expanded so you can see that it extends—is derived from, in C++ and C# parlance—__Application, and that it implements the ApplicationEvents2_Event interface in Word. All of this will be discussed in detail later.

Using the Office XP Primary Interop Assemblies

Before you can run the examples included in this article, you need to install the Microsoft Office XP Primary Interop Assemblies (PIAs) on your computer. Once you have installed the PIAs, they must be kept where they are accessible for the compiler and for your finished programs. For details see the Readme file included in the Office XP PIAs download and the "Assembly Location" article in the ".NET Framework Developer's Guide" (To view it, click Start, point to Programs, point to Microsoft .NET Framework SDK and click Documentation.)

For demonstration purposes in this article, the Office XP PIAs are extracted to the following folder: C:\Office XP PIAs\. They are then installed into the global assembly cache (GAC) and registered (see the Microsoft Office XP Primary Interop Assemblies (PIAs) Readme file for details on how to do this).

The C# compiler can be invoked from the command line by typing the name of its executable file (csc.exe). Once the PIAs are installed and registered, they can be referenced on your csc command line just like any other assembly, using the /r option. If you put the PIAs in an inaccessible place, your program will fail at run time with an exception of type System.IO.FileNotFoundException or System.TypeInitializationException, telling you which assembly could not be loaded.

Later on in the How to build and run example1.cs section, you'll see how to build a C# program and a reference PIA using the command line.

The samples included in this article use three Office XP PIAs:

  • Microsoft.Office.Interop.Word.dll
  • Office.dll
  • Microsoft.Office.Interop.Excel.dll

Code Walkthroughs

Before we walk through the code samples, first download the odc_offcs.exe file and extract the samples into a directory called C:\CSOfficeSamples or that of your choice. For easy referencing, in all examples below, we will assume the samples are located at C:\CSOfficeSamples.

The download contains five samples for Word 2002 (example1.cs, example2.cs, example3.cs, example4.cs and example5.cs) and one sample for Excel 2002 (excel1.cs). The corresponding sample builds (example1.exe, example2.exe and so forth) for the sample source files have also been included for you to try out.

All code samples are extensively commented.

Example 1. Starting the Word Application Object

The first example is very simple, and merely shows how to start Word 2002, leave it open for a few seconds, and then close it. Let's look at key lines of code in the example1.cs source file. The code snippet below allocates the Application object and its base class objects, but under the covers, CoCreateInstance gets called.

Application app = new Application();

The Quit method of the Application class takes three parameters: saveChanges, originalFormat, and routeDocument. These optional parameters can be omitted in Visual Basic code, but C# does not have optional parameters; all three must be passed to Quit at the point of call. We can get the same effect in C# by assigning the value Missing.Value to each optional variable, which tells the Quit method to use the default behavior. In this case, "do not save the document, retain the document's original format, and do no routing" are the meanings.

object saveChanges = Missing.Value;
object originalFormat = Missing.Value;
object routeDocument = Missing.Value;
app.Quit(ref saveChanges, ref originalFormat, ref routeDocument);

Notice that all three parameters are marked with the ref keyword. Because these methods were originally written to be used by Visual Basic, by default Visual Basic passes parameters by reference. The parameters must be passed by reference here, too.

How to build and run example1.cs

Before you can run the sample, you first need to build the example1.cs sample. To do this in the Visual Studio .NET Command Prompt window:

  1. Go into the C:\CSOfficeSamples or wherever you've saved the sample. You can do this by typing (for example) cd C:\CSOfficeSamples after the command prompt as shown in Figure 2 below.

  2. Next build the example1.cs by typing csc /r:"C:\Office XP PIAs\Microsoft.Office.Interop.Word.dll" example1.cs after the command prompt as shown in Figure 2.
    (If you saved the Office XP PIAs somewhere else, replace the following drive and installation path with the correct values: csc /r:Drive:\<Installation Path>\Microsoft.Office.Interop.Word.dll example1.cs).

    **Note   **The csc command line compiles the example1.cs source file to produce the example1.exe executable file. In this example, the executable file created is automatically saved in the same folder as example1.cs. The /r command-line option references Microsoft.Office.Interop.Word.dll. If the path to the location of the Microsoft.Office.Interop.Word.dll PIA (or whichever PIA you're referencing) is wrong, your program will fail at run time with an exception of type System.IO.FileNotFoundException or System.TypeInitializationException, telling you which assembly could not be loaded.

    Aa140045.odc_offcs02(en-us,office.10).gif

    Figure 2. Building a source file using the command line

  3. To run example1.exe, located in the same folder as the example1.cs source file, double-click it.

This sample is a very simple program and doesn't do anything particularly interesting; let's take a look at Example 2.

Example 2. Creating a New Word Document

In example2.cs, Word 2002 is started using the Application object as in Example 1, and then a new document is added to the collection of open documents that is encapsulated in the Application.Documents property. The first interesting bit of code is when the new document is created:

object template=Missing.Value;
object newTemplate=Missing.Value;
object documentType=Missing.Value;
object visible=true;
    _Document doc = app.Documents.Add( ref template,
                       ref newTemplate,
                       ref documentType,
                       ref visible);

All of the parameters for the Add method are optional, so we have to either give them a meaningful value or Missing.Value. In this case, because we are not using a template, or creating one, and because this is just a plain-text document, the first three arguments (template, newTemplate, and documentType) are set to Missing.Value. Because we want the document to be visible during this example, the value, "true" is assigned to the argument visible.

You may be wondering how to determine whether a Boolean value should be assigned to the visible object. This is why it is important to have access to the Word 2002 programming documentation. If you look at the description in the Word 2002 object model documentation for the Documents.Add method, you will see the following:

Visible Optional Variant. True to open the document in a visible window. If this value is False, Microsoft Word opens the document but sets the Visible property of the document window to False. The default value is True.

Note   To view the Word 2002 Visual Basic documentation for the Documents.Add method, in the Tools menu of Word 2002, point to Macro, and click Visual Basic Editor. When you are already in the Visual Basic Editor on the keyboard, press F2 to activate the Object Browser or F1 for Help. Search for "Documents" or "Documents.Add". You can also find similar documentation on MSDN.

This begs the question: Why does the PIA expect the parameters of the Add method to be typed as object, but the Documents.Add method documentation shows type Variant? This is because type Variant is automatically marshaled as the .NET Object type, which maps to the C# object type. In this case the argument visible boxes the Boolean value true as an object, which is passed to the Documents.Add() function.

The next interesting line of code is:

doc.Words.First.InsertBefore

Using the document interface that was returned from the call to the app.Documents.Add() function, some text is added to the beginning of the document. There is nothing surprising or unusual here.

Let's look at the next interesting piece of code, which saves the document:

object fileName = Environment.CurrentDirectory+"\\example2_new";
#if OFFICEXP
doc.SaveAs2000( ref fileName,
#else
doc.SaveAs ( ref fileName,
#endif
       ref optional,
       ref optional,
       ref optional,
       ref optional,
       ref optional,
       ref optional,
       ref optional,
       ref optional,
       ref optional,
       ref optional);

The first thing to notice is that the string holding the file name is boxed up into the object fileName. Next, the code will either call the SaveAs2000 method if OFFICEXP is defined, or SaveAs if OFFICEXP is not defined. As you have probably guessed, the signature of the SaveAs method was changed between Office 2000 and Office XP.

How to build and run example2.cs

To build example2.cs, in the Visual Studio .NET Command Prompt window:

  1. In the C:\CSOfficeSamples directory or wherever you have saved the example2.cs, type csc /r:"C:\Office XP PIAs\Microsoft.Office.Interop.Word.dll" /d:OFFICEXP example2.cs after the command prompt, as shown in Figure 3.
    (If you saved the Office XP PIAs somewhere else, replace the following drive and installation path with the correct values: csc /r:Drive:\<Installation Path>\Microsoft.Office.Interop.Word.dll /d:OFFICEXP example2.cs).

    Click here for larger image

    Figure 3. Building example2.cs using the command line (click thumbnail for larger image)

  2. To run example2.exe, located in the same folder as the example2.cs source file, double-click it.

Example 3. Opening an Existing Word Document

The Documents.Open method, like the Documents.SaveAs method, went through a signature change from Office 2000 to Office XP, so the new name is wrapped in a #if statement. The Open method is just as simple as the SaveAs method, as shown below:

    object fileName = Environment.CurrentDirectory+"\\example3";
    object optional=Missing.Value;
#if OFFICEXP
    _Document doc = app.Documents.Open2000( ref fileName,
#else
    _Document doc = app.Documents.Open( ref fileName,
#endif
                         ref optional,
                         ref optional,
                         ref optional,
                         ref optional,
                         ref optional,
                         ref optional,
                         ref optional,
                         ref optional,
                         ref optional,
                         ref optional,
                         ref optional);

The optional parameters are documented in the description of the Documents.Open method in the Word 2002 Visual Basic reference in Help and on MSDN.

The interesting code in this example is where the text in the opened document is highlighted and then cut:

    object first=0;
    object last=doc.Characters.Count;
    Range r = doc.Range(ref first, ref last);
    r.Select();
    Thread.Sleep (2000);
    r.Cut();

The integer values of the first character position and the last character position are boxed into the first and last objects, then passed to the Document.Range() function, which returns the Range object for the call to the Select() function. This explicit boxing is necessary because the Range object expects references to its arguments, and any conversion, implicit or explicit, changes the arguments to rvalues, which cannot be passed by reference. The example leaves the text highlighted for two seconds and then cuts the text. The cut operation also could have been done with the following code:

    object first=0;
    object units = WdUnits.wdCharacter;
    object last=doc.Characters.Count;
    doc.Range(ref first, ref last).Delete(ref units, ref last);

How to build and run example3.cs

To build example3.cs, in the Visual Studio .NET Command Prompt window:

  1. Open the directory in which you saved the example3.cs source file (for example in C:\CSOfficeSamples directory), and type csc /r:"C:\Office XP PIAs\Microsoft.Office.Interop.Word.dll" /d:OFFICEXP example3.cs after the command prompt.
    (If you saved the Office XP PIAs somewhere else, replace the following drive and installation path with the correct values: csc /r:Drive:\<Installation Path>\Microsoft.Office.Interop.Word.dll /d:OFFICEXP example3.cs)
  2. To run example3.exe, located in the same folder as the example3.cs source file, double-click it.

Example 4. Using Events Exposed by Word

This example is more involved than the others, but not as complex as it may look. Most of the complexity is in the length of the names used to identify the events and their handler types. Consider the setup code for the Office XP version of the DocumentOpen and DocumentChange event handlers:

...
#if OFFICEXP
    ApplicationEvents3_DocumentOpenEventHandler myOpenDoc = new
        ApplicationEvents3_DocumentOpenEventHandler
            (MyOpenEventHandler);

    ApplicationEvents3_DocumentChangeEventHandler myChangeDoc = new
        ApplicationEvents3_DocumentChangeEventHandler(DocChange);
#else
...

These two statements simply declare the event handlers for the events. A few lines later, these handlers are assigned to the events in the Application object app:

    app.DocumentOpen += myOpenDoc;
    app.DocumentChange += myChangeDoc;

These two events are now ready for use. When the Open method is called, both of these events are fired. Follow the hyperlinks to read the documentation on the DocumentOpen and DocumentChange methods.

So how does one figure out what events are available and how their handlers are called? If you inspect the Word 2002 PIA (Microsoft.Office.Interop.Word.dll) using ILDASM, some types have green point-down triangles in front of them. This indicates that the member is an event. Figure 4 shows Help for the ILDASM tree-view icons.

Aa140045.odc_offcs04(en-us,office.10).gif

Figure 4. ILDASM Help for the tree-view icons

Click here for larger image

Figure 5. Using ILDASM to view the events for the Application object (click thumbnail for larger image)

Figure 5 shows a partial screen shot of the events for the Application object. The leftmost identifier on each line is the name of the event. To the right of the colon is the fully qualified type name of the event handler. For example, the DocumentBeforeSave event requires a handler of type:

Microsoft.Office.Interop.Word
  .ApplicationEvents3_DocumentBeforeSaveEventHandler

Notice that the event tells us nothing about the signature of the event handler. For that, you need to look at the event handler declaration. In ILDASM, if you double-click the ApplicationEvents3_DocumentBeforeSaveEventHandler type, you will see something similar to Figure 6.

Aa140045.odc_offcs06(en-us,office.10).gif

Figure 6. Viewing the event handler declaration in ILDASM

The Invoke method is what interests us. The function you write for the event handler must have this signature. But how do you know what the parameters mean and which values they should take? That's where the Word 2002 Visual Basic documentation is indispensable. For the DocumentBeforeSave event, the documentation says:

Private Sub object_DocumentBeforeSave(ByVal Doc As Document, SaveAsUI As
  Boolean, Cancel As Boolean)

It goes on to describe the meaning of each parameter. Remember, C# by default passes parameters by value, whereas Visual Basic by default passes them by reference. That is why the two Boolean parameters are followed by the ampersand (&) when displayed by ILDASM, and why they must be tagged with the ref keyword when used in C#. Also, Visual Basic Subs are seen as methods returning void in C#. So, the handler for the DocumentSave event looks like this:

public static void SaveHandler (Document doc, ref bool b1, ref bool b2) {
    MessageBox.Show ("Saving document", "DocumentSave event", 
    MessageBoxButtons.OK, MessageBoxIcon.Information);
}

When the document is saved via the call to the SaveAs method, the DocumentBeforeSave event is fired before saving the document.

A few lines after the call to the SaveAs method, you will see the following code snippet:

app.DocumentChange -= myChangeDoc;

This line unhooks the DocumentChange event so it is not fired during the call to Quit.

How to build and run example4.cs

To build example4.cs, in the Visual Studio .NET Command Prompt window:

  1. Open the directory in which you saved the example4.cs source file (for example, in the C:\CSOfficeSamples directory), and type csc /r:"C:\Office XP PIAs\Microsoft.Office.Interop.Word.dll" /d:OFFICEXP example4.cs after the command prompt.
    (If you saved the Office XP PIAs somewhere else, replace the following drive and installation path with the correct values: csc /r:Drive:\<Installation Path>\Microsoft.Office.Interop.Word.dll /d:OFFICEXP example4.cs).
  2. To run example4.exe, located in the same folder as the example4.cs source file, double-click it.

Example 5. Animating the Office Assistant

Some people love Office Assistants and some people hate them. But whatever your opinion, example5.cs is here just for a bit of fun. This sample also uses the type information for the Assistant found in mso.dll. It uses two PIAs:

  • Microsoft.Office.Interop.Word.dll
  • Office.dll

Every non-trivial step in the example5.cs source file is extensively commented. We will not be going through the code as it is quite easy to understand.

How to build and run example5.cs

To build example5.cs, in the Visual Studio .NET Command Prompt window:

  1. Open the directory in which you saved the example5.cs source file (for example, in the C:\CSOfficeSamples directory), and type csc /r:"C:\Office XP PIAs\Microsoft.Office.Interop.Word.dll" /r:"C:\Office XP PIAs\Office.dll" example5.cs after the command prompt.
    (If you saved the Office XP PIAs somewhere else, replace the following drives and installation paths with the correct values: csc /r:Drive:\<Installation Path>\Microsoft.Office.Interop.Word.dll /r:Drive:\<Installation Path>\Office.dll example5.cs).
  2. To run example5.exe, located in the same folder as the example5.cs source file, double-click it.

Example 6. Default Properties and Indexed Properties

Word 2002 makes very little use of the default and indexed properties, but Excel 2002 uses them extensively, so this example (excel1.cs) takes advantage of that fact.

Like all Office XP interop code, this sample starts by instantiating an Application object. After creating a workbook and a worksheet, an array of strings is created to hold column headings. Once that array is created, you see the following code snippet:

wksRange = wks.get_Range("A2", "D2");

This gets the Range object for cells A2 to D2. But since the worksheet has a Range property, why is it necessary to call the accessor directly? And why doesn't that cause a syntax error like it normally would?

Unlike Visual Basic and Visual C++, C# does not have a syntactic construct for indexed properties. To use indexed properties in C#, you must call the accessor directly. A good example is the _Worksheet.Range property. To get the value of the Range property in Visual C++, your code would look something like this:

myRange = myWorksheet->Range["A2", "D2"];

To do the same thing in C# would look like this:

myRange = myWorksheet.get_Range("A2", "D2");

Setting the Range property, instead of assigning it, is a call to the set accessor:

myWorksheet.set_Range("A2", "D2", myRange);

The Range.Value property in Microsoft Excel 2000 was a regular property, but in Excel 2002, it was changed to an indexed property. That's why its use in this sample is bracketed by #if OFFICEXP statement.

_Workbook.Worksheets has what is called a default property. Default properties are seen in the interop assemblies as properties with the name Item. Ordinarily you must specify the Item member in order to use default properties from C#, but in the Excel library a little magic by TLBIMP creates accessors called get__Default or set__Default. If these exist, C# can use the indexer syntax rather than calling the accessors directly. These two lines from the example demonstrate that:

_Worksheet wks2 = (_Worksheet)wkb.Worksheets["Market Share!"];
((_Worksheet)wkb.Worksheets["Market Share!"]).Name = "Fred";

How to build and run excel1.cs

To build excel1.cs, in the Visual Studio .NET Command Prompt window:

  1. Open the directory in which you saved the excel1.cs source file (for example, in the C:\CSOfficeSamples directory). Type csc /r:"C:\Office XP PIAs\Microsoft.Office.Interop.Excel.dll" /d:OFFICEXP excel1.cs after the command prompt.
    (If you saved the Office XP PIAs somewhere else, replace the following drive and installation path with the correct values: csc /r:Drive:\<Installation Path>\Microsoft.Office.Interop.Excel.dll /d:OFFICEXP excel1.cs).
  2. To run excel1.exe, located in the same folder as the excel1.cs source file, double-click it.

Conclusion

COM interop with C# can be an extremely useful tool because it allows you to use existing objects without requiring you to rewrite the code for those objects. This article should help you to leverage your existing investment in COM objects.

Below are handy links to articles on PIAs, .NET security, .NET and COM interop, and so forth, that you could go to for more information: