Handling Complex COM Objects with Interop Assemblies or "Why Has My Menu Button Stopped Working in Office?"

 

Peter Vogel
PH&V Information Services

August 2004

Applies to:
    Microsoft Office 2003 Editions

Summary: Can't get your menu button to work consistently? Are you wondering how .NET manages the memory of COM objects you instantiate by using the .NET Framework? Peter Vogel discusses some of the common issues that occur when handling COM objects within the .NET Framework and provides strategies for addressing them. (7 printed pages)

Contents

Introduction
My Button Stopped Working
COM and .NET-based Memory Management
Managing COM Memory
Managing COM Objects with the AppDomain Class
Managing the COM Object with ReleaseComObject
Conclusion
About the Author

Introduction

The .NET Framework provides several mechanisms for handling communication between the worlds of the .NET Framework and COM. However, to prevent memory leaks when working with COM objects, or having your COM object suddenly ceasing to work, you must make intelligent use of features in the .NET Framework to manage COM objects.

While the .NET Framework provides a number of ways to work with COM objects, one of the most convenient is interop assemblies. When using interop assemblies, managed .NET-based code interacts with a Runtime Callable Wrapper (RCW) that manages the COM object for the .NET-based application. RCWs take care of some of the issues in working with COM objects, including component activation and parameter marshaling (such as converting string data types into COM BSTRs). Microsoft provides customized Interop Assemblies for working with the Microsoft Office System, known as the Primary Interop Assemblies (PIAs). However, these tools do not handle problems related to managing COM lifetimes and you must be able to recognize and manage those issues using other tools.

My Button Stopped Working

Some problems are easily solved but reveal key issues in working with COM objects from .NET-based applications. For example, one of the common problems that you discover working with Office applications is a "dead command button." Here is how it should work: From .NET-based code, you add a new menu button to a menu bar in Microsoft Office Word 2003 and tie a routine in a .NET-based application to the button's Click event. Initially, everything seems to work.

Suddenly, and for no apparent reason, when the user clicks the button the Click event routine doesn't run. The button's Click event appears to stop working abruptly.

This problem often occurs because of the way you declare the variable that refers to the menu button in the .NET-based code. If the variable is declared locally (such as inside a subroutine or function) then, when the routine finishes, the variable goes out of scope. Any routines tied to the object referred to by that variable stop working when the object is destroyed. However, the time when the object is destroyed after the variable referring to it goes out of scope, differs in a .NET-based environment from a COM environment.

Most developers would quickly diagnose this problem if the Click event code stopped running as soon as the routine ended and the variable went out of scope. That's certainly when the problem appears in a COM application that manipulates Word. But, in the world of the .NET Framework, an object is not necessarily destroyed when it goes out of scope. Instead, the object may persist after the variable goes out of scope, until eventually, the .NET Framework destroys the object. During that time, the Click event code continues to run when the user clicks on the menu button and then suddenly stops working correctly when the .NET Framework finally destroys the object. This behavior reflects a fundamental difference between the way COM and the .NET Framework manage object lifetimes and, as a result, memory.

COM and .NET-based Memory Management

COM implements a deterministic model for freeing memory. In COM, each object has a reference counter that tracks how many clients use it. As clients refer to the object, that counter is incremented; as clients stop referring to the object, the counter is decremented. When the counter reaches zero, the object is immediately removed from memory. Developers using C manage the reference counter explicitly, but in Visual Basic, the reference counter is decremented indirectly. In Visual Basic, having a variable go out of scope or setting the variable to nothing, causes the reference counter on the object referred to by that variable to be decremented. Only if that variable was the single variable (or the last variable) referring to the object does COM immediately remove the object from memory.

Conversely, the .NET Framework uses a non-deterministic approach to freeing memory. In the .NET Framework, objects are removed from memory as part of a garbage collection process that checks for objects that are no longer in use. Garbage collection is a lazy process. It's only done when necessary; typically, when memory is constrained. In the .NET Framework, setting a variable to nothing just makes the object that it refers to a candidate for removal at some time in the future. For more about this non-deterministic finalization process and a complete discussion of garbage collection in the .NET Framework, see the MSDN article, Programming for Garbage Collection, from the .NET Framework Developer's Guide.

The .NET-based approach to reclaiming memory creates the problem with the Click event. In the .NET-based application, the variable that refers to the RCW managing Word may go out of scope at the end of the routine but the RCW managing the COM object is only made eligible for garbage collection. Eventually, the RCW is subject to garbage collection and, at that point, the COM object that the RCW managed is destroyed. Only then does the .NET-based application lose the connection to the Click event and the button stops working.

If you only want to ensure that your .NET-based application hangs onto the COM object for the lifetime of the application, the solution to this problem is simple. Just declare the variable that refers to the RCW at the module or class level so that it does not go out of scope until the application is complete. In Microsoft Visual Basic .NET, the code to declare the object at the Class level is as follows:

Public Class WordManager
    Inherits System.Windows.Forms.Form
Private WithEvents mi As Microsoft.Office.Core.CommandBarButton

Managing COM Memory

However, if you want to release the COM object before your application completes, you run into a different problem: memory not being reclaimed when you expect it to be. In your .NET-based application, you can set any references to the COM object's RCW to nothing, but the COM object is not necessarily removed from memory. For example, an application that repeatedly loads and unloads Word to process a series of documents may run out of memory as formerly-used instances of Word continue to reside in memory. It is your responsibility to manage COM objects to minimize the impact on memory.

Again, COM objects from the Office object model provide an example of the problems that can occur. Similar to many large COM servers, the various Office applications contain literally hundreds of objects and typical tasks require instantiating several objects. As an example, this code starts the Word application using its PIA and then uses the RCW that results to create a document:

Dim wrd As New Microsoft.Office.Interop.Word.Application
Dim doc As Microsoft.Office.Interop.Word.Document
doc = wrd.Documents.Add()

This code loads three objects into memory: the RCW that manages the Word Application object, its Documents collection, and a Document object. If the next step in the process loads a large image into the document, the amount of memory held by the COM object can become very large.

When you are done with the objects and are ready to release them, you can set the variables that refer to the Word objects to Nothing:

wrd = Nothing
doc = Nothing

Unfortunately, garbage collection may not reclaim the memory used by these COM objects even when memory shortages trigger garbage collection. Garbage collection tries to remove objects from memory intelligently: objects that were used for a short time are assigned a higher priority in garbage collection than objects that were used for a long time. The higher priority given to short-term objects means that releasing a COM object as soon as you are finished with it makes it more likely that the object is subject to garbage collection sooner.

Note   For more information about how garbage collection works and its performance characteristics, see the MSDN article Garbage Collector Basics and Performance Hints, from .NET Framework Developer's Guide.

When using a COM object from a .NET-based application, there are two objects involved: the RCW and the COM object (or objects). Garbage collection is only aware of the size of the RCW (which can be small), not of the COM object (which may be large). Therefore, while the .NET-based application might release the RCW, garbage collection may not reclaim the RCW even as memory runs out. As long as the RCW stays in memory, the COM object that it manages stays in memory also.

It may appear that forcing garbage collection on the RCW resolves the problem. However, forcing garbage collection is almost always a bad idea in the .NET Framework. It may also be a wasted effort because forcing garbage collection does not guarantee that the COM object is removed. Even when called explicitly, garbage collection is always non-deterministic in the .NET Framework.

There are two mechanisms that ensure that COM objects are released from memory: the AppDomain object and the ReleaseComObject method. Using an AppDomain provides the simplest solution to managing COM objects but has performance costs and can expose a security risk. Using ReleaseComObject avoids those costs but requires more careful planning and coding.

Managing COM Objects with the AppDomain Class

In the .NET Framework, an AppDomain object provides a separate environment for an application to run. From the point of view of managing COM objects, AppDomain objects are attractive because unloading an AppDomain unloads all the resources that the AppDomain was using. The strategy is simple: For each COM object that you create, create an AppDomain and load the COM object into it. Using an AppDomain simplifies working with COM objects because you can load multiple COM objects into a single domain and dispose all of them disposed at the same time.

However, creating an AppDomain can be expensive, so AppDomains shouldn't be used when performance is critical. In addition, making an AppDomain available remotely destroys code access security (when not being used remotely, an AppDomain is secure).

The following example assumes that there is a COM object whose type name is MyDLL.MyObject in a file called MyDLL.DLL. To add a reference to this COM DLL to your application you would, from the Project menu, choose Add Reference, then choose Browse and navigate to MyDLL.DLL and select it.

If a PIA exists for the server, the reference is added to your program (if no PIA exists, Visual Studio .NET generates an interop assembly for the DLL). The interop assembly includes an entry for each class in the server's type library with a name for each of the objects in the library created by appending "Class" to the original class name. For my class, MyDLL.MyObject, the generated name is MyDLL.MyObjectClass.

Once you add the reference to the COM object, you can load an object into an AppDomain by using the CreateInstanceFromAndUnWrap method. This method takes two parameters: the full path name to the interop assembly and the class module's entry point. You can determine the location of the interop assembly by using a Type object. The Type object's associated Assembly object has a Location property that specifies where to find the interop assembly. Initially, the RCW is wrapped to prevent type library information from being loaded unnecessarily but the CreateInstanceFromAndUnWrap method also removes that wrapper.

The following code defines an AppDomain with the name MyDomain, and then loads my sample COM object into the AppDomain. The code then uses the object and, when done, unloads the AppDomain, releasing the COM object:

Dim apd As AppDomain
Dim obj As MyDLL.MyObject
Dim objType As Type

objType = GetType(MyDLL.MyObject)
apd = AppDomain.CreateDomain("MyDomain")
obj = apd.CreateInstanceFromAndUnwrap( _
     objType.Assembly.Location, "MyDLL.MyObjectClass")
. . .using the COM object. . .AppDomain.Unload(apd)

For more information about AppDomains, see Application Domains, from the .NET Framework Developer's Guide. As noted before, you must be careful when you pass a reference to a domain. Making an AppDomain object available remotely destroys code access security for that domain.

Managing the COM Object with ReleaseComObject

An alternative to using AppDomain objects is to force the memory to release a COM object by reducing the COM object's reference count to zero. Once the COM object is released you can then release the RCW to garbage collection later. While not as simple to implement as the AppDomain solution, using ReleaseComObject does avoid the performance costs of creating an AppDomain.

When a COM object is created in an RCW, the COM object's reference counter is set to one. No matter how many .NET-based clients refer to the RCW in a single process, the COM object's reference counter remains at one. However, you can increment the COM object's reference counter if you pass the RCW across a process boundary, so the reference counter is not guaranteed to be one always.

The .NET Framework provides a way to decrement a COM object's reference count directly by passing the RCW to the ReleaseComObject method of the System.Marshall class. The method returns the value of the COM object's reference count after decrementing it. This example loads Word, decrements the reference count, and then releases the RCW:

Dim wrd As New Microsoft.Office.Interop.Word.Application
. . .working with Word. . .intRefCount = _
System.Runtime.InteropServices.Marshal.ReleaseComObject(wrd)
wrd = Nothing

To address those situations when the value returned by ReleaseComObject is greater than zero, you can call the method in a loop that executes ReleaseComObject until the returned value is zero:

Dim wrd As New Microsoft.Office.Interop.Word.Application
. . .working with Word. . .Do
    intRefCount = _
     System.Runtime.InteropServices.Marshal.ReleaseComObject(wrd)
Loop While intRefCount > 0
wrd = Nothing

Since a single RCW is shared among all the clients in a process, using the ReleaseComObject method can release a COM object while other clients still depend on it. Attempting to work with the RCW from the .NET-based application after the COM object is released generates an exception, System.Runtime.InteropServices.InvalidComObjectException, with the additional information "COM object that has been separated from its underlying RCW can not be used." It is important, then, to make sure that you know when your application is finished with the object so that the counter can be decremented. One way to assure clarity is to put calls to ReleaseComObject in the containing object's finalizer/dispose method. This ensures a one-to-one mapping between .NET-based finalization and the release of the corresponding COM object.

Conclusion

This article discusses common issues encountered when developing with COM objects within the .NET Framework and provides recommendations about how to modify your solution to fix them. Specifically, you want to avoid having any reference to a COM object go out of scope until your application is finished with it. When your application is ready to release your COM object, you want to make sure that the COM object is removed from memory. Of the two mechanisms discussed, the AppDomain class provides the simplest mechanism for managing COM object lifetimes while using ReleaseComObject gives you better performance.

About the Author

Peter Vogel (MBA, MCSD) is a principal in PH&V Information services. PH&V specializes in .NET and XML development. Peter has designed, built, and installed intranet and component-based systems for Bayer AG, Exxon, Christie Digital, and the Canadian Imperial Bank of Commerce.

He is also the editor of Smart Access, wrote The Visual Basic Object and Component Handbook (Prentice Hall), and is currently working on a book on Web Parts for SharePoint Services and ASP.NET 2.0 (Wrox). Peter teaches for Learning Tree International. His articles have appeared in every major magazine devoted to Visual Basic-based development and in the Microsoft Developer Network (MSDN) Library. Peter also presents at conferences in North America, Australia, and Europe.

This article was developed in partnership with A23 Consulting.

© Microsoft Corporation. All rights reserved.