Beyond (COM) Add Reference: Has Anyone Seen the Bridge?

 

Sam Gentile
October 2003

Applies to:
   COM Interop
   Microsoft® .NET Framework
   C# language
   Microsoft Visual Studio® .NET

Summary Sam Gentile explains the need for bridges between COM Interop and the Microsoft .NET Framework, and how bridges are implemented in the .NET Framework. (14 printed pages)

Prerequisites:

  • Basic knowledge of core Microsoft .NET concepts (assemblies, attributes, reflection, classes, properties, and events)
  • Ability to create Windows Forms applications in C# in Visual Studio .NET
  • Ability to use the language compiler to build and manage applications

Download the associated code sample (212 KB)

Contents

Introduction
Has Anyone Seen the Bridge?
The Role of the Bridges
The COM Callable Wrapper (CCW)
Let's Get Squared
Generating the Interop Assembly
Managed Square
Where Are We and Where Are We Going?

Introduction

Interop is a wonderfully useful and necessary technology in the .NET Framework. Why? Quite frankly, there are literally billions of lines of existing COM code in use today by many corporations. While these corporations are aware of the many benefits of managed code, they fully expect to leverage their sizable existing investments in COM without rewriting all their applications from scratch. The good news is the common language runtime (CLR) includes COM Interop to utilize this functionality from managed code using the .NET Framework. The not-so-good news is that it usually is not trivial.

COM Interop is, quite simply, a bridge between COM and .NET. Many developers are happy when they notice Visual Studio .NET includes the magical (COM) Add Reference wizard (Add Reference | COM Wizard). This wizard allows developers to choose COM components and perform some "magic" to work with their .NET applications. The problem is, quite often, developers have to go beyond the wizard in order to make actual COM Interop work for their applications. That is where it gets hard.

The System.Runtime.InteropServices namespace contains nearly 300 classes. There are also a variety of Interop command line tools in the .NET Framework SDK. If that's not scary enough, many issues arise in any non-trivial Interop project resulting from the nuances of COM, and the vast differences between COM and .NET. I have learned from personal experience that the developer will very frequently be required to go beyond the (COM) Add Reference wizard, and use these command line tools, as well as understand at a deeper level what is going on in order to have things work as expected (or just even work!).

This article starts where "Add Reference" leaves off. I am not going to spend time showing you Visual Studio .NET wizard screen shots. There is plenty of great MSDN information already for that (see Calling COM Components from .NET Clients). This article is the first in a series designed to delve deep into a variety of issues that you, as a working programmer, will encounter with COM Interop that are either not well understood or well documented, but need to be in order for you to get your job done.

The code sample for this article and those that follow are in C# as a matter of personal preference. However, it is important for me to emphasize the CLR operates at a greater level of abstraction than previous Microsoft technologies. Thus, there is one class library, one type system, and languages offer common services in different syntaxes. The .NET languages can be viewed as syntactical sugar over the base class library (BCL), common type system (CTS), and CLR, and one's syntax, and thus language is simply a matter of personal taste.

Ready to begin? Let's dive in.

Has Anyone Seen the Bridge?

Everyone is familiar with the role of a bridge. A bridge is something that allows you to go between two areas that are separated by something impassable, such as a river, bay, etc. COM Interop can be viewed similarly. There are two vastly different worlds, the world of COM, and the world of the CLR separated by a vast boundary. There is a need to "bridge" the differences, and allow the world of COM to work with the managed world of the CLR. If COM Interop is to be at all useful, it must be a good bridge, hiding the details of each world from each other. The expectation of .NET programmers is they should be able to treat the COM component the same way as any other .NET component they "new" up and work with. .NET programmers should not be dragged into the world of COM and call CoCreateInstance (Ex) to create the COM component and deal with reference counts and similar concepts. They should use operator new to create the object and call methods on it, just like they would for any other .NET component.

The same should happen on the other side. If you expose a .NET component to a COM client, it should look like any other COM component. Thus, from COM, the developers can QueryInterface and do all the fun stuff they have done for years. What am I getting at? The underlying components should not change, and the programming model should not change. We will see for the most part this is true, and Interop is quite successful at bridging the differences. However, the worlds are vastly different and cause difficult problems in some scenarios. We will look at these problems in future articles.

Why do we need bridges at all? The simple answer is that although both technologies share the common goal of interoperable components, the two worlds couldn't be more different. This should not come as a great surprise. The world of the CLR is one of managed execution with a garbage collector and a common programming model, among many other things. COM lives in an unmanaged, reference-counted world with very different programming models. Although COM has a binary standard, many different programming models exist that vary in levels of abstraction, such as Visual Basic®, Microsoft Foundation Class Library (MFC), and Active Template Library (ATL), to name but a few. With that in mind, we will look briefly at some of the differences that directly impact interoperation. These include identity, locating components, object lifetime, programming model, type information, versioning, and error handling. The next section briefly discusses these areas of difference, keeping in mind that a fully detailed exploration of each of these topics has been the subject of many books and articles, and is outside the scope of this article.

Identity

All COM programmers are intimately familiar with the Globally Unique Identifier (GUID), which is used to uniquely identify many things in COM. These 128-bit numbers are globally unique in time and space (unless you reuse somebody else's!). GUIDs are used all over the place in COM and manifest as CLSIDs (Class Identifiers), IIDs (Interface Identifiers), LIBIDs (Library Identifiers), and AppIDs (Application Identifiers), to name but a few. They share a common purpose: give something in COM a globally unique identity.

Most normal human beings cannot memorize 128-bit numbers so classes can have human friendly names that map to the 128-bit number.

The CLR does not use this system. A type is identified simply by its name, and further qualified by the namespace in which it lives. This is known as a simple name. However, the full type identity of any CLR type includes the type name, namespace, and its containing assembly. In other words, assemblies also serve as type scoping units, as well as logical units, of deployment and packaging

Locating Components

The ways COM and .NET locate components are quite different. COM components can be physically located anywhere, but the information about how to find and load them is kept in one central location: the registry. In contrast, CLR components do not use the registry at all. All managed assemblies bring this information stored within them, as metadata. In addition, .NET components can live either privately with their applications in the same directory, or globally shared in the Global Assembly Cache (GAC).

To instantiate a COM component with CoCreateInstance, COM looks in the registry for the CLSID key, and the values associated with it. These values tell COM the name and the location of the DLL or EXE that implements the COM co-class that you wish to load. One of the much-touted benefits of COM is location transparency. Simply stated, the COM client calls the object in the same way, whether the object is in-process with the client, out-of-process on the same local machine, or running on a different machine altogether; the registry tells COM where. This system is easy to break. If files change location without changing their registry setting, programs break completely. This contributes to the infamous problem known as "DLL Hell."

For that reason, and many others, .NET components take a completely different approach. The CLR looks in one of three places: the GAC, the local directory, or some other place specified by a configuration file. One of the goals of .NET is to radically simplify deployment. For most applications, components can be deployed to the local directory in which the application lives, and everything works. This is known as x-copy deployment. Shared assemblies can be placed in the GAC. For more details on this, see Applied Microsoft .NET Framework Programming by Jeffrey Richter.

Object Lifetime

Arguably one of the greatest areas of difference is how COM and .NET deal with the issue of how long an object should live in memory and how that is determined.

COM uses a reference-counted system to determine object lifetime. This puts the burden on the object, and the programmer, to maintain its own lifetime and determine when it should delete itself. The rules of this model are precisely spelled out in the COM specification. The key to the whole scheme is the IUnknown interface, which all COM components must implement, and all COM interfaces are derived from. IUnknown includes two methods directly responsible for reference counting. The Add method increments the reference count and Release decrements it. When the count reaches zero the object may be destroyed. There are various nuances that arise from this scheme, as well as the possibility to create object cycles. In addition, this scheme is very error-prone and has been the source of many woes that have plagued COM programmers for years.

The CLR frees programmers from this responsibility altogether. The CLR manages all object references through the use of garbage collection. The garbage collector will free an object when it determines the object is no longer being used. The key difference is this is a non-deterministic process. Unlike COM, the object is not immediately freed when the last client is done with it, but rather when the garbage collector collects memory. This occurs at some non-predictable time in the future. For the most part, this is not a problem. But, as we will see a future article, this can be a significant problem if your COM designs call for explicit teardown at some point in time. The .NET Framework does provide a ReleaseComObject system call in the System.Runtime.InteropServices namespace to require immediate release, but this can lead to further issues we will examine later.

Programming Model

COM programming is far too labor intensive. Although there are programming model abstractions such as Microsoft Visual Basic that can greatly simplify COM programming, the fact is COM requires strict adherence to a set of rules and knowledge of too many low-level and arcane details to function effectively. Moreover, there are many programming tools for COM varying from Delphi to MFC to ATL to Visual Basic. Although each of these tools produces a working COM component that adheres to COM's binary v-table layout in memory, they differ widely in their programming model. Programmers who learn one tool and model have to face an entirely new programming model when they switch tools. For this and many other reasons, the .NET Framework greatly simplifies the programming model to one. The .NET Framework has one consistent object-oriented Base Class Library (BCL) Framework, irrespective of programming language and tool.

In COM programming, one never actually obtains a reference to the actual object. Instead COM clients obtain an interface reference and call the object's methods through it (yes, Visual Basic provides the illusion of programming to classes, but COM interfaces are used underneath). COM also does not have implementation inheritance.

In sharp contrast, the .NET Framework is a fully object-oriented platform in which programmers can fully use classes. Although the programmer can use interfaces, they are not required to do so by the model.

Type Information

In component-based systems, it is important to have some method of expressing the interface or contract of the component and how information is exchanged between the component and its clients or consumers. The COM specification did not mandate such an interchange format and it was therefore left outside of the specification. Not one but two different formats appeared.

The first of these, Microsoft Interface Definition Language (MIDL), actually had its origins in OSF DCE RPC, which used IDL to write descriptions of remote procedure calls (RPCs) in a language-neutral manner. Microsoft IDL provided extensions to DCE IDL in order to support COM interfaces, co-class definitions, and type definitions, among others. When IDL is compiled using the MIDL compiler, a set of C/C++ header files are generated that allows network code to make remote RPC calls over various network protocols.

The more common scenario was the use of type libraries (TLB files). Type libraries, although expressed in IDL terms, were compiled into a binary resource that type library browsers could then read and present. One of the problems is that type libraries were completely optional and not complete. Moreover, COM does not enforce the correctness of the information within. If that weren't enough, the format is not extensible nor does it make any attempt to describe component dependencies.

In order to create a full first-class component environment, type information pervades every level of the CLR in the form of metadata. CLR compilers are required to emit standard metadata in addition to MSIL. The type information is always present, complete, and accurate. This is what gives .NET components their "self-describing" nature.

From this section, you should begin to notice that some sort of conversion process is needed to morph COM type definitions into .NET Metadata.

Versioning

In component engineering, versioning is a vital, yet difficult problem to solve. Interfaces may evolve over time and this can cause clients to break. COM interfaces do not have a versioning story. A COM interface is said to be immutable; it cannot change once the interface has been defined and exposed. Any changes such as adding members or changing the order of arguments in a method will cause clients to break. Therefore, in COM we define a new interface entirely if there are any changes to be made to an existing interface. If I have defined and published a COM interface IFoo, and I wish to make changes or add members, I define a new interface IFoo2.

The binary object model of COM and its in-memory representation physically defines the COM interface. A COM interface is a v-table in memory and an interface pointer is a v-ptr to it. This very precise model is quite fragile. Any changes to the v-table ordering or field alignment will cause clients to break.

The .NET Framework was designed from the ground up to fully support component versioning. Each .NET assembly, when given a strong name, can contain a four-part version number in the form Major.Minor.Build.Revision ** that is stored in its manifest. The CLR fully supports multiple versions of the same assembly existing simultaneously in memory, isolated from each other. The CLR also supports a full versioning policy that can be applied by administrators in XML configuration files, which may be applied on a machine or application basis, binding a client to a specific version of an assembly.

Error Handling

The ways in which COM and .NET handle error reporting differ greatly. COM has more than one way to handle errors, but the primary method is to have methods return an error code of type HRESULT. HRESULTs are 32-bit status codes that tell the caller what type of error has occurred. HRESULTS are made up of three parts: a facility code, an information code, and a severity bit. The severity, indicated by the most significant bit, is what indicates success or failure. The facility code indicates the source of the error. The information code, in the lowest 16-bit contains a description of the error or warning.

I don't want to spend any more time on HRESULTs except to note that there is nothing to force the client to check for them, nor is there anything in the COM runtime to enforce that methods return them. They can be ignored. In addition to HRESULTs, a COM co-class can support an additional interface, ISupportErrorInfo, which provides richer error information. Of course, since it is an interface, the client must specifically query for it, check if it is supported, and then process the error. Many clients do not do this.

The .NET Framework enforces one consistent way of reporting and dealing with errors: exceptions. Exceptions cannot be ignored. In addition, exceptions isolate the code that deals with the error from the code that implements the logic.

The Role of the Bridges

As we have seen, the two systems differ greatly. In order to have Interop between these two models, some sort of "bridge" or wrapper is needed. For COM Interop, there are two such bridges. One, the Runtime Callable Wrapper (RCW), takes a COM component, wraps it up, and allows .NET clients to consume it. The other, the COM Callable Wrapper (CCW), wraps a .NET object for consumption by a COM client.

You may have noticed the word "runtime" in the terms. These bridges or wrappers are created dynamically by the CLR at runtime. In the case of the RCW, the CLR generates it from the metadata contained in the Interop assembly that has been generated. In the case of the CCW, an Interop assembly is not needed; the generation of a COM type library is completely optional. What is needed, however, is for the assembly to be registered in the Windows registry so that COM may call it. (We will look at this in the next article in the series.)

These wrappers completely handle and mask the transition between COM and .NET. The wrappers handle all the differences I spoke of earlier: data marshaling, object lifetime issues, error handling, and many more. As you might expect in a bridge, it should just get you from one side of the other safely without dealing with the details. You create the wrapper using the object creation semantics (new in .NET, CoCreateInstance in COM) that you would use on either side and internally the real object gets created. Then you simply make calls on the wrapper, which calls through to the real object.

At a high level, it looks as shown in Figure 1:

Figure 1. Creating wrappers using object creation semantics and calling on them

The wrappers depend on type information. As I have stated, some sort of conversion process or tool is required to morph COM type data into CLR metadata and vice versa. The .NET Framework Software Development Kit (SDK) provides these tools, which we will look at shortly.

Now that we have talked about the overall idea of the bridges, let's look at the Runtime Callable Wrapper (RCW) in more detail.

The Runtime Callable Wrapper (RCW)

.NET clients never talk to a COM object directly. Instead, the managed code talks to a wrapper called the Runtime Callable Wrapper (RCW). The RCW is a proxy dynamically created at runtime by the CLR from the metadata information contained in the Interop assembly. To the .NET client, the RCW appears as any other CLR object. Meanwhile, the RCW acts as a proxy marshalling calls between a .NET client and a COM object. There is exactly one RCW per COM object, regardless of how many managed references are held on it. It is the job of the RCW to maintain COM object identity by calling IUnknown->QueryInterface() under the covers, caching the interface pointers internally, and calling AddRef and Release at the right times.

The RCW can do the following functions:

  • Marshal method calls
  • Proxy COM interfaces
  • Preserve object identity
  • Maintain COM object lifetime
  • Consume default COM interfaces such as IUnknown and IDispatch

The process looks similar to that shown in Figure 2.

Figure 2. The RCW process

The COM Callable Wrapper (CCW)

The COM Callable Wrapper (CCW) performs a similar role in the other direction. It also acts as a bridge or proxy, this time when a COM client wishes to talk to a .NET object. The main job of the CCW is to forward calls to the .NET object from COM clients who are under the illusion that they are talking to another COM object. In that regard, the CCW implements standard COM interfaces like IUnknown, IDispatch, and quite a few others. A single CCW is shared among multiple COM clients for a .NET type.

The CCW performs the following functions:

  • Transforms COM data types into CLR equivalents (marshaling)
  • Simulates COM reference counting
  • Provides canned implementations of standard COM interfaces

The Type Library Importer (TLBIMP.EXE)

Before I get to an example (finally!), I need to say a little bit about the Type Library Importer tool. I will be spending a lot of time with the type library importer in my next article, but I'll give a brief introduction here.

As I have stated earlier, the CLR cannot do anything with COM type information. The CLR requires type information in the form of CLR metadata in an assembly. Clearly, we need some mechanism to read COM type information and convert it to CLR metadata in an assembly. These assemblies, termed Interop assemblies, can be created in three different ways.

The first of these ways, is to use the Add COM Reference wizard in Visual Studio .NET. I find this option far too limiting for Interop work as it does not allow any flexibility in options. For this reason, as well as the fact that there is already plenty of MSDN documentation on how to use this wizard, I will not discuss it further in this series. The second option is to use the Type Library Importer tool (TBLIMP.EXE). The final option is to programmatically use the System.Runtime.InteropServices.TypeLibConverter class. The first two options actually call this class to perform their work. We will be focusing on the TLBIMP tool here.

TLBIMP is a command-line tool that is available in the .NET Framework SDK, as well as with Visual Studio .NET. Its reads COM type information (usually in a .tlb, *.dll, or *.exe) file and convert it to CLR types in an Interop assembly. This tool has a whole host of options that we will explore in depth in the next article. For now, I will simply mention one important one. You can certainly use TLBIMP in its simplest form by merely specifying the name of the COM file you wish to convert. Unfortunately, if you do this, TLBIMP will overwrite that particular file with the Interop assembly without warning. What you want to do instead is specify the /out option. This allows you to specify the output file name that you wish. Your company may have particular standards for this but a convention that I like to use is to precede the name of the file with "Interop." Thus "foo.dll" becomes "Interop.foo.dll." Given this, our simplest form of TLBIMP becomes:

TLBIMP foo.dll /out:Interop.foo.dll

Let's now proceed to a very simple example.

Let's Get Squared

For an initial example, I have chosen to implement a trivial COM component with one interface, IMSDNComServer, and one method, SquareIt. You can download the sample code. Please note that the sample code does not perform any form of error checking for the sake of keeping the samples simple. In your code that you develop you will obviously want to do this. This amazing method takes a double number as an input and returns that number squared. I have implemented it using Visual C++® 6.0. The relevant IDL looks like the following:

interface IMSDNComServer : IDispatch
{
   [id(1), helpstring("method SquareIt")] HRESULT SquareIt([in] double dNum, 
      [out] double *dTheSquare);
};

coclass MSDNComServer
{
      [default] interface IMSDNComServer;
};

In the file MSDNComServer.cpp, the SquareIt method looks like the following:

STDMETHODIMP CMSDNComServer::SquareIt(double dNum, double *dTheSquare)
{

   *dTheSquare = dNum * dNum;

   return S_OK;
}

Also included with the download is a Visual Basic 6.0 Test client that instantiates the COM server and calls the SquareIt method. The code is quite simple:

     Dim oSquare As MSDNComServer
    Set oSquare = New MSDNComServer
    Dim dIn As Double
    Dim dTheSquare As Double
    
    
    dIn = 3
    Call oSquare.SquareIt(dIn, dTheSquare)
    MsgBox Str(dTheSquare)

When we run our Visual Basic 6.0 test application, we get the expected result:

Figure 3. SquareIt test application

Generating the Interop Assembly

Our objective is to use this COM component from our .NET code. We could use the Visual Studio .NET (COM) Add Reference wizard (Add Reference | COM Wizard), and in this kind of very simple COM component, TLBIMP offers no particular advantages, but for the sake of illustration, we are going to use the TLBIMP command. To use this command, on the Programs menu, click Visual Studio .NET Tools | Visual Studio .NET 2003 Command Line Prompt. In this way, the correct path and environment variables will be set.

You can look at the many options that TLBIMP offers by typing:

C:\Code\MSDN\MSDNManagedSquare>tlbimp /?

There are quite a few options. We will be examining a lot of these options and the effects that they have on the generated Interop assembly in the second article of this series. For our purposes in this simple example, we are going to just choose to specify the name of the output file. Recall that if this option is not specified, TLBIMP will overwrite the specified file with the generated assembly. Our command line looks like:

C:\Code\MSDN\MSDNManagedSquare>tlbimp /out:Interop.MSDNCom.dll MSDNCom.dll

This particular permutation of TLBIMP takes our COM server that is in MSDNCom.dll and generates an Interop Assembly named Interop.MSDNCom.dll. It is very important to realize that the underlying COM component remains the same; it does not change in any way. What we have done is create another "view" into it, a view from the CLR perspective.

To take a look at the "managed view", we can use ILDASM.exe. This tool, which also ships with both the .NET Framework SDK and Visual Studio .NET, allows us to look at the metadata and IL that is contained within a managed assembly. You will find this tool indispensable in your Interop work as you customize the type library import and export process. Upon invoking ILDASM on our Interop assembly, our top-level view looks like the following:

Figure 4. Managed view provided by ILDASM

Our Interop assembly contains two things: a Manifest and a namespace called Interop.MSDNCom. Drilling into our namespace, we find that the type library import process has produced three things!

Figure 5. Results of Type Library Importer process

The Type Library Importer has generated an abstract interface, IMSDNComServer, and two classes, MSDNComServer and MSDNComServerClass. The reasons for this are a bit complex and are the subject of my next article where we will look at the importing process in great detail. In the meantime, it is sufficient to note that this is due to the wide differences in the programming models and how components are versioned.

One thing to note is that is that an Interop assembly contains mostly metadata. The methods are mostly forwarded calls to the underlying COM component, emphasizing the role of bridge or proxy. This may be demonstrated by looking at the disassembly for the SquareIt method.

.method public hidebysig newslot virtual 
        instance void  SquareIt([in] float64 dNum,
                                [out] float64& dTheSquare) runtime managed internalcall
{
  .custom instance void 
[mscorlib]System.Runtime.InteropServices.DispIdAttribute::.ctor(int32) = ( 01 00 01 00 00 00 00 00 ) 
  .override Interop.MSDNCom.IMSDNComServer::SquareIt
} // end of method MSDNComServerClass::SquareIt 

Managed Square

Now that we have generated the Interop assembly, we can use it from a .NET client. For the sake of example, I have generated a Windows Forms application in C#. I am going to assume that you know how to use Visual Studio .NET to create such a project. The code is available as the MSDNManagedSquare project. The project has a reference to the generated Interop assembly, Interop.MSDNCom. Once that has been accomplished, we can make the metadata of the assembly available with the C# using statement:

using Interop.MSDNCom;

To call the COM server from managed code simply requires instantiating the
              class and calling the method.

private void button1_Click(object sender, System.EventArgs e)
{
double numToSquare = System.Convert.ToDouble(textBox1.Text);
   double squaredNumber;

   MSDNComServerClass squareServer = new MSDNComServerClass();
   squareServer.SquareIt(numToSquare, out squaredNumber);

   textBox2.Text = squaredNumber.ToString();
}

Notice that the code looks like any other .NET code to instantiate an object and call its methods. We have not had to write any special code to work with COM, nor use GUIDs, CoCreateInstanceEx, and other COM programming constructs. When we run our application, it works as expected. Underneath the covers, the CLR dynamically creates an RCW through which the SquareIt method is called and results are returned. This is completely transparent to the executing application, however.

Where Are We and Where Are We Going?

We looked at the reasons why a bridge is necessary due to the vast differences between COM and .NET in terms of identity, locating components, object lifetime, programming model, type information, versioning, and error handling. Two bridges to use with COM and .NET were discussed: the Runtime Callable Wrapper (RCW) and the COM Callable Wrapper (CCW). Since one of the important differences is type information because the two systems use incompatible type systems, we examined using the Type Library Importer (TLBIMP) to transform COM data types into CLR types in the form of metadata.

The first example was a COM component to square a number, and we generated an Interop assembly using TLBIMP and then used it from a C#-based Windows Forms application.

In the next article, Using the .NET Framework SDK Interoperability Tools, we will take a much closer look at TLBIMP and the Type Library Importer process, as well as a detailed look at the marshaling process and how to use the attributes and classes in the System.Runtime.InteropServices namespace to tailor the importing process.