From the February 2002 issue of MSDN Magazine

MSDN Magazine

Visual C++ .NET

Tips and Tricks to Bolster Your Managed C++ Code in Visual Studio .NET

Tomas Restrepo
This article assumes you're familiar with C++ and .NET
Level of Difficulty     1   2   3 
Download the code for this article: ManagedC.exe (36KB)
SUMMARY Developers using the Managed Extensions for C++ have more options than those using other languages because C++ is a lower-level language. However, this means an increase in code complexity.
      This article discusses a few of the more complex issues facing developers, such as operator overloading, managed types and unmanaged code, and boxing. Also covered are the is operator, the using statement, and string conversions. The author points out the flexibility of Managed Extensions for C++ and outlines the additional effort that is required for you to take advantage of its increased power and flexibility.

L ike traditional C++, the Managed Extensions for C++ introduced with Visual C++® .NET allow you to go pretty much wherever you want to go within the confines of the underlying platform. This means that developers using Managed Extensions for C++ have a few more tools and options than programmers working in other Microsoft® .NET languages—but this power comes at a price. As with traditional C++ code, there is more complexity and a heavier syntactical burden for tasks that are often easy in other languages.
      In this article I'm going to show you a number of techniques to make your work easier when you use the Managed Extensions for C++ to write applications for .NET. This article assumes you are already familiar with the managed C++ syntax and most of the basic principles, although I'll explain some constructs that might be more complex or unusual.

Overloaded Implicit and Explicit Conversions

      The C# language allows developers to overload both implicit and explicit conversion operators via the implicit and explicit keywords. Consider this simple C# example:

  public class Convertible {
  
private int _value;

public Convertible ( int v ) { _value = v; }

public static implicit operator int ( Convertible c ) {
return c._value;
}
public static explicit operator double ( Convertible c ) {
return (double)c._value;
}
}

 

The two overloaded operators shown in the previous code allow you to convert an instance of Convertible into a 32-bit integer value without an explicit cast, or convert it to a 64-bit floating-point value with an explicit cast.
      Now, consider how you'd invoke any of these conversion operators from a managed C++ client. The first thought that comes to mind is just doing the same thing a C# client would do:

  Convertible* c = new Convertible(15);
  
int c1 = *c;
double c2 = (double)(*c);

 

However, that doesn't work, and you'll see a couple of C2664 errors if you try to compile it. So, how do you do it?

Figure 1 The IL Disassembler
Figure 1 The IL Disassembler

      The trick lies in figuring out what code the C# compiler generates behind the scenes for these two conversion operators. Fortunately, the .NET disassembler, ILDASM.EXE, can tell you (see Figure 1). This is how those two operators are defined in intermediate language (IL):

  // implicit operator int
  
.method public hidebysig specialname static
int32 op_Implicit(class Convertible c) cil managed
{
}
// explicit operator double
.method public hidebysig specialname static
float64 op_Explicit(class Convertible c) cil managed
{
}

 

      As you can see, what you really have here are two methods, op_Implicit and op_Explicit, marked with special attributes so that they are usually hidden from the user. The C# compiler uses syntactic sugar so that these methods are invoked implicitly in a standard-defined conversion sequence, or explicitly in a cast expression. So actually you're not dealing with C++-style operators at all.
      What you can do, however, is call these methods directly from managed C++ to accomplish what you want. Here's what the new code looks like:

  Convertible* c = new Convertible(15);
  
int c1 = Convertible::op_Implicit(c);
double c2 = Convertible::op_Explicit(c);

 

You can use this same technique to write managed classes in C++ that overload conversion operators usable from C#. Then you can add static op_Implicit and op_Explicit methods to your __gc or __value classes and have them turn up in C# as overloaded conversion methods.

Overloading Operators

      Most other operators can be overloaded using a technique similar to the one used for implicit/explicit conversion operators, from C# as well as from the Managed Extensions for C++. The operators can be split across two main categories: unary and binary. Both categories encompass arithmetic, logical, and comparison operators.
      Figure 2 lists the operators that can be overloaded and the corresponding function names that the overloaded operators should have in Managed Extensions for C++. Let's look at the basic signatures for each of the two overloaded operator types. For unary operators, the signatures are

  static MT*  op_<name>(MT* o);
  
static bool op_<name>(MT* o);

 

and for binary operators, the signatures are the following:

  static MT*  op_<name>(MT* lhs, MT2* rhs);
  
static bool op_<name>(MT* lhs, MT2* rhs);

 

Please note that I use MT and MT2 as placeholders (think of a template syntax) for managed types in these signatures, where MT2 and MT might be the same type.
      Most of these operators, except for op_True and op_False, will be recognized by C++ developers who are familiar with operator overloading. Basically, these two operators allow a class to be evaluated as a Boolean expression and thus participate in more complex Boolean expressions, like using the logical AND/OR/XOR operators. They can also be used as condition variables in conditional statements such as if or while. This is something developers can accomplish to a certain degree in traditional C++ by providing implicit conversions to bool. It is also worth noting that, in most contexts, the same holds true for C# code.
      So what do op_True and op_False really do? They become really useful when you have types that have to deal with poly-valued logic, and thus can be thought to be, logically speaking, in true and false states at the same time, or in neither state. The classic example in the framework where you'll see this is the SQL data types in System::Data::SqlTypes, which implement equality and inequality operators in terms of the SQLBoolean value type. SQLBoolean implements op_True and op_False instead of an implicit conversion to bool, which allows it to work correctly when null database values are involved. (If a SQLBoolean value is null, then both op_True and op_False will return false.)
      Looking at it from an implementation standpoint, it's interesting to note that, as in C#, all operators are overloaded by implementing these functions as public static members of a managed class. This is unlike traditional C++ code, in which some have to be implemented as member functions while others are implemented as global functions.
      As a return type, almost all operators have the same type as the class they are defined in. So if you implement a class called MyManagedClass, then that's the type the operators should return. The only exceptions are those operators marked with an asterisk (*) in Figure 2, which usually return a Boolean value instead.
      Here's a short example of how you'd implement a unary .NET operator such as op_Increment:

  __gc class IntWrapper {
  
private: int m_value;
public:
IntWrapper(int v) : m_value(v) { }
__property int get_Value() { return m_value; }
__property void set_Value(int v) { m_value = v; }

static IntWrapper* op_Increment(IntWrapper* o) {
return new IntWrapper(o->Value+1);
}
};

 

Implementing a binary operator for IntWrapper, such as op_Addition, would look such as this:

  static IntWrapper* op_Addition(IntWrapper* lhs, IntWrapper* rhs) {
  
return new IntWrapper(lhs->Value + rhs->Value);
}

 

      Since overloaded operators are not in the Common Language Specification (CLS), you should always provide alternative methods in your classes that implement the same functionality as those operators. This allows them to be consumed by other languages such as Visual Basic® .NET, which don't support operator overloading. This can also be much more convenient in terms of the Managed Extensions for C++ since most .NET operators are not easily consumed from managed C++ code without calling the op_XXXX methods directly. In other words, this won't compile:

  IntWrapper* v(1);
  
IntWrapper* z(1);
IntWrapper* y = v + z;

 

      This is a side effect of using explicit pointer notation for handling references to managed objects in managed C++ because arithmetic operators have a special meaning when they are combined with pointer semantics, even if those semantics are prohibited in managed code.
      However, if you're defining value types using the __value keyword, remember that your overloaded operators won't use pointers as arguments or return values, but as complete value types. Also note that the .NET overloaded operators can be consumed from managed C++ directly without needing to go through the op_XXXX methods, since pointer notation doesn't get in the way in these cases.

Handling Managed Types from Unmanaged Code

      One of the problems you're bound to get yourself into when mixing managed and unmanaged code is that unmanaged C++ cannot touch a managed type instance directly. The simplest workaround is to use a set of standalone managed functions that can be called from the unmanaged code. While this technique is simple and useful, it falls short in many cases, particularly those in which you need to hold the managed object and manipulate it over longer periods of time.
      What you need is a way to hold the managed object in some location that can be called from unmanaged code. The first thing that comes to mind is an unmanaged type with managed methods—that is, a __nogc class in a managed code section. However, the unmanaged type cannot hold the managed instance directly, since the .NET runtime is not completely aware of unmanaged types and the garbage collector (GC) has no way to keep track of the references to the managed types that way.
      Fortunately, there is a way to accomplish this with just a tad more work. The solution is to use the GCHandle structure found in the System.Runtime.InteropServices namespace of the .NET Framework. This structure gives you a means to hold a managed object reference in unmanaged memory. One of the nicest things about GCHandle is that it allows you to keep all types of references, including pinned and weak references.
      Using GCHandle is fairly simple. You use the GCHandle::Alloc method to create an opaque handle to a managed object and GCHandle::Free to release it. Also, the GCHandle::Target method allows you to obtain the object reference back from the handle in managed code. Figure 3 shows you how to create a simple wrapper class around the current AppDomain object. It's not hard, as you can see, but it's a little bit annoying, particularly when you have to cast constantly to get at the object reference. If you're wondering about GCHandle::op_Explicit, it's simply a way to trigger the explicit conversions operators defined in GCHandle that allows a GCHandle instance to be cast to and from an IntPtr, which is the .NET runtime version of a native-size pointer.
      The good news is that you don't have to do this if you use a little template tucked away in gcroot.h called, well, gcroot. This is basically a smart pointer around a GCHandle instance. I rewrote the code from Figure 3 to use gcroot instead of GCHandle directly, as you can see in Figure 4.
      Like any other smart pointer, gcroot overloads operator->, allowing you to use it directly without casting. It does a good job of abstracting away the details of calling IntPtr::ToInt32 or IntPtr::ToInt64, depending on which platform the executable was compiled for (Win32® or Win64™).
      You should be using at least the release candidate version of Visual Studio® .NET by now, but be aware that although the compiler would allow you to simply hold a GCHandle instance inside AppDomainWrapper (and it would mostly work), you might run into unexpected crashes with the Beta 2 compiler, since it has a bug that causes it to get the size wrong.

Unboxing

      Unlike in C#, boxing operations are explicit in managed C++, so you have to use the __box keyword when boxing a value type. However, you won't find any __unbox keyword, so how do you get the value type back? You can do it with a dynamic_cast, like this:

  int a = 12432;
  
System::Object* o = __box(a);
std::cout <<"original value:"<<*(dynamic_cast<System::Int32*>(o));

 

      Ugly, isn't it? Besides all the trouble it involves, there's also the small problem that dynamic_cast won't accept an unmanaged type, like an int*. Instead you have to use the managed value type counterpart in the System namespace (Int32, in this case), unless you use the even less intuitive alternative:

  *(dynamic_cast<__box int*>(o));
  

 

Unboxing requires a deep understanding of boxed types. You must first cast the Object* to a pointer to the boxed version of the type you want (for example, __box int*), and then you can dereference that pointer, yielding an object of the unboxed type. Fortunately, you can wrap this process in a simple unbox template, which yields the object directly. It costs a little more since it is passing value types by value, as opposed to boxing them by reference, but it is simpler. If you're like me, and want consistency and simplicity, then you long for an unbox function that you can use to make this easier. It turns out that it's pretty easy to simulate with the following template function:

  template <typename U>
  
inline U unbox(System::Object* o) {
return *(dynamic_cast<__box U*>(o));
}

 

which can now be used like this:

  std::cout << "original value: " << unbox<int>(o);
  

 

      That's much more readable. However, a few matters still need to be addressed. The first one is that the method I've just presented won't work in Visual Studio® .NET Beta 2 if the value type you're unboxing is a managed Enum. This is a compiler bug, which means you'll get an internal compiler error when the compiler reaches the dynamic_cast in unbox.
      Unfortunately, there's no easy way around this problem short of changing unbox so that it uses static_cast instead. This requires checking that the type conversion is valid before making the cast, like this:

  template <typename U>
  
inline U unbox(System::Object* o) {
if ( __typeof(U)->Equals(o->GetType()) )
return *(static_cast<U __box*>(o));
throw new System::InvalidCastException();
}

 

If you're dealing with boxed enum types, then this last attempt won't compile if you're using System::Enum __gc* variables. For example, if you try this

  __value enum TheEnum {
  
val1,
val2,
val3,
};
System::Enum __gc* pa = __box(TheEnum::val1);

 

you'll get the following error:

  mcpptt.cpp(100) : error C2594: 'argument' : 
  
ambiguous conversions from 'System::
Enum __gc *' to 'System::Object __gc *'

 

      What's ambiguous in this conversion? Well, nothing, really. You've just run into a bug in Beta 2 of the Visual C++ .NET compiler, which has problems dealing with the implicit conversion between System::Enum* and System::Object*. If you're still not using a later version of the compiler, the only workaround that can be used is to overload the unbox function with a version explicitly taking an Enum* argument, which would look like this:

  template <typename U>
  
inline U unbox(System::Enum* o);

 

This last definition of unbox would have the exact same implementation code as the one taking a System::Object* argument that I presented before and will allow the sample code to compile.
      On a related note, it's interesting to see that when you're dealing with value types, there's a big difference between the following two lines of code:

  TheEnum __box* pa = __box(TheEnum::val1);
  
TheEnum __gc* pa = __box(TheEnum::val1);

 

The first line of code declares a boxed value type. This is essentially a way to treat the boxed value as if you were dealing with a reference type without needing to unbox it every time you have to access any of its members. (This happens implicitly in some other .NET languages.)
      The second line of code declares an interior __gc pointer that points directly at the value stored inside the boxing object. Interior pointers have several important restrictions and special semantics and can be hard to deal with, so I suggest you avoid them unless absolutely necessary.
      Keep the first definition of unbox handy, though. That's the one you'll want to use once the bugs in the compiler are worked out.

Converting Managed Strings to Character Arrays

      Having to convert managed strings to character arrays is quite common when you need to call unmanaged functions that handle strings. As it turns out, there are several ways to accomplish this.
      The first option is to use the StringToXXXX methods of the System::Runtime::InteropServices::Marshal class. There are variations for each combination: ANSI strings, Unicode strings, BSTRs, and so on. Here's how you'd use it to get a const char* out of a String instance:

  using namespace System::Runtime::InteropServices;
  
const char* str = (const char*)
(Marshal::StringToHGlobalAnsi(managedString)).ToPointer();
// use str as you wish or copy it elsewhere
// free string
Marshal::FreeHGlobal(IntPtr((void*)str));

 

The StringToXXX methods return an IntPtr instance, so you need to convert that either to a pointer or an integer value (32 or 64-bit) before using it. Don't forget to use the appropriate Marshal::FreeXXX methods when you're finished in order to release the allocated memory.
      The second option is to use PtrToStringChars. PtrToStringChars allows you to get access to the internal memory representation of the managed string instance with a simple function call. The vcclr.h header file (which is available in the Visual C++ .NET installation) contains the definition and the source code implementation of PtrToStringChars so you can see how it is done. Here's a short example:

  const System::Char* str = PtrToStringChars(managedString);
  

 

The downside of this option is that the pointer returned is a __gc pointer, which you can see more clearly by using this alternative definition of str:

  const wchar_t __gc* str = PtrToStringChars(managedString); 
  

 

      While the previous option is certainly simpler, it suffers from two drawbacks. The first one is, of course, that it returns a managed pointer (actually, it is technically called an interior gc pointer, and you should make sure you don't accidentally end up writing through it or you might corrupt the managed heap). The second drawback is that it returns a Unicode string, so you might need yet another conversion if you need an ANSI string.
      There is a way to convert the __gc pointer to a __nogc one, but that's only possible if you pin it in memory so that you can make sure that the string instance won't be moved during the garbage collection cycles, as shown here:

  const wchar_t __pin* str = PtrToStringChars(managedString); 
  

 

The downside of this is a potential performance hit, since pinned objects can cause sandbars in the managed heap, preventing a full-heap compaction from being performed. If you plan to use pinned objects, use them sparingly and never hold onto them for long periods of time.

Converting String to std::string

      The C++ standard library std::string and std::wstring are string management classes that many people (myself included) use extensively so it's nice to have a way to convert a System::String instance into either std::string or std::wstring. Figure 5 shows two simple functions to convert a System::String* into a std::string or std::wstring instance. If you prefer to have these functions return the unmanaged string instance instead of passing it as a reference, it is easy to do so. Note that overloading won't work in that case and you'll need to give different names to each function.

Mixing Templates and Managed Types

      The .NET platform doesn't support generic programming yet, but managed C++ programmers can still take advantage of the template mechanisms available in normal C++ to a certain extent. Of course, this means you can only use them as internal aids to your managed C++ implementation and cannot expose them to other .NET languages, but they are still useful for a few tricks.
      You've already seen two examples of mixing templates and managed types in the form of the gcroot template and the __unbox function. Now I'll tell you about a few other tricks you can put in your bag.

An Is Operator for Managed C++

      The C# language has an is operator that allows you to easily find out if an object has a given managed type, implements a given interface, or is in the same inheritance hierarchy as another type. The is operator simplifies some expressions and is exception-free, so it's quite useful. I wanted a way to simulate it, at least partially, in managed C++.
      My first attempt was to create a templated function that compared types:

  template <typename T1, typename T2>
  
inline bool istypeof ( T2* t )
{
return ( __typeof(T1)->Equals(t->GetType()) );
}

 

Obviously, this only accomplishes the goal of identifying the type, so I needed to extend it a little bit more. I turned to dynamic_cast, which behaves pretty much in the way you'd want. Here's the second attempt:

  template <typename T1, typename T2>
  
inline bool istypeof ( T2* t )
{
return (dynamic_cast<T1*>(t) != 0);
}

 

This is much better and works almost like you'd expect. Here's how you'd typically use it:

  bool isaB = istypeof<B>(someObject);
  

 

      However, there's one shortcoming: you can't pass a native type as T1 and expect it to work correctly. In other words, if you try something like this

  bool isanInt = istypeof<int>(someObject);
  

 

you'll get a C2682 compiler error. Trying System::Int32 instead of int doesn't work either. The solution is to compare the object's type to the boxed value type, instead of a pointer to the raw type. However, this requires using the __box syntax like so:

  bool isanInt = istypeof<int __box>(someBoxedInt);
  

 

This is certainly awkward and unintuitive, but unfortunately cannot be worked around (at least I haven't found a suitable workaround; if you do, please tell me!).
      One thing to watch out for when using the __typeof operator in managed C++ is that __typeof(wchar_t) can return different values depending on the compilation options. With the default compiler flags, it will return __typeof(System::UInt16), since the compiler doesn't enable wchar_t as an intrinsic type by default and instead defines it as unsigned short. However, if you compile with the /Zc:wchar_t compliance switch, it will return __typeof(System::Char), which is what you'd expect.

Simulating C#'s using Statement for IDisposable-aware Objects

      This technique is rather easy to simulate using a simple template class, similar in behavior to gcroot, which I've called auto_dispose (see Figure 6). Using auto_dispose is a little different from using gcroot in the sense that the auto_dispose definition is similar to that of std::auto_ptr in the Standard C++ Library. With auto_dispose, you don't specify pointer syntax when declaring the managed object type, as you do with gcroot. This is merely a matter of personal preference since I feel more comfortable with auto_ptr's syntax than with that of gcroot, but modifying the auto_dispose definition to turn it around is quite simple. Here's how you would use auto_dispose:

  auto_dispose<DisposableObject> obj = new DisposableObject;
  
// use obj as any other managed object pointer
•••
// obj->Dispose() called automatically at end of scope

 

      There are a couple of points worth mentioning about the implementation of auto_dispose. The first is that the implementation of the destructor uses static_cast to ensure that the type T does indeed implement IDisposable. (I'm not interested in the general case where a random class implements a Dispose method different from IDisposable::Dispose.) This is necessary because standard C++ template syntax provides no clean way to specify static type requirements. For example, there's no precise way of specifying that a template argument should implement a given interface or be derived from a certain base class. Using static_cast here allows you to work around this problem (at least partially) to get a compile-time error if T doesn't implement IDispose.
      Second, note that auto_dispose hides its copy constructors and assignment operators, which is done to avoid easy copying of the inner pointer into another auto_dispose instance. This is really important because if it were allowed, situations could arise in which IDisposable::Dispose could be called twice, at different times in the same object, which is not what you want, of course.
      Third, you don't want to get direct access to the inner pointer held by auto_dispose, which is why there is no implicit conversion operator to T* defined in the template, unless you need to pass down the object as an argument to another function. In most cases this is a dangerous and error-prone thing to do, because it forces you to treat the function you are calling as a white box, thus breaking the encapsulation.
      To see why this occurs, consider the case in which the function you just called with the disposable object caches the object instance somewhere, and you call Dispose shortly thereafter. The function now has what essentially amounts to a reference to a dead object, possibly in an invalid state (even though the GC doesn't consider it dead yet), which might cause unexpected problems later on. This can make your code throw unexpected exceptions and presents a situation that is difficult to debug. Finally, notice that you could still get the inner pointer by calling auto_dispose::operator->() directly if you're persistent, although you should really try to refrain from doing this.

Conclusion

      As you can see, managed C++ is quite flexible and can do as much as any other .NET language can do, although sometimes it takes more effort on your part to make things happen. Fortunately, with a few handy routines and some tips and tricks up your sleeve, you can be much more productive in Managed Extensions for C++ and make your code more readable and maintainable.

For related articles see:
Managed Extensions bring .NET CLR Support to C++
Still in Love with C++.NET: Modern Language Features Enhance the Visual C++ Compiler
Managed C++ reference

 

Tomas Restrepo is a software developer at InterGrupo S.A. and is interested in object-oriented programming, design patterns, and C++. You can reach him at tomasr@mvps.org.