A Baker's Dozen: Thirteen Things You Should Know Before Porting Your Visual C++ .NET Programs to Visual Studio 2005

 

Stanley B. Lippman
Microsoft Corporation

September 2004

Applies to:
   Microsoft Visual C++ .NET
   Microsoft Visual C++ 2005
   Microsoft Visual Studio 2005
   Microsoft Visual Studio .NET

Summary: Stan Lippman discusses issues that developers should be aware of when porting applications to Microsoft Visual Studio 2005. (23 printed pages)

Contents

Introduction
The New Syntactic Landscape
Changes in Semantic Meaning
Changes Without One-To-One Mapping
Conclusion

Introduction

Those of us in the C++ community feel a bit like the older child in a family when the new baby is brought home. Oohs and aahs circle around the new little tyke, who everybody wants to hold and make goo-goo faces at. If we're lucky, we get a pat on the head. Maybe our hair is ruffled. It's hard not to feel suddenly ignored and a bit hurt. Actually, it's a tad worse in technology, where the ground is constantly shifting and keeping abreast is a matter of survival.

Of course, today, .NET is the new technologyquite a feast at that, and everyone is oohing and aahing about C#. It is hard not to wonder if maybe we as C++ programmers ought not to learn C#. After all, in all the .NET discussion, there is hardly a mention of C++, except to maybe contrast it with the new kid on the block. In technology, after all, missing the boat can mean losing your meal ticket as well.

Are C++ programmers obsolete? Absolutely not! In this article, I'll touch on what's new in the current release of Microsoft Visual Studio .NET, and then give you some idea of our future plans. All of us on the Visual C++ team think you'll be pleasantly surprised.

I wrote that for an MSDN article introducing our Visual C++ work for Microsoft Visual Studio.NET. With the beta release of Microsoft Visual Studio 2005, this seems a good time for a reality check on my earlier promises. It's really not an idle question for the Visual C++ programmer. And it's really not a question of whether we've given Visual C++ a future in the .NET environment, but rather, have we given it a present? That, for me, has been the measure of all our work in redesigning the CLI binding for Visual C++.

Not to put too fine a point on it: the original work to integrate Visual C++ into .NET fell short. Oh, it succeeded very well in its primary purpose: provide a bridge over which to integrate existing native code into a .NET application—the technology is called IJW (It Just Works). And that's very cool. Unfortunately, other aspects of the language were less successful. Personally, I found writing C# code closer in spirit to C++ than I did writing code in the Managed Extensions. That had to be fixed.

The good news is that we didn't just fix the language; we reinvented it. We added deterministic finalization, support for automatic memberwise copy and initialization, first class support for operator overloading—heck, we even threw in support for STL, as well as support for both the template and CLI generic parameterized type mechanisms. It is now my language of preference for programming under .NET. We hope it will be your preferred language, as well.

The bad news is that we didn't just fix the language; we reinvented it. Getting from point A to point B, from the old language binding to the new language binding, is a bit like getting from Kansas to Oz. It is just not a mechanical transposition, but requires a bit of incantation.

To ease the transition, we've moved forward in three primary areas:

  • The compiler continues to accept the original syntax with a special switch (\clr:old_syntax).
  • We're working on an informal translation tool that can get you 80% of the way there. It will likely be made available for test-driving during the time-frame of the second beta.
  • We're providing this document, and a companion translation guide, Moving Your Programs from Managed Extensions for C++ to C++/CLI, lists each language change in detail with original and revised code snippets and a motivation as to why each language change was made.

This article is meant as a developer summary. It highlights a baker's dozen issues that you'll need to consider in order to have a safe and pleasant porting experience. (To help you get a feel for the differences, an Appendix lists the issues in tabular form.)

The original language binding to the CLI was called Managed Extensions to C++. The revised language binding is referred to as C++/CLI, and an ECMA standard of this binding is currently progressing well. For convenience, within this article I refer to the original binding as V1, and the revised binding as V2.

The language revisions fall into the following general categories:

  1. Syntax. This represents changes in the way we define and manipulate our CLI types to make programming more elegant and pleasant. This is particularly true in the specification of CLI arrays and in scalar and index properties. While these changes are extensive, they are also largely mechanical.
  2. CLI conformance. The CLI object model differs in some significant ways from that of C++. The V1 binding at times resisted that, such as in its treatment of string literals, its handling of the CLI enum, and in the definition of value types. These changes, we believe, improve the fidelity of the dynamic programming model. Unfortunately, they do not in all cases provide a mechanical translation from V1 to V2. For example, value types no longer support a default constructor—Item #10. A CLI enum can no longer be forwardly declared—Item #7.
  3. CLI enhancements. If you are a C++ programming, the absence of support for copy construction and the automatic invocation of a destructor on a reference type is not just aggravating but error-prone. These represent beneficial patterns of object management that enhance the integrity of our programs. These and other design patterns from ISO-C++ have been integrated within the CLI binding in V2. That's the good news. The bad news, again, is that they do not in all cases provide a mechanical translation from V1 to V2. For example, support for deterministic-finalization changes the meaning of a class destructor between V1 and V2—Item #6. An explicit overriding of an interface member is now integrated into a virtual function override mechanism—Item #12.

The New Syntactic Landscape

People rarely call home about the syntax of a programming language, except if it's to complain. Mostly, the best we can do is hope that we do not confuse people and that we have not made it too painful for them to spend hours at a time over weeks and months employing our language to implement and deploy programs. The least excusable sin of language syntax is to leave the programmer unsure as to the meaning of her program. Not only can that compromise the quality of the software; it can also diminish the quality of a programmer's life. The new language binding is a more inhabitable environment for the zestful development of complex software. If you are a Visual C++ programmer working to develop under .NET, I believe you will agree that we have significantly improved the language experience.

1. Contextual Keywords Replace __

The double-underscores are gone. That's the first obvious difference. The reason for their introduction in V1 was two-fold: (a) conformance to the ISO policy of setting off language extensions, and (b) providing a non-invasive strategy for introducing new keywords. So, the motivation was well-mannered and reasonable. The reason for their removal in V2 was that the result was an ugly syntax that felt both complex and unsightly. The V2 solution to (b) is to introduce contextual keywords. (The solution to (a) is a bit more of a tap dance and is performed in the full translation guide.)

A contextual keyword has a special meaning within specific program contexts. Within the general program, for example, sealed is treated as an ordinary identifier. However, when it occurs within the declaration portion of a managed reference class type, it is treated as a keyword within the context of that class declaration. This minimizes the potential invasive impact of introducing a new keyword in the language, something that we feel is very important to users with an existing code base. At the same time, it allows users of the new functionality to have a first-class experience of the additional language feature—something we felt was missing from the original language design.

Table 1.1 provides a listing of the changes in the syntax of declaring the CLI types.

Table 1.1 The CLI Type Syntax Changes

CLI Type Managed Extension C++/CLI
reference class __gc class R ref class R
value class __value class V value class V
abstract class __gc __abstract class R ref class R abstract
sealed class __gc __sealed class R ref class R sealed
interface class __gc __interface IBar interface class IBar
CLI enum __value enum E enum class E
delegate type __delegate void CallBack() delegate void CallBack()

2. Tracking Handle (^) Replaces Pointer (*)

In V1, an object of a reference type is declared using pointer syntax. Under the revised language design, a reference class type object is declared using a new declarative token (^) referred to formally as a tracking handle and more informally as a hat. (The tracking adjective underscores the idea that a reference type sits within the CLI heap, and can therefore transparently move locations during garbage collection heap compaction. A tracking handle is transparently updated during runtime. Two analogous concepts are (a) the tracking reference (%), and (b) the interior pointer (interior_ptr<>)—see the main document for a discussion of these changes.) For example,

// V1 declaration of a CLI reference type
String * ps = S"a string literal";

// V2 declaration of a CLI reference type
String ^ ps = "a string literal";

(We have also cleaned up the handling of string literals that promote to the System::String Unicode representation. In this example, the programmer no longer must manual identify a string literal as being a System literal.)

This also allows us to provide a uniform syntax across both reference and value types that are located on the CLI heap. Unlike C# and Microsoft Visual Basic .NET, the C++ binding to the CLI allows the programmer to directly manipulate a boxed instance of a value type; this can be considerably more efficient. Here is how it is done in both V1 and V2,

double result = 3.14159;
// V1 Syntax
__box double * br = __box( result );

// V2 Syntax
double^ br = result;

The use of the pointer syntax was problematic in two primary areas (a more aggressive defense of the change can be found in the companion translation guide):

  1. The use of the pointer syntax did not allow overloaded operators to be directly applied to a reference object; rather, one had to call the operator through its internal name, such as r1->op_Addition(r2) rather than the more intuitive r1+r2.
  2. There are a number of pointer operations, such as casting and pointer arithmetic, that are disallowed for objects stored on a garbage collected heap. This lead to confusion among users, and we believe a separate token better captures the notion of a CLI reference type.

There are two auxiliary changes that accompany the change from R* to R^—the replacement of operator new with a CLI specific heap operator, and the introduction of a special token to represent a null tracking handle.

  1. A new CLI heap allocation operator, gcnew. For example,
    // V1 Syntax
    StreamReader *ifile = new StreamReader( fileName );
    NativeClass * pnc = new NativeClass( args );

// V2 Syntax StreamReader ^ifile = gcnew StreamReader( file ); NativeClass * pnc = new NativeClass( args );

  1. In V1, we initialize a reference type to address no object as follows,
    // V1: OK ... we set obj to refer to no object
    Object * obj = 0;

// V1: Error ... no implicit boxing ... Object * obj2 = 1;

In V2, any initialization or assignment of a value type to an **Object** results in an implicit boxing of that value type. In V2, therefore, both **obj** and **obj2** are initialized to addressed boxed **Int32** objects holding, respectively, the values **0** and **1**. For example,

<pre class="code" IsFakePre="true" xmlns="https://www.w3.org/1999/xhtml">// V2: OK ... causes the implicit boxing of both 0 and 1

// but that is not what we intended for obj! Object ^ obj = 0; Object ^ obj2 = 1;

Therefore, in order to allow the explicit initialization, assignment, and comparison of a tracking handle against referring to no object, we introduced a new keyword, **nullptr**. And this should replace each instance of **0** And so the correct revision of the ***V1*** example looks as follows:

<pre class="code" IsFakePre="true" xmlns="https://www.w3.org/1999/xhtml">// V2: OK ... we set obj to refer to no object

Object ^ obj = nullptr;

// V2: OK ... we initialize obj2 to a Int32^ Object ^ obj2 = 1;

3. CLI Array Syntax Is Simplified

The declaration of a CLI array object in V1 was a slightly non-intuitive extension of the standard array declaration in which a __gc keyword is placed between the name of the array object and its possibly comma-filled dimension. For example,

// V1 Syntax
void PrintValues( Object* myArr __gc[]);
void PrintValues( int myArr __gc[,,]);

This has been simplified in V2, in which we use a template-like declaration that suggests the STL vector declaration. The first parameter indicates the element type. The second parameter specifies the array dimension (with a default value of 1, so only multiple dimensions require a second argument). The array object itself is a tracking handle and therefore requires a hat. If the element type is also a reference type, that, too, must be given a hat. For example,

// V2 Syntax
void PrintValues( array<Object^>^ myArr );
void PrintValues( array<int,3>^ myArr );

4. Properties Are Unified

In V1, each set or get property accessor is specified as an independent member function. The declaration of each method is prefixed with the __property keyword. The method name begins with either set_ or get_ followed by the actual name of the property. For example,

// V1 Syntax
public __gc __sealed class Vector {
   float _x;
public:
   __property double get_x(){ return _x; }
   __property void set_x( double newx ){ _x = newx; }
};

This was found to be confusing, because it spreads out the functionality associated with a property and requires the user to lexically unify the associated sets and gets. Moreover, it is lexically verbose, and feels inelegant. In the revised language design, the property keyword is followed by the type of the property and its unadorned name. The set and get access methods are placed within a block following the property name. (Note that unlike C#, the signature of the access method is specified.) For example,

// V2 Syntax
public ref class Vector sealed{ 
   float _x;
public:
   property double x 
   {
      double get(){ return _x; }
      void set( double newx ){ _x = newx; }
   } // Note: no semi-colon ...
};

Index Properties

The primary V1 shortcoming of indexed properties is its inability to provide class-level subscripting. A second, less significant, shortcoming is that it is visually difficult to distinguish a property from an indexed property—the number of parameters is the only indication. The indexed properties in V1 also suffer from the same problems as those of scalar properties: the accessors are not treated as an atomic unit, but separated into individual methods. For example,

// V1 Syntax
public __gc class Vector; 
public __gc class Matrix
{
    float mat[,]; // V1 array syntax ...

public: 
   __property void set_Item( int r, int c, float value);
   __property int get_Item( int r, int c );

   __property void set_Row( int r, Vector* value );
   __property int get_Row( int r );
};

In V2, the index properties are distinguished by the bracket ([,]) following the name of the indexer and indicating the number and type of each index. Here is the Matrix declaration recast into the new syntax. (Note that a forward CLI class declaration is no longer permitted to indicate its public or private access level—as illustrated in the forward declaration of the Vector class.)

// V2 Syntax
// now illegal to specify public here ...
ref class Vector; 

public ref class Matrix {
private:
   array<float, 2>^ mat; // V2 array syntax ...
public:
   property int Item[int,int]
   {
      int get( int r, int c );
      void set( int r, int c, float value );
   }

     property int Row[int]
   {
      int get( int r );
      void set( int r, Vector^ value );
   }
};

To indicate a class level indexer, the default keyword is reused to substitute for an explicit name. For example,

public ref class Matrix {
private:
   array<float, 2>^ mat;

public:
      // ok: class level indexer now
      //     Matrix mat ...
      //     mat[ 0, 0 ] = 1; 
      // invokes the set accessor of the default indexer ...

   property int default[int,int]
   {
      int get( int r, int c );
      void set( int r, int c, float value );
   }
};

5. Operators Are Integrated with ISO-C++

Perhaps the most striking aspect of V1 is its support for operator overloading—or rather, its effective absence. Within the declaration of a reference type, for example, rather than using the native operator+ syntax, one had to explicitly write out the underlying internal name of the operator—for example, op_Addition. More onerous, however, is the fact that the invocation of an operator had to be explicitly invoked through that name, thus precluding the two primary benefits of operator overloading: (a) the intuitive syntax, and (b) the ability to intermix new types with existing types. For example,

// V1 Syntax
public __gc __sealed class Vector {
public:
  Vector( double x, double y, double z );
  static bool    op_Equality( const Vector*, const Vector* );
  static Vector* op_Division( const Vector*, double );
  static Vector* op_Addition( const Vector*, const Vector* );
  static Vector* op_Subtraction( const Vector*, const Vector* );
};

int main()
{
  Vector *pa = new Vector( 0.231, 2.4745, 0.023 );
  Vector *pb = new Vector( 1.475, 4.8916, -1.23 ); 

  Vector *pc1 = Vector::op_Addition( pa, pb );
  Vector *pc2 = Vector::op_Subtraction( pa, pc1 );
  Vector *pc3 = Vector::op_Division( pc1, pc2->x() );

  if ( Vector::op_Equality( pc1, p2 )) // ...
}

In V2, the usual expectations of a native C++ programmer are restored, both in the declaration and in the use of the static operators (the C++ instance operators are also supported in V2, but that is not a translation issue). For example,

// V2 Syntax
public ref class Vector sealed {
public:
   Vector( double x, double y, double z );
   static bool    operator ==( const Vector^, const Vector^ );
   static Vector^ operator /( const Vector^, double );
   static Vector^ operator +( const Vector^, const Vector^ );
   static Vector^ operator -( const Vector^, const Vector^ );
};

int main()
{
   Vector^ pa = gcnew Vector( 0.231, 2.4745, 0.023 ),
   Vector^ pb = gcnew Vector( 1.475,4.8916,-1.23 );

   Vector^ pc1 = pa + pb;
   Vector^ pc2 = pa - pc1;
   Vector^ pc3 = pc1 / pc2->x();

   if ( pc1 == p2 ) // ...
}

Changes in Semantic Meaning

It can be a difficult and somewhat frustrating experience to change from a programming paradigm with which one has become an expert to a new paradigm with which one finds oneself making clumsy, new-kid-on-the block sorts of errors. V2 has extended the CLI binding with a number of C++-specific extensions to make the transition to a dynamic programming paradigm more familiar. This includes support for memberwise copy semantics and the automatic invocation of a reference type destructor at the end of its lifetime.

6. Destructor Goes to IDisposable::Dispose

Before the memory associated with an object is reclaimed by the garbage collector, an associated Finalize() method, if present, is invoked. You can think of this method as a kind of super-destructor since it is not tied to the program lifetime of the object. We refer to this as finalization. The timing of just when or even whether a Finalize() method is invoke is undefined. This is what is meant when we say that garbage collection exhibits non-deterministic finalization.

Non-deterministic finalization works well with dynamic memory management. When available memory gets sufficiently scarce, the garbage collector kicks in and things pretty much just work. Under a garbage collected environment, destructors to free memory are unnecessary.

Non-deterministic finalization does not work well, however, when an object maintains a critical resource such as a database connection or a lock of some sort. In this case, we need to release that resource as soon as possible. In the native world, that is done through the pairing of a constructor/destructor pair. As soon as the lifetime of the object ends, either through the completion of the local block within which it is declared or through the unraveling of the stack because of a thrown exception, the destructor kicks in and the resource is automatically released. It works very well, and its absence under the original language design was sorely missed.

The solution provided by the CLI is for a class to implement the Dispose() method of the IDisposable interface. The problem here is that Dispose() requires an explicit invocation by the user. This is error-prone and therefore a step backwards. The C# language provides a modest form of automation through a special using statement. V1 provided no special support.

In V1, the destructor of a reference class is implemented through the following two steps:

  1. The user-supplied destructor is renamed internally to Finalize(). If the class has a base class (remember, under the CLI Object Model, only single inheritance is supported), the compiler injects a call of its finalizer following execution of the user-supplied code. For example, given the following trivial hierarchy taken from the V1 language specification,
    __gc class A {
    public:
    ~A() { Console::WriteLine(S"in ~A"); }
    };

__gc class B : public A { public: ~B() { Console::WriteLine(S"in ~B"); } };

both destructors are renamed **Finalize()**. ***B***'s **Finalize()** method has an invocation of ***A***'s **Finalize()** method added following the invocation of **WriteLine()**. This is what the garbage collector will invoke by default during finalization. Here is what this internal transformation might look like,

<pre class="code" IsFakePre="true" xmlns="https://www.w3.org/1999/xhtml">// internal transformation of destructor under V1

__gc class A { // ... void Finalize() { Console::WriteLine(S"in ~A"); } };

__gc class B : public A { // ... void Finalize() { Console::WriteLine(S"in ~B");
A::Finalize(); } };

  1. In the second step, the compiler synthesizes a virtual destructor. This destructor is what our V1 user-programs invoke, either directly or through an application of the delete expression. It is never invoked by the garbage collector.

    What is placed within this synthesized destructor? Two statements. One is a call to GC::SuppressFinalize() to make sure there are no further invocations of Finalize(). The second is the actual invocation of Finalize(). This, recall, represents the user-supplied destructor for that class. Here is what this might look like,

    __gc class A {
    

public: virtual ~A() { System::GC::SuppressFinalize(this); A::Finalize(); } };

__gc class B : public A { public: virtual ~B() { System::GC:SuppressFinalize(this); B::Finalize(); } };

While this implementation allows the user to explicitly invoke the class Finalize() method now rather than whenever, it does not really tie in with the Dispose() method solution. This is changed in the revised language design.

In V2, the destructor is renamed internally to the Dispose() method and the reference class is automatically extended to implement the IDispose interface.

When either a destructor is invoked explicitly under V2, or when delete is applied to a tracking handle, the underlying Dispose() method is invoked automatically. If it is a derived class, a call of the Dispose() method of the base class is inserted at the close of the synthesized method.

But this doesn't get us all the way to deterministic finalization. In order to reach that, we need the additional support of local reference objects. (This has no analogous support within V1, and so it is not a translation issue—nor is it available in the Beta1 release. A description can be found in the full translation guide.)

In V2, as we've seen, the destructor is synthesized into the Dispose() method. This means that in cases where the destructor is not explicitly invoked, the garbage collector, during finalization, will not as before find an associated Finalize() method for the object. In order to support both destruction and finalization, V2 has introduced a special syntax for providing a finalizer. For example,

         // V2 Syntax
   public ref class R {
   protected:
      !R() { Console::WriteLine( "I am the R::finalizer()!" ); }
   };

The ! prefix is meant to suggest the analogous tilde (~) that introduces a class destructor—that is, both post-lifetime methods have a token prefixing the name of the class. If the synthesized Finalize() method occurs within a derived class, an invocation of the base class Finalize() method is inserted at its end. If the destructor is explicitly invoked, the finalizer is suppressed. (Note that a finalizer must be declared as a protected and not a public member.)

This means that the runtime behavior of a V1 program is silently changed when compiled under V2 whenever a reference class contains a non-trivial destructor. The required translation algorithm is to do the following:

  • If a destructor is present, rewrite that to be the class finalizer.

  • If a Dispose() method is present, rewrite that into the class destructor.

  • If the original code contained an explicit invocation of the class destructor or an application of the delete operator to an instance of the type, you will need to also provide a public method through which to invoke the finalizer in order to duplicate the V1 behavior. For example,

    public ref class R {
    public:
    void callFinalizer()
    {
    System::GC::SuppressFinalize( this );
    This->!R();
    }
    protected:
    !R() { Console::WriteLine( "I am the R::finalizer()!" ); }
    };
    

    For example, the following V1 code,

    void f( R* r )
    

{ r->Dispose(); // 1 delete r; // 2 };

would be transformed into the following V2 code,

<pre class="code" IsFakePre="true" xmlns="https://www.w3.org/1999/xhtml">void f( R^ r )

{ delete r ; // now equivalent to 1 r->callFinalizer(); // 2 };

7. CLI Enums

The V1 CLI enum declaration is preceded by the __value keyword. The idea here is distinguish the native enum from the CLI enum that is derived from System::ValueType, while suggesting an analogous functionality. For example,

// V1 Syntax
__value enum e1 { fail, pass };
public __value enum e2 : unsigned short  
     { not_ok = 1024, maybe, ok = 2048 };  

V2 solves the problem of distinguishing native and CLI enums by emphasizing the class nature of the latter rather than its value type roots. As such, the __value keyword is discarded, replaced with the spaced keyword pair of enum class. This provides a paired keyword symmetry to the declarations of the reference, value, and interface classes. The translation of the enumation pair e1 and e2 in V2 looks as follows,

// V2 Syntax
enum class e1 { fail, pass };
public enum class e2 : unsigned short 
     { not_ok = 1024, maybe, ok = 2048 };

Apart from this small syntactic change, the behavior of the CLI enum type has been changed in a number of ways:

  • A forward declaration of a CLI enum is no longer supported in V2. There is no mapping. It is simply flagged as a compile-time error.

  • The overload resolution between the built-in arithmetic types and the Object class hierarchy has reversed between V1 and V2. As a side-effect, CLI enums are no longer implicitly converted to arithmetic types in V2 as they were in V1. For example, consider the following code fragment:

    // V1 Syntax
    __value enum status { fail, pass };

void f( Object* ){ cout << "f(Object)\n"; } void f( int ){ cout << "f(int)\n"; }

int main() { status rslt; f( rslt ); // which f is invoked? }

For the native C++ programmer, the natural answer to the question, which instance of the overloaded **f()** in invoked?, is that of **f(int)**. An enum is a symbolic integral constant, and it participates in the standard integral promotions that take precedence in this case. In V1, this is the instance to which the call resolves.

However, this resolution caused a number of surprises—not when we used them in a native C++ frame of mind, but when we needed them to interact with the existing *Base Class Library* framework, where an *Enum* is a class indirectly derived from *Object*. In V2, the instance of **f()** invoked is that of **f(Object^)**.

As a side-effect, the revised language does not support implicit conversions between a CLI enum type and the arithmetic types. Any code that used a CLI enum where an arithmetic type is expected now requires an explicit cast.
  • In V2, a managed enum maintains its own scope, in conformance with the CLI object model, but counter-intuitive to the native C++ programmer. In V1, an attempt was made to define weakly injected names for the enumerators of a CLI enum in order to simulate the absence of scope within the native enum. This did not prove successful. The problem is that this causes the enumerators to spill into the global namespace, resulting in difficult to manage name-collisions. In V2, therefore, we have conformed to the other CLI languages in supporting scopes within the managed enum.

    Under V1, that is, the enumerators of a CLI enum are visible within the containing scope of the enum. In V2, the enumerators are encapsulated within the scope of the enum. This means that any unqualified use of an enumerator of a CLI enum is not recognized under V2. For example,

    // V1 supporting weak injection
    

__gc class XDCMake { public:   __value enum _xdc {      UNDEFINED, OPTION_USAGE, XDC4_XML_LDFAIL = 4 };

  XDCMake() { // unqualified use of _xdc enumerators ...      opList->Add( __box(UNDEFINED)); // (1)      opList->Add( __box(OPTION_USAGE)); // (2)      itagList->Add( __box(XDC4_XML_LDFAIL)); // (3)    } };

Each of the three unqualified uses of the enumerator names (**(1)**, **(2)**, and **(3)**) need to be qualified in the translation to V2. For example,

<pre class="code" IsFakePre="true" xmlns="https://www.w3.org/1999/xhtml">// V2 Syntax - CLI enum exhibits scope 

ref class XDCMake { public:   enum class _xdc { UNDEFINED, OPTION_USAGE, XDC4_XML_LDFAIL = 4   };

  XDCMake()   { // explicit qualification required ...     opList->Add( _xdc::UNDEFINED); //(1)     opList->Add( _xdc::OPTION_USAGE); //(2)     itagList->Add( _xdc::XDC4_XML_LDFAIL); //(3)   } };

8. Pinning Pointers

The garbage collector may optionally move objects that reside on the CLI heap to different locations within the heap during a compaction phase. This movement is not a problem for tracking handles, tracking references, and interior pointers, which update these entities transparently. This movement is a problem, however, if the user has passed the address outside of the runtime environment. In this case, the volatile movement of the object is likely to cause a runtime failure. To exempt objects such as these from being moved, we must locally pin them to their location for the extent of their outside use.

In V1, a pinning pointer is declared by qualifying a pointer declaration with the __pin keyword. In V2, a pinning pointer is declared with a pseudo-template syntax. The original constraints on a pinning pointer remain. For example, it cannot be used as a parameter or return type of a method; rather, it can only be declared on a local object. A number of additional constraints are added in V2:

  1. The default value of a pinning pointer is nullptr, not 0. A pin_ptr<> cannot be initialized or assigned 0. All assignments and explicit comparisons to 0 need to be changed to nullptr. (This is true of all reference types.)
  2. In V1, a pinning pointer is permitted to address a whole object. In V2, pinning a whole object is not supported. Rather, the address of an interior member needs to be pinned. For example,
    // V1 Syntax
    __gc struct H { int j; };
    __gc class G { ... };
    void f( G * g )
    {
    // V1: pinning a whole object
    H __pin * pH = new H;
    g->incr(& pH -> j);
    };

The member that we really need to pin in this case is H::j. The revision of the program to compile under V2 is to retarget the source of the pinning to that member. For example,

// V2 Syntax
ref struct H { int j; };
ref class G{ ... };
void f( G^ g )
{
   H ^ph = gcnew H;
   // V2: pin interior member ...
   pin_ptr<int> pj = &ph->j;
   g->incr(  pj );
}

9. Static Const Members Go to Literal

Although static const integral members are still supported, their linkage attribute has changed between V1 and V2. Their V1 linkage attribute is now carried in a literal integral member under V2. For example, consider the following class declared under V1,

// V1 Syntax
public __gc class Constants {
public:
static const int LOG_DEBUG = 4;
// ...
};

This generates the following underlying CIL attributes for the field (note the literal attribute in boldface),

.field public static literal int32 
modopt([Microsoft.VisualC]Microsoft.VisualC.IsConstModifier) STANDARD_CLIENT_PRX = int32(0x00000004)

While this still compiles under V2,

// V2 Syntax
public ref class Constants {
public:
static const int LOG_DEBUG = 4;
// ...
};

it no longer emits the literal attribute, and therefore is not viewed as a constant by the CLI runtime,

.field public static int32 modopt([Microsoft.VisualC]Microsoft.VisualC.IsConstModifier) STANDARD_CLIENT_PRX = int32(0x00000004)

In order to have the same inter-language literal attribute, the declaration needs to be changed to the newly supported literal data member. For example,

// V2 Syntax
public ref class Constants {
public:
literal int LOG_DEBUG = 4;
// ...
};

This change should only be applied to static const members of integral type. All others types should remain as before.

Changes Without One-To-One Mapping

In designing C++, Bjarne Stroustrup, over the years, admittedly made a few mistakes. For example, in the initial version of the language, the invocation of a virtual function within a constructor resolved to the most derived instance. Over time, it became clear that, in general, this was the wrong behavior, and he reversed the behavior with Release 1.2 of cfront back in the 1980s. We also have made some mistakes—perhaps are still making some now. I don't believe we are the only language that has done so; still, we are the only language about which I am writing. Except for Item #13, which makes use of a new override facility, these items represent reversals that invalidate V1 usage.

10. Constructors Are Implicitly Explicit

Under V1, a single argument constructor defines the conversion of a class object into an object of a second type. If an object of a class is expected, and the value supplied is of a type that matches a single argument constructor of that class, the compiler silently invokes the constructor to create a temporary object of that class, which it then applies to the expression. This impacts initialization, assignment, and function and operator overload resolution. In V2, single-argument constructors behave as if they have been declared explicit. That is, the compiler never applies them silently to effect a necessary conversion. Moreover, in V2, there is a distinction made between creation and converting casts. A creation cast, of the form,

// a creation cast ... constructor invoked ...
Buffer( 128 );

invokes the associated class constructor, the same as in V1. However, a conversion cast, of the following forms,

// conversion casts ... constructor is never invoked for these
( Buffer ) 128;
( Buffer )( 128 );
static_cast< Buffer >( 128 );

never results in the invocation of an associated constructor. The conversion cast only succeeds if the class defines an appropriate conversion operator.

The transformation of V1 code to exhibit V2 behavior requires not just the insertion of explicit casts, but is likely to require the definition of appropriate conversion operators as well.

11. No Value Class Default Constructor

In both V1 and V2, a value class does not support special class member functions (SMF) such as a copy constructor, a copy assignment operator, and a destructor. In V1, however, a value class did allow the definition of a default constructor—that is, a constructor taking no arguments. In V2, this permission has been rescinded; a value class can no longer provide a default constructor.

The problem, in this case, is that there are run-time occasions in which we could not guarantee the invocation of the associated default constructor. Given that absence of guarantee, it was felt to be better to drop the support altogether rather than have it be non-deterministic in its application.

In general, the absence of this special member function is not a problem if we use a value class in a constrained way; that is, if we only allow it to contain value members so that it supports bitwise copy. We do not need a copy constructor nor a copy operator when the aggregate type supports bitwise copy. Similarly, we do not need a destructor when the state of the aggregate type exhibits value semantics. Finally, since the runtime zeros out all states by default, we do not require a default constructor. (In C++, primitive data types are not automatically zeroed out, and so most of our default constructor-use—but granted, not all—is used to put the object in an uninitialized state.)

The problem, of course, is when a V1 value class uses the default constructor to perform non-zeroing operations. In this case, the code within the constructor will likely need to be migrated into a named initialization function. This is potentially error-prone, of course, because this method must be explicitly invoked by the programmer.

12. Override of Private Virtual Function

In V1, the access level of a virtual function does not constrain its ability to be overridden within a derived class. (The same is true in ISO-C++.) Under V2, a virtual function cannot override a base class virtual function that it cannot access. For example,

__gc class My {
private:
   virtual void g();// inaccessible to a derived class ...
};
 
__gc class File : public My {
public:
   // in V1, ok: g() overrides My::g()
   // in V2, error: cannot override: My::g()inaccessible ...
   void g();
};

The most obvious solution under V2 is to make the private base class member non-private. The inherited methods do not have to bear the same access; they simply have to be accessible. In this example, the least invasive change is to make the My member protected. This way the general program's access to the method through My is still prohibited,

ref class My {
protected:
      virtual void g();
};
 
ref class File : My {
public:
     void g();
};

Note that the absence of the explicit virtual keyword in the base class, under the revised language, generates a warning message. It wants you to become a more responsible programmer by squawking until you make its virtual nature explicit.

13. Explicit Interface Function Override

It is often desirable to provide two instances of an interface member within a class that implements the interface—one that is used when class objects are manipulated through an interface handle, and one that is used when class objects are used through the class interface. For example,

// V1 Syntax
public __gc class R : public ICloneable 
{
   // to be used through Icloneable ...
   Object* ICloneable::Clone();

   // to be used through an R ...
   R* Clone();
};

In V1, we do this by providing an explicit declaration of the interface method with the method's name qualified with the name of the interface. The class-specific instance is unqualified. This eliminates the need to downcast the return value of Clone(), in this example, when explicitly called through an instance of R.

In V2, a general overriding mechanism has been introduced that replaces the previous syntax. Our example needs to be rewritten as follows,

// V2 Syntax
public ref class R : public ICloneable 
{
   // to be used through ICloneable ...
   Object^ InterfaceClone() = ICloneable::Clone;

   // to be used through an R ...
   virtual R^ Clone() new;
};

This revision requires that the interface member that is being explicitly overridden be given a unique name within the class. Here, I've provided the rather awkward name of InterfaceClone(). The behavior is still the same—an invocation through the ICloneable interface invokes the renamed InterfaceClone(), while a call through an object of type R invokes the second Clone() instance.

Conclusion

So there you have it—a baker's dozen of the most visible change points between the original and revised C++ binding to the CLI. We believe the radical redesign in the language, although invasive, is a necessary step in making C++ a first-class denizen of the .NET environment. I would like to believe that we have been successful, but we will not have succeeded unless you agree as well. It is my hope that this article, the companion full Translation Guide, and the V1-to-V2 source translation tool—mscfront, which should be downloadable in the Beta2 time-frame—will prove to be helpful resources in making your transition to Microsoft Visual C++ 2005 successful.

STL Tutorial and Reference Guide by David Musser, Gillmer Derge, and Atul Saini, Addison-Wesley, 2001

C++ Standard Library by Nicolai Josuttis, Addison-Wesley, 1999

C++ Primer by Stanley Lippman and Josee Lajoie, Addison-Wesley, 1998

Acknowledgements

I would like to thank the members of the Visual C++ Team for their help and guidance in detailing these issues. Thanks go to Arjun Bijanki, Artur Laksberg, Brandon Bray, Jonathan Caves, Siva Challa, Tanveer Gani, Mark Hall, Mahesh Hariharan, Jeff Peil, Andy Rich, Alvin Chardon, and Herb Sutter. All have been of incredible help and responsiveness. This document is a tribute to all their expertise.

 

About the author

Stanley Lippman, Architect, Visual C++, Microsoft Corporation. He began working on C++ with its inventor Bjarne Stroustrup back in 1984 within Bell Laboratories. In between, he worked in Feature Animation at Disney and DreamWorks, and was a Software Technical Director on Fantasia 2000.

© Microsoft Corporation. All rights reserved.