Named Return Value Optimization in Visual C++ 2005

 

Ayman B. Shoukry
Visual C++ Compiler Team

October 2005

Summary: Shows how the Visual C++ compiler eliminates redundant Copy constructor and Destructor calls in various situations. (12 printed pages)

Contents

Optimization Description
Code Samples
Optimization Limitations
Optimization Side Effects

The Microsoft Visual C++ optimizing compiler is always looking for new techniques and optimizations to provide the programmer with higher performance when possible. This article shows how the compiler tries to eliminate redundant Copy constructor and Destructor calls in various situations.

Typically, when a method returns an instance of an object, a temporary object is created and copied to the target object via the copy constructor. The C++ standard allows the elision of the copy constructor (even if this results in different program behavior), which has a side effect of enabling the compiler to treat both objects as one (see section 12.8. Copying class objects, paragraph 15; see Reference*).* The Visual C++ 8.0 compiler makes use of the flexibility that the standard provides and adds a new feature: Named Return Value Optimization (NRVO). NRVO eliminates the copy constructor and destructor of a stack-based return value. This optimizes out the redundant copy constructor and destructor calls and thus improves overall performance. It is to be noted that this could lead to different behavior between optimized and non-optimized programs (see the Optimization Side Effects section).

There are some cases in which the optimization will not take place (see the Optimization Limitations section for samples). The more common ones are:

  • Different paths returning different named objects.
  • Multiple return paths (even if the same named object is returned on all paths) with EH states introduced.
  • The named object returned is referenced in an inline asm block.

Optimization Description

Here is a simple example in Figure 1 to illustrate the optimization and how it is implemented:

A MyMethod (B &var)
{
   A retVal;
   retVal.member = var.value + bar(var);
   return retVal;
}

Figure 1. Original code

The program that uses the above function may have a construct such as:

valA = MyMethod(valB);

That value that is returned from MyMethod is created in the memory space pointed to by ValA through the use of hidden argument. Here is what the function looks like when we expose the hidden argument and explicitly show the constructors and destructors:

A MyMethod (A &_hiddenArg, B &var)
{
   A retVal;
   retVal.A::A(); // constructor for retVal
   retVal.member = var.value + bar(var);
   _hiddenArg.A::A(retVal);  // the copy constructor for A
   return;
retVal.A::~A();  // destructor for retVal

}

Figure 2. Hidden argument code without NRVO (pseudo code)

From the above code, it is noticeable that there are some optimization opportunities available. The basic idea is to eliminate the temporary stack-based value (retVal) and use the hidden argument. Consequently, this will eliminate the copy constructor and destructor of the stack-based value. Here is the NRVO-based optimized code:

A MyMethod(A &_hiddenArg, B &var)
{
   _hiddenArg.A::A();
   _hiddenArg.member = var.value + bar(var);
   Return
}

Figure 3. Hidden argument code with NRVO (pseudo code)

Code Samples

Sample 1: Simple example

#include <stdio.h>
class RVO
{
public:
       
            RVO(){printf("I am in constructor\n");}
            RVO (const RVO& c_RVO) {printf ("I am in copy constructor\n");}
            ~RVO(){printf ("I am in destructor\n");}
            int mem_var;       
};
RVO MyMethod (int i)
{
            RVO rvo;
            rvo.mem_var = i;
            return (rvo);
}
int main()
{
            RVO rvo;
            rvo=MyMethod(5);
}

Figure 4. Sample1.cpp

Compiling sample1.cpp with and without NRVO turned on will yield different behavior.

Without NRVO (cl /Od sample1.cpp), the expected output would be:

I am in constructor
I am in constructor
I am in copy constructor
I am in destructor
I am in destructor
I am in destructor

With NRVO (cl /O2 sample1.cpp), the expected output would be:

I am in constructor
I am in constructor
I am in destructor
I am in destructor

Sample 2: More complex sample

#include <stdio.h>
class A {
  public:
    A() {printf ("A: I am in constructor\n");i = 1;}
    ~A() { printf ("A: I am in destructor\n"); i = 0;}
    A(const A& a) {printf ("A: I am in copy constructor\n"); i = a.i;}
    int i, x, w;
};
 class B {
  public:
    A a;
    B()  { printf ("B: I am in constructor\n");}
    ~B() { printf ("B: I am in destructor\n");}
    B(const B& b) { printf ("B: I am in copy constructor\n");}
};
A MyMethod()
{
    B* b = new B();
    A a = b->a;
    delete b;
    return (a);
}
int main()
{
    A a;
    a = MyMethod();
}

Figure 5. Sample2.cpp

The output without NRVO (cl /Od sample2.cpp) will look like this:

A: I am in constructor
A: I am in constructor
B: I am in constructor
A: I am in copy constructor
B: I am in destructor
A: I am in destructor
A: I am in copy constructor
A: I am in destructor
A: I am in destructor
A: I am in destructor

While when the NRVO optimization kicks in (cl /O2 sample2.cpp), the output will be:

A: I am in constructor
A: I am in constructor
B: I am in constructor
A: I am in copy constructor
B: I am in destructor
A: I am in destructor
A: I am in destructor
A: I am in destructor

Optimization Limitations

There are some cases where the optimization won't actually kick in. Here are few samples of such limitations.

Sample 3: Exception sample

In the face of exceptions the hidden argument must be destructed within the scope of the temporary that it is replacing. To illustrate:

//RVO class is defined above in figure 4
#include <stdio.h>
RVO MyMethod (int i)
{
            RVO rvo;
            rvo.mem_var = i;
            throw "I am throwing an exception!";
            return (rvo);
}
int main()
{
            RVO rvo;
            try 
            {
                        rvo=MyMethod(5);
            }
            catch (char* str)
            {
                        printf ("I caught the exception\n");
            }
}

Figure 6. Sample3.cpp

Without NRVO (cl /Od /EHsc sample3.cpp), the expected output would be:

I am in constructor
I am in constructor
I am in destructor
I caught the exception
I am in destructor

If the "throw" gets commented out, the output will be:

I am in constructor
I am in constructor
I am in copy constructor
I am in destructor
I am in destructor
I am in destructor

Now, if the "throw" gets commented out and the NRVO gets triggered in, the output will look like:

I am in constructor
I am in constructor
I am in destructor
I am in destructor

That is to say, sample3.cpp as it is in Figure 6 will behave the same with and without NRVO.

Sample 4: Different named object sample

To make use of the optimization all exit paths must return the same named object. To illustrate, consider sample4.cpp:

#include <stdio.h>
class RVO
{
public:
       
            RVO(){printf("I am in constructor\n");}
            RVO (const RVO& c_RVO) {printf ("I am in copy constructor\n");}
            int mem_var;       
};
RVO MyMethod (int i)
{
            RVO rvo;
            rvo.mem_var = i;
      if (rvo.mem_var == 10)
         return (RVO());
            return (rvo); 
}
int main()
{
            RVO rvo;
            rvo=MyMethod(5);
}

Figure 7. Sample4.cpp

The output while optimizations are enabled (cl /O2 sample4.cpp) is the same as not enabling any optimizations (cl /Od sample.cpp). The NRVO doesn't actually take place since not all return paths return the same named object.

I am in constructor
I am in constructor
I am in copy constructor

If you change the above sample to return rvo (as shown below in Figure 8. Sample4_modified.cpp) in all exit paths, the optimization will eliminate the copy constructor:

#include <stdio.h>
class RVO
{
public:
       
            RVO(){printf("I am in constructor\n");}
            RVO (const RVO& c_RVO) {printf ("I am in copy constructor\n");}
            int mem_var;       
};
RVO MyMethod (int i)
{
            RVO rvo;
           if (i==10)
         return (rvo);
      rvo.mem_var = i;
            return (rvo); 
}
int main()
{
            RVO rvo;
            rvo=MyMethod(5);
}

Figure 8. Sample4_Modified.cpp modified to make use of NRVO

The output (cl /O2 Sample4_Modified.cpp) will look like:

I am in constructor
I am in constructor

Sample 5: EH Restriction Sample

Figure 9 below illustrates the same sample as in Figure 8 except with the addition of a destructor to the RVO class. Having multiple return paths and introducing such a destructor creates EH states in the function. Due to the complexity of the compiler's tracking which objects need to be destructed, it avoids the return value optimization. This is actually something that Visual C++ 2005 will need to improve in the future.

//RVO class is defined above in figure 4
#include <stdio.h>
RVO MyMethod (int i)
{
            RVO rvo;
           if (i==10)
         return (rvo);
      rvo.mem_var = i;
            return (rvo); 
}
int main()
{
            RVO rvo;
            rvo=MyMethod(5);
}

Figure 8. Sample5.cpp

Compiling Sample5.cpp with and without optimization will yield the same result:

I am in constructor
I am in constructor
I am in copy constructor
I am in destructor
I am in destructor
I am in destructor

To make use of NRVO, try to eliminate the multiple return points in such cases by changing MyMethod to be something like:

RVO MyMethod (int i)
{
            RVO rvo;
      if (i!=10)
         rvo.mem_var = i;
      return(rvo);  
}

Sample 6: Inline asm restriction

Another case where the compiler avoids performing NRVO is when the named return object is referenced in an inline asm block. To illustrate, consider the sample below:

#include <stdio.h>
//RVO class is defined above in figure 4
RVO MyMethod (int i)
{
            RVO rvo;
__asm {
      mov eax,rvo   //comment this line out for RVO to kick in
      mov rvo,eax //comment this line out for RVO to kick in
          }
            return (rvo); 
}
int main()
{
            RVO rvo;
            rvo=MyMethod(5);
}

Figure 9. Sample6.cpp

Compiling sample6.cpp with optimization turned on (cl /O2 sample6.cpp) will still not take advantage of NRVO. That is because the object returned was actually referenced in an inline asm block. Hence the output with and without optimizations will look like:

I am in constructor
I am in constructor
I am in copy constructor
I am in destructor
I am in destructor
I am in destructor

From the output, it is clear that the elimination of the copy constructor and destructor calls did not take place. If the asm block gets commented out, such calls will get eliminated.

Optimization Side Effects

The programmer should be aware that such optimization might affect the flow of the application. The following example illustrates such a side effect:

#include <stdio.h>
int NumConsCalls=0;
int NumCpyConsCalls=0;
class RVO
{
public:
       
            RVO(){NumConsCalls++;}
            RVO (const RVO& c_RVO) {NumCpyConsCalls++;}
};
RVO MyMethod ()
{
            RVO rvo;
            return (rvo); 
}
void main()
{
           RVO rvo;
           rvo=MyMethod();
       int Division = NumConsCalls / NumCpyConsCalls;
       printf ("Constructor calls / Copy constructor calls = %d\n",Division);
}

Figure 10. Sample7.cpp

Compiling sample7.cpp with no optimizations enabled (cl /Od sample7.cpp) will yield what most users expect. The "constructor" is called twice and the "copy constructor" is called once and hence the division (2/1) yields 2.

Constructor calls / Copy constructor calls = 2

On the other hand, if the above code gets compiled with optimization enabled (cl /O2 sample7.cpp), The NRVO will kick in and hence the "copy constructor" call will be eliminated. Consequently, NumCpyConsCalls will be ZERO leading to a division by ZERO exception, which if not handled appropriately (as in sample7.cpp) might cause the application to crash.

Reference

The C++ Standard Incorporating Technical Corrigendum 1 BS ISO/IEC 14882:2003 (Second Edition)

© Microsoft Corporation. All rights reserved.