Reflection

Dodge Common Performance Pitfalls to Craft Speedy Applications

Joel Pobar

Parts of this article are based on a prerelease version of the .NET Framework 2.0. Those sections are subject to change.

This article discusses:

  • How to get good reflection performance
  • Early-bound and late-bound invocation
  • Member caching and handles
  • Best practices for using reflection
This article uses the following technologies:
.NET Framework, C#

Contents

What's Slow and What's Not?
When Should You Use Reflection?
Invoking or Calling a Member
Early-Bound and Late-Bound Invocation
Hybrid Late-Bound to Early-Bound Invocation
I Want My MemberInfo
The Reflection MemberInfo Cache
Handles and Handle-Resolution APIs
Implementing Your Own Cache
Wrapping It Up

Using reflection efficiently is like haggling with an API. You have to pay to play and make some concessions. But it's worth it. Reflection in .NET is one of the most powerful features you can employ to achieve application extensibility. With reflection, you can load types, understand their members, make decisions about them, and execute, all within the safety of the managed runtime. But to use this power wisely, it's important to understand the associated costs and pitfalls.

In this article I'll show you which reflection tasks are costly and which ones are not. I'll guide you in determining the cost/benefit trade-offs and will dive deep into the framework to get a close look at the internals. Then I'll review what's new for reflection in the Microsoft® .NET Framework 2.0 and how it makes certain functionality even less costly.

What's Slow and What's Not?

Figure 1 outlines some commonly used reflection APIs. As you'll see, some are fast and light while others are heavyweights.

Figure 1 Common Reflection Tasks

Fast and Light Functions

typeof Object.GetType typeof == Object.GetType Type equivalence APIs (including typehandle operator overloads) get_Module get_MemberType Some of the IsXX predicate APIs New token/handle resolution APIs in the .NET Framework 2.0

Costly Functions

GetXX APIs (MethodInfo, PropertyInfo, FieldInfo, and so on) GetCustomAttributes Type.InvokeMember Invoke APIs (MethodInfo.Invoke, FieldInfo.GetValue, and so on) get_Name (Name property) Activator.CreateInstance

Generally, lightweight functions are fast for one of two reasons: either the reflection infrastructure has the information it needs already stored in super-fast runtime data structures, or the just-in-time (JIT) compiler detects these method calls and code patterns and generates special optimizations to speed things up a bit. Two good examples of a reflection JIT optimization are the C# typeof method and the base class library's (BCL) Object.GetType method. Both are heavily used in the BCL for type equality, and as a result they had to be special cased to ensure optimal performance. In this article, however, I'll focus on the costly reflection APIs.

When Should You Use Reflection?

You should always think carefully about how you're using reflection. Using reflection occasionally without enforcing strict performance criteria is probably fine. If reflection APIs are only invoked when you're calling the part of your app that loads and invokes a third-party plug-in, then the cost should be reasonable. However, if your scenario involves a high-volume ASP.NET Web site that requires good throughput and response times, and if it makes significant use of the heavy reflection APIs in your "fast path" (the code in the application that must run very fast and is used repeatedly), you should really consider an architectural review to decide if you've made the right decisions on your use of reflection.

As a classic example, imagine you're writing an app that acts as a host for third-party plug-ins. Do calls into these plug-ins occur mostly in areas of you code that don't get much action or will they be accessed frequently? And what if you require reflection but your application has to be fast?

The first thing to think about when considering reflection is whether the extensibility point of your application can be statically defined as an interface or as a base class. The cleanest and best-performing way to call members on an extensibility point is to statically define the contract for consumers to implement, and then use that contract to invoke safely. Usually this contract is defined by an interface or base class, and the consumer has access to the contract to implement against it. To illustrate, I've created a sample with an extensible way to plug in a logging infrastructure.

First, I define a contract for my application's logging requirement. My app should call on three different methods to log three different types of messages: security, application, and error messages, so I define the contract to be an interface with three methods:

public interface LogInterface { void WriteErrorEvent(string errorMessage); void WriteApplicationEvent(string applicationMessage); void WriteSecurityEvent(string securityMessage); }

This interface could be compiled as a DLL and distributed to third parties that want to provide logging infrastructure plug-ins for my app. Third parties could then implement their logging infrastructure using the contract (the interface); making their plug-in libraries ready for my app to consume. This code shows a third-party logging plug-in that writes all log messages to the console:

public class JoelsLogger : LogInterface { public void WriteErrorEvent(string errorMessage) { Console.WriteLine("error: " + errorMessage); } public void WriteApplicationEvent(string applicationMessage) { Console.WriteLine("application: " + applicationMessage); } public void WriteSecurityEvent(string securityMessage) { Console.WriteLine("security: " + securityMessage); } }

Now that I have a contract and a library that implements the contract, I need code for the discovery and loading of the plug-in, discovery of the interface, type instantiation, and invocation. Here is a trivial example of this application logic:

Assembly asm = Assembly.LoadFrom(@"c:\myapp\plugins\joelslogger.dll"); LogInterface logger = null; foreach (Type t in asm.GetTypes()) { if (t.GetInterface("LogInterface") != null) { logger = (LogInterface)Activator.CreateInstance(t); break; } } if (logger != null) logger.WriteApplicationEvent("Initialized...");

Obviously this is not optimal. It's a clunky way to discover what type implements the interface since it iterates over each type and asks the loader to load it. A better approach is to declare that all plug-ins specify the type to locate during the library load stage. This can be done via an assembly-level custom attribute or through a configuration file. Unfortunately, there is no way to enforce this "lookup" contract via code (though exceptions can be thrown from my application at run time if a contract is not met), so my app will have to rely on good solid documentation for its extensibility point to make sure third parties implement this piece of code correctly. Here's how you can add an assembly-level custom attribute:

[AttributeUsage(AttributeTargets.Assembly)] public class LogInterfaceTypeAttribute : System.Attribute { public readonly string TypeName; public LogInterfaceTypeAttribute(string typeName) { TypeName = typeName; } }

LogInterfaceTypeAttribute has only one field, the string name of the type that implements the "LogInterface" interface. In my application docs, I ask that all third parties add my LogInterfaceType attribute to their library at the assembly level to tell my app which type in that library implements my interface:

[assembly: LogInterfaceType("JoelsLogger")] class JoelsLogger : LogInterface { ... }

Now I discover the attribute after I have loaded the assembly:

LogInterfaceType logtype = (LogInterfaceType) asm.GetCustomAttributes(typeof(LogInterfaceType), false)[0]; LogInterface log = (LogInterface) Activator.CreateInstance(asm.GetType(logtype.TypeName)); log.WriteApplicationEvent("Initialized...");

This approach avoids the inefficient loop-and-check previously used for contract implementation discovery. The performance gain is quite significant because it only loads the exact type that is needed to use the third-party logging library.

Invoking or Calling a Member

The reflection APIs are divided by their basic goals: invoking a member and inspecting a member. Now I'll consider the implementation behind reflection to discover exactly where the costs lie.

Reflection invocation is generally performed through two APIs in the reflection namespace: MethodBase.Invoke and Type.InvokeMember. This invocation is sometimes referred to as "late-bound" because the call to the member is analyzed at run time instead of at compile time. In contrast, call instructions emitted by the compiler are called "early-bound" calls. Using the late-bound APIs to perform invocation on members is significantly slower than using the early-bound call instruction counterparts. To put the performance implications of these APIs in perspective, it's worthwhile to contrast the cost with the other mechanisms for member invocation used in the runtime. Figure 2 illustrates the spectrum of invocation mechanisms for methods and the relative performance overhead of each. For a closer look, read Eric Gunnerson's great article, "Calling Code Dynamically".

Figure 2 Relative Performance of Invocation Mechanisms

Figure 2** Relative Performance of Invocation Mechanisms **

These metrics are based on an abstract scenario and should only be used as a starting point, since actual scenarios may have different performance characteristics. The best way to get accurate numbers for your scenario is to measure, measure, and measure again.

Early-Bound and Late-Bound Invocation

Invocation mechanisms can be divided into three categories: early-bound, late-bound, and a hybrid of early- and late-bound invocation. Early-bound invocation mechanisms result from direct intermediate language (IL)-level instructions (call, callvirt, and calli), interface calls, and delegate calls. These early-bound call mechanisms are typically emitted statically by compilers at compile time, where the compiler is supplied with all of the required information about the target to be invoked. These early-bound cases are significantly faster than their late-bound and hybrid counterparts as they map down to a few x86 instructions emitted by the JIT or native code generation (NGEN) compilers. Aggressive optimizations can also be made by the JIT because the call site and method are unchanging and are well known to the runtime.

The late-bound cases are MethodBase.Invoke, DynamicMethod via Invoke, Type.InvokeMember, and late-bound delegate calls (calls on delegates via Delegate.DynamicInvoke). All of these methods come with significantly more negative performance implications than the early-bound cases. Even in the best case, they're typically an order of magnitude slower than the slowest early-bound case. Type.InvokeMember is the slowest of the late-bound invocation mechanisms because there are two functions that InvokeMember needs to perform to properly invoke a member. First, it must figure out the exact member it's supposed to invoke, as determined by the user's string input, and then perform checks to be sure the invocation is safe. MethodBase.Invoke, on the other hand, doesn't need to figure out which method it must call because the MethodBase already holds the identity of the method.

To further explore how reflection sets up and invokes a member, consider a typical usage of MethodBase.Invoke:

public class D { public void MyMethod(string arg) { ... } } ... MethodInfo mi = typeof(D).GetMethod("MyMethod"); mi.Invoke(dobj, new object[] { "testing" });

Reflection needs to take the user's string representation of MyMethod and create a MethodInfo for the method that matches that name. Because the common language runtime (CLR) stores information about the method's name in metadata, reflection must look inside metadata to learn which method on type "D" has the specified name. This logic alone is expensive. Once you know which method has the string name MyMethod, a MethodInfo is created (MethodInfo derives from MethodBase).

On the call to Invoke, reflection must perform all the checks required to make the call safe and secure. Argument and parameter type matching is generally done first. In the code snippet just shown, you can see that MyMethod accepts a string as a parameter, and the call to invoke has an object array that contains the corresponding string to be used as the method's argument. This is a match and is therefore type safe. If the types in the argument array and the types of the method parameters don't match exactly, the reflection framework then checks to see if it can perform type coercion to force the types of the arguments and parameters to match. After argument and parameter matching and coercion are complete, reflection checks any code access security (CAS) demands that are located on the method and invokes the security subsystem as needed. Finally, after the security subsystem has vouched for the safety of the call, reflection tells the runtime to start execution of the call.

Hybrid Late-Bound to Early-Bound Invocation

If you think about it, late-bound really means doing all the work at run time that a compiler would do at compile time. The fundamental problem is that you may repeat the same work every time you invoke the method, while compilers do all that heavy lifting once. With the code-generation features of the runtime, however, you can bridge the gap between late-bound and early-bound invocation by acting like a compiler at run time. I call this the hybrid approach. Late-bound invocation can do the work to find the method, then emit code to statically patch up the call site to the method, much like a compiler would. In the .NET Framework 2.0, there's a new reflection feature built into the runtime called Lightweight Code Generation (LCG), which is optimal for enabling this hybrid scenario. The hybrid approach can still be accomplished in the .NET Framework 1.x, albeit with a slightly heavier implementation.

I Want My MemberInfo

To create a MemberInfo, there are generally two places that reflection looks to get information about a member: metadata and runtime data structures. When the runtime spins up to run a program, it populates its own runtime data structures with some of the information from metadata found on disk in the PE file. If you recall, metadata is a set of data tables that describe your assemblies and the entities contained in them. Using methods as an example, you can find the method names and the attributes of a method, like its visibility, in metadata. The runtime data structures only store information from metadata that is constantly and consistently referenced by its runtime services (JIT compilation, security checks, and so on). The CLR will lazily populate its data structures from metadata to get the currently executing job done. You can think of the information in the runtime's data structures as a very small subset of the metadata.

To create a MemberInfo, reflection must retrieve metadata information. It finds this data by consulting the runtime data structures or by going to metadata and computing it. Making use of the runtime data structures is fast since the data structures are optimized and have great locality of reference. However, if the runtime data structures don't have the required information, reflection is used to look in the metadata— a significantly slower operation.

Touching any sort of metadata is unpredictable and painful because you may end up with a page fault while bringing in the metadata from disk. Accessing metadata significantly increases your working set and can involve a great number of temporary allocations on the garbage collector (GC) heap. Working set refers to the amount of memory your application consumes in its running state. Rico Mariani describes it in his posting "My mom doesn't care about space".

Working set can have a significant impact on your performance. What's even worse is that reflection doesn't have any invariants or policy for accessing metadata or for the impact on the working set that happens with each call. The reason for this undefined performance behavior is that reflection and the runtime have a number of caches in place to reduce the number of times the reflection runtime has to travel to metadata to pick up the bits. At a high level, the two basic caches that reflection may access when it is asked to build a MemberInfo are the metadata cache and the reflection MemberInfo cache. Unfortunately, the working set increase comes not only from pulling in metadata pages, but also from the fundamental design of the reflection MemberInfo cache.

The Reflection MemberInfo Cache

If you want a lesson in bad design, the reflection MemberInfo cache is it. In all releases, this caching solution attempts to reduce the cost of the repeated metadata access that happens when you ask for the same MemberInfo more than once. There are a few surprises associated with the cache design in the .NET Framework 1.x which may negatively impact the working set.

Figure 3 B and D

Figure 3** B and D **

First, let's look a little closer at the way this cache works. In .NET Framework 1.x, each type has its own MemberInfo cache. The main entry point for creation of the cache comes from the GetXX APIs in the reflection namespace (I use GetXX as the name of the collective MemberInfo obtainers: GetMethod, GetProperty, GetEvent, and so on). There are two forms of these APIs, the non-plural, which return one MemberInfo (such as GetMethod), and the plural APIs (such as GetMethods), which return an array of members contained in the type. In the .NET Framework 1.x, reflection enforced a policy that MemberInfo objects would be created for all members of a specific kind on both the target type and the type's inheritance hierarchy on calls to either the plural or non-plural APIs. I call this eager caching; regardless of whether you asked for one member or all members, reflection would cache all members by default. For example, if you call GetProperty looking for a particular property on a type, all properties from that type will be cached.

Figure 3 helps illustrate the eager caching algorithm. Here you can see a simple class hierarchy, where type D derives from type B and where each type has a few methods on it. Consider now an application that invokes the following code:

MethodInfo mi = typeof(D).GetMethod("MyMethod");

When this code requesting the MethodInfo for MyMethod on type D is executed for the first time, the .NET Framework 1.x would eagerly create MethodInfos for all methods including non-visible methods on types D and B, and store them into the type's MemberInfo cache. Figure 4 shows the result. Once this cache is populated, subsequent calls to GetMethod and GetMethods on the same type would pull the MethodInfo from the cache, removing the requirement to requery the metadata.

Figure 4 .NET Framework 1.x MemberInfo Cache

Figure 4** .NET Framework 1.x MemberInfo Cache **

Unfortunately, types that have a deep inheritance hierarchy and a lot of members are good examples of where this cache design can negatively affect performance. On a single call to GetMethod where the user is requesting one method only, reflection will create the cache and then eagerly spend a lot of time reading metadata and populating the cache for all the other methods, even if they won't be used immediately or ever.Improving Reflection Performance

There are various reasons why an application can't define a static extensibility contract, but it's usually a result of the flexibility the architect built in. Here are some ways to reduce some of the cost of reflection and of the MemberInfo cache.

Avoid using Type.InvokeMember to call late-bound methods Type.InvokeMember is the slowest way to invoke a member via reflection. If your scenario doesn't involve COM interop invocation (something which Type.InvokeMember makes easier), consider other alternatives for setting up calls to members obtained through late-bound methods.

Avoid case-insensitive member lookups When calling both the plural and non-plural GetXX APIs, avoid specifying BindingFlags.IgnoreCase as a binding flag argument as it has a significant invocation and working set costs. Metadata is case sensitive and to perform an "ignore case" operation, reflection needs to obtain the metadata string names for all members of a type and then do a case-insensitive comparison. In addition, due to the design of the reflection cache, a second MemberInfo cache is created for the case-insensitive members, which basically doubles an application's working set.

Use BindingFlags.ExactMatch whenever possible Specify this BindingFlag when calling MethodBase.Invoke if you know that the types of the arguments you are passing exactly match the types of the method's parameters. It will allow reflection to go down an optimized, more efficient code path that does not perform argument type-coercion. If the argument types don't match the method's parameter types, an exception will be thrown.

Call the non-plural GetXX method if you know the name of the instance you care about (.NET Framework 2.0 only) The MemberInfo cache is lazily populated in the .NET Framework 2.0, which means lower working set cost, and less time to retrieve the method. If you know the name of the particular method you want to obtain, use the non-plural GetXX method.

Cache only the handle to a member (.NET Framework 2.0 only) In the .NET Framework 2.0, developers have the ability to define a cache policy themselves. The .NET Framework 2.0 introduces new APIs called the token handle resolution APIs. This set of APIs exposes some of the fundamental member identity concepts of the runtime and allows a user to set up and execute a MemberInfo caching policy on their own.

Another problem with eager caching is the cache growth policy. The .NET Framework 1.x MemberInfo cache is allowed to continue to grow indefinitely, without any reclamation. The more members you reflect on, the bigger the cache grows, even if you're application is finished using reflection. The resulting working set growth can cause problems for all sorts of long-running apps that use reflection heavily. For example, tools such as object browsers that continuously reflect on user-loaded types can be severely affected. The cache, and the working set, can get big fast.

A better cache policy was needed for the .NET Framework 2.0 reflection cache. Specifically, a consumer should be able to call a non-plural GetXX API and not have to worry that the API will create MemberInfos for rest of the members on that type. Plus, the reflection cache growth problem also needed a solution.

The .NET Framework 2.0 delivers a better reflection MemberInfo back-end cache. As a result, there is a better policy around obtaining members as well as a much better plan for growth and reclamation. There are two main changes to the MemberInfo cache policy in the .NET Framework 2.0. First, the cache creation algorithm is lazy so reflection only retrieves and caches data it is asked for. Secondly, the MemberInfo cache for each type has a solution for growth and reclamation. The cache has been rebuilt in managed code (as opposed to unmanaged in the .NET Framework 1.x) and sits in the GC heap, so the user can control cache reclamation. Such user control can significantly reduce the working set of an application.

With the new lazy policy, calling a non-plural GetXX API now does what you would expect—it creates a cache (if a cache for the type's members doesn't already exist) and populates it with just one member. Of course, calling the plural GetXXs API will result in the same eager algorithm described for the .NET Framework 1.x cache because the call is asking for all members of a specific kind on a type. Thus, executing the same code you executed earlier

MethodInfo mi = typeof(D).GetMethod("MyMethod");

will result in the situation depicted in Figure 5 rather than that shown in Figure 4.

Figure 5 .NET Framework 2.0 MemberInfo Cache

Figure 5** .NET Framework 2.0 MemberInfo Cache **

Figure 5 shows that, as before, a MethodInfo is allocated in the MemberInfo cache and a reference is created and then returned to the requester of the MethodInfo. This time, however, both are in the GC heap. Separate from the MemberInfo cache is the Type cache, another cache created by the loader and populated every time a type is loaded. This cache holds a weak reference to the relevant MemberInfo cache, so even though the Type cache is never reclaimed, it won't keep a MemberInfo cache alive. It has an insignificant impact on the working set, and you needn't worry about the implications of the Type cache; just remember that it exists.

Now if an application no longer references a MethodInfo and there are no existing references to any other MethodInfos in the relevant MemberInfo cache, the GC will be able to collect and reclaim the memory for that cache. So if you drop all your MemberInfo references, there's a good chance that you'll drop all MemberInfo caches, as well. Notice I said "good chance." There's always a caveat, of course, and here is the one for this situation: the .NET Framework 2.0 MemberInfo cache is process-wide, so another application in the same process (running in a different application domain) with a reference to a MemberInfo on the same type as your application will keep the relevant cache alive. The chance of that happening depends on your particular code base.

For more on tweaking reflection performance, see the sidebar "Improving Reflection Performance."

Handles and Handle-Resolution APIs

A handle is a small, lightweight structure that, in association with a type context, defines the identity of a member. You can think of handles as trimmed down MemberInfos, foregoing most of the methods and data, and without the back-end cache. In the .NET Framework 2.0, reflection offers APIs that get the handle from a MemberInfo and resolve the handle back to the MemberInfo, as shown here:

// Obtaining a Handle from an MemberInfo RuntimeMethodHandle handle = typeof(D).GetMethod("MyMethod").MethodHandle; // Resolving the Handle back to the MemberInfo MethodBase mb = MethodInfo.GetMethodFromHandle(handle);

Figure 6 Costs for Handle Resolution and GetXX Calls

Figure 6** Costs for Handle Resolution and GetXX Calls **

Handles don't keep the members in the cache alive but they still represent the identity of a MemberInfo. A user has the ability to set up his own cache based on these two invariants. The cost of going from a handle to a MemberInfo is about the same as using one of the GetXX methods if the appropriate MemberInfo is already in the cache. If the MemberInfo is not in the cache, resolving a Handle to a MemberInfo is approximately twice as fast as using one of the GetXX methods. You'll get consistent results, you won't need to worry about whether the cache is alive, and you'll keep working set down. Figure 6 compares the differences between methods.

Implementing Your Own Cache

Given the efficiency of handles, you might consider implementing your own cache on top of the one that exists in the Framework. But good cache design is hard, as Rico Mariani points out in his blog "Caching Implies Policy." If you find it difficult to set the policy of your cache consider avoiding the pain of implementing one. Rico's cache policy posting is a good read if you're thinking about developing your own caching strategy.

If you want to try anyway, a quick and easy exercise is to cache a call to GetMethod. For that API, you need a System.Type, the string-based method name, and an optional type array for the method's parameters (since overloading can introduce multiple methods with the same name on the same type). The obvious trick is to make the key to your cache unique. Here's a cache key implementation for all GetMethod call sites:

class CacheKey { public CacheKey(RuntimeTypeHandle th, string methodName, Type[] args) { InstanceHandle = th; MethodName = methodName; TypeArguments = args; } ... }

This key can then be used to store the associated MethodHandle. Whenever a call to GetMethod is required, you can hit your cache with the RuntimeTypeHandle of the type, the string-based method name, and the array of type parameters, and then get back the appropriate MethodHandle. A call to MethodInfo.ResolveMethodFromHandle will quickly retrieve a MethodInfo from that MethodHandle, ready for use.

If you get a cache miss on your own cache, you'll need to call GetMethod anyway to obtain the handle from the resulting MethodInfo. After making use of the MethodInfo, you'll end up throwing the MethodInfo away, keeping only the handle and hoping that the GC soon reclaims the MemberInfo cache that was temporarily created. A cache hit, however, never results in a call to GetMethod. You can go directly from the handle to the MethodInfo, perform whatever operations you like, and then drop the MethodInfo again.

After you've retrieved the appropriate MemberInfo, you'll need to be able to invoke it. If your scenario requires that you call Invoke on a MemberInfo multiple times and you can't define a static contract for some reason, you can consider code generation to remove the cost of the Invoke method call. To recap, a call to Invoke involves a number of security checks, type parameter checks, and walks along metadata, all of which can become expensive quickly. It doesn't make sense to repeat these checks if you are calling the same method multiple times. Enter the new Lightweight Code Generation (LCG) feature in the .NET Framework 2.0. This feature bridges the gap between purely dynamic invocations and early-bound calls. It hangs off of the System.Reflection.Emit namespace and provides the ability to generate new methods at run time. These dynamic methods are fully reclaimable by the GC, which is important for control over an application's working set. The LCG feature uses the existing Reflection.Emit.ILGenerator class and a new DynamicMethod class to emit IL for a new method, and it provides two ways to perform invocation, through delegates and through DynamicMethod.Invoke.

As an introductory example to using DynamicMethod, consider the following code with generates a method that displays "Hello World" to the console:

RuntimeMethodHandle myMethodHandle = typeof(Console).GetMethod("WriteLine"), new Type[]{typeof(String)})).MethodHandle; DynamicMethod dm = new DynamicMethod( "HelloWorld", // name of the method typeof(void), // return type of the method new Type[]{}, // argument types for the method typeof(LCGHelloWorld), // type in module with which to associate false); // skip JIT visibility checks ILGenerator il = dm.GetILGenerator(); il.Emit(OpCodes.Ldstr, "Hello, world"); il.Emit(OpCodes.Call, myMethodHandle); il.Emit(OpCodes.Ret);

There are two ways to invoke this LCG method: using DynamicMethod.Invoke, or DynamicMethod.CreateDelegate. Delegate invocation is the faster, recommended approach:

delegate void MyHelloWorldDelegate(); ... MyHelloWorldDelegate del = (MyHelloWorldDelegate) dm.CreateDelegate(typeof(MyHelloWorldDelegate)); ... del();

With that brief overview in hand, let's take a step back and consider the problem we're trying to solve with code generation.

When a method is not known at compile time, you cannot use the extremely fast call instructions because the compiler is unaware of how to set up the CLR for dynamic call sites. Here's an example where a compiler has all of the information it needs:

public static int GetEmployeeIdNumber(string employeeName) { ... } // callsite int empNum = GetEmployeeIdNumber("Joel");

At compile time, the compiler will emit IL to set up the stack for this call site. Everything about this particular call site is well known to the compiler. It can check that the types of the arguments and the types of the parameters match, and it can make sure that the method is "visible" to the call site (in this case, it is because the method is public). After these checks, the compiler can safely emit IL that pushes the string "Joel" onto the stack and emits a "call method" instruction to invoke the method. The IL for this call site is illustrated here:

ldstr "Joel" call int32 EmployeeData::GetEmployeeIdNumber(string) stloc.0

Contrary to this example, most uses of reflection invocation are purely dynamic. You can use the reflection APIs to invoke methods in a run-time world where you don't know or have compile-time access to the signature or location of the method that you're trying to invoke. Unfortunately, you pay the price at invocation time because reflection has to do a lot of these checks at run time while a compiler would do them at compile time.

You may be asking why reflection doesn't simply perform these checks once. Reflection and its various Invoke methods repeat these checks every time you invoke, even if the arguments are the same. This is done for a variety of reasons, including security. LCG can be used to bridge the gap between the compiler and reflection for invocations over a particular method. Using LCG, you can bypass the checks that reflection performs each time you invoke if you know that the method you are calling is safe.

For LCG to bridge between a call site and a method, you'll first need to define a delegate that can be used statically in code to reference the LCG method. For the cases where the extensibility point's method is totally unknown, you'll need to make a delegate signature that is generic enough to apply to all method signatures. This is fairly easy since the delegate signature can return an object and take an object array:

delegate object CallSiteDelegate(object[] args);

Now you need to set up the LCG method to match this generic signature. The LCG MethodWrapper method returns an object, and takes an object array as its parameters:

DynamicMethod dm = new DynamicMethod("MethodWrapper", typeof(object), new Type[] { typeof(object[]) }, ..., ...);

The code for building the LCG method is complicated and is beyond the scope of this article. As a summary though, the method needs to perform several operations before finally calling the intended method. It must check that there are enough arguments in the object array that is supplied to the LCG wrapper method. It then must pull out the instance or "this" pointer if the method being called is an instance method. Next, it unpacks the argument array and lays out the arguments on the stack, after which it calls the method and returns its result.

The most interesting part of the LCG method wrapper is the last few IL instructions to do the call once the arguments are laid out on the stack. The instructions end up being a straight call or callvirt. This is the basic building block for bridging late-bound and early-bound. Looking at Figure 2, you can see that these IL instruction calls will be super fast.

Finally, calling the dynamically generated method is done via the CallSiteDelegate delegate, with arguments supplied in the object array. For GetEmployeeNumberId it looks like the following:

DynamicMethod dm = ...; CallSiteDelegate getEmployeeIdMethod = (CallSiteDelegate) dm.CreateDelegate(typeof(CallSiteDelegate)); getEmployeeIdMethod(new object[] { "Joel" });

The performance of the whole invocation of the method will be slightly slower than a delegate call plus a call or callvirt instruction.

Wrapping It Up

Here I've tried to provide a clear explanation of how and when to use reflection and what it looks like under the covers, along with some tricks and tips on how to get the most out of it. I'm hoping that I've delivered enough information to get you thinking about your reflection scenarios, and enough depth for you to make some solid decisions about your designs.

Joel Pobar is a Program Manager on the common language runtime (CLR) team at Microsoft. He primarily works on the more dynamic features of the CLR. You can reach him at joelpob@microsoft.com.