Auto-Generating Wrapper Classes for Stored Procedures, Part 2: The Internals of the AutoSproc Tool

 

Steve J. Rodgers
Software Consultant

March 2004

Applies to:
   Microsoft® Visual Studio® .NET
   Microsoft® Visual Basic® .NET
   Microsoft® Visual C#® .NET

Summary: The AutoSproc tool uses reflection to auto-generate wrapper classes for calling stored procedures. In this companion article to Auto-Generating Wrapper Classes for Stored Procedures, Part 1: Using the AutoSproc Tool, we'll take a look at the internals of the tool and examine the techniques used to build it. (16 printed pages)

Download the Autosproc.exe sample code.

Contents

Introduction
Creating the Dynamic Assembly, Dynamic Module and Class
Implementing the Interface on the Generated Class
CIL Primer Interlude
Adding Two Numbers Together in CIL
Emitting Code Using ILGenerator Class
Emitting newobj and call Instructions: aka Creating Object Instances and Making Method Calls on Objects
What the Generated Code Really Looks Like
Wrap-Up

Introduction

These days, folks who are building scalable distributed applications tend to partition application code into discrete tiers conforming to the classic three tier model: Presentation, Business, and Data, respectively. When building an application of this kind with .NET technologies, the data tier usually consists of one or more classes with methods that exhibit a distinct programming pattern:

  1. Build a command object (SqlCommand/OleDbCommand).
  2. Add parameters to the command object (SqlParameter/OleDbParameter).
  3. Issue a call to a stored procedure in the database to carry out the request from the business tier.
  4. Return results (if any) to the business tier.

As an example, consider the following stored procedure up_insert_auth that we could define for use in the pubs database:

CREATE PROCEDURE up_insert_auth
@au_lname varchar(40)
as
insert into authors (au_id, au_lname, au_fname, phone, contract) values 
('123-56-1321', @au_lname, 'steve', '11111111', 1)
return

To actually call the stored procedure up_insert_auth from the data tier using ADO.NET, one is more than likely to define a wrapper function to build the call and execute it:

Public Class Sample1VB
    Public Shared Function Wrapper_up_insert_auth(ByVal conn As _
                        IDbConnection, _
         ByVal au_lname As String) As _
         Integer
        Dim p As IDataParameter = New SqlParameter("@au_lname", _
                                                au_lname)

        Dim cmd As IDbCommand = conn.CreateCommand()
        cmd.Parameters.Add(p)
        cmd.CommandType = CommandType.StoredProcedure
        cmd.CommandText = "up_insert_auth"
        Return cmd.ExecuteNonQuery()
    End Function

    Public Shared Sub Main()
        Dim conn As IDbConnection = New SqlConnection()
        conn.ConnectionString ="database=pubs;Integrated Security=SSPI"
        conn.Open()
        Dim rowsAffected As Integer
        rowsAffected = Wrapper_up_insert_auth(conn, "rodgers")
        conn.Close()
        Console.WriteLine("{0} rows affected", rowsAffected)
    End Sub
End Class

Writing code of this nature can be both tedious and potentially error prone; many developers use a cut and paste solution each time. This kind of repetition leads one to consider alternative approaches based on code generation techniques—in this case, code that automatically generates the wrappers for you based on type information.

This article will review the internals of the AutoSproc tool, which uses a code generation technique using classes exposed from the System.Reflection.Emit namespace. The code generation layer is approximately 1000 lines of Microsoft® Visual C#® and it would not be practical to give complete coverage in this article. That stated, being able to comprehend the implementation does not require a full analysis of the source code, merely a reasonably good understanding of CIL as well as both the System.Reflection and System.Reflection.Emit namespaces. It is assumed that the reader is fully conversant with the System.Reflection namespace, but not necessarily with the System.Reflection.Emit namespace or with CIL.

The job of the code generation assembly can be broken down into two areas of work:

  1. Creating the dynamic assembly, dynamic module and class
  2. Implementing the methods of the user defined interface on the class

Tracing class fabrication using the Microsoft® Visual Studio® .NET debugger is a good way to watch it all happen.

To whet your appetite, here is the same stored procedure illustrated above being called from Microsoft® Visual Basic® .NET code that is leveraging the code generation layer.

Note   The language neutrality of .NET means the code generation layer can be leveraged from most other managed languages such as C#, Managed C++ etc. Code samples in both C# and Visual Basic .NET can be located in the folder "CodeSamples" of the project download directory. All code samples in this article are written in C#.

'client code must reference assembly AutoSproc.dll 
'for it to compile successfully
Imports System
Imports System.Data
Imports System.Data.SqlClient
Imports AutoSproc 'code generation assembly namespace

Public Interface IPubs
    Inherits ISprocBase
    Function up_insert_auth(ByVal au_lname As String) As Integer
End Interface


Class Sample2VB
    Public Shared Sub Main()
        Dim conn As IDbConnection = New SqlConnection()
        conn.ConnectionString ="database=pubs;Integrated Security=SSPI"
        conn.Open()

        Dim imp As Object
        'Call to code generation layer
        imp = SprocFactory.CreateInstance(GetType(IPubs), _
                                            DBProvider.SQLServer)

        Dim pubs As IPubs = DirectCast(imp, IPubs)
        pubs.Connection = conn

        Dim rowsAffected As Integer
        'invoke stored procedure
        rowsAffected = pubs.up_insert_auth("rodgers")
        Console.WriteLine("{0} rows affected", rowsAffected)
    End Sub
End Class

Creating the Dynamic Assembly, Dynamic Module and Class

SprocFactory.CreateInstance() is the entry point to the code generation layer. There are in fact four overloads for this method as previously mentioned, but the one that actually performs the work is defined with the following signature:

public static object CreateInstance (Type itf, DBProvider prov, 
                     bool autoClose);

The other three methods simply provide additional logic around a call to this method.

At a high level, this method performs the following steps:

  1. Checks the cache of previously created types to see if this type has already been generated (hash table look up). If it has, then Activator.CreateInstance() is used to create a new instance of the type, AutoClose property is configured on the object, and then returns the instance to the caller without carrying out the following steps (2–10)
  2. Calls VerifyBaseInterface(). —This method ensures the interface type derives from ISprocBase, throwing an IllegalBaseInterfaceException if it does not
  3. Creates the dynamic assembly and dynamic module
  4. Creates the dynamic type
  5. Implements the interface and methods on the type
  6. Completes type creation
  7. Instantiates an instance of the type using Activator.CreateInstance()
  8. Caches the type information for the generated class in a hash table
  9. Configures AutoClose property on the instance
  10. Returns the new instance to the caller

The first thing this method does (after looking in the cache) is to carry out the work of creating the dynamic assembly and dynamic module. This is performed by means of a call to SprocFactory.ConstructAssemblyModuleHelper(). Stepping into this method in the debugger shows how various System.Reflection.Emit namespace classes are used.

Firstly, a dynamic assembly is generated by a call to:

AssemblyBuilder AppDomain.DefineDynamicAssembly(AssemblyName asmName,
                  AssemblyBuilderAccess access);

The return value is an AssemblyBuilder reference which is then used to construct the module by calling its DefineDynamicModule() method. This returns a reference of type ModuleBuilder. Stepping out of SprocFactory.ConstructAssemblyModuleHelper(), it can be seen that ModuleBuilder.DefineType() is called to define the generated type—notice the last parameter specifies the base class type as SprocBase. The base class SprocBase implements interface ISprocBase (Connection, Transaction and AutoClose properties).

Implementing the Interface on the Generated Class

The process of fleshing out the method implementations of the interface begins with a call to SprocFactory.ImplementDerivedInterfaceHelper(). This method makes a call to TypeBuilder.AddInterfaceImplementation() passing type information for the user defined interface as a parameter. This adds the interface to the list of interfaces the class implements. There may be many methods of the interface that will need to have CIL generated for their method bodies, so SprocFactory.ImplementMethodHelper() is called once for each method, passing type information for the method being generated.

At a high level, SprocFactory.ImplementMethodHelper() retrieves the ILGenerator class associated with the method, and uses it to emit the method body override.

The first task in overriding a method is to annotate the class implementation with a method that exactly matches the signature of the method to override. This is done by using a call to TypeBuilder.DefineMethod(), specifying the name of the method together with the MethodAttributes.Virtual flag.

To complete method overriding two more operations need to occur: First, the CIL for the method needs to be generated using the ILGenerator class. Second, a call to TypeBuilder.DefineMethodOverride() (the last operation carried out prior to returning) should be made to tie together the method implementation (CIL) and the method descriptor of the interface.

CIL Primer Interlude

To be able to understand what the emitted code is doing, this part of the article offers a short primer in both understanding and emitting CIL. If you are already familiar with CIL then you might like to skip this part and continue on to further discussion of the code generation layer.

It turns out that CIL is surprisingly easy to understand, although you will need to have a basic grasp of computer fundamentals. Most notably, the stack is a central concept with which you need to work when coming to grips with CIL.

The document "Partition III CIL.doc" (located in the directory: " %FrameworkSDKDir%\Tool Developers Guide\docs") contains a list of all the CIL instructions you can emit. Armed with this document and the Microsoft® .NET Framework SDK documentation, it is perfectly feasible to begin to understand how to go about emitting CIL.

Most CIL instructions expect one or more values to be pushed on the stack prior to execution. Executing the instruction results in the value(s) on the stack being popped off, operated on by the instruction, and the result of the instruction pushed onto the stack at the end.

Adding Two Numbers Together in CIL

As an example, consider the CIL instruction called add that adds two numbers together. The stack transition according to the documentation states the following for this instruction:

  • value1 is pushed onto the stack.
  • value2 is pushed onto the stack.
  • value2 and value1 are popped from the stack; value1 is added to value2.
  • The result is pushed onto the stack.

Starting out with CIL can seem a little daunting, but a useful trick to kick start your comprehension is to use ildasm.exe to disassemble methods to CIL that are originally compiled in higher level languages such as C# or VB.NET. When trying to figure out how to call the add instruction, consider writing a method in C# that looks like this:

public static void ShowAddInstruction () {
   int a;
   int b;
   int result;

   a = 0xBEEF;
   b = 0xF00D;
   
   result = a + b;      
}

Then disassemble it with ildasm.exe:

.method public hidebysig static void  ShowAddInstruction() cil managed
{
  // Code size       17 (0x11)
  .maxstack  2
  .locals init (int32 V_0,
           int32 V_1,
           int32 V_2)
  IL_0000:  ldc.i4     0xbeef
  IL_0005:  stloc.0
  IL_0006:  ldc.i4     0xf00d
  IL_000b:  stloc.1
  IL_000c:  ldloc.0
  IL_000d:  ldloc.1
  IL_000e:  add
  IL_000f:  stloc.2
  IL_0010:  ret
} // end of method App::ShowAddInstruction

It can be seen from the above disassembled code that the add instruction (at instruction address: IL_000e) is preceded by two ldloc instructions. Ldloc.0 performs the following operation according to the documentation:

"The local variable value at the index 0 (hence the 0 in ldloc.0) is pushed onto the stack."

In other words, local variable 'a' in the C# program is pushed onto the stack. It should not be too difficult to discern that the next instruction (at address: IL_000d) ldloc.1 will push the local variable 'b' in the C# program onto to the stack. With the stack now containing the two integers, the next instruction is the CIL add instruction (seen at IL_000e). Bear in mind the call to add pops both parameters off the stack, adds them together, and the result is pushed onto the stack at the end.

The next instruction (at address: IL_000f) stloc.2 takes the result left on the stack by the add instruction and performs the following action according to the documentation:

"A value is popped off of the stack and placed in the local variable indexed by 2."

In other words, the result is popped off the stack and placed in the local variable called 'result' in the C# program.

Emitting Code Using ILGenerator Class

ILGenerator exposes several methods needed to emit CIL instructions. The method most often used is ILGenerator.Emit(). This method is quite heavily overloaded (17 overloads!), but all share the same first parameter which allows you to specify the CIL instruction to emit using an OpCode structure. There is a class called OpCodes that defines the instruction set as fields, and this is used to produce the instruction codes. For example, emitting an add instruction by means of a C# wrapper method is done as follows:

public void EmitAddInstruction (ILGenerator ilGen) {
   ilGen.Emit (OpCodes.Add);
}

Of course, programs that add two numbers together will often need to fetch and retrieve results from local variables. Local variables can be declared in the CIL stream by calling ILGenerator.DeclareLocal(). Because the Common Language Runtime is a type safe execution engine, all variable declarations are strongly typed; hence, the caller must present type information for the variable being declared when calling ILGenerator.DeclareLocal().

So, for example, to declare a local variable using a C# wrapper function:

public void EmitLocalVariableDeclaration (ILGenerator ilGen, Type t) {
   ilGen.DeclareLocal (t);
}

The first call to ILGenerator.DeclareLocal() will create local.0, the second call will create local.1, and the nth call will create local.(n-1). Armed with this information, we are now in a position to use ILGenerator to flesh out the body of a method created previously with TypeBuilder.DefineMethod() .

To obtain the ILGenerator implementation, simply call MethodBuilder.GetILGenerator(). Here is the ShowAddInstruction method body implemented using ILGenerator:

public static void EmitCILOfShowAddInstruction (ILGenerator ilGen) {
      //first of all, declare the local variables
      ilGen.DeclareLocal (typeof (int)); //local.0 - 'a'
      ilGen.DeclareLocal (typeof (int)); //local.1 - 'b'
      ilGen.DeclareLocal (typeof (int)); //local.2 - 'result'

      //the following two operations perform the following 
      //equivalent in C#: a = 0xBEEF;
      //push 0xBEEF onto the stack
      ilGen.Emit (OpCodes.Ldc_I4, 0xBEEF); 
      //pop 0xBEEF off the stack into local.0
      ilGen.Emit (OpCodes. Stloc_0);

      //the following two operations perform the following 
      //equivalent line in C#: b = 0xF00D;
      //push 0xF00D onto the stack
      ilGen.Emit (OpCodes.Ldc_I4, 0xF00D); 
      //pop 0xF00D off the stack into local.1
      ilGen.Emit (OpCodes. Stloc_1);

      //the next four instructions collectively perform the
      //following equivalent line in C#: result = a + b;
      //push local.0 onto the stack 
      ilGen.Emit (OpCodes.Ldloc_0);
      //push local.1 onto the stack
      ilGen.Emit (OpCodes.Ldloc_1);
      //add the two operands
      ilGen.Emit (OpCodes.Add);
      //pop the result into local.2
      ilGen.Emit (OpCodes.Stloc_2);
   
      //emit the return for the method
      ilGen.Emit (OpCodes.Ret);
   }

Emitting newobj and call Instructions: aka Creating Object Instances and Making Method Calls on Objects

The code generation layer often needs to emit CIL to create objects and to invoke methods on objects. For example, the parameters need to be created and have the parameter directionality property set before being passed to the command object.

To create an object instance, the CIL newobj instruction is used. Again, as an example, consider the following simple C# program:

public class Person {
   public void SayHello () {
      Console.WriteLine ("hello, world");
   }
}

class App {
   static void Main () {
      Person p = new Person ();
   }
}

and here is the disassembled code produced for Main method with ildasm.exe

.method private hidebysig static void Main() cil managed
{
  .entrypoint
  // Code size       7 (0x7)
  .maxstack  1
  .locals init (class Person V_0)
  IL_0000:  newobj     instance void Person::.ctor()
  IL_0005:  stloc.0
  IL_0006:  ret
} // end of method App::Main

The line of interest is at address IL_0000, where the newobj instruction is called to create an instance of the Person type. When emitting a call to newobj using ILGenerator, it is necessary to specify type information for the constructor to invoke. Here is the equivalent code generated using ILGenerator:

public static void EmitNewObjPerson (ILGenerator ilGen) {
   ilGen.DeclareLocal (typeof (Person)); //declare local.0

   //get type information for Person
   Type pt = typeof (Person);
   //get type information for zero arg Person .ctor
   ConstructorInfo ci = pt.GetConstructor (new Type [] {});
   
   //for 1 or more arg .ctor, push parameters on
   //stack before newobj call

   //at the end of this instruction, the object reference is on the 
   //stack
   ilGen.Emit (OpCodes.Newobj, ci);
   
   //pop object reference off the stack and store in local.0
   ilGen.Emit (OpCodes.Stloc_0);
   
   ilGen.Emit (OpCodes.Ret);   
}

To make a method call on an object, you need to use the call instruction (or callvirt for virtual method invocations). The call instruction expects the stack to contain the object reference first, followed by the arguments (if any) to be passed to the method. If the method has a return value then it will be available at the end of the call on the stack. To expand upon the example already given, the newly created Person object will have its SayHello() method invoked immediately after creation in the following generated code:

public static void EmitNewObjPersonCallSayHello (ILGenerator ilGen) {
   ilGen.DeclareLocal (typeof (Person)); //declare local.0

   //get type information for Person
   Type pt = typeof (Person);
   //get type information for zero arg Person .ctor
   ConstructorInfo ci = pt.GetConstructor (new Type [] {});
   
   //for 1 or more arg .ctor, push parameters on
   //stack before newobj call

   //at the end of this instruction, the object reference is on the 
   //stack
   ilGen.Emit (OpCodes.Newobj, ci);
   
   //pop object reference off the stack and store in local.0
   ilGen.Emit (OpCodes.Stloc_0);
   
   //push object reference onto the stack
   ilGen.Emit (OpCodes.Ldloc_0);
   
   //SayHello takes no parameters so just make the call
   MethodInfo sh = typeof (Person).GetMethod ("SayHello");
   ilGen.Emit (OpCodes.Call, sh);
   
   //no return value, so no need to pop retval off stack   
   
   ilGen.Emit (OpCodes.Ret);   
}

.NET compilers such as the C# or Visual Basic .NET compiler will by default emit CIL that is always type safe. When emitting code using System.Reflection.Emit, the code emitted may not be type safe or composed of valid stack frame construction / tear down sequences. To assist the developer in determining whether an emitted assembly is well formed and type safe, the .NET Framework SDK ships with a very handy tool called PEVerify.exe that can be used to catch such errors.

What the Generated Code Really Looks Like

If the interface is defined as follows:

public interface IPubs : ISprocBase {
   [SprocName ("up_insert_auth")]
   int up_insert_authRowsAffected (string au_lname);
   
   [SprocName ("up_insert_auth")]
   DataSet up_insert_authDataSet (string au_lname);
}

Then here is the pseudo C# for both method implementations (remember this is an approximation of the generated CIL):

public int up_insert_authRowsAffected(string au_lname) 
{
   IDataParameter param;
   IDataParameter [] outputParams = null;
   IDataParameter [] inputParams = null;
 
   //look in hash table for parameters
   inputParams = this._htParams["455167124"];
   if (inputParams == null) 
   {

      param = new SqlParameter("@au_lname", au_lname);
      param.Direction = ParameterDirection.Input;
      inputParams = new IDataParameter [1];
      inputParams[0] = param;
      //store params in cache
      this._htParams["intup 455167124"] = inputParams;
   }
   else 
   {
      //cache hit!
      //for thread safety reasons, create 
      //a deep copy of the params
      
      inputParams = SprocFactory.DeepCopyParams(inputParams);
      inputParams[0].Value = au_lname;
   }

   return ExecuteStoredProcNonQuery(inputParams, 
      "up_insert_auth", this, 
      out outputParams, 0);   
}

public DataSet up_insert_authRowsDataSet 
   (string au_lname) 
{
   IDataParameter param;
   IDataParameter [] outputParams = null;
   IDataParameter [] inputParams = null;
 
   //look in hash table for parameters
   inputParams = this._htParams["455167084"];
   if (inputParams == null) 
   {
      param = new SqlParameter("@au_lname", au_lname);
      param.Direction = ParameterDirection.Input;
      inputParams = new IDataParameter [1];
      inputParams[0] = param;
      //store params in cache
      this._htParams["455167084"] = inputParams;
   }
   else 
   {
      //cache hit!
      //for thread safety reasons, create 
      //a deep copy of the params
      inputParams = SprocFactory.DeepCopyParams(inputParams);
      inputParams[0].Value = au_lname;
   }


   return ExecuteStoredProcDataAdapter(inputParams, 
      "up_insert_auth", this, out outputParams, 
      null, new DataSet(), 0);   
}

It can be seen from the two code samples that all generated method bodies begin by checking the parameter cache (hash table). A cache hit saves the time it would otherwise take to create each parameter. Of course, the first time the method is run the cache will be empty, and so the parameters are created and then cached for future calls. The hash table key is a unique function of the method signature and is calculated by calling MethodInfo.GetHashCode().ToString() at code generation time.

It can be seen that the parameters are packed into an array of IDataParameter which is then passed as a parameter to an emitted worker method call. There are actually four worker methods from which the code generation algorithm can choose to insert into the CIL stream. The one selected is based on the code generated method return type.

The four worker methods are:

SprocFactory.ExecuteStoredProcNonQuery()
SprocFactory.ExecuteStoredProcDataAdapter()
SprocFactory.ExecuteStoredProcGenerateCommand()
SprocFactory.ExecuteStoredProcIDataReader()

Inspection of these four methods reveals the fact that they share some common functionality: They all build a command object, seed it with the parameters and set the stored procedure name. However, beyond that, they carry out their own specific tasks described in Auto-Generating Wrapper Classes for Stored Procedures, Part 1: Using the AutoSproc Tool.

If the code generated method return type is int, then the worker method call injected into the CIL stream will be SprocFactory.ExecuteStoredProcNonQuery(). If the code generated method return type is DataSet (or derived from DataSet) then the worker method call injected into the stream will be SprocFactory.ExecuteStoredProcDataAdapter() and so on.

If output parameters are detected in the code generated method signature, then additional code has to be emitted at the end of the call to harvest them. This is achieved through several helper methods. First,any output parameters are identified by the worker function and are packed into the outbound IDataParameter array by the method SprocFactory.HarvestOutputParams(). The outbound IDataParameter array flows out of the worker function back into SprocFactory.ImplementMethodHelper(), where the output parameters are unmarshaled to the caller. The unmarshaling process begins with a call to SprocFactory.EmitHarvestOutputParamsHelper(). The type of each output parameter is determined through a switch statement, and the CIL required to harvest it is emitted by a call to SprocFactory.HarvestParameter()

Once output parameters are fixed up, the work of the generated CIL is over and control returns to the caller of the interface method.

Wrap-Up

Generating wrappers for calling stored procedures is not a new concept, and there are several different approaches one might take in solving the same problem—using System.Reflection.Emit is but one way. For example, it would be possible to use the System.CodeDom namespace to emit wrappers in C# or Visual Basic .NET,and there are tools available that use this technique (some are free, some are not). It is probably worthwhile spending time to evaluate the options, as they are likely to all have different features and performance overheads associated with them.

The Common Language Runtime exposes an incredibly powerful code generation mechanism with System.Reflection.Emit. Leveraging code generation techniques directly in mainstream application development is not recommended and should really be confined to areas of low level plumbing and tools, such as that showcased by this article.

Several individuals have contributed ideas and suggestions that have improved both the article and the code, and Steve would like to thank the following people for their input: Ian Griffiths, Bob Beauchemin, Ross Cockburn, and Dafydd Hughes.

Steve Rodgers is an independent software consultant living in Brisbane. He spends most of his time helping customers who are building .NET Enterprise applications.