Test Run

Testing SQL Stored Procedures Using LINQ

Dr. James McCaffrey

Code download available at:TestRun2008_04.exe(156 KB)

Contents

The System under Test
The Test Harness
Extending the Harness
Faster, More Compact, and Better Debugging

The need to test a program that accesses and manipulates a back-end SQL Server® database is very common. In many such cases, the application interacts with the back-end data through the use of SQL stored procedures. In this type of scenario, you can think of the stored procedures as auxiliary methods of the system under test; and they must therefore be tested just like any other module in the system.

Although there are several approaches you can take when testing SQL stored procedures, I have discovered that using LINQ dramatically simplifies test automation. In this month's column, I show you how to test SQL stored procedures using LINQ—specifically, using the LINQ to SQL provider. I'll assume you have intermediate level C# and SQL skills but no LINQ experience.

If you examine Figure 1, you'll see that I have written a small C# console application test harness that demonstrates the key activities you must perform when testing SQL stored procedures. I am testing a simple but representative stored procedure named usp_DeleteMovie, which resides in a database of movie-related data named dbMovies.

Figure 1 Testing a Stored Procedure Using LINQ

Figure 1** Testing a Stored Procedure Using LINQ **(Click the image for a larger view)

For each test case, I initialize a test bed database, call usp_DeleteMovie with a particular movie ID, and then check to see if the resulting state of dbMovies is correct. What is not apparent in Figure 1 is that I am using LINQ, which makes it easier to create my test harness than it is when I use ADO.NET and T-SQL. In the following sections, I will describe the dummy dbMovies database and usp_DeleteMovie stored procedure I am testing. I'll also show you the code and explain the LINQ techniques used, and then I'll explain how you can modify and extend my examples in order to meet your own needs.

The System under Test

Before digging into the harness, let's look at the sample database of the system under test so that you'll understand the test automation. In a typical development environment, developers, testers, and database architects usually work with their own local copies of the back-end database. So, let's assume that you have access to a SQL script that creates the current version of the database under test. A realistic database script would be long and complex, but the sample database creation script in Figure 2 is simple in order to keep the main concepts clear.

Figure 2 Database Creation Script

use master
go

if exists(select * from sys.sysdatabases where name='dbMovies')
  drop database dbMovies
go

create database dbMovies
go

use dbMovies
go

create table tblMain
(
  movID char(3) primary key,
  movTitle varchar(35) null,
  movRunTime int null
)
go

create table tblPrices
(
  movID char(3) primary key,
  movPrice money
)
go

create procedure usp_DeleteMovie
  @movID char(3)
as
  delete from tblMain where movID = @movID
  delete from tblPrices where movID = @movID
go

create procedure usp_GetMovieDataByPrice
  @movPrice money
as
  select m.movID, m.movTitle, p.movPrice, m.movRunTime
  from tblMain as m
  inner join tblPrices as p
    on m.movID = p.movID
  where p.movPrice = @movPrice
go

create procedure usp_GetMoviePrice
  @movID char(3),
  @price money out
as
  declare @ct int
  select @price = movPrice from tblPrices where movID = @movID
  select @ct = count(*) from tblPrices where movID = @movID
  return @ct
go

My database is named dbMovies and contains two tables, tblMain and tblPrices. The tblMain table has a movie ID column (which is the primary key), a title column, and a column that stores the duration of the movie (in minutes). The tblPrices table has a movie ID column and a column that stores the price of the movie.

My database has three stored procedures: usp_DeleteMovie, usp_GetMovieDataByPrice, and usp_GetMoviePrice. I precede the names of my stored procedures with usp_ to indicate that these are user-defined stored procedures as opposed to system stored procedures (sp_) or extended stored procedures (xp_). The points of note are that usp_DeleteMovie changes the state of the database but does not explicitly return a value; usp_GetMovieDataByPrice returns one or more SQL rowsets but does not change the state of dbMovies; and usp_GetMoviePrice returns two scalar values (one by a return value and one by an out-type parameter) but does not change database state.

These represent the three most common situations when testing SQL stored procedures: change state with no return value, return rowsets with no change of state, and return scalar values with no change of state. I will address all three scenarios shortly. Let's take a closer look at usp_DeleteMovie, the stored procedure being tested in Figure 1. The procedure accepts a movie ID as an input parameter and then deletes corresponding rows in tables tblMain and tblPrices, as you see here:

create procedure usp_DeleteMovie
  @movID char(3)
as
  delete from tblMain where movID = @movID
  delete from tblPrices where movID = @movID

From the perspective of test automation, you need to know how to prepare the dbMovies database associated with the stored procedure by setting it to a known state. Also, you'll want to know how to call the stored procedure and how to determine whether the stored procedure correctly modified the state of the database. As you'll see, performing these tasks with LINQ is much easier than with previous approaches.

The Test Harness

Let's walk through the code for the test harness that produced the output in Figure 1. The overall structure is shown in Figure 3, and the complete source code for all examples I discuss can be found in the code download that accompanies this column on the MSDN® Magazine Web site.

Figure 3 Test Harness Structure

using System;
using System.Linq;
using System.Text;
using System.Data.Linq;
using System.Security.Cryptography;

namespace Run
{
  class Program
  {
    static void Main(string[] args)
    {
      try
      {
        Console.WriteLine("\nBegin test automation using LINQ");
        string connString =
          "server=(local);database=dbMovies;Trusted_Connection=true";
        Console.WriteLine   
          ("\nStored procedure under test = usp_DeleteMovie");
        DbMovies db = new DbMovies(connString);

        string[] testCaseData = new string[] {
          "TC001:002:85-A0-D9-28-7B-AA-8D-5F-4C-43-B5-B6-A5-99-F4-63",
          "TC002:003:44-FA-D2-38-49-98-F0-BD-16-BD-9E-B6-0C-F1-73-35" };

        foreach (string testCase in testCaseData)
        {
          // parse inputs, expected state
          // call ResetDatabase()
          // call usp_DeleteMovie
          // call GetDatabaseState()
          // determine pass/fail
        } 
        Console.WriteLine("\n====================================");
        Console.WriteLine("\nEnd test run");
      }
      catch (Exception ex)
      {
        Console.WriteLine("Fatal: " + ex.Message);
      }
    } 

    static void ResetDatabase(string connString) { . . . }
    static string GetDatabaseState(string connString) { . . . }
  } 
}

LINQ is part of the Microsoft® .NET Framework 3.5, which is included with Visual Studio® 2008. Using Visual Studio 2008, I created a C# console application targeting the .NET Framework 3.5. I deleted the "using System.Collections.Generic" statement generated by Visual Studio because my harness does not use any methods from that namespace. Notice that Visual Studio added a using statement to the System.Linq namespace by default. I added a reference to the System.Data.Linq assembly and then added the corresponding using statement.

LINQ is actually a term used to describe a collection of technologies including LINQ to SQL and LINQ to XML. LINQ to SQL was formerly called DLINQ, and you will still find many online references to that older term. To test stored procedures using LINQ, I use a combination of base LINQ functionality contained in System.Linq and LINQ to SQL functionality that is specified in the System.Data.Linq namespace. I also add a using statement that points to the System.Security.Cryptography namespace because I will be using a crypto-hash method to verify the database state. After an introductory WriteLine statement, I set up a connection string to my test bed database:

string connString = 
  "server=(local);database=dbMovies;Trusted_Connection=true";

As you'll see shortly, it is useful to pass a connection string to auxiliary helper methods in the harness. Next comes a critical LINQ to SQL concept; I instantiate an object that represents my SQL database:

DbMovies db = new DbMovies(connString);

LINQ to SQL is not magic and does not automatically know about your database. Behind the scenes of my test harness I generated C# code that encapsulates all the information I need to access and manipulate the dbMovies database, including a DbMovies class. So where does this code come from?

There are three main approaches to get LINQ to SQL mapping code. You can use a command-line tool named sqlmetal.exe; you can use the Visual Studio 2008 Object Relational Designer (O/R Designer) GUI tool; or you can simply create the mapping code from scratch. I prefer to use the sqlmetal.exe tool because it is flexible and automation-friendly. To generate the C# wrapping code, I launched a Visual Studio 2008 command shell, navigated to the root directory of my test harness project, and entered this:

> sqlmetal.exe /server:(local) /database:dbMovies /sprocs 
   /code:mapping.cs

In this case, I am using the default Windows® Authentication mode to connect to the dbMovies database, but sqlmetal.exe can accept user and password information if necessary. I specify the /sprocs command-line switch in order to have sqlmetal.exe generate code that wraps the stored procedures in dbMovies, and I use the /code switch to indicate that I want source code as output rather than an intermediate .dbml (database markup language) file such as that which would be generated by the O/R Designer tool.

After generating the mapping.cs file, I add the file to my Visual Studio project. The mapping.cs file contains all the wrapping code I need to access my SQL database including a DbMovies class with a constructor that accepts a connection string:

public partial class DbMovies : System.Data.Linq.DataContext
{
  public DbMovies(string connection) : 
    base(connection, mappingSource)
  {
    //... 
  }
  // wrappers to tables, stored procedures, etc.
}

Figure 4 illustrates a simplified relation between my test harness and my test bed database. After instantiating a DbMovies object in my test harness, I set up some test case data to exercise the usp_DeleteMovie stored procedure:

Figure 4 Relation between Database and Test Harness

Figure 4** Relation between Database and Test Harness **(Click the image for a larger view)

string[] testCaseData = new string[] {
  "TC001:002:85-A0-D9-28-7B-AA-8D-5F-4C-43-B5-B6-A5-99-F4-63",
  "TC002:003:44-FA-D2-38-49-98-F0-BD-16-BD-9E-B6-0C-F1-73-35" };

Here I am only using two test cases; in a realistic testing scenario you might have hundreds or even thousands of test cases. For simplicity, I am embedding my test case data directly into my test harness. I could also have stored my data externally in the form of a flat text file, an XML file, a SQL table, or other data store. In general, external test case data is superior to internal test case data because external data is easier to modify and share.

My test case data has just three fields, delimited by colon characters. The first is a test case ID. The second represents a movie ID that will be used as an input to the usp_DeleteMovie stored procedure under test. The third is a string that represents a 16-byte MD5 crypto-hash value, which in turn represents the expected state of the database after the stored procedure under test has been called. I will explain this idea shortly. I use my test case data to control my main processing loop:

foreach (string testCase in testCaseData)
{
  // parse inputs, expected state
  // call ResetDatabase()
  // call usp_DeleteMovie
  // call GetDatabaseState()
  // determine pass/fail
}

Within the main processing loop, I use the String.Split method to parse my test case data string into individual string variables:

Console.WriteLine("\n====================================\n");
string[] data = testCase.Split(':');
string caseID = data[0];
string movieID = data[1];
string expectedState = data[2];
Console.WriteLine("Test case ID = " + caseID);
Console.WriteLine("\nID of movie to delete = " + movieID);

Now, before I can call the stored procedure under test I must set my database to a known state. This is a fundamental principle of testing SQL stored procedures (and databases in general). Many stored procedures change the state of their associated database by inserting, deleting, or updating data in one or more tables. In order to verify that a stored procedure operates correctly, you must examine the initial state of the database under test so that you can check for an expected state after the stored procedure has run. Setting a database to a known state is a task where LINQ really shines compared to previous technologies:

Console.WriteLine("Setting up test bed database");
ResetDatabase(connString);

To keep my Main method relatively clean, I farm out the work of setting up my test bed database to a helper method named ResetDatabase rather than place the code inline:

static void ResetDatabase(string connString)
{
  // delete all existing data in TblMain and TblPrices
  // populate TblMain and TblPrices with "rich" test data
}

I pass the connection string into my ResetDatabase method. First I delete all existing data in my tables, like so:

DbMovies dbLocal = new DbMovies(connString);
var allRowsInTblMain = from t in dbLocal.TblMain select t;
var allRowsInTblPrices = from t in dbLocal.TblPrices select t;
dbLocal.TblMain.DeleteAllOnSubmit(allRowsInTblMain);
dbLocal.TblPrices.DeleteAllOnSubmit(allRowsInTblPrices);
dbLocal.SubmitChanges();

I instantiate a DbMovies object. Next I use LINQ code to retrieve all existing data in both of my tables as LINQ to SQL entity collections. If you are new to LINQ, you may not have seen the implicit data type declaration mechanism using the var keyword. You can interpret var to mean "I want Visual Studio to determine the exact data type here based on the assignment code." In other words, as an alternative to using implicit typing such as this:

var allRowsInTblMain = from t in dbLocal.TblMain select t;

The following would have been an equally valid approach:

System.Linq.IQueryable<TblMain> allRowsInTblMain =
  from t in dbLocal.TblMain select t;

I really like implicit data typing in conjunction with LINQ. While writing the ResetDatabase code, I came across an issue that I want to alert you to. Originally I tried to use a single, static, global DbMovies object in my test harness rather than create a local object in my helper method ResetDatabase. However, using that approach I received the exception: "Cannot add an entity with a key that is already in use." It turns out that LINQ to SQL caches a lot of information for performance reasons. If you call a stored procedure, LINQ to SQL holds information about the entities involved, and then if you try to manipulate data related to those entities, you may get this error. A simple workaround is to avoid global objects when using LINQ to SQL.

After I have my entity collections, I pass them to the DeleteOnSubmit methods associated with each table. This effectively marks all rows in tblMain and tblPrices for deletion but does not actually perform any deletions. To execute the deletions, I issue a SubmitChanges statement, which will cause all pending changes, including my deletion requests, to be processed and propagated to the actual dbMovies database represented by the DbMovies object.

Now I am ready to populate my tables with rich data. When writing an application with a back-end database, you generally supply the database with normal data to use during the development process. But thorough stored procedure testing requires data specially designed to uncover potential problems. Here I hard-code some test bed data, which is not particularly rich in the testing sense but keeps the ideas clear:

string[] data = new string[] {
  "001:Active Assembly:101:11.11",
  "002:Bug Bashing Bit:102:22.22",
  "003:Computing in C#:103:33.33",
  "004:Data is Digital:104:22.22"
};

In a realistic scenario, I would add much more data and include null values, duplicate values, illegal values, boundary values, and so on. I could have read this rich test bed data from an external data store, but there is less reason to do so for this setup data than there is with the actual test case data. I parse my test bed data and then set up insert requests, as you see in Figure 5.

Figure 5 Create LINQ Insert Requests

foreach (string entry in data)
{
  string[] tokens = entry.Split(':');
  TblMain tm = new TblMain();
  tm.MovID = tokens[0];
  tm.MovTitle = tokens[1];
  tm.MovRunTime = int.Parse(tokens[2]);
  TblPrices tp = new TblPrices();
  tp.MovID = tokens[0];
  tp.MovPrice = decimal.Parse(tokens[3]);
  dbLocal.TblMain.InsertOnSubmit(tm);
  dbLocal.TblPrices.InsertOnSubmit(tp);
}
dbLocal.SubmitChanges();

I instantiate TblMain and TblPrices objects. These classes are defined in the code generated by the sqlmetal.exe tool. Because LINQ is simply an extension of the .NET Framework, I get all the advantages associated with the .NET environment including IntelliSense® code-completion and design-time syntax checking. But let me mention a test design pitfall you should avoid when populating a test bed database with rich data. In a realistic scenario, the database under test will contain stored procedures that insert data. You should definitely not use such procedures to populate your test bed—you would effectively be using part of the system under test to verify the system under test. Now, back in the main processing loop in my test harness, after my database has been set to a known state I am now ready to call the stored procedure under test. Once again, LINQ to SQL greatly simplifies this task:

Console.WriteLine("Calling usp_DeleteMovie");
db.Usp_DeleteMovie(movieID);

The Usp_DeleteMovie C# method is a friendly stand-in for the underlying usp_DeleteMovie stored procedure. Trust me, this is much easier than calling a stored procedure using ADO.NET—particularly in cases where there are many parameters. Now I am ready to determine if the stored procedure under test correctly modified the associated database. Determining database state, like resetting database state to a known configuration, is another fundamental task when testing stored procedures:

Console.WriteLine("Checking resulting state of test bed database"); 
string actualState = GetDatabaseState(connString);
Console.WriteLine("Expected state = " + expectedState);
Console.WriteLine("Actual state   = " + actualState);

In order to fetch the expected and actual database states, I use another helper method named GetDatabaseState:

static string GetDatabaseState(string connString)
{
  // combine data from tblMain and tblPrices into an aggregate string
  // convert aggregate to an MD5 byte array
  // return string image of byte array
}

First, I use LINQ to easily retrieve all data in my tables:

DbMovies db = new DbMovies(connString);
var allRowsInTblMain = from t in db.TblMain select t;
var allRowsInTblPrices = from t in db.TblPrices select t;

And then I build up an aggregate string by concatenating the string representations of each column value:

string s = "";
foreach (var record in allRowsInTblMain)
  s += record.MovID + record.MovTitle + record.MovRunTime;
foreach (var record in allRowsInTblPrices)
  s += record.MovID + record.MovPrice;

Now, I could use this aggregate string as my actual state and compare the string against an expected string. However, a string representation of the values in a test bed database could be very large, and some SQL data types, such as binary, don't map well to string representations. A better approach in most situations is to compute and use a hash of the values in the database:

MD5CryptoServiceProvider md5 =
  new MD5CryptoServiceProvider();
byte[] ba = md5.ComputeHash(Encoding.Unicode.GetBytes(s));
string cryptoHash = BitConverter.ToString(ba);
return cryptoHash;

I compute an MD5 (Message Digest version 5) 16-byte value of the aggregate string. I could have just as easily used a different hashing algorithm such as SHA-1 (Secure Hash Algorithm version 1). Crypto-hashes like MD5 accept a byte array of any size and produce a fingerprint-like identifying value that is always a fixed size. For readability, I convert the resulting byte array to a string using the static System.BitConverter class. Now I can determine if the stored procedure under test modified the state of dbMovies as expected or not:

if (expectedState == actualState)
  Console.WriteLine("\nTest case result = Pass");
else
  Console.WriteLine("\nTest case result = **FAIL**");

Here I just display a test result to the shell. In a production environment you can save your results to external storage. Recall that my test case data contained an expected database state such as 85-A0-D9-28-7B-AA-8D-5F-4C-43-B5-B6-A5-99-F4-63. So just where did this expected value come from? Determining expected state is the most time-consuming task when testing stored procedures. You must manually determine the expected state of your test bed database after a stored procedure has been called, given an initial state, and then use a utility program to compute an MD5 (or similar) crypto-hash of all or part of the values in expected state.

A common approach used to determine expected state, but one which is incorrect, is to run your test harness without supplying an expected value, and then fetch the hash value of the resulting state of the database. This approach creates test case data that merely verifies that the stored procedure under test produces the same, possibly incorrect, state.

Extending the Harness

Let's take a look at testing stored procedures that return a SQL rowset but that do not affect the data in the associated database. Recall that the usp_GetMoveDataByPrice accepts a movie price and returns movie ID, title, runtime, and price data for all movies that have the specified price. To call usp_GetMovieDataByPrice, you can write code along the lines of:

DbMovies db = new DbMovies(connString); 
decimal moviePrice = (decimal)22.22;
var results = dbc.Usp_GetMovieDataByPrice(moviePrice);

Very, very clean. Here I am using an implicit data type for object results. I could have also used an explicit data type for results and written:

System.Data.Linq.ISingleResult<Usp_GetMovieDataByPriceResult> results =
  db.Usp_GetMovieDataByPrice(moviePrice);

Now, to verify the correctness of the rowset returned into object results, I can first fetch an aggregate string composed of the column values of the returned rowset:

string s = "";
foreach (var r in results)
  s += r.MovID + r.MovTitle + r.MovRunTime + r.MovPrice;

And I can use this aggregate string as an actual return value, or I can compute a hash of the string as explained in the previous section. Notice that usp_GetMovieDataByPrice does not explicitly affect the state of dbMovies because it does not change the data in tblMain and tblPrice. This means that in your test harness you can omit the call to the ResetDatabase helper method if you wish.

Testing a stored procedure that returns a scalar value is generally the easiest scenario to test because there is a single expected result. For example, consider usp_GetMoviePrice, which is defined as:

create procedure usp_GetMoviePrice
  @movID char(3),
  @price money out
as
  declare @ct int
  select @price = movPrice from tblPrices where movID = @movID
  select @ct = count(*) from tblPrices where movID = @movID
  return @ct

This stored procedure effectively returns two scalar values; the price of a movie with a specified ID is stored into an out parameter and the number of movies found with the specified ID is an explicit return value. To call usp_GetMoviePrice using LINQ to SQL, you can write code like this:

DbMovies db = new DbMovies(connString); 
string movieID = "002";
decimal? actualPrice = null;
int actualCount = dbc.Usp_GetMoviePrice(movieID, ref actualPrice);

The question mark character after the decimal keyword may also be new to you. This is an example of a nullable type. Because all SQL data values can be null, the wrapping code generated by sqlmetal.exe and the O/R Designer tools generate most data members as nullable. Figure 6 shows how you would typically call uspGetMoviePrice using ADO.NET. I think you'll agree that the LINQ approach is significantly simpler.

Faster, More Compact, and Better Debugging

Based on my initial experience with LINQ and LINQ to SQL, I estimate that using LINQ reduces the time and effort to create test automation for stored procedures by at least 50 percent. Because LINQ is a part of the .NET Framework, you get the benefits of working directly with any .NET Framework-compliant language like C#. Instead of typing something like this

SqlCommand sc = new SqlCommand("INSERT INTO tblFoo VALUES('" + 
someValue + "', '" + otherValue + "')";

and then hoping you got all the quotes and commas in the right place, LINQ lets you write normal C#-style code and reports syntax errors immediately.

Because LINQ to SQL is essentially a well-designed wrapper around lower-level code (often classes like SqlDataReader), your test code is much shorter, much cleaner, and therefore easier to create, modify, and maintain. Because LINQ is a part of the .NET environment, you can use the Visual Studio debugger to track down errors in your test harness—something that is awkward or impossible with pre-LINQ approaches in many scenarios. Maybe most importantly, LINQ just feels right.

While researching this column, I found myself really enjoying the experimentation process. LINQ encourages investigatory activities because it makes your life as a developer/tester so easy. This characteristic of LINQ promotes more thorough testing, which will produce better software products.

Send your questions and comments for James to testrun@microsoft.com.

Dr. James McCaffrey works for Volt Information Sciences, Inc., where he manages technical training for software engineers working at Microsoft's Redmond, Washington, campus. He has worked on several Microsoft products including Internet Explorer and MSN Search. James is the author of .NET Test Automation Recipes, and he can be reached at jmccaffrey@volt.com or v-jammc@microsoft.com.