Asynchronous Command Execution in ADO.NET 2.0

 

Pablo Castro
Microsoft Corporation

April 2005
Updated June 2005

Applies to:
   ADO.NET 2.0

Summary: Get an overview of the new asynchronous execution functionality in ADO.NET 2.0, the scenarios it was developed to enable, plus some of the issues to keep in mind when using this feature. (13 printed pages)

Contents

Introduction
True Asynchronous I/O
New API Elements
Application Scenarios
Some Details to Keep in Mind
Conclusion

Introduction

In the 2.0 release of ADO.NET, we not only wanted to make existing scenarios easier, but also enable new scenarios that either were just not possible before, or were far from ideal.

Asynchronous command execution is a good example of that. In releases of ADO.NET before 2.0, it wasn't possible to execute a command and not wait for it to complete before continuing execution. The addition of an asynchronous API enables scenarios where it is important for the application to continue execution without waiting for the database to complete the operation.

In this article, I'll cover the basics of the asynchronous database API and a couple of scenarios where this API is useful. Although we designed the API to work with any data access provider, SqlClient—the .NET data access provider for SQL Server—is the only one of the four providers included with .NET that actually supports it. Because of that, I'll use SqlClient throughout the rest of the article in the samples and descriptions of methods and classes. Bear in mind that third-party provider writers can implement this asynchronous API as well, so we may see more databases that can be accessed asynchronously. Simply change the class and method names accordingly to use these samples with other databases.

True Asynchronous I/O

In previous versions of the .NET Framework it was possible to simulate non-blocking execution by using asynchronous delegates or the ThreadPool class; however, those solutions simply blocked another thread in the background, making them far from ideal for cases where it is important to avoid blocking threads, as we'll see in the Application Scenarios section below.

Also, there was at last one earlier database access API that exposed "non-blocking" statement execution: ADO. There is a fundamental difference between the ADO and the ADO.NET/SqlClient implementation, however: ADO spawns a background thread and blocks on that thread instead of the calling thread until the database operation finishes. Although this approach works for client-side applications, it's not useful for middle-tier and server-side scenarios such as the ones we'll discuss later in the article.

ADO.NET/SqlClient asynchronous command execution support is based on true asynchronous network I/O under the covers (or non-blocking signaling in the case of shared memory). If we have enough demand, perhaps I'll write about the internal implementation some day. For now, let me state that we do "true async," there are no blocked background threads waiting for a particular I/O operation to finish, and we use the overlapped I/O and the input/output completion ports facilities of the Windows 2000/XP/2003 operating systems to make it possible to use a single thread (or a few of them) to handle all the outstanding requests for a given process.

New API Elements

We modeled the new ADO.NET asynchronous API after the existing APIs in the .NET Framework, with similar functionality. Consistency is an important thing in large frameworks

Asynchronous Methods

All command execution APIs are in the Command object in ADO.NET, including ExecuteReader, ExecuteNonQuery, ExecuteXmlReader and ExecuteScalar. We decided to minimize the API surface that we added for this feature, so we added asynchronous versions only for the methods that couldn't be adapted from other methods: ExecuteReader, ExecuteNonQuery and ExecuteXmlReader. ExecuteScalar is simply a short form of ExecuteReader + fetch first row/first column + close the reader, so we didn't include an asynchronous version of it.

Following the asynchronous API pattern already in use in the .NET Framework, each existing synchronous method has now an asynchronous counterpart that's split into two methods; a begin part that starts the work, and an end part that completes it. The table below summarizes the new methods in the command object:

Table 1. New asynchronous methods available in ADO.NET 2.0

Synchronous MethodAsynchronous Method
"Begin" part"End" part
ExecuteNonQueryBeginExecuteNonQueryEndExecuteNonQuery
ExecuteReaderBeginExecuteReaderEndExecuteReader
ExecuteXmlReaderBeginExecuteXmlReaderEndExecuteXmlReader
The asynchronous pattern models methods, so the **begin** method takes all the input parameters, and the **end** method provides all the output parameters, as well as the return value. For example, here is what an asynchronous invocation of **ExecuteReader** looks like.
IAsyncResult ar = command.BeginExecuteReader();
// ...
// do other processing
// ...
SqlDataReader r = command.EndExecuteReader(ar);
// use the reader and then close the reader and the connection

In the sample above you can see that BeginExecuteReader doesn't take any parameters (mapped to the ExecuteReader overload that doesn't take any parameters), and that EndExecuteReader returns a SqlDataReader, just like ExecuteReader does.

In line with other asynchronous APIs in the base class library, the begin methods return an IAsyncResult reference that can be used to track the state of the operation. This will be covered in more depth in the "Completion Signaling" section below.

The "async" Connection String Keyword

In order to use asynchronous commands, the connections on which the commands will be executed must be initialized with async=true in the connection string. An exception will be thrown if any of the asynchronous methods are called on a command with a connection that doesn't have async=true in its connection string.

Note that if you know you'll use a given connection object with synchronous commands only, it's better not to include the async keyword in the connection string, or alternatively include it but set it to false. Execution of synchronous operations on connections that have asynchronous operations enabled will have noticeably increased resource utilization.

In cases where you need both synchronous and asynchronous APIs, you should use different connections if possible. If that's not an option, then you can still use the synchronous methods in connections opened with async=true, and they'll behave as usual; you'll see a small performance degradation, though.

Completion Signaling

One of the fundamental elements of asynchronous APIs is the completion signaling mechanism. In synchronous APIs, the method calls don't return until the operation is finished, so this is not an issue. In the asynchronous cases, the begin call returns immediately, so we need a way to detect when the operation is actually complete.

In ADO.NET—as in the rest of the .NET Framework asynchronous APIs—there are a number of options to detect when an asynchronous command has finished executing:

  • Callback: all the begin methods have overloads that take a delegate as a parameter, along with a user-defined state object. When this overload is used, ADO.NET will call that passed-in delegate, and make the state object available through the IAsyncResult object (passed as a parameter to the delegate). Note that the callback will be called in a thread-pool thread, which is likely to be different from the thread that initiated the operation. Proper synchronization may be needed, depending on the application.
  • Synchronization objects: the IAsyncResult objects returned by the begin methods have a WaitHandle property that contains an event object. The event object can be used in synchronization primitives such as WaitHandle.WaitAny and WaitHandle.WaitAll. This allows the calling code to wait for multiple pending operations, and be notified either when one finishes or when all of them finish. This also allows scenarios where the client code needs to wait both for the database operation and some other activity that uses events, or any other OS waitable handle for synchronization, too.
  • Polling: the IAsyncResult object also has an IsCompleted boolean property. This property will change to true when the operation completes, so it can be used by code that needs to perform some continuous activity; that code can periodically check the property and, if it changed, process the results.

In all three cases, once the operation is signaled as completed, the caller must call the corresponding end method for the begin method that initiated the asynchronous command. Failure to call the end method that matches the begin method may result in ADO.NET leaking system resources. Also, the call to the end method will make the results of the operation available to the caller; this is a SqlDataReader (for EndExecuteReader), the number of records affected (for EndExecuteNonQuery), or an XmlReader (for EndExecuteXmlReader).

Note that calling the end method without waiting for the operation to complete is perfectly legal. In that case, the method will block until the operation completes, turning it into a synchronous operation (functionally speaking; it remains asynchronous under the covers).

Application Scenarios

Designing features based on their "cool factor" is too easy and too tempting, so we try to avoid it by making sure that we have a relevant application scenario for each and every new feature that we ship. Here are a couple of the main scenarios that we had in mind when designing the asynchronous API. I also included one where asynchronous commands may sound like a good candidate, but may not be after further consideration—at least not always.

Executing Statements in Parallel

One interesting scenario for asynchronous command execution is the execution of multiple SQL statements in parallel, either against the same or different database servers.

Let's say you need to display information about a particular employee in your application, and some of that information is in the human resources database, while the salary-related information is in the accounting database. It would be nice to be able to send queries to both databases at the same time and have them execute in parallel, instead of waiting for the first statement to complete before starting the second one.

For example:

// obtain connection strings from configuration files or
// similar facility
// NOTE: these connection strings have to include "async=true", for
// example: 
// "server=myserver;database=mydb;integrated security=true;async=true"
string connstrAccouting = GetConnString("accounting");
string connstrHR = GetConnString("humanresources");
// define two connection objects, one for each database
using(SqlConnection connAcc = new SqlConnection(connstrAccounting))
using(SqlConnection connHumanRes  = new SqlConnection(connstrHR)) {
  // open the first connection
  connAcc.Open();
  // start the execution of the first query contained in the
  // "employee_info" stored-procedure
  SqlCommand cmdAcc = new SqlCommand("employee_info", connAcc);
  cmdAcc.CommandType = CommandType.StoredProcedure;
  cmdAcc.Parameters.AddWithValue("@empl_id", employee_id);
  IAsyncResult arAcc = cmdAcc.BeginExecuteReader();
  // at this point, the "employee_info" stored-proc is executing on
  // the server, and this thread is running at the same time
  // now open the second connection
  connHumanRes.Open();
  // start the execution of the second stored-proc against
  // the human-resources server
  SqlCommand cmdHumanRes = new SqlCommand("employee_hrinfo", 
                                          connHumanRes);
  cmdHumanRes.Parameters.AddWithValue("@empl_id", employee_id);
  IAsyncResult arHumanRes = cmdHumanRes.BeginExecuteReader();
  // now both queries are running at the same time
  // at this point; more work can be done from this thread, or we
  // can simply wait until both commands finish - in our case we'll
  // wait
  SqlDataReader drAcc = cmdAcc.EndExecuteReader(arAcc);
  SqlDataReader drHumanRes = cmdHumanRes.EndExecuteReader(arHumanRes);
  // now we can render the results, for example, bind the readers to an ASP.NET
  // web control, or scan the reader and draw the information in a 
  // WebForms form.
}

Note that here we simply call EndExecuteReader once; we don't have anything else to do until the database operation finishes. EndExecuteReader will block until the operation completes, and then return a SqlDataReader object.

For more sophisticated scenarios, where you need to wait for multiple activities and not all of them are asynchronous ADO.NET operations, you can use WaitHandle.WaitAll or WaitHandle.WaitAny; IAsyncResult.WaitHandle contains an event object that can be used for synchronization.

A particular application of this more sophisticated approach is out-of-order rendering. Let's say you have an ASP.NET page with multiple data sources. You can execute the multiple commands and then, as they finish, render the corresponding part of the page, while the other databases are still processing the other operations. This way we make progress whenever there is data available, no matter which one finishes first.

Here is the interesting part of an example using the Northwind database (the full example is included in the zip file that accompanies this article). Note that this is typically more useful when the operations go to different databases, or if your database server is powerful enough to process all the queries at the same time.

 // NOTE: connection strings denoted by "connstring" have to include 
 // "async=true", for example: 
 // "server=myserver;database=mydb;integrated security=true;async=true"
 // we'll use three connections for this
 using(SqlConnection c1 = new SqlConnection(connstring))
 using(SqlConnection c2 = new SqlConnection(connstring))
 using(SqlConnection c3 = new SqlConnection(connstring))
 {
  // get customer info
  c1.Open();
  SqlCommand cmd1 = new SqlCommand(
    "SELECT CustomerID, CompanyName, ContactName FROM Customers " +
    "WHERE CustomerID=@id", c1);
  cmd1.Parameters.Add("@id", SqlDbType.Char, 5).Value = custid;
  IAsyncResult arCustomer = cmd1.BeginExecuteReader();
  // get orders
  c2.Open();
  SqlCommand cmd2 = new SqlCommand(
    "SELECT * FROM Orders WHERE CustomerID=@id", c2);
  cmd2.Parameters.Add("@id", SqlDbType.Char, 5).Value = custid;
  IAsyncResult arOrders = cmd2.BeginExecuteReader();
  // get order detail if user picked an order
  IAsyncResult arDetails = null;
  SqlCommand cmd3 = null;
  if(null != orderid) {
   c3.Open();
   cmd3 = new SqlCommand(
      "SELECT * FROM [Order Details] WHERE OrderID=@id", c3);
   cmd3.Parameters.Add("@id", SqlDbType.Int).Value =         
                                                    int.Parse(orderid);
   arDetails = cmd3.BeginExecuteReader();
  }
  // build the wait handle array for WaitForMultipleObjects
  WaitHandle[] handles = new WaitHandle[null == arDetails ? 2 : 3];
  handles[0] = arCustomer.AsyncWaitHandle;
  handles[1] = arOrders.AsyncWaitHandle;
  if(null != arDetails)
   handles[2] = arDetails.AsyncWaitHandle;
  // wait for commands to complete and render page controls as we 
  // get data back
  SqlDataReader r;
  for(int results = (null==arDetails) ? 1 : 0; results < 3;results++) {
   // wait for any handle, then process results as they come
   int index = WaitHandle.WaitAny(handles, 5000, false); // 5 secs
   if(WaitHandle.WaitTimeout == index)
    throw new Exception("Timeout");
   switch(index) {
    case 0: // customer query is ready
     r = cmd1.EndExecuteReader(arCustomer);
     if (!r.Read())
      continue;
     lblCustomerID.Text = r.GetString(0);
     lblCompanyName.Text = r.GetString(1);
     lblContact.Text = r.GetString(2);
     r.Close();
     break;
    case 1: // orders query is ready
     r = cmd2.EndExecuteReader(arOrders);
     dgOrders.DataSource = r; // data-bind to the orders grid
     dgOrders.DataBind();
     r.Close();
     break;
    case 2: // details query is ready
     r = cmd3.EndExecuteReader(arDetails);
     dgDetails.DataSource = r; // data-bind to the details grid
     dgDetails.DataBind();
     r.Close();
     break;

    }
   }
  }
 }
}

Note that in the two examples shown above, a similar effect could have been achieved by using the classic synchronous ADO.NET and asynchronous delegates, thread-pool APIs such as QueueUserWorkItem, or user-created threads.In all cases, however, each command would have blocked a thread. While blocking threads is fine in some scenarios, such as client-side applications, it may compromise scalability in middle-tier and server-side applications. I'll go through some details about this in the next section.

Non-Blocking ASP.NET Handlers and Pages

Web servers typically use variants of thread-pooling to manage the threads that are used to process web-page requests (or any other kind of request, for that matter, such as Web service or HttpHandler invocations).

On database-driven websites with very high loads, the use of synchronous database APIs may result in the majority of the thread-pool threads being blocked while waiting for the database server to return results. In that case, the Web server will have almost no CPU usage and almost no network usage, yet it won't take new requests or it will have very few threads available to do so.

ASP.NET has a construct called "asynchronous HTTP handlers," which are classes that implement IHttpAsyncHandler and are associated with an ASP.NET file with extension "ashx". These classes can handle a request and produce a response in an asynchronous fashion. This feature integrates very well with asynchronous ADO.NET commands.

For more information on IHttpAsyncHandler see the IHttpAsyncHandler Interface topic on MSDN. Also, there's a good article on how to use it, Use Threads and Build Asynchronous Handlers in Your Server-Side Web Code, in MSDN Magazine (this one has cool drawings showing how the whole thing works under the covers; I wish I could come up with those drawings myself...).

The files asyncorders.cs and asyncorders.ashx included in the samples that go with this article are a simple but working sample of this technique. Below is the part in which we're interested.

public class AsyncOrders : IHttpAsyncHandler
{
 protected SqlCommand _cmd;
 protected HttpContext _context;
 // asynchronous execution support is split between 
 // BeginProcessRequest and EndProcessRequest

 public IAsyncResult BeginProcessRequest(HttpContext context,                  
                                         AsyncCallback cb, 
                                         object extraData) {
  // get the ID of the customers we need to list the orders for 
  // (it's in the query string)
  string customerId = context.Request["customerId"];
  if(null == customerId)
   throw new Exception("No customer ID specified");
  // obtain the connection string from the configuration file
  string connstring = 
                ConfigurationSettings.AppSettings["ConnectionString"];
  // connect to the database and kick-off the query
  SqlConnection conn = new SqlConnection(connstring);
  try {
   conn.Open();
   // we use an stored-procedure here, but this could be any statement
   _cmd = new SqlCommand("get_orders", conn);
   _cmd.CommandType = CommandType.StoredProcedure;
   _cmd.Parameters.AddWithValue("@ID", customerId);
   // begin execution of the command. This method will return post 
   // the query
   // to the database and return without waiting for the results
   // NOTE: we are passing to BeginExecuteReader the callback 
   // that ASP.NET passed to us; so ADO.NET will call cb directly 
   // once the first database results are ready. You can also use 
   // your own callback and invoke the ASP.NET one as appropiate
   IAsyncResult ar = _cmd.BeginExecuteReader(cb, extraData);
   // save the HttpContext to use it in EndProcessRequest
   _context = context;
   // we're returning ADO.NET's IAsyncResult directly. a more 
   // sophisticated application might need its own IAsyncResult 
   // implementation
   return ar;
  }

  catch {
   // only close the connection if we find a problem; otherwise, we'll
   // close it once we're done with the async handler
   conn.Close();
   throw;
  }
 }
 // ASP.NET will invoke this method when it detects that the async 
 // operation finished
 public void EndProcessRequest(IAsyncResult result) {
  try {
   // obtain the results from the database
   SqlDataReader reader = _cmd.EndExecuteReader(result);
   // render the page
   RenderResultsTable(_context, "Orders (async mode)", reader);
  }
  finally {
   // make sure we close the connection before returning from 
   // this method
   _cmd.Connection.Close();
   _cmd = null;
  }
 }
 // rest of AsyncOrders members
 // ...
}

In the sample above we can see how the HTTP handler process inputs parameters, kicks-off the database query, and then returns from the BeginProcessRequest method. By returning from the method we're giving the control back to ASP.NET, which is now free to reuse this thread to process another request while the database server handles our query. Once we're done, the signaling mechanism will cause EndProcessRequest to be called, and we'll complete the rendering of our page. Note that EndProcessRequest may be called on the same or a different thread.

In scenarios where you need multiple queries to render a page, you can send them together as a single batch—if they are all against the same database—or you can use multiple connections and begin multiple asynchronous command execution. In the latter case, additional code is required to coordinate completion of the commands, and to notify ASP.NET once all the commands have finished.

Keeping WinForms Applications Responsive

If you have long-running database operations that you execute from a WinForms application, you have probably already noticed that the application freezes when you perform those operations. This is because the event handler is blocked on the call to the database, and it can't handle other Windows messages in the meantime.

It might be tempting to use asynchronous command execution to work-around this problem. After all, it's exactly what we need: a way to execute a command and immediately return from the event handler. WinForms applications are trickier in this case, however: you cannot touch WinForms controls from a thread other than the thread on which they were created; this means that you cannot start an ADO.NET asynchronous operation, and in your callback you cannot refresh your controls with new data or perform data-binding operations. You would have to marshal the data back to the UI thread and do the updates there. To make things more complicated, in many cases you'll need many queries to fill up your UI, and some of them might depend on the results of a previous one. This means that you'd have to coordinate multiple asynchronous queries, then marshal all the results to the UI thread, and then refresh the UI controls.

A much simpler approach is to use the new BackgroundWorker class that will be included with .NET 2.0. Using this class, you can do traditional, synchronous database operations without blocking the UI threads. For more information on the BackgroundWorker class, check out the documentation for the .NET Framework 2.0 when that becomes available.

You can still use asynchronous commands if, after analyzing the options, you think it is still the way to go, and you have all the required considerations. We designed the feature more towards the scenarios where it was important to avoid blocking threads; in general, this is not a big deal for client-side applications, so the simplicity of BackgroundWorker makes it worth the shot.

Some Details to Keep in Mind

Here are a few considerations that you'll need to keep in mind when using asynchronous command execution.

Error Handling

Errors can happen at any point during the execution of a command. When ADO.NET can detect errors before initiating the actual database operation, it will throw an exception from the begin method; this is very similar to the synchronous case in which you get the exceptions from the call to ExecuteReader or similar methods directly. This includes invalid parameters, bad state of related objects ( no connection set for the command, for example), or some connectivity issues ( the server or the network is down, for example).

Now, once we send the operation to the server and return it to the user, we don't have a way to let you know if something goes wrong in the exact moment where we detect it. We cannot just throw an exception; there is no user code above us in the stack when we're doing intermediate processing, so you wouldn't be able to catch an exception if we threw one. What we do here is keep the error information around, and signal the operation as complete. Later on, when your code calls the end method, we detect that there was an error during processing and we throw an exception.

The bottom line is that you need to be prepared to handle errors both in the begin and the end methods.

Blocking Operations

Even when using asynchronous execution, there are a number of scenarios in which we might block for I/O. Here is a non-exhaustive list of possibly blocking calls that are specific to SqlClient:

  • Begin methods: if we have to send a large number of parameters or a very long SQL statement, we might block on network write.
  • SqlDataReader.Read: if the application reads data faster than the server can produce it and the network can transport it to the client, Read() might block on network read.
  • SqlDataReader.Get*: the datareader has Get methods for each data type, in its CLR and SQL type systems (for example, GetString and GetSqlString, respectively); if CommandBehavior.SequentialAccess is used on the begin call, then these Get methods might block on network read.
  • SqlDataReader.Close() and SqlDataReader.Dispose(): these methods might block if there are pending rows that haven't been consumed, or if the output parameters haven't yet been fetched from the network.

Canceling Pending Commands

The command object has a Cancel() method that can be used to cancel an executing command.

Versions

One of the things I really love about this feature is that it works on all the SQL Server versions supported by SqlClient; this includes SQL Server 7.0, SQL Server 2000, and the new SQL Server 2005.

There is one caveat, though: on pre-SQL Server 2005 servers, we don't support asynchronous execution over shared memory. So if you have the server and the client on the same computer and want to use asynchronous commands, use localhost as the server name, or add tcp: to the server name to force the provider to use TCP/IP instead of shared memory.

Additionally, this feature depends heavily on certain features that are only present in Windows NT-based operating systems, including Windows 2000, Windows XP, Windows 2003 Server, and future versions of Windows. Asynchronous execution is not supported in Windows 9x and Windows Me.

Note   In order to try out the examples, you'll need the .NET Framework 2.0.

Conclusion

Asynchronous command execution is a powerful extension to ADO.NET. It enables new high-scalability scenarios at the cost of some extra complexity.

Just like any other complex feature, I would recommend using asynchronous command execution only when it is really needed, as it will add extra code to your application that future owners will have to understand and maintain.

I hope you find this new feature useful, and if you have questions, comments, or feedback, please send them to the ADO.NET newsgroups. Several members of the ADO.NET team, including myself, check out the newsgroups almost every day. The newsgroup server is msnews.microsoft.com, and the ADO.NET newsgroup is microsoft.public.dotnet.framework.adonet.

© Microsoft Corporation. All rights reserved.