Why Does a Connection Pool Overflow?

11/03/2006

This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

Features
The .NET Connection Pool Lifeguard
Prevent pool overflows that can drown your applications

William Vaughn

Most ADO.NET data providers use connection pooling to improve the performance of applications built around Microsoft's disconnected .NET architecture. An application opens a connection (or gets a connection handle from the pool), runs one or more queries, processes the rowset, and releases the connection back to the pool. Without connection pooling, these applications would spend a lot of additional time opening and closing connections.

When you use ADO.NET connection pooling to manage connections for your Web-based applications and client/server Web service applications, your customers will usually get faster connections and better overall performance. But what happens when your application or Web site is suddenly flooded with customers who all want to connect at the same time? Will your applications sink or swim? Like a lifeguard, you need to monitor your connection pools carefully to maintain good performance and to prevent your pools from overflowing. Let's explore the reasons a connection pool might overflow, then see how you can write code or use Windows Performance Monitor to monitor your pools.

As I discussed in "Swimming in the .NET Connection Pool," May 2003, InstantDoc ID 38356, you need to know about many scalability and performance details when you use connection pooling. Remember that you need to monitor and manage two essential factors: the number of connections each pool manages and the number of connection pools. In an efficient production system, typically the number of pools is low (1 to 10) and the total number of connections in use is also low (fewer than 12). An efficient query takes less than a second to complete and disconnect. So even if hundreds of customers are accessing your Web site simultaneously, relatively few connections can often handle the entire load. To make your applications run efficiently, you must keep connection resources under control and monitor your pools' status so that you'll have some warning before they overflow and your customers start to complain—or go elsewhere.

Email discussion group participants often complain about how applications seem to work in testing but fail in production. Sometimes they report that their applications stop or bog down when about 100 clients get connected. Remember that the default number of connections in a pool is 100. If you try to open more than 100 connections from a pool, ADO.NET queues your application's connection request until a connection is free. The application (and its users) sees this as a delay in getting to the Web page or as an application lock-up. Let's look at how this problem arises.

In ADO.NET, the SqlClient .NET Data Provider gives you two techniques for opening and managing connections. First, when you need to manage the connection manually, you can use the DataReader object. With this method, your code constructs a SqlConnection object, sets the ConnectionString property, and uses the Open method to open a connection. After the code is finished with the DataReader, you close the SqlConnection before the SqlConnection object falls out of scope. To process the rowset, you can pass the DataReader to another routine in your application, but you still need to make sure that the DataReader and its connection are closed. If you don't close the SqlConnection, your code "leaks" a connection with each operation, so the pool accumulates connections and eventually overflows. Unlike in ADO and Visual Basic (VB) 6.0, the .NET garbage collector won't close the SqlConnection and clean up for you. Listing 1, which I walk through later, shows how I opened a connection and generated a DataReader to return the rowset from a simple query to stress the connection pool.

You can also run into problems when you use the DataAdapter object. The DataAdapter Fill and Update methods automatically open the DataAdapter object's connection and close it after the data I/O operation is complete. However, if the connection is already open when the Fill or Update method is executed, ADO.NET doesn't close the SqlConnection after the method completes. This is another opportunity to leak a connection.

In addition, you can also use COM-based ADO to create a connection from a .NET application. ADO pools these connections in the same way that ADO.NET does but doesn't give you a way to monitor the pool from your application as you can when you use the SqlClient ADO.NET Data Provider.

Indicting the DataReader

Orphaned connections and overflowing pools are serious problems, and judging by the number of newsgroup discussions about them, they're fairly common. The most likely culprit is the DataReader. To test the behavior of the DataReader, I wrote a sample Windows Forms (WinForms) application concentrating on the CommandBehavior.CloseConnection option. (You can download this application by entering InstantDoc ID 39031 at http://www.sqlmag.com.) You can set this option when you use the SqlCommand object's ExecuteReader method to execute the query and return a DataReader. My test application shows that even when you use this option, if you don't explicitly close the DataReader (or SqlConnection), the pool overflows. The application then throws an exception when the code requests more connections than the pool will hold.

Some developers insist that if you set the CommandBehavior.CloseConnection option, the DataReader and its associated connection close automatically when the DataReader finishes reading the data. Those developers are partially right—but the option works this way only when you're using a complex bound control in an ASP.NET Web application. Looping through a DataReader result set to the end of its rowset (that is, when Dr.Read—the DataReader's Read method—returns false) isn't enough to trigger automatic connection closing. However, if you bind to a complex bound control such as the DataGrid, the control closes the DataReader and the connection—but only if you've set the CommandBehavior.CloseConnection option.

If you execute a query by using another Execute method (e.g., ExecuteScalar, ExecuteNonQuery, ExecuteXMLReader), you are responsible for opening the SqlConnection object and, more importantly, closing it when the query finishes. If you miss a close, orphaned connections quickly accumulate.

Monitoring the Number of Connections

To test for orphaned connections and overflowing connection pools, I wrote a sample Web-form application. This application uses the same techniques you would typically use to return data from a query. (You can download a WinForms version of this code at http://www.sqlmag.com.)

I used the code in Listing 1 to open and close connections to the Web-form application. The routine at callout A creates, opens, and executes queries against 110 new SqlConnection objects—10 more than the default pool size. You must close and discard all these connections before leaving the routine. If you don't, the SqlConnection objects are orphaned along with the associated pooled connections. The ADO.NET pooling mechanism (aka the pooler) closes the database connections, but it doesn't close the pooled connections. I set the connection pool size to 10 to make the program fail faster—if it's going to. Typically, 10 connections are plenty for a query that runs as quickly as this one does. Many developers run busy Web sites that use fewer than five connections to handle hundreds of thousands of hits a day.

The routine at callout A creates SqlConnection and SqlCommand objects, sets the CommandText, and opens the connection. Then, the code at callout B determines whether to use CommandBehavior.CloseConnection when executing the DataReader, depending on which CheckBox controls the user selected on the Web form.

In callout C's code, I specify whether to bind the DataReader rowset to a DataGrid or loop through the rowset. Callout C's code tests what happens when you reach the end of the rowset that was passed back from the data provider through the DataReader.

Now, I use the code at callout D to specify whether to manually close the connection or let some other operation such as data binding do it. Frankly, it's usually safest to close the connection manually so that you're certain the connection won't be orphaned.

If the code works up to this point, I successfully opened and closed 110 connections. However, if something goes wrong, the exception handlers in the code at callout E will trap the exception (typically a Timeout) as an InvalidOperationException, which is how ADO.NET responds when the connection pool is full.

Table 1 summarizes how various options permit the routine to work or fail. Notice that if you don't set the CommandBehavior.CloseConnection option, your operations will eventually fail—even when you're using a bound control. The process fails even when you use that option if you don't either use a complex bound control or close the SqlDataAdapter or SqlConnection manually.

When I finished working with these sample applications, I'd generated more than 1000 pooled connections—all orphaned. Although the SQL Server User Connections count was 0, about 40 connection pools were left behind. The orphaned pools didn't disappear until I rebooted the system.

The sample applications that I used for this test include routines that use the DataAdapter to return rows. Unless you manually manage connections, the DataAdapter properly opens and closes the SqlConnection object, so you're not likely to encounter orphaned pool connections. However, if your application uses both a DataReader and a DataAdapter, you might find that the DataAdapter can't run a query against a connection if the connection is associated with an unclosed DataReader.

Determining When a Pool Is Full

As I discussed in "Swimming in the .NET Connection Pool," when the connection pool reaches the maximum number of connections you specify with the Max Pool Size ConnectionString option, ADO.NET blocks any subsequent attempt to open an additional connection. If a connection becomes available before the time you specify in the ConnectionTimeout option, the .NET Data Provider passes your application a pointer to that connection, returning control to your application. However, if no connection frees up in time, the connection request trips an InvalidOperationException.

Now you have to decide what to do; I don't suggest telling your users that you're out of connections. Some applications tell users that the system is busy helping other customers and encourage the user to return later. Other applications entertain users with an animation that informs them that the system hasn't locked up but is busy working on their requests. In the meantime, your code retries the operation. In all cases, you should log these failures to help diagnose what's going wrong and document that you've run out of resources.

Monitoring the Pool

So you've opened and closed a connection, and now you want to know whether the connection is still in place. Several techniques let you determine how many connections are still open and what they're doing:

Run sp_who or sp_who2. These system stored procedures return information from the sysprocesses system table that shows the status of and information about all working processes. Generally, you'll see one server process ID (SPID) per connection. If you named your connection by using the Application Name argument in the connection string, your working connections will be easy to find.
Use SQL Server Profiler with the SQLProfiler TSQL_Replay template to trace open connections. If you're familiar with Profiler, this method is easier than polling by using sp_who.
Use the Performance Monitor to monitor the pools and connections. I discuss this method in a moment.
Monitor performance counters in code. You can monitor the health of your connection pool and the number of established connections by using routines to extract the counters or by using the new .NET PerformanceCounter controls. Both techniques are included in the sample applications that you can download at http://www.sqlmag.com.

Let's look at how to find connection-pool counters and how to use these monitoring methods.

Where are the connection-pool counters? To monitor the connection-pool counters, you have to watch the system where ADO.NET is creating and incrementing them. ADO.NET doesn't always create a pool on the Microsoft IIS server or SQL Server if you're connecting from a remote system; it creates pools on the system where the ADO.NET code runs. This system could be a remote Windows or middle-tier system running IIS, a Web application, or a Web service. In contrast, the SQL Server performance counters are on the SQL Server system—not on the client.

Using Performance Monitor to monitor pools. If you use the Microsoft Management Console (MMC) Windows 2000 System Monitor snap-in, you can graph the SqlClient counters by selecting .NET CLR Data from the Performance object drop-down list, as Figure 1 shows. Note that you can monitor all processes by selecting the _global_ counter instances, or you can look at a specific instance—each pool generates its own set of counters. Performance Monitor lists these counters and exposes them as instances of the selected performance object. But Performance Monitor doesn't expose these counters unless there are instances to monitor. For example, Figure 1 shows the .NET CLR Data performance object, but no specific instances are listed. This means you have to create at least one connection to get the _global_ instance to appear along with a specific instance for each process. This behavior is a problem for your code; you won't be able to use the PerformanceCounter control to return any of these counters until ADO.NET creates them when it opens the connection. So, it's a bit of a catch-22. The lack of a valid counter instance will cause exceptions when you use this method—just be ready to trap them.

You can also monitor the number of open connections by using the SQL Server performance counter User Connections. This counter is listed under SQL Server: General Statistics in the Performance object dropdown list. I like to monitor the User Connections value and some selected .NET CLR Data SqlClient counters (which I describe in a moment) because I can get the information I want without worrying about instances.

Using code to monitor performance counters. When you need to monitor the connection pool programmatically, you can write code to monitor the SqlClient-managed performance counters—the same counters that the MMC Windows NT Performance Monitor snap-in exposes. Writing the code that performs the monitoring is somewhat daunting. But I've included a snapshot of the routine that extracts these counters from the inner workings of the SqlClient provider as one of the downloadable programs included with this article.

You can write code that examines the five counters that Table 2 shows. These five counters let you monitor the connection pool in realtime. .NET expects you to provide a category—dubbed Performance Object in the Performance Monitor—and select the appropriate counters from those registered with the system. To access the SqlClient counters, set the category to .NET CLR Data.

Using PerformanceCounter controls. You might find adding a PerformanceCounter to your application form at design time easier than hand-coding access to the performance counters. To do so, select a PerformanceCounter from the Visual Studio .NET Toolbox Components menu, drag it to your application form, and set the properties as Figure 2 shows. These controls work in Web forms and in WinForms applications.

Because the PerformanceCounter control exposes convenient drop-down lists, you can see any of the performance-counter categories, counter names, and specific instances at design time—except the instance you're about to run. This means you'll have to use the technique that Figure 2 shows to capture the appropriate instance of the pool your application is using. To bypass this problem, I select the _global_ instance. Again, this technique assumes that some application has already created at least one pool, so you need to be prepared for ADO.NET to throw an exception when no counter instances exist, just as it does when no pooled connections exist.

Beware of inaccurate pool counts. Because of a bug in the SqlClient .NET Data Provider that wasn't fixed in .NET Framework 1.1, the performance counters incorrectly indicate that pools remain "alive" when in fact they've been deleted. I was able to verify that the pools no longer existed by ending the MMC Performance Monitor snap-in, then ending Visual Studio .NET. These steps showed that the .NET Data Provider properly deleted connection pools when the process that created them ended. Obviously, this inaccuracy reduces the usefulness of the performance counters for monitoring the pools, but I hope that Microsoft will resolve this problem in the future.

What the Counters Don't Show

One problem you might face is that the configuration of each pool isn't available from the counters or SqlClient properties. Each SqlConnection object's ConnectionString holds the key to these pool settings. Because you can't depend on the default settings, it's tough to determine whether the pool is almost full or hardly being used. This would be another handy feature of a future version of ADO.NET.

However, assuming you know the values for the various connection pool ConnectionString arguments, the code in Listing 1 makes it simple to set up a timer to check a specific pool you created and report when it's x percent full. Then, a monitoring application could alert you so that you could resolve the problem and prevent an overflow.

Finally, remember that ADO.NET performs differently from COM-based ADO. Visual Basic .NET radically changes the way objects are discarded and no longer guarantees that a Connection object will be closed when it falls out of scope. Be sure that the SqlConnection object (or any Connection object) is closed before it falls out of scope.

Connection pooling is a powerful functionality that can improve your applications' performance. But if you aren't a good lifeguard, your connection pools can become a detriment instead of a benefit. I hope the techniques in this article help you effectively monitor your pools and keep your users happy.

Bugs, comments, suggestions | Legal | Privacy | Advertising

Share via