Printer Friendly Version      Send     
Click to Rate and Give Feedback
MSDN
MSDN Library
.NET Development
Performance
 Chapter 12 - Improving ADO.NET Perf...
Prescriptive Architecture
Chapter 12 - Improving ADO.NET Performance
 

Patterns and Practices home

Improving .NET Application Performance and Scalability

J.D. Meier, Srinath Vasireddy, Ashish Babbar, and Alex Mackman
Microsoft Corporation

May 2004

Related Links

Home Page for Improving .NET Application Performance and Scalability

Chapter 5, Improving Managed Code Performance

Chapter 14, Improving SQL Server Performance

Checklist: ADO.NET Performance

Send feedback to Scale@microsoft.com

patterns & practices Library

Summary: This chapter provides proven strategies to help you design and develop scalable data access solutions. Topics covered include different techniques to pass data across application layers, managing the database connection pool, optimizing stored procedure calls, reducing dataset serialization cost, techniques for paging through large result sets, managing transactions, handling BLOBS, and much more.

Contents

Objectives
Overview
How to Use This Chapter
Architecture
Performance and Scalability Issues
Design Considerations
Implementation Considerations
.NET Framework Data Providers
Connections
Commands
Stored Procedures
Parameters
DataSet vs. DataReader
DataReader
DataSet
XML and DataSet Objects
Typed DataSets
Types
Exception Management
Transactions
Binary Large Objects
Paging Records
Analyzing Performance and Scalability of Data Access
Summary
Additional Resources

Objectives

  • Optimize your data access design.
  • Choose between DataSets and DataReaders.
  • Run efficient database commands.
  • Pass data between layers efficiently.
  • Perform efficient transactions.
  • Optimize connection management.
  • Evaluate the cost of paging through records.
  • Evaluate criteria for analyzing data access performance.
  • Apply performance considerations to binary large object (BLOB) manipulation.

Overview

Well-designed data access code and data processing commands are essential elements for application performance and scalability. Typically, the database is a focal point for application load because the majority of application requests require data that comes from a database.

This chapter provides proven strategies for designing and implementing data access code for performance and scalability.

How to Use This Chapter

Use this chapter to improve the implementation of your data access code for performance and scalability. To get the most out of this chapter, consider the following:

  • Jump to topics or read beginning to end. The main headings in this chapter help you to quickly identify and then locate the topic that interests you. Alternatively, you can read the chapter beginning to end to gain a thorough appreciation of the issues that affect ADO.NET performance.
  • Use the checklist. Use "Checklist: ADO.NET Performance" in the "Checklists" section of this guide to quickly view and evaluate the guidelines presented in this chapter.
  • Use the "Architecture" section of this chapter to understand how ADO.NET works. By understanding the architecture, you can make better design and implementation choices. Understand core ADO.NET components, such as data provider objects and the DataSet object.
  • Use the "Design Considerations" section of this chapter to understand the high-level decisions that will affect implementation choices for ADO.NET code.
  • Read Chapter 13, "Code Review: .NET Application Performance" See the "Data Access" section for specific guidance.
  • Measure your application performance. Read the "ADO.NET/Data Access" and ".NET Framework Technologies" sections of Chapter 15, "Measuring .NET Application Performance" to learn about the key metrics that you can use to measure application performance. You have to measure application performance so that you can identify and resolve performance issues.
  • Test your application performance. Read Chapter 16, "Testing .NET Application Performance" to learn how to apply performance testing to your application. You have to apply a coherent testing process and analyze the results.
  • Tune your application performance. Read the "ADO.NET Tuning" section of Chapter 17, "Tuning .NET Application Performance" to learn how to resolve performance issues that you identify through the use of tuning metrics.
  • Tune SQL Server. Read Chapter 14, "Improving SQL Server Performance" to ensure that your Microsoft® SQL Server™ database is appropriately configured.

Architecture

ADO.NET relies on data providers to provide access to the underlying data source. Each data provider exposes a set of objects that you use to manage connections, retrieve data, and update data. The core objects are the following:

  • Connection
  • Command
  • DataReader
  • DataAdapter

In addition, ADO.NET provides the DataSet object, which provides a disconnected cache of data. The DataSet object does not require a specific type of data source and is not tied to the underlying data source that the data was obtained from.

The basic ADO.NET architecture is shown in Figure 12.1.

Figure 12.1: ADO.NET architecture

The following list outlines the purpose of each of the main ADO.NET objects:

  • Connection. This object represents a connection to a database.
  • Command. This object represents an SQL statement that is run while connected to a data source. This object can be a stored procedure or a direct SQL statement.
  • DataReader. This object retrieves a read-only, forward-only stream of data from a database. The DataReader object is designed for connected scenarios and offers better performance than reading data into a DataSet object at the expense of functionality. For more information about how to use DataReader objects and DataSet objects, see "DataSet vs. DataReader" later in this chapter.
  • DataAdapter. This object channels data to and from a DataSet object and the underlying data source. The DataAdapter object also provides enhanced batch update features that were previously associated with the ADO Recordset object.
  • DataSet. The DataSet object represents a disconnected, cached set of data. The DataSet is independent of the provider and is not tied to the underlying data source that might have been used to populate it. DataSet can easily be passed from component to component through the various layers of an application, and it can be serialized as XML.

    You should be aware of the way a DataSet is internally constructed because the DataSet contains a potentially large number of internal objects. This means that a large number of memory allocations are required to construct a typical DataSet.

    A DataSet consists of one or more DataTable objects together with DataRelation objects that maintain table relationship information. Each DataTable contains DataRow objects and DataColumn objects. Constraint objects are used to represent a constraint that can be enforced on one or more DataColumn objects.

    Note   You can also use typed datasets that derive from the basic DataSet class. Typed datasets provide benefits at build time and at run time. For more information, see "Typed DataSets" later in this chapter.
  • DataView. Although the DataView object is not shown in Figure 12.1, you can use a DataView to sort and filter data in a DataTable. This capability is often used for data binding.

Abstracting Data Access

ADO.NET is designed around a set of generic interfaces that abstract the underlying data processing functionality. You can use these interfaces directly to abstract your data access layer so that you can minimize the impact of changing the type of data source that you use. Abstracting data access is extremely helpful when you are designing systems where your customer chooses the database server.

The core interfaces provided by ADO.NET are found in the System.Data namespace:

  • IDbConnection. This is an interface for managing database connections.
  • IDbCommand. This is an interface for running SQL commands.
  • IDbTransaction. This is an interface for managing transactions.
  • IDataReader. This is an interface for reading data returned by a command.
  • IDataAdapter. This is an interface for channeling data to and from datasets.

The various provider objects, such as SqlConnection and OleDbConnection, implement these generic ADO.NET data access interfaces. If you decide to program against the generic interfaces, be aware of the following issues:

  • There is some small cost associated with a virtual call through an interface.
  • Certain expanded functionality is lost when you use the generic interfaces. For example, the ExecuteXmlReader method is implemented by the SqlCommand object but not by the IDbCommand interface.
  • There is no generic base exception type, so you must catch provider-specific exception types, such as SqlException, OleDbException, or OdbcException.
  • When you use the generic interfaces, you cannot take advantage of database-specific types that are defined for the managed providers; for example, you cannot take advantage of SqlDbType in SqlClient and Oracle-specific types in the Oracle provider. Using specific database types is helpful for type checking and parameter binding.

More Information

For more information about how to use the generic interfaces to abstract your data access, see the following resources:

  • Knowledge Base article 313304, "HOW TO: Use Base Classes to Reduce Code Forking with Managed Providers in Visual C# .NET," at http://support.microsoft.com/default.aspx?scid=kb;en-us;313304. This article includes sample code.
  • Use MSDN® to look at the interfaces that are described earlier in this section to identify the providers that implement each of the interfaces.

Performance and Scalability Issues

The following is a list of the main issues that can adversely affect the performance and scalability of data access in your application.

  • Inefficient queries. Queries that process and then return more columns or rows than necessary waste processing cycles that could best be used for servicing other requests. Queries that do not take advantage of indexes may also cause poor performance.
  • Retrieving too much data. Too much data in your results is usually the result of inefficient queries. The SELECT * query often causes this problem. You do not usually need to return all the columns in a row. Also, analyze the WHERE clause in your queries to ensure that you are not returning too many rows. Try to make the WHERE clause as specific as possible to ensure that the least number of rows are returned.
  • Inefficient or missing indexes. Query efficiency decreases when indexes are missing because a full table scan must be performed. Also, as your data grows, tables may become fragmented. Failure to periodically rebuild indexes may also result in poor query performance.
  • Unnecessary round trips. Round trips significantly affect performance. They are subject to network latency and to downstream server latency. Many data-driven Web sites heavily access the database for every user request. While connection pooling helps, the increased network traffic and processing load on the database server can adversely affect performance. Keep round trips to an absolute minimum.
  • Too many open connections. Connections are an expensive and scarce resource, which should be shared between callers by using connection pooling. Opening a connection for each caller limits scalability. To ensure the efficient use of connection pooling, avoid keeping connections open and avoid varying connection strings.
  • Failure to release resources. Failing to release resources can prevent them from being reused efficiently. If you fail to close connections before the connections fall out of scope, they are not reclaimed until garbage collection occurs for the connection. Failing to release resources can cause serious resource pressure and lead to shortages and timeouts.
  • Transaction misuse. If you select the wrong type of transaction management, you may add latency to each operation. Additionally, if you keep transactions active for long periods of time, the active transactions may cause resource pressure. Transactions are necessary to ensure the integrity of your data, but you need to ensure that you use the appropriate type of transaction for the shortest duration possible and only where necessary.
  • Overnormalized tables. Overnormalized tables may require excessive joins for simple operations. These additional steps may significantly affect the performance and scalability of your application, especially as the number of users and requests increases.

Subsequent sections in this chapter provide strategies and technical information to prevent or resolve each of these issues.

Design Considerations

To help ensure that data access in your application is optimized for performance, there are several issues that you must consider and a number of decisions that you must make at design time:

  • Design your data access layer based on how the data is used.
  • Cache data to avoid unnecessary work.
  • Connect by using service accounts.
  • Acquire late, release early.
  • Close disposable resources.
  • Reduce round trips.
  • Return only the data you need.
  • Use Windows authentication.
  • Choose the appropriate transaction type.
  • Use stored procedures.
  • Prioritize performance, maintainability, and productivity when you choose how to pass data across layers.
  • Consider how to handle exceptions.
  • Use appropriate normalization.

Design Your Data Access Layer Based on How the Data Is Used

If you choose to access tables directly from your application without an intermediate data access layer, you may improve the performance of your application at the expense of maintainability. The data access logic layer provides a level of abstraction from the underlying data store. A well-designed data access layer exposes data and functionality based on how the data is used and abstracts the underlying data store complexity.

Do not arbitrarily map objects to tables and columns, and avoid deep object hierarchies. For example, if you want to display a subset of data, and your design retrieves an entire object graph instead of the necessary portions, there is unnecessary object creation overhead. Evaluate the data you need and how you want to use the data against your underlying data store.

Cache Data to Avoid Unnecessary Work

Caching data can substantially reduce the load on your database server. By caching data, you avoid the overhead of connecting to your database, searching, processing, and transmitting data from your database server. By caching data, you directly improve performance and scalability in your application.

When you define your caching strategy, consider the following:

  • Is the data used application-wide and shared by all users, or is the data specific to each user? Data that is used across the application, such as a list of products, is a better caching candidate than data that is specific to each user.
  • How frequently do you need to update the cache? Even though the source data may change frequently, your application may not need to update the cache as often. If your data changes too frequently, it may not be a good caching candidate. You need to evaluate the expense of updating the cache compared to the cost of fetching the data as needed.
  • Where should you cache data? You can cache data throughout the application layers. By caching data as close as possible to the consumer of the data, you can reduce the impact of network latency.
  • What form of the data should you cache? The best form of data to cache is usually determined by the form that your clients require the data to be in. Try to reduce the number of times that you need to transform data.
  • How do you expire items in the cache? Consider the mechanism that you will use to expire old items in the cache and the best expiration time for your application.

Connect by Using Service Accounts

There are several ways to authenticate and to open a connection to the database. You can use SQL authentication that has an identity specified in the connection string. Or, you can use Windows authentication by using the process identity, by using a specific service identity, or by impersonating the original caller's identity.

From a security perspective, you should use Windows authentication. From a performance perspective, you should use a fixed service account and avoid impersonation. The fixed service account is typically the process account of the application. By using a fixed service account and a consistent connection string, you help ensure that database connections are pooled efficiently. You also help ensure that the database connections are shared by multiple clients. Using a fixed service account and a consistent connection string is a major factor in helping application scalability.

More Information

For more information, see "The Trusted Subsystem Model" in the "Authentication and Authorization" chapter of "Building Secure ASP.NET Applications: Authentication, Authorization and Secure Communication" on MSDN at http://msdn.microsoft.com/library/en-us/dnnetsec/html/SecNetch03.asp. This chapter explains how to use service accounts or process identity to connect to a database. You can also use the chapter to learn about the advantages and disadvantages of Windows and SQL authentication.

Acquire Late, Release Early

Your application should share expensive resources efficiently by acquiring the resources late, and then releasing them as early as possible. To do so:

  • Open database connections right when you need them. Close the database connections as soon as you are finished. Do not open them early, and do not hold them open across calls.
  • Acquire locks late, and release them early.

Close Disposable Resources

Usually, disposable resources are represented by objects that provide a Dispose method or a Close method. Make sure that you call one of these methods as soon as you are finished with the resource. For more information about closing Connection objects and DataReader objects, see "Connections" later in this chapter.

Reduce Round Trips

Network round trips are expensive and affect performance. Minimize round trips by using the following techniques:

  • If possible, batch SQL statements together. Failure to batch work creates additional and often unnecessary trips to the database. You can batch text SQL statements by separating them with a semicolon or by using a stored procedure. If you need to read multiple result sets, use the NextResult method of the DataReader object to access subsequent result sets.
  • Use connection pooling to help avoid extra round trips. By reusing connections from a connection pool, you avoid the round trips that are associated with connection establishment and authentication. For more information, see "Connections" later in this chapter.
  • Do not return results if you do not need them. If you only need to retrieve a single value, use the ExecuteScalar method to avoid the operations that are required to create a result set. You can also use the ExecuteNonQuery method when you perform data definition language (DDL) operations such as the create table operation. This also avoids the expense of creating a result set.
  • Use caching to bring nearly static data close to the consumer instead of performing round trips for each request.

Implicit Round Trips

Be aware that certain operations can cause implicit round trips. Typically, any operation that extracts metadata from the database causes an implicit round trip. For example, avoid calling DeriveParameters if you know the parameter information in advance. It is more efficient to fill the parameters collection by setting the information explicitly. The following code sample illustrates a call that causes an implicit round trip.

// This causes an implicit round trip to the database
SqlCommandBuilder.DeriveParameters(cmd);

Return Only the Data You Need

Evaluate the data that your application actually requires. Minimize the data that is sent over the network to minimize bandwidth consumption. The following approaches help reduce data over the network:

  • Return only the columns and rows that you need.
  • Cache data where possible.
  • Provide data paging for large results. For more information about paging, see "Paging Records" later in this chapter.

Use Windows Authentication

From a security perspective, you should use Windows authentication to connect to Microsoft SQL Server. There are several advantages to using Windows authentication. For example, credentials are not passed over the network, database connection strings do not contain credentials, and you can apply standard Windows security policies to accounts. For example, you can enforce use of strong passwords and apply password expiration periods.

From a performance perspective, SQL authentication is slightly faster than Windows authentication, but connection pooling helps minimize the difference. You also need to help protect the credentials in the connection string and in transit between your application and the database. Helping to protect the credentials adds to the overhead and minimizes the performance difference.

Note   Generally, local accounts are faster than domain accounts when you use Windows authentication. However, the performance saving needs to be balanced with the administration benefits of using domain accounts.

Choose the Appropriate Transaction Type

Proper transaction management minimizes locking and contention, and provides data integrity. The three transaction types and their usage considerations include the following:

  • Native database support. Native database support for transactions permits you to control the transaction from a stored procedure. In SQL Server, use BEGIN TRAN, COMMIT TRAN, and ROLLBACK to control the transaction's outcome. This type of transaction is limited to a single call from your code to the database, although the SQL query or stored procedure can make use of multiple stored procedures.
  • ADO.NET transactions. Manual transactions in ADO.NET enable you to span a transaction across multiple calls to a single data store. Both the SQL Server .NET Data Provider and the OLE DB .NET Data Provider implement the IDbTransaction interface and expose BeginTransaction on their respective connection object. This permits you to begin a transaction and to run multiple SQL calls using that connection instance and control the transaction outcome from your data access code.
  • Enterprise Services distributed transactions. Use declarative, distributed transactions when you need transactions to span multiple data stores or resource managers or where you need to flow transaction context between components. Also consider Enterprise Services transaction support for compensating transactions that permit you to enlist nontransactional resources in a transaction. For example, you can use the Compensating Resource Manager to combine a file system update and a database update into a single atomic transaction.

    Enterprise Services distributed transactions use the services of the Microsoft Distributed Transaction Coordinator (DTC). The DTC introduces additional performance overhead. The DTC requires several round trips to the server and performs complex interactions to complete a transaction.

Use Stored Procedures

Avoid embedded SQL statements. Generally, well-designed stored procedures outperform embedded SQL statements. However, performance is not the only consideration. When you choose whether to store your SQL commands on the server by using stored procedures or to embed commands in your application by using embedded SQL statements, consider the following issues:

  • Logic separation. When you design your data access strategy, separate business logic from data manipulation logic for performance, maintainability, and flexibility benefits. Validate business rules before you send the data to the database to help reduce network round trips. Separate your business logic from data manipulation logic to isolate the impact of database changes or business rule changes. Use stored procedures to clarify the separation by moving the data manipulation logic away from the business logic so that the two do not become intertwined.
  • SQL optimizations. Some databases provide optimizations to stored procedures that do not apply to dynamic SQL. For example, Microsoft SQL Server™ versions prior to SQL Server 2000 kept a cached execution plan for stored procedures. The cached execution plan for stored procedures reduced the need to compile each stored procedure request. SQL Server 2000 is optimized to cache query plans for both stored procedure and for dynamic SQL query plans.
  • Tuning/deployment. Stored procedure code is stored in the database and permits database administrators to review data access code. Database administrators can tune both the stored procedures and the database, independent of the deployed application. The application does not always need to be redeployed when stored procedures change.

    Embedded SQL is deployed as part of the application code and requires database administrators to profile the application to identify the SQL actually used. Profiling the application complicates tuning, because the application must be redeployed if any changes are made.

  • Network traffic sent to the server. Source code for stored procedures is stored on the server. Only the name and parameters are sent across the network to the server. Conversely, when you use embedded SQL, the full source of the commands must be transmitted each time the commands are run. When you use stored procedures, you can reduce the amount of data that is sent to the server when large SQL operations are frequently run.
  • Simplified batching of commands. Stored procedures make it easy to batch work and provide simpler maintenance.
  • Data security and integrity. With stored procedures, administrators can secure tables against direct access or manipulation, and they can only permit the execution of selected stored procedures. Both users and applications are granted access to the stored procedures that enforce data integrity rules. Embedded SQL usually requires advanced permissions on tables. Using advanced permissions on tables is a more complex security model to maintain.
  • SQL Injection. Avoid using dynamically generated SQL with user input. SQL injection occurs when input from a malicious user is used to perform unauthorized actions, such as retrieving too much data or destructively modifying data. Parameterized stored procedures and parameterized SQL statements can help reduce the likelihood of SQL injection. Parameter collections force parameters to be treated as literal values so that the parameters are not treated as executable code. You should also constrain all user input to reduce the likelihood that a malicious user could use SQL injection to perform unauthorized actions.

More Information

For more information about how to prevent SQL injection, see Chapter 14, "Building Secure Data Access," in Improving Web Application Security: Threats and Countermeasures on MSDN at http://msdn.microsoft.com/library/en-us/dnnetsec/html/THCMCh14.asp.

Prioritize Performance, Maintainability, and Productivity when You Choose How to Pass Data Across Layers

You should consider several factors when you choose an approach for passing data across layers:

  • Maintainability. Consider how hard it is to build and keep up with changes.
  • Productivity. Consider how hard it is to implement the solution.
  • Programmability. Consider how hard it is to code.
  • Performance. Consider how efficient it is for collections, browsing, and serializing.

This section summarizes the main approaches for passing data across application layers and the relative tradeoffs that exist:

  • DataSets. With this approach, you use a generic DataSet object. This approach offers great flexibility because of the extensive functionality of the DataSet. This includes serialization, XML support, ability to handle complex relationships, support for optimistic concurrency, and others. However, DataSet objects are expensive to create because of their internal object hierarchy, and clients must access them through collections.

    The DataSet contains collections of many subobjects, such as the DataTable, DataRow, DataColumn, DataRelation and Constraint objects. Most of these objects are passed with the DataSet between the layers. This is a lot of objects and a lot of data to be passed between the layers. It also takes time to fill a DataSet, because there are many objects that need to be instantiated and populated. All of this affects performance. Generally, the DataSet is most useful for caching when you want to create an in-memory representation of your database, when you want to work with relations between tables, and when you want to perform sorting and filtering operations.

  • Typed DataSets. Instantiation and marshaling performance of the typed DataSet is roughly equivalent to the DataSet. The main performance advantage of the typed DataSet is that clients can access methods and properties directly, without having to use collections.
  • DataReaders. This approach offers the optimum performance when you need to render data as quickly as possible. You should close DataReader objects as soon as possible and make sure that client applications cannot affect the amount of time the DataReader and, hence, the database connection is held open.

    The DataReader is very fast compared to a DataSet, but you should avoid passing DataReader objects between layers, because they require an open connection.

  • XML. This is a loosely coupled approach that natively supports serialization and collections of data. For example, an XML document can contain data for multiple business entities. It also supports a wide range of client types. Performance issues to consider include the fact that XML strings can require substantial parsing effort, and large and verbose strings can consume large amounts of memory.
  • Custom Classes. With this approach, you use private data members to maintain the object's state and provide public accessor methods. For simple types, you can use structures instead of classes, which means you avoid having to implement your own serialization. The main performance benefit of custom classes is that they enable you to create your own optimized serialization. You should avoid complex object hierarchies and optimize your class design to minimize memory consumption and reduce the amount of data that needs to be serialized when the object is passed between layers.

More Information

For more information about how to pass data across the layers, see "Designing Data Tier Components and Passing Data Through Tiers" at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnbda/html/BOAGag.asp.

Consider How to Handle Exceptions

In data access code in particular, you can use try/finally blocks to ensure that connections and other resources are closed, regardless of whether exceptions are generated. However, be aware of the following considerations:

  • Exceptions are expensive. Do not catch exceptions and then throw them again if your data access logic cannot add any value. A less costly approach is to permit the exception to propagate from the database to the caller. Similarly, do not wrap transaction attempts with try/catch blocks unless you plan to implement retry mechanisms.
  • If you want to completely abstract your caller from the data-specific details, you have to catch the exception, you have to log detailed information to a log store, and then you have to return an enumerated value from a list of application-specific error codes. The log store could be a file or the Windows event log.

Use Appropriate Normalization

Overnormalization of a database schema can affect performance and scalability. For example, if you program against a fully normalized database, you are often forced to use cross-table joins, subqueries, and data views as data sources. Obtaining the right degree of normalization involves tradeoffs.

On one hand, you want a normalized database to minimize data duplication, to ensure that data is logically organized, and to help maintain data integrity. On the other hand, it may be harder to program against fully normalized databases, and performance can suffer. Consider the following techniques:

  • Start with a normalized model. Start with a normalized model and then de-normalize later.
  • Reduce the cost of joins by repeating certain columns. Deep joins may result in the creation of temporary tables and table scans. To reduce the cost of joining across multiple tables, consider repeating certain columns.
  • Store precomputed results. Consider storing precomputed results, such as subtotals, instead of computing them dynamically for each request.

Implementation Considerations

When you move from application design to application development, consider the implementation details of your ADO.NET code. You can improve resource management by acquiring connections late, by releasing them early, and by using connection pooling.

When you run code on the server, prefer stored procedures. Stored procedures are optimized by the database and use provider-specific types when they pass parameters, to reduce processing. Choose the best transaction management mechanism, and then choose an appropriate isolation level. Keep transactions as short as possible, and avoid code that can lead to deadlocks. You can improve responsiveness by employing the correct paging strategy. Employing the correct paging strategy can also reduce server load.

By following best practice implementation guidelines, you can increase the performance of your ADO.NET code. The following sections summarize performance considerations for ADO.NET features and scenarios.

.NET Framework Data Providers

Microsoft .NET Framework Data Providers are divided into two categories: bridge providers and native providers. Bridge providers permit you to use data access libraries, such as OLE DB and Open Database Connectivity (ODBC). The bridge provider wraps the underlying data access library. Native providers, such as those for SQL Server and Oracle, typically offer performance improvements due, in part, to the fact that there is less abstraction. It is important to choose the correct data provider for your specific data source as described below:

  • Use System.Data.SqlClient for SQL Server 7.0 and later.
  • Use System.Data.OleDb for SQL Server 6.5 or OLE DB providers.
  • Use System.Data.ODBC for ODBC data sources.
  • Use System.Data.OracleClient for Oracle.
  • Use SQLXML managed classes for XML data and for SQL Server 2000.

Use System.Data.SqlClient for SQL Server 7.0 and Later

For SQL Server 7.0 or later, use the .NET Framework Data Provider for SQL Server, System.Data.SqlClient. It is optimized for accessing SQL Server and communicates directly by using the Tabular Data Stream (TDS) protocol. TDS is the native data transfer protocol of SQL Server.

The commonly used classes in System.Data.SqlClient are SqlConnection, SqlCommand, SqlDataAdapter, and SqlDataReader.

Use System.Data.OleDb for SQL Server 6.5 or OLE DB Providers

For SQL Server 6.5 or OLE DB data sources, use the .NET Framework Data Provider for OLE DB, System.Data.OleDb. For example, in SQL Server 6.5 or earlier, you would use the OLE DB Provider for SQL Server (SQLOLEDB) with the .NET Framework Data Provider for OLE DB. Using the OLE DB provider is less efficient than using the .NET Framework Data Provider for SQL Server, because it calls through the OLE DB layer by using COM interop when communicating with the database.

The commonly used classes in System.Data.OleDb are OleDbConnection, OleDbCommand, OleDbDataAdapter, and OleDbDataReader.

Note   The .NET Framework Data Provider for OLE DB does not support the Microsoft OLE DB Provider for ODBC (MSDASQL). For ODBC data sources, use the .NET Framework Data Provider for ODBC instead.

Use System.Data.ODBC for ODBC Data Sources

For ODBC data sources, use the .NET Framework Data Provider for ODBC, System.Data.ODBC. This provider uses the native ODBC Driver Manager through COM interop.

If you are using .NET Framework 1.0, you must download the .NET Framework Data Provider for ODBC from the Microsoft .NET Framework Developer Center on MSDN at http://msdn.microsoft.com/netframework/downloads/updates/default.aspx. Note that the namespace is Microsoft.Data.Odbc. If you are using .NET Framework 1.1, it is included in the System.Data.Odbc namespace.

The commonly used classes in System.Data.Odbc are OdbcConnection, OdbcCommand, OdbcDataAdapter, and OdbcDataReader.

Use System.Data.OracleClient for Oracle

For Oracle data sources, use the .NET Framework Data Provider for Oracle, System.Data.OracleClient. This provider enables data access to Oracle data sources through Oracle client connectivity software. The data provider supports Oracle client software version 8.1.7 and later.

If you are using .NET Framework 1.0, you must download the .NET Framework Data provider for Oracle from the Microsoft .NET Framework Developer Center on MSDN at http://msdn.microsoft.com/netframework/downloads/updates/default.aspx. If you are using .NET Framework 1.1 or later, the .NET Framework Data provider for Oracle is included in the System.Data.OracleClient namespace in System.Data.OracleClient.dll.

The commonly used classes in System.Data.OracleClient are OracleConnection, OracleCommand, OracleDataAdapter, and OracleDataReader.

Use SQLXML Managed Classes for XML Data and SQL Server 2000

To manipulate data in a SQL Server database as XML, use SQLXML Managed Classes. You can download SQLXML 3.0 from the Data Access and Storage Developer Center on MSDN at http://msdn.microsoft.com/data/downloads/default.aspx.

More Information

For more information about .NET Framework Data Providers, see the following:

Connections

Database connections are an expensive and limited resource. Your approach to connection management can significantly affect the overall performance and scalability of your application. Issues to consider include acquiring and releasing connections, pooling, and authentication. To improve database connection performance and scalability, apply the following strategies to your connection management policy:

  • Open and close the connection in the method.
  • Explicitly close connections.
  • When using DataReaders, specify CommandBehavior.CloseConnection.
  • Do not explicitly open a connection if you use Fill or Update for a single operation.
  • Avoid checking the State property of OleDbConnection.
  • Pool connections.

Open and Close the Connection in the Method

Acquire connections late and release them early. Opening connections before they are needed reduces the number of connections that are available and increases resource pressure. Close connections quickly to ensure that they can be reused as soon as possible. Do not hold on to connections. Holding on to connections reduces the connections that are available to other code and increases resource pressure. The general pattern is to open and close connections on a per-method basis.

Explicitly Close Connections

Explicitly call the Close or Dispose methods on SqlConnection objects as soon as you finish using them to release the resources that they use. Do not wait for the connection to fall out of scope. The connection is not returned to the pool until garbage collection occurs. This delays the reuse of the connection and negatively affects performance and scalability. The following are guidelines to consider. These guidelines are specific to SqlConnection because of the way it is implemented. These guidelines are not universal for all classes that have Close and Dispose functionality.

  • Using either the Close method or the Dispose method is sufficient. You do not have to call one method after the other. There is no benefit to calling one method after the other.
  • Dispose internally calls Close. In addition, Dispose clears the connection string.
  • If you do not call Dispose or Close, and if you do not use the using statement, you are reliant upon the finalization of the inner object to free the physical connection.
  • Use the using statement, instead of Dispose or Close, when you are working with a single type, and you are coding in Visual C#®. Dispose is automatically called for you when you use the using statement, even when an exception occurs.
  • If you do not use the using statement, close connections inside a finally block. Code in the finally block always runs, regardless of whether an exception occurs.
  • You do not have to set the SqlConnection reference to null or Nothing because there is no complex object graph. Setting object references to null or to Nothing is usually done to make a graph of objects unreachable.
Note   Closing a connection automatically closes any active DataReader objects that are associated with the connection.

Closing Connections in Visual Basic .NET

The following Visual Basic® .NET code snippet shows how to explicitly close a connection as soon as the connection is no longer needed.

Try
  conn.Open()
  cmd.ExecuteNonQuery()
  customerCount = paramCustCount.Value
Catch ex As Exception
  ' … handle exception
Finally
  ' This is guaranteed to run regardless of whether an exception occurs
  ' in the Try block.
  If Not(conn is Nothing) Then
    conn.Close()
  End If
End Try

Closing Connections in C#

The following example shows how to close connections in C#.

public void DoSomeWork()
{
  SqlConnection conn = new SqlConnection(connectionString);
  …  try
  {
    conn.Open();
    // Do Work
  }
  catch (Exception e)
  {
    // Handle and log error
  }
  finally
  {
    if(null!=conn)
       conn.Close();
  }
}

Closing Connections with the Using Statement in C#

The using statement simplifies code for C# developers by automatically generating a try and finally block when the code is compiled. This ensures that the Dispose method is called even if an exception occurs. The following code fragment shows how to use the using statement.

using (SqlConnection conn = new SqlConnection(connString))
{
  conn.Open();
  . . .
} // Dispose is automatically called on the conn variable here

The C# compiler converts this code into the following equivalent code, which has a try and finally block to ensure that the Dispose method on the SqlConnection object is called, regardless of whether an exception occurs.

SqlConnection conn = new SqlConnection(connString);
try
{
  conn.Open();
}
finally
{
  conn.Dispose();
}

One limitation of the using statement is that you can only put a single type in the parentheses. If you want to ensure that Dispose is called on additional resources, you must nest the using statements as shown in the following example.

using (SqlConnection conn = new SqlConnection(connString))
{
  SqlCommand cmd = new SqlCommand("CustomerRead");

  conn.Open();
  using (SqlDataReader dr = cmd.ExecuteReader())
  {
    while (dr.Read())
      Console.WriteLine(dr.GetString(0));
  }
}
Note   Using a nested using statement on the DataReader object is useful only if you need to perform further operations with the same connection after the inner using block. If you close the connection right away, this approach is of limited value because any active DataReader objects are closed automatically when the connection closes.

When Using DataReaders, Specify CommandBehavior.CloseConnection

When you create a DataReader object, specify the CommandBehavior.CloseConnection enumeration in your call to ExecuteReader. This ensures that when you close the DataReader, the connection is also closed. The following code fragment shows how to use the CommandBehavior enumeration.

// Create connection and command. Open connection.
. . .
SqlDataReader myReader= myCommand.ExecuteReader(CommandBehavior.CloseConnection);
// read some data
. . .
myReader.Close(); // The connection and reader are closed.

The CommandBehavior.CloseConnection is especially helpful when you return a DataReader from a function, and you do not have control over the calling code. If the caller forgets to close the connection but closes the reader, both are closed when the DataReader is created by using CommandBehavior.CloseConnection. This is shown in the following code fragment.

public SqlDataReader CustomerRead(int CustomerID)
{
  //... create connection and command, open connection
  return myCommand.ExecuteReader(CommandBehavior.CloseConnection);
}

//... client code
SqlDataReader myReader = CustomerRead(10248);
//... read some data
myReader.Close(); // reader and connection are closed

Do Not Explicitly Open a Connection if You Use Fill or Update for a Single Operation

If you perform a single Fill or Update operation, do not open the connection before you call the Fill method, because the DataAdapter automatically opens and closes the connection for you. The following code fragment shows how to call Fill.

DataSet dSet = new DataSet("test");
SqlConnection conn = new SqlConnection(connString);
SqlCommand cmd = new SqlCommand(sqlQuery,conn);
SqlDataAdapter dAdapter = new SqlDataAdapter(cmd);
dAdapter.Fill(dSet); // The connection was not explicitly opened.
// The connection is opened and closed by the DataAdapter automatically.

The SqlDataAdapter automatically opens the connection, runs the selected command, and then closes the connection when it is finished. This enables the connection to be open for the shortest period of time.

Note that if you need to perform multiple file or update operations, you need to open the connection before the first Fill or Update method and close it after the last one. Alternatively, you could wrap multiple Fill or Update operations inside a C# using block to ensure that the connection is closed after the last use.

Avoid Checking the State Property of OleDbConnection

If you need to monitor or check connection status and you are using an OleDbConnection, consider handling the StateChange event, and avoid checking the State property. This approach helps to minimize round trips.

Using the State property increases application overhead, because each call results in a call to the OLE DB DBPROP_CONNECTIONSTATUS property (if the connection is an OleDbConnection) for an open connection.

Note   The .NET Framework 2.0 (code named "Whidbey"), at the time of writing, provides an updated OLE DB .NET Data Provider that resolves this problem.

The following code fragment shows how to implement the StateChange event. This event is raised when the state of the connection changes from open to closed or from closed to open.

OleDbConnection conn = new OleDbConnection(connStr);

// Set up a connection state change handler.
conn.StateChange  += new StateChangeEventHandler(OnStateChange);
. . .
// StateChange event handler.
protected static void OnStateChange(object sender, StateChangeEventArgs args)
{
  Console.WriteLine("The current Connection state has changed from {0} to {1}.",
                                  args.OriginalState, args.CurrentState);
}
Note   The ODBC provider also incurs similar overhead when using the State property.

Pool Connections

Creating database connections is expensive. You reduce overhead by pooling your database connections. Make sure you call Close or Dispose on a connection as soon as possible. When pooling is enabled, calling Close or Dispose returns the connection to the pool instead of closing the underlying database connection.

You must account for the following issues when pooling is part of your design:

  • Share connections. Use a per-application or per-group service account to connect to the database. This creates a single pool or a small number of pools, and it enables many client requests to share the same connections.
  • Avoid per-user logons to the database. Each logon creates a separate pooled connection. This means that you end up with a large number of small pools. If you need a different user for each connection, disable pooling or set a small maximum size for the pool.
  • Do not vary connection strings. Different connection strings generate different connection pools. For example, using different capitalization, extra spaces, or different ordering of attributes causes connections to go to different pools. The SQL Server .NET Data Provider performs a byte-by-byte comparison to determine whether connection strings match.
  • Release connections. Do not cache connections. For example, do not put them in session or application variables. Close connections as soon as you are finished with them. Busy connections are not pooled.
  • Passing connections. Do not pass connections between logical or physical application layers.
  • Consider tuning your pool size if needed. For example, in the case of the .NET Framework Data Provider for SQL Server, the default minimum pool size is zero and the maximum is 100. You might need to increase the minimum size to reduce warm-up time. You might need to increase the maximum size if your application needs more than 100 connections.
  • Connection pools are managed by the specific database provider. SqlClient, OleDB client, and third-party clients may provide different configuration and monitoring options.

The following list details the pooling mechanisms that are available, and it summarizes pooling behavior for the .NET Framework data providers:

  • The .NET Framework Data Provider for SQL Server pools connections by using a pooling mechanism implemented in managed code. You control pooling behaviors such as lifetime and pool size through connection string arguments.
  • The .NET Framework Data Provider for Oracle also pools connections by using a managed code solution.
  • The .NET Framework Data Provider for OLE DB automatically uses OLE DB session pooling to pool connections. You control pooling behavior through connection string arguments.
  • The .NET Framework Data Provider for ODBC uses ODBC connection pooling.

Monitoring Pooling

You can monitor connection pooling to determine that it is working as expected and to help you identify the best minimum and maximum pool sizes.

Monitoring Pooling on a Computer that is Running SQL Server

You can monitor the number of open connections to SQL Server by using the SQL Server SQLServer:General Statistics performance counter object. This object is available only on a computer that is running SQL Server.

The connections are not specific to one particular application. If there are multiple applications accessing the server, this object reflects the total number of open connections for every application. Figure 12.2 shows the SQLServer:General Statistics object in the Performance Monitor tool.

Figure 12.2: Performance monitor showing the SQLServer:General Statistics counter

When monitoring SQLServer:General Statistics, you should observe the following:

  • The number of logins per second increases during application startup when the connection pool is established. The number of logins per second should then drop to zero and stay there. Repeated logins and logouts per second indicate that the connection pool is not being used because a different security context is being used to establish the connection.
  • The User Connections value should stabilize and remain constant. If this value increases and you see a jagged pattern in the number of logins per second, you may be experiencing a connection leak in the connection pool.

Monitoring Pooling Using the .NET Framework

The .NET Framework Data Provider for SQL Server provides several counters. The following counters are of particular significance:

  • SqlClient: Current # connection pools
  • SqlClient: Current # pooled and nonpooled connections
  • SqlClient: Current # pooled connections
  • SqlClient: Peak # pooled connections

The SqlClient: Current # connection pools counter indicates the number of connection pools that are currently in use. A large number of pools indicates that a pool is not being shared across clients. Using different connection strings creates new pools.

The SqlClient: Peak # pooled connections counter indicates the maximum number of connections that are currently in use. If this value remains at its peak, consider measuring the performance impact of increasing the Max Pool Size attribute in your connection string. The default value is 100. If you see this value at its peak in conjunction with a high number of failed connections in the SqlClient: Total # failed connects counter, consider changing the value and monitoring performance.

Note   These SqlClient counters may not be reset in .NET Framework version 1.1 when you stop and then restart an application. To reset the counters, stop the application and exit System Monitor, and then start the application and System Monitor again.

More Information

For more information about pooling connections, see the following resources on MSDN:

For more information about pooling connections, see the following Knowledge Base articles:

For more information about how to reset the .NET counters, see Knowledge Base article 314429, "BUG: Performance Counters for SQL Server .NET Data Provider Are Not Reset," at http://support.microsoft.com/default.aspx?scid=kb;en-us;314429.

Commands

You use Command objects such as SqlCommand, OleDbCommand, or OdbcCommand to run SQL commands against the database. You can run dynamic SQL statements or stored procedures by setting the CommandType property. You can set the CommandText property to contain the name of a stored procedure or the SQL statement that you want to run.

You use command Parameter objects, such as SqlParameter, OleDbParameter, or OdbcParameter, to specify the input and output parameters required by the current command. The recommended approach for running commands is to call stored procedures by using Parameter objects that provide parameter type checking. This approach provides both performance and security benefits. When you run commands against a database, consider the following recommendations:

  • Validate SQL input and use Parameter objects.
  • Retrieve only the columns and rows you need.
  • Support paging over large result sets.
  • Batch SQL statements to reduce round trips.
  • Use ExecuteNonQuery for commands that do not return data.
  • Use ExecuteScalar to return single values.
  • Use CommandBehavior.SequentialAccess for very wide rows or for rows with BLOBs.
  • Do not use CommandBuilder at run time.

Validate SQL Input and Use Parameter Objects

Validate all the input data that you use in SQL commands. Do not permit the client to retrieve more data than it should. Also, do not trust user input, and do not permit the client to perform operations that it should not perform. Doing so helps to lower the risk of SQL injection. By rejecting invalid data early before you issue a command that has the invalid data, you can improve performance by eliminating unnecessary database requests.

Use Parameter objects when you build database commands. When you use Parameter objects, each parameter is automatically type checked. Checking the type is another effective countermeasure you can use to help prevent SQL injection. Ideally, use Parameter objects in conjunction with stored procedures to improve performance. For more information about using parameters, see "Parameters" later in this chapter.

Using Parameters with Stored Procedures

The following code sample illustrates how to use the Parameters collection.

SqlDataAdapter myCommand = new SqlDataAdapter("AuthorLogin", conn);
myCommand.SelectCommand.CommandType = CommandType.StoredProcedure;
SqlParameter parm = myCommand.SelectCommand.Parameters.Add(
                       "@au_id", SqlDbType.VarChar, 11);
parm.Value = Login.Text;

In the code sample, the @au_id parameter is treated as a literal value and not as code that can be run. Also, the parameter is checked for type and length. In the code fragment, the input value cannot be longer than 11 characters. If the data does not conform to the type or length that is defined by the parameter, an exception is generated.

Using stored procedures alone does not necessarily prevent SQL injection. The important thing to do is use parameters with stored procedures. If you do not use parameters, your stored procedures may be susceptible to SQL injection if the stored procedures use unfiltered input. For example, the following code fragment is susceptible to SQL injection.

SqlDataAdapter myCommand = new SqlDataAdapter("LoginStoredProcedure '" +
                               Login.Text + "'", conn);

Using Parameters with Dynamic SQL

If you cannot use stored procedures, you can still use parameters with dynamic SQL as shown in the following code fragment.

SqlDataAdapter myCommand = new SqlDataAdapter(
"SELECT au_lname, au_fname FROM Authors WHERE au_id = @au_id", conn);
SqlParameter parm = myCommand.SelectCommand.Parameters.Add("@au_id",
                        SqlDbType.VarChar, 11);
parm.Value = Login.Text;

Retrieve Only the Columns and Rows You Need

Reduce unnecessary processing and network traffic by retrieving only the columns and the rows you need. Do not use the SELECT * query. This is poor practice because you might not know the schema, or it might change. It is easy to retrieve more data than you expect. Consider a scenario where you want four columns, but you perform an operation by using the SELECT * query on a 400-column table. In that scenario, you receive many more results than you expect. Instead, use WHERE clauses to filter the rows.

Support Paging Over Large Result Sets

If you have a large result set that contains many rows of data, consider whether you can implement a paging technique to batch the retrieval of data. Batching the retrieval of data helps to reduce database server load, to reduce network traffic, and to put fewer memory requirements on the data access client. For more information, see "Paging Records" later in this chapter.

Batch SQL Statements to Reduce Round Trips

Batching is the process of grouping several SQL statements in one trip to the server. The syntax in the following code fragment calls a stored procedure (that groups several queries) to return multiple result sets. The code uses the NextResult method of the DataReader object to advance to the next result set. NextResult can be called multiple times, and it returns true when another result set exists. It returns false when there are no more result sets.

SqlCommand cmd = new SqlCommand();
cmd.CommandText = "ReadCustomerAndOrders";
// The stored procedure returns multiple result sets.
SqlDataReader myReader = cmd.ExecuteReader();

if (myReader.read())
//... read first result set

reader.NextResult();

if (myReader.read())
//... read

If you build complex SQL strings dynamically, you can use a StringBuilder object to reduce the performance cost of building the strings.

More Information

You can also use stored procedures to batch SQL operations. For more information, see Knowledge Base article 311274, "HOW TO: Handle Multiple Results by Using the DataReader in Visual C# .NET," at http://support.microsoft.com/default.aspx?scid=kb;en-us;311274.

Use ExecuteNonQuery for Commands That Do Not Return Data

If you want to run commands that do not retrieve data, use the ExecuteNonQuery method. For example, you would use ExecuteNonQuery for the following types of commands:

  • Data Definition Language commands such as CREATE and ALTER
  • Data Modification Language commands such as INSERT, UPDATE, and DELETE
  • Data Control Language commands such as GRANT and REVOKE.

The following code fragment shows an update to the customer table that uses ExecuteNonQuery.

SqlConnection conn = new SqlConnection(connString);
SqlCommand cmd = new SqlCommand(
    "UPDATE Customer SET Freight = 45.44 WHERE CustomerID = 10248", conn);
cmd.ExecuteNonQuery();

Use ExecuteScalar to Return Single Values

If you want to retrieve a single value from your query by using a function such as COUNT(*) or SUM(Price), you can use a stored procedure output parameter, and then use the Command.ExecuteNonQuery method. This eliminates the overhead that is associated with creating a result set.

The following stored procedure returns the number of rows in a Customers table.

CREATE PROCEDURE GetNumberOfCustomers(
@CustomerCount int OUTPUT)
AS
SELECT @CustomerCount = COUNT(*)
FROM Customers

To call the stored procedure, use the following code.

static int GetCustomerCount()
{
  int customerCount = 0;

  SqlConnection conn = new SqlConnection("server=(local);" +
    "Integrated Security=SSPI;database=northwind");
  SqlCommand cmd = new SqlCommand("GetNumberOfCustomers", conn );
  cmd.CommandType = CommandType.StoredProcedure;

  SqlParameter paramCustCount =
    cmd.Parameters.Add("@CustomerCount", SqlDbType.Int );
    paramCustCount.Direction = ParameterDirection.Output;

  try
  {
    conn.Open();
    cmd.ExecuteNonQuery();
    customerCount = (int)paramCustCount.Value;
  }
  finally
  {
    if(null!=conn)
      conn.Close();
  }
  return customerCount;
}

If you do not have control over the stored procedure, and if the stored procedure returns the number of rows as a return value, then you can use Command.ExecuteScalar as shown in the following code fragment. The ExecuteScalar method returns the value of the first column of the first row of the result set.

static int GetCustomerCountWithScalar()
{
  int customerCount = 0;

  SqlConnection conn = new SqlConnection(
     "server=(local);Integrated Security=SSPI;database=northwind");
  SqlCommand cmd = new SqlCommand("GetCustomerCountWithScalar", conn );
  cmd.CommandType = CommandType.StoredProcedure;

  try
  {
    conn.Open();
    customerCount = (int)cmd.ExecuteScalar();
  }
  finally
  {
    if(null!=conn)
       conn.Close();
  }
  return customerCount;
}

The previous code fragment requires the following stored procedure.

CREATE PROCEDURE GetCustomerCountWithScalar
AS
SELECT COUNT(*) FROM Customers

Use CommandBehavior.SequentialAccess for Very Wide Rows or for Rows with BLOBs

Use the CommandBehavior.SequentialAccess enumeration for very wide rows or for rows that contain binary large object (BLOB) data. This permits you to return specific bytes from the retrieved row instead of returning the entire row. Returning the entire row may consume large amounts of memory because of the BLOB data.

When you use CommandBehavior.SequentialAccess, the BLOB data is retrieved only when you reference it. For example, you can call the GetBytes method. The GetBytes method permits you to control the precise number of bytes that are read. The following code fragment shows how to use CommandBehavior.SequentialAccess.

SqlDataReader reader = cmd.ExecuteReader(CommandBehavior.SequentialAccess)

Also, if you are performing optimistic locking against a table with very wide rows or against rows that contain BLOB data, use timestamps. Use timestamps instead of comparing all the fields in the table to the original versions. Using time stamps reduces the number of arguments by a value that is equal to n/2+1.

More Information

For a complete sample, see "Obtaining BLOB Values from a Database" in .NET Framework Developer's Guide on MSDN at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpguide/html/cpconobtainingblobvaluesfromdatabase.asp.

Do Not Use CommandBuilder at Run Time

CommandBuilder objects such as SqlCommandBuilder and OleDbCommandBuilder automatically generate the InsertCommand, UpdateCommand, and DeleteCommand properties of a DataAdapter. The CommandBuilder objects generate these properties based on the SelectCommand property of the DataAdapter. CommandBuilder objects are useful when you are designing and prototyping your application. However, you should not use them in production applications. The processing required to generate the commands affects performance. Manually create stored procedures for your commands, or use the Visual Studio® .NET design-time wizard and customize them later if necessary.

Stored Procedures

This section discusses how to write and to call stored procedures for maximum performance. You should generally prefer stored procedures over direct SQL statements, because stored procedures perform better. Stored procedures perform better because the database can optimize the data access plan used by the procedure and then cache it for subsequent reuse. In addition, stored procedures provide security and maintenance benefits.

  • Use stored procedures.
  • Use CommandType.Text with OleDbCommand.
  • Use CommandType.StoredProcedure with SqlCommand.
  • Consider using Command.Prepare.
  • Use output parameters where possible.
  • Consider SET NOCOUNT ON for SQL Server.

Use Stored Procedures

Stored procedures generally provide improved performance in comparison to SQL statements that are run directly. The following list explains the benefits of stored procedures compared to running data access logic directly from the middle tier of your application:

  • The database can prepare, optimize, and cache the execution plan so that the execution plan can be reused at a later time.
  • Stored procedures pass less information over the network on the initial request, because they only need to transmit the procedure name and the parameters. Everything else is already at the server.
  • Stored procedures abstract SQL statements from the client and business object developers and put responsibility for their maintenance in the hands of SQL experts.
  • Stored procedures also provide maintenance and security benefits.

More Information

For more information about the security benefits of stored procedures and about how you can use them as a countermeasure for SQL injection, see the following:

Use CommandType.Text with OleDbCommand

If you use an OleDbCommand object to call a stored procedure, use the CommandType.Text enumeration with the ODBC call syntax. If you use CommandType.StoredProcedure, ODBC call syntax is generated by the provider anyway. By using explicit call syntax, you reduce the work of the provider.

You should also set the type and length of any parameters that the stored procedure requires. Set the type and length of the parameters to prevent the provider from performing an additional round trip to obtain the parameter information from the database. The following code fragment demonstrates how to use ODBC call syntax and CommandType.Text, and how to explicitly set parameter information.

using (OleDbConnection conn = new OleDbConnection(connStr))
{
  OleDbCommand cmd = new OleDbCommand("call CustOrderHist(?)", conn);
  cmd.CommandType = CommandType.Text;
  OleDbParameter param = cmd.Parameters.Add("@CustomerID", OleDbType.Char, 5);
  param.Value = "ALFKI";
  conn.Open();
  OleDbDataReader reader = cmd.ExecuteReader();
  try
  {
    // List each product.
    while (reader.Read())
      Console.WriteLine(reader.GetString(0));
  }
  finally
  {
    reader.Close();
  }
} // Dispose is called on conn here

Use CommandType.StoredProcedure with SqlCommand

If you are using the SqlCommand object, use CommandType.StoredProcedure when you call stored procedures. Do not use CommandType.Text because it requires extra parsing. The following code fragment shows how to set the CommandType property to avoid extra parsing on the server.

SqlConnection conn = new SqlConnection(connString);
SqlCommand cmd = new SqlCommand("UpdateCustomerProcedure", conn);
cmd.CommandType = CommandType.StoredProcedure;
cmd.Parameters.Add(...

Consider Using Command.Prepare

If your application runs the same set of SQL queries multiple times, preparing those queries by using the Command.Prepare method may give you better performance. In ADO.NET, the SqlCommand.Prepare method calls the sp_prepare stored procedure for SQL Server 7. The SqlCommand.Prepare method calls sp_prepexec for SQL Server 2000 and later. SqlCommand.Prepare makes these calls instead of running a regular batch remote procedure call (RPC). The following code fragment shows how to use Command.Prepare.

cmd.CommandText =
   "insert into Region (RegionID, RegionDescription) values (@id, @desc)";

cmd.Parameters.Add ( "@id", SqlDbType.Int, 4, "RegionID") ;
cmd.Parameters.Add ( "@desc", SqlDbType.NChar, 50, "RegionDescription") ;

cmd.Parameters[0].Value = 5;
cmd.Parameters[1].Value = "North West";
cmd.Prepare();
cmd.ExecuteNonQuery();

cmd.Parameters[0].Value = 6;
cmd.Parameters[1].Value = "North East";
cmd.ExecuteNonQuery();

cmd.Parameters[0].Value = 7;
cmd.Parameters[1].Value = "South East";
cmd.ExecuteNonQuery();

cmd.Parameters[0].Value = 8;
cmd.Parameters[1].Value = "South West";
cmd.ExecuteNonQuery();

Using the Prepare method does not yield a benefit if you are only going to run the statement one or two times. The next version of SQL Server will better leverage how plans are cached, so using it would not make a difference. You should only use the Prepare method for those statements that you run multiple times.

Use Output Parameters Where Possible

Use output parameters and ExecuteNonQuery to return small amounts of data instead of returning a result set that contains a single row. When you use output parameters and ExecuteNonQuery to return small amounts of data, you avoid the performance overhead that is associated with creating the result set on the server.

The following code fragment uses a stored procedure to retrieve the product name and unit price for a specific product that is contained in the Products table in the Northwind database.

void GetProductDetails( int ProductID,
                        out string ProductName, out decimal UnitPrice )
{
  using( SqlConnection conn = new SqlConnection(
        "server=(local);Integrated Security=SSPI;database=Northwind") )
  {
    // Set up the command object used to run the stored procedure.
    SqlCommand cmd = new SqlCommand( "DATGetProductDetailsSPOutput", conn );
    cmd.CommandType = CommandType.StoredProcedure;
    // Establish stored procedure parameters.
    //  @ProductID int INPUT
    //  @ProductName nvarchar(40) OUTPUT
    //  @UnitPrice money OUTPUT

    // Must explicitly set the direction of the output parameters.
    SqlParameter paramProdID =
             cmd.Parameters.Add( "@ProductID", ProductID );
    paramProdID.Direction = ParameterDirection.Input;
    SqlParameter paramProdName =
             cmd.Parameters.Add( "@ProductName", SqlDbType.VarChar, 40 );
    paramProdName.Direction = ParameterDirection.Output;
    SqlParameter paramUnitPrice =
             cmd.Parameters.Add( "@UnitPrice", SqlDbType.Money );
    paramUnitPrice.Direction = ParameterDirection.Output;

    conn.Open();
    // Use ExecuteNonQuery to run the command.
    // Although no rows are returned, any mapped output parameters
    // (and potential return values) are populated
    cmd.ExecuteNonQuery( );
    // Return output parameters from stored procedure.
    ProductName = paramProdName.Value.ToString();
    UnitPrice = (decimal)paramUnitPrice.Value;
  }
}

Consider SET NOCOUNT ON for SQL Server

When you use SET NOCOUNT ON, the message that indicates the number of rows that are affected by the T-SQL statement is not returned as part of the results. When you use SET NOCOUNT OFF, the count is returned. Using SET NOCOUNT ON can improve performance because network traffic can be reduced. SET NOCOUNT ON prevents SQL Server from sending the DONE_IN_PROC message for each statement in a stored procedure or batch of SQL statements.

For example, if you have eight operations in a stored procedure, eight messages are returned to the caller. Each message contains the number of rows affected by the respective statement. When you use SET NOCOUNT ON, you reduce the processing that SQL Server performs and the size of the response that is sent across the network.

Note   In Query Analyzer, the DONE_IN_PROC message is intercepted and displayed as "N rows affected".

Parameters

Most SQL commands require input or output parameters, regardless of whether they are stored procedures or direct SQL statements. Each .NET Framework