An Evaluation of Stored Procedures for the .NET Developer
Christa Carpentiere
Microsoft Corporation
March 2004
Applies to:
Microsoft® SQL Server™
Microsoft Visual Studio® .NET 2003
Transact-SQL (T-SQL) Language
Summary: Get an introduction to Microsoft SQL Server stored procedures for developers who work primarily in the .NET programming languages. Discover the pros and cons of using stored procedures; get an overview of the tools that Microsoft Visual Studio .NET 2003 provides to help you work with stored procedures; and learn a handful of best practices to help you get started. (7 printed pages)
Introduction
Why Consider Stored Procedures?
Visual Studio .NET Tools
Tips for Getting Started
Conclusion
The aim of this article is to give you the key pros and cons of using stored procedures to encapsulate the Transact-SQL (T-SQL) needed by your applications, so you can make an informed decision about using them in your environment. For those who can take advantage of them, I will also cover tools and best practices for using them in .NET applications.
I know, I am entering the religious wars here where some think business logic should only be encoded in the middle tier, or only in the database; or in-line queries are the only way to go; or stored procedures should be used for absolutely everything. There are benefits and drawbacks to all of these approaches. The important thing is to consider what is important for your application and environment. So, let's look at what exactly stored procedures are, and consider some of the reasons why you might want to use them to encapsulate your T-SQL code.
Maybe you've written T-SQL to work with a SqlCommand object here and there, but never thought about if there was a better place for it than mixed into your data access code. Maybe you have applications that have some complicated T-SQL procedures coded in them because the application has added functionality over time. Stored procedures offer you an alternative place to encapsulate this code.
Most of you probably have some idea what a stored procedure is, but for those who don't, a stored procedure is a group of T-SQL statements that are saved together in the database as a single unit of code. You can pass in run-time information using input parameters, and get back data as a result set or an output parameter. The first time the stored procedure is run, it gets compiled. This produces an execution plan—essentially a record of the steps that Microsoft® SQL Server™ must take to get the results specified by the T-SQL in the stored procedure. The execution plan is then cached in memory for future use. This improves the performance of the stored procedure in that SQL Server does not need to analyze the code again to figure out what to do with it; it can simply refer to the cached plan. This cached plan is available until SQL Server is re-started, or until it is aged out of memory due to low usage.
The cached execution plan used to give stored procedures a performance advantage over queries. However, for the last couple of versions of SQL Server, execution plans are cached for all T-SQL batches, regardless of whether or not they are in a stored procedure. Therefore, performance based on this feature is no longer a selling point for stored procedures. Any T-SQL batch with static syntax that is submitted frequently enough to prevent its execution plan from aging out of memory will receive identical performance benefits. The "static" part is key; any change, even something as insignificant as the addition of a comment, will prevent matching with a cached plan and thereby prevent plan re-use.
However, stored procedures can still provide performance benefits where they can be used to reduce network traffic. You only have to send the EXECUTE stored_proc_name statement over the wire instead of a whole T-SQL routine, which can be pretty extensive for complex operations. A well-designed stored procedure can reduce many round trips between the client and the server to a single call.
Additionally, using stored procedures allows you to enhance execution plan re-use, and thereby improve performance, by using remote procedure calls (RPCs) to process the stored procedure on the server. When you use a SqlCommand.CommandType of StoredProcedure, the stored procedure is executed via RPC. The way RPC marshals parameters and calls the procedure on the server side makes it easier for the engine to find the matching execution plan and simply plug in the updated parameter values.
One last thing to think about when considering using stored procedures to enhance performance is whether you are leveraging T-SQL strengths. Think about what you want to do with the data.
- Are you using set-based operations, or doing other operations that are strongly supported in T-SQL? Then stored procedures are an option, although in-line queries would also work.
- Are you trying to do row-based operations, or complex string manipulation? Then you probably want to re-think doing this processing in T-SQL, which excludes using stored procedures, at least until SQL Server 2005 is released and Common Language Runtime (CLR) integration is available.
The second potential benefit to consider is maintainability. In a perfect world, your database schema would never change and your business rules would never get modified, but in the real world these things happen. That being the case, it may be easier for you if you can modify a stored procedure to include data from the new X, Y, and Z tables that have been added to support that new sales initiative, instead of changing that information somewhere in your application code. Changing it in the stored procedure makes the update transparent to the application—you still return the same sales information, even though the internal implementation of the stored procedure has changed. Updating the stored procedure will usually take less time and effort than changing, testing, and re-deploying your assembly.
Also, by abstracting the implementation and keeping this code in a stored procedure, any application that needs access to the data can get it in a uniform manner. You don't have to maintain the same code in multiple places, and your users get consistent information.
Another maintainability benefit of storing your T-SQL in stored procedures is better version control. You can version control the scripts that create and modify your stored procedures, just as you can any other source code module. By using Microsoft Visual SourceSafe® or some other source control tool, you can easily revert to or reference old versions of the stored procedures.
One caveat with using stored procedures to enhance maintainability is they cannot insulate you from all possible changes to your schemas and rules. If the changes are large enough to require a change in the parameters fed into the stored procedure, or in the data returned by it, then you are still going to have to go in and update the code in your assembly to add parameters, update GetValue()
calls, and so forth.
Another issue to consider is that using stored procedures to encapsulate business logic limits your application portability, in that it ties you to SQL Server. If application portability is critical in your environment, encapsulating business logic in a RDBMS-neutral middle tier may be a better choice.
A final reason you may want to consider using stored procedures is the way they can be used to enhance security.
In terms of regulating user access to information, they can provide access to specific data by allowing users permissions on the stored procedure, but not the underlying tables. You can think of stored procedures as similar to SQL Server views (if you are familiar with those), except the stored procedure accepts input from the user to dynamically change the data displayed.
Stored procedures can also help you out with code security. They can protect against some kinds of SQL injection attacks—primarily those that use an operator such as AND or OR to append commands onto a valid input parameter value. Stored procedures also hide the implementation of business rules in the chance that your application is compromised. That can be important for businesses where this type of information is considered intellectual property.
Additionally, using stored procedures lets you use the SqlParameter class available in ADO.NET to specify data types for stored procedure parameters. This gives you an easy way to validate the types of user-provided values as part of an in-depth defensive strategy. To be sure, parameters are just as useful in in-line queries as they are in stored procedures in narrowing the range of acceptable user input.
The caveats to using stored procedures to enhance security are that bad security or coding practices will still leave you open to attacks. Treating SQL Server role creation and assignment cavalierly will result in people getting access to data they shouldn't see. And assuming the use of stored procedures will de facto prevent all SQL injection attacks—like having data manipulation language (DML) appended to an input parameter—will have the same result.
Also, the use of parameters for data type validation is not foolproof, regardless of whether or not the T-SQL is in your code or in a stored procedure. All user-provided data, especially text data, should receive additional validation prior to being passed to the database.
Well, maybe. Let's recap their strong points:
- Improve performance by reducing network traffic
- Provide a single point of maintenance
- Abstract business rules for consistency and security
- Enhance security by minimizing some forms of attack
- Encourage execution plan re-use
If your environment allows you to exploit the benefits stored procedures provide, as outlined above, I'd highly recommend them. They provide a good tool for improving the way data is handled in your environment. On the other hand, if portability, working heavily with non-T-SQL-friendly processes, or an unstable database schema negates these benefits in your case, you may want to consider other alternatives.
One other thing to consider is the amount of T-SQL expertise you have in house. Do you have a fair amount of T-SQL knowledge, or are you willing to learn? Alternatively, do you have a DBA or likely stand-in who can be coaxed into writing stored procedures for you? The more T-SQL knowledge you have, the better your stored procedures are going to be, and the less hassle you'll have maintaining them. For example, T-SQL is focused on set-based operations, rather than row-based operations. Falling back on cursors because they remind you of datasets will result in poor performance. If you don't know much about T-SQL, consider this a learning opportunity. The knowledge will improve your code, regardless of where you keep it.
So, for those of you who think that stored procedures could add a certain special something to your applications, read on. We'll review some tools to make working with them easier, and take a look at some best practices for creating them.
Microsoft Visual Studio® .NET provides a few tools that let you view and manipulate SQL Server stored procedures (as well as other database objects). Let's take a quick look at what functionality you can expect to find.
You can use the Server Explorer to look at existing stored procedures, to see what parameters they require, or to view their internal implementation. If you have a connection to a server with SQL Server installed, you can expand ServerName, SQL Servers, SQLInstanceName, Northwind, Stored Procedures, and expand CustOrdersDetail. The explorer will show you any parameters that the stored procedure requires, and any columns that it returns. If you check out the properties for any of these, you'll note that the data types are expressed as ADO types. The framework docs provide a handy mapping between these types and .NET types. Of course, when you use parameters in your ADO.NET code, you may want to express their data types as members of the SqlDbType enumeration. You can see a mapping between .NET types and SqlDbType types.
If you double-click on a stored procedure, Visual Studio will open it up for you in SQL Editor, with handy color-coding and everything. Note that instead of showing CREATE PROCEDURE syntax—which is what actually exists in the database—it assumes you want to modify the stored procedure, and so gives you ALTER PROCEDURE syntax instead.
Do you have permissions to create stored procedures in the appropriate databases? If not, that's your first step toward creating and modifying stored procedures. Talk to your database administrator if you need assistance.
You can create a new stored procedure by right-clicking on the Stored Procedures node (or any stored procedure, for that matter) in Server Explorer and selecting New Stored Procedure. This will pop open a new window in the SQL Editor, with some skeletal CREATE PROCEDURE syntax to get you started. You can then type in the body of your stored procedure. Modifying a stored procedure starts out the same way as viewing one: locate it in Server Explorer, and then open it up.
If you need help creating queries for use in your stored procedure, right-click on the stored procedure window, and then select Insert SQL. Alternatively, you can also select a block of T-SQL, right-click, and then select Design SQL Block. Either of these actions will open a Query Builder window, which gives you a graphical interface to build or modify a T-SQL statement. When you are done, you can cut and paste the T-SQL into your stored procedure.
Unfortunately, there is no IntelliSense in this editor, so be prepared to have SQL Server Books Online open, as well, for frequent reference. Attempting to save the stored procedure will alert you to any syntax errors that need correction. Note that you cannot save the stored procedure until the syntax errors are fixed, so you probably want to make sure you have the time to work out and finish the procedure before you start coding it. You can always do initial coding and testing in SQL Server Query Analyzer, but that's another article.
Once you have your code the way you want it, you can test the stored procedure by right-clicking and choosing the Run Stored Procedure option.
If you want to start creating stored procedures to use with your applications, here are a handful of tips you can keep in mind to make them perform well and play nicely with others.
Stored procedures by default will return the count of rows affected by each statement in the procedure. If you don't need to use this information in your application (most applications don't), use the SET NOCOUNT ON statement in your stored procedure to stop this behavior. This removes one or more trips between the client and server, depending on how many statements that affect rows are included in the stored procedure. This doesn't sound like a huge issue, but it can adversely affect performance in high-traffic applications.
create procedure test_MyStoredProc @param1 int
as
set nocount on
The sp_ prefix is reserved for system stored procedures. The database engine will always look for stored procedures with this prefix first in the master database. This means that it will take just a bit longer to finish processing while the engine looks through the master database first, and then the database where the stored procedure actually resides. Also, if there happens to be a system stored procedure with an identical name, your procedure won't get processed at all.
Think carefully before using optional parameters extensively. You can easily impact performance by doing extra work that isn't required based on the set of parameters fed in for any given execution. You can get around this by using conditional coding for every possible combination of parameters, but this is time-consuming and increases your opportunities for error.
You can add a little speed and save a little processing power by using an OUTPUT parameter to return scalar data. Where your application needs a single value returned, try this approach instead of materializing a result set. You can also use an OUTPUT parameter to return a cursor when that is appropriate, but we'll leave the cursors vs. set-based processing ideological dispute for a future article.
Use the return value of a stored procedure to return processing status information to the calling application. Standardize on a set of return values and their meanings within your development group, and use them consistently. This will make it much easier to handle errors in your calling application and provide useful information about problems to your end users.
SQL Server will recompile a stored procedure when a DML statement is executed after a data definition language (DDL) statement, where the DML references any object modified by the DDL. This happens because SQL Server needs to take into account the changes made to the object by the DDL in order to create a plan for the DML. If you take care of all your DDL at the start of the stored procedure, then it only has to recompile once. If you mix DDL and DML statements, you will force the stored procedure to recompile multiple times, and that will adversely affect your performance.
You may not always be maintaining this code. In the future, other people might like to know what it does. 'Nuff said.
I hope you've gotten a feel for whether stored procedures can be useful to you. They are a tool that comes for free with SQL Server, so you should certainly exploit them if they can help you with your application development and maintenance. The introductory information I've provided will get you started. Consult SQL Server Books Online, the Visual Studio docs, and, of course, MSDN® for more in-depth information as you move forward.