Data Points
SQL Server 2005 XML Support, Exception Handling, and More
John Papa
Code download available at:DataPoints05.exe(116 KB)
SQL Server 2005 includes several important improvements to the Transact-SQL (T-SQL) language. One added feature is a new kind of trigger that fires when data definition language (DDL) statements run. DDL triggers are useful when tracking or securing the creation and modification of database objects or changes to the database server. Another of these new features involves exception handling, which has made a huge leap forward with the inclusion of TRY/CATCH blocks. Another set of new features surround the new XML datatype, which represents a huge step forward in managing XML data within enterprise applications that interact with SQL Server. Now XML documents or fragments can be stored within an XML column, bound to a schema, and queried using the XQuery syntax. This month I'll answer three interesting questions readers have sent me on these topics.
Q What are some of the advantages of both typed and untyped XML columns?
Q What are some of the advantages of both typed and untyped XML columns?
A Untyped XML columns store any well-formed XML fragment or document, and typed XML columns can be bound to an XML Schema. Untyped XML is useful when you aren't sure what schema the XML will follow. For example, if you must consume some XML from another app but you can't determine what XML schema the data will follow, untyped XML columns can do the job. Of course, untyped XML brings up other issues. For example, since you don't have a schema to program against, it is difficult to consume the XML effectively. But sometimes you can't avoid untyped XML. Creating an untyped XML column is as simple as this:
CREATE TABLE Foo(FooID INT, someXml XML)
A Untyped XML columns store any well-formed XML fragment or document, and typed XML columns can be bound to an XML Schema. Untyped XML is useful when you aren't sure what schema the XML will follow. For example, if you must consume some XML from another app but you can't determine what XML schema the data will follow, untyped XML columns can do the job. Of course, untyped XML brings up other issues. For example, since you don't have a schema to program against, it is difficult to consume the XML effectively. But sometimes you can't avoid untyped XML. Creating an untyped XML column is as simple as this:
CREATE TABLE Foo(FooID INT, someXml XML)
Another consequence of not binding a schema to an XML column is that SQL Server™ 2005 stores untyped XML as a string. Is this a good thing or a bad thing? Actually, both. Storing the XML as a string gives you greater flexibility to store any XML fragment or document. You might need this based on a business rule of your application. On the other hand, storing the XML as a string means that the XML data can be neither efficiently stored nor efficiently searched when compared to typed XML. SQL Server is not told what the XML will contain, nor is it told the hierarchical structure, or the XML nodes' datatypes. However, keep in mind that just because these are untyped XML columns does not mean that they can take any format you can invent. Both untyped and typed XML columns still only accept well-formed XML.
Typed XML columns are ideal if you want to store an XML document whose schema you know. A schema can define the elements, the attributes, their datatypes, which fields are required, and the overall hierarchical structure of the data. Because this detailed information is known about the XML column's data, the XML data is stored more efficiently internally by SQL Server 2005. If you try to store XML data that does not match the schema, SQL Server will detect this and stop you.
Creating a typed XML column is as simple as adding the name of the schema inside parentheses, as you see here:
CREATE TABLE Foo(FooID INT, someXml XML(CONTENT FooSchema))
This statement indicates that the someXml column must adhere to the XML Schema Collection named FooSchema. You can specify that the XML must be a document or that it can contain a fragment by including the appropriate keyword, DOCUMENT or CONTENT, respectively. If omitted, the default is CONTENT.
You can add an XML schema collection to a database using the T-SQL command as follows:
CREATE XML SCHEMA COLLECTION [FooSchema] AS N 'put your schema here'
You can also use SQL Server Management Studio (SSMS) to create an XML Schema Collection template. Open the Template Explorer from the View menu and navigate to the XML Schema Collections node and expand it. Then you can double-click on the CREATE template to open a template that gives you a head start on the syntax to create the XML Schema Collection (see Figure 1).
Figure 1** XML Schema Collection Template is SSMS **
The AdventureWorks database that ships with SQL Server 2005 has a table named HumanResources.JobCandidate. This table has a column named Resume which is an XML column bound to the schema collection HumanResources.HRResumeSchemaCollection. Any XML data that is stored in this column must adhere to this schema. Figure 2 shows a diagram of this schema.
Figure 2** Resume Schema **
Both typed and untyped XML columns can be indexed, but XML columns that are bound to a schema have some advantages over untyped XML columns when it comes to indexing. When an XML index is applied to an untyped XML column, much of the XML structure must be parsed to locate matching nodes. However, when an XML index is applied to a typed XML column, the specific nodes are identifiable and can be located based on the schema. So, the indexes can function more efficiently with typed XML since it knows where to look. Also, if a search for a range of a numeric value is required, the index would perform a datatype conversion with untyped XML since the datatypes are unknown. Typed XML defines its datatypes, so the expense of conversion is avoided.
Q I just migrated a SQL Server 2000 database to SQL Server 2005. I've heard that SQL Server 2005 introduces TRY/CATCH exception handling to T-SQL. How does the new exception handling work versus the existing IF @@ERROR <> 0 technique?
Q I just migrated a SQL Server 2000 database to SQL Server 2005. I've heard that SQL Server 2005 introduces TRY/CATCH exception handling to T-SQL. How does the new exception handling work versus the existing IF @@ERROR <> 0 technique?
A One of the common complaints about writing code in T-SQL has always been its lack of a robust exception handling construct. Up through SQL Server 2000 you could write a T-SQL batch of code that checked for and even raised errors when needed. But the techniques to do so are fairly rudimentary when compared to the TRY/CATCH techniques. First I will define the problems associated with exception handling using the @@ERROR function and then I will show how the new TRY/CATCH technique compares to this.
A One of the common complaints about writing code in T-SQL has always been its lack of a robust exception handling construct. Up through SQL Server 2000 you could write a T-SQL batch of code that checked for and even raised errors when needed. But the techniques to do so are fairly rudimentary when compared to the TRY/CATCH techniques. First I will define the problems associated with exception handling using the @@ERROR function and then I will show how the new TRY/CATCH technique compares to this.
SQL Server exposes the built-in @@ERROR function, which returns the error number for the last T-SQL command that was executed. The problem with this function is that it always returns the error returned from the statement that was executed immediately previously. This means that if you execute an INSERT statement that causes an error, then execute another SQL statement of any kind that does not cause an error, and then check to see what @@ERROR returns, the function will return a 0 because the previous statement did not return an error. You have to be very careful to make sure you check the value of @@ERROR after every single statement.
Figure 3 shows some sample code that will run inside a transaction to delete the customer's order details, its orders, and then the customer itself. Following each DELETE statement the code checks the value of the @@ERROR function to see if the DELETE statement causes an error. If it does, then I roll back the transaction, stop executing the code, and return from the batch.
Figure 3 Crude Exception Handling with @@ERROR
BEGIN TRANSACTION DELETE [Order Details] WHERE OrderID IN (SELECT OrderID FROM Orders WHERE CustomerID = 'ALFKI') IF @@ERROR <> 0 BEGIN ROLLBACK TRANSACTION RETURN END DELETE Orders WHERE CustomerID = 'ALFKI' IF @@ERROR <> 0 BEGIN ROLLBACK TRANSACTION RETURN END DELETE Customers WHERE CustomerID = 'ALFKI' IF @@ERROR <> 0 BEGIN ROLLBACK TRANSACTION RETURN END PRINT 'I got here' -- Normally do a COMMIT TRANSACTION here. -- But I do a ROLLBACK so I don't truly delete my test data. ROLLBACK TRANSACTION
This is standard transaction management that aborts a transaction and returns the state of the data to its initial state if something goes wrong. Notice that if all goes well and no errors occur, I print a success message and then roll back the transaction. Normally I would not roll it back on success but because this is a test transaction, I don't actually want to delete my data.
Notice that I had to follow every statement with the @@ERROR function and then follow it up with a ROLLBACK and a RETURN. This approach can get really ugly. Imagine writing a transaction that contains dozens of queries that all need to be checked. The process could be simplified somewhat by using GOTO statements, but I am certainly not an advocate of them and they still don't solve the problem of needing to check the @@ERROR function immediately following the action query.
SQL Server 2005 still supports the @@ERROR function but it also includes the widely known TRY/CATCH paradigm. The TRY/CATCH construct is similar to the C# try/catch construct in that errors are trapped within the try block and execution is then transferred to the catch block. (The T-SQL version of TRY/CATCH has no FINALLY clause option as does the C# version. You can simulate one by catching the exceptions, not returning, and then following the catch block with appropriate cleanup code.) Therefore you need to watch out for this in certain circumstances, like when you have a CURSOR allocated and open at the time an error is thrown in the T-SQL TRY block. In this case the cursor should be checked in the CATCH block to see if it is open; if it is open, it should be closed and deallocated.
Right away what likely strikes you about the TRY/CATCH construct in Figure 4 is how compact it is versus the @@ERROR technique. Notice that the repetitive checks to see if there is an error are not necessary with the TRY/CATCH block, thus reducing the number of lines of code, which also reduces the number of places you could make a coding mistake.
Figure 4 Exception Handling with TRY/CATCH
BEGIN TRY BEGIN TRANSACTION DELETE [Order Details] WHERE OrderID IN (SELECT OrderID FROM Orders WHERE CustomerID = 'ALFKI') DELETE Orders WHERE CustomerID = 'ALFKI' DELETE Customers WHERE CustomerID = 'ALFKI' PRINT 'committing deletes' COMMIT TRANSACTION END TRY BEGIN CATCH ROLLBACK TRANSACTION RETURN END CATCH
The code in Figure 4 attempts to begin a transaction, execute a series of DELETE statements, and then commit the transaction. (Again, notice that in my example I am not committing the transaction but instead I am rolling it back so I don't actually delete my data. In reality this would be a COMMIT TRAN statement.)
There are a few built-in functions that you can access from within a TRY/CATCH construct that can help you determine what caused the code to enter the CATCH block. For example, you could add the following statement inside of the CATCH block to return information about that error:
SELECT ERROR_LINE() AS ErrorLine, ERROR_MESSAGE() AS ErrorMessage, ERROR_NUMBER() AS ErrorNumber, ERROR_PROCEDURE() AS ErrorProcedure, ERROR_SEVERITY() AS ErrorSeverity, ERROR_STATE() AS ErrorState
These functions (where applicable) will return the line number in which the error occurred, the error message, the error number, the stored procedure, or trigger in which the error occurred, the severity level of the error, and the state level of the error.
The TRY/CATCH construct traps errors that have a severity of 11 through 19, inclusive. Severity levels below 11 are warnings and are not considered errors. Errors with a severity level of 20 and higher are considered severe errors. However, if these 20+ severity level errors do not cause the database engine to stop, they will be caught by the TRY/CATCH. Overall the TRY/CATCH statement is much easier to read, much easier to maintain, and less prone to copy and paste mistakes. Additionally, the @@ERROR technique is not always reliable. For example, some errors abort the plan or the entire batch.
Q I know that I can create triggers to audit changes to data, but how can I perform an action when a table's schema is changed?
Q I know that I can create triggers to audit changes to data, but how can I perform an action when a table's schema is changed?
A As I mentioned in the introduction, there's a new trigger in town: the DDL trigger in SQL Server 2005. While data manipulation language (DML) triggers fire based on data manipulation, DDL triggers fire when a change is made to the schema of a database or server. DDL triggers can only be created to fire AFTER the triggering event has occurred, as opposed to DML triggers, which can fire AFTER the event or INSTEAD OF the event. The following is the syntax for creating a DDL trigger, outlined by the SQL Server 2005 documentation:
CREATE TRIGGER trigger_name ON { ALL SERVER | DATABASE } [ WITH ENCRYPTION ] { FOR | AFTER } { event_type [ ,...n ] | DDL_DATABASE_LEVEL_EVENTS } [ WITH APPEND ] [ NOT FOR REPLICATION ] { AS { sql_statement [ ...n ] | EXTERNAL NAME < method specifier > } } < method_specifier > ::= assembly_name:class_name[::method_name]
A As I mentioned in the introduction, there's a new trigger in town: the DDL trigger in SQL Server 2005. While data manipulation language (DML) triggers fire based on data manipulation, DDL triggers fire when a change is made to the schema of a database or server. DDL triggers can only be created to fire AFTER the triggering event has occurred, as opposed to DML triggers, which can fire AFTER the event or INSTEAD OF the event. The following is the syntax for creating a DDL trigger, outlined by the SQL Server 2005 documentation:
CREATE TRIGGER trigger_name ON { ALL SERVER | DATABASE } [ WITH ENCRYPTION ] { FOR | AFTER } { event_type [ ,...n ] | DDL_DATABASE_LEVEL_EVENTS } [ WITH APPEND ] [ NOT FOR REPLICATION ] { AS { sql_statement [ ...n ] | EXTERNAL NAME < method specifier > } } < method_specifier > ::= assembly_name:class_name[::method_name]
Notice that you can create the trigger so that it fires when changes are made to the schema of a database or when changes are made to the database server. You can also define the events that will cause the trigger to fire by specifying the event types, which are all listed in the SQL Server 2005 documentation. Alternatively, you can have the DDL trigger fire after all events by specifying the DDL_DATABASE_LEVEL_EVENTS indicator.
With a DDL trigger you can log changes made to the schema of a database. Instead of preventing a change from being made, you can audit the change using a DDL trigger. Alternatively, you could log the attempt and then roll it back, all via the DDL trigger. You might also want to enforce naming conventions on your objects when they are created. For example, you might want all stored procedures prefixed with the characters pr. With a DDL trigger, you could enforce this naming convention.
DDL triggers are similar to DML triggers in that they both can roll back a transaction. However, DDL triggers have no inserted or deleted tables. DDL triggers have access to a built-in function called eventdata that returns XML within an XML datatype that contains information regarding the event that fired the trigger. Every event causes slightly different information to appear in the event data. However the eventdata function always returns the following information for all events:
- The time of the event
- The type of event
- The SPID of the connection that caused the trigger to fire
- The login name and user name of the user context in which the event occurred
Notice that the DDL trigger in Figure 5 is defined to fire whenever a stored procedure is created or altered on the current database. When the trigger fires, the event's data is captured and stored in a local XML variable. While this is not necessary, it can help make the trigger more efficient if you intend to access the eventdata function more than once. The trigger then checks to see if the name of the object begins with pr. If it does, I allow the transaction to complete. If it does not, then I roll back the transaction and the stored procedure would remain unchanged. In this case I know that the ObjectName element will contain the name of the stored procedure since the trigger is only being invoked when a stored procedure is created or altered. Figure 6 shows the event data for this trigger when I executed the following command:
create proc test1 as select getdate
Figure 6 Event Data for Creating a Procedure
<EVENT_INSTANCE> <EventType>CREATE_PROCEDURE</EventType> <PostTime>2005-10-20T00:52:16.160</PostTime> <SPID>51</SPID> <ServerName>MYDBSERVER</ServerName> <LoginName>CAMELOT\jpapa</LoginName> <UserName>dbo</UserName> <DatabaseName>AdventureWorks</DatabaseName> <SchemaName>dbo</SchemaName> <ObjectName>test1</ObjectName> <ObjectType>PROCEDURE</ObjectType> <TSQLCommand> <SetOptions ANSI_NULLS="ON" ANSI_NULL_DEFAULT="ON" ANSI_PADDING="ON" QUOTED_IDENTIFIER="ON" ENCRYPTED="FALSE" /> <CommandText>create proc test1 as select getdate()</CommandText> </TSQLCommand> </EVENT_INSTANCE>
Figure 5 DDL Trigger Fires Based on Stored Procedure
CREATE TRIGGER ddlTrigger_ProcNamingConvention ON DATABASE AFTER CREATE_PROCEDURE, ALTER_PROCEDURE AS DECLARE @eventDataXml XML SET @eventDataXml = EVENTDATA() IF SUBSTRING(@eventDataXml.value('(//ObjectName)[1]', 'VARCHAR(200)'), 1, 2) = 'pr' PRINT 'starts with pr' ELSE BEGIN PRINT 'does not start with pr' ROLLBACK TRANSACTION END GO
You can also take the event data and store it in its entirety in an auditing table's XML column. Another option would be to pull out the specific node's contents and store them individually. Obviously the implementation depends on the situation, but your options are numerous with DDL triggers. One thing to keep in mind when using DDL triggers is that they operate synchronously. Because of the synchronous execution, a trigger should not execute time-consuming operations if at all possible. Otherwise they can adversely affect performance of the database server. Just like DML triggers, DDL triggers should be defined to do what they need to do and end quickly. If you need asynchronous execution, you can use the Service Broker and Event Notification models.
Send your questions and comments for John to mmdata@microsoft.com.
John Papa is a Senior .NET Consultant with ASPSOFT (aspsoft.com) and a baseball fanatic who spends most of his summer nights rooting for the Yankees with his family and his faithful dog, Kadi. John, a C# MVP, has authored several books on ADO, XML, and SQL Server, and can often be found speaking at industry conferences such as VSLive.