Data Points

Handling Data Concurrency Using ADO.NET, Part 2

John Papa

Code download available at:DataPoints0410.exe(145 KB)

Contents

Planning for Concurrency
Save What You Can!
Using DataViewRowState
Turning the Other Cheek
Conclusion

Enterprise development has been moving towards a discon-nected model in recent years and ADO.NET development is no exception. While the disconnected model of the ADO.NET DataSet offers great flexibility, that adaptability also means looser control over data updates than you get with a connected data access model. One related issue is data concurrency. This month I'll focus on concurrency violations that occur when multiple rows could be saved to the database in a single batch.

When a concurrency violation occurs, the SqlDataAdapter's ContinueUpdateOnError property tells the SqlDataAdapter to either throw an exception or ignore the concurrency violation. I will demonstrate how to show exactly why a concurrency violation occurred by setting the ContinueUpdateOnError property to true and using a series of DataViews bound to ASP.NET DataGrid objects. These DataGrids will use the DataViewRowState enumerators to show what changes have been made to a DataSet.

The SqlDataAdapter also has some events, including RowUpdating and RowUpdated, which can come in handy. To illustrate their use, I will also show how to deal with concurrency by creating a handler for the RowUpdated event.

In last month's Data Points column, I discussed how to deal with concurrency violations and ran through an example of a violation that occurred when only a single row was updated at a time (see Data Points: Handling Data Concurrency Using ADO.NET). In that situation, setting the SqlDataAdapter's ContinueUpdateOnError property to false told the SqlDataAdapter to throw an exception as soon as a concurrency violation was encountered. This technique is ideal when only saving a single row or when you are attempting to save multiple rows and want them all to commit or fail. However, if you want to save the rows that do not encounter concurrency violations and display the rows that do cause concurrency violations, you must use another technique.

Planning for Concurrency

The examples in this column use optimistic concurrency; therefore, the data is only locked during the actual update. This leaves plenty of time for users to read that same data, try to update it, and collide with each other. As such, it is important to implement concurrency violation detection which catches attempts to modify records by a user that have already been modified since the last time the data was retrieved by that user.

Instead of dictating how to handle a concurrency violation, ADO.NET lets the developer decide how to deal with it. ADO.NET throws a data concurrency exception (DBConcurrencyException) when an UPDATE statement affects zero rows. If an UPDATE statement's WHERE clause contains only the primary key(s), it will implement the last-in wins approach, which overwrites whatever values exist in the row. The problem with this technique is that a user's changes can be overwritten and lost without any warning.

By adding another condition to an UPDATE statement's WHERE clause in addition to the primary key(s), a concurrency violation can be caught. There are a few ways to accomplish this, such as adding a timestamp column or a datetime column to the table and using it in the WHERE clause of the UPDATE statement. If you use a SQL Server TIMESTAMP column, it will automatically update itself with a new value every time a value in its row is modified. If you use a DATETIME column, you will have to update its value in every update either explicitly or by another means, such as an AFTER UPDATE trigger, as shown here:

CREATE TRIGGER trEmployees_IU ON Employees AFTER INSERT, UPDATE AS UPDATE Employees SET LastUpdateDateTime = getdate() FROM Employees INNER JOIN inserted ON Employees.EmployeeID = inserted.EmployeeID

For the examples here, I altered the Employee table by adding an additional versioning column called LastUpdateDateTime and had its value updated by the trigger trEmployee_IU. Using a TIMESTAMP or ROWVERSION column eliminates the need for the trigger, but the DATETIME column solution is easier to demonstrate as it shows an actual date and time to the user. So for the examples here, I will continue to use a DATETIME column, although in a production application I encourage you to consider using a TIMESTAMP column instead.

The next steps are to make sure that UPDATE statements use both the primary key columns(s) and the versioning column and to return the versioning column's value from the UPDATE statement (see Figure 1). The versioning column's value should be returned via the SqlDataAdapter and sent back to the DataSet. If the UPDATE works, this sets the DataSet to the updated versioning column's value, which it will need if it attempts to update the row again. To have the value of the versioning column sent back to the DataSet, a number of key steps must be executed:

  1. The value of the versioning column (LastUpdateDateTime) is set to the current date and time.
  2. The versioning column is returned from the stored procedure via an output parameter.
  3. The SqlDataAdapter maps the output parameter to the LastUpdateDateTime column in the DataSet using the following lines of code:
oUpdCmd.Parameters.Add(new SqlParameter("@NewLastUpdateDateTime", SqlDbType.DateTime, 8, ParameterDirection.Output, false, 0, 0, "LastUpdateDateTime", DataRowVersion.Current, null)); oUpdCmd.UpdatedRowSource = UpdateRowSource.OutputParameters;

Figure 1 Using a Versioning Column in an UPDATE

CREATE PROCEDURE prUpdate_Employee @EmployeeID INT, @LastUpdateDateTime DATETIME, @FirstName NVARCHAR(10), @LastName NVARCHAR(20), @Title NVARCHAR(30), @BirthDate DATETIME, @HireDate DATETIME, @Extension NVARCHAR(4), @NewLastUpdateDateTime DATETIME OUTPUT AS SET @NewLastUpdateDateTime = getdate() UPDATE Employees SET FirstName = @FirstName, LastName = @LastName, Title = @Title, BirthDate = @BirthDate, HireDate = @HireDate, Extension = @Extension, LastUpdateDateTime = @NewLastUpdateDateTime WHERE EmployeeID = @EmployeeID AND LastUpdateDateTime = @LastUpdateDateTime IF @@ROWCOUNT = 0 SET @NewLastUpdateDateTime = @LastUpdateDateTime RETURN

Once you have determined how you will detect data concurrency violations, you need to decide how to deal with them. ADO.NET doesn't automatically deal with concurrency violations, but gives you the flexibility to decide how to do so based on your own situation. You can ignore the violation and just update what you can, you can cancel all changes and give the user the choice of how to react (see Figure 2), or you can give the user no choice and predefine how to react. In the sample application you will see some of the options that can be presented to the user in the event of a data concurrency violation.

Figure 2 Options for Data Concurrency Violations

Figure 2** Options for Data Concurrency Violations **

As you saw in Figure 2, one option is to cancel changes and revert back to the state of the data from the last time it was retrieved from the database. A second option is to cancel the changes and reload the data from the database. The difference is that in the latter, the data will reflect any changes that another user might have made since the last time this user got the data. The final option is to save the changes. This option will only save rows that do not cause a concurrency violation. Because I don't want another user's data changes to be overwritten inadvertently, I don't want to provide the user with the last-in wins option.

Save What You Can!

Using this sample application I will demonstrate some ways that you can detect concurrency violations and notify the user. The application attempts to save all rows that were modified. When a row is being updated to the database, if an error occurs the app will note it and continue updating the rest of the rows. In other words, let's say a user modifies five rows and two of those cause errors. These two will not be saved while the other three will be saved properly. So let's start with a scenario in which two users browse to the screen that loads a list of employees that can be edited, as you saw in Figure 2. User 1 changes five rows (EmployeeID's 1 through 5), clicks the Save button, and is notified that all of the rows were saved successfully. Then User 2 changes five rows (EmployeeID's 3 through 7), clicks the Save button, and receives notification that a concurrency violation has occurred. For simplicity, I only change the FirstName field in each row, with User 1 appending a 1 and User 2 appending a 2 to the end of the first name values.

I've also added two radio buttons to the application to show two ways to deal with concurrency. Both techniques have the same net effect to the user, which is to save the rows that it can and to notify the user with as much information as possible about which rows could not be saved due to concurrency violations. Figure 3 shows the information displayed after User 2 clicks the Save button in the scenario I just described.

Figure 3 Different Kinds of Data Errors

Using DataViewRowState

In this example, the rows listed as EmployeeID 3, 4, and 5 were changed by User 1. So when User 2 also saved them, a concurrency violation was encountered. Notice in Figure 3 that there are five ASP.NET DataGrid objects displayed. Each DataGrid shows a different view of the data that errored out because of the concurrency violations. The ShowConcurrency method that is shown in Figure 4 contains the code that gathers the data that will be displayed in each of the DataGrids.

Figure 4 Showing Concurrency Violations

private void ShowConcurrency(dsEmployee.EmployeeDataTable oTblEmp) { if (oTblEmp.HasErrors) { // Store modified versions of the rows that saved with no problems, // then modified versions that had concurrency violations dsEmployee oDsModifiedOK = (dsEmployee)oDsAllEmployees.GetChanges(); dsEmployee oDsModified = new dsEmployee(); dsEmployee oDsInDB = new dsEmployee(); // Loop through rows with errors foreach (dsEmployee.EmployeeRow oDr in oTblEmp.GetErrors()) { // Remove the row from ModifiedOk and add it to Modified, then // get matching row from db and add it to the InDB DataSet oDsModifiedOK.Employee.Rows.Remove( oDsModifiedOK.Employee.Rows.Find(oDr.EmployeeID)); oDsModified.Employee.ImportRow(oDr); Employee oEmployee = new Employee(); oDsInDB.Employee.ImportRow(oEmployee.GetEmployee(oDr.EmployeeID)); } lblModifiedAll.Visible = lblModifiedOK.Visible = lblErroredModifiedCurrent.Visible = lblErroredInDB.Visible = lblErroredModifiedOriginal.Visible = true; // Bind grdModifiedAll DataGrid to all of the modified rows // Bind grdModifiedOK DataGrid to modified rows saved without error grdModifiedAll.DataSource = new DataView(oDsAllEmployees.Employee, null, null, DataViewRowState.ModifiedCurrent); grdModifiedOK.DataSource = new DataView(oDsModifiedOK.Employee, null, null, DataViewRowState.ModifiedCurrent); grdModifiedAll.DataBind(); grdModifiedOK.DataBind(); // Bind grdErroredModifiedOriginal to modified rows saved // with errors. Show only original version of row. Same for // grdErroredModifiedCurrent but show only current version. grdErroredModifiedOriginal.DataSource = new DataView( oDsModified.Employee, null, null, DataViewRowState.ModifiedOriginal); grdErroredModifiedCurrent.DataSource = new DataView( oDsModified.Employee, null, null, DataViewRowState.ModifiedCurrent); grdErroredModifiedOriginal.DataBind(); grdErroredModifiedCurrent.DataBind(); // Bind grdErroredInDB to rows currently in db that are the // counterparts for those with errors grdErroredInDB.DataSource = new DataView( oDsInDB.Employee, null, null, DataViewRowState.CurrentRows); grdErroredInDB.DataBind(); } }

The first DataGrid, "All Rows I Tried To Save," shows the five rows and their values that User 2 tried to save to the database. In order to display these rows and the values that are sent to the database, I created a DataView from the main employee DataSet (oDsAllEmployees) and filtered it so that only the current values of modified rows would be in the DataView. I set the DataGrid's Data source property to this new DataView and rebound the DataGrid using the following code:

// Bind the grdModifiedAll DataGrid to all of the modified rows grdModifiedAll.DataSource = new DataView(oDsAllEmployees.Employee, null, null, DataViewRowState.ModifiedCurrent); grdModifiedAll.DataBind();

The next DataGrid, "Rows that Saved OK," shows all of the rows that User 2 saved successfully, without errors. To show these rows, I created a DataSet called oDsModifiedOK and set it to contain all of the modified rows from the main DataSet using the DataSet's GetChanges method. Then, I looped through the Employee DataTable's rows that contain errors and picked out the EmployeeID for each row that had errors. The DataTable object has a method called GetErrors which returns an array of DataRow objects from the DataTable that contain errors. In this example, the array will contain the three rows that encountered concurrency violations. I then locate the same row for that employee by passing the EmployeeID to the Find method and then remove the row from the oDsModifiedOK DataSet. When the loop completes, only the rows that saved successfully will be left in the oDsModifiedOK DataSet.

The third, fourth, and fifth DataGrids show only the rows that encountered concurrency violations. Each DataGrid shows the same rows, with different values. The third DataGrid, "Rows in Error—Modified Original," shows the three rows and their values as they existed when User 2 last grabbed them from the database.

The fourth DataGrid, "Rows in Error—Modified Current," shows the three rows and their values as they existed after User 2 made changes to them. I created a DataSet called oDsModified and imported the rows that contained errors by looping through the GetErrors array and calling the ImportRow method of the Employee DataTable, as shown in Figure 4. The third DataGrid is bound to a DataView and filtered to show only the rows that match the DataViewRowState.ModifiedOriginal enumerator value.

The fourth DataGrid is bound to a DataView and filtered to show only the rows that match the DataViewRowState.ModifiedCurrent enumerator value. The DataViewRowState filter comes in handy here as it allows me to take a single DataTable and show both its original values and it current values in two different DataGrid objects via two DataView objects.

The final DataGrid, "Rows in Error—In Database," shows the same three rows that encountered concurrency violations as they exist in the database. This DataGrid is key as it shows how the values of the FirstName columns have changed since User 2 last retrieved them from the database. Thus, when User 2 tried to save the values shown in the "Rows in Error—Modified Current" DataGrid, the prUpdate_Employee stored procedure (shown in Figure 1) unsuccessfully tried to match the EmployeeID and LastUpdateDateTime fields between what is shown in the "Rows in Error—Modified Original" and "Rows in Error—In Database" DataGrids. Thus, no rows were updated and concurrency violations were raised for these rows. While I have shown much more information than the user would necessarily want to see in this example, it should help illustrate how concurrency violations occur and the flexibility that ADO.NET offers in dealing with them.

Turning the Other Cheek

In last month's column I demonstrated how to halt and roll back a transaction when any row in the batch update failed. The example here ignores changes to any rows that encounter an error during the update. There are two easy ways to implement this behavior in ADO.NET. One is to set the SqlDataAdapter's ContinueUpdateOnError property to true. This tells the SqlDataAdapter that if an error is encountered, it should not throw an exception. Rather, the SqlDataAdapter will skip the update of the offending DataRow, log the error condition in the offending DataRow's RowError property, and continue updating the rest of the rows in the batch update. A second way to implement this behavior is to use the SqlDataAdapter's RowUpdated event.

The RowUpdating event is raised just before a DataRow is about to be sent to the database to be updated. The RowUpdated event, on the other hand, is raised just after the row was updated in the database. Each of these events can be used to modify the updating behavior before and after it happens, to grab a reference to an updated row, or even to cancel the current update.

The RowUpdated event is less efficient than setting the ContinueUpdateOnError property to true. However, it does expose additional functionality such as exposing the type of action query being performed on the row, whether an error occurred, and the entire row that is being updated. Unless you need to take advantage of the added features, I suggest avoiding the RowUpdated event when managing concurrency.

When setting the ContinueUpdateOnError property to true you respond to errors after the update is finished. The RowUpdated event allows you to respond to errors right when the error is encountered. You can alter the behavior of the update by setting the RowUpdatingEventArgs argument's Status property to one of the settings listed in Figure 5.

Figure 5 Status Settings

Status Description
Continue Continue the update as planned
ErrorsOccurred Abort the update of all rows and throw an exception
SkipAllRemainingRows Abort the update of all rows but do not throw an exception
SkipCurrentRow Ignore the current row and continue the update of the remaining rows

In the sample application, if you select the radio button that specifies that the code should use the RowUpdated event handler, it will associate the event handler with the RowUpdated event of the SqlDataAdapter, as you can see here:

oDa.RowUpdated += new SqlRowUpdatedEventHandler(oDa_OnRowUpdated);

The event handler I created (oDa_OnRowUpdated) executes following each individual row that is updated. It checks the status of the update and if it detects that an error occurred, it sets the offending row's RowError property to the error message. Then, it changes the status from UpdateStatus.ErrorsOccurred to UpdateStatus.SkipCurrentRow. This tells the SqlDataAdapter not to throw an exception, but rather to continue trying to update the rest of the rows to the database (thus emulating the behavior of setting ContinueUpdateOnError to true):

private void oDa_OnRowUpdated(object sender, SqlRowUpdatedEventArgs e) { // If an error occurred during the update ... if (e.Status == UpdateStatus.ErrorsOccurred) { // Grab the error message and associate it with the // current row that was updated e.Row.RowError = e.Errors.Message; // ignore error; keep processing updates if there are more e.Status = UpdateStatus.SkipCurrentRow; } }

Conclusion

Handling data concurrency issues is especially important when using a disconnected data store like the ADO.NET DataSet object. If an application using the DataSet has the ability to save several rows at once to the database, then it must be prepared to handle the threat of concurrency violations, which can easily occur any time more than one user is accessing the data store.

One way to handle such concurrency violations is to save all rows that can be saved and show the user the rows that could not be saved due to the violation. Another way is to save none of the rows in the batch, basically implementing an atomic transaction. ADO.NET provides a variety of ways to handle concurrency, but choosing the right technique for your app is still up to you.

Send your questions and comments for John to  mmdata@microsoft.com.

John Papa is a baseball fanatic who spends most of his summer nights rooting for the Yankees with his two little girls, wife, and faithful dog, Kadi. He has authored several books on ADO, XML, and SQL Server and can often be found speaking at industry conferences such as VSLive.