Data Points

Exploring the ADO.NET DataRow

John Papa

Code download available at:DataPoints0310.exe(136 KB)

Contents

Creating and Importing Rows
Locating Rows
Summing It Up

ADO.NET provides a rich programming interface that allows developers to interact with rows through the DataTable, DataRowCollection, and the DataRow objects. Because I receive many questions regarding DataRow objects in ADO.NET, I have combined several of them and will present the answers in this month's installment of Data Points. I'll explore how you can manipulate data using the DataRow and focus on importing loading, traversing, locating, and examining data in ADO.NET and C#.

Creating and Importing Rows

The DataTable exposes several methods that you can use to manipulate DataRow objects. They include ImportRow, NewRow, Select, and LoadDataRow. I will use these methods in a sample Windows® Forms application to show you how to grab one or more rows from one DataSet and add them to a second DataSet. I'll also show you how to create and load a DataSet manually.

To begin, I load a DataSet directly from the SQL Server™ Northwind database Employee table (see Figure 1) and bind it to a DataGrid (see Figure 2). After loading the DataSet, I create a primary key constraint for the DataTable on the EmployeeID column. Since this column is an IDENTITY column in the Northwind database, I also set the AutoIncrement property to true and both the AutoIncrementSeed and AutoIncrementStep properties to -1. This tells the DataTable that whenever a row is added to this DataTable, the EmployeeID DataColumn's value should be generated by the DataTable, starting with a value of -1 and decreasing by 1 for each row. Thus, EmployeeID values in any added rows would have a value of -1, -2, -3 and so on. This is important because I can't know what EmployeeID value the database will assign when I send new rows to be inserted. Therefore, I assign a value that I know can't be assigned by the database (negative numbers). Then, when I send the data to the database to be inserted, I will ignore the EmployeeID value and will retrieve the newly created and assigned value from the database. (For more information please refer to the Data Points column in the July 2003 issue of MSDN® Magazine, "Managing Hierarchical Inserts in ASP.NET and ADO.NET").

Figure 2 Get Employees and Bind to DataGrid

m_oCn = new SqlConnection(m_sCn); //-- Set up the SELECT Command string sSQL = "SELECT EmployeeID, FirstName, LastName FROM Employees"; SqlCommand oSelCmd = new SqlCommand(sSQL, m_oCn); //-- Get the Employee records oSelCmd.CommandType = CommandType.Text; m_oDa.SelectCommand = oSelCmd; m_oDa.Fill(m_oDs1, "Employees"); //-- Set the primary key m_oDs1.Tables["Employees"].Constraints.Add("PK_Employees", m_oDs1.Tables["Employees"].Columns["EmployeeID"], true); m_oDs1.Tables["Employees"].Columns["EmployeeID"]. AutoIncrement = true; m_oDs1.Tables["Employees"].Columns["EmployeeID"]. AutoIncrementSeed = -1; m_oDs1.Tables["Employees"].Columns["EmployeeID"]. AutoIncrementStep = -1; //-- Bind grdEmployees1.DataSource = m_oDs1.Tables["Employees"];

Figure 1 Loading a DataSet

Next, I create a second DataSet manually using code, then add two rows of employees to it (see Figure 3). In this code sample I create a DataTable and add DataColumn objects to it that match the DataColumn objects from the first DataSet, m_oDs1. I also add a primary key constraint to this table to keep the two DataTables and DataSets in sync. However, notice that I set the AutoIncrementSeed to -1000 (see Figure 3). That way, when I take the rows from this manually created DataTable and copy them to the DataTable derived from the database, none of the rows will have duplicate EmployeeIDs. This is a simple, yet effective, solution for avoiding non-unique key values.

Figure 3 Creating a DataSet Manually

DataTable oDt = new DataTable("Employees"); DataRow oRow; //-- Create the table oDt.Columns.Add("EmployeeID", System.Type.GetType("System.Int32")); oDt.Columns.Add("FirstName", System.Type.GetType("System.String")); oDt.Columns.Add("LastName", System.Type.GetType("System.String")); oDt.Constraints.Add("PK_Employees", oDt.Columns["EmployeeID"], true); oDt.Columns["EmployeeID"].AutoIncrement = true; oDt.Columns["EmployeeID"].AutoIncrementSeed = -1000; oDt.Columns["EmployeeID"].AutoIncrementStep = -1; m_oDs2.Tables.Add(oDt); //-- Add the rows oRow = m_oDs2.Tables["Employees"].NewRow(); oRow["FirstName"] = "Lloyd"; oRow["LastName"] = "Llamas"; m_oDs2.Tables["Employees"].Rows.Add(oRow); oRow = m_oDs2.Tables["Employees"].NewRow(); oRow["FirstName"] = "Layla"; oRow["LastName"] = "Lewis"; m_oDs2.Tables["Employees"].Rows.Add(oRow); grdEmployees2.DataSource = m_oDs2.Tables["Employees"];

Next, I create a new row for the manually created DataTable using the NewRow method (again, see Figure 3). This may seem awkward because what is really happening here is that to add a new row to a DataTable I actually have to ask the DataTable for a new row, set its values, and then add the new row to the DataTable. Thus, when I call the NewRow method, the newly created row is actually detached from the DataTable. To attach it, it must then be passed to the Add method of the DataTable's Rows collection.

The following code can be executed to import the rows from the second DataSet (the manually created one) into the first DataSet (from the Northwind database).

//-- Import the rows foreach (DataRow oRow in m_oDs2.Tables["Employees"].Rows) { m_oDs1.Tables["Employees"].ImportRow(oRow); }

This code loops through the two rows I added to the manually created DataTable and imports them via the ImportRow method one at a time into the DataSet that is derived from the Northwind Employee table.

That's all you need to import data. However, there are some important aspects of imported data to keep in mind. When data is imported from a DataTable into another DataTable, the RowState is also imported. To demonstrate this, I added two additional DataGrid objects to the Windows Form (see Figure 1). I added a second DataTable to store the RowState for each row to the first DataSet, which already contained a DataTable of employees from Northwind. Thus there are two DataTable objects in the first DataSet m_oDs, one referring to the employees and another that has the same number of rows as the first one. The second DataTable, which I call State, is used simply to store the RowState of each row from the Employee DataTable.

The second DataSet already contains a DataTable to store the employees that I manually created with code. I added a second DataTable to it to store the RowState for each of its rows. The purpose here is to demonstrate how the RowState of a row is maintained by the ImportRow method. The RowStates of the rows retrieved from the database's Employees table and stored in the first DataSet m_oDs1 are all set to "unchanged" since I do nothing to them. The RowState values of the rows that I manually added to the second DataSet m_oDs2 are both "added," since I added them myself. Thus, when I import these two rows into the m_oDs1 DataSet, the RowState for the two new rows is still "added."

As a side note, if I had invoked the AcceptChanges method on the first DataSet after I added the two rows, their RowState values would have been set to "unchanged." Therefore, when I later imported them into the m_oDs2 DataSet, their RowStates would also be "unchanged." When invoked on a DataTable, the AcceptChanges method tells the DataTable that the original values of each DataRow should be set to whatever value is current, producing an "unchanged" RowState for each row. The full code for this example includes the code to keep track of the DataGrid RowState as well.

LoadDataRow is the other method worth noting while I'm on this topic. The ImportRow method essentially copies one DataRow from a DataTable into a second DataTable, assuming both DataTable objects have the same columns. The LoadDataRow method can also be used to copy rows. The following code shows how ImportRow was used in the previous code sample to copy a DataRow from one DataTable to another DataTable:

//-- Import Row m_oDs1.Tables["Employees"].ImportRow(oRow);

This can also be accomplished by using the LoadDataRow method, as shown here:

//-- Load Data Row object[] aRowValues = {oRow["EmployeeID"], oRow["FirstName"], oRow["LastName"]}; m_oDs1.Tables["Employees"].LoadDataRow(aRowValues, false);

Since rows are simply being copied, ImportRow is more practical when two DataTable objects exist with the same columns. However, LoadDataRow yields a little more flexibility. For example, if the DataTable objects do not have the same columns, LoadDataRow can be used whereas ImportRow cannot. This is because the LoadDataRow method accepts an array of values to insert, in this case, into the DataTable. Thus, if there were different columns in the two DataTable objects, LoadDataRow could simply grab the appropriate values in the proper order from the second DataTable to be inserted into the first DataTable.

However, the LoadDataRow method serves another purpose beyond inserting a row into a DataTable. In fact, LoadDataRow is most useful when it is necessary to update an existing row if one exists or to add a row if it does not yet exist. If the key value for the row is found, LoadDataRow will update that row's values instead of inserting a new row. This is just another way in which LoadDataRow shows greater flexibility over ImportRow, since in this scenario ImportRow would raise an exception if a row already existed with a duplicate key value.

Locating Rows

The next code example addresses several questions I received regarding how to find a row. Since there are several ways to find rows using ADO.NET, I'll walk through examples for three of the most prominent techniques: the Find, Contains, and Select methods. All three of these methods can be used to locate specific rows; however, they accomplish this task in very different ways.

The Find method looks for a row by the primary key value of the row. If a row is found, it is returned; if no row is found, a null value is returned. This Find method is very useful when you need to locate a row and then perform an action on it. I used the Find method in the first example in this column when I synchronized the employee DataGrid with its RowState DataGrid. In that example, I had a common value in both DataGrid objects—the EmployeeID. I created a primary key constraint on both of the DataGrid objects' underlying DataTable objects using their EmployeeID DataColumns. By doing this, I was able to loop through the rows of the employee DataTable and set the corresponding state DataTable's row to the appropriate value. Notice how the following code sample loops through each employee in the Employees DataTable and uses the Find method to retrieve the matching row from the State DataTable.

DataRow oStateRow; foreach (DataRow oRow in m_oDs1.Tables["Employees"].Rows) { oStateRow = m_oDs1.Tables["State"].Rows.Find(oRow["EmployeeID"]); oStateRow["RowState"] = oRow.RowState; }

In this example, for the Find method to work properly there must be a primary key. The primary key constraint must exist on the EmployeeID DataColumn; otherwise the Find method will raise an exception. The Find method relies on the constraint to locate the desired row. Since the Find method requires a primary key constraint, there is no need to worry about locating more than one row. If the DataTable had multiple DataColumn objects that made up its primary key resulting in a compound key, I could have simply passed in an array of values representing the key values to the Find method.

If I don't need to locate and perform an action on the row, but rather just need to know if a row exists in the DataTable that matches a primary key value, I could use the Contains method. It is very useful when you need to check for the existence of a row but not retrieve it. The Contains method works like the Find method in that it can be passed a single value or an array of values, depending on how many columns make up its primary key. Contains relies on the existence of a primary key just like Find, so it is important to make sure that one exists before going down this road. Contains yields a simple way to check for the existence of a row while Find not only checks for the row, it returns it.

The third technique is to use the Select method of the DataTable object. It returns an array of DataRow objects that meet the specified criteria. The Select method has many overloaded forms, so it can be filtered using a string expression, returned in a particular sort order, or it can be filtered to show only certain row states.

One key difference between the Select and Find methods is that Find searches for one row that matches the primary key value(s) passed in. This Find method may return zero rows or one row while the Select method may return zero or more rows. Select is more flexible in this respect since it is not dependent upon the existence of a primary key constraint. It searches for all rows that meet the criteria and it returns them in an array of DataRow objects. To compare the Find and Select methods, in the following code snippet I have shown the code that uses Find from the previous example application as well as what the replacement code would look like using Select:

//-- Using the Find method oStateRow = m_oDs1.Tables["State"].Rows.Find(oRow["EmployeeID"]); oStateRow["RowState"] = oRow.RowState; //-- Using the Select method DataRow[] aRow = m_oDs1.Tables["State"].Select("EmployeeID = " + oRow["EmployeeID"]); aRow[0]["RowState"] = oRow.RowState;

For this example, Find is more efficient since I am always looking for a single row. Select can be used; however, it does not have the advantage of using only the primary key value. Also, notice how in the sample code where I used the Select method, I hardcoded the row index to the first row (index of 0). I could have looped through the returned rows, but since I know that there is only going to be one row, I took the risk of hardcoding it. Obviously, this is not the best technique when looking for a single row.

The Select method can be very effective, however, when searching for an unknown number of rows. For example, if I wanted to find all employees from Seattle (assuming the City column was in my DataTable) I could easily do so by using the following code:

DataRow[] aRow = oDs.Tables["Employees"].Select("City = 'Seattle'");

I could also sort the retrieved DataRow objects in the array by specifying the sort order parameter of the Select method as follows:

DataRow[] aRow = m_oDs2.Tables["Employees"].Select("City = 'Seattle'", "LastName");

This would return the two employees who live in Seattle, sorted by their last names. The Select method can also filter rows using the DataViewRowState. The following code would return the employees from Seattle who were just added, sorted by last name:

DataRow[] aRow = m_oDs2.Tables["Employees"].Select("City = 'Seattle'", "LastName", DataViewRowState.Added);

Summing It Up

This column demonstrated how to retrieve rows, check to see if a row exists, locate a specific row based on key value, create tables manually, create a new row, and copy rows between DataTable objects. It also touched on DataRowState and how it is maintained when a row is imported using the ImportRow method. The DataRowState and the DataViewRowState are both instrumental in ADO.NET on a number of fronts, especially when dealing with concurrency, as I will discuss in a future Data Points column.

Send your questions and comments for John to  mmdata@microsoft.com.

John Papa is a baseball fanatic who spends most of his summer nights rooting for the Yankees with his two little girls, wife, and faithful dog, Kadi. He has authored several books on ADO, XML, and SQL Server and can often be found speaking at industry conferences such as VSLive. You can reach him at mmdata@microsoft.com.