Multi-User Considerations in Data Synchronization for SQL Server 2005 Mobile Edition 3.0

 

Gil Lapid Shafriri
Debra Dove

Microsoft Corporation

March 2005

Summary: Gain a better understanding of how to develop your SQL Server Mobile applications to handle multi-user access and data synchronization with Microsoft SQL Server 2005. (6 printed pages)

Contents

Introduction
Locking
Subscriber Conflict Detection
Column-Level Tracking
Referential Integrity Violations
Remote Data Access and Multi-User Concepts
Remote Data Access and Primary Key Update Caveat

Introduction

Microsoft SQL Server Mobile Edition allows multiple users to concurrently access the same database. Multi-user access means that multiple threads or processes may be accessing a database while one thread is synchronizing that database with SQL Server. This functionality enables background synchronization, so a user can continue to work on a database while the database is synchronizing with the Publisher. However, this creates interesting complications that you will need to address as you build your applications. The following sections highlight common problems that may occur as a result of using multi-user database access in your application.

Locking

During synchronization, locks are held in different stages of the synchronization. Synchronization may fail if a lock can’t be acquired when it is needed. The locks that are acquired during synchronization are as follows:

  • A database exclusive lock is held at the beginning of the synchronization for updating the internal global synchronization data to ensure that the correct changes will be collected in the next synchronization. The lock is released immediately after the data is updated.
  • A row exclusive lock is held on each row that is uploaded to the Publisher. The lock is released after the uploaded row has been retrieved from the subscription database.
  • A row exclusive lock is held on each row that is applied to the Subscriber from the Publisher. The lock is released immediately after applying the row to the Subscriber.
  • A table DDL lock is held if any schema change is applied to the Subscriber during synchronization. The lock is held until the synchronization ends.
  • A DDL lock is held on all replicated user tables during re-initialization. The lock is held until the synchronization ends.

The first three locks in this list allow other threads and processes to read data from the database, but prevent writing data. The last two locks prevent both reading and writing. To prevent synchronization failures when modifying data during synchronization, make your transactions as short-lived as possible.

The following example demonstrates a potential locking problem and a good solution for this type of problem:

The Subscriber opens a transaction on a replicated table and modifies a few rows. The transaction is held uncommitted. Background synchronization starts and an attempt is made to acquire a database lock (see the first lock in the list above). Because there is a pending transaction on the database, the lock cannot be acquired, and the synchronization fails. To resolve this, you must commit the transaction immediately after modifying the rows.

Subscriber Conflict Detection

If a Subscriber begins synchronization with the Publisher, and during that synchronization a row in the subscription database is changed or deleted, there may be Subscriber conflict. A Subscriber conflict only occurs if the Subscriber changes occur during synchronization and those changes conflict with Publisher changes that are being applied during the same synchronization. When this occurs, the SubscriberConflicts property is set to a non-zero value equaling the number of rows with a conflict. In all other cases, namely Subscriber changes which occur outside of synchronization, conflicts are detected and immediately resolved at the Publisher.

If your application allows users to make changes to the subscription database during synchronization, it must also detect non-zero values for the SubscriberConflicts property. If a non-zero value is detected, your application needs to resynchronize with the Publisher.

The following scenario shows how a Subscriber conflict can occur. In this scenario, you have a published table and a Subscriber that has synchronized previously with the publication.

  1. Row X in the published table is changed at the Publisher.
  2. The Subscriber application begins synchronization with the Publisher.
  3. During synchronization, another thread in the Subscriber application deletes row X from the subscription database.
  4. Also during synchronization, the Publisher sends the Subscriber the change for row X. This command is detected as a conflict on the Subscriber, because row X was deleted.
  5. The Publisher row is not applied at the Subscriber, and the SubscriberConflicts property is set to 1.
  6. The Subscriber resynchronizes with the Publisher and sends the delete command for row X. This delete is detected as a conflict with the Publisher’s version of row X. If the default conflict resolver is used, the Subscriber's version of the row (the delete) is logged in the conflict table and the Publisher sends its version of row X to the Subscriber.
  7. The Subscriber applies the Publisher's row.

Column-Level Tracking

In general, when a Subscriber changes specific columns in a column-tracked table, only those columns will be uploaded to the Publisher. However, if the row is updated during synchronization, then all modified columns, including those that were modified before the synchronization, will be uploaded to the Publisher on the next synchronization. The reason is that the column bit mask on the row, used for column tracking, cannot be cleared after the first synchronization because the row was modified during the synchronization. The effect of this is a potential conflict in what seems to be a "mergeable" situation.

For example, consider the following scenario:

A publication includes a column-tracked table T, which has two columns, named COL1 and COL2. The Subscriber has synchronized previously with the publication.

  1. The Subscriber changes COL1 on row X.
  2. The Subscriber synchronizes with the Publisher.
  3. During synchronization, COL1 for row X is uploaded to the Publisher and applied successfully.
  4. Also during synchronization, the Subscriber changes COL2 on row X.

At this point, there are no conflicts. The changes to COL1 that occurred before synchronization have been applied. However, during the next synchronization, the changes to both COL1 and COL2 will be synchronized. If a change has subsequently occurred on the Publisher to COL1 of row x, there will be a conflict:

  1. After the first synchronization, COL1 of row X is changed on the Publisher.
  2. The Subscriber synchronizes with the Publisher.
  3. The Subscriber uploads both COL1 and COL2 for row X to the Publisher.
  4. A conflict is detected and resolved on the Publisher, because COL1 was changed on both the Subscriber and the Publisher. Namely, if the default conflict resolver is being used, the Subscriber’s version of the entire row is logged in the conflict table and the Publisher’s version of the entire row is applied to the Subscriber. Row X at the Subscriber no longer contains the changes to COL1 or COL2 for row X.

Note that if the Subscriber changed COL2 after (not during) the first synchronization, then COL1 would not be uploaded to the server during the second synchronization, and the Subscriber change to COL2 would have successfully merged into the Publisher row, and the Publisher change to COL1 would have been merged into the Subscriber row.

Referential Integrity Violations

Changes made to the subscription database during synchronization can cause Publisher rows to fail to be applied on the Subscriber because of referential integrity (RI) violations. In that case, the Subscriber will automatically initiate an additional synchronization to try to fix the RI error. Referential integrity errors are not considered conflicts by the Publisher, but instead are treated as errors. The publication property @compensate_for_errors dictates how RI errors are handled.

If @compensate_for_errors=true, then a compensating row from the Publisher will be sent to the Subscriber in order to fix the error.

If @compensate_for_errors=false, then synchronization fails, and your application must fix the RI error, or the user must fix it manually, before subsequent synchronizations will succeed.

The following scenario demonstrates how to use @compensate_for_errors. In the scenario, you have a publication with table T. Table T has a primary key. The Subscriber has previously subscribed to this publication.

  1. The Publisher inserts a new row to table T with a primary key value of X.
  2. The Subscriber synchronizes with the Publisher.
  3. During synchronization, a new row is inserted at the Subscriber, also with a primary key value of X.
  4. The Publisher sends the row with a primary key value of X to the Subscriber. This row fails to be applied because a row with a primary key value of X already exists in the Subscriber table.
  5. The Subscriber initiates another synchronization to fix the error.
  6. The Subscriber row is uploaded to the Publisher.
  7. The Subscriber row fails to be applied on the Publisher because a row with a primary key value of X already exists in the Publisher table.

The behavior at this point depends on the @compensate_for_errors property:

If @compensate_for_errors is set to true, the Publisher sends a DELETE command for the Subscriber row, followed by an INSERT command for the Publisher row. The Subscriber applies both changes sent to it by the Publisher. The end result is that the row change that occurred on the Subscriber is lost, and the row from the Publisher is added.

If @compensate_for_errors is set to false, the synchronization fails. Your application must include logic to handle this failure, or the user must remove the row from either the subscription or the publication database, and then resynchronize. While this requires more programming logic, it protects user data and allows you the flexibility of determining which row is kept or removed. The synchronization exception error information includes the reason for the failure and the ROWGUID value for the rows that are causing the errors.

Remote Data Access and Multi-User Concepts

In general, the multi-user concepts for remote data access (RDA) are similar to those for replication. The following rules apply:

  • Changes that occur on the local table while an RDA push process is occurring will be uploaded during the next push process.
  • A database exclusive lock is held at the beginning of the RDA push process for updating internal global data. The lock is released immediately after the data is updated.
  • A row exclusive lock is held on each row that is uploaded. The lock is released after the uploaded row has been retrieved from the database.
  • When an RDA table is downloaded from the server using the Pull method, a DDL lock is held during the download. During this time the table can’t be read or written.

Remote Data Access and Primary Key Update Caveat

Replication and RDA track rows differently. Replication tracks a row based on an immutable row GUID, while RDA identifies and tracks rows based on a primary key. Therefore, complications can arise if the primary key is updated during an RDA push process. You can change a primary key in a row on an RDA-tracked table and upload this row to the server. However, once you have updated the primary key of a given row and have started an RDA push operation, you should not update the primary key for that row again until the push process completes. Changing the key while the push process is in progress may cause all subsequent calls to the Push method to fail. This failure is due to the way RDA handles tracking internally.

For example, consider the following scenario:

  1. The application updates the primary key on row X in an RDA-tracked table.
  2. The RDA Push method is called, and while the push process is occurring, the application updates the primary key on row X again.
  3. The push process completes successfully.
  4. The application calls the Push method again. The push fails because the RDA tracking layer cannot correctly track the primary key update that was performed in step 2.