How Change Data Capture Works

[This topic is pre-release documentation and is subject to change in future releases. Blank topics are included as placeholders.]

Change data capture records insert, update, and delete activity applied to SQL Server tables, making the details of the changes available in an easily consumed relational format. Column information that mirrors the column structure of a tracked source table is captured for the modified rows, along with the metadata needed to apply the changes to a target environment.

A good example of a data consumer targeted by this technology is an extraction, transformation, and loading (ETL) application that incrementally loads change data from SQL Server source tables to a data warehouse or data mart. Although the representation of the source tables within the data warehouse must reflect changes in those tables, an end-to-end technology that refreshes a replica of the source is not appropriate. What is needed is a reliable stream of change data, structured so that consumers can apply it to dissimilar target representations of the data. SQL Server change data capture provides this technology.

A member of the sysadmin fixed server role must enable a database for change data capture. Once the database is enabled, members of the db_owner fixed database role can enable tables whose change activity is to be captured.

When the first table in the database is enabled, a capture process automatically gathers change data from the transaction log and inserts the change information in the associated change table. Additional metadata about each transaction is inserted in a metadata table that allows the captured changes to be placed on a conventional timeline.

Data consumers access change data through table-valued functions rather than by querying the change tables directly. Data is requested for changes having commit log sequence numbers (LSNs) that lie within a specified range. A mapping table allows a conventional interval expressed as date-time values to be expressed as a comparable LSN range. Separate functions are available to request all of the changes occurring within an interval as well as the net changes applied to distinct rows during the interval.

The following illustration shows the change data capture process. Data modification operations made in the source tables are recorded in the transaction log of the database. The change capture process reads the log records and inserts modifications made to captured columns in the corresponding change tables. The illustration shows the one-to-one relationship between a source table and a change table, as indicated by the dashed lines. However, up to two change tables can be defined for a source table. The illustration also shows how a separate ETL process can be used to extract the change data from the change tables by using the change data capture query functions and update a data warehouse.

Change data capture process.

See Also

Concepts

Configuring Change Data Capture

Other Resources

Improving Incremental Loads with Change Data Capture

Help and Information

Getting SQL Server 2008 Assistance