Change Tracking Extract Staging Table

The change tracking pattern uses the change tracking facility of SQL Server to identify changes (insert, update, delete) in a source database. It only supports SQL Server table sources (not queries or views). To use the change tracking pattern, it’s necessary to turn change tracking on for the source SQL Server table. A very good overview of how change tracking works can be found on Tim Mitchell’s blog post “getting started with change tracking” (thanks Tim).

To complete this tutorial, you will need to enable change tracking on the Service database and on the TaskType table.

Enable Change Tracking on the Database and TaskType table

  1. Through SQL Server Management Studio (SSMS) Find the “Service” database in the Object Explorer.
  2. Right-click the “Service” database and click Properties.
  3. Select Change Tracking page and select “True” in the Change Tracking dropdown.
  4. Click Ok.

Your result should look like this:

  1. Find the “TaskType” table in the Object Explorer.
  2. Right-click the “TaskType” table and click Properties.
  3. Select Change Tracking page and select “True” in the Change Tracking dropdown.
  4. Click Ok.

Your result should look like this:

Create a new “TaskType” Persistent Staging table

  1. Expand the Staging folder, right-click Persistent Staging and click Add New.
  2. Select “Service” in the Source System
  3. In the Staging Table Name field type “TaskType“.
  4. Enter the “This is a Task Type persistent staging table” Description in the Description. This is optional.

Staging Columns

The quickest way to define the columns of the Staging table is to import the column definitions from a Source table. If there isn’t a matching Source table, then you can manually define the columns. To import a Source table do the following:

  1. Click on the Staging Columns tab.
  2. Click the Import Schema button.
  3. Select “Service” in the Connection dropdown. The Import Table field is populated with tables and views from the source.
  4. Select the “dbo.TaskType” in the Import Table field.
  5. Click Ok. The columns and data types of the source table are imported into the Staging table design.

Advanced Tab

On the Advanced tab, you will be presented with additional properties for the staging table. These properties are based on the target technology for the data warehouse. In this case, the target technology is Azure Data Warehouse, so you are presented with a DISTRIBUTION option with a default.

  1. Just keep the default options on the advanced tab.

Create a Change Tracking Extract

  1. To define an Extract select the Add button on the Extract tab. The Extract dialog appears.

Overview Tab

  1. Extract Name. Keep the default Extract Name.
  2. Extract Pattern. Select “Change Tracking” in the Extract Pattern field. The selected pattern is used to determine the type of code that will be generated for this Extract.
  3. Extract Overview. Optional. Enter a description of the extract for documentation purposes.

Source Tab

The Source tab defines the source table or query of the Extract.

  1. Click on the Source tab.
  2. In the Source Connection dropdown select the “Service” connection.
  3. In the ‘Table or View‘ dropdown select the “dbo.TaskType” table.

More Information

There are 3 use cases for using the Change Tracking pattern:

  1. Change Tracking existed on the table before any data was inserted into the table.
  • In this scenario just create the extract with the change tracking pattern and leave the DmlVers Change Identifier initial value as 0. The initial run will pick up all data in the source table.
  1. Change Tracking was added to the table after there was some data added to the table.
  • Change tracking only captures changes after change tracking is set “on” for the table, so all data inserted into the table prior to change tracking is set “on” will not be extracted using this pattern.
  • In this case, it necessary to first define the extract as a Full Extract, generate, deploy and run it as a Full extract, and then later change the pattern to ‘Change Tracking’.
  • Prior to changing the pattern to ‘Change Tracking’, but after you have run at least one full extract, execute the following SQL in the source database:
SELECT CHANGE_TRACKING_CURRENT_VERSION()

Note down the resulting value, you will use it as the initial value of the ‘DmlVers’ change identifier when you change to the ‘Change Tracking’ pattern.

  • Run a Full Extract. This will ensure the persistent staging database is up to date.
  • Edit the Extract for the Staging table and change the pattern to ‘change tracking’. Set the value of the DmlVers change identifier to the value return by the SQL above. Note, you should do this as close to deployment time as possible. This will reduce the number of ‘duplicate’ row versions in the persistent staging table. Note, only one is marked as latest. The duplicate effectively never existed due to how effective and end dates are set.
  • Generate, Deploy and Execute the Extract with the ‘Change Tracking’ pattern.
  • This second method will ensure you get all rows and their changes as required.
  1. The Source table is recreated, and/or Change Tracking reset – Extract from a given version

The Persistent staging process stores the maximum Change Tracking Version number extracted during a batch run to a zdm.ControlParameters table in the staging table. E.g.

You can manually update that value if, for some reason, it’s necessary to extract the changes since x version again. This could result in duplicates being added to the persistent staging table because the changed record has already been added to the persistent staging table in a previous batch. As noted before, if there are duplicates, then only one is marked as latest. Under normal circumstances, it’s not necessary to update these values.