Tags: change data capture, data replication, PowerShell
A major bank in Europe that we’ve been working with over the years wanted to get help in upgrading SQL Server 2008 to 2016. Since we performed the migration process originally from DB2 to SQL Server we had a deep knowledge of the customer’s data.
Because of the need to have a highly available system and they could not afford to be down, they needed a way to downgrade to the old database system in case somthing bad happens to the upgraded system. Also, our customer wanted to include the disaster recovery solution. We proposed a solution that would allow them to use data replication from the upgraded database to the old database. So, in case of an issue in production they could easily go back to the old version of SQL Server.
To see how we were able to accomplish this check out our video.
Here’s the Problem Our Customer Had
SQL Server has a variety of technologies used to do data replication from one database to another. The most popular solutions include Database Mirroring, Log Shipping, Database Sharding, Export — Import, Backup — Restore, and so on…
SQL Server provides the built-in utilities for data replication from one database to another. However, you can implement this technique only if the tables in the databases have the primary keys. Unfortunately, most of the customer’s tables didn’t include the primary keys. Besides, the customer refused to change the database architecture in order to add the primary keys to the tables. Thus, we had to start searching for another technology to deliver the data from the upgraded SQL Server database to the old one.
Most of the other replication technologies work only if both databases are running on the same version of the SQL Server. In this particular case, we needed to replicate data from SQL Server 2016 database to the old customer’s SQL Server 2008 database instance.
We have managed to find an alternative option that allows tracking changes in the source database. We mean using the Change Data Capture (CDC) technology.
Technical Solution
It turned out CDC was the only available solution for data replication of this particular customer’s system. Actually, we already used the similar approach when we migrated IBM DB2 to SQL Server for this customer.
Let’s discover how the Change Data Capture technology works. CDC scans the replication log and harvests changes in the tables of the original database. Then, for every replicated table CDC creates an auxiliary change table that stores all insert, update, and delete activity applied to the original SQL Server table.
Here’s the approach that we took:
- enabled Change Data Capture for the database.
- enabled CDC for the selected tables and
- specified the column that need to be replicated.
So, now we’ve got the changes tracked, yet we still had to deliver them to the replicated table in the SQL Server 2008 environment. To automate the process of data delivery, we created the PowerShell Script. This script withdraws changes from the auxiliary change table and performs exactly the same operations in the replicated database. Thus, now we have two identical copies of data in source and target databases.
Change Data Capture Setup
Let’s discover how one can perform the setup of the Change Data Capture.
Creating objects
First of all, we create the source and target databases.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | IF db_id('CDC_Source') IS NOT NULL BEGIN ALTER DATABASE CDC_Source SET single_user WITH ROLLBACK immediate DROP DATABASE CDC_Source END CREATE DATABASE CDC_Source ALTER DATABASE CDC_Source SET recovery FULL GO IF db_id('CDC_Target') IS NOT NULL BEGIN ALTER DATABASE CDC_Target SET single_user WITH ROLLBACK immediate DROP DATABASE CDC_Target END CREATE DATABASE CDC_Target GO |
Then we create tables without the primary or unique key in both databases.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | USE CDC_Source GO IF object_id('dbo.TestTable1') IS NOT NULL DROP TABLE dbo.TestTable1 CREATE TABLE dbo.TestTable1 ( id BIGINT ,val VARCHAR(500) ,dt datetime2 ) USE CDC_Target GO IF object_id('dbo.TestTable1') IS NOT NULL DROP TABLE dbo.TestTable1 CREATE TABLE dbo.TestTable1 ( id BIGINT ,val VARCHAR(500) ,dt datetime2 ) |
Creating an auxiliary table for CDC
On the next step, we create an auxiliary table for delivered data.
1 2 3 4 5 6 7 8 9 | IF object_id('dbo.cdc_lsns') IS NOT NULL DROP TABLE dbo.cdc_lsns CREATE TABLE dbo.cdc_lsns ( seqval BINARY(10) ,obj_name VARCHAR(200) ,ts datetime2 DEFAULT sysdatetime() ) INSERT INTO dbo.cdc_lsns DEFAULT VALUES |
Enabling CDC
Now we enable the Change Database Capture on the database level and create the CDC role.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | USE CDC_Source GO IF NOT EXISTS(SELECT name FROM sys.databases WHERE is_cdc_enabled = 1 AND database_id = db_id()) BEGIN EXEC sys.sp_cdc_enable_db EXEC sys.sp_cdc_add_job @job_type = N'capture' ,@start_job = 1 ,@maxtrans = 10000 ,@maxscans = 100 ,@continuous = 1 ,@pollinginterval = 2 EXEC sys.sp_cdc_add_job @job_type = N'cleanup' ,@start_job = 1 ,@retention = 52494800 ,@threshold = 10000 END IF NOT EXISTS (SELECT 1 FROM sys.database_principals WHERE name=N'cdc_admin' AND TYPE = N'R') CREATE ROLE cdc_admin AUTHORIZATION dbo |
After that, we enable CDC on the table level.
1 2 3 4 5 6 7 | EXEC sys.sp_cdc_enable_table @source_schema = 'dbo', @source_name = 'TestTable1', @role_name = N'cdc_admin', @index_name = NULL --2016 modification: no support for net changes required ,@supports_net_changes = 0 ,@filegroup_name = NULL ,@allow_partition_switch = 1 |
After tweaking the source table, we can check the captured changes in the auxiliary table using the following statement.
1 | SELECT * FROM CDC_Target.dbo.cdc_lsns |
In addition, you can find more code samples, including the replication operations on this page.
Observations
Here are some valuable observations that we’ve learned from this project.
- Before enabling Change Data Capture at the table level, make sure you have already enabled SQL Server Agent.
- Consider changing settings in order to remove the history data from the auxiliary change table. It can quickly grow in size and decrease the overall performance of the replication script.
- Definitely, you should use primary keys in the tables and therefore avoid these difficulties.
Benefits and Results
The customer got the upgraded SQL Server database system that now meets the strict security and high availability requirements and fits the highest industry standards. In addition, the customer was able to maintain sufficient database performance increase, real-time backup and the support of downgrade. As of this point in time, none of the databases had to be downgraded because of the superior performance of SQL Server 2016.