how to migrate cdc changes of multiple tables to azure sql db using adf

Shubhi Jain 21 Reputation points
2021-09-07T10:15:36.397+00:00

I have multiple tables in azure SQL DB with CDC enabled on all of them. I want to migrate the changes made on CDC enabled table to another azure SQL DB using ADF pipelines. I have made the pipeline for migrating changes of a single table but am stuck when changes are made to multiple tables how I can check for the changes made in the source and migrate them to the destination in the same pipeline.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,906 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. HimanshuSinha-msft 19,386 Reputation points Microsoft Employee
    2021-09-07T23:44:36.633+00:00

    Hello @Shubhi Jain ,
    Thanks for the ask and using the Microsoft Q&A platform .
    When you say that you have done the implementation for 1 table , did you used paramterized the dataset ? Basically you should be able to pass the the tablename as parameter and that table related CDC data should be copied to sink . If not let me know I am help you with that .

    If you have already done that then use lookup to get all the CDC enabled tablename and pass the same to the existing pipeline .

    *****Edited the posted on 9/29

    We can achieve the same by performing the below steps .

    1. Get all the table name on which the cdc is enabled .

    136561-image.png

    select capture_instance from cdc.change_tables

    The output will be in form of cdc.schemaname_tablename eg cdc.HR.Employee . We will use this in the Lookup activity .

    1. Use a for each activity with a explaression like

    @activity('Get all the table name on which CDC is enabled').output.value

    3.In SQL to query the CDC table on the source side they follow the query structure

    select * from cdc.SomtestSchema_Table1_ct so in our case it will

    select * from cdc.HR_Employee_ct

    So add a copy activity and in the source as below dynamic expression

    @markus.bohland@hotmail.de ('SELECT * FROM cdc.',item().capture_instance,'_ct')

    136562-image.png

    and since our intention is to add the cdc data on the destination in a table like schemaName.Table so we will create a array variable and derieve the schema name and table name from it .

    @split(replace(item().capture_instance,'cdc',''),'_')

    --- Let me explain this here , we got the schema name in step 1 as cdc.HR.Employee , the above expression will give us an array as ['HR','Employee'] , we will use this a parametertized values on the sink side .
    @variables('tempArray')[0]
    @variables('tempArray')1

    136514-image.png
    136428-image.png
    136497-image.png
    136534-image.png

    Please do let me know how it goes .
    Thanks
    Himanshu
    Please do consider clicking on "Accept Answer" and "Up-vote" on the post that helps you, as it can be beneficial to other community members

    1 person found this answer helpful.