Troubleshoot the Change data capture resource in Azure Data Factory
APPLIES TO: Azure Data Factory Azure Synapse Analytics
Tip
Try out Data Factory in Microsoft Fabric, an all-in-one analytics solution for enterprises. Microsoft Fabric covers everything from data movement to data science, real-time analytics, business intelligence, and reporting. Learn how to start a new trial for free!
This article provides suggestions on how to troubleshoot common problems with the change data capture resource in Azure Data Factory.
Issue: Trouble enabling native CDC in my SQL source.
For sources in SQL, two sets of tables are available: tables with native SQL CDC enabled and tables with time-based incremental columns.
Follow these steps to configure native CDC for a specific source table in your SQL database.
Consider you have following table, with ID as the Primary Key. If a Primary Key is present in the schema, supports_net_changes is set to true by default. If not, configure it using the script in Query 3.
Query 1
CREATE TABLE Persons (
ID int,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int,
Last_login DATETIME,
PRIMARY KEY (ID));
Note
Currently the ADF CDC resource only loads net changes for insert, update and delete operations.
To enable CDC at the database level, execute the following query:
Query 2
EXEC sys.sp_cdc_enable_db
To enable CDC at the table level, execute the following query:
Query 3
EXEC sys.sp_cdc_enable_table
@source_schema = N'dbo'
, @source_name = N'Persons'
, @role_name = N'cdc_admin'
, @supports_net_changes = 1
, @captured_column_list = N'ID';
Issue: Tables are unavailable to select in the CDC resource configuration process.
If your SQL source doesn't have SQL Server CDC with net_changed enabled or doesn't have any time-based incremental columns, then the tables in your source will be unavailable for selection.
Issue: The debug cluster isn't available from a warm pool.
The debug cluster isn't available from a warm pool. There will be a wait time in the order of 1+ minutes.
Issue: Trouble in tracking delete operations.
Currently CDC resource supports delete operations for following sink types - Azure SQL Database & Delta. To achieve this in the column mapping page, select keys column that can be used to determine if a row from the source matches a row from the sink.
Issue: My CDC resource fails when target SQL table has identity columns.
Getting following error on running a CDC when your target sink table has identity columns,
Can't insert explicit value for identity column in table 'TableName' when IDENTITY_INSERT is set to OFF.
Run the following query to determine if you have an identity column in your SQL based target.
Query 4
SELECT *
FROM sys.identity_columns
WHERE OBJECT_NAME(object_id) = 'TableName'
To resolve this user can follow either of these steps:
- Set IDENTITY_INSERT to ON by running following query at database level and rerun the CDC Mapper
Query 5
SET IDENTITY_INSERT dbo.TableName ON;
(Or)
- User can remove the specific identity column from mapping while performing inserts.
Issue: Trouble using Self-hosted integration runtime.
Currently, Self-hosted integration runtime isn't supported in the CDC resource. If trying to connect to an on-premise source, use Azure integration runtime with managed virtual network.