Detecting change type in SQL source in a data flow using change detection capture

Peter Sharp 181 Reputation points
2022-08-17T05:09:05.853+00:00

I have a flow configured using a SQL source, and I thought I would take the 'Enable change data capture' option that I noticed recently for a test drive.

231881-2022-08-17-14-39-16-clipboard.png

On the first pass, it worked as I'd expected, given the settings. It got everything.

I cleared the sink and ran it again, returning a single row. One that had been modified since the last run. All good.

When I checked that row, though, I noticed it was not new. It happened to be an updated row.

This got me thinking, shouldn't it tell me what the nature of the change was? If I'm getting new rows and updated rows with no way of telling the difference, I need to either load the sink and do an 'exists comparison' or just switch to an upsert-only sink. The 'exists' checking means loading a potentially large dataset just to see how the row should be handled, and moving to upsert feels a bit hinky because you need to add an 'Alter Row' action set to short circuit all rows to upsert.

It's not applicable in this case, but what if I need to account for deleted rows as well?

I can't find a huge amount of information about this functionality, but can anyone tell me what usage is suggested?

Assuming that the date column I am specifying is used as a very basic filter with an auto-saved last run date, then I guess I am on my own in working this out. It only tells me the rows and doesn't know if they are new or updated. But what about the 'native change data capture'?

231777-screenshot-2022-08-17-150437.png

I'm about to do some testing, but is it going to try to turn on CDC in SQL server? And if so, it should have access to all change data, inserts, updates and deletes. Does anyone know if this is exposed in the flow at all? It seems like a wasted opportunity if this is the case.

Any clarification of relevant links would be awesome. Thanks.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
{count} votes

1 answer

Sort by: Most helpful
  1. ShaikMaheer-MSFT 38,556 Reputation points Microsoft Employee Moderator
    2022-08-18T11:14:40.417+00:00

    Thank you for posting this query in Microsoft Q&A platform.

    @MarkKromer-MSFT - Could you please share inputs here about change data capture functionality for better understanding.

    @Peter Sharp - Between, you can consider comparing source and sink data by calculating hash of them and then capture changed data there by to load to sink. Kindly check below video to understand this better.
    How to Capture Changed Data using Data flows in Azure Data Factory

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.