I have a flow configured using a SQL source, and I thought I would take the 'Enable change data capture' option that I noticed recently for a test drive.
On the first pass, it worked as I'd expected, given the settings. It got everything.
I cleared the sink and ran it again, returning a single row. One that had been modified since the last run. All good.
When I checked that row, though, I noticed it was not new. It happened to be an updated row.
This got me thinking, shouldn't it tell me what the nature of the change was? If I'm getting new rows and updated rows with no way of telling the difference, I need to either load the sink and do an 'exists comparison' or just switch to an upsert-only sink. The 'exists' checking means loading a potentially large dataset just to see how the row should be handled, and moving to upsert feels a bit hinky because you need to add an 'Alter Row' action set to short circuit all rows to upsert.
It's not applicable in this case, but what if I need to account for deleted rows as well?
I can't find a huge amount of information about this functionality, but can anyone tell me what usage is suggested?
Assuming that the date column I am specifying is used as a very basic filter with an auto-saved last run date, then I guess I am on my own in working this out. It only tells me the rows and doesn't know if they are new or updated. But what about the 'native change data capture'?
I'm about to do some testing, but is it going to try to turn on CDC in SQL server? And if so, it should have access to all change data, inserts, updates and deletes. Does anyone know if this is exposed in the flow at all? It seems like a wasted opportunity if this is the case.
Any clarification of relevant links would be awesome. Thanks.