Data Factory Copy Activity returns previous resultset to sink and not the current set

Marko Vainiokangas 1 Reputation point
2021-08-18T09:36:28.877+00:00

I have an Azure SQL as Source and a REST Sink (An Azure Function). I check the logs on azure function and count the number of lines its being called with and I consistently get that ADF Copy Activity is sending the previous run result set and not the current running activity. So at best I have to only run the activity twice to be caught up, or I have to run it multiple times like a cat-and-mouse game to reach parity between the two.

Run #1: ADF Claims it has read and written 735459 but in actual fact, it sent to the azure function the previous resultset of 735380

Run #2: ADF Claims it has read and written 735459, no changes have occurred in the actual azure sql source, it now sends the azure function the resultset of 735459

The Sink is configured with 10ms interval, 100 000 write batch size. The rows themselves are not very large or complex either, they are actually rather small per row/line.

The issue gets worse with another table set up with similar feature, 100k batch size, 10000ms interval, slightly wider table, but its still under 20 columns. That table mutates a lot more and is a lot harder to catch up to.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
{count} votes

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.