Why does pipeline give varying row order results with the same data file?

Tyrone Jones 0 Reputation points
2023-02-27T21:18:16.68+00:00

I have updated one of our pipelines in order to incorporate row ordering as recommended here:
https://learn.microsoft.com/en-us/answers/questions/419209/additional-column-that-records-the-row-number-auto

The goal being to track the order of the data in the input file as that affects downstream processes.

It is reading from a file, making minor transformations, and inserting into an azure sql database.

User's image

All seems well when tested, data looks fine in debug.

Differences were reported when the pipeline was run in a different session however.

So I was able to reproduce the discrepancy, by using debug / data preview with two different integration runtimes each in turn (the auto resolve one, and then our "real" one) the real one yields different results.

In other words the row order produced was different, with the same row number attached to a different data row from the input file.

Why is this ? How can we ensure consistent output ?
Thanks

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,652 questions
{count} votes