Why does pipeline give varying row order results with the same data file?

I have updated one of our pipelines in order to incorporate row ordering as recommended here:
https://learn.microsoft.com/en-us/answers/questions/419209/additional-column-that-records-the-row-number-auto
The goal being to track the order of the data in the input file as that affects downstream processes.
It is reading from a file, making minor transformations, and inserting into an azure sql database.
All seems well when tested, data looks fine in debug.
Differences were reported when the pipeline was run in a different session however.
So I was able to reproduce the discrepancy, by using debug / data preview with two different integration runtimes each in turn (the auto resolve one, and then our "real" one) the real one yields different results.
In other words the row order produced was different, with the same row number attached to a different data row from the input file.
Why is this ? How can we ensure consistent output ?
Thanks
Hello Tyrone Jones,
I am checking to see if you got a chance to look into my above response. Please let me know if you have any further questions.
We are using a single sink. A single table in an Azure database.
I can provide an example data format/file but likely not the file itself as it is about 7G
Providing the follow up info. Attached is a 1000 record sample file. Also attached is the data flow script file. That should have all the information related to transformation. Not much there just column to column mapping after removing whitespace and such.
Again the problem is that the "FileRowNumber" created at the end will not be consistent. e.g. the first data row from the file will end up with a different filerownumber in different runs. The only difference I can point to is the Integration runtime, which is why I mentioned it. Not sure what is happening. I'm also including info on the "live" runtime in case that information is helpful. Seems like some file splitting or partitioning is happening behind the scenes somewhere, if it is not just a bug/software issue. I'll also work on large file next in case its a large file issue.
DataFlowscript.txtclaimtest.txt
Hello Tyrone Jones,
Sorry, I didn't get you. Can you please provide more details about the "7G" you mentioned?
That is the size of the actual files we are processing. 7 gigabytes. Or greater
In case these help:

These are runtime details for the runtime:
Also if the large version of the file is needed or useful , let me know & how to send. Compressed it is 700M
Hello Tyrone Jones,
Thank you for the details. I will look into this further and get back to you with more details.
Hello Tyrone Jones,
I tried to reproduce the issue from my end with the sample file you provided earlier. But the FileRowNumber is consistent when writing it to my sink dataset.
I would suggest opening a support case for a deeper investigation. If you don't have a support plan, please let me know. I can provide you one-time free support request.
I am looking forward to hearing from you.
Sign in to comment
Activity