Order of Merged file in Copy Activity - Azure Synapse

Keshav Prasad 1 Reputation point
2021-11-23T10:30:19.94+00:00

When I am trying to use the copy activity merge files copy behavior, the resulting file in sink has a different order than expected(i.e. the order of the rows in the original files is not maintained). Is there some way I can preserve the order after merging the files from source to sink in Synapse (not necessarily copy activity)?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,375 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. MartinJaffer-MSFT 26,236 Reputation points
    2021-11-24T01:07:45.1+00:00

    Hello @Keshav Prasad and welcome to Microsoft Q&A.

    For Copy Activity there are no controls over the order. However, since the out-of-order must be caused by parallelism, reducing the parallelism to 1 (so there is only 1 writer) may reduce the shuffling. This could help with one part of the ordering.

    I have not had cause to thoroughly explore the ordering behavior before, but I am aware there are two parts of this ordering to be aware of:

    1. order of files read
    2. order of rows within files

    The combinations can give rise to multiple types of dis-order I explore below

    Suppose we have files A, B, C, each having their own rows 1,2,3. Below I will illustrate possible dis-orders. Which do you see?

    File ordering disorder, but row order good. expecting in order A,B,C but getting BAC
    B1
    B2
    B3
    A1
    A2
    A3
    C1
    C2
    C3

    File ordering ok, but row dis-order
    A1
    A3
    A2
    B3
    B1
    B2
    C2
    C1
    C3

    File interleaving but preserving in-file row ordering
    A1
    B1
    C1
    A2
    A3
    C2
    B2
    B3
    C3

    For Data Flow, there is more control, if you have an index or ID column to sort by.

    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.