When merging files in data factory, files are replaced instead of aggregating onto sink dataset

Anna994 1 Reputation point
2022-09-29T13:18:16.88+00:00

I am trying to merge files from my source dataset into sink, using ForEach activity with wildcard path to grab specific files from each folder.

When the files merge into sink, it appears to replace/overwrite the files instead of aggregating onto sink dataset.

My current pipeline:
246065-adf-pipeline.png

ForEach activity to aggregate tables from files:
246039-adf-foreach-condition.png 246112-adf-sink.png

This results in ~20,000 rows in my sink dataset, instead of the >1000000 files that I need.

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,376 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,785 questions
{count} votes

1 answer

Sort by: Most helpful
  1. AnnuKumari-MSFT 31,731 Reputation points Microsoft Employee
    2022-09-30T11:00:36.43+00:00

    Hi @Anna994 ,

    Welcome to Microsoft Q&A platform and thanks for posting your question here.

    As per my understanding you are trying to merge input files along with the target file which is already present using copy activity in ADF. Please let me know if my understanding is incorrect.

    Merge files option just merges from the source folder to one file. It is not meant to merge the input data on top of sink dataset.

    For more details, kindly check the following resources :
    File system as sink
    Copy behaviour in ADF

    If you want to append data on top of .csv file, you need to use Union transformation in mapping data flow. For more details, kindly check : Union Transformation in mapping dataflow

    Additional resources: Appending Rows to the file already existing in Azure Data Factory - Mapping Data Flow

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you.
      Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    0 comments No comments