ADF- metadata driven - copy ADLS parquet multiple files to Snowflake multiple tables

KarthikShanth 26 Reputation points
2022-05-04T21:42:05.347+00:00

Hello, I have 20 parquet files from 20 SQL server tables stored in ADLS blob. Now I m using the metadata driven ADF pipeline to load each file to separate 20 tables in snowflake(to sync the SQL server source to snowflake database staging ). When I configure the metadata-driven ingestion pipeline, there is only a file to one table map is available, whereas I m expecting 20 such mappings to load 20 different files to 20 different tables in snowflake.

Is there any other way to do this one pipeline? 199006-adf-adls-snfk-maps.png

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,333 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,471 questions
0 comments No comments
{count} votes

Accepted answer
  1. AnnuKumari-MSFT 30,361 Reputation points Microsoft Employee
    2022-05-09T09:44:44.187+00:00

    Hi @KarthikShanth ,
    Thanks for sharing details and screenshot. While using Metadata-driven copy task, if your source is tabular datastore such as , Azure SQL server or Oracle database, we can select multiple tables and perform full load or incremental load to other datastore such as ADLS . However, if your source is files , there is no option to select multiple files or folders and perform copy activity dynamically.

    When source is tabular datastore, we can select multiple tables to leverage parameterized pipeline to perform copy dynamically.
    200214-image.png

    When source is ADLS files, we can't select multiple files.
    200243-image.png

    Get more details on limitations of metadata driven copy task here: https://learn.microsoft.com/en-us/azure/data-factory/copy-data-tool-metadata-driven#known-limitations

    To achieve your requirement, you can create your customized pipeline with the steps as described in the above answer as well. Reiterating the same steps here: These are the steps you can follow to achieve the same:

    1. Use Get metadata activity and point the dataset to the folder having 20 files and use childItems in the fieldList to fetch all the fileNames .

    2. Use ForEach activity to iterate through each of the filenames and process it one by one. Use this expression for the Item in ForEach : @{activity('Get Metadata1').output.childItems}

    3. Inside Foreach , use copy activity and in source settings use wildcard path and provide filename as @item().name . In sink dataset, create parameter to make the tablename dynamic and pass the value as @item().name for fileName in sink table.

    --------------------------------------------------------------------------

    If the above answer helped, Please do consider clicking Accept Answer and Up-Vote for the same as accepted answers help community as well. If you have any further query do let us know.

    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. AnnuKumari-MSFT 30,361 Reputation points Microsoft Employee
    2022-05-05T10:23:17.347+00:00

    Hi @KarthikShanth ,

    Welcome to Microsoft Q&A platform and thanks for posting your query.

    As I understand your query, you want to load data from 20 different files stored in ADLS to 20 Snowflake tables . Please let me know if my understanding is incorrect.

    These are the steps you can follow to achieve the same:

    1. Use Get metadata activity and point the dataset to the folder having 20 files and use childItems in the fieldList to fetch all the fileNames .
    2. Use ForEach activity to iterate through each of the filenames and process it one by one. Use this expression for the Item in ForEach : @{activity('Get Metadata1').output.childItems}
    3. Inside Foreach , use copy activity and in source settings use wildcard path and provide filename as @item().name . In sink dataset, create parameter to make the tablename dynamic and pass the value as @item().name for fileName in sink table.

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you.
      Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators