question

karthikshanth avatar image
0 Votes"
karthikshanth asked AnnuKumari-MSFT commented

ADF- metadata driven - copy ADLS parquet multiple files to Snowflake multiple tables

Hello, I have 20 parquet files from 20 SQL server tables stored in ADLS blob. Now I m using the metadata driven ADF pipeline to load each file to separate 20 tables in snowflake(to sync the SQL server source to snowflake database staging ). When I configure the metadata-driven ingestion pipeline, there is only a file to one table map is available, whereas I m expecting 20 such mappings to load 20 different files to 20 different tables in snowflake.

Is there any other way to do this one pipeline? 199006-adf-adls-snfk-maps.png


azure-data-factoryazure-data-lake-storage
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

AnnuKumari-MSFT avatar image
1 Vote"
AnnuKumari-MSFT answered AnnuKumari-MSFT commented

Hi @karthikshanth ,
Thanks for sharing details and screenshot. While using Metadata-driven copy task, if your source is tabular datastore such as , Azure SQL server or Oracle database, we can select multiple tables and perform full load or incremental load to other datastore such as ADLS . However, if your source is files , there is no option to select multiple files or folders and perform copy activity dynamically.

When source is tabular datastore, we can select multiple tables to leverage parameterized pipeline to perform copy dynamically.
200214-image.png

When source is ADLS files, we can't select multiple files.
200243-image.png

Get more details on limitations of metadata driven copy task here: https://docs.microsoft.com/en-us/azure/data-factory/copy-data-tool-metadata-driven#known-limitations

To achieve your requirement, you can create your customized pipeline with the steps as described in the above answer as well. Reiterating the same steps here: These are the steps you can follow to achieve the same:

1. Use Get metadata activity and point the dataset to the folder having 20 files and use childItems in the fieldList to fetch all the fileNames .

2. Use ForEach activity to iterate through each of the filenames and process it one by one. Use this expression for the Item in ForEach : @{activity('Get Metadata1').output.childItems}

3. Inside Foreach , use copy activity and in source settings use wildcard path and provide filename as @item().name . In sink dataset, create parameter to make the tablename dynamic and pass the value as @item().name for fileName in sink table.


If the above answer helped, Please do consider clicking Accept Answer and Up-Vote for the same as accepted answers help community as well. If you have any further query do let us know.


image.png (131.6 KiB)
image.png (46.5 KiB)
· 3
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi @karthikshanth ,
Just checking in to see if you got a chance to see previous response. If the suggested response helped you, please click Accept Answer and kindly upvote the same. If you have extra questions about this answer, please do Comment.

0 Votes 0 ·

Thanks @AnnuKumari-MSFT . This approach worked and I was able to successfully load the ADLS parquet files into Snowflake tables dynamically. Appreciate your help on this community.

1 Vote 1 ·

@karthikshanth ,
Glad to know it worked ! Thankyou!

0 Votes 0 ·
AnnuKumari-MSFT avatar image
0 Votes"
AnnuKumari-MSFT answered AnnuKumari-MSFT commented

Hi @karthikshanth ,

Welcome to Microsoft Q&A platform and thanks for posting your query.

As I understand your query, you want to load data from 20 different files stored in ADLS to 20 Snowflake tables . Please let me know if my understanding is incorrect.

These are the steps you can follow to achieve the same:

  1. Use Get metadata activity and point the dataset to the folder having 20 files and use childItems in the fieldList to fetch all the fileNames .

  2. Use ForEach activity to iterate through each of the filenames and process it one by one. Use this expression for the Item in ForEach : @{activity('Get Metadata1').output.childItems}

  3. Inside Foreach , use copy activity and in source settings use wildcard path and provide filename as @item().name . In sink dataset, create parameter to make the tablename dynamic and pass the value as @item().name for fileName in sink table.

Hope this will help. Please let us know if any further queries.


  • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you.
    Original posters help the community find answers faster by identifying the correct answer. Here is how

  • Want a reminder to come back and check responses? Here is how to subscribe to a notification

  • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators


· 3
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

But, I already using the ADF's metadata-driven ingestion template for this work. Why do you want me to recreate something very similar to ADF's metadata-driven ingestion template? which means, ADF's metadata-driven copy won't work for ADLS blob multi files to other databases like Snowflake, and SQL Server with multiple copy mapping target tables?

PS- Earlier, I used another ADF's metadata-driven pipeline to pull SQL server tables into the ADLS blob, which worked fine creating all 20 parquet files in the ADLS blob.

0 Votes 0 ·

Hi @karthikshanth ,
If there is a template suitable for your scenario , you can definitely make use of it , change datasets according to your source and sink . In your case , you can clone your previous pipeline (SQL -ADLS) and make necessary changes. However, my point is you need to parameterize your datasets to make it dynamic for multiple tables. In order for us to reproduce your scenario, could you please share the name of the metadata driven template which you have used so that I can show you exactly what needs to be changed. I cannot find any template with such name:
199632-image.png


0 Votes 0 ·
image.png (95.8 KiB)

199843-image.png



I was talking about this metadata driven copy task, which is available on ADF home page.

0 Votes 0 ·
image.png (16.0 KiB)