ForEach isn't working with Snowflake V2 connector

MarkDG 0 Reputation points
2024-11-27T21:54:25.39+00:00

We've got a classic pipeline set up that reads a list of .csv files in blob storage and uses the child items in the foreach loop as a parameter for the table name in Snowflake and the file to copy from blob.

However, each table has the same row count at the end and the columns from one csv are input into tables that don't have the same name but share column names. There are 20 source files and each copydata iteration runs for 20 files.

I've added a set variable to output the very same iteration it is running. I've set foreach to sequential so I can plod through each run to make sure it has only one parameter.

We've got a the same pipeline set up elsewhere with the Snowflake legacy connector which works fine. This pipeline ran OK until we added the new V2 Snowflake connector, and this is literally the same as the other pipelines still running with legacy Snowflake connectors.

It's as if the new Snowflake connector somehow drops the 'sink' part of the copydata outside the foreach and runs it for every item in the list, every time.

I've tried every which way with initial variables, straight up @item().names.

Run one file through and it works fine.

Has anyone got any suggestions please?

Thanks

Mark

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,973 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Ganesh Gurram 1,825 Reputation points Microsoft Vendor
    2024-11-28T20:35:55.62+00:00

    @MarkDG - Thanks for the question and using MS Q&A forum.

    It seems like you're facing an issue with the Snowflake V2 connector while using a ForEach loop in Azure Data Factory (ADF) for processing CSV files from blob storage and copying data into Snowflake tables. I understand that the legacy Snowflake connector works fine, but with the new V2 connector, it appears to be executing the copy operation multiple times in an unintended manner.

    Here are a few suggestions that could help resolve the issue you're facing:

    • Verify Sink Configuration: Make sure that the sink settings in the Copy Data activity are correctly set to use the dynamic table names. It seems like the sink configuration might be getting reset or not correctly picking up the parameters.
    • Check ForEach Configuration: Ensure that the ForEach activity is correctly configured to iterate over the list of files. Double-check the items being passed into the ForEach loop.
    • Sequential Execution: Since you’ve set the ForEach to sequential, it should process one file at a time. Verify that each iteration is correctly picking up the parameters by adding debug outputs or logging within the loop.
    • Consider Using a Lookup Activity: If the number of files is manageable, you could use a Lookup activity to fetch the file names and table names from a table or configuration file. This approach can provide more flexibility and control over the execution of the Copy Data activity.
    • Parameter Passing: Ensure that the parameters are being correctly passed to the Copy Data activity. Use expressions like @item().name to dynamically set the table names and file paths.
    • Check the settings of the Snowflake Sink in the Copy Data activity to ensure that there are no unintended parallel executions happening within the loop. For example, review options like Write batch size or Max concurrent connections which could be causing issues if the connector is trying to perform parallel operations when it shouldn't.

    For more details refer to this: https://learn.microsoft.com/en-us/answers/questions/1858689/document-for-snowflake-v2-connector-issues
    https://learn.microsoft.com/en-us/azure/data-factory/connector-snowflake?tabs=data-factory#upgrade-the-snowflake-linked-service

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click `Accept Answer` and `Yes` for was this answer helpful. And, if you have any further query do let us know.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.