Incomplete Files from Copy Data Command in Azure Data Factory pipeline when uploading data from Snowflake

Susan Rakers 20 Reputation points
2025-03-12T19:07:37.29+00:00

I am experiencing an issue where the file-sink of the Copy Data command (SnowflakeExportCopyCommand) is producing incomplete files when uploading data from Snowflake to Azure Blob Storage in our Azure Data Factory pipeline.

Observations:

  • The number of rows read from Snowflake matches the number of rows written to Azure Blob Storage, as indicated in the copy details.
  • However, when multiple files are generated using the COPY command, the resulting Parquet files in Azure storage have incorrect sizes and row counts.
  • I have explicitly set the following Snowflake copy options: SINGLE=TRUE and MAX_FILE_SIZE=900000000 but the issue persists.

Has anyone encountered similar behavior, and are there any known solutions or workarounds?

Would appreciate any insights into possible causes or additional configurations that might resolve this.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,379 questions
{count} votes

Accepted answer
  1. phemanth 14,810 Reputation points Microsoft External Staff
    2025-03-19T06:02:34.3833333+00:00

    @Susan Rakers

    I'm glad that you were able to resolve your issue and thank you for posting your solution so that others experiencing the same thing can easily reference this! Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", I'll repost your solution in case you'd like to accept the answer .

    Ask:

    I am experiencing an issue where the file-sink of the Copy Data command (SnowflakeExportCopyCommand) is producing incomplete files when uploading data from Snowflake to Azure Blob Storage in our Azure Data Factory pipeline.

    Observations:

    • The number of rows read from Snowflake matches the number of rows written to Azure Blob Storage, as indicated in the copy details.
    • However, when multiple files are generated using the COPY command, the resulting Parquet files in Azure storage have incorrect sizes and row counts.
    • I have explicitly set the following Snowflake copy options: SINGLE=TRUE and MAX_FILE_SIZE=900000000 but the issue persists.

    Has anyone encountered similar behavior, and are there any known solutions or workarounds?

    Would appreciate any insights into possible causes or additional configurations that might resolve this.

    Solution: I have found the solution to my issue with the file-sink of the Copy Data command (SnowflakeExportCopyCommand) is producing incomplete files when uploading data from Snowflake to Azure Blob Storage in our Azure Data Factory pipeline.

    1. Set schema mapping
    2. Snowflake Copy Options: OVERWRITE: False MAX_FILE_SIZE: 300000000                SINGLE:  True
    3. In Sink, set Copy Behavior to 'Merge Files'

    The 'Merge Files' option combine the multiple obtained files of the copy into one file. The schema mapping must set set to avoid schema inconsistencies.

    If I missed anything please let me know and I'd be happy to add it to my answer, or feel free to comment below with any additional information.

    If you have any other questions, please let me know. Thank you again for your time and patience throughout this issue.


    Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Susan Rakers 20 Reputation points
    2025-03-18T16:17:10.1666667+00:00

    I have found the solution to my issue with the file-sink of the Copy Data command (SnowflakeExportCopyCommand) is producing incomplete files when uploading data from Snowflake to Azure Blob Storage in our Azure Data Factory pipeline.

    1. Set schema mapping
    2. Snowflake Copy Options: OVERWRITE: False MAX_FILE_SIZE: 300000000                SINGLE:  True
    3. In Sink, set Copy Behavior to 'Merge Files'

    The 'Merge Files' option combine the multiple obtained files of the copy into one file. The schema mapping must set set to avoid schema inconsistencies.

    Thanks for all of your help in finding the solution.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.