Using Azure Data Factory to send TSV files to a Snowflake stage, and it is escaping certain characters incorrectly.

Leif Starbuck 1 Reputation point
2022-02-03T19:25:28.937+00:00

I am using a Azure Data Factory pipeline to move compressed TSV files from an on-prem file share to a ADLS share that acts as a Snowflake stage, so we can move it to a Snowflake table.

The problem is when the combination of a backslash and a newline character are found at the end of a line in the file, ADF will escape the backslash character correctly by adding a double-slash ('\'), but it will then 'forget' to escape the newline character. This is leading to many of these type of error coming out of Snowflake stage:

Found character '\\' instead of field delimiter '\t' File '<*file_name*>', line 6134, character 58 Row 6132 starts at line 6133, column "TRANSIENT_STAGE_TABLE"[321]  

My DelimitedText linked service for the Sink has escape character set to a blackslash (). Again, the problem this causes is when I have a backslash at the end of a string value, which exists in my data quite a bit.

This issue seems to be extremely similar to this one, listed below. One of our data engineers already worked on this previously and believe it to be a bug within Azure Data Factory.

https://learn.microsoft.com/en-us/answers/questions/115904/azure-data-factory-adf-escape-character-issue-when.html

171090-adf.png

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,426 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,196 questions
{count} votes

1 answer

Sort by: Most helpful
  1. ShaikMaheer-MSFT 38,401 Reputation points Microsoft Employee
    2022-02-07T06:36:04.17+00:00

    Hi @Leif Starbuck ,

    Thank you for posting query in Microsoft Q&A Platform.

    I reproduced your scenario. used tsv file from ADLS source folder(Data folder in my case) as source and copied it ADLS sink folder(output folder in my case). I don't see errors while performed copy using copy data activity.

    Please check below screenshots for detailed idea.

    Source File:
    171822-image.png

    Source dataset in ADF:
    171775-image.png

    Sink Dataset in ADF:
    171814-image.png

    Sink file after copy:
    171796-image.png

    Hope this will help. If not, Please share your sample file data to repro with your data too. Thank you.

    ----------

    Please consider hitting Accept Answer button. Accepted answers helps community as well.