Azure Data Factory/Azure Synapse Copy activity - Escape column delimiter character when its already present in data

Keshav Prasad 1 Reputation point
2022-02-01T08:49:22.993+00:00

I am converting some data from JSON format in Azure blob to pipe delimited text files which will be stored in ADLS. I am using pipe(|) as my column delimiter. There is some data which already has pipe (|) in it which is creating problem in further parsing of these text files. For example my data is-

{
"Key": "XYZ",
"Value" : "ABC|DEF|GHI"
},
{
"Key": "YYY",
"Value" : "QQQ"
}
So when I am using copy activity to convert these JSONs to text files in copy activity I am getting something like this-

XYZ|ABC|DEF|GHI
YYY|QQQ
This is creating an issue because for my further parsing of this text file only two columns are read. So, I want the pipes in the data to be escaped. The text files should look like this-

XYZ|ABC\|DEF\|GHI
YYY|QQQ
I have tried playing around with settings of copy activity - delimited text sink like declaring escape character or quote character but nothing seems to work. Does anyone have any idea on how to achieve this data manipulation from the copy activity?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,696 questions
{count} votes

2 answers

Sort by: Most helpful
  1. HimanshuSinha-msft 19,386 Reputation points Microsoft Employee
    2022-02-16T23:43:30.453+00:00

    Hello @Keshav Prasad ,
    Thanks for the ask and using Microsoft Q&A platform .
    As we understand the ask here is to have the output of the file in the above called out format and you are fine to mapping data flow . Please do let me know if that is not accurate,
    My sincere apoloziges for the delay in reply on my side .

    I did tried out this with a dummy JSON .

    {  
    	"employees": [{  
    			"Key": "XYZ",  
    			"Value": "ABC|DEF|GHI"  
    		},  
    		{  
    			"Key": "YYY",  
    			"Value": "QQQ"  
    		}  
    	]  
    }  
    

    I did used the mapping dataflow . I am calling out the steps below

    1. Create the source and pointed the JSON location and selected "Array of documents" under Source options
    2. Added a flatten the activity and setting as below .

    175166-image.png

    The idea is to have a output like this .

    175106-image.png

    1. Add a derive column and in my case i added a column named new values and the dynamic expression as

    split(employees.Value,'|')

    175154-image.png

    The output here will like

    175167-image.png

    Step 5 . Add one more flatten activity and setting as below .

    175143-image.png

    the output of this activity is something like

    175168-image.png

    When you write this out a csv file , it will look like this

    175169-image.png

    Please do let me if you have any queries .
    Thanks
    Himanshu

    -------------------------------------------------------------------------------------------------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    1 person found this answer helpful.
    0 comments No comments

  2. Keshav Prasad 1 Reputation point
    2022-02-23T08:00:49.42+00:00

    Hey thanks for the exhaustive answer @HimanshuSinha-msft ! Actually I don't want to achieve this, what I want is to add backslashes before the pipe in the data so it does not get parsed as different columns in pipe delimited text files and also in further processing. I achieved this through adding \ from the source in JSON before the pipe and so I got '\' this in my text files before the pipe character.

    New JSON-
    {
    "Key": "XYZ",
    "Value" : "ABC\|DEF\|GHI"
    },
    {
    "Key": "YYY",
    "Value" : "QQQ"
    }

    New Text File from this JSON using copy activity-

    XYZ|ABC\|DEF\|GHI
    YYY|QQQ