Azure Data Factory | JSON File to JSON File

Abhay Chandramouli 1,061 Reputation points
2025-03-21T18:08:59.0266667+00:00

Hi I have some data like this stored in a azure blob storage

[

{

"requests": [

  "{\"action\":\"UPDATE\",\"endpoint\":\"v75.0/records/Product__c/b07XY0000098z5BZU\"}",

  "{\"action\":\"UPDATE\",\"endpoint\":\"v75.0/records/Product__c/b07XY0000098z5AAB\"}"

]

},

{

"requests": [

  "{\"action\":\"UPDATE\",\"endpoint\":\"v75.0/records/Product__c/b07XY0000098z5BZU\"}",

  "{\"action\":\"UPDATE\",\"endpoint\":\"v75.0/records/Product__c/b07XY0000098z5AAB\"}"

]

}

]

Now If you see closely, the requests array is an array of strings. These strings are JSON Strings.

I want a way to read this file, convert the JSON Strings to JSON and paste the entire new json in a new file in Azure blobs.

This needs to go in new file

[

{

"requests": [

  {

    "action": "UPDATE",

    "endpoint": "v75.0/records/Product__c/b07XY0000098z5BZU"

  },

  {

    "action": "UPDATE",

    "endpoint": "v75.0/records/Product__c/b07XY0000098z5AAB"

  }

]

},

{

"requests": [

  {

    "action": "UPDATE",

    "endpoint": "v75.0/records/Product__c/b07XY0000098z5BZU"

  },

  {

    "action": "UPDATE",

    "endpoint": "v75.0/records/Product__c/b07XY0000098z5AAB"

  }

]

}

]

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
{count} votes

Answer accepted by question author
  1. Anonymous
    2025-03-24T10:19:39.69+00:00

    Hello @Abhay Chandramouli,

    You can use combination of the Mapping dataflows and copy activity to achieve your requirement. But here, in this approach you should be aware of the JSON string schema.

    The dataflow will be used for the creation of the JSON from the JSON strings. It will create the outer array as an object. To convert it into an array, copy activity will be used after the dataflow activity.

    I took your sample JSON data in an input file.

    
    [
    
    {
    
    "requests": ["{\"action\":\"UPDATE\",\"endpoint\":\"v75.0/records/Product__c/b07XY0000098z5BZU\"}","{\"action\":\"UPDATE\",\"endpoint\":\"v75.0/records/Product__c/b07XY0000098z5AAB\"}"]
    
    },
    
    {
    
    "requests": ["{\"action\":\"UPDATE\",\"endpoint\":\"v75.0/records/Product__c/b07XY0000098z5BZU\"}","{\"action\":\"UPDATE\",\"endpoint\":\"v75.0/records/Product__c/b07XY0000098z5AAB\"}"]
    
    }
    
    ]
    
    

    First create a source dataset for the above and target JSON dataset with required filename in the specific location.

    Give the source dataset as the source of the dataflow and go to Source options->JSON settings ->Array of documents.

    Then, take a derived column transformation and create a new column one with below expression.

    
    replace(replace(replace(toString(requests),'"{','{'),'}"','}'),'\\"','"')
    
    

    enter image description here

    This will create a new string column with JSON array as JSON string.

    enter image description here

    Now, apply the Parse transformation on this column with the desired schema of the JSON string.

    
    (action as string,endpoint as string)[]
    
    

    enter image description here

    This will create the required JSON array column.

    enter image description here

    Now, use Select transformation on this to remove the extra columns like request, one and rename the new column as requests.

    enter image description here

    Take sink of the dataflow with the JSON target dataset. In the dataset give the file name and to get a single file, you need to give the same file name in the sink settings as well.

    enter image description here

    Upon the execution of the dataflow from pipeline, the result file jsontarget1.json will be generated as shown below.

    
    {"requests":[{"action":"UPDATE","endpoint":"v75.0/records/Product__c/b07XY0000098z5BZU"},{"action":"UPDATE","endpoint":"v75.0/records/Product__c/b07XY0000098z5AAB"}]}
    
    {"requests":[{"action":"UPDATE","endpoint":"v75.0/records/Product__c/b07XY0000098z5BZU"},{"action":"UPDATE","endpoint":"v75.0/records/Product__c/b07XY0000098z5AAB"}]}
    
    

    To convert the outer objects as JSON array, use copy activity with same target JSON dataset for both source and sink after the dataflow activity. In the copy activity sink, select the File pattern as Array of objects.

    enter image description here

    Now, upon the execution of the pipeline, the target JSON file will be generated with the desired JSON structure.

    enter image description here

    Hope this helps.

    If the answer is helpful, please click Accept Answer and kindly upvote it. If you have any further questions about this answer, please click Comment.

    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.