How to Copy Complex multi-array JSON into SQL table

Jancy John 86 Reputation points
2022-10-26T14:34:54.443+00:00

Hi All,

I'm trying to flatten the following JSON structure in Azure Data Factory so that I can get the data from ‘rows [[]]’ in a tabular format to store in SQL table. The JSON response has columns [] and rows [[]], where columns [] hold the field names and rows [[]] hold the corresponding data for these fields. I want the data from columns [].name as the column names and data from rows [[]] as the rows in the result.

I have already raised this query in the forum and got a valuable solution. However, this is a multi-array, I am still couldn’t copy the JSON into SQL table.

Thanks in advance for your help!

The sample JSON File Structure is below:
{
"id": "XXXX",
"name": "XYZ",
"type": "cost",
"location": null,
"sku": null,
"eTag": null,
"properties": {
"nextLink": null,
"columns": [
{
"name": "Amount",
"type": "Number"
},
{
"name": "Date",
"type": "Number"
},
{
"name": "Group",
"type": "String"
},
{
"name": "Type",
"type": "String"
},
{
"name": "Location",
"type": "String"
},
{
"name": "Currency",
"type": "String"
}
],
"rows": [
[
0. 0012,
20220901,
"RAM",
"storage",
"EU",
"GBP"
],
[
0. 0015,
20220902,
"VM",
"Compute",
"EU",
"GBP"
],
[
0. 0017,
20220903,
"Disk",
"storage",
"EU",
"GBP"
]
]
}
}

The desired result is:
Amount Date Group Type Location Currency
0. 0012 20220901 RAM storage EU GBP
0. 0015 20220902 VM Compute EU GBP
0. 0017 20220903 Disk storage EU GBP

(I will have 140 rows and 6 columns with real source data)

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
0 comments No comments
{count} votes

Accepted answer
  1. AnnuKumari-MSFT 34,556 Reputation points Microsoft Employee Moderator
    2022-10-27T09:44:39.72+00:00

    Hi @Anonymous ,
    Thankyou for using Microsoft Q&A platform and thanks for posting your question here. I tried to create the dataflow and transforming the above JSON and am successfully able to achieve that. However, there are a lots of bits and pieces involved which makes it complex. Kindly try the below steps :

    1. Add two source transformations pointing to the same source json. Select 'Array of documents' in the json settings for both the sources.
    2. Add flatten transformations to each of the sources . For one flatten transformation, select rows[[]] to unroll by and for the other one, select columns[] to unroll by.
    3. Add another flatten transformation to unroll row[] in the first branch so that we get individual rows out of the nested array.
    254732-1.gif

    4. Add surrogate key transformation to both the sources to create a incremental identity column RowNum
    5. In the first branch , add a derived column transformation and create a new column , say Rankingwith this expression: iif(RowNum%6==0,6,toInteger(RowNum%6)) . This would assign ranking to the data based on the number of columns(i.e. 6). That means this column will have data like 1,2,3,4,5,6 again 1,2,3,4,5,6 upto the last row. This will help in joining the two sources
    254714-2.gif

    6. Add Join transformation and select join type as 'inner join' based on Ranking column and RowNum column coming from below branch
    254706-3.gif

    7. Add Select transformation and deselect the unwanted columns. Keep only rows and name
    8. Add Surrogate key transformation to create incremental identity column called NewId
    9. Add derived column transformation to create a new dummy column with the value : iif(NewID%3==0,3,toInteger(NewID%3)) . Here 3 is based on the number of rows. In your case, you can either hardcode it by giving 140 instead of 3 , or you can parameterize it so that this value would be passed via pipeline parameter during the runtime so that when the number of rows changes , you don't have to update anything in dataflow, rather you can pass the number of rows while debugging the pipeline.
    254659-4.gif

    10. Add Pivot transformation to group by using dummy column . In pivot key tab, select name column and in pivoted columns provide this expression : first(rows)
    254664-5.gif

    11. In mapping tab of pivot transformation, click on 'Map drifted' , it would add another transformation and import the current schema after pivot
    12. Add select transformation and deselect the dummy column which is not needed
    13. Add Sink transformation to load the data into SQL table.

    254679-6.gif

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button and take satisfaction survey whenever the information provided helps you.
      Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.