How to Flatten an Array JSON Structure in ADF

Jancy John 86 Reputation points
2022-10-18T16:49:21.513+00:00

Hi All,

I'm trying to flatten the following JSON structure in Azure Data Factory so that I can get the data from ‘rows []’ in a tabular format to store in SQL table. The JSON response has columns [] and rows [], where columns [] hold the field names and rows [] hold the corresponding data for these fields. I want the data from columns [].name as the column names and data from rows [] as the rows in the result.
I have tried using an ADF Data Flow & the flatten transformation but couldn’t find a method for the desired result. New to Azure ADF and JSON so any help or suggestions would be much appreciated.

Thanks in advance for your help!

The sample JSON File Structure is below:

{
"id": "XXXX",
"name": "XYZ",
"type": "cost",
"location": null,
"sku": null,
"eTag": null,
"properties": {
"nextLink": null,
"columns": [
{
"name": "Amount",
"type": "Number"
},
{
"name": "Date",
"type": "Number"
},
{
"name": "Group",
"type": "String"
},
{
"name": "Type",
"type": "String"
},
{
"name": "Location",
"type": "String"
},
{
"name": "Currency",
"type": "String"
}
],
"rows": [
[
0.0012,
20220901,
"RAM",
"storage",
"EU",
"GBP"
],
]
}
}

The desired result is:
Amount Date Group Type Location Currency
0. 0012 20220901 RAM storage EU GBP

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
0 comments No comments
{count} votes

Accepted answer
  1. AnnuKumari-MSFT 34,556 Reputation points Microsoft Employee Moderator
    2022-10-19T08:46:15.613+00:00

    Hi @Anonymous ,

    Welcome to Microsoft Q&A platform and thanks for posting your question here.
    As I understand your question, you are trying to copy JSON into SQL table by flattening the complex data.

    First of all, row[] has a single nested array within , however at the end of it, there seems to be ',' which is not required here. For processing it , I am removing one level of array . Instead of row[[]] , I am providing row[] . You can follow the same steps on the original source data:

    • Add two source transformations pointing to the same source json. Select 'Array of documents' in the json settings for both the sources.
    • Add flatten transformations to each of the sources . For one flatten transformation, select rows[] to unroll by and for the other one, select columns[] to unroll by.

    251895-flattenjson.gif

    • Add surrogate Key transformation to each of them to create a new incremental column called rownum
      251922-surrogatekey.gif
    • Add Join transformation to join the two sources based on Id and RowNum
    • Use Select transformation to remove unwanted columns after the join
      251896-join.gif
    • Use Pivot transformation and group by using id column . In the Pivot key tab, use name column and in the pivoted columns , provide aggregate function like min(rows) or max(rows) or avg(rows).
    • Go to the Data preview tab of Pivot transformation and select 'Map drifted' option. It will add another derived column automatically . Select 'Id' column and remove it and sort other columns as per the need.

    251923-pivot1.gif

    • Add SQL dataset in sink transformation to load the data to SQL table.

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you.
      Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    2 people found this answer helpful.
    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Jancy John 86 Reputation points
    2022-10-19T15:29:15.867+00:00

    Hi @AnnuKumari-MSFT

    Thank you so much for your valuable time in resolving my query. It really helped and I am now able to replicate the same.

    However, I still need your help with one more query I have in this. This is definitely my mistake while drafting the sample JSON response.

    row [] is actually a nested array with more than one group of items. I have around 140 arrays inside row [] in the original JSON response. I was just giving a sample JSON due to the confidentiality of data and forgot to mention that row [] is a nested array row [[]].

    I have replicated your solution in ADF and added one more flatten transformation (after flatten1) and flattened the nested row and got the first row of data in output. However, I would like to have multiple rows, as per the real JSON. Could you please help me with the necessary changes to be made in surrogatekey and join transformations.

    252085-adf-dataflow.png

    Thank you very much for your kind help on this.

    The sample JSON File Structure is below:
    {
    "id": "XXXX",
    "name": "XYZ",
    "type": "cost",
    "location": null,
    "sku": null,
    "eTag": null,
    "properties": {
    "nextLink": null,
    "columns": [
    {
    "name": "Amount",
    "type": "Number"
    },
    {
    "name": "Date",
    "type": "Number"
    },
    {
    "name": "Group",
    "type": "String"
    },
    {
    "name": "Type",
    "type": "String"
    },
    {
    "name": "Location",
    "type": "String"
    },
    {
    "name": "Currency",
    "type": "String"
    }
    ],
    "rows": [
    [
    0. 0012,
    20220901,
    "RAM",
    "storage",
    "EU",
    "GBP"
    ],
    [
    0. 0015,
    20220902,
    "VM",
    "Compute",
    "EU",
    "GBP"
    ],
    [
    0. 0017,
    20220903,
    "Disk",
    "storage",
    "EU",
    "GBP"
    ]
    ]
    }
    }

    The desired result is: So I should have 140 rows and 6 columns with real source data
    Amount Date Group Type Location Currency
    0. 0012 20220901 RAM storage EU GBP
    0. 0015 20220902 VM Compute EU GBP
    0. 0017 20220903 Disk storage EU GBP

    1 person found this answer helpful.
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.