Issues while flatten the complex JSON object in Azure Data Factory

Sachin 26 Reputation points
2021-11-10T22:16:29.227+00:00

148350-source-setting.pngHello,

I have JSON data which I retrieved from the API. I stored the JSON in the blob storage and now I am trying to flatten the JSON using data flow as the JSON object is very complex. Please refer the "sample-json.txt". Below are the steps which I did.

1) Import the source JSON data and selected the "Document Per Line" JSON setting.
2) Created Flatten1 activity and retrieved the 6 columns from "data" object and added the "include" object which has media array. Please refer the "flatten1-setting.png"
3) Created Flatten 2 activity and retrieved the "media key", "type" and "count" attribute from media array apart from 6 attributes from above step. Please refer the "flatten2-setting.png"
4) Preview the final data. Please refer the "flatten2-datapreview.png"

As per the flatten2 data preview my output rows are duplicating because of the media array object. Can you please help to know how we can ignore that.

I tried using Source Option -> JSON settings -> Single Document/Array of documents but I am getting "Malformed records are detected in schema inference. FAILFAST" error. The only settings which worked is "Document Per Line". I am not sure why I am getting the error.

Please help me to resolve the issues with the JSON flattening.

Thanks

148349-flatten1-setting.png148314-datapreview-flattenoutput.png148315-flatten2-setting.png148324-flatten2-datapreview.png
148336-sample-json.txt

148325-flatten1-datapreview.png

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,246 questions
{count} vote

1 answer

Sort by: Most helpful
  1. ShaikMaheer-MSFT 38,506 Reputation points Microsoft Employee
    2021-11-11T14:14:45.013+00:00

    Hi @Sachin ,

    Thank you for posting query in Microsoft Q&A Platform.

    I see flatten transformation is behaving correctly only in your case.

    Flatten transformation take array values inside hierarchical structures such as JSON and unroll them into individual rows.

    So, when array will get flatten automatically rows belonging to that array will get duplicated or flatten. That is what idea of Flatten transformation is.

    Kindly check below link for better idea about flatten transformation.
    https://www.youtube.com/watch?v=zrjYg2_2Y9I

    Please let us know if your expected output is different with sample output data. So that we can guide through steps and transformations to use to achieve same.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.