Transform items of an array into Columns in Azure Data Factory

Pankaj Dhami 130 Reputation points
2022-03-07T14:26:56.747+00:00

consider the following json structure where there are multiple arrays eg question and polls ![180666-image.png][1] and the desired output is a flat table like structure where each question and polls array data is represented as a separate column for eg questions array of a particular user has 2 records such that array[0] will become data for column Ques1 and array[1] will become the data for column Ques2 ![180734-image.png][2] Is there any way to achieve this in azure data factory data flow ? { "email": "xxxxxxxx1@123.com", "questions": [ { "questionid": 6268255, "createtimestamp": "2015-05-19T11:55:12-07:00", "content": "sdfsd password?", "answer": { "createtimestamp": "2015-05-19T11:57:12-07:00", "content": "asdfsd", "presenterid": 813717, "presentername": "a reerXXXXX", "privacy": "Private" } }, { "questionid": 6268257, "createtimestamp": "2015-05-19T11:58:12-07:00", "content": "asfdfdftexts directly?", "answer": { "createtimestamp": "2015-05-19T11:59:59-07:00", "content": "sdfsdfsf for an event", "presenterid": 813706, "presentername": "aeee XXXXX", "privacy": "Private" } } ], "polls": [ { "pollid": 16818601, "pollsubmittedtimestamp": "2015-05-19T11:45:12-07:00", "pollquestionid": 6268336, "pollquestion": "poll 1", "pollanswers": [ "poll 1 ans" ] }, { "pollid": 16818630, "pollsubmittedtimestamp": "2015-05-19T11:47:12-07:00", "pollquestionid": 6268358, "pollquestion": "pol 2 solutions?", "pollanswers": [ "Unfamiliar" ] } ] } [1]: /api/attachments/180666-image.png?platform=QnA [2]: /api/attachments/180734-image.png?platform=QnA

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,196 questions
Not Monitored
Not Monitored
Tag not monitored by Microsoft.
37,795 questions
0 comments No comments
{count} votes

Accepted answer
  1. AnnuKumari-MSFT 32,161 Reputation points Microsoft Employee
    2022-03-08T08:22:51.6+00:00

    Hi anonymous user ,

    Welcome to Microsoft Q&A platform and thankyou for posting your query.

    As I understand your query , you have a JSON input which you want to Load it as a table content in Azure SQL DB. In this process, you want to transform array items into separate columns in the table. Please correct me if my understanding is wrong.

    In order to achieve this requirement , I would recommend to use Data flow which is the best solution for transforming data using ADF. Here are the steps you need to follow.

    1. Upload the Json file in ADLS and create a linked service and dataset pointing to the same file . Import schema during this process
    2. Create a new Dataflow and In Source transformation , use the above created dataset . Go to source option, Under JSON settings, select document form as 'Array of documents'

    180932-image.png
    3. Use Surrogate key transformation to create an identity column 'Id'

    180925-image.png

    4. Use Select transformation to bring columns in correct sequence

    180941-image.png

    5. Use Flatten transformation to flatten the array 'Question'

    180829-image.png

    180819-image.png

    6. Use Sink transformation with Azure SQL DB dataset to load the data into SQL table

    180848-image.png
    180899-image.png

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you.
      Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. Nasreen Akter 10,791 Reputation points
    2022-03-07T16:17:38.323+00:00

    Hi anonymous user,

    Thank you for the ask. You can achieve that by using the CopyActivity in the Pipeline or with the DataFlow.

    Hope this helps! Thanks :)