Flatten a column that has array values into several rows using Dataflow

Tzu-Ko Hung 21 Reputation points
2022-11-28T10:22:37.51+00:00

The raw data I have right now is looking like the following:
264763-image.png

I am thinking of using the flatten function in dataflow using Data Factory to convert the data into the following:
264699-image.png

And eventually I want my data to look like this:
264774-image.png

However, I am now stuck in the middle going from the first image to the second one. I did some research and I think I should be using the flatten function in DataFlow using Data Factory. Based on my research, I should select the column that I want to flatten in the red circle below (in the above case, I should select column3 in there), however, I see nothing and this is blocking me from moving forward.

264734-image.png

Anyone know why I am not seeing any column option? And what I should do to be able to select the column I need?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,710 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,240 questions
{count} votes

Accepted answer
  1. AnnuKumari-MSFT 32,161 Reputation points Microsoft Employee
    2022-11-29T13:51:04.43+00:00

    Hi @Tzu-Ko Hung ,

    Welcome to Microsoft Q&A platform and thankyou for posting your question here.

    As I understand your issue, you are trying to flatten the data present in a column as array of JSON and convert it into multiple columns. Please let me know if my understanding is incorrect.

    Since the data in column3 is in string format, we need to convert it into array first in order to unroll it using flatten transformation . Also, another important point is since the source data is in tabular format, you need to convert it into json format first to apply flatten transformation on top of it.

    1. Use derived column transformation to create a new column with this expression: @(column1=column1, column2=column2, column3=split(replace(replace(replace(column3,'[',''),']',''),'},{','};{'), ';'))
    2. Use select transformation to remove all other columns coming from source.
    3. Use flatten transformation and unroll by using this new column and select all underlying columns to flatten it.
    4. Use Parse transformation to create 3 new columns : column3a: (a as string) , column3b: (b as string) , column3c: (c as string)
    5. Use select transformation to deselect column3 from source.

    265291-1.gif

    265244-2.gif

    For more details, kindly watch out : How to flatten and parse the json data of sql column using mapping dataflow

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png and take satisfaction survey whenever the information provided helps you.
      Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    2 people found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful