Split the column values in dataflow in Azure Data factory

amikm 11 Reputation points
2021-08-19T13:42:29.087+00:00

I have few records that have the following columns

124689-1.png

I have few records that has following columns

Id, Person, housenumber
1, Mark, 101,102,
103,104,
105,106,
2,Alice, 107
3,Bob, 108,109,110
enter image description here

Note: So, we have 3 Id, where the first Id has housenumbers present in this format

I have created a dataflow, where I took the source file has source and added a derived transformation to split the housenumber split(housenumber, ', \n') then again created another derived column to get the index position of each house number

house1 = split(housenumber, ', \n')1
house2 = split(housenumber, ', \n')2..etc
the problem is I am not able to split the data accordingly. Also, some person can have more than 100 houses as well, in that case, creating a derived column 100 times will cause the problem as we need to update this derived transformation all the time.

Please help me to fix this kind of issue using dataflow in the Azure data factory

Note: For simplicity purposes, I used Power BI query editor to show you the data. But we want the fix as per dataflow transformation that I can use further using pivoting then sink transformation

Expected output
124727-2.png

124690-3.png

124706-4.png

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,972 questions
{count} vote

1 answer

Sort by: Most helpful
  1. ShaikMaheer-MSFT 38,321 Reputation points Microsoft Employee
    2021-08-20T05:33:09.37+00:00

    Hi @amikm ,

    Thank you for posting query in Microsoft Q&A Platform.

    You can achieve this using split() function in Derived column transformation and Flatten transformation.

    Please check below detailed example to understand it better.

    Step1: Source Transformation, which has skills column with comma separated values.

    124830-source.gif

    Step 2: Derived Column Transformation, here I am using split() function to convert comma separated string values to array
    expression used: split(skills,',')

    124943-derviedcol.gif

    Step 3: Flatten Transformation, to flatten your skills array to multiple rows.

    124905-flatten.gif

    Step 4: Use Sink Transformation, to load your data in to desired target.

    Hope this will help. Thank you.

    ------------------------

    • Please accept an answer if correct. Original posters help the community find answers faster by identifying the correct answer. Here is how.
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification.
    3 people found this answer helpful.