I'm deduping IDs but want to write a second occurrence's new phone number in another column on the first occurrence's row; can I do that in Data Flows?

Jeff vG 86 Reputation points
2021-06-17T13:16:12.01+00:00

I have data like this

ID        phone1        phone2
a         123-4567      null
a         123-8910      null

And I don't necessarily want to dedupe that second ID and throw away the row because I want that second phone number in phone2, like

ID        phone1        phone2
a         123-4567      123-8910

So I can write both phone numbers to a SQL DB tables with a PK on ID.

How do I do this in Data Flows?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,696 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,196 questions
0 comments No comments
{count} votes

Accepted answer
  1. PRADEEPCHEEKATLA-MSFT 85,746 Reputation points Microsoft Employee
    2021-06-18T08:25:59.3+00:00

    Hello @Jeff vG ,

    Thanks for question and using MS Q&A platform.

    Yes, you can do this in Azure Data Flows.

    Please do check the below steps to implement your scenario:

    Step1: Add the source dataset to the dataflow.

    106963-adf-source1.gif

    Step2: Add the Aggregate transformation in mapping data flow and do the below steps:

    Group by : Columns name ID
    Aggregates: Column name phone1 and use collect(phone1) expression to collect all the values present in phone1 column.

    106944-adf-aggregate1.gif

    Step3: Add the Derived column transformation in mapping data flow and do the below steps:

    Column phone1 and Expression phone1[1]
    Column phone2 and Expression phone1[2]

    106975-adf-derviedcolumn1.gif

    Step4: add the sink in mapping data flow and configuration the output configuration.

    106992-adf-sink1.gif

    Step5: Now create a pipeline and run the dataflow.

    106926-adf-pipeline.gif

    Hope this helps. Do let us know if you any further queries.

    ---------------------------------------------------------------------------

    Please "Accept the answer" if the information helped you. This will help us and others in the community as well.

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful