Azure Data Factory pipeline with one source and three destinations based on provider id

Peter M. Florenzano 1 Reputation point
2021-04-01T14:40:22.913+00:00

Hello everyone,

I'm in process of planning out a demo for one of our clients. They are a health care provider with many providers, but for the demo sake, this will be 3 - 4 records per store type, which is BLOB storage, Azure Data Lake and SFTP.

Based on the source data, which resides in Azure SQL Database, provider id and store type, the data will land in one out of three destinations (ADL, BLOB Storage & SFTP Gateway).

I'm not sure how I would go about developing this without separate pipelines.

Any information would be greatly appreciated.

Thank you

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,599 questions
0 comments No comments
{count} votes

7 answers

Sort by: Most helpful
  1. Peter M. Florenzano 1 Reputation point
    2021-04-02T13:52:54.147+00:00

    @Nandan Hegde That works, thank you very much! The only thing this process needs to do now is create new containers in both the ADL and BLOB destinations for each ProviderID listed within the table for BLOB, the same with ADL and SFTP.

    What it's currently doing is creating the first ProviderID and copying all the records that equal to BLOB instead of creating a new container for each Provider ID

    84083-adloutput-04022021.png

    I'm sure an adjustment needs to be made on the dataset side.

    Here is a screenshot of how the containers should look with each unique Provider ID

    84111-sampledataset-2.png

    Any help would be greatly appreciated.

    Thanks again


  2. Kiran-MSFT 691 Reputation points Microsoft Employee
    2021-04-05T16:30:50.103+00:00

    Being a data processing problem this is best handled by dataflow. Source and use the split transform to place data into 3 different folders/files in a temporary destination on lake(SFTP is not yet available as a sink in dataflow). Then use a copy activity to write the data to SFTP location.

    I also assume you want this solution to scale. Iterating rows in foreach is not a scalable solution and will work only for small data loads.