Azure Data Factory pipeline with one source and three destinations based on provider id

Peter M. Florenzano 1 Reputation point
2021-04-01T14:40:22.913+00:00

Hello everyone,

I'm in process of planning out a demo for one of our clients. They are a health care provider with many providers, but for the demo sake, this will be 3 - 4 records per store type, which is BLOB storage, Azure Data Lake and SFTP.

Based on the source data, which resides in Azure SQL Database, provider id and store type, the data will land in one out of three destinations (ADL, BLOB Storage & SFTP Gateway).

I'm not sure how I would go about developing this without separate pipelines.

Any information would be greatly appreciated.

Thank you

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,526 questions
0 comments No comments
{count} votes

7 answers

Sort by: Most helpful
  1. Nandan Hegde 29,886 Reputation points MVP
    2021-04-01T16:21:30.62+00:00

    Hey,
    Can you please share some sample data and some example of your expectation which might help us to provide a better approach.

    Based on my understanding that there is a config table which indicates what data needs to be copied where:
    You can leverage a combination of below activities in a single pipeline:

    Lookup activity to get the provider id and destination
    The output of lookup activity can be added in if activity or switch activity with each scenario having a copy activity with SQL db as source and either of the destination

    0 comments No comments

  2. Peter M. Florenzano 1 Reputation point
    2021-04-01T16:29:58.917+00:00

    Hi Nandan,

    Yes of course, below is my sample data:

    83744-sampledataset.png

    The records that have a StoreType = BLOB need to be placed in Azure BLOB Storage. The records with a StoreType = 'ADF' need to be placed in Azure Data Lake and so on.


  3. Peter M. Florenzano 1 Reputation point
    2021-04-01T18:51:15.877+00:00

    Thank you,

    I have it setup like that already, the only issue I'm running across is the ProviderID folders need to be dynamically created during runtime.

    Should I use a lookup activity and a for-each loop or is there another way I can pass that ProviderID as an array?


  4. Peter M. Florenzano 1 Reputation point
    2021-04-02T11:14:52.49+00:00

    Thank you for your response Nandan,

    What I'm struggling with is how do I pass the list of distinct of ProviderID's from my lookup to the ForEach loop? Within the Lookup, I'm using the following query:

    SELECT DISTINCT ProviderID
    FROM [dbo].[PatientInformation]
    WHERE StoreType = 'BLOB'

    How do I pass the above values as an array to the ForEach loop? I did create a parameter as ProviderID with a parameter type as an Array for the pipeline which is as follows:

    @pipeline().parameters.ProviderID

    When I attempt to run the pipeline, it's asking me for a value. Shouldn't the value be passed automatically?

    Thanks again


  5. Peter M. Florenzano 1 Reputation point
    2021-04-02T13:04:22.653+00:00

    @Nandan Hegde

    Hi,

    This message comes up when I enter that string within the Items section of the ForEach section:

    "The output of activity 'Test' can't be referenced since it is either not an ancestor to the current activity or does not exist"