Flowlets question

Marc van der Wielen 46 Reputation points
2022-03-11T07:57:14.743+00:00

Hi,

I am exploring the usage of flowlets and wonder how I should implement the following scenario:

I have a dataflow which reads a source with 10 columns, one of these columns contains a customer number for which I want a flowlet to execute some logic on it.
The flowlet will accept the customer number as input and returns another column as output.

How can I use the output column in the dataflow in such the 10 columns from the source will stay available further downstream as well as the output column returned from the flowlet?

When I develop the dataflow as follows:
Source -> Flowlet -> Sink
I will only have one output column after the flowlet while I want it to be 11 columns.

Should I create an additional branch and perform a lookup on the flowlet to accomplish what I want or is there a better way?

Source -> Flowlet
---------|
---------|
---------|Source (branch) -> Lookup flowlet (customer id) -> Sink

Thanks for your thoughts!

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,395 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,600 questions
{count} votes

Accepted answer
  1. ShaikMaheer-MSFT 37,896 Reputation points Microsoft Employee
    2022-03-14T15:22:43.617+00:00

    Hi @Marc van der Wielen ,

    Thank you for follow up ask with details.

    Flowlets in ADF will need to supply all the columns as input to work. If we don't supply then it is going to return only one column always.

    For example, lets as Source1 has columns A,B,C and Source2 has columns B,M,N. Then if we want flow let to work for both the sources we should consider adding input columns as A,B,C,M,N.

    Below is an idea which I have on high to avoid adding all source columns to flowlet. Kindly check its feasible in your case.

    Step1: Create a Flowlet with only customerNumber column as input.
    Step2: Inside dataflow on source transformation use New Branch transformation to clone source as sperate branch OR add another source transformation with same source data.
    Step3: On source transformation use Select transformation to get take only customerNumber column.
    Step4: Add Flowlet transformation after select transformation.
    Step5: From Flowlet transformation try to join columns between both paths(step2 & step3)
    Please Note, make sure have a common column between source and new branch, so that we can easily join as mentioned in Step5.

    Below screenshot shows high level implementation suggested.
    182887-image.png

    Hope this helps. Please let us know if any further queries.

    ------------------

    Please consider hitting Accept Answer button. Accepted answers helps community as well.


2 additional answers

Sort by: Most helpful
  1. ShaikMaheer-MSFT 37,896 Reputation points Microsoft Employee
    2022-03-11T16:02:59.12+00:00

    Hi @Marc van der Wielen ,

    Thank you for posting query in Microsoft Q&A Platform.

    To summarize ask here, you are looking a way to include all source columns in to Flowlet along with customer number column. Please feel free to correct if my understanding is wrong here.

    While creating flowlet under input we can add all the columns manually or by using source. Those columns will be available to through out of logic of flowlet and output from flowlet too.

    Please check below screenshot where I added columns in flowlet input along with custNum column.
    182342-image.png

    All above columns output from flowlet.
    182351-image.png

    Kindly check below video in which detailed explanation of flow lets available. This will help.
    Flowlets in Mapping data flow in Azure Data Factory

    Hope this helps. Please let us know if any further queries.

    -----------------

    Please consider hitting Accept Answer. Accepted answers helps community as well.

    1 person found this answer helpful.

  2. Marc van der Wielen 46 Reputation points
    2022-03-17T09:05:44.29+00:00

    @ShaikMaheer-MSFT sorry for the delay, yes your answer was what I already expected. Creating an addtional branch and then use a join or a lookup to the flowlet will solve the scenario but I do feel this is not optimal as an additional join or lookup and having two datastreams are required. I wondered if the flowlet supports some column passthrough feature which will dynamically accept any number of input columns (besides the input columns needed for the logic in the flowlet) which you could then use further downstream. Apparently this is not possible and therefore this alternative solution is how I should solve my scenario.

    Kind regards,

    Marc