Not able to use append variable in Use query of source (copy data) through For each loop

abhilash yadav 1 Reputation point
2021-01-23T22:47:24.143+00:00

Hi,

I am beginner and learning Data factory.

I am doing below requirement.

---a Table is having data like Bird name and Bird Type ( Mourning Dove , Dove).

--- Want to generate flat files as per Bird types dynamically.

1) used lookup on table to fetch distinct Bird Type and keeping in array variable

2) For each loop will read the array values in loop and this array value has to be passed to generate through query in "copy data" and with same above table as source.

3) Can some one suggest how can i use value getting into append variable during iteration in SQL query for generating flat file.

Not able to use append variable in Use query of source (copy data) through For each loop

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,196 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Nasreen Akter 10,791 Reputation points
    2021-01-24T01:28:22.563+00:00

    Hi @abhilash yadav ,

    Thank you for asking this question. In my opinion, the solution you want to achieve in the pipeline using CopyActivity would be too much expensive as too many Activities (i.e., CopyActivities and other Activities) will run just for a single pipeline run. You can achieve the same output in a way more simpler and cleaner way by using DataFlow.

    The steps you have to do are:

    1. create a DataFlow. To make folder path dynamic, you can add a parameter folderPath
    2. add required sources (in the source activity, you can pull all data from SQL table or you can use Query, it's up to you)
    3. do data manipulations (you can do a lot of things e.g., filter, aggregate, pivot, deriveColumns etc.)
    4. before Sink, derive column "filePath" e.g., out/Birds/birdtype1.parquet
    5. in the Sink --> Settings: select FilenameOption to "As data in column" and select column with filename to "filePath".
    6. in Sink-->Optimize: Use Current Partition

    That's all. All the data with the same filePath, will be written in that specific file. You will get all the files in the folderPath you mentioned in the pipeline.

    Please see screenshots attached and let me know if you have any questions. If the above response helped, please "accept as answer" and "up-vote" the same! Thanks!

    59891-dataflow-pipeline.jpg
    59901-dataflow-param.jpg
    59862-dataflow-1.jpg
    59863-dataflow-2.jpg
    59864-dataflow-3.jpg
    59865-dataflow-4.jpg
    59856-dataflow-5.jpg

    3 people found this answer helpful.