Creating a dependency pipeline to check file is latest in ADF

amikm 11 Reputation points
2021-07-26T16:59:23.597+00:00

I am trying to create a dependency pipeline for files before executing my model refresh (web activity) I want to make sure all the related files are there in their respective folders and all files are latest.

Suppose, my model refreshes uses the following file present in adls-

  1. myadls/raw/master/file1.csv
  2. myadls/raw/dim/file2.csv
  3. myadls/raw/dim/file3.csv
  4. myadls/master/reporting/file4.csv

We need to compare the files last modified with today's date. If both are equal then files are the latest. If any of the files is not the latest then I need an email with the file name that is not the latest and I shouldn't trigger my web activity which usually does model refresh.

I have created this pipeline using get metadata, for each activity, If-condition, web activity, and Set variable activity. But the problem is I am not able to get an email for the file which is not the latest. Can anyone help me to re-design my below pipeline so that I can get an email for the file which is not the latest file?

Please find my current design for the dependency pipeline. In the last, If condition in a true activity I am running a web activity that does model refresh and In false activity, I have created another web activity that says one of the files is not the lastest so model refresh can't be done. But It's not able to tell which file it is.

117957-adf.png

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
{count} votes

2 answers

Sort by: Most helpful
  1. HimanshuSinha-msft 19,486 Reputation points Microsoft Employee Moderator
    2021-07-27T00:20:28.447+00:00

    Hello @amikm ,
    Thanks for the ask and using the Microsoft Q&A platform .

    I think the ask is how we can send an email from ADF . ADF by itself does not have any activity which you can use . You can implement the same using Logic apps .

    https://www.mssqltips.com/sqlservertip/5718/azure-data-factory-pipeline-email-notification-part-1/

    One other "Not a clean way " Is to fail the pipeline itself and this should trigger a pipeline failure email . I think you can use a web activity with an no existance URL and this will make the activity fail .

    Please do let me know how it goes .

    Thanks
    Himanshu


  2. Nandan Hegde 36,146 Reputation points MVP Volunteer Moderator
    2021-07-28T06:30:18.623+00:00

    Hey,
    As per your current arch ,you can create variables per foreach activity that would store the file name .
    So within foreach activity, in case if the file is not latest using append variable activity
    you can save all file names.
    and then in the final validation, you can concat all for each loop variables to have the final list of files that are not modified.

    But ideally I would suggest the below approach :

    1. Have the list of files created as a lookup activity output.
    2. Provide that to a single foreach activity in sequential execution.
    3. within foreach via IF activity and getmeta data activity, check whether the file is latest or not.
      If not via append variable activity append the file name.
    4. Once out of foreach, via If condition check whether the file name variable is blank or has some values.
      If it has values, then you can send an email and the filename variable has all the non updated file names
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.