Reading startDate and endDate as a string from a filename to copy data based on that string dynamically

Tinashe Chinyati 156 Reputation points
2020-12-02T16:11:54.173+00:00

Greetings everyone

I am new to data factory and need some assistance. I am copying files from an on-prem storage to ADLSgen2. I have created a copy pipeline which has the following sequence:

  1. Get metadata activity (getting the childItems)
  2. Foreach activity
  3. Inside foreach I linked set variable activity (which is splitting and extracting the date) and copy activity

The files have this format D_OGY_20200916_094812_00.CSV. I am interested in extracting the date and use it as start and end date inorder to copy only files that correspond to that date.

So I am creating a date partition in my sink based on the set variable activity output. The pipeline is working well but now i want it to backdate and load historical data. I have created a parameter startDate and endDate so i can load the string set in my variable dynamically. Is it possible, if so your input will be greatly appreciated.44454-adf-forum1.png

44474-get-metadata.png
44445-variable1.png44482-copy22.png

I want to use a tumbling window trigger so i can be able to pass the startDate and endDate to load the data. Thanks in advance!

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
0 comments No comments
{count} votes

Answer accepted by question author
  1. HimanshuSinha 19,547 Reputation points Microsoft Employee Moderator
    2020-12-04T02:16:18.03+00:00

    Thanks for clarifying the ask .
    Look the date is substring which you are reading from the file name and so you cannot use that way .
    I did work on this and introducing an if activity will help you validate if the file name ( i mean the date part is in the range )
    is to be copied or not . I used the below expression in the IF activity

    @ANDO (lessOrEquals(int(variables('name')),int(pipeline().parameters.Enddate)),greaterOrEquals(int(variables('name')),int(pipeline().parameters.Startdate)))

    variables('name') : captures the file name eg "20200916"( you already have the expression for that )

    Inside the IF activity , you should place the COPY activity in "True" condition .

    Though the logic is simple , I am sharing the animation .
    Also you will have to convert the Startdate and EndDate paramters to yyyyMMdd format by using @formatDateTime(pipeline().parameters.startdate,'yyyyMMdd')
    I have hardcoded that in the solution to have a quick test .

    Note : I am not adding the copy activity as I think you have already figured out that .

    Thanks Himanshu
    Please do consider to click on "Accept Answer" and "Upvote" on the post that helps you, as it can be beneficial to other community members.
    45062-copyfile101.gif

    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. HimanshuSinha 19,547 Reputation points Microsoft Employee Moderator
    2020-12-03T00:57:11.137+00:00

    Hello @Tinashe Chinyati ,

    Thanks for the post .

    Couple of things , you mentioned "The files have this format D_OGY_20200916_094812_00.CSV. I am interested in extracting the date and use it as start and end date inorder to copy only files that correspond to that date." , can you please elaborate this ?

    Also you have added and the start and end date as parameter in the dataset , at this point I am not sure as to how are you using the parameter .

    Anyways as it turns out I dont think this will work for you , the reason being the copy activity is inside the for each loop ( FE ) loop which is taken the input from the Getmetadata activity . If i where you I think I could have focused on the getdata activitvy to get all the files to be copied .

    Yes TW trigger shoulds help .

    Let me know if you have questions .

    Thanks
    Himanshu


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.