Comparison using greater function issue

pmscorca 812 Reputation points
2023-04-04T21:36:56.1533333+00:00

Hi, inside an ADF instance I've created a pipeline to get the last csv file from an ADLS gen2 folder, named "raw". In the raw folder I've these three file: filecsv_2023.03.22.csv, filecsv_2023.03.24.csv and filecsv_2023.03.25.csv; a file name contains the related data. I need to get the last file and so I need to get the filecsv_2023.03.25.csv. In my pipeline, I've put a Get Metadata activity to get the files inside the raw folder and then I've put a For Each activity to compare @item().name with a string variable by using the greater logical function, in a such way: @greater(item().name, variables('currfilename')). Respect to the true value of the currfilename variable I've set this variable with the @item().name. I've debugged the pipeline and I haven't obtained the last file; moreover, I've noticed 3 unattended iterations for the For Each activity. I've also tried to extract the data by some replace actions and an int conversion but unsuccessfully. Now, how I could solve a such issue, please? Many thanks

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,687 questions
0 comments No comments
{count} votes

Accepted answer
  1. Suba Balaji 11,186 Reputation points
    2023-04-05T13:38:34.5366667+00:00

    Hi @pmscorca Here is another implementation using greater. Expression in If-condition:

    @greater(int(split(replace(replace(item().name,'.csv',''),'.',''),'_')[1]),int(variables('tmpfile')))
    
    

    tmpfile :: string variable, with default value 20230101 If the if condition succeeds, then we have two set variable activities First set variable assigns the new greater value to tmpfile variable

    @greater(int(split(replace(replace(item().name,'.csv',''),'.',''),'_')[1]),int(variables('tmpfile')))
    

    Second set variable assigns proper file name to another string variable, fullfilename

    @item().name
    
    

    Make sure to run foreach sequentially! Please let us know how it goes. Thank you!


1 additional answer

Sort by: Most helpful
  1. Suba Balaji 11,186 Reputation points
    2023-04-05T04:42:41.3566667+00:00

    Hi @pmscorca Please try the below steps -

    1. Get Meta Data activity to get all file name
    2. use a foreach loop (uncheck sequential) activity and inside that use an append variable activity (array variable - my case its allfilenames)
    int(split(replace(replace(item().name,'.csv',''),'.',''),'_')[1])
    
    

    the above exprssion would be creating an array variable with just the number part in your file name. For example, [20230225,20230226,20230322] Then use a set variable activity to find max of the array and to append the dots (.) in the file name

    @concat(concat(substring(string(max(variables('allfilenames'))),0,4),'.')
        
        ,
        concat(substring(string(max(variables('allfilenames'))),4,2),'.')
    ,
    
        substring(string(max(variables('allfilenames'))),6,2))
    

    The above expression gives you the file name with max date in it, or latest date in it. For example, in my example, 2023.03.22 Then you can use a filter activity with below expression

    @contains(item().name, variables('latestfile'))
    
    

    This filter activity output will give you the file that contains max date in it.

    @activity('Filter1').output.Value[0]['name']
    
    

    Please let us know if you could replicate it. THanks Suba