Comparison using greater function issue

pmscorca 1,052 Reputation points
2023-04-04T21:36:56.1533333+00:00

Hi, inside an ADF instance I've created a pipeline to get the last csv file from an ADLS gen2 folder, named "raw". In the raw folder I've these three file: filecsv_2023.03.22.csv, filecsv_2023.03.24.csv and filecsv_2023.03.25.csv; a file name contains the related data. I need to get the last file and so I need to get the filecsv_2023.03.25.csv. In my pipeline, I've put a Get Metadata activity to get the files inside the raw folder and then I've put a For Each activity to compare @item().name with a string variable by using the greater logical function, in a such way: @greater(item().name, variables('currfilename')). Respect to the true value of the currfilename variable I've set this variable with the @item().name. I've debugged the pipeline and I haven't obtained the last file; moreover, I've noticed 3 unattended iterations for the For Each activity. I've also tried to extract the data by some replace actions and an int conversion but unsuccessfully. Now, how I could solve a such issue, please? Many thanks

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,526 questions
0 comments No comments
{count} votes

Accepted answer
  1. Subashri Vasudevan 11,226 Reputation points
    2023-04-05T13:38:34.5366667+00:00

    Hi @pmscorca Here is another implementation using greater. Expression in If-condition:

    @greater(int(split(replace(replace(item().name,'.csv',''),'.',''),'_')[1]),int(variables('tmpfile')))
    
    

    tmpfile :: string variable, with default value 20230101 If the if condition succeeds, then we have two set variable activities First set variable assigns the new greater value to tmpfile variable

    @greater(int(split(replace(replace(item().name,'.csv',''),'.',''),'_')[1]),int(variables('tmpfile')))
    

    Second set variable assigns proper file name to another string variable, fullfilename

    @item().name
    
    

    Make sure to run foreach sequentially! Please let us know how it goes. Thank you!


1 additional answer

Sort by: Most helpful
  1. Subashri Vasudevan 11,226 Reputation points
    2023-04-05T04:42:41.3566667+00:00

    Hi @pmscorca Please try the below steps -

    1. Get Meta Data activity to get all file name
    2. use a foreach loop (uncheck sequential) activity and inside that use an append variable activity (array variable - my case its allfilenames)
    int(split(replace(replace(item().name,'.csv',''),'.',''),'_')[1])
    
    

    the above exprssion would be creating an array variable with just the number part in your file name. For example, [20230225,20230226,20230322] Then use a set variable activity to find max of the array and to append the dots (.) in the file name

    @concat(concat(substring(string(max(variables('allfilenames'))),0,4),'.')
        
        ,
        concat(substring(string(max(variables('allfilenames'))),4,2),'.')
    ,
    
        substring(string(max(variables('allfilenames'))),6,2))
    

    The above expression gives you the file name with max date in it, or latest date in it. For example, in my example, 2023.03.22 Then you can use a filter activity with below expression

    @contains(item().name, variables('latestfile'))
    
    

    This filter activity output will give you the file that contains max date in it.

    @activity('Filter1').output.Value[0]['name']
    
    

    Please let us know if you could replicate it. THanks Suba


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.