How to leverage CheckMD5 to ensure today's file looks different from yesterday's file

Question

I am working in Azure Synapse Analytics and am wanting to ensure today's file is different from yesterday's file in my pipeline. My naming convention for files is filename1YYYYMMDD.csv. I want to leverage MD5 to check to ensure each file is not the same, but having challenges around checking between directories and dynamically checking yesterday's file. Today's file will be in the 'Incoming' directory whereas yesterday's file will be in the 'Archive' directory which breaks down into various subdirectories based on Year, Month, and Day of data processing. My pipeline involves a 'Get metadata' activity on the 'Incoming' directory and a 'Getmetadata' activity on the 'Archive' directory. This is followed by an if activity to compare the two. Another concern I have is I want to compare filename1 with filename1 and filename22 with filename22 (excludes the YYYYMMDD appended to the file names). Any suggestions on my approach?

Answer

Hello @@Moore, Payton E ,,

Thanks for the ask and using the Microsoft Q&A platform .
md5 function is not supported in pipeline , but its is supported by mapping data flow . Since you are having the .csv file you can implement that on the column level . Please read about the same here

On the second ask , there can be many ways you can do this , but I like the below dynamic expression . The logic is pretty starigh forward . Read the toal length of the filename and subtract 12 ("YYYYMMDD.csv") from that .

@substring(pipeline().parameters.parameter1,0,sub(length(pipeline().parameters.parameter1),12))

Note: while testing I passed the filename as a parameter .

Please do let me know how it goes .
Thanks
Himanshu
Please do consider clicking on "Accept Answer" and "Up-vote" on the post that helps you, as it can be beneficial to other community members

How to leverage CheckMD5 to ensure today's file looks different from yesterday's file

1 answer