how we can get latest sub folder from the main folder using ADF

2024-03-22T09:01:55.3433333+00:00

how we can get latest sub folder from the main folder using ADF

ex: folder -->folder1, folder2

in the above ex i need latest sub folder ?

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,711 questions
{count} votes

4 answers

Sort by: Most helpful
  1. Vinodh247-1375 11,396 Reputation points
    2024-03-23T11:10:13.2266667+00:00

    Hi Bommisetty, Rakesh (Hudson IT Consultant),

    Thanks for reaching out to Microsoft Q&A.

    Yes, using the timestamp in the folder. Don't want to duplicate the answer, hence quoting the URL; see the below implementation with steps.

    https://stackoverflow.com/questions/70055595/using-adf-get-the-latest-folder-based-on-timestamp-in-folder-name

    Please 'Upvote'(Thumbs-up) and 'Accept' as an answer if the reply was helpful. This will benefit other community members who face the same issue.

    1 person found this answer helpful.

  2. Suba Balaji 11,186 Reputation points
    2024-03-23T11:26:31.37+00:00

    Hi Rakesh,

    You can use getmetadata activity + Foreach loop to find the latest subfolder.

    Sample Design:

    Screenshot 2024-03-23 at 4.35.46 PM

    1. First get meta data would bring all the sub folder names + files
    2. Filter just filters out the sub folders, ignoring files
    3. Foreach sequentially loops through each subfolder, and take out its last modified Timestamp using Get Metadata2
      1. If condition inside the foreach compares the current timestamp of folder with a variable (with older timestamp. In my case, i had a string variable having a timestamp in 2022) Expression:
              @greater(activity('Get Metadata2').output.lastModified,variables('latestts'))
              
        
      2. Inside if, i have two set variable activities
        1. if the current timestamp is greater than the TS in variable, assign the greater timestamp to the variable "latestts"
        2. in one more set variable foldername , assign current name of the folder

    My Variables:

    Screenshot 2024-03-23 at 4.54.18 PM

    Final variable is just to assign foldername and see whats the final folder name. Hope it helps.

    Please let me know if you need further help on this.

    0 comments No comments

  3. 2024-03-25T04:56:12.61+00:00

    How to read latest timestamp in s3 bucket using same example above. or below example.

    folder1-->1234543(folder1),675432(forlder2),765437(folder3).

    in the above example how to get the latest folder


  4. Pinaki Ghatak 2,400 Reputation points Microsoft Employee
    2024-05-01T08:21:49.57+00:00

    Hello @Bommisetty, Rakesh

    To get the latest subfolder from a main folder using Azure Data Factory, you can use the Get Metadata activity. In the Get Metadata activity, you can specify the metadata type as "childItems" to get a list of subfolders and files in the given folder.

    Then, you can use the "lastModified" metadata type to get the last modified datetime of each subfolder.

    Finally, you can use a Filter activity to filter the subfolders based on the last modified datetime and select the latest subfolder.

    Here is an example of how you can achieve this:

    1. Create a pipeline in Azure Data Factory.
    2. Add a Get Metadata activity to the pipeline and specify the folder path of the main folder in the "folderPath" field.
    3. In the "fieldList" field of the Get Metadata activity, specify "childItems" and "lastModified".
    4. Add a Filter activity to the pipeline and connect it to the Get Metadata activity.
    5. In the Filter activity, specify the condition to filter the subfolders based on the last modified datetime. For example, you can use the expression @greaterOrEquals(item().lastModified, addDays(utcnow(), -7)) to filter the subfolders that were modified in the last 7 days.
    6. Add a Select activity to the pipeline and connect it to the Filter activity.
    7. In the Select activity, select the subfolder with the latest last modified datetime. For example, you can use the expression @last(activity('Filter').output.value).name to select the latest subfolder name.
    8. Use the selected subfolder name in subsequent activities in the pipeline as needed.

    I hope this helps! Let me know if you have any further questions.

    0 comments No comments