hw to ingest only files from new folder in AWS to ADLS

reddy 41 Reputation points
2020-08-25T04:23:49.003+00:00

I have a AWS s3 bucket with 4 folders in it. Each folder has sub folders in it and each subfolder has files in it. The folder structure is like below. For each day there will be another folder with date(for ex, for next day a folder with 2020-01-03 will be created under Subfolders with files in it).
I want to load all data first into my adls gen2 folder with same structure and on daily basis load only newly created date folder.

How can i implement this?

--S3 bucket
-- Main Folder 1(level 1 folder)
-- Sub Folder 1(level 2 folder)
--2020-01-01(level 3 folder)
-- file1.parquet
-- file2.parquet
--2020-01-02
-- file1.parquet
-- file2.parquet
-- Sub Folder 2
--2020-01-01
-- file1.parquet
-- file2.parquet
--2020-01-02
-- file1.parquet
-- file2.parquet

for next day there will be folder with date 2020-01-03. I want to load only this new folder going forward.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,180 questions
{count} votes