hw to ingest only files from new folder in AWS to ADLS
I have a AWS s3 bucket with 4 folders in it. Each folder has sub folders in it and each subfolder has files in it. The folder structure is like below. For each day there will be another folder with date(for ex, for next day a folder with 2020-01-03 will be created under Subfolders with files in it).
I want to load all data first into my adls gen2 folder with same structure and on daily basis load only newly created date folder.
How can i implement this?
--S3 bucket
-- Main Folder 1(level 1 folder)
-- Sub Folder 1(level 2 folder)
--2020-01-01(level 3 folder)
-- file1.parquet
-- file2.parquet
--2020-01-02
-- file1.parquet
-- file2.parquet
-- Sub Folder 2
--2020-01-01
-- file1.parquet
-- file2.parquet
--2020-01-02
-- file1.parquet
-- file2.parquet
for next day there will be folder with date 2020-01-03. I want to load only this new folder going forward.