How Can I read csv files which are in nested folders and copy them preserving the hierarchy?

Jay Remally 0 Reputation points
2024-05-24T16:33:37.1766667+00:00

I have csv.gz files which are partitioned this way /2024/01/01/xyz/x.csv

/2024/01/01/yza/y.csv

/2024/01/01/zab/z.csv

and there are files for several years and i want to copy all those files using adf while maintaining the folder structure and hierarchy and once i am done with the historical load i should be running the pipeline daily to copy the new files not overwrite(there could be new files updated for the last year as well).

I have tried different ways but couldnt quite figure it out.

I Request any suggestions from folks who have done something like this already or know a proper way to get a solution to this.

Thanks

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,394 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,875 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Vinodh247-1375 12,056 Reputation points
    2024-05-26T07:21:29.5433333+00:00

    Hi Jay Remally,

    Thanks for reaching out to Microsoft Q&A.

    to copy csv files from nested folders in adf while preserving the hierarchy, you can try the below steps:

    1. Source Dataset Configuration:
      • Create a source dataset that points to your root folder containing the nested subfolders.
      • Set the wildcard file path property to ".csv" to read only CSV files.
      • Set the recursive property to "true" to include files from all subfolders.
    2. Copy Activity Configuration:
      • Create a Copy Data activity in your ADF pipeline.
      • Use the source dataset created in above step.
      • Configure the sink dataset to write the data to the desired destination (ex: blob storage, SQL db, datalake, etc.).
      • Set the sink folder structure to match the source folder hierarchy. This will preserve the nested subfolder structure.

    https://stackoverflow.com/questions/67757798/copy-all-files-from-sub-folders-move-the-same-structure-to-archive-folder-and-d

    Please 'Upvote'(Thumbs-up) and 'Accept' as an answer if the reply was helpful. This will benefit other community members who face the same issue.

    0 comments No comments