How do I extract only specific files from a compressed file using ADF?

Alfi Khair 20 Reputation points
2023-09-06T07:37:24.6033333+00:00

I am working on a File Extraction pipeline.

I have one compressed file (i.e., CompressedA) stored in one container which contains 5 separate files (i.e., File1-File5). In a scenario where user only requires File1, File2 and File5, how do I extract them without rerunning the pipeline multiple times?

The flow is as such:

Container 1: CompressedA >> decompressed to Container2: File1-File5 >> extract specific files to UserContainer.

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,562 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,639 questions
0 comments No comments
{count} votes

Accepted answer
  1. Subashri Vasudevan 11,231 Reputation points
    2023-09-06T13:41:57.4966667+00:00

    Hi Alfi Khair

    After decompressing the file, if you want to extract specific files, you can use the combination of get metadata activity, foreach loop and an if condition plus a copy activity.

    1. use an array variable (called filenames) to hold the list of files that need to be copied to UserContainer. for instance like below. ["File1.txt","File2.txt","File5.txt"]. You can alternatively have this in a lookup file or table. choice is yours. Here let's consider using an array in the pipeline.
    2. use Get Metadata activity to get list of decompressed files
    3. use a foreach loop to loop over decompressed files
      1. inside foreach use if condition to check if item().name is in the array variable using the below expression: @contains(variables('filenames'),item().name)
      . The above condition will check if the current file name in foreach loop matches any of the array value. In the true part have a copy activity to copy file from source to UserContainer. Leave the false part empty.
      1. Hope it helps. Please feel free to write back if you have any further questions on it.
    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.