How to extract files from .tar.xz files stored in ADLS Gen 2 using Azure Data Factory?

Shubham Pawade 46 Reputation points
2022-06-02T10:00:26.237+00:00

Is it possible to extract files from .tar.xz using Azure Data Factory. The file will be stored in Azure Data Lake Storage Gen2.

Dataset used is binary. Adding the screenshots below,

207813-temps.png

As you can see there's .tar.gz option available but not .tar.xz. Is there any way or workaround to unzip tar.xz? Any help would be appreciated.

I have tried using tar.gz but getting the error as expected.

207862-image-2022-06-02t09-53-23-408z.png

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,639 questions
0 comments No comments
{count} votes

Accepted answer
  1. Samy Abdul 3,376 Reputation points
    2022-06-02T12:16:03.99+00:00

    Hi @Shubham Pawade , I am not sure Data factory support decompression of .tar files, however, you can use Azure functions

    activity leveraging Azure functions to achieve the purpose.

    https://github.com/Azure/Azure-DataFactory/tree/main/SamplesV2/UntarAzureFilesWithAzureFunction

    https://learn.microsoft.com/en-us/azure/data-factory/control-flow-azure-function-activity Thanks


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.