ADF COpy Activity is failing to unzip .zip file

Amey Kale 0 Reputation points
2023-08-04T11:09:49.2633333+00:00

Hi,

I am using ADF's copy activity to un-zip file in Azure blob storage. I have uploaded a valid zip file into blob storage and using copy activity to copy and unzip this file into new container. I think I have used all the right settings:

  1. Activity: Copy
  2. Source data set: Binary, Compression type: ZipDeflate(.zip), Compression level: Optimal
  3. Sink dataset: Binary, Compression type: None, Copy behavior: Preserve hierarchy

When i run my pipeline with above settings, I get following error:

Error message: ErrorCode=UserErrorUnzipInvalidFile,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=The file 'Prod_20230730230042.zip' is not a valid Zip file with Deflate compression method.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.IO.InvalidDataException,Message=A local file header is corrupt.,Source=System.IO.Compression,'

Activity Id: 1937352d-b6aa-42e0-85d3-c46d45342847

Pipeline run Id: 1937352d-b6aa-42e0-85d3-c46d45342847

Integration runtime: AutoResolveIntegrationRuntime

Region: West US 2

I have validated that the zip file is correct and I am able to unzip the file on my Windows machine and also with My Python code.

I have read somewhere that ADF only supports deflate compression algorithm. By default zip uses deflate compression algorithm but there are others as well. Can you let me know if there is a way to identify the compression algorithm used for my file and reason why my activity could be failing?

Thanks Amey

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,627 questions
{count} votes

2 answers

Sort by: Most helpful
  1. QuantumCache 20,366 Reputation points Moderator
    2023-08-04T19:37:21.1966667+00:00

    Hello @Amey Kale Welcome to QnA forum,

    If you have confirmed that your zip file is compressed using the Deflate algorithm, then the issue could be with the file itself. The error message mentions that a local file header is corrupt, which could indicate that the file is corrupted or incomplete.

    I would try to create another Zip file manually and try the pipeline once again

    You may want to try re-uploading the file to Azure Blob Storage and then running the pipeline again?
    Meanwhile I will verify from my side.
    Supported file formats and compression codecs by copy activity in Azure Data Factory and Azure Synapse pipelinesUser's image


  2. Amey Kale 0 Reputation points
    2023-08-22T10:39:24.35+00:00

    Hi,

    This file is created by the third part tool and hence I don't have a control over it. I created my own zip file, and I am able to un-zip the file successfully with ADF code. This means my ADF code and the setting is correct. I am able to unzip this problematic file with all the other softwares like windows zip, 7 zip, Python and C# libraries. So there seems to be some issue internally with the ADF implementation because of which ADF is not able to unzip this problematic file.

    Thanks

    Amey Kale

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.