ErrorCode=UserErrorUnzipInvalidFile,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=The file 'All Weekly 18.02.2024.zip' is not a valid Zip file with Deflate compression method.,Source=Microsoft.DataTransfer.ClientLibrary,''Type

Osadhis Nanda 0 Reputation points
2024-04-09T06:13:16.2066667+00:00

While Files copying from SFTP to ADLS using ADF Copy activity Pipeline fails when doing unzip from SFTP to ADLS.

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,348 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,600 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Amira Bedhiafi 15,676 Reputation points
    2024-04-09T14:52:26.1266667+00:00

    Since you didn't provide any further information, you may need to manually download and unzip or decompress the file at each stage of its journey.

    This means initially downloading and unzipping the file from its blob storage location, followed by a similar process after it has been transferred via SFTP.

    If the file fails to unzip or decompress at any point, it indicates that the file was not in the correct format when it reached that particular stage.

    If unzipping or decompressing is successful at both stages, but the issue arises with Data Factory, the problem may lie with Data Factory itself. If this is the case, please inform us.


  2. KarishmaTiwari-MSFT 18,527 Reputation points Microsoft Employee
    2024-04-24T05:21:07.2066667+00:00

    ADF has certain limitations when handling large files. For files larger than 2 GB, consider the following:

    • ADF may split large files into smaller chunks during transfer. Ensure that the chunking process doesn’t interfere with the ZIP file’s integrity.
    • ADF processes data in memory. If the ZIP file is too large, it might exceed available memory during unzipping.
    • If using a self-hosted integration runtime, ensure it has sufficient resources (memory, CPU) to handle large files.
    • Large files take longer to transfer. Check if there’s any network latency causing timeouts during the transfer. Adjust timeout settings in ADF if necessary.

    Since manually downloading the file and unzipping it in ADLS works, consider the following workaround.

    • Manually download the ZIP file from SFTP to a local machine. Unzip it locally. Upload the unzipped files directly to ADLS.
    • Automated Solution:
      • Create a custom script (e.g., PowerShell, Python) that performs the same steps as the manual process.
      • Use an ADF custom activity to execute this script after the initial SFTP-to-ADLS copy.
      • This way, you automate the process while ensuring successful unzipping.

    Azure File Sync (Alternative Approach):

    • Consider using Azure File Sync to copy large files efficiently from SFTP to ADLS.
    • Azure File Sync synchronizes files between an on-premises file server and Azure Files. It handles large files well and provides better performance.
    • Set up Azure File Sync between your SFTP server (on-premises) and an Azure File Share (which maps to ADLS).
    • Once synchronized, the files will be available in ADLS without the need for manual intervention.

    Remember to monitor ADF pipeline execution logs for any additional error details.

    0 comments No comments