Azure Data Factory - Change .zip to .gz on the fly using ADF pipeline

Lokesh 211 Reputation points
2020-04-27T12:12:01.85+00:00

Hi Guys,

We have a requirement to load .zip files from amazon s3 to DWH. As a first step we want to convert these into .gz files so that it could be picked by polybase to be loaded in DWH.

Source dataset is set to compression type = ZipDeflate
Sink dataset is set to compression type = gzip

On debugging the pipeline, we are getting the following error.

{ "errorCode": "2200", "message": "Failure happened on 'Sink' side. ErrorCode=UserErrorUnzipInvalidFile,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=The
file 'file1.zip' is not a valid Zip file with Deflate compression method.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.IO.InvalidDataException,Message=End of Central Directory record could not be found.,Source=Microsoft.DataTransfer.ClientLibrary,'",
"failureType": "UserError", "target": "S3toBlob", "details": [] }

We have tried some other compression types as well, for no luck.

Anyone tried this before ?

Regards

Lokesh

Community Center | Not monitored
{count} votes

1 answer

Sort by: Most helpful
  1. Vaibhav Chaudhari 38,921 Reputation points Volunteer Moderator
    2020-04-27T17:39:23.99+00:00

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.