HTTP Zip file to Azure blob giving error ErrorCode=UserErrorSourceNotSeekable

Syed Rashid Nizam 56 Reputation points
2022-01-04T05:37:13.873+00:00

Something strange is happening, a copy/download zip file from HTTP to Azure dataLake was working fine 2 weeks back, now I am getting following error, same question asked by another user,
Please see details below

"ErrorCode=UserErrorSourceNotSeekable,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Your HttpServer source can't support random read which is requied by current copy activity setting, please create two copy activities to work around it: the first copy activity binary copy your HttpServer source to a staging file store(like Azure Blob, Azure Data Lake, File, etc.), second copy activity copy from the staged file store to your destination with current settings.,Source=Microsoft.DataTransfer.ClientLibrary,"

Public source HTTP : https://data.gov.au/data/dataset/5bd7fcab-e315-42cb-8daf-50b7efc2027e/resource/0ae4d427-6fa8-4d40-8e76-c6909b5a071b/download/public_split_1_10.zip
Source configured correctly with compression type ZipDeflate and format binary
Sink configured correctly with no compression type and again binary format

162059-image.png

162125-image.png

162037-image.png

162083-image.png

162066-image.png

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,983 questions
{count} votes

Accepted answer
  1. svijay-MSFT 5,241 Reputation points Microsoft Employee
    2022-01-04T16:43:21.443+00:00

    Hello @Syed Rashid Nizam ,

    Thanks for the question and using MS Q&A platform.

    I see that you have mentioned that this was 2 weeks before. But I'll check whether there was any recent change. However, you could check the below 2 steps at your end an let us know if that resolves your issue.

    Like the error mentioned you would need 2 copy activity to resolve the issue.

    Step 1 : First Copy activity will have get from the source and store it as a ZIP File - as binary.

    Source : HTTP

    Sink : Staging Sink(Azure Blob for instance) - as a binary - You will not be uncompressing it.( with the same compression type as source )

    Step 2 : Another Copy activity which will copy the file stored as part of the STEP 1 to Destination Sink with the with no compression type.

    Hope this helps :)


2 additional answers

Sort by: Most helpful
  1. Syed Rashid Nizam 56 Reputation points
    2022-01-05T01:22:16.827+00:00

    Thanks @svijay-MSFT for your reply, it was definitely working for us, above is test copy activity screen shots, but I did develop a solution having URL parametrized and doing in single copy activity, another evidence is please see the link by another MSFT "https://learn.microsoft.com/en-us/answers/questions/544541/how-to-copy-http-zip-file-to-azure-blob.html" doing exactly in one copy activity, in this link another youtube presentation doing same steps what I did and working for everyone, its important I get to know the answer what has changes in past 3 to 4 weeks so that I can share with my team.

    With regard to your advise solution, since I was not clear with compression type in HTTP I tried following
    Step 1:
    Source HTTP , as Binary and using compression Type "None"
    Sink : Azure Data Lake Gen2, as Binary Compression type "None"

    it creates a file with random name and without extension, not actual name "public_split_1_10.zip"when you download file as normal process see the screen shot below, file name in red

    162343-image.png

    trying step 1 in another way

    Step 1:
    Source HTTP , as Binary and using compression Type "None"
    Sink : Azure Data Lake Gen2, as Binary Compression type "ZipDeflate"
    it creates file with bit proper name but add .zip again due to compression type, I need proper name, not Public_Split_1_10.zip.zip, see the red part in screen shot below. otherwise it creates issue in next copy activtly name where I would like to create wild card path *.zip

    162326-image.png

    The URL I have given is public you can try at your end let me know the what works for you.
    HTTP Source URL "https://data.gov.au/data/dataset/5bd7fcab-e315-42cb-8daf-50b7efc2027e/resource/0ae4d427-6fa8-4d40-8e76-c6909b5a071b/download/public_split_1_10.zip"

    More importantly I need to understand why there has been a change in Data Factory, if something that was working for me and others is now having a issue, this is important for our organisation to know the answer.


  2. admin-sroy 0 Reputation points
    2024-02-29T09:59:07.27+00:00

    I am also facing the same issue suddenly after running the production job correctly for past 30 days. Not sure what is changed suddenly to behave like this. Kindly replay if you find out the reason rather alternative solution. Thanks

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.