How to get .zip data from the web into my DataFactory

Jonas Trumpfheller 21 Reputation points
2021-07-29T09:25:46.507+00:00

Hello everyone,

first of all i am realativly new to the world of azure.

I think it's best if I explain to you first what exactly I'm planning or what my project is.

I'm trying to save data from a certain website, which is in a zip format, into Azure, analyze it and then visualize it in the end and make it available for other people.

The problem I have now is that I can't manage to connect to the website and Azure.
I have tried to connect to the website with the DataFactory and copy the data to a DataLake using the Data Copy module, but this did not work.

So now my question is, how do I manage to store the data from the Internet in Azure and process it afterwards?

Do I need the DataFactory at all? Am I forgetting a service completely? Will my project work at all?

I'm a bit desperate and unfortunately can't get any further without help, as Azure is simply too complicated and too extensive for me at some points.

Thanks a lot for your help!

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,559 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
{count} votes

1 answer

Sort by: Most helpful
  1. MartinJaffer-MSFT 26,236 Reputation points
    2021-08-11T22:22:42.75+00:00

    Okay, I have a better understanding now.

    In this case, I agree with the error message's suggestion. Breaking this down into 2 parts will give you much better control. Not everything can be done in a single step.

    Instead,

    HTTP (Binary) -> Blob (Binary) ... Then ... Blob( Text, compressed) -> Blob (Text, uncompressed)
    Where the Blob(Binary) and Blob(Text, compressed) point to the same location.

    The Copy Data wizard, only does 1 copy activity at a time I think(?). However it can be used to create each of the activities, then you can take the two and put them into the same pipeline, linked by a green on-success dependency.

    Hmm, maybe it is worth either doing the compress -> uncompress in the first step , or adding as a middle step, instead of putting as last step.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.