get csv data from a website using copy activity

arkiboys 9,706 Reputation points
2023-06-08T08:58:54.5033333+00:00

hello,

There are csv files in a http website .

At present I can manually click on the csv file in that http and it gets downloaded into my machine but I want to download using ADF so later I can automate it...

for example the http is: https:/webaddress/xyz

the file name for this month is

filename-may2023
I created a linkedservice to point to baseurl which is the http address I pointed above.
Then my dataset is of type delimited column file and points to the linkedservice
I then used a copy activity and the source is pointing to the dataset. But in preview of the source, I do not see data of the csv file. It shows the html tags which shows as 10 lines.
Should I be able to see the data in the csv in preview tab? Am I doing this correctly?

Do I need to download the csv file first and then extract the data or does the copy activity extracts data from csv file on the web directly?
I feel like I have to download the file first and then put it into a blob and then extract?

Thank you

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,623 questions
0 comments No comments
{count} votes

Accepted answer
  1. Subashri Vasudevan 11,226 Reputation points
    2023-06-08T11:54:42.1033333+00:00

    Hi @arkiboys,

    Thanks for the question, and using MS Q&A portal.

    Normally, when we use HTTP connector, the copy activity will show the data in data preview. We need not download the file before copying to sink. It is fishy when you say you are seeing some html tags.

    Request you to kindly verify the type of file you selected, after choosing http connector for source. Choosing the wrong type of data might cause issue like what you mention as html tags. Pl check and let us know if that worked for you.

    Screenshot 2023-06-08 at 5.23.46 PM


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.