How to download excel file from Cloud and Save it as Excel in Azure Data Lake Store in Azure Synapse?

Chaiwa 56 Reputation points
2022-09-17T02:45:37.833+00:00

Anyone here to help me configure the Azure Synapse Copy Activity or any other type of activity to download excel files from a cloud storage such as Dropbox and save it as it is in ADLS? The only option I have seen and tried is to configure the Copy Activity using the http connector with the source as Binary and sink as Binary, but that doesn't make it available for processing in Data Flow. I want them saved as Excel just as they would be saved if manually uploaded. Data Flow doesn't have Binary as an option for source.The files are huge to download and upload manually so I can only use the http connector to download the files but I am stuck on the destination format since there is no option for saving it as Excel. The files are not .csv but .xlsx and .xls so I cannot go for Delimited Text.

Anyone knows how this works?

Thanks,
Chaiwa.

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,187 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Chaiwa 56 Reputation points
    2022-09-17T19:17:19.673+00:00

    I solved this by using Copy Activity with the http connector and configuring the source dataset as Excel and sink dataset as parquet. Now I can read parquet into the Data Flow and process the files.

    Thanks.


  2. AnnuKumari-MSFT 34,011 Reputation points Microsoft Employee
    2022-09-19T10:43:37.133+00:00

    Hi @Chaiwa ,

    Thankyou for using Microsoft Q&A platform and thanks for posting your question here.

    • You can consider using a copy activity with source dataset as HTTP connector with Binary format and Sink dataset as ADLS connector with binary format with filename as FileName.xls.
    • In the sink dataset , select compression type as zipdeflate and compression level as Optimal. This will unzip the file on the fly .
    • Once the data has been landed, create a ADLS connector with Excel format in the source transformation of the Dataflow .

    I am still checking on the error you are getting due to limitation of file size. However, please feel free to try the above approach and let us know if it helps.

    Check the below screenshot for sink dataset of copy activity:

    242459-image.png

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you.
      Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.