Azure data factory convert excel file copied from Sharepoint online to parquet

Cooldbguy 1 Reputation point
2020-07-01T04:22:19.69+00:00

Hi All,

I have a requirement to copy excel files from share point online to Azure data lake store(ADLS) Gen1 in parquet format, with the below link I was able to copy file over to ADLS. But I can't convert the file to parquet format as it doesn't show me the SheetName of the excel file while reading in ADF.

Can anybody suggest?

http://learn.microsoft.com/en-us/azure/data-factory/connector-sharepoint-online-list

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,814 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. ChiragMishra-MSFT 956 Reputation points
    2020-07-01T06:12:03.277+00:00

    Hi @Cooldbguy ,

    You can now read excel files as a source, which means you can specify the sheet name while copying too. However, Excel format is supported for the following connectors: Amazon S3, Azure Blob, Azure Data Lake Storage Gen1, Azure Data Lake Storage Gen2, Azure File Storage, File System, FTP, Google Cloud Storage, HDFS, HTTP, and SFTP. Read more here : https://learn.microsoft.com/en-us/azure/data-factory/format-excel

    Hence, you can convert the copied excel to parquet format by having another copy activity to copy data from the xls files in your ADLS Gen1, to a parquet dataset which can still be on the same ADLS Gen1.

    Hope this helps.


  2. ChiragMishra-MSFT 956 Reputation points
    2020-07-15T06:43:26.9+00:00

    Hi @Cooldbguy-8578,

    You can parameterize the sheet name to dynamically pass it during your copy activity as shown below.
    12248-image.png


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.