How to read excel document from sharepoint directly dump it in Azure SQL using Azure data factory ?

heta desai 252 Reputation points
2024-04-15T07:24:19.01+00:00

I want to access excel document from SharePoint and without downloading that to data lake or blob storage, use data factory to read the content of it and directly dump it in Azure SQL.

Azure SQL Database
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,882 questions
SharePoint
SharePoint
A group of Microsoft Products and technologies used for sharing and managing content, knowledge, and applications.
10,874 questions
{count} votes

3 answers

Sort by: Most helpful
  1. Amira Bedhiafi 26,656 Reputation points
    2024-04-15T09:44:51.7466667+00:00

    Based on the documentation :

    You can copy file from SharePoint Online by using Web activity to authenticate and grab access token from SPO, then passing to subsequent Copy activity to copy data with HTTP connector as source.

    sharepoint copy file flow

    Follow the Prerequisites section to create Microsoft Entra application and grant permission to SharePoint Online.

    1. Create a Web Activity to get the access token from SharePoint Online:

    URL: https://accounts.accesscontrol.windows.net/[Tenant-ID]/tokens/OAuth/2. Replace the tenant ID.

    Method: POST

    Headers:

       Content-Type: application/x-www-form-urlencoded
      
    **Body**: `grant_type=client_credentials&client_id=[Client-ID]@[Tenant-ID]&client_secret=[Client-Secret]&resource=00000003-0000-0ff1-ce00-000000000000/[Tenant-Name].sharepoint.com@[Tenant-ID]`. Replace the client ID (application ID), client secret (application key), tenant ID, and tenant name (of the SharePoint tenant).
    
    1. Chain with a Copy activity with HTTP connector as source to copy SharePoint Online file content: HTTP linked service: Base URL: https://[site-url]/_api/web/GetFileByServerRelativeUrl('[relative-path-to-file]')/$value. Replace the site URL and relative path to file. Make sure to include the SharePoint site URL along with the Domain name, such as https://[sharepoint-domain-name].sharepoint.com/sites/[sharepoint-site]/_api/web/GetFileByServerRelativeUrl('/sites/[sharepoint-site]/[relative-path-to-file]')/$value. Authentication type: Anonymous (to use the Bearer token configured in copy activity source later) Dataset: choose the format you want. To copy file as-is, select "Binary" type. Copy activity source: Request method: GET Additional header: use the following expression@{concat('Authorization: Bearer ', activity('<Web-activity-name>').output.access_token)}, which uses the Bearer token generated by the upstream Web activity as authorization header. Replace the Web activity name. Configure the copy activity sink as usual.

    More links :

    https://stackoverflow.com/questions/45085236/how-to-fetch-sharepoint-data-and-load-into-azure-blob-storage-using-azure-data-f

    https://github.com/MicrosoftDocs/azure-docs/issues/18486

    https://github.com/MicrosoftDocs/azure-docs/issues/22486

    0 comments No comments

  2. AnnuKumari-MSFT 33,476 Reputation points Microsoft Employee
    2024-04-18T09:31:00.4933333+00:00

    Hi heta desai ,

    Thankyou for using Microsoft Q&A platform and thanks for posting your query here.

    As per my understanding you are trying to copy data from sharepoint online to SQL directly.

    You can directly use copy activity to copy data from SharePoint Online List with the help of sharepoint online connector in ADF and in sink, point the dataset to SQL connector.

    As a pre-requisite , kindly make sure to have service principal authentication to connect to SharePoint and grant site permission to your registered application

    Here is the documentation which can help with the configuration to Copy data from SharePoint Online List by using Azure Data Factory or Azure Synapse Analytics

    Hope it helps. Kindly accept the answer . Thankyou


  3. Pinaki Ghatak 4,690 Reputation points Microsoft Employee
    2024-05-27T10:18:57.5066667+00:00

    Hello @heta desai

    To access an Excel document from SharePoint and directly dump its content into Azure SQL using Azure Data Factory, you can use the SharePoint Online List connector as a source and the Azure SQL Database connector as a sink.

    However, please note that the Excel document needs to be in a supported format for Azure Data Factory to read it.

    Here are the high-level steps to achieve this:

    1. Create a new data factory in the Azure portal.
    2. Create a new pipeline in the data factory.
    3. Add a SharePoint Online List dataset as the source for the pipeline. In the dataset, specify the URL of the SharePoint site and the name of the Excel document.
    4. Add an Azure SQL Database dataset as the sink for the pipeline. In the dataset, specify the connection string to the Azure SQL Database and the name of the table where you want to dump the data.
    5. Add a copy activity to the pipeline. In the copy activity, specify the SharePoint Online List dataset as the source and the Azure SQL Database dataset as the sink.
    6. Configure the mapping between the source and sink datasets. You can use the mapping options in Azure Data Factory to transform the data as needed. Once you have completed these steps, you can trigger the pipeline to read the content of the Excel document from SharePoint and directly dump it into Azure SQL.

    Please note that if the Excel document is not in a supported format, you may need to convert it to a supported format before reading it with Azure Data Factory. Additionally, you may need to configure authentication for the SharePoint Online List dataset if your SharePoint site requires authentication.


    I hope that this response has addressed your query and helped you overcome your challenges. If so, please mark this response as Answered. This will not only acknowledge our efforts, but also assist other community members who may be looking for similar solutions.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.