How to connect SharePoint with Azure OpenAI for automatic document processing

Yash Shukla 40 Reputation points
2024-01-04T15:55:25.9166667+00:00

How can I establish a connection between SharePoint and Azure OpenAI for automatic document processing? I'm using the official Azure repository for Azure OpenAI with my own data, following this link: https://github.com/Azure-Samples/azure-search-openai-demo

My goal is to automatically fetch documents from a SharePoint folder, process them, and store embeddings and vector indexes in Azure AI Search. I'm using the RAG Architecture Approach, as shown in the attached diagram:

Diagram that shows the embeddings approach.

As I'm new to SharePoint, I haven't been able to come up with a working approach with the official documents provided by Microsoft. Can someone help me out or provide further explanation if needed?

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
1,279 questions
SharePoint
SharePoint
A group of Microsoft Products and technologies used for sharing and managing content, knowledge, and applications.
11,230 questions
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,933 questions
{count} votes

Accepted answer
  1. Boris Von Dahle 3,216 Reputation points
    2024-01-04T22:24:08.3766667+00:00

    Hello,

    To connect SharePoint with Azure OpenAI for automatic document processing, you would need to create a bridge between SharePoint and Azure. Unfortunately, the Azure OpenAI demo repository does not provide direct support for SharePoint integration. However, you can achieve this by using SharePoint's APIs to fetch the documents and then process them using Azure OpenAI.

    Here are the general steps you might follow:

    SharePoint provides REST APIs that can be used to fetch documents from a SharePoint library. You can use these APIs to fetch the documents from your SharePoint site.

    https://learn.microsoft.com/en-us/sharepoint/dev/sp-add-ins/get-to-know-the-sharepoint-rest-service?tabs=csom
    https://learn.microsoft.com/en-us/sharepoint/dev/sp-add-ins/working-with-folders-and-files-with-rest

    You could also use Azure Data Factory in order to copy the files to the blob storage.

    https://learn.microsoft.com/en-us/azure/data-factory/connector-sharepoint-online-list?tabs=data-factory

    Once you have the documents, you can then use Azure OpenAI to process them. The Azure OpenAI demo repository provides a good starting point for this.

    https://github.com/Azure-Samples/azure-search-openai-demo

    After processing the documents, you can then store the resulting embeddings and vector indexes in Azure AI Search.

    Hope this helps

    Regards

    If you found this answer useful, please consider marking it as 'Accepted.' This helps other users easily find and benefit from this information

    4 people found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. Tjerk van der Maten 10 Reputation points
    2024-02-07T08:27:31.7166667+00:00

    The first answer is indeed valuable. However, the thing I'm struggling with is with direct Sharepoint connections is the 'document level security'. With an ADLS solution we can add searchable metadata to add ID's/GroupID's with for example GRAPHAPI solutions. Is it already possible to also take into account the existing 'folder or document level security'? Such that not everyone can ask questions about every document located in a Sharepoint Site? Thanks in advance!

    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.