Document Indexing and Vector Embeddings for SharePoint files using Azure Cognitive Search

Sarva, Pavan 45 Reputation points
2023-10-13T20:04:44.44+00:00

Please let me know if SharePoint Indexer for Azure Cognitive Search includes the ability to index documents from SharePoint libraries and generate vector embeddings. If not, what is the recommended approach to produce embedding for documents in the SP library and persist them in Oder to execute vector search using Azure Cognitive Search?

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
1,339 questions
Microsoft 365 and Office SharePoint For business Windows
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,080 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. brtrach-MSFT 17,731 Reputation points Microsoft Employee Moderator
    2023-10-16T01:02:30.51+00:00

    @Sarva, Pavan Yes, the SharePoint indexer in Azure Cognitive Search can index documents from SharePoint libraries. However, it does not generate vector embeddings by default.

    To generate vector embeddings for documents in the SharePoint library and persist them for vector search using Azure Cognitive Search, you can use Azure OpenAI Service. You can use the OpenAI Service to generate embeddings for each document in the SharePoint library and store them in a separate field in the search index. Once the embeddings are stored in the search index, you can use Azure Cognitive Search to execute vector search queries against the embeddings.

    Here is a high-level overview of the steps you can follow:

    1. Set up a SharePoint indexer in Azure Cognitive Search to index documents from the SharePoint library.
    2. Use Azure OpenAI Service to generate embeddings for each document in the SharePoint library.
    3. Store the embeddings in a separate field in the search index.
    4. Use Azure Cognitive Search to execute vector search queries against the embeddings.
    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.