Index Refresh for Persistent Vector Embedding of Enterprise Documents with Azure Cognitive Search.

Sarva, Pavan 40 Reputation points
2023-10-13T20:23:06.1166667+00:00

Please suggest approach and considerations to refresh the index for content updates without requiring changes to the vector search application while using Azure Cognitive Search for persistent vector embeddings.

Consider enterprise documentation that requires periodic updates for relevant vector search results.

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
758 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,755 questions
SharePoint
SharePoint
A group of Microsoft Products and technologies used for sharing and managing content, knowledge, and applications.
9,869 questions
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
2,302 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. brtrach-MSFT 15,356 Reputation points Microsoft Employee
    2023-10-16T01:06:56.2333333+00:00

    @Sarva, Pavan To refresh the index for content updates without requiring changes to the vector search application while using Azure Cognitive Search for persistent vector embeddings, you can use the Azure Cognitive Search Indexer. The Indexer is a tool that extracts searchable content from cloud data sources and populates a search index using field-to-field mappings between source data and a search index. Here are the steps to configure the indexer:

    First, create an Azure Cognitive Search index with the necessary fields for your vector search application.

    Next, create an Azure Data Factory pipeline to extract data from your data source (SharePoint in this case) and transform it into the format required by the Azure Cognitive Search index.

    Then, create an Azure Cognitive Search indexer that uses the Azure Data Factory pipeline to extract and transform data from your data source and populate the Azure Cognitive Search index.

    1. Configure the indexer to run on a recurring data refresh schedule that meets your needs.

    Finally, test the indexer to ensure that it is working as expected.

    When using the indexer, there are several considerations to keep in mind. For example, the indexer can only extract data from supported data sources, so ensure that your data source is supported by Azure Cognitive Search. Additionally, the indexer can only extract data that is in a format that can be mapped to the fields in your Azure Cognitive Search index, so ensure that your data source is in a format that can be mapped to the fields in your Azure Cognitive Search index. Finally, the indexer can only extract data that is relevant to your vector search application, so ensure that your data source contains only relevant data.

    0 comments No comments