Azure open ai studio data configuration is creating an extra index in azure ai search

lakshmi 631 Reputation points
2024-05-16T17:15:22.42+00:00

Hi team,

We are using azure open ai studio to configure the private data configured in azure blob storage. Once the configuration is done its creating and extra index and indexer in azure ai search

Actual index name i provided is 'openai' but one more index got created with-index format

User's image

While configuring i selected daily scheduler, s parallely an indexer also got created but there also an duplicate got created .

User's image

How to identify the correct indexer from this?

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
798 questions
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
2,437 questions
0 comments No comments
{count} votes

Accepted answer
  1. Charlie Wei 3,305 Reputation points
    2024-05-17T07:02:59.2966667+00:00

    Hello lakshmi,

    Both indexers are required. The openai-indexer-chunk divides the data into small chunks with a maximum of 1,024 tokens, and the other openai-indexer writes the preprocessed data into Azure AI Search.

    This is mentioned in the Microsoft Learn document.

    Data is ingested into Azure AI search using the following process:

    1. Ingestion assets are created in Azure AI Search resource and Azure storage account. Currently these assets are: indexers, indexes, data sources, a custom skill in the search resource, and a container (later called the chunks container) in the Azure storage account. You can specify the input Azure storage container using the Azure OpenAI studio, or the ingestion API (preview).
    2. Data is read from the input container, contents are opened and chunked into small chunks with a maximum of 1,024 tokens each. If vector search is enabled, the service calculates the vector representing the embeddings on each chunk. The output of this step (called the "preprocessed" or "chunked" data) is stored in the chunks container created in the previous step.
    3. The preprocessed data is loaded from the chunks container, and indexed in the Azure AI Search index.

    Best regards,
    Charlie


    If you find my response helpful, please consider accepting this answer and voting yes to support the community. Thank you!


0 additional answers

Sort by: Most helpful