Load data into a search index in Azure AI Search

This article explains how to import, refresh, and manage content in a predefined search index. In Azure AI Search, a search index is created first, with data import following as a second step. The exception is Import Data wizard and indexer pipelines, which create and load an index in one workflow.

A search service imports and indexes text and vectors in JSON, used in full text search, vector search, hybrid search, and knowledge mining scenarios. Text content is obtainable from alphanumeric fields in the external data source, metadata that's useful in search scenarios, or enriched content created by a skillset (skills can extract or infer textual descriptions from images and unstructured content). Vector content is vectorized using an external embedding model or integrated vectorization (preview).

Once data is indexed, the physical data structures of the index are locked in. For guidance on what can and can't be changed, see Drop and rebuild an index.

Indexing isn't a background process. A search service will balance indexing and query workloads, but if query latency is too high, you can either add capacity or identify periods of low query activity for loading an index.

Load documents

A search service accepts JSON documents that conform to the index schema.

You can prepare these documents yourself, but if content resides in a supported data source, running an indexer or the Import data wizard can automate document retrieval, JSON serialization, and indexing.

In the Azure portal, use the Import Data wizards to create and load indexes in a seamless workflow. If you want to load an existing index, choose an alternative approach.

  1. Sign in to the Azure portal with your Azure account.

  2. Find your search service and on the Overview page, select Import data or Import and vectorize data on the command bar to create and populate a search index. You can follow these links to review the workflow: Quickstart: Create an Azure AI Search index and Quickstart: Integrated vectorization (preview).

    Screenshot of the Import data command

If indexers are already defined, you can reset and run an indexer from the Azure portal, which is useful if you're adding fields incrementally. Reset forces the indexer to start over, picking up all fields from all source documents.

Delete orphan documents

Azure AI Search supports document-level operations so that you can look up, update, and delete a specific document in isolation. The following example shows how to delete a document. In a search service, documents are unrelated so deleting one will have no impact on the rest of the index.

  1. Identify which field is the document key. In the portal, you can view the fields of each index. Document keys are string fields and are denoted with a key icon to make them easier to spot.

  2. Check the values of the document key field: search=*&$select=HotelId. A simple string is straightforward, but if the index uses a base-64 encoded field, or if search documents were generated from a parsingMode setting, you might be working with values that you aren't familiar with.

  3. Look up the document to verify the value of the document ID and to review its content before deleting it. Specify the key or document ID in the request. The following examples illustrate a simple string for the Hotels sample index and a base-64 encoded string for the metadata_storage_path key of the cog-search-demo index.

    GET https://[service name].search.windows.net/indexes/hotel-sample-index/docs/1111?api-version=2023-11-01
    
    GET https://[service name].search.windows.net/indexes/cog-search-demo/docs/aHR0cHM6Ly9oZWlkaWJsb2JzdG9yYWdlMi5ibG9iLmNvcmUud2luZG93cy5uZXQvY29nLXNlYXJjaC1kZW1vL2d1dGhyaWUuanBn0?api-version=2023-11-01
    
  4. Delete the document to remove it from the search index.

    POST https://[service name].search.windows.net/indexes/hotels-sample-index/docs/index?api-version=2023-11-01
    Content-Type: application/json   
    api-key: [admin key] 
    {  
      "value": [  
        {  
          "@search.action": "delete",  
          "id": "1111"  
        }  
      ]  
    }
    

See also