In Azure AI Search, a vector store has an index schema that defines vector and nonvector fields, a vector configuration for algorithms that create the embedding space, and settings on vector field definitions that are used in query requests. The Create or Update Index API creates the vector store.
Follow these steps to index vector data:
Define a schema with vector algorithms for indexing and search
This article explains the workflow and uses REST to illustrate each step. Each recent version of the REST API adds new functionality. Once you understand the basic workflow and what each API version provides, continue with the Azure SDK code samples in the azure-search-vector-samples repository for guidance on using these features in test and production code.
Azure AI Search, in any region and on any tier. Most existing services support vector search. For services created before January 2019, there's a small subset that can't create a vector index. In this situation, a new service must be created.
Pre-existing vector embeddings in your source documents if you're using the generally available version of the Azure SDKs and REST APIs. For more information, see Generate embeddings. An alternative is integrated vectorization (preview).
You should know the dimensions limit of the model used to create the embeddings and how similarity is computed. In Azure OpenAI, for text-embedding-ada-002, the length of the numerical vector is 1536. Similarity is computed using cosine. Valid values are 2 through 3072 dimensions.
You should be familiar with creating an index. The schema must include a field for the document key, other fields you want to search or filter, and other configurations for behaviors needed during indexing and queries.
Prepare documents for indexing
Prior to indexing, assemble a document payload that includes fields of vector and nonvector data. The document structure must conform to the index schema.
Make sure your documents:
Provide a field or a metadata property that uniquely identifies each document. All search indexes require a document key. To satisfy document key requirements, a source document must have one field or property that can uniquely identify it in the index. This source field must be mapped to an index field of type Edm.String and key=true in the search index.
Provide vector data (an array of single-precision floating point numbers) in source fields.
Vector fields contain numeric data generated by embedding models, one embedding per field. We recommend the embedding models in Azure OpenAI, such as text-embedding-ada-002 for text documents or the Image Retrieval REST API for images. Only index top-level vector fields are supported: Vector subfields aren't currently supported.
Provide other fields with human-readable alphanumeric content for the query response, and for hybrid query scenarios that include full text search or semantic ranking in the same request.
Your search index should include fields and content for all of the query scenarios you want to support. Suppose you want to search or filter over product names, versions, metadata, or addresses. In this case, similarity search isn't especially helpful. Keyword search, geo-search, or filters would be a better choice. A search index that includes a comprehensive field collection of vector and nonvector data provides maximum flexibility for query construction and response composition.
A short example of a documents payload that includes vector and nonvector fields is in the load vector data section of this article.
Add a vector search configuration
A vector configuration specifies the vector search algorithm and parameters used during indexing to create "nearest neighbor" information among the vector nodes:
Hierarchical Navigable Small World (HNSW)
Exhaustive KNN
If you choose HNSW on a field, you can opt in for exhaustive KNN at query time. But the other direction doesn’t work: if you choose exhaustive, you can’t later request HNSW search because the extra data structures that enable approximate search don’t exist.
Looking for preview-to-stable version migration guidance? See Upgrade REST APIs for steps.
Name of the configuration. The name must be unique within the index.
profiles add a layer of abstraction for accommodating richer definitions. A profile is defined in vectorSearch, and then referenced by name on each vector field.
"hnsw" and "exhaustiveKnn" are the Approximate Nearest Neighbors (ANN) algorithms used to organize vector content during indexing.
"m" (bi-directional link count) default is 4. The range is 4 to 10. Lower values should return less noise in the results.
"efConstruction" default is 400. The range is 100 to 1,000. It's the number of nearest neighbors used during indexing.
"efSearch" default is 500. The range is 100 to 1,000. It's the number of nearest neighbors used during search.
"metric" should be "cosine" if you're using Azure OpenAI, otherwise use the similarity metric associated with the embedding model you're using. Supported values are cosine, dotProduct, euclidean.
2024-05-01-Preview is the newest version. It adds more encoding options, but the vector search configuration (vectorSearch structure) is mostly identical to 2024-03-01-preview.
Adds hamming distance as a metric for nearest neighbor search over binary data. For more information, see Index binary data for vector search.
Expands integrated vectorization with more embedding model choices. To benefit from this capability, you must take a dependency on an indexer and skillset. See Load vector data and the Pull APIs section for a list of the new embedding skills.
Add a vectorSearch section in the index that specifies compression settings and the search algorithms used to create the embedding space. For more information, see Configure vector quantization and reduced storage.
vectorSearch.compressions.kind must be scalarQuantization.
rerankWithOriginalVectors uses the original, uncompressed vectors to recalculate similarity and rerank the top results returned by the initial search query. The uncompressed vectors exist in the search index even if stored is false. This property is optional. Default is true.
defaultOversampling considers a broader set of potential results to offset the reduction in information from quantization. The formula for potential results consists of the k in the query, with an oversampling multiplier. For example, if the query specifies a k of 5, and oversampling is 20, then the query effectively requests 100 documents for use in reranking, using the original uncompressed vector for that purpose. Only the top k reranked results are returned. This property is optional. Default is 4.
quantizedDataType must be set to int8. This is the only primitive data type supported at this time. This property is optional. Default is int8.
Name of the configuration. The name must be unique within the index.
profiles are new in this preview. They add a layer of abstraction for accommodating richer definitions. A profile is defined in vectorSearch, and then as a property on each vector field.
hnsw and "exhaustiveKnn" are the Approximate Nearest Neighbors (ANN) algorithms used to organize vector content during indexing.
m (bi-directional link count) default is 4. The range is 4 to 10. Lower values should return less noise in the results.
efConstruction default is 400. The range is 100 to 1,000. It's the number of nearest neighbors used during indexing.
efSearch default is 500. The range is 100 to 1,000. It's the number of nearest neighbors used during search.
metric should be "cosine" if you're using Azure OpenAI, otherwise use the similarity metric associated with the embedding model you're using. Supported values are cosine, dotProduct, euclidean.
Important
2023-07-01-Preview was the first REST API version to support vectors. It uses outmoded structures that have been replaced in newer previews. We recommend migrating to a newer REST API.
This preview added:
vectorSearch.algorithmConfigurations for specifying the HNSW algorithm.
hnsw nearest neighbor algorithm for indexing vector content.
Name of the configuration. The name must be unique within the index.
hnsw is the Approximate Nearest Neighbors (ANN) algorithm used to create the proximity graph during indexing. Only Hierarchical Navigable Small World (HNSW) is supported in this API version.
m (bi-directional link count) default is 4. The range is 4 to 10. Lower values should return less noise in the results.
efConstruction default is 400. The range is 100 to 1,000. It's the number of nearest neighbors used during indexing.
efSearch default is 500. The range is 100 to 1,000. It's the number of nearest neighbors used during search.
metric should be "cosine" if you're using Azure OpenAI, otherwise use the similarity metric associated with the embedding model you're using. Supported values are cosine, dotProduct, euclidean.
Add a vector field to the fields collection
The fields collection must include a field for the document key, vector fields, and any other fields that you need for hybrid search scenarios.
Vector fields are characterized by their data type, a dimensions property based on the embedding model used to output the vectors, and a vector profile.
Define a vector field with the following attributes. You can store one generated embedding per field. For each vector field:
type must be Collection(Edm.Single) in this API version.
dimensions is the number of dimensions generated by the embedding model. For text-embedding-ada-002, it's 1536.
vectorSearchProfile is the name of a profile defined elsewhere in the index.
searchable must be true.
retrievable can be true or false. True returns the raw vectors (1536 of them) as plain text and consumes storage space. Set to true if you're passing a vector result to a downstream app.
filterable, facetable, sortable must be false.
Add filterable nonvector fields to the collection, such as "title" with filterable set to true, if you want to invoke prefiltering or postfiltering on the vector query.
Add other fields that define the substance and structure of the textual content you're indexing. At a minimum, you need a document key.
You should also add fields that are useful in the query or in its response. The following example shows vector fields for title and content ("titleVector", "contentVector") that are equivalent to vectors. It also provides fields for equivalent textual content ("title", "content") useful for sorting, filtering, and reading in a search result.
The following example shows the fields collection:
Vector field definitions are the same as 2024-03-01-preview, with the exception of a new binary data type. For more information, see Index binary data for vector search.
Add vector fields to the fields collection. You can store one generated embedding per document field. For each vector field:
type can be Collection(Edm.Single), Collection(Edm.Half), Collection(Edm.Int16), Collection(Edm.SByte)
dimensions is the number of dimensions generated by the embedding model. For text-embedding-ada-002, it's 1536.
vectorSearchProfile is the name of a profile defined elsewhere in the index.
searchable must be true.
retrievable can be true or false. True returns the raw vectors (1536 of them) as plain text and consumes storage space. Set to true if you're passing a vector result to a downstream app. False is required if stored is false.
stored is a new boolean property that applies to vector fields only. True stores a copy of vectors returned in search results. False discards that copy during indexing. You can search on vectors, but can't return vectors in results.
filterable, facetable, sortable must be false.
The following example shows the fields collection:
In the following REST API example, "title" and "content" contain textual content used in full text search and semantic ranking, while "titleVector" and "contentVector" contain vector data. In this API version, you can use indexers and a skillset to populate vector field using integrated vectorization. The index definition doesn't change, but you can add indexers and skills to your solution to populate the fields.
Add vector fields to the fields collection. You can store one generated embedding per document field. For each vector field:
type must be Collection(Edm.Single).
dimensions is the number of dimensions generated by the embedding model. For text-embedding-ada-002, it's 1536.
vectorSearchProfile is the name of a profile defined elsewhere in the index.
searchable must be true.
retrievable can be true or false. True returns the raw vectors (1536 of them) as plain text and consumes storage space. Set to true if you're passing a vector result to a downstream app.
filterable, facetable, sortable must be false.
Add filterable nonvector fields to the collection, such as "title" with filterable set to true, if you want to invoke prefiltering or postfiltering on the [vector query](vector-search-how-to-query.md
Add other fields that define the substance and structure of the textual content you're indexing. At a minimum, you need a document key.
You should also add fields that are useful in the query or in its response. The following example shows vector fields for title and content ("titleVector", "contentVector") that are equivalent to vectors. It also provides fields for equivalent textual content ("title", "content") useful for sorting, filtering, and reading in a search result.
The following example shows the fields collection:
The vector field definitions for this version are obsolete in later versions. We recommend migrating to a newer REST API.
2023-07-01-Preview was the first REST API version to support vector scenarios.
In the following REST API example, "title" and "content" contain textual content used in full text search and semantic ranking, while "titleVector" and "contentVector" contain vector data that was generated externally.
Add vector fields to the fields collection. You can store one generated embedding per document field. For each vector field:
Assign the Collection(Edm.Single) data type.
Provide the name of the vector search algorithm configuration.
Provide the number of dimensions generated by the embedding model.
Set attributes:
"searchable" must be "true".
"retrievable" set to "true" allows you to display the raw vectors (for example, as a verification step), but doing so increases storage. Set to "false" if you don't need to return raw vectors. You don't need to return vectors for a query, but if you're passing a vector result to a downstream app then set "retrievable" to "true".
"filterable", "facetable", "sortable" attributes must be "false". Don't set them to "true" because those behaviors don't apply within the context of vector fields and the request will fail.
Add other fields that define the substance and structure of the textual content you're indexing. At a minimum, you need a document key.
You should also add fields that are useful in the query or in its response. The following example shows vector fields for title and content ("titleVector", "contentVector") that are equivalent to vectors. It also provides fields for equivalent textual content ("title", "content") useful for sorting, filtering, and reading in a search result.
An index definition with the described elements looks like this:
Content that you provide for indexing must conform to the index schema and include a unique string value for the document key. Prevectorized data is loaded into one or more vector fields, which can coexist with other fields containing alphanumeric content.
Use Documents - Index to load vector and nonvector data into an index. The push APIs for indexing are identical across all stable and preview versions. Use any of the following APIs to load documents:
POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/index?api-version=2023-11-01
Content-Type: application/json
api-key: {{admin-api-key}}
{
"value": [
{
"id": "1",
"title": "Azure App Service",
"content": "Azure App Service is a fully managed platform for building, deploying, and scaling web apps. You can host web apps, mobile app backends, and RESTful APIs. It supports a variety of programming languages and frameworks, such as .NET, Java, Node.js, Python, and PHP. The service offers built-in auto-scaling and load balancing capabilities. It also provides integration with other Azure services, such as Azure DevOps, GitHub, and Bitbucket.",
"category": "Web",
"titleVector": [
-0.02250031754374504,
. . .
],
"contentVector": [
-0.024740582332015038,
. . .
],
"@search.action": "upload"
},
{
"id": "2",
"title": "Azure Functions",
"content": "Azure Functions is a serverless compute service that enables you to run code on-demand without having to manage infrastructure. It allows you to build and deploy event-driven applications that automatically scale with your workload. Functions support various languages, including C#, F#, Node.js, Python, and Java. It offers a variety of triggers and bindings to integrate with other Azure services and external services. You only pay for the compute time you consume.",
"category": "Compute",
"titleVector": [
-0.020159931853413582,
. . .
],
"contentVector": [
-0.02780858241021633,
. . .
],
"@search.action": "upload"
}
. . .
]
}
All of the newer preview releases use pull APIs (indexers and skillsets) for integrated vectorization during indexing and query time.
Indexers can retrieve and index vector fields in source documents, assuming an index schema that meets vector field requirements and the preview REST API. Data sources provide the vectors in whatever format the data source supports (such as strings in JSON). The indexer assumes that fields typed as Collection(Edm.Single) contain vectors and will index that content as vector indexes.
No changes to field mapping behavior or change detection for vectors. The behaviors for text indexing also apply to vectors.
If vector data is sourced in files, we recommend a nondefault parsingMode such as json, jsonLines, or csv based on the shape of the data.
Azure SQL doesn't provide a way to store a collection natively as a single SQL column. A workaround hasn't been identified at this time.
The dimensions of all vectors from the data source must be the same and match their index definition for the field they're mapping to. The indexer throws an error on any documents that don’t match.
Skills and vectorizers are used to generate embeddings. For vectorization during indexing, choose from the following skills:
For validation purposes, you can query the index using Search Explorer in Azure portal or a REST API call. Because Azure AI Search can't convert a vector to human-readable text, try to return fields from the same document that provide evidence of the match. For example, if the vector query targets the "titleVector" field, you could select "title" for the search results.
Fields must be attributed as "retrievable" to be included in the results.
Use the default Query view for a quick confirmation that the index contains vectors. The query view is for full text search. Although you can't use it for vector queries, you can send an empty search (search=*) to check for content. The content of all fields, including vector fields, is returned as plain text.
The following REST API example is a vector query, but it returns only nonvector fields (title, content, category). Only fields marked as "retrievable" can be returned in search results.
To update a vector store, modify the schema and if necessary, reload documents to populate new fields. APIs for schema updates include Create or Update Index (REST), CreateOrUpdateIndex in the Azure SDK for .NET, create_or_update_index in the Azure SDK for Python, and similar methods in other Azure SDKs.
Drop and rebuild is often required for updates to and deletion of existing fields.
However, you can update an existing schema with the following modifications, with no rebuild required:
Add new fields to a fields collection.
Add new vector configurations, assigned to new fields but not existing fields that have already been vectorized.
Change "retrievable" (values are true or false) on an existing field. Vector fields must be searchable and retrievable, but if you want to disable access to a vector field in situations where drop and rebuild isn't feasible, you can set retrievable to false.
Code samples in the azure-search-vector repository demonstrate end-to-end workflows that include schema definition, vectorization, indexing, and queries.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see: https://aka.ms/ContentUserFeedback.