Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Note
This feature is currently in public preview. This preview is provided without a service-level agreement and isn't recommended for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.
In Azure AI Search, agentic retrieval is a new parallel query architecture that uses a chat completion model for query planning, generating subqueries that broaden the scope of what's searchable and relevant.
Queries are created internally. Certain aspects of those generated queries are determined by your search index. This article explains which index elements affect agentic retrieval. None of the required elements are new or specific to agentic retrieval, which means you can use an existing index if it meets the criteria identified in this article, even if it was created using earlier API versions.
Summarized, the search index specified in the targetIndexes
of an agent definition must have these elements:
- String fields attributed as
searchable
andretrievable
- A semantic configuration, with a
defaultSemanticConfiguration
- A vectorizer if you want to include vector queries in the pipeline
Optionally, the following index elements increase your opportunities for optimization:
scoringProfile
with adefaultScoringProfile
, for boosting relevancesynonymMaps
for terminology or jargonanalyzers
for linguistics rules or patterns (like whitespace preservation, or special characters)
Example index definition
Here's an example index that works for agentic retrieval. It meets the criteria for required elements.
{
"name": "earth_at_night",
"fields": [
{
"name": "id", "type": "Edm.String",
"searchable": true, "retrievable": true, "filterable": true, "sortable": true, "facetable": true,
"key": true,
"stored": true,
"synonymMaps": []
},
{
"name": "page_chunk", "type": "Edm.String",
"searchable": true, "retrievable": true, "filterable": false, "sortable": false, "facetable": false,
"analyzer": "en.microsoft",
"stored": true,
"synonymMaps": []
},
{
"name": "page_chunk_text_3_large", "type": "Collection(Edm.Single)",
"searchable": true, "retrievable": false, "filterable": false, "sortable": false, "facetable": false,
"dimensions": 3072,
"vectorSearchProfile": "hnsw_text_3_large",
"stored": false,
"synonymMaps": []
},
{
"name": "page_number", "type": "Edm.Int32",
"searchable": false, "retrievable": true, "filterable": true, "sortable": true, "facetable": true,
"stored": true,
"synonymMaps": []
},
{
"name": "chapter_number", "type": "Edm.Int32",
"searchable": false, "retrievable": true, "filterable": true, "sortable": true, "facetable": true,
"stored": true,
"synonymMaps": []
}
],
"scoringProfiles": [],
"suggesters": [],
"analyzers": [],
"normalizers": [],
"tokenizers": [],
"tokenFilters": [],
"charFilters": [],
"similarity": {
"@odata.type": "#Microsoft.Azure.Search.BM25Similarity"
},
"semantic": {
"defaultConfiguration": "semantic_config",
"configurations": [
{
"name": "semantic_config",
"flightingOptIn": false,
"prioritizedFields": {
"prioritizedContentFields": [
{
"fieldName": "page_chunk"
}
],
"prioritizedKeywordsFields": []
}
}
]
},
"vectorSearch": {
"algorithms": [
{
"name": "alg",
"kind": "hnsw",
"hnswParameters": {
"metric": "cosine",
"m": 4,
"efConstruction": 400,
"efSearch": 500
}
}
],
"profiles": [
{
"name": "hnsw_text_3_large",
"algorithm": "alg",
"vectorizer": "azure_openai_text_3_large"
}
],
"vectorizers": [
{
"name": "azure_openai_text_3_large",
"kind": "azureOpenAI",
"azureOpenAIParameters": {
"resourceUri": "https://YOUR-AOAI-RESOURCE.openai.azure.com",
"deploymentId": "text-embedding-3-large",
"apiKey": "<redacted>",
"modelName": "text-embedding-3-large"
}
}
],
"compressions": []
}
}
Key points:
In agentic retrieval, a large language model (LLM) is used twice. First, it's used to create a query plan. After the query plan is executed and search results are generated, those results are passed to the LLM again, this time as grounding data. LLMs consume and emit tokenized strings of human readable plain text content. For this reason, you must have searchable
fields that provide plain text strings, and are retrievable
in the response.
This index includes a vector field that's used at query time. You don't need the vector in results because it isn't human or LLM readable, but it does need to be searchable
. Since you don't need vectors in the response, both retrievable
and stored
are false.
The vectorizer defined in the vector search configuration is critical. It determines whether your vector field is used during query execution. The vectorizer encodes subqueries into vectors at query time for similarity search over the vectors. The vectorizer must be the same embedding model used to create the vectors in the index.
All searchable
fields are included in query execution. There's no support for a select
statement that explicitly states which fields to query.
Add a semantic configuration
The index must have at least one semantic configuration. The semantic configuration must have:
- A
defaultSemanticConfiguration
set to a named configuration. - A
prioritizedContentFields
set to at least one string field that is bothsearchable
andretrievable
.
Within the configuration, prioritizedContentFields
is required. Title and keywords are optional. For chunked content, you might not have either. However, if you add entity recognition or key phrase extraction, you might have some keywords associated with each chunk that can be useful in search scenarios, perhaps in a scoring profile.
Here's an example of a semantic configuration that works for agentic retrieval:
"semantic":{
"defaultConfiguration":"semantic_config",
"configurations":[
{
"name":"semantic_config",
"flightingOptIn":false,
"prioritizedFields":{
"prioritizedFields":{
"titleField":{
"fieldName":""
},
"prioritizedContentFields":[
{
"fieldName":"page_chunk"
}
],
"prioritizedKeywordsFields":[
{
"fieldName":"Category"
},
{
"fieldName":"Tags"
},
{
"fieldName":"Location"
}
]
}
}
}
]
}
Note
The response provides title
, terms
, and content
, which map to the prioritized fields in this configuration.
Add a vectorizer
If you have vector fields in the index, the query plan includes them if they're searchable
and have a vectorizer
assignment.
A vectorizer specifies an embedding model that provides text-to-vector conversions at query time. It must point to the same embedding model used to encode the vector content in your index. You can use any embedding model supported by Azure AI Search. Vectorizers are specified on vector fields by way of a vector profile.
Recall the vector field definition in the index example. Attributes on a vector field include dimensions or the number of embeddings generated by the model, and the profile.
{
"name": "page_chunk_text_3_large", "type": "Collection(Edm.Single)",
"searchable": true, "retrievable": false, "filterable": false, "sortable": false, "facetable": false,
"dimensions": 3072,
"vectorSearchProfile": "hnsw_text_3_large",
"stored": false,
"synonymMaps": []
}
Vector profiles are configurations of vectorizers, algorithms, and compression techniques. Each vector field can only use one profile, but your index can have many in case you want unique profiles for every vector field.
Querying vectors and calling a vectorizer adds latency to the overall request, but if you want similarity search it might be worth the trade-off.
Here's an example of a vectorizer that works for agentic retrieval, as it appears in a vectorSearch configuration. There's nothing in the vectorizer definition that needs to be changed to work with agentic retrieval.
"vectorSearch": {
"algorithms": [
{
"name": "alg",
"kind": "hnsw",
"hnswParameters": {
"metric": "cosine",
"m": 4,
"efConstruction": 400,
"efSearch": 500
}
}
],
"profiles": [
{
"name": "hnsw_text_3_large",
"algorithm": "alg",
"vectorizer": "azure_openai_text_3_large"
}
],
"vectorizers": [
{
"name": "azure_openai_text_3_large",
"kind": "azureOpenAI",
"azureOpenAIParameters": {
"resourceUri": "https://YOUR-AOAI-RESOURCE.openai.azure.com",
"deploymentId": "text-embedding-3-large",
"apiKey": "<redacted>",
"modelName": "text-embedding-3-large"
}
}
],
"compressions": []
}
Add a scoring profile
Scoring profiles are criteria for relevance boosting. They're applied to non-vector fields (text and numbers) and are evaluated during query execution, although the precise behavior depends on the API version used to create the index.
If you create the index using 2025-05-01-preview, the scoring profile executes last. If the index is created using an earlier API version, scoring profiles are evaluated before semantic reranking.
You can use any scoring profile that makes sense for your index. Here's an example of one that boosts the search score of a match if the match is found in a specific field. Fields are weighted by boosting multipliers. For example if a match was found in the "Category" field, the boosted score is multiplied by 5.
"scoringProfiles": [
{
"name": "boostSearchTerms",
"text": {
"weights": {
"Location": 2,
"Category": 5
}
}
}
]
Add analyzers
Analyzers apply to text fields and can be language analyzers or custom analyzers that control tokenization in the index, such as preserving special characters or whitespace.
Analyzers are defined within a search index and assigned to fields. The fields collection example includes an analyzer reference on the text chunks. In this example, the default analyzer (standard Lucene) is replaced with a Microsoft language analyzer.
{
"name": "page_chunk", "type": "Edm.String",
"searchable": true, "retrievable": true, "filterable": false, "sortable": false, "facetable": false,
"analyzer": "en.microsoft",
"stored": true,
"synonymMaps": []
}
Add a synonym map
Synonym maps expand queries by adding synonyms for named terms. For example, you might have scientific or medical terms for common terms.
Synonym maps are defined as a top-level resource on a search index and assigned to fields. The fields collection example doesn't include a synonym map, but if you had variant spellings of country names in synonym map, here's what an example might look like if it was assigned to a hypothetical "locations" field.
{
"name":"locations",
"type":"Edm.String",
"searchable":true,
"synonymMaps":[ "country-synonyms" ]
}