Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Note
This agentic retrieval feature is generally available in the 2026-04-01 REST API via programmatic access. The Azure portal and Microsoft Foundry portal continue to provide preview-only access to all agentic retrieval features. For migration guidance, see Migrate agentic retrieval code to the latest version.
If you choose to use a preview REST API, you can access capabilities that aren't yet generally available for this feature. Preview features are provided without a service-level agreement and aren't recommended for production workloads. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.
Important
These features and functionality are part of the 2026-05-01-preview REST API. The 2026-05-01-preview is licensed to you as part of your Azure subscription and is subject to the terms applicable to "Previews" in the Microsoft Product Terms, the Microsoft Products and Services Data Protection Addendum ("DPA"), and the Supplemental Terms of Use for Microsoft Azure Previews.
The 2026-05-01-preview supports connections to other Microsoft services and third-party services. Use of these services is subject to their respective terms and might result in data processing or storage outside of the Azure compliance boundary, as well as data flowing into the Azure compliance boundary.
The 2026-05-01-preview can't modify access permissions that were set outside of the 2026-05-01-preview. If you use the 2026-05-01-preview with access- or permission-restricted content, a timing lag will occur before the 2026-05-01-preview recognizes changes to those access or permission restrictions.
It's your responsibility to manage whether your data will flow outside of your organization's compliance and geographic boundaries and any related implications, and that appropriate permissions, boundaries, and approvals are provisioned.
You're responsible for carefully reviewing and testing applications you build in the context of your specific use cases and making all appropriate decisions and customizations. This includes implementing your own responsible AI mitigations, such as metaprompts, content filters, or other safety systems, and ensuring your applications meet appropriate quality, reliability, security, and trustworthiness standards. For more information, see the Azure AI Search Transparency Note.
An indexed OneLake knowledge source ingests Microsoft OneLake files into an agentic retrieval pipeline in Azure AI Search. Knowledge sources are created independently, referenced in a knowledge base, and used as grounding data when the knowledge base is queried at runtime.
When you create an indexed OneLake knowledge source, you specify an external data source, models, and properties to automatically generate the following Azure AI Search objects:
- A data source that represents a lakehouse.
- A skillset that chunks and optionally vectorizes multimodal content from the lakehouse.
- An index that stores enriched content and meets the criteria for agentic retrieval.
- An indexer that uses the previous objects to drive the indexing and enrichment pipeline.
The generated indexer conforms to the OneLake indexer, whose prerequisites, supported tasks, supported document formats, supported shortcuts, and limitations also apply to OneLake knowledge sources. For more information, see the OneLake indexer documentation.
Usage support
| Azure portal | Microsoft Foundry portal | .NET SDK | Python SDK | Java SDK | JavaScript SDK | REST API |
|---|---|---|---|---|---|---|
| ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
Prerequisites
An Azure AI Search service in any region that provides agentic retrieval.
Completion of the OneLake indexer prerequisites.
Completion of the OneLake indexer data preparation.
Permissions to create knowledge sources. Configure keyless authentication with the Search Service Contributor role assigned to your user account (recommended) or use an API key.
If the knowledge source specifies an Azure OpenAI model for embeddings or image verbalization, the search service must have a managed identity with Cognitive Services User permissions on the Microsoft Foundry resource.
Required
Azure.Search.Documentspackage:For 2026-05-01-preview features, the latest preview package:
dotnet add package Azure.Search.Documents --prereleaseFor 2026-04-01 features, the latest stable package:
dotnet add package Azure.Search.Documents
Required
azure-search-documentspackage:For 2026-05-01-preview features, the latest preview package:
pip install --pre azure-search-documentsFor 2026-04-01 features, the latest stable package:
pip install azure-search-documents
Required REST API version:
For preview features: Search Service 2026-05-01-preview
For generally available features: Search Service 2026-04-01
Check for existing knowledge sources
A knowledge source is a top-level, reusable object. Knowing about existing knowledge sources is helpful for either reuse or naming new objects.
Run the following code to list knowledge sources by name and type.
// List knowledge sources by name and type
using Azure.Search.Documents.Indexes;
var indexClient = new SearchIndexClient(new Uri(searchEndpoint), credential);
var knowledgeSources = indexClient.GetKnowledgeSourcesAsync();
Console.WriteLine("Knowledge Sources:");
await foreach (var ks in knowledgeSources)
{
Console.WriteLine($" Name: {ks.Name}, Type: {ks.GetType().Name}");
}
Reference: SearchIndexClient
# List knowledge sources by name and type
from azure.core.credentials import AzureKeyCredential
from azure.search.documents.indexes import SearchIndexClient
index_client = SearchIndexClient(endpoint = "search_url", credential = AzureKeyCredential("api_key"))
for ks in index_client.list_knowledge_sources():
print(f" - {ks.name} ({ks.kind})")
Reference: SearchIndexClient
### List knowledge sources by name and type
GET {{search-url}}/knowledgesources?api-version={{api-version}}&$select=name,kind
api-key: {{api-key}}
Reference: Knowledge Sources - List
You can also return a single knowledge source by name to review its JSON definition.
using Azure.Search.Documents.Indexes;
using System.Text.Json;
var indexClient = new SearchIndexClient(new Uri(searchEndpoint), credential);
// Specify the knowledge source name to retrieve
string ksNameToGet = "earth-knowledge-source";
// Get its definition
var knowledgeSourceResponse = await indexClient.GetKnowledgeSourceAsync(ksNameToGet);
var ks = knowledgeSourceResponse.Value;
// Serialize to JSON for display
var jsonOptions = new JsonSerializerOptions
{
WriteIndented = true,
DefaultIgnoreCondition = System.Text.Json.Serialization.JsonIgnoreCondition.Never
};
Console.WriteLine(JsonSerializer.Serialize(ks, ks.GetType(), jsonOptions));
Reference: SearchIndexClient
# Get a knowledge source definition
from azure.core.credentials import AzureKeyCredential
from azure.search.documents.indexes import SearchIndexClient
import json
index_client = SearchIndexClient(endpoint = "search_url", credential = AzureKeyCredential("api_key"))
ks = index_client.get_knowledge_source("knowledge_source_name")
print(json.dumps(ks.as_dict(), indent = 2))
Reference: SearchIndexClient
### Get a knowledge source definition
GET {{search-url}}/knowledgesources/{{knowledge-source-name}}?api-version={{api-version}}
api-key: {{api-key}}
Reference: Knowledge Sources - Get
The following JSON is an example response for an indexed OneLake knowledge source.
{
"name": "my-onelake-ks",
"kind": "indexedOneLake",
"description": "A sample indexed OneLake knowledge source.",
"encryptionKey": null,
"indexedOneLakeParameters": {
"fabricWorkspaceId": "<REDACTED>",
"lakehouseId": "<REDACTED>",
"targetPath": null,
"ingestionParameters": {
"disableImageVerbalization": false,
"ingestionPermissionOptions": [],
"contentExtractionMode": "standard",
"identity": null,
"embeddingModel": {
"kind": "azureOpenAI",
"azureOpenAIParameters": {
"resourceUri": "<REDACTED>",
"deploymentId": "text-embedding-3-large",
"apiKey": "<REDACTED>",
"modelName": "text-embedding-3-large"
}
},
"chatCompletionModel": {
"kind": "azureOpenAI",
"azureOpenAIParameters": {
"resourceUri": "<your-foundry-resource-endpoint>",
"deploymentId": "gpt-5-mini",
"apiKey": "<REDACTED>",
"modelName": "gpt-5-mini"
}
},
"ingestionSchedule": null,
"aiServices": {
"uri": "<your-foundry-resource-endpoint>",
"apiKey": "<REDACTED>"
}
},
"createdResources": {
"datasource": "my-onelake-ks-datasource",
"indexer": "my-onelake-ks-indexer",
"skillset": "my-onelake-ks-skillset",
"index": "my-onelake-ks-index"
}
}
}
Create a knowledge source
Run the following code to create an indexed OneLake knowledge source.
// Create an indexed OneLake knowledge source
using Azure.Search.Documents.Indexes;
using Azure.Search.Documents.Indexes.Models;
using Azure.Search.Documents.Models;
using Azure;
var indexClient = new SearchIndexClient(new Uri(searchEndpoint), new AzureKeyCredential(apiKey));
var chatCompletionParams = new AzureOpenAIVectorizerParameters
{
ResourceUri = new Uri(aoaiEndpoint),
DeploymentName = aoaiGptDeployment,
ModelName = aoaiGptModel
};
var embeddingParams = new AzureOpenAIVectorizerParameters
{
ResourceUri = new Uri(aoaiEndpoint),
DeploymentName = aoaiEmbeddingDeployment,
ModelName = aoaiEmbeddingModel
};
var ingestionParams = new KnowledgeSourceIngestionParameters
{
DisableImageVerbalization = false,
ChatCompletionModel = new KnowledgeBaseAzureOpenAIModel(azureOpenAIParameters: chatCompletionParams),
EmbeddingModel = new KnowledgeSourceAzureOpenAIVectorizer
{
AzureOpenAIParameters = embeddingParams
},
IngestionPermissionOptions = new List<KnowledgeSourceIngestionPermissionOption>
{
KnowledgeSourceIngestionPermissionOption.UserIds,
KnowledgeSourceIngestionPermissionOption.GroupIds
}
};
var oneLakeParams = new IndexedOneLakeKnowledgeSourceParameters(
fabricWorkspaceId: fabricWorkspaceId,
lakehouseId: lakehouseId)
{
IngestionParameters = ingestionParams
};
var knowledgeSource = new IndexedOneLakeKnowledgeSource(
name: "my-onelake-ks",
indexedOneLakeParameters: oneLakeParams)
{
Description = "This knowledge source pulls content from a lakehouse."
};
await indexClient.CreateOrUpdateKnowledgeSourceAsync(knowledgeSource);
Console.WriteLine($"Knowledge source '{knowledgeSource.Name}' created or updated successfully.");
Reference: SearchIndexClient, IndexedOneLakeKnowledgeSource
# Create an indexed OneLake knowledge source
from azure.core.credentials import AzureKeyCredential
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import IndexedOneLakeKnowledgeSource, IndexedOneLakeKnowledgeSourceParameters, KnowledgeBaseAzureOpenAIModel, AzureOpenAIVectorizerParameters, KnowledgeSourceAzureOpenAIVectorizer, KnowledgeSourceContentExtractionMode, KnowledgeSourceIngestionParameters
index_client = SearchIndexClient(endpoint = "search_url", credential = AzureKeyCredential("api_key"))
knowledge_source = IndexedOneLakeKnowledgeSource(
name = "my-onelake-ks",
description= "This knowledge source pulls content from a lakehouse.",
encryption_key = None,
indexed_one_lake_parameters = IndexedOneLakeKnowledgeSourceParameters(
fabric_workspace_id = "fabric_workspace_id",
lakehouse_id = "lakehouse_id",
target_path = None,
ingestion_parameters = KnowledgeSourceIngestionParameters(
identity = None,
disable_image_verbalization = False,
chat_completion_model = KnowledgeBaseAzureOpenAIModel(
azure_open_ai_parameters = AzureOpenAIVectorizerParameters(
resource_url = "aoai_endpoint",
deployment_name = "aoai_gpt_deployment",
model_name = "aoai_gpt_model",
api_key = "aoai_api_key"
)
),
embedding_model = KnowledgeSourceAzureOpenAIVectorizer(
azure_open_ai_parameters=AzureOpenAIVectorizerParameters(
resource_url = "aoai_endpoint",
deployment_name = "aoai_embedding_deployment",
model_name = "aoai_embedding_model",
api_key = "aoai_api_key"
)
),
content_extraction_mode = KnowledgeSourceContentExtractionMode.MINIMAL,
ingestion_schedule = None,
ingestion_permission_options = ["user_ids", "group_ids"]
)
)
)
index_client.create_or_update_knowledge_source(knowledge_source)
print(f"Knowledge source '{knowledge_source.name}' created or updated successfully.")
Reference: SearchIndexClient
### Create an indexed OneLake knowledge source
PUT {{search-url}}/knowledgesources/my-onelake-ks?api-version=2026-05-01-preview
api-key: {{api-key}}
Content-Type: application/json
{
"name": "my-onelake-ks",
"kind": "indexedOneLake",
"description": "This knowledge source pulls content from a lakehouse.",
"indexedOneLakeParameters": {
"fabricWorkspaceId": "<YOUR FABRIC WORKSPACE GUID>",
"lakehouseId": "<YOUR LAKEHOUSE GUID>",
"targetPath": null,
"ingestionParameters": {
"identity": null,
"disableImageVerbalization": null,
"chatCompletionModel": {
"kind": "azureOpenAI",
"azureOpenAIParameters": {
"resourceUri": "{{aoai-endpoint}}",
"deploymentId": "{{aoai-gpt-deployment}}",
"modelName": "{{aoai-gpt-model}}",
"apiKey": "{{aoai-key}}"
}
},
"embeddingModel": {
"kind": "azureOpenAI",
"azureOpenAIParameters": {
"resourceUri": "{{aoai-endpoint}}",
"deploymentId": "{{aoai-embedding-deployment}}",
"modelName": "{{aoai-embedding-model}}",
"apiKey": "{{aoai-key}}"
}
},
"contentExtractionMode": "minimal",
"ingestionSchedule": null,
"ingestionPermissionOptions": ["userIds", "groupIds"]
}
}
}
Reference: Knowledge Sources - Create or Update
Note
Document-level permissions enforcement using ingestionPermissionOptions requires the 2026-05-01-preview API version. 2026-04-01 doesn't support this feature.
Source-specific properties
The following properties apply to indexed OneLake knowledge sources.
| Name | Description | Type | Editable | Required |
|---|---|---|---|---|
Name |
The name of the knowledge source, which must be unique within the knowledge sources collection and follow the naming guidelines for objects in Azure AI Search. | String | No | Yes |
Description |
A description of the knowledge source. | String | Yes | No |
EncryptionKey |
A customer-managed key to encrypt sensitive information in both the knowledge source and the generated objects. | Object | Yes | No |
IndexedOneLakeKnowledgeSourceParameters |
Parameters specific to OneLake knowledge sources: FabricWorkspaceId, LakehouseId, and TargetPath. |
Object | Yes | |
FabricWorkspaceId |
The GUID of the workspace that contains the lakehouse. | String | No | Yes |
LakehouseId |
The GUID of the lakehouse. | String | No | Yes |
TargetPath |
A folder or shortcut within the lakehouse. When unspecified, the entire lakehouse is indexed. | String | No | No |
| Name | Description | Type | Editable | Required |
|---|---|---|---|---|
name |
The name of the knowledge source, which must be unique within the knowledge sources collection and follow the naming guidelines for objects in Azure AI Search. | String | No | Yes |
description |
A description of the knowledge source. | String | Yes | No |
encryption_key |
A customer-managed key to encrypt sensitive information in both the knowledge source and the generated objects. | Object | Yes | No |
indexed_one_lake_parameters |
Parameters specific to OneLake knowledge sources: fabric_workspace_id, lakehouse_id, and target_path. |
Object | Yes | |
fabric_workspace_id |
The GUID of the workspace that contains the lakehouse. | String | No | Yes |
lakehouse_id |
The GUID of the lakehouse. | String | No | Yes |
target_path |
A folder or shortcut within the lakehouse. When unspecified, the entire lakehouse is indexed. | String | No | No |
| Name | Description | Type | Editable | Required |
|---|---|---|---|---|
name |
The name of the knowledge source, which must be unique within the knowledge sources collection and follow the naming guidelines for objects in Azure AI Search. | String | No | Yes |
kind |
The kind of knowledge source, which is indexedOneLake in this case. |
String | No | Yes |
description |
A description of the knowledge source. | String | Yes | No |
encryptionKey |
A customer-managed key to encrypt sensitive information in both the knowledge source and the generated objects. | Object | Yes | No |
indexedOneLakeParameters |
Parameters specific to OneLake knowledge sources: fabricWorkspaceId, lakehouseId, and targetPath. |
Object | Yes | |
fabricWorkspaceId |
The GUID of the workspace that contains the lakehouse. | String | No | Yes |
lakehouseId |
The GUID of the lakehouse. | String | No | Yes |
targetPath |
A folder or shortcut within the lakehouse. When unspecified, the entire lakehouse is indexed. | String | No | No |
Ingestion parameters properties
For indexed knowledge sources only, you can pass the following ingestionParameters properties to control how content is ingested and processed.
| Name | Description | Type | Editable | Required |
|---|---|---|---|---|
Identity |
A managed identity to use in the generated indexer. | Object | Yes | No |
DisableImageVerbalization |
Enables or disables the use of image verbalization. The default is False, which enables image verbalization. Set to True to disable image verbalization. |
Boolean | No | No |
ChatCompletionModel |
A chat completion model that verbalizes images or extracts content. Supported models are gpt-4o, gpt-4o-mini, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, gpt-5, gpt-5-mini, and gpt-5-nano. The GenAI Prompt skill is included in the generated skillset. Setting this parameter also requires that DisableImageVerbalization is set to False. When ContentExtractionMode is set to standard, ChatCompletionModel.AzureOpenAIParameters.ResourceUri must equal AiServices.Uri, and both parameters must point to the same Microsoft Foundry resource on services.ai.azure.com. |
Object | Only ApiKey and DeploymentName are editable |
No |
EmbeddingModel |
A text embedding model that vectorizes text and image content during indexing and at query time. Supported models are text-embedding-ada-002, text-embedding-3-small, and text-embedding-3-large. The Azure OpenAI Embedding skill is included in the generated skillset, and the Azure OpenAI vectorizer is included in the generated index. |
Object | Only ApiKey and DeploymentName are editable |
No |
ContentExtractionMode |
Controls how content is extracted from files. The default is minimal, which uses basic content extraction methods for text and images. Set to standard for advanced document cracking and chunking using the Azure Content Understanding skill, which is included in the generated skillset. For standard only, the AiServices parameter is specifiable, and ChatCompletionModel.AzureOpenAIParameters.ResourceUri must equal AiServices.Uri. For more information, see the ChatCompletionModel row. |
String | No | No |
AiServices |
A Foundry resource to access Azure Content Understanding in Foundry Tools. Setting this parameter requires that ContentExtractionMode is set to standard. For more information, see the ChatCompletionModel row. |
Object | Only ApiKey is editable |
No |
IngestionSchedule |
Adds scheduling information to the generated indexer. You can also add a schedule later to automate data refresh. | Object | Yes | No |
IngestionPermissionOptions |
The document-level permissions to ingest alongside content. Specify UserIds, GroupIds, or RbacScope to store permission metadata in the index. You can also specify SensitivityLabel to ingest Microsoft Purview sensitivity label metadata for blob, indexed OneLake, and indexed SharePoint knowledge sources. For source-specific RBAC guidance, see Ingest RBAC permissions from blob storage and Ingest ACLs from ADLS Gen2. To enforce these permissions at query time, see Enforce permissions at query time. |
Array | No | No |
AssetStore |
(2026-05-01-preview only) A blob container used to persist images extracted from source documents. Required to enable image serving (preview) for the knowledge base. Setting this parameter provisions a knowledge store alongside the knowledge source to store the image artifacts. You can inspect and manage this knowledge store like any other. The storage account must remain accessible to the search service for the lifetime of the knowledge base. | Object | No | No |
| Name | Description | Type | Editable | Required |
|---|---|---|---|---|
identity |
A managed identity to use in the generated indexer. | Object | Yes | No |
disable_image_verbalization |
Enables or disables the use of image verbalization. The default is False, which enables image verbalization. Set to True to disable image verbalization. |
Boolean | No | No |
chat_completion_model |
A chat completion model that verbalizes images or extracts content. Supported models are gpt-4o, gpt-4o-mini, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, gpt-5, gpt-5-mini, and gpt-5-nano. The GenAI Prompt skill is included in the generated skillset. Setting this parameter also requires that disable_image_verbalization is set to False. When content_extraction_mode is set to standard, chat_completion_model.azure_open_ai_parameters.resource_url must equal ai_services.uri, and both parameters must point to the same Microsoft Foundry resource on services.ai.azure.com. |
Object | Only api_key and deployment_name are editable |
No |
embedding_model |
A text embedding model that vectorizes text and image content during indexing and at query time. Supported models are text-embedding-ada-002, text-embedding-3-small, and text-embedding-3-large. The Azure OpenAI Embedding skill is included in the generated skillset, and the Azure OpenAI vectorizer is included in the generated index. |
Object | Only api_key and deployment_name are editable |
No |
content_extraction_mode |
Controls how content is extracted from files. The default is minimal, which uses basic content extraction methods for text and images. Set to standard for advanced document cracking and chunking using the Azure Content Understanding skill, which is included in the generated skillset. For standard only, the ai_services parameter is specifiable, and chat_completion_model.azure_open_ai_parameters.resource_url must equal ai_services.uri. For more information, see the chat_completion_model row. |
String | No | No |
ai_services |
A Foundry resource to access Azure Content Understanding in Foundry tools. Setting this parameter requires that content_extraction_mode is set to standard. For more information, see the chat_completion_model row. |
Object | Only api_key is editable |
No |
ingestion_schedule |
Adds scheduling information to the generated indexer. You can also add a schedule later to automate data refresh. | Object | Yes | No |
ingestion_permission_options |
The document-level permissions to ingest alongside content. Specify user_ids, group_ids, or rbac_scope to store permission metadata in the index. You can also specify sensitivity_label to ingest Microsoft Purview sensitivity label metadata for blob, indexed OneLake, and indexed SharePoint knowledge sources. For source-specific RBAC guidance, see Ingest RBAC permissions from blob storage and Ingest ACLs from ADLS Gen2. To enforce these permissions at query time, see Enforce permissions at query time. |
Array | No | No |
asset_store |
(2026-05-01-preview only) A blob container used to persist images extracted from source documents. Required to enable image serving (preview) for the knowledge base. Setting this parameter provisions a knowledge store alongside the knowledge source to store the image artifacts. You can inspect and manage this knowledge store like any other. The storage account must remain accessible to the search service for the lifetime of the knowledge base. | Object | No | No |
| Name | Description | Type | Editable | Required |
|---|---|---|---|---|
identity |
A managed identity to use in the generated indexer. | Object | Yes | No |
disableImageVerbalization |
Enables or disables the use of image verbalization. The default is false, which enables image verbalization. Set to true to disable image verbalization. |
Boolean | No | No |
chatCompletionModel |
A chat completion model that verbalizes images or extracts content. Supported models are gpt-4o, gpt-4o-mini, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, gpt-5, gpt-5-mini, and gpt-5-nano. The GenAI Prompt skill is included in the generated skillset. Setting this parameter also requires that disableImageVerbalization is set to false. When contentExtractionMode is set to standard, chatCompletionModel.azureOpenAIParameters.resourceUri must equal aiServices.uri, and both parameters must point to the same Microsoft Foundry resource on services.ai.azure.com. |
Object | Only apiKey and deploymentId are editable |
No |
embeddingModel |
A text embedding model that vectorizes text and image content during indexing and at query time. Supported models are text-embedding-ada-002, text-embedding-3-small, and text-embedding-3-large. The Azure OpenAI Embedding skill is included in the generated skillset, and the Azure OpenAI vectorizer is included in the generated index. |
Object | Only apiKey and deploymentId are editable |
No |
contentExtractionMode |
Controls how content is extracted from files. The default is minimal, which uses basic content extraction methods for text and images. Set to standard for advanced document cracking and chunking using the Azure Content Understanding skill, which is included in the generated skillset. For standard only, the ai_services parameter is specifiable, and chatCompletionModel.azureOpenAIParameters.resourceUri must equal aiServices.uri. For more information, see the chatCompletionModel row. |
String | No | No |
aiServices |
A Foundry resource to access Azure Content Understanding in Foundry tools. Setting this parameter requires that contentExtractionMode is set to standard. For more information, see the chatCompletionModel row. |
Object | Only apiKey is editable |
No |
ingestionSchedule |
Adds scheduling information to the generated indexer. You can also add a schedule later to automate data refresh. | Object | Yes | No |
ingestionPermissionOptions |
The document-level permissions to ingest alongside content. Specify userIds, groupIds, or rbacScope to store permission metadata in the index. You can also specify sensitivityLabel to ingest Microsoft Purview sensitivity label metadata for blob, indexed OneLake, and indexed SharePoint knowledge sources. For source-specific RBAC guidance, see Ingest RBAC permissions from blob storage and Ingest ACLs from ADLS Gen2. To enforce these permissions at query time, see Enforce permissions at query time. |
Array | No | No |
assetStore |
(2026-05-01-preview only) A blob container used to persist images extracted from source documents. Required to enable image serving (preview) for the knowledge base. Setting this parameter provisions a knowledge store alongside the knowledge source to store the image artifacts. You can inspect and manage this knowledge store like any other. The storage account must remain accessible to the search service for the lifetime of the knowledge base. | Object | No | No |
Check ingestion status
Run the following code to monitor ingestion progress and health, including the knowledge source kind and detailed indexing errors for knowledge sources that generate an indexer pipeline and populate a search index.
using Azure.Search.Documents.Indexes;
using System.Text.Json;
var indexClient = new SearchIndexClient(new Uri(searchEndpoint), new AzureKeyCredential(apiKey));
// Get knowledge source ingestion status
var statusResponse = await indexClient.GetKnowledgeSourceStatusAsync(knowledgeSourceName);
var status = statusResponse.Value;
// Serialize to JSON for display
var json = JsonSerializer.Serialize(status, new JsonSerializerOptions { WriteIndented = true });
Console.WriteLine(json);
Reference: SearchIndexClient
# Check knowledge source ingestion status
from azure.core.credentials import AzureKeyCredential
from azure.search.documents.indexes import SearchIndexClient
import json
index_client = SearchIndexClient(endpoint="search_url", credential=AzureKeyCredential("api_key"))
status = index_client.get_knowledge_source_status("knowledge_source_name")
print(json.dumps(status.as_dict(), indent=2))
Reference: SearchIndexClient
### Check knowledge source ingestion status
GET {{search-url}}/knowledgesources/{{knowledge-source-name}}/status?api-version={{api-version}}
api-key: {{api-key}}
Content-Type: application/json
Reference: Knowledge Sources - Get Status
A response for a request that includes ingestion parameters and is actively ingesting content might look like the following example.
{
"kind": "azureBlob",
"synchronizationStatus": "active",
"synchronizationInterval": "1d",
"currentSynchronizationState": {
"startTime": "2026-04-10T19:30:00Z",
"itemUpdatesProcessed": 1100,
"itemsUpdatesFailed": 100,
"itemsSkipped": 1100,
"errors": [
{
"key": "Item id 1",
"docURL": "https://contoso.blob.core.windows.net/contracts/2024/Q4/doc-00023.csv",
"statusCode": 400,
"componentName": "DocumentExtraction.AzureBlob.MyDataSource",
"errorMessage": "Could not read the value of column 'foo' at index '0'.",
"details": "The file could not be parsed.",
"documentationLink": "https://go.microsoft.com/fwlink/?linkid=2049388"
}
]
},
"lastSynchronizationState": {
"status": "partialSuccess",
"startTime": "2026-04-09T19:30:00Z",
"endTime": "2026-04-09T19:40:01Z",
"itemUpdatesProcessed": 1100,
"itemsUpdatesFailed": 100,
"itemsSkipped": 1100,
"errors": null
},
"statistics": {
"totalSynchronizations": 25,
"averageSynchronizationDuration": "00:15:20",
"averageItemsProcessedPerSynchronization": 500
}
}
Note
The kind property and currentSynchronizationState.errors[] array with document-level error details are available starting with the 2026-04-01 API version. For earlier API versions, these fields aren't returned. The lastSynchronizationState.status field is also new in 2026-04-01.
Review the generated objects
When you create this knowledge source, Azure AI Search automatically generates a data source, skillset, indexer, and index. The creation response lists each object under createdResources.
These objects are generated according to a fixed template, and their names are based on the name of the knowledge source. You can't change the object names. Avoid editing these objects directly, as changes can introduce errors or incompatibilities that break the indexer pipeline.
You can use the Azure portal to validate object creation. The workflow is:
Check the indexer for success or failure messages. Connection or quota errors appear here.
Check the data source to verify the connection to your data store. The connection uses either a connection string or a managed identity, depending on how you configured the knowledge source.
Check the skillset to see how your content is chunked and optionally vectorized.
Check the index to see how your content is indexed and exposed for retrieval, including which fields are searchable and filterable and which fields store vectors for similarity search. Use Search Explorer to run queries against the generated index.
Assign to a knowledge base
If you're satisfied with the knowledge source, add it to a knowledge base.
For any knowledge base that specifies an indexed OneLake knowledge source, be sure to set includeReferenceSourceData to true. This step is necessary for pulling the source document URL into the citation.
Query a knowledge base
After the knowledge base is configured, call the retrieve action or MCP endpoint to query the knowledge source. This knowledge source supports optional configurations for document-level permissions enforcement and document-embedded image surfacing.
Enforce document-level permissions
To enforce document-level permissions, set ingestionPermissionOptions when you create this knowledge source, and then include the user's access token in the retrieve request. For more information, see Enforce permissions at query time (preview).
Surface document-embedded images
To surface document-embedded images (such as diagrams or scans) in answer synthesis responses, configure assetStore on this knowledge source, and then enable image serving on the knowledge base. For more information, see Surface document-embedded images in agentic retrieval (preview).
Delete a knowledge source
Before you can delete a knowledge source, you must delete any knowledge base that references it or update the knowledge base definition to remove the reference. For knowledge sources that generate an index and indexer pipeline, all generated objects are also deleted. However, if you used an existing index to create a knowledge source, your index isn't deleted.
If you try to delete a knowledge source that's in use, the action fails and returns a list of affected knowledge bases.
To delete a knowledge source:
Get a list of all knowledge bases on your search service.
using Azure.Search.Documents.Indexes; var indexClient = new SearchIndexClient(new Uri(searchEndpoint), credential); var knowledgeBases = indexClient.GetKnowledgeBasesAsync(); Console.WriteLine("Knowledge Bases:"); await foreach (var kb in knowledgeBases) { Console.WriteLine($" - {kb.Name}"); }Reference: SearchIndexClient
An example response might look like the following:
{ "@odata.context": "https://my-search-service.search.windows.net/$metadata#knowledgebases(name)", "value": [ { "name": "my-kb" }, { "name": "my-kb-2" } ] }Get an individual knowledge base definition to check for knowledge source references.
using Azure.Search.Documents.Indexes; using System.Text.Json; var indexClient = new SearchIndexClient(new Uri(searchEndpoint), credential); // Specify the knowledge base name to retrieve string kbNameToGet = "earth-knowledge-base"; // Get a specific knowledge base definition var knowledgeBaseResponse = await indexClient.GetKnowledgeBaseAsync(kbNameToGet); var kb = knowledgeBaseResponse.Value; // Serialize to JSON for display string json = JsonSerializer.Serialize(kb, new JsonSerializerOptions { WriteIndented = true }); Console.WriteLine(json);Reference: SearchIndexClient
An example response might look like the following:
{ "Name": "earth-knowledge-base", "KnowledgeSources": [ { "Name": "earth-knowledge-source" } ], "Models": [ {} ], "RetrievalReasoningEffort": {}, "OutputMode": {}, "ETag": "\u00220x8DE278629D782B3\u0022", "EncryptionKey": null, "Description": null, "RetrievalInstructions": null, "AnswerInstructions": null }Either delete the knowledge base or, if you have multiple knowledge sources, update the knowledge base to remove the source. This example shows deletion.
using Azure.Search.Documents.Indexes; var indexClient = new SearchIndexClient(new Uri(searchEndpoint), credential); await indexClient.DeleteKnowledgeBaseAsync(knowledgeBaseName); System.Console.WriteLine($"Knowledge base '{knowledgeBaseName}' deleted successfully.");Reference: SearchIndexClient
Delete the knowledge source.
await indexClient.DeleteKnowledgeSourceAsync(knowledgeSourceName); System.Console.WriteLine($"Knowledge source '{knowledgeSourceName}' deleted successfully.");Reference: SearchIndexClient
Get a list of all knowledge bases on your search service.
# Get knowledge bases from azure.core.credentials import AzureKeyCredential from azure.search.documents.indexes import SearchIndexClient index_client = SearchIndexClient(endpoint = "search_url", credential = AzureKeyCredential("api_key")) print("Knowledge Bases:") for kb in index_client.list_knowledge_bases(): print(f" - {kb.name}")Reference: SearchIndexClient
An example response might look like the following:
{ "@odata.context": "https://my-search-service.search.windows.net/$metadata#knowledgebases(name)", "value": [ { "name": "my-kb" }, { "name": "my-kb-2" } ] }Get an individual knowledge base definition to check for knowledge source references.
# Get a knowledge base definition from azure.core.credentials import AzureKeyCredential from azure.search.documents.indexes import SearchIndexClient index_client = SearchIndexClient(endpoint = "search_url", credential = AzureKeyCredential("api_key")) kb = index_client.get_knowledge_base("knowledge_base_name") print(kb)Reference: SearchIndexClient
An example response might look like the following:
{ "name": "my-kb", "description": null, "retrievalInstructions": null, "answerInstructions": null, "outputMode": null, "knowledgeSources": [ { "name": "my-blob-ks", } ], "models": [], "encryptionKey": null, "retrievalReasoningEffort": { "kind": "low" } }Either delete the knowledge base or, if you have multiple knowledge sources, update the knowledge base to remove the source. This example shows deletion.
# Delete a knowledge base from azure.core.credentials import AzureKeyCredential from azure.search.documents.indexes import SearchIndexClient index_client = SearchIndexClient(endpoint = "search_url", credential = AzureKeyCredential("api_key")) index_client.delete_knowledge_base("knowledge_base_name") print(f"Knowledge base deleted successfully.")Reference: SearchIndexClient
Delete the knowledge source.
# Delete a knowledge source from azure.core.credentials import AzureKeyCredential from azure.search.documents.indexes import SearchIndexClient index_client = SearchIndexClient(endpoint = "search_url", credential = AzureKeyCredential("api_key")) index_client.delete_knowledge_source("knowledge_source_name") print(f"Knowledge source deleted successfully.")Reference: SearchIndexClient
Get a list of all knowledge bases on your search service.
### Get knowledge bases GET {{search-url}}/knowledgebases?api-version={{api-version}}&$select=name api-key: {{api-key}}Reference: Knowledge Bases - List
An example response might look like the following:
{ "@odata.context": "https://my-search-service.search.windows.net/$metadata#knowledgebases(name)", "value": [ { "name": "my-kb" }, { "name": "my-kb-2" } ] }Get an individual knowledge base definition to check for knowledge source references.
### Get a knowledge base definition GET {{search-url}}/knowledgebases/{{knowledge-base-name}}?api-version={{api-version}} api-key: {{api-key}}Reference: Knowledge Bases - Get
An example response might look like the following:
{ "name": "my-kb", "description": null, "retrievalInstructions": null, "answerInstructions": null, "outputMode": null, "knowledgeSources": [ { "name": "my-blob-ks", } ], "models": [], "encryptionKey": null, "retrievalReasoningEffort": { "kind": "low" } }Either delete the knowledge base or, if you have multiple knowledge sources, update the knowledge base to remove the source. This example shows deletion.
### Delete a knowledge base DELETE {{search-url}}/knowledgebases/{{knowledge-base-name}}?api-version={{api-version}} api-key: {{api-key}}Reference: Knowledge Bases - Delete
Delete the knowledge source.
### Delete a knowledge source DELETE {{search-url}}/knowledgesources/{{knowledge-source-name}}?api-version={{api-version}} api-key: {{api-key}}Reference: Knowledge Sources - Delete