Create an indexed OneLake knowledge source

Note

This agentic retrieval feature is generally available in the 2026-04-01 REST API via programmatic access. The Azure portal and Microsoft Foundry portal continue to provide preview-only access to all agentic retrieval features. For migration guidance, see Migrate agentic retrieval code to the latest version.

If you choose to use a preview REST API, you can access capabilities that aren't yet generally available for this feature. Preview features are provided without a service-level agreement and aren't recommended for production workloads. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

Important

These features and functionality are part of the 2026-05-01-preview REST API. The 2026-05-01-preview is licensed to you as part of your Azure subscription and is subject to the terms applicable to "Previews" in the Microsoft Product Terms, the Microsoft Products and Services Data Protection Addendum ("DPA"), and the Supplemental Terms of Use for Microsoft Azure Previews.

The 2026-05-01-preview supports connections to other Microsoft services and third-party services. Use of these services is subject to their respective terms and might result in data processing or storage outside of the Azure compliance boundary, as well as data flowing into the Azure compliance boundary.

The 2026-05-01-preview can't modify access permissions that were set outside of the 2026-05-01-preview. If you use the 2026-05-01-preview with access- or permission-restricted content, a timing lag will occur before the 2026-05-01-preview recognizes changes to those access or permission restrictions.

It's your responsibility to manage whether your data will flow outside of your organization's compliance and geographic boundaries and any related implications, and that appropriate permissions, boundaries, and approvals are provisioned.

You're responsible for carefully reviewing and testing applications you build in the context of your specific use cases and making all appropriate decisions and customizations. This includes implementing your own responsible AI mitigations, such as metaprompts, content filters, or other safety systems, and ensuring your applications meet appropriate quality, reliability, security, and trustworthiness standards. For more information, see the Azure AI Search Transparency Note.

An indexed OneLake knowledge source ingests Microsoft OneLake files into an agentic retrieval pipeline in Azure AI Search. Knowledge sources are created independently, referenced in a knowledge base, and used as grounding data when the knowledge base is queried at runtime.

When you create an indexed OneLake knowledge source, you specify an external data source, models, and properties to automatically generate the following Azure AI Search objects:

A data source that represents a lakehouse.
A skillset that chunks and optionally vectorizes multimodal content from the lakehouse.
An index that stores enriched content and meets the criteria for agentic retrieval.
An indexer that uses the previous objects to drive the indexing and enrichment pipeline.

The generated indexer conforms to the OneLake indexer, whose prerequisites, supported tasks, supported document formats, supported shortcuts, and limitations also apply to OneLake knowledge sources. For more information, see the OneLake indexer documentation.

Usage support

Azure portal	Microsoft Foundry portal	.NET SDK	Python SDK	Java SDK	JavaScript SDK	REST API
✔️	✔️	✔️	✔️	✔️	✔️	✔️

Prerequisites

An Azure AI Search service in any region that provides agentic retrieval.
Completion of the OneLake indexer prerequisites.
Completion of the OneLake indexer data preparation.
Permissions to create knowledge sources. Configure keyless authentication with the Search Service Contributor role assigned to your user account (recommended) or use an API key.
If the knowledge source specifies an Azure OpenAI model for embeddings or image verbalization, the search service must have a managed identity with Cognitive Services User permissions on the Microsoft Foundry resource.

Required Azure.Search.Documents package:
- For 2026-05-01-preview features, the latest preview package: dotnet add package Azure.Search.Documents --prerelease
- For 2026-04-01 features, the latest stable package: dotnet add package Azure.Search.Documents

Required azure-search-documents package:
- For 2026-05-01-preview features, the latest preview package: pip install --pre azure-search-documents
- For 2026-04-01 features, the latest stable package: pip install azure-search-documents

Required REST API version:
- For preview features: Search Service 2026-05-01-preview
- For generally available features: Search Service 2026-04-01

Check for existing knowledge sources

A knowledge source is a top-level, reusable object. Knowing about existing knowledge sources is helpful for either reuse or naming new objects.

Run the following code to list knowledge sources by name and type.

// List knowledge sources by name and type
using Azure.Search.Documents.Indexes;

var indexClient = new SearchIndexClient(new Uri(searchEndpoint), credential);
var knowledgeSources = indexClient.GetKnowledgeSourcesAsync();

Console.WriteLine("Knowledge Sources:");

await foreach (var ks in knowledgeSources)
{
    Console.WriteLine($"  Name: {ks.Name}, Type: {ks.GetType().Name}");
}

Name	Description	Type	Editable	Required
`Name`	The name of the knowledge source, which must be unique within the knowledge sources collection and follow the naming guidelines for objects in Azure AI Search.	String	No	Yes
`Description`	A description of the knowledge source.	String	Yes	No
`EncryptionKey`	A customer-managed key to encrypt sensitive information in both the knowledge source and the generated objects.	Object	Yes	No
`IndexedOneLakeKnowledgeSourceParameters`	Parameters specific to OneLake knowledge sources: `FabricWorkspaceId`, `LakehouseId`, and `TargetPath`.	Object		Yes
`FabricWorkspaceId`	The GUID of the workspace that contains the lakehouse.	String	No	Yes
`LakehouseId`	The GUID of the lakehouse.	String	No	Yes
`TargetPath`	A folder or shortcut within the lakehouse. When unspecified, the entire lakehouse is indexed.	String	No	No

Name	Description	Type	Editable	Required
`Identity`	A managed identity to use in the generated indexer.	Object	Yes	No
`DisableImageVerbalization`	Enables or disables the use of image verbalization. The default is `False`, which enables image verbalization. Set to `True` to disable image verbalization.	Boolean	No	No
`ChatCompletionModel`	A chat completion model that verbalizes images or extracts content. Supported models are `gpt-4o`, `gpt-4o-mini`, `gpt-4.1`, `gpt-4.1-mini`, `gpt-4.1-nano`, `gpt-5`, `gpt-5-mini`, and `gpt-5-nano`. The GenAI Prompt skill is included in the generated skillset. Setting this parameter also requires that `DisableImageVerbalization` is set to `False`. When `ContentExtractionMode` is set to `standard`, `ChatCompletionModel.AzureOpenAIParameters.ResourceUri` must equal `AiServices.Uri`, and both parameters must point to the same Microsoft Foundry resource on `services.ai.azure.com`.	Object	Only `ApiKey` and `DeploymentName` are editable	No
`EmbeddingModel`	A text embedding model that vectorizes text and image content during indexing and at query time. Supported models are `text-embedding-ada-002`, `text-embedding-3-small`, and `text-embedding-3-large`. The Azure OpenAI Embedding skill is included in the generated skillset, and the Azure OpenAI vectorizer is included in the generated index.	Object	Only `ApiKey` and `DeploymentName` are editable	No
`ContentExtractionMode`	Controls how content is extracted from files. The default is `minimal`, which uses basic content extraction methods for text and images. Set to `standard` for advanced document cracking and chunking using the Azure Content Understanding skill, which is included in the generated skillset. For `standard` only, the `AiServices` parameter is specifiable, and `ChatCompletionModel.AzureOpenAIParameters.ResourceUri` must equal `AiServices.Uri`. For more information, see the `ChatCompletionModel` row.	String	No	No
`AiServices`	A Foundry resource to access Azure Content Understanding in Foundry Tools. Setting this parameter requires that `ContentExtractionMode` is set to `standard`. For more information, see the `ChatCompletionModel` row.	Object	Only `ApiKey` is editable	No
`IngestionSchedule`	Adds scheduling information to the generated indexer. You can also add a schedule later to automate data refresh.	Object	Yes	No
`IngestionPermissionOptions`	The document-level permissions to ingest alongside content. Specify `UserIds`, `GroupIds`, or `RbacScope` to store permission metadata in the index. You can also specify `SensitivityLabel` to ingest Microsoft Purview sensitivity label metadata for blob, indexed OneLake, and indexed SharePoint knowledge sources. For source-specific RBAC guidance, see Ingest RBAC permissions from blob storage and Ingest ACLs from ADLS Gen2. To enforce these permissions at query time, see Enforce permissions at query time.	Array	No	No
`AssetStore`	(2026-05-01-preview only) A blob container used to persist images extracted from source documents. Required to enable image serving (preview) for the knowledge base. Setting this parameter provisions a knowledge store alongside the knowledge source to store the image artifacts. You can inspect and manage this knowledge store like any other. The storage account must remain accessible to the search service for the lifetime of the knowledge base.	Object	No	No

Create an indexed OneLake knowledge source

Usage support

Prerequisites

Check for existing knowledge sources

Create a knowledge source

Source-specific properties

Ingestion parameters properties

Check ingestion status

Review the generated objects

Assign to a knowledge base

Query a knowledge base

Enforce document-level permissions

Surface document-embedded images

Delete a knowledge source

Related content

Feedback

Additional resources