Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Important
These features and functionality are part of the 2026-05-01-preview REST API. The 2026-05-01-preview is licensed to you as part of your Azure subscription and is subject to the terms applicable to "Previews" in the Microsoft Product Terms, the Microsoft Products and Services Data Protection Addendum ("DPA"), and the Supplemental Terms of Use for Microsoft Azure Previews.
The 2026-05-01-preview supports connections to other Microsoft services and third-party services. Use of these services is subject to their respective terms and might result in data processing or storage outside of the Azure compliance boundary, as well as data flowing into the Azure compliance boundary.
It's your responsibility to manage whether your data will flow outside of your organization's compliance and geographic boundaries and any related implications, and that appropriate permissions, boundaries, and approvals are provisioned.
You're responsible for carefully reviewing and testing applications you build in the context of your specific use cases and making all appropriate decisions and customizations. This includes implementing your own responsible AI mitigations, such as metaprompts, content filters, or other safety systems, and ensuring your applications meet appropriate quality, reliability, security, and trustworthiness standards. For more information, see the Azure AI Search Transparency Note.
A file knowledge source (preview) uploads small and medium file sets directly to Azure AI Search for agentic retrieval. Knowledge sources are created independently, referenced in a knowledge base, and used as grounding data when the knowledge base is queried at runtime.
File knowledge sources are useful when you want a managed upload experience instead of provisioning Azure Storage, configuring access, and creating an indexer pipeline over an external container. Azure AI Search processes uploaded files so their extracted content can be retrieved from a knowledge base.
If your content already lives in Azure Blob Storage or ADLS Gen2, or if you need large-scale ingestion or storage account capabilities, use a blob knowledge source instead.
Usage support
| Azure portal | Microsoft Foundry portal | .NET SDK | Python SDK | Java SDK | JavaScript SDK | REST API |
|---|---|---|---|---|---|---|
| ❌ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
Prerequisites
A dedicated Azure AI Search service in any region that provides agentic retrieval. File knowledge sources aren't supported on serverless search services. For more information about dedicated tiers, see Choose a service tier. If you need paid usage beyond the monthly free allowance, set the
knowledgeRetrievalservice property tostandardby using the Search Management REST API.Files in a supported format.
Permissions to create knowledge sources. Configure keyless authentication with the Search Service Contributor role assigned to your user account (recommended) or use an API key.
If the knowledge source specifies an Azure OpenAI model for embeddings, the search service must have a managed identity with Cognitive Services User permissions on the Microsoft Foundry resource.
- The latest
Azure.Search.Documentspreview package:dotnet add package Azure.Search.Documents --prerelease
- The latest
azure-search-documentspreview package:pip install --pre azure-search-documents
- The 2026-05-01-preview version of the Search Service REST APIs.
Supported formats and limits
The following file types are supported.
| Category | Extensions |
|---|---|
| Text | .txt, .md, .html, .json, .csv |
| Code | .c, .cs, .cpp, .java, .py, .js, .ts, .php, .rb, .sh |
| Documents | .pdf, .docx, .pptx, .doc |
The following limits apply to file knowledge sources.
| Limit | Value |
|---|---|
| Maximum file size per upload | 50 MB |
| Maximum files per file knowledge source | 100 |
Note
Uploaded content is stored in the generated search index. For total storage limits by pricing tier, see Service limits.
Check for existing knowledge sources
A knowledge source is a top-level, reusable object. Knowing about existing knowledge sources is helpful for either reuse or naming new objects.
Run the following code to list knowledge sources by name and type.
// List knowledge sources by name and type
using Azure.Search.Documents.Indexes;
var indexClient = new SearchIndexClient(new Uri(searchEndpoint), credential);
var knowledgeSources = indexClient.GetKnowledgeSourcesAsync();
Console.WriteLine("Knowledge Sources:");
await foreach (var ks in knowledgeSources)
{
Console.WriteLine($" Name: {ks.Name}, Type: {ks.GetType().Name}");
}
Reference: SearchIndexClient
# List knowledge sources by name and type
from azure.core.credentials import AzureKeyCredential
from azure.search.documents.indexes import SearchIndexClient
index_client = SearchIndexClient(endpoint = "search_url", credential = AzureKeyCredential("api_key"))
for ks in index_client.list_knowledge_sources():
print(f" - {ks.name} ({ks.kind})")
Reference: SearchIndexClient
### List knowledge sources by name and type
GET {{search-url}}/knowledgesources?api-version={{api-version}}&$select=name,kind
api-key: {{api-key}}
Reference: Knowledge Sources - List
You can also return a single knowledge source by name to review its JSON definition.
using Azure.Search.Documents.Indexes;
using System.Text.Json;
var indexClient = new SearchIndexClient(new Uri(searchEndpoint), credential);
// Specify the knowledge source name to retrieve
string ksNameToGet = "earth-knowledge-source";
// Get its definition
var knowledgeSourceResponse = await indexClient.GetKnowledgeSourceAsync(ksNameToGet);
var ks = knowledgeSourceResponse.Value;
// Serialize to JSON for display
var jsonOptions = new JsonSerializerOptions
{
WriteIndented = true,
DefaultIgnoreCondition = System.Text.Json.Serialization.JsonIgnoreCondition.Never
};
Console.WriteLine(JsonSerializer.Serialize(ks, ks.GetType(), jsonOptions));
Reference: SearchIndexClient
# Get a knowledge source definition
from azure.core.credentials import AzureKeyCredential
from azure.search.documents.indexes import SearchIndexClient
import json
index_client = SearchIndexClient(endpoint = "search_url", credential = AzureKeyCredential("api_key"))
ks = index_client.get_knowledge_source("knowledge_source_name")
print(json.dumps(ks.as_dict(), indent = 2))
Reference: SearchIndexClient
### Get a knowledge source definition
GET {{search-url}}/knowledgesources/{{knowledge-source-name}}?api-version={{api-version}}
api-key: {{api-key}}
Reference: Knowledge Sources - Get
The following JSON is an example response for a file knowledge source.
{
"name": "my-file-ks",
"kind": "file",
"description": "A sample file knowledge source.",
"encryptionKey": null,
"fileParameters": {
"ingestionParameters": {
"contentExtractionMode": "minimal",
"embeddingModel": {
"kind": "azureOpenAI",
"azureOpenAIParameters": {
"resourceUri": "<REDACTED>",
"deploymentId": "text-embedding-3-large",
"modelName": "text-embedding-3-large"
}
}
}
}
}
Create a knowledge source
Create a file knowledge source that specifies the embedding model used to vectorize uploaded content.
using Azure;
using Azure.Search.Documents.Indexes;
using Azure.Search.Documents.Indexes.Models;
var indexClient = new SearchIndexClient(new Uri(searchEndpoint), new AzureKeyCredential(apiKey));
var embeddingParams = new AzureOpenAIVectorizerParameters
{
ResourceUri = new Uri(aoaiEndpoint),
DeploymentName = aoaiEmbeddingDeployment,
ModelName = aoaiEmbeddingModel
};
var ingestionParams = new KnowledgeSourceIngestionParameters
{
ContentExtractionMode = "minimal",
EmbeddingModel = new KnowledgeSourceAzureOpenAIVectorizer
{
AzureOpenAIParameters = embeddingParams
}
};
var fileParams = new FileKnowledgeSourceParameters
{
IngestionParameters = ingestionParams
};
var knowledgeSource = new FileKnowledgeSource(
name: "my-file-ks",
fileParameters: fileParams
)
{
Description = "This knowledge source uses directly uploaded product manuals."
};
await indexClient.CreateOrUpdateKnowledgeSourceAsync(knowledgeSource);
Console.WriteLine($"Knowledge source '{knowledgeSource.Name}' created or updated successfully.");
Reference: SearchIndexClient
from azure.core.credentials import AzureKeyCredential
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import (
AzureOpenAIVectorizerParameters,
FileKnowledgeSource,
FileKnowledgeSourceParameters,
)
from azure.search.documents.knowledgebases.models import (
KnowledgeSourceAzureOpenAIVectorizer,
KnowledgeSourceIngestionParameters,
)
index_client = SearchIndexClient(endpoint="search_url", credential=AzureKeyCredential("api_key"))
embedding_params = AzureOpenAIVectorizerParameters(
resource_url="aoai_endpoint",
deployment_name="aoai_embedding_deployment",
model_name="aoai_embedding_model",
)
ingestion_params = KnowledgeSourceIngestionParameters(
content_extraction_mode="minimal",
embedding_model=KnowledgeSourceAzureOpenAIVectorizer(
azure_open_ai_parameters=embedding_params
),
)
knowledge_source = FileKnowledgeSource(
name="my-file-ks",
description="This knowledge source uses directly uploaded product manuals.",
file_parameters=FileKnowledgeSourceParameters(ingestion_parameters=ingestion_params),
)
index_client.create_or_update_knowledge_source(knowledge_source=knowledge_source)
print(f"Knowledge source '{knowledge_source.name}' created or updated successfully.")
Reference: SearchIndexClient
PUT {{search-url}}/knowledgesources/my-file-ks?api-version=2026-05-01-preview
api-key: {{api-key}}
Content-Type: application/json
Prefer: return=representation
{
"name": "my-file-ks",
"kind": "file",
"description": "This knowledge source uses directly uploaded product manuals.",
"encryptionKey": null,
"fileParameters": {
"ingestionParameters": {
"embeddingModel": {
"kind": "azureOpenAI",
"azureOpenAIParameters": {
"resourceUri": "{{aoai-endpoint}}",
"deploymentId": "{{aoai-embedding-deployment}}",
"modelName": "{{aoai-embedding-model}}"
}
},
"contentExtractionMode": "minimal"
}
}
}
Reference: Knowledge Sources - Create or Update
Source-specific properties
The following properties apply to file knowledge sources.
| Name | Description | Type | Editable | Required |
|---|---|---|---|---|
Name |
The name of the knowledge source, which must be unique within the knowledge sources collection and follow the naming guidelines for objects in Azure AI Search. | String | No | Yes |
Description |
A description of the knowledge source. | String | Yes | No |
EncryptionKey |
A customer-managed key to encrypt sensitive information in both the knowledge source and the generated objects. | Object | Yes | No |
FileParameters |
Parameters specific to file knowledge sources: IngestionParameters. |
Object | Only nested model credentials are editable | No |
| Name | Description | Type | Editable | Required |
|---|---|---|---|---|
name |
The name of the knowledge source, which must be unique within the knowledge sources collection and follow the naming guidelines for objects in Azure AI Search. | String | No | Yes |
description |
A description of the knowledge source. | String | Yes | No |
encryption_key |
A customer-managed key to encrypt sensitive information in both the knowledge source and the generated objects. | Object | Yes | No |
file_parameters |
Parameters specific to file knowledge sources: ingestion_parameters. |
Object | Only nested model credentials are editable | No |
| Name | Description | Type | Editable | Required |
|---|---|---|---|---|
name |
The name of the knowledge source, which must be unique within the knowledge sources collection and follow the naming guidelines for objects in Azure AI Search. | String | No | Yes |
kind |
The kind of knowledge source, which is file in this case. |
String | No | Yes |
description |
A description of the knowledge source. | String | Yes | No |
encryptionKey |
A customer-managed key to encrypt sensitive information in both the knowledge source and the generated objects. | Object | Yes | No |
fileParameters |
Parameters specific to file knowledge sources: ingestionParameters. |
Object | Only nested model credentials are editable | No |
Ingestion parameters properties
The following ingestion parameter properties control how uploaded files are processed.
| Name | Description | Type | Editable | Required |
|---|---|---|---|---|
ContentExtractionMode |
Controls how content is extracted from files. File knowledge sources support only minimal. |
String | No | No |
EmbeddingModel |
A vectorizer that generates embeddings for content during ingestion and for queries at retrieval time. Supported Kind values are azureOpenAI, customWebApi, aiServicesVision, and aml. |
Object | Vectorizer credentials are editable | No |
| Name | Description | Type | Editable | Required |
|---|---|---|---|---|
content_extraction_mode |
Controls how content is extracted from files. File knowledge sources support only minimal. |
String | No | No |
embedding_model |
A vectorizer that generates embeddings for content during ingestion and for queries at retrieval time. Supported kind values are azureOpenAI, customWebApi, aiServicesVision, and aml. |
Object | Vectorizer credentials are editable | No |
| Name | Description | Type | Editable | Required |
|---|---|---|---|---|
contentExtractionMode |
Controls how content is extracted from files. File knowledge sources support only minimal. |
String | No | No |
embeddingModel |
A vectorizer that generates embeddings for content during ingestion and for queries at retrieval time. Supported kind values are azureOpenAI, customWebApi, aiServicesVision, and aml. |
Object | Vectorizer credentials are editable | No |
Upload files
After the knowledge source exists, upload files directly to it. Each upload is a synchronous call: Azure AI Search extracts content from the uploaded file, chunks the content, creates embeddings when needed, and prepares the extracted content for retrieval before the call returns. You don't have to configure or run a separate ingestion pipeline.
The request body contains the file content. The listed fileName is taken from the Content-Disposition: attachment; filename="..." header on the upload request. If the header isn't set, the service assigns an auto-generated fileName. SDKs can set the header through the upload method parameters shown in the following examples.
using Azure;
using Azure.Search.Documents.Indexes;
using Azure.Search.Documents.Indexes.Models;
var indexClient = new SearchIndexClient(new Uri(searchEndpoint), new AzureKeyCredential(apiKey));
string fileName = "installation-guide.pdf";
byte[] fileBytes = await File.ReadAllBytesAsync(fileName);
string contentDisposition = $"attachment; filename=\"{fileName}\"";
KnowledgeSourceFile uploadedFile = (await indexClient.UploadKnowledgeSourceFileAsync(
"my-file-ks",
contentDisposition,
BinaryData.FromBytes(fileBytes))).Value;
Console.WriteLine($"Uploaded file ID: {uploadedFile.FileId}");
from pathlib import Path
from azure.core.credentials import AzureKeyCredential
from azure.search.documents.indexes import SearchIndexClient
index_client = SearchIndexClient(endpoint="search_url", credential=AzureKeyCredential("api_key"))
file_path = Path("installation-guide.pdf")
uploaded_file = index_client.upload_knowledge_source_file(
"my-file-ks",
file_path.read_bytes(),
filename=file_path.name,
)
print(f"Uploaded file ID: {uploaded_file.file_id}")
POST {{search-url}}/knowledgesources/my-file-ks/files?api-version=2026-05-01-preview
api-key: {{api-key}}
Content-Type: application/octet-stream
Content-Disposition: attachment; filename="installation-guide.pdf"
<binary file content>
Note
Uploading a file doesn't replace an existing file, even if you reuse the same fileName. Each upload creates a new file with its own fileId, so the list of uploaded files can contain multiple entries that share a fileName.
To replace content, delete the prior file by fileId before or after the new upload.
List uploaded files
List files on the knowledge source to inspect the uploaded file set.
using Azure;
using Azure.Search.Documents.Indexes;
using Azure.Search.Documents.Indexes.Models;
var indexClient = new SearchIndexClient(new Uri(searchEndpoint), new AzureKeyCredential(apiKey));
await foreach (KnowledgeSourceFile file in indexClient.GetKnowledgeSourceFilesAsync("my-file-ks"))
{
Console.WriteLine($"{file.FileName} ({file.FileSizeBytes} bytes) error={file.ErrorMessage}");
}
from azure.core.credentials import AzureKeyCredential
from azure.search.documents.indexes import SearchIndexClient
index_client = SearchIndexClient(endpoint="search_url", credential=AzureKeyCredential("api_key"))
for file in index_client.list_knowledge_source_files("my-file-ks"):
print(f"{file.file_name} ({file.file_size_bytes} bytes) error={file.error_message}")
GET {{search-url}}/knowledgesources/my-file-ks/files?api-version=2026-05-01-preview
api-key: {{api-key}}
A response includes metadata for each uploaded file. The errorMessage value is null when the upload is processed without an error.
{
"value": [
{
"fileId": "file-abc123",
"fileName": "installation-guide.txt",
"fileSizeBytes": 89,
"createdAt": "2026-05-07T18:10:00Z",
"lastUpdatedAt": "2026-05-07T18:14:00.803Z",
"errorMessage": null
}
]
}
Because uploads are synchronous, a file is ready for retrieval as soon as its upload call succeeds. If processing fails, the upload response and any subsequent list entry include a non-null errorMessage. Review the value for unsupported file types, extraction failures, model access issues, or quota limits.
Delete uploaded files
Delete files from the knowledge source when you no longer want them available for retrieval.
using Azure;
using Azure.Search.Documents.Indexes;
var indexClient = new SearchIndexClient(new Uri(searchEndpoint), new AzureKeyCredential(apiKey));
await indexClient.DeleteKnowledgeSourceFileAsync("my-file-ks", "file-abc123");
from azure.core.credentials import AzureKeyCredential
from azure.search.documents.indexes import SearchIndexClient
index_client = SearchIndexClient(endpoint="search_url", credential=AzureKeyCredential("api_key"))
index_client.delete_knowledge_source_file("my-file-ks", "file-abc123")
DELETE {{search-url}}/knowledgesources/my-file-ks/files/file-abc123?api-version=2026-05-01-preview
api-key: {{api-key}}
Assign to a knowledge base
If you're satisfied with the knowledge source, add it to a knowledge base.
Query a knowledge base
After the knowledge base is configured, call the retrieve action or MCP endpoint to query the knowledge source.
Delete a knowledge source
Before you can delete a knowledge source, you must delete any knowledge base that references it or update the knowledge base definition to remove the reference. For knowledge sources that generate an index and indexer pipeline, all generated objects are also deleted. However, if you used an existing index to create a knowledge source, your index isn't deleted.
If you try to delete a knowledge source that's in use, the action fails and returns a list of affected knowledge bases.
To delete a knowledge source:
Get a list of all knowledge bases on your search service.
using Azure.Search.Documents.Indexes; var indexClient = new SearchIndexClient(new Uri(searchEndpoint), credential); var knowledgeBases = indexClient.GetKnowledgeBasesAsync(); Console.WriteLine("Knowledge Bases:"); await foreach (var kb in knowledgeBases) { Console.WriteLine($" - {kb.Name}"); }Reference: SearchIndexClient
An example response might look like the following:
{ "@odata.context": "https://my-search-service.search.windows.net/$metadata#knowledgebases(name)", "value": [ { "name": "my-kb" }, { "name": "my-kb-2" } ] }Get an individual knowledge base definition to check for knowledge source references.
using Azure.Search.Documents.Indexes; using System.Text.Json; var indexClient = new SearchIndexClient(new Uri(searchEndpoint), credential); // Specify the knowledge base name to retrieve string kbNameToGet = "earth-knowledge-base"; // Get a specific knowledge base definition var knowledgeBaseResponse = await indexClient.GetKnowledgeBaseAsync(kbNameToGet); var kb = knowledgeBaseResponse.Value; // Serialize to JSON for display string json = JsonSerializer.Serialize(kb, new JsonSerializerOptions { WriteIndented = true }); Console.WriteLine(json);Reference: SearchIndexClient
An example response might look like the following:
{ "Name": "earth-knowledge-base", "KnowledgeSources": [ { "Name": "earth-knowledge-source" } ], "Models": [ {} ], "RetrievalReasoningEffort": {}, "OutputMode": {}, "ETag": "\u00220x8DE278629D782B3\u0022", "EncryptionKey": null, "Description": null, "RetrievalInstructions": null, "AnswerInstructions": null }Either delete the knowledge base or, if you have multiple knowledge sources, update the knowledge base to remove the source. This example shows deletion.
using Azure.Search.Documents.Indexes; var indexClient = new SearchIndexClient(new Uri(searchEndpoint), credential); await indexClient.DeleteKnowledgeBaseAsync(knowledgeBaseName); System.Console.WriteLine($"Knowledge base '{knowledgeBaseName}' deleted successfully.");Reference: SearchIndexClient
Delete the knowledge source.
await indexClient.DeleteKnowledgeSourceAsync(knowledgeSourceName); System.Console.WriteLine($"Knowledge source '{knowledgeSourceName}' deleted successfully.");Reference: SearchIndexClient
Get a list of all knowledge bases on your search service.
# Get knowledge bases from azure.core.credentials import AzureKeyCredential from azure.search.documents.indexes import SearchIndexClient index_client = SearchIndexClient(endpoint = "search_url", credential = AzureKeyCredential("api_key")) print("Knowledge Bases:") for kb in index_client.list_knowledge_bases(): print(f" - {kb.name}")Reference: SearchIndexClient
An example response might look like the following:
{ "@odata.context": "https://my-search-service.search.windows.net/$metadata#knowledgebases(name)", "value": [ { "name": "my-kb" }, { "name": "my-kb-2" } ] }Get an individual knowledge base definition to check for knowledge source references.
# Get a knowledge base definition from azure.core.credentials import AzureKeyCredential from azure.search.documents.indexes import SearchIndexClient index_client = SearchIndexClient(endpoint = "search_url", credential = AzureKeyCredential("api_key")) kb = index_client.get_knowledge_base("knowledge_base_name") print(kb)Reference: SearchIndexClient
An example response might look like the following:
{ "name": "my-kb", "description": null, "retrievalInstructions": null, "answerInstructions": null, "outputMode": null, "knowledgeSources": [ { "name": "my-blob-ks", } ], "models": [], "encryptionKey": null, "retrievalReasoningEffort": { "kind": "low" } }Either delete the knowledge base or, if you have multiple knowledge sources, update the knowledge base to remove the source. This example shows deletion.
# Delete a knowledge base from azure.core.credentials import AzureKeyCredential from azure.search.documents.indexes import SearchIndexClient index_client = SearchIndexClient(endpoint = "search_url", credential = AzureKeyCredential("api_key")) index_client.delete_knowledge_base("knowledge_base_name") print(f"Knowledge base deleted successfully.")Reference: SearchIndexClient
Delete the knowledge source.
# Delete a knowledge source from azure.core.credentials import AzureKeyCredential from azure.search.documents.indexes import SearchIndexClient index_client = SearchIndexClient(endpoint = "search_url", credential = AzureKeyCredential("api_key")) index_client.delete_knowledge_source("knowledge_source_name") print(f"Knowledge source deleted successfully.")Reference: SearchIndexClient
Get a list of all knowledge bases on your search service.
### Get knowledge bases GET {{search-url}}/knowledgebases?api-version={{api-version}}&$select=name api-key: {{api-key}}Reference: Knowledge Bases - List
An example response might look like the following:
{ "@odata.context": "https://my-search-service.search.windows.net/$metadata#knowledgebases(name)", "value": [ { "name": "my-kb" }, { "name": "my-kb-2" } ] }Get an individual knowledge base definition to check for knowledge source references.
### Get a knowledge base definition GET {{search-url}}/knowledgebases/{{knowledge-base-name}}?api-version={{api-version}} api-key: {{api-key}}Reference: Knowledge Bases - Get
An example response might look like the following:
{ "name": "my-kb", "description": null, "retrievalInstructions": null, "answerInstructions": null, "outputMode": null, "knowledgeSources": [ { "name": "my-blob-ks", } ], "models": [], "encryptionKey": null, "retrievalReasoningEffort": { "kind": "low" } }Either delete the knowledge base or, if you have multiple knowledge sources, update the knowledge base to remove the source. This example shows deletion.
### Delete a knowledge base DELETE {{search-url}}/knowledgebases/{{knowledge-base-name}}?api-version={{api-version}} api-key: {{api-key}}Reference: Knowledge Bases - Delete
Delete the knowledge source.
### Delete a knowledge source DELETE {{search-url}}/knowledgesources/{{knowledge-source-name}}?api-version={{api-version}} api-key: {{api-key}}Reference: Knowledge Sources - Delete