Edit

Create a file knowledge source (preview)

Important

These features and functionality are part of the 2026-05-01-preview REST API. The 2026-05-01-preview is licensed to you as part of your Azure subscription and is subject to the terms applicable to "Previews" in the Microsoft Product Terms, the Microsoft Products and Services Data Protection Addendum ("DPA"), and the Supplemental Terms of Use for Microsoft Azure Previews.

The 2026-05-01-preview supports connections to other Microsoft services and third-party services. Use of these services is subject to their respective terms and might result in data processing or storage outside of the Azure compliance boundary, as well as data flowing into the Azure compliance boundary.

It's your responsibility to manage whether your data will flow outside of your organization's compliance and geographic boundaries and any related implications, and that appropriate permissions, boundaries, and approvals are provisioned.

You're responsible for carefully reviewing and testing applications you build in the context of your specific use cases and making all appropriate decisions and customizations. This includes implementing your own responsible AI mitigations, such as metaprompts, content filters, or other safety systems, and ensuring your applications meet appropriate quality, reliability, security, and trustworthiness standards. For more information, see the Azure AI Search Transparency Note.

A file knowledge source (preview) uploads small and medium file sets directly to Azure AI Search for agentic retrieval. Knowledge sources are created independently, referenced in a knowledge base, and used as grounding data when the knowledge base is queried at runtime.

File knowledge sources are useful when you want a managed upload experience instead of provisioning Azure Storage, configuring access, and creating an indexer pipeline over an external container. Azure AI Search processes uploaded files so their extracted content can be retrieved from a knowledge base.

If your content already lives in Azure Blob Storage or ADLS Gen2, or if you need large-scale ingestion or storage account capabilities, use a blob knowledge source instead.

Usage support

Azure portal Microsoft Foundry portal .NET SDK Python SDK Java SDK JavaScript SDK REST API
✔️ ✔️ ✔️ ✔️ ✔️ ✔️

Prerequisites

  • A dedicated Azure AI Search service in any region that provides agentic retrieval. File knowledge sources aren't supported on serverless search services. For more information about dedicated tiers, see Choose a service tier. If you need paid usage beyond the monthly free allowance, set the knowledgeRetrieval service property to standard by using the Search Management REST API.

  • Files in a supported format.

  • Permissions to create knowledge sources. Configure keyless authentication with the Search Service Contributor role assigned to your user account (recommended) or use an API key.

  • If the knowledge source specifies an Azure OpenAI model for embeddings, the search service must have a managed identity with Cognitive Services User permissions on the Microsoft Foundry resource.

Supported formats and limits

The following file types are supported.

Category Extensions
Text .txt, .md, .html, .json, .csv
Code .c, .cs, .cpp, .java, .py, .js, .ts, .php, .rb, .sh
Documents .pdf, .docx, .pptx, .doc

The following limits apply to file knowledge sources.

Limit Value
Maximum file size per upload 50 MB
Maximum files per file knowledge source 100

Note

Uploaded content is stored in the generated search index. For total storage limits by pricing tier, see Service limits.

Check for existing knowledge sources

A knowledge source is a top-level, reusable object. Knowing about existing knowledge sources is helpful for either reuse or naming new objects.

Run the following code to list knowledge sources by name and type.

// List knowledge sources by name and type
using Azure.Search.Documents.Indexes;

var indexClient = new SearchIndexClient(new Uri(searchEndpoint), credential);
var knowledgeSources = indexClient.GetKnowledgeSourcesAsync();

Console.WriteLine("Knowledge Sources:");

await foreach (var ks in knowledgeSources)
{
    Console.WriteLine($"  Name: {ks.Name}, Type: {ks.GetType().Name}");
}

Reference: SearchIndexClient

# List knowledge sources by name and type
from azure.core.credentials import AzureKeyCredential
from azure.search.documents.indexes import SearchIndexClient

index_client = SearchIndexClient(endpoint = "search_url", credential = AzureKeyCredential("api_key"))

for ks in index_client.list_knowledge_sources():
    print(f"  - {ks.name} ({ks.kind})")

Reference: SearchIndexClient

### List knowledge sources by name and type
GET {{search-url}}/knowledgesources?api-version={{api-version}}&$select=name,kind
api-key: {{api-key}}

Reference: Knowledge Sources - List

You can also return a single knowledge source by name to review its JSON definition.

using Azure.Search.Documents.Indexes;
using System.Text.Json;

var indexClient = new SearchIndexClient(new Uri(searchEndpoint), credential);

// Specify the knowledge source name to retrieve
string ksNameToGet = "earth-knowledge-source";

// Get its definition
var knowledgeSourceResponse = await indexClient.GetKnowledgeSourceAsync(ksNameToGet);
var ks = knowledgeSourceResponse.Value;

// Serialize to JSON for display
var jsonOptions = new JsonSerializerOptions 
{ 
    WriteIndented = true,
    DefaultIgnoreCondition = System.Text.Json.Serialization.JsonIgnoreCondition.Never
};
Console.WriteLine(JsonSerializer.Serialize(ks, ks.GetType(), jsonOptions));

Reference: SearchIndexClient

# Get a knowledge source definition
from azure.core.credentials import AzureKeyCredential
from azure.search.documents.indexes import SearchIndexClient
import json

index_client = SearchIndexClient(endpoint = "search_url", credential = AzureKeyCredential("api_key"))

ks = index_client.get_knowledge_source("knowledge_source_name")
print(json.dumps(ks.as_dict(), indent = 2))

Reference: SearchIndexClient

### Get a knowledge source definition
GET {{search-url}}/knowledgesources/{{knowledge-source-name}}?api-version={{api-version}}
api-key: {{api-key}}

Reference: Knowledge Sources - Get

The following JSON is an example response for a file knowledge source.

{
  "name": "my-file-ks",
  "kind": "file",
  "description": "A sample file knowledge source.",
  "encryptionKey": null,
  "fileParameters": {
    "ingestionParameters": {
      "contentExtractionMode": "minimal",
      "embeddingModel": {
        "kind": "azureOpenAI",
        "azureOpenAIParameters": {
          "resourceUri": "<REDACTED>",
          "deploymentId": "text-embedding-3-large",
          "modelName": "text-embedding-3-large"
        }
      }
    }
  }
}

Create a knowledge source

Create a file knowledge source that specifies the embedding model used to vectorize uploaded content.

using Azure;
using Azure.Search.Documents.Indexes;
using Azure.Search.Documents.Indexes.Models;

var indexClient = new SearchIndexClient(new Uri(searchEndpoint), new AzureKeyCredential(apiKey));

var embeddingParams = new AzureOpenAIVectorizerParameters
{
    ResourceUri = new Uri(aoaiEndpoint),
    DeploymentName = aoaiEmbeddingDeployment,
    ModelName = aoaiEmbeddingModel
};

var ingestionParams = new KnowledgeSourceIngestionParameters
{
    ContentExtractionMode = "minimal",
    EmbeddingModel = new KnowledgeSourceAzureOpenAIVectorizer
    {
        AzureOpenAIParameters = embeddingParams
    }
};

var fileParams = new FileKnowledgeSourceParameters
{
    IngestionParameters = ingestionParams
};

var knowledgeSource = new FileKnowledgeSource(
    name: "my-file-ks",
    fileParameters: fileParams
)
{
    Description = "This knowledge source uses directly uploaded product manuals."
};

await indexClient.CreateOrUpdateKnowledgeSourceAsync(knowledgeSource);
Console.WriteLine($"Knowledge source '{knowledgeSource.Name}' created or updated successfully.");

Reference: SearchIndexClient

from azure.core.credentials import AzureKeyCredential
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import (
    AzureOpenAIVectorizerParameters,
    FileKnowledgeSource,
    FileKnowledgeSourceParameters,
)
from azure.search.documents.knowledgebases.models import (
    KnowledgeSourceAzureOpenAIVectorizer,
    KnowledgeSourceIngestionParameters,
)

index_client = SearchIndexClient(endpoint="search_url", credential=AzureKeyCredential("api_key"))

embedding_params = AzureOpenAIVectorizerParameters(
    resource_url="aoai_endpoint",
    deployment_name="aoai_embedding_deployment",
    model_name="aoai_embedding_model",
)

ingestion_params = KnowledgeSourceIngestionParameters(
    content_extraction_mode="minimal",
    embedding_model=KnowledgeSourceAzureOpenAIVectorizer(
        azure_open_ai_parameters=embedding_params
    ),
)

knowledge_source = FileKnowledgeSource(
    name="my-file-ks",
    description="This knowledge source uses directly uploaded product manuals.",
    file_parameters=FileKnowledgeSourceParameters(ingestion_parameters=ingestion_params),
)

index_client.create_or_update_knowledge_source(knowledge_source=knowledge_source)
print(f"Knowledge source '{knowledge_source.name}' created or updated successfully.")

Reference: SearchIndexClient

PUT {{search-url}}/knowledgesources/my-file-ks?api-version=2026-05-01-preview
api-key: {{api-key}}
Content-Type: application/json
Prefer: return=representation

{
  "name": "my-file-ks",
  "kind": "file",
  "description": "This knowledge source uses directly uploaded product manuals.",
  "encryptionKey": null,
  "fileParameters": {
    "ingestionParameters": {
      "embeddingModel": {
        "kind": "azureOpenAI",
        "azureOpenAIParameters": {
          "resourceUri": "{{aoai-endpoint}}",
          "deploymentId": "{{aoai-embedding-deployment}}",
          "modelName": "{{aoai-embedding-model}}"
        }
      },
      "contentExtractionMode": "minimal"
    }
  }
}

Reference: Knowledge Sources - Create or Update

Source-specific properties

The following properties apply to file knowledge sources.

Name Description Type Editable Required
Name The name of the knowledge source, which must be unique within the knowledge sources collection and follow the naming guidelines for objects in Azure AI Search. String No Yes
Description A description of the knowledge source. String Yes No
EncryptionKey A customer-managed key to encrypt sensitive information in both the knowledge source and the generated objects. Object Yes No
FileParameters Parameters specific to file knowledge sources: IngestionParameters. Object Only nested model credentials are editable No
Name Description Type Editable Required
name The name of the knowledge source, which must be unique within the knowledge sources collection and follow the naming guidelines for objects in Azure AI Search. String No Yes
description A description of the knowledge source. String Yes No
encryption_key A customer-managed key to encrypt sensitive information in both the knowledge source and the generated objects. Object Yes No
file_parameters Parameters specific to file knowledge sources: ingestion_parameters. Object Only nested model credentials are editable No
Name Description Type Editable Required
name The name of the knowledge source, which must be unique within the knowledge sources collection and follow the naming guidelines for objects in Azure AI Search. String No Yes
kind The kind of knowledge source, which is file in this case. String No Yes
description A description of the knowledge source. String Yes No
encryptionKey A customer-managed key to encrypt sensitive information in both the knowledge source and the generated objects. Object Yes No
fileParameters Parameters specific to file knowledge sources: ingestionParameters. Object Only nested model credentials are editable No

Ingestion parameters properties

The following ingestion parameter properties control how uploaded files are processed.

Name Description Type Editable Required
ContentExtractionMode Controls how content is extracted from files. File knowledge sources support only minimal. String No No
EmbeddingModel A vectorizer that generates embeddings for content during ingestion and for queries at retrieval time. Supported Kind values are azureOpenAI, customWebApi, aiServicesVision, and aml. Object Vectorizer credentials are editable No
Name Description Type Editable Required
content_extraction_mode Controls how content is extracted from files. File knowledge sources support only minimal. String No No
embedding_model A vectorizer that generates embeddings for content during ingestion and for queries at retrieval time. Supported kind values are azureOpenAI, customWebApi, aiServicesVision, and aml. Object Vectorizer credentials are editable No
Name Description Type Editable Required
contentExtractionMode Controls how content is extracted from files. File knowledge sources support only minimal. String No No
embeddingModel A vectorizer that generates embeddings for content during ingestion and for queries at retrieval time. Supported kind values are azureOpenAI, customWebApi, aiServicesVision, and aml. Object Vectorizer credentials are editable No

Upload files

After the knowledge source exists, upload files directly to it. Each upload is a synchronous call: Azure AI Search extracts content from the uploaded file, chunks the content, creates embeddings when needed, and prepares the extracted content for retrieval before the call returns. You don't have to configure or run a separate ingestion pipeline.

The request body contains the file content. The listed fileName is taken from the Content-Disposition: attachment; filename="..." header on the upload request. If the header isn't set, the service assigns an auto-generated fileName. SDKs can set the header through the upload method parameters shown in the following examples.

using Azure;
using Azure.Search.Documents.Indexes;
using Azure.Search.Documents.Indexes.Models;

var indexClient = new SearchIndexClient(new Uri(searchEndpoint), new AzureKeyCredential(apiKey));

string fileName = "installation-guide.pdf";
byte[] fileBytes = await File.ReadAllBytesAsync(fileName);
string contentDisposition = $"attachment; filename=\"{fileName}\"";

KnowledgeSourceFile uploadedFile = (await indexClient.UploadKnowledgeSourceFileAsync(
    "my-file-ks",
    contentDisposition,
    BinaryData.FromBytes(fileBytes))).Value;

Console.WriteLine($"Uploaded file ID: {uploadedFile.FileId}");
from pathlib import Path

from azure.core.credentials import AzureKeyCredential
from azure.search.documents.indexes import SearchIndexClient

index_client = SearchIndexClient(endpoint="search_url", credential=AzureKeyCredential("api_key"))

file_path = Path("installation-guide.pdf")
uploaded_file = index_client.upload_knowledge_source_file(
    "my-file-ks",
    file_path.read_bytes(),
    filename=file_path.name,
)
print(f"Uploaded file ID: {uploaded_file.file_id}")
POST {{search-url}}/knowledgesources/my-file-ks/files?api-version=2026-05-01-preview
api-key: {{api-key}}
Content-Type: application/octet-stream
Content-Disposition: attachment; filename="installation-guide.pdf"

<binary file content>

Note

Uploading a file doesn't replace an existing file, even if you reuse the same fileName. Each upload creates a new file with its own fileId, so the list of uploaded files can contain multiple entries that share a fileName.

To replace content, delete the prior file by fileId before or after the new upload.

List uploaded files

List files on the knowledge source to inspect the uploaded file set.

using Azure;
using Azure.Search.Documents.Indexes;
using Azure.Search.Documents.Indexes.Models;

var indexClient = new SearchIndexClient(new Uri(searchEndpoint), new AzureKeyCredential(apiKey));

await foreach (KnowledgeSourceFile file in indexClient.GetKnowledgeSourceFilesAsync("my-file-ks"))
{
    Console.WriteLine($"{file.FileName} ({file.FileSizeBytes} bytes) error={file.ErrorMessage}");
}
from azure.core.credentials import AzureKeyCredential
from azure.search.documents.indexes import SearchIndexClient

index_client = SearchIndexClient(endpoint="search_url", credential=AzureKeyCredential("api_key"))

for file in index_client.list_knowledge_source_files("my-file-ks"):
    print(f"{file.file_name} ({file.file_size_bytes} bytes) error={file.error_message}")
GET {{search-url}}/knowledgesources/my-file-ks/files?api-version=2026-05-01-preview
api-key: {{api-key}}

A response includes metadata for each uploaded file. The errorMessage value is null when the upload is processed without an error.

{
  "value": [
    {
      "fileId": "file-abc123",
      "fileName": "installation-guide.txt",
      "fileSizeBytes": 89,
      "createdAt": "2026-05-07T18:10:00Z",
      "lastUpdatedAt": "2026-05-07T18:14:00.803Z",
      "errorMessage": null
    }
  ]
}

Because uploads are synchronous, a file is ready for retrieval as soon as its upload call succeeds. If processing fails, the upload response and any subsequent list entry include a non-null errorMessage. Review the value for unsupported file types, extraction failures, model access issues, or quota limits.

Delete uploaded files

Delete files from the knowledge source when you no longer want them available for retrieval.

using Azure;
using Azure.Search.Documents.Indexes;

var indexClient = new SearchIndexClient(new Uri(searchEndpoint), new AzureKeyCredential(apiKey));

await indexClient.DeleteKnowledgeSourceFileAsync("my-file-ks", "file-abc123");
from azure.core.credentials import AzureKeyCredential
from azure.search.documents.indexes import SearchIndexClient

index_client = SearchIndexClient(endpoint="search_url", credential=AzureKeyCredential("api_key"))

index_client.delete_knowledge_source_file("my-file-ks", "file-abc123")
DELETE {{search-url}}/knowledgesources/my-file-ks/files/file-abc123?api-version=2026-05-01-preview
api-key: {{api-key}}

Assign to a knowledge base

If you're satisfied with the knowledge source, add it to a knowledge base.

Query a knowledge base

After the knowledge base is configured, call the retrieve action or MCP endpoint to query the knowledge source.

Delete a knowledge source

Before you can delete a knowledge source, you must delete any knowledge base that references it or update the knowledge base definition to remove the reference. For knowledge sources that generate an index and indexer pipeline, all generated objects are also deleted. However, if you used an existing index to create a knowledge source, your index isn't deleted.

If you try to delete a knowledge source that's in use, the action fails and returns a list of affected knowledge bases.

To delete a knowledge source:

  1. Get a list of all knowledge bases on your search service.

    using Azure.Search.Documents.Indexes;
    
    var indexClient = new SearchIndexClient(new Uri(searchEndpoint), credential);
    var knowledgeBases = indexClient.GetKnowledgeBasesAsync();
    
    Console.WriteLine("Knowledge Bases:");
    
    await foreach (var kb in knowledgeBases)
    {
        Console.WriteLine($"  - {kb.Name}");
    }
    

    Reference: SearchIndexClient

    An example response might look like the following:

     {
         "@odata.context": "https://my-search-service.search.windows.net/$metadata#knowledgebases(name)",
         "value": [
         {
             "name": "my-kb"
         },
         {
             "name": "my-kb-2"
         }
         ]
     }
    
  2. Get an individual knowledge base definition to check for knowledge source references.

    using Azure.Search.Documents.Indexes;
    using System.Text.Json;
    
    var indexClient = new SearchIndexClient(new Uri(searchEndpoint), credential);
    
    // Specify the knowledge base name to retrieve
    string kbNameToGet = "earth-knowledge-base";
    
    // Get a specific knowledge base definition
    var knowledgeBaseResponse = await indexClient.GetKnowledgeBaseAsync(kbNameToGet);
    var kb = knowledgeBaseResponse.Value;
    
    // Serialize to JSON for display
    string json = JsonSerializer.Serialize(kb, new JsonSerializerOptions { WriteIndented = true });
    Console.WriteLine(json);
    

    Reference: SearchIndexClient

    An example response might look like the following:

     {
       "Name": "earth-knowledge-base",
       "KnowledgeSources": [
         {
           "Name": "earth-knowledge-source"
         }
       ],
       "Models": [
         {}
       ],
       "RetrievalReasoningEffort": {},
       "OutputMode": {},
       "ETag": "\u00220x8DE278629D782B3\u0022",
       "EncryptionKey": null,
       "Description": null,
       "RetrievalInstructions": null,
       "AnswerInstructions": null
     }
    
  3. Either delete the knowledge base or, if you have multiple knowledge sources, update the knowledge base to remove the source. This example shows deletion.

    using Azure.Search.Documents.Indexes;
    var indexClient = new SearchIndexClient(new Uri(searchEndpoint), credential);
    
    await indexClient.DeleteKnowledgeBaseAsync(knowledgeBaseName);
    System.Console.WriteLine($"Knowledge base '{knowledgeBaseName}' deleted successfully.");
    

    Reference: SearchIndexClient

  4. Delete the knowledge source.

    await indexClient.DeleteKnowledgeSourceAsync(knowledgeSourceName);
    System.Console.WriteLine($"Knowledge source '{knowledgeSourceName}' deleted successfully.");
    

    Reference: SearchIndexClient

  1. Get a list of all knowledge bases on your search service.

    # Get knowledge bases
    from azure.core.credentials import AzureKeyCredential
    from azure.search.documents.indexes import SearchIndexClient
    
    index_client = SearchIndexClient(endpoint = "search_url", credential = AzureKeyCredential("api_key"))
    
    print("Knowledge Bases:")
    for kb in index_client.list_knowledge_bases():
        print(f"  - {kb.name}")
    

    Reference: SearchIndexClient

    An example response might look like the following:

     {
         "@odata.context": "https://my-search-service.search.windows.net/$metadata#knowledgebases(name)",
         "value": [
         {
             "name": "my-kb"
         },
         {
             "name": "my-kb-2"
         }
         ]
     }
    
  2. Get an individual knowledge base definition to check for knowledge source references.

    # Get a knowledge base definition
    from azure.core.credentials import AzureKeyCredential
    from azure.search.documents.indexes import SearchIndexClient
    
    index_client = SearchIndexClient(endpoint = "search_url", credential = AzureKeyCredential("api_key"))
    kb = index_client.get_knowledge_base("knowledge_base_name")
    print(kb)
    

    Reference: SearchIndexClient

    An example response might look like the following:

     {
       "name": "my-kb",
       "description": null,
       "retrievalInstructions": null,
       "answerInstructions": null,
       "outputMode": null,
       "knowledgeSources": [
         {
           "name": "my-blob-ks",
         }
       ],
       "models": [],
       "encryptionKey": null,
       "retrievalReasoningEffort": {
         "kind": "low"
       }
     }
    
  3. Either delete the knowledge base or, if you have multiple knowledge sources, update the knowledge base to remove the source. This example shows deletion.

    # Delete a knowledge base
    from azure.core.credentials import AzureKeyCredential 
    from azure.search.documents.indexes import SearchIndexClient
    
    index_client = SearchIndexClient(endpoint = "search_url", credential = AzureKeyCredential("api_key"))
    index_client.delete_knowledge_base("knowledge_base_name")
    print(f"Knowledge base deleted successfully.")
    

    Reference: SearchIndexClient

  4. Delete the knowledge source.

    # Delete a knowledge source
    from azure.core.credentials import AzureKeyCredential 
    from azure.search.documents.indexes import SearchIndexClient
    
    index_client = SearchIndexClient(endpoint = "search_url", credential = AzureKeyCredential("api_key"))
    index_client.delete_knowledge_source("knowledge_source_name")
    print(f"Knowledge source deleted successfully.")
    

    Reference: SearchIndexClient

  1. Get a list of all knowledge bases on your search service.

    ### Get knowledge bases
    GET {{search-url}}/knowledgebases?api-version={{api-version}}&$select=name
    api-key: {{api-key}}
    

    Reference: Knowledge Bases - List

    An example response might look like the following:

     {
         "@odata.context": "https://my-search-service.search.windows.net/$metadata#knowledgebases(name)",
         "value": [
         {
             "name": "my-kb"
         },
         {
             "name": "my-kb-2"
         }
         ]
     }
    
  2. Get an individual knowledge base definition to check for knowledge source references.

    ### Get a knowledge base definition
    GET {{search-url}}/knowledgebases/{{knowledge-base-name}}?api-version={{api-version}}
    api-key: {{api-key}}
    

    Reference: Knowledge Bases - Get

    An example response might look like the following:

     {
       "name": "my-kb",
       "description": null,
       "retrievalInstructions": null,
       "answerInstructions": null,
       "outputMode": null,
       "knowledgeSources": [
         {
           "name": "my-blob-ks",
         }
       ],
       "models": [],
       "encryptionKey": null,
       "retrievalReasoningEffort": {
         "kind": "low"
       }
     }
    
  3. Either delete the knowledge base or, if you have multiple knowledge sources, update the knowledge base to remove the source. This example shows deletion.

    ### Delete a knowledge base
    DELETE {{search-url}}/knowledgebases/{{knowledge-base-name}}?api-version={{api-version}}
    api-key: {{api-key}}
    

    Reference: Knowledge Bases - Delete

  4. Delete the knowledge source.

    ### Delete a knowledge source
    DELETE {{search-url}}/knowledgesources/{{knowledge-source-name}}?api-version={{api-version}}
    api-key: {{api-key}}
    

    Reference: Knowledge Sources - Delete