Foundry Local SDK reference

The Foundry Local SDK enables you to ship AI features in your applications that are capable of using local AI models through a simple and intuitive API. The SDK abstracts away the complexities of managing AI models and provides a seamless experience for integrating local AI capabilities into your applications. This reference documents SDK implementations for C#, JavaScript, Python, and Rust.

The SDK doesn't require the Foundry Local CLI to be installed on the end users machine, allowing you to ship your applications without extra setup steps for your users - your applications is self-contained. The extra benefits of the Foundry Local SDK include:

Hardware detection and optimization: Automatic capability assessment for GPU, NPU, and CPU.
Execution provider management (Windows): Automatic download and registration of appropriate ONNX Runtime execution providers (CUDA, Vitis, QNN, OpenVINO, TensorRT) based on device capabilities.
Metal support via WebGpu (macOS): Native support for running models on Apple Silicon with optimized performance.
Model acquisition: Seamless download from Foundry Model Catalog with versioning, updates, and automatically hardware-optimized model selection with fallback support.
Efficient runtime: Adds approximately 20 MB to app size, runs on devices from mobile phones to desktops.
OpenAI API compatibility: Easy integration with OpenAI models and tools.
Optional REST server: Run Foundry Local as a local service accessible by other applications.

C# SDK Reference

Install packages

If you're developing or shipping on Windows, select the Windows tab. The Windows package integrates with the Windows ML runtime — it provides the same API surface area with a wider breadth of hardware acceleration.

Windows
Cross-Platform

dotnet add package Microsoft.AI.Foundry.Local.WinML
dotnet add package OpenAI

dotnet add package Microsoft.AI.Foundry.Local
dotnet add package OpenAI

The C# samples in the GitHub repository are preconfigured projects. If you're building from scratch, you should read the Foundry Local SDK reference for more details on how to set up your C# project with Foundry Local.

Project configuration

The sample repositories include a .csproj file that handles platform detection automatically. If you're building a project from scratch, use this configuration as a reference:

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>
  </PropertyGroup>

  <!-- Windows: target Windows SDK for WinML hardware acceleration -->
  <PropertyGroup Condition="$([MSBuild]::IsOSPlatform('Windows'))">
    <TargetFramework>net8.0-windows10.0.18362.0</TargetFramework>
    <Platforms>ARM64;x64</Platforms>
    <WindowsPackageType>None</WindowsPackageType>
    <EnableCoreMrtTooling>false</EnableCoreMrtTooling>
  </PropertyGroup>

  <!-- Non-Windows: standard .NET -->
  <PropertyGroup Condition="!$([MSBuild]::IsOSPlatform('Windows'))">
    <TargetFramework>net8.0</TargetFramework>
  </PropertyGroup>

  <PropertyGroup Condition="'$(RuntimeIdentifier)'==''">
    <RuntimeIdentifier>$(NETCoreSdkRuntimeIdentifier)</RuntimeIdentifier>
  </PropertyGroup>

  <!-- Windows: WinML for hardware acceleration -->
  <ItemGroup Condition="$([MSBuild]::IsOSPlatform('Windows'))">
    <PackageReference Include="Microsoft.AI.Foundry.Local.WinML" />
  </ItemGroup>

  <!-- Non-Windows: standard SDK -->
  <ItemGroup Condition="!$([MSBuild]::IsOSPlatform('Windows'))">
    <PackageReference Include="Microsoft.AI.Foundry.Local" />
  </ItemGroup>

  <!-- Linux GPU support -->
  <ItemGroup Condition="'$(RuntimeIdentifier)' == 'linux-x64'">
    <PackageReference Include="Microsoft.ML.OnnxRuntime.Gpu" />
    <PackageReference Include="Microsoft.ML.OnnxRuntimeGenAI.Cuda" />
  </ItemGroup>

  <!-- Shared utilities -->
  <ItemGroup>
    <Compile Include="../Shared/*.cs" />
  </ItemGroup>

</Project>

The following table explains the key project settings:

Setting	Description
`TargetFramework`	On Windows, targets `net9.0-windows10.0.26100` for WinML hardware acceleration. On other platforms, targets `net9.0`.
`WindowsAppSDKSelfContained`	Set to `false` to use the system-installed Windows App SDK rather than bundling it.
`WindowsPackageType`	Set to `None` to build as an unpackaged desktop app (no MSIX packaging).
`EnableCoreMrtTooling`	Set to `false` to disable MRT Core resource tooling, which isn't needed for console apps.
`RuntimeIdentifier`	Defaults to the current SDK's runtime identifier, ensuring the correct platform binaries are selected.
`Microsoft.AI.Foundry.Local.WinML`	Windows-only package that uses WinML for hardware acceleration and automatic execution provider management.
`Microsoft.AI.Foundry.Local`	Cross-platform package for macOS, Linux, and Windows without WinML.
`Microsoft.ML.OnnxRuntime.Gpu` / `OnnxRuntimeGenAI.Cuda`	Linux GPU support packages for CUDA-enabled hardware.

Quickstart

Use this snippet to verify that the SDK can initialize and access the local model catalog.

using Microsoft.AI.Foundry.Local;
using Microsoft.Extensions.Logging;
using System.Linq;

var config = new Configuration
{
  AppName = "app-name",
  LogLevel = Microsoft.AI.Foundry.Local.LogLevel.Information,
};

using var loggerFactory = LoggerFactory.Create(builder =>
{
  builder.SetMinimumLevel(Microsoft.Extensions.Logging.LogLevel.Information);
});
var logger = loggerFactory.CreateLogger<Program>();

await FoundryLocalManager.CreateAsync(config, logger);
var manager = FoundryLocalManager.Instance;

var catalog = await manager.GetCatalogAsync();
var models = await catalog.ListModelsAsync();

Console.WriteLine($"Models available: {models.Count()}");

This example prints the number of models available for your hardware.

Samples

For sample applications that demonstrate how to use the Foundry Local C# SDK, see the Foundry Local C# SDK Samples GitHub repository.

API reference

For more details on the Foundry Local C# SDK read Foundry Local C# SDK API Reference.

Native Audio Transcription API

The C# SDK includes a native audio client for transcribing audio files on-device using Whisper models. This runs inference in-process without needing the REST web server.

Get an audio client

After loading a Whisper model, get an audio client:

var audioClient = await model.GetAudioClientAsync();

Audio transcription methods

Method	Signature	Description
`TranscribeAudioStreamingAsync()`	`(string audioFilePath, CancellationToken ct) => IAsyncEnumerable<TranscriptionChunk>`	Streams transcription results chunk by chunk. Each chunk has a `Text` property.

AudioClient settings

Property	Type	Description
`Language`	`string`	ISO 639-1 language code (for example, `"en"`). Improves accuracy.
`Temperature`	`float`	Sampling temperature (0.0–1.0). Lower values are more deterministic.

Example

var audioClient = await model.GetAudioClientAsync();
audioClient.Settings.Language = "en";
audioClient.Settings.Temperature = 0.0f;

await foreach (var chunk in audioClient.TranscribeAudioStreamingAsync(
    "recording.mp3", CancellationToken.None))
{
    Console.Write(chunk.Text);
}

References:

Transcribe audio files with Foundry Local

JavaScript SDK Reference

npm install foundry-local-sdk-winml openai

npm install foundry-local-sdk openai

Quickstart

Use this snippet to verify that the SDK can initialize and access the local model catalog.

import { FoundryLocalManager } from 'foundry-local-sdk';

console.log('Initializing Foundry Local SDK...');

const manager = FoundryLocalManager.create({
    appName: 'foundry_local_samples',
    logLevel: 'info'
});
console.log('✓ SDK initialized successfully');

// Explore available models
console.log('\nFetching available models...');
const catalog = manager.catalog;
const models = await catalog.getModels();

console.log(`Found ${models.length} models:`);
for (const model of models) {
    console.log(`  - ${model.alias}`);
}

This example outputs the list of available models for your hardware.

Samples

For sample applications that demonstrate how to use the Foundry Local JavaScript SDK, see the Foundry Local JavaScript SDK Samples GitHub repository.

API reference

For more details on the Foundry Local JavaScript SDK read Foundry Local JavaScript SDK API Reference.

References

Python SDK Reference

Install packages

Windows
Cross-Platform

pip install foundry-local-sdk-winml openai

pip install foundry-local-sdk openai

Quickstart

Use this snippet to verify that the SDK can initialize and access the local model catalog.

import asyncio
from foundry_local_sdk import Configuration, FoundryLocalManager


async def main():
    config = Configuration(app_name="app-name")
    FoundryLocalManager.initialize(config)
    manager = FoundryLocalManager.instance

    models = manager.catalog.list_models()
    print(f"Models available: {len(models)}")


if __name__ == "__main__":
    asyncio.run(main())

This example prints the number of models available for your hardware.

Samples

For sample applications that demonstrate how to use the Foundry Local Python SDK, see the Foundry Local SDK Samples GitHub repository.

Configuration

The Configuration class allows you to customize the SDK behavior:

from foundry_local_sdk import Configuration

config = Configuration(
    app_name="app-name",
    log_level="info",
    model_cache_dir="./foundry_local_data/model_cache",
    web={"urls": "http://127.0.0.1:55588"},
)

Parameter	Type	Description
`app_name`	`str`	Name of your application.
`log_level`	`str`	Logging level (for example, `"info"`, `"debug"`).
`model_cache_dir`	`str`	Directory for cached models.
`web`	`dict`	Web service configuration with `urls` key.

Core API

Method	Description
`FoundryLocalManager.initialize(config)`	Initialize the singleton manager with a `Configuration`.
`FoundryLocalManager.instance`	Access the initialized manager instance.
`manager.catalog.list_models()`	List all available models in the catalog.
`manager.catalog.get_model(alias)`	Get a model by alias.
`manager.catalog.get_cached_models()`	List models in the local cache.
`manager.catalog.get_loaded_models()`	List models currently loaded.
`model.download(progress_callback)`	Download the model (skips if cached).
`model.load()`	Load the model for inference.
`model.unload()`	Unload the model.
`model.is_cached`	Check if the model is cached locally.
`model.is_loaded`	Check if the model is loaded.

Native Chat Completions API

After loading a model, get a chat client:

client = model.get_chat_client()

Method	Description
`client.complete_chat(messages)`	Generate a complete chat response.
`client.complete_streaming_chat(messages)`	Stream chat response chunks.

Native Audio Transcription API

After loading a Whisper model, get an audio client:

audio_client = model.get_audio_client()

Method	Description
`audio_client.transcribe(file_path)`	Transcribe an audio file. Returns an object with a `text` property.

References:

Rust SDK Reference

Install packages

Windows
Cross-Platform

cargo add foundry-local-sdk --features winml
cargo add tokio --features full
cargo add tokio-stream anyhow

cargo add foundry-local-sdk
cargo add tokio --features full
cargo add tokio-stream anyhow

Quickstart

Use this snippet to verify that the SDK can initialize and access the local model catalog.

use foundry_local_sdk::{FoundryLocalConfig, FoundryLocalManager};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let manager = FoundryLocalManager::create(FoundryLocalConfig::new("app-name"))?;

    let models = manager.catalog().get_models().await?;
    println!("Models available: {}", models.len());

    Ok(())
}

This example prints the number of models available for your hardware.

Samples

For sample applications that demonstrate how to use the Foundry Local Rust SDK, see the Foundry Local SDK Samples GitHub repository.

Configuration

The FoundryLocalConfig struct allows you to customize the SDK behavior:

use foundry_local_sdk::FoundryLocalConfig;

let config = FoundryLocalConfig::new("app-name")
    .with_log_level("info")
    .with_model_cache_dir("./foundry_local_data/model_cache")
    .with_web_urls("http://127.0.0.1:55588");

Core API

Method	Description
`FoundryLocalManager::create(config)`	Create a new manager with a `FoundryLocalConfig`.
`manager.catalog().get_models().await`	List all available models.
`manager.catalog().get_model(alias).await`	Get a model by alias.
`manager.catalog().get_cached_models().await`	List models in the local cache.
`manager.catalog().get_loaded_models().await`	List models currently loaded.
`model.download(callback).await`	Download the model (skips if cached).
`model.load().await`	Load the model for inference.
`model.unload().await`	Unload the model.

Native Chat Completions API

After loading a model, create a chat client with optional settings:

let client = model.create_chat_client()
    .temperature(0.7)
    .max_tokens(256);

Method	Description
`client.complete_chat(&messages, tools).await`	Generate a complete chat response.
`client.complete_streaming_chat(&messages, tools).await`	Stream chat response chunks.

Message types: ChatCompletionRequestSystemMessage, ChatCompletionRequestUserMessage, ChatCompletionRequestMessage.

Native Audio Transcription API

After loading a Whisper model, create an audio client:

let audio_client = model.create_audio_client();

Method	Description
`audio_client.transcribe(file_path).await`	Transcribe an audio file. Returns an object with a `text` field.

References:

Feedback

Nakatulong ba ang pahinang ito?

Last updated on 2026-06-15

Foundry Local SDK reference

C# SDK Reference

Install packages

Project configuration

Quickstart

Samples

API reference

Native Audio Transcription API

Get an audio client

Audio transcription methods

AudioClient settings

Example

JavaScript SDK Reference

Install packages

Quickstart

Samples

API reference

References

Python SDK Reference

Install packages

Quickstart

Samples

Configuration

Core API

Native Chat Completions API

Native Audio Transcription API

Rust SDK Reference

Install packages

Quickstart

Samples

Configuration

Core API

Native Chat Completions API

Native Audio Transcription API

Feedback

Mga karagdagang mapagkukunan