Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Important
- Foundry Local is available in preview. Public preview releases provide early access to features that are in active deployment.
- Features, approaches, and processes can change or have limited capabilities, before General Availability (GA).
This article provides a comprehensive reference for the Foundry Local command-line interface (CLI). The CLI organizes commands into logical categories to help you manage models, control the service, and maintain your local cache.
Overview
View all available commands with the help option:
foundry --help
The CLI organizes commands into three main categories:
- Model: Commands for managing and running AI models
- Service: Commands for controlling the Foundry Local service
- Cache: Commands for managing your local model storage
Model commands
The following table summarizes the commands related to managing and running models:
Note
You can specify the model argument by its alias or model ID. Using an alias:
- Selects the best model for your available hardware automatically. For example, if you have an Nvidia GPU available, Foundry Local selects the best GPU model. If you have a supported NPU available, Foundry Local selects the NPU model.
- Lets you use a shorter name without needing to remember the model ID.
If you want to run a specific model, use the model ID. For example, to run the qwen2.5-0.5b on CPU - irrespective of your available hardware - use: foundry model run qwen2.5-0.5b-instruct-generic-cpu.
If you have an Intel NPU on Windows, ensure you install the Intel NPU driver for optimal NPU acceleration.
| Command | Description |
|---|---|
foundry model --help |
Displays all available model-related commands and their usage. |
foundry model run <model> |
Runs a specified model, downloads it if it isn't cached, and starts an interaction. |
foundry model list |
Lists all available models for local use. On first run, it downloads execution providers (EPs) for your hardware. |
foundry model list --filter <key>=<value> |
Lists models filtered by the specified criteria (device, task, alias, provider). |
foundry model info <model> |
Displays detailed information about a specific model. |
foundry model info <model> --license |
Displays the license information for a specific model. |
foundry model download <model> |
Downloads a model to the local cache without running it. |
foundry model load <model> |
Loads a model into the service. |
foundry model unload <model> |
Unloads a model from the service. |
Model list ordering
When there are multiple model ID variants available for an alias, the model list is presented in priority order. The first model in the list is the model that is run when you specify the model by alias.
Model list filtering
The foundry model list command supports filtering models using the --filter option. You can filter models based on a single attribute using key-value pairs.
foundry model list --filter <key>=<value>
Note
When you run foundry model list for the first time after installation, Foundry Local automatically downloads the relevant execution providers (EPs) for your machine's hardware configuration. You see a progress bar indicating the download completion before the model list appears.
Supported filter keys:
device - Hardware Device Type
Filters models by the hardware device they run on.
Possible values:
CPU- Central Processing Unit modelsGPU- Graphics Processing Unit modelsNPU- Neural Processing Unit models
provider - Execution Provider
Filters models by their execution provider/runtime.
Possible values:
CPUExecutionProvider- CPU-based executionCUDAExecutionProvider- NVIDIA CUDA GPU executionWebGpuExecutionProvider- WebGPU executionQNNExecutionProvider- Qualcomm Neural Network execution (NPU)OpenVINOExecutionProvider- Intel OpenVINO executionNvTensorRTRTXExecutionProvider- NVIDIA TensorRT executionVitisAIExecutionProvider- AMD Vitis AI execution
task - Model Task Type
Filters models by their intended use case/task.
Common values:
chat-completion: Conversational AI modelstext-generation: Text generation models
alias - Model Alias
Filters models by their alias identifier. Supports wildcard matching with * suffix.
Sample values:
phi4-cpuqwen2.5-coder-0.5b-instruct-generic-cpudeepseek-r1-distill-qwen-1.5b-generic-cpuphi-4-mini-instruct-generic-cpu
Special filter features
Negation Support: Prefix any value with ! to exclude matching models.
foundry model list --filter device=!GPU
Wildcard Matching (alias only): Append * to match prefixes when filtering by alias.
foundry model list --filter alias=qwen*
Examples
foundry model list --filter device=GPU
foundry model list --filter task=chat-completion
foundry model list --filter provider=CUDAExecutionProvider
Note
- All comparisons are case-insensitive.
- Only one filter can be used per command.
- Unrecognized filter keys result in an error.
Service commands
The following table summarizes the commands related to managing and running the Foundry Local service:
| Command | Description |
|---|---|
foundry service --help |
Displays all available service-related commands and their usage. |
foundry service start |
Starts the Foundry Local service. |
foundry service stop |
Stops the Foundry Local service. |
foundry service restart |
Restarts the Foundry Local service. |
foundry service status |
Displays the current status of the Foundry Local service. |
foundry service ps |
Lists all models currently loaded in the Foundry Local service. |
foundry service diag |
Displays the logs of the Foundry Local service. |
foundry service set <options> |
Sets the configuration of the Foundry Local service. |
Cache commands
The following table summarizes the commands for managing the local cache where models are stored:
| Command | Description |
|---|---|
foundry cache --help |
Shows all available cache-related commands and their usage. |
foundry cache location |
Shows the current cache directory. |
foundry cache list |
Lists all models stored in the local cache. |
foundry cache cd <path> |
Changes the cache directory to the specified path. |
foundry cache remove <model> |
Removes a model from the local cache. |
Execution providers
Execution providers are hardware-specific acceleration libraries that run models as efficiently as possible on device.
Built-in execution providers
Foundry Local includes the CPU execution provider, the WebGPU execution provider, and the CUDA execution provider.
The CPU execution providers uses Microsoft Linear Algebra Subroutines (MLAS) to run on any CPU and is the CPU fallback for Foundry Local.
The WebGPU execution provider uses Dawn, the native implementation of the web-based API, for acceleration on any GPU, and is the GPU fallback for Foundry Local.
The CUDA execution provider uses NVIDIA CUDA, for acceleration on NVIDIA GPUs, NVIDIA GeForce RTX 30XX and above with minimum recommended driver version 32.0.15.5585 + Cuda version 12.5, and id subject to the following license terms: License Agreement for NVIDIA Software Development Kits—EULA.
Plugin execution providers
The execution providers listed in the table are available (depending on device and driver compatibility) for dynamic download and registration on Windows and are subject to the license terms specified.
Foundry Local automatically downloads these execution providers on first run. The plugin execution providers are automatically updated when new versions are available.
| Name (Vendor) | Requirements | License Terms |
|---|---|---|
| "NvTensorRtRtxExecutionProvider" (Nvidia) | NVIDIA GeForce RTX 30XX and above with minimum recommended driver version 32.0.15.5585 + Cuda version 12.5 | License Agreement for NVIDIA Software Development Kits—EULA |
| "OpenVINOExecutionProvider" (Intel) | CPU: Intel TigerLake (11th Gen) and later with min recommended driver 32.0.100.9565 GPU: Intel AlderLake (12th Gen) and later with min recommended driver 32.0.101.1029 NPU: Intel ArrowLake (15th Gen) and above with min recommended driver 32.0.100.4239 |
Intel OBL Distribution Commercial Use License Agreement v2025.02.12 |
| "QNNExecutionProvider" (Qualcomm) | Snapdragon(R) X Elite - X1Exxxxx - Qualcomm(R) Hexagon(TM) NPU with minimum driver version 30.0.140.0 and above Snapdragon(R) X Plus - X1Pxxxxx - Qualcomm(R) Hexagon(TM) NPU with minimum driver version 30.0.140.0 and above |
To view the QNN License, download the Qualcomm® Neural Processing SDK, extract the ZIP, and open the LICENSE.pdf file. |
| "VitisAIExecutionProvider" (AMD) | Min: Adrenalin Edition 25.6.3 with NPU driver 32.00.0203.280 Max: Adrenalin Edition 25.9.1 with NPU driver 32.00.0203.297 |
No additional license required |