AI Runtime CLI quickstart

Important

The AI Runtime CLI is in Beta.

This page walks through submitting your first training job with the AI Runtime CLI. Before starting, install the CLI and configure authentication.

Step 1: Write a YAML config

Create train.yaml describing the workload. The minimal config requires an experiment name, a compute spec, and a command. The command below runs without any local code, so you can submit your first run right away:

experiment_name: my-first-air-run
compute:
  num_accelerators: 1
  accelerator_type: GPU_1xA10
command: echo "hello AIR!"

Run your own code

To run a local training script, add an environment block that lists your Python dependencies and a code_source block that uploads your local code. Place your script alongside train.yaml:

my-project/
├── train.yaml
└── train.py

experiment_name: my-first-air-run
environment:
  version: '4'
  dependencies:
    - torch
    - transformers
compute:
  num_accelerators: 1
  accelerator_type: GPU_1xA10
code_source:
  type: snapshot
  snapshot:
    root_path: .
command: python $CODE_SOURCE_PATH/train.py

This config installs the listed dependencies, uploads the current directory (root_path: .), and runs train.py on a single A10 GPU. $CODE_SOURCE_PATH resolves to the uploaded code location on the remote node. Databricks recommends using this rather than hardcoding a path. environment.version selects the serverless GPU environment version and is optional (defaults to '4'). For all available versions, see Serverless environment versions.

For the full field reference, see Workload YAML reference.

Step 2: Submit the run

Submit the workload:

air run --file train.yaml

The CLI uploads your local code (if you configured a code_source), submits the job, and prints a run ID. Use that ID to inspect, watch, and cancel the run in later commands.

The submission creates a run in the MLflow experiment named in experiment_name (an experiment can hold many runs). That run captures the workload's metrics, parameters, artifacts, and logs, all viewable in the workspace MLflow UI. Logs are also available outside MLflow: stream them to your terminal or a file, or download them later with air logs (see Step 3).

To watch logs until completion, add --watch:

air run --file train.yaml --watch

Step 3: Inspect the run

Check status:

air get run <run-id>

The output includes clickable links to the run's MLflow experiment and MLflow run in the workspace UI.

Stream or download logs:

air logs <run-id>
air logs <run-id> --node 2
air logs <run-id> --download-to ./logs/

Distributed workloads run across multiple nodes. By default, air logs streams from node 0. To view logs from a specific node, pass --node. Use --download-to to write logs to a local directory instead of streaming them.

List recent runs:

air list runs --limit 10
air list runs --active

Cancel a run:

air cancel <run-id>

Common patterns

Override YAML fields from the command line:

air run --file train.yaml --override compute.num_accelerators=32 timeout_minutes=120

Validate the config without submitting:

air run --file train.yaml --dry-run

Make a submission safely retryable:

air run --file train.yaml --idempotency-key my-unique-key

If the same key has been used before, the existing run is returned instead of creating a new one.

Next steps

Feedback

Was this page helpful?

Last updated on 2026-06-12