AI Runtime CLI examples

Important

The following examples are complete, end-to-end workloads you submit from the air CLI with air run -f train.yaml. Each shows a real multi-GPU pattern on H100 GPUs, including the workload YAML, bootstrap commands, and code. Start with the quickstart if you haven't submitted a run before.

Example	Description
Multi-node LLM fine-tuning with FSDP	Supervised fine-tuning of Llama-3.1-8B across 16 H100 GPUs (2 nodes) using `torchrun` and PyTorch Fully Sharded Data Parallel (FSDP). Logs to MLflow and checkpoints to a Unity Catalog volume.
Distributed training with Ray Train	Distributed data-parallel fine-tuning with Ray Train's `TorchTrainer` across 8 H100 GPUs on a single node, with one worker per GPU.
Batch inference with Ray Data and vLLM	Offline LLM batch inference with Ray Data and vLLM across 8 H100 GPUs on a single node, running one vLLM replica per GPU and writing results to a Unity Catalog volume as Parquet.

Feedback

Was this page helpful?

Last updated on 2026-07-16

AI Runtime CLI examples

Feedback

Additional resources