Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Important
AI Runtime for single-node tasks is in Public Preview. The distributed training API for multi-GPU workloads remain in Beta.
This page provides notebook examples for fine-tuning large language models (LLMs) using AI Runtime. These examples demonstrate various approaches to fine-tuning including parameter-efficient methods like Low-Rank Adaptation (LoRA) and full supervised fine-tuning.
| Tutorial | Description |
|---|---|
| Fine-tune Qwen2-0.5B model | Efficiently fine-tune the Qwen2-0.5B model using Transformer reinforcement learning (TRL), Liger Kernels for memory-efficient training, and LoRA for parameter-efficient fine-tuning. |
| Fine-tune Llama-3.2-3B with Unsloth | Fine-tune Llama-3.2-3B using the Unsloth library. |
| Supervised fine-tuning using DeepSpeed and TRL | Use the Serverless GPU Python API to run supervised fine-tuning (SFT) using the Transformer Reinforcement Learning (TRL) library with DeepSpeed ZeRO Stage 3 optimization. |
| LORA fine-tuning using Axolotl | Use the Serverless GPU Python API to LORA fine-tune an Olmo3 7B model using the Axolotl library. |
Video demo
This video walks through the Fine-tune Llama-3.2-3B with Unsloth example notebook in detail (12 minutes).