Share via


Large language models (LLMs)

Important

This feature is in Beta. Workspace admins can control access to this feature from the Previews page. See Manage Azure Databricks previews.

This page provides notebook examples for fine-tuning large language models (LLMs) using Serverless GPU compute. These examples demonstrate various approaches to fine-tuning including parameter-efficient methods like Low-Rank Adaptation (LoRA) and full supervised fine-tuning.

Fine-tune Qwen2-0.5B model

The following notebook provides an example of how to efficiently fine-tune the Qwen2-0.5B model using:

  • Transformer reinforcement learning (TRL) for supervised fine-tuning
  • Liger Kernels for memory-efficient training with optimized Triton kernels.
  • LoRA for parameter-efficient fine-tuning.

Notebook

Get notebook

Fine-tune Llama-3.2-3B with Unsloth

This notebook demonstrates how to fine-tune Llama-3.2-3B using the Unsloth library.

Unsloth Llama

Get notebook

Video demo

This video walks through the notebook in detail (12 minutes).

Supervised fine-tuning using DeepSpeed and TRL

This notebook demonstrates how to use the Serverless GPU Python API to run supervised fine-tuning (SFT) using the Transformer Reinforcement Learning (TRL) library with DeepSpeed ZeRO Stage 3 optimization.

TRL DeepSpeed

Get notebook