Edit

Share via


Distillation in Azure AI Foundry portal (preview)

Important

Items marked (preview) in this article are currently in public preview. This preview is provided without a service-level agreement, and we don't recommend it for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

In Azure AI Foundry portal, you can use distillation to efficiently train a student model.

What is distillation?

In machine learning, distillation is a technique for transferring knowledge from a large, complex model (often called the teacher model) to a smaller, simpler model (the student model). This process helps the smaller model achieve similar performance to the larger one while being more efficient in terms of computation and memory usage.

Distillation steps

The main steps in knowledge distillation are:

  1. Use the teacher model to generate predictions for the dataset.

  2. Train the student model by using these predictions, along with the original dataset, to mimic the teacher model's behavior.

Sample notebook

Distillation in Azure AI Foundry portal is currently only available through a notebook experience. You can use the sample notebook to see how to perform distillation. Model distillation is available for Microsoft models and a selection of OSS (open-source software) models available in the model catalog. In this sample notebook, the teacher model uses the Meta Llama 3.1 405B instruction model, and the student model uses the Meta Llama 3.1 8B instruction model.

We used an advanced prompt during synthetic data generation. The advanced prompt incorporates chain-of-thought (CoT) reasoning, which results in higher-accuracy data labels in the synthetic data. This labeling further improves the accuracy of the distilled model.