Evaluate language models with Azure Databricks

Module
8 Units

Intermediate

Data Engineer

Azure Databricks

In this module, you explore Large Language Model evaluation using various metrics and approaches, learn about evaluation challenges and best practices, and discover automated evaluation techniques including LLM-as-a-judge methods.

Learning objectives

In this module, you learn how to:

Evaluate LLM evaluation models
Describe the relationship between LLM evaluation and AI system evaluation
Describe standard LLM evaluation metrics like accuracy, perplexity, and toxicity
Describe LLM-as-a-judge for evaluation

Prerequisites

Before starting this module, you should be familiar with Azure Databricks. Consider completing Explore Azure Databricks before starting this module.

Introduction min
Explore LLM evaluation min
Evaluate LLMs and AI systems min
Evaluate LLMs with standard metrics min
Describe LLM-as-a-judge for evaluation min
Exercise - Evaluate an Azure OpenAI model min
Module assessment min
Summary min

Start