Evaluate language models with Azure Databricks

Intermediate
Data Engineer
Azure Databricks

In this module, you explore Large Language Model evaluation using various metrics and approaches, learn about evaluation challenges and best practices, and discover automated evaluation techniques including LLM-as-a-judge methods.

Learning objectives

In this module, you learn how to:

  • Evaluate LLM evaluation models
  • Describe the relationship between LLM evaluation and AI system evaluation
  • Describe standard LLM evaluation metrics like accuracy, perplexity, and toxicity
  • Describe LLM-as-a-judge for evaluation

Prerequisites

Before starting this module, you should be familiar with Azure Databricks. Consider completing Explore Azure Databricks before starting this module.