Evaluate language models with Azure Databricks
Intermediate
Data Engineer
Azure Databricks
In this module, you explore Large Language Model evaluation using various metrics and approaches, learn about evaluation challenges and best practices, and discover automated evaluation techniques including LLM-as-a-judge methods.
Learning objectives
In this module, you learn how to:
- Evaluate LLM evaluation models
- Describe the relationship between LLM evaluation and AI system evaluation
- Describe standard LLM evaluation metrics like accuracy, perplexity, and toxicity
- Describe LLM-as-a-judge for evaluation
Prerequisites
Before starting this module, you should be familiar with Azure Databricks. Consider completing Explore Azure Databricks before starting this module.