Evaluate language models with Azure Databricks

Intermediate
Data Engineer
Azure Databricks

Learn to compare Large Language Model (LLM) and traditional Machine Learning (ML) evaluations, understand their relationship with AI system evaluation, and explore various LLM evaluation metrics and specific task-related evaluations.

Learning objectives

In this module, you learn how to:

  • Compare LLM and traditional ML evaluations.
  • Describe the relationship between LLM evaluation and evaluation of entire AI systems.
  • Describe generic LLM evaluation metrics like accuracy, perplexity, and toxicity.
  • Describe LLM-as-a-judge for evaluation.

Prerequisites

Before starting this module, you should be familiar with Azure Databricks. Consider completing Explore Azure Databricks before starting this module.