Episode

Automated evaluation of LLM apps with the azure-ai-generative SDK | Python Data Science Day

with Pamela Fox

Automated evaluation of LLM Apps with Azure ai-generative SDK.

Chapters

  • 00:00 - Automated evaluation of LLM apps with the azure-ai-generative SDK
  • 00:55 - Types of LLM apps
  • 01:09 - Prompt-only LLM app
  • 03:14 - Retrieval Augmented Generation (RAG) LLM app
  • 07:05 - RAG flow
  • 08:10 - Are the answers high quality?
  • 11:21 - LLM Ops for LLM Apps
  • 12:46 - Experimenting with quality factors
  • 14:55 - AI RAG Chat Evaluator https://aka.ms/rag/eval
  • 16:07 - Ground truth data
  • 18:32 - Evaluation
  • 25:29 - Evaluation approach
  • 25:57 - Improving ground truth data sets
  • 26:17 - Next steps

Connect

Developer
Python