Episode
Automated evaluation of LLM apps with the azure-ai-generative SDK | Python Data Science Day
with Pamela Fox
Automated evaluation of LLM Apps with Azure ai-generative SDK.
Chapters
- 00:00 - Automated evaluation of LLM apps with the azure-ai-generative SDK
- 00:55 - Types of LLM apps
- 01:09 - Prompt-only LLM app
- 03:14 - Retrieval Augmented Generation (RAG) LLM app
- 07:05 - RAG flow
- 08:10 - Are the answers high quality?
- 11:21 - LLM Ops for LLM Apps
- 12:46 - Experimenting with quality factors
- 14:55 - AI RAG Chat Evaluator https://aka.ms/rag/eval
- 16:07 - Ground truth data
- 18:32 - Evaluation
- 25:29 - Evaluation approach
- 25:57 - Improving ground truth data sets
- 26:17 - Next steps
Recommended resources
Related episodes
Connect
- Pamela Fox | Twitter/X: @pamelafox
Automated evaluation of LLM Apps with Azure ai-generative SDK.
Chapters
- 00:00 - Automated evaluation of LLM apps with the azure-ai-generative SDK
- 00:55 - Types of LLM apps
- 01:09 - Prompt-only LLM app
- 03:14 - Retrieval Augmented Generation (RAG) LLM app
- 07:05 - RAG flow
- 08:10 - Are the answers high quality?
- 11:21 - LLM Ops for LLM Apps
- 12:46 - Experimenting with quality factors
- 14:55 - AI RAG Chat Evaluator https://aka.ms/rag/eval
- 16:07 - Ground truth data
- 18:32 - Evaluation
- 25:29 - Evaluation approach
- 25:57 - Improving ground truth data sets
- 26:17 - Next steps
Recommended resources
Related episodes
Connect
- Pamela Fox | Twitter/X: @pamelafox
Video URL
HTML iframe
Developer
Python
Have feedback? Submit an issue here.