Share via


Optimize AI-driven outcomes with prompt evaluations

Important

Some of the functionality described in this release plan has not been released. Delivery timelines may change and projected functionality may not be released (see Microsoft policy). Learn more: What's new and planned

Enabled for Public preview General availability
Admins, makers, marketers, or analysts, automatically May 15, 2025 Sep 2025

Business value

The prompt accuracy scoring feature in AI Builder’s prompt builder gives you empirical evidence on prompt effectiveness. It does this by allowing a high degree of testability and, more importantly, evaluation of the prompt outcomes. This allows you to identify areas of improvement to optimize the precision of your prompt to improve AI-driven outcomes in alignment with business goals.

Feature details

The prompt accuracy scoring feature in AI Builder’s prompt builder allows you to build a test suite and validate your prompt performance across different iterations of prompt development. These detailed assessments empower you to make the correct decisions about using prompts in agents, apps, and flows, moving capabilities to production, and prompt improvements. This comprehensive feedback on the effectiveness of your AI prompts helps you optimize for clarity and precision. As you create or refine prompts, the feature analyzes the prompt structure, language, and relevance to the intended task, and assigns a confidence score to each test case prediction that reflects in expected performance of the prompt. This score is generated based on factors such as specificity, complexity, alignment, and custom assertions. Thus, you can derive actionable insights to improve prompt phrasing or reduce ambiguity. By giving you a clear, quantifiable measure of prompt quality, the accuracy scoring feature streamlines the prompt engineering process, enhances model outcomes, and reduces iteration time. This enables more efficient and reliable AI interactions across use cases.