Notă
Accesul la această pagină necesită autorizare. Puteți încerca să vă conectați sau să modificați directoarele.
Accesul la această pagină necesită autorizare. Puteți încerca să modificați directoarele.
Conversational evaluation allows you to assess your agent's general behavior over a longer interaction. It reflects how real users interact with agents, where each response depends on previous context within an ongoing conversation. You can use these evaluations to determine whether an agent can maintain context, ask for clarifications, and complete multi‑step tasks.
You can also run single response evaluations, which are good for when you want to test your agent on how it answer specific questions, on what capabilities it call, and on the exact wording it uses in its answers.
Evaluations use test sets. A test set for conversational evaluations consists of a group of up to 20 test cases. When you run an agent evaluation, you select a test set and Copilot Studio runs every test case in that set against your agent.
You can create test cases within a test set by importing them by using a spreadsheet or use AI to generate messages based on your agent's design and resources. You can then choose how you want to measure the quality of your agent's responses for each test case within a test set.
For more information about how agent evaluation works, see About agent evaluation.
To learn how to edit an existing test set, see Change the details of a test set.
Important
Test results are available in Copilot Studio for 89 days. To save your test results for a longer period, export the results to a CSV file.
Create a conversation test set
- Go to your agent's Evaluation page.
Select New evaluation, then select Conversation.
You can create multi-turn test cases using any of the following methods:
Quick conversation set: Automatically generate 10 short conversations based on your agent’s description, instructions, and capabilities.
Full conversation set: Generate conversations using your agent’s knowledge or defined topics. In this option you can select creating short or long conversations.
Use your test chat: Convert the latest test chat into a test case.
Note
Conversation test sets support up to 20 test cases. Each test case supports up to 12 total messages, which is 6 pairs of questions and answers.
Under Name, type a name for your test set.
Change or add the test methods you want to use. For conversation test sets, you can add the General quality, Keyword match, Capabilities match or the Classification custom test methods.
- Add a new method:
- Select Add test method.
- Select all the methods you want to test with, then select OK. You can add multiple methods.
- For some methods, set a pass score, then select OK. The pass score determines what score results in a pass or a failure.
- Some methods require adding expected responses or keywords for each of your test cases. For more information, see Choose evaluation methods.
- Select an existing test method to edit or delete.
Test method Measures Test set type Scoring Configurations General quality How good is a test case's response(s) based on specific qualities Single response or conversation Scored out of 100% None Compare meaning How well the meaning of the test case's answer matches the expected answer Single response Scored out of 100% Pass score, expected answer Capability use Whether the test case used all or any the expected resources Single response Pass/fail Expected capabilities Keyword match Whether the test case used all or any of the expected keywords or phrases Single response or conversation Pass/fail Expected keywords or phrases Text similarity How well the text of the test case's answer matches the expected answer Single response Scored out of 100% Pass score, expected answer Exact match Whether the test case's answer matches the expected answer exactly Single response Pass/fail Expected answer - Add a new method:
Edit the details of the test cases. All test methods, except general quality, require expected responses or keywords. For more information on editing test cases, see Modify a test set.
Select User profile, then select or add the account that you want to use for this test set, or continue without authentication. The evaluation uses this account to connect to knowledge sources and tools during testing. For information on adding and managing user profiles, see Manage user profiles and connections.
Note
Automated testing uses the authentication of the selected test account. If your agent has knowledge sources or connections that require specific authentication, select the appropriate account for your testing.
Edit or create more test cases. Learn more in Edit test cases within a test set.
Select Save to update the test set without running the test cases or Evaluate to run the test set immediately.