Create a test set for an agent (preview)

[This article is prerelease documentation and is subject to change.]

A test set is a collection of conversations that you use to evaluate your agent's performance. Each conversation represents a scenario you want your agent to handle.

Note

This article reflects the new agent experience in Microsoft Copilot Studio, which is currently available as a production-ready preview. Learn about the two experiences in Classic vs. new agent experience.

  • Production-ready previews are subject to supplemental terms of use.
  • Some capabilities available in the classic experience aren't yet available in the new experience.
  • Agents created in the new experience can't be converted to the classic experience.

Prerequisites

An agent created and saved in the new experience. See Create an agent (preview).

Start a new evaluation

  1. Open your agent in Copilot Studio.
  2. Select the Evaluate tab.
  3. Select New evaluation to create a new evaluation.

Select from one of these options:

Option 1: Upload conversations from a CSV file

  1. On the Evaluate tab, in the Data source section, drag a CSV file onto the upload area or select it to browse for a file.

    Tip

    Select the CSV link to download a template that shows the correct file format. Max file size is 5 MB.

  2. Review the imported conversations and make any needed adjustments.

  3. In the Configure test set panel on the right, enter a Name for the evaluation.

  4. Select Evaluate to run or Save to save without running.

Option 2: Generate conversations with AI

  1. On the Evaluate tab, in the More ways to start section, select Quick conversation set.
  2. The system generates 10 conversations based on the agent's description, instructions, and topics.
  3. Review the generated conversations in the Review your test cases list.
  4. In the Configure test set panel, enter a Name and select Evaluate or Save.

Alternatively, after entering the evaluation view, select Add conversations > Generate 10 conversations to generate additional AI-created conversations. After running an initial evaluation, you also have the option to generate more than 10 conversations by selecting Add conversations > Generate 25 conversations or Generate 50 conversations.

Option 3: Write conversations manually

  1. On the Evaluate tab, in the More ways to start section, select Or, write some questions yourself.
  2. In the Review your test cases list, select Add conversations > Write to manually create a conversation (test case).
  3. Add a user question for each conversation. Optionally, you can also add an expected agent response.
  4. In the Configure test set panel, enter a Name and select Evaluate or Save.

Configure the test set

When you create or edit an evaluation, the Configure test set panel on the right side includes:

  • Name: A required name for the evaluation.
  • Test method: The method used to score responses. General quality evaluates whether responses meet quality standards such as relevance and completeness.
  • User profile: The authenticated profile used to run the evaluation. Select Manage to configure which profile runs the tests.

Edit conversations in a test set

  1. On the Evaluate tab, open an existing test set from the Evaluation dropdown.
  2. Select a conversation to edit its messages.
  3. Use the Add conversations dropdown to add more conversations.
  4. Select Save when you're done.