Share via

Partial status in Automatic Evaluation of Agent in AI Foundry

Parul Paul 65 Reputation points
2026-03-19T07:16:02.83+00:00

According to the official documentation, the evaluation run statuses are limited to Pending, In Progress (Running), Completed, and Failed. However, during our recent evaluations, we are observing an additional undocumented status: “Partial.”

From the evaluation details:

Status: Partial

Completion tokens: 319

Prompt tokens: 2,452

Fluency: 100% (1/1)

Based on our analysis, this “Partial” status appears when the run completes but not all rows or evaluators succeed.

We’ve identified that this behavior is occurring due to too many requests from the GPT-4.1 model, rather than any issue with the Agent configuration or execution.

Additionally, we understand that:

Billing is based on token usage

It is independent of whether the Agent is published or not


Questions:

Is “Partial” an officially supported but undocumented status for evaluation runs?

  1. What is the recommended way to handle or avoid this status during automatic evaluation, especially when caused by too many requests?
  2. Is there any configuration available in Azure AI Foundry to automatically retry failed rows during evaluation?
Foundry Tools
Foundry Tools

Formerly known as Azure AI Services or Azure Cognitive Services is a unified collection of prebuilt AI capabilities within the Microsoft Foundry platform

0 comments No comments

Answer accepted by question author
  1. SRILAKSHMI C 17,545 Reputation points Microsoft External Staff Moderator
    2026-03-20T05:52:00.78+00:00

    Hello Parul Paul,

    Welcome to Microsoft Q&A,

    Thanks for the detailed analysis, your observations are accurate, and this is a known behavior.

    1. About the “Partial” status

    Partial is not currently listed in the official documentation (which mentions Pending, In Progress, Completed, and Failed), but it is a valid system-generated state.

    It indicates that:

    • The evaluation run completed overall, but

    One or more rows or evaluators did not complete successfully

    In simple terms, it means “Completed with some errors”, rather than a full failure.

    2. Why this is happening

    This behavior is most commonly seen when the system is under load, especially with models like GPT-4.1.

    Typical causes include:

    Rate limiting (TPM/RPM constraints)

    High concurrency (too many evaluation requests at once)

    Backend throttling or transient timeouts

    In such cases, some requests succeed while others fail, resulting in a “Partial” status instead of “Completed.”

    3. How to handle or avoid “Partial” runs

    To reduce the chances of encountering this:

    Lower concurrency in your evaluation runs

    • Break your dataset into smaller batches instead of running large evaluations in one go, Like smaller chunks rather than hundreds of rows at once

    Implement retry logic with exponential backoff for failed requests

    Ensure your deployment has sufficient quota (TPM capacity)

    If possible, test in a region with better availability/capacity or use a lighter model for evaluation scenarios

    These steps help reduce throttling and improve overall success rates.

    4. Retry behavior

    Currently, there is no built-in option in Azure AI Foundry or the Evaluation SDK to automatically retry failed rows within a run.

    Recommended approach:

    Run the evaluation

    Review run details to identify failed rows

    Create a smaller dataset with only failed rows

    Re-run evaluation on that subset

    This is the most effective way to handle partial failures today.

    5. Billing clarification

    • Billing is based on token usage
    • Charges apply regardless of run status, including Partial
    • Only successfully processed tokens are billed, but failed attempts may still consume tokens depending on where the failure occurred.

    Please refer this

    • Troubleshooting evaluation runs (common fail reasons, throttling tips): https://learn.microsoft.com/azure/ai-foundry/how-to/develop/cloud-evaluation?view=foundry-classic

    • Evaluation submission wizard errors & limits (batch sizing, region tips): https://learn.microsoft.com/azure/ai-foundry/how-to/evaluations-storage-account

    • General evaluation best practices (batch size, token limits, retries): https://learn.microsoft.com/azure/ai-foundry/concepts/observability?view=foundry-classic

    I Hope this helps. Do let me know if you have any further queries.


    If this answers your query, please do click Accept Answer and Yes for was this answer helpful.

    Thank you!


Answer accepted by question author
  1. Debashmita Saha 235 Reputation points
    2026-03-20T04:18:25.83+00:00

    Hi Parul,

    Thanks for the question.

    When you see a “Partial” status in Automatic Evaluation of Agent in AI Foundry, it usually means that the evaluation didn’t fully complete or that only part of the criteria were met. Think of it like a progress report—your agent passed some checks but not all of them. This can happen if:

    Certain test cases ran successfully, while others didn’t.

    The evaluation was interrupted or timed out.

    Some metrics were available, but others couldn’t be calculated.

    In practice, “Partial” is a signal to double‑check logs or outputs to see which parts worked and which didn’t. It’s not necessarily a failure—it’s more like a “work in progress” flag that nudges you to investigate further.

    To answer the second question, Azure AI Foundry doesn’t currently provide a native configuration to automatically retry failed rows during evaluation. The only recommended approach is to handle retries in your own workflow—either by re‑running failed rows or adding retry logic in your evaluation pipeline.

    I can think of the following steps to try out:

    1. Look at which rows failed. Often, you’ll see timeouts or rate‑limit errors.
    2. If you’re sending too many requests at once, scale back the batch size to reduce concurrency.
    3. Script retries manually (for example, re‑running only the failed rows).
    4. Keep an eye on token counts and request volume to avoid hitting limits.
    5. Add a delay or exponential backoff in your retry script so you don’t overwhelm the service again.

    Hope this helps! Please feel free to reach out in the comments if you have further queries.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.