Formerly known as Azure AI Services or Azure Cognitive Services is a unified collection of prebuilt AI capabilities within the Microsoft Foundry platform
Hello Parul Paul,
Welcome to Microsoft Q&A,
Thanks for the detailed analysis, your observations are accurate, and this is a known behavior.
1. About the “Partial” status
Partial is not currently listed in the official documentation (which mentions Pending, In Progress, Completed, and Failed), but it is a valid system-generated state.
It indicates that:
- The evaluation run completed overall, but
One or more rows or evaluators did not complete successfully
In simple terms, it means “Completed with some errors”, rather than a full failure.
2. Why this is happening
This behavior is most commonly seen when the system is under load, especially with models like GPT-4.1.
Typical causes include:
Rate limiting (TPM/RPM constraints)
High concurrency (too many evaluation requests at once)
Backend throttling or transient timeouts
In such cases, some requests succeed while others fail, resulting in a “Partial” status instead of “Completed.”
3. How to handle or avoid “Partial” runs
To reduce the chances of encountering this:
Lower concurrency in your evaluation runs
- Break your dataset into smaller batches instead of running large evaluations in one go, Like smaller chunks rather than hundreds of rows at once
Implement retry logic with exponential backoff for failed requests
Ensure your deployment has sufficient quota (TPM capacity)
If possible, test in a region with better availability/capacity or use a lighter model for evaluation scenarios
These steps help reduce throttling and improve overall success rates.
4. Retry behavior
Currently, there is no built-in option in Azure AI Foundry or the Evaluation SDK to automatically retry failed rows within a run.
Recommended approach:
Run the evaluation
Review run details to identify failed rows
Create a smaller dataset with only failed rows
Re-run evaluation on that subset
This is the most effective way to handle partial failures today.
5. Billing clarification
- Billing is based on token usage
- Charges apply regardless of run status, including Partial
- Only successfully processed tokens are billed, but failed attempts may still consume tokens depending on where the failure occurred.
Please refer this
• Troubleshooting evaluation runs (common fail reasons, throttling tips): https://learn.microsoft.com/azure/ai-foundry/how-to/develop/cloud-evaluation?view=foundry-classic
• Evaluation submission wizard errors & limits (batch sizing, region tips): https://learn.microsoft.com/azure/ai-foundry/how-to/evaluations-storage-account
• General evaluation best practices (batch size, token limits, retries): https://learn.microsoft.com/azure/ai-foundry/concepts/observability?view=foundry-classic
I Hope this helps. Do let me know if you have any further queries.
If this answers your query, please do click Accept Answer and Yes for was this answer helpful.
Thank you!