Hi Shariq Hashmi,
Thank you for sharing the details,
Based on the behavior you’re seeing where the agent starts responding and then consistently fails mid-generation this does not appear to be a prompt issue. This pattern typically points to a service-side or configuration-related issue in the AI Foundry Agent Playground.
Below are the most relevant checks and explanations to help narrow this down.
What the error indicates
The generic error returned with a request ID usually means the request reached the backend, but an internal failure occurred during response generation. Since this is 100% reproducible, retries alone will not resolve it.
This can be caused by either:
- A backend service issue, or
- A misconfiguration in the AI Foundry project or agent setup
Things to verify on your side
- Permissions and role assignments
Ensure the identity used by the agent (user, service principal, or managed identity) has the correct roles on the AI Foundry project:
Azure AI User (minimum)
Azure AI Project Manager (recommended)
You can verify this under Access Control (IAM) for the AI project.
- Networking configuration
If your project uses:
VNET integration
Private endpoints
Restricted network access
Please confirm:
All related resources (AI project, networking components) are in the same region
Networking was configured at project creation time
Adding or changing VNET / private endpoint settings after agent creation is not supported and can lead to runtime failures like the one you’re seeing.
- API key and connection settings
Confirm that:
The API key or managed identity used has access to the Azure OpenAI / AI Foundry resource
Required connections (for example, model connections) exist and are active
If agents were deployed via ARM/Bicep, ensure the template includes the required:
aiServicesConnections configuration
Missing this can cause agents to fail at runtime even though creation succeeds.
- Quotas and limits
Although the error message is generic, quota exhaustion can sometimes surface this way.
Please check:
Agent limits per project/region
Model quota availability
Token or request limits
You can verify this in the AI Foundry quota and limits blade.
- Model and region compatibility
Ensure that:
The model selected for the agent is supported in your region
The model is currently available and healthy
Using an unsupported or constrained model can result in mid-response failures.
When this is likely a service-side issue
Given that:
The response begins streaming
The failure is consistent
Multiple attempts return different request IDs
This strongly suggests a backend execution issue in the Foundry Agent Playground.
Recommended steps
Open an Azure Support Request Include:
- Request IDs:
-
68ed3d2a-ffb0-4d1e-b608-b8cbf058df32
-
9e2f7fba-b4a5-45a5-8d6a-1efc077e31ac
- Subscription ID
- Tenant ID
- AI Foundry project name and region
- Confirmation that the error occurs mid-response in the Playground
Test via API If the agent is callable via REST/SDK, test outside the Playground to determine whether the issue is UI-specific or service-wide.
Thank you!