An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
Hello Mukul Munjal
Welcome to Microsoft Q&A and Thank you for sharing the detailed description and event payload.
From what you’ve described, the Realtime session is established successfully and response generation begins, but then fails mid‑stream with a server_error and the response status changes to failed. This is not expected behavior.
What This Indicates
Session connection is successful
Response generation starts
Failure occurs during processing
This pattern typically points to a transient backend or dependency issue within the Realtime service pipeline, rather than a problem with your request format.
Recommended Checks & Mitigations
While we investigate this from the backend side, here are steps you can take to stabilize and isolate the issue:
Add Retry Logic
-
server_error(HTTP 5xx) is often transient. - Implement retry with exponential backoff.
- If using an SDK, increase retry attempts (default is usually low).
Verify Endpoint, Deployment & Parameters
Ensure your endpoint format is correct:
Code
https://<resource>.openai.azure.com/openai/deployments/<deployment>/...
Double‑check deployment name, API version, and parameters such as stream: true, max_completion_tokens (avoid very large values), temperature, top_p.
Test with a Minimal Request
Try a simple text‑only streaming request:
{
"model": "<your-deployment>",
"messages": [
{ "role": "user", "content": "Hello" }
],
"stream": true,
"max_completion_tokens": 1000
}
If this succeeds, gradually add audio or larger prompts to identify if complexity is triggering the failure.
Check Input / Streaming Behavior
Large prompts, long audio streams, or rapid event bursts can cause mid‑stream failures.
For audio Format should be PCM16 mono, 24 kHz., Send small chunks (~100 ms), Properly base64 encode payloads.
Use Azure Monitor to check request volume, throttling, or failures.
If the region is under load, consider scaling your resource or testing in another region.
Ensure you are not hitting realtime session limits or rate limits, which can surface as generic failures.
Check Azure Service Health / Resource Health for any ongoing issues in your region.
Please refer this
GPT Realtime API troubleshooting (speech & audio): https://learn.microsoft.com/azure/ai-foundry/openai/how-to/realtime-audio#troubleshooting
I Hope this helps. Do let me know if you have any further queries.
Thank you!