Azure OpenAI Availability rate down to 65%. 503 error
Today I frequently got service denial for chat completion requests with high token counts (~10K)
openai.InternalServerError: Error code: 503 - {'error': {'code': 'InternalServerError', 'message': 'The service is temporarily unable to process your request. Please try again later.'}}
Included is the availability chart from monitoring
Deployment info
gpt 4o
Deployment typeGlobal Standard
Rate limit (Tokens per minute)13,565,000
Rate limit (Requests per minute)81,390
Model version2024-11-20
Region eastus
Troubleshooting in the Portal did not help.
I need to explain to my customers that the latency and unavailability is caused by AzureOpenAI, not my production code.
I need pressing support right now.