Share via

[Azure OpenAI] Recent poor server side availability of GPT-5

Fan Zhao 20 Reputation points
2026-04-02T19:23:06.57+00:00

Hello. We recently noticed that GPT-5 model's availability (5xx http error code) has been poor as of April 1, 11:52pm UTC in at lease US East and West. Some of our production workloads depend on the model. Other GPT models seem to be fine.

Screenshot 2026-04-02 at 3.15.14 PM

Does anyone also noticed the same pattern and know when it is going to be fixed? TIA

Azure OpenAI Service
Azure OpenAI Service

An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.

0 comments No comments

Answer accepted by question author
  1. Anshika Varshney 9,655 Reputation points Microsoft External Staff Moderator
    2026-04-03T17:34:47.38+00:00

    Hi Fan Zhao,

    Thanks for raising this. When you see bursts of 5xx errors or intermittent availability with Azure OpenAI, it usually points to a temporary service‑side issue rather than a problem in your application code. From community reports, this kind of behavior has been observed before with specific models or regions while other models continue to work normally.

    The first thing to check is whether the issue is model‑specific or region‑specific. If other models in the same region are working fine, it strongly suggests the problem is isolated to that model backend. As a quick validation step, try running the same request against another deployment or region if you have one available. This helps confirm that the failures are not caused by request format or client configuration.

    From an application design point of view, make sure your client has basic retry and resiliency logic. For short‑lived 5xx responses, retrying the request with a short delay can help absorb transient backend issues. Many teams also spread production workloads across multiple deployments or regions so that a temporary degradation in one location does not take everything down.

    If you are running critical workloads on a single model, a practical workaround is to temporarily route traffic to an alternate model that offers similar capability while the affected model stabilizes. Community feedback shows that in several incidents, other GPT models remained healthy while one specific model had degraded availability.

    For reference, Microsoft’s general guidance on Azure OpenAI reliability and deployment practices is documented here: https://learn.microsoft.com/azure/ai-services/openai/overview

    In short, this pattern usually means a backend availability issue that resolves from the service side. Validating region and model scope, checking Azure status updates, and having simple retry and fallback logic in place are the most effective ways to handle it until the service fully recovers.

    I Hope this helps. Do let me know if you have any further queries.
    Thankyou!

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Q&A Assist
    2026-04-02T19:23:22.3266667+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    Service-side availability incidents and their timelines are not documented in the provided material, and there is no information about a known GPT‑5 outage or its resolution.

    For production-impacting 5xx errors with Azure OpenAI models such as GPT‑5 in specific regions (East US, West US), the supported path is:

    1. Check Azure Service Health and Azure Status for your subscription and the affected regions to see if there is an active incident for Azure OpenAI.
    2. If nothing is listed or the issue is more severe for your subscription than the public status indicates, open a support case from the Azure portal so the product group can investigate using your subscription ID, region, model name, and example correlation IDs from failed calls.
    3. As a mitigation while the issue is investigated, consider:
      • Deploying GPT‑5 in an alternative region where it is available and routing traffic there.
      • Using another available model (for example GPT‑5.1 or GPT‑4o) as a temporary fallback, depending on what is deployed and tested in your environment.

    The documentation also notes that model availability and capacity are managed centrally and can vary by region; customers cannot request custom regional provisioning, but can request additional quota in regions where the model is already offered.


    References:

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.