Azure Open AI service Internal Error 500

Max 0 Reputation points
2024-07-13T01:56:34.9066667+00:00

Hi,

We encountered a lot 500 errors when using Azure OpenAI service today.

region:North Central US

Internal server error | Apim-request-id: 8441184c-a5f5-4ccc-be01-28fec6e41783

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
2,573 questions
{count} votes

2 answers

Sort by: Most helpful
  1. YutongTie-MSFT 48,156 Reputation points
    2024-07-13T04:56:48.2466667+00:00

    Hello everyone,

    Thanks for reaching out to us and reporting this issue, this issue has been escalated and acknowledged by product team, product team is working on a fix, I will update here later.

    Thanks for your understanding.

    Regards,

    Yutong

    0 comments No comments

  2. YutongTie-MSFT 48,156 Reputation points
    2024-07-13T09:34:57.68+00:00

    Hello everyone,

    Update for this issue with more details -

    Impact Statement: Starting at approximately 00:01 UTC on 13 July 2024, a subset of customers across multiple regions using the Azure OpenAI service began experiencing errors when calling the Azure OpenAI endpoints and may have issues accessing their resources for the duration of this impact. 

    Current Status: During a routine cleanup operation, we believe that some dependent backend components became unavailable, which led to the aforementioned issues. This was believed to be transient, as initially, this only affected a small percentage of customers, however it soon increased to multiple regions. We have stopped the cleanup operation to avoid further impact, while we continue to work on our recovery efforts.

    These efforts are ongoing across all impacted regions, which are seeing partial service restoration. Customers in these regions may start noticing improvements. 

    Retries may be successful as the mitigation progresses across the different affected regions. 

    Initially, this issue was communicated via the public Azure Status page, as the full impact was not adequately determined. Now that the root cause has been determined, and the impact is no longer increasing, all further updates will be communicated via our standard communications to impacted customers in Azure Service Health.

    This issue should be fixed soon.

    I hope this helps!

    Regards,

    Yutong

    -Please kindly accept the answer if you feel helpful to support the community, thanks a lot.