Azure OpenAI: GPT-4o deployment has a 2 minute hard timeout via API call

Adriaan 25 Reputation points

Hi, we have a deployment of GPT-4o in Azure that's acting strangely in comparison to GPT-4 Turbo.

We access this deployment via either Semantic Kernel or via the Azure AI SDK using dotnet, depending on the use case, but both frameworks obviously invoke the same API calls for chat completion.
Regardless, when prompting these models via the chat completion API there is a default response timeout of 2 minutes.
This timeout can be overridden via the OpenAI client options, and has always worked with GPT 3.5, 3.5 Turbo, 4 and 4 Turbo.

Since moving to GPT-4o, overriding this timeout no longer appears to have any effect, and should your completion generation take over 2 minutes, the connection from the host will be cut at exactly 2 minutes.

This behaviour occurs regardless of the framework being used and can be replicated by calling the chat completion API directly (all API versions appear to have this issue) or via the Chat Playground in Azure OpenAI Studio.

Is this a bug / issue with Azure's implementation of GPT-4o? Or is this a bug / issue on OpenAI's side?

Kindly assist.

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
2,465 questions
{count} vote

Accepted answer
  1. navba-MSFT 18,900 Reputation points Microsoft Employee

    @Adriaan Apologies for the late reply. I appreciate your patience on this.


    The product Owners were involved in the background, to look into this timeout issue.


    There was a known issue for the GPT4o causing timeout. The cause has been identified.


    The fix will be deployed to all regions by end of this week.


    Post that you can test again and let me know how it goes.


    Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful