Azure OpenAI service responding very slow and the performance is not consistent

Nayak, Rajesh 65

Hi ,

We are experimenting with Azure OpenAI services. We are using GPT-3(text-davinci-003) and chatGPT for our use case but the performance of the endpoint is not consistent. For a prompt with few-shot examples the GPT-3 endpoint sometimes taking 10 secs to respond and sometime the same call takes 2.5 sec. Would you be able to guide how to enhance the performance of the model endpoints ?

YutongTie-MSFT 48,586 Reputation points

2023-05-03T18:43:20.1033333+00:00

Hello @Nayak, Rajesh May I know which region you are in? Thanks.
Nithin Anil 15 Reputation points

2023-05-08T13:45:35.72+00:00

I am also facing the same issues. Performance is not consistent. Is there any ways to optimize it?
Nayak, Rajesh 65 Reputation points

2023-05-09T09:15:31.2933333+00:00

YutongTie-MSFT I am using GPT-3 model of Azure OpenAI in West Europe region.
Wouter van Gils 50 Reputation points

2023-06-08T11:28:44.9333333+00:00

I have the same issues the last few days, West Europe. Sometimes the API responds within 2 sec but every two/three prompts it will be more like 60 to 120 sec to generate output. @YutongTie-MSFT what can I do to speed this up? Currently this is terrible for a demo to clients.
Benoit Letournel 30 Reputation points

2023-07-18T07:23:32.45+00:00
We are facing similar issues building our product with ChatGPT :

Response time not consistent across regions, we noticed decent perf on East US but worst perf on UK south and FR centrale

Response time not consistent within the same region. East US perf decreases/improves over 1 week span

What's more puzzling is that performance with Azure Openai is much worse than using directly OpenAI.
João Silva 0 Reputation points

2023-07-20T15:38:45.3466667+00:00

Facing the same problem on West Europe region.
mi_oversea 0 Reputation points

2023-07-28T06:53:03.7833333+00:00

Facing the same problem on Japan East region.
Mrityunjay Upadhyay 0 Reputation points

2023-08-10T07:37:06.8733333+00:00

Any resolution to this issue? i am using US East and response time are very unpredictable and bad. after first couple of requests it times out.
YutongTie-MSFT 48,586 Reputation points

2023-08-12T04:10:16.8866667+00:00

Hello team,

Already brought your feedback to product team for further investigation, thanks for your feedback again.

At the meantime, could you please provide more details about how long it is when you mentioned the response is slow?

As the guidance provided by product team, some requests can take longer is within the normal operation of the service. You are free to abort the call and make another one if you feel it's exceeding any timeframe you require.

Regards,

Yutong
mr. yu 0 Reputation points

2023-11-29T03:40:58.1833333+00:00

Hey, I face the same issue these days. It bothered so much.
In Azure OpenAI Studio website, I try to edit deployment, in advance settings, I max up Tokens per Minute Rate Limit (thousands) , and turn off Enable Dynamic Quota.

It seems to work for me right now, don't know why.

2 answers

Fabien Snauwaert 0 Reputation points

2023-07-19T15:52:32.04+00:00
Switched to Azure OpenAI from OpenAI's OpenAI. Using region francecentral. The measured response time went up, from an average of 2407 ms with OpenAI to 3032 ms with Azure, or an extra latency of +643 ms.

This is with model gpt-35-turbo version 0613 for both.

Has anyone tested this across region and models?

I'm wondering if:

Some regions wouldn't be faster?

Would 0301 be faster than 0613?
Please sign in to rate this answer.

0 comments No comments
Sign in to comment
Deaglan Mullan 0 Reputation points

2024-01-21T00:26:47.74+00:00

Still seems to be an issue. When performing well it is much faster than OpenAI. But it is very inconsistent. Sometimes it takes 1s. Sometimes it takes up to 30s . (West Europe). EDIT: My bad. I realize it is reaching the quota limit of 1k tokens per minute which is causing it to hang.
Please sign in to rate this answer.

0 comments No comments
Sign in to comment

Share via

Azure OpenAI service responding very slow and the performance is not consistent

2 answers