why do I hit openai rate limit

Question

why do I hit openai rate limit

Sergey Markosyan 10

The rate limit for ChatGPT model is 300 requests per minute. However our requests are hitting rate limit at much lower rates. The "metrics" report of Azure OpenAI service shows maximum 200 requests in 5-minute intervals, e.g. 188 total requests, from which 76 are blocked (we get rate limit error). How that is possible?

YutongTie-MSFT 53,966 Reputation points Moderator

2023-03-26T20:21:20.4066667+00:00

Hello @Sergey Markosyan Are you working on ChatGPT-3.5-turbo model? Could you please share the screenshot of what your metrics report said so that we can check with product team? Thanks a lot.
Sergey Markosyan 10 Reputation points

2023-03-29T10:37:56.5533333+00:00

@YutongTie-MSFT Yes this is ChatGPT-3.5-turbo model. Here in the screenshot you can see the rate limit chart . The rate limit in the chart is lower than 300 RPM that is mentioned in the documentation. Also the rate limit is changing dynamically. What does that mean? I expected to see here flat 300 .
Sergey Markosyan 10 Reputation points

2023-03-29T10:38:20.6166667+00:00
David Dudáš 25 Reputation points

2023-04-18T12:07:12.6+00:00

I encounter the same issue. I am trying to use Azure OpenAI embedding model and call it in parallel. The limit should be 300 request/minute. I tested it and and I cannot get to more than 50 requests at the same time.

Based on error msg I got: openai.error.RateLimitError: Requests to the Get a vector representation of a given input that can be easily consumed by machine learning models and algorithms. Operation under Azure OpenAI API version 2023-03-15-preview have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 10 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.

I assume that there is a floating 10s window for that 300 request/minute limit -> real limit is 50 requests/10 seconds.
Shawn Mittal 35 Reputation points Microsoft Employee

2023-05-03T15:42:39.3433333+00:00

I'm hitting this same error now. I wasn't just a few days ago. Not sure what the issue is.
Guillaume Berthier 36 Reputation points

2023-05-23T13:41:55.57+00:00

same for me, since a couple of days I'm getting this "[warning] rate limit" on gpt-4 Azure Open AI API (east us Azure region) every single days despite I'm way below the API quota
Cristian Leandro Campagna 15 Reputation points

2023-10-05T16:26:14.8766667+00:00

Hello everyone, I had this same error as well. I solved it by changing the TMR in the edit of the deployment at the maximum value . Hope it helps you!

Cheers

Cristian
Umar Aalam 0 Reputation points

2023-10-12T07:04:30.02+00:00

Issue:

Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2023-07-01-preview have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 3 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.

What I have done-

I have requested for the quota increase and got the acceptance on the canada region.

However, even after increased quota region gpt4 model, I am still encountering the rate limit error.

Kindly suggest any workaround
Umar Aalam 0 Reputation points

2023-10-12T07:04:49.03+00:00

testtt
Lee, Steven Hong Sing 1 Reputation point

2024-05-06T03:53:06.7933333+00:00

I also faced the same issue. I'm trying with the text-embedding-3-small embedding model. I am running on less than 100 parallel requests.

Operation under Azure OpenAI API version 2023-03-15-preview have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 29 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.

Your answer

YutongTie-MSFT 53,966 Reputation points Moderator

2023-03-26T20:21:20.4066667+00:00

Hello @Sergey Markosyan Are you working on ChatGPT-3.5-turbo model? Could you please share the screenshot of what your metrics report said so that we can check with product team? Thanks a lot.
Sergey Markosyan 10 Reputation points

2023-03-29T10:37:56.5533333+00:00

@YutongTie-MSFT Yes this is ChatGPT-3.5-turbo model. Here in the screenshot you can see the rate limit chart . The rate limit in the chart is lower than 300 RPM that is mentioned in the documentation. Also the rate limit is changing dynamically. What does that mean? I expected to see here flat 300 .
Sergey Markosyan 10 Reputation points

2023-03-29T10:38:20.6166667+00:00
David Dudáš 25 Reputation points

2023-04-18T12:07:12.6+00:00

I encounter the same issue. I am trying to use Azure OpenAI embedding model and call it in parallel. The limit should be 300 request/minute. I tested it and and I cannot get to more than 50 requests at the same time.

Based on error msg I got: openai.error.RateLimitError: Requests to the Get a vector representation of a given input that can be easily consumed by machine learning models and algorithms. Operation under Azure OpenAI API version 2023-03-15-preview have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 10 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.

I assume that there is a floating 10s window for that 300 request/minute limit -> real limit is 50 requests/10 seconds.
Shawn Mittal 35 Reputation points Microsoft Employee

2023-05-03T15:42:39.3433333+00:00

I'm hitting this same error now. I wasn't just a few days ago. Not sure what the issue is.
Guillaume Berthier 36 Reputation points

2023-05-23T13:41:55.57+00:00

same for me, since a couple of days I'm getting this "[warning] rate limit" on gpt-4 Azure Open AI API (east us Azure region) every single days despite I'm way below the API quota
Cristian Leandro Campagna 15 Reputation points

2023-10-05T16:26:14.8766667+00:00

Hello everyone, I had this same error as well. I solved it by changing the TMR in the edit of the deployment at the maximum value . Hope it helps you!

Cheers

Cristian
Umar Aalam 0 Reputation points

2023-10-12T07:04:30.02+00:00

Issue:

Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2023-07-01-preview have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 3 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.

What I have done-

I have requested for the quota increase and got the acceptance on the canada region.

However, even after increased quota region gpt4 model, I am still encountering the rate limit error.

Kindly suggest any workaround
Umar Aalam 0 Reputation points

2023-10-12T07:04:49.03+00:00

testtt
Lee, Steven Hong Sing 1 Reputation point

2024-05-06T03:53:06.7933333+00:00

I also faced the same issue. I'm trying with the text-embedding-3-small embedding model. I am running on less than 100 parallel requests.

Operation under Azure OpenAI API version 2023-03-15-preview have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 29 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.

Share via

why do I hit openai rate limit

Your answer