OpenAI RateLimitError persists even after increasing request limit for gpt-35-turbo-16k model on Azure

Question

OpenAI RateLimitError persists even after increasing request limit for gpt-35-turbo-16k model on Azure

Anonymous

I'm currently working with the gpt-35-turbo-16k model from OpenAI, deployed on Azure. Initially, I encountered a RateLimitError due to hitting the rate limit of 6 requests per minute. I then changed the limit to 60 requests per minute and retried, expecting the issue to be resolved.

However, I'm still facing the same error:

"openai.error.RateLimitError: Requests to the Creates a completion for the chat message Operation under Azure OpenAI API version 2023-05-15 have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 36 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit."

I have confirmed that I updated the rate limit correctly, and I waited for a reasonable amount of time for the change to take effect.

Is there something I might be missing, or does it take more time for rate limit changes to propagate? Has anyone else encountered a similar issue, and if so, how did you resolve it?

Thanks in advance for any insights or suggestions!

Region = Switzerland North

api_type = "azure"
api_version = "2023-05-15"
engine = "lund-gpt-35-turbo-16k"

User's image

Pramod Valavala 20,656 Reputation points Microsoft Employee Moderator

2023-10-19T15:13:11.18+00:00

@Ensar Kaya Just following up here to see if the response here helps

2 answers

Your answer

Pramod Valavala 20,656 Reputation points Microsoft Employee Moderator

2023-10-19T15:13:11.18+00:00

@Ensar Kaya Just following up here to see if the response here helps

Answer 1

Anonymous

Delete and redeploy the model

Answer 2

Pramod Valavala 20,656 Microsoft Employee Moderator

@Ensar Kaya While you have control over the rate limits at the deployment, there are hard limits at a regional level as well which you could have hit. Do note that these are shared limits at the subscription level across all Azure OpenAI Service resources.

Also, while you might have not made as many requests, it is also possible that you hit the token limit first.

Deleted

This comment has been deleted due to a violation of our Code of Conduct. The comment was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.
Anonymous

2023-10-19T15:58:08.6866667+00:00

Thx for your quick response. I don't think the regional limits were the problem. I had a deployed 3.5 turbo model with 2k token limit and 6 request per minute, I increased these number from Azure AI Studio-Deployments UI. I tested the model with new rate limit but somehow it was still having the previous limits(I waited around 2 hours to test it). Then I deleted the model and deploy another one with the same limits and tests started passing.
Pramod Valavala 20,656 Reputation points Microsoft Employee Moderator

2023-10-19T16:07:47.42+00:00

@Ensar Kaya Understood. Glad you were able to resolve your issue this way. Thanks for sharing!

Do take note of the regional limits though, since it is possible that those limits could have been hit and reset by the time you re-created the model deployment.

Either way, glad to see you are unblocked. 🙂
me_v2 20 Reputation points

2024-07-28T01:21:57.31+00:00

The error is vague. It doesn't say if the problem is RPM or TPM.

Share via

OpenAI RateLimitError persists even after increasing request limit for gpt-35-turbo-16k model on Azure

2 answers

Your answer