How to fix RateLimitError: Error code: 429 - {'error': {'code': '429', 'message': 'Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2023-12-01-preview have exceeded token rate limit of your current OpenAI S0 pricing tier.

Question

How to fix RateLimitError: Error code: 429 - {'error': {'code': '429', 'message': 'Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2023-12-01-preview have exceeded token rate limit of your current OpenAI S0 pricing tier.

Felix Boe Tangen 0 Microsoft Employee

Getting this error message. but when i check the metrics in the portal it says i am no where using 150k TPM which is the limit for this model GPT4o

Saideep Anchuri 9,500 Reputation points Moderator

2025-03-04T03:56:13.49+00:00

Hi Felix Boe Tangen

Following up to see if the above answer was helpful.

Thank You.
Saideep Anchuri 9,500 Reputation points Moderator

2025-03-05T03:57:14.1933333+00:00

Hi Felix Boe Tangen

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet.

Thank You.

1 answer

Your answer

Saideep Anchuri 9,500 Reputation points Moderator

2025-03-04T03:56:13.49+00:00

Hi Felix Boe Tangen

Following up to see if the above answer was helpful.

Thank You.
Saideep Anchuri 9,500 Reputation points Moderator

2025-03-05T03:57:14.1933333+00:00

Hi Felix Boe Tangen

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet.

Thank You.

Answer 1

Hi Felix Boe Tangen

The RateLimitError: Error code: 429 indicates that your requests have exceeded the token rate limit for your current pricing tier, even if your metrics do not show usage close to the limit. The token count used for rate limiting is an estimate based on the character count of the API request, which may differ from the actual token count used for billing. The rate limit expects requests to be evenly distributed over a one-minute period. your requests are distributed evenly over time. Additionally, you may want to review the settings for max_tokens and best_of to minimize the token count for each request. You can also adjust max_response size by explicitly mentioning in system message to keep it under 100 or 200 words.

Kindly refer below link: rate-limits

Thank You.

Share via

How to fix RateLimitError: Error code: 429 - {'error': {'code': '429', 'message': 'Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2023-12-01-preview have exceeded token rate limit of your current OpenAI S0 pricing tier.

1 answer

Your answer