How to fix RateLimitError: Error code: 429 - {'error': {'code': '429', 'message': 'Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2023-12-01-preview have exceeded token rate limit of your current OpenAI S0 pricing tier.

Felix Boe Tangen 0 Reputation points Microsoft Employee
2025-03-03T13:54:02.1166667+00:00

Getting this error message. but when i check the metrics in the portal it says i am no where using 150k TPM which is the limit for this model GPT4o

Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,602 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Saideep Anchuri 9,425 Reputation points Microsoft External Staff Moderator
    2025-03-03T14:18:42.32+00:00

    Hi Felix Boe Tangen

    The RateLimitError: Error code: 429 indicates that your requests have exceeded the token rate limit for your current pricing tier, even if your metrics do not show usage close to the limit. The token count used for rate limiting is an estimate based on the character count of the API request, which may differ from the actual token count used for billing. The rate limit expects requests to be evenly distributed over a one-minute period. your requests are distributed evenly over time. Additionally, you may want to review the settings for max_tokens and best_of to minimize the token count for each request. You can also adjust max_response size by explicitly mentioning in system message to keep it under 100 or 200 words.

    Kindly refer below link: rate-limits

    Thank You.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.