Azure OpenAI Error 429 - Request Below Rate Limit

Pedro Daniel Scheeffer Pinheiro 50 Reputation points
2024-06-10T19:21:29.57+00:00

I am receiving an Error 429 while using Azure OpenAI, despite the request being below the rate limit.

Region: Sweden Central

The error message reads: "Error code: 429 - {'error': {'code': '429', 'message': 'Rate limit is exceeded. Try again in 86400 seconds.'}}".

My input is two prompts with around 900 tokens, and the max token limit is set to 4000. PTU utilization is at 0%. The error started occurring recently. Can someone help me troubleshoot this issue?
Also, is it necessary that I wait for a day to try again?

I tried to increase the timer between request but no luck.

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,132 questions
{count} votes

3 answers

Sort by: Most helpful
  1. Chris Hoder - MSFT 101 Reputation points Microsoft Employee
    2024-06-11T00:24:29.87+00:00

    Hi - Is it possible for you to open a support request as this will let us debug the behavior with your specific requests.

    thanks!


  2. Jessie Chen 50 Reputation points
    2024-07-29T10:37:52.9833333+00:00

    I encountered the same problem and resolved it by increasing the Tokens per Minute rate limit.

    For your reference: https://learn.microsoft.com/en-us/answers/questions/1845382/azure-openai-chatbot-server-responded-with-status?orderby=helpful

    0 comments No comments

  3. Rafal Kaluzny 0 Reputation points
    2024-10-15T09:21:56.21+00:00

    Hey,

    what helped me was just syncing the two rate limits - in my Azure deployment and my Python code.

    AZURE DEPLOYMENT (Free Tier, S0, gpt-35-turbo-16k)
    User's image

    CODE BEFORE (giving the 429 error)

                response = client.chat.completions.create(
                    model=azure_oai_deployment,
                    temperature=0.7,
                    max_tokens=1200,
                    messages=messages_array
    

    Error code: 429 - {'error': {'code': '429', 'message': 'Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-06-01 have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 86400 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.'}}

    CODE AFTER (working fine)

                response = client.chat.completions.create(
                    model=azure_oai_deployment,
                    temperature=0.7,
                    max_tokens=1000,
                    messages=messages_array
    
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.