Exceeded token rate limit on Azure OpenAI model without using it?

Abhyuday Luthra 25 Reputation points
2024-10-26T11:23:30.2366667+00:00

Hi, I've just created my first OpenAI model deployment using 4o-mini but I'm getting a rate limit exceeded error for any api call I try and make to it. This is my first model deployment and the first time I'm making any calls so I'm not sure how I've hit the rate limit so quickly?

I'm using the default code to test:

client = AzureOpenAI(
  azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"), 
  api_key=os.getenv("AZURE_OPENAI_API_KEY"),  
  api_version="2024-09-01-preview"
)
response = client.chat.completions.create(
    model="o1-preview-new", # replace with the model deployment name of your o1-preview, or o1-mini model
    messages=[
        {"role": "user", "content": "What steps should I think about when writing my first Python API?"},
    ],
    max_completion_tokens = 5000
)

and my error is this:

Error code: 429 - {'error': {'code': '429', 'message': 'Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-08-01-preview have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 86400 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.'}}
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,082 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Gowtham CP 6,020 Reputation points Volunteer Moderator
    2024-10-26T13:18:12.41+00:00

    Hello Abhyuday Luthra,

    Thanks for reaching out on Microsoft Q&A.

    This error is common with Azure's S0 tier, which has strict token limits, so it’s easy to hit them quickly, even if it’s your first time using the service. If you set a high max_completion_tokens (like 5000), try lowering it to 500–1000. Also, new deployments sometimes take a bit of time to fully set up, so waiting and testing again later might help. If the limit still feels too tight, you can put in a Request for Quota Increase to get more access. You can check more about these limits in Azure's documentation. I also recommend reporting this issue to the Azure support team.

    Hope this helps!

    If you found this answer helpful, please upvote and mark it as accepted to close the thread. Thanks!


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.