Azure OpenAI Rate-Limiting Error

Khawar Habib 0 Reputation points
2024-05-02T12:03:25.7233333+00:00

I have deployed Azure OpenAI service with gpt-35-turbo(0301) and set token per minute limit to 1K and it's displaying approx. 6 requests per minute.

User's image

In my first request, i have utilized only 223 tokens in total. I am adding usage response as well.

"usage": {
        "completion_tokens": 193,
        "prompt_tokens": 30,
        "total_tokens": 223
    }

When I attempted to verify it using Postman on the subsequent request, I encountered the following error. Could someone please explain how it is exceeding the rate limit?

Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-02-15-preview have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 6 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.


Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
2,647 questions
{count} votes