Quota Limit/ Usage

Đặng Hoàn Mỹ 0 Reputation points
2024-06-18T15:28:39.31+00:00

User's image

My account is a school account and I'm experimenting with Azure OpenAI Services.
I'm about to use GPTs' models, but I checked it got Limit - Tokens Per Minute 100%.
I just deployed 1 time and cannot do anything else since it's said no quota/ tokens available.

I understand it's Per Minute so it's counted on every minute, am I correct?

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
2,521 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Adharsh Santhanam 2,320 Reputation points
    2024-06-19T05:05:12.8433333+00:00

    Hello Đặng Hoàn Mỹ, you may want to check if you have the permissions to increase the quota on your subscription/service. More specifically, to answer your question, TPM rate limits are based on the maximum tokens estimated to be processed when the request is received. It is different than the token count used for billing, which is computed after all processing is completed. Azure OpenAI calculates a max processed-token count per request using

    1. Prompt text and count
    2. The max_tokens setting
    3. The best_of setting

    This estimated count is added to a running token count of all requests, which resets every minute. A 429 response code is returned once the TPM rate limit is reached within the minute. You may find this article to be a good reference to read -- https://techcommunity.microsoft.com/t5/fasttrack-for-azure/optimizing-azure-openai-a-guide-to-limits-quotas-and-best/ba-p/4076268

    Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

    0 comments No comments