GPT 4o-mini Globalstandard quota is only 30K tokens per minute in East US how to increase

Question

GPT 4o-mini Globalstandard quota is only 30K tokens per minute in East US how to increase

Muhammed Kashif 0

why can I increase this max tokens per minute in azure openai my region is east us but it dosnt let me go past 30k but docs say that it can be alot more than that why is this so can anyone please explain

santoshkc 15,435 Reputation points Microsoft External Staff Moderator

2024-12-13T12:04:41.64+00:00

Hi @Muhammed Kashif,

Following up to see if the below answer was helpful. If this answers your query, do click Accept Answer and Yes for was this answer helpful. Thank you.
santoshkc 15,435 Reputation points Microsoft External Staff Moderator

2024-12-16T05:49:58.1366667+00:00

Hi @Muhammed Kashif,

We haven’t heard from you on the last response and was just checking back to see if the below answer was helpful. If this answers your query, do click Accept Answer and Yes for was this answer helpful.

Thank you.

1 answer

Your answer

santoshkc 15,435 Reputation points Microsoft External Staff Moderator

2024-12-13T12:04:41.64+00:00

Hi @Muhammed Kashif,

Following up to see if the below answer was helpful. If this answers your query, do click Accept Answer and Yes for was this answer helpful. Thank you.
santoshkc 15,435 Reputation points Microsoft External Staff Moderator

2024-12-16T05:49:58.1366667+00:00

Hi @Muhammed Kashif,

We haven’t heard from you on the last response and was just checking back to see if the below answer was helpful. If this answers your query, do click Accept Answer and Yes for was this answer helpful.

Thank you.

Answer 1

Hi @Muhammed Kashif,

Thank you for reaching out to Microsoft Q&A forum.

The 30K token-per-minute limit you're encountering for GPT-4o-mini in East US is likely due to the rate limits imposed by your current Azure OpenAI pricing tier. This is a typical measure to prevent abuse and ensure fair usage. If you're on the free tier, upgrading to a higher-tier plan (like the Standard tier) can help increase your quota.

Azure OpenAI Service applies rate limits (Tokens-per-Minute or TPM) based on your region and model. You can increase your limit by selecting the Edit option on your deployment model, then adjusting the Token per Minute Rate Limit.

User's image

For further adjustments, visit the Quota section under Shared Resources in Azure OpenAI Studio, where you can request a quota increase. Make sure to check the Azure OpenAI quota management documentation for more details on rate limits.

User's image

If requests aren’t distributed evenly over a minute, you might encounter a 429 error, even if your usage is within the average rate limit. Upgrading your subscription and adjusting your quota should resolve this issue and increase your token limits.

Hope this helps. Do let us know if you any further queries.

If this answers your query, do click Accept Answer and Yes for was this answer helpful.

Share via

GPT 4o-mini Globalstandard quota is only 30K tokens per minute in East US how to increase

1 answer

Your answer