How to deal with OpenAI Azure quotas

Mantas Urnieza 0 Reputation points
2024-11-26T07:25:26.3933333+00:00

Hello,

I am migrating from direct OpenAI to Azure OpenAI service.

The main blocker is quotas on tokens per minute and requests per minute. I can't switch because when I start making requests I get quota errors.

I already requested an increase of quota but Azure is not able to raise it to the needed level.

What steps should I take to fully migrate to Azure?

So far only 40% of my OpenAI requests are being routed to Azure.

I spend ~$20k per month.

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,380 questions
{count} votes

2 answers

Sort by: Most helpful
  1. santoshkc 10,635 Reputation points Microsoft Vendor
    2024-11-26T14:55:46.82+00:00

    Hi @Mantas Urnieza,

    Thank you for reaching out to Microsoft Q&A forum!

    Azure OpenAI's quota feature lets you manage rate limits for model deployments by assigning Tokens-per-Minute (TPM) per region and model. Each deployment has a TPM and Requests-per-Minute (RPM) limit. Once your quota is reached, you can reduce TPM on existing deployments or request an increase. TPM allocation is flexible, allowing you to create multiple deployments as long as the total TPM does not exceed your quota. You can adjust TPM from Azure AI Studio and request quota increases as needed through the portal.

    For more info you can refer to: Manage Azure OpenAI Service quota.

    To view your quota allocations across deployments in a given region, select Shared Resources> Quota in Azure OpenAI studio and click on the link to increase the quota.

    User's image

    To migrating from OpenAI to Azure OpenAI refer to this documentation.

    I hope this helps! Thank you.

    0 comments No comments

  2. Mantas Urnieza 0 Reputation points
    2024-11-26T17:38:43.3266667+00:00

    Thank you santoshkc for an answer, but this is does not help.

    I already requested quota increases and have my deployment at maximum available. Azure is rejecting any other quota increase requests. I need some other solution here.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.