Inquiry about Azure OpenAI API Quota Increase Limits

服部 隼也 0 Reputation points
2025-07-01T09:12:51.8933333+00:00

Dear Microsoft Support Team,

I am currently using the Azure OpenAI API and understand that quota increases are handled on a request basis.

I would appreciate it if you could clarify the following points:

  1. To what extent can the usage quota be increased through requests?
  2. Is there a maximum limit to how much the quota can be increased? If so, could you please provide details on the criteria and the specific maximum values, if available?

Thank you very much for your assistance.

Best regards,

Shunya Hattori

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,129 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Prashanth Veeragoni 5,655 Reputation points Microsoft External Staff Moderator
    2025-07-01T14:53:42.4+00:00

    Hi 服部 隼也,

    Yes, Azure OpenAI usage quotas (such as tokens-per-minute, requests-per-minute, and deployment limits) are initially provisioned at default limits. These can be increased upon request, but the final limits are subject to Microsoft’s internal review based on your use case, region availability, and responsible AI review.

    Answers to Your Questions:

    1.what extent can the usage quota be increased through requests?

    ·   There is no publicly fixed upper bound for quota increases. However, Microsoft typically reviews the scale based on your:

    o   Intended use case

    o   Model size (e.g., GPT-4 Turbo vs. GPT-3.5)

    o   Token consumption requirements (tokens-per-minute)

    o   Region availability

    o   Responsible AI use and business justification

    ·   Token-per-minute (TPM) and Requests-per-minute (RPM) are the main metrics considered.

    2.Is there a maximum limit to how much the quota can be increased?

    ·    Yes, practical ceilings exist, but they are not explicitly documented.

    ·     Examples of high-end limits seen in practice:

    o   GPT-4 Turbo: Up to 300,000 TPM

    o   GPT-3.5-Turbo: Up to 600,000–1,000,000 TPM

    o   These are not guaranteed and depend on request review.

    ·    Microsoft’s Internal team will assess:

    o   Your current usage trends

    o   Business justification and project scale

    o   Enterprise agreement or subscription type

    o   Compliance with Responsible AI Standard

    How to Request a Quota Increase

    You can follow these steps:

    Option1: Via Azure Portal

    1. Go to the Azure Portal → Your OpenAI Resource.
    2. Select "Limits" or "Usage + quotas" under the resource settings.
    3. Click “Request increase” or open a support ticket.
    4. Provide:
      • Subscription details
      • Region
      • Model (e.g., gpt-4, gpt-35-turbo)
      • Required TPM / RPM
      • Description of use case and justification

    Option2: Using Azure Support

    ·   Navigate to: https://portal.azure.com/#blade/Microsoft_Azure_Support/HelpAndSupportBlade

    ·   Create a new support request:

    o   Issue Type: Service and subscription limits (quotas)

    o   Service: Azure OpenAI

    o   Region and Deployment ID

    o   Provide justification and urgency

    Please refer below Documentations

    ·    Azure OpenAI Quotas & Limits

    ·    Requesting quota increases

    Hope this helps, if that resolves your query please do up-vote and accept it, so that it will be helpful for others in the community who are having similar issues/query.

    Thank You!

    0 comments No comments

  2. 服部 隼也 0 Reputation points
    2025-07-03T00:59:16.63+00:00

    Dear Prashanth Veeragoni,

    Thank you very much for your kind and thorough response regarding the Azure OpenAI API quota increase.

    The information you provided has clarified the following points for me:

    There is no explicit upper limit to quota increases.

    The feasibility and extent of an increase will be individually reviewed based on the use case, model size, region availability, and responsible AI considerations.

    You provided specific numerical examples (e.g., up to 300,000 TPM for GPT-4 Turbo and 600,000-1,000,000 TPM for GPT-3.5-Turbo), which allowed me to grasp an approximate ceiling for the quota.

    This information will greatly assist in our future planning.

    Your detailed explanation of the specific steps for requesting a quota increase was also very helpful. I will proceed with the request via the Azure Portal or Azure Support as needed.

    I sincerely appreciate your prompt and comprehensive response.

    Thank you again for your continued support.

    Best regards,

    Shunya Hattori


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.