Share via

Request to Increase RPM / Concurrency Limit for gpt-image-2 on Azure OpenAI

Emmanuella Onosemuode 0 Reputation points
2026-05-10T05:42:19.32+00:00

Hello Azure Support Team,

We are currently using Azure OpenAI Service and would like to request an increase in the RPM and concurrency limits for the gpt-image-2 model.

Our business requires high-volume image generation and image editing workloads. The current rate limit is not sufficient for our production demand, and we expect usage to continue increasing.

Request details:

Model: gpt-image-2

Region: eastus2

Current RPM limit: 10RPM

Requested RPM limit: 100RPM

Use case: Commercial image generation and image editing through API integration

We have real production demand and are willing to scale our usage on Azure if the quota can support our workload. Please help review and increase the RPM and concurrency limits for gpt-image-2 as soon as possible.

Thank you

Azure OpenAI in Foundry Models
0 comments No comments

2 answers

Sort by: Most helpful
  1. SRILAKSHMI C 18,225 Reputation points Microsoft External Staff Moderator
    2026-05-12T08:57:15.0266667+00:00

    Hello @Emmanuella Onosemuode

    Thank you for reaching out and for providing the detailed information regarding your production workload requirements for the gpt-image-2 model.

    We understand that you are currently encountering the default RPM/concurrency limitation for gpt-image-2 in the East US 2 region and would like to increase the limit from 10 RPM to approximately 100 RPM to support high-volume commercial image generation and image editing workloads.

    Based on your request:

    • Model: gpt-image-2

    • Region: East US 2

    • Current limit: 10 RPM

    • Requested limit: 100 RPM

    • Use case: Production-scale image generation and image editing via API integration

    Please note that RPM and concurrency limits for Azure OpenAI models are managed through Azure OpenAI quota allocation and regional backend capacity. Increases beyond the default allocation require review and approval by the Azure OpenAI engineering/quota management team.

    Recommended next steps:

    1. Review your current usage metrics We recommend collecting usage evidence from: • Azure Monitor metrics • application logs • HTTP 429/rate-limit responses • latency/concurrency trends

    Showing that your workload is consistently approaching or exceeding the current 10 RPM limit can help support the quota increase request.

    1. Gather deployment details Please ensure you have: • Azure OpenAI resource name/resource ID • deployment name for gpt-image-2 • target region (East US 2) • estimated production RPM/TPM requirements

    Submit a quota increase request You can submit the request from:

    Azure AI Foundry Portal → Management → Quota

    Portal link: Azure AI Foundry Portal

    Within the Quota blade: • Filter by:

    Subscription

    Model = gpt-image-2

    Region = eastus2

    • Select the quota row and choose: “Request quota”

    In the request form, include: • Current RPM limit: 10 RPM • Requested RPM limit: 100 RPM • Requested concurrency increase • Business justification: “High-volume commercial image generation and image editing workloads for production API integration”

    You may also use the direct quota request form: Azure OpenAI Quota Request Form

    Official quota documentation: Azure OpenAI Quotas and Limits Documentation

    Approval depends on:

    regional GPU/model capacity availability,

    subscription history,

    production usage patterns,

    and responsible AI/compliance review.

    If East US 2 is under temporary capacity pressure, the engineering team may recommend:

    phased quota increases,

    alternative regions,

    or additional deployments for workload distribution.

    Interim optimization recommendations: While the quota request is under review, you may also consider: • batching smaller image requests when possible • implementing client-side throttling with exponential backoff and jitter • distributing workload across multiple deployments or regions if supported

    Please note that quota increase reviews typically take several business days depending on regional capacity and request volume. You can monitor the request status directly from the Quota blade in the portal after submission.

    Increase Token/RPM Limits in Azure OpenAI Service: https://learn.microsoft.com/azure/foundry/openai/quotas-limits#can-i-request-more-quota

    Common throttling solutions & best practices: https://learn.microsoft.com/azure/ai-services/openai/how-to/quota

    I Hope this helps. Do let me know if you have any further queries.


    If this answers your query, please do click Accept Answer and Yes for was this answer helpful.

    Thank you!

    Was this answer helpful?


  2. Nathan Roberts (SN) 11,281 Reputation points Volunteer Moderator
    2026-05-10T09:53:18+00:00

    Hey there, Emmanuella Onosemuode

    From reading your question, it looks like you have already adjusted the RPM and concurrency limits to the maximum limit through the portal. If this is the case, you will need to create a support request through the Azure portal here: https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SupportAndTroubleshooting

    In the support request, please enter 'Increase RMP and Concurrency limits' and select your service, subscription and resource. Then click OK and follow through the form. You will want to scroll down to the bottom and click onto 'Contact support', then click onto 'Create a support request'

    User's image

    You will be able to explain your situation to the agent who will be able to explore what options are available to them.

    Hope this helps,
    Nathan

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.