Request to Increase RPM / Concurrency Limit for gpt-image-2 on Azure OpenAI

Question

Request to Increase RPM / Concurrency Limit for gpt-image-2 on Azure OpenAI

Emmanuella Onosemuode 0

Hello Azure Support Team,

We are currently using Azure OpenAI Service and would like to request an increase in the RPM and concurrency limits for the gpt-image-2 model.

Our business requires high-volume image generation and image editing workloads. The current rate limit is not sufficient for our production demand, and we expect usage to continue increasing.

Request details:

Model: gpt-image-2

Region: eastus2

Current RPM limit: 10RPM

Requested RPM limit: 100RPM

Use case: Commercial image generation and image editing through API integration

We have real production demand and are willing to scale our usage on Azure if the quota can support our workload. Please help review and increase the RPM and concurrency limits for gpt-image-2 as soon as possible.

Thank you

0 comments

2 answers

Your answer

Answer 1

Hello @Emmanuella Onosemuode

Thank you for reaching out and for providing the detailed information regarding your production workload requirements for the gpt-image-2 model.

We understand that you are currently encountering the default RPM/concurrency limitation for gpt-image-2 in the East US 2 region and would like to increase the limit from 10 RPM to approximately 100 RPM to support high-volume commercial image generation and image editing workloads.

Based on your request:

• Model: gpt-image-2

• Region: East US 2

• Current limit: 10 RPM

• Requested limit: 100 RPM

• Use case: Production-scale image generation and image editing via API integration

Please note that RPM and concurrency limits for Azure OpenAI models are managed through Azure OpenAI quota allocation and regional backend capacity. Increases beyond the default allocation require review and approval by the Azure OpenAI engineering/quota management team.

Recommended next steps:

Review your current usage metrics We recommend collecting usage evidence from: • Azure Monitor metrics • application logs • HTTP 429/rate-limit responses • latency/concurrency trends

Showing that your workload is consistently approaching or exceeding the current 10 RPM limit can help support the quota increase request.

Gather deployment details Please ensure you have: • Azure OpenAI resource name/resource ID • deployment name for gpt-image-2 • target region (East US 2) • estimated production RPM/TPM requirements

Submit a quota increase request You can submit the request from:

Azure AI Foundry Portal → Management → Quota

Portal link: Azure AI Foundry Portal

Within the Quota blade: • Filter by:

Subscription

Model = gpt-image-2

Region = eastus2

• Select the quota row and choose: “Request quota”

In the request form, include: • Current RPM limit: 10 RPM • Requested RPM limit: 100 RPM • Requested concurrency increase • Business justification: “High-volume commercial image generation and image editing workloads for production API integration”

You may also use the direct quota request form: Azure OpenAI Quota Request Form

Official quota documentation: Azure OpenAI Quotas and Limits Documentation

Approval depends on:

regional GPU/model capacity availability,

subscription history,

production usage patterns,

and responsible AI/compliance review.

If East US 2 is under temporary capacity pressure, the engineering team may recommend:

phased quota increases,

alternative regions,

or additional deployments for workload distribution.

Interim optimization recommendations: While the quota request is under review, you may also consider: • batching smaller image requests when possible • implementing client-side throttling with exponential backoff and jitter • distributing workload across multiple deployments or regions if supported

Please note that quota increase reviews typically take several business days depending on regional capacity and request volume. You can monitor the request status directly from the Quota blade in the portal after submission.

Increase Token/RPM Limits in Azure OpenAI Service: https://learn.microsoft.com/azure/foundry/openai/quotas-limits#can-i-request-more-quota

Common throttling solutions & best practices: https://learn.microsoft.com/azure/ai-services/openai/how-to/quota

I Hope this helps. Do let me know if you have any further queries.

If this answers your query, please do click Accept Answer and Yes for was this answer helpful.

Thank you!

SRILAKSHMI C 18,225 Reputation points Microsoft External Staff Moderator

2026-05-13T07:34:16.68+00:00

Hi @Emmanuella Onosemuode

Following up to see if the above answer was helpful. If this answers your query, please do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Thank you!

Answer 2

Hey there, Emmanuella Onosemuode

From reading your question, it looks like you have already adjusted the RPM and concurrency limits to the maximum limit through the portal. If this is the case, you will need to create a support request through the Azure portal here: https://portal.azure.com/#view/Microsoft_Azure_ProjectOxford/CognitiveServicesHub/~/SupportAndTroubleshooting

In the support request, please enter 'Increase RMP and Concurrency limits' and select your service, subscription and resource. Then click OK and follow through the form. You will want to scroll down to the bottom and click onto 'Contact support', then click onto 'Create a support request'

User's image

You will be able to explain your situation to the agent who will be able to explore what options are available to them.

Hope this helps,
Nathan

Share via

Request to Increase RPM / Concurrency Limit for gpt-image-2 on Azure OpenAI

2 answers

Your answer