Unable to provision more PTU for gpt4.1

Jeffrey Lau 0 Reputation points
2025-05-19T16:07:47.5233333+00:00

I am getting this error in my project
{ "error": { "code": "InvalidCapacity", "message": "There's no available capacity to scale out by 50 PTU for the current request." } }

However when I look at the quota and available ptu it say 1750. Not sure where the issue is.

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,101 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Jerald Felix 2,180 Reputation points
    2025-05-19T16:15:53.35+00:00

    Hello Jeffrey Lau,

    The error is not about your quota (i.e., what you're allowed to use) — it's about regional availability of actual compute capacity at the moment you're making the request.

    Even though your quota shows 1750 PTU available, the region (e.g., East US or Switzerland Central) currently doesn’t have enough free physical capacity to allocate an additional 50 PTUs for GPT-4.1.

    Capacity issues are often temporary. Retry after a few minutes or during non-peak hours. If your workload allows, try deploying in another region (e.g., West US, France Central, or East Asia) that has better capacity.

    Please check model summary table for other region selection

    Tips to Investigate Further

    Go to Azure Portal > Azure OpenAI > Your Resource > Usage + Quotas.

    Check both:

    Quota Limit (what you're allowed)

    1. Current Usage & Regional Capacity

    Also, ensure you're not:

    1. Requesting a non-existent SKU or region combination.
    2. Trying to scale multiple deployments at once without available capacity.

    You can find available regional capacity using Capacity API as mentioned in below section

    https://learn.microsoft.com/en-us/azure/ai-services/openai/quotas-limits?tabs=REST#regional-quota-capacity-limits

    Best Regards,

    Jerald Felix

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.