Hello Jeffrey Lau,
The error is not about your quota (i.e., what you're allowed to use) — it's about regional availability of actual compute capacity at the moment you're making the request.
Even though your quota shows 1750 PTU available, the region (e.g., East US or Switzerland Central) currently doesn’t have enough free physical capacity to allocate an additional 50 PTUs for GPT-4.1.
Capacity issues are often temporary. Retry after a few minutes or during non-peak hours. If your workload allows, try deploying in another region (e.g., West US
, France Central
, or East Asia
) that has better capacity.
Please check model summary table for other region selection
Tips to Investigate Further
Go to Azure Portal > Azure OpenAI > Your Resource > Usage + Quotas.
Check both:
Quota Limit (what you're allowed)
- Current Usage & Regional Capacity
Also, ensure you're not:
- Requesting a non-existent SKU or region combination.
- Trying to scale multiple deployments at once without available capacity.
You can find available regional capacity using Capacity API as mentioned in below section
Best Regards,
Jerald Felix