Error code 429 - 'TooManyRequests'. Azure OpenAI - AI model deployed via AI Foundry.

Stephen 85 Reputation points
2025-02-12T15:48:51.32+00:00

In Azure AI Foundry, I have the gpt-4o model deployed.  In the UI, it is grouped under the Azure AI service “ai-sig6-azure-ai-services_aoai”.  In the Azure Portal, I have an Azure AI Service called ai-sig6-azure-ai-services.  The gpt-4o model has TKM of 30K and RPM of 180.  I try to send several requests in a row and 1 or 2 will succeed and then I get the error HTTP Status Code ‘TooManyRequests’.  I should not be anywhere close to those limits. I think there must be another limit that I am hitting, but cannot find it in the Azure Portal or Azure AI Foundry.

The http headers when I get the ‘TooManyRequests’ are:

Here are the response headers:

Retry-After: 49

x-ratelimit-reset-tokens: 49

apim-request-id: 8ef18262-d6c3-4b3b-a2bf-7cf1ccdddfee

Strict-Transport-Security: max-age=31536000; includeSubDomains; preload

X-Content-Type-Options: nosniff

policy-id: DeploymentRatelimit-Token

x-ms-region: East US 2

x-ratelimit-remaining-requests: 24

Date: Wed, 12 Feb 2025 14:14:46 GMT

Request failed with status code: TooManyRequests

What do I need to change so I don’t get this error?

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,080 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Stephen 85 Reputation points
    2025-02-14T14:00:24.9766667+00:00

    I created a support ticket and spoke to a Microsoft employee. Apparently, the region I deployed to was experiencing heavy use and therefore, the default TKM and RPM were lowered. She requested that I deploy to another region and provided some regions with more capacity. In addition, she said to request for quota increase - the same info from the previous comment, and I did that and the request just went through. So, problem resolved.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.