Error code 429 - 'TooManyRequests'. Azure OpenAI - AI model deployed via AI Foundry.

Question

Error code 429 - 'TooManyRequests'. Azure OpenAI - AI model deployed via AI Foundry.

Stephen 85

In Azure AI Foundry, I have the gpt-4o model deployed. In the UI, it is grouped under the Azure AI service “ai-sig6-azure-ai-services_aoai”. In the Azure Portal, I have an Azure AI Service called ai-sig6-azure-ai-services. The gpt-4o model has TKM of 30K and RPM of 180. I try to send several requests in a row and 1 or 2 will succeed and then I get the error HTTP Status Code ‘TooManyRequests’. I should not be anywhere close to those limits. I think there must be another limit that I am hitting, but cannot find it in the Azure Portal or Azure AI Foundry.

The http headers when I get the ‘TooManyRequests’ are:

Here are the response headers:

Retry-After: 49

x-ratelimit-reset-tokens: 49

apim-request-id: 8ef18262-d6c3-4b3b-a2bf-7cf1ccdddfee

Strict-Transport-Security: max-age=31536000; includeSubDomains; preload

X-Content-Type-Options: nosniff

policy-id: DeploymentRatelimit-Token

x-ms-region: East US 2

x-ratelimit-remaining-requests: 24

Date: Wed, 12 Feb 2025 14:14:46 GMT

Request failed with status code: TooManyRequests

What do I need to change so I don’t get this error?

Stephen 85 Reputation points

2025-02-14T14:01:12.92+00:00

I created a support ticket and spoke to a Microsoft employee. Apparently, the region I deployed to was experiencing heavy use and therefore, the default TKM and RPM were lowered. She requested that I deploy to another region and provided some regions with more capacity. In addition, she said to request for quota increase - the same info from the previous comment, and I did that and the request just went through. So, problem resolved.

1 answer

Your answer

Stephen 85 Reputation points

2025-02-14T14:01:12.92+00:00

I created a support ticket and spoke to a Microsoft employee. Apparently, the region I deployed to was experiencing heavy use and therefore, the default TKM and RPM were lowered. She requested that I deploy to another region and provided some regions with more capacity. In addition, she said to request for quota increase - the same info from the previous comment, and I did that and the request just went through. So, problem resolved.

Answer 1

Stephen 85

I created a support ticket and spoke to a Microsoft employee. Apparently, the region I deployed to was experiencing heavy use and therefore, the default TKM and RPM were lowered. She requested that I deploy to another region and provided some regions with more capacity. In addition, she said to request for quota increase - the same info from the previous comment, and I did that and the request just went through. So, problem resolved.

Share via

Error code 429 - 'TooManyRequests'. Azure OpenAI - AI model deployed via AI Foundry.

1 answer

Your answer