Resolving 429 Errors in Azure OpenAI Due to Rate Limits

Question

Resolving 429 Errors in Azure OpenAI Due to Rate Limits

SwathiDhanwada-MSFT 18,996 Moderator

Why are my Azure OpenAI deployments being throttled with 429 errors, and what steps can I take to prevent this?

PS - Based on common issues that we have seen from customers and other sources, we are posting these questions to help the Azure community.

1 answer

Your answer

Answer 1

Azure OpenAI's quota feature enables assignment of rate limits to your deployments, up-to a global limit called your “quota”. Quota is assigned to your subscription on a per-region, per-model basis in units of Tokens-per-Minute (TPM). Your subscription is onboarded with a default quota for most models.

Refer to this document for default TPM values. You can allocate TPM among deployments until reaching quota. If you exceed a model's TPM limit in a region, you can reassign quota among deployments or request a quota increase. Alternatively, if viable, consider creating a deployment in a new Azure region in the same geography as the existing one.

TPM rate limits are based on the maximum tokens estimated to be processed when the request is received. It is different than the token count used for billing, which is computed after all processing is completed. Azure OpenAI calculates a max processed-token count per request using:

Prompt text and count
The max_tokens setting
The best_of setting

This estimated count is added to a running token count of all requests, which resets every minute. A 429 response code is returned once the TPM rate limit is reached within the minute.

To minimize issues related to rate limits, it's a good idea to use the following techniques:

Implement retry logic in your application.
Avoid sharp changes in the workload. Increase the workload gradually.
Test different load increase patterns.
Increase the quota assigned to your deployment. Move quota from another deployment, if necessary.

Remember to optimize these settings based on your specific needs.

Resources:

Please do not forget to "up-vote" wherever the information provided helps you, as this can be beneficial to other community members.

Share via

Resolving 429 Errors in Azure OpenAI Due to Rate Limits

1 answer

Your answer