Rate Limited with 1 query?

Question

just learning and playing with in chat playground. Deployed a new service - asked about 5 qs in the last 2-3 hours.. now I tried again - 2 req, 1 min apart and I get RL'ed

User's image

Answer

Hi Rajeev,

I'd recommend checking your Quotas:

User's image

If you are hitting your quotas, then I recommend requesting a rate limit increase using https://aka.ms/oai/quotaincrease

If this is helpful please accept as answer or upvote.

Best regards,

Dillon Silzer, Director | Cloudaen.com | Cloudaen Computing Solutions

Answer

Rajeev Bhat Welcome to Microsoft Q&A forum!

Can you also share the deployed model details and the region where the model is deployed? Where are you seeing this error?

I understand that you have only tried with few requests and seeing this issue.

To give more context, As each request is received, Azure OpenAI computes an estimated max processed-token count that includes the following:

Prompt text and count
The max_tokens parameter setting
The best_of parameter setting

As requests come into the deployment endpoint, the estimated max-processed-token count is added to a running token count of all requests that is reset each minute. If at any time during that minute, the TPM rate limit value is reached, then further requests will receive a 429 response code until the counter resets. For more details, see Understanding rate limits.

To minimize issues related to rate limits, it's a good idea to use the following techniques:

Set max_tokens and best_of to the minimum values that serve the needs of your scenario. For example, don’t set a large max-tokens value if you expect your responses to be small.
Use quota management to increase TPM on deployments with high traffic, and to reduce TPM on deployments with limited needs.
Implement retry logic in your application.
Avoid sharp changes in the workload. Increase the workload gradually.
Test different load increase patterns.

Also, see Optimizing Azure OpenAI: A Guide to Limits, Quotas, and Best Practices for more information.

Hope this helps. Do let me know if you have any further queries.

Share via

Rate Limited with 1 query?

2 answers

Your answer