I'm getting rate_limit_exceeded error for almost all my assistant calls

Question

I'm getting rate_limit_exceeded error for almost all my assistant calls

Rodrigo Juarez 15

I'm trying to use assistants in the Azure AI Studio playground. I created a new hub and deployed a chat-gpt-4-mini model. I also created a new assistant with only a custom prompt. Almost any call returns a rate_limit_exceeded error message, even when I'm the only one calling the endpoint, and I'm waiting more than 1 minute between calls (anyway, the first call will usually fail). Is there anything I should configure on my end?

User's image

dupammi 8,615 Reputation points Microsoft External Staff

2024-08-21T08:15:59.6533333+00:00

Hi @Rodrigo Juarez

Thank you for your question.

The "rate_limit_exceeded" error suggests that you've exceeded the allowed request threshold. Ensure your subscription's quota supports the usage level, and consider adjusting rate limits or scaling your deployment. Check for unintended calls that might be hitting the endpoint and review logs for patterns. If the issue persists, reach out to Azure supportfor further assistance, or try a simpler deployment to isolate the problem.
dupammi 8,615 Reputation points Microsoft External Staff

2024-08-23T01:17:12.85+00:00

@Rodrigo Juarez

We haven’t heard from you on the last response and was just checking back to see if you got a chance to try above suggestions.

Thank you.

4 answers

Your answer

dupammi 8,615 Reputation points Microsoft External Staff

2024-08-21T08:15:59.6533333+00:00

Hi @Rodrigo Juarez

Thank you for your question.

The "rate_limit_exceeded" error suggests that you've exceeded the allowed request threshold. Ensure your subscription's quota supports the usage level, and consider adjusting rate limits or scaling your deployment. Check for unintended calls that might be hitting the endpoint and review logs for patterns. If the issue persists, reach out to Azure supportfor further assistance, or try a simpler deployment to isolate the problem.
dupammi 8,615 Reputation points Microsoft External Staff

2024-08-23T01:17:12.85+00:00

@Rodrigo Juarez

We haven’t heard from you on the last response and was just checking back to see if you got a chance to try above suggestions.

Thank you.

Answer 1

Rodrigo Juarez 15

I ended setting a really high value (225) for Tokens per Minute Rate Limit (thousands) then it started working. I did't try what's the lower limit that would work.

Answer 2

61056333 0

Same problem here. What is it? I've got only 10 tokens so far while the limit is 10K per minute.

61056333 0 Reputation points

2024-08-23T23:44:33.71+00:00

It just replies the first Hi message and then it stops with the crazy rate limit message.
61056333 0 Reputation points

2024-08-24T00:09:26.3366667+00:00

It only replies the first Hi message and then return this crazy rate limit message for next input.

Answer 3

61056333 0

Screenshot 2024-08-23 at 4.46.29 PM

Answer 4

Rodrigo Juarez 15

I ended setting a really high value (225) for Tokens per Minute Rate Limit (thousands) then it started working. I did't try what's the lower limit that would work.

Kuroodo 5 Reputation points

2024-12-04T17:00:47.5633333+00:00

I'm having the same problem too. I am trying to assess Azure and seem to only make 1 request per minute despite 30k TPM which says is 300 RPM.

How did you set your limit to 225? Did you have to request an increase? Or did you upgrade your account? If you upgraded your account, can you explain where I could do that? I've looked everywhere

Share via

I'm getting rate_limit_exceeded error for almost all my assistant calls

4 answers

Your answer