I'm getting rate_limit_exceeded error for almost all my assistant calls

Rodrigo Juarez 15 Reputation points
2024-08-20T23:31:40.19+00:00

I'm trying to use assistants in the Azure AI Studio playground. I created a new hub and deployed a chat-gpt-4-mini model. I also created a new assistant with only a custom prompt. Almost any call returns a rate_limit_exceeded error message, even when I'm the only one calling the endpoint, and I'm waiting more than 1 minute between calls (anyway, the first call will usually fail). Is there anything I should configure on my end?

User's image

Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,602 questions
{count} votes

4 answers

Sort by: Most helpful
  1. Rodrigo Juarez 15 Reputation points
    2024-08-24T00:15:28.74+00:00

    I ended setting a really high value (225) for Tokens per Minute Rate Limit (thousands) then it started working. I did't try what's the lower limit that would work.

    1 person found this answer helpful.
    0 comments No comments

  2. 61056333 0 Reputation points
    2024-08-23T23:43:09.25+00:00

    Same problem here. What is it? I've got only 10 tokens so far while the limit is 10K per minute.


  3. 61056333 0 Reputation points
    2024-08-23T23:47:23.74+00:00

    Screenshot 2024-08-23 at 4.46.29 PM

    0 comments No comments

  4. Rodrigo Juarez 15 Reputation points
    2024-08-24T00:26:16.41+00:00

    I ended setting a really high value (225) for Tokens per Minute Rate Limit (thousands) then it started working. I did't try what's the lower limit that would work.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.