I'm getting rate_limit_exceeded error for almost all my assistant calls

Rodrigo Juarez 5 Reputation points
2024-08-20T23:31:40.19+00:00

I'm trying to use assistants in the Azure AI Studio playground. I created a new hub and deployed a chat-gpt-4-mini model. I also created a new assistant with only a custom prompt. Almost any call returns a rate_limit_exceeded error message, even when I'm the only one calling the endpoint, and I'm waiting more than 1 minute between calls (anyway, the first call will usually fail). Is there anything I should configure on my end?

User's image

Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,893 questions
{count} vote

4 answers

Sort by: Most helpful
  1. 61056333 0 Reputation points
    2024-08-23T23:43:09.25+00:00

    Same problem here. What is it? I've got only 10 tokens so far while the limit is 10K per minute.


  2. 61056333 0 Reputation points
    2024-08-23T23:47:23.74+00:00

    Screenshot 2024-08-23 at 4.46.29 PM

    0 comments No comments

  3. Rodrigo Juarez 5 Reputation points
    2024-08-24T00:15:28.74+00:00

    I ended setting a really high value (225) for Tokens per Minute Rate Limit (thousands) then it started working. I did't try what's the lower limit that would work.

    0 comments No comments

  4. Rodrigo Juarez 5 Reputation points
    2024-08-24T00:26:16.41+00:00

    I ended setting a really high value (225) for Tokens per Minute Rate Limit (thousands) then it started working. I did't try what's the lower limit that would work.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.