Same problem here. What is it? I've got only 10 tokens so far while the limit is 10K per minute.
I'm getting rate_limit_exceeded error for almost all my assistant calls
I'm trying to use assistants in the Azure AI Studio playground. I created a new hub and deployed a chat-gpt-4-mini model. I also created a new assistant with only a custom prompt. Almost any call returns a rate_limit_exceeded error message, even when I'm the only one calling the endpoint, and I'm waiting more than 1 minute between calls (anyway, the first call will usually fail). Is there anything I should configure on my end?
4 answers
Sort by: Most helpful
-
-
-
Rodrigo Juarez 5 Reputation points
2024-08-24T00:15:28.74+00:00 I ended setting a really high value (225) for Tokens per Minute Rate Limit (thousands) then it started working. I did't try what's the lower limit that would work.
-
Rodrigo Juarez 5 Reputation points
2024-08-24T00:26:16.41+00:00 I ended setting a really high value (225) for Tokens per Minute Rate Limit (thousands) then it started working. I did't try what's the lower limit that would work.