How to Resolve 'Rate Limit Exceeded' Error in Azure OpenAI When Using Vector Stores and Documents in the Assistants Module

Fabrício França 0 Reputation points
2024-11-25T17:00:51.65+00:00

I am using my OpenAI model on Azure OpenAI, and it is working perfectly in the chat and assistant modules. However, when I use vector stores or documents in the thread within the assistant module, I encounter the following error:

Error

ratelimitexceeded: Rate limit is exceeded. Try again in 52 seconds. RunId: run_oxJenN1MKs0VQSJAjpW4uWY1

Model details:

  • Rate limit (Tokens per minute): 20,000
  • Rate limit (Requests per minute): 120
  • Model name: gpt-4o
  • Model version: 2024-05-13
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,379 questions
{count} votes

1 answer

Sort by: Most helpful
  1. VasaviLankipalle-MSFT 18,296 Reputation points
    2024-11-25T21:10:45.15+00:00

    Hello @Fabrício França ,

    Azure OpenAI’s quota feature enables assignment of rate limits to your deployments, up-to a global limit called your “quota.” Quota is assigned to your subscription on a per-region, per-model basis in units of Tokens-per-Minute (TPM).

    Try to view your quota allocations across deployments in a given region, select Management > Quota in Azure AI Studio.

    TPM rate limits are based on the maximum number of tokens that are estimated to be processed by a request at the time the request is received.

    RPM rate limits are based on the number of requests received over time. The rate limit expects that requests be evenly distributed over a one-minute period. If this average flow isn't maintained, then requests may receive a 429 response even though the limit isn't met when measured over the course of a minute.

    Please see Manage Azure OpenAI Service quota for more details.

    To view your quota allocations across deployments in a given region, select Shared Resources> Quota in Azure OpenAI studio and click on the link to increase the quota.

    User's image

    Try to monitor metrics in the Azure portal by following this guide: https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/monitor-openai

    If you still face issue, please raise a support ticket in the Azure portal.

    Regards,

    Vasavi


    -Please kindly accept the answer and vote 'yes' if you feel helpful to support the community, thanks.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.