Is there daily or month quota cap for text-embedding-ada-002

Anjan Shakya 100 Reputation points
2025-04-03T22:35:04.8466667+00:00

I am getting below error.

Requests to the Embeddings_Create Operation under Azure OpenAI API version 2024-08-01-preview have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 1 second.
Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit. For Free Account customers, upgrade to Pay as you Go here: https://aka.ms/429TrialUpgrade.

Is there daily or month quota cap for text-embedding-ada-002 beside TPM and RPM?

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,912 questions
{count} votes

Accepted answer
  1. Marcin Policht 43,785 Reputation points MVP
    2025-04-04T16:24:31.9233333+00:00

    The effective embedding model deployment limits are applied to individual deployments - you can easily identify these from Azure OpenAI Service/Azure AI Foundry interface. The bottom line is that all of them are per minute - none of them apply on per day/month level


    If the above response helps answer your question, remember to "Accept Answer" so that others in the community facing similar issues can easily find the solution. Your contribution is highly appreciated.

    hth

    Marcin

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Manas Mohanty 2,930 Reputation points Microsoft External Staff
    2025-04-04T15:17:06.8566667+00:00

    Hi Anjan Shakya

    I could find one more limit in quota and limit docs beside TPM and RPM.

    Max number or inputs in array with /embeddings- 2048

    But TPM decides how many requests you can send in 1 minute. (RPM), It is expected to throw a rate limit error any time it crosses per second limit for all Azure OpenAI models to constrain regional demands and keep servers healthy.

    1. You may increase your TPM on the embedding model from deployment section to avail higher rate limits
    2. Adopt backup-exponential retry logic with dynamic sleep time

    Hope it addresses your query.

    Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.

    Thank You.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.