Why do I get a 429 saying I should retry in 24h in the OpenAI S0 pricing tier?

Sorin Costea 5 Reputation points
2024-12-06T14:43:45.7133333+00:00

All documents talk about quotas per MINUTE, yet the error I get says "Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-10-01-preview have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 86400 seconds."

That is, NEXT DAY. However there's NO DOCUMENT mentioning any DAILY limit, and all quotas are per model anyway so per MINUTE, both in documentation and in the quotas tab in AI Studio.

So I don't know what to do about that error.

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,120 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Max Lacy 345 Reputation points
    2024-12-06T15:36:44.0466667+00:00

    I understand you are experiencing a rate limit issue when trying to utilize ChatCompletions_Create Operation under Azure OpenAI API version 2024-10-01-preview.

    When a deployment is created, the assigned TPM will directly map to the tokens-per-minute rate limit enforced on its inferencing requests. A Requests-Per-Minute (RPM) rate limit will also be enforced whose value is set proportionally to the TPM assignment using the following ratio:

    6 RPM per 1000 TPM.
    Depending on the configuration of your deployment your TPM may be set too low. To address your problem look at increasing your Token per minute in the Azure AI Portal. This will increase the allowed RPM to ensure you hit less rate limits located in Deployments | <select deployment> | Edit.

    User's image


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.