GPT-4o/4o mini request via openai-azure-proxy with short text get 429 error but GPT-3.5 not

Jiajun Tan 0 Reputation points
2024-08-26T12:56:06.2133333+00:00

I'm using https://github.com/haibbo/cf-openai-azure-proxy/tree/main and https://github.com/stulzq/azure-openai-proxy to transfer my azure OpenAI endpoint to the OpenAI official API form to use some third-party services.

When choosing GPT-4o/4o-mini models, when I send only a small piece of text, it will get a 429 error, meaning that the request has reached TPM limit. However, the same text will not trigger that error when switching to GPT-3.5-Turbo. The response generated by GPT-3.5 is absolutely far from the TPM limit (1k tokens per minute).

Also, if I use Python SDK to send the same text for chat completion, all models get normal responses.

API Version: 2024-07-01-preview

region: eastus

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,081 questions
{count} votes

1 answer

Sort by: Most helpful
  1. AshokPeddakotla-MSFT 35,971 Reputation points Moderator
    2024-08-27T05:58:02.7566667+00:00

    Jiajun Tan Greetings and Welcome to Microsoft Q&A forum!

    When choosing GPT-4o/4o-mini models, when I send only a small piece of text, it will get a 429 error, meaning that the request has reached TPM limit. However, the same text will not trigger that error when switching to GPT-3.5-Turbo. The response generated by GPT-3.5 is absolutely far from the TPM limit (1k tokens per minute).

    Could you please double check and confirm if the rate limit of these models are not reached the actual limit in Azure OpenAI studio?

    User's image

    If the usage limit is over you need to increase the quota.

    Please note that different model deployments have unique max TPM values. This represents the maximum amount of TPM that can be allocated to that type of model deployment in a given region. See Manage Azure OpenAI Service quota for more details.

    Also, could you check in any other region to isolate the issue?

    As you mentioned, you do not have any issues while using SDK, this could also be due to the third party services which you are using.

    Do let me know if you have any further queries.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.