OpenAI GPT-4 TPM cannot be set to higher than 8K

Morten Johnsen 5 Reputation points
2024-07-30T14:02:43.0766667+00:00

A customer of ours needs an OpenAI GPT-4 large language model (LLM) for an application they have purchased from us. They started out with a Pay-As-You-Go (0003P) subscription that, as stated in quotas-limits, has a maximum quota of 8K tokens per minute (TPM). This is too little and for that reason they obtained a new subscription of the type Azure Plan (0017G). However, even after swapping to that new subscription, the 8K TPM limit remains in place. They have tried different regions (including for instance "resource rich" Sweden Central) as not all regions have the same offerings. However, in no region was it possible to increase the configurable TPM beyond 8K.

What do the customer have to do to increase their GPT-4 TPM quota to let say 150K TPM, which is what other customers of ours have with the very same 0017G type subscription?

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,069 questions
0 comments No comments
{count} vote

1 answer

Sort by: Most helpful
  1. YutongTie-MSFT 53,966 Reputation points Moderator
    2024-07-30T20:56:11.25+00:00

    Hello @Morten

    Thanks for reaching out to us, you can definitely request for more quota if your business need it. The gating team will review on your usage and meet your need as much as possible according to the region capacity.

    You can do it from Azure OpenAI Studio as below screenshot, and you need to fill a quick form then.

    User's image

    At the begin, users will start from a reasonable quota, but as business need, users can request more.

    I hope this helps.

    Regards,

    Yutong

    -Please kindly accept the answer if you feel helpful to support the community, thanks a lot.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.