How do I deploy a GPT-4 based model in NorthCentralUS region using TPM quota-Azure OpenAI

ParisaTabassum-7447 20 Reputation points Microsoft Employee
2024-09-20T00:06:25.02+00:00

Hello, we are trying to deploy a GPT-4 based model under our subscription. The Azure OpenAI resource is created in North Central US (and we need the resource and model to deploy in this region).

Our usage of the model is/ will be minimal, so we want to use TPM quota instead of PTU quota.

Now our partner team already deployed and trained the model using TPM quota under their subscription. The model is deployed in North central us. It was deployed few months back.
Now we are trying to deploy the same model in our subscription, but when I am trying to deploy the model it gives me error: Not enough Quota available. And the only quota option it shows is PTU quota. (picture attached).

I have seen the related question answer in this page: https://learn.microsoft.com/en-us/answers/questions/1494185/cant-deploy-any-gpt-4-model-no-quota-is-available

It shows the gpt-4 model is not available in North Central US. I will have to choose one from 5 listed regions. But my question is,

  1. how could our partner team deploy and train a gpt-4 model in NCUS region and we can't? Was the region available before and has been decommissioned?
  2. Can we use TPM quota for gpt-4 instead of PTU quota?
  3. If we try to transfer the trained model from their subscription to our subscription, will it work?
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,054 questions
0 comments No comments
{count} votes

Accepted answer
  1. AshokPeddakotla-MSFT 35,971 Reputation points Moderator
    2024-09-20T08:12:29.25+00:00

    ParisaTabassum-7447 Greetings & Welcome to Microsoft Q&A forum!

    Please see below answers to your queries.

    how could our partner team deploy and train a gpt-4 model in NCUS region and we can't? Was the region available before and has been decommissioned?

    Please note that model availability depends on several factors such as capacity, usage etc.,

    GPT-4 models are available in Standard deployment and Provisioned deployment types.

    The availability of this model varies with the deployment type in a particular region.

    User's image

    User's image

    In you case, please double check the deployment type with the partner.

    There could also be a possibility that the model was available at the time of deployment by the partner as well.

    Can we use TPM quota for gpt-4 instead of PTU quota?

    To give more context, Quota is assigned to your subscription on a per-region, per-model basis in units of Tokens-per-Minute (TPM). When you onboard a subscription to Azure OpenAI, you'll receive default quota for most available models. Then, you'll assign TPM to each deployment as it is created, and the available quota for that model will be reduced by that amount. You can continue to create deployments and assign them TPM until you reach your quota limit. Once that happens, you can only create new deployments of that model by reducing the TPM assigned to other deployments of the same model (thus freeing TPM for use), or by requesting and being approved for a model quota increase in the desired region.

    Unlike the Tokens Per Minute (TPM) quota used by other Azure OpenAI offerings, PTUs are model-independent. The PTUs might be used to deploy any supported model/version in the region.

    See What is provisioned throughput? and Manage Azure OpenAI Service quota for more information.

    If we try to transfer the trained model from their subscription to our subscription, will it work?

    Yes, fine-tuning supports deploying a fine-tuned model to a different region than where the model was originally fine-tuned. You can also deploy to a different subscription/region.

    The only limitations are that the new region must also support fine-tuning and when deploying cross subscription the account generating the authorization token for the deployment must have access to both the source and destination subscriptions.

    Cross subscription/region deployment can be accomplished via Python or REST.

    Do let me know if that helps or have any other queries.


    If the response helped, please do click Accept Answer and Yes for was this answer helpful.

    Doing so would help other community members with similar issue identify the solution. I highly appreciate your contribution to the community.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.