what is PTU for gpt-4 model deployment.

Neha Chopade 0 Reputation points
2024-01-18T13:23:31.0266667+00:00

what are PTU for gpt-4 model deployment?

Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,135 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Azar 26,185 Reputation points MVP
    2024-01-18T14:22:32.55+00:00

    Hey Neha Chopade

    The PTU allows you to specify the amount of throughput you require in a deployment. The service then allocates the necessary model processing capacity and ensures it's ready for you. Throughput is defined in terms of provisioned throughput units (PTU) which is a normalized way of representing the throughput for your deployment. Each model-version pair requires different amounts of PTU to deploy and provide different amounts of throughput per PTU.

    For a detailed info on this plz follow the below link,

    https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/provisioned-throughput

    If this helps kindly accept the answer thanks much.


  2. AshokPeddakotla-MSFT 35,931 Reputation points
    2024-01-18T14:35:46.9133333+00:00

    Neha Chopade Greetings & Welcome to Microsoft Q&A forum!

    Provisioned throughput units (PTU) are units of model processing capacity that customers you can reserve and deploy for processing prompts and generating completions. The minimum PTU deployment, increments, and processing capacity associated with each unit varies by model type & version.

    Note that Quota is specific to a (deployment type, model, region) triplet and isn't interchangeable. Meaning you can't use quota for GPT-4 to deploy GPT-35-turbo.

    what are PTU for gpt-4 model deployment?

    As mentioned, each model-version pair requires different amounts of PTU to deploy and provide different amounts of throughput per PTU.

    To get a quick estimate for your workload, open the capacity planner in the Azure OpenAI Studio. The capacity planner is under Management > Quotas > Provisioned. The Provisioned option and the capacity planner are only available in certain regions within the Quota pane, if you don't see this option setting the quota region to Sweden Central will make this option available.

    User's image

    Please see below documentations to learn more about PTU.

    What is provisioned throughput?

    Get started using Provisioned Deployments on the Azure OpenAI Service

    Provisioned throughput units onboarding

    Do let me know if that helps or have any other queries.

    ---If the response helped, please do click Accept Answer and Yes for was this answer helpful. Doing so would help other community members with similar issue identify the solution. I highly appreciate your contribution to the community.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.