need some details of PTU in Azure OpenAI Service

Question

need some details of PTU in Azure OpenAI Service

Lianne Goodwin 20

Hey. I heard people talked about quota method in Azure OpenAI Service called PTU. Just wonder what is the difference between PTU and the normal deployment quota. Thanks.

Accepted answer

0 additional answers

Your answer

Answer 1

Hi Lianne

Provisioned throughput units (PTU) are generic units of model processing capacity that you can use to size provisioned deployments to achieve the required throughput for processing prompts and generating completions. Provisioned throughput units are granted to a subscription as quota on a regional basis, which defines the maximum number of PTUs that can be assigned to deployments in that subscription and region.

Unlike the Tokens Per Minute (TPM) quota used by other Azure OpenAI offerings, PTUs are model-independent. The PTUs might be used to deploy any supported model/version in the region. You can find more details in below link.

https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/provisioned-throughput

The PTU can further be ProvisionedManaged or GlobalProvisionedManaged. Here pricing of PTU is below in USD
User's image

https://azure.microsoft.com/en-us/pricing/details/cognitive-services/openai-service/

Share via

need some details of PTU in Azure OpenAI Service

0 additional answers

Your answer