Neha Chopade Greetings & Welcome to Microsoft Q&A forum!
Provisioned throughput units (PTU) are units of model processing capacity that customers you can reserve and deploy for processing prompts and generating completions. The minimum PTU deployment, increments, and processing capacity associated with each unit varies by model type & version.
Note that Quota is specific to a (deployment type, model, region) triplet and isn't interchangeable. Meaning you can't use quota for GPT-4 to deploy GPT-35-turbo.
what are PTU for gpt-4 model deployment?
As mentioned, each model-version pair requires different amounts of PTU to deploy and provide different amounts of throughput per PTU.
To get a quick estimate for your workload, open the capacity planner in the Azure OpenAI Studio. The capacity planner is under Management > Quotas > Provisioned. The Provisioned option and the capacity planner are only available in certain regions within the Quota pane, if you don't see this option setting the quota region to Sweden Central will make this option available.

Please see below documentations to learn more about PTU.
What is provisioned throughput?
Get started using Provisioned Deployments on the Azure OpenAI Service
Provisioned throughput units onboarding
Do let me know if that helps or have any other queries.
---If the response helped, please do click Accept Answer
and Yes
for was this answer helpful.
Doing so would help other community members with similar issue identify the solution. I highly appreciate your contribution to the community.