How to get information about privately deployed Azure Open AI chat model

Jens Madsen 1 Reputation point
2023-09-21T13:42:40.32+00:00

We have privately deployed OpenAI LLM on Azure. In order to avoid keeping track of e.g. token limit of a given deployment (endpoint) it would be very helpful if one could retrieve information about the model through an API call. Something in line with:

https://

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
2,934 questions
{count} votes

1 answer

Sort by: Most helpful
  1. dupammi 8,465 Reputation points Microsoft Vendor
    2023-09-22T08:35:20.66+00:00

    Hi @Jens Madsen ,

    Thanks for using Microsoft Q&A platform.

    I understand that you are trying to retrieve the information related to the token limit of a privately deployed OpenAI LLM model, through an API call. I will be happy to assist you with this.

    Here is the detailed information related to the API call to query the quota usage in a given region, for a specific subscription.

    Manage Azure OpenAI Service quota - Azure AI services | Microsoft Learn

    • Query Quota Usage :

    Please provide the subscription Id , location in the respective places in below command :
    User's image

    • GET COMMAND :

    GET https://management.azure.com/subscriptions/**{subscriptionId}**/providers/Microsoft.CognitiveServices/locations/**{location}**/usages?api-version=2023-05-01

    Here are the detailed path parameters and types expected:

    User's image

    Below is the example request:
    User's image

    • CURL COMMAND :

    curl -X GET https://management.azure.com/subscriptions/***00000000-0000-0000-0000-000000000000***/providers/Microsoft.CognitiveServices/locations/eastus/usages?api-version=2023-05-01 -H "Content-Type: application/json" -H 'Authorization: Bearer YOUR_AUTH_TOKEN'

    Make sure to replace “subscriptionId”, “location”, "YOUR_AUTH_TOKEN" with the actual subscriptionid, location, access token you obtained through Azure AD or the Azure portal, respectively.

    It helps you get started programmatically creating deployments that use quota to set TPM rate limits.

    With the introduction of quota you must use API version 2023-05-01 for resource management related activities.

    More detailed information about "quota usage" reference document:

    Usages - List - REST API (Azure Cognitive Services) | Microsoft Learn

    I hope this information helps! Let me know if you have any further questions.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.