How to Enable Dynamic Quota for Deployments Using Python SDK

Anonymous
2024-07-11T12:04:58.82+00:00

Hi,

We have a use case that needs setting dynamic quota for azure deployments using Azure python SDK. However, for setting the dynamic quota, the attributes seems to be read only on python SDK. As dynamic quota is referred as dynamic_throttling in the API doc, and is included in the headers with enableDynamicThrottling bool. In python sdk, it can only be set here as it seems and does not cover our use case and the flexibility that the API offers. We also tried including this setting in the request that is sent to do the deployment following the API but dynamicThrottlingEnabled is just ignored from the request. Deployment is made with success response but the quota is not dynamic on the edit deployment view. What do you suggest in this case for us to set this dynamically using the Python SDK on the model deployment step?

This request does not work with the correct headers and access for example:
It gets success response, deployment is made but dynamicThrottlingEnabled is just ignored

{
    "sku": 
    {
        "name": "Standard", 
        "capacity": 20
    }, 
    "properties": 
    {
        "model": {
            "format": "OpenAI", 
            "name": "gpt-35-turbo", 
            "version": "0613"
        }, 
        "raiPolicyName": "Microsoft.Default", 
        "versionUpgradeOption": "OnceNewDefaultVersionAvailable", 
        "dynamicThrottlingEnabled": true
    }
}
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,089 questions
{count} votes

1 answer

Sort by: Most helpful
  1. YutongTie-MSFT 51,766 Reputation points
    2024-07-11T21:51:34.35+00:00

    Hello,

    Thanks for reaching out to us, you can set it directly in Azure OpenAI Studio deployment as below screenshot -

    Screenshot of advanced configuration UI for deployments.

    Alternatively, you can enable it programmatically with Azure CLI's az rest:

    Replace the {subscriptionId}, {resourceGroupName}, {accountName}, and {deploymentName} with the relevant values for your resource. In this case, accountName is equal to Azure OpenAI resource name.

    az rest --method patch --url "https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.CognitiveServices/accounts/{accountName}/deployments/{deploymentName}?2023-10-01-preview" --body '{"properties": {"dynamicThrottlingEnabled": true} }'

    More details please refer to - https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/dynamic-quota

    I hope this helps, please let us know how it works.

    Regards,

    Yutong

    -Please kindly accept the answer if you feel helpful to support the community, thanks a lot.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.