自動調整 Azure Machine Learning 中的線上端點

發行項
10/16/2024

適用於：Azure CLI ml 延伸模組 v2 (目前)Python SDK azure-ai-ml v2 (目前)

在本文中，您將了解如何設定以計量和排程為基礎的自動調整，以管理部署中的資源使用狀況。自動調整程序可讓您自動執行正確的資源量，以處理應用程式的負載。 Azure Machine Learning 中的線上端點透過與 Azure 監視器的自動調整功能整合，進而支援自動調整作業。

Azure 監視器的自動調整可讓您設定規則，以在符合規則條件時觸發一或多個自動調整動作。您可設定以計量為基礎的調整 (例如 CPU 使用率大於 70%)、以排程為基礎的調整 (例如尖峰上班時間的調整規則)，或上述兩者的組合。如需詳細資訊，請參閱Microsoft Azure 自動調整概觀。

此圖表顯示自動調整如何視需求新增和移除執行個體。

您目前可以使用 Azure CLI、REST API、Azure Resource Manager、Python SDK 或瀏覽器型 Azure 入口網站來管理自動調整。

必要條件

已部署的端點。如需詳細資訊，請參閱使用線上端點部署和評分機器學習模型 (部分機器翻譯)。
若要使用自動調整，必須將角色 microsoft.insights/autoscalesettings/write 指派給管理自動調整的身分識別。您可以使用任何允許此動作的內建或自訂角色。如需管理 Azure Machine Learning 角色的一般指引，請參閱管理使用者和角色。如需 Azure 監視器自動調整設定的詳細資訊，請參閱 Microsoft.Insights 自動調整設定。
若要使用 Python SDK 來管理 Azure 監視器服務，請使用下列命令安裝 azure-mgmt-monitor 套件：
```
pip install azure-mgmt-monitor
```

定義自動調整設定檔

若要啟用線上端點的自動調整，請先定義自動調整設定檔。此設定檔會指定預設、最小和最大擴展集容量。下列範例示範如何設定虛擬機器 (VM) 執行個體數目，以取得預設、最小和最大調整容量。

適用於：Azure CLI ml 延伸模組 v2 (目前)

如果您尚未設定 Azure CLI 的預設值，請儲存您的預設設定。若要避免多次傳遞訂用帳戶、工作區和資源群組的值，請執行下列程式碼：

az account set --subscription <subscription ID>
az configure --defaults workspace=<Azure Machine Learning workspace name> group=<resource group>

設定端點和部署名稱：

# set your existing endpoint name
ENDPOINT_NAME=your-endpoint-name
DEPLOYMENT_NAME=blue

取得部署和端點的 Azure Resource Manager 識別碼：

# ARM id of the deployment
DEPLOYMENT_RESOURCE_ID=$(az ml online-deployment show -e $ENDPOINT_NAME -n $DEPLOYMENT_NAME -o tsv --query "id")
# ARM id of the deployment. todo: change to --query "id"
ENDPOINT_RESOURCE_ID=$(az ml online-endpoint show -n $ENDPOINT_NAME -o tsv --query "properties.\"azureml.onlineendpointid\"")
# set a unique name for autoscale settings for this deployment. The below will append a random number to make the name unique.
AUTOSCALE_SETTINGS_NAME=autoscale-$ENDPOINT_NAME-$DEPLOYMENT_NAME-`echo $RANDOM`

建立自動調整設定檔：

az monitor autoscale create \
  --name $AUTOSCALE_SETTINGS_NAME \
  --resource $DEPLOYMENT_RESOURCE_ID \
  --min-count 2 --max-count 5 --count 2

注意

如需詳細資訊，請參閱 az monitor autoscale 參考。

適用於：Python SDK azure-ai-ml v2 (目前)

匯入必要的模組：

from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
from azure.mgmt.monitor import MonitorManagementClient
from azure.mgmt.monitor.models import AutoscaleProfile, ScaleRule, MetricTrigger, ScaleAction, Recurrence, RecurrentSchedule
import random 
import datetime

定義工作區、端點和部署的變數：

subscription_id = "<YOUR-SUBSCRIPTION-ID>"
resource_group = "<YOUR-RESOURCE-GROUP>"
workspace = "<YOUR-WORKSPACE>"

endpoint_name = "<YOUR-ENDPOINT-NAME>"
deployment_name = "blue"

取得 Azure Machine Learning 和 Azure 監視器用戶端：

credential = DefaultAzureCredential()
ml_client = MLClient(
    credential, subscription_id, resource_group, workspace
)

mon_client = MonitorManagementClient(
    credential, subscription_id
)

取得端點和部署物件：

deployment = ml_client.online_deployments.get(
    deployment_name, endpoint_name
)

endpoint = ml_client.online_endpoints.get(
    endpoint_name
)

建立自動調整設定檔：

# Set a unique name for autoscale settings for this deployment. The following code appends a random number to create a unique name.
autoscale_settings_name = f"autoscale-{endpoint_name}-{deployment_name}-{random.randint(0,1000)}"

mon_client.autoscale_settings.create_or_update(
    resource_group, 
    autoscale_settings_name, 
    parameters = {
        "location" : endpoint.location,
        "target_resource_uri" : deployment.id,
        "profiles" : [
            AutoscaleProfile(
                name="my-scale-settings",
                capacity={
                    "minimum" : 2, 
                    "maximum" : 5,
                    "default" : 2
                },
                rules = []
            )
        ]
    }
)

建立以部署計量為基礎的擴增規則

常見的擴增規則是在平均 CPU 負載很高時，增加 VM 執行個體數目。下列範例示範如何在 CPU 平均負載大於 70% 達 5 分鐘時，多配置兩個節點 (不超過最大值)：

適用於：Azure CLI ml 延伸模組 v2 (目前)

az monitor autoscale rule create \
  --autoscale-name $AUTOSCALE_SETTINGS_NAME \
  --condition "CpuUtilizationPercentage > 70 avg 5m" \
  --scale out 2

規則是 my-scale-settings 設定檔的一部分，其中 autoscale-name 符合設定檔的 name 部分。規則的 condition 引數值指出「VM 執行個體之間的平均 CPU 使用量超過 70% 達 5 分鐘」時，應觸發此規則。滿足條件時，系統會多配置兩個 VM 執行個體。

注意

如需詳細資訊，請參閱 az monitor autoscale Azure CLI 語法參考。

適用於：Python SDK azure-ai-ml v2 (目前)

建立規則定義：

rule_scale_out = ScaleRule(
    metric_trigger = MetricTrigger(
        metric_name="CpuUtilizationPercentage",
        metric_resource_uri = deployment.id, 
        time_grain = datetime.timedelta(minutes = 1),
        statistic = "Average",
        operator = "GreaterThan", 
        time_aggregation = "Last",
        time_window = datetime.timedelta(minutes = 5), 
        threshold = 70
    ), 
    scale_action = ScaleAction(
        direction = "Increase", 
        type = "ChangeCount", 
        value = 2, 
        cooldown = datetime.timedelta(hours = 1)
    )
)

此規則是指引數 metric_name、time_window 和 time_aggregation 中的 CPUUtilizationpercentage 值最後 5 分鐘的平均值。當計量的值大於 threshold 70 時，部署會多配置兩個 VM 執行個體。

更新 my-scale-settings 設定檔以納入此規則：

mon_client.autoscale_settings.create_or_update(
    resource_group, 
    autoscale_settings_name, 
    parameters = {
        "location" : endpoint.location,
        "target_resource_uri" : deployment.id,
        "profiles" : [
            AutoscaleProfile(
                name="my-scale-settings",
                capacity={
                    "minimum" : 2, 
                    "maximum" : 5,
                    "default" : 2
                },
                rules = [
                    rule_scale_out
                ]
            )
        ]
    }
)

建立以部署計量為基礎的縮減規則

當平均 CPU 負載很輕時，縮減規則可以減少 VM 執行個體數目。下列範例示範如何在 CPU 負載小於 30% 達 5 分鐘時，釋放一個節點 (下限為兩個)。

適用於：Azure CLI ml 延伸模組 v2 (目前)

az monitor autoscale rule create \
  --autoscale-name $AUTOSCALE_SETTINGS_NAME \
  --condition "CpuUtilizationPercentage < 25 avg 5m" \
  --scale in 1

適用於：Python SDK azure-ai-ml v2 (目前)

建立規則定義：

rule_scale_in = ScaleRule(
    metric_trigger = MetricTrigger(
        metric_name="CpuUtilizationPercentage",
        metric_resource_uri = deployment.id, 
        time_grain = datetime.timedelta(minutes = 1),
        statistic = "Average",
        operator = "LessThan", 
        time_aggregation = "Last",
        time_window = datetime.timedelta(minutes = 5), 
        threshold = 30
    ), 
    scale_action = ScaleAction(
        direction = "Increase", 
        type = "ChangeCount", 
        value = 1, 
        cooldown = datetime.timedelta(hours = 1)
    )
)

更新 my-scale-settings 設定檔以納入此規則：

mon_client.autoscale_settings.create_or_update(
    resource_group, 
    autoscale_settings_name, 
    parameters = {
        "location" : endpoint.location,
        "target_resource_uri" : deployment.id,
        "profiles" : [
            AutoscaleProfile(
                name="my-scale-settings",
                capacity={
                    "minimum" : 2, 
                    "maximum" : 5,
                    "default" : 2
                },
                rules = [
                    rule_scale_out, 
                    rule_scale_in
                ]
            )
        ]
    }
)

建立以端點計量為基礎的調整規則

在上一節中，您已建立規則，根據部署計量進行縮減或擴增。您也可以建立套用至部署端點的規則。在本節中，您將了解如何在要求延遲大於平均值 70 毫秒達 5 分鐘時，配置另一個節點。

適用於：Azure CLI ml 延伸模組 v2 (目前)

az monitor autoscale rule create \
 --autoscale-name $AUTOSCALE_SETTINGS_NAME \
 --condition "RequestLatency > 70 avg 5m" \
 --scale out 1 \
 --resource $ENDPOINT_RESOURCE_ID

適用於：Python SDK azure-ai-ml v2 (目前)

建立規則定義：

rule_scale_out_endpoint = ScaleRule(
    metric_trigger = MetricTrigger(
        metric_name="RequestLatency",
        metric_resource_uri = endpoint.id, 
        time_grain = datetime.timedelta(minutes = 1),
        statistic = "Average",
        operator = "GreaterThan", 
        time_aggregation = "Last",
        time_window = datetime.timedelta(minutes = 5), 
        threshold = 70
    ), 
    scale_action = ScaleAction(
        direction = "Increase", 
        type = "ChangeCount", 
        value = 1, 
        cooldown = datetime.timedelta(hours = 1)
    )
)

此規則的 metric_resource_uri 欄位現在是指端點，而不是部署。

更新 my-scale-settings 設定檔以納入此規則：

mon_client.autoscale_settings.create_or_update(
    resource_group, 
    autoscale_settings_name, 
    parameters = {
        "location" : endpoint.location,
        "target_resource_uri" : deployment.id,
        "profiles" : [
            AutoscaleProfile(
                name="my-scale-settings",
                capacity={
                    "minimum" : 2, 
                    "maximum" : 5,
                    "default" : 2
                },
                rules = [
                    rule_scale_out, 
                    rule_scale_in,
                    rule_scale_out_endpoint
                ]
            )
        ]
    }
)

尋找支援的計量識別碼

如果您想要使用 Azure CLI 或 the SDK，在程式碼中使用其他計量來設定自動調整規則，請參閱可用的計量中的表格。

建立以排程為基礎的調整規則

您也可以建立只在特定日期或時間套用的規則。在本節中，您會建立規則，將週末的節點計數設定為 2。

適用於：Azure CLI ml 延伸模組 v2 (目前)

az monitor autoscale profile create \
  --name weekend-profile \
  --autoscale-name $AUTOSCALE_SETTINGS_NAME \
  --min-count 2 --count 2 --max-count 2 \
  --recurrence week sat sun --timezone "Pacific Standard Time"

適用於：Python SDK azure-ai-ml v2 (目前)

mon_client.autoscale_settings.create_or_update(
    resource_group, 
    autoscale_settings_name, 
    parameters = {
        "location" : endpoint.location,
        "target_resource_uri" : deployment.id,
        "profiles" : [
            AutoscaleProfile(
                name="Default",
                capacity={
                    "minimum" : 2, 
                    "maximum" : 2,
                    "default" : 2
                },
                recurrence = Recurrence(
                    frequency = "Week", 
                    schedule = RecurrentSchedule(
                        time_zone = "Pacific Standard Time", 
                        days = ["Saturday", "Sunday"], 
                        hours = [], 
                        minutes = []
                    )
                )
            )
        ]
    }
)

啟用或停用自動調整

您可以啟用或停用特定的自動調整設定檔。

適用於：Azure CLI ml 延伸模組 v2 (目前)

az monitor autoscale update \
  --autoscale-name $AUTOSCALE_SETTINGS_NAME \
  --enabled false

適用於：Python SDK azure-ai-ml v2 (目前)

mon_client.autoscale_settings.create_or_update(
    resource_group, 
    autoscale_settings_name, 
    parameters = {
        "location" : endpoint.location,
        "target_resource_uri" : deployment.id,
        "enabled" : False
    }
)

刪除資源

如果您不打算使用部署，請執行下列步驟來刪除資源。

適用於：Azure CLI ml 延伸模組 v2 (目前)

# delete the autoscaling profile
az monitor autoscale delete -n "$AUTOSCALE_SETTINGS_NAME"

# delete the endpoint
az ml online-endpoint delete --name $ENDPOINT_NAME --yes --no-wait

適用於：Python SDK azure-ai-ml v2 (目前)

mon_client.autoscale_settings.delete(
    resource_group, 
    autoscale_settings_name
)

ml_client.online_endpoints.begin_delete(endpoint_name)

共用方式為

自動調整 Azure Machine Learning 中的線上端點

必要條件

定義自動調整設定檔

建立以部署計量為基礎的擴增規則

建立以部署計量為基礎的縮減規則

建立以端點計量為基礎的調整規則

尋找支援的計量識別碼

建立以排程為基礎的調整規則

啟用或停用自動調整

刪除資源

意見反應

其他資源

共用方式為

自動調整 Azure Machine Learning 中的線上端點

必要條件

定義自動調整設定檔

建立以部署計量為基礎的擴增規則

建立以部署計量為基礎的縮減規則

建立以端點計量為基礎的調整規則

尋找支援的計量識別碼

建立以排程為基礎的調整規則

啟用或停用自動調整

刪除資源

相關內容

意見反應

其他資源