如何使用程式碼部署和推斷受控計算部署

文章
01/11/2025

Azure AI Foundry 入口網站模型目錄提供超過 1,600 種模型，而部署這些模型最常見的方式是使用受控計算部署選項，這有時也稱為受控在線部署。

部署大型語言模型 (LLM) 使其可在網站、應用程式或其他生產環境中使用。部署通常涉及伺服器或在雲端中主控模型，以及為使用者建立 API 或其他介面以與模型互動。您可以針對生成式 AI 應用程式叫用虛擬機器部署進行即時推斷，例如聊天和 Copilot。

在本文中，您將了解如何使用 Azure Machine Learning SDK 部署模型。本文也涵蓋如何對已部署的模型執行推斷。

取得模型識別碼

您可以使用 Azure Machine Learning SDK 來部署受控計算模型，但首先，讓我們瀏覽模型目錄並取得部署所需的模型 ID。

登入 Azure AI Foundry 並移至首頁。
從左側側邊列中，選取「模型目錄」。
在「部署選項」篩選器中，選取「受控計算」。
選取模型。
從您選取模型的詳細資料頁面複製模型 ID。這看起來如下：azureml://registries/azureml/models/deepset-roberta-base-squad2/versions/16

部署模型

部署模型。

首先，您需要安裝 Azure Machine Learning SDK。

pip install azure-ai-ml
pip install azure-identity

使用此程式碼向 Azure Machine Learning 進行驗證，並建立用戶端物件。將佔位元元取代為您的訂用帳戶標識碼、資源組名和 Azure AI Foundry 項目名稱。

from azure.ai.ml import MLClient
from azure.identity import InteractiveBrowserCredential

client = MLClient(
    credential=InteractiveBrowserCredential,
    subscription_id="your subscription name goes here",
    resource_group_name="your resource group name goes here",
    workspace_name="your project name goes here",
)

針對受控計算部署選項，您必須建立模型部署前的端點。將端點視為可裝載多個模型部署的容器。由於端點名稱在區域中必須唯一，因此在此範例中，我們會使用時間戳記來建立唯一的端點名稱。

import time, sys
from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
    ManagedOnlineDeployment,
    ProbeSettings,
)

# Make the endpoint name unique
timestamp = int(time.time())
online_endpoint_name = "customize your endpoint name here" + str(timestamp)

# Create an online endpoint
endpoint = ManagedOnlineEndpoint(
    name=online_endpoint_name,
    auth_mode="key",
)
workspace_ml_client.begin_create_or_update(endpoint).wait()

建立部署。您可以在模型目錄中找到模型 ID。

model_name = "azureml://registries/azureml/models/deepset-roberta-base-squad2/versions/16" 

demo_deployment = ManagedOnlineDeployment(
    name="demo",
    endpoint_name=online_endpoint_name,
    model=model_name,
    instance_type="Standard_DS3_v2",
    instance_count=2,
    liveness_probe=ProbeSettings(
        failure_threshold=30,
        success_threshold=1,
        timeout=2,
        period=10,
        initial_delay=1000,
    ),
    readiness_probe=ProbeSettings(
        failure_threshold=10,
        success_threshold=1,
        timeout=10,
        period=10,
        initial_delay=1000,
    ),
)
workspace_ml_client.online_deployments.begin_create_or_update(demo_deployment).wait()
endpoint.traffic = {"demo": 100}
workspace_ml_client.begin_create_or_update(endpoint).result()

推斷部署

您需要範例 json 資料來測試推斷。以下列程式碼建立 sample_score.json。

{
  "inputs": {
    "question": [
      "Where do I live?",
      "Where do I live?",
      "What's my name?",
      "Which name is also used to describe the Amazon rainforest in English?"
    ],
    "context": [
      "My name is Wolfgang and I live in Berlin",
      "My name is Sarah and I live in London",
      "My name is Clara and I live in Berkeley.",
      "The Amazon rainforest (Portuguese: Floresta Amaz\u00f4nica or Amaz\u00f4nia; Spanish: Selva Amaz\u00f3nica, Amazon\u00eda or usually Amazonia; French: For\u00eat amazonienne; Dutch: Amazoneregenwoud), also known in English as Amazonia or the Amazon Jungle, is a moist broadleaf forest that covers most of the Amazon basin of South America. This basin encompasses 7,000,000 square kilometres (2,700,000 sq mi), of which 5,500,000 square kilometres (2,100,000 sq mi) are covered by the rainforest. This region includes territory belonging to nine nations. The majority of the forest is contained within Brazil, with 60% of the rainforest, followed by Peru with 13%, Colombia with 10%, and with minor amounts in Venezuela, Ecuador, Bolivia, Guyana, Suriname and French Guiana. States or departments in four nations contain \"Amazonas\" in their names. The Amazon represents over half of the planet's remaining rainforests, and comprises the largest and most biodiverse tract of tropical rainforest in the world, with an estimated 390 billion individual trees divided into 16,000 species."
    ]
  }
}

讓我們使用 sample_score.json 來推斷。根據您儲存範例 json 檔案的位置來變更位置。

scoring_file = "./sample_score.json" 
response = workspace_ml_client.online_endpoints.invoke(
    endpoint_name=online_endpoint_name,
    deployment_name="demo",
    request_file=scoring_file,
)
response_json = json.loads(response)
print(json.dumps(response_json, indent=2))

設定自動調整

若要設定部署的自動調整，您可以移至 Azure 入口網站、找出 AI 專案資源群組中輸入 Machine learning online deployment 的 Azure 資源，並使用 [設定] 底下的 [調整] 選單。如需自動調整的詳細資訊，請參閱 Azure 機器學習檔中的自動調整在線端點。

刪除部署端點

若要刪除 Azure AI Foundry 入口網站中的部署，請選取 部署詳細數據頁面頂端面板中的 [刪除 ] 按鈕。

配額考量

若要使用即時端點部署和推斷，您會取用依個別區域為基礎、指派給您訂用帳戶的虛擬機器 (VM) 核心配額。當您註冊 Azure AI Foundry 時，您會收到區域中數個 VM 系列的預設 VM 配額。您可以繼續建立部署，直到達到配額限制為止。一旦發生這種情況，您可以要求配額增加。

下一步

深入瞭解您可以在 Azure AI Foundry 中執行的動作
在 Azure AI 常見問題文章中取得常見問題的解答

分享方式：