オンラインエンドポイントへの MLflow モデルの段階的なロールアウト

[アーティクル]
04/04/2023

この記事では、サービスを中断することなく、MLflow モデルを段階的に更新してオンラインエンドポイントにデプロイする方法について説明します。ブルーグリーンデプロイ (安全なロールアウト戦略とも呼ばれます) を使用して、新しいバージョンの Web サービスを運用環境に導入します。この戦略により、完全にロールアウトする前に、新しいバージョンの Web サービスを少数のユーザーまたは要求にロールアウトできます。

この例の概要

オンラインエンドポイントには、エンドポイントとデプロイという概念があります。エンドポイントは、顧客がモデルを使用するために使用する API を表し、デプロイはその API の特定の実装を示します。この区別により、ユーザーは API を実装から切り離し、コンシューマーに影響を与えることなく基礎となる実装を変更することができます。この例では、このような概念を用いて、サービスの中断を招くことなくエンドポイントにデプロイされたモデルを更新します。

今回デプロイするモデルは、UCI Heart Disease Data Set をベースにしています。このデータベースには 76 個の属性が含まれていますが、そのサブセットである 14 個を使っています。このモデルは、患者の心臓病の存在を予測しようと試みるものです。これは 0 (存在しない) から 1 (存在する) の整数値です。トレーニングには XGBBoost 分類器が使われ、必要な前処理はすべて scikit-learn パイプラインとしてパッケージ化されているため、このモデルは生データから予測までを行うエンドツーエンドのパイプラインになっています。

この記事の情報は、azureml-examples リポジトリに含まれているコードサンプルを基にしています。ファイルをコピーして貼り付けることなくコマンドをローカルで実行するには、リポジトリを複製し、ディレクトリを sdk/using-mlflow/deploy に変更します。

Jupyter ノートブックで経過をたどる

次のノートブックで、このサンプルの経過をたどることができます。複製したリポジトリで、ノートブック mlflow_sdk_online_endpoints_progresive.ipynb を開きます。

前提条件

この記事の手順に従う前に、次の前提条件が満たされていることをご確認ください。

Azure サブスクリプション。 Azure サブスクリプションをお持ちでない場合は、開始する前に無料アカウントを作成してください。無料版または有料版の Azure Machine Learning をお試しください。
Azure ロールベースのアクセス制御 (Azure RBAC) は、Azure Machine Learning の操作に対するアクセスを許可するために使用されます。この記事の手順を実行するには、ユーザーアカウントに、Azure Machine Learning ワークスペースの所有者か共同作成者ロール、または Microsoft.MachineLearningServices/workspaces/onlineEndpoints/* を許可するカスタムロールを割り当てる必要があります。詳細については、「Azure Machine Learning ワークスペースへのアクセスの管理」を参照してください。

さらに、次を行う必要があります。

Azure CLI と Azure CLI の ml 拡張機能をインストールします。詳しくは、CLI (v2) のインストール、設定、使用に関するページをご覧ください。

Mlflow SDK パッケージ mlflow と MLflow 用の Azure Machine Learning プラグイン azureml-mlflow をインストールします。
```
pip install mlflow azureml-mlflow
```
Azure Machine Learning のコンピューティング内で実行していない場合、MLflow の追跡 URI または MLflow のレジストリ URI は作業しているワークスペースを指すように構成します。「Azure Machine Learning 用に MLflow を構成する」方法について説明します。

ワークスペースに接続する

まず、これから作業する Azure Machine Learning ワークスペースに接続しましょう。

az account set --subscription <subscription>
az configure --defaults workspace=<workspace> group=<resource-group> location=<location>

ワークスペースは、Azure Machine Learning の最上位のリソースで、Azure Machine Learning を使用するときに作成するすべての成果物を操作するための一元的な場所を提供します。このセクションでは、デプロイタスクを実行するワークスペースに接続します。

必要なライブラリをインポートします。

from azure.ai.ml import MLClient, Input
from azure.ai.ml.entities import ManagedOnlineEndpoint, ManagedOnlineDeployment, Model
from azure.ai.ml.constants import AssetTypes
from azure.identity import DefaultAzureCredential

ワークスペースの詳細を構成し、ワークスペースへのハンドルを取得します。

subscription_id = "<subscription>"
resource_group = "<resource-group>"
workspace = "<workspace>"

ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace)

必要なライブラリをインポートする

import json
import mlflow
import requests
import pandas as pd
from mlflow.deployments import get_deploy_client

MLflow クライアントとデプロイクライアントを構成する

mlflow_client = mlflow.MLflowClient()
deployment_client = get_deploy_client(mlflow.get_tracking_uri())

レジストリにモデルを登録します。

モデルが Azure Machine Learning レジストリに登録されていることを確認します。 Azure Machine Learning では、未登録のモデルをデプロイすることはサポートされていません。 MLflow SDK を使用して新しいモデルを登録できます。

MODEL_NAME='heart-classifier'
az ml model create --name $MODEL_NAME --type "mlflow_model" --path "model"

model_name = 'heart-classifier'
model_local_path = "model"

model = ml_client.models.create_or_update(
     Model(name=model_name, path=model_local_path, type=AssetTypes.MLFLOW_MODEL)
)

model_name = 'heart-classifier'
model_local_path = "model"

registered_model = mlflow_client.create_model_version(
    name=model_name, source=f"file://{model_local_path}"
)
version = registered_model.version

オンラインエンドポイントの作成

オンラインエンドポイントは、オンライン (リアルタイム) の推論に使用されるエンドポイントです。オンラインエンドポイントには、クライアントからデータを受信する準備が整い、リアルタイムで応答を返信できるデプロイが含まれています。

この機能を利用するには、同じエンドポイントの下に同じモデルの複数のバージョンをデプロイします。ただし、最初のうちは、新しいデプロイにはトラフィックは流れません。新しいモデルが正しく動作することが確認できたら、一方のデプロイからもう一方のデプロイにトラフィックを段階的に移動させる予定です。

エンドポイントには名前が必要で、これは同じリージョンで一意である必要があります。存在しないものを作成するようにしましょう。

ENDPOINT_SUFIX=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w ${1:-5} | head -n 1)
ENDPOINT_NAME="heart-classifier-$ENDPOINT_SUFIX"

import random
import string

# Creating a unique endpoint name by including a random suffix
allowed_chars = string.ascii_lowercase + string.digits
endpoint_suffix = "".join(random.choice(allowed_chars) for x in range(5))
endpoint_name = "heart-classifier-" + endpoint_suffix

print(f"Endpoint name: {endpoint_name}")

import random
import string

# Creating a unique endpoint name by including a random suffix
allowed_chars = string.ascii_lowercase + string.digits
endpoint_suffix = "".join(random.choice(allowed_chars) for x in range(5))
endpoint_name = "heart-classifier-" + endpoint_suffix

print(f"Endpoint name: {endpoint_name}")

エンドポイントを構成する

endpoint.yml

$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineEndpoint.schema.json
name: heart-classifier-edp
auth_mode: key

endpoint = ManagedOnlineEndpoint(
    name=endpoint_name,
    description="An endpoint to serve predictions of the UCI heart disease problem",
    auth_mode="key",
)

構成ファイルを使用して、このエンドポイントのプロパティを構成できます。次の例では、エンドポイントの認証モードを "key" に構成します。

endpoint_config = {
    "auth_mode": "key",
    "identity": {
        "type": "system_assigned"
    }
}

この構成を JSON ファイルに書き込みます。

endpoint_config_path = "endpoint_config.json"
with open(endpoint_config_path, "w") as outfile:
    outfile.write(json.dumps(endpoint_config))

エンドポイントを作成します。

az ml online-endpoint create -n $ENDPOINT_NAME -f endpoint.yml

ml_client.online_endpoints.begin_create_or_update(endpoint).result()

endpoint = deployment_client.create_endpoint(
    name=endpoint_name,
    config={"endpoint-config-file": endpoint_config_path},
)

エンドポイントの認証シークレットを取得します。
```
ENDPOINT_SECRET_KEY=$(az ml online-endpoint get-credentials -n $ENDPOINT_NAME | jq -r ".accessToken")
```
```
endpoint_secret_key = ml_client.online_endpoints.list_keys(
    name=endpoint_name
).access_token
```
MLflow SDK では、まだこの機能を使用できません。 Azure Machine Learning スタジオのエンドポイントに移動し、そこから秘密鍵を取得します。

ブルーデプロイを作成する

エンドポイントは、まだ空です。デプロイは含まれていません。前に作業していたものと同じモデルをデプロイして、最初のモデルを作成してみましょう。このデプロイを "default" と呼び、"ブルーデプロイ" を表します。

展開を構成する

blue-deployment.yml

$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineDeployment.schema.json
name: default
endpoint_name: heart-classifier-edp
model: azureml:heart-classifier@latest
instance_type: Standard_DS2_v2
instance_count: 1

blue_deployment_name = "default"

デプロイのハードウェア要件を構成します。

blue_deployment = ManagedOnlineDeployment(
    name=blue_deployment_name,
    endpoint_name=endpoint_name,
    model=model,
    instance_type="Standard_DS2_v2",
    instance_count=1,
)

お使いのエンドポイントにエグレス接続がない場合、引数 with_package=True を含めることでモデルのパッケージ化 (プレビュー) を使用します。

blue_deployment = ManagedOnlineDeployment(
    name=blue_deployment_name,
    endpoint_name=endpoint_name,
    model=model,
    instance_type="Standard_DS2_v2",
    instance_count=1,
    with_package=True,
)

blue_deployment_name = "default"

デプロイのハードウェア要件を構成するには、必要な構成を使用して JSON ファイルを作成する必要があります。

deploy_config = {
    "instance_type": "Standard_DS2_v2",
    "instance_count": 1,
}

注意

この構成の完全な仕様は、マネージドオンラインデプロイスキーマ (v2) に関するページで確認できます。

構成をファイルに書き込みます。

deployment_config_path = "deployment_config.json"
with open(deployment_config_path, "w") as outfile:
    outfile.write(json.dumps(deploy_config))

配置を作成する

az ml online-deployment create --endpoint-name $ENDPOINT_NAME -f blue-deployment.yml --all-traffic

お使いのエンドポイントにエグレス接続がない場合、フラグ --with-package を含めることでモデルのパッケージ化 (プレビュー) を使用します。

az ml online-deployment create --with-package --endpoint-name $ENDPOINT_NAME -f blue-deployment.yml --all-traffic

ヒント

create コマンドでフラグ --all-traffic を設定します。これにより、すべてのトラフィックが新しいデプロイに割り当てられます。

ml_client.online_deployments.begin_create_or_update(blue_deployment).result()

blue_deployment = deployment_client.create_deployment(
    name=blue_deployment_name,
    endpoint=endpoint_name,
    model_uri=f"models:/{model_name}/{version}",
    config={"deploy-config-file": deployment_config_path},
)

すべてのトラフィックをデプロイに割り当てる

現時点では、エンドポイントには 1 つのデプロイがありますが、そのトラフィックはどれも割り当てられていません。これを割り当ててみましょう。
作成中に --all-traffic を使用したため、Azure CLI ではこの手順は必要ありません。
```
endpoint.traffic = { blue_deployment_name: 100 }
```
```
traffic_config = {"traffic": {blue_deployment_name: 100}}
```
構成をファイルに書き込みます。
```
traffic_config_path = "traffic_config.json"
with open(traffic_config_path, "w") as outfile:
    outfile.write(json.dumps(traffic_config))
```
エンドポイント構成を更新します。
作成中に --all-traffic を使用したため、Azure CLI ではこの手順は必要ありません。
```
ml_client.begin_create_or_update(endpoint).result()
```
```
deployment_client.update_endpoint(
    endpoint=endpoint_name,
    config={"endpoint-config-file": traffic_config_path},
)
```

デプロイをテストするためのサンプル入力を作成する

sample.yml

{
    "input_data": {
        "columns": [
            "age",
            "sex",
            "cp",
            "trestbps",
            "chol",
            "fbs",
            "restecg",
            "thalach",
            "exang",
            "oldpeak",
            "slope",
            "ca",
            "thal"
        ],
        "data": [
            [ 48, 0, 3, 130, 275, 0, 0, 139, 0, 0.2, 1, 0, "normal" ]
        ]
    }
}

次のコードサンプルでは、トレーニングデータセットから 5 つの観測結果をサンプリングし、target 列を削除し (モデルで予測を行うため)、モデルのデプロイで使用できるようにファイル sample.json に要求を作成します。

samples = (
    pd.read_csv("data/heart.csv")
    .sample(n=5)
    .drop(columns=["target"])
    .reset_index(drop=True)
)

with open("sample.json", "w") as f:
    f.write(
        json.dumps(
            {"input_data": json.loads(samples.to_json(orient="split", index=False))}
        )
    )

次のコードサンプルでは、トレーニングデータセットから 5 つの観測結果をサンプリングし、target 列を削除し (モデルで予測を行うため)、要求を作成します。

samples = (
    pd.read_csv("data/heart.csv")
    .sample(n=5)
    .drop(columns=["target"])
    .reset_index(drop=True)
)

展開をテスト

az ml online-endpoint invoke --name $ENDPOINT_NAME --request-file sample.json

ml_client.online_endpoints.invoke(
    endpoint_name=endpoint_name,
    request_file="sample.json",
)

deployment_client.predict(
    endpoint=endpoint_name, 
    df=samples
)

エンドポイントでグリーンデプロイを作成します。

開発チームによって作成された新しいバージョンのモデルがあり、運用する準備ができているとします。まずこのモデルを実行してみて、確信が得られたらエンドポイントを更新してトラフィックをルーティングさせることができます。

新しいモデルバージョンを登録する

MODEL_NAME='heart-classifier'
az ml model create --name $MODEL_NAME --type "mlflow_model" --path "model"

新しいモデルのバージョン番号を取得しましょう。

VERSION=$(az ml model show -n heart-classifier --label latest | jq -r ".version")

model_name = 'heart-classifier'
model_local_path = "model"

model = ml_client.models.create_or_update(
     Model(name=model_name, path=model_local_path, type=AssetTypes.MLFLOW_MODEL)
)
version = model.version

model_name = 'heart-classifier'
model_local_path = "model"

registered_model = mlflow_client.create_model_version(
    name=model_name, source=f"file://{model_local_path}"
)
version = registered_model.version

新しいデプロイの構成

green-deployment.yml

$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineDeployment.schema.json
name: xgboost-model
endpoint_name: heart-classifier-edp
model: azureml:heart-classifier@latest
instance_type: Standard_DS2_v2
instance_count: 1

デプロイには次のように名前を付けます。

GREEN_DEPLOYMENT_NAME="xgboost-model-$VERSION"

green_deployment_name = f"xgboost-model-{version}"

デプロイのハードウェア要件を構成します。

green_deployment = ManagedOnlineDeployment(
    name=green_deployment_name,
    endpoint_name=endpoint_name,
    model=model,
    instance_type="Standard_DS2_v2",
    instance_count=1,
)

お使いのエンドポイントにエグレス接続がない場合、引数 with_package=True を含めることでモデルのパッケージ化 (プレビュー) を使用します。

green_deployment = ManagedOnlineDeployment(
    name=green_deployment_name,
    endpoint_name=endpoint_name,
    model=model,
    instance_type="Standard_DS2_v2",
    instance_count=1,
    with_package=True,
)

green_deployment_name = f"xgboost-model-{version}"

デプロイのハードウェア要件を構成するには、必要な構成を使用して JSON ファイルを作成する必要があります。

deploy_config = {
    "instance_type": "Standard_DS2_v2",
    "instance_count": 1,
}

ヒント

ここでは、deployment-config-file に記載されているハードウェアの確認項目と同じものを使用しています。ただし、同じ構成であることは必須ではありません。要件に応じて、モデルごとに異なるハードウェアを構成することができます。

構成をファイルに書き込みます。

deployment_config_path = "deployment_config.json"
with open(deployment_config_path, "w") as outfile:
    outfile.write(json.dumps(deploy_config))

新しいデプロイを作成する

az ml online-deployment create -n $GREEN_DEPLOYMENT_NAME --endpoint-name $ENDPOINT_NAME -f green-deployment.yml

お使いのエンドポイントにエグレス接続がない場合、フラグ --with-package を含めることでモデルのパッケージ化 (プレビュー) を使用します。

az ml online-deployment create --with-package -n $GREEN_DEPLOYMENT_NAME --endpoint-name $ENDPOINT_NAME -f green-deployment.yml

ml_client.online_deployments.begin_create_or_update(green_deployment).result()

new_deployment = deployment_client.create_deployment(
    name=green_deployment_name,
    endpoint=endpoint_name,
    model_uri=f"models:/{model_name}/{version}",
    config={"deploy-config-file": deployment_config_path},
)

トラフィックを変更せずにデプロイをテストする

az ml online-endpoint invoke --name $ENDPOINT_NAME --deployment-name $GREEN_DEPLOYMENT_NAME --request-file sample.json

ml_client.online_endpoints.invoke(
    endpoint_name=endpoint_name,
    deployment_name=green_deployment_name
    request_file="sample.json",
)

deployment_client.predict(
    endpoint=endpoint_name, 
    deployment_name=green_deployment_name, 
    df=samples
)

ヒント

呼び出すデプロイの名前がどのように示されているかに注目してください。

トラフィックを段階的に更新する

新しいデプロイの確信が得られたら、トラフィックを更新して、その一部を新しいデプロイにルーティングします。トラフィックはエンドポイントレベルで構成されます。

トラフィックを構成します。

この手順は、Azure CLI では必要ありません

endpoint.traffic = {blue_deployment_name: 90, green_deployment_name: 10}

traffic_config = {"traffic": {blue_deployment_name: 90, green_deployment_name: 10}}

構成をファイルに書き込みます。

traffic_config_path = "traffic_config.json"
with open(traffic_config_path, "w") as outfile:
    outfile.write(json.dumps(traffic_config))

エンドポイントを更新する

az ml online-endpoint update --name $ENDPOINT_NAME --traffic "default=90 $GREEN_DEPLOYMENT_NAME=10"

ml_client.begin_create_or_update(endpoint).result()

deployment_client.update_endpoint(
    endpoint=endpoint_name,
    config={"endpoint-config-file": traffic_config_path},
)

トラフィック全体を新しいデプロイに切り替える場合は、すべてのトラフィックを更新します。

この手順は、Azure CLI では必要ありません

endpoint.traffic = {blue_deployment_name: 0, green_deployment_name: 100}

traffic_config = {"traffic": {blue_deployment_name: 0, green_deployment_name: 100}}

構成をファイルに書き込みます。

traffic_config_path = "traffic_config.json"
with open(traffic_config_path, "w") as outfile:
    outfile.write(json.dumps(traffic_config))

エンドポイントを更新する

az ml online-endpoint update --name $ENDPOINT_NAME --traffic "default=0 $GREEN_DEPLOYMENT_NAME=100"

ml_client.begin_create_or_update(endpoint).result()

deployment_client.update_endpoint(
    endpoint=endpoint_name,
    config={"endpoint-config-file": traffic_config_path},
)

古いデプロイにトラフィックが流れることはないため、安全に削除できます。
```
az ml online-deployment delete --endpoint-name $ENDPOINT_NAME --name default
```
```
ml_client.online_deployments.begin_delete(
    name=blue_deployment_name, 
    endpoint_name=endpoint_name
)
```
```
deployment_client.delete_deployment(
    blue_deployment_name, 
    endpoint=endpoint_name
)
```
ヒント

この時点で、以前の "ブルーデプロイ" が削除され、新しい "グリーンデプロイ" が "ブルーデプロイ" の代わりに使用されていることに気づくはずです。

リソースをクリーンアップする

az ml online-endpoint delete --name $ENDPOINT_NAME --yes

ml_client.online_endpoints.begin_delete(name=endpoint_name)

deployment_client.delete_endpoint(endpoint_name)

重要

エンドポイントを削除すると、そのエンドポイントの下にあるデプロイもすべて削除されます。

次の方法で共有

オンラインエンドポイントへの MLflow モデルの段階的なロールアウト

この例の概要

Jupyter ノートブックで経過をたどる

前提条件

ワークスペースに接続する

レジストリにモデルを登録します。

オンラインエンドポイントの作成

ブルーデプロイを作成する

エンドポイントでグリーンデプロイを作成します。

トラフィックを段階的に更新する

リソースをクリーンアップする

次の手順

フィードバック

フィードバック

その他のリソース

次の方法で共有

オンライン エンドポイントへの MLflow モデルの段階的なロールアウト

この例の概要

Jupyter ノートブックで経過をたどる

前提条件

ワークスペースに接続する

レジストリにモデルを登録します。

オンライン エンドポイントの作成

ブルー デプロイを作成する

エンドポイントでグリーン デプロイを作成します。

トラフィックを段階的に更新する

リソースをクリーンアップする

次の手順

フィードバック

フィードバック

その他のリソース

オンラインエンドポイントへの MLflow モデルの段階的なロールアウト

オンラインエンドポイントの作成

ブルーデプロイを作成する

エンドポイントでグリーンデプロイを作成します。