チュートリアル 4: オンラインの具体化を有効にし、オンラインの推論を実行する

2024-11-21

Azure Machine Learning のマネージド Feature Store を使うと、特徴量の検出、作成、運用化を行うことができます。特徴量は、さまざまな特徴量を実験するプロトタイプ作成フェーズから始まり、機械学習ライフサイクルの結合組織として機能します。このライフサイクルはモデルをデプロイする運用化フェーズに進み、推論のステップで特徴量のデータについて調べます。 Feature Store の詳細については、Feature Store の概念に関するリソースを参照してください。

このチュートリアルシリーズのパート 1 では、カスタム変換を使用して特徴量セットの仕様を作成し、その特徴量セットを使用してトレーニングデータを生成する方法について説明しました。このシリーズのパート 2 では、具体化を有効にしてバックフィルを実行する方法を説明しました。さらに、パート 2 では、モデルのフォーマンスを向上させる方法として、特徴を試す方法についても説明しました。パート 3 では、Feature Store がどのように実験とトレーニングのフローの機敏性を高めるかについて説明しました。また、パート 3 では、バッチ推論を実行する方法についても説明しました。

このチュートリアルでは、次のことを行いました。

Azure Cache for Redis を設定する
オンライン具体化ストアとして Feature Store にキャッシュをアタッチし、必要なアクセス許可を付与する
オンラインストアに特徴量セットを具体化する
モックデータを使用してオンラインデプロイをテストする

前提条件

注

このチュートリアルでは、サーバーレス Spark コンピューティングを搭載した Azure Machine Learning ノートブックを使用します。

このチュートリアルシリーズのパート 1 からパート 4 まで完了してください。このチュートリアルでは、前のチュートリアルで作成した Feature Store と他のリソースを再利用します。

設定

このチュートリアルでは、Python Feature Store core SDK (azureml-featurestore) を使用します。 Python SDK は、Feature Store、特徴量セット、Feature Store エンティティの作成、読み取り、更新、削除 (CRUD) 操作に使用されます。

このチュートリアルでは、これらのリソースを明示的にインストールする必要はありません。ここに示すセットアップ手順では、online.yml ファイルにこれらのリソースが含まれています。

Azure Machine Learning Spark ノートブックを構成します。

新しいノートブックを作成し、順を追ってこのチュートリアルの手順を実行できます。また既存のノートブック featurestore_sample/notebooks/sdk_only/4.Enable-online-store-run-inference.ipynb を開いて実行することもできます。このチュートリアルは開いたままにしておき、ドキュメントのリンクや追加の説明を参照してください。
1. 上部ナビゲーションの [コンピューティング] ドロップダウンリストで、[Serverless Spark Compute] (サーバーレス Spark コンピューティング) を選びます。
2. セッションを構成するには、以下を行います。
  1. azureml-examples/sdk/python/featurestore-sample/project/env/online.yml ファイルをお使いのローカルコンピューターにダウンロードします
  2. 上部ナビゲーションの [セッションの構成] で、[Python パッケージ] を選びます
  3. [Conda ファイルのアップロード] を選びます
  4. 最初のチュートリアルの conda.yml ファイルのアップロードで説明されているのと同じ手順で、ローカルコンピューターから online.yml ファイルをアップロードします
  5. 必要に応じて、前提条件が頻繁に再実行されないように、セッションタイムアウト (アイドル時間) を増やします
このコードセルによって、Spark セッションを開始します。すべての依存関係をインストールして Spark セッションを開始するには、約 10 分かかります。
```
# Run this cell to start the spark session (any code block will start the session ). This can take approximately 10 mins.
print("start spark session")
```

サンプル用のルートディレクトリを設定する

import os

# Please update the dir to ./Users/<your_user_alias> (or any custom directory you uploaded the samples to).
# You can find the name from the directory structure in the left navigation panel.
root_dir = "./Users/<your_user_alias>/featurestore_sample"

if os.path.isdir(root_dir):
    print("The folder exists.")
else:
    print("The folder does not exist. Please create or fix the path")

チュートリアルノートブックが実行されるプロジェクトワークスペースの MLClient を初期化します。 MLClient は作成、読み取り、更新、および削除 (CRUD) 操作用に使用されます。

import os
from azure.ai.ml import MLClient
from azure.ai.ml.identity import AzureMLOnBehalfOfCredential

project_ws_sub_id = os.environ["AZUREML_ARM_SUBSCRIPTION"]
project_ws_rg = os.environ["AZUREML_ARM_RESOURCEGROUP"]
project_ws_name = os.environ["AZUREML_ARM_WORKSPACE_NAME"]

# Connect to the project workspace
ws_client = MLClient(
    AzureMLOnBehalfOfCredential(), project_ws_sub_id, project_ws_rg, project_ws_name
)

Feature Store ワークスペースの作成、読み取り、更新、および削除 (CRUD) 操作のために、Feature Store ワークスペースの MLClient を初期化します。
```
from azure.ai.ml import MLClient
from azure.ai.ml.identity import AzureMLOnBehalfOfCredential

# Feature store
featurestore_name = (
    "<FEATURESTORE_NAME>"  # use the same name from part #1 of the tutorial
)
featurestore_subscription_id = os.environ["AZUREML_ARM_SUBSCRIPTION"]
featurestore_resource_group_name = os.environ["AZUREML_ARM_RESOURCEGROUP"]

# Feature store MLClient
fs_client = MLClient(
    AzureMLOnBehalfOfCredential(),
    featurestore_subscription_id,
    featurestore_resource_group_name,
    featurestore_name,
)
```
注

Feature Store ワークスペースでは、プロジェクト間での特徴量の再利用がサポートされています。現在使用中のプロジェクトワークスペースでは、特定の Feature Store の特徴量を利用して、モデルのトレーニングと推論を実行します。多くのプロジェクトワークスペースで同じ Feature Store ワークスペースを共有し、再利用できます。

前述のように、このチュートリアルでは Python Feature Store core SDK (azureml-featurestore) を使用します。この初期化された SDK クライアントは、Feature Store、特徴量セット、Feature Store エンティティの作成、読み取り、更新、削除 (CRUD) 操作に使用されます。

from azureml.featurestore import FeatureStoreClient
from azure.ai.ml.identity import AzureMLOnBehalfOfCredential

featurestore = FeatureStoreClient(
    credential=AzureMLOnBehalfOfCredential(),
    subscription_id=featurestore_subscription_id,
    resource_group_name=featurestore_resource_group_name,
    name=featurestore_name,
)

Azure Cache for Redis を準備する

このチュートリアルでは、オンライン具体化ストアとして Azure Cache for Redis を使用します。新しい Redis インスタンスを作成することも、既存のインスタンスを再利用することもできます。

オンライン具体化ストアとして使用するために、Azure Cache for Redis リソースの値を設定します。このコードセルで、作成または再利用する Azure Cache for Redis リソースの名前を定義します。他のデフォルトの設定にオーバーライドすることができます。
```
ws_location = ws_client.workspaces.get(ws_client.workspace_name).location

redis_subscription_id = os.environ["AZUREML_ARM_SUBSCRIPTION"]
redis_resource_group_name = os.environ["AZUREML_ARM_RESOURCEGROUP"]
redis_name = "<REDIS_NAME>"
redis_location = ws_location
```

新しい Redis インスタンスを作成できます。適切な Redis Cache レベル (Basic、Standard、Premium、または Enterprise) を選択します。選択したキャッシュレベルで使用できる SKU ファミリを選択します。レベルとキャッシュのパフォーマンスの詳細については、このリソースを参照してください。 SKU レベルと Azure キャッシュファミリの詳細については、このリソースを参照してください。

このコードセルを実行して、Premium レベル、SKU ファミリ P、キャッシュ容量 2 の Azure Cache for Redis を作成します。 Redis インスタンスの準備には 5 分から 10 分かかる場合があります。

from azure.mgmt.redis import RedisManagementClient
from azure.mgmt.redis.models import RedisCreateParameters, Sku, SkuFamily, SkuName

management_client = RedisManagementClient(
    AzureMLOnBehalfOfCredential(), redis_subscription_id
)

# It usually takes about 5 - 10 min to finish the provision of the Redis instance.
# If the following begin_create() call still hangs for longer than that,
# please check the status of the Redis instance on the Azure portal and cancel the cell if the provision has completed.
# This sample uses a PREMIUM tier Redis SKU from family P, which may cost more than a STANDARD tier SKU from family C.
# Please choose the SKU tier and family according to your performance and pricing requirements.

redis_arm_id = (
    management_client.redis.begin_create(
        resource_group_name=redis_resource_group_name,
        name=redis_name,
        parameters=RedisCreateParameters(
            location=redis_location,
            sku=Sku(name=SkuName.PREMIUM, family=SkuFamily.P, capacity=2),
        ),
    )
    .result()
    .id
)

print(redis_arm_id)

必要に応じて、このコードセルは、以前に定義した名前を持つ既存の Redis インスタンスを再利用します。

redis_arm_id = "/subscriptions/{sub_id}/resourceGroups/{rg}/providers/Microsoft.Cache/Redis/{name}".format(
    sub_id=redis_subscription_id,
    rg=redis_resource_group_name,
    name=redis_name,
)

オンライン具体化ストアを Feature Store にアタッチする

Feature Store では、オンライン具体化ストアとして使用するために、アタッチされたリソースとして Azure Cache for Redis が必要です。このコードセルは、そのステップを処理します。

from azure.ai.ml.entities import (
    ManagedIdentityConfiguration,
    FeatureStore,
    MaterializationStore,
)

online_store = MaterializationStore(type="redis", target=redis_arm_id)

ml_client = MLClient(
    AzureMLOnBehalfOfCredential(),
    subscription_id=featurestore_subscription_id,
    resource_group_name=featurestore_resource_group_name,
)

fs = FeatureStore(
    name=featurestore_name,
    online_store=online_store,
)

fs_poller = ml_client.feature_stores.begin_create(fs)
print(fs_poller.result())

注

Feature Store の更新中、grant_materiaization_permissions=True の設定だけでは、必要な RBAC アクセス許可が UAI に付与されません。 UAI へのロールの割り当ては、次のいずれかが更新された場合にのみ行われます。

具体化 ID
オンラインストアのターゲット
オンラインストアのターゲット

SDK を使用して行う方法の例については、チュートリアル: マネージド Feature Store をプロビジョニングするためのさまざなまアプローチに関するリソースを参照してください。

オンラインストアに `accounts` 特徴量セットデータを具体化する

`accounts` 特徴量セットで具体化を有効化する

このチュートリアルシリーズの前半では、事前計算済み特徴量があり、バッチ推論シナリオでのみ使用されていたため、アカウント特徴量セットを具体化しませんでした。このコードセルを使用すると、オンライン具体化が可能になるため、オンラインストアで特徴量を利用できるようになり、アクセスの待ち時間が短くなります。一貫性を保つために、オフライン具体化も可能です。オフライン具体化の有効化はオプションです。

from azure.ai.ml.entities import (
    MaterializationSettings,
    MaterializationComputeResource,
)

# Turn on both offline and online materialization on the "accounts" featureset.

accounts_fset_config = fs_client._featuresets.get(name="accounts", version="1")

accounts_fset_config.materialization_settings = MaterializationSettings(
    offline_enabled=True,
    online_enabled=True,
    resource=MaterializationComputeResource(instance_type="standard_e8s_v3"),
    spark_configuration={
        "spark.driver.cores": 4,
        "spark.driver.memory": "36g",
        "spark.executor.cores": 4,
        "spark.executor.memory": "36g",
        "spark.executor.instances": 2,
    },
    schedule=None,
)

fs_poller = fs_client.feature_sets.begin_create_or_update(accounts_fset_config)
print(fs_poller.result())

`account` 特徴量セットをバックフィルする

begin_backfill 関数は、この特徴量セットに対して有効になっているすべての具体化ストアにデータをバックフィルします。ここでは、オフラインとオンラインの具体化の両方が有効になっています。このコードセルは、オンラインとオフラインの両方の具体化ストアにデータをバックフィルします。

from datetime import datetime, timedelta

# Trigger backfill on the "accounts" feature set.
# Backfill from 01/01/2020 to all the way to 3 hours ago.

st = datetime(2020, 1, 1, 0, 0, 0, 0)
et = datetime.now() - timedelta(hours=3)

poller = fs_client.feature_sets.begin_backfill(
    name="accounts",
    version="1",
    feature_window_start_time=st,
    feature_window_end_time=et,
    data_status=["None"],
)
print(poller.result().job_ids)

ヒント

feature_window_start_time と feature_window_end_time の細分性は秒単位に制限されます。 datetime オブジェクトで指定されたミリ秒の値は無視されます。
具体化ジョブは、バックフィルジョブの送信中に定義された data_status と一致するデータが特徴ウィンドウ内にある場合のみ送信されます。

このコードセルは、バックフィルジョブの完了を追跡します。前にプロビジョニングした Azure Cache for Redis のPremium レベルでは、この手順の完了に約 10 分要する場合があります。

# Get the job URL, and stream the job logs.
# With PREMIUM Redis SKU, SKU family "P", and cache capacity 2,
# it takes approximately 10 minutes to complete.
fs_client.jobs.stream(poller.result().job_ids[0])

オンラインストアに `transactions` 特徴量セットデータを具体化

このチュートリアルシリーズの前半では、transactions 特徴量セットデータをオフライン具体化ストアに具体化しました。

このコードセルを使用すると、transactions 特徴量セットのオンライン具体化が有効になります。

# Enable materialization to online store for the "transactions" feature set.

transactions_fset_config = fs_client._featuresets.get(name="transactions", version="1")
transactions_fset_config.materialization_settings.online_enabled = True

fs_poller = fs_client.feature_sets.begin_create_or_update(transactions_fset_config)
print(fs_poller.result())

このコードセルは、オンラインとオフラインの両方の具体化ストアにデータをバックフィルして、両方のストアに最新のデータがあることを確認します。このシリーズのチュートリアル 3 で設定した定期的な具体化ジョブは、オンラインとオフラインの両方の具体化ストアにデータを具体化するようになりました。

# Trigger backfill on the "transactions" feature set to fill in the online/offline store.
# Backfill from 01/01/2020 to all the way to 3 hours ago.

from datetime import datetime, timedelta
from azure.ai.ml.entities import DataAvailabilityStatus

st = datetime(2020, 1, 1, 0, 0, 0, 0)
et = datetime.now() - timedelta(hours=3)


poller = fs_client.feature_sets.begin_backfill(
    name="transactions",
    version="1",
    feature_window_start_time=st,
    feature_window_end_time=et,
    data_status=[DataAvailabilityStatus.NONE],
)
print(poller.result().job_ids)

このコードセルは、バックフィルジョブの完了を追跡します。前にプロビジョニングした Azure Cache for Redis のPremium レベルでは、このステップの完了に約 5 分要する場合があります。

# Get the job URL, and stream the job logs.
# With PREMIUM Redis SKU, SKU family "P", and cache capacity 2,
# it takes approximately 5 minutes to complete.
fs_client.jobs.stream(poller.result().job_ids[0])

オフラインの特徴量具体化についてさらに詳しく調べる

具体化ジョブ UI から、特徴量セットの特徴量具体化の状態を調べることができます。

Azure Machine Learning のグローバルランディングページを開きます
左側のペインで、[機能ストア] を選びます
アクセス可能な Feature Store の一覧から、バックフィルを実行した Feature Store を選びます
[具体化ジョブ] タブを選択します

データ具体化の状態は、次の場合があります。
- 完了 (緑)
- 未完了 (赤)
- 保留中 (青)
- なし (灰色)
"データ間隔" は、同じデータ具体化状態であるデータの連続した部分を表します。たとえば、以前のスナップショットには、オフライン具体化ストアに 16 個の "データ間隔" があります。
データには、最大 2,000 個の "データ間隔" を含めることができます。データに 2,000 個を超える "データ間隔" が含まれる場合は、新しいバージョンの特徴量セットを作成します。
1 つのバックフィルジョブで、複数のデータ状態 (たとえば、["None", "Incomplete"]) の一覧を指定できます。
バックフィル中に、定義された特徴ウィンドウ内にある "データ間隔" ごとに新しい具体化ジョブが送信されます。
具体化ジョブが既に保留中の場合、またはまだバックフィルされていない "データ間隔" で実行されている場合、その "データ間隔" に対して新しいジョブは送信されません。

オンライン具体化ストアの更新

オンライン具体化ストアを Feature Store レベルで更新するには、Feature Store 内のすべての特徴量セットでオンライン具体化が無効になっている必要があります。
特徴量セットでオンライン具体化が無効になっている場合、オンライン具体化ストアで既に具体化されているデータの具体化状態がリセットされます。これにより、既に具体化されたデータが使用できなくなります。オンライン具体化を有効にした後、具体化ジョブを再送信する必要があります。
最初に特徴量セットに対してオフライン具体化のみが有効になっていて、オンライン具体化を後で有効にする場合:
- オンラインストア内のデータの既定のデータ具体化状態が None になります。
- 最初のオンライン具体化ジョブが送信されると、オフラインストアで既に具体化されているデータ (使用可能な場合) がオンライン特徴量の計算に使用されます。
- オンライン具体化の "データ間隔" が、オフラインストアにある既に具体化されたデータの "データ間隔" と部分的に重複している場合は、"データ間隔" の重複部分と重複していない部分に対して別々の具体化ジョブが送信されます。

ローカルでテストする

次に、開発環境を使用して、オンライン具体化ストアから特徴量を検索します。 サーバーレス Spark コンピューティングにアタッチされたチュートリアルノートブックは、開発環境として機能します。

このコードセルは、デフォルトの特徴量取得の仕様から特徴量一覧を解析します。

# Parse the list of features from the existing feature retrieval specification.
feature_retrieval_spec_folder = root_dir + "/project/fraud_model/feature_retrieval_spec"

features = featurestore.resolve_feature_retrieval_spec(feature_retrieval_spec_folder)

features

このコードは、オンライン具体化ストアから特徴量の値を取得します。

from azureml.featurestore import init_online_lookup
import time

# Initialize the online store client.
init_online_lookup(features, AzureMLOnBehalfOfCredential())

テストのためにいくつかの観測データを準備し、そのデータを使用してオンライン具体化ストアから特徴量を検索します。オンライン検索中に、観測サンプルデータで定義されているキー (accountID) が Redis に存在しないことがあります (TTL により)。この場合、次のようになります。

Azure portal を開きます。
Redis インスタンスに移動します
Redis インスタンスのコンソールを開き、KEYS * コマンドを使用して既存のキーをチェックします

サンプル観測データの accountID 値を既存のキーに置き換えます

import pyarrow
from azureml.featurestore import get_online_features

# Prepare test observation data
obs = pyarrow.Table.from_pydict(
    {"accountID": ["A985156952816816", "A1055521248929430", "A914800935560176"]}
)

# Online lookup:
# It may happen that the keys defined in the observation sample data above does not exist in the Redis (due to TTL).
# If this happens, go to Azure portal and navigate to the Redis instance, open its console and check for existing keys using command "KEYS *"
# and replace the sample observation data with the existing keys.
df = get_online_features(features, obs)
df

これらの手順では、オンラインストアから特徴量を検索しました。次の手順では、Azure Machine Learning マネージドオンラインエンドポイントを使用してオンライン特徴量をテストします。

Azure Machine Learning マネージドオンラインエンドポイントからオンライン特徴量をテストする

マネージドオンラインエンドポイントは、オンライン/リアルタイム推論のモデルをデプロイしてスコアリングします。たとえば、Kubernetes などの任意の使用可能な推論テクノロジを使用できます。

この手順には、次のアクションが含まれます。

Azure Machine Learning のマネージドオンラインエンドポイントを作成します。
必要なロールベースのアクセス制御 (RBAC) アクセス許可を付与します。
このチュートリアルシリーズのチュートリアル 3 でトレーニングしたモデルをデプロイします。この手順で使用するスコアリングスクリプトには、オンライン特徴量を検索するコードがあります。
サンプルデータを使用してモデルをスコアリングします。

Azure Machine Learning マネージドオンラインエンドポイントを作成する

マネージドオンラインエンドポイントの詳細については、このリソースを参照してください。マネージド Feature Store API を使用すると、他の推論プラットフォームからオンライン特徴量を検索することもできます。

このコードセルは、fraud-model　マネージドオンラインエンドポイントを定義します。

from azure.ai.ml.entities import (
    ManagedOnlineDeployment,
    ManagedOnlineEndpoint,
    Model,
    CodeConfiguration,
    Environment,
)


endpoint_name = "<ENDPOINT_NAME>"

endpoint = ManagedOnlineEndpoint(name=endpoint_name, auth_mode="key")

このコードセルは、前のコードセルで定義されたマネージドオンラインエンドポイントを作成します。

ws_client.online_endpoints.begin_create_or_update(endpoint).result()

必要な RBAC アクセス許可を付与する

ここでは、Redis インスタンスと FeaturenStore のマネージドオンラインエンドポイントに必要な RBAC アクセス許可を付与します。モデルデプロイのスコアリングコードでは、マネージド Feature Store API を使用してオンラインストアの特徴量を正常に検索するために、これらの RBAC アクセス許可が必要です。

マネージドオンラインエンドポイントのマネージド ID を取得する

このコードセルは、マネージドオンラインエンドポイントのマネージド ID を取得します。

# Get managed identity of the managed online endpoint.
endpoint = ws_client.online_endpoints.get(endpoint_name)

model_endpoint_msi_principal_id = endpoint.identity.principal_id
model_endpoint_msi_principal_id

Azure Cache for Redis のオンラインエンドポイントマネージド ID に `Contributor` ロールを付与する

このコードセルは、Redis インスタンスのオンラインエンドポイントマネージド ID に Contributor ロールを付与します。この RBAC アクセス許可は、Redis オンラインストアにデータを具体化するために必要です。

from azure.core.exceptions import ResourceExistsError
from azure.mgmt.msi import ManagedServiceIdentityClient
from azure.mgmt.msi.models import Identity
from azure.mgmt.authorization import AuthorizationManagementClient
from azure.mgmt.authorization.models import RoleAssignmentCreateParameters
from uuid import uuid4

auth_client = AuthorizationManagementClient(
    AzureMLOnBehalfOfCredential(), redis_subscription_id
)

scope = f"/subscriptions/{redis_subscription_id}/resourceGroups/{redis_resource_group_name}/providers/Microsoft.Cache/Redis/{redis_name}"


# The role definition ID for the "contributor" role on the redis cache
# You can find other built-in role definition IDs in the Azure documentation
role_definition_id = f"/subscriptions/{redis_subscription_id}/providers/Microsoft.Authorization/roleDefinitions/b24988ac-6180-42a0-ab88-20f7382dd24c"

# Generate a random UUID for the role assignment name
role_assignment_name = str(uuid4())

# Set up the role assignment creation parameters
role_assignment_params = RoleAssignmentCreateParameters(
    principal_id=model_endpoint_msi_principal_id,
    role_definition_id=role_definition_id,
    principal_type="ServicePrincipal",
)

# Create the role assignment
try:
    # Create the role assignment
    result = auth_client.role_assignments.create(
        scope, role_assignment_name, role_assignment_params
    )
    print(
        f"Redis RBAC granted to managed identity '{model_endpoint_msi_principal_id}'."
    )
except ResourceExistsError:
    print(
        f"Redis RBAC already exists for managed identity '{model_endpoint_msi_principal_id}'."
    )

Feature Store のオンラインエンドポイントマネージド ID に `AzureML Data Scientist` ロールを付与する

このコードセルは、Feature Store のオンラインエンドポイントマネージド ID に AzureML Data Scientist ロールを付与します。この RBAC アクセス許可は、モデルをオンラインエンドポイントに正常にデプロイするために必要です。

auth_client = AuthorizationManagementClient(
    AzureMLOnBehalfOfCredential(), featurestore_subscription_id
)

scope = f"/subscriptions/{featurestore_subscription_id}/resourceGroups/{featurestore_resource_group_name}/providers/Microsoft.MachineLearningServices/workspaces/{featurestore_name}"

# The role definition ID for the "AzureML Data Scientist" role.
# You can find other built-in role definition IDs in the Azure documentation.
role_definition_id = f"/subscriptions/{featurestore_subscription_id}/providers/Microsoft.Authorization/roleDefinitions/f6c7c914-8db3-469d-8ca1-694a8f32e121"

# Generate a random UUID for the role assignment name.
role_assignment_name = str(uuid4())

# Set up the role assignment creation parameters.
role_assignment_params = RoleAssignmentCreateParameters(
    principal_id=model_endpoint_msi_principal_id,
    role_definition_id=role_definition_id,
    principal_type="ServicePrincipal",
)

# Create the role assignment
try:
    # Create the role assignment
    result = auth_client.role_assignments.create(
        scope, role_assignment_name, role_assignment_params
    )
    print(
        f"Feature store RBAC granted to managed identity '{model_endpoint_msi_principal_id}'."
    )
except ResourceExistsError:
    print(
        f"Feature store RBAC already exists for managed identity '{model_endpoint_msi_principal_id}'."
    )

モデルをオンラインエンドポイントにデプロイする

スコアリングスクリプト project/fraud_model/online_inference/src/scoring.py をレビューします。スコアリングスクリプト

モデルトレーニング中にモデルと共にパッケージ化された特徴量取得の仕様から特徴量メタデータを読み込みます。このチュートリアルシリーズのチュートリアル 3 では、このタスクについて説明しました。仕様には、transactions と accounts の両方の特徴量セットの特徴量があります。
入力推論要求を受信したときに、要求のインデックスキーを使用してオンライン特徴量を検索します。この場合、両方の特徴量セットのインデックス列は accountID です。
推論を実行するために特徴量をモデルに渡し、応答を返します。応答は、変数 is_fraud を表すブール値です。

次に、このコードセルを実行して、モデルデプロイ用のマネージドオンラインデプロイ定義を作成します。

deployment = ManagedOnlineDeployment(
    name="green",
    endpoint_name=endpoint_name,
    model="azureml:fraud_model:1",
    code_configuration=CodeConfiguration(
        code=root_dir + "/project/fraud_model/online_inference/src/",
        scoring_script="scoring.py",
    ),
    environment=Environment(
        conda_file=root_dir + "/project/fraud_model/online_inference/conda.yml",
        image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04",
    ),
    instance_type="Standard_DS3_v2",
    instance_count=1,
)

このコードセルを使用して、モデルをオンラインエンドポイントにデプロイします。デプロイには 4 分から 5 分要する場合があります。

# Model deployment to online enpoint may take 4-5 minutes.
ws_client.online_deployments.begin_create_or_update(deployment).result()

モックデータを使用してオンラインデプロイをテストする

このコードセルを実行して、モックデータを使用してオンラインデプロイをテストします。セルには、出力として 0 または 1 が表示されます。

# Test the online deployment using the mock data.
sample_data = root_dir + "/project/fraud_model/online_inference/test.json"
ws_client.online_endpoints.invoke(
    endpoint_name=endpoint_name, request_file=sample_data, deployment_name="green"
)

クリーンアップ

このシリーズの 5 番目のチュートリアルでは、リソースを削除する方法について説明します。

次の方法で共有

チュートリアル 4: オンラインの具体化を有効にし、オンラインの推論を実行する

前提条件

設定

Azure Cache for Redis を準備する

オンライン具体化ストアを Feature Store にアタッチする

オンライン ストアに accounts 特徴量セット データを具体化する

accounts 特徴量セットで具体化を有効化する

account 特徴量セットをバックフィルする

オンライン ストアに transactions 特徴量セット データを具体化

オフラインの特徴量具体化についてさらに詳しく調べる

オンライン具体化ストアの更新

ローカルでテストする

Azure Machine Learning マネージド オンライン エンドポイントからオンライン特徴量をテストする

Azure Machine Learning マネージド オンライン エンドポイントを作成する

必要な RBAC アクセス許可を付与する

マネージド オンライン エンドポイントのマネージド ID を取得する

Azure Cache for Redis のオンライン エンドポイントマネージド ID に Contributor ロールを付与する

Feature Store のオンライン エンドポイント マネージド ID に AzureML Data Scientist ロールを付与する

モデルをオンライン エンドポイントにデプロイする

モック データを使用してオンライン デプロイをテストする