運用環境にデプロイされたモデルのパフォーマンスを監視する

適用対象:Azure CLI ml 拡張機能 v2 (現行)Python SDK azure-ai-ml v2 (現行)

Azure Machine Learning では、モデルの監視を使用して、運用環境の機械学習モデルのパフォーマンスを継続的に追跡できます。モデル監視は、監視信号の広い視野を提供します。また、潜在的な問題に関するアラートも表示されます。運用環境のモデルのシグナルとパフォーマンスメトリックを監視すると、モデル固有のリスクを重大に評価できます。また、ビジネスに悪影響を及ぼす可能性のある盲点を特定することもできます。

この記事では、次のタスクを実行する方法について説明します。

Azure Machine Learning オンラインエンドポイントにデプロイされるモデルに対して、すぐに使用できる高度な監視を設定する
運用環境のモデルのパフォーマンスメトリックを監視する
Azure Machine Learning の外部にデプロイされているモデル、または Azure Machine Learning バッチエンドポイントにデプロイされているモデルを監視する
モデル監視で使用するカスタムシグナルとメトリックを設定する
モニタリング結果を使用する
Azure Machine Learning モデルモニタリングを Azure Event Grid と統合する

前提条件

Azure CLI と Azure CLI のml拡張機能(インストールおよび構成済み)。詳細については、「 CLI のインストールと設定 (v2)」を参照してください。
Bash シェルまたは互換性のあるシェル (Linux システム上のシェルや Linux 用 Windows サブシステムなど)。この記事の Azure CLI の例では、この種類のシェルを使用することを前提としています。
Azure Machine Learning ワークスペース。ワークスペースを作成する手順については、「設定」を参照してください。

Azure Machine Learning ワークスペース。ワークスペースを作成する手順については、「ワークスペースの作成」を参照してください。
Azure Machine Learning SDK for Python v2。 SDK をインストールするには、次のコマンドを使用します。
```
pip install azure-ai-ml azure-identity
```
SDK の既存のインストールを最新バージョンに更新するには、次のコマンドを使用します。
```
pip install --upgrade azure-ai-ml azure-identity
```
詳細については、「 Python 用 Azure Machine Learning パッケージクライアントライブラリ」を参照してください。

次の Azure ロールベースのアクセス制御 (Azure RBAC) ロールの少なくとも 1 つを持つユーザーアカウント。
- Azure Machine Learning ワークスペースの所有者ロール
- Azure Machine Learning ワークスペースの共同作成者ロール
- Microsoft.MachineLearningServices/workspaces/onlineEndpoints/*アクセス許可を持つカスタムロール
詳細については、「Azure Machine Learning ワークスペースへのアクセスの管理」を参照してください。
Azure Machine Learning マネージドオンラインエンドポイントまたは Kubernetes オンラインエンドポイントを監視する場合:
- Azure Machine Learning オンラインエンドポイントにデプロイされるモデル。マネージドオンラインエンドポイントと Kubernetes オンラインエンドポイントがサポートされています。 Azure Machine Learning オンラインエンドポイントにモデルをデプロイする手順については、オンラインエンドポイントを使用した機械学習モデルのデプロイとスコア付けを参照してください。
- モデルデプロイメントでデータ収集が有効化されています。 Azure Machine Learning オンラインエンドポイントのデプロイ手順の間に、データ収集を有効にできます。詳細については、「リアルタイム推論のためにデプロイされたモデルから実稼働データを収集する」を参照してください。
Azure Machine Learning バッチエンドポイントにデプロイされた、または Azure Machine Learning の外部にデプロイされたモデルを監視する場合:
- 運用データを収集し、Azure Machine Learning データ資産として登録する手段
- モデル監視のために登録されたデータ資産を継続的に更新する手段
- (推奨)系列追跡のための Azure Machine Learning ワークスペースでのモデルの登録

サーバーレス Spark コンピューティングプールを構成する

モデル監視ジョブは、サーバーレス Spark コンピューティングプールで実行するようにスケジュールされます。次の Azure Virtual Machines インスタンスの種類がサポートされています。

Standard_E4s_v3
Standard_E8s_v3
Standard_E16s_v3
Standard_E32s_v3
Standard_E64s_v3

この記事の手順に従うときに仮想マシンインスタンスの種類を指定するには、次の手順を実行します。

Azure CLI を使用してモニターを作成する場合は、YAML 構成ファイルを使用します。 create_monitor.compute.instance_type値を使用したい型に設定し、そのファイルに保存します。

すぐに使用できるモデル監視を設定する

Azure Machine Learning オンラインエンドポイントで運用環境にモデルをデプロイし、デプロイ時にデータ収集を有効にするシナリオを考えてみましょう。この場合、Azure Machine Learning は運用環境の推論データを収集し、Azure Blob Storage に自動的に格納します。 Azure Machine Learning モデルの監視を使用して、この運用環境の推論データを継続的に監視できます。

Azure CLI、Python SDK、またはスタジオを使って、すぐに使用できるモデルモニタリングのセットアップを行うことができます。すぐに使用できるモデルモニタリング構成には、次のモニタリング機能が用意されています。

Azure Machine Learning は、Azure Machine Learning オンラインデプロイに関連付けられている運用推論データ資産を自動的に検出し、モデルの監視にデータ資産を使用します。
比較参照データ資産は、最近の過去の運用推論データ資産として設定されます。
監視設定には、次の組み込みの監視信号(データドリフト、予測ドリフト、データ品質)が自動的に含まれており、追跡されます。監視シグナルごとに、Azure Machine Learning では次のものが使われます。
- 比較参照データ資産としての最近の過去の運用推論データ資産。
- メトリックとしきい値のスマートな既定値。
監視ジョブは、通常のスケジュールで実行するように構成されます。そのジョブは監視信号を取得し、対応するしきい値と比較して各メトリックの結果を評価します。既定では、しきい値を超えると、Azure Machine Learning はモニターを設定したユーザーにアラート電子メールを送信します。

すぐに使用できるモデルモニタリングを設定するには、次の手順を実行します。

Azure CLI では、 az ml schedule を使用して監視ジョブをスケジュールします。

YAML ファイルに監視定義を作成します。即時利用可能の定義のサンプルについては、次の YAML コードを参照してください。これは azureml-examples リポジトリでも入手できます。

この定義を使用する前に、環境に合わせて値を調整します。 endpoint_deployment_idの場合は、azureml:<endpoint-name>:<deployment-name>形式の値を使用します。

# out-of-box-monitoring.yaml
$schema:  http://azureml/sdk-2-0/Schedule.json
name: credit_default_model_monitoring
display_name: Credit default model monitoring
description: Credit default model monitoring setup with minimal configurations

trigger:
  # perform model monitoring activity daily at 3:15am
  type: recurrence
  frequency: day #can be minute, hour, day, week, month
  interval: 1 # #every day
  schedule: 
    hours: 3 # at 3am
    minutes: 15 # at 15 mins after 3am

create_monitor:

  compute: # specify a spark compute for monitoring job
    instance_type: standard_e4s_v3
    runtime_version: "3.4"

  monitoring_target: 
    ml_task: classification # model task type: [classification, regression, question_answering]
    endpoint_deployment_id: azureml:credit-default:main # azureml endpoint deployment id

  alert_notification: # emails to get alerts
    emails:
      - abc@example.com
      - def@example.com

次のコマンドを実行してモデルを作成します。
```
az ml schedule create -f ./out-of-box-monitoring.yaml
```

次の例のようなコードを使用します。次のプレースホルダーを適切な値に置き換えます。

プレースホルダー	説明	例
<購読-ID>	サブスクリプションの ID	aaaa0a0a-bb1b-cc2c-dd3d-eeeeee4e4e4e
<resource-group-name (リソースグループ名)>	ワークスペースを含むリソースグループの名前	my-resource-group
<ワークスペース名>	ワークスペースの名前	my-workspace
<エンドポイント名>	監視するエンドポイントの名前	credit-default
<デプロイメント名>	監視するデプロイメントの名前	main
<email-address-1> および <email-address-2>	通知に使用する電子メールアドレス	`abc@example.com`
<周波数単位>	監視周波数ユニット	day
<インターバル>	ジョブ間の間隔 (頻度単位で表されます)	1
<開始時刻>	監視を開始する時間 (24時間制時計)	3
<開始分>	指定した時間の後に監視を開始するまでの分	15

from azure.identity import DefaultAzureCredential
from azure.ai.ml import MLClient
from azure.ai.ml.entities import (
    AlertNotification,
    MonitoringTarget,
    MonitorDefinition,
    MonitorSchedule,
    RecurrencePattern,
    RecurrenceTrigger,
    ServerlessSparkCompute
)

# Get a handle to the workspace.
ml_client = MLClient(
    DefaultAzureCredential(),
    subscription_id="<subscription-ID>",
    resource_group_name="<resource-group-name>",
    workspace_name="<workspace-name>",
)

# Create the compute instance.
spark_compute = ServerlessSparkCompute(
    instance_type="standard_e4s_v3",
    runtime_version="3.3"
)

# Specify your online endpoint deployment.
monitoring_target = MonitoringTarget(
    ml_task="classification",
    endpoint_deployment_id="azureml:<endpoint-name>:<deployment-name>"
)

# Create an alert notification object.
alert_notification = AlertNotification(
    emails=['<email-address-1>', '<email-address-2>']
)

# Create the monitor definition.
monitor_definition = MonitorDefinition(
    compute=spark_compute,
    monitoring_target=monitoring_target,
    alert_notification=alert_notification
)

# Specify the schedule frequency.
recurrence_trigger = RecurrenceTrigger(
    frequency="<frequency-unit>",
    interval=<interval>,
    schedule=RecurrencePattern(hours=<start-hour>, minutes=<start-minutes>)
)

# Create the monitoring schedule.
model_monitor = MonitorSchedule(
    name="credit_default_monitor_basic",
    trigger=recurrence_trigger,
    create_monitor=monitor_definition
)

# Schedule the monitoring job.
poller = ml_client.schedules.begin_create_or_update(model_monitor)
created_monitor = poller.result()

高度なモデル監視を設定する

Azure Machine Learning には、継続的なモデル監視のための多くの機能が用意されています。この機能の包括的な一覧については、「モデル監視の機能」を参照してください。多くの場合、高度な監視タスクをサポートするモデル監視を設定する必要があります。次のセクションでは、高度な監視の例をいくつか示します。

広い視野のための複数の監視信号の使用
比較参照データ資産としての履歴モデルトレーニングデータまたは検証データの使用
N 個の最も重要な機能と個々の機能の監視

特徴量の重要度を構成する

特徴量の重要度は、モデルの出力に対する各入力特徴量の相対的な重要度を表します。たとえば、標高よりもモデルの予測にとって温度の方が重要な場合があります。機能の重要度をオンにすると、運用環境で誤差やデータ品質の問題が発生したくない機能を可視化できます。

データドリフトやデータ品質など、シグナルで特徴量の重要度を有効にするには、次を提供する必要があります。

reference_data データ資産としてのトレーニングデータ資産。
reference_data.data_column_names.target_column プロパティ。モデルの出力列または予測列の名前です。

機能の重要度を有効にすると、Azure Machine Learning Studio で監視する各機能の機能の重要度が表示されます。

Python SDK または Azure CLI を使用するときに alert_enabled プロパティを設定することで、シグナルごとにアラートをオンまたはオフにすることができます。

Azure CLI、Python SDK、またはスタジオを使用して、高度なモデル監視を設定できます。

YAML ファイルに監視定義を作成します。高度な定義のサンプルについては、次の YAML コードを参照してください。これは、azureml-examples リポジトリでも使用できます。

この定義を使用する前に、環境のニーズに合わせて次の設定やその他の設定を調整します。

endpoint_deployment_idの場合は、azureml:<endpoint-name>:<deployment-name>形式の値を使用します。
参照入力データセクションの path には、 azureml:<reference-data-asset-name>:<version>形式の値を使用します。
target_columnの場合は、DEFAULT_NEXT_MONTHなど、モデルが予測する値を含む出力列の名前を使用します。
featuresでは、高度なデータ品質信号で使用するSEX、EDUCATION、AGEなどの機能を一覧表示します。
[ emails] で、通知に使用するメールアドレスを一覧表示します。

# advanced-model-monitoring.yaml
$schema:  http://azureml/sdk-2-0/Schedule.json
name: fraud_detection_model_monitoring
display_name: Fraud detection model monitoring
description: Fraud detection model monitoring with advanced configurations

trigger:
  # perform model monitoring activity daily at 3:15am
  type: recurrence
  frequency: day #can be minute, hour, day, week, month
  interval: 1 # #every day
  schedule: 
    hours: 3 # at 3am
    minutes: 15 # at 15 mins after 3am

create_monitor:

  compute: 
    instance_type: standard_e4s_v3
    runtime_version: "3.4"

  monitoring_target:
    ml_task: classification
    endpoint_deployment_id: azureml:credit-default:main
  
  monitoring_signals:
    advanced_data_drift: # monitoring signal name, any user defined name works
      type: data_drift
      # reference_dataset is optional. By default referece_dataset is the production inference data associated with Azure Machine Learning online endpoint
      reference_data:
        input_data:
          path: azureml:credit-reference:1 # use training data as comparison reference dataset
          type: mltable
        data_context: training
        data_column_names:
          target_column: DEFAULT_NEXT_MONTH
      features: 
        top_n_feature_importance: 10 # monitor drift for top 10 features
      alert_enabled: true
      metric_thresholds:
        numerical:
          jensen_shannon_distance: 0.01
        categorical:
          pearsons_chi_squared_test: 0.02
    advanced_data_quality:
      type: data_quality
      # reference_dataset is optional. By default reference_dataset is the production inference data associated with Azure Machine Learning online endpoint
      reference_data:
        input_data:
          path: azureml:credit-reference:1
          type: mltable
        data_context: training
      features: # monitor data quality for 3 individual features only
        - SEX
        - EDUCATION
      alert_enabled: true
      metric_thresholds:
        numerical:
          null_value_rate: 0.05
        categorical:
          out_of_bounds_rate: 0.03

    feature_attribution_drift_signal:
      type: feature_attribution_drift
      # production_data: is not required input here
      # Please ensure Azure Machine Learning online endpoint is enabled to collected both model_inputs and model_outputs data
      # Azure Machine Learning model monitoring will automatically join both model_inputs and model_outputs data and used it for computation
      reference_data:
        input_data:
          path: azureml:credit-reference:1
          type: mltable
        data_context: training
        data_column_names:
          target_column: DEFAULT_NEXT_MONTH
      alert_enabled: true
      metric_thresholds:
        normalized_discounted_cumulative_gain: 0.9
  
  alert_notification:
    emails:
      - abc@example.com
      - def@example.com

次のコマンドを実行してモデルを作成します。
```
az ml schedule create -f ./advanced-model-monitoring.yaml
```

高度なモデル監視を設定するには、次のサンプルのようなコードを使用します。次のプレースホルダーを適切な値に置き換えます。

プレースホルダー	説明	例
<購読-ID>	サブスクリプションの ID	aaaa0a0a-bb1b-cc2c-dd3d-eeeeee4e4e4e
<resource-group-name (リソースグループ名)>	ワークスペースを含むリソースグループの名前	my-resource-group
<ワークスペース名>	ワークスペースの名前	my-workspace
<エンドポイント名>	監視するエンドポイントの名前	credit-default
<デプロイメント名>	監視するデプロイメントの名前	main
<プロダクションデータ資産名>	運用データを含むデータ資産の名前	credit-default-main-model_inputs
<リファレンスデータ資産名>	参照データを含むデータ資産の名前	credit-default-reference
<ターゲットカラム>	モデルが予測する値を含む出力列の名前	DEFAULT_NEXT_MONTH
<feature-1>、 <feature-2>、 <feature-3>	高度なデータ品質信号で使用する機能	年齢
<email-address-1> および <email-address-2>	通知に使用する電子メールアドレス	`abc@example.com`
<周波数単位>	監視周波数ユニット	day
<インターバル>	ジョブ間の間隔 (頻度単位で表されます)	1
<開始時刻>	監視を開始する時間 (24時間制時計)	3
<開始分>	指定した時間の後に監視を開始するまでの分	15

from azure.identity import DefaultAzureCredential
from azure.ai.ml import Input, MLClient
from azure.ai.ml.constants import (
    MonitorDatasetContext,
)
from azure.ai.ml.entities import (
    AlertNotification,
    BaselineDataRange,
    DataDriftSignal,
    DataQualitySignal,
    PredictionDriftSignal,
    DataDriftMetricThreshold,
    DataQualityMetricThreshold,
    FeatureAttributionDriftMetricThreshold,
    FeatureAttributionDriftSignal,
    PredictionDriftMetricThreshold,
    NumericalDriftMetrics,
    CategoricalDriftMetrics,
    DataQualityMetricsNumerical,
    DataQualityMetricsCategorical,
    MonitorFeatureFilter,
    MonitoringTarget,
    MonitorDefinition,
    MonitorSchedule,
    RecurrencePattern,
    RecurrenceTrigger,
    ServerlessSparkCompute,
    ReferenceData,
    ProductionData
)

# Get a handle to the workspace.
ml_client = MLClient(
    DefaultAzureCredential(),
    subscription_id="<subscription-ID>",
    resource_group_name="<resource-group-name>",
    workspace_name="<workspace-name>",
)

# Create a compute instance.
spark_compute = ServerlessSparkCompute(
    instance_type="standard_e4s_v3",
    runtime_version="3.3"
)

# Specify the online deployment if you have one.
monitoring_target = MonitoringTarget(
    ml_task="classification",
    endpoint_deployment_id="azureml:<endpoint-name>:<deployment-name>"
)

# Specify a look-back window size and offset to use. Omit this line to use the default values, which are listed in the documentation.
data_window = BaselineDataRange(lookback_window_size="P1D", lookback_window_offset="P0D")

# Set up the production data.
production_data = ProductionData(
    input_data=Input(
        type="uri_folder",
        path="azureml:<production-data-asset-name>:1"
    ),
    data_window=data_window,
    data_context=MonitorDatasetContext.MODEL_INPUTS,
)

# Set up the training data to use as a reference data asset.
reference_data_training = ReferenceData(
    input_data=Input(
        type="mltable",
        path="azureml:<reference-data-asset-name>:1"
    ),
    data_column_names={
        "target_column":"<target-column>"
    },
    data_context=MonitorDatasetContext.TRAINING,
)

# Create an advanced data drift signal.
features = MonitorFeatureFilter(top_n_feature_importance=10)

metric_thresholds = DataDriftMetricThreshold(
    numerical=NumericalDriftMetrics(
        jensen_shannon_distance=0.01
    ),
    categorical=CategoricalDriftMetrics(
        pearsons_chi_squared_test=0.02
    )
)

advanced_data_drift = DataDriftSignal(
    reference_data=reference_data_training,
    features=features,
    metric_thresholds=metric_thresholds,
    alert_enabled=True
)

# Create an advanced prediction drift signal.
metric_thresholds = PredictionDriftMetricThreshold(
    categorical=CategoricalDriftMetrics(
        jensen_shannon_distance=0.01
    )
)

advanced_prediction_drift = PredictionDriftSignal(
    reference_data=reference_data_training,
    metric_thresholds=metric_thresholds,
    alert_enabled=True
)

# Create an advanced data quality signal.
features = ['<feature-1>', '<feature-2>', '<feature-3>']

metric_thresholds = DataQualityMetricThreshold(
    numerical=DataQualityMetricsNumerical(
        null_value_rate=0.01
    ),
    categorical=DataQualityMetricsCategorical(
        out_of_bounds_rate=0.02
    )
)

advanced_data_quality = DataQualitySignal(
    reference_data=reference_data_training,
    features=features,
    metric_thresholds=metric_thresholds,
    alert_enabled=True
)

# Create a feature attribution drift signal.
metric_thresholds = FeatureAttributionDriftMetricThreshold(normalized_discounted_cumulative_gain=0.9)

feature_attribution_drift = FeatureAttributionDriftSignal(
    reference_data=reference_data_training,
    metric_thresholds=metric_thresholds,
    alert_enabled=True
)

# Put all monitoring signals in a dictionary.
monitoring_signals = {
    'data_drift_advanced':advanced_data_drift,
    'data_quality_advanced':advanced_data_quality,
    'feature_attribution_drift':feature_attribution_drift,
}

# Create an alert notification object.
alert_notification = AlertNotification(
    emails=['<email-address-1>', '<email-address-2>']
)

# Create the monitor definition.
monitor_definition = MonitorDefinition(
    compute=spark_compute,
    monitoring_target=monitoring_target,
    monitoring_signals=monitoring_signals,
    alert_notification=alert_notification
)

# Specify the schedule frequency.
recurrence_trigger = RecurrenceTrigger(
    frequency="<frequency-unit>",
    interval=<interval>,
    schedule=RecurrencePattern(hours=<start-hour>, minutes=<start-minutes>)
)

# Create the monitoring schedule.
model_monitor = MonitorSchedule(
    name="credit_default_monitor_advanced",
    trigger=recurrence_trigger,
    create_monitor=monitor_definition
)

# Schedule the monitoring job.
poller = ml_client.schedules.begin_create_or_update(model_monitor)
created_monitor = poller.result()

モデルパフォーマンス監視を設定する

Azure Machine Learning モデルの監視を使用する場合は、パフォーマンスメトリックを計算することで、運用環境のモデルのパフォーマンスを追跡できます。現在、次のモデルパフォーマンスメトリックがサポートされています。

分類モデルの場合:
- Precision
- 精度
- 再現率
回帰モデルの場合:
- 平均絶対誤差 (MAE)
- 平均二乗誤差 (MSE)
- 二乗平均平方根誤差 (RMSE)

モデルパフォーマンス監視の前提条件

各行の一意の ID を持つ実稼働モデル (モデルの予測) の出力データ。 Azure Machine Learning データコレクターを使用して運用データを収集する場合は、推論要求ごとに関連付け ID が提供されます。データコレクターには、アプリケーションから独自の一意の ID をログに記録するオプションも用意されています。

注

Azure Machine Learning モデルのパフォーマンス監視では、 Azure Machine Learning データコレクターを使用して、独自の列に一意の ID を記録することをお勧めします。
各行に一意の ID を持つ基準データ（実際の値）。特定の行の一意の ID は、その特定の推論要求のモデル出力データの一意の ID と一致する必要があります。この一意の ID は、グラウンドトゥルースデータ資産とモデル出力データを結合するために使用されます。

グラウンドトゥルースデータがないと、モデルパフォーマンスの監視を実行できません。グラウンド・トゥルース・データはアプリケーション・レベルで検出されるため、使用可能になった時点で収集するのはユーザーの責任です。また、データ資産をこのグラウンドトゥルースデータが含まれている Azure Machine Learning に保持することも必要です。
（省略可能）モデルの出力データとグラウンドトゥルースデータがあらかじめ統合されている表形式データ資産。

データコレクターを使用する場合のモデルパフォーマンス監視の要件

Azure Machine Learning では、次の条件を満たすと、関連付け ID が生成されます。

Azure Machine Learning データコレクターを使用して、運用環境の推論データを収集します。
各行に個別の列として独自の一意の ID を指定することはできません。

生成された関連付け ID は、ログに記録された JSON オブジェクトに含まれます。ただし、データコレクターは、互いに短い時間間隔で送信される行をバッチ処理します。バッチ処理された行は、同じ JSON オブジェクト内に含まれます。各オブジェクト内では、すべての行が同じ関連付け ID を持ちます。

JSON オブジェクト内の行を区別するために、Azure Machine Learning モデルのパフォーマンス監視では、インデックス作成を使用してオブジェクト内の行の順序が決定されます。たとえば、バッチに 3 つの行が含まれており、関連付け ID が test場合、最初の行の ID は test_0、2 行目の ID は test_1、3 番目の行の ID は test_2 です。グラウンドトゥルースデータ資産の一意の ID と収集された運用推論モデルの出力データの ID を照合するには、各関連付け ID にインデックスを適切に適用します。ログに記録された JSON オブジェクトに 1 行しかない場合は、 correlationid_0 を correlationid 値として使用します。

このインデックス作成を使用しないように、一意の ID を独自の列に記録することをお勧めします。 Azure Machine Learning データコレクターがログに記録する pandas データフレーム内にその列を配置します。モデル監視構成では、この列の名前を指定して、モデルの出力データをグラウンドトゥルースデータと結合できます。両方のデータ資産の各行の ID が同じである限り、Azure Machine Learning モデルの監視ではモデルのパフォーマンス監視を実行できます。

モデルのパフォーマンスを監視するためのワークフローの例

モデルのパフォーマンス監視に関連する概念を理解するには、次のワークフロー例を検討してください。これは、クレジットカードトランザクションが不正であるかどうかを予測するためにモデルをデプロイするシナリオに適用されます。

データコレクターを使用してモデルの運用推論データ (入力および出力データ) を収集するようにデプロイを構成します。出力データを is_fraud という列に格納します。
収集された推論データの行ごとに、一意の ID をログに記録します。一意の ID は、アプリケーションから取得することも、ログに記録された JSON オブジェクトごとに Azure Machine Learning によって一意に生成される correlationid 値を使用することもできます。
実地検証用の (または実際の) is_fraud データが使用可能な場合、各行をモデルの出力データ内の対応する行に対して記録された一意のIDに一致させ、マッピングします。
Azure Machine Learning にデータ資産を登録して、それを用いて、信頼できるデータ is_fraud を収集し、維持するために利用します。
一意の ID 列を使用して、モデルの運用推論とグラウンドトゥルースデータ資産を結合するモデルパフォーマンス監視シグナルを作成します。
モデルのパフォーマンスメトリックを計算します。

モデルパフォーマンス監視の前提条件を満たしたら、次の手順を実行してモデルの監視を設定します。

YAML ファイルに監視定義を作成します。次のサンプル仕様では、運用環境の推論データを使用したモデル監視を定義します。この定義を使用する前に、環境のニーズに合わせて次の設定やその他の設定を調整します。

endpoint_deployment_idの場合は、azureml:<endpoint-name>:<deployment-name>形式の値を使用します。
入力データセクションの path 値ごとに、 azureml:<data-asset-name>:<version>形式の値を使用します。
prediction値には、モデルが予測する値を含む出力列の名前を使用します。
actual値には、モデルが予測しようとする実際の値を含む地表真理値列の名前を使用します。
correlation_id 値には、出力データとグラウンドトゥルースデータの結合に使用される列の名前を使用します。
[ emails] で、通知に使用するメールアドレスを一覧表示します。

# model-performance-monitoring.yaml
$schema:  http://azureml/sdk-2-0/Schedule.json
name: model_performance_monitoring
display_name: Credit card fraud model performance
description: Credit card fraud model performance

trigger:
  type: recurrence
  frequency: day
  interval: 7 
  schedule: 
    hours: 10
    minutes: 15

create_monitor:
  compute: 
    instance_type: standard_e8s_v3
    runtime_version: "3.3"
  monitoring_target:
    ml_task: classification
    endpoint_deployment_id: azureml:loan-approval-endpoint:loan-approval-deployment

  monitoring_signals:
    fraud_detection_model_performance: 
      type: model_performance 
      production_data:
        input_data:
          path: azureml:credit-default-main-model_outputs:1
          type: mltable
        data_column_names:
          prediction: is_fraud
          correlation_id: correlation_id
      reference_data:
        input_data:
          path: azureml:my_model_ground_truth_data:1
          type: mltable
        data_column_names:
          actual: is_fraud
          correlation_id: correlation_id
        data_context: ground_truth
      alert_enabled: true
      metric_thresholds: 
        tabular_classification:
          accuracy: 0.95
          precision: 0.8
  alert_notification: 
      emails: 
        - abc@example.com

次のコマンドを実行してモデルを作成します。
```
az ml schedule create -f ./model-performance-monitoring.yaml
```

モデルパフォーマンス監視の前提条件を満たしたら、次の Python コードを使用してモデルの監視を設定します。まず、次のプレースホルダーを適切な値に置き換えます。

プレースホルダー	説明	例
<購読-ID>	サブスクリプションの ID	aaaa0a0a-bb1b-cc2c-dd3d-eeeeee4e4e4e
<resource-group-name (リソースグループ名)>	ワークスペースを含むリソースグループの名前	my-resource-group
<ワークスペース名>	ワークスペースの名前	my-workspace
<プロダクションデータ資産名>	運用データを含むデータ資産の名前	credit-default-main-model_inputs
<生産目標列>	モデルが予測する値を含む実稼働列の名前	DEFAULT_NEXT_MONTH
<プロダクションジョインカラム>	実稼働およびグラウンドトゥルースデータの結合に使用する実稼働列の名前	correlationid
<グラウンドトゥルースデータアセット名>	グラウンドトゥルースデータを含むデータ資産の名前	credit-ground-truth
<ground-truth-target-column>	モデルが予測しようとする実際のデータを含む地表真理値列の名前	ground_truth
<ground-truth-join-column>	実稼働データと地表真理値データの結合に使用する地表真理値列の名前	correlationid
<email-address-1> および <email-address-2>	通知に使用する電子メールアドレス	`abc@example.com`
<周波数単位>	監視周波数ユニット	day
<インターバル>	ジョブ間の間隔 (頻度単位で表されます)	1
<開始時刻>	監視を開始する時間 (24時間制時計)	3
<開始分>	指定した時間の後に監視を開始するまでの分	15

from azure.identity import DefaultAzureCredential
from azure.ai.ml import Input, MLClient
from azure.ai.ml.constants import (
    MonitorDatasetContext,
)
from azure.ai.ml.entities import (
    AlertNotification,
    BaselineDataRange,
    ModelPerformanceMetricThreshold,
    ModelPerformanceSignal,
    ModelPerformanceClassificationThresholds,
    MonitoringTarget,
    MonitorDefinition,
    MonitorSchedule,
    RecurrencePattern,
    RecurrenceTrigger,
    ServerlessSparkCompute,
    ReferenceData,
    ProductionData
)

# Get a handle to the workspace.
ml_client = MLClient(
    DefaultAzureCredential(),
    subscription_id="<subscription-ID>",
    resource_group_name="<resource-group-name>",
    workspace_name="<workspace-name>",
)

# Create a compute instance.
spark_compute = ServerlessSparkCompute(
    instance_type="standard_e4s_v3",
    runtime_version="3.3"
)

# Specify the type of the model task.
monitoring_target = MonitoringTarget(
    ml_task="classification",
)

# Specify production data that the model data collector generates. 
production_data = ProductionData(
    input_data=Input(
        type="uri_folder",
        path="azureml:<production-data-asset-name>:1"
    ),
    data_column_names={
        "target_column": "<production-target-column>",
        "join_column": "<production-join-column>"
    },
    data_window=BaselineDataRange(
        lookback_window_offset="P0D",
        lookback_window_size="P10D",
    )
)

# Specify the ground truth reference data.
reference_data_ground_truth = ReferenceData(
    input_data=Input(
        type="mltable",
        path="azureml:<ground-truth-data-asset-name>:1"
    ),
    data_column_names={
        "target_column": "<ground-truth-target-column>",
        "join_column": "<ground-truth-join-column>"
    },
    data_context=MonitorDatasetContext.GROUND_TRUTH_DATA,
)

# Create the model performance signal.
metric_thresholds = ModelPerformanceMetricThreshold(
    classification=ModelPerformanceClassificationThresholds(
        accuracy=0.50,
        precision=0.50,
        recall=0.50
    ),
)

model_performance = ModelPerformanceSignal(
    production_data=production_data,
    reference_data=reference_data_ground_truth,
    metric_thresholds=metric_thresholds,
    alert_enabled=True
)

# Put all monitoring signals in a dictionary.
monitoring_signals = {
    'model_performance':model_performance,
}

# Create an alert notification object.
alert_notification = AlertNotification(
    emails=['<email-address-1>', '<email-address-2>']
)

# Set up the monitor definition.
monitor_definition = MonitorDefinition(
    compute=spark_compute,
    monitoring_target=monitoring_target,
    monitoring_signals=monitoring_signals,
    alert_notification=alert_notification
)

# Specify the schedule frequency.
recurrence_trigger = RecurrenceTrigger(
    frequency="<frequency-unit>",
    interval=<interval>,
    schedule=RecurrencePattern(hours=<start-hour>, minutes=<start-minutes>)
)

# Create the monitoring schedule.
model_monitor = MonitorSchedule(
    name="credit_default_model_performance",
    trigger=recurrence_trigger,
    create_monitor=monitor_definition
)

# Schedule the monitoring job.
poller = ml_client.schedules.begin_create_or_update(model_monitor)
created_monitor = poller.result()

モデルのパフォーマンス監視を設定するには、次のセクションの手順を実行します。

基本設定を構成する

Azure Machine Learning スタジオで、ワークスペースに移動します。
[ 管理] で [ 監視] を選択し、[追加] を選択 します。
[基本設定] ページで、「すぐに使用できるモデル監視を設定する」で前述した情報を入力します。

データ資産を追加する

[基本設定] ページで、[ 次へ ] を選択して、[ 詳細設定 ] セクションの [データ資産の構成] ページを開きます。
追加を選択し、基盤データ資産として使用するデータ資産を追加してください。グラウンドトゥルースデータ資産には、一意の ID 列が必要です。また、グラウンドトゥルースデータ資産とモデル出力データ資産の一意の ID 列の値が一致している必要があります。これらのデータ資産は、メトリック計算が行われる前に結合できます。
追加されたデータ資産の一覧にモデル出力データ資産が表示されない場合は、[ 追加] を選択して追加します。

パフォーマンス監視シグナルを追加する

[データ資産の構成] ページで、[ 次へ] を選択します。 [監視シグナルの選択] ページが開きます。 Azure Machine Learning オンラインデプロイを使用している場合は、監視シグナルの一覧が表示されます。
ページに表示される監視シグナルをすべて削除します。このセクションでは、モデルパフォーマンス監視シグナルを作成することに重点を置きます。
[追加]を選択します。
[シグナルの編集] ウィンドウで[ モデルパフォーマンス (プレビュー)]を選択し、次の手順に従ってモデルパフォーマンスシグナルを構成します。
1. 手順 1:
  1. 運用データ資産の場合は、モデル出力データ資産を選択します。
  2. 適切なターゲット列 (たとえば、 DEFAULT_NEXT_MONTH) を選択します。
  3. 使用するルックバックウィンドウのサイズとオフセットを選択します。
2. 手順 2:
  1. 参照データ資産の場合は、グラウンドトゥルースデータ資産を選択します。
  2. ターゲット列 (たとえば、 ground_truth) を選択します。
  3. モデル出力データ資産との結合に使用する列 ( correlationidなど) を選択します。どちらのデータ資産にもその列が含まれている必要があり、データ資産の各行に一意の ID が含まれている必要があります。
3. 手順 3 で、使用するパフォーマンスメトリックを選択し、それぞれのしきい値を指定します。
保存を選択します。 [監視信号の選択] ページで、モデルパフォーマンス信号が表示されます。

構成を完了する

[監視シグナルの選択] ページで、[ 次へ] を選択します。
[通知] ページで、モデルパフォーマンスシグナルの通知を有効にして、[ 次へ] を選択します。
[監視設定の確認] ページで、設定を確認します。
[作成] を選択して、モデルパフォーマンス監視を作成します。

運用データのモデル監視を設定する

また、Azure Machine Learning バッチエンドポイントにデプロイするモデルや、Azure Machine Learning の外部にデプロイするモデルを監視することもできます。デプロイがなく、運用データがある場合は、データを使用して継続的なモデル監視を実行できます。これらのモデルを監視するには、次のことが可能である必要があります。

運用環境にデプロイされたモデルから運用環境の推論データを収集します。
運用環境の推論データを Azure Machine Learning データ資産として登録し、データの継続的な更新を保証します。
データコレクターを使用してデータを収集しない場合は、カスタムデータ前処理コンポーネントを指定し、Azure Machine Learning コンポーネントとして登録します。このカスタムデータ前処理コンポーネントがないと、Azure Machine Learning モデル監視システムは、時間枠をサポートする表形式にデータを処理できません。

カスタム前処理コンポーネントには、次の入力署名と出力シグネチャが必要です。

入力または出力	シグネチャ名	タイプ	説明	値の例
入力	`data_window_start`	リテラル、文字列	ISO8601形式のデータウィンドウの開始時刻	2023-05-01T04:31:57.012Z
入力	`data_window_end`	リテラル、文字列	データウィンドウの終了時刻 (ISO8601形式)	2023-05-01T04:31:57.012Z
入力	`input_data`	uri_folder	収集された運用推論データ。Azure Machine Learning データ資産として登録されます。	azureml:myproduction_inference_data:1
出力	`preprocessed_data`	mltable	参照データスキーマのサブセットと一致する表形式のデータ資産

カスタムデータ前処理コンポーネントの例については、azuremml-examples の GitHub リポジトリにある custom_preprocessing を参照してください。

Azure Machine Learning コンポーネントを登録する手順については、「ワークスペースにコンポーネントを登録する」を参照してください。

運用データと前処理コンポーネントを登録したら、モデル監視を設定できます。

次のような監視定義 YAML ファイルを作成します。この定義を使用する前に、環境のニーズに合わせて次の設定やその他の設定を調整します。

endpoint_deployment_idの場合は、azureml:<endpoint-name>:<deployment-name>形式の値を使用します。
pre_processing_componentの場合は、azureml:<component-name>:<component-version>形式の値を使用します。 1.0.0ではなく、1などの正確なバージョンを指定します。
pathごとに、azureml:<data-asset-name>:<version>形式の値を使用します。
target_column値には、モデルが予測する値を含む出力列の名前を使用します。
[ emails] で、通知に使用するメールアドレスを一覧表示します。

# model-monitoring-with-collected-data.yaml
$schema:  http://azureml/sdk-2-0/Schedule.json
name: fraud_detection_model_monitoring
display_name: Fraud detection model monitoring
description: Fraud detection model monitoring with your own production data

trigger:
  # perform model monitoring activity daily at 3:15am
  type: recurrence
  frequency: day #can be minute, hour, day, week, month
  interval: 1 # #every day
  schedule: 
    hours: 3 # at 3am
    minutes: 15 # at 15 mins after 3am

create_monitor:
  compute: 
    instance_type: standard_e4s_v3
    runtime_version: "3.4"
  monitoring_target:
    ml_task: classification
    endpoint_deployment_id: azureml:fraud-detection-endpoint:fraud-detection-deployment
  
  monitoring_signals:

    advanced_data_drift: # monitoring signal name, any user defined name works
      type: data_drift
      # define production dataset with your collected data
      production_data:
        input_data:
          path: azureml:my_production_inference_data_model_inputs:1  # your collected data is registered as Azure Machine Learning asset
          type: uri_folder
        data_context: model_inputs
        pre_processing_component: azureml:production_data_preprocessing:1.0.0
      reference_data:
        input_data:
          path: azureml:my_model_training_data:1 # use training data as comparison baseline
          type: mltable
        data_context: training
        data_column_names:
          target_column: is_fraud
      features: 
        top_n_feature_importance: 20 # monitor drift for top 20 features
      alert_enabled: true
      metric_thresholds:
        numerical:
          jensen_shannon_distance: 0.01
        categorical:
          pearsons_chi_squared_test: 0.02

    advanced_prediction_drift: # monitoring signal name, any user defined name works
      type: prediction_drift
      # define production dataset with your collected data
      production_data:
        input_data:
          path: azureml:my_production_inference_data_model_outputs:1  # your collected data is registered as Azure Machine Learning asset
          type: uri_folder
        data_context: model_outputs
        pre_processing_component: azureml:production_data_preprocessing:1.0.0
      reference_data:
        input_data:
          path: azureml:my_model_validation_data:1 # use training data as comparison reference dataset
          type: mltable
        data_context: validation
      alert_enabled: true
      metric_thresholds:
        categorical:
          pearsons_chi_squared_test: 0.02
  
  alert_notification:
    emails:
      - abc@example.com
      - def@example.com

次のコマンドを実行してモデルを作成します。

az ml schedule create -f ./model-monitoring-with-collected-data.yaml

次の Python コードに似たスクリプトを使用して、モデルの監視を設定します。まず、次のプレースホルダーを適切な値に置き換えます。

プレースホルダー	説明	例
<subscription-ID\>	サブスクリプションの ID	aaaa0a0a-bb1b-cc2c-dd3d-eeeeee4e4e4e
<リソースグループ名\>	ワークスペースを含むリソースグループの名前	my-resource-group
<ワークスペース名\>	ワークスペースの名前	my-workspace
<production-data-asset-name\>	運用データを含むデータ資産の名前	my_model_production_data
<preprocessing-component-name\>	前処理コンポーネントの名前	生産データの前処理
<トレーニングデータ資産名\>	参照データ資産として使用するトレーニングデータ資産の名前	マイモデルのトレーニングデータ
<email-address-1\> と <email-address-2\>	通知に使用する電子メールアドレス	`abc@example.com`
<周波数単位\>	監視周波数ユニット	day
<インターバル\>	ジョブ間の間隔 (頻度単位で表されます)	1
<開始時刻\>	監視を開始する時間 (24時間制時計)	3
<開始分数\>	指定した時間の後に監視を開始するまでの分	15

from azure.identity import InteractiveBrowserCredential
from azure.ai.ml import Input, MLClient
from azure.ai.ml.constants import (
    MonitorFeatureType,
    MonitorMetricName,
    MonitorDatasetContext
)
from azure.ai.ml.entities import (
    AlertNotification,
    DataDriftSignal,
    DataQualitySignal,
    DataDriftMetricThreshold,
    DataQualityMetricThreshold,
    NumericalDriftMetrics,
    CategoricalDriftMetrics,
    DataQualityMetricsNumerical,
    DataQualityMetricsCategorical,
    MonitorFeatureFilter,
    MonitorInputData,
    MonitoringTarget,
    MonitorDefinition,
    MonitorSchedule,
    RecurrencePattern,
    RecurrenceTrigger,
    ServerlessSparkCompute,
    ReferenceData,
    ProductionData
)

# Get a handle to the workspace.
subscription_id = "<subscription-ID>"
resource_group = "<resource-group-name>"
workspace = "<workspace-name>"
ml_client = MLClient(
   InteractiveBrowserCredential(),
   subscription_id,
   resource_group,
   workspace
)

# Specify the compute instance.
spark_compute = ServerlessSparkCompute(
    instance_type="standard_e4s_v3",
    runtime_version="3.3"
)

# Specify the target data asset (the production data asset).
production_data = ProductionData(
    input_data=Input(
        type="uri_folder",
        path="azureml:<production-data-asset-name>:1"
    ),
    data_context=MonitorDatasetContext.MODEL_INPUTS,
    pre_processing_component="azureml:<preprocessing-component-name>:1.0.0"
)

# Specify the training data to use as a reference data asset.
reference_data_training = ReferenceData(
    input_data=Input(
        type="mltable",
        path="azureml:<training-data-asset-name>:1"
    ),
    data_context=MonitorDatasetContext.TRAINING
)

# Create an advanced data drift signal.
features = MonitorFeatureFilter(top_n_feature_importance=20)
metric_thresholds = DataDriftMetricThreshold(
    numerical=NumericalDriftMetrics(
        jensen_shannon_distance=0.01
    ),
    categorical=CategoricalDriftMetrics(
        pearsons_chi_squared_test=0.02
    )
)

advanced_data_drift = DataDriftSignal(
    production_data=production_data,
    reference_data=reference_data_training,
    features=features,
    metric_thresholds=metric_thresholds,
    alert_enabled=True
)

# Create an advanced data quality signal.
features = ['feature_A', 'feature_B', 'feature_C']
metric_thresholds = DataQualityMetricThreshold(
    numerical=DataQualityMetricsNumerical(
        null_value_rate=0.01
    ),
    categorical=DataQualityMetricsCategorical(
        out_of_bounds_rate=0.02
    )
)

advanced_data_quality = DataQualitySignal(
    production_data=production_data,
    reference_data=reference_data_training,
    features=features,
    metric_thresholds=metric_thresholds,
    alert_enabled=True
)

# Put all monitoring signals in a dictionary.
monitoring_signals = {
    'data_drift_advanced': advanced_data_drift,
    'data_quality_advanced': advanced_data_quality
}

# Create an alert notification object.
alert_notification = AlertNotification(
    emails=['<email-address-1>', '<email-address-2>']
)

# Set up the monitor definition.
monitor_definition = MonitorDefinition(
    compute=spark_compute,
    monitoring_signals=monitoring_signals,
    alert_notification=alert_notification
)

# Specify the schedule frequency.
recurrence_trigger = RecurrenceTrigger(
    frequency="<frequency-unit>",
    interval=<interval>,
    schedule=RecurrencePattern(hours=<start-hour>, minutes=<start-minutes>)
)

# Create the monitoring schedule.
model_monitor = MonitorSchedule(
    name="fraud_detection_model_monitoring_advanced",
    trigger=recurrence_trigger,
    create_monitor=monitor_definition
)

# Schedule the monitoring job.
poller = ml_client.schedules.begin_create_or_update(model_monitor)
created_monitor = poller.result()

カスタムシグナルとメトリックを使用してモデル監視を設定する

Azure Machine Learning モデルの監視を使用する場合は、カスタムシグナルを定義し、任意のメトリックを実装してモデルを監視できます。カスタムシグナルを Azure Machine Learning コンポーネントとして登録できます。モデル監視ジョブは、指定されたスケジュールで実行されると、データドリフト、予測ドリフト、および事前構築済み信号の場合と同様に、カスタムシグナル内で定義されたメトリックを計算します。

モデルモニタリングに使用するカスタムシグナルを設定するには、まずカスタムシグナルを定義し、Azure Machine Learning コンポーネントとして登録する必要があります。 Azure Machine Learning コンポーネントには、次の入力署名と出力署名が必要です。

コンポーネント入力シグネチャ

コンポーネント入力データフレームには、次の項目が含まれている必要があります。

前処理コンポーネントから処理されたデータを格納する mltable 構造体。
カスタムシグナルコンポーネントの一部として実装されたメトリックを表す任意の数のリテラル。たとえば、 std_deviation メトリックを実装する場合は、 std_deviation_thresholdの入力が必要です。一般に、メトリックごとに <metric-name>_threshold という名前の入力が 1 つ必要です。

シグネチャ名	タイプ	説明	値の例
`production_data`	mltable	参照データスキーマのサブセットと一致する表形式のデータ資産
`std_deviation_threshold`	リテラル、文字列	実装されたメトリックのそれぞれのしきい値	2

コンポーネント出力シグネチャ

コンポーネント出力ポートには、次の署名が必要です。

シグネチャ名	タイプ	説明
`signal_metrics`	mltable	計算されたメトリックを含む mltable 構造体。この署名のスキーマについては、次のセクション「スキーマのsignal_metrics」を参照してください。

signal_metrics シグナルメトリクススキーマ

コンポーネント出力データフレームには、 group、 metric_name、 metric_value、 threshold_valueの 4 つの列が含まれている必要があります。

シグネチャ名	タイプ	説明	値の例
`group`	リテラル、文字列	カスタムメトリックに適用する最上位の論理グループ	TRANSACTIONAMOUNT
`metric_name`	リテラル、文字列	カスタムメトリックの名前	標準偏差
`metric_value`	数値	カスタムメトリックの値	44,896.082
`threshold_value`	数値	カスタムメトリックのしきい値	2

次の表は、 std_deviation メトリックを計算するカスタムシグナルコンポーネントからの出力例を示しています。

グループ	metric_value	metric_name	閾値
TRANSACTIONAMOUNT	44,896.082	標準偏差	2
LOCALHOUR	3.983	標準偏差	2
TRANSACTIONAMOUNTUSD	54,004.902	標準偏差	2
DIGITALITEMCOUNT	7.238	標準偏差	2
PHYSICALITEMCOUNT	5.509	標準偏差	2

カスタムシグナルコンポーネント定義とメトリック計算コードの例については、 azureml-examples リポジトリのcustom_signalを参照してください。

Azure Machine Learning コンポーネントを登録する手順については、「ワークスペースにコンポーネントを登録する」を参照してください。

Azure Machine Learning でカスタムシグナルコンポーネントを作成して登録したら、次の手順を実行してモデル監視を設定します。

次のような監視定義を YAML ファイルに作成します。この定義を使用する前に、環境のニーズに合わせて次の設定やその他の設定を調整します。

component_idの場合は、azureml:<custom-signal-name>:1.0.0形式の値を使用します。
入力データセクションの path には、 azureml:<production-data-asset-name>:<version>形式の値を使用します。
pre_processing_componentの場合:
- データコレクターを使用してデータを収集する場合は、pre_processing_component プロパティを省略できます。
- データコレクターを使用せず、コンポーネントを使用して運用データを前処理する場合は、 azureml:<custom-preprocessor-name>:<custom-preprocessor-version>形式の値を使用します。
[ emails] で、通知に使用するメールアドレスを一覧表示します。

# custom-monitoring.yaml
$schema:  http://azureml/sdk-2-0/Schedule.json
name: my-custom-signal
trigger:
  type: recurrence
  frequency: day # Possible frequency values include "minute," "hour," "day," "week," and "month."
  interval: 7 # Monitoring runs every day when you use the value 1.
create_monitor:
  compute:
    instance_type: "standard_e4s_v3"
    runtime_version: "3.3"
  monitoring_signals:
    customSignal:
      type: custom
      component_id: azureml:my_custom_signal:1.0.0
      input_data:
        production_data:
          input_data:
            type: uri_folder
            path: azureml:my_production_data:1
          data_context: test
          data_window:
            lookback_window_size: P30D
            lookback_window_offset: P7D
          pre_processing_component: azureml:custom_preprocessor:1.0.0
      metric_thresholds:
        - metric_name: std_deviation
          threshold: 2
  alert_notification:
    emails:
      - abc@example.com

次のコマンドを実行してモデルを作成します。
```
az ml schedule create -f ./custom-monitoring.yaml
```

モニタリング結果を使用する

モデルモニターを構成し、最初の実行が完了したら、Azure Machine Learning Studio で結果を表示できます。

スタジオの [ 管理] で、[監視] を選択 します。 [監視] ページで、モデルモニターの名前を選択して、その概要ページを表示します。このページには、監視モデル、エンドポイント、デプロイが表示されます。また、構成された信号に関する詳細情報も提供します。次の図は、データドリフトとデータ品質信号を含む監視の概要ページを示しています。
概要ページの [通知 ] セクションを確認します。このセクションでは、それぞれのメトリックに対して構成されたしきい値に違反する各シグナルの機能を確認できます。
[ 信号 ]セクションで data_drift を選択すると、データドリフト信号に関する詳細情報が表示されます。詳細ページでは、監視構成に含まれる各数値およびカテゴリの特徴のデータドリフトメトリック値を確認できます。モニターに複数の実行がある場合は、各機能の傾向線が表示されます。
詳細ページで、個々の機能の名前を選択します。参照分布と比較した運用ディストリビューションを示す詳細ビューが開きます。このビューを使用して、機能の時間の経過に伴うドリフトを追跡することもできます。
監視の概要ページに戻ります。 [ シグナル ] セクションで 、data_quality を選択して、このシグナルに関する詳細情報を表示します。このページでは、監視する各機能の null 値率、範囲外レート、データ型エラー率を確認できます。

モデルモニタリングは継続的なプロセスです。 Azure Machine Learning モデルの監視を使用する場合は、複数の監視シグナルを構成して、運用環境のモデルのパフォーマンスを幅広く確認できます。

Azure Machine Learning モデルの監視を Event Grid と統合する

Event Grid を使用する場合は、Azure Machine Learning モデルの監視によって生成されるイベントを構成して、アプリケーション、プロセス、CI/CD ワークフローをトリガーできます。 Azure Event Hubs、Azure Functions、Azure Logic Apps など、さまざまなイベントハンドラーを介してイベントを使用できます。モニターがドリフトを検出すると、機械学習パイプラインを実行してモデルを再トレーニングして再デプロイするなど、プログラムによってアクションを実行できます。

Azure Machine Learning モデルの監視を Event Grid と統合するには、次のセクションの手順を実行します。

システムトピックを作成する

監視に使用する Event Grid システムトピックがない場合は、作成します。手順については、 Azure portal での Event Grid システムトピックの作成、表示、管理に関するページを参照してください。

イベントサブスクリプションの作成

Azure portal で、Azure Machine Learning ワークスペースに移動します。
[イベント] を選択し、次に [イベントサブスクリプション] を選択します。
[ 名前] の横に、 MonitoringEvent などのイベントサブスクリプションの名前を入力します。
[ イベントの種類] で、[ 実行の状態が変更されました] のみを選択します。

警告

イベントの種類に対して [実行の状態が変更されました ] のみを選択します。 検出されたデータセットドリフトは選択しないでください。それはAzure Machine Learning モデルの監視ではなく、データドリフト v1 に適用されます。
[ フィルター ] タブを選択します。[ 詳細フィルター] で [ 新しいフィルターの追加] を選択し、次の値を入力します。
- キーの下にdata.RunTags.azureml_modelmonitor_threshold_breachedを入力します。
- [ 演算子] で、[ 文字列を含む] を選択します。
- [ 値] で、 1 つ以上の機能がメトリックのしきい値に違反したため、入力に失敗しました。
このフィルターを使用すると、Azure Machine Learning ワークスペース内のモニターの実行状態が変わると、イベントが生成されます。実行状態は、完了から失敗、または失敗から完了に変わる可能性があります。

監視レベルでフィルター処理するには、もう一度 [新しいフィルターの追加] を選択し、次の値を入力します。
- キーの下にdata.RunTags.azureml_modelmonitor_threshold_breachedを入力します。
- [ 演算子] で、[ 文字列を含む] を選択します。
- [値] に、credit_card_fraud_monitor_data_drift などの、イベントをフィルター処理するモニターシグナルの名前を入力します。入力する名前は、監視シグナルの名前と一致している必要があります。フィルター処理で使用するシグナルには、モニター名とシグナルの説明を含む <monitor-name>_<signal-description> 形式の名前が必要です。
[ 基本 ] タブを選択します。イベントハンドラーとして機能するエンドポイント (Event Hubs など) を構成します。
[作成] を選び、イベントサブスクリプションを作成します。

イベントの表示

イベントをキャプチャした後は、イベントハンドラーのエンドポイントページでイベントを表示できます。

Azure Monitor メトリック タブでイベントを表示することもできます。

フィードバック

このページはお役に立ちましたか?

Last updated on 2025-05-08

次の方法で共有

基本設定を構成する

データ資産を追加する

データドリフトの設定を編集する

特徴量属性ドリフトシグナルを追加する

構成を完了する

基本設定を構成する

データ資産を追加する

パフォーマンス監視シグナルを追加する

構成を完了する

次の方法で共有

運用環境にデプロイされたモデルのパフォーマンスを監視する

前提条件

サーバーレス Spark コンピューティング プールを構成する

すぐに使用できるモデル監視を設定する

高度なモデル監視を設定する

特徴量の重要度を構成する

モデル パフォーマンス監視を設定する

モデル パフォーマンス監視の前提条件

データ コレクターを使用する場合のモデル パフォーマンス監視の要件

モデルのパフォーマンスを監視するためのワークフローの例

運用データのモデル監視を設定する

カスタム シグナルとメトリックを使用してモデル監視を設定する

コンポーネント入力シグネチャ

コンポーネント出力シグネチャ

signal_metrics シグナルメトリクス スキーマ

モニタリング結果を使用する

Azure Machine Learning モデルの監視を Event Grid と統合する

システム トピックを作成する

イベント サブスクリプションの作成

イベントの表示

関連コンテンツ

フィードバック

その他のリソース

サーバーレス Spark コンピューティングプールを構成する

モデルパフォーマンス監視を設定する

モデルパフォーマンス監視の前提条件

データコレクターを使用する場合のモデルパフォーマンス監視の要件

カスタムシグナルとメトリックを使用してモデル監視を設定する

signal_metrics シグナルメトリクススキーマ

システムトピックを作成する

イベントサブスクリプションの作成