工作區模型登錄範例

發行項
03/01/2024

注意

本檔涵蓋工作區模型登錄。 Azure Databricks 建議在 Unity 目錄中使用模型。 Unity 目錄中的模型提供集中式模型控管、跨工作區存取、譜系和部署。未來將會淘汰工作區模型登錄。

此範例說明如何使用工作區模型登錄來建置機器學習應用程式，以預測風力發電廠的每日電力輸出。此範例示範如何：

使用 MLflow 追蹤和記錄模型
向模型登錄註冊模型
描述模型並製作模型版本階段轉換
整合已註冊的模型與生產應用程式
在模型登錄中搜尋和探索模型
封存和刪除模型

本文說明如何使用 MLflow 追蹤和 MLflow 模型登錄 UI 和 API 來執行這些步驟。

如需使用 MLflow 追蹤和登錄 API 執行所有這些步驟的筆記本，請參閱模型登錄範例筆記本。

使用 MLflow 追蹤載入資料集、定型模型和追蹤

在模型登錄中註冊模型之前，您必須先在實驗執行期間定型和記錄模型。本節說明如何載入風力發電廠資料集、定型模型，以及將定型執行記錄至 MLflow。

載入資料集

下列程式碼會載入資料集，其中包含美國中風力發電廠的天氣資料和電源輸出資訊。資料集包含、、和 air temperature 功能，每隔六小時取樣 (一次， 08:00 一次 16:0000:00 在) ，以及數年) 的每日匯總電源輸出 (power 。 wind speedwind direction

import pandas as pd
wind_farm_data = pd.read_csv("https://github.com/dbczumar/model-registry-demo-notebook/raw/master/dataset/windfarm_data.csv", index_col=0)

def get_training_data():
  training_data = pd.DataFrame(wind_farm_data["2014-01-01":"2018-01-01"])
  X = training_data.drop(columns="power")
  y = training_data["power"]
  return X, y

def get_validation_data():
  validation_data = pd.DataFrame(wind_farm_data["2018-01-01":"2019-01-01"])
  X = validation_data.drop(columns="power")
  y = validation_data["power"]
  return X, y

def get_weather_and_forecast():
  format_date = lambda pd_date : pd_date.date().strftime("%Y-%m-%d")
  today = pd.Timestamp('today').normalize()
  week_ago = today - pd.Timedelta(days=5)
  week_later = today + pd.Timedelta(days=5)

  past_power_output = pd.DataFrame(wind_farm_data)[format_date(week_ago):format_date(today)]
  weather_and_forecast = pd.DataFrame(wind_farm_data)[format_date(week_ago):format_date(week_later)]
  if len(weather_and_forecast) < 10:
    past_power_output = pd.DataFrame(wind_farm_data).iloc[-10:-5]
    weather_and_forecast = pd.DataFrame(wind_farm_data).iloc[-10:]

  return weather_and_forecast.drop(columns="power"), past_power_output["power"]

定型模型

下列程式碼會使用 TensorFlow Keras 來定型神經網路，根據資料集中的天氣特徵來預測電源輸出。 MLflow 可用來追蹤模型的超參數、效能計量、原始程式碼和成品。

def train_keras_model(X, y):
  import tensorflow.keras
  from tensorflow.keras.models import Sequential
  from tensorflow.keras.layers import Dense

  model = Sequential()
  model.add(Dense(100, input_shape=(X_train.shape[-1],), activation="relu", name="hidden_layer"))
  model.add(Dense(1))
  model.compile(loss="mse", optimizer="adam")

  model.fit(X_train, y_train, epochs=100, batch_size=64, validation_split=.2)
  return model

import mlflow

X_train, y_train = get_training_data()

with mlflow.start_run():
  # Automatically capture the model's parameters, metrics, artifacts,
  # and source code with the `autolog()` function
  mlflow.tensorflow.autolog()

  train_keras_model(X_train, y_train)
  run_id = mlflow.active_run().info.run_id

使用 MLflow UI 註冊和管理模型

建立新的已註冊模型

按一下 Azure Databricks 筆記本右側提要欄中的[實驗，流覽至 [MLflow 實驗執行] 提要欄位。
找出對應至 TensorFlow Keras 模型定型會話的 MLflow 執行，然後按一下 [ 檢視執行詳細 資料] 圖示，在 MLflow 執行 UI 中加以開啟。
在 MLflow UI 中，向下捲動至 [ 成品] 區段，然後按一下名為 model的目錄。按一下出現的 [ 註冊模型 ] 按鈕。
從下拉式功能表中選取 [ 建立新模型 ]，然後輸入下列模型名稱： power-forecasting-model 。
按一下 [註冊]。這會註冊名為 power-forecasting-model 的新模型，並建立新的模型版本： Version 1 。

幾分鐘後，MLflow UI 會顯示新註冊模型的連結。請遵循此連結，在 MLflow 模型登錄 UI 中開啟新的模型版本。

探索模型登錄 UI

MLflow 模型登錄 UI 中的模型版本頁面提供已註冊預測模型的相關資訊 Version 1 ，包括其作者、建立時間及其目前階段。

模型版本頁面

模型版本頁面也會提供 [來源執行 ] 連結，這會開啟用來在 MLflow Run UI 中建立模型的 MLflow 執行。您可以從 MLflow 執行 UI 存取 [來源 筆記本] 連結，以檢視用來定型模型的 Azure Databricks 筆記本快照集。

來源執行

來源筆記本

若要流覽回 MLflow 模型登錄，請按一下提要欄中的 [ 模型。

產生的 MLflow 模型登錄首頁會顯示 Azure Databricks 工作區中所有已註冊模型的清單，包括其版本和階段。

按一下 power-forecasting-model 連結以開啟已註冊的模型頁面，其中會顯示預測模型的所有版本。

新增模型描述

您可以將描述新增至已註冊的模型和模型版本。已註冊的模型描述適用于記錄適用于多個模型版本的資訊 (例如模型問題和資料集) 的一般概觀。模型版本描述適用于詳細說明特定模型版本的唯一屬性 (，例如用來開發模型的方法和演算法) 。

將高階描述新增至已註冊的電源預測模型。按一下 [ ，然後輸入下列描述：

This model forecasts the power output of a wind farm based on weather data. The weather data consists of three features: wind speed, wind direction, and air temperature.

新增模型描述

按一下 [ 儲存]。
從已註冊的模型頁面按一下 [第 1 版 ] 連結，以巡覽回模型版本頁面。

按一下 [ ，然後輸入下列描述：

This model version was built using TensorFlow Keras. It is a feed-forward neural network with one hidden layer.

新增模型版本描述

按一下 [ 儲存]。

轉換模型版本

MLflow 模型登錄會定義數個模型階段：無、預備、生產及 Archived 。每個階段都有唯一的意義。例如，暫存適用于模型測試，而生產環境則適用于已完成測試或檢閱程式的模型，且已部署到應用程式。

按一下 [ 階段 ] 按鈕以顯示可用模型階段的清單和可用的階段轉換選項。
選取 [轉換至 - > 生產 ]，然後在階段轉換確認視窗中按 [確定 ]，將模型轉換為 生產環境。

模型版本轉換至生產環境之後，目前的階段會顯示在 UI 中，並將專案新增至活動記錄以反映轉換。

MLflow 模型登錄可讓多個模型版本共用相同的階段。依階段參考模型時，模型登錄會使用最新的模型版本， (具有最大版本識別碼的模型版本) 。已註冊的模型頁面會顯示特定模型的所有版本。

已註冊的模型頁面

使用 MLflow API 註冊和管理模型

在本節中：

以程式設計方式定義模型的名稱
註冊模型
使用 API 新增模型和模型版本描述
使用 API 轉換模型版本並擷取詳細資料

以程式設計方式定義模型的名稱

既然模型已註冊並轉換至生產環境，您可以使用 MLflow 程式設計 API 來參考模型。定義已註冊的模型名稱，如下所示：

model_name = "power-forecasting-model"

註冊模型

model_name = get_model_name()

import mlflow

# The default path where the MLflow autologging function stores the TensorFlow Keras model
artifact_path = "model"
model_uri = "runs:/{run_id}/{artifact_path}".format(run_id=run_id, artifact_path=artifact_path)

model_details = mlflow.register_model(model_uri=model_uri, name=model_name)

import time
from mlflow.tracking.client import MlflowClient
from mlflow.entities.model_registry.model_version_status import ModelVersionStatus

# Wait until the model is ready
def wait_until_ready(model_name, model_version):
  client = MlflowClient()
  for _ in range(10):
    model_version_details = client.get_model_version(
      name=model_name,
      version=model_version,
    )
    status = ModelVersionStatus.from_string(model_version_details.status)
    print("Model status: %s" % ModelVersionStatus.to_string(status))
    if status == ModelVersionStatus.READY:
      break
    time.sleep(1)

wait_until_ready(model_details.name, model_details.version)

使用 API 新增模型和模型版本描述

from mlflow.tracking.client import MlflowClient

client = MlflowClient()
client.update_registered_model(
  name=model_details.name,
  description="This model forecasts the power output of a wind farm based on weather data. The weather data consists of three features: wind speed, wind direction, and air temperature."
)

client.update_model_version(
  name=model_details.name,
  version=model_details.version,
  description="This model version was built using TensorFlow Keras. It is a feed-forward neural network with one hidden layer."
)

使用 API 轉換模型版本並擷取詳細資料

client.transition_model_version_stage(
  name=model_details.name,
  version=model_details.version,
  stage='production',
)
model_version_details = client.get_model_version(
  name=model_details.name,
  version=model_details.version,
)
print("The current model stage is: '{stage}'".format(stage=model_version_details.current_stage))

latest_version_info = client.get_latest_versions(model_name, stages=["production"])
latest_production_version = latest_version_info[0].version
print("The latest production version of the model '%s' is '%s'." % (model_name, latest_production_version))

使用 API 載入已註冊模型的版本

MLflow 模型元件會定義函式，以從數個機器學習架構載入模型。例如， mlflow.tensorflow.load_model() 用來載入以 MLflow 格式儲存的 TensorFlow 模型，並 mlflow.sklearn.load_model() 用來載入以 MLflow 格式儲存的 scikit-learn 模型。

這些函式可以從 MLflow 模型登錄載入模型。

import mlflow.pyfunc

model_version_uri = "models:/{model_name}/1".format(model_name=model_name)

print("Loading registered model version from URI: '{model_uri}'".format(model_uri=model_version_uri))
model_version_1 = mlflow.pyfunc.load_model(model_version_uri)

model_production_uri = "models:/{model_name}/production".format(model_name=model_name)

print("Loading registered model version from URI: '{model_uri}'".format(model_uri=model_production_uri))
model_production = mlflow.pyfunc.load_model(model_production_uri)

使用生產模型預測電源輸出

在本節中，生產模型是用來評估風力發電廠的氣象預測資料。應用程式 forecast_power() 會從指定的階段載入最新版的預測模型，並用它來預測接下來五天的電源生產。

def plot(model_name, model_stage, model_version, power_predictions, past_power_output):
  import pandas as pd
  import matplotlib.dates as mdates
  from matplotlib import pyplot as plt
  index = power_predictions.index
  fig = plt.figure(figsize=(11, 7))
  ax = fig.add_subplot(111)
  ax.set_xlabel("Date", size=20, labelpad=20)
  ax.set_ylabel("Power\noutput\n(MW)", size=20, labelpad=60, rotation=0)
  ax.tick_params(axis='both', which='major', labelsize=17)
  ax.xaxis.set_major_formatter(mdates.DateFormatter('%m/%d'))
  ax.plot(index[:len(past_power_output)], past_power_output, label="True", color="red", alpha=0.5, linewidth=4)
  ax.plot(index, power_predictions.squeeze(), "--", label="Predicted by '%s'\nin stage '%s' (Version %d)" % (model_name, model_stage, model_version), color="blue", linewidth=3)
  ax.set_ylim(ymin=0, ymax=max(3500, int(max(power_predictions.values) * 1.3)))
  ax.legend(fontsize=14)
  plt.title("Wind farm power output and projections", size=24, pad=20)
  plt.tight_layout()
  display(plt.show())

def forecast_power(model_name, model_stage):
  from mlflow.tracking.client import MlflowClient
  client = MlflowClient()
  model_version = client.get_latest_versions(model_name, stages=[model_stage])[0].version
  model_uri = "models:/{model_name}/{model_stage}".format(model_name=model_name, model_stage=model_stage)
  model = mlflow.pyfunc.load_model(model_uri)
  weather_data, past_power_output = get_weather_and_forecast()
  power_predictions = pd.DataFrame(model.predict(weather_data))
  power_predictions.index = pd.to_datetime(weather_data.index)
  print(power_predictions)
  plot(model_name, model_stage, int(model_version), power_predictions, past_power_output)

建立新的模型版本

傳統機器學習技術也適用于電源預測。下列程式碼會使用 scikit-learn 定型隨機樹系模型，並透過 mlflow.sklearn.log_model() 函式向 MLflow 模型登錄進行註冊。

import mlflow.sklearn
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error

with mlflow.start_run():
  n_estimators = 300
  mlflow.log_param("n_estimators", n_estimators)

  rand_forest = RandomForestRegressor(n_estimators=n_estimators)
  rand_forest.fit(X_train, y_train)

  val_x, val_y = get_validation_data()
  mse = mean_squared_error(rand_forest.predict(val_x), val_y)
  print("Validation MSE: %d" % mse)
  mlflow.log_metric("mse", mse)

  # Specify the `registered_model_name` parameter of the `mlflow.sklearn.log_model()`
  # function to register the model with the MLflow Model Registry. This automatically
  # creates a new model version
  mlflow.sklearn.log_model(
    sk_model=rand_forest,
    artifact_path="sklearn-model",
    registered_model_name=model_name,
  )

使用 MLflow 模型登錄搜尋擷取新的模型版本識別碼

from mlflow.tracking.client import MlflowClient
client = MlflowClient()

model_version_infos = client.search_model_versions("name = '%s'" % model_name)
new_model_version = max([model_version_info.version for model_version_info in model_version_infos])

wait_until_ready(model_name, new_model_version)

將描述新增至新的模型版本

client.update_model_version(
  name=model_name,
  version=new_model_version,
  description="This model version is a random forest containing 100 decision trees that was trained in scikit-learn."
)

將新的模型版本轉換為預備，並測試模型

將模型部署至生產應用程式之前，最好先在預備環境中進行測試。下列程式碼會將新的模型版本轉換為預備版本，並評估其效能。

client.transition_model_version_stage(
  name=model_name,
  version=new_model_version,
  stage="Staging",
)

forecast_power(model_name, "Staging")

將新的模型版本部署至生產環境

確認新模型版本在預備環境中執行良好之後，下列程式碼會將模型轉換為生產環境，並使用來自預測電源輸出與生產模型區段完全相同的應用程式程式碼來產生電源預測。

client.transition_model_version_stage(
  name=model_name,
  version=new_model_version,
  stage="production",
)

forecast_power(model_name, "production")

現在生產階段有兩個預測模型的模型版本：以 Keras 模型定型的模型版本，以及 scikit-learn 中定型的版本。

產品型號版本

注意

依階段參考模型時，MLflow 模型模型登錄會自動使用最新的生產版本。這可讓您在不變更任何應用程式程式碼的情況下更新生產模型。

封存和刪除模型

當不再使用模型版本時，您可以封存或刪除它。您也可以刪除整個已註冊的模型;這會移除其所有相關聯的模型版本。

`Version 1`電源預測模型的封存

電源預測模型的封存 Version 1 ，因為它不再使用。您可以在 MLflow 模型登錄 UI 中或透過 MLflow API 封存模型。

MLflow UI 中的封存 `Version 1`

若要封存 Version 1 電源預測模型：

在 MLflow 模型登錄 UI 中開啟其對應的模型版本頁面：
按一下 [階段] 按鈕，選取 [ 轉換至 - > 封存]：
在階段轉換確認視窗中按 [確定 ]。

使用 MLflow API 封存 `Version 1`

下列程式碼會 MlflowClient.update_model_version() 使用函式來封存 Version 1 電源預測模型。

from mlflow.tracking.client import MlflowClient

client = MlflowClient()
client.transition_model_version_stage(
  name=model_name,
  version=1,
  stage="Archived",
)

刪除 `Version 1` 電源預測模型

您也可以使用 MLflow UI 或 MLflow API 來刪除模型版本。

警告

模型版本刪除是永久的，無法復原。

`Version 1`在 MLflow UI 中刪除

若要刪除 Version 1 電源預測模型：

在 MLflow 模型登錄 UI 中開啟其對應的模型版本頁面。
選取版本識別碼旁邊的下拉式箭號，然後按一下 [ 刪除]。

使用 MLflow API 刪除 `Version 1`

client.delete_model_version(
   name=model_name,
   version=1,
)

使用 MLflow API 刪除模型

您必須先將所有剩餘的模型版本階段轉換為 None 或 Archived。

from mlflow.tracking.client import MlflowClient

client = MlflowClient()
client.transition_model_version_stage(
  name=model_name,
  version=2,
  stage="Archived",
)

client.delete_registered_model(name=model_name)

筆記本

MLflow 模型登錄範例筆記本

取得筆記本

共用方式為

工作區模型登錄範例

使用 MLflow 追蹤載入資料集、定型模型和追蹤

載入資料集

定型模型

使用 MLflow UI 註冊和管理模型

在本節中：

建立新的已註冊模型

探索模型登錄 UI

新增模型描述

轉換模型版本

使用 MLflow API 註冊和管理模型

在本節中：

以程式設計方式定義模型的名稱

註冊模型

使用 API 新增模型和模型版本描述

使用 API 轉換模型版本並擷取詳細資料

使用 API 載入已註冊模型的版本

使用生產模型預測電源輸出

建立新的模型版本

使用 MLflow 模型登錄搜尋擷取新的模型版本識別碼

將描述新增至新的模型版本

將新的模型版本轉換為預備，並測試模型

將新的模型版本部署至生產環境

封存和刪除模型

`Version 1`電源預測模型的封存

MLflow UI 中的封存 `Version 1`

使用 MLflow API 封存 `Version 1`

刪除 `Version 1` 電源預測模型

`Version 1`在 MLflow UI 中刪除

使用 MLflow API 刪除 `Version 1`

使用 MLflow API 刪除模型

筆記本

MLflow 模型登錄範例筆記本

其他資源

共用方式為

工作區模型登錄範例

使用 MLflow 追蹤載入資料集、定型模型和追蹤

載入資料集

定型模型

使用 MLflow UI 註冊和管理模型

在本節中：

建立新的已註冊模型

探索模型登錄 UI

新增模型描述

轉換模型版本

使用 MLflow API 註冊和管理模型

在本節中：

以程式設計方式定義模型的名稱

註冊模型

使用 API 新增模型和模型版本描述

使用 API 轉換模型版本並擷取詳細資料

使用 API 載入已註冊模型的版本

使用生產模型預測電源輸出

建立新的模型版本

使用 MLflow 模型登錄搜尋擷取新的模型版本識別碼

將描述新增至新的模型版本

將新的模型版本轉換為預備，並測試模型

將新的模型版本部署至生產環境

封存和刪除模型

Version 1電源預測模型的封存

MLflow UI 中的封存 Version 1

使用 MLflow API 封存 Version 1

刪除 Version 1 電源預測模型

Version 1在 MLflow UI 中刪除

使用 MLflow API 刪除 Version 1

使用 MLflow API 刪除模型

筆記本

MLflow 模型登錄範例筆記本

其他資源

`Version 1`電源預測模型的封存

MLflow UI 中的封存 `Version 1`

使用 MLflow API 封存 `Version 1`

刪除 `Version 1` 電源預測模型

`Version 1`在 MLflow UI 中刪除

使用 MLflow API 刪除 `Version 1`