調整 R 指令碼以在生產環境中執行

文章
10/16/2024

本文說明如何取得現有的 R 指令碼並進行適當的變更，以在 Azure Machine Learning 中以作業的形式執行。

您必須進行本文中詳細描述的大部分 (若非全部) 變更。

免除使用者互動

您的 R 指令碼必須設計為自動執行，並且會透過容器內的 Rscript 命令執行。務必從指令碼中移除任何互動式輸入或輸出。

新增剖析

如果您的指令碼需要任何種類的輸入參數 (大部分的指令碼都是這樣)，請透過 Rscript 呼叫將輸入傳遞至指令碼。

Rscript <name-of-r-script>.R
--data_file ${{inputs.<name-of-yaml-input-1>}} 
--brand ${{inputs.<name-of-yaml-input-2>}}

在您的 R 指令碼中，剖析輸入並進行適當的類型轉換。我們建議您使用 optparse 套件。

下列程式碼片段顯示如何：

起始剖析器
將所有輸入新增為選項
使用適當的資料類型剖析輸入

您也可以新增預設值，這很方便進行測試。建議您新增預設值為 ./outputs 的 --output 參數，以便儲存指令碼的任何輸出。

library(optparse)

parser <- OptionParser()

parser <- add_option(
  parser,
  "--output",
  type = "character",
  action = "store",
  default = "./outputs"
)

parser <- add_option(
  parser,
  "--data_file",
  type = "character",
  action = "store",
  default = "data/myfile.csv"
)

parser <- add_option(
  parser,
  "--brand",
  type = "double",
  action = "store",
  default = 1
)
args <- parse_args(parser)

args 是具名清單。您稍後可以在指令碼中使用任何這些參數。

獲得 `azureml_utils.R` 協助程式指令碼

您必須在將執行之 R 指令碼的相同工作目錄中，獲得名為 azureml_utils.R 指令碼的協助程式指令碼。執行中的 R 指令碼需要協助程式指令碼，才能與 MLflow 伺服器通訊。協助程式指令碼提供一種方法來持續擷取驗證權杖，因為權杖會在執行中的作業中快速變更。協助程式指令碼也可讓您使用 R MLflow API 中提供的記錄函式，記錄模型、參數、標記和一般成品。

使用以下程式碼建立您的檔案 azureml_utils.R：

# Azure ML utility to enable usage of the MLFlow R API for tracking with Azure Machine Learning (Azure ML). This utility does the following::
# 1. Understands Azure ML MLflow tracking url by extending OSS MLflow R client.
# 2. Manages Azure ML Token refresh for remote runs (runs that execute in Azure Machine Learning). It uses tcktk2 R libraray to schedule token refresh.
#    Token refresh interval can be controlled by setting the environment variable MLFLOW_AML_TOKEN_REFRESH_INTERVAL and defaults to 30 seconds.

library(mlflow)
library(httr)
library(later)
library(tcltk2)

new_mlflow_client.mlflow_azureml <- function(tracking_uri) {
  host <- paste("https", tracking_uri$path, sep = "://")
  get_host_creds <- function () {
    mlflow:::new_mlflow_host_creds(
      host = host,
      token = Sys.getenv("MLFLOW_TRACKING_TOKEN"),
      username = Sys.getenv("MLFLOW_TRACKING_USERNAME", NA),
      password = Sys.getenv("MLFLOW_TRACKING_PASSWORD", NA),
      insecure = Sys.getenv("MLFLOW_TRACKING_INSECURE", NA)
    )
  }
  cli_env <- function() {
    creds <- get_host_creds()
    res <- list(
      MLFLOW_TRACKING_USERNAME = creds$username,
      MLFLOW_TRACKING_PASSWORD = creds$password,
      MLFLOW_TRACKING_TOKEN = creds$token,
      MLFLOW_TRACKING_INSECURE = creds$insecure
    )
    res[!is.na(res)]
  }
  mlflow:::new_mlflow_client_impl(get_host_creds, cli_env, class = "mlflow_azureml_client")
}

get_auth_header <- function() {
    headers <- list()
    auth_token <- Sys.getenv("MLFLOW_TRACKING_TOKEN")
    auth_header <- paste("Bearer", auth_token, sep = " ")
    headers$Authorization <- auth_header
    headers
}

get_token <- function(host, exp_id, run_id) {
    req_headers <- do.call(httr::add_headers, get_auth_header())
    token_host <- gsub("mlflow/v1.0","history/v1.0", host)
    token_host <- gsub("azureml://","https://", token_host)
    api_url <- paste0(token_host, "/experimentids/", exp_id, "/runs/", run_id, "/token")
    GET( api_url, timeout(getOption("mlflow.rest.timeout", 30)), req_headers)
}


fetch_token_from_aml <- function() {
    message("Refreshing token")
    tracking_uri <- Sys.getenv("MLFLOW_TRACKING_URI")
    exp_id <- Sys.getenv("MLFLOW_EXPERIMENT_ID")
    run_id <- Sys.getenv("MLFLOW_RUN_ID")
    sleep_for <- 1
    time_left <- 30
    response <- get_token(tracking_uri, exp_id, run_id)
    while (response$status_code == 429 && time_left > 0) {
        time_left <- time_left - sleep_for
        warning(paste("Request returned with status code 429 (Rate limit exceeded). Retrying after ",
                    sleep_for, " seconds. Will continue to retry 429s for up to ", time_left,
                    " second.", sep = ""))
        Sys.sleep(sleep_for)
        sleep_for <- min(time_left, sleep_for * 2)
        response <- get_token(tracking_uri, exp_id)
    }

    if (response$status_code != 200){
        error_response = paste("Error fetching token will try again after sometime: ", str(response), sep = " ")
        warning(error_response)
    }

    if (response$status_code == 200){
        text <- content(response, "text", encoding = "UTF-8")
        json_resp <-jsonlite::fromJSON(text, simplifyVector = FALSE)
        json_resp$token
        Sys.setenv(MLFLOW_TRACKING_TOKEN = json_resp$token)
        message("Refreshing token done")
    }
}

clean_tracking_uri <- function() {
    tracking_uri <- httr::parse_url(Sys.getenv("MLFLOW_TRACKING_URI"))
    tracking_uri$query = ""
    tracking_uri <-httr::build_url(tracking_uri)
    Sys.setenv(MLFLOW_TRACKING_URI = tracking_uri)
}

clean_tracking_uri()
tcltk2::tclTaskSchedule(as.integer(Sys.getenv("MLFLOW_TOKEN_REFRESH_INTERVAL_SECONDS", 30))*1000, fetch_token_from_aml(), id = "fetch_token_from_aml", redo = TRUE)

# Set MLFlow related env vars
Sys.setenv(MLFLOW_BIN = system("which mlflow", intern = TRUE))
Sys.setenv(MLFLOW_PYTHON_BIN = system("which python", intern = TRUE))

使用以下程式碼行來啟動 R 指令碼：

source("azureml_utils.R")

以本機檔案的形式讀取資料檔案

當您以作業形式執行 R 指令碼時，Azure Machine Learning 會採用您在作業提交中指定的資料，並將其掛接在執行中的容器上。因此，您將能夠讀取資料檔案，就如同執行中容器上的本機檔案一樣。

確定來源資料已註冊為資料資產
在作業提交參數中依名稱傳遞資料資產
像平常讀取本機檔案一樣讀取檔案

定義輸入參數，如 parameters 區段所示。使用參數 data-file 來指定整個路徑，以便您使用 read_csv(args$data_file) 來讀取資料資產。

儲存作業成品 (映像、資料等)

重要

本節不適用於模型。如需模型特定的儲存和記錄指示，請參閱下列兩節。

您可以儲存任意指令碼輸出，例如資料檔案、映像、序列化 R 物件等，這些輸出是由 Azure Machine Learning 中的 R 指令碼所產生。建立 ./outputs 目錄來儲存任何產生的成品 (映像、模型、資料等)任何儲存至 ./outputs 的檔案都會自動包含在執行中，並在執行結束時上傳至實驗。由於您在輸入參數區段中新增了 --output 參數的預設值，因此請在 R 指令碼中包含下列程式碼片段以建立 output 目錄。

if (!dir.exists(args$output)) {
  dir.create(args$output)
}

建立目錄之後，請將成品儲存至該目錄。例如：

# create and save a plot
library(ggplot2)

myplot <- ggplot(...)

ggsave(myplot, 
       filename = file.path(args$output,"forecast-plot.png"))


# save an rds serialized object
saveRDS(myobject, file = file.path(args$output,"myobject.rds"))

使用 `carrier` 套件 `crate` 您的模型

R MLflow API 文件指定您的 R 模型必須是 crate 模型變體。

如果您的 R 指令碼定型模型並產生模型物件，您必須 crate 它，才能在稍後使用 Azure Machine Learning 進行部署。
使用 crate 函式時，請在呼叫您需要的任何套件函式時使用明確的命名空間。

假設您有使用 fable 套件建立的時間序列模型物件 (稱為 my_ts_model)。若要讓此模型可在部署時呼叫，請建立 crate，您將在其中傳入模型物件及預測範圍 (以期間數為單位)：

library(carrier)
crated_model <- crate(function(x)
{
  fabletools::forecast(!!my_ts_model, h = x)
})

crated_model 物件是您將記錄的物件。

使用 R MLflow API 記錄模型、參數、標記或其他成品

除了儲存任何產生的成品之外，您也可以記錄每次執行的模型、標記和參數。使用 R MLflow API 來執行此動作。

當您記錄模型時，您會記錄建立的模型 (如上一節所述而建立)。

注意

當您記錄模型時，模型也會儲存並新增至執行成品。除非您未記錄模型，否則不需要明確儲存模型。

若要記錄模型和/或參數：

使用 mlflow_start_run() 開始執行
記錄具有 mlflow_log_model、mlflow_log_param 或 mlflow_log_batch 的成品
請勿使用 mlflow_end_run() 結束執行。略過此呼叫，因為其目前會造成錯誤。

例如，若要記錄在上一節中建立的 crated_model 物件，您會在 R 程式碼中包含下列程式碼：

提示

在記錄模型時，請使用 models 作為 artifact_path 的值，這是最佳做法 (即使您可將其命名為其他名稱)。

mlflow_start_run()

mlflow_log_model(
  model = crated_model, # the crate model object
  artifact_path = "models" # a path to save the model object to
  )

mlflow_log_param(<key-name>, <value>)

# mlflow_end_run() - causes an error, do not include mlflow_end_run()

指令碼結構和範例

依照本文中概述的所有變更，使用這些程式碼片段作為建構 R 指令碼的指南。

# BEGIN R SCRIPT

# source the azureml_utils.R script which is needed to use the MLflow back end
# with R
source("azureml_utils.R")

# load your packages here. Make sure that they are installed in the container.
library(...)

# parse the command line arguments.
library(optparse)

parser <- OptionParser()

parser <- add_option(
  parser,
  "--output",
  type = "character",
  action = "store",
  default = "./outputs"
)

parser <- add_option(
  parser,
  "--data_file",
  type = "character",
  action = "store",
  default = "data/myfile.csv"
)

parser <- add_option(
  parser,
  "--brand",
  type = "double",
  action = "store",
  default = 1
)
args <- parse_args(parser)

# your own R code goes here
# - model building/training
# - visualizations
# - etc.

# create the ./outputs directory
if (!dir.exists(args$output)) {
  dir.create(args$output)
}

# log models and parameters to MLflow
mlflow_start_run()

mlflow_log_model(
  model = crated_model, # the crate model object
  artifact_path = "models" # a path to save the model object to
  )

mlflow_log_param(<key-name>, <value>)

# mlflow_end_run() - causes an error, do not include mlflow_end_run()
## END OF R SCRIPT

建立環境

若要執行 R 指令碼，您將使用 Azure CLI 的 ml 延伸模組，也稱為 CLI v2。 ml 命令會使用 YAML 作業定義檔案。如需使用 az ml 提交作業的詳細資訊，請參閱使用 Azure Machine Learning CLI 定型模型。

YAML 作業檔案會指定環境。您必須先在工作區中建立此環境，才能執行作業。

您可以在 Azure Machine Learning 工作室或使用 Azure CLI 建立環境。

無論您使用何種方法，您都會使用 Dockerfile。 R 環境的所有 Docker 內容檔案都必須有下列規格，才能在 Azure Machine Learning 上運作：

FROM rocker/tidyverse:latest

# Install python
RUN apt-get update -qq && \
 apt-get install -y python3-pip tcl tk libz-dev libpng-dev

RUN ln -f /usr/bin/python3 /usr/bin/python
RUN ln -f /usr/bin/pip3 /usr/bin/pip
RUN pip install -U pip

# Install azureml-MLflow
RUN pip install azureml-MLflow
RUN pip install MLflow

# Create link for python
RUN ln -f /usr/bin/python3 /usr/bin/python

# Install R packages required for logging with MLflow (these are necessary)
RUN R -e "install.packages('mlflow', dependencies = TRUE, repos = 'https://cloud.r-project.org/')"
RUN R -e "install.packages('carrier', dependencies = TRUE, repos = 'https://cloud.r-project.org/')"
RUN R -e "install.packages('optparse', dependencies = TRUE, repos = 'https://cloud.r-project.org/')"
RUN R -e "install.packages('tcltk2', dependencies = TRUE, repos = 'https://cloud.r-project.org/')"

基底映像是 rocker/tidyverse:latest，其已安裝許多 R 套件及其相依性。

重要

您必須安裝指令碼所需的任何 R 套件才能事先執行。視需要將更多行新增至 Docker 內容檔案。

RUN R -e "install.packages('<package-to-install>', dependencies = TRUE, repos = 'https://cloud.r-project.org/')"

其他建議

您可以考量的一些其他建議：

使用 R 的 tryCatch 函式來處理例外狀況和錯誤
新增明確的記錄以進行疑難排解和偵錯

下一步

如何在 Azure Machine Learning 中定型 R 模型

分享方式：

調整 R 指令碼以在生產環境中執行

免除使用者互動

新增剖析

獲得 `azureml_utils.R` 協助程式指令碼

以本機檔案的形式讀取資料檔案

儲存作業成品 (映像、資料等)

使用 `carrier` 套件 `crate` 您的模型

使用 R MLflow API 記錄模型、參數、標記或其他成品

指令碼結構和範例

建立環境

其他建議

下一步

意見反映

更多資源

分享方式：

調整 R 指令碼以在生產環境中執行

免除使用者互動

新增剖析

獲得 azureml_utils.R 協助程式指令碼

以本機檔案的形式讀取資料檔案

儲存作業成品 (映像、資料等)

使用 carrier 套件 crate 您的模型

使用 R MLflow API 記錄模型、參數、標記或其他成品

指令碼結構和範例

建立環境

其他建議

下一步

意見反映

更多資源

獲得 `azureml_utils.R` 協助程式指令碼

使用 `carrier` 套件 `crate` 您的模型