프로덕션 환경에서 실행되도록 R 스크립트 조정

아티클
02/03/2024

이 문서에서는 기존 R 스크립트를 가져와서 Azure Machine Learning에서 작업으로 실행하기 위해 적절하게 변경하는 방법을 설명합니다.

전부는 아니지만 이 문서에 자세히 설명된 변경 내용을 대부분 수행해야 합니다.

사용자 상호 작용 제거

R 스크립트는 무인으로 실행되도록 설계되어야 하며 컨테이너 내의 Rscript 명령을 통해 실행됩니다. 스크립트에서 대화형 입력 또는 출력을 제거해야 합니다.

구문 분석 추가

스크립트에 모든 종류의 입력 매개 변수가 필요한 경우(대부분의 스크립트가 수행함) 호출을 통해 입력을 스크립트에 Rscript 전달합니다.

Rscript <name-of-r-script>.R
--data_file ${{inputs.<name-of-yaml-input-1>}} 
--brand ${{inputs.<name-of-yaml-input-2>}}

R 스크립트에서 입력을 구문 분석하고 적절한 형식 변환을 만듭니다. 패키지를 사용하는 optparse 것이 좋습니다.

다음 코드 조각은 다음 방법을 보여줍니다.

파서 시작
모든 입력을 옵션으로 추가
적절한 데이터 형식으로 입력 구문 분석

테스트에 편리한 기본값을 추가할 수도 있습니다. 스크립트의 출력이 --output 저장되도록 기본값이 ./outputs 있는 매개 변수를 추가하는 것이 좋습니다.

library(optparse)

parser <- OptionParser()

parser <- add_option(
  parser,
  "--output",
  type = "character",
  action = "store",
  default = "./outputs"
)

parser <- add_option(
  parser,
  "--data_file",
  type = "character",
  action = "store",
  default = "data/myfile.csv"
)

parser <- add_option(
  parser,
  "--brand",
  type = "double",
  action = "store",
  default = 1
)
args <- parse_args(parser)

args 는 명명된 목록입니다. 스크립트의 뒷부분에서 이러한 매개 변수를 사용할 수 있습니다.

`azureml_utils.R` 도우미 스크립트 원본

실행할 R 스크립트의 동일한 작업 디렉터리에 스크립트라는 azureml_utils.R 도우미 스크립트를 원본으로 제공해야 합니다. 실행 중인 R 스크립트가 MLflow 서버와 통신하려면 도우미 스크립트가 필요합니다. 도우미 스크립트는 실행 중인 작업에서 토큰이 빠르게 변경되므로 인증 토큰을 지속적으로 검색하는 방법을 제공합니다. 도우미 스크립트를 사용하면 R MLflow API에 제공된 로깅 함수를 사용하여 모델, 매개 변수, 태그 및 일반 아티팩트를 기록할 수도 있습니다.

다음 코드를 사용하여 파일을 azureml_utils.R만듭니다.

# Azure ML utility to enable usage of the MLFlow R API for tracking with Azure Machine Learning (Azure ML). This utility does the following::
# 1. Understands Azure ML MLflow tracking url by extending OSS MLflow R client.
# 2. Manages Azure ML Token refresh for remote runs (runs that execute in Azure Machine Learning). It uses tcktk2 R libraray to schedule token refresh.
#    Token refresh interval can be controlled by setting the environment variable MLFLOW_AML_TOKEN_REFRESH_INTERVAL and defaults to 30 seconds.

library(mlflow)
library(httr)
library(later)
library(tcltk2)

new_mlflow_client.mlflow_azureml <- function(tracking_uri) {
  host <- paste("https", tracking_uri$path, sep = "://")
  get_host_creds <- function () {
    mlflow:::new_mlflow_host_creds(
      host = host,
      token = Sys.getenv("MLFLOW_TRACKING_TOKEN"),
      username = Sys.getenv("MLFLOW_TRACKING_USERNAME", NA),
      password = Sys.getenv("MLFLOW_TRACKING_PASSWORD", NA),
      insecure = Sys.getenv("MLFLOW_TRACKING_INSECURE", NA)
    )
  }
  cli_env <- function() {
    creds <- get_host_creds()
    res <- list(
      MLFLOW_TRACKING_USERNAME = creds$username,
      MLFLOW_TRACKING_PASSWORD = creds$password,
      MLFLOW_TRACKING_TOKEN = creds$token,
      MLFLOW_TRACKING_INSECURE = creds$insecure
    )
    res[!is.na(res)]
  }
  mlflow:::new_mlflow_client_impl(get_host_creds, cli_env, class = "mlflow_azureml_client")
}

get_auth_header <- function() {
    headers <- list()
    auth_token <- Sys.getenv("MLFLOW_TRACKING_TOKEN")
    auth_header <- paste("Bearer", auth_token, sep = " ")
    headers$Authorization <- auth_header
    headers
}

get_token <- function(host, exp_id, run_id) {
    req_headers <- do.call(httr::add_headers, get_auth_header())
    token_host <- gsub("mlflow/v1.0","history/v1.0", host)
    token_host <- gsub("azureml://","https://", token_host)
    api_url <- paste0(token_host, "/experimentids/", exp_id, "/runs/", run_id, "/token")
    GET( api_url, timeout(getOption("mlflow.rest.timeout", 30)), req_headers)
}


fetch_token_from_aml <- function() {
    message("Refreshing token")
    tracking_uri <- Sys.getenv("MLFLOW_TRACKING_URI")
    exp_id <- Sys.getenv("MLFLOW_EXPERIMENT_ID")
    run_id <- Sys.getenv("MLFLOW_RUN_ID")
    sleep_for <- 1
    time_left <- 30
    response <- get_token(tracking_uri, exp_id, run_id)
    while (response$status_code == 429 && time_left > 0) {
        time_left <- time_left - sleep_for
        warning(paste("Request returned with status code 429 (Rate limit exceeded). Retrying after ",
                    sleep_for, " seconds. Will continue to retry 429s for up to ", time_left,
                    " second.", sep = ""))
        Sys.sleep(sleep_for)
        sleep_for <- min(time_left, sleep_for * 2)
        response <- get_token(tracking_uri, exp_id)
    }

    if (response$status_code != 200){
        error_response = paste("Error fetching token will try again after sometime: ", str(response), sep = " ")
        warning(error_response)
    }

    if (response$status_code == 200){
        text <- content(response, "text", encoding = "UTF-8")
        json_resp <-jsonlite::fromJSON(text, simplifyVector = FALSE)
        json_resp$token
        Sys.setenv(MLFLOW_TRACKING_TOKEN = json_resp$token)
        message("Refreshing token done")
    }
}

clean_tracking_uri <- function() {
    tracking_uri <- httr::parse_url(Sys.getenv("MLFLOW_TRACKING_URI"))
    tracking_uri$query = ""
    tracking_uri <-httr::build_url(tracking_uri)
    Sys.setenv(MLFLOW_TRACKING_URI = tracking_uri)
}

clean_tracking_uri()
tcltk2::tclTaskSchedule(as.integer(Sys.getenv("MLFLOW_TOKEN_REFRESH_INTERVAL_SECONDS", 30))*1000, fetch_token_from_aml(), id = "fetch_token_from_aml", redo = TRUE)

# Set MLFlow related env vars
Sys.setenv(MLFLOW_BIN = system("which mlflow", intern = TRUE))
Sys.setenv(MLFLOW_PYTHON_BIN = system("which python", intern = TRUE))

다음 줄로 R 스크립트를 시작합니다.

source("azureml_utils.R")

로컬 파일로 데이터 파일 읽기

R 스크립트를 작업으로 실행하면 Azure Machine Learning은 작업 제출에서 지정한 데이터를 가져와 실행 중인 컨테이너에 탑재합니다. 따라서 실행 중인 컨테이너의 로컬 파일인 것처럼 데이터 파일을 읽을 수 있습니다.

원본 데이터가 데이터 자산으로 등록되었는지 확인
작업 제출 매개 변수에서 이름으로 데이터 자산 전달
일반적으로 로컬 파일을 읽는 것처럼 파일을 읽습니다.

매개 변수 섹션에 표시된 대로 입력 매개 변수를 정의합니다. 데이터 자산을 읽는 데 사용할 수 있도록 매개 변수 data-file를 사용하여 read_csv(args$data_file) 전체 경로를 지정합니다.

작업 아티팩트(이미지, 데이터 등) 저장

Important

이 섹션은 모델에 적용되지 않습니다. 모델별 저장 및 로깅 지침은 다음 두 섹션을 참조하세요.

Azure Machine Learning의 R 스크립트에 의해 생성된 데이터 파일, 이미지, 직렬화된 R 개체 등과 같은 임의의 스크립트 출력을 저장할 수 있습니다. ./outputs 생성된 아티팩트(이미지, 모델, 데이터 등)를 저장할 디렉터리를 만듭니다. 저장되는 ./outputs 모든 파일은 자동으로 실행에 포함되고 실행이 끝날 때 실험에 업로드됩니다. 입력 매개 변수 섹션에서 매개 변수의 --output 기본값을 추가했기 때문에 R 스크립트에 다음 코드 조각을 포함하여 디렉터리를 만듭니 output 다.

if (!dir.exists(args$output)) {
  dir.create(args$output)
}

디렉터리를 만든 후 해당 디렉터리에 아티팩트 저장 예시:

# create and save a plot
library(ggplot2)

myplot <- ggplot(...)

ggsave(myplot, 
       filename = file.path(args$output,"forecast-plot.png"))


# save an rds serialized object
saveRDS(myobject, file = file.path(args$output,"myobject.rds"))

`crate` 패키지가 있는 `carrier` 모델

R MLflow API 설명서에서는 R 모델이 모델 버전이어야 crate한다고 지정합니다.

R 스크립트가 모델을 학습시키고 모델 개체를 생성하는 경우 나중에 Azure Machine Learning을 사용하여 모델을 배포할 수 있어야 합니다 crate .
함수를 crate 사용하는 경우 필요한 패키지 함수를 호출할 때 명시적 네임스페이스를 사용합니다.

패키지로 만든 호출된 타임스러리 모델 개체 my_ts_model 가 있다고 fable 가정해 보겠습니다. 이 모델을 배포할 때 호출 가능하도록 하려면 모델 개체에 전달할 위치와 기간 수의 예측 수평선을 만듭니 crate 다.

library(carrier)
crated_model <- crate(function(x)
{
  fabletools::forecast(!!my_ts_model, h = x)
})

crated_model 개체는 기록할 개체입니다.

R MLflow API를 사용하여 모델, 매개 변수, 태그 또는 기타 아티팩트를 기록합니다.

생성된 아티팩트 외에도 각 실행에 대한 모델, 태그 및 매개 변수를 기록할 수 있습니다. 이렇게 하려면 R MLflow API를 사용합니다.

모델을 기록할 때 이전 섹션에서 설명한 대로 만든 crated 모델을 기록합니다.

참고 항목

모델을 기록하면 모델도 저장되고 실행 아티팩트에도 추가됩니다. 로그하지 않는 한 모델을 명시적으로 저장할 필요가 없습니다.

모델 및/또는 매개 변수를 기록하려면 다음을 수행합니다.

다음을 사용하여 실행 시작 mlflow_start_run()
를 mlflow_log_param사용하여 아티팩트 mlflow_log_model기록 또는mlflow_log_batch
로 실행을 mlflow_end_run()종료하지 마세요. 현재 오류가 발생하므로 이 호출을 건너뜁니다.

예를 들어 이전 섹션에서 만든 개체를 기록 crated_model 하려면 R 스크립트에 다음 코드를 포함합니다.

팁

모델을 로깅할 artifact_path 때 값으로 사용하는 models 것이 가장 좋습니다(이름을 다른 이름으로 지정할 수 있더라도).

mlflow_start_run()

mlflow_log_model(
  model = crated_model, # the crate model object
  artifact_path = "models" # a path to save the model object to
  )

mlflow_log_param(<key-name>, <value>)

# mlflow_end_run() - causes an error, do not include mlflow_end_run()

스크립트 구조 및 예제

이 문서에 설명된 모든 변경 내용에 따라 R 스크립트를 구조화하기 위한 지침으로 이러한 코드 조각을 사용합니다.

# BEGIN R SCRIPT

# source the azureml_utils.R script which is needed to use the MLflow back end
# with R
source("azureml_utils.R")

# load your packages here. Make sure that they are installed in the container.
library(...)

# parse the command line arguments.
library(optparse)

parser <- OptionParser()

parser <- add_option(
  parser,
  "--output",
  type = "character",
  action = "store",
  default = "./outputs"
)

parser <- add_option(
  parser,
  "--data_file",
  type = "character",
  action = "store",
  default = "data/myfile.csv"
)

parser <- add_option(
  parser,
  "--brand",
  type = "double",
  action = "store",
  default = 1
)
args <- parse_args(parser)

# your own R code goes here
# - model building/training
# - visualizations
# - etc.

# create the ./outputs directory
if (!dir.exists(args$output)) {
  dir.create(args$output)
}

# log models and parameters to MLflow
mlflow_start_run()

mlflow_log_model(
  model = crated_model, # the crate model object
  artifact_path = "models" # a path to save the model object to
  )

mlflow_log_param(<key-name>, <value>)

# mlflow_end_run() - causes an error, do not include mlflow_end_run()
## END OF R SCRIPT

환경 만들기

R 스크립트를 실행하려면 CLI v2라고도 하는 Azure CLI에 대한 확장을 사용합니다 ml . 이 ml 명령은 YAML 작업 정의 파일을 사용합니다. 작업을 az ml제출하는 방법에 대한 자세한 내용은 Azure Machine Learning CLI를 사용하여 모델 학습을 참조하세요.

YAML 작업 파일은 환경을 지정합니다. 작업을 실행하려면 작업 영역에서 이 환경을 만들어야 합니다.

Azure Machine Learning 스튜디오 또는 Azure CLI 를 사용하여 환경을 만들 수 있습니다.

어떤 방법을 사용하든 Dockerfile을 사용합니다. Azure Machine Learning에서 작동하려면 R 환경에 대한 모든 Docker 컨텍스트 파일에 다음 사양이 있어야 합니다.

FROM rocker/tidyverse:latest

# Install python
RUN apt-get update -qq && \
 apt-get install -y python3-pip tcl tk libz-dev libpng-dev

RUN ln -f /usr/bin/python3 /usr/bin/python
RUN ln -f /usr/bin/pip3 /usr/bin/pip
RUN pip install -U pip

# Install azureml-MLflow
RUN pip install azureml-MLflow
RUN pip install MLflow

# Create link for python
RUN ln -f /usr/bin/python3 /usr/bin/python

# Install R packages required for logging with MLflow (these are necessary)
RUN R -e "install.packages('mlflow', dependencies = TRUE, repos = 'https://cloud.r-project.org/')"
RUN R -e "install.packages('carrier', dependencies = TRUE, repos = 'https://cloud.r-project.org/')"
RUN R -e "install.packages('optparse', dependencies = TRUE, repos = 'https://cloud.r-project.org/')"
RUN R -e "install.packages('tcltk2', dependencies = TRUE, repos = 'https://cloud.r-project.org/')"

기본 이미지는 rocker/tidyverse:latest많은 R 패키지와 해당 종속성이 이미 설치된 이미지입니다.

Important

스크립트를 미리 실행해야 하는 R 패키지를 설치해야 합니다. 필요에 따라 Docker 컨텍스트 파일에 줄을 더 추가합니다.

RUN R -e "install.packages('<package-to-install>', dependencies = TRUE, repos = 'https://cloud.r-project.org/')"

추가 제안 사항

고려할 수 있는 몇 가지 추가 제안 사항:

예외 및 오류 처리에 R 함수 tryCatch 사용
문제 해결 및 디버깅을 위한 명시적 로깅 추가

다음 단계

Azure Machine Learning에서 R 모델을 학습하는 방법

다음을 통해 공유

프로덕션 환경에서 실행되도록 R 스크립트 조정

사용자 상호 작용 제거

구문 분석 추가

`azureml_utils.R` 도우미 스크립트 원본

로컬 파일로 데이터 파일 읽기

작업 아티팩트(이미지, 데이터 등) 저장

`crate` 패키지가 있는 `carrier` 모델

R MLflow API를 사용하여 모델, 매개 변수, 태그 또는 기타 아티팩트를 기록합니다.

스크립트 구조 및 예제

환경 만들기

추가 제안 사항

다음 단계

피드백

피드백

추가 리소스

다음을 통해 공유

프로덕션 환경에서 실행되도록 R 스크립트 조정

사용자 상호 작용 제거

구문 분석 추가

azureml_utils.R 도우미 스크립트 원본

로컬 파일로 데이터 파일 읽기

작업 아티팩트(이미지, 데이터 등) 저장

crate 패키지가 있는 carrier 모델

R MLflow API를 사용하여 모델, 매개 변수, 태그 또는 기타 아티팩트를 기록합니다.

스크립트 구조 및 예제

환경 만들기

추가 제안 사항

다음 단계

피드백

피드백

추가 리소스

`azureml_utils.R` 도우미 스크립트 원본

`crate` 패키지가 있는 `carrier` 모델