Редактиране

Споделяне чрез


Adapt your R script to run in production

This article explains how to take an existing R script and make the appropriate changes to run it as a job in Azure Machine Learning.

You'll have to make most of, if not all, of the changes described in detail in this article.

Remove user interaction

Your R script must be designed to run unattended and will be executed via the Rscript command within the container. Make sure you remove any interactive inputs or outputs from the script.

Add parsing

If your script requires any sort of input parameter (most scripts do), pass the inputs into the script via the Rscript call.

Rscript <name-of-r-script>.R
--data_file ${{inputs.<name-of-yaml-input-1>}} 
--brand ${{inputs.<name-of-yaml-input-2>}}

In your R script, parse the inputs and make the proper type conversions. We recommend that you use the optparse package.

The following snippet shows how to:

  • initiate the parser
  • add all your inputs as options
  • parse the inputs with the appropriate data types

You can also add defaults, which are handy for testing. We recommend that you add an --output parameter with a default value of ./outputs so that any output of the script will be stored.

library(optparse)

parser <- OptionParser()

parser <- add_option(
  parser,
  "--output",
  type = "character",
  action = "store",
  default = "./outputs"
)

parser <- add_option(
  parser,
  "--data_file",
  type = "character",
  action = "store",
  default = "data/myfile.csv"
)

parser <- add_option(
  parser,
  "--brand",
  type = "double",
  action = "store",
  default = 1
)
args <- parse_args(parser)

args is a named list. You can use any of these parameters later in your script.

Source the azureml_utils.R helper script

You must source a helper script called azureml_utils.R script in the same working directory of the R script that will be run. The helper script is required for the running R script to be able to communicate with the MLflow server. The helper script provides a method to continuously retrieve the authentication token, since the token changes quickly in a running job. The helper script also allows you to use the logging functions provided in the R MLflow API to log models, parameters, tags and general artifacts.

  1. Create your file, azureml_utils.R, with this code:

    # Azure ML utility to enable usage of the MLFlow R API for tracking with Azure Machine Learning (Azure ML). This utility does the following::
    # 1. Understands Azure ML MLflow tracking url by extending OSS MLflow R client.
    # 2. Manages Azure ML Token refresh for remote runs (runs that execute in Azure Machine Learning). It uses tcktk2 R libraray to schedule token refresh.
    #    Token refresh interval can be controlled by setting the environment variable MLFLOW_AML_TOKEN_REFRESH_INTERVAL and defaults to 30 seconds.
    
    library(mlflow)
    library(httr)
    library(later)
    library(tcltk2)
    
    new_mlflow_client.mlflow_azureml <- function(tracking_uri) {
      host <- paste("https", tracking_uri$path, sep = "://")
      get_host_creds <- function () {
        mlflow:::new_mlflow_host_creds(
          host = host,
          token = Sys.getenv("MLFLOW_TRACKING_TOKEN"),
          username = Sys.getenv("MLFLOW_TRACKING_USERNAME", NA),
          password = Sys.getenv("MLFLOW_TRACKING_PASSWORD", NA),
          insecure = Sys.getenv("MLFLOW_TRACKING_INSECURE", NA)
        )
      }
      cli_env <- function() {
        creds <- get_host_creds()
        res <- list(
          MLFLOW_TRACKING_USERNAME = creds$username,
          MLFLOW_TRACKING_PASSWORD = creds$password,
          MLFLOW_TRACKING_TOKEN = creds$token,
          MLFLOW_TRACKING_INSECURE = creds$insecure
        )
        res[!is.na(res)]
      }
      mlflow:::new_mlflow_client_impl(get_host_creds, cli_env, class = "mlflow_azureml_client")
    }
    
    get_auth_header <- function() {
        headers <- list()
        auth_token <- Sys.getenv("MLFLOW_TRACKING_TOKEN")
        auth_header <- paste("Bearer", auth_token, sep = " ")
        headers$Authorization <- auth_header
        headers
    }
    
    get_token <- function(host, exp_id, run_id) {
        req_headers <- do.call(httr::add_headers, get_auth_header())
        token_host <- gsub("mlflow/v1.0","history/v1.0", host)
        token_host <- gsub("azureml://","https://", token_host)
        api_url <- paste0(token_host, "/experimentids/", exp_id, "/runs/", run_id, "/token")
        GET( api_url, timeout(getOption("mlflow.rest.timeout", 30)), req_headers)
    }
    
    
    fetch_token_from_aml <- function() {
        message("Refreshing token")
        tracking_uri <- Sys.getenv("MLFLOW_TRACKING_URI")
        exp_id <- Sys.getenv("MLFLOW_EXPERIMENT_ID")
        run_id <- Sys.getenv("MLFLOW_RUN_ID")
        sleep_for <- 1
        time_left <- 30
        response <- get_token(tracking_uri, exp_id, run_id)
        while (response$status_code == 429 && time_left > 0) {
            time_left <- time_left - sleep_for
            warning(paste("Request returned with status code 429 (Rate limit exceeded). Retrying after ",
                        sleep_for, " seconds. Will continue to retry 429s for up to ", time_left,
                        " second.", sep = ""))
            Sys.sleep(sleep_for)
            sleep_for <- min(time_left, sleep_for * 2)
            response <- get_token(tracking_uri, exp_id)
        }
    
        if (response$status_code != 200){
            error_response = paste("Error fetching token will try again after sometime: ", str(response), sep = " ")
            warning(error_response)
        }
    
        if (response$status_code == 200){
            text <- content(response, "text", encoding = "UTF-8")
            json_resp <-jsonlite::fromJSON(text, simplifyVector = FALSE)
            json_resp$token
            Sys.setenv(MLFLOW_TRACKING_TOKEN = json_resp$token)
            message("Refreshing token done")
        }
    }
    
    clean_tracking_uri <- function() {
        tracking_uri <- httr::parse_url(Sys.getenv("MLFLOW_TRACKING_URI"))
        tracking_uri$query = ""
        tracking_uri <-httr::build_url(tracking_uri)
        Sys.setenv(MLFLOW_TRACKING_URI = tracking_uri)
    }
    
    clean_tracking_uri()
    tcltk2::tclTaskSchedule(as.integer(Sys.getenv("MLFLOW_TOKEN_REFRESH_INTERVAL_SECONDS", 30))*1000, fetch_token_from_aml(), id = "fetch_token_from_aml", redo = TRUE)
    
    # Set MLFlow related env vars
    Sys.setenv(MLFLOW_BIN = system("which mlflow", intern = TRUE))
    Sys.setenv(MLFLOW_PYTHON_BIN = system("which python", intern = TRUE))
    
  2. Start your R script with the following line:

source("azureml_utils.R")

Read data files as local files

When you run an R script as a job, Azure Machine Learning takes the data you specify in the job submission and mounts it on the running container. Therefore you'll be able to read the data file(s) as if they were local files on the running container.

  • Make sure your source data is registered as a data asset
  • Pass the data asset by name in the job submission parameters
  • Read the files as you normally would read a local file

Define the input parameter as shown in the parameters section. Use the parameter, data-file, to specify a whole path, so that you can use read_csv(args$data_file) to read the data asset.

Save job artifacts (images, data, etc.)

Important

This section does not apply to models. See the following two sections for model specific saving and logging instructions.

You can store arbitrary script outputs like data files, images, serialized R objects, etc. that are generated by the R script in Azure Machine Learning. Create a ./outputs directory to store any generated artifacts (images, models, data, etc.) Any files saved to ./outputs will be automatically included in the run and uploaded to the experiment at the end of the run. Since you added a default value for the --output parameter in the input parameters section, include the following code snippet in your R script to create the output directory.

if (!dir.exists(args$output)) {
  dir.create(args$output)
}

After you create the directory, save your artifacts to that directory. For example:

# create and save a plot
library(ggplot2)

myplot <- ggplot(...)

ggsave(myplot, 
       filename = file.path(args$output,"forecast-plot.png"))


# save an rds serialized object
saveRDS(myobject, file = file.path(args$output,"myobject.rds"))

crate your models with the carrier package

The R MLflow API documentation specifies that your R models need to be of the crate model flavor.

  • If your R script trains a model and you produce a model object, you'll need to crate it to be able to deploy it at a later time with Azure Machine Learning.
  • When using the crate function, use explicit namespaces when calling any package function you need.

Let's say you have a timeseries model object called my_ts_model created with the fable package. In order to make this model callable when it's deployed, create a crate where you'll pass in the model object and a forecasting horizon in number of periods:

library(carrier)
crated_model <- crate(function(x)
{
  fabletools::forecast(!!my_ts_model, h = x)
})

The crated_model object is the one you'll log.

Log models, parameters, tags, or other artifacts with the R MLflow API

In addition to saving any generated artifacts, you can also log models, tags, and parameters for each run. Use the R MLflow API to do so.

When you log a model, you log the crated model you created as described in the previous section.

Note

When you log a model, the model is also saved and added to the run artifacts. There is no need to explicitly save a model unless you did not log it.

To log a model, and/or parameter:

  1. Start the run with mlflow_start_run()
  2. Log artifacts with mlflow_log_model, mlflow_log_param, or mlflow_log_batch
  3. Do not end the run with mlflow_end_run(). Skip this call, as it currently causes an error.

For example, to log the crated_model object as created in the previous section, you would include the following code in your R script:

Tip

Use models as value for artifact_path when logging a model, this is a best practice (even though you can name it something else.)

mlflow_start_run()

mlflow_log_model(
  model = crated_model, # the crate model object
  artifact_path = "models" # a path to save the model object to
  )

mlflow_log_param(<key-name>, <value>)

# mlflow_end_run() - causes an error, do not include mlflow_end_run()

Script structure and example

Use these code snippets as a guide to structure your R script, following all the changes outlined in this article.

# BEGIN R SCRIPT

# source the azureml_utils.R script which is needed to use the MLflow back end
# with R
source("azureml_utils.R")

# load your packages here. Make sure that they are installed in the container.
library(...)

# parse the command line arguments.
library(optparse)

parser <- OptionParser()

parser <- add_option(
  parser,
  "--output",
  type = "character",
  action = "store",
  default = "./outputs"
)

parser <- add_option(
  parser,
  "--data_file",
  type = "character",
  action = "store",
  default = "data/myfile.csv"
)

parser <- add_option(
  parser,
  "--brand",
  type = "double",
  action = "store",
  default = 1
)
args <- parse_args(parser)

# your own R code goes here
# - model building/training
# - visualizations
# - etc.

# create the ./outputs directory
if (!dir.exists(args$output)) {
  dir.create(args$output)
}

# log models and parameters to MLflow
mlflow_start_run()

mlflow_log_model(
  model = crated_model, # the crate model object
  artifact_path = "models" # a path to save the model object to
  )

mlflow_log_param(<key-name>, <value>)

# mlflow_end_run() - causes an error, do not include mlflow_end_run()
## END OF R SCRIPT

Create an environment

To run your R script, you'll use the ml extension for Azure CLI, also referred to as CLI v2. The ml command uses a YAML job definitions file. For more information about submitting jobs with az ml, see Train models with Azure Machine Learning CLI.

The YAML job file specifies an environment. You'll need to create this environment in your workspace before you can run the job.

You can create the environment in Azure Machine Learning studio or with the Azure CLI.

Whatever method you use, you'll use a Dockerfile. All Docker context files for R environments must have the following specification in order to work on Azure Machine Learning:

FROM rocker/tidyverse:latest

# Install python
RUN apt-get update -qq && \
 apt-get install -y python3-pip tcl tk libz-dev libpng-dev

RUN ln -f /usr/bin/python3 /usr/bin/python
RUN ln -f /usr/bin/pip3 /usr/bin/pip
RUN pip install -U pip

# Install azureml-MLflow
RUN pip install azureml-MLflow
RUN pip install MLflow

# Create link for python
RUN ln -f /usr/bin/python3 /usr/bin/python

# Install R packages required for logging with MLflow (these are necessary)
RUN R -e "install.packages('mlflow', dependencies = TRUE, repos = 'https://cloud.r-project.org/')"
RUN R -e "install.packages('carrier', dependencies = TRUE, repos = 'https://cloud.r-project.org/')"
RUN R -e "install.packages('optparse', dependencies = TRUE, repos = 'https://cloud.r-project.org/')"
RUN R -e "install.packages('tcltk2', dependencies = TRUE, repos = 'https://cloud.r-project.org/')"

The base image is rocker/tidyverse:latest, which has many R packages and their dependencies already installed.

Important

You must install any R packages your script will need to run in advance. Add more lines to the Docker context file as needed.

RUN R -e "install.packages('<package-to-install>', dependencies = TRUE, repos = 'https://cloud.r-project.org/')"

Additional suggestions

Some additional suggestions you may want to consider:

  • Use R's tryCatch function for exception and error handling
  • Add explicit logging for troubleshooting and debugging

Next steps