Facing problem while deploying model on Azure ML as online endpoints

Rishavraj Mandal 30 Reputation points

MLflow version




  • 'inference-schema[numpy-support]==1.5.0'
  • xlrd==2.0.1
  • mlflow== 1.26.1
  • azureml-mlflow==1.42.0
  • tqdm==4.63.0
  • pytorch-transformers==1.2.0
  • pytorch-lightning==2.0.2
  • seqeval==1.2.2
  • azureml-inference-server-http==0.8.0
    name: model-env

System information

  • python=3.9
  • pip=22.1.2
  • numpy=1.21.2
  • scikit-learn=0.24.2
  • scipy=1.7.1
  • 'pandas>=1.1,<1.2'
  • pytorch=1.10.0

Describe the problem

Trained the model and pipeline on GPU instances in Azure ML.
When trying to load the model using this code -

*How and where can I update map_location=torch.device('cpu') ?

Why does it say that run() I'm my custom socre.py is not decorated when I have clearly added some lines there.**

model_path = os.path.join(os.getenv("AZUREML_MODEL_DIR"), "use-case1-model")
model = mlflow.pyfunc.load_model(model_path)


import logging
import os
import json
import mlflow
from io import StringIO
from mlflow.pyfunc.scoring_server import infer_and_parse_json_input, predictions_to_json
import sys
from time import strftime, localtime
from collections import Counter
from pytorch_transformers import BertTokenizer
import random
import numpy as np 
import torch 
from tqdm import tqdm

def init():
    global model
    # "model" is the path of the mlflow artifacts when the model was registered. For automl
    # models, this is generally "mlflow-model".
    model_path = os.path.join(os.getenv("AZUREML_MODEL_DIR"), "use-case1-model")
    model = mlflow.pyfunc.load_model(model_path)
    logging.info("Init complete")

def run(raw_data):
    data = json.loads(raw_data)
    title = json.dumps(data["title"])
    att = json.dumps(data["attributes"])

    output = model.predict([tensor_t,tensor_a])

    predict_list = output.tolist()[0]
    result = StringIO()
    return result.getvalue()

`Other info / logs

Initializing logger
2023-05-25 09:36:19,602 I [66] azmlinfsrv - Starting up app insights client
2023-05-25 09:36:22,449 I [66] azmlinfsrv.user_script - Found user script at /var/azureml-app/dependencies/score.py
2023-05-25 09:36:22,449 I [66] azmlinfsrv.user_script - run() is not decorated. Server will invoke it with the input in JSON string.
2023-05-25 09:36:22,449 I [66] azmlinfsrv.user_script - Invoking user's init function
2023/05/25 09:36:22 WARNING mlflow.pyfunc: The version of Python that the model was saved in, `Python 3.8.16`, differs from the version of Python that is currently running, `Python 3.9.16`, and may be incompatible
2023-05-25 09:36:22,742 E [66] azmlinfsrv - User's init function failed
2023-05-25 09:36:22,744 E [66] azmlinfsrv - Encountered Exception Traceback (most recent call last):
  File "/azureml-envs/azureml_9a3b1e0a66d72d612aebc12b4a285f72/lib/python3.9/site-packages/azureml_inference_server_http/server/user_script.py", line 117, in invoke_init
  File "/var/azureml-app/dependencies/score.py", line 21, in init
    model = mlflow.pyfunc.load_model(model_path)
  File "/azureml-envs/azureml_9a3b1e0a66d72d612aebc12b4a285f72/lib/python3.9/site-packages/mlflow/pyfunc/__init__.py", line 735, in load_model
    model_impl = importlib.import_module(conf[MAIN])._load_pyfunc(data_path)
  File "/azureml-envs/azureml_9a3b1e0a66d72d612aebc12b4a285f72/lib/python3.9/site-packages/mlflow/pytorch/__init__.py", line 735, in _load_pyfunc
    return _PyTorchWrapper(_load_model(path, **kwargs))
  File "/azureml-envs/azureml_9a3b1e0a66d72d612aebc12b4a285f72/lib/python3.9/site-packages/mlflow/pytorch/__init__.py", line 643, in _load_model
    return torch.load(model_path, **kwargs)
  File "/azureml-envs/azureml_9a3b1e0a66d72d612aebc12b4a285f72/lib/python3.9/site-packages/torch/serialization.py", line 809, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "/azureml-envs/azureml_9a3b1e0a66d72d612aebc12b4a285f72/lib/python3.9/site-packages/torch/serialization.py", line 1172, in _load
    result = unpickler.load()
  File "/azureml-envs/azureml_9a3b1e0a66d72d612aebc12b4a285f72/lib/python3.9/site-packages/torch/serialization.py", line 1142, in persistent_load
    typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
  File "/azureml-envs/azureml_9a3b1e0a66d72d612aebc12b4a285f72/lib/python3.9/site-packages/torch/serialization.py", line 1116, in load_tensor
    wrap_storage=restore_location(storage, location),
  File "/azureml-envs/azureml_9a3b1e0a66d72d612aebc12b4a285f72/lib/python3.9/site-packages/torch/serialization.py", line 217, in default_restore_location
    result = fn(storage, location)
  File "/azureml-envs/azureml_9a3b1e0a66d72d612aebc12b4a285f72/lib/python3.9/site-packages/torch/serialization.py", line 182, in _cuda_deserialize
    device = validate_cuda_device(location)
  File "/azureml-envs/azureml_9a3b1e0a66d72d612aebc12b4a285f72/lib/python3.9/site-packages/torch/serialization.py", line 166, in validate_cuda_device
    raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/azureml-envs/azureml_9a3b1e0a66d72d612aebc12b4a285f72/lib/python3.9/site-packages/azureml_inference_server_http/server/aml_blueprint.py", line 111, in setup
  File "/azureml-envs/azureml_9a3b1e0a66d72d612aebc12b4a285f72/lib/python3.9/site-packages/azureml_inference_server_http/server/user_script.py", line 119, in invoke_init
    raise UserScriptException(ex) from ex
azureml_inference_server_http.server.user_script.UserScriptException: Caught an unhandled exception from the user script
Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,711 questions
0 comments No comments
{count} votes

Accepted answer
  1. Sedat SALMAN 13,345 Reputation points

    CUDA device problem:

    This error message, "Attempting to deserialize object on a CUDA device but torch.cuda.is available() is False," indicates that you're attempting to load a PyTorch model saved with CUDA tensors but on a machine that doesn't have CUDA.

    There's a hint in the error log about how to fix it: "please use torch.load with map location=torch.device('cpu') to map your storages to the CPU."

    To address this, modify your mlflow.pyfunc.load model call to include the map location argument.

    However, because mlflow.pyfunc.load model does not allow you to specify map location directly, you may need to use mlflow.pytorch.load model instead, which allows you to specify map location as follows:

    model = mlflow.pytorch.load_model(model_path, map_location=torch.device('cpu'))

    run() is unadorned:

    This caution, "run() is unadorned. The server will call it using the JSON string as input ", is not necessarily an error, but rather a note about how your script's run() function will be called. According to the Azure ML Inference service, it will call your run() function with the input data as a JSON string, and it is your responsibility to parse that string into a format that your model can understand. You can ignore this warning if your run() function is designed to handle input in this manner.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful