How to access encoder pickle file in scoring script in azure ml studio

Question

How to access encoder pickle file in scoring script in azure ml studio

Vigneshwar K 0

I have been using azure ml studio notebook to train and register a regression model. I have used one hot encoder to encode categorical variable and saved it as a pickle file in azure ml studio. I have registered my model eg. prediction.pkl model in azure. I was in the process of creating an online endpoint for the model. While developing the scoring script it struck me how will I be using the one hot encoder so I can perform .transform on the categorical column when the model is put to use post deployment.

Vigneshwar K 0 Reputation points

2023-12-11T15:31:43.98+00:00

Hi. Thank you for the reply. As I understand, I have to register the encoder, just like I registered the model. Correct me if I am wrong.

And additional information, I will be deploying the model as an endpoint.
Ramr-msft 17,826 Reputation points

2023-12-27T01:43:09.68+00:00

Thanks for the details, Yes.

1 answer

Your answer

Vigneshwar K 0 Reputation points

2023-12-11T15:31:43.98+00:00

Hi. Thank you for the reply. As I understand, I have to register the encoder, just like I registered the model. Correct me if I am wrong.

And additional information, I will be deploying the model as an endpoint.
Ramr-msft 17,826 Reputation points

2023-12-27T01:43:09.68+00:00

Thanks for the details, Yes.

Answer 1

Thanks for the question, In Azure ML, you can use the one-hot encoder in your scoring script by loading it along with your model. Here’s a general approach:

Save your one-hot encoder: After fitting the one-hot encoder on your training data, you can save it as a pickle file, just like you did with your model.
Upload your one-hot encoder to Azure: You can upload the pickle file of your one-hot encoder to Azure ML Studio, similar to how you uploaded your model.
Load your one-hot encoder in the scoring script: In your scoring script, you can load the one-hot encoder using the pickle module. This will allow you to call the .transform method on incoming data in your scoring script.Here’s a sample code snippet for your init() and run() methods in your scoring script:

import os
import pickle
import json
import pandas as pd
from azureml.core.model import Model

def init():
    global model
    global encoder

    # load model
    model_path = Model.get_model_path('prediction.pkl')
    with open(model_path, 'rb') as f:
        model = pickle.load(f)

    # load one-hot encoder
    encoder_path = Model.get_model_path('encoder.pkl')
    with open(encoder_path, 'rb') as f:
        encoder = pickle.load(f)

def run(raw_data):
    try:
        data = json.loads(raw_data)['data']
        data = pd.DataFrame.from_dict(data)
        
        # transform the data using the loaded encoder
        data = encoder.transform(data)

        # make prediction
        result = model.predict(data)

        return result.tolist()
    except Exception as ex:
        error = str(ex)
        return error

Share via

How to access encoder pickle file in scoring script in azure ml studio

1 answer

Your answer