How to access encoder pickle file in scoring script in azure ml studio

Vigneshwar K 0 Reputation points
2023-12-10T23:06:48.44+00:00

I have been using azure ml studio notebook to train and register a regression model. I have used one hot encoder to encode categorical variable and saved it as a pickle file in azure ml studio. I have registered my model eg. prediction.pkl model in azure. I was in the process of creating an online endpoint for the model. While developing the scoring script it struck me how will I be using the one hot encoder so I can perform .transform on the categorical column when the model is put to use post deployment.

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
3,337 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Ramr-msft 17,826 Reputation points
    2023-12-11T10:48:21.34+00:00

    Thanks for the question, In Azure ML, you can use the one-hot encoder in your scoring script by loading it along with your model. Here’s a general approach:

    1. Save your one-hot encoder: After fitting the one-hot encoder on your training data, you can save it as a pickle file, just like you did with your model.
    2. Upload your one-hot encoder to Azure: You can upload the pickle file of your one-hot encoder to Azure ML Studio, similar to how you uploaded your model.
    3. Load your one-hot encoder in the scoring script: In your scoring script, you can load the one-hot encoder using the pickle module. This will allow you to call the .transform method on incoming data in your scoring script.Here’s a sample code snippet for your init() and run() methods in your scoring script:
    import os
    import pickle
    import json
    import pandas as pd
    from azureml.core.model import Model
    
    def init():
        global model
        global encoder
    
        # load model
        model_path = Model.get_model_path('prediction.pkl')
        with open(model_path, 'rb') as f:
            model = pickle.load(f)
    
        # load one-hot encoder
        encoder_path = Model.get_model_path('encoder.pkl')
        with open(encoder_path, 'rb') as f:
            encoder = pickle.load(f)
    
    def run(raw_data):
        try:
            data = json.loads(raw_data)['data']
            data = pd.DataFrame.from_dict(data)
            
            # transform the data using the loaded encoder
            data = encoder.transform(data)
    
            # make prediction
            result = model.predict(data)
    
            return result.tolist()
        except Exception as ex:
            error = str(ex)
            return error
    
    
    1 person found this answer helpful.
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.