Loading, Saving and Serving Models
Trainers, transforms and pipelines can be persisted in a couple of ways. Using Python’s built-in
persistence model of pickle, or else by using the
the load_model()
and save_model()
methods of nimbusml.Pipeline
.
Advantages of using pickle is that all attribute values of objects are preserved, and can be
inspected after deserialization. However, for models trained from external sources such as the ML.NET C#
application, pickle cannot be used, and the load_model()
method needs to be used instead.
Similarly the save_model()
method saves the model in a format that can be used by external
applications.
Below is an example using pickle.
import pickle
from nimbusml import Pipeline, FileDataStream
from nimbusml.linear_model import AveragedPerceptronBinaryClassifier
from nimbusml.datasets import get_dataset
data_file = get_dataset('infert').as_filepath()
ds = FileDataStream.read_csv(data_file)
ds.schema.rename('case', 'case2') # column name case is not allowed in C#
# Train a model and score
pipeline = Pipeline([AveragedPerceptronBinaryClassifier(
feature=['age', 'parity', 'spontaneous'], label='case2')])
metrics, scores = pipeline.fit(ds).test(ds, output_scores=True)
print(metrics)
# Load model from file and evaluate. Note that 'evaltype'
# must be specified explicitly
s = pickle.dumps(pipeline)
pipe2 = pickle.loads(s)
metrics2, scores2 = pipe2.test(ds, evaltype='binary', output_scores=True)
print(metrics2)
Output:
Automatically adding a MinMax normalization transform, use 'norm=Warn' or 'norm=No' to turn this behavior off.
Training calibrator.
Elapsed time: 00:00:00.5800875
AUC Accuracy Positive precision Positive recall Negative precision Negative recall Log-loss Log-loss reduction Test-set entropy (prior Log-Loss/instance) F1 Score AUPRC
0 0.705038 0.71371 0.7 0.253012 0.715596 0.945455 0.814956 0.113826 0.919634 0.371681 0.572031
AUC Accuracy Positive precision Positive recall Negative precision Negative recall Log-loss Log-loss reduction Test-set entropy (prior Log-Loss/instance) F1 Score AUPRC
0 0.705038 0.71371 0.7 0.253012 0.715596 0.945455 0.814956 0.113826 0.919634 0.371681 0.572031
Below is an example of using load_model() and save_model(). The model can also originate from external tools such as the ML.NET C# application or Maml.exe command line tool. When loading a model this way, the argument of ‘evaltype’ must be specified explicitly.
from nimbusml import Pipeline, FileDataStream
from nimbusml.linear_model import AveragedPerceptronBinaryClassifier
from nimbusml.datasets import get_dataset
data_file = get_dataset('infert').as_filepath()
ds = FileDataStream.read_csv(data_file)
ds.schema.rename('case', 'case2') # column name case is not allowed in C#
# Train a model and score
pipeline = Pipeline([AveragedPerceptronBinaryClassifier(
feature=['age', 'parity', 'spontaneous'], label='case2')])
metrics, scores = pipeline.fit(ds).test(ds, output_scores=True)
pipeline.save_model("mymodeluci.zip")
print(metrics)
# Load model from file and evaluate. Note that 'evaltype'
# must be specified explicitly
pipeline2 = Pipeline()
pipeline2.load_model("mymodeluci.zip")
metrics2, scores2 = pipeline2.test(ds, y = 'case2', evaltype='binary')
print(metrics2)
Output:
Automatically adding a MinMax normalization transform, use 'norm=Warn' or 'norm=No' to turn this behavior off.
Training calibrator.
Elapsed time: 00:00:00.1367380
AUC Accuracy Positive precision Positive recall Negative precision Negative recall Log-loss Log-loss reduction Test-set entropy (prior Log-Loss/instance) F1 Score AUPRC
0 0.705038 0.71371 0.7 0.253012 0.715596 0.945455 0.814956 0.113826 0.919634 0.371681 0.572031
AUC Accuracy Positive precision Positive recall Negative precision Negative recall Log-loss Log-loss reduction Test-set entropy (prior Log-Loss/instance) F1 Score AUPRC
0 0.705038 0.71371 0.7 0.253012 0.715596 0.945455 0.814956 0.113826 0.919634 0.371681 0.572031
The saved model (‘mymodeluci.zip’) can be used for scoring in ML.NET using the following code:
public static void Score()
{
var modelPath = "mymodeluci.zip";
var mlContext = new MLContext();
var loadedModel = mlContext.Model.Load(modelPath, out DataViewSchema inputSchema);
var example = new List<InfertData>()
{
new InfertData()
{
age = 26,
parity = 6,
spontaneous = 2
}
};
// load data into IDataView
var loadedData = mlContext.Data.LoadFromEnumerable(example);
var predictionDataView = loadedModel.Transform(loadedData);
// convert IDataView predictions to IEnumerable
var prediction = mlContext.Data
.CreateEnumerable<InfertPrediction>(predictionDataView,
reuseRowObject: false).ToList();
foreach (var p in prediction)
{
Console.WriteLine($"PredictedLabel: {p.PredictedLabel}, " +
$"Probability: {p.Probability}, Score: {p.Score}");
}
}
public class InfertData
{
public int age { get; set; }
public int parity { get; set; }
public int spontaneous { get; set; }
}
public class InfertPrediction
{
public bool PredictedLabel { get; set; }
public float Probability { get; set; }
public float Score { get; set; }
}