Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Metrics from Pipeline.test()
The evaluation metrics for models are generated using the test()
method of
nimbusml.Pipeline
.
The type of metrics to generate is inferred automatically by looking at the trainer type in the
pipeline. If a model has been loaded using the load_model()
method, then the evaltype must
be specified explicitly.
Binary Classification Metrics
This corresponds to evaltype=’binary’.
AUC - see Receiver Operating Characteristic
Accuracy - see Precision and Recall
Positive Precision - see Precision and Recall
Positive Recall - see Precision and Recall
Negative Precision - see Precision and Recall
Negative Recall - see Precision and Recall
Log-loss - see Log Loss
Log-loss reduction - RIG(Y|X) * 100 = (H(Y) - H(Y|X)) / H(Y) * 100. Ranges from [-inf, 100], where 100 is perfect predictions and 0 indicates mean predictions.
Test-set Entropy - H(Y)
F1 Score - see Precision and Recall
AUPRC - see Area under Precision-Recall Curve
Note: Note about ROCThe computed AUC is defined as the probability that the score for a positive example is higher than the score for a negative one (see AucAggregator.cs in ML.NET). This expression is asymptotically equivalent to the area under the curve which is what scikit-learn computation. computes (see auc). That explains discrepencies on small test sets.
Multiclass Classification Metrics
This corresponds to evaltype=’multiclass’.
Accuracy(micro-avg) - Every sample-class pair contribute equally to the accuracy metric.
Accuracy(macro-avg) - Every class contributes equally to the accuracy metric. Minority classes are given equal weight as the larger classes.
Log-loss - see Log Loss
Log-loss reduction - RIG(Y|X) * 100 = (H(Y) - H(Y|X)) / H(Y) * 100. Ranges from [-inf, 100], where 100 is perfect predictions and 0 indicates mean predictions.
(class N) - Accuracy of class N
Regression Metrics
This corresponds to evaltype=’regression’.
L1(avg) - E( | y - y’ | )
L2(avg) - E( ( y - y’ )^2 )
RMS(avg) - E( ( y - y’ )^2 )^0.5
Loss-fn(avg) - Expected value of loss function. If using square loss, is equal to L2(avg)
Clustering Metrics
This corresponds to evaltype=’cluster’.
NMI - measure of the mutual dependence of the variables. See Normalized Variants. Range is in [0,1], where higher is better.
AvgMinScore - Mean distance of samples to centroids. Smaller is better.
Ranking Metrics
This corresponds to evaltype=’ranking’.
NDCG@N - Normalized Discounted Cumulative Gain @ Top N positions. See Discounted Cumulative Gain
DCG@N - Discounted Cumulative Gain @ Top N positions. See Discounted Cumulative Gain
Anomaly Detection Metrics
This corresponds to evaltype=’anomaly’.
AUC - see Receiver Operating Characteristic
DR @K FP - Detection rate at k false positives. When the test examples are sorted by the output of the anomaly detector in descending order, denote by K the index of the k’th example whose label is 0. Detection rate at k false positives is the detection rate at K.
DR @K FPR - Detection rate at fraction p false positives. When the test examples are sorted by the output of the anomaly detector in descending order, denote by K the index such that a fraction p of the label 0 examples are above K. Detection rate at fraction p false positives is the detection rate at K.
**DR @NumPos** - Detection rate at number of anomalies. Denote by D the number of label 1 examples in the test set. Detection rate at number of anomalies is equal to the detection rate at D.
NumAnomalies - Total number of anomalies detected.