LinearSvmBinaryClassifier Class
Linear Support Vector Machine (SVM) Binary Classifier
- Inheritance
-
nimbusml.internal.core.linear_model._linearsvmbinaryclassifier.LinearSvmBinaryClassifierLinearSvmBinaryClassifiernimbusml.base_predictor.BasePredictorLinearSvmBinaryClassifiersklearn.base.ClassifierMixinLinearSvmBinaryClassifier
Constructor
LinearSvmBinaryClassifier(normalize='Auto', caching='Auto', l2_regularization=0.001, perform_projection=False, number_of_iterations=1, initial_weights_diameter=0.0, no_bias=False, initial_weights=None, shuffle=True, batch_size=1, feature=None, label=None, weight=None, **params)
Parameters
- feature
see Columns.
- label
see Columns.
- weight
see Columns.
- normalize
Specifies the type of automatic normalization used:
"Auto"
: if normalization is needed, it is performed automatically. This is the default choice."No"
: no normalization is performed."Yes"
: normalization is performed."Warn"
: if normalization is needed, a warning message is displayed, but normalization is not performed.
Normalization rescales disparate data ranges to a standard scale.
Feature
scaling ensures the distances between data points are proportional
and
enables various optimization methods such as gradient descent to
converge
much faster. If normalization is performed, a MinMax
normalizer
is
used. It normalizes values in an interval [a, b] where -1 <= a <= 0
and 0 <= b <= 1
and b - a = 1
. This normalizer preserves
sparsity by mapping zero to zero.
- caching
Whether trainer should cache input training data.
- l2_regularization
L2 regularization weight. It also controls the learning rate, with the learning rate being inversely proportional to it.
- perform_projection
Perform projection to unit-ball? Typically used with batch size > 1.
- number_of_iterations
Number of iterations.
- initial_weights_diameter
Sets the initial weights diameter that
specifies the range from which values are drawn for the initial
weights. These weights are initialized randomly from within this range.
For example, if the diameter is specified to be d
, then the weights
are uniformly distributed between -d/2
and d/2
. The default
value is 0
, which specifies that all the weights are set to zero.
- no_bias
No bias.
- initial_weights
Initial Weights and bias, comma-separated.
- shuffle
Whether to shuffle for each training iteration.
- batch_size
Batch size.
- params
Additional arguments sent to compute engine.
Examples
###############################################################################
# LinearSvmBinaryClassifier
from nimbusml import Pipeline, FileDataStream
from nimbusml.datasets import get_dataset
from nimbusml.linear_model import LinearSvmBinaryClassifier
# data input (as a FileDataStream)
path = get_dataset('infert').as_filepath()
data = FileDataStream.read_csv(path)
print(data.head())
# age case education induced parity ... row_num spontaneous ...
# 0 26 1 0-5yrs 1 6 ... 1 2 ...
# 1 42 1 0-5yrs 1 1 ... 2 0 ...
# 2 39 1 0-5yrs 2 6 ... 3 0 ...
# 3 34 1 0-5yrs 2 4 ... 4 0 ...
# 4 35 1 6-11yrs 1 3 ... 5 1 ...
# define the training pipeline
pipeline = Pipeline([LinearSvmBinaryClassifier(
feature=['age', 'parity', 'spontaneous'], label='case')])
# train, predict, and evaluate
metrics, predictions = pipeline.fit(data).test(data, output_scores=True)
# print predictions
print(predictions.head())
# PredictedLabel Score Probability
# 0 1 0.688481 0.607060
# 1 0 -2.514992 0.203312
# 2 0 -3.479344 0.129230
# 3 0 -3.016621 0.161422
# 4 0 -0.825512 0.397461
# print evaluation metrics
print(metrics)
# AUC Accuracy Positive precision Positive recall ...
# 0 0.705476 0.71371 0.666667 0.289157 ...
Remarks
Linear SVM implements an algorithm that finds a hyperplane in the feature space for binary classification, by solving an SVM problem. For instance, for a given feature vector, the prediction is given by determining what side of the hyperplane the point falls into. That is the same as the sign of the feautures' weighted sum (the weights being computed by the algorithm) plus the bias computed by the algorithm.
This algorithm implemented is the PEGASOS method, which alternates between stochastic gradient descent steps and projection steps, introduced by Shalev-Shwartz, Singer and Srebro.
Reference
Wikipedia entry for Support Vector Machine
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM
Methods
decision_function |
Returns score values |
get_params |
Get the parameters for this operator. |
predict_proba |
Returns probabilities |
decision_function
Returns score values
decision_function(X, **params)
get_params
Get the parameters for this operator.
get_params(deep=False)
Parameters
- deep
predict_proba
Returns probabilities
predict_proba(X, **params)