LinearSvmBinaryClassifier Class

Linear Support Vector Machine (SVM) Binary Classifier

Constructor

LinearSvmBinaryClassifier(normalize='Auto', caching='Auto', l2_regularization=0.001, perform_projection=False, number_of_iterations=1, initial_weights_diameter=0.0, no_bias=False, initial_weights=None, shuffle=True, batch_size=1, feature=None, label=None, weight=None, **params)

Parameters

Name	Description
feature	see Columns.
label	see Columns.
weight	see Columns.
normalize	Specifies the type of automatic normalization used: `"Auto"`: if normalization is needed, it is performed automatically. This is the default choice. `"No"`: no normalization is performed. `"Yes"`: normalization is performed. `"Warn"`: if normalization is needed, a warning message is displayed, but normalization is not performed. Normalization rescales disparate data ranges to a standard scale. Feature scaling ensures the distances between data points are proportional and enables various optimization methods such as gradient descent to converge much faster. If normalization is performed, a `MinMax` normalizer is used. It normalizes values in an interval [a, b] where `-1 <= a <= 0` and `0 <= b <= 1` and `b - a = 1`. This normalizer preserves sparsity by mapping zero to zero.
caching	Whether trainer should cache input training data.
l2_regularization	L2 regularization weight. It also controls the learning rate, with the learning rate being inversely proportional to it.
perform_projection	Perform projection to unit-ball? Typically used with batch size > 1.
number_of_iterations	Number of iterations.
initial_weights_diameter	Sets the initial weights diameter that specifies the range from which values are drawn for the initial weights. These weights are initialized randomly from within this range. For example, if the diameter is specified to be `d`, then the weights are uniformly distributed between `-d/2` and `d/2`. The default value is `0`, which specifies that all the weights are set to zero.
no_bias	No bias.
initial_weights	Initial Weights and bias, comma-separated.
shuffle	Whether to shuffle for each training iteration.
batch_size	Batch size.
params	Additional arguments sent to compute engine.

Examples


   ###############################################################################
   # LinearSvmBinaryClassifier
   from nimbusml import Pipeline, FileDataStream
   from nimbusml.datasets import get_dataset
   from nimbusml.linear_model import LinearSvmBinaryClassifier

   # data input (as a FileDataStream)
   path = get_dataset('infert').as_filepath()

   data = FileDataStream.read_csv(path)
   print(data.head())
   #   age  case education  induced  parity   ... row_num  spontaneous  ...
   # 0   26     1    0-5yrs        1       6  ...       1            2  ...
   # 1   42     1    0-5yrs        1       1  ...       2            0  ...
   # 2   39     1    0-5yrs        2       6  ...       3            0  ...
   # 3   34     1    0-5yrs        2       4  ...       4            0  ...
   # 4   35     1   6-11yrs        1       3  ...       5            1  ...
   # define the training pipeline
   pipeline = Pipeline([LinearSvmBinaryClassifier(
       feature=['age', 'parity', 'spontaneous'], label='case')])

   # train, predict, and evaluate
   metrics, predictions = pipeline.fit(data).test(data, output_scores=True)

   # print predictions
   print(predictions.head())
   #    PredictedLabel     Score  Probability
   # 0               1  0.688481     0.607060
   # 1               0 -2.514992     0.203312
   # 2               0 -3.479344     0.129230
   # 3               0 -3.016621     0.161422
   # 4               0 -0.825512     0.397461
   # print evaluation metrics
   print(metrics)
   #         AUC  Accuracy  Positive precision  Positive recall  ...
   # 0  0.705476   0.71371            0.666667         0.289157  ...

Remarks

Linear SVM implements an algorithm that finds a hyperplane in the feature space for binary classification, by solving an SVM problem. For instance, for a given feature vector, the prediction is given by determining what side of the hyperplane the point falls into. That is the same as the sign of the feautures' weighted sum (the weights being computed by the algorithm) plus the bias computed by the algorithm.

This algorithm implemented is the PEGASOS method, which alternates between stochastic gradient descent steps and projection steps, introduced by Shalev-Shwartz, Singer and Srebro.

Reference

Wikipedia entry for Support Vector Machine

Pegasos: Primal Estimated sub-GrAdient SOlver for SVM

Methods

decision_function	Returns score values
get_params	Get the parameters for this operator.
predict_proba	Returns probabilities

decision_function

Returns score values

decision_function(X, **params)

get_params

Get the parameters for this operator.

get_params(deep=False)

Parameters

Name	Description
deep	Default value: False

predict_proba

Returns probabilities

predict_proba(X, **params)

Share via

LinearSvmBinaryClassifier Class

Constructor

Parameters

Examples

Remarks

Methods

decision_function

get_params

Parameters

predict_proba