OnlineGradientDescentRegressor Class

Train a stochastic gradient descent model.

Constructor

OnlineGradientDescentRegressor(normalize='Auto', caching='Auto', loss='squared', learning_rate=0.1, decrease_learning_rate=True, l2_regularization=0.0, number_of_iterations=1, initial_weights_diameter=0.0, reset_weights_after_x_examples=None, lazy_update=True, recency_gain=0.0, recency_gain_multiplicative=False, averaged=True, averaged_tolerance=0.01, initial_weights=None, shuffle=True, feature=None, label=None, **params)

Parameters

Name	Description
feature	see Columns.
label	see Columns.
normalize	Specifies the type of automatic normalization used: `"Auto"`: if normalization is needed, it is performed automatically. This is the default choice. `"No"`: no normalization is performed. `"Yes"`: normalization is performed. `"Warn"`: if normalization is needed, a warning message is displayed, but normalization is not performed. Normalization rescales disparate data ranges to a standard scale. Feature scaling insures the distances between data points are proportional and enables various optimization methods such as gradient descent to converge much faster. If normalization is performed, a `MaxMin` normalizer is used. It normalizes values in an interval [a, b] where `-1 <= a <= 0` and `0 <= b <= 1` and `b - a = 1`. This normalizer preserves sparsity by mapping zero to zero.
caching	Whether trainer should cache input training data.
loss	The default is Hinge. Other choices are Exp, Log, SmoothedHinge. For more information, please see nimbusml.
learning_rate	Determines the size of the step taken in the direction of the gradient in each step of the learning process. This determines how fast or slow the learner converges on the optimal solution. If the step size is too big, you might overshoot the optimal solution. If the step size is too small, training takes longer to converge to the best solution.
decrease_learning_rate	Decrease learning rate.
l2_regularization	L2 Regularization Weight.
number_of_iterations	Number of iterations.
initial_weights_diameter	Sets the initial weights diameter that specifies the range from which values are drawn for the initial weights. These weights are initialized randomly from within this range. For example, if the diameter is specified to be `d`, then the weights are uniformly distributed between `-d/2` and `d/2`. The default value is `0`, which specifies that all the weights are set to zero.
reset_weights_after_x_examples	Number of examples after which weights will be reset to the current average.
lazy_update	Instead of updating averaged weights on every example, only update when loss is nonzero.
recency_gain	Extra weight given to more recent updates (do_lazy_updates` must be False).
recency_gain_multiplicative	Whether Recency Gain is multiplicative (vs. additive).
averaged	Do averaging?.
averaged_tolerance	The inexactness tolerance for averaging.
initial_weights	Initial Weights and bias, comma-separated.
shuffle	Whether to shuffle for each training iteration.
params	Additional arguments sent to compute engine.

Examples


   ###############################################################################
   # OnlineGradientDescentRegressor
   from nimbusml import Pipeline, FileDataStream
   from nimbusml.datasets import get_dataset
   from nimbusml.feature_extraction.categorical import OneHotVectorizer
   from nimbusml.linear_model import OnlineGradientDescentRegressor

   # data input (as a FileDataStream)
   path = get_dataset('infert').as_filepath()

   data = FileDataStream.read_csv(path)
   print(data.head())
   #    age  case education  induced  parity ... row_num  spontaneous  ...
   # 0   26     1    0-5yrs        1       6 ...       1            2  ...
   # 1   42     1    0-5yrs        1       1 ...       2            0  ...
   # 2   39     1    0-5yrs        2       6 ...       3            0  ...
   # 3   34     1    0-5yrs        2       4 ...       4            0  ...
   # 4   35     1   6-11yrs        1       3 ...       5            1  ...

   # define the training pipeline
   pipeline = Pipeline([
       OneHotVectorizer(columns={'edu': 'education'}),
       OnlineGradientDescentRegressor(feature=['parity', 'edu'], label='age')
   ])

   # train, predict, and evaluate
   metrics, predictions = pipeline.fit(data).test(data, output_scores=True)

   # print predictions
   print(predictions.head())
   #       Score
   # 0  28.103731
   # 1  21.805904
   # 2  28.103731
   # 3  25.584600
   # 4  33.743286
   # print evaluation metrics
   print(metrics)
   #    L1(avg)   L2(avg)  RMS(avg)  Loss-fn(avg)  R Squared
   # 0  4.452286  31.15933  5.582054      31.15933  -0.134398

Remarks

Stochastic gradient descent uses a simple yet efficient iterative technique to fit model coefficients using error gradients for convex loss functions (see Stochastic_gradient_descent).

The OnlineGradientDescentRegressor implements the standard (non- batch) SGD, with a choice of loss functions, and an option to update the weight vector using the average of the vectors seen over time (averaged argument is set to True by default).

Reference

Stochastic_gradient_descent

Methods

get_params

Get the parameters for this operator.

get_params

Get the parameters for this operator.

get_params(deep=False)

Parameters

Name	Description
deep	Default value: False

Share via

OnlineGradientDescentRegressor Class

Constructor

Parameters

Examples

Remarks

Methods

get_params

Parameters