LbfgsMaximumEntropyMulticlassTrainer Class
Important
Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.
The IEstimator<TTransformer> to predict a target using a maximum entropy multiclass classifier trained with L-BFGS method.
public sealed class LbfgsMaximumEntropyMulticlassTrainer : Microsoft.ML.Trainers.LbfgsTrainerBase<Microsoft.ML.Trainers.LbfgsMaximumEntropyMulticlassTrainer.Options,Microsoft.ML.Data.MulticlassPredictionTransformer<Microsoft.ML.Trainers.MaximumEntropyModelParameters>,Microsoft.ML.Trainers.MaximumEntropyModelParameters>
type LbfgsMaximumEntropyMulticlassTrainer = class
inherit LbfgsTrainerBase<LbfgsMaximumEntropyMulticlassTrainer.Options, MulticlassPredictionTransformer<MaximumEntropyModelParameters>, MaximumEntropyModelParameters>
Public NotInheritable Class LbfgsMaximumEntropyMulticlassTrainer
Inherits LbfgsTrainerBase(Of LbfgsMaximumEntropyMulticlassTrainer.Options, MulticlassPredictionTransformer(Of MaximumEntropyModelParameters), MaximumEntropyModelParameters)
- Inheritance
To create this trainer, use LbfgsMaximumEntropy or LbfgsMaximumEntropy(Options).
The input label column data must be key type and the feature column must be a known-sized vector of Single.
This trainer outputs the following columns:
Output Column Name | Column Type | Description |
---|---|---|
Score |
Vector of Single | The scores of all classes. Higher value means higher probability to fall into the associated class. If the i-th element has the largest value, the predicted label index would be i. Note that i is zero-based index. |
PredictedLabel |
key type | The predicted label's index. If its value is i, the actual label would be the i-th category in the key-valued input label type. |
Machine learning task | Multiclass classification |
Is normalization required? | Yes |
Is caching required? | No |
Required NuGet in addition to Microsoft.ML | None |
Exportable to ONNX | Yes |
Maximum entropy model is a generalization of linear logistic regression. The major difference between maximum entropy model and logistic regression is the number of classes supported in the considered classification problem. Logistic regression is only for binary classification while maximum entropy model handles multiple classes. See Section 1 in this paper for a detailed introduction.
Assume that the number of classes is
The optimization technique implemented is based on the limited memory Broyden-Fletcher-Goldfarb-Shanno method (L-BFGS). L-BFGS is a quasi-Newtonian method, which replaces the expensive computation of the Hessian matrix with an approximation but still enjoys a fast convergence rate like Newton's method where the full Hessian matrix is computed. Since L-BFGS approximation uses only a limited amount of historical states to compute the next step direction, it is especially suited for problems with a high-dimensional feature vector. The number of historical states is a user-specified parameter, using a larger number may lead to a better approximation of the Hessian matrix but also a higher computation cost per step.
This class uses empirical risk minimization (i.e., ERM)
to formulate the optimization problem built upon collected data.
Note that empirical risk is usually measured by applying a loss function on the model's predictions on collected data points.
If the training data does not contain enough data points
(for example, to train a linear model in
Together with the implemented optimization algorithm, L1-norm regularization can increase the sparsity of the model weights,
An aggressive regularization (that is, assigning large coefficients to L1-norm or L2-norm regularization terms) can harm predictive capacity by excluding important variables from the model. For example, a very large L1-norm coefficient may force all parameters to be zeros and lead to a trivial model. Therefore, choosing the right regularization coefficients is important in practice.
Check the See Also section for links to usage examples.
Feature |
The feature column that the trainer expects. (Inherited from TrainerEstimatorBase<TTransformer,TModel>) |
Label |
The label column that the trainer expects. Can be |
Weight |
The weight column that the trainer expects. Can be |
Info | (Inherited from LbfgsTrainerBase<TOptions,TTransformer,TModel>) |
Fit(IData |
Continues the training of a LbfgsMaximumEntropyMulticlassTrainer using an already trained |
Fit(IData |
Trains and returns a ITransformer. (Inherited from TrainerEstimatorBase<TTransformer,TModel>) |
Get |
(Inherited from TrainerEstimatorBase<TTransformer,TModel>) |
Append |
Append a 'caching checkpoint' to the estimator chain. This will ensure that the downstream estimators will be trained against cached data. It is helpful to have a caching checkpoint before trainers that take multiple data passes. |
With |
Given an estimator, return a wrapping object that will call a delegate once Fit(IDataView) is called. It is often important for an estimator to return information about what was fit, which is why the Fit(IDataView) method returns a specifically typed object, rather than just a general ITransformer. However, at the same time, IEstimator<TTransformer> are often formed into pipelines with many objects, so we may need to build a chain of estimators via EstimatorChain<TLastTransformer> where the estimator for which we want to get the transformer is buried somewhere in this chain. For that scenario, we can through this method attach a delegate that will be called once fit is called. |
Product | Versions |
---|---|
ML.NET | 1.0.0, 1.1.0, 1.2.0, 1.3.1, 1.4.0, 1.5.0, 1.6.0, 1.7.0, 2.0.0, 3.0.0, Preview, 4.0.0 |
- LbfgsMaximumEntropy(MulticlassClassificationCatalog+MulticlassClassificationTrainers, LbfgsMaximumEntropyMulticlassTrainer+Options)
- LbfgsMaximumEntropyMulticlassTrainer.Options
- LbfgsMaximumEntropy(MulticlassClassificationCatalog+MulticlassClassificationTrainers, String, String, String, Single, Single, Single, Int32, Boolean)