Forest Binary Trainer Class
Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.
The IEstimator<TTransformer> for training a decision tree binary classification model using Fast Forest.
public sealed class FastForestBinaryTrainer : Microsoft.ML.Trainers.FastTree.RandomForestTrainerBase<Microsoft.ML.Trainers.FastTree.FastForestBinaryTrainer.Options,Microsoft.ML.Data.BinaryPredictionTransformer<Microsoft.ML.Trainers.FastTree.FastForestBinaryModelParameters>,Microsoft.ML.Trainers.FastTree.FastForestBinaryModelParameters>
type FastForestBinaryTrainer = class inherit RandomForestTrainerBase<FastForestBinaryTrainer.Options, BinaryPredictionTransformer<FastForestBinaryModelParameters>, FastForestBinaryModelParameters>
Public NotInheritable Class FastForestBinaryTrainer Inherits RandomForestTrainerBase(Of FastForestBinaryTrainer.Options, BinaryPredictionTransformer(Of FastForestBinaryModelParameters), FastForestBinaryModelParameters)
To create this trainer, use FastForest or FastForest(Options).
Input and Output Columns
The input label column data must be Boolean. The input features column data must be a known-sized vector of Single.
This trainer outputs the following columns:
|Output Column Name||Column Type||Description|
||Single||The unbounded score that was calculated by the model.|
||Boolean||The predicted label, based on the sign of the score. A negative score maps to
||Single||The probability calculated by calibrating the score of having true as the label. Probability value is in range [0, 1].|
|Machine learning task||Binary classification|
|Is normalization required?||No|
|Is caching required?||No|
|Required NuGet in addition to Microsoft.ML||Microsoft.ML.FastTree|
|Exportable to ONNX||Yes|
Training Algorithm Details
Decision trees are non-parametric models that perform a sequence of simple tests on inputs. This decision procedure maps them to outputs found in the training dataset whose inputs were similar to the instance being processed. A decision is made at each node of the binary tree data structure based on a measure of similarity that maps each instance recursively through the branches of the tree until the appropriate leaf node is reached and the output decision returned.
Decision trees have several advantages:
- They are efficient in both computation and memory usage during training and prediction.
- They can represent non-linear decision boundaries.
- They perform integrated feature selection and classification.
- They are resilient in the presence of noisy features.
Fast forest is a random forest implementation. The model consists of an ensemble of decision trees. Each tree in a decision forest outputs a Gaussian distribution by way of prediction. An aggregation is performed over the ensemble of trees to find a Gaussian distribution closest to the combined distribution for all trees in the model. This decision forest classifier consists of an ensemble of decision trees.
Generally, ensemble models provide better coverage and accuracy than single decision trees. Each tree in a decision forest outputs a Gaussian distribution.
For more see:
Check the See Also section for links to examples of the usage.
The feature column that the trainer expects.(Inherited from TrainerEstimatorBase<TTransformer,TModel>)
The optional groupID column that the ranking trainers expects.(Inherited from TrainerEstimatorBaseWithGroupId<TTransformer,TModel>)
The label column that the trainer expects. Can be
The weight column that the trainer expects. Can be
|Info||(Inherited from FastTreeTrainerBase<TOptions,TTransformer,TModel>)|
Trains and returns a ITransformer.(Inherited from TrainerEstimatorBase<TTransformer,TModel>)
Trains a FastForestBinaryTrainer using both training and validation data, returns a BinaryPredictionTransformer<TModel>.
|GetOutputSchema(SchemaShape)||(Inherited from TrainerEstimatorBase<TTransformer,TModel>)|
Append a 'caching checkpoint' to the estimator chain. This will ensure that the downstream estimators will be trained against cached data. It is helpful to have a caching checkpoint before trainers that take multiple data passes.
Given an estimator, return a wrapping object that will call a delegate once Fit(IDataView) is called. It is often important for an estimator to return information about what was fit, which is why the Fit(IDataView) method returns a specifically typed object, rather than just a general ITransformer. However, at the same time, IEstimator<TTransformer> are often formed into pipelines with many objects, so we may need to build a chain of estimators via EstimatorChain<TLastTransformer> where the estimator for which we want to get the transformer is buried somewhere in this chain. For that scenario, we can through this method attach a delegate that will be called once fit is called.
Submit and view feedback for