FastTreeRankingFeaturizationEstimator Class
Definition
Important
Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.
A IEstimator<TTransformer> to transform input feature vector to tree-based features.
public sealed class FastTreeRankingFeaturizationEstimator : Microsoft.ML.Trainers.FastTree.TreeEnsembleFeaturizationEstimatorBase
type FastTreeRankingFeaturizationEstimator = class
inherit TreeEnsembleFeaturizationEstimatorBase
Public NotInheritable Class FastTreeRankingFeaturizationEstimator
Inherits TreeEnsembleFeaturizationEstimatorBase
- Inheritance
Remarks
Input and Output Columns
The input label data type must be key type or Single. The value of the label determines relevance, where higher values indicate higher relevance. If the label is a key type, then the key index is the relevance value, where the smallest index is the least relevant. If the label is a Single, larger values indicate higher relevance. The feature column must be a known-sized vector of Single and input row group column must be key type.
This estimator outputs the following columns:
Output Column Name | Column Type | Description |
---|---|---|
Trees |
Known-sized vector of Single | The output values of all trees. Its size is identical to the total number of trees in the tree ensemble model. |
Leaves |
Known-sized vector of Single | 0-1 vector representation to the IDs of all leaves where the input feature vector falls into. Its size is the number of total leaves in the tree ensemble model. |
Paths |
Known-sized vector of Single | 0-1 vector representation to the paths the input feature vector passed through to reach the leaves. Its size is the number of non-leaf nodes in the tree ensemble model. |
Those output columns are all optional and user can change their names. Please set the names of skipped columns to null so that they would not be produced.
Prediction Details
This estimator produces several output columns from a tree ensemble model. Assume that the model contains only one decision tree:
Node 0
/ \
/ \
/ \
/ \
Node 1 Node 2
/ \ / \
/ \ / \
/ \ Leaf -3 Node 3
Leaf -1 Leaf -2 / \
/ \
Leaf -4 Leaf -5
Assume that the input feature vector falls into Leaf -1
. The output Trees
may be a 1-element vector where
the only value is the decision value carried by Leaf -1
. The output Leaves
is a 0-1 vector. If the reached
leaf is the $i$-th (indexed by $-(i+1)$ so the first leaf is Leaf -1
) leaf in the tree, the $i$-th value in Leaves
would be 1 and all other values would be 0. The output Paths
is a 0-1 representation of the nodes passed
through before reaching the leaf. The $i$-th element in Paths
indicates if the $i$-th node (indexed by $i$) is touched.
For example, reaching Leaf -1
lead to $[1, 1, 0, 0]$ as the Paths
. If there are multiple trees, this estimator
just concatenates Trees
's, Leaves
's, Paths
's from all trees (first tree's information comes first in the concatenated vectors).
Check the See Also section for links to usage examples.
Methods
Fit(IDataView) |
Produce a TreeEnsembleModelParameters which maps the column called InputColumnName in |
GetOutputSchema(SchemaShape) |
PretrainedTreeFeaturizationEstimator adds three float-vector columns into |
Extension Methods
AppendCacheCheckpoint<TTrans>(IEstimator<TTrans>, IHostEnvironment) |
Append a 'caching checkpoint' to the estimator chain. This will ensure that the downstream estimators will be trained against cached data. It is helpful to have a caching checkpoint before trainers that take multiple data passes. |
WithOnFitDelegate<TTransformer>(IEstimator<TTransformer>, Action<TTransformer>) |
Given an estimator, return a wrapping object that will call a delegate once Fit(IDataView) is called. It is often important for an estimator to return information about what was fit, which is why the Fit(IDataView) method returns a specifically typed object, rather than just a general ITransformer. However, at the same time, IEstimator<TTransformer> are often formed into pipelines with many objects, so we may need to build a chain of estimators via EstimatorChain<TLastTransformer> where the estimator for which we want to get the transformer is buried somewhere in this chain. For that scenario, we can through this method attach a delegate that will be called once fit is called. |