SupportedTransformers Class
Defines customer-facing names for transformers supported by AutoML.
Transformers are classified for use with
Categorical
data (e.g., CatImputer
),
DateTime
data (e.g., DataTimeTransformer
),
Text
data (e.g., TfIdf
), or for
Generic
data types (e.g., Imputer
).
- Inheritance
-
builtins.objectSupportedTransformers
Constructor
SupportedTransformers()
Remarks
The attributes defined in SupportedTransformers are used in featurization summaries when using automatic preprocessing in automated ML or when customizing featurization with the FeaturizationConfig class as shown in the example.
featurization_config = FeaturizationConfig()
featurization_config.add_transformer_params('Imputer', ['column1'], {"strategy": "median"})
featurization_config.add_transformer_params('HashOneHotEncoder', [], {"number_of_bits": 3})
For more information, see Configure automated ML experiments.
Attributes
ImputationMarker
Add boolean imputation marker for imputed values.
ImputationMarker = 'ImputationMarker'
Imputer
Complete missing values.
Imputer = 'Imputer'
MaxAbsScaler
Scale data by its maximum absolute value.
MaxAbsScaler = 'MaxAbsScaler'
CatImputer
Impute missing values for categorical features by the most frequent category.
CatImputer = 'CatImputer'
HashOneHotEncoder
Convert input to hash and encode to one-hot encoded vector.
HashOneHotEncoder = 'HashOneHotEncoder'
LabelEncoder
Encode categorical data into numbers.
LabelEncoder = 'LabelEncoder'
CatTargetEncoder
Map category data with averaged target value for regression and to the class probability for classification.
CatTargetEncoder = 'CatTargetEncoder'
WoETargetEncoder
Calculate the Weight of Evidence of correlation of a categorical data to a target column.
WoETargetEncoder = 'WoETargetEncoder'
OneHotEncoder
Convert input to one-hot encoded vector.
OneHotEncoder = 'OneHotEncoder'
DateTimeTransformer
Expand datatime features into sub features such as year, month, and day.
DateTimeTransformer = 'DateTimeTransformer'
CountVectorizer
Convert a collection of documents to a matrix of token counts.
CountVectorizer = 'CountVectorizer'
NaiveBayes
Transform textual data using sklearn Multinomial Naïve Bayes.
NaiveBayes = 'NaiveBayes'
StringCast
Cast input to string and lower case.
StringCast = 'StringCast'
TextTargetEncoder
Apply target encoding to text data where a stacked linear model with bag-of-words generates the probability of each class.
TextTargetEncoder = 'TextTargetEncoder'
TfIdf
Transform a count matrix to a normalized TF or TF-iDF representation.
TfIdf = 'TfIdf'
TimeIndexFeaturizer
Transformer to create datetime-based features using time_index_featurizer class.
TimeIndexFeaturizer = 'TimeIndexFeaturizer'
WordEmbedding
Convert vectors of text tokens into sentence vectors using a pre-trained model.
WordEmbedding = 'WordEmbedding'
CUSTOMIZABLE_TRANSFORMERS
Transformers that are customized in featurization with parameters of methods in the FeaturizationConfig class.
CUSTOMIZABLE_TRANSFORMERS = {'HashOneHotEncoder', 'Imputer', 'TfIdf'}
BLOCK_TRANSFORMERS
Transformers that can be blocked from use in featurization in the FeaturizationConfig class.
BLOCK_TRANSFORMERS = {'CatTargetEncoder', 'CountVectorizer', 'HashOneHotEncoder', 'LabelEncoder', 'NaiveBayes', 'OneHotEncoder', 'TextTargetEncoder', 'TfIdf', 'TimeIndexFeaturizer', 'WoETargetEncoder', 'WordEmbedding'}
FULL_SET
The full set of transformers.
FULL_SET = {'CatImputer', 'CatTargetEncoder', 'CountVectorizer', 'DateTimeTransformer', 'HashOneHotEncoder', 'ImputationMarker', 'Imputer', 'LabelEncoder', 'MaxAbsScaler', 'NaiveBayes', 'OneHotEncoder', 'StringCast', 'TextTargetEncoder', 'TfIdf', 'TimeIndexFeaturizer', 'WoETargetEncoder', 'WordEmbedding'}
الملاحظات
https://aka.ms/ContentUserFeedback.
قريبًا: خلال عام 2024، سنتخلص تدريجيًا من GitHub Issues بوصفها آلية إرسال ملاحظات للمحتوى ونستبدلها بنظام ملاحظات جديد. لمزيد من المعلومات، راجعإرسال الملاحظات وعرضها المتعلقة بـ