SupportedTransformers Class

Defines customer-facing names for transformers supported by AutoML.

Transformers are classified for use with Categorical data (e.g., CatImputer), DateTime data (e.g., DataTimeTransformer), Text data (e.g., TfIdf), or for Generic data types (e.g., Imputer).

Inheritance
builtins.object
SupportedTransformers

Constructor

SupportedTransformers()

Remarks

The attributes defined in SupportedTransformers are used in featurization summaries when using automatic preprocessing in automated ML or when customizing featurization with the FeaturizationConfig class as shown in the example.


   featurization_config = FeaturizationConfig()
   featurization_config.add_transformer_params('Imputer', ['column1'], {"strategy": "median"})
   featurization_config.add_transformer_params('HashOneHotEncoder', [], {"number_of_bits": 3})

For more information, see Configure automated ML experiments.

Attributes

ImputationMarker

Add boolean imputation marker for imputed values.

ImputationMarker = 'ImputationMarker'

Imputer

Complete missing values.

Imputer = 'Imputer'

MaxAbsScaler

Scale data by its maximum absolute value.

MaxAbsScaler = 'MaxAbsScaler'

CatImputer

Impute missing values for categorical features by the most frequent category.

CatImputer = 'CatImputer'

HashOneHotEncoder

Convert input to hash and encode to one-hot encoded vector.

HashOneHotEncoder = 'HashOneHotEncoder'

LabelEncoder

Encode categorical data into numbers.

LabelEncoder = 'LabelEncoder'

CatTargetEncoder

Map category data with averaged target value for regression and to the class probability for classification.

CatTargetEncoder = 'CatTargetEncoder'

WoETargetEncoder

Calculate the Weight of Evidence of correlation of a categorical data to a target column.

WoETargetEncoder = 'WoETargetEncoder'

OneHotEncoder

Convert input to one-hot encoded vector.

OneHotEncoder = 'OneHotEncoder'

DateTimeTransformer

Expand datatime features into sub features such as year, month, and day.

DateTimeTransformer = 'DateTimeTransformer'

CountVectorizer

Convert a collection of documents to a matrix of token counts.

CountVectorizer = 'CountVectorizer'

NaiveBayes

Transform textual data using sklearn Multinomial Naïve Bayes.

NaiveBayes = 'NaiveBayes'

StringCast

Cast input to string and lower case.

StringCast = 'StringCast'

TextTargetEncoder

Apply target encoding to text data where a stacked linear model with bag-of-words generates the probability of each class.

TextTargetEncoder = 'TextTargetEncoder'

TfIdf

Transform a count matrix to a normalized TF or TF-iDF representation.

TfIdf = 'TfIdf'

TimeIndexFeaturizer

Transformer to create datetime-based features using time_index_featurizer class.

TimeIndexFeaturizer = 'TimeIndexFeaturizer'

WordEmbedding

Convert vectors of text tokens into sentence vectors using a pre-trained model.

WordEmbedding = 'WordEmbedding'

CUSTOMIZABLE_TRANSFORMERS

Transformers that are customized in featurization with parameters of methods in the FeaturizationConfig class.

CUSTOMIZABLE_TRANSFORMERS = {'HashOneHotEncoder', 'Imputer', 'TfIdf'}

BLOCK_TRANSFORMERS

Transformers that can be blocked from use in featurization in the FeaturizationConfig class.

BLOCK_TRANSFORMERS = {'CatTargetEncoder', 'CountVectorizer', 'HashOneHotEncoder', 'LabelEncoder', 'NaiveBayes', 'OneHotEncoder', 'TextTargetEncoder', 'TfIdf', 'TimeIndexFeaturizer', 'WoETargetEncoder', 'WordEmbedding'}

FULL_SET

The full set of transformers.

FULL_SET = {'CatImputer', 'CatTargetEncoder', 'CountVectorizer', 'DateTimeTransformer', 'HashOneHotEncoder', 'ImputationMarker', 'Imputer', 'LabelEncoder', 'MaxAbsScaler', 'NaiveBayes', 'OneHotEncoder', 'StringCast', 'TextTargetEncoder', 'TfIdf', 'TimeIndexFeaturizer', 'WoETargetEncoder', 'WordEmbedding'}