Azure AutoML Featurisation Error

SoonJoo@Genting 221 Reputation points
2021-11-29T02:55:50.033+00:00

According the following doc, I should be able to to turn on FeaturizationConfig in the settings:
https://learn.microsoft.com/en-us/azure/machine-learning/how-to-configure-auto-train

However I'm getting the following error when I try to change the switch to 'FeaturizationConfig' when setting up the AutoML experiment:

ConfigException: ConfigException: Message: Invalid argument(s) 'featurizationconfig' specified. Supported value(s): 'off, auto'

The following is my settings:

import logging

automl_settings = {
"iteration_timeout_minutes": 15,
"experiment_timeout_hours": 0.3,
"enable_early_stopping": True,
"primary_metric": 'spearman_correlation',
"featurization": 'FeaturizationConfig',
"verbosity": logging.INFO,
"n_cross_validations": 5
}

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,506 questions
{count} vote

Accepted answer
  1. Ramr-msft 17,596 Reputation points
    2021-11-29T12:46:23.847+00:00

    @SoonJoo@Genting Thanks, Previously, it was a black-box preprocessing, with user’s preprocess=True/False setting.
    New change includes deprecation of preprocess and introduction of new field featurization, where featurization = ‘auto’ (for automatic featurization, comparable to preprocess=True) / ‘off’ (to turn off featurization, comparable to preprocess=False) / FeaturizationConfig (object to pass in customized configuration on featurization setting).

    For more information on custom featurization as well as how to construct FeaturizationConfig is in this documentation.
    We also have a notebook available with example in our git repo.
    Usage example:

    from azureml.automl.core.featurization import FeaturizationConfig  
      
    featurization_config = FeaturizationConfig()  
    featurization_config.add_column_purpose('Column2', 'Categorical')  
    featurization_config.add_column_purpose('Column5', 'Categorical')  
      
    automl_config = AutoMLConfig(task = 'classification', compute_target=compute_target, featurization=featurization_config, **automl_settings )  
    remote_run = experiment.submit(automl_config, show_output = False)  
    

    For classification & regression you do have the option to turn off automatic featurization.

    featurization
    str or FeaturizationConfig
    'auto' / 'off' / FeaturizationConfig Indicator for whether featurization step should be done automatically or not, or whether customized featurization should be used.

    Note: Timeseries features are handled separately when the task type is set to forecasting independent of this parameter.

    AutoMLConfig Class

    0 comments No comments

0 additional answers

Sort by: Most helpful