AutoML automatic data preprocessing?

Question

AutoML automatic data preprocessing?

Hyun Jae Cho 6

I used a dataset that contains missing values, and Auto ML reached over 90% accuracy. I am curious how Auto ML dealt with missing values and if there is a way to retrieve the preprocessed dataset that Auto ML created? Or does it ignore rows with missing data?

Additionally, I selected "enable deep learning" when creating my Auto ML instance, but when I look at the models tried after the process completes, I do not see deep learning models have been tried. Why is that? I see random forest, LightGBM, XG boost, but no deep neural nets.

Thank you.

1 answer

Your answer

Answer 1

@Hyun Jae Cho Thanks for the question. Here is the doc for Auto ML Data prep / feature engineering.

Missing value imputaion: Mean, median, mode, one hot imputation marker.

Are you using the SDK or UI? We are working on a progress bar which will show up when training DNNs. Until that is released, there are couple of different ways to verify DNNs in the model:

Logs in portal – azureml_automl.log will have printed statement for pretrained transformer or bilstm transformer. You can search for the string “Added” in the logs, which will tell what transformers were added in featurization. When BERT or BiLSTM is chosen, “pretrained” or “bilstm” transformer will appear in this log line
After training the model, download the model and if you have all required dependencies locally available, you can unpickle the downloaded model and look into its featurization steps and it will list pretrained or bilstm transformer when the model pipeline includes it.

• ML Interpretability dashboard supported to understand feature importance for all models except ForecastTCN.

Share via

AutoML automatic data preprocessing?

1 answer

Your answer