Clean Missing Data

MUJEEBUR RAHMAN 16 Reputation points
2020-09-15T12:35:35.01+00:00

While doing a machine learning algorithm in Azure ML Studio, I used a dataset which contained some missing data. So I used the Clean Missing Data module for this purpose. Inn the fields section, there are fields named as Minimum Missing Value Ratio and Maximum Missing Value Ratio. I referred to the documentation mentioned here. But I couldn't understand whether giving these fields are necessary or not. Is it necessary to provide these two fields?

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,724 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. GiftA-MSFT 11,161 Reputation points
    2020-09-16T14:00:54.36+00:00

    Hi, thanks for reaching out. The minimum and maximum missing value ratio is useful for defining the conditions under which a cleaning operation is performed on the dataset. Minimum missing value ratio is used to specify the minimum number of missing values required for the operation to be performed. By default, the minimum missing value ratio property is set to 0 which means that missing values are cleaned even if there is only one missing value. If you set minimum missing value ratio to 20%, it means missing values are cleaned when there are over 20% rows containing missing values. Maximum missing value ratio is used to specify the maximum number of missing values that can be present for the operation to be performed. If you set maximum missing value ratio to 20%, it means missing values are cleaned when there 20% or fewer rows containing missing values. I would consider it necessary especially if you want to specify conditions for the cleaning operation. Hope this helps.

    1 person found this answer helpful.
    0 comments No comments