"azureml-dataprep" error with AutoML UI(no coding)

Ramesh Srinivasan 20 Reputation points
2024-11-28T16:40:07.8333333+00:00

Hello,

I'm new to Azure ML Studio. Under "Authoring" -> "Automated ML", I'm creating a new "Automated ML Job" with "Classification" Type.

The job executes flawless with 10 columns/features in the model.

However, I have a complex model with 2000 features. In few minutes the job fails with below message.

Failed to load dataset definition with azureml-dataprep==5.1.6. (Engineless). Please install the latest version with "pip install -U azureml-dataprep

This error message is happening only with 2000+ columns in CSV file. However, it doesn't happen with smaller columns.

Any suggestions or recommendations?

Thanks!

User's image

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
3,026 questions
{count} votes

Accepted answer
  1. Saideep Anchuri 420 Reputation points Microsoft Vendor
    2024-11-29T06:12:03.8466667+00:00

    Hi Ramesh Srinivasan

    Welcome to Microsoft Q&A Forum, thank you for posting your query here!

    The error occurs because the azureml-dataprep library struggles to handle the high-dimensional dataset with 2000+ features. Try to update azureml-dataprep to the latest version using pip install -U azureml-dataprep. Increase compute resources in your Automated ML job by selecting a cluster with higher memory and CPUs. You can also reduce the number of features using preprocessing techniques like feature selection to drop irrelevant or redundant columns. If the issue persists, consider splitting the dataset into smaller chunks before using Automated ML. These steps should help manage the large dataset effectively.

    Hope this helps. Do let us know if you any further queries.

     


    If this answers your query, do click Accept Answer and Yes for was this answer helpful.

    Thank You.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.