How to set the environment for AutoML regression tasks to run on?

Nik Adlakha 0 Reputation points
2025-06-16T03:17:46.3+00:00

I am trying to set the environment for AutoML regression jobs. I have the entire codebase setup on python 3.10 and I have an environment setup as well. The entire pipeline steps run on this environment, except AutoML regression. The issue arises when I am trying to load the model in the ML pipeline step to evaluate. Since the automl job is getting trained on Python 3.9, all the dependencies are way different from the environment setup, which causes issues to load the model.

I have tried passing in the environment in the regression step, it does not complain, but it does not setup the environment either.


train = regression(
        experiment_name=customer_definition.experiment_name,
        training_data=train_test_split_step.outputs.train_data,
        test_data=train_test_split_step.outputs.val_data,
        target_column_name=target_column_name,
        primary_metric=primary_metric,
        enable_model_explainability=True,
        n_cross_validations=n_cross_validations,
        compute=compute_name,
        environment_id=environment_id,
        outputs={"best_model": Output(type="mlflow_model")},
    )


Can you help me fix this?

Here is the error:

To fix the mismatches, call `mlflow.pyfunc.get_model_dependencies(model_uri)` to fetch the model's environment and install dependencies using the resulting environment file.
2025/06/16 02:17:56 WARNING mlflow.pyfunc: The version of Python that the model was saved in, `Python 3.9.22`, differs from the version of Python that is currently running, `Python 3.10.16`, and may be incompatible
ERROR:__main__:Failed to load model

I have tried installing the model dependencies using the function indicated. But, it still gives me the same error.

mlflow.pyfunc.get_model_dependencies(model_uri)
model = mlflow.pyfunc.load_model(model_uri)
Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
3,334 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Ravada Shivaprasad 535 Reputation points Microsoft External Staff Moderator
    2025-06-17T01:04:28.4533333+00:00

    Hi Nik Adlakha

    The issue you're experiencing stems from a fundamental mismatch between your Python environments. While your main pipeline runs on Python 3.10, the AutoML regression job defaults to Python 3.9, causing dependency conflicts when loading the model.

    First, create a dedicated environment for AutoML:

    name: automl_environment
    channels:
      - conda-forge
    dependencies:
      - python=3.9
      - mlflow
      - scikit-learn
      - cloudpickle
    

    After that Update the Pipeline configuration

    aml_env = Environment(name="automl_environment")
    aml_env.python.version = "3.9"
    
    train = regression(
        experiment_name=customer_definition.experiment_name,
        training_data=train_test_split_step.outputs.train_data,
        compute=compute_name,
        environment=aml_env,
        outputs={"best_model": Output(type="mlflow_model")},
    )
    

    Finally, load the model with proper environment settings

    model_config = {
        "python_version": "3.9",
        "pip_requirements": [
            "mlflow==latest",
            "scikit-learn==latest",
            "cloudpickle==latest"
        ]
    }
    
    model = mlflow.pyfunc.load_model(model_uri=model_uri, python_env=model_config)
    
    

    This will maintains separate environments while ensuring compatibility between your main pipeline and AutoML jobs. The key is keeping the AutoML environment isolated with Python 3.9 while allowing your main pipeline to continue running on Python 3.10.

    Hope it Helps!

    Thanks


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.