Downloading AzureML experiment metrics logged with MLflow

matsuo_basho 10 Reputation points
2023-12-03T17:29:25.8166667+00:00

I'm using this tutorial to query the results of a job where I've logged metrics with MLFlow.

When I try the following, I get a None object returned:

mlflow.get_experiment_by_name(<exp_name)

mlflow.search_experiments()

When I try this, I get a run not found error:

mlflow.get_run(<run_name)

So mlflow is clearly not seeing my jobs. But they are there in the studio. I have an active AzureML connection using the AzureML VSCode plug-in. Help me understand what the issue is and how I can download the metrics I've logged with MLFlow.

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,708 questions
{count} votes

1 answer

Sort by: Most helpful
  1. dupammi 7,955 Reputation points Microsoft Vendor
    2023-12-06T04:44:42.6366667+00:00

    Hi @matsuo_basho ,

    I'm glad that the guidance was helpful.

    Regarding your next query, you can obtain the metrics for each epoch using the MLflow Python API. To do this, you need to log the metrics for each epoch using the mlflow.log_metric() function in your training script. Please use a FOR loop in the python script to iterate through all the values you are interested in and log it using log_metric from within the loop.

    A quick sample you may want to refer to, please adjust it according to your scenario -

    import mlflow
    
    # Start an MLflow run
    with mlflow.start_run():
    
        # Train your model
        for epoch in range(num_epochs):
            # Train your model for one epoch
            train_loss, train_acc = train_one_epoch(...)
            val_loss, val_acc = validate(...)
            
            # Log the metrics for this epoch
            mlflow.log_metric("train_loss", train_loss, step=epoch)
            mlflow.log_metric("train_acc", train_acc, step=epoch)
            mlflow.log_metric("val_loss", val_loss, step=epoch)
            mlflow.log_metric("val_acc", val_acc, step=epoch)
    

    Once you've logged the metrics for each epoch, you can retrieve them using the methods discussed in my previous responses.

    Hope this helps.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful.