How to save or log pytorch model using MLflow?

Question

How to save or log pytorch model using MLflow?

Rishavraj Mandal 30

I am in main.py at the root directory and at main.py calling the model script to train the model. The directory looks like this

User's image

But I am getting an error while saving the code paths saying the directory is not found.


# Registering the model to the workspace
    mlflow.pytorch.log_model(
        pytorch_model= model,
        registered_model_name="use-case1-model",
        artifact_path="use-case1-model",
        input_example=df[['Title', 'Attributes']],
        conda_env=os.path.join("./dependencies", "conda.yaml"),
        code_paths="./models"
        ]
        
    )

    # Saving the model to a file
    mlflow.pytorch.save_model(
        pytorch_model= model,
        conda_env=os.path.join("./dependencies", "conda.yaml"),
        input_example=df[['Title', 'Attributes']],
        path=os.path.join(args.model, "use-case1-model"),
        code_paths="./models"
    )

Qu`estion 1: is there a need to save the code paths and extra files parameter in my case?

Question 2: What's the right way to save the code paths directory for code_paths and extra_files parameters?

1 answer

Your answer

Answer 1

Hello @Rishavraj Mandal

Thanks for reaching out to us, I will suggest you try below workflow:

To save or log a PyTorch model using MLflow, you can use the mlflow.pytorch.log_model or mlflow.pytorch.save_model functions.

Regarding your first question, the code_paths and extra_files parameters are optional and are used to specify additional files or directories that should be included when logging or saving the model. If you don't have any additional files or directories that need to be included, you can omit these parameters.

Regarding your second question, the code_paths parameter should be set to the path of the directory that contains the code used to train the model. This can be a local directory or a remote directory accessible via a URI. If you are running the training script locally, you can set the code_paths parameter to the path of the directory containing the training script and any other necessary files. For example, if your training script is located in the models directory, you can set code_paths to "./models". If you have multiple directories that contain code used to train the model, you can specify them as a list of strings.

Here is an example of how to log and save a PyTorch model using MLflow for your reference, please do need changes to fit your scenario -

import mlflow.pytorch
import torch
# Define your PyTorch model
model = torch.nn.Sequential(
    torch.nn.Linear(2, 1),
    torch.nn.Sigmoid()
)
# Train your model and obtain the trained model object
# Log the model to MLflow
mlflow.pytorch.log_model(
    pytorch_model=model,
    artifact_path="my-model",
    conda_env="path/to/conda.yaml",
    code_paths=["path/to/training/script.py", "path/to/other/code"],
    registered_model_name="my-registered-model"
)
# Save the model to a file
mlflow.pytorch.save_model(
    pytorch_model=model,
    path="my-model",
    conda_env="path/to/conda.yaml",
    code_paths=["path/to/training/script.py", "path/to/other/code"]
)

Regards,

Yutong

Share via

How to save or log pytorch model using MLflow?

1 answer

Your answer