Events
31 Mar, 23 - 2 Apr, 23
The biggest Fabric, Power BI, and SQL learning event. March 31 – April 2. Use code FABINSIDER to save $400.
Register todayThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
A machine learning model is a file trained to recognize certain types of patterns. You train a model over a set of data, and you provide it with an algorithm that uses to reason over and learn from that data set. After you train the model, you can use it to reason over data that it never saw before, and make predictions about that data.
In MLflow, a machine learning model can include multiple model versions. Here, each version can represent a model iteration. In this article, you learn how to interact with ML models to track and compare model versions.
In MLflow, machine learning models include a standard packaging format. This format allows use of those models in various downstream tools, including batch inferencing on Apache Spark. The format defines a convention to save a model in different "flavors" that different downstream tools can understand.
You can directly create a machine learning model from the Fabric UI. The MLflow API can also directly create the model.
To create a machine learning model from the UI, you can:
You can also create a machine learning model directly from your authoring experience with the mlflow.register_model()
API. If a registered machine learning model with the given name doesn't exist, the API creates it automatically.
import mlflow
model_uri = "runs:/{}/model-uri-name".format(run.info.run_id)
mv = mlflow.register_model(model_uri, "model-name")
print("Name: {}".format(mv.name))
print("Version: {}".format(mv.version))
A machine learning model contains a collection of model versions for simplified tracking and comparison. Within a model, a data scientist can navigate across various model versions to explore the underlying parameters and metrics. Data scientists can also make comparisons across model versions to identify whether or not newer models might yield better results.
A machine learning model version represents an individual model that is registered for tracking.
Each model version includes the following information:
MLflow tagging for model versions enables users to attach custom metadata to specific versions of a registered model in the MLflow Model Registry. These tags, stored as key-value pairs, help organize, track, and differentiate between model versions, making it easier to manage model lifecycles. Tags can be used to denote the model's purpose, deployment environment, or any other relevant information, facilitating more efficient model management and decision-making within teams.
This code demonstrates how to train a RandomForestRegressor model using Scikit-learn, log the model and parameters with MLflow, and then register the model in the MLflow Model Registry with custom tags. These tags provide useful metadata, such as project name, department, team, and project quarter, making it easier to manage and track the model version.
import mlflow.sklearn
from mlflow.models import infer_signature
from sklearn.datasets import make_regression
from sklearn.ensemble import RandomForestRegressor
# Generate synthetic regression data
X, y = make_regression(n_features=4, n_informative=2, random_state=0, shuffle=False)
# Model parameters
params = {"n_estimators": 3, "random_state": 42}
# Model tags for MLflow
model_tags = {
"project_name": "grocery-forecasting",
"store_dept": "produce",
"team": "stores-ml",
"project_quarter": "Q3-2023"
}
# Log MLflow entities
with mlflow.start_run() as run:
# Train the model
model = RandomForestRegressor(**params).fit(X, y)
# Infer the model signature
signature = infer_signature(X, model.predict(X))
# Log parameters and the model
mlflow.log_params(params)
mlflow.sklearn.log_model(model, artifact_path="sklearn-model", signature=signature)
# Register the model with tags
model_uri = f"runs:/{run.info.run_id}/sklearn-model"
model_version = mlflow.register_model(model_uri, "RandomForestRegressionModel", tags=model_tags)
# Output model registration details
print(f"Model Name: {model_version.name}")
print(f"Model Version: {model_version.version}")
After applying the tags, you can view them directly on the model version details page. Additionally, tags can be added, updated, or removed from this page at any time.
To compare and evaluate the quality of machine learning model versions, you can compare the parameters, metrics, and metadata between selected versions.
You can visually compare runs within an existing model. Visual comparison allows easy navigation between, and sorts across, multiple versions.
To compare runs, you can:
Data scientists can also use MLflow to search among multiple models saved within the workspace. Visit the MLflow documentation to explore other MLflow APIs for model interaction.
from pprint import pprint
client = MlflowClient()
for rm in client.list_registered_models():
pprint(dict(rm), indent=4)
Once you train a model on a data set, you can apply that model to data it never saw to generate predictions. We call this model use technique scoring or inferencing. For more information about Microsoft Fabric model scoring, see the next section.
Events
31 Mar, 23 - 2 Apr, 23
The biggest Fabric, Power BI, and SQL learning event. March 31 – April 2. Use code FABINSIDER to save $400.
Register todayTraining
Module
Train and track machine learning models with MLflow in Microsoft Fabric - Training
Learn how to train machine learning models in notebooks and track your work with MLflow experiments in Microsoft Fabric.
Certification
Microsoft Certified: Azure Data Scientist Associate - Certifications
Manage data ingestion and preparation, model training and deployment, and machine learning solution monitoring with Python, Azure Machine Learning and MLflow.
Documentation
Machine learning experiment - Microsoft Fabric
Learn how to create machine learning experiments, use the MLflow API, manage and compare runs, and save a run as a model.
Model scoring with PREDICT - Microsoft Fabric
Learn how to operationalize machine learning models in Fabric with a scalable function called PREDICT.
Autologging in Synapse Data Science - Microsoft Fabric
Use autologging with MLflow to automatically capture machine learning metrics and parameters.