Events
31 Mar, 23 - 2 Apr, 23
Le plus grand événement d’apprentissage Fabric, Power BI et SQL. 31 mars au 2 avril. Utilisez le code FABINSIDER pour économiser 400 $.
Inscrivez-vous aujourd’huiThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
A machine learning experiment is the primary unit of organization and control for all related machine learning runs. A run corresponds to a single execution of model code. In MLflow, tracking is based on experiments and runs.
Machine learning experiments allow data scientists to log parameters, code versions, metrics, and output files when running their machine learning code. Experiments also let you visualize, search for, and compare runs, as well as download run files and metadata for analysis in other tools.
In this article, you learn more about how data scientists can interact with and use machine learning experiments to organize their development process and to track multiple runs.
You can create a machine learning experiment directly from the fabric user interface (UI) or by writing code that uses the MLflow API.
To create a machine learning experiment from the UI:
After creating the experiment, you can start adding runs to track run metrics and parameters.
You can also create a machine learning experiment directly from your authoring experience using the mlflow.create_experiment()
or mlflow.set_experiment()
APIs. In the following code, replace <EXPERIMENT_NAME>
with your experiment's name.
import mlflow
# This will create a new experiment with the provided name.
mlflow.create_experiment("<EXPERIMENT_NAME>")
# This will set the given experiment as the active experiment.
# If an experiment with this name does not exist, a new experiment with this name is created.
mlflow.set_experiment("<EXPERIMENT_NAME>")
A machine learning experiment contains a collection of runs for simplified tracking and comparison. Within an experiment, a data scientist can navigate across various runs and explore the underlying parameters and metrics. Data scientists can also compare runs within a machine learning experiment to identify which subset of parameters yield a desired model performance.
A machine learning run corresponds to a single execution of model code.
Each run includes the following information:
You can also view recent runs for an experiment by selecting Run list. This view allows you to keep track of recent activity, quickly jump to the related Spark application, and apply filters based on the run status.
To compare and evaluate the quality of your machine learning runs, you can compare the parameters, metrics, and metadata between selected runs within an experiment.
MLflow tagging for experiment runs allows users to add custom metadata in the form of key-value pairs to their runs. These tags help categorize, filter, and search for runs based on specific attributes, making it easier to manage and analyze experiments within the MLflow platform. Users can utilize tags to label runs with information such as model types, parameters, or any relevant identifiers, enhancing the overall organization and traceability of experiments.
This code snippet starts an MLflow run, logs some parameters and metrics, and adds tags to categorize and provide additional context for the run.
import mlflow
import mlflow.sklearn
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
from sklearn.datasets import fetch_california_housing
# Autologging
mlflow.autolog()
# Load the California housing dataset
data = fetch_california_housing(as_frame=True)
X = data.data
y = data.target
# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Start an MLflow run
with mlflow.start_run() as run:
# Train the model
model = LinearRegression()
model.fit(X_train, y_train)
# Predict and evaluate
y_pred = model.predict(X_test)
# Add tags
mlflow.set_tag("model_type", "Linear Regression")
mlflow.set_tag("dataset", "California Housing")
mlflow.set_tag("developer", "Bob")
Once the tags are applied, you can then view the results directly from the inline MLflow widget or from the run details page.
Warning
Warning: Limitations on Applying Tags to MLflow Experiment Runs in Fabric
synapseml
, mlflow
, or trident
are restricted and will not be accepted.You can visually compare and filter runs within an existing experiment. Visual comparison allows you to easily navigate between multiple runs and sort across them.
To compare runs:
Data scientists can also use MLflow to query and search among runs within an experiment. You can explore more MLflow APIs for searching, filtering, and comparing runs by visiting the MLflow documentation.
You can use the MLflow search API mlflow.search_runs()
to get all runs in an experiment by replacing <EXPERIMENT_NAME>
with your experiment name or <EXPERIMENT_ID>
with your experiment ID in the following code:
import mlflow
# Get runs by experiment name:
mlflow.search_runs(experiment_names=["<EXPERIMENT_NAME>"])
# Get runs by experiment ID:
mlflow.search_runs(experiment_ids=["<EXPERIMENT_ID>"])
Tip
You can search across multiple experiments by providing a list of experiment IDs to the experiment_ids
parameter. Similarly, providing a list of experiment names to the experiment_names
parameter will allow MLflow to search across multiple experiments. This can be useful if you want to compare across runs within different experiments.
Use the max_results
parameter from search_runs
to limit the number of runs returned. The order_by
parameter allows you to list the columns to order by and can contain an optional DESC
or ASC
value. For instance, the following example returns the last run of an experiment.
mlflow.search_runs(experiment_ids=[ "1234-5678-90AB-CDEFG" ], max_results=1, order_by=["start_time DESC"])
You can use the MLFlow authoring widget within Fabric notebooks to track MLflow runs generated within each notebook cell. The widget allows you to track your runs, associated metrics, parameters, and properties right down to the individual cell level.
To obtain a visual comparison, you can also switch to the Run comparison view. This view presents the data graphically, aiding in the quick identification of patterns or deviations across different runs.
Once a run yields the desired result, you can save the run as a model for enhanced model tracking and for model deployment by selecting Save as a ML model.
ML experiments are integrated directly into Monitor. This functionality is designed to provide more insight into your Spark applications and the ML experiments they generate, making it easier to manage and debug these processes.
Users can track experiment runs directly from monitor, providing a unified view of all their activities. This integration includes filtering options, enabling users to focus on experiments or runs created within the last 30 days or other specified periods.
ML Experiment are integrated directly into Monitor, where you can select a specific Spark application and access Item Snapshots. Here, you’ll find a list of all the experiments and runs generated by that application.
Events
31 Mar, 23 - 2 Apr, 23
Le plus grand événement d’apprentissage Fabric, Power BI et SQL. 31 mars au 2 avril. Utilisez le code FABINSIDER pour économiser 400 $.
Inscrivez-vous aujourd’huiTraining
Module
Train and track machine learning models with MLflow in Microsoft Fabric - Training
Learn how to train machine learning models in notebooks and track your work with MLflow experiments in Microsoft Fabric.
Certification
Microsoft Certified: Azure Data Scientist Associate - Certifications
Manage data ingestion and preparation, model training and deployment, and machine learning solution monitoring with Python, Azure Machine Learning and MLflow.
Documentation
Machine learning model - Microsoft Fabric
Learn how to create machine learning models, manage versions within a model, track models, and apply a model.
Model scoring with PREDICT - Microsoft Fabric
Learn how to operationalize machine learning models in Fabric with a scalable function called PREDICT.
Autologging in Synapse Data Science - Microsoft Fabric
Use autologging with MLflow to automatically capture machine learning metrics and parameters.