Manage Azure Machine Learning environments with the CLI & SDK (v2)
APPLIES TO: Azure CLI ml extension v2 (current) Python SDK azure-ai-ml v2 (current)
Azure Machine Learning environments define the execution environments for your jobs or deployments and encapsulate the dependencies for your code. Azure Machine Learning uses the environment specification to create the Docker container that your training or scoring code runs in on the specified compute target. You can define an environment from a conda specification, Docker image, or Docker build context.
In this article, learn how to create and manage Azure Machine Learning environments using the SDK & CLI (v2).
Before following the steps in this article, make sure you have the following prerequisites:
An Azure Machine Learning workspace. If you don't have one, use the steps in the Quickstart: Create workspace resources article to create one.
The Azure CLI and the
ml
extension or the Azure Machine Learning Python SDK v2:To install the Azure CLI and extension, see Install, set up, and use the CLI (v2).
Important
The CLI examples in this article assume that you are using the Bash (or compatible) shell. For example, from a Linux system or Windows Subsystem for Linux.
To install the Python SDK v2, use the following command:
pip install azure-ai-ml azure-identity
To update an existing installation of the SDK to the latest version, use the following command:
pip install --upgrade azure-ai-ml azure-identity
For more information, see Install the Python SDK v2 for Azure Machine Learning.
Tip
For a full-featured development environment, use Visual Studio Code and the Azure Machine Learning extension to manage Azure Machine Learning resources and train machine learning models.
To run the training examples, first clone the examples repository. For the CLI examples, change into the cli
directory. For the SDK examples, change into the sdk/python/assets/environment
directory:
git clone --depth 1 https://github.com/Azure/azureml-examples
Note that --depth 1
clones only the latest commit to the repository, which reduces time to complete the operation.
Tip
Use the following tabs to select the method you want to use to work with environments. Selecting a tab will automatically switch all the tabs in this article to the same tab. You can select another tab at any time.
To connect to the workspace, you need identifier parameters - a subscription, resource group, and workspace name. You use these details in the MLClient
from the azure.ai.ml
namespace to get a handle to the required Azure Machine Learning workspace. To authenticate, you use the default Azure authentication. Check this example for more details on how to configure credentials and connect to a workspace.
# import required libraries
from azure.ai.ml import MLClient
from azure.ai.ml.entities import Environment, BuildContext
from azure.identity import DefaultAzureCredential
# Enter details of your AML workspace
subscription_id = "<SUBSCRIPTION_ID>"
resource_group = "<RESOURCE_GROUP>"
workspace = "<AML_WORKSPACE_NAME>"
# get a handle to the workspace
ml_client = MLClient(
DefaultAzureCredential(), subscription_id, resource_group, workspace
)
There are two types of environments in Azure Machine Learning: curated and custom environments. Curated environments are predefined environments containing popular ML frameworks and tooling. Custom environments are user-defined and can be created via az ml environment create
.
Curated environments are provided by Azure Machine Learning and are available by default. Azure Machine Learning routinely updates these environments with the latest framework version releases and maintains them for bug fixes and security patches. They're backed by cached Docker images, which reduce job preparation cost and model deployment time.
You can use these curated environments out of the box for training or deployment by referencing a specific version or latest version of the environment. Use the following syntax: azureml://registries/azureml/environment/<curated-environment-name>/versions/<version-number>
or azureml://registries/azureml/environment/<curated-environment-name>/labels/latest
. You can also use them as a reference for your own custom environments by modifying the Dockerfiles that back these curated environments.
You can see the set of available curated environments in the Azure Machine Learning studio UI, or by using the CLI (v2) via az ml environment list
.
Tip
When working with curated environments in the CLI or SDK, the environment name begins with AzureML-
followed by the name of the curated environment. When using the Azure Machine Learning studio, they do not have this prefix. The reason for this difference is that the studio UI displays curated and custom environments on separate tabs, so the prefix isn't necessary. The CLI and SDK do not have this separation, so the prefix is used to differentiate between curated and custom environments.
You can define an environment from a Docker image, a Docker build context, and a conda specification with Docker image.
To define an environment from a Docker image, provide the image URI of the image hosted in a registry such as Docker Hub or Azure Container Registry.
The following example creates an environment from a Docker image. An image from the official PyTorch repository on Docker Hub is specified via the image
property.
env_docker_image = Environment(
image="pytorch/pytorch:latest",
name="docker-image-example",
description="Environment created from a Docker image.",
)
ml_client.environments.create_or_update(env_docker_image)
Tip
Azure Machine Learning maintains a set of CPU and GPU Ubuntu Linux-based base images with common system dependencies. For example, the GPU images contain Miniconda, OpenMPI, CUDA, cuDNN, and NCCL. You can use these images for your environments, or use their corresponding Dockerfiles as reference when building your own custom images.
For the set of base images and their corresponding Dockerfiles, see the AzureML-Containers repo.
Instead of defining an environment from a prebuilt image, you can also define one from a Docker build context. To do so, specify the directory that serves as the build context. This directory should contain a Dockerfile (not larger than 1MB) and any other files needed to build the image.
In the following example, the local path to the build context folder is specified in the path
parameter. Azure Machine Learning looks for a Dockerfile named Dockerfile
at the root of the build context.
env_docker_context = Environment(
build=BuildContext(path="docker-contexts/python-and-pip"),
name="docker-context-example",
description="Environment created from a Docker context.",
)
ml_client.environments.create_or_update(env_docker_context)
Azure Machine Learning starts building the image from the build context when the environment is created. You can monitor the status of the build and view the build logs in the studio UI.
You can define an environment using a standard conda YAML configuration file that includes the dependencies for the conda environment. See Creating an environment manually for information on this standard format.
You must also specify a base Docker image for this environment. Azure Machine Learning builds the conda environment on top of the Docker image provided. If you install some Python dependencies in your Docker image, those packages won't exist in the execution environment thus causing runtime failures. By default, Azure Machine Learning builds a Conda environment with dependencies you specified, and runs the job in that environment instead of using any Python libraries that you installed on the base image.
The relative path to the conda file is specified using the conda_file
parameter.
env_docker_conda = Environment(
image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04",
conda_file="conda-yamls/pydata.yml",
name="docker-image-plus-conda-example",
description="Environment created from a Docker image plus Conda environment.",
)
ml_client.environments.create_or_update(env_docker_conda)
Azure Machine Learning builds the final Docker image from this environment specification when the environment is used in a job or deployment. You can also manually trigger a build of the environment in the studio UI.
The SDK and CLI (v2) also allow you to manage the lifecycle of your Azure Machine Learning environment assets.
List all the environments in your workspace:
envs = ml_client.environments.list()
for env in envs:
print(env.name)
List all the environment versions under a given name:
envs = ml_client.environments.list(name="docker-image-example")
for env in envs:
print(env.version)
Get the details of a specific environment:
env = ml_client.environments.get(name="docker-image-example", version="1")
print(env)
Update mutable properties of a specific environment:
env.description="This is an updated description."
ml_client.environments.create_or_update(environment=env)
Important
For environments, only description
and tags
can be updated. All other properties are immutable; if you need to change any of those properties you should create a new version of the environment.
Archiving an environment hides it by default from list queries (az ml environment list
). You can still continue to reference and use an archived environment in your workflows. You can archive either all versions of an environment or only a specific version.
If you don't specify a version, all versions of the environment under that given name are archived. If you create a new environment version under an archived environment container, that new version is automatically set as archived as well.
Archive all versions of an environment:
ml_client.environments.archive(name="docker-image-example")
Archive a specific environment version:
ml_client.environments.archive(name="docker-image-example", version="1")
Important
Archiving an environment's version does not delete the cached image in the container registry. If you wish to delete the cached image associated with a specific environment, you can use the command az acr repository delete on the environment's associated repository.
To use an environment for a training job, specify the environment
property of the command.
For examples of submitting jobs, see the examples at https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs.
When you submit a training job, the building of a new environment can take several minutes. The duration depends on the size of the required dependencies. The environments are cached by the service. So as long as the environment definition remains unchanged, you incur the full setup time only once.
For more information on how to use environments in jobs, see Train models.
You can also use environments for your model deployments. For more information, see Deploy and score a machine learning model.