What is an Azure Machine Learning workspace?

The workspace is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. The workspace keeps a history of all training runs, including logs, metrics, output, and a snapshot of your scripts. You use this information to determine which training run produces the best model.

Once you have a model you like, you register it with the workspace. You then use the registered model and scoring scripts to deploy to an online endpoint as a REST-based HTTP endpoint.

Taxonomy

A taxonomy of the workspace is illustrated in the following diagram:

Workspace taxonomy

The diagram shows the following components of a workspace:

  • A workspace can contain Azure Machine Learning compute instances, cloud resources configured with the Python environment necessary to run Azure Machine Learning.

  • User roles enable you to share your workspace with other users, teams, or projects.

  • Compute targets are used to run your experiments.

  • When you create the workspace, associated resources are also created for you.

  • Jobs are training runs you use to build your models. You can organize your jobs into Experiments.

  • Pipelines are reusable workflows for training and retraining your model.

  • Data assets aid in management of the data you use for model training and pipeline creation.

  • Once you have a model you want to deploy, you create a registered model.

  • Use the registered model and a scoring script to create an online endpoint.

Tools for workspace interaction

You can interact with your workspace in the following ways:

Machine learning with a workspace

Machine learning tasks read and/or write artifacts to your workspace.

  • Run an experiment to train a model - writes job run results to the workspace.
  • Use automated ML to train a model - writes training results to the workspace.
  • Register a model in the workspace.
  • Deploy a model - uses the registered model to create a deployment.
  • Create and run reusable workflows.
  • View machine learning artifacts such as jobs, pipelines, models, deployments.
  • Track and monitor models.

Workspace management

You can also perform the following workspace management tasks:

Workspace management task Portal Studio Python SDK Azure CLI VS Code
Create a workspace
Manage workspace access
Create and manage compute resources
Create a compute instance

Warning

Moving your Azure Machine Learning workspace to a different subscription, or moving the owning subscription to a new tenant, is not supported. Doing so may cause errors.

Create a workspace

There are multiple ways to create a workspace:

Note

The workspace name is case-insensitive.

Sub resources

These sub resources are the main resources that are made in the AzureML workspace.

  • VMs: provide computing power for your AzureML workspace and are an integral part in deploying and training models.
  • Load Balancer: a network load balancer is created for each compute instance and compute cluster to manage traffic even while the compute instance/cluster is stopped.
  • Virtual Network: these help Azure resources communicate with one another, the internet, and other on-premises networks.
  • Bandwidth: encapsulates all outbound data transfers across regions.

Associated resources

When you create a new workspace, it automatically creates several Azure resources that are used by the workspace:

  • Azure Storage account: Is used as the default datastore for the workspace. Jupyter notebooks that are used with your Azure Machine Learning compute instances are stored here as well.

    Important

    By default, the storage account is a general-purpose v1 account. You can upgrade this to general-purpose v2 after the workspace has been created. Do not enable hierarchical namespace on the storage account after upgrading to general-purpose v2.

    To use an existing Azure Storage account, it cannot be of type BlobStorage or a premium account (Premium_LRS and Premium_GRS). It also cannot have a hierarchical namespace (used with Azure Data Lake Storage Gen2). Neither premium storage nor hierarchical namespaces are supported with the default storage account of the workspace. You can use premium storage or hierarchical namespace with non-default storage accounts.

  • Azure Container Registry: Registers docker containers that are used for the following components:

    To minimize costs, ACR is lazy-loaded until images are needed.

    Note

    If your subscription setting requires adding tags to resources under it, Azure Container Registry (ACR) created by Azure Machine Learning will fail, since we cannot set tags to ACR.

  • Azure Application Insights: Stores monitoring and diagnostics information. For more information, see Monitor online endpoints.

    Note

    You can delete the Application Insights instance after cluster creation if you want. Deleting it limits the information gathered from the workspace, and may make it more difficult to troubleshoot problems. If you delete the Application Insights instance created by the workspace, you cannot re-create it without deleting and recreating the workspace.

  • Azure Key Vault: Stores secrets that are used by compute targets and other sensitive information that's needed by the workspace.

Note

You can instead use existing Azure resource instances when you create the workspace with the Python SDK or the Azure Machine Learning CLI using an ARM template.

Next steps

To learn more about planning a workspace for your organization's requirements, see Organize and set up Azure Machine Learning.

To get started with Azure Machine Learning, see: