What is an Azure Machine Learning workspace?
The workspace is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. The workspace keeps a history of all training runs, including logs, metrics, output, and a snapshot of your scripts. You use this information to determine which training run produces the best model.
Once you have a model you like, you register it with the workspace. You then use the registered model and scoring scripts to deploy to an online endpoint as a REST-based HTTP endpoint.
A taxonomy of the workspace is illustrated in the following diagram:
The diagram shows the following components of a workspace:
A workspace can contain Azure Machine Learning compute instances, cloud resources configured with the Python environment necessary to run Azure Machine Learning.
User roles enable you to share your workspace with other users, teams, or projects.
Compute targets are used to run your experiments.
When you create the workspace, associated resources are also created for you.
Jobs are training runs you use to build your models. You can organize your jobs into Experiments.
Pipelines are reusable workflows for training and retraining your model.
Data assets aid in management of the data you use for model training and pipeline creation.
Once you have a model you want to deploy, you create a registered model.
Use the registered model and a scoring script to create an online endpoint.
Tools for workspace interaction
You can interact with your workspace in the following ways:
- On the web:
- In any Python environment with the Azure Machine Learning SDK for Python.
- On the command line using the Azure Machine Learning CLI extension
- Azure Machine Learning VS Code Extension
Machine learning with a workspace
Machine learning tasks read and/or write artifacts to your workspace.
- Run an experiment to train a model - writes job run results to the workspace.
- Use automated ML to train a model - writes training results to the workspace.
- Register a model in the workspace.
- Deploy a model - uses the registered model to create a deployment.
- Create and run reusable workflows.
- View machine learning artifacts such as jobs, pipelines, models, deployments.
- Track and monitor models.
You can also perform the following workspace management tasks:
|Workspace management task||Portal||Studio||Python SDK||Azure CLI||VS Code|
|Create a workspace||✓||✓||✓||✓||✓|
|Manage workspace access||✓||✓|
|Create and manage compute resources||✓||✓||✓||✓||✓|
|Create a compute instance||✓||✓||✓||✓|
Moving your Azure Machine Learning workspace to a different subscription, or moving the owning subscription to a new tenant, is not supported. Doing so may cause errors.
Create a workspace
There are multiple ways to create a workspace:
- Use Azure Machine Learning studio to quickly create a workspace with default settings.
- Use the Azure portal for a point-and-click interface with more options.
- Use the Azure Machine Learning SDK for Python to create a workspace on the fly from Python scripts or Jupyter notebooks.
- Use an Azure Resource Manager template or the Azure Machine Learning CLI when you need to automate or customize the creation with corporate security standards.
- If you work in Visual Studio Code, use the VS Code extension.
The workspace name is case-insensitive.
These sub resources are the main resources that are made in the AzureML workspace.
- VMs: provide computing power for your AzureML workspace and are an integral part in deploying and training models.
- Load Balancer: a network load balancer is created for each compute instance and compute cluster to manage traffic even while the compute instance/cluster is stopped.
- Virtual Network: these help Azure resources communicate with one another, the internet, and other on-premises networks.
- Bandwidth: encapsulates all outbound data transfers across regions.
When you create a new workspace, it automatically creates several Azure resources that are used by the workspace:
Azure Storage account: Is used as the default datastore for the workspace. Jupyter notebooks that are used with your Azure Machine Learning compute instances are stored here as well.
By default, the storage account is a general-purpose v1 account. You can upgrade this to general-purpose v2 after the workspace has been created. Do not enable hierarchical namespace on the storage account after upgrading to general-purpose v2.
To use an existing Azure Storage account, it cannot be of type BlobStorage or a premium account (Premium_LRS and Premium_GRS). It also cannot have a hierarchical namespace (used with Azure Data Lake Storage Gen2). Neither premium storage nor hierarchical namespaces are supported with the default storage account of the workspace. You can use premium storage or hierarchical namespace with non-default storage accounts.
Azure Container Registry: Registers docker containers that are used for the following components:
- Azure Machine Learning environments when training and deploying models
- AutoML when deploying
- Data profiling
To minimize costs, ACR is lazy-loaded until images are needed.
If your subscription setting requires adding tags to resources under it, Azure Container Registry (ACR) created by Azure Machine Learning will fail, since we cannot set tags to ACR.
You can delete the Application Insights instance after cluster creation if you want. Deleting it limits the information gathered from the workspace, and may make it more difficult to troubleshoot problems. If you delete the Application Insights instance created by the workspace, you cannot re-create it without deleting and recreating the workspace.
Azure Key Vault: Stores secrets that are used by compute targets and other sensitive information that's needed by the workspace.
To learn more about planning a workspace for your organization's requirements, see Organize and set up Azure Machine Learning.
To get started with Azure Machine Learning, see:
Submit and view feedback for