Workspaces are places to collaborate with colleagues to create machine learning artifacts and group related work. For example, experiments, jobs, datasets, models, components, and inference endpoints. This article describes workspaces, how to manage access to them, and how to use them to organize your work.
Datastores define how you and others can connect to data sources when using data assets.
Security settings - Networking, identity and access control, and encryption settings.
Organizing workspaces
For machine learning team leads and administrators, workspaces serve as containers for access management, cost management, and data isolation. Here are some tips for organizing workspaces:
Use user roles for permission management in the workspace between users. For example a data scientist, a machine learning engineer or an admin.
Assign access to user groups: By using Microsoft Entra user groups, you don't have to add individual users to each workspace, and to other resources the same group of users requires access to.
Create a workspace per project: While a workspace can be used for multiple projects, limiting it to one project per workspace allows for cost reporting accrued to a project level. It also allows you to manage configurations like datastores in the scope of each project.
Share Azure resources: Workspaces require you to create several associated resources. Share these resources between workspaces to save repetitive setup steps.
Enable self-serve: Precreate and secure associated resources as an IT admin, and use user roles to let data scientists create workspaces on their own.
Your workspace keeps a history of all training runs, with logs, metrics, output, lineage metadata, and a snapshot of your scripts. As you perform tasks in Azure Machine Learning, artifacts are generated. Their metadata and data are stored in the workspace and on its associated resources.
Associated resources
When you create a new workspace, you're required to bring other Azure resources to store your data. If not provided by you, these resources are automatically be created by Azure Machine Learning.
Azure Storage account. Stores machine learning artifacts such as job logs. By default, this storage account is used when you upload data to the workspace. Jupyter notebooks that are used with your Azure Machine Learning compute instances are stored here as well.
Important
You can't use an existing Azure Storage account if it is:
An account of type BlobStorage
A premium account (Premium_LRS and Premium_GRS)
An account with hierarchical namespace (used with Azure Data Lake Storage Gen2).
You can use premium storage or hierarchical namespace as additional storage by creating a datastore.
Do not enable hierarchical namespace on the storage account after upgrading to general-purpose v2.
If you bring an existing general-purpose v1 storage account, you may upgrade this to general-purpose v2 after the workspace has been created.
Azure Container Registry (ACR). Stores created docker containers, when you build custom environments via Azure Machine Learning. Deploying AutoML models and data profile will also trigger creation of custom environments.
Workspaces can be created without ACR as a dependency if you do not have a need to build custom docker containers. Azure Machine Learning can read from external container registries.
ACR will automatically be provisioned when you build custom docker images. Use Azure role-based access control (Azure RBAC) to prevent customer docker containers from being built.
Important
If your subscription setting requires adding tags to resources under it, ACR created by Azure Machine Learning will fail, since we cannot set tags to ACR.
Azure Application Insights. Helps you monitor and collect diagnostic information from your inference endpoints.
The following workspace management tasks are available in each interface.
Workspace management task
Portal
Studio
Python SDK
Azure CLI
VS Code
Create a workspace
✓
✓
✓
✓
✓
Manage workspace access
✓
✓
Create and manage compute resources
✓
✓
✓
✓
✓
Create a compute instance
✓
✓
✓
✓
Warning
Moving your Azure Machine Learning workspace to a different subscription, or moving the owning subscription to a new tenant, is not supported. Doing so may cause errors.
Sub resources
When you create compute clusters and compute instances in Azure Machine Learning, sub resources are created.
VMs: provide computing power for compute instances and compute clusters, which you use to run jobs.
Load Balancer: a network load balancer is created for each compute instance and compute cluster to manage traffic even while the compute instance/cluster is stopped.
Virtual Network: these help Azure resources communicate with one another, the internet, and other on-premises networks.
Bandwidth: encapsulates all outbound data transfers across regions.
Explore and configure the Azure Machine Learning workspace, its resources and its assets. Explore which developer tools you can use to interact with the workspace, focusing on the CLI and Python SDK v2.
Manage data ingestion and preparation, model training and deployment, and machine learning solution monitoring with Python, Azure Machine Learning and MLflow.
Hubs provide a central way to govern security, connectivity, and compute resources for a team with multiple workspaces. Project workspaces that are created using a hub obtain the same security settings and shared resource access.