Edit

Share via


Use an Azure Resource Manager template to create a workspace for Azure Machine Learning

In this article, you learn several ways to create an Azure Machine Learning workspace by using Azure Resource Manager templates. A Resource Manager template makes it easy to create resources in a single, coordinated operation. A template is a JSON document that defines the resources that are needed for a deployment. It might also specify deployment parameters. Parameters are used to provide input values during deployment.

For more information, see Deploy an application with an Azure Resource Manager template.

Prerequisites

Limitations

  • When you create a new workspace, you can either automatically create services needed by the workspace or use existing services. If you want to use existing services from a different Azure subscription than the workspace, you must register the Azure Machine Learning namespace in the subscription that contains those services. For example, if you create a workspace in subscription A that uses a storage account in subscription B, the Azure Machine Learning namespace must be registered in subscription B before the workspace can use the storage account.

    The resource provider for Azure Machine Learning is Microsoft.MachineLearningServices. For information on seeing whether it's registered or registering it, see Azure resource providers and types.

    Important

    This information applies only to resources provided during workspace creation: Azure Storage Accounts, Azure Container Registry, Azure Key Vault, and Application Insights.

  • The example template might not always use the latest API version for Azure Machine Learning. We recommend that you modify the template to use the latest API versions before you use it. For information on the latest API versions for Azure Machine Learning, see the pages for specific operation groups in the Azure Machine Learning REST API documentation.

    Tip

    Each Azure service has its own set of API versions. For information on the API for a specific service, check the service information in the Azure REST API reference.

    To update the API version, find the "apiVersion": "YYYY-MM-DD" entry for the resource type and update it to the latest version. The following example is an entry for Azure Machine Learning:

    "type": "Microsoft.MachineLearningServices/workspaces",
    "apiVersion": "2023-10-01",
    

Multiple workspaces in the same virtual network

The template doesn't support deploying multiple Azure Machine Learning workspaces in the same virtual network. This is because the template creates new DNS zones during deployment.

If you want to create a template that deploys multiple workspaces in the same virtual network, set it up manually (by using the Azure portal or CLI). Then use the Azure portal to generate a template.

About the Resource Manager template

You can get the Resource Manager template used throughout this document from the microsoft.machineleaerningservices/machine-learning-workspace-vnet directory of the Azure Quickstart Templates GitHub repository.

This template creates the following Azure services:

  • An Azure Storage account
  • Azure Key Vault
  • Application Insights
  • Azure Container Registry
  • An Azure Machine Learning workspace

The Azure Machine Learning workspace uses these services for functionality like logging and storing data, secrets, and Docker images. The template also creates a resource group that contains the services.

The example template has two required parameters:

  • The location, which specifies where to create the resources.

    The template uses the location you select for most resources. The exception is Application Insights, which isn't available in all of the locations that the other services are. If you select a location where it isn't available, the service is created in the South Central US location.

  • The workspaceName, which is the friendly name of the Azure Machine Learning workspace.

    Note

    The workspace name is case-insensitive.

    The names of the other services are generated randomly.

Tip

Although the template associated with this document creates a container registry, you can also create a new workspace without creating a container registry. One will be created when you perform an operation that requires a container registry. For example, training or deploying a model.

You can also reference an existing container registry or storage account in the Azure Resource Manager template, instead of creating a new one. If you do, you must either use a managed identity or enable the admin account for the container registry.

Warning

After an Azure Container Registry is created for a workspace, don't delete it. Doing so makes your Azure Machine Learning workspace inoperative.

For more information on templates, see the following articles:

Deploy the template

To deploy your template, you need to create a resource group.

See the Azure portal section if you prefer to use the graphical user interface.

az group create --name "examplegroup" --location "eastus"

After your resource group is created, deploy the template by using the following command:

az deployment group create \
    --name "exampledeployment" \
    --resource-group "examplegroup" \
    --template-uri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/quickstarts/microsoft.machinelearningservices/machine-learning-workspace-vnet/azuredeploy.json" \
    --parameters workspaceName="exampleworkspace" location="eastus"

By default, all resources created by the template are new. However, you can also use existing resources by including different parameters in the template. For example, if you want to use an existing storage account, set the storageAccountOption value to existing, and provide the name of your storage account in the storageAccountName parameter, as shown in the following command.

Important

If you want to use an existing Azure Storage account, it can't be a premium account (Premium_LRS or Premium_GRS). It also can't have a hierarchical namespace (which is used with Azure Data Lake Storage Gen2). Neither premium storage nor hierarchical namespaces are supported with the default storage account of the workspace. You can use premium storage or hierarchical namespace with non-default storage accounts.

az deployment group create \
    --name "exampledeployment" \
    --resource-group "examplegroup" \
    --template-uri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/quickstarts/microsoft.machinelearningservices/machine-learning-workspace-vnet/azuredeploy.json" \
    --parameters workspaceName="exampleworkspace" \
      location="eastus" \
      storageAccountOption="existing" \
      storageAccountName="existingstorageaccountname"

Deploy an encrypted workspace

The following example template demonstrates how to create a workspace that has three settings:

  • Enable high confidentiality settings for the workspace. This configuration creates a new Azure Cosmos DB instance.
  • Enable encryption for the workspace.
  • Use an existing Azure key vault to retrieve customer-managed keys. Customer-managed keys are used to create a new Azure Cosmos DB instance for the workspace.

Important

After a workspace is created, you can't change the settings for confidential data, encryption, key vault ID, or key identifiers. To change these values, you must create a new workspace that uses the new values.

For more information, see Customer-managed keys.

Important

Your subscription must meet these requirements before you use this template:

  • You must have an existing Azure key vault that contains an encryption key.
  • The key vault must be in the same region where you plan to create the Azure Machine Learning workspace.
  • You must specify the ID of the key vault and the URI of the encryption key.
  • The key vault must have both soft delete and purge protection enabled.

For information about creating the vault and key, see Configure customer-managed keys.

To get the values for the cmk_keyvault (the ID of the key vault) and the resource_cmk_uri (the key URI) parameters needed by this template, take the following steps:

  1. To get the key vault ID, use the following command:

    az keyvault show --name <keyvault-name> --query 'id' --output tsv    
    

    This command returns a value similar to /subscriptions/{subscription-guid}/resourceGroups/<resource-group-name>/providers/Microsoft.KeyVault/vaults/<keyvault-name>.

  2. To get the value for the URI for the customer-managed key, use the following command:

    az keyvault key show --vault-name <keyvault-name> --name <key-name> --query 'key.kid' --output tsv    
    

This command returns a value similar to https://mykeyvault.vault.azure.net/keys/mykey/{guid}.

Important

After a workspace is created, you can't change the settings for confidential data, encryption, key vault ID, or key identifiers. To change these values, you must create a new workspace that uses the new values.

To enable the use of customer-managed keys, set the following parameters when deploying the template:

  • Set encryption_status to Enabled.
  • Set cmk_keyvault to the cmk_keyvault value obtained in the preceding steps.
  • Set resource_cmk_uri to the resource_cmk_uri value obtained in the preceding steps.
az deployment group create \
    --name "exampledeployment" \
    --resource-group "examplegroup" \
    --template-uri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/quickstarts/microsoft.machinelearningservices/machine-learning-workspace-vnet/azuredeploy.json" \
    --parameters workspaceName="exampleworkspace" \
      location="eastus" \
      encryption_status="Enabled" \
      cmk_keyvault="/subscriptions/{subscription-guid}/resourceGroups/<resource-group-name>/providers/Microsoft.KeyVault/vaults/<keyvault-name>" \
      resource_cmk_uri="https://mykeyvault.vault.azure.net/keys/mykey/{guid}" \

When you use a customer-managed key, Azure Machine Learning creates a secondary resource group that contains the Azure Cosmos DB instance. For more information, see Encryption at rest in Azure Cosmos DB.

You can optionally set the confidential_data parameter to true. Doing so enables the following behavior:

  • Starts encrypting the local scratch disk for Azure Machine Learning compute clusters, if you haven't created any clusters in your subscription. If you have previously created a cluster in the subscription, open a support ticket to have encryption of the scratch disk enabled for your compute clusters.

  • Cleans up the local scratch disk between jobs.

  • Securely passes credentials for the storage account, container registry, and SSH account from the execution layer to your compute clusters by using Key Vault.

  • Enables IP filtering to ensure that no external services other than AzureMachineLearningService can call the underlying batch pools.

    Important

    After a workspace it created, you can't change the settings for confidential data, encryption, key vault ID, or key identifiers. To change these values, you must create a new workspace that uses the new values.

    For more information, see Encryption at rest.

Deploy a workspace behind a virtual network

By setting the vnetOption parameter value to either new or existing, you can create the resources used by a workspace behind a virtual network.

Important

For Container Registry, only the Premium SKU is supported.

Important

Application Insights doesn't support deployment behind a virtual network.

Only deploy the workspace behind a private endpoint

If your associated resources aren't behind a virtual network, you can set the privateEndpointType parameter to AutoAproval or ManualApproval to deploy the workspace behind a private endpoint. This setting can be used for both new and existing workspaces. When updating an existing workspace, configure the template parameters with the information from the existing workspace.

az deployment group create \
    --name "exampledeployment" \
    --resource-group "examplegroup" \
    --template-uri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/quickstarts/microsoft.machinelearningservices/machine-learning-workspace-vnet/azuredeploy.json" \
    --parameters workspaceName="exampleworkspace" \
      location="eastus" \
      privateEndpointType="AutoApproval"

Use a new virtual network

To deploy a resource behind a new virtual network, set the vnetOption to new and provide the virtual network settings for the resource. The following example shows how to deploy a workspace and deploy the storage account resource behind a new virtual network.

az deployment group create \
    --name "exampledeployment" \
    --resource-group "examplegroup" \
    --template-uri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/quickstarts/microsoft.machinelearningservices/machine-learning-workspace-vnet/azuredeploy.json" \
    --parameters workspaceName="exampleworkspace" \
      location="eastus" \
      vnetOption="new" \
      vnetName="examplevnet" \
      storageAccountBehindVNet="true"
      privateEndpointType="AutoApproval"

Alternatively, you can deploy multiple or all dependent resources behind a virtual network:

az deployment group create \
    --name "exampledeployment" \
    --resource-group "examplegroup" \
    --template-uri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/quickstarts/microsoft.machinelearningservices/machine-learning-workspace-vnet/azuredeploy.json" \
    --parameters workspaceName="exampleworkspace" \
      location="eastus" \
      vnetOption="new" \
      vnetName="examplevnet" \
      storageAccountBehindVNet="true" \
      keyVaultBehindVNet="true" \
      containerRegistryBehindVNet="true" \
      containerRegistryOption="new" \
      containerRegistrySku="Premium"
      privateEndpointType="AutoApproval"

Use an existing virtual network and existing resources

To deploy a workspace with existing resources, you have to set the vnetOption parameter and subnet parameters to existing. However, you need to create service endpoints in the virtual network for each of the resources before deployment. As with new virtual network deployments, you can place one or all of your resources behind a virtual network.

Important

Subnets should have a Microsoft.Storage service endpoint.

Important

Subnets don't support private endpoints. Disable private endpoints to enable subnets.

  1. Enable service endpoints for the resources:

    az network vnet subnet update --resource-group "examplegroup" --vnet-name "examplevnet" --name "examplesubnet" --service-endpoints "Microsoft.Storage"
    az network vnet subnet update --resource-group "examplegroup" --vnet-name "examplevnet" --name "examplesubnet" --service-endpoints "Microsoft.KeyVault"
    az network vnet subnet update --resource-group "examplegroup" --vnet-name "examplevnet" --name "examplesubnet" --service-endpoints "Microsoft.ContainerRegistry"
    
  2. Deploy the workspace:

    az deployment group create \
    --name "exampledeployment" \
    --resource-group "examplegroup" \
    --template-uri "https://raw.githubusercontent.com/Azure/azure-quickstart-templates/master/quickstarts/microsoft.machinelearningservices/machine-learning-workspace-vnet/azuredeploy.json" \
    --parameters workspaceName="exampleworkspace" \
      location="eastus" \
      vnetOption="existing" \
      vnetName="examplevnet" \
      vnetResourceGroupName="examplegroup" \
      storageAccountBehindVNet="true" \
      keyVaultBehindVNet="true" \
      containerRegistryBehindVNet="true" \
      containerRegistryOption="new" \
      containerRegistrySku="Premium" \
      subnetName="examplesubnet" \
      subnetOption="existing"
      privateEndpointType="AutoApproval"
    

Use the Azure portal

  1. Complete the steps in Deploy resources from custom template. When you get to the Custom deployment pane, select Quickstart template.

  2. In the Quickstart template list, select quickstarts/microsoft.machinelearningservices/machine-learning-workspace-vnet. Finally, select Select template.

  3. On the Custom deployment page, provide the following required information and any other parameters required by your deployment scenario.

    • Subscription: Select the Azure subscription to use for the resources.
    • Resource group: Select or create the resource group to contain the services.
    • Region: Select the Azure region to create the resources in.
    • Workspace name: Enter a name for the Azure Machine Learning workspace. The workspace name must be between 3 and 33 characters. It can contain only alphanumeric characters and the - character.
    • Location: Select the location for the deployment metadata. This location can be the same as the region location, or it can be different.
    • Vnet Name: Enter a virtual network name.
  4. Select Review + create.

  5. Select Create.

For more information, see Deploy resources from custom template.

Troubleshooting

Resource provider errors

When creating an Azure Machine Learning workspace, or a resource used by the workspace, you might get an error that's similar to one of these:

  • No registered resource provider found for location {location}
  • The subscription is not registered to use namespace {resource-provider-namespace}

Most resource providers are automatically registered, but not all of them. If you see this message, you need to register a provider.

The following table contains a list of resource providers required by Azure Machine Learning:

Resource provider Why it's needed
Microsoft.MachineLearningServices Creating the Azure Machine Learning workspace.
Microsoft.Storage An Azure Storage account is used as the default storage for the workspace.
Microsoft.ContainerRegistry Azure Container Registry is used by the workspace to build Docker images.
Microsoft.KeyVault Azure Key Vault is used by the workspace to store secrets.
Microsoft.Notebooks An Azure Machine Learning compute instance uses integrated notebooks.
Microsoft.ContainerService You want to deploy trained models to Azure Kubernetes Services.

If you want to use a customer-managed key with Azure Machine Learning, you must register the following service providers:

Resource provider Why it's needed
Microsoft.DocumentDB An Azure Cosmos DB instance logs metadata for the workspace.
Microsoft.Search Azure Search provides indexing capabilities for the workspace.

If you want to use a managed virtual network with Azure Machine Learning, you must register the Microsoft.Network resource provider. This resource provider is used by the workspace when private endpoints for the managed virtual network are created.

For information on registering resource providers, see Resolve errors for resource provider registration.

Key Vault access policy and Resource Manager templates

You might encounter failures if you use a Resource Manager template to create a workspace and associated resources (including Key Vault), multiple times. For example, using a template multiple times with the same parameters as part of a continuous integration and deployment pipeline can lead to failures.

Most resource creation operations that run via templates are idempotent, but Key Vault clears the access policies each time the template is used. Clearing the access policies creates problems with accessing the key vault for any workspace that's using it. For example, stop and create operations of Azure notebook VMs might fail.

To avoid this problem, we recommend one of the following approaches:

  • Don't deploy a template more than once with the same parameters. Or delete existing resources before using the template to re-create them.

  • Examine the Key Vault access policies and use these policies to set the accessPolicies property of the template. To view the access policies, use the following Azure CLI command:

    az keyvault show --name mykeyvault --resource-group myresourcegroup --query properties.accessPolicies
    

    For more information on using the accessPolicies section of the template, see the AccessPolicyEntry object reference.

  • Check whether the Key Vault resource already exists. If it does, don't re-create it by using the template. For example, to use the existing Key Vault instead of creating a new one, make the following changes in the template:

    • Add a parameter that accepts the ID of an existing Key Vault resource:

      "keyVaultId":{
        "type": "string",
        "metadata": {
          "description": "Specify the existing Key Vault ID."
        }
      }
      
    • Remove the section that creates a Key Vault resource:

      {
        "type": "Microsoft.KeyVault/vaults",
        "apiVersion": "2018-02-14",
        "name": "[variables('keyVaultName')]",
        "location": "[parameters('location')]",
        "properties": {
          "tenantId": "[variables('tenantId')]",
          "sku": {
            "name": "standard",
            "family": "A"
          },
          "accessPolicies": [
          ]
        }
      },
      
    • Remove the "[resourceId('Microsoft.KeyVault/vaults', variables('keyVaultName'))]", line from the dependsOn section of the workspace. Also change the keyVault entry in the properties section of the workspace to reference the keyVaultId parameter:

      {
        "type": "Microsoft.MachineLearningServices/workspaces",
        "apiVersion": "2019-11-01",
        "name": "[parameters('workspaceName')]",
        "location": "[parameters('location')]",
        "dependsOn": [
          "[resourceId('Microsoft.Storage/storageAccounts', variables('storageAccountName'))]",
          "[resourceId('Microsoft.Insights/components', variables('applicationInsightsName'))]"
        ],
        "identity": {
          "type": "systemAssigned"
        },
        "sku": {
          "tier": "[parameters('sku')]",
          "name": "[parameters('sku')]"
        },
        "properties": {
          "friendlyName": "[parameters('workspaceName')]",
          "keyVault": "[parameters('keyVaultId')]",
          "applicationInsights": "[resourceId('Microsoft.Insights/components',variables('applicationInsightsName'))]",
          "storageAccount": "[resourceId('Microsoft.Storage/storageAccounts/',variables('storageAccountName'))]"
        }
      }
      

    After you make these changes, you can specify the ID of the existing Key Vault resource when running the template. The template then reuses the key vault by setting the keyVault property of the workspace to its ID.

    To get the ID of the key vault, you can reference the output of the original template job or use the Azure CLI. The following command shows how to use the Azure CLI to get the key vault resource ID:

    az keyvault show --name mykeyvault --resource-group myresourcegroup --query id
    

    This command returns a value similar to this:

    /subscriptions/{subscription-guid}/resourceGroups/myresourcegroup/providers/Microsoft.KeyVault/vaults/mykeyvault