Manage Azure Machine Learning workspaces by using Terraform

In this article, you learn how to create an Azure Machine Learning workspace by using Terraform configuration files. Terraform template-based configuration files enable you to define, create, and configure Azure resources in a repeatable and predictable manner. Terraform tracks resource state and can clean up and destroy resources.

A Terraform configuration file is a document that defines the resources needed for a deployment. The Terraform configuration can also specify deployment variables to use to provide input values when you apply the configuration.

Prerequisites

Limitations

  • When you create a new workspace, you can either automatically create services needed by the workspace or use existing services. If you want to use existing services from a different Azure subscription than the workspace, you must register the Azure Machine Learning namespace in the subscription that contains those services. For example, if you create a workspace in subscription A that uses a storage account in subscription B, the Azure Machine Learning namespace must be registered in subscription B before the workspace can use the storage account.

    The resource provider for Azure Machine Learning is Microsoft.MachineLearningServices. For information on seeing whether it's registered or registering it, see Azure resource providers and types.

    Important

    This information applies only to resources provided during workspace creation: Azure Storage Accounts, Azure Container Registry, Azure Key Vault, and Application Insights.

  • The following limitation applies to the Application Insights instance created during workspace creation:

    Tip

    An Azure Application Insights instance is created when you create the workspace. You can delete the Application Insights instance after cluster creation if you want. Deleting it limits the information gathered from the workspace, and might make it more difficult to troubleshoot problems. If you delete the Application Insights instance created by the workspace, the only way to recreate it is to delete and recreate the workspace.

    For more information on using the Application Insights instance, see Monitor and collect data from Machine Learning web service endpoints.

Create the workspace

Create a file named main.tf that has the following code.

data "azurerm_client_config" "current" {}

resource "azurerm_resource_group" "default" {
  name     = "${random_pet.prefix.id}-rg"
  location = var.location
}

resource "random_pet" "prefix" {
  prefix = var.prefix
  length = 2
}

resource "random_integer" "suffix" {
  min = 10000000
  max = 99999999
}

Declare the Azure provider in a file named providers.tf that has the following code.

terraform {
  required_version = ">= 1.0"

  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = ">= 3.0, < 4.0"
    }
    random = {
      source  = "hashicorp/random"
      version = ">= 3.0"
    }
  }
}

provider "azurerm" {
  features {
    key_vault {
      recover_soft_deleted_key_vaults    = false
      purge_soft_delete_on_destroy       = false
      purge_soft_deleted_keys_on_destroy = false
    }
    resource_group {
      prevent_deletion_if_contains_resources = false
    }
  }
}

Configure the workspace

To create an Azure Machine Learning workspace, use one of the following Terraform configurations. An Azure Machine Learning workspace requires various other services as dependencies. The template specifies these associated resources. Depending on your needs, you can choose to use a template that creates resources with either public or private network connectivity.

Note

Some resources in Azure require globally unique names. Before deploying your resources, make sure to set name variables to unique values.

The following configuration creates a workspace with public network connectivity.

Define the following variables in a file called variables.tf.

variable "environment" {
  type        = string
  description = "Name of the environment"
  default     = "dev"
}

variable "location" {
  type        = string
  description = "Location of the resources"
  default     = "eastus"
}

variable "prefix" {
  type        = string
  description = "Prefix of the resource name"
  default     = "ml"
}

Define the following workspace configuration in a file called workspace.tf:

# Dependent resources for Azure Machine Learning
resource "azurerm_application_insights" "default" {
  name                = "${random_pet.prefix.id}-appi"
  location            = azurerm_resource_group.default.location
  resource_group_name = azurerm_resource_group.default.name
  application_type    = "web"
}

resource "azurerm_key_vault" "default" {
  name                     = "${var.prefix}${var.environment}${random_integer.suffix.result}kv"
  location                 = azurerm_resource_group.default.location
  resource_group_name      = azurerm_resource_group.default.name
  tenant_id                = data.azurerm_client_config.current.tenant_id
  sku_name                 = "premium"
  purge_protection_enabled = false
}

resource "azurerm_storage_account" "default" {
  name                            = "${var.prefix}${var.environment}${random_integer.suffix.result}st"
  location                        = azurerm_resource_group.default.location
  resource_group_name             = azurerm_resource_group.default.name
  account_tier                    = "Standard"
  account_replication_type        = "GRS"
  allow_nested_items_to_be_public = false
}

resource "azurerm_container_registry" "default" {
  name                = "${var.prefix}${var.environment}${random_integer.suffix.result}cr"
  location            = azurerm_resource_group.default.location
  resource_group_name = azurerm_resource_group.default.name
  sku                 = "Premium"
  admin_enabled       = true
}

# Machine Learning workspace
resource "azurerm_machine_learning_workspace" "default" {
  name                          = "${random_pet.prefix.id}-mlw"
  location                      = azurerm_resource_group.default.location
  resource_group_name           = azurerm_resource_group.default.name
  application_insights_id       = azurerm_application_insights.default.id
  key_vault_id                  = azurerm_key_vault.default.id
  storage_account_id            = azurerm_storage_account.default.id
  container_registry_id         = azurerm_container_registry.default.id
  public_network_access_enabled = true

  identity {
    type = "SystemAssigned"
  }
}

Create and apply the plan

To create the workspace, run the following code:

terraform init

terraform plan \
        # -var <any of the variables set in variables.tf> \
          -out demo.tfplan

terraform apply "demo.tfplan"

Troubleshoot resource provider errors

When creating an Azure Machine Learning workspace, or a resource used by the workspace, you may receive an error similar to the following messages:

  • No registered resource provider found for location {location}
  • The subscription is not registered to use namespace {resource-provider-namespace}

Most resource providers are automatically registered, but not all. If you receive this message, you need to register the provider mentioned.

The following table contains a list of the resource providers required by Azure Machine Learning:

Resource provider Why it's needed
Microsoft.MachineLearningServices Creating the Azure Machine Learning workspace.
Microsoft.Storage Azure Storage Account is used as the default storage for the workspace.
Microsoft.ContainerRegistry Azure Container Registry is used by the workspace to build Docker images.
Microsoft.KeyVault Azure Key Vault is used by the workspace to store secrets.
Microsoft.Notebooks Integrated notebooks on Azure Machine Learning compute instance.
Microsoft.ContainerService If you plan on deploying trained models to Azure Kubernetes Services.

If you plan on using a customer-managed key with Azure Machine Learning, then the following service providers must be registered:

Resource provider Why it's needed
Microsoft.DocumentDB Azure CosmosDB instance that logs metadata for the workspace.
Microsoft.Search Azure Search provides indexing capabilities for the workspace.

If you plan on using a managed virtual network with Azure Machine Learning, then the Microsoft.Network resource provider must be registered. This resource provider is used by the workspace when creating private endpoints for the managed virtual network.

For information on registering resource providers, see Resolve errors for resource provider registration.