Azure Machine Learning data exfiltration prevention

Azure Machine Learning has several inbound and outbound dependencies. Some of these dependencies can expose a data exfiltration risk by malicious agents within your organization. This document explains how to minimize data exfiltration risk by limiting inbound and outbound requirements.

  • Inbound: If your compute instance or cluster uses a public IP address, you have an inbound on azuremachinelearning (port 44224) service tag. You can control this inbound traffic by using a network security group (NSG) and service tags. It's difficult to disguise Azure service IPs, so there's low data exfiltration risk. You can also configure the compute to not use a public IP, which removes inbound requirements.

  • Outbound: If malicious agents don't have write access to outbound destination resources, they can't use that outbound for data exfiltration. Azure Active Directory, Azure Resource Manager, Azure Machine Learning, and Microsoft Container Registry belong to this category. On the other hand, Storage and AzureFrontDoor.frontend can be used for data exfiltration.

    • Storage Outbound: This requirement comes from compute instance and compute cluster. A malicious agent can use this outbound rule to exfiltrate data by provisioning and saving data in their own storage account. You can remove data exfiltration risk by using an Azure Service Endpoint Policy and Azure Batch's simplified node communication architecture.

    • AzureFrontDoor.frontend outbound: Azure Front Door is used by the Azure Machine Learning studio UI and AutoML. Instead of allowing outbound to the service tag (AzureFrontDoor.frontend), switch to the following fully qualified domain names (FQDN). Switching to these FQDNs removes unnecessary outbound traffic included in the service tag and allows only what is needed for Azure Machine Learning studio UI and AutoML.

      • ml.azure.com
      • automlresources-prod.azureedge.net

Prerequisites

  • An Azure subscription
  • An Azure Virtual Network (VNet)
  • An Azure Machine Learning workspace with a private endpoint that connects to the VNet.
    • The storage account used by the workspace must also connect to the VNet using a private endpoint.
  • You need to recreate compute instance or scale down compute cluster to zero node.
    • Not required if you have joined preview.
    • Not required if you have new compute instance and compute cluster created after December 2022.

Why do I need to use the service endpoint policy

Service endpoint policies allow you to filter egress virtual network traffic to Azure Storage accounts over service endpoint and allow data exfiltration to only specific Azure Storage accounts. Azure Machine Learning compute instance and compute cluster requires access to Microsoft-managed storage accounts for its provisioning. The Azure Machine learning alias in service endpoint policies includes Microsoft-managed storage accounts. We use service endpoint policies with the Azure Machine Learning alias to prevent data exfiltration or control the destination storage accounts. You can learn more in Service Endpoint policy documentation.

1. Create the service endpoint policy

  1. From the Azure portal, add a new Service Endpoint Policy. On the Basics tab, provide the required information and then select Next.

  2. On the Policy definitions tab, perform the following actions:

    1. Select + Add a resource, and then provide the following information:

      • Service: Microsoft.Storage
      • Scope: Select the scope as Single account to limit the network traffic to one storage account.
      • Subscription: The Azure subscription that contains the storage account.
      • Resource group: The resource group that contains the storage account.
      • Resource: The default storage account of your workspace.

      Select Add to add the resource information.

      A screenshot showing how to create a service endpoint policy.

    2. Select + Add an alias, and then select /services/Azure/MachineLearning as the Server Alias value. Select Add to add the alias.

      Note

      The Azure CLI and Azure PowerShell do not provide support for adding an alias to the policy.

  3. Select Review + Create, and then select Create.

Important

If your compute instance and compute cluster need access to additional storage accounts, your service endpoint policy should include the additional storage accounts in the resources section. Note that it is not required if you use Storage private endpoints. Service endpoint policy and private endpoint are independent.

2. Allow inbound and outbound network traffic

Inbound

Important

The following information modifies the guidance provided in the How to secure training environment article.

When using Azure Machine Learning compute instance with a public IP address, allow inbound traffic from Azure Batch management (service tag BatchNodeManagement.<region>). A compute instance with no public IP doesn't require this inbound communication.

Outbound

Important

The following information is in addition to the guidance provided in the Secure training environment with virtual networks and Configure inbound and outbound network traffic articles.

Select the configuration that you're using:

Allow outbound traffic over TCP port 443 to the following service tags. Replace <region> with the Azure region that contains your compute cluster or instance:

  • BatchNodeManagement.<region>
  • AzureMachineLearning
  • Storage.<region> - A Service Endpoint Policy will be applied in a later step to limit outbound traffic.

For more information, see How to secure training environments and Configure inbound and outbound network traffic.

3. Enable storage endpoint for the subnet

  1. From the Azure portal, select the Azure Virtual Network for your Azure ML workspace.
  2. From the left of the page, select Subnets and then select the subnet that contains your compute cluster/instance resources.
  3. In the form that appears, expand the Services dropdown and then enable Microsoft.Storage. Select Save to save these changes.
  4. Apply the service endpoint policy to your workspace subnet.

A screenshot of the Azure portal showing how to enable storage endpoint for the subnet.

4. Curated environments

When using Azure ML curated environments, make sure to use the latest environment version. The container registry for the environment must also be mcr.microsoft.com. To check the container registry, use the following steps:

  1. From Azure ML studio, select your workspace and then select Environments.

  2. Verify that the Azure container registry begins with a value of mcr.microsoft.com.

    Important

    If the container registry is viennaglobal.azurecr.io you cannot use the curated environment with the data exfiltration. Try upgrading to the latest version of the curated environment.

  3. When using mcr.microsoft.com, you must also allow outbound configuration to the following resources. Select the configuration option that you're using:

    Allow outbound traffic over TCP port 443 to the following service tags. Replace <region> with the Azure region that contains your compute cluster or instance.

    • MicrosoftContainerRegistry.<region>
    • AzureFrontDoor.FirstParty

Next steps

For more information, see the following articles: