Azure Machine Learning data exfiltration prevention
Azure Machine Learning has several inbound and outbound dependencies. Some of these dependencies can expose a data exfiltration risk by malicious agents within your organization. This document explains how to minimize data exfiltration risk by limiting inbound and outbound requirements.
Inbound: If your compute instance or cluster uses a public IP address, you have an inbound on
azuremachinelearning
(port 44224) service tag. You can control this inbound traffic by using a network security group (NSG) and service tags. It's difficult to disguise Azure service IPs, so there's low data exfiltration risk. You can also configure the compute to not use a public IP, which removes inbound requirements.Outbound: If malicious agents don't have write access to outbound destination resources, they can't use that outbound for data exfiltration. Microsoft Entra ID, Azure Resource Manager, Azure Machine Learning, and Microsoft Container Registry belong to this category. On the other hand, Storage and AzureFrontDoor.frontend can be used for data exfiltration.
Storage Outbound: This requirement comes from compute instance and compute cluster. A malicious agent can use this outbound rule to exfiltrate data by provisioning and saving data in their own storage account. You can remove data exfiltration risk by using an Azure Service Endpoint Policy and Azure Batch's simplified node communication architecture.
AzureFrontDoor.frontend outbound: Azure Front Door is used by the Azure Machine Learning studio UI and AutoML. Instead of allowing outbound to the service tag (AzureFrontDoor.frontend), switch to the following fully qualified domain names (FQDN). Switching to these FQDNs removes unnecessary outbound traffic included in the service tag and allows only what is needed for Azure Machine Learning studio UI and AutoML.
ml.azure.com
automlresources-prod.azureedge.net
Tip
The information in this article is primarily about using an Azure Virtual Network. Azure Machine Learning can also use a managed virtual networks. With a managed virtual network, Azure Machine Learning handles the job of network isolation for your workspace and managed computes.
To address data exfiltration concerns, managed virtual networks allow you to restrict egress to only approved outbound traffic. For more information, see Workspace managed network isolation.
Prerequisites
- An Azure subscription
- An Azure Virtual Network (VNet)
- An Azure Machine Learning workspace with a private endpoint that connects to the VNet.
- The storage account used by the workspace must also connect to the VNet using a private endpoint.
- You need to recreate compute instance or scale down compute cluster to zero node.
- Not required if you have joined preview.
- Not required if you have new compute instance and compute cluster created after December 2022.
Why do I need to use the service endpoint policy
Service endpoint policies allow you to filter egress virtual network traffic to Azure Storage accounts over service endpoint and allow data exfiltration to only specific Azure Storage accounts. Azure Machine Learning compute instance and compute cluster requires access to Microsoft-managed storage accounts for its provisioning. The Azure Machine Learning alias in service endpoint policies includes Microsoft-managed storage accounts. We use service endpoint policies with the Azure Machine Learning alias to prevent data exfiltration or control the destination storage accounts. You can learn more in Service Endpoint policy documentation.
1. Create the service endpoint policy
From the Azure portal, add a new Service Endpoint Policy. On the Basics tab, provide the required information and then select Next.
On the Policy definitions tab, perform the following actions:
Select + Add a resource, and then provide the following information:
- Service: Microsoft.Storage
- Scope: Select the scope as Single account to limit the network traffic to one storage account.
- Subscription: The Azure subscription that contains the storage account.
- Resource group: The resource group that contains the storage account.
- Resource: The default storage account of your workspace.
Select Add to add the resource information.
Select + Add an alias, and then select
/services/Azure/MachineLearning
as the Server Alias value. Select Add to add the alias.Note
The Azure CLI and Azure PowerShell do not provide support for adding an alias to the policy.
Select Review + Create, and then select Create.
Important
If your compute instance and compute cluster need access to additional storage accounts, your service endpoint policy should include the additional storage accounts in the resources section. Note that it is not required if you use Storage private endpoints. Service endpoint policy and private endpoint are independent.
2. Allow inbound and outbound network traffic
Inbound
Important
The following information modifies the guidance provided in the How to secure training environment article.
Important
The following information modifies the guidance provided in the How to secure training environment article.
When using Azure Machine Learning compute instance with a public IP address, allow inbound traffic from Azure Batch management (service tag BatchNodeManagement.<region>
). A compute instance with no public IP doesn't require this inbound communication.
Outbound
Important
The following information is in addition to the guidance provided in the Secure training environment with virtual networks and Configure inbound and outbound network traffic articles.
Important
The following information is in addition to the guidance provided in the Secure training environment with virtual networks and Configure inbound and outbound network traffic articles.
Select the configuration that you're using:
Allow outbound traffic to the following service tags. Replace <region>
with the Azure region that contains your compute cluster or instance:
Service tag | Protocol | Port |
---|---|---|
BatchNodeManagement.<region> |
ANY | 443 |
AzureMachineLearning |
TCP | 443 |
Storage.<region> |
TCP | 443 |
Note
For the storage outbound, a Service Endpoint Policy will be applied in a later step to limit outbound traffic.
For more information, see How to secure training environments and Configure inbound and outbound network traffic.
For more information, see How to secure training environments and Configure inbound and outbound network traffic.
3. Enable storage endpoint for the subnet
Use the following steps to enable a storage endpoint for the subnet that contains your Azure Machine Learning compute clusters and compute instances:
- From the Azure portal, select the Azure Virtual Network for your Azure Machine Learning workspace.
- From the left of the page, select Subnets and then select the subnet that contains your compute cluster and compute instance.
- In the form that appears, expand the Services dropdown and then enable Microsoft.Storage. Select Save to save these changes.
- Apply the service endpoint policy to your workspace subnet.
4. Curated environments
When using Azure Machine Learning curated environments, make sure to use the latest environment version. The container registry for the environment must also be mcr.microsoft.com
. To check the container registry, use the following steps:
From Azure Machine Learning studio, select your workspace and then select Environments.
Verify that the Azure container registry begins with a value of
mcr.microsoft.com
.Important
If the container registry is
viennaglobal.azurecr.io
you cannot use the curated environment with the data exfiltration. Try upgrading to the latest version of the curated environment.When using
mcr.microsoft.com
, you must also allow outbound configuration to the following resources. Select the configuration option that you're using:Allow outbound traffic over TCP port 443 to the following service tags. Replace
<region>
with the Azure region that contains your compute cluster or instance.MicrosoftContainerRegistry.<region>
AzureFrontDoor.FirstParty
Next steps
For more information, see the following articles: