Use simplified compute node communication

An Azure Batch pool contains one or more compute nodes that execute user-specified workloads in the form of Batch tasks. To enable Batch functionality and Batch pool infrastructure management, compute nodes must communicate with the Azure Batch service.

Batch supports two types of node communication modes:

  • Classic where the Batch service initiates communication to the compute nodes
  • Simplified where the compute nodes initiate communication to the Batch service

This document describes the simplified compute node communication mode and the associated network configuration requirements.

Tip

Information in this document pertaining to networking resources and rules such as NSGs does not apply to Batch pools with no public IP addresses using the node management private endpoint without Internet outbound access.

Supported regions

Simplified compute node communication in Azure Batch is currently available for the following regions:

  • Public: all public regions where Batch is present except for West India and France South.

  • Government: USGov Arizona, USGov Virginia, USGov Texas.

  • China: all China regions where Batch is present except for China North 1 and China East 1.

Compute node communication differences between Classic and Simplified

The simplified compute node communication mode streamlines the way Batch pool infrastructure is managed on behalf of users. This communication mode reduces the complexity and scope of inbound and outbound networking connections required in baseline operations.

Batch pools with the classic communication mode require the following networking rules in network security groups (NSGs), user-defined routes (UDRs), and firewalls when creating a pool in a virtual network:

  • Inbound:

    • Destination ports 29876, 29877 over TCP from BatchNodeManagement.region
  • Outbound:

    • Destination port 443 over TCP to Storage.region
    • Destination port 443 over TCP to BatchNodeManagement.region for certain workloads that require communication back to the Batch Service, such as Job Manager tasks

Batch pools with the simplified communication mode require the following networking rules in NSGs, UDRs, and firewalls:

  • Inbound:

    • None
  • Outbound:

    • Destination port 443 over ANY to BatchNodeManagement.region

Outbound requirements for a Batch account can be discovered using the List Outbound Network Dependencies Endpoints API This API will report the base set of dependencies, depending upon the Batch account pool communication mode. User-specific workloads may need extra rules such as opening traffic to other Azure resources (such as Azure Storage for Application Packages, Azure Container Registry, etc.) or endpoints like the Microsoft package repository for virtual file system mounting functionality.

Benefits of the simplified communication mode

Azure Batch users utilizing the simplified mode benefit from simplification of networking connections and rules. Simplified compute node communication helps reduce security risks by removing the requirement to open ports for inbound communication from the internet. Only a single outbound rule to a well-known Service Tag is required for baseline operation.

The simplified mode also provides more fine-grained data exfiltration control over the classic communication mode since outbound communication to Storage.region is no longer required. You can explicitly lock down outbound communication to Azure Storage if necessary for your workflow. For example, you can scope your outbound communication rules to Azure Storage to enable your AppPackage storage accounts or other storage accounts for resource files or output files.

Even if your workloads aren't currently impacted by the changes (as described in the next section), it's recommended to move to the simplified mode. Doing so will ensure your Batch workloads are ready for any future improvements enabled by this mode, and also for when this communication mode will move to become the default.

Potential impact between classic and simplified communication modes

In many cases, the simplified communication mode won't directly affect your Batch workloads. However, simplified compute node communication will have an impact for the following cases:

  • Users who specify a Virtual Network as part of creating a Batch pool and do one or both of the following actions:
    • Explicitly disable outbound network traffic rules that are incompatible with simplified compute node communication.
    • Use UDRs and firewall rules that are incompatible with simplified compute node communication.
  • Users who enable software firewalls on compute nodes and explicitly disable outbound traffic in software firewall rules that are incompatible with simplified compute node communication.

If either of these cases applies to you, then follow the steps outlined in the next section to ensure that your Batch workloads can still function under the simplified mode. We strongly recommend that you test and verify all of your changes in a dev and test environment first before pushing your changes into production.

Required network configuration changes for simplified communication mode

The following set of steps is required to migrate to the new communication mode:

  1. Ensure your networking configuration as applicable to Batch pools (NSGs, UDRs, firewalls, etc.) includes a union of the modes (that is, the combined network rules of both classic and simplified modes). At a minimum, these rules would be:
    • Inbound:
      • Destination ports 29876, 29877 over TCP from BatchNodeManagement.region
    • Outbound:
      • Destination port 443 over TCP to Storage.region
      • Destination port 443 over ANY to BatchNodeManagement.region
  2. If you have any other inbound or outbound scenarios required by your workflow, you'll need to ensure that your rules reflect these requirements.
  3. Use one of the following options to update your workloads to use the new communication mode.
    • Create new pools with the targetNodeCommunicationMode set to simplified and validate that the new pools are working correctly. Migrate your workload to the new pools and delete any earlier pools.
    • Update existing pools targetNodeCommunicationMode property to simplified and then resize all existing pools to zero nodes and scale back out.
  4. Use the Get Pool, List Pool API or Portal to confirm the currentNodeCommunicationMode is set to the desired communication mode of simplified.
  5. Modify all applicable networking configuration to the Simplified Compute Node Communication rules, at the minimum (note any extra rules needed as discussed above):
    • Inbound:
      • None
    • Outbound:
      • Destination port 443 over ANY to BatchNodeManagement.region

If you follow these steps, but later want to switch back to classic compute node communication, you'll need to take the following actions:

  1. Create new pools or update existing pools targetNodeCommunicationMode property set to classic.
  2. Migrate your workload to these pools, or resize existing pools and scale back out (see step 3 above).
  3. See step 4 above to confirm that your pools are operating in classic communication mode.
  4. Optionally revert your networking configuration.

Specifying the node communication mode on a Batch pool

Below are examples of how to create a Batch pool with simplified compute node communication.

Tip

Specifying the target node communication mode is a preference indication for the Batch service and not a guarantee that it will be honored. Certain configurations on the pool may prevent the Batch service from honoring the specified target node communication mode, such as interaction with No public IP address, virtual networks, and the pool configuration type.

Azure portal

Navigate to the Pools blade of your Batch account and click the Add button. Under OPTIONAL SETTINGS, you can select Simplified as an option from the pull-down of Node communication mode as shown below.

Screenshot that shows creating a pool with simplified mode.

To update an existing pool to simplified communication mode, navigate to the Pools blade of your Batch account and click on the pool to update. On the left-side navigation, select Node communication mode. There you'll be able to select a new target node communication mode as shown below. After selecting the appropriate communication mode, click the Save button to update. You'll need to scale the pool down to zero nodes first, and then back out for the change to take effect, if conditions allow.

Screenshot that shows updating a pool to simplified mode.

To display the current node communication mode for a pool, navigate to the Pools blade of your Batch account, and click on the pool to view. Select Properties on the left-side navigation and the pool node communication mode will be shown under the General section.

Screenshot that shows properties with a pool with simplified mode.

REST API

This example shows how to use the Batch Service REST API to create a pool with simplified compute node communication.

POST {batchURL}/pools?api-version=2022-10-01.16.0
client-request-id: 00000000-0000-0000-0000-000000000000

Request body

"pool": {
     "id": "pool-simplified",
     "vmSize": "standard_d2s_v3",
     "virtualMachineConfiguration": {
          "imageReference": {
               "publisher": "Canonical",
               "offer": "0001-com-ubuntu-server-jammy",
               "sku": "22_04-lts"
          },
          "nodeAgentSKUId": "batch.node.ubuntu 22.04"
     },
     "resizeTimeout": "PT15M",
     "targetDedicatedNodes": 2,
     "targetLowPriorityNodes": 0,
     "taskSlotsPerNode": 1,
     "taskSchedulingPolicy": {
          "nodeFillType": "spread"
     },
     "enableAutoScale": false,
     "enableInterNodeCommunication": false,
     "targetNodeCommunicationMode": "simplified"
}

Limitations

The following are known limitations of the simplified communication mode:

  • Limited migration support for previously created pools without public IP addresses (V1 preview). These pools can only be migrated if created in a virtual network, otherwise they won't use simplified compute node communication, even if specified on the pool. For more information, see the migration guide.
  • Cloud Service Configuration pools are currently not supported for simplified compute node communication and are deprecated. Specifying a communication mode for these types of pools aren't honored and will always result in classic communication mode. We recommend using Virtual Machine Configuration for your Batch pools. For more information, see Migrate Batch pool configuration from Cloud Services to Virtual Machine.

Next steps