How to run tasks on Supercomputers using the REST API

This article walks you through the end-to-end process of preparing Azure Discovery Supercomputer infrastructure and running compute tasks on it by using the REST API. You learn how to create a Supercomputer, add GPU-accelerated node pools, monitor provisioning, scale your compute, and clean up resources when you're finished.

Azure Discovery Supercomputers provide dedicated, cloud-hosted high-performance computing (HPC) infrastructure for running workloads such as AI model training, scientific simulations, and large-scale data processing. Node pools are the compute building blocks that run your tasks — each pool defines the VM size, scaling limits, and priority of the underlying virtual machine scale set.

Prerequisites

Before you begin, make sure you have the following:

An Azure subscription. If you don't have one, create a free account.
A resource group in a supported region.
Azure CLI installed, or another tool to make authenticated REST calls (such as curl or Postman).
The following user-assigned managed identities created in your subscription:
- Cluster identity — used by the Supercomputer control plane.
- Kubelet identity — used at the node level to access Azure resources. Must have the ManagedIdentityOperator role on the cluster identity.
A virtual network with:
- A system subnet for the Supercomputer's managed system node pool.
- A management subnet delegated to Microsoft.ContainerService/managedClusters for the AKS API server.
- One or more node pool subnets for your compute node pools (these must have connectivity to the system subnet).
Sufficient GPU quota in your subscription for the VM sizes you plan to use (for example, Standard_NC24ads_A100_v4 requires NCads A100 v4-series quota).

Supported regions

Supercomputers are currently available in the following Azure regions:

East US
UK South
West Europe

API version

All examples in this article use API version 2026-02-01-preview.

Authentication

All requests require a Microsoft Entra ID bearer token with the user_impersonation scope. To acquire a token using Azure CLI:

az account get-access-token --resource https://management.azure.com

Include the token in the Authorization header of every request:

Authorization: Bearer <access-token>

Step 1: Create a Supercomputer

A Supercomputer is the top-level resource that provides the managed AKS-backed cluster. You must create it before you can add node pools for your tasks.

Request

PUT https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Discovery/supercomputers/{supercomputerName}?api-version=2026-02-01-preview
Content-Type: application/json
Authorization: Bearer <access-token>

Request body

{
  "location": "eastus",
  "tags": {
    "environment": "production",
    "project": "molecular-simulation"
  },
  "properties": {
    "subnetId": "/subscriptions/{subscriptionId}/resourceGroups/{networkRG}/providers/Microsoft.Network/virtualNetworks/{vnetName}/subnets/{systemSubnet}",
    "managementSubnetId": "/subscriptions/{subscriptionId}/resourceGroups/{networkRG}/providers/Microsoft.Network/virtualNetworks/{vnetName}/subnets/{mgmtSubnet}",
    "outboundType": "LoadBalancer",
    "systemSku": "Standard_D4s_v6",
    "identities": {
      "clusterIdentity": {
        "id": "/subscriptions/{subscriptionId}/resourceGroups/{identityRG}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{clusterIdentityName}"
      },
      "kubeletIdentity": {
        "id": "/subscriptions/{subscriptionId}/resourceGroups/{identityRG}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{kubeletIdentityName}"
      }
    }
  }
}

Key properties

Property	Required	Description
`location`	Yes	The Azure region for the Supercomputer.
`properties.subnetId`	Yes	System subnet for the managed node pool.
`properties.managementSubnetId`	No	Subnet for the AKS API server, delegated to `Microsoft.ContainerService/managedClusters`.
`properties.outboundType`	No	Network egress: `LoadBalancer` (default) or `None`.
`properties.systemSku`	No	VM SKU for system nodes: `Standard_D4s_v6` (default), `Standard_D4s_v5`, or `Standard_D4s_v4`.
`properties.identities.clusterIdentity.id`	Yes	ARM resource ID of the cluster managed identity.
`properties.identities.kubeletIdentity.id`	Yes	ARM resource ID of the kubelet managed identity.

Response

201 Created — The Supercomputer is being provisioned. The response includes Azure-AsyncOperation and Retry-After headers.
200 OK — The Supercomputer already exists and was updated.

Azure CLI equivalent

az rest --method PUT \
  --url "https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Discovery/supercomputers/{supercomputerName}?api-version=2026-02-01-preview" \
  --body @supercomputer-create.json

Step 2: Wait for Supercomputer provisioning

Supercomputer creation is a long-running operation. Poll the resource until provisioningState reaches a terminal state.

while true; do
  STATE=$(az rest --method GET \
    --url "https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Discovery/supercomputers/{supercomputerName}?api-version=2026-02-01-preview" \
    --query "properties.provisioningState" -o tsv)
  echo "Provisioning state: ${STATE}"
  if [ "${STATE}" = "Succeeded" ] || [ "${STATE}" = "Failed" ] || [ "${STATE}" = "Canceled" ]; then
    break
  fi
  sleep 30
done

Provisioning state	Meaning
`Accepted`	The request has been accepted.
`Provisioning`	Infrastructure is being created.
`Succeeded`	The Supercomputer is ready.
`Failed`	Provisioning failed — check error details.
`Canceled`	The operation was canceled.

Important

Do not create node pools until the Supercomputer reaches the Succeeded state.

Step 3: Create a node pool for your tasks

Node pools define the compute capacity for running your tasks. Each node pool specifies the VM size (including GPU options), scaling limits, and priority.

Request

PUT https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Discovery/supercomputers/{supercomputerName}/nodePools/{nodePoolName}?api-version=2026-02-01-preview
Content-Type: application/json
Authorization: Bearer <access-token>

URI parameters

Parameter	Type	Description
`subscriptionId`	string (UUID)	The ID of the target subscription.
`resourceGroupName`	string	The name of the resource group (1–90 characters, case-insensitive).
`supercomputerName`	string	Name of the parent Supercomputer. Must match `^[a-zA-Z0-9-]{3,24}$`.
`nodePoolName`	string	Name of the node pool. Must match `^[a-zA-Z0-9-]{3,24}$`.

Request body

{
  "location": "eastus",
  "tags": {
    "workload": "ai-training",
    "gpu": "A100"
  },
  "properties": {
    "subnetId": "/subscriptions/{subscriptionId}/resourceGroups/{networkRG}/providers/Microsoft.Network/virtualNetworks/{vnetName}/subnets/{nodePoolSubnet}",
    "vmSize": "Standard_NC24ads_A100_v4",
    "minNodeCount": 0,
    "maxNodeCount": 4,
    "scaleSetPriority": "Regular"
  }
}

Node pool properties

Property	Required	Default	Description
`location`	Yes	—	Must match the Supercomputer's region.
`properties.vmSize`	Yes	—	The VM size for compute nodes.
`properties.maxNodeCount`	Yes	—	Maximum number of nodes the pool can scale to (minimum: 1).
`properties.minNodeCount`	No	`0`	Minimum number of nodes. Set to `0` for scale-to-zero behavior.
`properties.subnetId`	No	—	The subnet for this node pool. Must have connectivity to the system subnet.
`properties.scaleSetPriority`	No	`Regular`	`Regular` for on-demand VMs or `Spot` for cost-optimized, interruptible VMs.

Response

201 Created — The node pool is being provisioned. Includes Azure-AsyncOperation and Retry-After headers.
200 OK — The node pool already exists and was updated.

Example response (201 Created)

{
  "id": "/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Discovery/supercomputers/{supercomputerName}/nodePools/{nodePoolName}",
  "name": "{nodePoolName}",
  "type": "Microsoft.Discovery/supercomputers/nodePools",
  "location": "eastus",
  "tags": {
    "workload": "ai-training",
    "gpu": "A100"
  },
  "systemData": {
    "createdBy": "user@contoso.com",
    "createdByType": "User",
    "createdAt": "2026-05-01T10:00:00Z",
    "lastModifiedBy": "user@contoso.com",
    "lastModifiedByType": "User",
    "lastModifiedAt": "2026-05-01T10:00:00Z"
  },
  "properties": {
    "provisioningState": "Accepted",
    "subnetId": "/subscriptions/.../subnets/nodePoolSubnet",
    "vmSize": "Standard_NC24ads_A100_v4",
    "maxNodeCount": 4,
    "minNodeCount": 0,
    "scaleSetPriority": "Regular"
  }
}

Azure CLI equivalent

az rest --method PUT \
  --url "https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Discovery/supercomputers/{supercomputerName}/nodePools/{nodePoolName}?api-version=2026-02-01-preview" \
  --body @nodepool-create.json

Step 4: Wait for node pool provisioning

Poll the node pool resource until it reaches a terminal state, the same way you polled the Supercomputer.

while true; do
  STATE=$(az rest --method GET \
    --url "https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Discovery/supercomputers/{supercomputerName}/nodePools/{nodePoolName}?api-version=2026-02-01-preview" \
    --query "properties.provisioningState" -o tsv)
  echo "Node pool state: ${STATE}"
  if [ "${STATE}" = "Succeeded" ] || [ "${STATE}" = "Failed" ] || [ "${STATE}" = "Canceled" ]; then
    break
  fi
  sleep 30
done

When the node pool reaches Succeeded, it's ready to accept tasks.

Step 5: Add workload identities (optional)

If your tasks need to access Azure resources (such as Storage accounts or Key Vaults), add workload identities to the Supercomputer. These identities are available to workloads running on the node pools as federated credentials.

Request

PATCH https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Discovery/supercomputers/{supercomputerName}?api-version=2026-02-01-preview
Content-Type: application/json
Authorization: Bearer <access-token>

Request body

{
  "properties": {
    "identities": {
      "workloadIdentities": {
        "/subscriptions/{subscriptionId}/resourceGroups/{identityRG}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{workloadIdentityName}": {}
      }
    }
  }
}

Response

200 OK — The update completed synchronously.
202 Accepted — The update is in progress. Poll using the Location header.

After the update completes, the workload identity's principalId and clientId are populated in the response:

{
  "properties": {
    "identities": {
      "workloadIdentities": {
        "/subscriptions/.../userAssignedIdentities/workloadIdentityName": {
          "principalId": "00000000-0000-0000-0000-000000000001",
          "clientId": "00000000-0000-0000-0000-000000000002"
        }
      }
    }
  }
}

Step 6: Scale your node pool

You can update a node pool to adjust scaling limits — for example, to increase capacity before a large job or scale to zero after tasks complete.

Request

PATCH https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Discovery/supercomputers/{supercomputerName}/nodePools/{nodePoolName}?api-version=2026-02-01-preview
Content-Type: application/json
Authorization: Bearer <access-token>

Scale up for a large task

{
  "properties": {
    "maxNodeCount": 16
  }
}

Scale to zero after tasks complete

{
  "properties": {
    "minNodeCount": 0,
    "maxNodeCount": 0
  }
}

Azure CLI equivalent

az rest --method PATCH \
  --url "https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Discovery/supercomputers/{supercomputerName}/nodePools/{nodePoolName}?api-version=2026-02-01-preview" \
  --body '{"properties": {"maxNodeCount": 16}}'

Step 7: Monitor your resources

Get Supercomputer status

GET https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Discovery/supercomputers/{supercomputerName}?api-version=2026-02-01-preview
Authorization: Bearer <access-token>

List node pools on a Supercomputer

GET https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Discovery/supercomputers/{supercomputerName}/nodePools?api-version=2026-02-01-preview
Authorization: Bearer <access-token>

List all Supercomputers in a resource group

GET https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Discovery/supercomputers?api-version=2026-02-01-preview
Authorization: Bearer <access-token>

List all Supercomputers in a subscription

GET https://management.azure.com/subscriptions/{subscriptionId}/providers/Microsoft.Discovery/supercomputers?api-version=2026-02-01-preview
Authorization: Bearer <access-token>

Tip

List responses are paginated. If the response includes a nextLink property, make a GET request to that URL to retrieve the next page. Continue until nextLink is null.

Step 8: Clean up resources

When your tasks are finished, delete the node pools first and then the Supercomputer.

Delete a node pool

DELETE https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Discovery/supercomputers/{supercomputerName}/nodePools/{nodePoolName}?api-version=2026-02-01-preview
Authorization: Bearer <access-token>

Response: 202 Accepted (deletion in progress) or 204 No Content (already deleted).

Delete the Supercomputer

DELETE https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Discovery/supercomputers/{supercomputerName}?api-version=2026-02-01-preview
Authorization: Bearer <access-token>

Response: 202 Accepted (deletion in progress) or 204 No Content (already deleted).

Warning

Deleting a Supercomputer removes all associated managed resources, including node pools and the managed resource group. This action cannot be undone. Ensure no active tasks are running before deletion.

End-to-end example

This script walks through the full lifecycle: create a Supercomputer, add a GPU node pool, verify readiness, run a hypothetical task, then clean up.

Set variables

SUBSCRIPTION_ID="00000000-0000-0000-0000-000000000000"
RESOURCE_GROUP="rg-discovery-prod"
SC_NAME="sc-ml-eastus"
NODEPOOL_NAME="gpu-a100"
API_VERSION="2026-02-01-preview"
SC_URL="https://management.azure.com/subscriptions/${SUBSCRIPTION_ID}/resourceGroups/${RESOURCE_GROUP}/providers/Microsoft.Discovery/supercomputers/${SC_NAME}"
NP_URL="${SC_URL}/nodePools/${NODEPOOL_NAME}"

Create the Supercomputer

az rest --method PUT \
  --url "${SC_URL}?api-version=${API_VERSION}" \
  --body '{
    "location": "eastus",
    "properties": {
      "subnetId": "/subscriptions/'"${SUBSCRIPTION_ID}"'/resourceGroups/rg-network/providers/Microsoft.Network/virtualNetworks/vnet-discovery/subnets/snet-system",
      "managementSubnetId": "/subscriptions/'"${SUBSCRIPTION_ID}"'/resourceGroups/rg-network/providers/Microsoft.Network/virtualNetworks/vnet-discovery/subnets/snet-management",
      "systemSku": "Standard_D4s_v6",
      "identities": {
        "clusterIdentity": {
          "id": "/subscriptions/'"${SUBSCRIPTION_ID}"'/resourceGroups/rg-identity/providers/Microsoft.ManagedIdentity/userAssignedIdentities/id-sc-cluster"
        },
        "kubeletIdentity": {
          "id": "/subscriptions/'"${SUBSCRIPTION_ID}"'/resourceGroups/rg-identity/providers/Microsoft.ManagedIdentity/userAssignedIdentities/id-sc-kubelet"
        }
      }
    },
    "tags": { "environment": "production" }
  }'

Wait for Supercomputer to be ready

while true; do
  STATE=$(az rest --method GET --url "${SC_URL}?api-version=${API_VERSION}" \
    --query "properties.provisioningState" -o tsv)
  echo "Supercomputer state: ${STATE}"
  [ "${STATE}" = "Succeeded" ] || [ "${STATE}" = "Failed" ] || [ "${STATE}" = "Canceled" ] && break
  sleep 30
done

Create a GPU node pool

az rest --method PUT \
  --url "${NP_URL}?api-version=${API_VERSION}" \
  --body '{
    "location": "eastus",
    "properties": {
      "subnetId": "/subscriptions/'"${SUBSCRIPTION_ID}"'/resourceGroups/rg-network/providers/Microsoft.Network/virtualNetworks/vnet-discovery/subnets/snet-gpu",
      "vmSize": "Standard_NC24ads_A100_v4",
      "minNodeCount": 0,
      "maxNodeCount": 4,
      "scaleSetPriority": "Regular"
    },
    "tags": { "workload": "ai-training" }
  }'

Wait for node pool to be ready

while true; do
  STATE=$(az rest --method GET --url "${NP_URL}?api-version=${API_VERSION}" \
    --query "properties.provisioningState" -o tsv)
  echo "Node pool state: ${STATE}"
  [ "${STATE}" = "Succeeded" ] || [ "${STATE}" = "Failed" ] || [ "${STATE}" = "Canceled" ] && break
  sleep 30
done

Verify your infrastructure

# Check Supercomputer details
az rest --method GET --url "${SC_URL}?api-version=${API_VERSION}" \
  --query "{name:name, state:properties.provisioningState, sku:properties.systemSku}"

# List node pools
az rest --method GET --url "${SC_URL}/nodePools?api-version=${API_VERSION}" \
  --query "value[].{name:name, vmSize:properties.vmSize, min:properties.minNodeCount, max:properties.maxNodeCount, state:properties.provisioningState}"

Clean up when tasks are done

# Delete node pool first
az rest --method DELETE --url "${NP_URL}?api-version=${API_VERSION}"
sleep 60

# Then delete the Supercomputer
az rest --method DELETE --url "${SC_URL}?api-version=${API_VERSION}"

Error handling

All API operations return standard Azure Resource Manager error responses on failure:

{
  "error": {
    "code": "ResourceNotFound",
    "message": "The Resource 'Microsoft.Discovery/supercomputers/my-sc' under resource group 'my-rg' was not found.",
    "target": "supercomputerName",
    "details": [],
    "additionalInfo": []
  }
}

Common error codes

HTTP status	Error code	Description
400	`InvalidRequestContent`	The request body is malformed or missing required properties.
404	`ResourceNotFound`	The specified resource does not exist.
409	`Conflict`	The resource is in a state that conflicts with the request (for example, creating a node pool on a Supercomputer that is still provisioning).
429	`TooManyRequests`	Throttled. Retry after the interval in the `Retry-After` header.

Feedback

Was this page helpful?

Last updated on 2026-05-01

How to run tasks on Supercomputers using the REST API

Prerequisites

Supported regions

API version

Authentication

Step 1: Create a Supercomputer

Request

Request body

Key properties

Response

Azure CLI equivalent

Step 2: Wait for Supercomputer provisioning

Step 3: Create a node pool for your tasks

Request

URI parameters

Request body

Node pool properties

Response

Example response (201 Created)

Azure CLI equivalent

Step 4: Wait for node pool provisioning

Step 5: Add workload identities (optional)

Request

Request body

Response

Step 6: Scale your node pool

Request

Scale up for a large task

Scale to zero after tasks complete

Azure CLI equivalent

Step 7: Monitor your resources

Get Supercomputer status

List node pools on a Supercomputer

List all Supercomputers in a resource group

List all Supercomputers in a subscription

Step 8: Clean up resources

Delete a node pool

Delete the Supercomputer

End-to-end example

Set variables

Create the Supercomputer

Wait for Supercomputer to be ready

Create a GPU node pool

Wait for node pool to be ready

Verify your infrastructure

Clean up when tasks are done

Error handling

Common error codes

Related content

Feedback

Additional resources