Azure VMware Solution private cloud and cluster concepts

Azure VMware Solution delivers VMware-based private clouds in Azure. The private cloud hardware and software deployments are fully integrated and automated in Azure. You deploy and manage the private cloud through the Azure portal, CLI, or PowerShell.

A private cloud includes clusters with:

  • Dedicated bare-metal server hosts provisioned with VMware ESXi hypervisor
  • VMware vCenter Server for managing ESXi and vSAN
  • VMware NSX-T Data Center software-defined networking for vSphere workload VMs
  • VMware vSAN datastore for vSphere workload VMs
  • VMware HCX for workload mobility
  • Resources in the Azure underlay (required for connectivity and to operate the private cloud)

As with other resources, private clouds are installed and managed from within an Azure subscription. The number of private clouds within a subscription is scalable. Initially, there's a limit of one private cloud per subscription. There's a logical relationship between Azure subscriptions, Azure VMware Solution private clouds, vSAN clusters, and hosts.

The diagram shows a single Azure subscription with two private clouds that represent a development and production environment. In each of those private clouds are two clusters.

Diagram that shows a single Azure subscription with two private clouds that represent a development and production environment.

Hosts

Azure VMware Solution clusters are based upon hyper-converged infrastructure. The following table shows the CPU, memory, disk and network specifications of the host.

Host Type CPU (GHz) RAM (GB) vSAN Cache Tier (TB, raw) vSAN Capacity Tier (TB, raw) Network Interface Cards Regional availability
AV36 Dual Intel Xeon Gold 6140 CPUs with 18 cores/CPU @ 2.3 GHz, Total 36 physical cores (72 logical cores with hyperthreading) 576 3.2 (NVMe) 15.20 (SSD) 4x 25 Gb/s NICs (2 for management & control plane, 2 for customer traffic) All product regions
AV36P Dual Intel Xeon Gold 6240 CPUs with 18 cores/CPU @ 2.6 GHz / 3.9 GHz Turbo, Total 36 physical cores (72 logical cores with hyperthreading) 768 1.5 (Intel Cache) 19.20 (NVMe) 4x 25 Gb/s NICs (2 for management & control plane, 2 for customer traffic) Selected regions (*)
AV52 Dual Intel Xeon Platinum 8270 CPUs with 26 cores/CPU @ 2.7 GHz / 4.0 GHz Turbo, Total 52 physical cores (104 logical cores with hyperthreading) 1,536 1.5 (Intel Cache) 38.40 (NVMe) 4x 25 Gb/s NICs (2 for management & control plane, 2 for customer traffic) Selected regions (*)

An Azure VMware Solution cluster requires a minimum number of three hosts. You can only use hosts of the same type in a single Azure VMware Solution private cloud. Hosts used to build or scale clusters come from an isolated pool of hosts. Those hosts have passed hardware tests and have had all data securely deleted before being added to a cluster.

(*) details available via the Azure pricing calculator.

Clusters

For each private cloud created, there's one vSAN cluster by default. You can add, delete, and scale clusters. The minimum number of hosts per cluster and the initial deployment is three.

You use vSphere and NSX-T Manager to manage most other aspects of cluster configuration or operation. All local storage of each host in a cluster is under the control of vSAN.

The Azure VMware Solution management and control plane has the following resource requirements that need to be accounted for during solution sizing.

Area Description Provisioned vCPUs Provisioned vRAM (GB) Provisioned vDisk (GB) Typical CPU Usage (GHz) Typical vRAM Usage (GB) Typical Raw vSAN Datastore Usage (GB)
VMware vSphere vCenter Server 8 24 499 1.1 3.6 1,020
VMware vSphere ESXi node 1 N/A N/A N/A 9.4 0.4 N/A
VMware vSphere ESXi node 2 N/A N/A N/A 9.4 0.4 N/A
VMware vSphere ESXi node 3 N/A N/A N/A 9.4 0.4 N/A
VMware NSX-T Data Center NSX-T Unified Appliance Node 1 6 24 300 5.5 8.5 613
VMware NSX-T Data Center NSX-T Unified Appliance Node 2 6 24 300 5.5 8.5 613
VMware NSX-T Data Center NSX-T Unified Appliance Node 3 6 24 300 5.5 8.5 613
VMware NSX-T Data Center NSX-T Edge VM 1 8 32 200 1.3 0.6 409
VMware NSX-T Data Center NSX-T Edge VM 2 8 32 200 1.3 0.6 409
VMware HCX (Optional Add-On) HCX Manager 4 12 60 1 3.2 132
VMware Site Recovery Manager (Optional Add-On) SRM Appliance 4 12 20 1 1 53
VMware vSphere (Optional Add-On) vSphere Replication Appliance 4 8 26 4.3 2.2 62
Total 54 vCPUs 192 GB 1,905 GB 54.7 GHz 37.9 GB 3,924 GB

These resource requirements only apply to the first cluster deployed in an Azure VMware Solution private cloud. Subsequent clusters only need to account for the ESXi resource requirements in solution sizing.

The VMware ESXi nodes have usage values that account for the vSphere VMkernel hypervisor overhead, vSAN overhead and NSX-T distributed router, firewall and bridging overhead. These are estimates for a standard three cluster configuration. The storage requirements are listed as not applicable (N/A) since a boot volume separate from the vSAN Datastore is used.

The VMware HCX and VMware Site Recovery Manager resource requirements are optional Add-Ons to the Azure VMware Solution service. Discount these requirements in the solution sizing if they are not being used.

The VMware Site Recovery Manager Add-On has the option of configuring multiple VMware vSphere Replication Appliances. The table above assumes one vSphere Replication appliance is used.

Sizing an Azure VMware Solution is an estimate; the sizing calculations from the design phase should be validated during the testing phase of a project to ensure the Azure VMware Solution has been sized correctly for the application workload.

Tip

You can always extend the cluster and add additional clusters later if you need to go beyond the initial deployment number.

The following table describes the maximum limits for Azure VMware Solution.

Resource Limit
vSphere clusters per private cloud 12
Minimum number of ESXi hosts per cluster 3 (hard-limit)
Maximum number of ESXi hosts per cluster 16 (hard-limit)
Maximum number of ESXi hosts per private cloud 96
Maximum number of vCenter Servers per private cloud 1 (hard-limit)
Maximum number of HCX site pairings 25 (any edition)
Maximum number of Azure VMware Solution ExpressRoute max linked private clouds 4
The virtual network gateway used determines the actual max linked private clouds. For more details, see About ExpressRoute virtual network gateways
Maximum Azure VMware Solution ExpressRoute port speed 10 Gbps
The virtual network gateway used determines the actual bandwidth. For more details, see About ExpressRoute virtual network gateways
Maximum number of Azure Public IPv4 addresses assigned to NSX-T Data Center 2,000
Maximum number of Azure VMware Solution Interconnects per private cloud 10
vSAN capacity limits 75% of total usable (keep 25% available for SLA)
VMware Site Recovery Manager - Maximum number of protected Virtual Machines 3,000
VMware Site Recovery Manager - Maximum number of Virtual Machines per recovery plan 2,000
VMware Site Recovery Manager - Maximum number of protection groups per recovery plan 250
VMware Site Recovery Manager - RPO Values 5 min or higher * (hard-limit)
VMware Site Recovery Manager - Maximum number of virtual machines per protection group 500
VMware Site Recovery Manager - Maximum number of recovery plans 250

* For information about Recovery Point Objective (RPO) lower than 15 minutes, see How the 5 Minute Recovery Point Objective Works in the vSphere Replication Administration guide.

For other VMware-specific limits, use the VMware configuration maximum tool.

VMware software versions

The VMware solution software versions used in new deployments of Azure VMware Solution private cloud clusters are:

Software Version
VMware vCenter Server 7.0 U3c
ESXi 7.0 U3c
vSAN 7.0 U3c
vSAN on-disk format 10
HCX 4.4.2
VMware NSX-T Data Center
NOTE: VMware NSX-T Data Center is the only supported version of NSX Data Center.
3.1.2

The current running software version is applied to new clusters added to an existing private cloud. For more information, see the VMware software version requirements for HCX and Understanding vSAN on-disk format versions and compatibility.

Host maintenance and lifecycle management

One benefit of Azure VMware Solution private clouds is the platform is maintained for you. Microsoft is responsible for the lifecycle management of VMware software (ESXi, vCenter Server, and vSAN). Microsoft is also responsible for the lifecycle management of the NSX-T Data Center appliances, bootstrapping the network configuration, such as creating the Tier-0 gateway and enabling North-South routing. You're responsible for the NSX-T Data Center SDN configuration: network segments, distributed firewall rules, Tier 1 gateways, and load balancers.

Microsoft is responsible for applying any patches, updates, or upgrades to ESXi, vCenter Server, vSAN, and NSX-T Data Center in your private cloud. The impact of patches, updates, and upgrades on ESXi, vCenter Server, and NSX-T Data Center is different.

  • ESXi - There's no impact to workloads running in your private cloud. Access to vCenter Server and NSX-T Data Center isn't blocked during this time. It's recommended that, during this time, you don't plan any other activities like scaling up private cloud, scheduling or initiating active HCX migrations, making HCX configuration changes and so on, in your private cloud.

  • vCenter Server - There's no impact to workloads running in your private cloud. During this time, vCenter Server will be unavailable and you won't be able to manage VMs (stop, start, create, or delete). It's recommended that, during this time, you don't plan any other activities like scaling up private cloud, creating new networks, and so on, in your private cloud. If you are using VMware Site Recovery Manager or vSphere Replication user interfaces, it is recommended to not configure vSphere Replication and configure or execute site recovery plans during the vCenter Server upgrade.

  • NSX-T Data Center - There's workload impact and when a particular host is being upgraded, the VMs on that host might lose connectivity from 2 seconds to maximum 1 minute with any and all of the following symptoms:

    • Ping errors

    • Packet loss

    • Error messages (for example, Destination Host Unreachable and Net unreachable)

    During this upgrade window, all access to the NSX-T Data Center management plane will be blocked. You can't make configuration changes to the NSX-T Data Center environment for the duration. However, your workloads will continue to run as normal, subject to the upgrade impact detailed above.

    It's recommended that, during the upgrade time, you don't plan any other activities like scaling up private cloud, and so on, in your private cloud. These can prevent the upgrade from starting or could have adverse impacts on the upgrade and the environment.

You'll be notified before patches/updates or upgrades are applied to your private clouds. We'll also work with you to schedule a maintenance window before applying updates or upgrades to your private cloud.

Software updates include:

  • Patches - Security patches or bug fixes released by VMware

  • Updates - Minor version change of a VMware stack component

  • Upgrades - Major version change of a VMware stack component

Note

Microsoft tests a critical security patch as soon as it becomes available from VMware.

Documented VMware workarounds are implemented in lieu of installing a corresponding patch until the next scheduled updates are deployed.

Host monitoring and remediation

Azure VMware Solution continuously monitors the health of both the underlay and the VMware components. When Azure VMware Solution detects a failure, it takes action to repair the failed components. When Azure VMware Solution detects a degradation or failure on an Azure VMware Solution node, it triggers the host remediation process.

Host remediation involves replacing the faulty node with a new healthy node in the cluster. Then, when possible, the faulty host is placed in VMware vSphere maintenance mode. VMware vMotion moves the VMs off the faulty host to other available servers in the cluster, potentially allowing zero downtime for live migration of workloads. If the faulty host can't be placed in maintenance mode, the host is removed from the cluster.

Azure VMware Solution monitors the following conditions on the host:

  • Processor status
  • Memory status
  • Connection and power state
  • Hardware fan status
  • Network connectivity loss
  • Hardware system board status
  • Errors occurred on the disk(s) of a vSAN host
  • Hardware voltage
  • Hardware temperature status
  • Hardware power status
  • Storage status
  • Connection failure

Note

Azure VMware Solution tenant admins must not edit or delete the above defined VMware vCenter Server alarms, as these are managed by the Azure VMware Solution control plane on vCenter Server. These alarms are used by Azure VMware Solution monitoring to trigger the Azure VMware Solution host remediation process.

Backup and restoration

Private cloud vCenter Server and NSX-T Data Center configurations are on an hourly backup schedule. Backups are kept for three days. If you need to restore from a backup, open a support request in the Azure portal to request restoration.

Azure VMware Solution continuously monitors the health of both the physical underlay and the VMware Solution components. When Azure VMware Solution detects a failure, it takes action to repair the failed components.

Next steps

Now that you've covered Azure VMware Solution private cloud concepts, you may want to learn about: