Reliability in Virtual Machine Scale Sets
This article contains information on availability zones support for Virtual Machine Scale Sets.
Note
Virtual Machine Scale Sets can only be deployed into one region. If you want to deploy VMs across multiple regions, see Virtual Machines-Disaster recovery: cross-region failover.
Availability zone support
Azure availability zones are at least three physically separate groups of datacenters within each Azure region. Datacenters within each zone are equipped with independent power, cooling, and networking infrastructure. In the case of a local zone failure, availability zones are designed so that if the one zone is affected, regional services, capacity, and high availability are supported by the remaining two zones.
Failures can range from software and hardware failures to events such as earthquakes, floods, and fires. Tolerance to failures is achieved with redundancy and logical isolation of Azure services. For more detailed information on availability zones in Azure, see Regions and availability zones.
Azure availability zones-enabled services are designed to provide the right level of reliability and flexibility. They can be configured in two ways. They can be either zone redundant, with automatic replication across zones, or zonal, with instances pinned to a specific zone. You can also combine these approaches. For more information on zonal vs. zone-redundant architecture, see Recommendations for using availability zones and regions.
With Azure Virtual Machine Scale Sets, you can create and manage a group of load balanced VMs. The number of VMs can automatically increase or decrease in response to demand or a defined schedule. Scale sets provide high availability to your applications, and allow you to centrally manage, configure, and update many VMs. There's no cost for the scale set itself. You only pay for each VM instance that you create.
Virtual Machine Scale Sets supports both zonal and zone-redundant deployments within a region:
Zonal deployment. When you create a scale set in a single zone, you control which zone all the VMs of that set run in. The scale set is managed and autoscales only within that zone.
Zone-redundant deployment. A zone-redundant scale set lets you create a single scale set that spans multiple zones. By default, as VMs are created, they're evenly balanced across zones.
Prerequisites
To use availability zones, your scale set must be created in a supported Azure region.
All VMs - even single instance VMs - should be deployed into a scale set using flexible orchestration mode to future-proof your application for scaling and availability.
SLA
Because availability zones are physically separate and provide distinct power sources, network, and cooling - service-level agreements (SLAs) are increased. For more information, see the SLA for Microsoft Online Services.
Create a Virtual Machine Scale Set with availability zones enabled
You can create a scale set that uses availability zones with one of the following methods:
The process to create a scale set that uses a zonal deployment is the same as detailed in the getting started article. When you select a supported Azure region, you can create a scale set in one or more available zones, as shown in the following example:
The scale set and supporting resources, such as the Azure load balancer and public IP address, are created in the single zone that you specify.
Zonal failover support
Virtual Machine Scale Sets are created with five fault domains by default in Azure regions with no zones. For the regions that support availability zone deployment of Virtual Machine Scale Sets and this option is selected, the default value of the fault domain count is 1 for each of the zones. In this case, FD=1 implies that the VM instances belonging to the scale set are spread across many racks on a best effort basis. For more information, see Choosing the right number of fault domains for Virtual Machine Scale Set.
Low-latency design
It's recommended that you configure Virtual Machine Scale Sets with zone-redundancy. However, if your application has strict low latency requirements, you may need to implement a zonal for your scale sets VMs. With a zonal scale sets deployment, it's recommended that you create multiple scale set VMs across more than one zone. For example, you can create one scale sets instance that's pinned to zone 1 and one instance pinned to zone 2 or 3. You also need to use a load balancer or other application logic to direct traffic to the appropriate scale sets during a zone outage.
Important
If you opt out of zone-aware deployment, you forego protection from isolation of underlying faults. Opting out from availability zone configuration forces reliance on resources that don't obey zone placement and separation (including underlying dependencies of these resources). These resources shouldn't be expected to survive zone-down scenarios. Solutions that leverage such resources should define a disaster recovery strategy and configure a recovery of the solution in another region.
Safe deployment techniques
To have more control over where you deploy your VMs, you should deploy zonal, instead of regional, scale set VMs. However, zonal VMs only provide zone isolation and not zone redundancy. To achieve full zone-redundancy with zonal VMs, there should be two or more VMs across different zones.
It's also recommended that you use the max spreading deployment option for your zone-redundant VMs. For more information, see the spreading options.
Spreading options
When you deploy a scale set into one or more availability zones, you have the following spreading options (as of API version 2017-12-01):
Max spreading (platformFaultDomainCount = 1). Max spreading is the recommended deployment option, as it provides the best spreading in most cases. If you spread replicas across distinct hardware isolation units, it's recommended that you spread across availability zones and utilize max spreading within each zone.
With max spreading, the scale set spreads your VMs across as many fault domains as possible within each zone. This spreading could be across greater or fewer than five fault domains per zone.
Note
With max spreading, regardless of how many fault domains the VMs are spread across, you can only see one fault domain in both the scale set VM instance view and the instance metadata. The spreading within each zone is implicit.
Static fixed spreading (platformFaultDomainCount = 5). With static fixed spreading, the scale set spreads your VMs exactly across five fault domains per zone. If the scale set can't find five distinct fault domains per zone to satisfy the allocation request, the request fails.
Spreading aligned with managed disks fault domains (platformFaultDomainCount = 2 or 3) You can consider aligning the number of scale set fault domains with the number of managed disks fault domains. This alignment can help prevent loss of quorum if an entire managed disks fault domain goes down. The fault domain count can be set to less than or equal to the number of managed disks fault domains available in each of the regions. To learn about the number of Managed Disks fault domains by region, see [insert doc here](link here).
Zone balancing
For scale sets deployed across multiple zones (zone-redundant), you can choose either best effort zone balance or strict zone balance. A scale set is considered "balanced" if each zone has the same number of VMs (plus or minus one VM) as all other zones in the scale set. For example:
Scale Set | VMs in Zone 1 | VMs in Zone 2 | VMs in Zone 3 | Zone Balancing |
---|---|---|---|---|
Balanced scale set | 2 | 3 | 3 | This scale set is considered balanced. There's only one zone with a different VM count and it's only 1 less than the other zones. |
Unbalanced scale set | 1 | 3 | 3 | This scale set is considered unbalanced. Zone 1 has 2 fewer VMs than zones 2 and 3. |
It's possible that VMs in the scale set are successfully created, but extensions on those VMs fail to deploy. The VMs with extension failures are still counted when determining if a scale set is balanced. For instance, a scale set with 3 VMs in zone 1, 3 VMs in zone 2, and 3 VMs in zone 3 is considered balanced even if all extensions failed in zone 1 and all extensions succeeded in zones 2 and 3.
With best-effort zone balance, the scale set attempts to scale in and out while maintaining balance. However, if for some reason the balancing isn't possible (for example, if one zone goes down, the scale set can't create a new VM in that zone), the scale set allows temporary imbalance to successfully scale in or out. On subsequent scale-out attempts, the scale set adds VMs to zones that need more VMs for the scale set to be balanced. Similarly, on subsequent scale in attempts, the scale set removes VMs from zones that need fewer VMs for the scale set to be balanced. With "strict zone balance", the scale set fails any attempts to scale in or out if doing so would cause unbalance.
To use best-effort zone balance, set zoneBalance
to false. The zoneBalance
setting is the default in API version 2017-12-01. To use strict zone balance, set zoneBalance
to true.
Migrate to availability zone support
To learn how to redeploy a regional scale set to availability zone support, see Migrate Virtual Machines and Virtual Machine Scale Sets to availability zone support.
Additional guidance
Placement groups
Important
Placement groups only apply to Virtual Machine Scale Sets running in Uniform orchestration mode.
When you deploy a Virtual Machine Scale Set, you have the option to deploy with a single or multiple placement groups per availability zone. For regional scale sets, the choice is to have a single placement group in the region or to have multiple placement groups in the region. If the scale set property singlePlacementGroup
is set to false, the scale set can be composed of multiple placement groups and has a range of 0-1000 VMs. When set to the default value of true, the scale set is composed of a single placement group and has a range of 0-100 VMs. For most workloads, we recommend multiple placement groups, which allows for greater scale. In API version 2017-12-01, scale sets default to multiple placement groups for single-zone and cross-zone scale sets, but they default to single placement group for regional scale sets.