Share an Azure managed disk

Applies to: ✔️ Linux VMs ✔️ Windows VMs ✔️ Flexible scale sets ✔️ Uniform scale sets

Azure shared disks is a feature for Azure managed disks that allow you to attach a managed disk to multiple virtual machines (VMs) simultaneously. Attaching a managed disk to multiple VMs allows you to either deploy new or migrate existing clustered applications to Azure.

How it works

VMs in the cluster can read or write to their attached disk based on the reservation chosen by the clustered application using SCSI Persistent Reservations (SCSI PR). SCSI PR is an industry standard used by applications running on Storage Area Network (SAN) on-premises. Enabling SCSI PR on a managed disk allows you to migrate these applications to Azure as-is.

Shared managed disks offer shared block storage that can be accessed from multiple VMs, these are exposed as logical unit numbers (LUNs). LUNs are then presented to an initiator (VM) from a target (disk). These LUNs look like direct-attached-storage (DAS) or a local drive to the VM.

Shared managed disks don't natively offer a fully managed file system that can be accessed using SMB/NFS. You need to use a cluster manager, like Windows Server Failover Cluster (WSFC), or Pacemaker, that handles cluster node communication and write locking.

Limitations

General limitations

Enabling shared disks is only available to a subset of disk types. Currently only ultra disks, premium SSD v2 (preview), premium SSDs, and standard SSDs can enable shared disks. Shared disks can be attached to individual VMSS instances but can't be defined in the VMSS models or automatically deployed.

Each managed disk that has shared disks enabled are also subject to the following limitations, organized by disk type:

Ultra disks

Ultra disks have their own separate list of limitations, unrelated to shared disks. For ultra disk limitations, refer to Using Azure ultra disks.

When sharing ultra disks, they have the following additional limitations:

Premium SSD v2 (preview)

Premium SSD v2 disks have their own separate list of limitations, unrelated to shared disks. For these limitations, see Premium SSD v2 limitations.

When sharing Premium SSD v2 disks, they have the following additional limitation:

Premium SSD

Standard SSDs

Operating system requirements

Shared disks support several operating systems. See the Windows or Linux sections for the supported operating systems.

Billing implications

When you share a disk, your billing could be impacted in two different ways, depending on the type of disk.

For shared premium SSD disks, in addition to cost of the disk's tier, there's an extra charge that increases with each VM the SSD is mounted to. See managed disks pricing for details.

Ultra disks don't have an extra charge for each VM that they're mounted to. They're billed on the total IOPS and MBps that the disk is configured for. Normally, an ultra disk has two performance throttles that determine its total IOPS/MBps. However, when configured as a shared ultra disk, two more performance throttles are exposed, for a total of four. These two additional throttles allow for increased performance at an extra expense and each meter has a default value, which raises the performance and cost of the disk.

The four performance throttles a shared ultra disk has are diskMBpsReadWrite, diskIOPSReadOnly, diskMBpsReadWrite, and diskMBpsReadOnly. Each performance throttle can be configured to change the performance of your disk. The performance for shared ultra disk is calculated in the following ways: total provisioned IOPS (diskIOPSReadWrite + diskIOPSReadOnly) and for total provisioned throughput MBps (diskMBpsReadWrite + diskMBpsReadOnly).

Once you've determined your total provisioned IOPS and total provisioned throughput, you can use them in the pricing calculator to determine the cost of an ultra shared disk.

Disk sizes

For now, only ultra disks, premium SSD v2 (preview), premium SSD, and standard SSDs can enable shared disks. Different disk sizes may have a different maxShares limit, which you can't exceed when setting the maxShares value.

For each disk, you can define a maxShares value that represents the maximum number of nodes that can simultaneously share the disk. For example, if you plan to set up a 2-node failover cluster, you would set maxShares=2. The maximum value is an upper bound. Nodes can join or leave the cluster (mount or unmount the disk) as long as the number of nodes is lower than the specified maxShares value.

Note

The maxShares value can only be set or edited when the disk is detached from all nodes.

Premium SSD ranges

The following table illustrates the allowed maximum values for maxShares by premium SSD sizes:

Disk sizes maxShares limit
P1,P2,P3,P4,P6,P10,P15,P20 3
P30, P40, P50 5
P60, P70, P80 10

The IOPS and bandwidth limits for a disk aren't affected by the maxShares value. For example, the max IOPS of a P15 disk is 1100 whether maxShares = 1 or maxShares > 1.

Standard SSD ranges

The following table illustrates the allowed maximum values for maxShares by standard SSD sizes:

Disk sizes maxShares limit
E1,E2,E3,E4,E6,E10,E15,E20 3
E30, E40, E50 5
E60, E70, E80 10

The IOPS and bandwidth limits for a disk aren't affected by the maxShares value. For example, the max IOPS of a E15 disk is 500 whether maxShares = 1 or maxShares > 1.

Ultra disk ranges

The minimum maxShares value is 1, while the maximum maxShares value is 15. There are no size restrictions on ultra disks, any size ultra disk can use any value for maxShares, up to and including the maximum value.

Premium SSD v2 ranges

The minimum maxShares value is 1, while the maximum maxShares value is 15. There are no size restrictions on Premium SSD v2, any size Premium SSD v2 disk can use any value for maxShares, up to and including the maximum value.

Sample workloads

Windows

Azure shared disks are supported on Windows Server 2008 and newer. Most Windows-based clustering builds on WSFC, which handles all core infrastructure for cluster node communication, allowing your applications to take advantage of parallel access patterns. WSFC enables both CSV and non-CSV-based options depending on your version of Windows Server. For details, refer to Create a failover cluster.

Some popular applications running on WSFC include:

Linux

Azure shared disks are supported on:

Linux clusters can use cluster managers such as Pacemaker. Pacemaker builds on Corosync, enabling cluster communications for applications deployed in highly available environments. Some common clustered filesystems include ocfs2 and gfs2. You can use SCSI Persistent Reservation (SCSI PR) and/or STONITH Block Device (SBD) based clustering models for arbitrating access to the disk. When using SCSI PR, you can manipulate reservations and registrations using utilities such as fence_scsi and sg_persist.

Persistent reservation flow

The following diagram illustrates a sample 2-node clustered database application that uses SCSI PR to enable failover from one node to the other.

Two node cluster consisting of Azure VM1, VM2, and a disk shared between them. An application running on the cluster handles access to the disk.

The flow is as follows:

  1. The clustered application running on both Azure VM1 and VM2 registers its intent to read or write to the disk.
  2. The application instance on VM1 then takes exclusive reservation to write to the disk.
  3. This reservation is enforced on your Azure disk and the database can now exclusively write to the disk. Any writes from the application instance on VM2 won't succeed.
  4. If the application instance on VM1 goes down, the instance on VM2 can now initiate a database failover and take-over of the disk.
  5. This reservation is now enforced on the Azure disk and the disk will no longer accept writes from VM1. It will only accept writes from VM2.
  6. The clustered application can complete the database failover and serve requests from VM2.

The following diagram illustrates another common clustered workload consisting of multiple nodes reading data from the disk for running parallel processes, such as training of machine learning models.

Four node VM cluster, each node registers intent to write, application takes exclusive reservation to properly handle write results

The flow is as follows:

  1. The clustered application running on all VMs registers the intent to read or write to the disk.
  2. The application instance on VM1 takes an exclusive reservation to write to the disk while opening up reads to the disk from other VMs.
  3. This reservation is enforced on your Azure disk.
  4. All nodes in the cluster can now read from the disk. Only one node writes back results to the disk, on behalf of all nodes in the cluster.

Ultra disks reservation flow

Ultra disks offer two extra throttles, for a total of four throttles. Due to this, ultra disks reservation flow can work as described in the earlier section, or it can throttle and distribute performance more granularly.

An image of a table that depicts the `ReadOnly` or `Read/Write` access for Reservation Holder, Registered, and Others.

Performance throttles

Premium SSD performance throttles

With premium SSD, the disk IOPS and throughput is fixed, for example, IOPS of a P30 is 5000. This value remains whether the disk is shared across 2 VMs or 5 VMs. The disk limits can be reached from a single VM or divided across two or more VMs.

Ultra disk performance throttles

Ultra disks have the unique capability of allowing you to set your performance by exposing modifiable attributes and allowing you to modify them. By default, there are only two modifiable attributes but, shared ultra disks have two more attributes.

Attribute Description
DiskIOPSReadWrite The total number of IOPS allowed across all VMs mounting the shared disk with write access.
DiskMBpsReadWrite The total throughput (MB/s) allowed across all VMs mounting the shared disk with write access.
DiskIOPSReadOnly* The total number of IOPS allowed across all VMs mounting the shared disk as ReadOnly.
DiskMBpsReadOnly* The total throughput (MB/s) allowed across all VMs mounting the shared disk as ReadOnly.

* Applies to shared ultra disks only

The following formulas explain how the performance attributes can be set, since they're user modifiable:

  • DiskIOPSReadWrite/DiskIOPSReadOnly:
    • IOPS limits of 300 IOPS/GiB, up to a maximum of 160 K IOPS per disk
    • Minimum of 100 IOPS
    • DiskIOPSReadWrite + DiskIOPSReadOnly is at least 2 IOPS/GiB
  • DiskMBpsRead Write/DiskMBpsReadOnly:
    • The throughput limit of a single disk is 256 KiB/s for each provisioned IOPS, up to a maximum of 2000 MBps per disk
    • The minimum guaranteed throughput per disk is 4KiB/s for each provisioned IOPS, with an overall baseline minimum of 1 MBps

Examples

The following examples depict a few scenarios that show how the throttling can work with shared ultra disks, specifically.

Two nodes cluster using cluster shared volumes

The following is an example of a 2-node WSFC using clustered shared volumes. With this configuration, both VMs have simultaneous write-access to the disk, which results in the ReadWrite throttle being split across the two VMs and the ReadOnly throttle not being used.

CSV two node ultra example

Two node cluster without cluster share volumes

The following is an example of a 2-node WSFC that isn't using clustered shared volumes. With this configuration, only one VM has write-access to the disk. This results in the ReadWrite throttle being used exclusively for the primary VM and the ReadOnly throttle only being used by the secondary.

CSV two nodes no csv ultra disk example

Four node Linux cluster

The following is an example of a 4-node Linux cluster with a single writer and three scale-out readers. With this configuration, only one VM has write-access to the disk. This results in the ReadWrite throttle being used exclusively for the primary VM and the ReadOnly throttle being split by the secondary VMs.

Four node ultra throttling example

Ultra pricing

Ultra shared disks are priced based on provisioned capacity, total provisioned IOPS (diskIOPSReadWrite + diskIOPSReadOnly) and total provisioned Throughput MBps (diskMBpsReadWrite + diskMBpsReadOnly). There's no extra charge for each additional VM mount. For example, an ultra shared disk with the following configuration (diskSizeGB: 1024, DiskIOPSReadWrite: 10000, DiskMBpsReadWrite: 600, DiskIOPSReadOnly: 100, DiskMBpsReadOnly: 1) is charged with 1024 GiB, 10100 IOPS, and 601 MBps regardless of whether it is mounted to two VMs or five VMs.

Next steps

If you're interested in enabling and using shared disks for your managed disks, proceed to our article Enable shared disk