Use Azure Container Storage with local NVMe
Azure Container Storage is a cloud-based volume management, deployment, and orchestration service built natively for containers. This article shows you how to configure Azure Container Storage to use Ephemeral Disk with local NVMe as back-end storage for your Kubernetes workloads. At the end, you'll have a pod that's using local NVMe as its storage.
When your application needs sub-millisecond storage latency and doesn't require data durability, you can use Ephemeral Disk with Azure Container Storage to meet your performance requirements. Ephemeral means that the disks are deployed on the local virtual machine (VM) hosting the AKS cluster and not saved to an Azure storage service. Data will be lost on these disks if you stop/deallocate your VM.
There are two types of Ephemeral Disk available: local NVMe and temp SSD. NVMe is designed for high-speed data transfer between storage and CPU. Choose NVMe when your application needs higher IOPS or throughput than temp SSD, or requires more storage space. Be aware that Azure Container Storage only supports synchronous data replication for local NVMe.
Due to the ephemeral nature of these disks, Azure Container Storage supports the use of generic ephemeral volumes by default when using ephemeral disk. However, certain use cases might call for persistent volumes even if the data isn't durable; for example, if you want to use existing YAML files or deployment templates that are hard-coded to use persistent volumes, and your workload supports application-level replication for durability. In such cases, you can update your Azure Container Storage installation and add the annotation acstor.azure.com/accept-ephemeral-storage=true
in your persistent volume claim definition to support the creation of persistent volumes from ephemeral disk storage pools.
If you don't have an Azure subscription, create a free account before you begin.
This article requires the latest version (2.35.0 or later) of the Azure CLI. See How to install the Azure CLI. If you're using the Bash environment in Azure Cloud Shell, the latest version is already installed. If you plan to run the commands locally instead of in Azure Cloud Shell, be sure to run them with administrative privileges. For more information, see Get started with Azure Cloud Shell.
You'll need the Kubernetes command-line client,
kubectl
. It's already installed if you're using Azure Cloud Shell, or you can install it locally by running theaz aks install-cli
command.If you haven't already installed Azure Container Storage, follow the instructions in Use Azure Container Storage with Azure Kubernetes Service.
Check if your target region is supported in Azure Container Storage regions.
Local NVMe Disk is only available in certain types of VMs, for example, Storage optimized VM SKUs or GPU accelerated VM SKUs. If you plan to use local NVMe capacity, choose one of these VM SKUs.
Run the following command to get the VM type that's used with your node pool. Replace <resource group>
and <cluster name>
with your own values. You don't need to supply values for PoolName
or VmSize
, so keep the query as shown here.
az aks nodepool list --resource-group <resource group> --cluster-name <cluster name> --query "[].{PoolName:name, VmSize:vmSize}" -o table
The following is an example of output.
PoolName VmSize
---------- ---------------
nodepool1 standard_l8s_v3
We recommend that each VM have a minimum of four virtual CPUs (vCPUs), and each node pool have at least three nodes.
Follow these steps to create and attach a generic ephemeral volume.
First, create a storage pool, which is a logical grouping of storage for your Kubernetes cluster, by defining it in a YAML manifest file.
If you enabled Azure Container Storage using az aks create
or az aks update
commands, you might already have a storage pool. Use kubectl get sp -n acstor
to get the list of storage pools. If you have a storage pool already available that you want to use, you can skip this section and proceed to Display the available storage classes.
Follow these steps to create a storage pool using local NVMe.
Use your favorite text editor to create a YAML manifest file such as
code acstor-storagepool.yaml
.Paste in the following code and save the file. The storage pool name value can be whatever you want.
apiVersion: containerstorage.azure.com/v1 kind: StoragePool metadata: name: ephemeraldisk-nvme namespace: acstor spec: poolType: ephemeralDisk: diskType: nvme
Apply the YAML manifest file to create the storage pool.
kubectl apply -f acstor-storagepool.yaml
When storage pool creation is complete, you'll see a message like:
storagepool.containerstorage.azure.com/ephemeraldisk-nvme created
You can also run this command to check the status of the storage pool. Replace
<storage-pool-name>
with your storage pool name value. For this example, the value would be ephemeraldisk-nvme.kubectl describe sp <storage-pool-name> -n acstor
When the storage pool is created, Azure Container Storage will create a storage class on your behalf, using the naming convention acstor-<storage-pool-name>
.
When the storage pool is ready to use, you must select a storage class to define how storage is dynamically created when creating and deploying volumes.
Run kubectl get sc
to display the available storage classes. You should see a storage class called acstor-<storage-pool-name>
.
$ kubectl get sc | grep "^acstor-"
acstor-azuredisk-internal disk.csi.azure.com Retain WaitForFirstConsumer true 65m
acstor-ephemeraldisk-nvme containerstorage.csi.azure.com Delete WaitForFirstConsumer true 2m27s
Important
Don't use the storage class that's marked internal. It's an internal storage class that's needed for Azure Container Storage to work.
Create a pod using Fio (Flexible I/O Tester) for benchmarking and workload simulation, that uses a generic ephemeral volume.
Use your favorite text editor to create a YAML manifest file such as
code acstor-pod.yaml
.Paste in the following code and save the file.
kind: Pod apiVersion: v1 metadata: name: fiopod spec: nodeSelector: acstor.azure.com/io-engine: acstor containers: - name: fio image: nixery.dev/shell/fio args: - sleep - "1000000" volumeMounts: - mountPath: "/volume" name: ephemeralvolume volumes: - name: ephemeralvolume ephemeral: volumeClaimTemplate: metadata: labels: type: my-ephemeral-volume spec: accessModes: [ "ReadWriteOnce" ] storageClassName: acstor-ephemeraldisk-nvme # replace with the name of your storage class if different resources: requests: storage: 1Gi
When you change the storage size of your volumes, make sure the size is less than the available capacity of a single node's ephemeral disk. See Check node ephemeral disk capacity.
Apply the YAML manifest file to deploy the pod.
kubectl apply -f acstor-pod.yaml
You should see output similar to the following:
pod/fiopod created
Check that the pod is running and that the ephemeral volume claim has been bound successfully to the pod:
kubectl describe pod fiopod kubectl describe pvc fiopod-ephemeralvolume
Check fio testing to see its current status:
kubectl exec -it fiopod -- fio --name=benchtest --size=800m --filename=/volume/test --direct=1 --rw=randrw --ioengine=libaio --bs=4k --iodepth=16 --numjobs=8 --time_based --runtime=60
You've now deployed a pod that's using local NVMe as its storage, and you can use it for your Kubernetes workloads.
To create a persistent volume from an ephemeral disk storage pool, you must include an annotation in your persistent volume claims (PVCs) as a safeguard to ensure that you intend to use persistent volumes even when the data is ephemeral. Additionally, you need to enable the --ephemeral-disk-volume-type
flag with the PersistentVolumeWithAnnotation
value on your cluster before creating your persistent volume claims.
Follow these steps to create and attach a persistent volume.
Run the following command to update your Azure Container Storage installation to allow the creation of persistent volumes from ephemeral disk storage pools.
az aks update -n <cluster-name> -g <resource-group> --enable-azure-container-storage ephemeralDisk --storage-pool-option NVMe --ephemeral-disk-volume-type PersistentVolumeWithAnnotation
Create a storage pool, which is a logical grouping of storage for your Kubernetes cluster, by defining it in a YAML manifest file.
If you enabled Azure Container Storage using az aks create
or az aks update
commands, you might already have a storage pool. Use kubectl get sp -n acstor
to get the list of storage pools. If you have a storage pool already available that you want to use, you can skip this section and proceed to Display the available storage classes.
Follow these steps to create a storage pool using local NVMe.
Use your favorite text editor to create a YAML manifest file such as
code acstor-storagepool.yaml
.Paste in the following code and save the file. The storage pool name value can be whatever you want.
apiVersion: containerstorage.azure.com/v1 kind: StoragePool metadata: name: ephemeraldisk-nvme namespace: acstor spec: poolType: ephemeralDisk: diskType: nvme
Apply the YAML manifest file to create the storage pool.
kubectl apply -f acstor-storagepool.yaml
When storage pool creation is complete, you'll see a message like:
storagepool.containerstorage.azure.com/ephemeraldisk-nvme created
You can also run this command to check the status of the storage pool. Replace
<storage-pool-name>
with your storage pool name value. For this example, the value would be ephemeraldisk-nvme.kubectl describe sp <storage-pool-name> -n acstor
When the storage pool is created, Azure Container Storage will create a storage class on your behalf, using the naming convention acstor-<storage-pool-name>
.
When the storage pool is ready to use, you must select a storage class to define how storage is dynamically created when creating and deploying volumes.
Run kubectl get sc
to display the available storage classes. You should see a storage class called acstor-<storage-pool-name>
.
$ kubectl get sc | grep "^acstor-"
acstor-azuredisk-internal disk.csi.azure.com Retain WaitForFirstConsumer true 65m
acstor-ephemeraldisk-nvme containerstorage.csi.azure.com Delete WaitForFirstConsumer true 2m27s
Important
Don't use the storage class that's marked internal. It's an internal storage class that's needed for Azure Container Storage to work.
A persistent volume claim is used to automatically provision storage based on a storage class. Follow these steps to create a PVC using the new storage class.
Use your favorite text editor to create a YAML manifest file such as
code acstor-pvc.yaml
.Paste in the following code and save the file. The PVC
name
value can be whatever you want.apiVersion: v1 kind: PersistentVolumeClaim metadata: name: ephemeralpvc annotations: acstor.azure.com/accept-ephemeral-storage: "true" spec: accessModes: - ReadWriteOnce storageClassName: acstor-ephemeraldisk-nvme # replace with the name of your storage class if different resources: requests: storage: 100Gi
When you change the storage size of your volumes, make sure the size is less than the available capacity of a single node's ephemeral disk. See Check node ephemeral disk capacity.
Apply the YAML manifest file to create the PVC.
kubectl apply -f acstor-pvc.yaml
You should see output similar to:
persistentvolumeclaim/ephemeralpvc created
You can verify the status of the PVC by running the following command:
kubectl describe pvc ephemeralpvc
Once the PVC is created, it's ready for use by a pod.
Create a pod using Fio (Flexible I/O Tester) for benchmarking and workload simulation, and specify a mount path for the persistent volume. For claimName, use the name value that you used when creating the persistent volume claim.
Use your favorite text editor to create a YAML manifest file such as
code acstor-pod.yaml
.Paste in the following code and save the file.
kind: Pod apiVersion: v1 metadata: name: fiopod spec: nodeSelector: acstor.azure.com/io-engine: acstor volumes: - name: ephemeralpv persistentVolumeClaim: claimName: ephemeralpvc containers: - name: fio image: nixery.dev/shell/fio args: - sleep - "1000000" volumeMounts: - mountPath: "/volume" name: ephemeralpv
Apply the YAML manifest file to deploy the pod.
kubectl apply -f acstor-pod.yaml
You should see output similar to the following:
pod/fiopod created
Check that the pod is running and that the persistent volume claim has been bound successfully to the pod:
kubectl describe pod fiopod kubectl describe pvc ephemeralpvc
Check fio testing to see its current status:
kubectl exec -it fiopod -- fio --name=benchtest --size=800m --filename=/volume/test --direct=1 --rw=randrw --ioengine=libaio --bs=4k --iodepth=16 --numjobs=8 --time_based --runtime=60
You've now deployed a pod that's using local NVMe and you can use it for your Kubernetes workloads.
In this section, you'll learn how to check the available capacity of ephemeral disk for a single node, how to expand or delete a storage pool, and how to optimize performance.
An ephemeral volume is allocated on a single node. When you configure the size of your ephemeral volumes, the size should be less than the available capacity of the single node's ephemeral disk.
Run the following command to check the available capacity of ephemeral disk for a single node.
$ kubectl get diskpool -n acstor
NAME CAPACITY AVAILABLE USED RESERVED READY AGE
ephemeraldisk-nvme-diskpool-jaxwb 75660001280 75031990272 628011008 560902144 True 21h
ephemeraldisk-nvme-diskpool-wzixx 75660001280 75031990272 628011008 560902144 True 21h
ephemeraldisk-nvme-diskpool-xbtlj 75660001280 75031990272 628011008 560902144 True 21h
In this example, the available capacity of ephemeral disk for a single node is 75031990272
bytes or 69 GiB.
You can expand storage pools backed by local NVMe to scale up quickly and without downtime. Shrinking storage pools isn't currently supported.
Because a storage pool backed by Ephemeral Disk uses local storage resources on the AKS cluster nodes (VMs), expanding the storage pool requires adding another node to the cluster. Follow these instructions to expand the storage pool.
Run the following command to add a node to the AKS cluster. Replace
<cluster-name>
,<nodepool name>
, and<resource-group-name>
with your own values. To get the name of your node pool, runkubectl get nodes
.az aks nodepool add --cluster-name <cluster name> --name <nodepool name> --resource-group <resource group> --node-vm-size Standard_L8s_v3 --node-count 1 --labels acstor.azure.com/io-engine=acstor
Run
kubectl get nodes
and you'll see that a node has been added to the cluster.Run
kubectl get sp -A
and you should see that the capacity of the storage pool has increased.
If you want to delete a storage pool, run the following command. Replace <storage-pool-name>
with the storage pool name.
kubectl delete sp -n acstor <storage-pool-name>
Depending on your workload’s performance requirements, you can choose from three different performance tiers: Basic, Standard, and Premium. Your selection will impact the number of vCPUs that Azure Container Storage components consume in the nodes where it's installed. Standard is the default configuration if you don't update the performance tier.
These three tiers offer a different range of IOPS. The following table contains guidance on what you could expect with each of these tiers. We used FIO, a popular benchmarking tool, to achieve these numbers with the following configuration:
- AKS: Node SKU - Standard_L16s_v3;
- FIO: Block size - 4KB; Queue depth - 32; Numjobs - number of cores assigned to container storage components; Access pattern - random; Worker set size - 32G
Tier | Number of vCPUs | 100 % Read IOPS | 100 % Write IOPS |
---|---|---|---|
Basic |
12.5% of total VM cores | Up to 120,000 | Up to 90,000 |
Standard (default) |
25% of total VM cores | Up to 220,000 | Up to 180,000 |
Premium |
50% of total VM cores | Up to 550,000 | Up to 360,000 |
Note
RAM and hugepages consumption will stay consistent across all tiers: 1 GiB of RAM and 2 GiB of hugepages.
Once you've identified the performance tier that aligns best to your needs, you can run the following command to update the performance tier of your Azure Container Storage installation. Replace <performance tier>
with basic, standard, or premium.
az aks update -n <cluster-name> -g <resource-group> --enable-azure-container-storage <storage-pool-type> --ephemeral-disk-nvme-perf-tier <performance-tier>