AKS storage issues after patching

Karsten 1 Reputation point
2021-12-01T15:44:40.737+00:00

I found a weird storage issue after I did some OS patching and node reboots on a AKS cluster a couple of weeks ago. Some persistent volumes (PVs) in UK south availability zone 3 suddenly have the same underlying storage. It looks very much like a problem with the Azure storage layer too me. Pods/PVs in other availability zones are not affected.

Detailed description below. Has somebody else seen this issue before? Is there way to resolve this? Restarting the pods doesn't help.

The PVs use the managed-retain storage class, which is defined as follows:

~ k describe storageclasses managed-retain  
Name:                  managed-retain  
IsDefaultClass:        No  
Annotations:           <none>  
Provisioner:           kubernetes.io/azure-disk  
Parameters:            kind=Managed,storageaccounttype=Standard_LRS  
AllowVolumeExpansion:  True  
MountOptions:          <none>  
ReclaimPolicy:         Retain  
VolumeBindingMode:     Immediate  
Events:                <none>  

This storage class is supposed to provide managed disks, which can't be shared.

The issue affects two Wordpress pods, which suddenly share a file system with a RabbitMQ and Elasticsearch pod.

The Wordpress pod mounts a volume to /var/www/html and the RabbitMQ pod mounts a volume to /var/lib/rabbitmq. The volumes are supposed to be separate disks and should be mounted to only a single pod.

When I create a test file on one of the pods, like so:

keti -n rabbitmq rabbit-0 -- bash -c 'touch /var/lib/rabbitmq/testfile'  

I can see the file in the RabbitMQ pod:

 keti -n rabbitmq rabbit-0 -- bash -c 'ls -l /var/lib/rabbitmq/testfile'   
-rw-r--r-- 1 root root 0 Dec  1 14:24 /var/lib/rabbitmq/testfile  

Which is expected. But I also see exactly the same file in the Wordpress pod, which isn't expected:

[~ keti -n wordpress wordpress-65d986d4f7-nz9s4 -- bash -c 'ls -l /var/www/html/testfile'   
-rw-r--r-- 1 root root 0 Dec  1 14:24 /var/www/html/testfile][1][154099-pod-wordpress-65d986d4f7-nz9s4.txt][2]  

See pod_rabbit-0.txt and pod_wordpress-65d986d4f7-nz9s4.txt for the detailed configuration of both pods.

This is the persistent volume claim of the RabbitMQ pod:

~ k -n rabbitmq describe pvc rabbitmq-data-rabbit-0   
Name:          rabbitmq-data-rabbit-0  
Namespace:     rabbitmq  
StorageClass:  managed-retain  
Status:        Bound  
Volume:        pvc-69707b58-30d9-4001-9004-76bbd0db1a36  
Labels:        app=rabbitmq  
Annotations:   pv.kubernetes.io/bind-completed: yes  
               pv.kubernetes.io/bound-by-controller: yes  
               volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/azure-disk  
Finalizers:    [kubernetes.io/pvc-protection]  
Capacity:      5Gi  
Access Modes:  RWO  
VolumeMode:    Filesystem  
Used By:       rabbit-0  
Events:        <none>  

This is the persistent volume claim of the Wordpress pod:

~ k -n wordpress describe pvc wordpress-data  
Name:          wordpress-data  
Namespace:     wordpress  
StorageClass:  managed-retain  
Status:        Bound  
Volume:        pvc-c66de91e-602c-4934-b513-b930993ad8b7  
Labels:        <none>  
Annotations:   pv.kubernetes.io/bind-completed: yes  
               pv.kubernetes.io/bound-by-controller: yes  
               volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/azure-disk  
Finalizers:    [kubernetes.io/pvc-protection]  
Capacity:      5Gi  
Access Modes:  RWO  
VolumeMode:    Filesystem  
Used By:       wordpress-65d986d4f7-nz9s4  
Events:        <none>  

As can be seen in the above outputs, the PVC reference different persistent volumes (PVs) (pvc-69707b58-30d9-4001-9004-76bbd0db1a36 and pvc-c66de91e-602c-4934-b513-b930993ad8b7), so it's clearly not a configuration issue. Furthermore, both PVCs were defined a long time ago:

~ k -n rabbitmq get pvc   
NAME                     STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS     AGE  
rabbitmq-data-rabbit-0   Bound    pvc-69707b58-30d9-4001-9004-76bbd0db1a36   5Gi        RWO            managed-retain   537d  

~ k get pvc                                   
NAME                               STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS       AGE  
wordpress-data                     Bound    pvc-c66de91e-602c-4934-b513-b930993ad8b7   5Gi        RWO            managed-retain     666d  

The PVs are defined as follows.

RabbitMQ:

k describe pv pvc-69707b58-30d9-4001-9004-76bbd0db1a36  
Name:              pvc-69707b58-30d9-4001-9004-76bbd0db1a36  
Labels:            failure-domain.beta.kubernetes.io/region=uksouth  
                   failure-domain.beta.kubernetes.io/zone=uksouth-3  
Annotations:       pv.kubernetes.io/bound-by-controller: yes  
                   pv.kubernetes.io/provisioned-by: kubernetes.io/azure-disk  
                   volumehelper.VolumeDynamicallyCreatedByKey: azure-disk-dynamic-provisioner  
Finalizers:        [kubernetes.io/pv-protection]  
StorageClass:      managed-retain  
Status:            Bound  
Claim:             rabbitmq/rabbitmq-data-rabbit-0  
Reclaim Policy:    Retain  
Access Modes:      RWO  
VolumeMode:        Filesystem  
Capacity:          5Gi  
Node Affinity:       
  Required Terms:    
    Term 0:        failure-domain.beta.kubernetes.io/region in [uksouth]  
                   failure-domain.beta.kubernetes.io/zone in [uksouth-3]  
Message:             
Source:  
    Type:         AzureDisk (an Azure Data Disk mount on the host and bind mount to the pod)  
    DiskName:     kubernetes-dynamic-pvc-69707b58-30d9-4001-9004-76bbd0db1a36  
    DiskURI:      /subscriptions/**redacted**/resourceGroups/mc_credresgpproduction_**redacted**_uksouth/providers/Microsoft.Compute/disks/kubernetes-dynamic-pvc-69707b58-30d9-4001-9004-76bbd0db1a36  
    Kind:         Managed  
    FSType:         
    CachingMode:  ReadOnly  
    ReadOnly:     false  
Events:           <none>  

Wordpress:

k describe pv pvc-c66de91e-602c-4934-b513-b930993ad8b7  
Name:              pvc-c66de91e-602c-4934-b513-b930993ad8b7  
Labels:            failure-domain.beta.kubernetes.io/region=uksouth  
                   failure-domain.beta.kubernetes.io/zone=uksouth-3  
Annotations:       pv.kubernetes.io/bound-by-controller: yes  
                   pv.kubernetes.io/provisioned-by: kubernetes.io/azure-disk  
                   volumehelper.VolumeDynamicallyCreatedByKey: azure-disk-dynamic-provisioner  
Finalizers:        [kubernetes.io/pv-protection]  
StorageClass:      managed-retain  
Status:            Bound  
Claim:             wordpress/wordpress-data  
Reclaim Policy:    Retain  
Access Modes:      RWO  
VolumeMode:        Filesystem  
Capacity:          5Gi  
Node Affinity:       
  Required Terms:    
    Term 0:        failure-domain.beta.kubernetes.io/region in [uksouth]  
                   failure-domain.beta.kubernetes.io/zone in [uksouth-3]  
Message:             
Source:  
    Type:         AzureDisk (an Azure Data Disk mount on the host and bind mount to the pod)  
    DiskName:     kubernetes-dynamic-pvc-c66de91e-602c-4934-b513-b930993ad8b7  
    DiskURI:      /subscriptions/**redacted**/resourceGroups/mc_credresgpproduction_**redacted**_uksouth/providers/Microsoft.Compute/disks/kubernetes-dynamic-pvc-c66de91e-602c-4934-b513-b930993ad8b7  
    Kind:         Managed  
    FSType:         
    CachingMode:  ReadOnly  
    ReadOnly:     false  
Events:           <none>  

The PV definitions also point to different DiskNames (kubernetes-dynamic-pvc-69707b58-30d9-4001-9004-76bbd0db1a36 and kubernetes-dynamic-pvc-c66de91e-602c-4934-b513-b930993ad8b7).

Thanks very much!

Karsten

Azure Kubernetes Service (AKS)
Azure Kubernetes Service (AKS)
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
1,790 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Karsten 1 Reputation point
    2021-12-03T12:34:20.31+00:00

    Hi,

    many thanks for your reply and your efforts to reproduce the issue. Here are the outputs from my system:

    crictl ps|egrep rabbit\|wordpress
    55c0e11ff37f4       87505dc99f218       10 days ago         Running             rabbitmq              0                   11b08bab81bdc
    a8e4aeabe747e       cfb931188dab8       2 weeks ago         Running             wordpress             0                   a64c38b0ee2ef
    
    
    
    crictl inspect 55c0e11ff37f4 a8e4aeabe747e | grep pvc | grep hostPath
            "hostPath": "/var/lib/kubelet/pods/96efafa7-660e-41a9-800f-8e5089c582e4/volumes/kubernetes.io~azure-disk/pvc-69707b58-30d9-4001-9004-76bbd0db1a36",
            "hostPath": "/var/lib/kubelet/pods/8f218053-7c93-4c75-97b2-e55e06f417e4/volumes/kubernetes.io~azure-disk/pvc-c66de91e-602c-4934-b513-b930993ad8b7",
    

    I'm afraid I can't do the diff on the vol_data.json, because it doesn't exist. It appears that it was introduced in a more recent version. The cluster that has the issue is on the slightly out of date version v1.19.9. I find the file on one of my newer clusters that run on v1.22. I realise I need to upgrade, but I'm a bit worried that this will make things worse.

    df -h |grep pvc doesn't show me the volumes, but I can find it with:

    root@aks-a2m-15622180-vmss000002:/# df -h |grep '/dev/sd'
    /dev/sda1                                                                                                     97G   21G   77G  21% /
    /dev/sda15                                                                                                   105M  4.4M  100M   5% /boot/efi
    /dev/sde1                                                                                                     20G   44M   19G   1% /mnt
    /dev/sdf                                                                                                     4.8G  615M  4.2G  13% /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/m716043828
    /dev/sdd                                                                                                     148G   26G  122G  18% /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/m1670688361
    

    It's the /dev/sdf disk. ls -l /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/m716043828 shows me the files I can see in the wordpress and rabbitmq container. The /dev/sdd disk is the other Wordpress pod (and a different deployment), which also suddenly shares storage with an Elasticsearch pod (same issue, different pair of pods).

    root@aks-a2m-15622180-vmss000002:/# blkid
    /dev/sda1: LABEL="cloudimg-rootfs" UUID="5a9997c3-aafd-46e9-954c-781f2b11fb68" TYPE="ext4" PARTUUID="cbc2fcb7-e40a-4fec-a370-51888c246f12"                                                                                                    
    /dev/sda15: LABEL="UEFI" UUID="2FBA-C33A" TYPE="vfat" PARTUUID="53fbf8ed-db79-4c52-8e42-78dbf30ff35c"
    /dev/sdb: UUID="2bf2a47b-e77e-4648-9ba6-7fcd0cb2a1cd" TYPE="ext4"
    /dev/sdc: UUID="64e259b0-9ca4-421f-b23a-90b304bcb383" TYPE="ext4"
    /dev/sdd: UUID="3c76aea0-b4e1-4013-a1b0-ff6b67bc88df" TYPE="ext4"
    /dev/sde1: UUID="9b50c84e-44f9-4fd8-b4b1-32ab7d16de1e" TYPE="ext4" PARTUUID="dc2b2422-01"
    /dev/sda14: PARTUUID="de01bd39-4bfe-4bc8-aff7-986e694f7972"
    /dev/sdf: UUID="c4ba423b-1ada-4ff4-bfe6-098caf5fdb08" TYPE="ext4"
    

    Some additional information: The problem started on the 16th of November, after I drained and rebooted all the nodes in turn. The cluster is almost 2 years old and I have patched and upgraded it serveral times without major issues. The configuration of the affected deployments/stateful sets hasn't changed in over a year.

    Some of the PVs took ages to re-attach after the reboot (about 30 - 45 minutes). All I did when I tried to resolve this was to delete the pods occasionally and they attached eventually. Afterwards thinks looked OK until I discovered that some pods share a volume. All the affected nodes are in availability zone 3.

    Any ideas what I could do to fix this?

    Thanks again for your help with this!

    Karsten