fixed after upgrading to 1.22.6
this was probably a bug.
PVs are being assgined with wrong nodeAffinity
Running with Kubernetes 1.19.13 on AKS, one of our nodepools doesn't have an AZ defined and we prefer it that way.
When scaling, new nodes are being tagged with
failure-domain.beta.kubernetes.io/region: eastus
failure-domain.beta.kubernetes.io/zone: '0'
which is expected.
however, when we add a new PVC, the PV that is created is being assigned with the wrong nodeAffinity
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: failure-domain.beta.kubernetes.io/region
operator: In
values:
- eastus
- key: failure-domain.beta.kubernetes.io/zone
operator: In
values:
- eastus-1
Pods doesn't start due to volume affinity issues
How to troubleshot?
This is so weird!
4 answers
Sort by: Most helpful
-
Kube VS 6 Reputation points
2022-03-21T11:51:47.15+00:00 -
shiva patpi 13,256 Reputation points Microsoft Employee
2022-03-15T05:43:58.127+00:00 Hello @Kube VS ,
Sorry for late reply here . Are you still seeing the issues ?
Can you start checking below data:kubectl describe pv, pvc, sc
kubectl describe pod
kubectl describe nodeAre you sure , none of your nodes are in Az ?
I am thinking when the new nodes got scaled up with those zone tags and when the new PV which is getting binding to a Pod - that POD might be assigned to one of the node which is in zone.
Hence if the Pod is trying to make use of the PV via PVC , since the underlying node is in a zone - probably that PVC is also having that particular node Affinity.If you see the corresponding disk provisioned through PVC - you might see that disk is in one of the availability Zone (You can check it in Azure Portal for the disk)
(I was trying to repro locally but I am not able to do so as you are using an older version of kubernetes)
Starting 1.21.* kubernetes has made lot of changes w.r.t provisioner.
Prior to 1.21.* provisioners were: kubernetes.io/azure-disk
with 1.21.* versions default provisioners are disk.csi.azure.com -
Kube VS 6 Reputation points
2022-03-18T17:19:43.42+00:00 Hi @shiva patpi , thank you for your reply.
Everything has been checked out, couple of times. over and over again.
I don't have a single node on an AZ in the nodepool (how can i?)
furthermore, all the nodes have been recreated. (nodepool scaled to 0 and then scaled up again)
yet, the disk provisioned is being provisioned in AZ-1 (also verified through in azure portal)I am considering upgrading to 1.22, hopefully, this will resolve it.
-
Mohamed Basher 1 Reputation point
2022-05-19T09:30:30.29+00:00 This is happening again in Kubernetes 1.23