Dear community,
Our prod AKS cluster is in failed state(running). Worker nodes is also in NotReady state.
Can you please help resolving this issue? We can not event stop it because it stuck in failed state. We tried running following commands but they failed after unning for hours
az resource update --ids /subscriptions/xxx/resourcegroups/xxx/providers/Microsoft.ContainerService/ManagedClusters/xxx
az aks update --resource-group xxx --name xxx
I see following error as well
Warning NetworkNotReady 2m34s (x14701 over 8h) kubelet network is not ready: container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
Below are some output of kubectl commands
[nlweb-prod@AZSAPLPPUSA01 tmp]$ kubectl get no -o wide
E0819 13:10:27.880180 25142 memcache.go:255] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0819 13:10:27.896518 25142 memcache.go:106] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0819 13:10:27.900139 25142 memcache.go:106] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0819 13:10:27.904477 25142 memcache.go:106] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
aks-primarypool-19920242-vmss000006 NotReady agent 10h v1.25.11 10.82.46.4