@Poluri, Venudhar , Thank you for the question.
While AKS has resilience mechanisms to withstand a VM stop or deallocate config and recover from it, this isn't a supported configuration. Stop your cluster instead.
Azure Kubernetes Service node auto-repair applies but works differently than Automatic instance repairs for Azure virtual machine scale sets.
If the node is in a NotReady State for a long time after the node VM has started please try the following steps:
- SSH to the node. How-to
- Collect kubelet logs. How-to
- Check if the docker daemon is running with
sudo systemctl status docker
[For containerd usesudo systemctl status containerd
]. For Windows nodes useGet-Service
command - If it is inactive, try starting docker using
sudo systemctl start docker
[For containerd usesudo systemctl start containerd
]. For Windows nodes useStart-Service
command - Check if the kubelet service is running with
sudo systemctl status kubelet
. For Windows nodes useGet-Service
- If it is inactive, try starting the kubelet service using
sudo systemctl start kubelet
. For Windows nodes useStart-Service
- If the node is still in a NotReady state try restarting the VM/VMSS instance.
If you are still facing the issue please do let us know.
----------
Hope this helps.
Please "Accept as Answer" if it helped, so that it can help others in the community looking for help on similar topics.