Hello @Ramkumar
Welcome to the Microsoft Q&A Platform. Thank you for reaching out & I hope you are doing well.
The error you are seeing getaddrinfo EAI_AGAIN shows it could be DNS resolution or timeout issues.
Let's try to follow the below steps to troubleshoot the issue:
- Check the logs of the Core DNS pods
kubectl -n kube-system logs -l k8s-app=kube-dns
Check if you see the timeouts, restarts type of errors.
- Check Core DNS pods Resource Utilization
kubectl top pod -n kube-system -l k8s-app=kube-dns
- Check id the pods are overutilized Consider increasing resource limits.
- Try to Restart CoreDNS pods as see if the issue persists
kubectl rollout restart deployment coredns -n kube-system
- Run DNS Test pods
Deploy a test pod to run the DNS lookup
kubectl run -it --rm dnsutils --image=tutum/dnsutils -- bash
then test DNS
dig api.nbq.ae
dig kubernetes.default
- Login into Node and check if DNS resolution on Nodes are working fine
nslookup api.nbq.ae
- Sometimes pods on the AKS cluster might switch up to public resolutions too despite having private resolution setting up fine. Those case can be verified by running the packet captures on the cluster Nodes.
Kindy check and revert if have any further queries.
Regards
Ujjawal Tyagi