Hello Subin Sabu
Welcome to the Microsoft Q&A Platform! Thank you for asking your question here.
I understand that you are experiencing issue with Network on you AKS cluster.
The troubleshooting on AKS sometimes can be complicated but I want to share some tools that you can try.
- My first question, Do you have Azure monitoring for the both AKS clusters?
- If yes what are the insights from there?
A)
You can try collect the TCPDUMP, that you can use WireShark to read the captures.
https://github.com/ioanc/k8s-network-troubleshooting/blob/master/tcpdump-node-local.sh
Official document from MS how to collect it - https://learn.microsoft.com/en-us/troubleshoot/azure/azure-kubernetes/logs/capture-tcp-dump-linux-node-aks
B)
Installing the Troubleshooting tools to help, Pull this image for example:
docker.io/fransouza/troubleshooting-network-tools:v1
Content : network tools (nslookup/ping/nc/dig/traceroute/ifconfig)
C)
Basic troubleshooting network in AKS
https://learn.microsoft.com/en-us/azure/architecture/operator-guides/aks/troubleshoot-network-aks#pod-fails-to-allocate-the-ip-address
D)
Also comparing the AKS cluster it's a good idea.
-Please check the Ports and any other details for CoreDNS pods
-System MODE nodepool, how many instances did you have? Is there any alerts ?
-What do you have during such events of latency or timeout on Kubectl get Events or on the describe of those pods.
-Run Kubectl TOP nodes to monitor the utilization and possible capacity issues.
-What kind of Disks are you using for the Nodes, Managed or Ephemeral?
Details here:
E)
And last, but not less important if you want to have guarantee of performance you can upgrade the SLA Tier of your AKS cluster, if it's the case that you are using Free TIER.
Free Tier AKS doesn't have SLA to give any performance.
https://learn.microsoft.com/en-us/azure/aks/free-standard-pricing-tiers
If it was helpful, please ACCEPT the Answer and click "UpVOTE" on this post to let us know.
Thank You.
Lisboa