Hi @Tanul
Thanks for reaching Microsoft Q&A.
I´ve checked internally and for now we only have a scenario that might be consider as a bug for the AKS operation to migrate a cluster from legacy Azure CNI to Overlay (https://learn.microsoft.com/en-us/azure/aks/azure-cni-overlay#upgrade-an-existing-cluster-to-cni-overlay).
Cluster had both the old azure-ip-masq-agent-config configmap as well as the azure-ip-masq-agent-config-reconciled configmap with the nonMasqueradeCIDRs populated with VNET, nodes subnet and service CIDR.
Once run the update to Overlay, only the azure-ip-masq-agent-config-reconciled got updated to remove all nonMasqueradeCIDR, and only include the pod CIDR there.
This can break some of the pods traffic because the pods were not getting the traffic SNAT when it was sent to some address spaces in Azure network setup.
- Mitigation while the fix is being applied to all global regions is to remove the nonMasqueradeCIDR from the azure-ip-masq-agent-config configmap and restarting the azure-ip-masq-agent pods.
If you have any type of bug/connectivity/latency, please report to us by opening a support ticket on the Azure Portal so we can check/drive internally. Also, I advise you to follow our github page where you can find weekly updates/bug/new releases for AKS.
Hope this helps. Please "Accept as Answer" if it helped, so that it can help others in the community looking for help on similar topics.