AKS outbound long running request hanging

Question

AKS outbound long running request hanging

Mateusz T 0

Cluster A exposes api.example.com endpoint that proxies traffic to a custom pod in cluster B (exposed using v1.api.example.com). This custom pod has 2 endpoints - one (let's call it B1) responding after 10 seconds and second one responding after 5 minutes (let's call it B2). Proxying is configured in cluster A using k8s Ingress pointing at ExternalName service with externalName configured to the cluster B domain.

Let's consider following scenarios:

If I send a request (from postman/curl) directly to cluster B (on v1.api.example.com), I'm getting response from both endpoints B1 and B2 after expected delay.
If I send request over cluster A (using api.example.com), then request is proxied to cluster B, but request to B1 succeeds and request to B2 is hanging from client perspective and returns no result (after way much longer than 5 minutes, not returning any HTTP error - I would expect something like 502/504, only when request from the client - then in A logs HTTP 499 is seen). I can see in B cluster logs of both ingress-nginx and service itself that result was returned as HTTP 200 after expected time in both cases - for B1 and B2 endpoint, but it seems that it was never consumed by ingress-nginx from cluster A although connection seems to be hanging all the time.
Moreover, if I deploy a pod in cluster A with similar services - one responding after 10s and other one after 5mins, both work properly when reaching cluster A.
If I get into a pod in cluster A and try to hit B2 request it returns response properly after 5 minutes. If I hit ingress-nginx from a pod in cluster A using its internal IP (and setting required host header) B2 request is also hanging infinitely (B1 doesn't so the B1/B2 services are reachable from there).

My conclusion is that it is related to some settings regarding the outgoing traffic from cluster A to B (or to internet in general). I was reading documentation about

AKS setup:

Kubernetes version 1.24.6
Azure Load Balancer with public ingress and egress IPs, Load Balancer Idle timeout on frontend is set to 25 minutes and on outbound as well, but it's strange for me that automatically created aksOutboundRule doesn't allow to choose kubernetes backend pool that is greyed out (when creating a different outbound rule the behavior is the same, so I'm wondering if 25 minutes timeout is applied to the outbound traffic):
Ingress-nginx as ingress controller I have tried to set all timeout related settings in cluster A to high amounts to make sure that they are not shutting the underlying TCP connection silently down:

       keep-alive: "420"
       proxy-connect-timeout: "60"
       proxy-read-timeout: "1800"
       proxy-send-timeout: "1800"
       upstream-keepalive-timeout: "420"

Network policy extracted using az aks show :

     "networkProfile": {
       "dnsServiceIp": "10.0.0.10",
       "dockerBridgeCidr": "172.17.0.1/16",
       "ipFamilies": [
         "IPv4"
       ],
       "loadBalancerProfile": {
         "allocatedOutboundPorts": 0,
         "effectiveOutboundIPs": [
           {
             "id": "<id>",
             "resourceGroup": "<rg-id>"
           }
         ],
         "enableMultipleStandardLoadBalancers": null,
         "idleTimeoutInMinutes": 25,
         "managedOutboundIPs": {
           "count": 1,
           "countIpv6": null
         },
         "outboundIPs": null,
         "outboundIpPrefixes": null
       },
       "loadBalancerSku": "Standard",
       "natGatewayProfile": null,
       "networkMode": null,
       "networkPlugin": "azure",
       "networkPolicy": "calico",
       "outboundType": "loadBalancer",
       "podCidr": null,
       "podCidrs": null,
       "serviceCidr": "10.0.0.0/16",
       "serviceCidrs": [
         "10.0.0.0/16"
       ]
     },

The only Kubernetes network policy (autogenerated as well):

   apiVersion: networking.k8s.io/v1
   kind: NetworkPolicy
   metadata:
     annotations:
     generation: 1
     labels:
       addonmanager.kubernetes.io/mode: Reconcile
     name: konnectivity-agent
     namespace: kube-system
     resourceVersion: "598"
   spec:
     egress:
     - {}
     podSelector:
       matchLabels:
         app: konnectivity-agent
     policyTypes:
     - Egress
   status: {}

Clusters A and B deployed in separate Virtual Networks and each of them exposes services on a separate domain.

Can anyone please help me to figure out what can be the reason for the long running requests hanging when traffic goes over cluster A to cluster B?

Mateusz T 0 Reputation points

2023-02-02T21:03:44.8966667+00:00

I have done some extensive debugging including tcpdump and Wireshark analysis and it seems that the outbound TCP connection from A to B is being silently killed. In TCP traffic dump I can only sending HTTP request upstream and then moment when I cancelled the request (after over 6 minutes). There is no TCP RST in the meantime, so it seems that a network element on the way silently kills the TCP connection. After I cancelled the request from the client, ingress-nginx sends TCP FIN to the upstream, but upstream doesn't respond (most probably because connection was closed) and performs retransmission:

I suspect that making ingress-nginx send TCP keepalives upstream will solve the problem, but I haven't figured out yet the way to configure it. I will try to configure it using sysctl net.ipv4.tcp_keepalive_time in the ingress-nginx pod.
srbhatta-MSFT 8,586 Reputation points Microsoft Employee

2023-02-09T05:39:41.8033333+00:00

Hello @Mateusz T , Thank you for reaching out to Microsoft QnA. This issue requires in-depth analysis, and I would recommend you to reach out to the Azure Support team by opening a service request for this issue. Thank you.

Your answer

Mateusz T 0 Reputation points

2023-02-02T21:03:44.8966667+00:00

I have done some extensive debugging including tcpdump and Wireshark analysis and it seems that the outbound TCP connection from A to B is being silently killed. In TCP traffic dump I can only sending HTTP request upstream and then moment when I cancelled the request (after over 6 minutes). There is no TCP RST in the meantime, so it seems that a network element on the way silently kills the TCP connection. After I cancelled the request from the client, ingress-nginx sends TCP FIN to the upstream, but upstream doesn't respond (most probably because connection was closed) and performs retransmission:

I suspect that making ingress-nginx send TCP keepalives upstream will solve the problem, but I haven't figured out yet the way to configure it. I will try to configure it using sysctl net.ipv4.tcp_keepalive_time in the ingress-nginx pod.
srbhatta-MSFT 8,586 Reputation points Microsoft Employee

2023-02-09T05:39:41.8033333+00:00

Hello @Mateusz T , Thank you for reaching out to Microsoft QnA. This issue requires in-depth analysis, and I would recommend you to reach out to the Azure Support team by opening a service request for this issue. Thank you.

Share via

AKS outbound long running request hanging

Your answer