TCP ClientIP Affinity/Session Stickiness for Pods Over Multiple Ports

M van Staden 20 Reputation points
2023-06-22T13:23:25.1+00:00

We have an application that needs to connect to the same pod based on the client ip. This application uses 3 different ports. We have an application gateway that exposes the public IP with a load balancer.

*IPs are for illustrative purposes only

We've tried running an nginx ingress LoadBalancer type service in our cluster with the following configuration (the IPs are the backend service pod IPs):

stream { 
	upstream app1 {     
		zone app1 256k;      
		hash $remote_addr consistent;      
		server 10.200.1.50:56000;     
		server 10.200.2.10:56000;     
		server 10.200.3.15:56000;     
		server 10.200.4.20:56000; 
	}  
	server {      
		listen 56000;     
		proxy_pass app1; 
	}  

	upstream app2 {     
		zone app2 256k;      
		hash $remote_addr consistent;      
		server 10.200.1.50:56002;     
		server 10.200.2.10:56002;     
		server 10.200.3.15:56002;     
		server 10.200.4.20:56002; 
	}  
	server {      
		listen 56002;     
		proxy_pass app2; 
	}
  
	upstream app3 {     
		zone app3 256k;      
		hash $remote_addr consistent;      
		server 10.200.1.50:56001;     
		server 10.200.2.10:56001;     
		server 10.200.3.15:56001;     
		server 10.200.4.20:56001; 
	}  
	server {      
		listen 56001;     
		proxy_pass app3; 
	}   
}

The hashes for each upstream are unfortunately different though (most likely due to the differing server+port combos), so we are unable to route traffic from the same client ip on different ports to the same pod. And the reason why we have separate upstream groups is because the incoming port needs to be the same port of the receiving server.

We've then tried to route the traffic on the nginx controller to the cluster ip of the backend service, with ClientIP affinity set on the receiving service:

stream {      
	server { 
		listen 56001; 
		proxy_pass 10.0.5.44:56001; 
	} 
	server { 
		listen 56000; 
		proxy_pass 10.0.5.44:56000; 
	} 
	server { 
		listen 56002; 
		proxy_pass 10.0.5.44:56002; 
	} 
}

At present the correct client IP is seen on the nginx controller but when it is proxied off to the cluster ip it is replaced with the nginx pod ip. That's a separate problem. For now the IP address should be the same regardless of the remote client, however, ClientIP affinity does not appear to be working as traffic is being spread across the pods. I am unsure what we've done wrong with the configuration to cause this.

The backend service configuration:

kind: Service apiVersion: v1 metadata:   name: camserver-svc   namespace: camera   uid: fb973598-908b-4b15-95f5-02582f965757   resourceVersion: "24990840"   creationTimestamp: "2023-04-27T08:17:41Z"   labels:     app.kubernetes.io/instance: appserver     app.kubernetes.io/managed-by: Helm     app.kubernetes.io/name: appserver     app.kubernetes.io/version: 1.16.0     helm.sh/chart: camserver-0.1.0   annotations:     meta.helm.sh/release-name: appserver     meta.helm.sh/release-namespace: app   managedFields:     - manager: helm       operation: Update       apiVersion: v1       time: "2023-04-27T08:17:41Z"       fieldsType: FieldsV1       fieldsV1:         f:metadata:           f:annotations:             .: {}             f:meta.helm.sh/release-name: {}             f:meta.helm.sh/release-namespace: {}           f:labels:             .: {}             f:app.kubernetes.io/instance: {}             f:app.kubernetes.io/managed-by: {}             f:app.kubernetes.io/name: {}             f:app.kubernetes.io/version: {}             f:helm.sh/chart: {}         f:spec:           f:ports:             .: {}             k:{"port":56000,"protocol":"TCP"}:               .: {}               f:name: {}               f:port: {}               f:protocol: {}               f:targetPort: {}             k:{"port":56001,"protocol":"TCP"}:               .: {}               f:name: {}               f:port: {}               f:protocol: {}               f:targetPort: {}             k:{"port":56002,"protocol":"TCP"}:               .: {}               f:name: {}               f:port: {}               f:protocol: {}               f:targetPort: {}           f:selector: {}           f:type: {}     - manager: Mozilla       operation: Update       apiVersion: v1       time: "2023-06-21T05:59:16Z"       fieldsType: FieldsV1       fieldsV1:         f:spec:           f:internalTrafficPolicy: {}           f:sessionAffinity: {}           f:sessionAffinityConfig:             .: {}             f:clientIP:               .: {}               f:timeoutSeconds: {} spec:   ports:     - name: app1       protocol: TCP       port: 56000       targetPort: 56000     - name: app2       protocol: TCP       port: 56002       targetPort: 56002     - name: app3       protocol: TCP       port: 56001       targetPort: 56001   selector:     app: camserver   clusterIP: 10.0.0.4   clusterIPs:     - 10.0.0.4   type: ClusterIP   sessionAffinity: ClientIP   sessionAffinityConfig:     clientIP:       timeoutSeconds: 10800   ipFamilies:     - IPv4   ipFamilyPolicy: SingleStack   internalTrafficPolicy: Cluster status:   loadBalancer: {} 

The health of the nodes seem fine.

Any input would be greatly appreciated. If there is a way to debug the scheduling of how/why pods are being selected, please let me know. Or if there is an alternative approach to this please indicate how to accomplish this.

Azure Kubernetes Service
Azure Kubernetes Service
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
2,448 questions
{count} votes

Accepted answer
  1. Ben Gimblett 4,560 Reputation points Microsoft Employee
    2023-06-26T08:37:05.47+00:00

    @M van Staden - I had another look at your question. I don't believe the fact you're hosting different endpoints (on different ports) per pod is related to the issue. Your fundamental ask is: "...an application ...needs to connect to the same pod based on the client IP" - so client IP affinity.

    This isn't going to work with a typical out-of-the-box ingress controller (nginx, AGIC, Ambassador etc). Ingress controllers are reverse proxies meaning the original client ip will only show between reverse proxy and backend in a forwarded-for header (where the reverse proxy supports doing this). The ip of the packets leaving the reverse proxy will be pointing to the reverse proxy (for the return path).

    Without an ingress controller you can use client ip affinity on the service object

    With an ingress controller you can use cookie based affinity (for controllers which support it) - which implies the client making the connection into the cluster can also handle cookies.

    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. msrini-MSFT 9,291 Reputation points Microsoft Employee
    2023-06-23T13:23:38.2066667+00:00

    Application Gateway is a reverse proxy which means when you send traffic to Application Gateway, it creates a new session yo it's backend and the client IP is not preserved in the IP header. That is the reason your load balancer is not able to route it to the same pod. To have a solution for this, remove Application Gateway and expose your AKS with Public Load balancer in front with client based affinity enabled.

    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.