A custom network security group blocks traffic
When you access an application that's hosted on an Azure Kubernetes Service (AKS) cluster, you receive a "Timed out" error message. This error can occur even if the application is running and the rest of the configuration appears to be correct.
Prerequisites
The Kubernetes kubectl tool, or a similar tool, to connect to the cluster. To install kubectl by using Azure CLI, run the az aks install-cli command.
The Client URL (cURL) tool, or a similar command-line tool.
The apt-get command-line tool for handling packages.
Symptoms
If you run the following kubectl get and cURL commands, you experience "Timed out" errors that resemble the following console output:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
my-deployment-66648877fc-v78jm 1/1 Running 0 5m53s
$ kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
my-loadbalancer-service LoadBalancer 10.0.107.79 10.81.x.x 80:31048/TCP 4m14s
$ curl -Iv http://10.81.124.39 # Use an IP address that fits the "EXTERNAL-IP" pattern.
* Trying 10.81.x.x:80...
* connect to 10.81.x.x port 80 failed: Timed out
* Failed to connect to 10.81.x.x port 80 after 21033 ms: Timed out
* Closing connection 0
curl: (28) Failed to connect to 10.81.x.x port 80 after 21033 ms: Timed out
Cause
If you experience the same "Timed out" error every time, that usually suggests that a network component is blocking the traffic.
To troubleshoot this issue, you can start by checking access to the pod, and then move on to the client in an inside-out approach.
To check the pod, run the following kubectl get
and kubectl describe commands:
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
my-deployment-66648877fc-v78jm 1/1 Running 0 53s 172.25.0.93 aks-agentpool-42617579-vmss000000
$ kubectl describe pod my-deployment-66648877fc-v78jm # Specify the pod name from the previous command.
...
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 117s default-scheduler Successfully assigned default/my-deployment-66648877fc-v78jm to aks-agentpool-42617579-vmss000000
Normal Pulling 116s kubelet Pulling image "httpd"
Normal Pulled 116s kubelet Successfully pulled image "httpd" in 183.532816ms
Normal Created 116s kubelet Created container webserver
Normal Started 116s kubelet Started container webserver
Based on this output, the pod seems to be running correctly, without any restarts.
Open a test pod to check access to the application pod. Run the following kubectl get
, kubectl run, apt-get
, and cURL commands:
$ kubectl get pods -o wide # Get the pod IP address.
NAME READY STATUS RESTARTS AGE IP NODE
my-deployment-66648877fc-v78jm 1/1 Running 0 7m45s 172.25.0.93 aks-agentpool-42617579-vmss000000
$ kubectl run -it --rm aks-ssh --image=debian:stable # Launch the test pod.
If you don't see a command prompt, try pressing enter.
$ root@aks-ssh:
$ # Install packages inside the test pod.
$ root@aks-ssh: apt-get update -y && apt-get install dnsutils -y && apt-get install curl -y
Get:1 http://deb.debian.org/debian bullseye InRelease [116 kB]
Get:2 http://deb.debian.org/debian bullseye-updates InRelease [39.4 kB]
...
...
Running hooks in /etc/ca-certificates/update.d...
done.
$ # Try to check access to the pod using the pod IP address from the "kubectl get" output.
$ curl -Iv http://172.25.0.93
* Trying 172.25.0.93:80...
* Connected to 172.25.0.93 (172.25.0.93) port 80 (#0)
...
...
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
...
...
* Connection #0 to host 172.25.0.93 left intact
The pod is accessible directly. Therefore, the application is running.
The defined service is a LoadBalancer
type. This means that the request flow from the end client to the pod will be as follows:
Client >> Load balancer >> AKS node >> Application pod
In this request flow, we can block the traffic through the following components:
- Network policies in the cluster
- The network security group (NSG) for the AKS subnet and AKS node
To check network policy, run the following kubectl get
command:
$ kubectl get networkpolicy --all-namespaces
NAMESPACE NAME POD-SELECTOR AGE
kube-system konnectivity-agent app=konnectivity-agent 3h8m
Only the AKS default policy exists. Therefore, network policy doesn't seem to be blocking the traffic.
To check the NSGs and their associated rules by using AKS, follow these steps:
In the Azure portal, search for and select Virtual machine scale sets.
In the list of scale set instances, select the one that you're using.
In the menu pane of your scale set instance, select
Networking
.
The Networking page for the scale set instance appears. In the Inbound port rules tab, two sets of rules are displayed that are based on the two NSGs that act on the scale set instance:
The first set is composed of NSG rules at the subnet level. These rules are displayed under the following note heading:
Network security group <my-aks-nsg> (attached to subnet: <my-aks-subnet>)
This arrangement is common if a custom virtual network and custom subnet for the AKS cluster are used. The set of rules at the subnet level might resemble the following table.
Priority Name Port Protocol Source Destination Action 65000 AllowVnetInBound Any Any VirtualNetwork VirtualNetwork Allow 65001 AllowAzureLoadBalancerInBound Any Any AzureLoadBalancer Any Allow 65500 DenyAllInBound Any Any Any Any Deny The second set is composed of NSG rules at the network adapter level. These rules are displayed under the following note heading:
Network security group aks-agentpool-<agentpool-number>-nsg (attached to network interface: aks-agentpool-<vm-scale-set-number>-vmss)
This NSG is applied by the AKS cluster, and it's managed by AKS. The corresponding set of rules might resemble the following table.
Priority Name Port Protocol Source Destination Action 500 <guid>-TCP-80-Internet 80 TCP Internet 10.81.x.x Allow 65000 AllowVnetInBound Any Any VirtualNetwork VirtualNetwork Allow 65001 AllowAzureLoadBalancerInBound Any Any AzureLoadBalancer Any Allow 65500 DenyAllInBound Any Any Any Any Deny
At the network adapter level, there's an NSG inbound rule for TCP at IP address 10.81.x.x on port 80 (highlighted in the table). However, an equivalent rule is missing from the rules for the NSG at the subnet level.
Why didn't AKS apply the rule to the custom NSG? Because AKS doesn't apply NSGs to its subnet, and it won't modify any of the NSGs that are associated with that subnet. AKS will modify the NSGs only at the network adapter level. For more information, see Can I configure NSGs with AKS?.
Solution
If the application is enabled for access on a certain port, you must make sure that the custom NSG allows that port as an Inbound
rule. After the appropriate rule is added in the custom NSG at the subnet level, the application is accessible.
$ curl -Iv http://10.81.x.x
* Trying 10.81.x.x:80...
* Connected to 10.81.x.x (10.81.x.x) port 80 (#0)
...
...
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
...
...
* Connection #0 to host 10.81.x.x left intact
Contact us for help
If you have questions or need help, create a support request, or ask Azure community support. You can also submit product feedback to Azure feedback community.