Capture a TCP dump from a Windows node in an AKS cluster
Networking issues may occur when you're using a Microsoft Azure Kubernetes Service (AKS) cluster. To help investigate these issues, this article explains how to capture a TCP dump from a Windows node in an AKS cluster, and then download the capture to your local machine.
Prerequisites
- Azure CLI, version 2.0.59 or later. You can open Azure Cloud Shell in the web browser to enter Azure CLI commands. Or install or upgrade Azure CLI on your local machine. To find the version that's installed on your machine, run
az --version
. - An AKS cluster. If you don't have an AKS cluster, create one using Azure CLI or through the Azure portal.
Step 1: Find the nodes to troubleshoot
How do you determine which node to pull the TCP dump from? You first get the list of nodes in the AKS cluster using the Kubernetes command-line client, kubectl. Follow the instructions to connect to the cluster and run the kubectl get nodes --output wide
command using the Azure portal or Azure CLI. A node list that's similar to the following output appears:
$ kubectl get nodes --output wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
akswin000000 Ready agent 3m8s v1.20.9 10.240.0.4 <none> Windows Server 2019 Datacenter 10.0.17763.2237 docker://20.10.6
akswin000001 Ready agent 3m50s v1.20.9 10.240.0.115 <none> Windows Server 2019 Datacenter 10.0.17763.2237 docker://20.10.6
akswin000002 Ready agent 3m32s v1.20.9 10.240.0.226 <none> Windows Server 2019 Datacenter 10.0.17763.2237 docker://20.10.6
Step 2: Connect to a Windows node
The next step is to establish a connection to the AKS cluster node. You authenticate either using a Secure Shell (SSH) key, or using the Windows admin password in a Remote Desktop Protocol (RDP) connection. Both methods require creating an intermediate connection, because you can't currently connect directly to the AKS Windows node. Whether you connect to a node through SSH or RDP, you need to specify the user name for the AKS nodes. By default, this user name is azureuser. Besides using an SSH or RDP connection, you can connect to a Windows node from the HostProcess container.
Create hostprocess.yaml with the following content. Replace
AKSWINDOWSNODENAME
with the AKS Windows node name.apiVersion: v1 kind: Pod metadata: labels: pod: hpc name: hpc spec: securityContext: windowsOptions: hostProcess: true runAsUserName: "NT AUTHORITY\\SYSTEM" hostNetwork: true containers: - name: hpc image: mcr.microsoft.com/windows/servercore:ltsc2022 # Use servercore:1809 for WS2019 command: - powershell.exe - -Command - "Start-Sleep 2147483" imagePullPolicy: IfNotPresent nodeSelector: kubernetes.io/os: windows kubernetes.io/hostname: AKSWINDOWSNODENAME tolerations: - effect: NoSchedule key: node.kubernetes.io/unschedulable operator: Exists - effect: NoSchedule key: node.kubernetes.io/network-unavailable operator: Exists - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists
Run the
kubectl apply -f hostprocess.yaml
command to deploy the Windows HostProcess container in the specified Windows node.Run the
kubectl exec -it [HPC-POD-NAME] -- powershell
command.Run any PowerShell commands inside the HostProcess container to access the Windows node.
Note
To access the files in the Windows node, switch the root folder to
C:\
inside the HostProcess container.
Step 3: Create a packet capture
When you're connected to the Windows node through SSH or RDP, or from the HostProcess container, a form of the Windows command prompt appears:
azureuser@akswin000000 C:\Users\azureuser>
Now open a command prompt and enter the Network Shell (netsh) command below for capturing traces (netsh trace start). This command starts the packet capture process.
netsh trace start capture=yes tracefile=C:\temp\AKS_node_name.etl
Output appears that's similar to the following text:
Trace configuration:
-------------------------------------------------------------------
Status: Running
Trace File: AKS_node_name.etl
Append: Off
Circular: On
Max Size: 250 MB
Report: Off
While the trace is running, replicate your issue many times. This action ensures the issue has been captured within the TCP dump. Note the time stamp while you replicate the issue. To stop the packet capture when you're done, enter netsh trace stop
:
azureuser@akswin000000 C:\Users\azureuser>netsh trace stop
Merging traces ... done
Generating data collection ... done
The trace file and additional troubleshooting information have been compiled as "C:\temp\AKS_node_name.cab".
File location = C:\temp\AKS_node_name.etl
Tracing session was successfully stopped.
Step 4: Transfer the capture locally
After you complete the packet capture, identify the HostProcess pod so that you can copy the dump locally.
On your local machine, open a second console, and then get a list of pods by running the
kubectl get pods
command:kubectl get pods NAME READY STATUS RESTARTS AGE azure-vote-back-6c4dd64bdf-m4nk7 1/1 Running 2 3d21h azure-vote-front-85b4df594d-jhpzw 1/1 Running 2 3d21h hpc 1/1 Running 0 3m58s
The HostProcess pod's default name is hpc, as shown in the third line.
Copy the TCP dump files locally by running the following commands. Replace the pod name with
hpc
.kubectl cp -n default hpc:/temp/AKS_node_name.etl ./AKS_node_name.etl tar: Removing leading '/' from member names kubectl cp -n default hpc:/temp/AKS_node_name.etl ./AKS_node_name.cab tar: Removing leading '/' from member names
The .etl and .cab files will now be present in your local directory.
Contact us for help
If you have questions or need help, create a support request, or ask Azure community support. You can also submit product feedback to Azure feedback community.