Capture a TCP dump from a Windows node in an AKS cluster

Networking issues may occur when you're using a Microsoft Azure Kubernetes Service (AKS) cluster. To help investigate these issues, this article explains how to capture a TCP dump from a Windows node in an AKS cluster, and then download the capture to your local machine.

Prerequisites

Step 1: Find the nodes to troubleshoot

How do you determine which node to pull the TCP dump from? You first get the list of nodes in the AKS cluster using the Kubernetes command-line client, kubectl. Follow the instructions to connect to the cluster and run the kubectl get nodes --output wide command using the Azure portal or Azure CLI. A node list that's similar to the following output appears:

$ kubectl get nodes --output wide
NAME                                STATUS   ROLES   AGE     VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                         KERNEL-VERSION     CONTAINER-RUNTIME
akswin000000                        Ready    agent   3m8s    v1.20.9   10.240.0.4     <none>        Windows Server 2019 Datacenter   10.0.17763.2237    docker://20.10.6
akswin000001                        Ready    agent   3m50s   v1.20.9   10.240.0.115   <none>        Windows Server 2019 Datacenter   10.0.17763.2237    docker://20.10.6
akswin000002                        Ready    agent   3m32s   v1.20.9   10.240.0.226   <none>        Windows Server 2019 Datacenter   10.0.17763.2237    docker://20.10.6

Step 2: Connect to a Windows node

The next step is to establish a connection to the AKS cluster node. You authenticate either using a Secure Shell (SSH) key, or using the Windows admin password in a Remote Desktop Protocol (RDP) connection. Both methods require creating an intermediate connection, because you can't currently connect directly to the AKS Windows node. Whether you connect to a node through SSH or RDP, you need to specify the user name for the AKS nodes. By default, this user name is azureuser. Besides using an SSH or RDP connection, you can connect to a Windows node from the HostProcess container.

  1. Create hostprocess.yaml with the following content. Replace AKSWINDOWSNODENAME with the AKS Windows node name.

    apiVersion: v1
    kind: Pod
    metadata:
      labels:
        pod: hpc
      name: hpc
    spec:
      securityContext:
        windowsOptions:
          hostProcess: true
          runAsUserName: "NT AUTHORITY\\SYSTEM"
      hostNetwork: true
      containers:
        - name: hpc
          image: mcr.microsoft.com/windows/servercore:ltsc2022 # Use servercore:1809 for WS2019
          command:
            - powershell.exe
            - -Command
            - "Start-Sleep 2147483"
          imagePullPolicy: IfNotPresent
      nodeSelector:
        kubernetes.io/os: windows
        kubernetes.io/hostname: AKSWINDOWSNODENAME
      tolerations:
        - effect: NoSchedule
          key: node.kubernetes.io/unschedulable
          operator: Exists
        - effect: NoSchedule
          key: node.kubernetes.io/network-unavailable
          operator: Exists
        - effect: NoExecute
          key: node.kubernetes.io/unreachable
          operator: Exists
    
  2. Run the kubectl apply -f hostprocess.yaml command to deploy the Windows HostProcess container in the specified Windows node.

  3. Run the kubectl exec -it [HPC-POD-NAME] -- powershell command.

  4. Run any PowerShell commands inside the HostProcess container to access the Windows node.

    Note

    To access the files in the Windows node, switch the root folder to C:\ inside the HostProcess container.

Step 3: Create a packet capture

When you're connected to the Windows node through SSH or RDP, or from the HostProcess container, a form of the Windows command prompt appears:

azureuser@akswin000000 C:\Users\azureuser>

Now open a command prompt and enter the Network Shell (netsh) command below for capturing traces (netsh trace start). This command starts the packet capture process.

netsh trace start capture=yes tracefile=C:\temp\AKS_node_name.etl

Output appears that's similar to the following text:

Trace configuration:
-------------------------------------------------------------------
Status:             Running
Trace File:         AKS_node_name.etl
Append:             Off
Circular:           On
Max Size:           250 MB
Report:             Off

While the trace is running, replicate your issue many times. This action ensures the issue has been captured within the TCP dump. Note the time stamp while you replicate the issue. To stop the packet capture when you're done, enter netsh trace stop:

azureuser@akswin000000 C:\Users\azureuser>netsh trace stop
Merging traces ... done
Generating data collection ... done
The trace file and additional troubleshooting information have been compiled as "C:\temp\AKS_node_name.cab".
File location = C:\temp\AKS_node_name.etl
Tracing session was successfully stopped.

Step 4: Transfer the capture locally

After you complete the packet capture, identify the HostProcess pod so that you can copy the dump locally.

  1. On your local machine, open a second console, and then get a list of pods by running the kubectl get pods command:

    kubectl get pods
    NAME                                                    READY   STATUS    RESTARTS   AGE
    azure-vote-back-6c4dd64bdf-m4nk7                        1/1     Running   2          3d21h
    azure-vote-front-85b4df594d-jhpzw                       1/1     Running   2          3d21h
    hpc                                                     1/1     Running   0          3m58s
    

    The HostProcess pod's default name is hpc, as shown in the third line.

  2. Copy the TCP dump files locally by running the following commands. Replace the pod name with hpc.

    kubectl cp -n default hpc:/temp/AKS_node_name.etl ./AKS_node_name.etl
    tar: Removing leading '/' from member names
    kubectl cp -n default hpc:/temp/AKS_node_name.etl ./AKS_node_name.cab
    tar: Removing leading '/' from member names
    

    The .etl and .cab files will now be present in your local directory.

Contact us for help

If you have questions or need help, create a support request, or ask Azure community support. You can also submit product feedback to Azure feedback community.