I think I have found a bug in the default NAT of virtual networks in combination with public IP addresses. In my setup, I see that ICMP messages due to errors, e.g. port unreachable or need to frag, are not referenced right with virtual machines exposed via public IP address. Instead of the public IP address, the internal private IP address is referenced in ICMP packets.
Can somebody confirm that this is a bug, intended behavior, or unsupported for some reason?
In my case the issue "just" leads to performance issues. But I think that the exposure of internal structures of the Azure setups might be a security issue.
Long story
ICMP responses on errors usually carry three IP addresses. The source, destination and the destination of the origin IP packet. Since in Azure public IP addresses are translated to private IP addresses and back (NAT), this is also necessary for ICMP packets to and from the virtual machine.
I observed that source and destination are translated correctly, but the destination of the origin IP packet is not changed and sticks to the private IP address.
Example with unreachable port
In this example, I replaced the IP ranges for privacy reasons. So please don't be confused. ;-)
The setup is as follows:
A virtual machine connected to a virtual network with a private subnet (1.1.1.0/24). A public IP address (2.2.2.2) is associated to the virtual network interface (1.1.1.4) of the virtual machine.
The Internet router on the client side has the public IP address 9.9.9.9. To make the issue visible, we send a UDP packet from the client side to an unreachable port, here 2.2.2.2:6000. The network security group is configured to accept packets on that port and passes the packet to the host.
On the host, we see the packet in tcpdump
and the host replies with an ICMP unreachable packet. The destination of the origin IP packet and the source are equal in this case.
13:56:38.901679 IP 9.9.9.9.54123 > 1.1.1.4.6000: UDP, length 1450
13:56:38.901735 IP 1.1.1.4 > 9.9.9.9: ICMP 1.1.1.4 udp port 6000 unreachable, length 556
13:56:39.903037 IP 9.9.9.9.54123 > 1.1.1.4.6000: UDP, length 1450
13:56:39.903088 IP 1.1.1.4 > 9.9.9.9: ICMP 1.1.1.4 udp port 6000 unreachable, length 556
13:56:40.904540 IP 9.9.9.9.54123 > 1.1.1.4.6000: UDP, length 1450
13:56:40.904598 IP 1.1.1.4 > 9.9.9.9: ICMP 1.1.1.4 udp port 6000 unreachable, length 556
On the inbound interface of the client side router, we see that the source has changed as expected but the destination of the origin IP packet didn't. It refers to the internal private IP address within the Azure private subnet. Instead of 1.1.1.4 it should be 2.2.2.2.
14:56:38.911702 IP 2.2.2.2 > 9.9.9.9: ICMP 1.1.1.4 udp port 6000 unreachable, length 556
14:56:39.915429 IP 2.2.2.2 > 9.9.9.9: ICMP 1.1.1.4 udp port 6000 unreachable, length 556
14:56:40.914425 IP 2.2.2.2 > 9.9.9.9: ICMP 1.1.1.4 udp port 6000 unreachable, length 556
Other experiments
The same happens with ICMP messages regarding fragmentation. The address in the IP header is masqueraded but the reference in the ICMP packet stays the private IP address of the virtual network.
13:49:26.695277 IP 1.1.1.68 > 9.9.9.9: ICMP 1.1.1.68 unreachable - need to frag (mtu 1437), length 556
13:49:27.696577 IP 1.1.1.68 > 9.9.9.9: ICMP 1.1.1.68 unreachable - need to frag (mtu 1437), length 556