Share via

Cluster Network Validation - fail UDP port 3343

Notes Admin 96 Reputation points
2021-01-28T13:48:49.307+00:00

When running the cluster network validation test on 2 x HPE DL380 Gen10 fully patched and firmware/driver updated Windows Server 2019 (LTSC 1809) with Hyper-V role nodes (pre-cluster creation) it gets the error below:
Network interfaces s-test-01.assemblyni.gov.uk - LOM1Port1_Mgmt and s-test-02.assemblyni.gov.uk - LOM1Port 1_Mgmt are on the same cluster network, yet address 10.63.35.30 is not reachable from 10.63.35.31 using UDP on port 3343.

The above problematic NICs are 1Gbps and used for management, RDP etc and are the only NICs with default gateways set and are connected via a Cisco 3750 switch with no ACL or port security configured.
Each server also has a single NIC with dual 25Gbps ports which are directly connected with DAC cables as we do not currently have the 25Gbps switches.
All other NICs are vNICs created on a switch embedded team on each server that uses the dual port 25Gbps NIC.
What has been tried:

  1. Firewall has been disabled on all profiles on both servers. No other FWs between the servers
  2. Real-time monitoring has been disabled on both servers for Windows Defender which is the only AV used
  3. Servers full patched with HPP SPP 2020-09, all Windows OS Updates and restarted several times
  4. When I change the mgmt. nic on one server to be in a different subnet the validation test works but why? Also when you go to create the cluster it will ask for a cluster VIP address which needs to be in the same subnet across all servers and it only offers the mgmt. NIC IP address subnets I assume because they are the only ones with default gateway set?
    I can find plenty of similar articles but none that answers this scenario and I would really appreciate any help or advice please.
    Thanks
    Stu
    kk
Windows for business | Windows Server | Storage high availability | Clustering and high availability
0 comments No comments
{count} votes

Answer accepted by question author
  1. Notes Admin 96 Reputation points
    2021-02-05T10:33:40.72+00:00

    For anyone interested I found the solution but I cannot tell you why this works.
    As opposed to using a single physical onboard network port, I decided to try teaming 2 of the onboard 1Gbps network adapters and then create a virtual NIC and use it for the management traffic across both server nodes and whatya know, it flamin worked!? But WHY?
    So I don't know if this is a Failover cluster requirement or why I couldn't create the cluster when using a single physical network port for management traffic. Specifically the problem being failing to communication over UDP on port 3343.
    I have not read any article saying watch out dont do crazy stuff like that because it is not supported and the pre-requisite for a Microsoft 2019 Hyper-V cluster is you must use resilient virtual NICs for your management traffic.
    I dont know if this makes sense to anyone and I would appreciate if anyone is able to explain this, please feel free to enlighten me and/or others :-)

    To finish I have to thank MIco who did enlighten me on the multi-subnet cluster articles.
    I would also like to thank Romain Serre whose article made me think to try using vNICs for management.
    https://www.tech-coffee.net/2-node-hyperconverged-cluster-with-windows-server-2016/#comment-3732
    I also found this article useful:
    https://social.technet.microsoft.com/Forums/windowsserver/en-US/c3e15170-2a83-48a8-b671-efc2a9afe4cf/s2d-cluster-validation-fails-firewall-and-udp-port-3343

    0 comments No comments

9 additional answers

Sort by: Most helpful
  1. kumar kaushal 176 Reputation points Microsoft Employee
    2021-02-08T08:51:33.563+00:00

    Network validation test in cluster does 2 things :

    1)One is the ping test .
    2)Secondly it would start sending heartbeat packets over port 3343 over each interface . That is why in the network validation test you see 0% packet loss or 100% packet loss depending upon if the interface is reachable or not.

    Cluster heartbeat packets basically makes use of RCP _request and RCP _response packets . You can see these in network capture . Cluster heartbeat basically makes use Route control protocol.

    The problem that you have here is : You won't be able to look at these in network traces because that requires a parser and that is not inbuild in the network monitor .

    The enhancement is from 2012 OS onwards

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.