Since Feb 2026 we've had around 11 servers running different Windows OS versions go into a not responsive state. The majority of them are running Server 2019 but two of them are running 2016/2025. The issue is not reoccurring on the same systems. We have a little over 1 thousand virtual servers running different apps on multiple different physical hosts and only a very small portion has had this issue occur. All servers run the same AV and other management type software.
When the issue occurs, the server is still Pingable, but services stop working on them. A symptom we use to detect if the issue is occurring is the System Center Operations Manager agent alerts 'Health Service Heartbeat Failure'.
Services for applications that are running on the servers with this issue also stop working as well. When we try to remote desktop, it doesn't work. If we try to access the local console the screen is stuck on black. If we reboot the server, once it boots back up, it always says 'Getting Windows Ready | Do Not Turn Off Your Computer'.
We check to see if there are any updates or installations that occurred the same day of the issue but don't find anything. The event logs show a gap in time between the time the issue started and the time it's resolved.
In one case we left the server running (didn't reboot it) and the issue eventually resolved itself hours later.
These are all VMWare servers. We've tried updating VMWare esxi host software and hardware drivers as well as updating to the newest VMWare Tools version. The issue is still randomly occurring however (we had one do it this morning).