Share via

Multiple Random Windows Servers have freezing issue

Ramagost, Paul 0 Reputation points
2026-04-29T12:51:09.54+00:00

Since Feb 2026 we've had around 11 servers running different Windows OS versions go into a not responsive state. The majority of them are running Server 2019 but two of them are running 2016/2025. The issue is not reoccurring on the same systems. We have a little over 1 thousand virtual servers running different apps on multiple different physical hosts and only a very small portion has had this issue occur. All servers run the same AV and other management type software.

When the issue occurs, the server is still Pingable, but services stop working on them. A symptom we use to detect if the issue is occurring is the System Center Operations Manager agent alerts 'Health Service Heartbeat Failure'.

Services for applications that are running on the servers with this issue also stop working as well. When we try to remote desktop, it doesn't work. If we try to access the local console the screen is stuck on black. If we reboot the server, once it boots back up, it always says 'Getting Windows Ready | Do Not Turn Off Your Computer'.

We check to see if there are any updates or installations that occurred the same day of the issue but don't find anything. The event logs show a gap in time between the time the issue started and the time it's resolved.

In one case we left the server running (didn't reboot it) and the issue eventually resolved itself hours later.

These are all VMWare servers. We've tried updating VMWare esxi host software and hardware drivers as well as updating to the newest VMWare Tools version. The issue is still randomly occurring however (we had one do it this morning).

Windows for business | Windows Server | Performance | System performance

2 answers

Sort by: Most helpful
  1. Scott Nguyen 1,470 Reputation points Independent Advisor
    2026-04-29T13:37:47.45+00:00

    Hello, since the servers remain pingable but the UI and services hang, I would guess that the TrustedInstaller.exe or TiWorker.exe processes have locked the file system or registry hives while performing cleanup, which could cause black screen and gap in the Event Logs.

    You should check the %windir%\Logs\CBS\CBS.log directory for entries that align with the SCOM heartbeat failure, specifically looking for worker threads that hang during "Component Cleanup" or "Registry Hive" compaction.

    To prevent further occurrences, I suggest you temporarily disable the "StartComponentCleanup" task located under Task Scheduler Library\Microsoft\Windows\Servicing to see if the issue continue to happen.

    Was this answer helpful?


  2. AI answer

    2026-04-29T12:51:42.7633333+00:00

    The behavior matches a generic Windows freeze/hang on virtualized servers. Use the Windows freeze troubleshooting guidance and, critically, collect a memory dump while a server is in the frozen state so the root cause can be identified.

    Recommended approach:

    1. Baseline and identify the pattern
    • Confirm for each affected VM:
      • Hypervisor/host (ESXi version, host name) and whether multiple affected VMs share the same host.
      • Windows version and patch level.
      • What was happening at the time (backup, AV scan, patching, high I/O, etc.).
    • Note that the gap in Event Logs and the “Getting Windows ready” message after reboot are typical of a system that was hung while Windows was attempting to complete operations (for example, servicing or shutdown-related work) and then resumes/finishes after recovery.
    1. Use the official freeze troubleshooting workflow Follow the “Advanced troubleshooting for Windows freezes” guidance:
    • Determine scope and frequency:
      • Which computers are freezing (specific VMs/hosts or random)?
      • What operation is happening when they freeze (backup, patching, heavy load)?
      • How often and on how many computers this occurs.
      This is explicitly recommended in the “Identify the problem” section.
    • While a VM is frozen but still running:
      • Attempt RDP (already fails in this case, which is a data point).
      • Test ping (already succeeds).
      • Try administrative access (Computer Management, Server Manager, administrative shares like \\ServerName\c$).
      • Try Ctrl+Alt+Delete on the console and note whether there is any response.
      These steps help distinguish between full kernel hangs, user-mode deadlocks, or storage/network stalls.
    • After the VM recovers:
      • Review System and Application logs around the time of the freeze for:
        • Application Error events (crashes of critical processes).
        • Service Control Manager errors for critical services.
        • Error Event IDs 2019/2020 with source Srv/Server (resource exhaustion).
      • Run perfmon /report to generate a System Diagnostics report for each affected server and compare for common bottlenecks.
    1. Prepare to capture a complete memory dump Because the issue is intermittent and not easily reproducible, configure the VMs now so that the next occurrence produces a kernel dump that can be analyzed.
    • Configure complete memory dump on each affected VM:
      1. Run sysdm.cpl → Advanced → Performance → Settings → Advanced → Change to verify virtual memory.
      2. In System Properties → Advanced → Startup and Recovery → Settings.
      3. Under “Write debugging information”, select Complete memory dump.
      4. Select Overwrite any existing file.
      5. Ensure there is a paging file on the system drive at least 100 MB larger than installed RAM.
      6. Ensure the system drive has more free space than the physical RAM size.
    • Enable keyboard-triggered crash (CrashOnCtrlScroll) inside each VM so a dump can be forced when the hang occurs:
      1. In Registry Editor, create CrashOnCtrlScroll (REG_DWORD = 1) under both:
        • HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\i8042prt\Parameters
        • HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\kbdhid\Parameters
      2. Restart the VM.
      3. When the VM exhibits the freeze, hold Right Ctrl and press Scroll Lock twice to force a crash and generate %SystemRoot%\MEMORY.DMP.
      This is the recommended method for capturing a dump on a frozen Windows system.
    • If system drive space is constrained on some servers, configure a dedicated dump file volume using the DedicatedDumpFile registry entry as described in “Configure the destination path for a memory dump” and “How to use the DedicatedDumpFile registry value to overcome space limitations on the system drive”.
    1. VM-specific data collection Since all affected systems are VMware VMs, also prepare a hypervisor-based capture path:
    • Use VMware snapshots or suspend state and convert to a dump:
      • Take a snapshot or suspend the VM while it is frozen.
      • Use VMware’s Checkpoint To Core Tool (vmss2core) to convert .vmss or .vmsn to a Windows-compatible dump file.
      • Analyze the dump with standard Windows debugging tools or provide it to Microsoft Support.

    This is explicitly recommended for VMware environments and is useful if the in-guest CrashOnCtrlScroll cannot be triggered.

    1. Analyze and remediate
    • Once a dump is captured from a frozen state, analyze it (or provide to Microsoft Support) to determine whether the root cause is:
      • Kernel-mode driver deadlock (for example, storage, network, filter drivers, backup agents, AV, or VMware tools components).
      • Resource exhaustion (nonpaged pool, handles, etc.).
      • A specific Windows component or third-party service.
    • Based on the dump findings, typical next steps include:
      • Updating or removing problematic drivers/agents.
      • Adjusting backup or AV scanning schedules or methods.
      • Addressing storage or network latency issues at the hypervisor or SAN level.
    1. If the VM is frozen and cannot be interacted with
    • If the VM is completely unresponsive to keyboard input, use the hypervisor to trigger a dump:
      • For Hyper-V, the guidance is to use Debug-VM -InjectNonMaskableInterrupt; in VMware, use the documented snapshot/suspend plus vmss2core method.
    1. Engage support with collected data
    • With:
      • A complete memory dump from a frozen state.
      • System/Application event logs.
      • perfmon /report output.
      • VMware host logs around the same time.
      open a support case with Microsoft and/or VMware so they can correlate guest and host behavior.

    This workflow is designed specifically for intermittent freeze issues like those described and is applicable to Windows Server 2016/2019/2025 running on VMware.


    References:

    AI-generated content may be incorrect. Read our transparency notes for more information.

    Was this answer helpful?

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.