Windows Server 2019 - Failover Cluster Manager - Event ID: 1069, 5120 & 5217

Youssef Kassem 0 Reputation points
2024-04-02T08:36:51.1466667+00:00

Cluster resource 'Virtual Machine AUS-Helpdesk' of type 'Virtual Machine' in clustered role 'AUS-Helpdesk' failed.

Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.


'Virtual Machine Configuration AUS-Helpdesk' failed to unregister the virtual machine configuration during the initialization of the resource: The wait operation timed out. (0x00000102).


Cluster Events:

Cluster Shared Volume 'Volume4' ('Cluster Disk 5') is no longer available on this node because of 'STATUS_IO_TIMEOUT(c00000b5)'. All I/O will temporarily be queued until a path to the volume is reestablished.


Software snapshot creation on Cluster Shared Volume(s) ('\?\Volume{c566e925-0762-4ab3-b05c-ef1db83ad228}') with snapshot set id '{07f1675d-cf74-4052-969b-1ed3712e3531}' failed with error 'HrError(0x80042308)(2147754760)'. Please check the state of the CSV resources and the system events of the resource owner nodes.


Please help us to resolve the issue!

Hyper-V
Hyper-V
A Windows technology providing a hypervisor-based virtualization solution enabling customers to consolidate workloads onto a single server.
2,735 questions
Windows Server Clustering
Windows Server Clustering
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.Clustering: The grouping of multiple servers in a way that allows them to appear to be a single unit to client computers on a network. Clustering is a means of increasing network capacity, providing live backup in case one of the servers fails, and improving data security.
1,008 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Ian Xue 37,706 Reputation points Microsoft Vendor
    2024-04-05T08:42:31.3366667+00:00

    Hi Youssef Kassem,

    Thanks for your update. Based on my research, Event ID 5120 indicates that there has been an interruption to communication between a cluster node and a volume in Cluster Shared Volumes (CSV). This interruption may be short enough that it isn't noticeable or long enough that it interferes with services and applications using the volume. If the interruption persists, review other events in the System or Application event logs for information about communication between the node and the volume.

    Status code: STATUS_IO_TIMEOUT(c00000b5), indicates that a redirected file system IO operation took longer than the time allowed. The timeout is two minutes for synchronous operations and four minutes for asynchronous operations.

    The cause of the timeout varies. It might indicate a software, configuration, or hardware problem.

    1. Check your system event log for events that indicate network connectivity problems, host bus adapter (HBA) problems, or disk problems.
    2. Make sure that the affected system has the latest versions of network and storage drivers and firmware installed. Additionally, make sure that Microsoft updates and hotfixes are up to date. In particular, make sure that October 18, 2018—KB4462928 (OS Build 14393.2580) is installed. This update addresses an issue that occurs when restarting a node after draining the node. Event ID 5120 appears in the log with a "STATUS_IO_TIMEOUT c00000b5" message. This may slow or stop input and output (I/O) to the VMs, and sometimes the nodes may drop out of cluster membership.

    Reference: Event ID 5120 Cluster Shared Volume troubleshooting guidance - Windows Server | Microsoft Learn

    Best Regards,

    Ian Xue


    If the Answer is helpful, please click "Accept Answer" and upvote it.

    0 comments No comments

  2. Alex Bykovskyi 2,166 Reputation points
    2024-04-08T17:23:28.4533333+00:00

    Hey,

    As mentioned, your issue might be related to storage connectivity. Check if your mutlipathing is configured properly and nodes have connection to the storage via all the paths.
    https://www.dell.com/support/kbdoc/en-us/000138971/cluster-storage-loses-access-to-cluster-shared-volumes

    In addition, make sure that your hosts are fully patched.
    If you need shared storage, StarWind VSAN would be a great option. It provides replicated shared storage for failover cluster to maximize uptime of the VMs.
    https://www.starwindsoftware.com/starwind-virtual-san

    Cheers,

    Alex Bykovskyi

    StarWind Software

    Note: Posts are provided “AS IS” without warranty of any kind, either expressed or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.