hyper-v cluster servers server 2012 R2 both servers restarted at same time

Vajram Gajengi 1 Reputation point
2020-10-17T05:06:10.18+00:00

Hi Team,

We have hyper-v cluster servers server 2012 R2 both servers restarted at same time within one minute difference, we have found that lost power unexptectedly which is event-41 , as well as found bug check codeCLASSPNP.SYS on .dmp file but still we didn't find root cause.

Could you please help on this.

Regards

Vajram Gajengi

Windows Server Clustering
Windows Server Clustering
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.Clustering: The grouping of multiple servers in a way that allows them to appear to be a single unit to client computers on a network. Clustering is a means of increasing network capacity, providing live backup in case one of the servers fails, and improving data security.
941 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. TimCerling(ret) 1,156 Reputation points
    2020-10-17T23:14:08.987+00:00

    You say it lost power. Why are you not calling that a root cause?

    0 comments No comments

  2. Xiaowei He 9,866 Reputation points
    2020-10-19T06:00:23.483+00:00

    Hi,

    Firstly, please understand, root cause analysis requires dedicated logs and requires it to be a reproduced issue, besides, it is limited to do root cause analysis on the forum.

    Based on my experience, I would suggest you check if the cluster nodes are up to date, if not, please install the latest windows update on the Cluster nodes.

    Besides, you may check if there's cluster error log 1146 on the cluster log, if yes, it's recommended to enable RHS dump, so that if the issue reoccur, we may collect related logs for analyze:

    ---------
    On problematic nodes
    a. Confirm the cluster node had configured Kernel memory dump and paging file\dump location disk has enough space

    To set up a kernel memory dump, steps can be referred to:

    1. Enable the system to generate a kernel memory dump by changing the following registry key:
      HKEY_LOCAL_MACHINE \System\CurrentControlSet\Control\CrashControl
      Value Name: CrashDumpEnabled
      Data Type: REG_DWORD
      Value: 2
    2. Please explicitly specify the paging files on the system drive.
      · Locate to the registry HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\Session Manager\Memory Management
      · Double click 'PagingFiles' and change the paging file on C drive as "C:\pagefile.sys 8300 8300" which changes the initial size and maximum size of paging file.
    3. Please specify the following registry key to change the dump file location:

    · Locate to the registry HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\CrashControl
    · Double click 'DumpFile' to change the location of the dump file. Please choose another disk drive with more free space.

    1. Reboot the server for settings take effect.

    b. Add a DWORD registry value named "DebugBreakOnDeadlock" value 3 at HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\ClusSvc\Parameters\DebugBreakOnDeadlock

    c. Restart the cluster service.

    d. Download RHSMon.zip from attachment, run run.cmd.

    e. Upload system event log and cluster log from both nodes.

    When the RHS deadlock issue occurs, rhsmon.exe will detect it. It will launch dumpncrash.cmd to create user dump for rhs.exe/clussvc.exe and then crash the box to get the memory dump. Please remove the DebugBreakOnDeadlock key from this node once the issue been reproduced and dump generated.

    To analyze the logs and get the RHSMon tool, it's recommended to open a case with MS:

    https://support.microsoft.com/en-us/gp/customer-service-phone-numbers

    Thanks for your time!
    Best Regards,
    Anne

    -----------------------------

    If the Answer is helpful, please click "Accept Answer" and upvote it.

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.