HangRecoveryAction
Specifies the recovery action taken by the cluster service in response to a heartbeat countdown timeout.
Attribute | Value |
---|---|
Data type |
DWORD |
Access |
Read/write |
Structure |
CLUSPROP_DWORD |
Minimum |
WatchdogActionDisable (0) Windows Server 2008 R2: ClussvcHangActionDisable (0) is the minimum value. |
Maximum |
WatchdogActionLiveDumpAndTerminateProcess (6) Windows Server 2012: WatchdogActionDebugBreak (5) is the maximum value. Windows Server 2008 R2: ClussvcHangActionBugCheckMachine (3) is the maximum value. |
Default |
WatchdogActionLiveDumpAndTerminateProcess (6) Windows Server 2012: WatchdogActionBugCheckOnlyFromNetFt (3) is the default value. Windows Server 2008 R2: ClussvcHangActionBugCheckMachine (3) is the default value. |
Remarks
The constant for this property is CLUSTER_HANG_RECOVERY_ACTION_KEYNAME.
The Cluster network driver maintains a countdown timer that initiates the HangRecoveryAction property when it reaches 0 (zero). Whenever the ClusNet driver receives a Cluster service heartbeat, the countdown time is reset to the ClusSvcHeartbeatTimeout property. Additionally, when the Cluster service stops for any reason, the Cluster network driver automatically turns off the countdown timer.
The HangRecoveryAction property can be set to the following values.
Value | Description |
---|---|
WatchdogActionDisable Windows Server 2008 R2: The name of the value is ClussvcHangActionDisable. 0 |
NetFt WatchDog: Takes no action. Core Operations WatchDog: Takes no action. Windows Server 2008 R2: Disables the cluster heartbeat and monitoring mechanism. |
WatchdogActionLog Windows Server 2008 R2: The name of the value is ClussvcHangActionLog. 1 |
NetFt WatchDog: Logs a system event. Core Operations WatchDog: Logs a system event. Windows Server 2008 R2: Log an event in the system log of the event viewer when a heartbeat countdown timeout occurs. |
WatchdogActionTerminateProcess Windows Server 2008 R2: The name of the value is ClussvcHangActionTerminateService. 2 |
NetFt WatchDog: Terminates the Cluster service. Core Operations WatchDog: Terminates the cluster service. Windows Server 2008 R2: Terminate the cluster service when a heartbeat countdown timeout occurs. |
WatchdogActionBugCheckOnlyFromNetFt Windows Server 2008 R2: The name of the value is ClussvcHangActionBugCheckMachine. 3 |
NetFt WatchDog: Bugchecks the machine. Core Operations WatchDog: Terminates the cluster service. Windows Server 2008 R2: Create a system Stop error (BugCheck) when a heartbeat countdown timeout occurs. |
WatchdogActionBugCheckAlsoFromProcess : This value is available starting with Windows Server 2012. 4 |
NetFt WatchDog: Bugchecks the machine. Core Operations WatchDog: Bugchecks the machine. |
WatchdogActionDebugBreak : This value is available starting with Windows Server 2012. 5 |
NetFt WatchDog: Causes a debug break to occur. Core Operations WatchDog: Causes a debug break to occur. |
WatchdogActionLiveDumpAndTerminateProcess : This value is available starting with Windows 10, version 1703. 6 |
NetFt WatchDog: Live dumps and terminates the process. Core Operations WatchDog: Live dumps and terminates the process. |
Note
In some extreme cases, system services may also stop responding, and actions 1 and 2 may not succeed. In such cases, action 3 (bugcheck) is the only effective recovery measure.
If the action is set to cause a bugcheck on the cluster node, Windows stops responding and you receive the Stop error Bugcheck code of 0x9E. The Stop error causes a failover to another cluster node. Additionally, if the node where the Stop error occurs is configured to capture a memory dump file, you may be able to use the information that is contained in the memory dump file to diagnose the cause of the unresponsive cluster node.
The following code is an example of a stack trace from a Kernel dump that the Cluster network driver initiated:
ChildEBP RetAddr
f9c33ea8 f6e2e11f nt!KeBugCheckEx+0x19
f9c33ecc f6e2e836 clusnet!CnpCheckClussvcHang+0xef
f9c33ef0 805070d7 clusnet!CnpHeartBeatDpc+0x47e
f9c33fa4 8050735d nt!KiTimerExpiration+0x371
f9c33ff4 80543ccf nt!KiRetireDpcList+0x63
The Bugcheck error code is similar to the following error code: BugCheck 9E, {812d5b08, 3c, 0, 0}
Note
You must manually configure the server to generate a memory dump file in response to a Bugcheck.
Examples
The property value portion of a property list entry for HangRecoveryAction can be set with the following example code:
DWORD ClusSvcHangActionData = 1;
CLUSPROP_DWORD ClusSvcHangActionValue;
ClusSvcHangActionValue.Syntax.dw = CLUSPROP_SYNTAX_LIST_VALUE_DWORD;
ClusSvcHangActionValue.cbLength = sizeof(DWORD);
ClusSvcHangActionValue.dw = ClusSvcHangActionData;
Requirements
Minimum supported client |
None supported |
Minimum supported server |
Windows Server 2008 Enterprise, Windows Server 2008 Datacenter |