Redigera

Dela via


Determine which cluster node failed for Analytics Platform System

This article describes how to determine the name of the Analytics Platform System (PDW) node that failed after a cluster failover has occurred and a cluster failover alert has been raised. As part of troubleshooting a cluster failover, you must determine the name of the node that failed before contacting Microsoft to help resolve the problem.

Background

For high availability in SQL Server PDW, the Control node and the Compute nodes are configured as active or passive components of Windows failover clusters. When an active server fails to respond to critical system requests, the passive server fails over and performs the functions of the server that failed.

After a cluster failover, when SQL Server PDW reports on node status, the passive server has a failed over status. However, it is not obvious which server or node failed, especially if the server that failed is still online. To troubleshoot the cluster failure, you must determine the name of the node that failed over.

Admin Console solution

Find the name of the node that failed

  1. Open the Admin Console. For more information about the Admin Console, see Monitor the Appliance by Using the Admin Console (Analytics Platform System). After failover occurs, the failover event is included in the number of alerts on the HEALTH page. There is an HEALTH page for the PDW region and for the fabric region of the appliance. Each Health page has an ALERTS tab. To learn more about an alert, select the Health page, the Alerts tab, and then select an alert.

System view solution

The following SQL statement shows how to use the sys.dm_pdw_component_health_active_alerts system view to find the name of the server that failed.

SELECT  
SUBSTRING( component_instance_id, 2, charindex(' ', component_instance_id, 1)-2) AS failed_node_name,  
create_time AS failover_time  
FROM sys.dm_pdw_component_health_active_alerts  
WHERE alert_id = 500139  
ORDER BY failed_node_name;