SCOM (cluster state?) monitoring causes high cpu usage by clussvc.exe

Dirk Hondong 871 Reputation points
2021-05-18T06:44:39.697+00:00

Hi everyone,
I'm trying to resolve a high CPU condition on a 2 node Windows Server 2012R2 cluster hosting a SQL Server 2016 Availability group.
On the virtual servers with 4 vCPUs nothing really special happens . But the clussvc.exe causes 2 vCPUs to stay at 50% usage each.
When I deactivate the cluster state monitoring in SCOM the CPU usage drops a few minutes later to almost 0% for clussvc.exe

Did anyone had the same issue or has an idea what exactly causes clussvc.exe to go crazy?

I am not a SCOM expert (more the SQL Server guy) but I'd like to get this solved / explained. I do not want to burn CPU time for nothing.

Regards,
Dirk

Operations Manager
Operations Manager
A family of System Center products that provide infrastructure monitoring, help ensure the predictable performance and availability of vital applications, and offer comprehensive monitoring for datacenters and cloud, both private and public.
1,446 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Dirk Hondong 871 Reputation points
    2021-05-21T19:26:25.877+00:00

    Ok, I think I have found the answer / solution by myself.

    As I have stated out I had a lot of errors and warnings in the failover cluster log which you can access through event manager.

    The next hint was this old technet post
    https://social.technet.microsoft.com/Forums/en-US/e7e95c36-8f82-45b6-a88b-bdd423a4a1b5/access-denied-errors-api-sapiopenclusterex-failed-with-error-5?forum=virtualmachinemgrclustering
    where Erland stated out that NT Authority\network service did not had sufficient rights.
    I checked the properties of my cluster object from Failover cluster manager and the network service only had Read Access.
    With the powershell cmd Grant-ClusterAccess "NT Authority\Network Service" -Full
    (https://learn.microsoft.com/en-us/powershell/module/failoverclusters/grant-clusteraccess?view=windowsserver2019-ps)
    the high CPU usage turned from clussvc to the WMI Provider. Putting both nodes then in maintenance mode for a few minutes and then let Health Monitor resume its work did the trick.
    No high CPU usage anymore.

    Regards
    Dirk

    1 person found this answer helpful.