Management Configuration Service Roll-up stuck in Unhealthy State

Malako 136 Reputation points
2020-09-23T11:29:42.507+00:00

Hi!

We're currently running SCOM 2016 UR6 (7.2.12066.0) and one of our Management Servers have an availability roll-up stuck in Unhealthy (Management Configuration Service).

All underlying monitors show a healthy state. There is no history in the underlying monitors of any unhealthy states (warnings have occurred).

We've tried resetting the monitor but it goes back into the old state without any indication of whats wrong.

If anyone has any idea as to how we can get to the bottom of this, we'd appreciate it!

Thanks!

Operations Manager
Operations Manager
A family of System Center products that provide infrastructure monitoring, help ensure the predictable performance and availability of vital applications, and offer comprehensive monitoring for datacenters and cloud, both private and public.
1,420 questions
0 comments No comments
{count} votes

Accepted answer
  1. SChalakov 10,271 Reputation points MVP
    2020-09-23T14:57:14.327+00:00

    Hi @Malako-7845,

    you can clear the cache, especially in such situations. Here is more on the topic and as you will see it is almost a "standard" troubleshooting procedure. I personally do it quite often when I have issues with the management servers (MS cache) or the console:

    How and When to Clear the Cache

    and

    What does the option "Flushing the Health Service and State and Cache" do?

    About putting your MS server in MM: I think this won't be much of a deal, but this depends on your monitored infrastructure. There is a workaround for that - an admin task, which you can trigger, which does the same as putting the server into MM. The task's name is "Resubmit local cache for state change events"

    Hope this answers your question.

    Regards,
    Stoyan

    1 person found this answer helpful.

4 additional answers

Sort by: Most helpful
  1. Leon Laude 85,681 Reputation points
    2020-09-23T12:07:38.28+00:00

    Hi @Malako-7845,

    Can you please check the Operations Manager event log for any warnings or errors ?
    If any warnings or errors are found, please post them here.


    (If the reply was helpful please don't forget to upvote or accept as answer, thank you)

    Best regards,
    Leon

    0 comments No comments

  2. SChalakov 10,271 Reputation points MVP
    2020-09-23T12:12:12.567+00:00

    Hi @Malako-7845,

    to add to Leon's request, can you please check the alert details of the alert that is being raised when this monitor fires? What does it state?

    Regards,
    Stoyan

    0 comments No comments

  3. CyrAz 5,181 Reputation points
    2020-09-23T12:15:18.547+00:00

    Usually, a stuck rollup monitor can be fixed by putting in maintenance mode the instance targeted by the monitor for a few minutes.
    If it doesn't work, clearing the agent cache will probably necessary, but since it will be the Management Server cache in this specific case that may be require some planning to do it during the night for example.

    0 comments No comments

  4. Malako 136 Reputation points
    2020-09-23T13:19:10.83+00:00

    @SChalakov
    There is no alert raised when the Roll-up is returned to its unhealthy state.

    @Leon Laude
    In the event log I have a few warnings on the mentioned MS:

    ----------

    EventID: 29120 (Warning)
    OpsMgr Management Configuration Service failed to process configuration request
    <....>
    Microsoft.EnterpriseManagement.ManagementConfiguration.Interop.ConfigManager.ManagementPackNotFoundException: Management pack with version dependent id 'f06bf782-8dbf-45b6-a8e9-474636b15cab' was not found in the store
    EventID: 2115 (Warning)
    A Bind Data Source in Management Group SCOM has posted items to the workflow, but has not received a response in 60 seconds. This indicates a performance or functional problem with the workflow.
    Workflow Id : Microsoft.SystemCenter.CollectDiscoveryData*

    The Instance ID of this workflow points to the HealthService of the Management Server in question

    ----------

    No other events that seem to be associated with the roll-up :S

    @CyrilAzoulay
    Is it ok to do this on the Management Servers? What are the implications of doing this? Will it affect anything particular - are there precautions to be made?

    Thank you all for your time!

    0 comments No comments