SCOM 1807 - Unable to add new agents/clients

vafran 121 Reputation points
2020-10-14T06:36:31.127+00:00

Hello,

Suddenly I am unable to add new agents to the SCOM 2016 environment. They are installed correctly, but appear as not monitored in the console.

32242-image.png

On the agents I get the following events:

20070
20071
21016 (less frequently)

On the SCOM server I get events 20000.

The agent control panel shows FQDN of management server and port 5723, which is opened. Also there are no certificates in the environment.

I searched for the issue and ended up removing all objects in maintenances state.

This seems to have started happening since I added a new SQL cluster, but it may be a coincidence. I added the first two nodes just fine, but I added a third one a few weeks later (which is a reinstall of a node from another cluster, preserving the server name), was the first client/agent I found to fail. Also the instances of this cluster are detected from the proxy node, but not also appear as not monitored.

The agents are added from the SCOM console, but if I install manually and then approve it from the console, the situation is exactly the same.

This is the line of events in a completely newly installed agent in a new server

32147-image.png

Any advice?

Operations Manager
Operations Manager
A family of System Center products that provide infrastructure monitoring, help ensure the predictable performance and availability of vital applications, and offer comprehensive monitoring for datacenters and cloud, both private and public.
1,421 questions
0 comments No comments
{count} votes

6 answers

Sort by: Most helpful
  1. vafran 121 Reputation points
    2020-11-03T09:52:55.763+00:00

    Hello,

    I did recover form backup from a definitely working date, but althoug at first instance it seems to work, it goes back to the non working situation.

    The culprit is the delta syncronization:

    The System Center Management Configuration Service has failed to perform the Configuration Store Delta synchronization state task in an acceptable amount of time.

    The purpose of this monitor is to determine if the Configuration Service has failed to run the “DeltaSynchronization “work item over the last 15 minutes (default). The impact of the “DeltaSynchronization” work item failing is during this time the management group could experience inconstant behaviors about its ability to update Agents with new configuration.

    So I have this events 29181 for failed sync with System.InvalidCastException error.

    Previous to the restore, the following query on the database woudl return 10, as failed:

    select * from cs.WorkItem where workitemname like '%snapshot%' order by StartedDateTimeUtc desc

    After restore, I got just one correct sync (20) but no other attmnept, while bvefore the restore there was one failed attempt every few seconds.

    ![37057-image.png][1]

    Check this:
    https://learn.microsoft.com/en-us/troubleshoot/system-center/scom/configuration-not-updated-with-event-29181

    But those settings in my build are already well above this numbers, and the environment is not so large.

    Also this is not time timeout, like the most issues found in forums, but a invalidcastexception type of error.

    0 comments No comments