question

peewhy avatar image
0 Votes"
peewhy asked Crystal-MSFT commented

Management server Issues

Hello All,

We recently built SCOM 2019 environment and it was all smooth until the agents were migrated. We are on UR2 SCOM 2019 and have around 5000 agents. 4 MS and 2 GW. Around 900 servers reporting to the gateway

I have since yesterday observed the MS are toggling between grey, healthy and critical states. Something doesnt seem right. I reviewed the event logs and found few interesting errors.

I have the below error in all my 4 MS's

85140-microsoftteams-image-13.png

After a little browsing , i decided to run the "Request Snapshot synchronization task" under Service group state from the Monitoring pane targeting the service group

then i restarted the configuration service in all the 4 MS

This did not resolve the issue. I still have event logged in my MS. Any directions will be highly appreciated

85244-microsoftteams-image-14.png


Additionally i find these errors too in the event logs

85251-untitled.png


msc-operations-manager
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

CyrAz avatar image
0 Votes"
CyrAz answered peewhy edited

Did you try this procedure? It seems it could be your case, since you are running 5000 agents : https://docs.microsoft.com/en-US/troubleshoot/system-center/scom/configuration-not-updated-with-event-29181

· 3
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@CyrilAzoulay Thanks for your response.

Yes. I did read that article. My config file has setting has the following values.

<Setting Name="SnapshotSyncManagedEntityBatchSize" Value="50000" />
<Setting Name="SnapshotSyncRelationshipBatchSize" Value="50000" />
<Setting Name="SnapshotSyncTypedManagedEntityBatchSize" Value="100000" />

I was wondering if it was right for me to reduce the values as suggested ? Can i proceed ?

0 Votes 0 ·

I don't know, these settings are pretty much undocumented (or at least not publicly).

0 Votes 0 ·

Hello @CyrilAzoulay ,

The issue seems to be resolved. Though i did not make changes to the above parameters, i changed the timeout seconds in all the MS config file


The configservice.config file is located at C:\Program Files\Microsoft System Center 2012 R2\Operations Manager\Server
Update the file with the values highlighted below:
Snapshot Synchronization:
<Operation Name="EndSnapshot" TimeoutSeconds="1800" />
Delta Synchronization:
<OperationTimeout DefaultTimeoutSeconds="300">
<Operation Name="GetEntityChangeDeltaList" TimeoutSeconds="300"

Earlier it was

<Operation Name="EndSnapshot" TimeoutSeconds="900" />
<Operation Name="GetEntityChangeDeltaList" TimeoutSeconds="30"

0 Votes 0 ·
Crystal-MSFT avatar image
0 Votes"
Crystal-MSFT answered Crystal-MSFT commented

@peewhy, For our issue, based on my research, it can also be caused by a management pack which created a healthservice entry with invalid data.

We can run the following query identify the condition and the Management Pack that caused it

 select DiscoveryName , MPName , MPFriendlyName , MPVersion , MPIsSealed , MPLastModified , MPCreated from discovery d
 join ManagementPack MP on MP . ManagementPackId = d . ManagementPackId
 where discoveryId in
 ( select DiscoveryRuleId from discoverysource where discoverysourceid in
 ( select DiscoverySourceId from DiscoverySourceToTypedManagedEntity where TypedManagedEntityId in
 ( select BaseManagedEntityID from MTV_HealthService   where MaximumQueueSize is null or DisplayName = '' )))

After that, we can remove the offending Management Pack or disable the discovery that contributed the invalid health service entry in the MTV_HealthService table

Hope it can help.


If the response is helpful, please click "Accept Answer" and upvote it.
Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.


· 4
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @Crystal-MSFT .

Appreciate your response. Below is the result.

85565-untitled.png


But i do have a MP which is logging massive amounts of errors in event log. It is related Azure Managed Instance. 16000 events in the last hour
Do you suggest removing this MP ?

85530-untitled.png



0 Votes 0 ·
untitled.png (23.1 KiB)
untitled.png (80.8 KiB)

@peewhy, Based on my research, I find this MP is used to monitor Azure SQL instance. Did we have Azure SAL to monitor in our environment? If not, we can remove this MP to see if the issue is resolved.
https://techcommunity.microsoft.com/t5/sql-server/released-azure-sql-managed-instance-management-pack-7-0-22-0/ba-p/1503931

Hope it can help.






0 Votes 0 ·

Appreciate your kind assistance here @Crystal-MSFT - I still have to work on this MP and will keep you posted on the developments. This is indeed creating a lot of noise in the event logs

0 Votes 0 ·
Show more comments