configuration is breaking every few hours with event ID 29181
Hi all!
There is an issue that manifests with the following event ID:
Event ID [29181]
OpsMgr Management Configuration Service failed to execute 'SnapshotSynchronization' engine work item due to the following exception
Microsoft.EnterpriseManagement.ManagementConfiguration.DataAccessLayer.DataAccessException: Data access operation failed
at Microsoft.EnterpriseManagement.ManagementConfiguration.DataAccessLayer.DataAccessOperation.ExecuteSynchronously(Int32 timeoutSeconds, WaitHandle stopWaitHandle)
System.InvalidCastException: Specified cast is not valid.
at Microsoft.EnterpriseManagement.ManagementConfiguration.CmdbOperations.ManagedEntitySnapshotReadOperation.ReadData(SqlDataReader reader)
When you have an invalidcastexception it`s pretty straightforward that there is something wrong with the data from the OpsMgr DB and the first place to look is the mtv_healthservice table.
There I observed that for a particular agent that was also a filer(a cluster with a file server role) the display_name was blank.
During the discovery and better said when rediscovering the filer, the entry would get populated with the filer virtual name however it would just *disappear* after a few hours.
In every trace however the cluster discovery was executing fine with the correct parameters!
Apparently however we got another discovery from the BranchCache MP: cscript.exe /nologo "BranchCache.FileServerNodeHierarchy.Discovery.vbs" {E5971A03-C724-04C4-7083-4405CEC3B5D7} {BDAC4C39-E170-4624-04FE-22CFF49E51E1} serverFQDN
and if you open up the BranchCache MP we can see the target for this is <Value>$Target/Property[Type="Windows!Microsoft.Windows.Server.Computer"]/IsVirtualNode$</Value>
and this one is responsible for entering the display_name with a null value.
So the easiest solution would be to remove this MP and enjoy a clean config for ever...::)
Or just disable this faulty discovery.
Or remove the filers from monitoring
I am expecting the new BranchCache MP with a fix for this issue which should come soon, until then be warned, since it`s a really nasty issue that is hard to see in the traces and breaks the whole environment until manually rediscovering the filer.