Unable to remove agent of cluster node

EvHulsel 0 Reputation points
2024-07-24T10:00:22.6433333+00:00

Last week I've upgraded our SCOM environment from 2019 to 2022 (UR2) and everything went without too much fuzz. There's one agent that is giving me a headache, though. It fails to discover stuff, which in turn fails to add performance monitors and doesn't report its newer version back to the SCOM environment.

Since it's a part of a cluster (and we monitor clusters) it throws an error when uninstalling or deleting it in the console: "The agent (server.domain.local) is managing other devices and cannot be uninstalled. Resolve this issue via Agentless Managed view in Administration prior to attempting to uninstall again." The resources that show up in Agentless Managed are monitored by the other cluster node, so that shouldn't be an issue. And there's no other resource that the to-be-uninstalled agent is monitoring.

I've tried to disallowing the agent to act as a proxy, but that makes no difference. I also tried to reinstall the agent manually on the server itself, which it does without a hitch, but doesn't fix the original problem. This leads me to believe that the issue is in SCOM itself (or some sort of DB screwup).

As a desperate measure I also tried to remove the other node, which shows the same error and is understandable. Trying to temporarily moving the resources to the to-be-uninstalled agent doesn't even work, so that's also a no-go.

I never had any issues with removing agents that run on cluster nodes, but this one is being ridiculous...

Does any of you have any idea I could try? I prefer not to mess with the database, only as a last resort.

Operations Manager
Operations Manager
A family of System Center products that provide infrastructure monitoring, help ensure the predictable performance and availability of vital applications, and offer comprehensive monitoring for datacenters and cloud, both private and public.
1,493 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. XinGuo-MSFT 18,771 Reputation points
    2024-07-25T07:38:23.0666667+00:00

    Hi,

    Please try to uninstall the agent by using MOMAgent.msi from the command line:

    %WinDir%\System32\msiexec.exe /x <path>\MOMAgent.msi /qb

    Then Uninstall the agent from a cluster


  2. EvHulsel 0 Reputation points
    2024-07-25T13:14:16.4266667+00:00

    Update: I created an override to disable the Cluster Service discovery on all the nodes of the cluster. Then waited for the cluster objects to disappear in the Agentless Managed window, but that didn't happen. I ended up using Kevin's how-to (https://kevinholman.com/2018/05/03/deleting-and-purging-data-from-the-scom-database/) to remove the objects from SCOM, after which I could remove the cluster objects from the Agentless Managed window and ultimately remove the agent from the faulty cluster node.

    I then removed the created override and after that I was able to reinstall the agent and get it to work properly. It took some time for everything to be rediscovered. It even had some weird issue, crashing the console when going to the Agentless Managed window, but eventually when all the cluster resources were discovered that went away. I blame the harsh approach of removing the resources :P


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.