Linux monitoring: Cannot resolve hostname error

T_Schneider 186 Reputation points
2023-08-04T13:26:12.27+00:00

Hello,

for three out of 256 Linux machines we regularly receive the following alert:
"Cannot resolve hostname"

Error Description:
WSManFault

WS-Management cannot process the request. The operation failed because of an HTTP error. The HTTP error (12152) is: The server returned an invalid or unrecognized response .

Those errors clear after five minutes (which is the default monitoring interval). I've done all of the basic checks and while in error the hostnames can be resolved ok from the management servers.

There is no real pattern in how often those alerts trigger. As a test I've added manual entries into the hosts file on the management servers which has made no difference.

Here is some additional background information about our environment:

  • SCOM 2022 with UR1 and KB5024286 patch (OpenSSL 3.0)
  • 8 SCOM managenment servers, 2 dedicated to Linux, located in main datacenter
  • globally distributed sites
  • majority of Linux machines at the main datacenter, but also at various remote sites

So, there is one thing in common among the three affected machines:
they run Ubuntu 22 an RHEL 9

Interestingly enough, we have one Ubuntu 22 in the main datacenter which does not behave like this.

So to me it looks like there is a timing issue which only affects machines using the most recent version of the OpenSSL library.

Ping response time to the affected machines is <300ms (60ms for one of them).

Thanks
Thorsten

Operations Manager
Operations Manager
A family of System Center products that provide infrastructure monitoring, help ensure the predictable performance and availability of vital applications, and offer comprehensive monitoring for datacenters and cloud, both private and public.
1,575 questions
{count} votes

1 answer

Sort by: Most helpful
  1. XinGuo-MSFT 22,066 Reputation points
    2023-08-07T09:39:55.8533333+00:00

    Hi,

    Looks like a network issue. Can we try to plug the affected machines to the main datacenter environment and check if the error still persists?

    With complex and specific issues like this, it's often helpful to use packet capture software such as Network Monitor or Wireshark for a more in-depth investigation and resolution.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.