HPC Pack 2016 - Getting error while starting cluster manager on single cluster head node

Attuchirayil, Ajay 21 Reputation points
2022-09-17T02:46:18.24+00:00

Hi,

I am getting an error when I am trying to start the cluster manager on head node.

Setup:

HPC installation file used: HPCPack2016Update3-Full-Refresh-v6450.zip
Few details about the cluster are below:

  • Windows Server 2016
  • Single Head node configuration
  • Network Topology 5
  • 48 Compute Nodes added to the cluster
  • 5 databases setup in a remote DB server

Context:
After installation it was working for a month or so,
then when I was ready to setup jobs I started getting this error on startup of cluster manager.

Error:
In the error I am prompted to use another head node, I am attaching a screenshot for your reference.

Also, on checking the Event Viewer logs, I see the following error being logged every 3 mins.

*Failed to initialize collector. Retrying in 60 seconds. System.AggregateException: One or more errors occurred. ---> System.Net.Http.HttpRequestException: An error occurred while sending the request. ---> System.Net.WebException: The remote server returned an error: (502) Bad Gateway.
at System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult)
at System.Net.Http.HttpClientHandler.GetResponseCallback(IAsyncResult ar)
--- End of inner exception stack trace ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.Hpc.Monitoring.ManagementRestClient.<>c__DisplayClass19_0.<

Not Monitored
Not Monitored
Tag not monitored by Microsoft.
39,564 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.