Azure Stack HCI server hardware monitoring and alerting

Pankaj Vishwakarma 1 Reputation point
2022-09-02T13:51:00.313+00:00

Hi,

We are validating AZSHCI solution in our lab using DELL AX-650 servers. We've three node HCI cluster at the moment and doing POC with this setup.
And currently, we are working on validating the monitoring part for the entire HCI setup i.e. HCI nodes/cluster/services/server hardware every aspect of AZSHCI.

We tested the HCI layer monitoring and alerting for HCI nodes/cluster/health etc., using Azure monitor and log analytic workspace.

However, we are kinda stuck on creating alert monitoring for server hardware side. Because SNMP monitoring is not supported in HCI nodes and on the other hand the Dell OpenManageIntegration extension in WAC doesn't support the alerting of the DELL hardware server. So in case if there is any hardware fails, we would not get any alert notification.

Therefore looking for some suggestions if there is any way that we can also integrate the DELL server's hardware layer monitoring as well with the Azure monitor/insight.

Thank you in advance

Azure Stack HCI
Azure Stack HCI
A hyperconverged infrastructure operating system delivered as an Azure service that provides security, performance, and feature updates.
263 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Prrudram-MSFT 21,886 Reputation points
    2022-09-06T13:26:27.22+00:00

    Hello @Pankaj Vishwakarma ,

    Thank you for reaching out to the Microsoft Q&A platform. Happy to answer your question.

    You have mentioned using WAC with the Extension and if that does not surface the alerts then I'm guessing this was left up to Dell to surface in their product.

    The only way I know of which has been done in the past is using Microsoft System Center Operations Manager (SCOM/OpsMgr) with the Dell Hardware Management Pack.
    The Dell MP at the time (circa 2010 to 2016) was exceptional as it tapped into the hardware layer and was able to surface the errors that would have been generated in Dell OpenManage.
    SCOM 2019/2022 also can tie into Azure Monitor/Log Analytics, however I have not used it in some time.
    Unless someone else has another idea, I would say you would need an instance of SCOM deployed (which may be a bit overkill here but can be an all-in-one instance) and tied into Azure Monitor.

    --please don't forget to upvote and accept as answer if the reply is helpful--

    0 comments No comments