Heartbeat - Aggreated by Datacenter

Daniel Lucas 21 Reputation points Microsoft Employee
2021-11-12T04:59:48.847+00:00

I have on premises computers being monitored from Azure Monitor.
I would like to create an alert when one of my datacenters are offline.
The alert would be using computer naming convention that identifies the datacenter and measure the total of computer and calculate the expected heartbeat and if more than 50% of my servers are offline generate an alert.

Is that possible?

Thanks

Azure Monitor
Azure Monitor
An Azure service that is used to collect, analyze, and act on telemetry data from Azure and on-premises environments.
2,792 questions
0 comments No comments
{count} votes

Accepted answer
  1. Alistair Ross 7,101 Reputation points Microsoft Employee
    2021-11-12T13:19:10.637+00:00

    Hi @Daniel Lucas ,
    Yes you can do this, however it's hard to come up with an example query without knowing your naming convention. Here is an example I've made up based on some fake naming convention using the extract() function

    let Threshold = 0.5;  
    let startTime = ago(10m);  
    let endTime = now();  
    // Count of Unique Computer Names  
    let ComputerCount = toscalar( Heartbeat  
    | where TimeGenerated between (startTime .. endTime)  
        | distinct Computer  
        | count);  
    // Expected Number of heartbeats in the time range  
    let heartbeatPerAgent = array_length(range(startTime, endTime, 1m)) -1;  
    // Number of Agents per computer. There can be more than one with the AMA and MMA  
    let agentsPerComputer = Heartbeat  
    | where TimeGenerated between (startTime .. endTime)  
    | distinct Computer, SourceComputerId  
    | summarize NumberOfAgents = count() by Computer;  
    // Gather Heartbeats  
    Heartbeat  
    | where TimeGenerated between (startTime .. endTime)  
    | summarize Count = count() by Computer  
    | join kind=innerunique (  
        // Join the number of agents to the summarized heartbeat data  
        agentsPerComputer  
    ) on Computer  
    | extend CountofHeartbeats = Count / NumberOfAgents  
    | extend ExpectedHeartbeats = heartbeatPerAgent  
    | order by Computer  
    //Used to simulate Computer naming convention based on my heartbeat data  
    | extend RenamedComputer = case ( row_number() < (ComputerCount /2),   
    strcat("SiteA", strrep("0", (4-strlen(tostring(row_number())))), row_number()),  
    strcat("SiteB", strrep("0", (4-strlen(tostring(row_number())))), row_number())  
    )  
    | project-away Computer, Computer1, NumberOfAgents  
    // Using Extract to identify the Site Code  
    |  extend Site = extract(@"(Site)(\w)(\d+)",2, RenamedComputer)     
    | summarize TotalHeartbeats = sum(CountofHeartbeats), ExpectedHeartbeats = sum(ExpectedHeartbeats) by Site  
    | where TotalHeartbeats < (ExpectedHeartbeats * Threshold)  
    
    0 comments No comments

0 additional answers

Sort by: Most helpful