Azure VM availability Report or dashboard

Bhanot Ravi 31 Reputation points
2021-07-16T14:47:41.26+00:00

I was looking how can I get consolidated view or extract of my azure virtual machines availability status. These VMs are present across various log analytics workspaces.

I am looking for best KQL query or whatever way possible which can extract the status and provide one single view or dashboard.

Thanks,
RB

Azure Virtual Machines
Azure Virtual Machines
An Azure service that is used to provision Windows and Linux virtual machines.
7,872 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Bhanot Ravi 31 Reputation points
    2021-07-16T21:27:59.56+00:00

    Insights provide the performance stats, but I am looking for data which can provide whether VM went down or was there any unexpected reboot happened / intermittent connectivity failures etc. Basically, VM heartbeat went down or not. I want to view availability status of all VMs in one single dashboard from across all subscriptions as those VMs are pointing to different analytics workspaces

    Thanks,
    RB


  2. kobulloc-MSFT 26,321 Reputation points Microsoft Employee
    2021-07-23T00:08:42.213+00:00

    There are a lot of different ways to go about this (and going through the options would likely make for a good Azure Monitor question) but if we rule out VM Insights a Kusto query gives you the most control and options, especially if you are looking at multiple log analytic workspaces. There are three key steps to creating this dashboard:

    Create a New Custom Dashboard
    The documentation will walk you through the process of creating a custom dashboard here:
    https://learn.microsoft.com/en-us/azure/azure-portal/azure-portal-dashboards

    1. Sign into the Azure Portal
    2. From the menu, select Dashboard
    3. Click on +New Dashboard > Blank dashboard
    4. Name the dashboard and click on Done customizing

    Create Your Kusto Query
    I suspect this is really the core of your question and there are a lot of possible options here so it depends on what you are after. In my example, I'm using 3 queries recommended by Monitoring > Logs on the VM resource page.

    While you could leave this at just "Not Reporting VMs", there's some good information to be gained from "Last Heartbeat Status" and "Agent latency spikes - Heartbeat table". This is totally up to you, however, and there are a lot more queries you could add to this dashboard.

    117255-image.png

    Not Reporting VMs

    // Not reporting VMs   
    // VMs that have not reported a heartbeat in the last 5 minutes.   
    // To create an alert for this query, click '+ New alert rule'  
    Heartbeat   
    | where TimeGenerated > ago(24h)  
    | summarize LastCall = max(TimeGenerated) by Computer, _ResourceId  
    | where LastCall < ago(5m)  
    

    Last heartbeat of each computer

    // Last heartbeat of each computer.   
    // Show the last heartbeat sent by each computer.   
    // Last heartbeat of each computer   
    // Show the last heartbeat sent by each computer.   
    Heartbeat  
    | summarize arg_max(TimeGenerated, *) by Computer  
    

    Agent latency spikes - Heartbeat table

    // Agent latency spikes - Heartbeat table.   
    // Check for agent latency spikes in the ingestion of Heartbeats in the last 24 hours.   
    // Agent latency spikes - Heartbeat table   
    // Check for agent latency spikes in the ingestion of Heartbeats in the last 24 hour.   
    // This query calculates ingestion duration every 10 minutes, and looks for spikes  
    let StartTime = ago(24h);  
    let EndTime = now();  
    let MinRSquare = 0.9; // Tune the sensitivity of the detection sensor. Higher numbers make the detector more sensitive  
    Heartbeat  
    | where TimeGenerated between (StartTime .. EndTime)  
    // calculate ingestion duration in seconds  
    | extend AgentLatencySeconds = (_TimeReceived-TimeGenerated)/1s  
    // Create a time series  
    | make-series RatioSeries=avg(AgentLatencySeconds) default=0 on TimeGenerated in range(StartTime , EndTime,10m)  
    // Apply a 2-line regression to the time series  
    | extend (RSquare2, SplitIdx, Variance2, RVariance2, LineFit2) = series_fit_2lines(RatioSeries)  
    // Find out if our 2-line is trending up or down  
    |extend (Slope, Interception, RSquare, Variance, RVariance, LineFit) = series_fit_line(LineFit2)  
    // Check whether the line fit reaches the threshold, and if the spike represents an increase (rather than a decrease)  
    | project PatternMatch = iff(RSquare2 > MinRSquare and Slope>0, "Spike detected", "No spike")  
    

    To query across multiple resources/workspaces, you would use a union. For example:
    https://learn.microsoft.com/en-us/azure/azure-monitor/logs/cross-workspace-query#performing-a-query-across-multiple-resources

    union Update, workspace("contosoretail-it").Update, workspace("b459b4u5-912x-46d5-9cb1-p43069212nb4").Update  
    | where TimeGenerated >= ago(1h)  
    | where UpdateState == "Needed"  
    | summarize dcount(Computer) by Classification  
    

    Additional Kusto Query Examples:

    Save and Pin the Kusto Query to Your Dashboard
    After you have your query set, all that's left is to save and pin the query to your custom dashboard:
    https://learn.microsoft.com/en-us/azure/azure-monitor/visualize/tutorial-logs-dashboards

    Click on save, name the query, then Pin to dashboard.

    117225-image.png

    Additional Reading:

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.