Azure Log Analytics: Service Level Agreement (SLA) part 2
My post yesterday got a little long so, I’ve done a part2 to show how else you might group your solutions.
Here is the new query, I kept it simple so removed the JOIN and links to Perf and Memory that were shown in Part 1.
let start_time =startofday(ago(7d));
let end_time =startofday(now());
// Lets define our groups of computers, using various search techniques
let webServers = (Heartbeat
| where Computer startswith "WEB"
| distinct Computer
| project Computer
);
let domainControllers = (Event
| where EventLog == "Directory Service"
| distinct Computer
| project Computer
);
let aksServers = (Heartbeat
| where Computer has "aks"
| distinct Computer
| project Computer
);
let subnetServers = (WireData
| where LocalIP matches regex "10.118.51."
| where Computer !startswith "D"
| distinct Computer
| project Computer
);
Heartbeat
| where TimeGenerated > start_time and TimeGenerated < end_time
// import generated computer list(s)
| where Computer in (webServers, domainControllers, aksServers, subnetServers)
// Main calculation
| summarize heartbeat_per_hour=count() by bin_at(TimeGenerated, 1h, start_time), Computer
| extend available_per_hour=iff(heartbeat_per_hour>0, true, false)
| summarize total_available_hours=countif(available_per_hour==true) by Computer
| extend total_number_of_buckets=round((end_time-start_time)/1h)
| extend availability_rate=total_available_hours*100/total_number_of_buckets
// Create some readable groupnames
| extend GroupName =
case(
Computer in (webServers), GroupName ="Web Servers",
Computer in (domainControllers),GroupName ="Domain Controllers",
Computer in (aksServers), GroupName ="AKS servers",
Computer in (subnetServers), GroupName ="Subnet51",
"Unknown"
)
// Show the results
| summarize
ServerCount=dcount(Computer),
Availability=avg(availability_rate),
AvailabilityFailedOn=countif(availability_rate < 100)
by GroupName
Essentially this query looks at 4 groups of servers:
1. Web Servers
2. AKS Servers (same as in part1)
3. Domain Controllers
4. A subnet
Web Servers and AKS, I used the same technique as I did in part1. However there is a subtle use of the ‘has’ for the AKS servers to pick up the text “aks” anywhere in the computer name.
Domain Controllers, with this group I used the Event table, and looked for the “Directory Service” log which should be on each DC.
Subnet, this uses a regex to look for a specific subnet, I also excluded any Computers that had a name that started with “D”.
These are few techniques you can use to group computers or services together.
I also wanted a friendly name for each service, so used the Case function for that:
| extend GroupName =
case(
Computer in (webServers), GroupName ="Web Servers",
Computer in (domainControllers),GroupName ="Domain Controllers",
Computer in (aksServers), GroupName ="AKS servers",
Computer in (subnetServers), GroupName ="Subnet51",
"Unknown"
)