Creating KQL Query to Detect and Alert on Offline Log Sources

Michael Redbourne 1 Reputation point
2022-01-24T23:56:16.267+00:00

G'Day,

We're trying to alert when one or more log sources go offline in Sentinel, then project or summarize the offline log source(s) into an offense for review. I'm using the Heartbeat table here as an example because most people will have it. The immediate way I thought of doing this was using set operations. The thought being you pull the set of "heartbeats" from the last 24 hours (for example), then pull a separate set of heartbeats from the last hour. Using set_difference() ("A - B"), if the returned array is not {}, report an offense to Sentinel. (Evidently, if it's {}, nothing has gone offline.) The reverse case, B-A is not a consideration, as that would be new log sources being added, which is fine.

If this was only a log source or two we needed information on, I'd create a different kind of rule. But in some cases we have thousands of high priority log sources that cannot go offline.

I've taken a variety of stabs at this using different methods of summarizing the data I need, with... no results.

let HBDay = Heartbeat
| where TimeGenerated > ago(24h)
| summarize make_set(Computer);
Heartbeat
| where TimeGenerated > ago(1h)
| summarize make_set(Computer)
| project set_difference(HBDay.set_Computer, set_Computer)

The last line I suspect isn't valid KQL. I was trying to pull the set out of the first query, and then compare them. That (expectedly) failed. I've tried variations of this - pulling the first pull of data (using distinct(), assigning that to a variable, then checking it against the second set. Something like:

let A = Heartbeat
| where TimeGenerated > ago(24h)
| distinct Computer;
Heartbeat
| where TimeGenerated > ago(24h)
| distinct Computer
| where Computer !in ((A))

I haven't gotten anywhere, except a headache. Has anyone done something similar or have guidance?

Azure Monitor
Azure Monitor
An Azure service that is used to collect, analyze, and act on telemetry data from Azure and on-premises environments.
2,787 questions
Microsoft Sentinel
Microsoft Sentinel
A scalable, cloud-native solution for security information event management and security orchestration automated response. Previously known as Azure Sentinel.
971 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Andrew Blumhardt 9,491 Reputation points Microsoft Employee
    2022-01-25T00:10:44.423+00:00

    There are several workbooks with related queries that can be used as examples. Maybe something like this:

    let DownLimit = ago(1h);
union withsource=TableName1 *
| project TimeGenerated, TableName1
| summarize arg_max(TimeGenerated, *) by TableName1 
| where TimeGenerated < DownLimit

    0 comments No comments

  2. Anonymous
    2022-01-25T14:49:11.66+00:00

    // this is what I tend to run and prefer, this shows me Tables with no data in the past 15mins and also the 3 previous 15min intervals (so I know in the Alert if this is normal or not ) union * // Look back in the last hour, run this rule on a 5min frequency | make-series count() default=0 on TimeGenerated from ago(1h) to now() step 15m by Type | where count_ [-1] == 0 // look at the last record [-1] and only show events when last data point was equal to zero | project-away TimeGenerated ![168334-screenshot-2022-01-25-144439.png][1] [1]: /api/attachments/168334-screenshot-2022-01-25-144439.png?platform=QnA

    0 comments No comments