Share via

Intermittent "Missing Heartbeat" Alerts in Sentinel Even Though Logs Show No Gap

Muhammad Arif Ahmed 0 Reputation points
2026-03-12T09:49:00.31+00:00

Hi everyone,

I have an on-premises virtual machine onboarded to Azure Arc and I’m collecting Heartbeat logs using the Azure Monitor Agent (AMA) in Microsoft Sentinel.

I created an analytics rule to trigger an alert if a heartbeat is missing for 10 minutes. The query I’m using is:

Heartbeat

However, I am occasionally receiving alerts for missing heartbeat, but when I check the Heartbeat table in Sentinel, I can see that the VM is still sending heartbeats and there doesn’t appear to be a gap.

So essentially:

The alert fires indicating a missing heartbeat

But when reviewing the logs afterward, heartbeat entries appear continuous

I’m trying to understand what could cause this behavior. For example:

Could this be related to query execution timing or ingestion delay?

Is there a better way to design a missing heartbeat alert for Azure Arc machines?

Has anyone experienced similar false-positive heartbeat alerts?

Any suggestions or best practices would be appreciated.

Thanks!

Microsoft Security | Microsoft Sentinel

2 answers

Sort by: Most helpful
  1. VEMULA SRISAI 13,030 Reputation points Microsoft External Staff Moderator
    2026-03-12T13:51:05.5366667+00:00

    Hello Muhammad Arif Ahmed,

    This behavior can occur due to query timing and ingestion delay when monitoring heartbeats in Microsoft Sentinel using the Azure Monitor Agent on machines connected through Azure Arc.

    Sometimes the heartbeat is generated on time by the VM, but it may take a few minutes before it is fully ingested into the Log Analytics workspace. If the analytics rule runs during that delay window, it may temporarily appear that the heartbeat is missing, which can trigger a false alert. Later, when the data finishes ingesting, the Heartbeat table appears continuous.

    A better approach is to design the alert rule so it calculates the time since the last heartbeat rather than simply checking if a record exists. For example:

    Heartbeat
    

    Then configure the alert rule with:

    Measure: MinutesSinceLastHeartbeat

    Aggregation type: Maximum

    Aggregation granularity: Same as the evaluation frequency (for example 5 minutes)

    Also make sure the Override query time range is larger than the threshold. For example, if you want to alert when a VM has no heartbeat for 10 minutes, set the query time range to something higher such as 30–60 minutes. Otherwise, a VM that stopped sending heartbeats may not appear in the query results at all.

    Many environments also add a buffer (for example 15 minutes instead of 10) to account for occasional ingestion latency.

    This approach usually reduces the false-positive alerts while still detecting genuine heartbeat failures.

    https://learn.microsoft.com/en-us/azure/azure-monitor/reference/queries/heartbeat

    Was this answer helpful?

    1 person found this answer helpful.

  2. Q&A Assist
    2026-03-12T09:49:24.2766667+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    This behavior is consistent with how log-based “missing heartbeat” alerts can misfire due to ingestion latency and the way log alerts are evaluated.

    Key points from the scenario and context:

    1. Log alerts are not ideal for detecting “lack of data”
      Log search alerts work best when detecting the presence of specific events, not the absence of events. For heartbeats, Azure Monitor documentation explicitly notes that log alerts are less effective when trying to detect missing data, such as a virtual machine heartbeat. Latency or short gaps in ingestion can cause the rule to evaluate “no results” at a particular run, fire an alert, and then later the delayed heartbeat records appear in the Heartbeat table, making it look like there was never a gap.
    2. Ingestion latency can cause false positives
      Azure Monitor processes large volumes of logs, and ingestion latency or spikes can occur. When the analytics rule runs, if the relevant Heartbeat records have not yet been ingested, the query can temporarily return no data for that 10‑minute window, causing the alert to fire. Later, once ingestion completes, the Heartbeat table appears continuous. The troubleshooting guidance for log alerts highlights that latency can lead to unexpected alert triggers and recommends considering metric-based alerts when misfires occur on latent data.
    3. General ingestion issues can affect all heartbeats
      If there are broader ingestion issues in the Log Analytics workspace or region, heartbeats from Linux or Windows agents may temporarily not appear, even though the agents are still sending them. The Linux heartbeat troubleshooting guidance shows that a query returning no heartbeats can indicate ingestion problems rather than an agent or VM issue. Similar ingestion issues can intermittently affect Arc machines using AMA.
    4. Better design for heartbeat monitoring of Arc machines
      For Azure and Arc-enabled servers, the recommended pattern is:
      • Use metric-based alerts where possible for availability/heartbeat scenarios, because metrics are near real time and less affected by ingestion latency than logs. Azure Monitor supports sending data to the metric store and then using metric alerts, which are more reliable for “is it alive” checks.
      • Use log alerts for detecting specific error events or patterns in logs, not for detecting missing heartbeats.
      For virtual machines in Azure, best practices recommend:
      • An availability metric alert to track when a VM is running (for Azure VMs).
      • An agent heartbeat alert rule to verify agent health, but with the understanding that this is a log-based alert and subject to ingestion behavior. For Arc-enabled servers, the same principle applies: use Azure Monitor Agent with data collection rules to send metrics and logs, and prefer metric alerts when possible for heartbeat/availability.
    5. Sentinel analytics rule integrity and timing
      Sentinel provides the _SentinelHealth() table to monitor analytics rule execution, including delays and failures. If a scheduled or near-real-time (NRT) rule experiences delays or query execution issues, it can affect when and how the rule evaluates the Heartbeat data. Monitoring rule delay and status can help confirm whether rule execution timing contributes to the false positives.

    Practical recommendations based on the above:

    1. Prefer a metric-based heartbeat/availability alert for Arc-enabled servers where feasible, instead of a pure log-based “missing Heartbeat” rule.
    2. If a log-based rule must be used:
      • Increase the missing-heartbeat window beyond 10 minutes to reduce sensitivity to short ingestion delays.
      • Consider adding logic to the query to look over a slightly larger time window and/or aggregate by Computer and TimeGenerated to smooth out brief gaps.
    3. Monitor for ingestion issues:
      • Use queries similar to the Linux heartbeat troubleshooting example to verify whether heartbeats from multiple machines are missing at the same time, which indicates ingestion problems rather than a single VM issue.
      • Check Azure service health and ingestion latency guidance when multiple agents show missing heartbeats.
    4. Use _SentinelHealth() to monitor Sentinel analytics rule delays and failures, and alert on significant rule delays that might correlate with false-positive heartbeat alerts.

    These steps align with Azure Monitor and Sentinel guidance: use metrics for near-real-time availability/heartbeat checks, use logs for content-based detection, and account for ingestion latency when designing log-based “missing data” alerts.


    References:

    Was this answer helpful?

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.