Graph API event availability latency

Andrew Lytle 10 Reputation points
2023-03-24T17:36:46.5266667+00:00

Hello all,

We are ingesting data from the Graph API from a number of endpoints (Alerts, Risk Detections, Risky Users, Directory Audits, Sign-Ins) into our internal SIEM like system. We are using a time based polling method, where we will query all the events which were created since the last iteration of our poll.

What we've noticed is that we often miss events which appear in the API with older timestamps.For example, if we query all events from 09:00-09:05, we get N total events. If we were to wait some period of time (minutes to hours), and then make the exact same query, we might receive more than N events. Looking at the individual events, it appears that there is some latency before the events appear, but the timestamps associated with those logs are in the past. The timestamp is probably correct in that it aligns with when the event occurred, but not when "became available for query" in the API. This makes it very tricky to make sure we are capturing all the event data.

We read through the Graph API Latency SLA page (which appears to have been removed in the last few months, it used to be at https://learn.microsoft.com/en-us/azure/active-directory/reports-monitor) we read that in some circumstances it might take a number of hours for events to become available for query, based on the type of events.

To validate this, we started a test where we "delay" our queries. We still poll about ten minutes after the time window (which is our default), but we're also querying the same time window after 2 hours and 8 hours, looking for how many total events have appeared in the queries. Our goal here is to understand whether we can capture all the events from the API if we wait longer before we query a given time range.

What we've found is that the number of events which appear in the 2/8 hour delayed queries vary widely, from as low as 0.01% to up to 20% over a 24 hour period.

Would it be possible to confirm whether this is expected? Our analysis is leading us to believe that there's a scaling event, resource contention, or other reason out of our control which is affecting the timeliness of these events.

Thanks!

Microsoft Security Microsoft Graph
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Erkan Sahin 840 Reputation points
    2023-03-25T13:37:05.4633333+00:00

    It is possible that there is a delay in events becoming available for query in the Graph API due to a variety of factors such as resource contention, scaling events, or other issues that may be out of your control. The fact that the number of events that appear in the delayed queries varies widely, from 0.01% to up to 20%, over a 24-hour period seems to suggest that there may be some latency in the API.

    To help identify the cause of this latency, you may want to consider monitoring the performance of your API queries over time. This could involve measuring the time it takes for events to become available for query and tracking any trends or patterns that emerge. You may also want to consider working with Microsoft support to troubleshoot the issue and identify any potential solutions.

    In the meantime, one potential workaround could be to increase the time delay before querying a given time range. While this may not guarantee that you capture all events, it could help ensure that you capture more events over time. Another option could be to use a different method for ingesting data, such as real-time streaming, if that is available in the Graph API.

    Please mark if my answer is helpful :-)

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.