Gateway TImeout on auditLogs/signIns

Andrew Batchelor 1 Reputation point
2022-04-13T22:25:11.363+00:00

While calling the auditLogs/signIns endpoint using paging via $top and $skiptoken, after retrieving dozens of pages sucessfully, some applications will return a 504 gateway timeout error.

Once we get this error, attempting to retry with the same url, we get a 400 with "Skip Token is null" while when we get a 429 error, we are able to retry with the same url after the throttling period has passed.

We have tried using smaller page sizes (as low as 500 records) but still run into the same problem for the same applications.

any idea what could be causing this issue?

Perhaps a certain signin record has more data than others and leads to a timeout?

Also is there a way to limit the data returned per sign in, all we really need is the date of the signin and a user id, but the $select option is not supported.

Here are the headers and response from the Gateway Time Error:
URL: https://graph.microsoft.com/beta/auditLogs/signIns?$filter=createdDateTime+ge+2022-03-16+and+appId+eq+%<<appid>>%27&$top=500&$skiptoken=47d96df265450ef838de40dac783215e22736d0aede079fb5f631fe09364a515

"request-id" "de8908f3-f79e-4f1f-b672-6977089b9329"
client-request-id "de8908f3-f79e-4f1f-b672-6977089b9329"
"x-ms-ags-diagnostic" {"ServerInfo":{"DataCenter":"West US 2","Slice":"E","Ring":"1","ScaleUnit":"002","RoleInstance":"MWH0EPF0005A6AE"}}
"Date" Fri, 15 Apr 2022 03:29:02 GMT

Response Body:
{"error":{"code":"UnknownError","message":"","innerError":{"date":"2022-04-15T03:29:02","request-id":"de8908f3-f79e-4f1f-b672-6977089b9329","client-request-id":"de8908f3-f79e-4f1f-b672-6977089b9329"}}}

And Then if we retry with the same URL:
{"error":{"code":"UnknownError","message":"Invalid Skip Token, skip token is null","innerError":{"date":"2022-04-15T03:36:00","request-id":"c48d64c7-8cd7-487a-901e-b06df5090161","client-request-id":"c48d64c7-8cd7-487a-901e-b06df5090161"}}}

This was on the beta API but experienced the same behavior on the v1.0 version as well

We do use the retry-after header and back off calls for the interval given. It doesn't seem like it's an throttling issue so much as a problem with a specific page load.

Microsoft Entra ID
Microsoft Entra ID
A Microsoft Entra identity service that provides identity management and access control capabilities. Replaces Azure Active Directory.
22,228 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Vicky Kumar (Mindtree Consulting PVT LTD) 1,161 Reputation points Microsoft Employee
    2022-04-14T08:28:22.873+00:00

    Throttling behavior can depend on the type and number of requests. For example, if you have a high volume of requests, all request types are throttled. there might be case if you have consumed more than 0.8 of its limits, its mentioned in docs.

    193016-image.png

    The following are best practices for handling throttling:

    Reduce the number of operations per request.
    Reduce the frequency of calls.
    Avoid immediate retries, because all requests accrue against your usage limits.

    Note : there are also a Service-specific limits

    for info about throttling please take a look on doc -https://learn.microsoft.com/en-us/graph/throttling


  2. Pak-Hun Chan 15 Reputation points
    2024-07-10T17:18:17.88+00:00

    I know this thread is old but, since this post is the top search result, I wanted to share some information here for anyone encountering this problem


    If you're suddenly facing a 504 error, check your $filter and $top parameters. It's likely that your $filters are so specific that there aren't enough records to meet the $top value, causing the compute that's backing the Graph API to search through EVERY row of data

    This extensive search may not complete within the ~2min timeout, resulting in a 504 Gateway Timeout. Graph API seems to store only the last 30 days of sign-ins, so it's possible that your API calls work perfectly on some days, and relentlessly error on other days

    In my case, I had set $filter=appDisplayName eq 'oneOfMyOrganizationsAppName' and $top=100, but it started timing out one day. While debugging, I noticed the 504 error persisted even as I reduced $top to 20 and then to 10. It finally worked when I reduced $top to 2, but this made filtering by application id/name completely impractical. Since there are periods when users rarely sign into the applications we were tracking, I decided to refactor my code to remove the application-specific filters from the Graph API $filter parameter, and handled the filtering on our backend compute. Since then, I haven't had any issues

    Your API calls would probably still be okay when filtering by application id/name in the API parameters if you also filter using a very recent createdDateTime value (eg, $filter = createdDateTime ge <ISO-8601 datetime for 6 hours ago> and appDisplayName eq '<app name>')


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.