MS Graph API: High Frequency of Unique Error Responses During User and Chat Operations

Ron Farkash 0 Reputation points
2025-06-05T08:16:05.2666667+00:00

Hello, in the past few months I have used the MS Graph API extensively, that included fetching users, managing various resources' change notifications, fetching chats, events, online meetings, adding and removing applications to chats, and more.

the following list describes in detail the actions I perform on daily basis:

fetch users

subscribe/unsubscribe/renew/get subscriptions of user chat messages

subscribe/unsubscribe/renew/get subscriptions of user chats

subscribe/unsubscribe/renew/get subscriptions of user online meeting

subscribe/unsubscribe/renew/get subscriptions of user calendar

install/remove teams bot application from chats

get chats/events/online meetings by id

I use the most recent version of ms-graph-sdk and ms-graph-sdk-beta python packages to perform those actions, mostly the non-beta version.

over the past months I have experienced a set of odd errors which I have no way to handle other than retrying the same request hoping it wouldn't happen.

those errors occur on my local environment (and my colleagues) and also on Kubernetes negating the possibility that it's environment related

overall my system is working but only with the help of retries, but it is becoming very difficult to manage.

the status codes are ranging from 500-504, indicating server-side issues beyond my application's control.

the following list describes the errors I experience (I have omitted some data that may be marked as sensitive):

error=MainError(additional_data={}, code='ExtensionError', details=None, inner_error=InnerError(additional_data={}, client_request_id='', date=, odata_type=None, request_id=''), message='Operation: Update; Exception: [A task was canceled.]', target=None))

error=MainError(additional_data={}, code='ExtensionError', details=None, inner_error=InnerError(additional_data={}, client_request_id=', date=, odata_type=None, request_id=''), message='Operation: Update; Exception: [Status Code: InternalServerError; Reason: Failed to execute backend request.]', target=None))

error=MainError(additional_data={}, code='ExtensionError', details=None, inner_error=InnerError(additional_data={}, client_request_id='', date=datetime.datetime(2025, 6, 4, 15, 7, 9), odata_type=None, request_id=''), message='Operation: Update; Exception: [Status Code: BadGateway; Reason: ]', target=None))

error=MainError(additional_data={}, code='UnknownError', details=None, inner_error=InnerError(additional_data={}, client_request_id='', date=, odata_type=None, request_id=''), message='Bad Gateway', target=None)

error=MainError(additional_data={}, code='UnknownError', details=None, inner_error=InnerError(additional_data={}, client_request_id='', date=datetime.datetime(2025, 6, 4, 14, 50, 46), odata_type=None, request_id=''), message='Service Unavailable', target=None)

error=MainError(additional_data={}, code='BadGateway', details=None, inner_error=InnerError(additional_data={}, client_request_id='', date=, odata_type=None, request_id=''), message='Failed to execute backend request.', target=None)

error=MainError(additional_data={}, code='ExtensionError', details=None, inner_error=InnerError(additional_data={}, client_request_id='', date=, odata_type=None, request_id=''), message='Operation: Read; Exception: [Status Code: BadGateway; Reason: ]', target=None)

HTTP Request: GET https://graph.microsoft.com/v1.0//chats/{chat_id}"HTTP/2 504 ==Gateway== Timeout"

error: MainError(additional_data={}, code='UnknownError', details=None, inner_error=InnerError(additional_data={}, client_request_id='', date=, odata_type=None, request_id=''), message='You do not have permission to view this directory or page using the credentials that you supplied.', target=None)

the errors mentioned above happen statistically, at random times, without any correlation to actions I perform, some errors appear more when renewing subscriptions, some appear when trying to add the bot to chat and some appear when fetching users or even chats, my application permissions are untouched for months, they have been consented to and my system is being tested daily, I have reviewed my configurations and have not identified any recent changes that could be causing these issues.

it is also important to mention that since yesterday I have started receiving "Service Unavailable" 502 at a high frequency, this hinders my system to a point where it may become unusable.

the amount of errors grew over time in an unpredictable manner which slowly prevents my system from functioning correctly, therefore I am seeking support in finding the cause for those errors, handling them correctly and be less dependent on retrying mechanisms.

thank you in advance

Microsoft Security | Microsoft Graph
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.