Understanding SCIM Client Request Behavior and Expected Load for Large-Scale Operations

Question

Understanding SCIM Client Request Behavior and Expected Load for Large-Scale Operations

Amrit Khanna 0

We are developing a SCIM 2.0 compliant service and are preparing to integrate with Microsoft Entra ID as an identity provider for our customers. To ensure we can build a robust and scalable solution, we need to understand the specific request patterns of the Entra ID SCIM client, particularly for scenarios involving a large number of users (e.g., thousands).

Our primary questions are about the expected request load for different SCIM operations:

1. Initial Bulk Synchronization / Onboarding:

When a customer onboards and assigns our SCIM application to a group in Entra ID containing several thousand users, at what rate will the Entra ID SCIM client send requests to our server?
How are these requests typically distributed across our SCIM endpoints? For example, what is the expected rate of POST /Users requests?
Does the client send requests sequentially or in parallel? If in parallel, what is the typical level of concurrency we should expect?

2. Bulk Deactivation / Deletion:

Similarly, if an administrator unassigns a large number of users from our application (or if users are deleted), what is the expected request rate for PATCH requests (to set active to false) or DELETE /Users/{id} requests?
How does Entra ID handle the deletion of a group with a large membership? Will it trigger individual requests for each user, and if so, at what rate?

3. Client-Side Throttling and Retry Logic:

If our server responds with a 429 Too Many Requests status code and includes a Retry-After header, what is the precise behavior of the Entra ID SCIM client?
Will it pause all provisioning activities for the duration specified in the Retry-After header for that specific customer?
In the case of throttling during a large group membership update, will the client retry only the failed requests, or will it re-synchronize the entire group membership?

Navya 19,795 Reputation points Microsoft External Staff Moderator

2025-06-25T08:43:32.02+00:00

Hi @Amrit Khanna

Just checking in to see if you had a chance to review the solution provided by Megan Truong. If the information was helpful in addressing your question, please consider accepting the answer. This helps us and also improves searchability for others in the community who might be looking for similar information. if you have any further query do let us know.

1 answer

Your answer

Navya 19,795 Reputation points Microsoft External Staff Moderator

2025-06-25T08:43:32.02+00:00

Hi @Amrit Khanna

Just checking in to see if you had a chance to review the solution provided by Megan Truong. If the information was helpful in addressing your question, please consider accepting the answer. This helps us and also improves searchability for others in the community who might be looking for similar information. if you have any further query do let us know.

Answer 1

Megan Truong 635 Independent Advisor

Hello @Amrit Khanna

Thank you for contacting Q&A Forum. As for your question point:

Initial Bulk Synchronization / Onboarding: Rate limitation is currently only available for apps in the gallery that Microsoft has built and onboarded. There is no rate limiter for custom non-gallery apps. Each provisioning job acts independently, with no knowledge of the others; the period between cycles is 40 minutes, though for excessively large sets of users/groups, the cycle may take considerably longer. Reference: https://learn.microsoft.com/en-us/entra/identity/app-provisioning/application-provisioning-when-will-provisioning-finish-specific-user#how-long-will-it-take-to-provision-users
Bulk Deactivation / Deletion: The rate and time length are also determined by the number of users that the administrator want to deactivate or delete. One configured instance of provisioning on an AAD Enterprise App/custom non-gallery app corresponds to one provisioning job. If you have ten clients, each with one provisioning job configured, there will be ten provisioning jobs.
Client-Side Throttling and Retry Logic: If you enable automated provisioning for your SCIM, it will retry after failing with the error "429 Too Many Requests". It will suspend all provisioning activities for the time period indicated in the Retry-After header for that particular client. Rather than resynchronizing the full group membership, the client retries only the failed requests.

Kindly let me know if this work for you and please let me know if you have any further questions.

If I have answered your question, please accept this answer as a token of appreciation and don't forget to give a thumbs up for "Was it helpful"!

Best regards,

Megan.

Megan Truong 635 Reputation points Independent Advisor

2025-06-20T01:29:02.9633333+00:00

Hello @Amrit Khanna

I hope everything is going well on your end.

Following up on the recent discussion regarding the inquiry, please let me know if there are any additional questions or concerns that require further assistance from us.

If the above answer was helpful and resolved your question, do click "Accept Answer" and "Yes" for was this answer helpful.

Have a good day!
Megan Truong 635 Reputation points Independent Advisor

2025-06-23T02:06:52.03+00:00

Hello @Amrit Khanna

I hope you're having a great day so far!

Following up on the recent discussion regarding the question, please let me know if there are any additional questions or concerns that require further assistance from us.

If the above answer was helpful and resolved your question, do click "Accept Answer" and "Yes" for was this answer helpful.

Have a good day!
Amrit Khanna 0 Reputation points

2025-06-27T11:08:51.2766667+00:00

Hi @Megan Truong ,

Thank you for the detailed response

I have one follow-up question to help us fine-tune our server-side architecture.

Regarding a bulk operation, for example, if an administrator assigns a group with 5,000 users to our application, or deactivates 5,000 existing users at once:

Should we expect the Azure SCIM client to send all 5,000 individual requests to our endpoint almost simultaneously in a massive burst? Or, does the Azure client have an internal mechanism that sends these requests in smaller, sequential batches (e.g., in batches of 100 or 200 at a time)?

Understanding whether the requests will arrive as one large spike or as a series of smaller, rapid batches is critical for us to correctly implement our throttling strategy.

Thank you for your clarification.

Share via

Understanding SCIM Client Request Behavior and Expected Load for Large-Scale Operations

1 answer

Your answer