Share via

Understanding SCIM Client Request Behavior and Expected Load for Large-Scale Operations

Amrit Khanna 0 Reputation points
2025-06-17T06:32:07.4666667+00:00

We are developing a SCIM 2.0 compliant service and are preparing to integrate with Microsoft Entra ID as an identity provider for our customers. To ensure we can build a robust and scalable solution, we need to understand the specific request patterns of the Entra ID SCIM client, particularly for scenarios involving a large number of users (e.g., thousands).

Our primary questions are about the expected request load for different SCIM operations:

1. Initial Bulk Synchronization / Onboarding:

  • When a customer onboards and assigns our SCIM application to a group in Entra ID containing several thousand users, at what rate will the Entra ID SCIM client send requests to our server?
  • How are these requests typically distributed across our SCIM endpoints? For example, what is the expected rate of POST /Users requests?
  • Does the client send requests sequentially or in parallel? If in parallel, what is the typical level of concurrency we should expect?

2. Bulk Deactivation / Deletion:

  • Similarly, if an administrator unassigns a large number of users from our application (or if users are deleted), what is the expected request rate for PATCH requests (to set active to false) or DELETE /Users/{id} requests?
  • How does Entra ID handle the deletion of a group with a large membership? Will it trigger individual requests for each user, and if so, at what rate?

3. Client-Side Throttling and Retry Logic:

  • If our server responds with a 429 Too Many Requests status code and includes a Retry-After header, what is the precise behavior of the Entra ID SCIM client?
  • Will it pause all provisioning activities for the duration specified in the Retry-After header for that specific customer?
  • In the case of throttling during a large group membership update, will the client retry only the failed requests, or will it re-synchronize the entire group membership?
Microsoft Security | Microsoft Entra | Microsoft Entra ID

2 answers

Sort by: Most helpful
  1. Shubham Kumar 0 Reputation points
    2025-11-25T07:11:43.65+00:00

    Hi @Megan Truong,

    I have a follow-up question regarding the batching behavior you described:

    Batching Behavior - Requests are sent in small, rapid batches, not all at once. - Typical batch sizes are around 40–200 users per batch, depending on tenant configuration and system load.

    You mentioned that the batch size depends on "tenant configuration". Could you please elaborate on this? Is this a setting that we, as the application developer or the customer, can configure? and what things can we configure?

    Our goal is to minimize the initial onboarding time for our customers, especially for large-scale deployments. If we could influence the batch size to be on the higher end of the 40-200 range (e.g., closer to 200), it would significantly improve the user experience.

    Any guidance or documentation you could point us to on this topic would be greatly appreciated.

    Was this answer helpful?

    0 comments No comments

  2. Megan Truong 800 Reputation points
    2025-06-18T08:46:30.0866667+00:00

    Hello @Amrit Khanna

    Thank you for contacting Q&A Forum. As for your question point:

    1. Initial Bulk Synchronization / Onboarding: Rate limitation is currently only available for apps in the gallery that Microsoft has built and onboarded. There is no rate limiter for custom non-gallery apps. Each provisioning job acts independently, with no knowledge of the others; the period between cycles is 40 minutes, though for excessively large sets of users/groups, the cycle may take considerably longer. Reference: https://learn.microsoft.com/en-us/entra/identity/app-provisioning/application-provisioning-when-will-provisioning-finish-specific-user#how-long-will-it-take-to-provision-users
    2. Bulk Deactivation / Deletion: The rate and time length are also determined by the number of users that the administrator want to deactivate or delete. One configured instance of provisioning on an AAD Enterprise App/custom non-gallery app corresponds to one provisioning job. If you have ten clients, each with one provisioning job configured, there will be ten provisioning jobs.
    3. Client-Side Throttling and Retry Logic: If you enable automated provisioning for your SCIM, it will retry after failing with the error "429 Too Many Requests".  It will suspend all provisioning activities for the time period indicated in the Retry-After header for that particular client.  Rather than resynchronizing the full group membership, the client retries only the failed requests.

    Kindly let me know if this work for you and please let me know if you have any further questions.

    If I have answered your question, please accept this answer as a token of appreciation and don't forget to give a thumbs up for "Was it helpful"!

    Best regards,

    Megan.

    Was this answer helpful?


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.