Rate and usage limits

Azure DevOps Services

Azure DevOps Services uses multi-tenancy to reduce costs and improve performance. This design leaves users vulnerable to performance issues and even outages when other users of their shared resources have spikes in their consumption. So, Azure DevOps limits the resources individuals can consume, and the amount of requests they can make to certain commands. When these limits are exceeded, future requests might be either delayed or blocked.

For more information, see Git limits and Best practices to avoid hitting rate limits.

Global consumption limit

Azure DevOps currently has a global consumption limit, which delays requests from individual users beyond a threshold when shared resources are in danger of being overwhelmed. This limit is focused exclusively on avoiding outages when shared resources are close to being overwhelmed. Individual users typically only get delayed requests when one of the following incidents occurs:

  • One of their shared resources is at risk of being overwhelmed
  • Their personal usage exceeds 200 times the consumption of a typical user within a (sliding) five-minute window

The amount of the delay depends on the user's sustained level of consumption. Delays range from a few milliseconds per request up to 30 seconds. Once consumption goes to zero or the resource is no longer overwhelmed, the delays stop within five minutes. If consumption remains high, delays might continue indefinitely to protect the resource.

When a user request gets delayed by a significant amount, that user receives an email and a warning banner in the web. For the build service account and others without an email address, members of the Project Collection Administrators group get the email. For more information, see Usage monitoring.

When an individual user's requests get blocked, responses with HTTP code 429 (too many requests) are received, with a message similar to the following message:

TF400733: The request has been canceled: Request was blocked due to exceeding usage of resource <resource name> in namespace <namespace ID>.

Azure DevOps throughput units (TSTUs)

Azure DevOps users consume many shared resources, and consumption depends on the following factors:

  • Uploading a large number of files to version control creates a large amount of load on databases and storage accounts
  • Complex work item tracking queries create database load based on the number of work items they search through
  • Builds drive load by downloading files from version control, producing log output
  • All operations consume CPU and memory on various parts of the service

To accommodate, Azure DevOps resource consumption is expressed in abstract units called Azure DevOps throughput units, or TSTUs. TSTUs eventually incorporate a blend of the following items:

  • Azure SQL Database DTUs as a measure of database consumption
  • Application tier and job agent CPU, memory, and I/O as a measure of compute consumption
  • Azure Storage bandwidth as a measure of storage consumption

For now, TSTUs are primarily focused on Azure SQL Database DTUs, since Azure SQL Databases are the shared resources most commonly overwhelmed by excessive consumption. A single TSTU is the average load we expect a single normal user of Azure DevOps to generate per five minutes. Normal users also generate spikes in load. These spikes are typically 10 or fewer TSTUs per five minutes. Less frequently, spikes go as high as 100 TSTUs.

The global consumption limit is 200 TSTUs within a sliding five-minute window.

We recommend that you at least respond to the Retry-After header. If you detect a Retry-After header in any response, wait until some time passes before you send another request. Doing so helps your client application experience fewer enforced delays. Keep in mind that the response is 200, so you don't need to apply retry logic to the request.

If possible, we further recommend that you monitor X-RateLimit-Remaining and X-RateLimit-Limit headers. Doing so allows you to approximate how quickly you're approaching the delay threshold. Your client can intelligently react and spread out its requests over time.

Note

Identities that are used by tools and applications to integrate with Azure DevOps might need higher rate and usage limits beyond the allowed consumption limit from time to time. You can get additional rate and usage limits by assigning the Basic + Test Plans access level to the desired identities used by your application. Once the need for higher rate limits are fulfilled, you can go back to the access level that the identity used to have. You're charged for the cost of Basic + Test Plans access level only for the time it's assigned to the identity.

Identities that are already assigned a Visual Studio Enterprise subscription cannot be assigned Basic + Test Plans access level till they are removed.

Pipelines

Rate limiting is similar for Azure Pipelines. Each pipeline gets treated as an individual entity with its own resource consumption tracked. Even if build agents are self-hosted, they generate load in the form of cloning and sending logs.

We apply a 200 TSTU limit for an individual pipeline in a sliding 5-minute window. This limit is the same as the global consumption limit for users. If a pipeline gets delayed or blocked by rate limiting, a message appears in the attached logs.

API client experience

When requests get delayed or blocked, Azure DevOps returns response headers to help API clients react. While not fully standardized, these headers are broadly in line with other popular services.

The following table lists the headers available and what they mean. Except for X-RateLimit-Delay, all of these headers get sent before requests start getting delayed. This design gives clients the opportunity to proactively slow down their rate of requests.

Header name

Description


Retry-After

The RFC 6585-specified header sent to tell you how long to wait before you send your next request to fall under the detection threshold. Units: seconds.


X-RateLimit-Resource

A custom header indicating the service and type of threshold that was reached. Threshold types and service names might vary over time and without warning. We recommend displaying this string to a human, but not relying on it for computation.


X-RateLimit-Delay

How long the request was delayed. Units: seconds with up to three decimal places (milliseconds).


X-RateLimit-Limit

Total number of TSTUs allowed before delays are imposed.


X-RateLimit-Remaining

Number of TSTUs remaining before being delayed. If requests are already being delayed or blocked, it's 0.


X-RateLimit-Reset

Time at which, if all resource consumption stopped immediately, tracked usage would return to 0 TSTUs. Expressed in Unix epoch time.


Work tracking, process, & project limits

Azure DevOps imposes limits for the number of projects you can have in an organization and the number of teams you can have within each project. Also be aware of limits for work items, queries, backlogs, boards, dashboards, and more. For more information, see Work tracking, process, and project limits.

Wiki

In addition to the usual repository limits, wikis defined for a project are limited to 25 MB per single file.

Service connections

There are no per-project limits placed on creating service connections. However, there might be limits, which are imposed through Microsoft Entra ID. For additional information, review the following articles: