Redaguoti

Bendrinti naudojant


Understand how Azure Resource Manager throttles requests

This article describes how Azure Resource Manager throttles requests. It shows you how to track the number of requests that remain before reaching the limit, and how to respond when you reach the limit.

Throttling happens at two levels. Azure Resource Manager throttles requests for the subscription and tenant. If the request is under the throttling limits for the subscription and tenant, Resource Manager routes the request to the resource provider. The resource provider applies throttling limits that are tailored to its operations.

The following image shows how throttling is applied as a request goes from the user to Azure Resource Manager and the resource provider. The image shows that requests are initially throttled per principal ID and per Azure Resource Manager instance in the region of the user sending the request. The requests are throttled per hour. When the request is forwarded to the resource provider, requests are throttled per region of the resource rather than per Azure Resource Manager instance in region of the user. The resource provider requests are also throttled per principal user ID and per hour.

Diagram that shows how throttling is applied as a request goes from the user to Azure Resource Manager and the resource provider.

Subscription and tenant limits

Every subscription-level and tenant-level operation is subject to throttling limits. Subscription requests are ones that involve passing your subscription ID, such as retrieving the resource groups in your subscription. For example, sending a request to https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups?api-version=2022-01-01 is a subscription-level operation. Tenant requests don't include your subscription ID, such as retrieving valid Azure locations. For example, sending a request to https://management.azure.com/tenants?api-version=2022-01-01 is a tenant-level operation.

The default throttling limits per hour are shown in the following table.

Scope Operations Limit
Subscription reads 12000
Subscription deletes 15000
Subscription writes 1200
Tenant reads 12000
Tenant writes 1200

These limits are scoped to the security principal (user or application) making the requests and the subscription ID or tenant ID. If your requests come from more than one security principal, your limit across the subscription or tenant is greater than 12,000 and 1,200 per hour.

These limits apply to each Azure Resource Manager instance. There are multiple instances in every Azure region, and Azure Resource Manager is deployed to all Azure regions. So, in practice, the limits are higher than these limits. The requests from a user are usually handled by different instances of Azure Resource Manager.

The remaining requests are returned in the response header values.

Migrating to regional throttling and token bucket algorithm

Starting in 2024, Microsoft is migrating Azure subscriptions to a new throttling architecture. With this change, you'll experience new throttling limits. The new throttling limits are applied per region rather than per instance of Azure Resource Manager. The new architecture uses a token bucket algorithm to manage API throttling.

The token bucket represents the maximum number of requests that you can send for each second. When you reach the maximum number of requests, the refill rate determines how quickly tokens become available in the bucket.

These updated limits make it easier for you to refresh and manage your quota.

The new limits are:

Scope Operations Bucket size Refill rate per sec
Subscription reads 250 25
Subscription deletes 200 10
Subscription writes 200 10
Tenant reads 250 25
Tenant deletes 200 10
Tenant writes 200 10

The subscription limits apply per subscription, per service principal, and per operation type. There are also global subscription limits that are equivalent to 15 times the individual service principal limits for each operation type. The global limits apply across all service principals. Requests will be throttled if the global, service principal, or tenant specific limits are exceeded.

The limits may be smaller for free or trial customers.

For example, suppose you have a bucket size of 250 tokens for read requests and refill rate of 25 tokens per second. If you send 250 read requests in a second, the bucket is empty and your requests are throttled. Each second, 25 tokens become available until the bucket reaches its maximum capacity of 250 tokens. You can use tokens as they become available.

How do I know if my subscription uses the new throttling experience?

After your subscription is migrated to the new throttling experience, the response header shows the remaining requests per minute instead of per hour. Also, your Retry-After value shows one minute or less, instead of five minutes. For more information, see Error code.

Why is throttling changing to per region rather than per instance?

Since different regions have a different number of Resource Manager instances, throttling per instance causes inconsistent throttling performance. Throttling per region makes throttling consistent and predictable.

How does the new throttling experience affect my limits?

You can send more requests. Write requests increase by 30 times. Delete requests increase by 2.4 times. Read requests increase by 7.5 times.

Can I prevent my subscription from migrating to the new throttling experience?

No, all subscriptions will eventually be migrated.

Resource provider limits

Resource providers apply their own throttling limits. Within each subscription, the resource provider throttles per region of the resource in the request. Because Resource Manager throttles by instance of Resource Manager, and there are several instances of Resource Manager in each region, the resource provider might receive more requests than the default limits in the previous section.

This section discusses the throttling limits of some widely used resource providers.

Storage throttling

The following limits apply only when you perform management operations by using Azure Resource Manager with Azure Storage. The limits apply per region of the resource in the request.

Resource Limit
Storage account management operations (read) 800 per 5 minutes
Storage account management operations (write) 10 per second / 1200 per hour
Storage account management operations (list) 100 per 5 minutes

Network throttling

The Microsoft.Network resource provider applies the following throttle limits:

Operation Limit
write / delete (PUT) 1000 per 5 minutes
read (GET) 10000 per 5 minutes

In addition to those general limits, see the usage limits for Azure DNS.

Compute throttling

Microsoft Compute implements throttling to provide an optimal experience for Virtual Machine and Virtual Machine Scale Set users. Compute Throttling Limits provides comprehensive information on throttling policies and limits for VM, Virtual Machine Scale Sets and Scale Set VMs.

Azure Resource Graph throttling

Azure Resource Graph limits the number of requests to its operations. The steps in this article to determine the remaining requests and how to respond when the limit is reached also apply to Resource Graph. However, Resource Graph sets its own limit and reset rate. For more information, see Resource Graph throttling headers.

Other resource providers

For information about throttling in other resource providers, see:

Error code

When you reach the limit, you receive the HTTP status code 429 Too many requests. The response includes a Retry-After value, which specifies the number of seconds your application should wait (or sleep) before sending the next request. If you send a request before the retry value elapses, your request isn't processed and a new retry value is returned.

If you're using an Azure SDK, the SDK may have an auto retry configuration. For more information, see Retry guidance for Azure services.

Some resource providers return 429 to report a temporary problem. The problem could be an overload condition that isn't directly caused by your request. Or, it could represent a temporary error about the state of the target resource or dependent resource. For example, the network resource provider returns 429 with the RetryableErrorDueToAnotherOperation error code when the target resource is locked by another operation. To determine if the error comes from throttling or a temporary condition, view the error details in the response.

Remaining requests

You can determine the number of remaining requests by examining response headers. Read requests return a value in the header for the number of remaining read requests. Write requests include a value for the number of remaining write requests. The following table describes the response headers you can examine for those values:

Response header Description
x-ms-ratelimit-remaining-subscription-deletes Subscription scoped deletes remaining. This value is returned on delete operations.
x-ms-ratelimit-remaining-subscription-reads Subscription scoped reads remaining. This value is returned on read operations.
x-ms-ratelimit-remaining-subscription-writes Subscription scoped writes remaining. This value is returned on write operations.
x-ms-ratelimit-remaining-tenant-reads Tenant scoped reads remaining
x-ms-ratelimit-remaining-tenant-writes Tenant scoped writes remaining
x-ms-ratelimit-remaining-subscription-resource-requests Subscription scoped resource type requests remaining.

This header value is only returned if a service overrides the default limit. Resource Manager adds this value instead of the subscription reads or writes.
x-ms-ratelimit-remaining-subscription-resource-entities-read Subscription scoped resource type collection requests remaining.

This header value is only returned if a service overrides the default limit. This value provides the number of remaining collection requests (list resources).
x-ms-ratelimit-remaining-tenant-resource-requests Tenant scoped resource type requests remaining.

This header is only added for requests at tenant level, and only if a service overrides the default limit. Resource Manager adds this value instead of the tenant reads or writes.
x-ms-ratelimit-remaining-tenant-resource-entities-read Tenant scoped resource type collection requests remaining.

This header is only added for requests at tenant level, and only if a service overrides the default limit.

The resource provider can also return response headers with information about remaining requests. For information about response headers returned by the Compute resource provider, see Call rate informational response headers.

Retrieving the header values

Retrieving these header values in your code or script is no different than retrieving any header value.

For example, in C#, you retrieve the header value from an HttpWebResponse object named response with the following code:

response.Headers.GetValues("x-ms-ratelimit-remaining-subscription-reads").GetValue(0)

In PowerShell, you retrieve the header value from an Invoke-WebRequest operation.

$r = Invoke-WebRequest -Uri https://management.azure.com/subscriptions/{guid}/resourcegroups?api-version=2016-09-01 -Method GET -Headers $authHeaders
$r.Headers["x-ms-ratelimit-remaining-subscription-reads"]

For a complete PowerShell example, see Check Resource Manager Limits for a Subscription.

If you want to see the remaining requests for debugging, you can provide the -Debug parameter on your PowerShell cmdlet.

Get-AzResourceGroup -Debug

Which returns many values, including the following response value:

DEBUG: ============================ HTTP RESPONSE ============================

Status Code:
OK

Headers:
Pragma                        : no-cache
x-ms-ratelimit-remaining-subscription-reads: 11999

To get write limits, use a write operation:

New-AzResourceGroup -Name myresourcegroup -Location westus -Debug

Which returns many values, including the following values:

DEBUG: ============================ HTTP RESPONSE ============================

Status Code:
Created

Headers:
Pragma                        : no-cache
x-ms-ratelimit-remaining-subscription-writes: 1199

In Azure CLI, you retrieve the header value by using the more verbose option.

az group list --verbose --debug

Which returns many values, including the following values:

msrest.http_logger : Response status: 200
msrest.http_logger : Response headers:
msrest.http_logger :     'Cache-Control': 'no-cache'
msrest.http_logger :     'Pragma': 'no-cache'
msrest.http_logger :     'Content-Type': 'application/json; charset=utf-8'
msrest.http_logger :     'Content-Encoding': 'gzip'
msrest.http_logger :     'Expires': '-1'
msrest.http_logger :     'Vary': 'Accept-Encoding'
msrest.http_logger :     'x-ms-ratelimit-remaining-subscription-reads': '11998'

To get write limits, use a write operation:

az group create -n myresourcegroup --location westus --verbose --debug

Which returns many values, including the following values:

msrest.http_logger : Response status: 201
msrest.http_logger : Response headers:
msrest.http_logger :     'Cache-Control': 'no-cache'
msrest.http_logger :     'Pragma': 'no-cache'
msrest.http_logger :     'Content-Length': '163'
msrest.http_logger :     'Content-Type': 'application/json; charset=utf-8'
msrest.http_logger :     'Expires': '-1'
msrest.http_logger :     'x-ms-ratelimit-remaining-subscription-writes': '1199'

Next steps