Rate limit an HTTP handler in .NET
In this article, you'll learn how to create a client-side HTTP handler that rate limits the number of requests it sends. You'll see an HttpClient that accesses the "www.example.com"
resource. Resources are consumed by apps that rely on them, and when an app makes too many requests for a single resource, it can lead to resource contention. Resource contention occurs when a resource is consumed by too many apps, and the resource is unable to serve all of the apps that are requesting it. This can result in a poor user experience, and in some cases, it can even lead to a denial of service (DoS) attack. For more information on DoS, see OWASP: Denial of Service.
What is rate limiting?
Rate limiting is the concept of limiting how much a resource can be accessed. For example, you may know that a database your app accesses can safely handle 1,000 requests per minute, but it may not handle much more than that. You can put a rate limiter in your app that only allows 1,000 requests every minute and rejects any more requests before they can access the database. Thus, rate limiting your database and allowing your app to handle a safe number of requests. This is a common pattern in distributed systems, where you may have multiple instances of an app running, and you want to ensure that they don't all try to access the database at the same time. There are multiple different rate-limiting algorithms to control the flow of requests.
To use rate limiting in .NET, you'll reference the System.Threading.RateLimiting NuGet package.
Implement a DelegatingHandler
subclass
To control the flow of requests, you implement a custom DelegatingHandler subclass. This is a type of HttpMessageHandler that allows you to intercept and handle requests before they're sent to the server. You can also intercept and handle responses before they're returned to the caller. In this example, you'll implement a custom DelegatingHandler
subclass that limits the number of requests that can be sent to a single resource. Consider the following custom ClientSideRateLimitedHandler
class:
internal sealed class ClientSideRateLimitedHandler(
RateLimiter limiter)
: DelegatingHandler(new HttpClientHandler()), IAsyncDisposable
{
protected override async Task<HttpResponseMessage> SendAsync(
HttpRequestMessage request, CancellationToken cancellationToken)
{
using RateLimitLease lease = await limiter.AcquireAsync(
permitCount: 1, cancellationToken);
if (lease.IsAcquired)
{
return await base.SendAsync(request, cancellationToken);
}
var response = new HttpResponseMessage(HttpStatusCode.TooManyRequests);
if (lease.TryGetMetadata(
MetadataName.RetryAfter, out TimeSpan retryAfter))
{
response.Headers.Add(
"Retry-After",
((int)retryAfter.TotalSeconds).ToString(
NumberFormatInfo.InvariantInfo));
}
return response;
}
async ValueTask IAsyncDisposable.DisposeAsync()
{
await limiter.DisposeAsync().ConfigureAwait(false);
Dispose(disposing: false);
GC.SuppressFinalize(this);
}
protected override void Dispose(bool disposing)
{
base.Dispose(disposing);
if (disposing)
{
limiter.Dispose();
}
}
}
The preceding C# code:
- Inherits the
DelegatingHandler
type. - Implements the IAsyncDisposable interface.
- Defines a
RateLimiter
field that is assigned from the constructor. - Overrides the
SendAsync
method to intercept and handle requests before they're sent to the server. - Overrides the DisposeAsync() method to dispose of the
RateLimiter
instance.
Looking a bit closer at the SendAsync
method, you'll see that it:
- Relies on the
RateLimiter
instance to acquire aRateLimitLease
from theAcquireAsync
. - When the
lease.IsAcquired
property istrue
, the request is sent to the server. - Otherwise, an HttpResponseMessage is returned with a
429
status code, and if thelease
contains aRetryAfter
value, theRetry-After
header is set to that value.
Emulate many concurrent requests
To put this custom DelegatingHandler
subclass to the test, you'll create a console app that emulates many concurrent requests. This Program
class creates an HttpClient with the custom ClientSideRateLimitedHandler
:
var options = new TokenBucketRateLimiterOptions
{
TokenLimit = 8,
QueueProcessingOrder = QueueProcessingOrder.OldestFirst,
QueueLimit = 3,
ReplenishmentPeriod = TimeSpan.FromMilliseconds(1),
TokensPerPeriod = 2,
AutoReplenishment = true
};
// Create an HTTP client with the client-side rate limited handler.
using HttpClient client = new(
handler: new ClientSideRateLimitedHandler(
limiter: new TokenBucketRateLimiter(options)));
// Create 100 urls with a unique query string.
var oneHundredUrls = Enumerable.Range(0, 100).Select(
i => $"https://example.com?iteration={i:0#}");
// Flood the HTTP client with requests.
var floodOneThroughFortyNineTask = Parallel.ForEachAsync(
source: oneHundredUrls.Take(0..49),
body: (url, cancellationToken) => GetAsync(client, url, cancellationToken));
var floodFiftyThroughOneHundredTask = Parallel.ForEachAsync(
source: oneHundredUrls.Take(^50..),
body: (url, cancellationToken) => GetAsync(client, url, cancellationToken));
await Task.WhenAll(
floodOneThroughFortyNineTask,
floodFiftyThroughOneHundredTask);
static async ValueTask GetAsync(
HttpClient client, string url, CancellationToken cancellationToken)
{
using var response =
await client.GetAsync(url, cancellationToken);
Console.WriteLine(
$"URL: {url}, HTTP status code: {response.StatusCode} ({(int)response.StatusCode})");
}
In the preceding console app:
- The
TokenBucketRateLimiterOptions
are configured with a token limit of8
, and queue processing order ofOldestFirst
, a queue limit of3
, and replenishment period of1
millisecond, a tokens per period value of2
, and an auto-replenish value oftrue
. - An
HttpClient
is created with theClientSideRateLimitedHandler
that is configured with theTokenBucketRateLimiter
. - To emulate 100 requests, Enumerable.Range creates 100 URLs, each with a unique query string parameter.
- Two Task objects are assigned from the Parallel.ForEachAsync method, splitting the URLs into two groups.
- The
HttpClient
is used to send aGET
request to each URL, and the response is written to the console. - Task.WhenAll waits for both tasks to complete.
Since the HttpClient
is configured with the ClientSideRateLimitedHandler
, not all requests will make it to the server resource. You can test this assertion by running the console app. You'll see that only a fraction of the total number of requests are sent to the server, and the rest are rejected with an HTTP status code of 429
. Try altering the options
object used to create the TokenBucketRateLimiter
to see how the number of requests that are sent to the server changes.
Consider the following example output:
URL: https://example.com?iteration=06, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=60, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=55, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=59, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=57, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=11, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=63, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=13, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=62, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=65, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=64, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=67, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=14, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=68, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=16, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=69, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=70, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=71, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=17, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=18, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=72, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=73, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=74, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=19, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=75, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=76, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=79, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=77, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=21, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=78, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=81, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=22, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=80, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=20, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=82, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=83, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=23, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=84, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=24, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=85, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=86, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=25, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=87, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=26, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=88, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=89, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=27, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=90, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=28, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=91, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=94, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=29, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=93, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=96, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=92, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=95, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=31, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=30, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=97, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=98, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=99, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=32, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=33, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=34, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=35, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=36, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=37, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=38, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=39, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=40, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=41, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=42, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=43, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=44, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=45, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=46, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=47, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=48, HTTP status code: TooManyRequests (429)
URL: https://example.com?iteration=15, HTTP status code: OK (200)
URL: https://example.com?iteration=04, HTTP status code: OK (200)
URL: https://example.com?iteration=54, HTTP status code: OK (200)
URL: https://example.com?iteration=08, HTTP status code: OK (200)
URL: https://example.com?iteration=00, HTTP status code: OK (200)
URL: https://example.com?iteration=51, HTTP status code: OK (200)
URL: https://example.com?iteration=10, HTTP status code: OK (200)
URL: https://example.com?iteration=66, HTTP status code: OK (200)
URL: https://example.com?iteration=56, HTTP status code: OK (200)
URL: https://example.com?iteration=52, HTTP status code: OK (200)
URL: https://example.com?iteration=12, HTTP status code: OK (200)
URL: https://example.com?iteration=53, HTTP status code: OK (200)
URL: https://example.com?iteration=07, HTTP status code: OK (200)
URL: https://example.com?iteration=02, HTTP status code: OK (200)
URL: https://example.com?iteration=01, HTTP status code: OK (200)
URL: https://example.com?iteration=61, HTTP status code: OK (200)
URL: https://example.com?iteration=05, HTTP status code: OK (200)
URL: https://example.com?iteration=09, HTTP status code: OK (200)
URL: https://example.com?iteration=03, HTTP status code: OK (200)
URL: https://example.com?iteration=58, HTTP status code: OK (200)
URL: https://example.com?iteration=50, HTTP status code: OK (200)
You'll notice that the first logged entries are always the immediately returned 429 responses, and the last entries are always the 200 responses. This is because the rate limit is encountered client-side and avoids making an HTTP call to a server. This is a good thing because it means that the server isn't flooded with requests. It also means that the rate limit is enforced consistently across all clients.
Note also that each URL's query string is unique: examine the iteration
parameter to see that it's incremented by one for each request. This parameter helps to illustrate that the 429 responses aren't from the first requests, but rather from the requests that are made after the rate limit is reached. The 200 responses arrive later but these requests were made earlier—before the limit was reached.
To have a better understanding of the various rate-limiting algorithms, try rewriting this code to accept a different RateLimiter
implementation. In addition to the TokenBucketRateLimiter
you could try:
ConcurrencyLimiter
FixedWindowRateLimiter
PartitionedRateLimiter
SlidingWindowRateLimiter
Summary
In this article, you learned how to implement a custom ClientSideRateLimitedHandler
. This pattern could be used to implement a rate-limited HTTP client for resources that you know have API limits. In this way, you're preventing your client app from making unnecessary requests to the server, and you're also preventing your app from being blocked by the server. Additionally, with the use of metadata to store retry timing values, you could also implement automatic retry logic.