Description
This is a summary of the Design doc.
Background and Motivation
Outages caused when system activities exceed the system’s capacity is a leading concern in system design. The ability to handle system activity efficiently, and gracefully limit the execution of activities before the system is under stress is a fundamental to system resiliency. .NET does not have a standardized means for expressing and managing rate limiting logic needed to produce a resilient system. This adds complexity to designing and developing resilient software in .NET by introducing an easy vector for competing rate limiting logic and anti-patterns. A standardized interface in .NET for limiting activities will make it easier for developers to build resilient systems for all scales of deployment and workload.
Users will interact with the proposed APIs in order to ensure rate and/or concurrency limits are enforced. This abstraction require explicit release semantics to accommodate non self-replenishing (i.e. concurrency) limits similar to how Semaphores operate. The abstraction also accounts for self-replenishing (i.e. rate) limits where no explicit release semantics are needed as the permits are replenished automatically over time. This component encompasses the Acquire/WaitAsync mechanics (i.e. check vs wait behaviours) and default implementations will be provided for select accounting method (fixed window, sliding window, token bucket, simple concurrency). The return type is a RateLimitLease
type which indicates whether acquisition is successful and manages the lifecycle of the acquired permits.
Proposed API - Abstractions
namespace System.Threading.RateLimiting
{
public abstract class RateLimiter
{
// An estimated count of available permits. Potential uses include diagnostics.
public abstract int GetAvailablePermits();
// Fast synchronous attempt to acquire permits
// Set permitCount to 0 to get whether permits are exhausted
public RateLimitLease Acquire(int permitCount = 1);
// Implementation
protected abstract RateLimitLease AcquireCore(int permitCount);
// Wait until the requested permits are available or permits can no longer be acquired
// Set permitCount to 0 to wait until permits are replenished
public ValueTask<RateLimitLease> WaitAsync(int permitCount = 1, CancellationToken cancellationToken = default);
// Implementation
protected abstract ValueTask<RateLimitLease> WaitAsyncCore(int permitCount, CancellationToken cancellationToken = default);
}
public abstract class RateLimitLease : IDisposable
{
// This represents whether lease acquisition was successful
public abstract bool IsAcquired { get; }
// Method to extract any general metadata. This is implemented by subclasses
// to return the metadata they support.
public abstract bool TryGetMetadata(string metadataName, out object? metadata);
// This casts the metadata returned by the general method above to known types of values.
public bool TryGetMetadata<T>(MetadataName<T> metadataName, [MaybeNullWhen(false)] out T metadata);
// Used to get a list of metadata that is available on the lease which can be dictionary keys or static list of strings.
// Useful for debugging purposes but TryGetMetadata should be used instead in product code.
public abstract IEnumerable<string> MetadataNames { get; }
// Virtual method that extracts all the metadata using the list of metadata names and TryGetMetadata().
public virtual IEnumerable<KeyValuePair<string, object?>> GetAllMetadata();
// Follow the general .NET pattern for dispose
public void Dispose() { Dispose(true); GC.SuppressFinalize(this); }
protected virtual void Dispose(bool disposing);
}
// Curated set of known MetadataName<T>
public static class MetadataName : IEquatable<MetadataName>
{
public static MetadataName<TimeSpan> RetryAfter { get; } = Create<TimeSpan>("RETRY_AFTER");
public static MetadataName<string> ReasonPhrase { get; } = Create<string>("REASON_PHRASE");
public static MetadataName<T> Create<T>(string name) => new MetadataName<T>(name);
}
// Wrapper of string and a type parameter signifying the type of the metadata value
public sealed class MetadataName<T> : IEquatable<MetadataName<T>>
{
public MetadataName(string name);
public string Name { get; }
}
}
The Acquire
call represents a fast synchronous check that immediately returns whether there are enough permits available to continue with the operation and atomically acquires them if there are, returning RateLimitLease
with the value RateLimitLease.IsAcquired
representing whether the acquisition is successful and the lease itself representing the acquired permits, if successful. The user can pass in a permitCount
of 0 to check whether the permit limit has been reached without acquiring any permits.
WaitAsync
, on the other hand, represents an awaitable request to check whether permits are available. If permits are available, obtain the permits and return immediately with a RateLimitLease
representing the acquired permits. If the permits are not available, the caller is willing to pause the operation and wait until the necessary permits become available. The user can also pass in a permitCount
of 0 but and indicates the user wants to wait until more permits become available.
GetAvailablePermits()
is envisioned as a flexible and simple way for the limiter to communicate the status of the limiter to the user. This count is similar in essence to SemaphoreSlim.CurrentCount
. This count can also be used in diagnostics to track the usage of the rate limiter.
The abstract class RateLimitLease
is used to facilitate the release semantics of rate limiters. That is, for non self-replenishing, the returning of the permits obtained via Acquire/WaitAsync is achieved by disposing the RateLimitLease
. This enables the ability to ensure that the user can't release more permits than was obtained.
The RateLimitLease.IsAcquired
property is used to express whether the acquisition request was successful. TryGetMetadata()
is implemented by subclasses to allow for returning additional metadata as part of the rate limit decision. A curated list of well know names for commonly used metadata is provided via MetadataName
which keeps a list of MetadataName<T>
s which are wrappers of string
and a type parameter indicating the value type. To optimize performance, implementations will need to pool RateLimitLease
.
Usage Examples
For components enforcing limits, the standard usage pattern will be:
RateLimiter limiter = new SomeRateLimiter(options => ...)
// Synchronous checks
endpoints.MapGet("/acquire", async context =>
{
// Check limiter using `Acquire` that should complete immediately
using var lease = limiter.Acquire();
// RateLimitLease was successfully obtained, the using block ensures
// that the lease is released upon processing completion.
if (lease.IsAcquired)
{
await context.Response.WriteAsync("Hello World!");
}
else
{
// Rate limit check failed, send 429 response
context.Response.StatusCode = StatusCodes.Status429TooManyRequests;
return;
}
}
// Async checks
endpoints.MapGet("/waitAsync", async context =>
{
// Check limiter using `WaitAsync` which may complete immediately
// or wait until permits are available. Using block ensures that
// the lease is released upon processing completion.
using var lease = await limiter.WaitAsync();
if (lease.IsAcquired)
{
await context.Response.WriteAsync("Hello World!");
}
else
{
// Rate limit check failed, send 429 response
context.Response.StatusCode = StatusCodes.Status429TooManyRequests;
return;
}
}
Propsed API - Concrete Implementations
namespace System.Threading.RateLimiting
{
// This specifies the behaviour of `WaitAsync` When PermitLimit has been reached
public enum QueueProcessingOrder
{
OldestFirst,
NewestFirst
}
public sealed class ConcurrencyLimiterOptions
{
public ConcurrencyLimiterOptions(
int permitLimit,
QueueProcessingOrder queueProcessingOrder,
int queueLimit);
// Specifies the maximum number of permits for the limiter
public int PermitLimit { get; }
// Permits exhausted mode, configures `WaitAsync` behaviour
public QueueProcessingOrder QueueProcessingOrder { get; }
// Queue limit when queuing is enabled
public int QueueLimit { get; }
}
public sealed class TokenBucketRateLimiterOptions
{
public TokenBucketRateLimiterOptions(
int tokenLimit,
QueueProcessingOrder queueProcessingOrder,
int queueLimit,
TimeSpan replenishmentPeriod,
int tokensPerPeriod,
bool autoReplenishment = true);
// Specifies the maximum number of tokens for the limiter
public int TokenLimit { get; }
// Permits exhausted mode, configures `WaitAsync` behaviour
public QueueProcessingOrder QueueProcessingOrder { get; }
// Queue limit when queuing is enabled
public int QueueLimit { get; }
// Specifies the period between replenishments
public TimeSpan ReplenishmentPeriod { get; }
// Specifies how many tokens to restore each replenishment
public int TokensPerPeriod { get; }
// Whether to create a timer to trigger replenishment automatically
public bool AutoReplenishment { get; }
}
// Window based rate limiter options
public sealed class FixedWindowRateLimiterOptions
{
public FixedWindowRateLimiterOptions(
int permitLimit,
QueueProcessingOrder queueProcessingOrder,
int queueLimit,
TimeSpan window,
bool autoReplenishment = true);
// Specifies the maximum number of tokens for the limiter
public int PermitLimit { get; }
// Permits exhausted mode, configures `WaitAsync` behaviour
public QueueProcessingOrder QueueProcessingOrder { get; }
// Queue limit when queuing is enabled
public int QueueLimit { get; }
// Specifies the duration of the window where the rate limit is applied
public TimeSpan Window { get; }
// Whether to create a timer to trigger replenishment automatically
public bool AutoRefresh { get; }
}
public sealed class SlidingWindowRateLimiterOptions
{
public SlidingWindowRateLimiterOptions(
int permitLimit,
QueueProcessingOrder queueProcessingOrder,
int queueLimit,
TimeSpan window,
int segmentsPerWindow,
bool autoReplenishment = true);
// Specifies the maximum number of tokens for the limiter
public int PermitLimit { get; }
// Permits exhausted mode, configures `WaitAsync` behaviour
public QueueProcessingOrder QueueProcessingOrder { get; }
// Queue limit when queuing is enabled
public int QueueLimit { get; }
// Specifies the duration of the window where the rate limit is applied
public TimeSpan Window { get; }
// Specifies the number of segments the Window should be divided into
public int SegmentsPerWindow { get; set; }
// Whether to create a timer to trigger replenishment automatically
public bool AutoRefresh { get; }
}
// Limiter implementations
public sealed class ConcurrencyLimiter : RateLimiter
{
public ConcurrencyLimiter(ConcurrencyLimiterOptions options);
public override int GetAvailablePermits();
protected override RateLimitLease AcquireCore(int permitCount);
protected override ValueTask<RateLimitLease> WaitAsyncCore(int permitCount, CancellationToken cancellationToken = default);
}
public sealed class TokenBucketRateLimiter : RateLimiter
{
public FixedWindowRateLimiter(TokenBucketRateLimiter options);
public bool TryReplenish();
public override int GetAvailablePermits();
protected override RateLimitLease AcquireCore(int permitCount);
protected override ValueTask<RateLimitLease> WaitAsyncCore(int permitCount, CancellationToken cancellationToken = default);
}
public sealed class FixedWindowRateLimiter : RateLimiter
{
public FixedWindowRateLimiter(FixedWindowRateLimiterOptions options);
public bool TryRefresh();
public override int GetAvailablePermits();
protected override RateLimitLease AcquireCore(int permitCount);
protected override ValueTask<RateLimitLease> WaitAsyncCore(int permitCount, CancellationToken cancellationToken = default);
}
public sealed class SlidingWindowRateLimiter : RateLimiter
{
public SlidingWindowRateLimiter(SlidingWindowRateLimiterOptions options);
public bool TryRefresh();
public override int GetAvailablePermits();
protected override RateLimitLease AcquireCore(int permitCount);
protected override ValueTask<RateLimitLease> WaitAsyncCore(int permitCount, CancellationToken cancellationToken = default);
}
}
For more details on how these options work, see the Design Doc.
Adoption samples
This API will be used in implementing ASP.NET Core middleware in .NET 6.0 and can be useful in implementing limits for various BCL types in the future including:
- Channels
- Pipelines
- Streams
- HttpClient
Sample implementation in Channels, note this is using slightly outdated API.
For more theoretical samples of RateLimiter
implementations, see the Proof of Concepts in the Design Doc.
We also adoption for enforcing limits in YARP as well as conversion of existing implementations in ATS and ACR.
Alternative Designs
Token bucket rate limiter external replenishment
The default implementation will allocate a new System.Threading.Timer
to trigger permit replenishment. This can be expensive when many limiters are in use and a better pattern is to trigger the replenishment via a single Timer
. The current proposal has two APIs to support this, a public void Replenish()
on the limiter and a public bool AutoReplenishment { get;set; }
on the options class.
Subclasses can override default behaviour
Instead of exposing the two APIs, we can make the class extensible and allow subclasses to add the Replenish()
method as well as the external replenishment functionality. However, the AutoReplenishment
still need to exist so the default implementation knows if a Timer needs to be created.
Heuristics based replenishment.
We can rely on recomputing the permit count based on how long since the last replenishment occurred on every invocation of Acquire
, WaitAsync
and GetAvailablePermits
. However, we'll still need to allocate a Timer to process queued WaitAsync
calls.
Separate abstractions for rate and concurrency limits
A design where rate limits and concurrency limits were expressed by separate abstractions was considered. The design more clearly express the intended use pattern where rate limits do not need to return a RateLimitLease
and does not possess release semantics. In comparison, the proposed design where the release semantics for rate limits will no-op.
However, this design has the drawback for consumers of rate limits since there are two possible limiter types that can be specified by the user. To alleviate some of the complexity, a wrapper for rate limits was considered. However, the complexity of this design was deemed undesirable and a unified abstraction for rate and concurrency limits was preferred.
A struct instead of class for RateLimitLease
This approach was considered since allocating a new RateLimitLease
for each acquisition request is considered to be a performance bottleneck. The design evolved to the following:
// Represents a permit lease obtained from the limiter. The user disposes this type to release the acquired permits.
public struct RateLimitLease : IDisposable
{
// This represents whether permit acquisition was successful
public bool IsAcquired { get; }
// This represents the count of permits obtained in the lease
public int Count { get; }
// This represents additional metadata that can be returned as part of a call to Acquire/AcquireAsync
// Potential uses could include a RetryAfter value or an error code.
public object? State { get; }
// Private fields to be used by `onDispose()`, this is not a public API shown here for completeness
private RateLmiter? _rateLimiter;
private Action<RateLimiter?, int, object?>? _onDispose;
// Constructor which sets all the readonly values
public RateLimitLease(
bool isAcquired,
int Count,
object? state,
RateLimiter? rateLimiter,
Action<RateLimiter?, int, object?>? onDispose);
// Return the acquired permits, calls onDispose with RateLimiter
// This can only be called once, it's an user error if called more than once
public void Dispose();
}
However, this design became problematic with the consideration of including a AggregatedRateLimiter<TKey>
which necessitates the existence of another struct RateLimitLease<TKey>
with a private reference to the AggregatedRateLimiter<TKey>
. This bifurcation of the return types of Acquire
and WaitAsync
between the AggregatedRateLimiter<TKey>
and RateLimiter
make it very difficult to consume aggregated and simple limiters in a consistent manner. Additional complexity in definiting an API to store and retrieve additional metadata is also a concern, see below. For this reason, it is better to make RateLimitLease
a class instead of a struct and require implementations to pool if optimization for performance is required.
Additional concerns that needed to be resolved for a struct RateLimitLease
are elaborated below:
Permit as reference ID
There was alternative proposal where the struct only contains a reference ID and additional APIs on the RateLimiter
instance is used to return permits and obtain additional metadata. This is equivalent to the RateLimiter
internally tracking outstanding permit leases and allow permit release via RateLimiter.Release(RateLimitLease.ID)
or obtain additional metadata via RateLimiter.TryGetMetadata(RateLimitLease.ID, MetadataName)
. This shifts the need to pool data structures for tracking idempotency of Dispose
and additional metadata to the RateLimiter
implementation itself. This additional indirection doesn't resolve the bifurcation issue mentioned previously and necessitates additional APIs that are hard to use and implement on the RateLimiter
, as such this alternative is not chosen.
RateLimitLease state
The current proposal uses a object State
to communicate additional information on a rate limit decision. This is the most general way to provide additional information since the RateLimiter
can add any arbitrary type or collections via object State
. However, there is a tradeoff between the generality and flexibility of this approach with usability. For example, we have gotten feedback from ATS that they want a simpler way to specify a set of values such as RetryAfter, error codes, or percentage of permits used. As such, here are several design alternatives.
Interfaces
One option to support access to values is to keep the object State
but require limiters to set a state that implements different Interfaces. For example, there could be a IRateLimiterRetryAfterHeaderValue
interface that looks like:
public interface IRateLimiterRetryAfterHeaderValue
{
string RetryAfter { get; }
}
Consumers of the RateLimiter
would then check if the State
object implements the interface before retrieving the value. It also puts burdens on the implementers of RateLimiters
since they should also define a set interfaces to represent commonly used values.
Property bags
Property bags like Activity.Baggage
and Activity.Tags
are very well suited to store the values that were identified by the ATS team. For web work loads where these values are likely to be headers and header value pairs, this is a good way to express the State
field on RateLimitLease
. Specifically, the type would be either:
Option 1: IReadonlyDictionary<string,string?> State
However, there is a drawback here in terms of generality since it would mean that we are opinionated about the type of keys and values as strings. Alternatively we can modify this to be:
Option 2: IReadonlyDictionary<string,object?> State
This is slightly more flexible since the value can be any type. However, to use these values, the user would need to know ahead of time what the value for specific keys are and downcast the object to whatever type it is. Going one step further:
Option 3: IReadonlyDictionary<object,object?> State
This gives the most flexibility in the property bag, since we are no longer opinionated about the key type. But the same issue with option 2 remains and it's unclear whether this generality of key type would actually be useful.
Feature collection
Another way to represent the State
would be something like a IFeatureCollection
. The benefit of this interface is that while it is general enough to contain any type of value and that specific implementations can optimize for commonly accessed fields by accessing them directly (e.g. https://github.com/dotnet/aspnetcore/blob/52eff90fbcfca39b7eb58baad597df6a99a542b0/src/Http/Http/src/DefaultHttpContext.cs).
A bool
returned by TryAcquire
to indicate success/failure and throw for WaitAsync
to indicate failure
An earlier iteration proposed the following API instead:
namespace System.Threading.RateLimiting
{
public abstract class RateLimiter
{
// An estimated count of permits. Potential uses include diagnostics.
abstract int GetAvailablePermits();
// Fast synchronous attempt to acquire permits.
// Set requestedCount to 0 to get whether permit limit has been reached.
abstract bool Acquire(int requestedCount, out RateLimitLease lease);
// Wait until the requested permits are available.
// Set requestedCount to 0 to wait until permits are replenished.
// An exception is thrown if permits cannot be obtained.
abstract ValueTask<RateLimitLease> WaitAsync(int requestedCount, CancellationToken cancellationToken = default);
}
public struct RateLimitLease: IDisposable
{
// This represents additional metadata that can be returned as part of a call to TryAcquire/WaitAsync
// Potential uses could include a RetryAfter value.
public object? State { get; init; }
// Constructor
public RateLimitLease(object? state, Action<RateLimitLease>? onDispose);
// Return the acquired permits
public void Dispose();
// This static field can be used for rate limiters that do not require release semantics or for failed concurrency limiter acquisition requests.
public static RateLimitLease NoopSuccess = new RateLimitLease(null, null);
}
This was proposed since the method name TryAcquire
seemed to convey the idea that it is a quick synchronous check. However, this also impacted the shape of the API to return bool
by convention and return additional information via out parameters. If a limiter wants to communicate a failure for a WaitAsync
, it would throw an exception. This may occur if the limiter has reached the hard cap. The drawback here is that these scenarios, which may be frequent depending on the scenario, will necessitate an allocation of an Exception
type.
Another alternative was identified with WaitAsync
returning a tuple, i.e. ValueTask<(bool, RateLimitLease)> WaitAsync(...)
. The consumption pattern would then look like:
(bool successful, RateLimitLease lease) = await WaitAsync(1);
if (successful)
{
using lease;
// continue processing
}
else
{
// limit reached
}
Release APIs on RateLimiter
Instead of using RateLimitLease
to track release of permits an alternative approach proposes adding a void Release(int releaseCount)
method on RateLimiter
and require users to call this method explicitly. However, this requires the user to call release with the correct count which can be error prone and the RateLimitLease
approach was preferred.
Partial acquisition and release
Currently, the acquisition and release of permits is all-or-nothing.
Additional APIs will be needed to allow for the ability to acquire a part of the requested permits. For example, 5 permits were requested but willing to accept a subset of the requested permits if not all 5 is available.
Similarly, additional APIs can be added to RateLimitLease
to facilitate the release a part of the acquired permits. For example, 5 permits are obtained, but as processing continues, each permit can be released individually.
These APIs are not included in this proposal since no concrete use cases has been currently identified.
Risks
This is a proposal for new API and main concerns include:
- Consumption patterns of rate limiters should be simple and idiomatic to prevent pitfalls.
- The default rate and/or concurrency limiters should suffice in most general use cases.
- The abstraction should should be expressive enough to allow for customized rate limiters.