Skip to content

Add IMemoryPoolFactory and cleanup memory pool while idle #61554

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Jun 9, 2025

Conversation

BrennanConroy
Copy link
Member

@BrennanConroy BrennanConroy commented Apr 18, 2025

Adding the ability to release (to the GC) memory from the PinnedBlockMemoryPool. It does this by counting the rent and return calls and using heuristics to decide if it should free up memory or not based on pool usage.

Additionally implements the newly approved IMemoryPoolFactory<T> interface and plumbs it through Kestrel, IIS, and HttpSys with the default implementation for IMemoryPoolFactory<byte> being the PinnedBlockMemoryPool.

Kestrel makes use of its heartbeat loop to clean up the memory pools, whereas IIS and HttpSys don't have the equivalent so they use a PeriodicTimer approach.

Adds metrics to the pool. Issue for those is at #61594.

Closes #61003
Closes #27394
Closes #55490

Did a few perf runs to see the impact of Interlocked.Increment on the Rent and Return paths.
Did longer 60 second runs to make sure the memory pools had plenty of opportunity to run clean up logic and see the impact of incrementing counters.

  • A few runs of Plaintext Platform for 60 seconds had no noticeable impact (this is our 24m RPS scenario so it should be hitting the pool a lot)
  • A few runs of MvcJsonOutput60k for 60 seconds had no noticeable impact (wanted a benchmark that wrote data to the response)
  • A few runs of Plaintext Platform with 2 IOQueues (default is # cores/2) to make more requests use the same pool, this one might have a tiny impact on RPS, but there seemed to be a bit of noise, was getting 5.6m and 5.7m with 2 queues and 5.8m and 6m without PR, but it seemed noisy.

TLDR; didn't notice a perf impact with this PR and so removed the ScalableApproximateCounting code for now. Can always add it back if we find problems.

@BrennanConroy BrennanConroy added feature-iis Includes: IIS, ANCM feature-kestrel feature-httpsys area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions labels Apr 18, 2025
@adityamandaleeka adityamandaleeka linked an issue May 2, 2025 that may be closed by this pull request
@BrennanConroy BrennanConroy marked this pull request as ready for review May 15, 2025 01:13
{
}

public PinnedBlockMemoryPool(IMeterFactory meterFactory)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should pass an options object so we can easily add future options

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you set the options? Just have a generic MemoryPoolOptions type that users can configure in DI? That sounds like something we could add in the future without breaking anything.

internal sealed class PinnedBlockMemoryPoolMetrics
{
// Note: Dot separated instead of dash.
public const string MeterName = "System.Buffers.PinnedBlockMemoryPool";
Copy link
Member

@JamesNK JamesNK May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if these metrics should be on a new meter. Couldn't they be on a meter the same name as Kestrel's meter?

A new meter name means folks would need to configure OTEL to export the kestrel meter, and this System.Buffers.PinnedBlockMemoryPool meter. Having memory stats related to kestrel be on the Kestrel meter makes more sense to me.

Copy link
Member

@JamesNK JamesNK May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see there are multiple frameworks that use System.Buffers.PinnedBlockMemoryPool. Maybe it does make sense for it to be its own meter. Although if you have multiple frameworks using the pool at the same time then numbers will get merged together. Is that would you intend?

A solution would be to have an owner tag that says who used the memory. When a memory pool is created the owner could pass in their name, e.g. kestrel, sockets, iis, etc, and the name is internally passed to counters when they're changed. An owner tag would let you view total memory usage, vs just Kestrel HTTP layer, vs sockets layer, vs IIS (if running together in the same process)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the meter should be renamed to indicate that it's specific to ASP.NET Core. Maybe Microsoft.AspNetCore.MemoryPool? That is simple and obvious what the meter is for.

I don't think people care that the underlying pool is implemented using pinned blocks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the name. But can we limit metric feedback to the issue #61594?

The owner tag seems interesting, but how would it actually work? For example, the app layer can grab the memory pool factory via DI and get a pool instance, so if you wanted a different name for app code then the interface would need to allow setting a name. If this is something you think we should pursue, can you comment on the issue I linked?


public PinnedBlockMemoryPoolMetrics(IMeterFactory meterFactory)
{
_meter = meterFactory.Create(MeterName);
Copy link
Member

@JamesNK JamesNK May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there could be multiple instances of memory pool used across assemblies with shared source. Fortunatly I believe that as long as the same IMeterFactory is used for all of them (injected by DI), the meter and counters created will be the same instances. Metrics values will be aggregated and saved to the same location.

However, I think a unit test to double check that is important.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test and updated our TestMeterFactory implementation to return the same Meter instance for similar meters. Same as the default meter factory impl: https://source.dot.net/#Microsoft.Extensions.Diagnostics/Metrics/DefaultMeterFactory.cs,37

@JamesNK
Copy link
Member

JamesNK commented May 15, 2025

@lmolkova @noahfalk More metrics are being added to ASP.NET Core. If you have time, your feedback on naming and metrics usage would be valuable.

@JamesNK
Copy link
Member

JamesNK commented May 15, 2025

@BrennanConroy FYI naming guidelines for OTEL are here: https://opentelemetry.io/docs/specs/semconv/general/naming/

@davidfowl
Copy link
Member

Here’s a review of the updated PinnedBlockMemoryPool from @dotnet/aspnetcore/pull/61554:


Key Changes and Features

  • Idle Cleanup/Eviction:
    The pool now supports scheduled eviction of idle blocks using an adaptive algorithm based on recent rent/return activity.
  • Eviction Scheduling:
    • TryScheduleEviction schedules an eviction every 10 seconds if needed.
    • Uses ThreadPool.UnsafeQueueUserWorkItem for eviction work.
  • Eviction Algorithm (PerformEviction):
    • If activity is trending down, removes a percentage of blocks proportional to excess returns.
    • If activity is steady, removes 1% of the current pool (at least 1).
    • If no activity, removes 5% (at least 10) for aggressive idle trimming.
  • Metrics:
    • Tracks metrics for memory usage, evictions, allocations, etc.
  • Dispose Pattern:
    • Thread-safe with a lock and an optional disposal callback.
  • Rent/Return:
    • Rent increments a counter, returns blocks to the pool unless disposed.
    • Rent over block size throws.
    • Double-increments rent count on allocation to bias the eviction algorithm.

Strengths

  • Adaptive Trimming:
    The eviction logic is sophisticated and adapts to changing patterns of usage, which is excellent for minimizing memory waste and avoiding thrashing.
  • Thread-Safe:
    Uses ConcurrentQueue and Interlocked for all counters and state, as well as a lock for Dispose.
  • Metrics:
    Well-instrumented, which will help diagnose issues and tune performance.
  • Activity-Aware:
    The algorithm takes recent rent/return activity into account, reducing the likelihood of evicting blocks that are needed soon.
  • Clear Ownership Model:
    Internal comments and method signatures make it clear who owns buffers and how they must be returned.

Potential Issues / Suggestions

  1. Eviction Calculation Granularity

    • When the pool is very small (say, less than 100 blocks), currentCount / 100 can be zero, so no blocks are evicted in “steady” or “trending less” activity. However, the Math.Max(1, ...) ensures at least 1 is evicted in the steady case.
  2. Eviction Timer Drift

    • _nextEviction = now.AddSeconds(10);
      If eviction is delayed (ThreadPool busy), evictions may drift, but this is typical for such background cleanup and not a problem for a memory pool.
  3. Dispose Race/Redundancy

    • The lock in Dispose(bool) is correct, but some state (_isDisposed) is set before the lock, which could allow a thread to see a partial dispose.
    • This risk is minor and would only affect metrics or rare edge cases. Consider moving _isDisposed = true; inside the lock for maximum safety.
  4. Metrics Double-Counting

    • The logic for incrementing _rentCount twice on allocation is intentional (to reduce eviction under high allocation), but could slightly skew metrics for external consumers. This is a fair tradeoff for the intended behavior.
  5. No Age-Based Eviction

    • All eviction is based on activity, not block idle age. This is typical and generally fine, but if age-based trimming is needed (e.g., only trim oldest), another queue or timestamps would be required.
  6. Block Finalizer/Leak Recovery

    • The comments reference finalizer-based leak recovery, but this is not visible in this excerpt—assume the MemoryPoolBlock handles this as before.
  7. Eviction During High Concurrency

    • If eviction is scheduled during a burst of high concurrency, there could be brief over-trimming. The activity-based scheme should minimize this risk, but monitoring is recommended in real workloads.

Summary Table

Aspect Status Notes
Thread Safety Uses concurrent structures and locking as needed
Adaptive Eviction Algorithm considers activity and trims efficiently
Metrics/Diagnostics Good instrumentation
Defensive Coding Careful checks on sizes, disposal, and ownership
Over/Under-trimming ⚠️/✅ Well-tuned, but as with all heuristics, workload-dependent
Disposal Robustness Locking and callback support

Conclusion

This is a robust, production-grade adaptive memory pool.

  • The adaptive eviction is an excellent improvement over static trimming.
  • Thread safety, metrics, and defensive checks are all strong.
  • The only possible enhancement would be optional age-based eviction for even more precise memory usage control, but for most server workloads, this scheme is ideal.

Overall: 👍 Looks great and ready for real workloads.


public DefaultMemoryPoolFactory(IMeterFactory? meterFactory = null)
{
_meterFactory = meterFactory ?? NoopMeterFactory.Instance;
Copy link
Member

@JamesNK JamesNK May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it would only be null in tests? I think it would be better if the type always expect a meter factory from DI. And then test code is responsible for creating the factory with a test meter factory. Then NoopMeterFactory can be remove from this type and https://github.com/dotnet/aspnetcore/blob/2ad1ff0a9e870a77522fc0725153154834bcb0ac/src/Shared/Metrics/TestMeterFactory.cs used when this type is created in tests.

await _timerTask;
}

private sealed class NoopMeterFactory : IMeterFactory
Copy link
Member

@JamesNK JamesNK May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name isn't accurate because the factory will still create meters and people can still listen to those meters. The only difference is the factory isn't set as the owner of the meters. In other words, they're harder to test.

A better simulation of "no op" would be to not create PinnedBlockMemoryPoolMetrics if there's no meter factory and any calls to it would be _metrics?.UpdateCurrentMemory(-block.Memory.Length);

As mentioned here, I think the best improvement to make here is always require the meter factory and have tests always specify it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with allowing a null IMeterFactory. Most tests aren't doing anything with meters, so it seems like a waste to require them to pass in a meter factory.

@BrennanConroy
Copy link
Member Author

@ reviewers, any more feedback? Want to get this merged for preview6.

@JamesNK
Copy link
Member

JamesNK commented Jun 3, 2025

Yes.

I think metrics + multiple memory pools is only useful if there are ways to tell them apart. There needs to be a name that is added to metrics. All instances increment the same counters and you don't know what in the app is using memory without different dimensions in the metrics.

I see Http.sys, IIS, Kestrel, namedpipes and sockets use the factory. Someone probably won't use multiple web servers in the same app, but Kestrel + transports will be a thing. If memory usage is high then there won't be a way to tell whether it is kestrel or the transport that is causing the memory issues. Yes, someone could take a dump, but the solution of passing a name when creating a memory pool seems very easy. Then the extra level of detail is available in metrics apps like the aspire dashboard, grafana, or azure monitor.

This isn't just feedback on metrics names. It impacts the public API of IMemoryPoolFactory and will require a bunch of other changes to pass the name through various layers onto the counters.

@BrennanConroy
Copy link
Member Author

Great, we can change the API in previews so let's work on that in preview7 and RC1. The core change here is evicting memory, and we would like customers to be able to try this out sooner rather than later.

/// if activity is high, fewer or no blocks are evicted.
/// Evicted blocks are removed from the pool and their memory is unrooted for garbage collection.
/// </summary>
internal void PerformEviction()
Copy link
Member

@JamesNK JamesNK Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do other heartbeat activities log?

I'm wondering if there should be some trace-level logging here. For example, if the memory pool decides to evict memory then write a log with details about why and how much.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kestrel does eviction of Http2Stream streams on the heartbeat. I don't think it's logged...

@JamesNK
Copy link
Member

JamesNK commented Jun 4, 2025

(these are final non-metrics, non-API feedback. IMO good to merge and test in next preview after addressed)

@BrennanConroy
Copy link
Member Author

/azp run

Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@@ -166,5 +166,5 @@ public static Socket CreateDefaultBoundListenSocket(EndPoint endpoint)
return listenSocket;
}

internal Func<MemoryPool<byte>> MemoryPoolFactory { get; set; } = System.Buffers.PinnedBlockMemoryPoolFactory.Create;
internal IMemoryPoolFactory<byte> MemoryPoolFactory { get; set; } = DefaultSimpleMemoryPoolFactory.Instance;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we at all concerned with breaking reflection-based code that was setting this previously?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. The couple teams we gave code to were told the private reflection wouldn't work in 10.0.

services.AddOptions<SocketTransportOptions>().Configure((SocketTransportOptions options, IMemoryPoolFactory<byte> factory) =>
{
// Set the IMemoryPoolFactory from DI on SocketTransportOptions. Usually this should be the PinnedBlockMemoryPoolFactory from UseKestrelCore.
options.MemoryPoolFactory = factory;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we're changing the type of the MemoryPoolFactory from a Func to an IMemoryPoolFactory meaning we're breaking reflection anyway, why flow the factory through the options instead of constructor injecting it in SocketTransportFactory? Just to minimize churn? I guess this also ensures anyone constructing it themselves with an options instance resolved from DI gets the appropriate IMemoryPoolFactory.

I think it's fine to leave it, but I think it's our last chance not to flow the factory through options if we don't want to.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would mean new API (since the type is public) or using the internal impl trick and switching to that. We'd need to do it on NamedPipes as well.

catch (Exception ex)
{
// Log the exception, but don't let it crash the thread pool
_logger?.LogError(ex, "Error while performing eviction in PinnedBlockMemoryPool.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_logger?.LogError(ex, "Error while performing eviction in PinnedBlockMemoryPool.");
_logger?.LogCritical(ex, "Error while performing eviction in PinnedBlockMemoryPool.");

I think a critical log is appropriate considering something must have gone horribly wrong to get here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose error because we're still going to be running eviction code next iteration. Critical to me feels more like we stopped doing work now and in the future because something is horribly wrong. i.e. the accept loop in Kestrel: https://source.dot.net/#Microsoft.AspNetCore.Server.Kestrel.Core/Internal/ConnectionDispatcher.cs,74

}

// Remove from queue and let GC clean the memory up
while (burstAmount > 0 && _blocks.TryDequeue(out var block))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it'd be interesting to emit some metric if this _blocks.TryDequeue operation fails. It seems like this would be a good indicator that our heuristics aren't working as well as we'd like given how conservative we try to make the burst amount.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave a comment on #61594? That could happen any time we are performing eviction and new traffic starts coming in, or if the pool was low and the heuristics didn't limit themselves to the size of the pool.

@BrennanConroy
Copy link
Member Author

/ba-g Test failures have been fixed in main, saving trees by skipping check

@BrennanConroy BrennanConroy merged commit b24508e into main Jun 9, 2025
25 of 27 checks passed
@BrennanConroy BrennanConroy deleted the brecon/kestrelmemorypool branch June 9, 2025 21:09
@dotnet-policy-service dotnet-policy-service bot added this to the 10.0-preview6 milestone Jun 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions feature-httpsys feature-iis Includes: IIS, ANCM feature-kestrel
Projects
None yet
8 participants