-
Notifications
You must be signed in to change notification settings - Fork 10.4k
Add IMemoryPoolFactory and cleanup memory pool while idle #61554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/Servers/Kestrel/Kestrel/src/WebHostBuilderKestrelExtensions.cs
Outdated
Show resolved
Hide resolved
{ | ||
} | ||
|
||
public PinnedBlockMemoryPool(IMeterFactory meterFactory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should pass an options object so we can easily add future options
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would you set the options? Just have a generic MemoryPoolOptions
type that users can configure in DI? That sounds like something we could add in the future without breaking anything.
internal sealed class PinnedBlockMemoryPoolMetrics | ||
{ | ||
// Note: Dot separated instead of dash. | ||
public const string MeterName = "System.Buffers.PinnedBlockMemoryPool"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if these metrics should be on a new meter. Couldn't they be on a meter the same name as Kestrel's meter?
A new meter name means folks would need to configure OTEL to export the kestrel meter, and this System.Buffers.PinnedBlockMemoryPool
meter. Having memory stats related to kestrel be on the Kestrel meter makes more sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see there are multiple frameworks that use System.Buffers.PinnedBlockMemoryPool
. Maybe it does make sense for it to be its own meter. Although if you have multiple frameworks using the pool at the same time then numbers will get merged together. Is that would you intend?
A solution would be to have an owner
tag that says who used the memory. When a memory pool is created the owner could pass in their name, e.g. kestrel
, sockets
, iis
, etc, and the name is internally passed to counters when they're changed. An owner
tag would let you view total memory usage, vs just Kestrel HTTP layer, vs sockets layer, vs IIS (if running together in the same process)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the meter should be renamed to indicate that it's specific to ASP.NET Core. Maybe Microsoft.AspNetCore.MemoryPool
? That is simple and obvious what the meter is for.
I don't think people care that the underlying pool is implemented using pinned blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the name. But can we limit metric feedback to the issue #61594?
The owner tag seems interesting, but how would it actually work? For example, the app layer can grab the memory pool factory via DI and get a pool instance, so if you wanted a different name for app code then the interface would need to allow setting a name. If this is something you think we should pursue, can you comment on the issue I linked?
|
||
public PinnedBlockMemoryPoolMetrics(IMeterFactory meterFactory) | ||
{ | ||
_meter = meterFactory.Create(MeterName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like there could be multiple instances of memory pool used across assemblies with shared source. Fortunatly I believe that as long as the same IMeterFactory
is used for all of them (injected by DI), the meter and counters created will be the same instances. Metrics values will be aggregated and saved to the same location.
However, I think a unit test to double check that is important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a test and updated our TestMeterFactory implementation to return the same Meter instance for similar meters. Same as the default meter factory impl: https://source.dot.net/#Microsoft.Extensions.Diagnostics/Metrics/DefaultMeterFactory.cs,37
@BrennanConroy FYI naming guidelines for OTEL are here: https://opentelemetry.io/docs/specs/semconv/general/naming/ |
Here’s a review of the updated Key Changes and Features
Strengths
Potential Issues / Suggestions
Summary Table
ConclusionThis is a robust, production-grade adaptive memory pool.
Overall: 👍 Looks great and ready for real workloads. |
src/Servers/Kestrel/Core/src/Internal/PinnedBlockMemoryPoolFactory.cs
Outdated
Show resolved
Hide resolved
|
||
public DefaultMemoryPoolFactory(IMeterFactory? meterFactory = null) | ||
{ | ||
_meterFactory = meterFactory ?? NoopMeterFactory.Instance; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it would only be null in tests? I think it would be better if the type always expect a meter factory from DI. And then test code is responsible for creating the factory with a test meter factory. Then NoopMeterFactory
can be remove from this type and https://github.com/dotnet/aspnetcore/blob/2ad1ff0a9e870a77522fc0725153154834bcb0ac/src/Shared/Metrics/TestMeterFactory.cs used when this type is created in tests.
await _timerTask; | ||
} | ||
|
||
private sealed class NoopMeterFactory : IMeterFactory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name isn't accurate because the factory will still create meters and people can still listen to those meters. The only difference is the factory isn't set as the owner of the meters. In other words, they're harder to test.
A better simulation of "no op" would be to not create PinnedBlockMemoryPoolMetrics
if there's no meter factory and any calls to it would be _metrics?.UpdateCurrentMemory(-block.Memory.Length);
As mentioned here, I think the best improvement to make here is always require the meter factory and have tests always specify it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with allowing a null IMeterFactory
. Most tests aren't doing anything with meters, so it seems like a waste to require them to pass in a meter factory.
@ reviewers, any more feedback? Want to get this merged for preview6. |
Yes. I think metrics + multiple memory pools is only useful if there are ways to tell them apart. There needs to be a name that is added to metrics. All instances increment the same counters and you don't know what in the app is using memory without different dimensions in the metrics. I see Http.sys, IIS, Kestrel, namedpipes and sockets use the factory. Someone probably won't use multiple web servers in the same app, but Kestrel + transports will be a thing. If memory usage is high then there won't be a way to tell whether it is kestrel or the transport that is causing the memory issues. Yes, someone could take a dump, but the solution of passing a name when creating a memory pool seems very easy. Then the extra level of detail is available in metrics apps like the aspire dashboard, grafana, or azure monitor. This isn't just feedback on metrics names. It impacts the public API of |
Great, we can change the API in previews so let's work on that in preview7 and RC1. The core change here is evicting memory, and we would like customers to be able to try this out sooner rather than later. |
src/Servers/Kestrel/Core/test/PinnedBlockMemoryPoolFactoryTests.cs
Outdated
Show resolved
Hide resolved
/// if activity is high, fewer or no blocks are evicted. | ||
/// Evicted blocks are removed from the pool and their memory is unrooted for garbage collection. | ||
/// </summary> | ||
internal void PerformEviction() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do other heartbeat activities log?
I'm wondering if there should be some trace-level logging here. For example, if the memory pool decides to evict memory then write a log with details about why and how much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kestrel does eviction of Http2Stream streams on the heartbeat. I don't think it's logged...
(these are final non-metrics, non-API feedback. IMO good to merge and test in next preview after addressed) |
/azp run |
Azure Pipelines successfully started running 3 pipeline(s). |
@@ -166,5 +166,5 @@ public static Socket CreateDefaultBoundListenSocket(EndPoint endpoint) | |||
return listenSocket; | |||
} | |||
|
|||
internal Func<MemoryPool<byte>> MemoryPoolFactory { get; set; } = System.Buffers.PinnedBlockMemoryPoolFactory.Create; | |||
internal IMemoryPoolFactory<byte> MemoryPoolFactory { get; set; } = DefaultSimpleMemoryPoolFactory.Instance; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we at all concerned with breaking reflection-based code that was setting this previously?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. The couple teams we gave code to were told the private reflection wouldn't work in 10.0.
services.AddOptions<SocketTransportOptions>().Configure((SocketTransportOptions options, IMemoryPoolFactory<byte> factory) => | ||
{ | ||
// Set the IMemoryPoolFactory from DI on SocketTransportOptions. Usually this should be the PinnedBlockMemoryPoolFactory from UseKestrelCore. | ||
options.MemoryPoolFactory = factory; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we're changing the type of the MemoryPoolFactory from a Func to an IMemoryPoolFactory meaning we're breaking reflection anyway, why flow the factory through the options instead of constructor injecting it in SocketTransportFactory? Just to minimize churn? I guess this also ensures anyone constructing it themselves with an options instance resolved from DI gets the appropriate IMemoryPoolFactory.
I think it's fine to leave it, but I think it's our last chance not to flow the factory through options if we don't want to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would mean new API (since the type is public) or using the internal impl trick and switching to that. We'd need to do it on NamedPipes as well.
catch (Exception ex) | ||
{ | ||
// Log the exception, but don't let it crash the thread pool | ||
_logger?.LogError(ex, "Error while performing eviction in PinnedBlockMemoryPool."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_logger?.LogError(ex, "Error while performing eviction in PinnedBlockMemoryPool."); | |
_logger?.LogCritical(ex, "Error while performing eviction in PinnedBlockMemoryPool."); |
I think a critical log is appropriate considering something must have gone horribly wrong to get here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose error because we're still going to be running eviction code next iteration. Critical to me feels more like we stopped doing work now and in the future because something is horribly wrong. i.e. the accept loop in Kestrel: https://source.dot.net/#Microsoft.AspNetCore.Server.Kestrel.Core/Internal/ConnectionDispatcher.cs,74
} | ||
|
||
// Remove from queue and let GC clean the memory up | ||
while (burstAmount > 0 && _blocks.TryDequeue(out var block)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it'd be interesting to emit some metric if this _blocks.TryDequeue
operation fails. It seems like this would be a good indicator that our heuristics aren't working as well as we'd like given how conservative we try to make the burst amount.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leave a comment on #61594? That could happen any time we are performing eviction and new traffic starts coming in, or if the pool was low and the heuristics didn't limit themselves to the size of the pool.
/ba-g Test failures have been fixed in main, saving trees by skipping check |
Adding the ability to release (to the GC) memory from the
PinnedBlockMemoryPool
. It does this by counting the rent and return calls and using heuristics to decide if it should free up memory or not based on pool usage.Additionally implements the newly approved
IMemoryPoolFactory<T>
interface and plumbs it through Kestrel, IIS, and HttpSys with the default implementation forIMemoryPoolFactory<byte>
being thePinnedBlockMemoryPool
.Kestrel makes use of its heartbeat loop to clean up the memory pools, whereas IIS and HttpSys don't have the equivalent so they use a
PeriodicTimer
approach.Adds metrics to the pool. Issue for those is at #61594.
Closes #61003
Closes #27394
Closes #55490
Did a few perf runs to see the impact of
Interlocked.Increment
on theRent
andReturn
paths.Did longer 60 second runs to make sure the memory pools had plenty of opportunity to run clean up logic and see the impact of incrementing counters.
TLDR; didn't notice a perf impact with this PR and so removed the ScalableApproximateCounting code for now. Can always add it back if we find problems.