Description
The problem
At the moment Buffer.allocUnsafe
uses a pre-allocated internal Buffer
which is then sliced for newly allocated buffers, when possible. The size of the pool may be configured by setting Buffer.poolSize
property (8KB by default). Once the current internal buffer fills up, a new one is allocated by the pool.
Here is the related fragment of the official documentation:
The Buffer module pre-allocates an internal Buffer instance of size Buffer.poolSize that is used as a pool for the fast allocation of new Buffer instances created using Buffer.allocUnsafe() and the deprecated new Buffer(size) constructor only when size is less than or equal to Buffer.poolSize >> 1 (floor of Buffer.poolSize divided by two).
While this mechanism certainly improves performance, it doesn't implement actual pooling, i.e. reclamation and reuse of buffers. Thus it doesn't decrease allocation rate in scenarios when Buffer.allocUnsafe
is on the hot path (this is a common situation for network client libraries, e.g. DB drivers).
This issue is aimed to discuss the value of such pooling mechanism and provide a good starting point for initial design feedback.
There are two main options for pooling in Node.js:
- Provide an API for manual reference counting. This approach is unsafe and requires a lot of discipline from uses, so it doesn't sound like a good candidate.
- Integrate with GC for buffer reclamation. This one is more complicated and requires integration with GC.
It appears that option 2 is feasible with experimental APIs (FinalizationGroup and, maybe, WeakRefs).
The solution
The implementation idea was described by @addaleax (see #30611 (comment))
FinalizationGroup
API could be enough to implement a pool that would reuse internal buffers once all slices pointing at them were GCed. When the pool uses an active internal buffer (let's call it a source buffer) to create slices, it could register those slices (as "the object whose lifetime we're concerned with") with their source buffer (as "holdings") in a FinalizationGroup
. Once a finalizer function is called and it reports that all slices were GCed (it could be based on a simple counter), the pool could reclaim the corresponding source buffer and reuse it.
There are some concerns with this approach.
First, this pooling mechanism may potentially lead to performance degradation under certain conditions. For large source buffers the cost of bookkeeping of all slices in the FinalizationGroup
may be too high. For small source buffers (especially with 8KB source buffers which is the default in the standard API's global pool) it's probably not worth it at all. This could be mitigated by using a hybrid approach: use current mechanism in case for small buffer allocation as the fallback. So, some PoC experiments and benchmarking have to be done.
Another concern is that it's an experimental API, so I'm not sure if FinalizationGroup
s are available internally without --harmony-weak-refs
flag (or planned to become available), like it's done for internal WeakReference
API.
Proposed plan:
- Gather initial feedback
- Implement a PoC and benchmark it
- Discuss next steps (potentially, support in node core)
Alternatives
Such pool could be certainly implemented as an npm module. However, support in node core could bring performance improvements to a much broader audience.