Description
openedon Jun 4, 2023
I'm a bit uneasy with the fact that it's allowed to create a Disposable without registering its dispose method to some kind of disposal scope, either with using
on the current scope, or add to some DisposableStack
.
In fact, the disposal registration is opt-in, not opt-out. This gives a Disposable object leaky behavior by default. It's like still doing malloc()
and free()
, but added opt-in support for scope based RAII. This is especially concerning given most of the resource management in JavaScript to date has been implicit by garbage collection. A Disposable object can leak even after it getting garbage collected.
function createHandle(): Disposable & Handle {
const { write, close } = nativeHandle();
return { write, [Symbol.dispose](): { close() } };
}
function main() {
const handle = createHandle();
handle.write(data);
// ops you forget to dispose handle
}
main();
// now it's leaking quietly
// any because handle is garbage collected you can't even find it in heap dump
It's worse than leaking - it's leaking quietly, and it can disappear from heap dump because the dispose method can be garbage collected. Diagnosing a leak like this would be a nightmare.
I know, before the proposal people have already been doing explicit resource management in JavaScript. But that's because it's the only way to do it. I would expect a language level improvement on resource management to do better than just providing a syntactic sugar without addressing the unsafe nature of explicit resource management.
I'm not saying I have a solution to the problem. But I'd like to at least start a conversation on possibilities.
My proposal have improved for several iterations thanks to in-depth discussion. To help people to quickly follow up with the most recent proposal of mine, please check this mega post.
Maybe it's better to not exposing the dispose method on any returned value. A dispose method must be registered to some DisposeScope
. While decoupling DisposeScope
from block scope, making it similar to Autorelease Pool Blocks in Objective-C. So we don't need explicit transfer of ownership when a function is using
some Disposable resource but wants to delegate its disposal to a parent DisposeScope
. Also a DisposeScope
can be stored and re-opened to support use cases like class.
So it would look something like this:
interface DisposeScope {
defer<T>((value: T) => Promise<void>|void): void;
}
interface DisposableFunction<T, Args extends any[]> {
(scope: DisposeScope, ...Args): T;
}
interface DisposableClass<T, Args extends any[]> {
new (scope: DisposeScope, ...Args): T;
}
function Allocate(scope: DisposeScope, length: number): Array {
const arr = new Array(length);
scope.defer(() => arr.splice(0, arr.length));
return arr;
}
function Bar() {
const bar = using Allocate(10); // using always bind to the nearest dispose scope above the call stack
}
async dispose { // create a DisposeScope at this stack level
const foo = using Allocate(5);
{
Bar();
}
} // await bar dispose(); awiat foo dispose();
const baz = using Allocate(5); // error: no enclosing DisposeScope
// For class
class Resource {
constructor(scope: DisposeScope) {
this.#scope = scope;
}
async run(): Promise<void> {
async dispose(this.#scope) { // reopen a DisposeScope
this.#arr = using Allocate(5);
} // disposal is being deferred to the owner of this.#scope
}
}
async dispose { // create a DisposeScope at this stack level
const resource = using new Resource();
} // await resource.#arr dispose();
Enclosing some code inside a DisposeScope
means you want the scope to collect all the resource dispose methods, and call them when the DisposeScope
is closed, and to make sure all dispose methods complete before executing the statements after the DisposeScope
block.
You can even take it further to not pass DisposeScope
as an argument. Instead making it an environment slot on the execution context. A defer
keyword can be added to register a disposal method to the enclosing DisposeScope
. However storing and re-opening a scope would require some changes to be supported.
So it can even look something like this:
function Allocate(length: number): Array {
const arr = new Array(length);
defer arr.splice(0, arr.length); // the expression is deferred for evaluation in the enclosing DisposeScope
async defer { // a block expression can be used for defer
await cleanup(); // await can be used after async defer
}; // async defer can be used in a sync function as long as the enclosing DisposeScope is async dispose
return arr;
}
function Bar() {
const bar = Allocate(10); // function call always inherits the nearest dispose scope above the call stack
}
async dispose { // create a DisposeScope at this stack level
const foo = Allocate(5);
{
Bar();
}
} // await bar dispose(); awiat foo dispose();
const baz = Allocate(5); // error: no enclosing DisposeScope
// For class
function EnclosingDisposeScope(): unique symbol {
// [native function]
// returns a special symbol uniquely identifies the current enclosing DisposeScope
}
class Task {
constructor() {
this.#scope = EnclosingDisposeScope();
}
run(): void {
dispose(this.#scope) { // reopen a DisposeScope identified by the symbol
this.#arr = Allocate(5);
} // disposal is being deferred to the scope identified by this.#scope
}
}
async dispose { // create a DisposeScope at this stack level
const task = new Task();
dispose { // a new DisposeScope at top of the stack
task.run();
}
} // await task.#arr dispose();
In this way there's no need to define a Disposable
type, and no need to use a DisposableStack
. Even the using
keyword can be omitted. A function created disposable resource no longer has to return a Disposable
type. Every function just return whatever value they produce, and defer
statement can be inserted at any location to dispose resource at the enclosing DisposeScope
.
In this way, a disposal cannot be scheduled with defer
if there's no enclosing DisposeScope
. And the only way to register disposal operation is to use defer
, and the dispose method is never exposed to the outside.
This would provide a safer default behavior when using Disposable resource. If you somehow "forgot" to create an new DisposeScope
, resource created directly or indirectly from your function won't leak silently - it either fails to defer
to an enclosing DisposeScope
, resulting an error being thrown, or it's guarenteed to be disposed when the enclosing DisposeScope
closed. If a resource is living longer than you expected, it's very likely being deferred to some long running DisposeScope
, and you can easily find both the DisposeScope
and the dispose method it referecnes in a heap dump. Diagnosing resource retaintion problems would be easier this way.
Extending the lifetime of a DisposeScope
Resource lifetime and execution context (functions, try-catch blocks etc) lifetime are loosely related. Resource references can be cloned, assigned, transferred, returned, down and up the call stack. You simply cannot confine a resouce in a fixed block of code enclosed by a pair of curly braces. Execution context on the other hand, is more ordered - it's always in a Last-In-First-Out stack. There are special behaviors like async/await, macrotasks, generator functions but if you generalize execution context enough, these can also follow the same rule.
A curly brace block syntax for a resource DisposeScope
made it easy to implicitly pass the scope down the execution stack, but it doesn't capture the dynamic nature of resource passing between different execution contexts, and in many cases, outside the curly brace block the DisposeScope
can enclose. When that happen, we will have use-after-free problems:
function createHandle(): Handle {
const { write, close } = nativeHandle();
defer close();
return { write };
}
dispose {
const handle = createHandle();
doSomething().then(() => {
handle.write(data); // error: handle already disposed
}); // this returned synchronously while the callback is scheduled on an async context
}
Note that DisposableStack
also has this problem:
function createHandle(): Disposable & Handle {
const { write, close } = nativeHandle();
return { write, [Symbol.dispose](): { close() } };
}
function main() {
using handle = createHandle();
doSomething().then(() => {
handle.write(data); // error: handle already disposed
}); // this returned synchronously while the callback is scheduled on an async context
}
However, if we decouple the DisposeScope
from execution block scope, we can define a different semantic for it. Just look at the following code and it would be very intuitive if it can just work:
function createHandle(): Handle {
const { write, close } = nativeHandle();
defer close();
return { write };
}
dispose { // spoiler: this should be an async dispose
const scope = EnclosingDisposeScope();
const handle = createHandle();
doSomething().then(() => {
dispose(scope) { // re-open a captured scope on an async context
handle.write(data);
}
});
}
An immediate observation tell us, we can make it work if we add the following rule for a DisposeScope
disposal: Only dispose a DisposeScope
when all the references to that scope were gone (unreachable / garbage collected).
We can manually analyze the bahavior by denoting reference counts of the scope.
dispose { // open the scope, refcount++: 1
const scope = EnclosingDisposeScope(); // refcount++: 2
const handle = createHandle();
doSomething().then(/* ... */); // captures scope, refcount++: 3
// const scope gone, refcount--: 2
} // close the scope, refcount--: 1
// on the async execution context:
() => {
dispose(scope) { // re-open the scope, refcount++: 2
handle.write(data);
} // close the scope, refcount--: 1
} // captured scope gone, refcount--: 0, now we can dispose scope
We can see that after reaching the end of the dispose block that created the DisposeScope
, there is still reference to the scope. So the dispose doesn't happen. It's until the scheduled async context finished, and the reference count of the scope dropped to 0, we can finally dispose the scope.
This however, still has a problem. The DisposeScope
we created here is an synchronous scope. But the refcount of the scope only reach to 0 on a different async execution context. This means we should use an async dispose
scope instead, even the deferred function is synchronous:
function createHandle(): Handle {
const { write, close } = nativeHandle();
defer close();
return { write };
}
async dispose {
const scope = EnclosingDisposeScope();
const handle = createHandle();
doSomething().then(() => {
async dispose(scope) { // re-open the captured scope on an async context
handle.write(data);
}
});
}
We can write self managed resources:
function createHandle(): Handle {
const scope = EnclosingDisposeScope();
const { write, close } = nativeHandle();
defer close();
return { write, scope }; // extending the scope's lifetime to the lifetime of the returned object
}
async dispose {
const handle = createHandle();
doSomething().then(() => {
// no need to capture the scope explicitly, handle will hold a reference to the scope already
handle.write(data);
});
}
// For class
class Task {
constructor() {
this.#scope = EnclosingDisposeScope(); // extending the scope's lifetime to the lifetime of the returned object
}
run(): void {
dispose(this.#scope) {
this.#arr = Allocate(5);
}
}
}
async dispose {
const task = new Task();
task.run();
}
We can even keep improving this by integrating async context tracking mechanism like Node.js AsyncLocalStorage
/ AsyncResource
or the AsyncContext
proposal. So the DisposeScope
itself is an AsyncResource
.
There are however some performance implications if we binds the lifetime of an async DisposeScope
to its references. It means the dispose can only happen after a GC cycle. But personally I think the benifits are worth the tradeoff. We can achieve almost implicit usage of managed resource this way.