Disposable is leaky by default

I'm a bit uneasy with the fact that it's allowed to create a Disposable without registering its dispose method to some kind of disposal scope, either with `using` on the current scope, or add to some `DisposableStack`.

In fact, the disposal registration is opt-in, not opt-out. This gives a Disposable object leaky behavior by default. It's like still doing `malloc()` and `free()`, but added opt-in support for scope based RAII. This is especially concerning given most of the resource management in JavaScript to date has been implicit by garbage collection. **A Disposable object can leak even after it getting garbage collected.**

```js
function createHandle(): Disposable & Handle {
    const { write, close } = nativeHandle();
    return { write, [Symbol.dispose](): { close() } };
}

function main() {
    const handle = createHandle();
    handle.write(data);
    // ops you forget to dispose handle
}

main();
// now it's leaking quietly
// any because handle is garbage collected you can't even find it in heap dump
```

It's worse than leaking - it's leaking **quietly**, and it can **disappear from heap dump** because the dispose method can be garbage collected. Diagnosing a leak like this would be a nightmare.

I know, before the proposal people have already been doing explicit resource management in JavaScript. But that's because it's the only way to do it. I would expect a language level improvement on resource management to do better than just providing a syntactic sugar without addressing the unsafe nature of explicit resource management.

I'm not saying I have a solution to the problem. But I'd like to at least start a conversation on possibilities.

> <img width="100%" src="https://raw.githubusercontent.com/ken-okabe/web-images/main/note.svg">
>
> My proposal have improved for several iterations thanks to in-depth discussion. To help people to quickly follow up with the most recent proposal of mine, [please check this mega post](https://github.com/tc39/proposal-explicit-resource-management/issues/159#issuecomment-1620895118).

Maybe it's better to not exposing the dispose method on any returned value. A dispose method must be registered to some `DisposeScope`. While decoupling `DisposeScope` from block scope, making it similar to Autorelease Pool Blocks in Objective-C. So we don't need explicit transfer of ownership when a function is `using` some Disposable resource but wants to delegate its disposal to a parent `DisposeScope`. Also a `DisposeScope` can be stored and re-opened to support use cases like class.

So it would look something like this:

```ts
interface DisposeScope {
    defer<T>((value: T) => Promise<void>|void): void;
}

interface DisposableFunction<T, Args extends any[]> {
    (scope: DisposeScope, ...Args): T;
}

interface DisposableClass<T, Args extends any[]> {
    new (scope: DisposeScope, ...Args): T;
}

function Allocate(scope: DisposeScope, length: number): Array {
    const arr = new Array(length);
    scope.defer(() => arr.splice(0, arr.length));
    return arr;
}

function Bar() {
    const bar = using Allocate(10); // using always bind to the nearest dispose scope above the call stack
}

async dispose { // create a DisposeScope at this stack level
    const foo = using Allocate(5);
    {
        Bar();
    }
} // await bar dispose(); awiat foo dispose();

const baz = using Allocate(5); // error: no enclosing DisposeScope


// For class

class Resource {
    constructor(scope: DisposeScope) {
        this.#scope = scope;
    }
    async run(): Promise<void> {
        async dispose(this.#scope) { // reopen a DisposeScope
            this.#arr = using Allocate(5);
        } // disposal is being deferred to the owner of this.#scope
    }
}

async dispose { // create a DisposeScope at this stack level
    const resource = using new Resource();
} // await resource.#arr dispose();
```

Enclosing some code inside a `DisposeScope` means you want the scope to collect all the resource dispose methods, and call them when the `DisposeScope` is closed, and to make sure all dispose methods complete before executing the statements after the `DisposeScope` block.

You can even take it further to not pass `DisposeScope` as an argument. Instead making it an environment slot on the execution context. A `defer` keyword can be added to register a disposal method to the enclosing `DisposeScope`. However storing and re-opening a scope would require some changes to be supported.

So it can even look something like this:

```ts
function Allocate(length: number): Array {
    const arr = new Array(length);
    defer arr.splice(0, arr.length); // the expression is deferred for evaluation in the enclosing DisposeScope
    async defer { // a block expression can be used for defer
        await cleanup(); // await can be used after async defer
    }; // async defer can be used in a sync function as long as the enclosing DisposeScope is async dispose
    return arr;
}

function Bar() {
    const bar = Allocate(10); // function call always inherits the nearest dispose scope above the call stack
}

async dispose { // create a DisposeScope at this stack level
    const foo = Allocate(5);
    {
        Bar();
    }
} // await bar dispose(); awiat foo dispose();

const baz = Allocate(5); // error: no enclosing DisposeScope


// For class

function EnclosingDisposeScope(): unique symbol {
    // [native function]
    // returns a special symbol uniquely identifies the current enclosing DisposeScope
}

class Task {
    constructor() {
        this.#scope = EnclosingDisposeScope();
    }
    run(): void {
        dispose(this.#scope) { // reopen a DisposeScope identified by the symbol
            this.#arr = Allocate(5);
        } // disposal is being deferred to the scope identified by this.#scope
    }
}

async dispose { // create a DisposeScope at this stack level
    const task = new Task();
    dispose { // a new DisposeScope at top of the stack
        task.run();
    }
} // await task.#arr dispose();
```

In this way there's no need to define a `Disposable` type, and no need to use a `DisposableStack`. Even the `using` keyword can be omitted. A function created disposable resource no longer has to return a `Disposable` type. Every function just return whatever value they produce, and `defer` statement can be inserted at any location to dispose resource at the enclosing `DisposeScope`.

In this way, a disposal cannot be scheduled with `defer` if there's no enclosing `DisposeScope`. And the only way to register disposal operation is to use `defer`, and the dispose method is never exposed to the outside.

This would provide a safer default behavior when using Disposable resource. If you somehow "forgot" to create an new `DisposeScope`, resource created directly or indirectly from your function won't leak silently - it either fails to `defer` to an enclosing `DisposeScope`, resulting an error being thrown, or it's guarenteed to be disposed when the enclosing `DisposeScope` closed. If a resource is living longer than you expected, it's very likely being deferred to some long running `DisposeScope`, and you can easily find both the `DisposeScope` and the dispose method it referecnes in a heap dump. Diagnosing resource retaintion problems would be easier this way.

### Extending the lifetime of a `DisposeScope`

Resource lifetime and execution context (functions, try-catch blocks etc) lifetime are loosely related. Resource references can be cloned, assigned, transferred, returned, down and up the call stack. You simply cannot confine a resouce in a fixed block of code enclosed by a pair of curly braces. Execution context on the other hand, is more ordered - it's always in a Last-In-First-Out stack. There are special behaviors like async/await, macrotasks, generator functions but if you generalize execution context enough, these can also follow the same rule.

A curly brace block syntax for a resource `DisposeScope` made it easy to implicitly pass the scope down the execution stack, but it doesn't capture the dynamic nature of resource passing between different execution contexts, and in many cases, outside the curly brace block the `DisposeScope` can enclose. When that happen, we will have use-after-free problems:

```ts
function createHandle(): Handle {
    const { write, close } = nativeHandle();
    defer close();
    return { write };
}
dispose {
    const handle = createHandle();
    doSomething().then(() => {
        handle.write(data); // error: handle already disposed
    }); // this returned synchronously while the callback is scheduled on an async context
}
```

Note that `DisposableStack` also has this problem:

```ts
function createHandle(): Disposable & Handle {
    const { write, close } = nativeHandle();
    return { write, [Symbol.dispose](): { close() } };
}
function main() {
    using handle = createHandle();
    doSomething().then(() => {
        handle.write(data); // error: handle already disposed
    }); // this returned synchronously while the callback is scheduled on an async context
}
```

However, if we decouple the `DisposeScope` from execution block scope, we can define a different semantic for it. Just look at the following code and it would be very intuitive if it can just work:

```ts
function createHandle(): Handle {
    const { write, close } = nativeHandle();
    defer close();
    return { write };
}
dispose { // spoiler: this should be an async dispose
    const scope = EnclosingDisposeScope();
    const handle = createHandle();
    doSomething().then(() => {
        dispose(scope) { // re-open a captured scope on an async context
            handle.write(data);
        }
    });
}
```

An immediate observation tell us, we can make it work if we add the following rule for a `DisposeScope` disposal: **Only dispose a `DisposeScope` when all the references to that scope were gone (unreachable / garbage collected).**

We can manually analyze the bahavior by denoting reference counts of the scope.

```ts
dispose { // open the scope, refcount++: 1
    const scope = EnclosingDisposeScope(); // refcount++: 2
    const handle = createHandle();
    doSomething().then(/* ... */); // captures scope, refcount++: 3
    // const scope gone, refcount--: 2
} // close the scope, refcount--: 1

// on the async execution context:
() => {
    dispose(scope) { // re-open the scope, refcount++: 2
        handle.write(data);
    } // close the scope, refcount--: 1
} // captured scope gone, refcount--: 0, now we can dispose scope
```

We can see that after reaching the end of the dispose block that created the `DisposeScope`, there is still reference to the scope. So the dispose doesn't happen. It's until the scheduled async context finished, and the reference count of the scope dropped to 0, we can finally dispose the scope.

This however, still has a problem. The `DisposeScope` we created here is an synchronous scope. But the refcount of the scope only reach to 0 on a different async execution context. This means we should use an `async dispose` scope instead, even the deferred function is synchronous:

```ts
function createHandle(): Handle {
    const { write, close } = nativeHandle();
    defer close();
    return { write };
}
async dispose {
    const scope = EnclosingDisposeScope();
    const handle = createHandle();
    doSomething().then(() => {
        async dispose(scope) { // re-open the captured scope on an async context
            handle.write(data);
        }
    });
}
```

We can write self managed resources:

```ts
function createHandle(): Handle {
    const scope = EnclosingDisposeScope();
    const { write, close } = nativeHandle();
    defer close();
    return { write, scope }; // extending the scope's lifetime to the lifetime of the returned object
}
async dispose {
    const handle = createHandle();
    doSomething().then(() => {
        // no need to capture the scope explicitly, handle will hold a reference to the scope already
        handle.write(data);
    });
}

// For class

class Task {
    constructor() {
        this.#scope = EnclosingDisposeScope(); // extending the scope's lifetime to the lifetime of the returned object
    }
    run(): void {
        dispose(this.#scope) {
            this.#arr = Allocate(5);
        }
    }
}
async dispose {
    const task = new Task();
    task.run();
}
```

We can even keep improving this by integrating async context tracking mechanism like [Node.js `AsyncLocalStorage` / `AsyncResource`](https://nodejs.org/api/async_context.html#asyncresourceemitdestroy) or the [`AsyncContext` proposal](https://github.com/tc39/proposal-async-context). So the `DisposeScope` itself is an `AsyncResource`.

There are however some performance implications if we binds the lifetime of an async `DisposeScope` to its references. It means the dispose can only happen after a GC cycle. But personally I think the benifits are worth the tradeoff. We can achieve almost implicit usage of managed resource this way.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disposable is leaky by default #159

rixtox
openedon Jun 4, 2023

Extending the lifetime of a `DisposeScope`

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Disposable is leaky by default #159

Description

rixtoxopenedon Jun 4, 2023

Extending the lifetime of a DisposeScope

Metadata