Description
The current design of napi::AsyncWorker has a number of defects and limitations with it's current design
Issues
Unsafe Self-Destructing Behavior
Managing object lifetimes across multiple threads is a challenging problem. In the case of AsyncWorker, the caller that instantiates a new AsyncWorker instance has no control over when the work will be completed. To work around this problem, the AsyncWorker class currently has a "self-destructing" behavior - the last step of the OnWorkComplete callback function causes the object to delete itself. (delete self;
). This behavior is unsafe and causes a number of problems:
-
The AsyncWorker class does not know how the AsyncWorker memory was allocated. Imagine the following code as an example:
void runWorkOnThread() { AsyncWorkerSubclass worker; worker.Queue(); }
In this case,new
was never called and the AsyncWorker (unfortunately) lives on the stack. -
the work_complete callback executes asynchronously from the caller's lifetime of the object. If the caller holds a reference to the AsyncWorker and later calls Cancel() asynchronously from the main node.js event loop, a race condition will occur. Cancellation is always an asynchronous operation and must not fail in a dangerous manner when called after execution of the asynchronous work has already completed. Imagine this scenario:
- The AsyncWorker is instantiated and queued, a reference to the AsyncWorker pointer is retained.
- Call flow returns out of the addon and back to node.js.
- Node.js executes other tasks on the event loop, one of which is the work_complete callback.
- The AsyncWorker is deleted.
- User code is executed on the Node.js event loop that calls an operation on the addon that holds the reference to the AsyncWorker and calls Cancel.
- Invalid memory is referenced - boom.
Hard-coded Javascript Function Callbacks
The AsyncWorker currently takes a callback napi::Function and optional napi::Object receiver as required parameters. The intent behind the callback function is to provide an easy mechanism to automatically call a user-defined callback function outside of the addon, but this makes an incorrect assumption that the implementing addon will use a callback design pattern.
Consider a "legacy-free" addon that wishes to return a Promise for direct use with async and await. The addon would need to implement a Javascript function as a callback, wrap C++ code inside the Javascript Function that marshals their return value and/or errors, then have that C++ code signal a deferred.
The requirement for a "Javascript round-trip" to execute C++ code or implement Promises is a major headache.
No Suppose for Promises
As discussed above, there is no way to directly bridge an AsyncWorker to a Promise.
No Support for Error objects
The AsyncWorker currently only supports Javascript Error objects instantiated with a message; there is no way to return a "richer" Error object or different error type from within an AsyncWorker. In my own case, I want to return additional, important error information returned from the proprietary encryption engine I'm wrapping - there's no way to do this presently, even from within OnOK() or OnError(), which execute on the main Node.js event loop.
No Return Values
There is no mechanism to pass a return value out of the AsyncWorker, whether to the callback function or a Promise.
Unsafe Access to Napi::Env
The class contains a public reference to a napi_env retrieved from the callback function passed to the constructor. While it's ultimately up to implementers to write their code correctly, the public Env() function is a dangerous temptation - if used from within the Execute() function, multiple threads will be unsafely using the same v8 Isolate.
Resolution
I've been exploring a number of potential solutions to the problems listed above, but I haven't solved all of the issues in a single solution I can pitch to the node-addon-api community. The intent behind this issue is to create a single place to list all of the potential problems with the current type and to keep track of the various design options and discussion for resolving the defects and limitations.
Potential ideas I've been exploring include:
- Removing the Env() function from the AsyncWorker type and modifying OnOK() and OnError() to take a Napi::Env as a parameter instead. .
- Changing
void Queue()
tostatic void Queue(AsyncWorker* worker);
to force it to be a pointer. - Removing the self-deleting behavior in favor of the caller managing lifetime. The idea would be to have a higher level class/construct that addon implementers would use, while the AsyncWorker would mainly be used internally for management of the napi_async_work resource.
- Remove the callback function from the type and separate out any callback and/or promise behavior to subclasses.
- Have Queue() return an object with a cancellation token (Cancel function) and methods to bind to a Promise or a Callback function.
- Replace OnError() and OnOK() with a
virtual napi_value OnComplete(Napi::Env env)
that could return a value back from the asyncronous operation. Note this still has the problem of getting the value or error safely out of the execute function. - Add reference counting internally to avoid thread-lifetime issues with napi_async_work.
- Add a higher level construct that allows the use of async work without subclassing. For example, something that worked similarly to std::async(); something like a Napi::WorkerPool::Run(Callable) that could take a lambda function, regular function, or callable type.
- Exploring options of bridging C++'s async types (std:future, std::promise, etc) with AsyncWorker. Probably not feasible, but newer versions of C++ are adding support for things like Future.then().
No slam-dunk options yet - just quite a few ugly trade-offs. The idea of a higher level type that takes a Callable (lambda/etc) is very tempting, but the only clean way to do that and still keep our sanity is to use std::invoke. That creates a dependency on C++17 - in the case of Windows, Visual Studio 2015 or so.