-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Long running synchronous iterators block the event loop when fed into Readable.from() #41821
Comments
I just found https://nodejs.org/en/docs/guides/dont-block-the-event-loop/ (linked from goldbergyoni/nodebestpractices#294) which more makes me think it's a bug. But I'm not clear what is a good fix? Each loop of the generator is a very small task, it's just lot of them. |
You are performing work that is not I/O - that defers at most a microtask. Stuff that doesn't do I/O will always happen before stuff that does - this is by design. You don't need const r = new Readable({ read() { this.push(1); }, objectMode: true });
r.on('data', (chunk) => console.log('got data', chunk));
setInterval(() => console.log('interval'));``` |
In this example yes, but I'm talking about a general case.
I understand that. It's how the event loop works. It doesn't change that this is possible with What I think is needed is either a change to the documentation (to say, don't use If you need a more conceptual case, look at this:
or anything similar. Is this potentially the wrong way to do stuff? Is it dumb? Possibly yes to both! Should something tell you not to, or make it possible to work around? I think so. |
(assuming I am reading this issue correctly) This surprises me too. I guess I would've expected that sync iterators are wrapped in async iterators, like for-await. |
@devsnek even if the sync iterator is wrapped in an async iterator - it would still defer at most microtasks since it doesn't do I/O - like a function that does |
Well, If you do long-running work without yielding to I/O - you are not yielding to I/O. Whether or not that is appropriate to your use case that is entirely up to you. You can easily yield to the event loop by doing something like: from(async function*() {
let i = 0;
for (const item of getMySyncGenerator()) {
yield item;
if (i++ % 20 === 0) await setImmediate(); // yield to the event loop every 20 items;
}
}); Or something similar - this isn't unique to |
@benjamingr I did try something similar and I didn't get it to yield, but this looks like it might work. If it doesn't I'll update you. Really, the reason this is surprising is that a lot of other streams stuff seems to pass back to the event loop because V8 knows to pass it to an underlying C++ library. I not saying
It's too early to be sure those words even make sense, but that sort of thing would have prevent my confusion at least 😅 Also, if it's as easy as your snippet makes out it would be trivial to implement directly |
Well, a lot of streams do perform I/O so they yield back to the event loop. If you are doing work that explicitly does not do I/O (like just There is not a single place streams "pass back to the event loop" - it is always the actual source of the stream (or the destination) where the I/O happens - the stream just pauses waiting for the read/write to complete. Note streams aren't special in this regard - this is true for other things with microtick semantics (like if you nest I am happy with a doc change as a resolution to this but I am not sure where it would go since this behavior isn't specific to streams and is omnipresent in Node APIs. |
Yeah, that's why I said seems to. Adding it to |
Here's the code that got this working: function setImmediatePromise() {
return new Promise(resolve => setImmediate(resolve));
}
const iterStream = Readable.from(async function* () {
let i = 0
for await (const item of baseGenerator) {
yield item;
i++;
if (i % 1e5 === 0) await setImmediatePromise();
}
}()); Based on comments here and this snyk.io article |
Sure though "working" might be a misnomer since the fact that microtasks (promises/queueMicrotask/nextTick etc aka jobs) never yield to I/O is quite intentional and is a strong guarantee on the order of events (you never process another I/O event before you process all microticks - also true in browsers). Old versions of node used to warn on nextTick nesting but that was removed since there are many places that cause microticks now. As a side note, you can simply your code to: import { setImmediate } from 'timers/promises'; // that's built in now;
const iterStream = Readable.from(baseGenerator)
.asIndexedPairs()
.map(async ([i, x]) => {
if(i % 1e5 === 0) await setImmediate();
return x;
}); |
@nodejs/documentation the ask here is to more explicitly state that if you perform actions that don't perform I/O (like promise |
I think if you make a conscious choice to yield to the event loop, when the code you wrote does that, you should be ok to say it's working without it being a misnomer. On the plus side I very much like your updated version. |
Happy you like it nodejs/nodejs.org#4404 |
Version
v16.13.0
Platform
Linux WIN-******* 4.4.0-19041-Microsoft #1237-Microsoft Sat Sep 11 14:32:00 PST 2021 x86_64 x86_64 x86_64 GNU/Linux
Subsystem
streams
What steps will reproduce the bug?
Seen here: https://stackoverflow.com/q/70915042/1075247
Run the below code (with an appropriate
subPerm
function - something that is synchronous generator and long running).Note that the logs inside the
on('data', ...
are printed (i.e. the ones prefaced with 'manual time'), but the ones from thesetInterval
are not.How often does it reproduce? Is there a required condition?
This occurs every time
What is the expected behavior?
I would expect the loop that processes the generator (
Readable.from
) to give way to the event loop at some point, but it does notWhat do you see instead?
No event loop processing happens at all. It doesn't even seem to print something at the end of the
Additional information
Batching seems like the sort of thing that would help here, but I've seen the code for
Readable.from
and I couldn't see a way to easily modify it to include that.#34207 looks highly related, but again even under that proposal I'm not clear if you'd want to wrap one in the other, or use batching 'under the hood'.
Also, I'm aware this is labled as a bug, even though the behaviour is quite explainable. I'd argue it's very unintuitive and completely undocumented that the event loop is essentially paused under these circumstances (although I admit long running synchronous generators used where it matters might be rare in the wild) - and potentially harmful if any authentication timeouts are relying on
setTimeout
etc.The text was updated successfully, but these errors were encountered: