-
Notifications
You must be signed in to change notification settings - Fork 29.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot properly close/cleanup character device ReadStream (process hangs) #15439
Comments
Some additional information in case somebody else is having similar problems. I'm fairly confident that this happens because the data event handler remains in the node event loop (even after calling Output from a simple test using Why Is Node Running?:
Potential workaround If an event is sent to the stream after close/destroy has been called, the process will end successfully without any errors. Thus, if you are able to simultaneously read and write from/to your stream source, you could ensure shutdown by artificially injecting some event in the A simple way to test the above is to use your keyboard or mouse as an input source, e.g.
|
unable to reproduce this in macos, and with some other character devices in alpine linux too so looks like platform specific. Unable to access the said device in my Linux box, as it looks like that needs root permission. Can you run with |
Certainly. Here's a log as requested, starting just where we do the
Thanks for looking into this! |
thanks @lfk for the data, and sorry! actually I missed your updates among the github notification flood. minimal reproduce: $ cat 15439.js const fs = require('fs')
const stream = fs.createReadStream('/dev/input/event0')
setTimeout(() => {
stream.close()
}, process.argv[2] / 1) I have tried with My theory: As the file system work is carried out by internal worker thread, and it has initiated a blocking read from the device, and the data won't come, the main thread is unable to move ahead - it is held up in the polling for an async event that has to come from the said worker. If I adjust (reduce) the timeout I observe race condition - process exits sometimes, else hangs. I believe the race is between the read thread and the close thread. If the close happens first, read is abandoned. If the read is initiated first, nothing else matters. /cc @nodejs/fs @nodejs/libuv |
A C reproduce, I wasn't successful with a libuv recreate. $ cat 15439.c #include <signal.h>
#include <unistd.h>
#include <fcntl.h>
#include <pthread.h>
int fd;
void *worker(void *data) {
char buf[65536];
read(fd, buf, 65536);
read(fd, buf, 65536);
return NULL;
}
int main() {
pthread_t tid;
fd = open("/dev/random", O_RDONLY);
pthread_create(&tid, NULL, worker, NULL);
sleep(1);
close(fd);
pthread_join(tid, NULL);
return 0;
}
The behavior is obvious, as the reader thread has entered the kernel and is oblivious to the closing of fd in main thread.
Fixing this at its core will be bit involving, as it would need detecting such cases and interrupting worker threads. Given that it is a corner case, I recommend to document this as a limitation. |
This is not something libuv would (or could) fix but documenting it is a good way forward. Labels updated. Aside: |
Thank you again @gireeshpunathil, really appreciate that you took the time to look into this and provide such a detailed breakdown. Just for the sake of completeness, in my case the |
@lfk - sure. Do you want to attempt a documentation change to this effect and make it a PR? I guess you can best explain the limitation being the user who experienced it. Let me know! |
charcter device streaming works just like any other streams, but hangs on the close callsite due to the worker thread being blocked on the read and main thread waiting for any async event that may not occur. Document this behavior and suggest a potential workaround. Fixes: nodejs#15439
charcter device streaming works just like any other streams, but hangs on the close callsite due to the worker thread being blocked on the read and main thread waiting for any async event that may not occur. Document this behavior and suggest a potential workaround. Fixes: #15439 PR-URL: #21212 Reviewed-By: Vse Mozhet Byt <vsemozhetbyt@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
charcter device streaming works just like any other streams, but hangs on the close callsite due to the worker thread being blocked on the read and main thread waiting for any async event that may not occur. Document this behavior and suggest a potential workaround. Fixes: #15439 PR-URL: #21212 Reviewed-By: Vse Mozhet Byt <vsemozhetbyt@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
charcter device streaming works just like any other streams, but hangs on the close callsite due to the worker thread being blocked on the read and main thread waiting for any async event that may not occur. Document this behavior and suggest a potential workaround. Fixes: #15439 PR-URL: #21212 Reviewed-By: Vse Mozhet Byt <vsemozhetbyt@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
The workaround in the docs did not help me, but I managed to get it to work with:
|
Issue observed on two distinct systems:
System 1. Linux laptop
System 2. Beaglebone Black
I'm using fs.ReadStream to continuously read from a character device, e.g.
/dev/input/event0
. Receiving events works great, but on attempting to shut down the process does not exit as expected. Googling has turned up similar-sounding issues all the way back to node-0.x, but I have been unable to find a solution.Code to reproduce (behaves the same on both aforementioned systems):
Output:
A few interesting (?) observations:
Explicitly obtaining the file descriptor we get identical results, unless if the
.on('data', ...)
listener is not attached, in which case we can exit as expected. Removing the listener in the shutdown process does not work, however, nor do we get this behavior if thefd
is opened byfs.createReadStream
(i.e. simply commenting out the.on('data', ...)
from the first sample. Adjusted sample:This is probably not related to the problem, but I get the same result if I "trigger" an
end
-event by doing the following in my shutdown routine as such:Is this a bug, or am I simply missing something? Cheers!
The text was updated successfully, but these errors were encountered: