-
Notifications
You must be signed in to change notification settings - Fork 29.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: fix worker send error #20973
test: fix worker send error #20973
Conversation
From the CI:
|
2658ac4
to
6229b90
Compare
for what this PR was raised, the same test failed! good thing is that it precisely showed that fix did not work, the best CI run I have ever seen! Looking at the failure and associated code, I see what is happening: node/lib/internal/child_process.js Lines 628 to 632 in 4f28015
the error in the child child_process.send is either emitted to its receiver and eventually to the worker object, if there is no callback supplied to send. if callback is supplied, error is never emitted, instead passed to the callback. So I should apply the error filter in the callback, not in the error handler. |
pushed commit to that effect and squashed with the last one, but noticed that the head commit has got associated with this one! how is that possible? what mess up I would have made? |
6229b90
to
fb5ac32
Compare
ok, forget the last comment - I actually |
gist of CI:
windows:
first one is unrelated flake. the other failure (parallel/test-child-process-fork-net) fails with the same symptom as that of this test, and require the same patch. I suggested @mmarchini to address that based on the result of this. In short, this PR is stable, and CI failures are unrelated. |
@nodejs/testing @nodejs/child_process |
err.code !== 'EMFILE' | ||
) { | ||
throw err; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am not mistaken the test actually does not work in case one of these errors pops up. So instead of ignoring the error, I would like to skip the test with a message. That way there is at least some kind of notification that the test actually did not work as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BridgeAR : the sole purpose of the test is to make sure the channel
field of a closed worker process is propery nullified (ref: #2847).
This is asserted after the first send causes the worker to exit and subsequently called back in the worker's closure callback
The race comes after: the second request can fail through a variety of reasons based on the state of the worker and the server, and their failure or success is immaterial to the test.
Hope this clarifies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the error does not have any influence on the actual thing we want to test: great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I think these should be written as assertions, rather than if
statements with throw
s.
thanks @cjihrig . Given that:
I think it is worthwhile to go as is (fixing the issue at hand) and look at improvements later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM -- I'm personally in favor of getting this landed sooner rather than later since this fixes Windows CI breakage
@cjihrig - I know you approved it, but your suggestion with a on a separate note: can anyone tell me the discrete steps for landing self PRs? I know the general landing procedure, but am looking for the |
Try this: https://github.com/nodejs/node-core-utils/blob/master/docs/git-node.md it makes everything supereasy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
BTW, I ran a node-stress-single-test yesterday, and there were no failures out of 9999 runs :)
@bzoz The tool does not solve the merge status issue (but I guess if you reset your PR branch to the HEAD that you are about to push to master and force push to update the PR branch, then GitHub would recognize the PR has been merged?) |
In test-child-process-fork-closed-channel-segfault.js, race condition is observed between the server getting closed and the worker sending a message. Accommodate the potential errors. Earlier, the same race was observed between the client and server and was addressed through ignoring the relevant errors through error handler. The same mechanism is re-used for worker too. The only difference is that the filter is applied at the callback instead of at the worker's error listener. Refs: #3635 (comment) Fixes: #20836 PR-URL: #20973 Reviewed-By: Rich Trott <rtrott@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Jon Moss <me@jonathanmoss.me> Reviewed-By: Matheus Marchini <matheus@sthima.com> Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com> Reviewed-By: Bartosz Sosnowski <bartosz@janeasystems.com>
fb5ac32
to
397eceb
Compare
Landed in 397eceb |
Patch inspired on 397eceb to fix flakyness on test-child-process-fork-net. Ref: nodejs#20973
`flaky-test-child-process-fork-net` has been failing constantly for the past few days, and all solutions suggestes so far were didn't work. Marking it as faky while the issue is not fixed. Ref: nodejs#21012 Ref: nodejs#20973 Ref: nodejs#20973
`flaky-test-child-process-fork-net` has been failing constantly for the past few days, and all solutions suggestes so far were didn't work. Marking it as faky while the issue is not fixed. Ref: #21012 Ref: #20973 Ref: #20973 PR-URL: #21018 Refs: #21012 Refs: #20973 Refs: #20973 Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de> Reviewed-By: Jon Moss <me@jonathanmoss.me> Reviewed-By: Anatoli Papirovski <apapirovski@mac.com> Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
In test-child-process-fork-closed-channel-segfault.js, race condition is observed between the server getting closed and the worker sending a message. Accommodate the potential errors. Earlier, the same race was observed between the client and server and was addressed through ignoring the relevant errors through error handler. The same mechanism is re-used for worker too. The only difference is that the filter is applied at the callback instead of at the worker's error listener. Refs: #3635 (comment) Fixes: #20836 PR-URL: #20973 Reviewed-By: Rich Trott <rtrott@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Jon Moss <me@jonathanmoss.me> Reviewed-By: Matheus Marchini <matheus@sthima.com> Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com> Reviewed-By: Bartosz Sosnowski <bartosz@janeasystems.com>
`flaky-test-child-process-fork-net` has been failing constantly for the past few days, and all solutions suggestes so far were didn't work. Marking it as faky while the issue is not fixed. Ref: #21012 Ref: #20973 Ref: #20973 PR-URL: #21018 Refs: #21012 Refs: #20973 Refs: #20973 Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de> Reviewed-By: Jon Moss <me@jonathanmoss.me> Reviewed-By: Anatoli Papirovski <apapirovski@mac.com> Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
In test-child-process-fork-closed-channel-segfault.js, race condition
is observed between the server getting closed and the worker sending
a message. Accommodate the potential errors.
Earlier, the same race was observed between the client and server
and was addressed through ignoring the relevant errors through error
handler. The same mechanism is re-used for worker too.
Refs: #3635 (comment)
Fixes: #20836
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes