-
Notifications
You must be signed in to change notification settings - Fork 29.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate flaky test-dns #5554
Comments
Another failure: |
And another: |
It appears that this test is flaky due to timing out when the callback never fires on one of the two test cases:
I'm not sure what can be done (from within Node) to make sure those Anyway, if anyone has ideas how to make those tests more robust, I'd love to hear it. /cc @nodejs/testing @nodejs/build |
Oh, I see now that I had it backwards in my mind: OK, so I'm thinking there's might be no foolproof way around this problem and maybe these test cases should be moved from |
Maybe we can move those two tests into their own test file and find a way to replace |
Still happening: https://ci.nodejs.org/job/node-test-binary-arm/1540/RUN_SUBSET=1,nodes=pi2-raspbian-wheezy/console Any ideas, anyone? Pretty thin roster using |
Do those requests come back, given enough time? I don't think 4306786 would cause the flakey behavior. |
They timeout after 120 seconds, so I think the answer is "no, they don't come back". I imagine this is related to the general uptick in pi2-raspbian-wheezy flakiness we've seen lately but I don't know what the ultimate source is. It does seem to be primarily or exclusively in tests that involve network connections (to localhost in most/all cases, but that's the nature of our CI tests). I haven't dug into c-ares so I'm not sure if this test might actually touch the network for certain configurations or whatever. It passes without any network connectivity, so it doesn't need the network. Yeah, and I agree that it seems unlikely that 4306786 is the problem. It's more likely the victim here than the perpetrator. |
Good news. I tried putting all those hints tests in their own file to see if that fixed the flakiness on the theory that the tests were either interacting with other tests or else cumulatively hitting some sort of threshold that was triggering Pi2 to sometimes drop a ... packet or something. Anyway, looks like that fixes it, and I'm all for splitting out the really big honkin' test files into smaller honkin' test files. So, win-win, I guess. Stress test on current master confirming flakiness of current test: https://ci.nodejs.org/job/node-stress-single-test/577/nodes=pi2-raspbian-wheezy/console Stress test on my branch confirming non-flakiness of the test-dns stuff split across two files: https://ci.nodejs.org/job/node-stress-single-test/576/nodes=pi2-raspbian-wheezy/console (215 tests and counting... still time for it to go sideways, but I'm optimistic...) UPDATE: 1 failure in 500+ runs, which is a whole better than the 10 failures in 100+ runs that we get on current master, but still, there's an issue... |
A few of the hints tests were flaky in CI on pi2-raspbian-wheezy. Moving them to their own file fixes it. It could be that there is an unnoticed interaction with other tests in the file, or it could be that there is a cumulative threshold on some resource that is reached that causes Pi2 to sometimes stall out. (The test was timing out.) Fixes: nodejs#5554
Stress test for further splitting up the test more in the hops of increased reliability: https://ci.nodejs.org/job/node-stress-single-test/578/nodes=pi2-raspbian-wheezy/console |
OK, so it does use the network despite |
Use empty string instead of `www.google.com` for tests where we are just doing parameter evaluation. This will avoid DNS lookups which appear to be causing flakiness on Raspberry Pi devices in CI. Fixes: nodejs#5554
New PR: #5996 |
Use empty string instead of `www.google.com` for tests where we are just doing parameter evaluation. This will avoid DNS lookups which appear to be causing flakiness on Raspberry Pi devices in CI. PR-URL: #5996 Fixes: #5554 Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Use empty string instead of `www.google.com` for tests where we are just doing parameter evaluation. This will avoid DNS lookups which appear to be causing flakiness on Raspberry Pi devices in CI. PR-URL: #5996 Fixes: #5554 Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Example failure:
The text was updated successfully, but these errors were encountered: