-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wait more in HTTP server specs in interpreted mode #12420
Conversation
@@ -2,7 +2,12 @@ require "spec" | |||
require "../spec_helper" | |||
require "../../support/fibers" | |||
|
|||
private def wait_for(timeout = 5.seconds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was no caller with an explicit timeout so for clarify I moved the argument to inside the method.
private def wait_for(timeout = 5.seconds) | ||
private def wait_for | ||
timeout = {% if flag?(:interpreted) %} | ||
25.seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
25 seconds seems a lot. Are you sure we need that much margin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would you suggest? Let's say we go with 10. It times out. We try with 15. It times out. We keep bumping it. Is the timeout necessary at all in these specs if we expect them to pass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We expect specs to pass. But if they break for some reason, I wouldn't want to wait for the specs to finish for an accumulation of perhaps many minutes in total. For that reason the timeout should be chosen as small as possible while still ensuring specs to pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I don't know what's a good value. Maybe we can change it to 10 seconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a related note, all specs have a hard-coded timeout of 15 seconds for the -Dpreview_mt
workflow, and every now and then one spec fails. Example: https://github.com/crystal-lang/crystal/runs/7470644243?check_suite_focus=true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I'm sure the interpreter can do it in 5 seconds or much less... if it's complied in release mode. But in CI the interpreted is not optimized, and the non-optimized version is much, much slower than the optimized version (let's say, like 100 or 1000 times slower)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just ran some quick test and with a non-release compiler in interpreter mode, it takes about 22-24 ms for server.listening?
to respond with true
. It's basically just waiting to switch to the enqueued fiber, doing a syscall and switching back.
In compile mode, it's more like 0.1-0.2 ms. So it's (only) 100x faster than the non-release interpreter.
So if it happens to exceed the 5 seconds timeout in CI, it takes a lot longer than expected and might be worth looking into the reasons for that instead of just increasing the timeout limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I still think that the timeout as suggested by this PR is the right choice. If you can improve the performance and lower the timeout along with that, then great. Otherwise it's always a clear improvement to bring timeouts in line with reality according to the rule of thumb that I mention
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which failing spec does this timeout configuration even affect? The PR description doesn't give any context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any of the HTTP server specs that use this method used to randomly fail.
Here's another similar failure:
There's a timeout of 1 second in |
I think we should rather investigate why it takes such an exuberant amount of time in the first place. |
Of course! The only downside is that until we figure it out all PRs will show a red cross and appear as failed. An alternative is to bump timeouts in interpreted mode, capture this timeout issue as a separate issue and move on. |
Sure, increasing the timeout as a temporary workaround is probably okay. But then it should be designated as such, with a TODO comment attached to the code and an open issue to track the original problem. |
No description provided.