-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Always retry a request even if the sender returns a non-nil error #464
Conversation
Fixes #450 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code LGTM, but I'm not sure if these error codes are truly inclusive of all of the error cases you might run into across OS's.
Also, you might want to take a look at what 1.13 is doing with https://golang.org/doc/go1.13#error_wrapping. It seems like you are close to the Unwrap / As pattern, but not quite there.
internal/internal_test.go
Outdated
|
||
func TestIsTemporaryNetworkErrorTrue(t *testing.T) { | ||
if !IsTemporaryNetworkError(someTempError{}) { | ||
t.Fatal("expected someTempError to be a temporary network error") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You really want to fail fast :)
I've been running this on a branch in travis and just saw another transient failure: Maybe instead of having a list of errno to retry, it should keep a list of errno not to retry? |
It's an interesting idea, then the question becomes what's the list of error codes we shouldn't retry? EHOSTDOWN and EHOSTUNREACH seem like good candidates although couldn't these too be transient in the face of network connectivity problems? Seems like getting all the conditions right might be tricky (same as the changes in this PR). I'm not against the idea, just wondering which one is easier to maintain (having a list of errno to not retry kinda feels like trying to prove a negative). Of course we could just always retry on error which is what we originally used to do; it's the simplest solution but runs the risk of retrying on non-transient failures. |
retrying on non-transient failure is better than not retrying on transient failures |
also, seems like with this PR I still see connection reset failures: https://travis-ci.org/kahing/goofys/jobs/585257461 if detailedError, ok := err.(autorest.DetailedError); ok {
if urlErr, ok := detailedError.Original.(*url.Error); ok {
adl2Log.Errorf("url.Err: %T: %v %v %v", urlErr.Err, urlErr.Err, urlErr.Temporary(), urlErr.Timeout()) that code produced this log line:
|
I agree on retrying is better than not.
I think this is addressed in the retry guidance: My read of 'timeout' is to include all connection failures and a complete failure after 5 recommended attempts in the backoff process. |
5a31b97
to
f81aab2
Compare
@kahing @mbrancato I've reworked this to always retry failed requests. |
still got some errors:
and
|
Thanks for the info. For the first case we don't, at present, have any retry logic when reading a response body. The retry logic for this PR is only for calling the REST API. Can you please open a new issue to track adding retries for reading responses? Please note that this is likely a significant design change so I don't know how fast a fix will be forthcoming. |
pass rate is better but still get #470 quite often. This is definitely a step forward though |
Minor CHANGELOG fix.
Thank you for your contribution to Go-AutoRest! We will triage and review it as soon as we can.
As part of submitting, please make sure you can make the following assertions:
dev
branch, except in the case of urgent bug fixes warranting their own release.master
, I've updated CHANGELOG.md to address the changes I'm making.