-
Notifications
You must be signed in to change notification settings - Fork 4.8k
retry build errors #1194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
retry build errors #1194
Conversation
|
[test] |
|
continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_requests_openshift3/1238/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this message belongs here - retry decisions belong in a higher layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the controller is the bit that has the logic to know whether a particular event can/should be retried or not (based on what state has been modified or would be duplicated if the event is replayed) so i don't see how you can(well, should) defer the decision to retry out of the controller. The fact is the controller is making the decision to retry or not based on the type of error it returns and knowing the logic of the retry controller anyway. There's really no avoiding the fact that the controller needs to have some knowledge of the types of errors the retry handler is expecting and how it will handle them. the less of that there is, the better, imho. (which is why I preferred the approach that just returned RetryableError vs FatalError wrapping the real error)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Higher layer = your retry handler. That's where you make a decision about retry. Adding it here too confuses the issue
657c533 to
482ca27
Compare
|
@smarterclayton updated per your comments. thoughts? notable and debatable design points:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Godoc
|
Minor comment, otherwise this is fine. |
|
godoc'd. [merge] |
|
continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/merge_pull_requests_openshift3/1115/) (Image: devenv-fedora_973) |
|
Evaluated for origin up to 2841124 |
|
post-hoc LGTM ----- Original Message -----
|
UPSTREAM: 59931: do not delete node in openstack, if those still exist in cloudprovider
For now this will retry retryable failures forever on a tight loop. I'm still contemplating switching it to something like 100. The problem is that without backoff (which the current retrycontroller logic does not offer for good reason), if there's a temporary API outage we'll burn through any reasonable number of retries rapidly.