-
Notifications
You must be signed in to change notification settings - Fork 905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add automatic retry to GHA nightly build workflows #3652
Comments
Checked the solutions possible in our case:
From the above decided to try two different solutions: 2 and 3 |
After checking 2 and 3 from the above, we've figured out that:
To make it work at the workflow level, we must also create actions from the workflows. So the plan is to start from the checks and apply retry at the steps level. |
The current working solution is tracking the job status at the level of the checks (
To make it work we add The drawback of the solution is that we copy-paste all the steps which makes it harder to maintain. We cannot make the same things at the level of the
The alternative possible solution is to create a custom Several used sources:
|
From the above and PS with @ankatiyar, it was decided NOT to proceed with any of the described solutions as they seem to bring more difficulties than value. @merelcht, @SajidAlamQB, what do you think? Please let me know if I'm missing anything or if there is any other possible solution in your mind! |
@ElenaKhaustova thanks for investigating this in so much detail and explaining all possibilities. This is definitely a lot more complex than I thought. I agree it's not worth having such a complex retry system at this point in time, because jobs aren't failing that frequently because of flakiness. We can always revisit this if we find that our builds aren't stable enough anymore and we need to retry too often. |
Closing issue after research and several discussions, it was decided not to proceed with it for now. |
Description
The nightly build and notification workflow creates a lot of spurious failure notification issues which are usually resolved on a re-run of the failed tests.
Proposal: Add an automatic re-run of the failed tests/entire test suite before it reaches the create issue for the failure of notification step so that only genuine failures create failure issues.
Possible Implementation
I looked into this very briefly and saw these actions, but haven't tried it out yet.
The text was updated successfully, but these errors were encountered: