Commit 4b69748
authored
ci: add custom timeout to ci job in order to save resources (#8504)
Fixes #8503.
### Description
Added custom timeout for `cron-conda` job of the `cron-conda` workflow
based on historical data.
### More details
Over the last 633 successful runs, the `cron-conda` job has a maximum
runtime of 40 minutes (mean=23, std=2) across all matrix combinations.
However, there are failed runs that fail after reaching the threshold of
6 hours that GitHub imposes. In other words, these jobs seem to get
stuck, possibly for external or random reasons.
One such example is
[this](https://github.com/Project-MONAI/MONAI/actions/runs/14507295168/job/40698965737)
job run, that failed after 6 hours. More stuck jobs have been observed
over the last six months, the first one on 11-Jan-2025 and the last one
one on 17-Apr-2025, while more recent occurences are also possible
because our dataset has a cutoff date around late May. With the proposed
changes, a total of **145 hours would have been saved** over the last
six months retrospectively, clearing the queue for other workflows and
**speeding up the CI** of the project, while also **saving resources**
in general 🌱.
The idea is to set a timeout to stop jobs that run much longer than
their historical maximum, because such jobs are probably stuck and will
simply fail with a timeout at 6 hours.
Our PR proposes to set the timeout to `max + 3*std = 46 minutes` where
`max` and `std` (standard deviation) are derived from the history of 633
successful runs. This will provide sufficient margin if the workflow
gets naturally slower in the future, but if you would prefer
lower/higher threshold we would be happy to do it.
Note that the timeout applies to all the matrix jobs, and not to their
sum, overriding the default 6-hour timeout of github.
### Context
Hi,
We are a team of [researchers](https://www.ifi.uzh.ch/en/zest.html) from
University of Zurich and we are currently working on energy
optimizations in GitHub Actions workflows.
Thanks for your time on this.
Feel free to let us know (here or in the email below) if you have any
questions, and thanks for putting in the time to read this.
Best regards,
[Konstantinos
Kitsios](https://www.ifi.uzh.ch/en/zest/team/konstantinos_kitsios.html)
konstantinos.kitsios@uzh.ch
A few sentences describing the changes proposed in this pull request.
### Types of changes
- [x] Non-breaking change (fix or new feature that would not break
existing functionality).
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Chores**
* Set a 46-minute timeout for the scheduled Conda workflow to improve
reliability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Konstantinos <konstantinos.kitsios@uzh.ch>1 parent d388d1c commit 4b69748
1 file changed
+1
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| 24 | + | |
24 | 25 | | |
25 | 26 | | |
26 | 27 | | |
| |||
0 commit comments