Add special CI mode #90

doublep · 2023-07-05T19:52:44Z

I have long wanted to improve CI stability with Eldev, but never really got to doing it.

The problem: CI tests (also for Eldev itself for example) often fail for reasons completely external to the project being tested. E.g. because of networking problem, whatever MELPA bugs (they apparently don't have transactional upgrade mode, so your CI can fail because you just happen to run it in the "hiccup" moment, when the PA itself is in inconsistent state — or that's how it looked to me) or maybe yet something else. I.e. if you just manually restart CI run, it will often succeed. This is very annoying and reduces trust in CI testing overall.

Idea: add a global option to Eldev, called --ci or --robust-mode or something like that. When in that mode, Eldev should retry on certain failures instead of immediately giving up. This mode should be automatically active on various common CI servers, starting with GitHub test servers. I.e. default value should be "auto", and then Eldev would use some heuristics to determine if it is executed in a CI setup (where "auto" would resolve to "yes", i.e. robust mode) or just locally (results in "no").

The largest problem is to figure out specific errors where Eldev should then retry instead of giving up. This is, of course, made particularly difficult by the fact that such errors are not reproducible and happen only from time to time.

@ikappaki, @bbatsov, @sirikid, @LaurenceWarne, @juergenhoetzel, @DarwinAwardWinner: Sorry for batch-pinging, but if you are interested in this, please link (or duplicate here, especially if you restart CI run) stacktraces that look like such intermittent errors where Eldev should retry and I'll try to figure out. If not, just unsubscribe from this thread and sorry again.

The text was updated successfully, but these errors were encountered:

DarwinAwardWinner · 2023-07-05T21:19:27Z

Unfortunately, I haven't done many CI runs recently, and it looks like Github has cleaned up my older CI history. So while I'm sure I've seen transient errors, I'm not sure I have a way to find and provide any stack traces at this time.

doublep · 2023-07-14T18:52:46Z

OK, looks like I got the first example, with Eldev itself. CI run failed because of "End of file during parsing" "When updating contents of package archive ‘melpa-stable’" during integration tests. Will see if I could somehow make Eldev robust against such stuff.

MELPA-INTERMITTENT-FAILURE.log

…etries certains things several times before giving up (issue #90).

doublep added a commit that referenced this issue Jul 16, 2023

Add "robust" mode (active be default on CI servers), in which Eldev r…

1636af3

…etries certains things several times before giving up (issue #90).

doublep added enhancement New feature or request in progress labels Jul 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add special CI mode #90

Add special CI mode #90

doublep commented Jul 5, 2023

DarwinAwardWinner commented Jul 5, 2023

doublep commented Jul 14, 2023

Add special CI mode #90

Add special CI mode #90

Comments

doublep commented Jul 5, 2023

DarwinAwardWinner commented Jul 5, 2023

doublep commented Jul 14, 2023