We should not fail fast in our e2e test suite

The last release showed that our current approach of stopping the test run when a test fails can hide important other test failures. We should change the test runner to not fail fast but instead keep running additional tests.

Some of the things we have to be aware of and might need to address in the test runner:
* resource cleanup: I believe created resources are not deleted when a test fails this can lead to resource exhaustion and subsequent test failures that are not actually related to bugs in the product but 
* test interdependencies: we discussed concerns about the license tests causing problems, however after looking at them I don't see an issue as long as we don't parallelise the tests (which we currently have not intention to)

I assume part of the motivation to use `failfast` was to keep the feedback cycle short and not have to wait for hours and hours of tests before we get go/no-go signal. But I think we have since changed the way we use the e2e tests quite a bit. We now use a much short smoke test for the PR builds which is effectively a single test. The full e2e test suite is used over night or after integration into `main` where turnaround time is less of a concern. We also have the proposal to add new e2e test pipeline that is a bit more comprehensive than the smoke test but covers the most important operations namely version upgrades of the stack.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

We should not fail fast in our e2e test suite #6338

pebrc
openedon Jan 18, 2023

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

We should not fail fast in our e2e test suite #6338

Description

pebrcopenedon Jan 18, 2023

Metadata