Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add test suits parallelization for E2E tests in CI #9064

Open
cnotv opened this issue Jun 7, 2023 · 8 comments
Open

Add test suits parallelization for E2E tests in CI #9064

cnotv opened this issue Jun 7, 2023 · 8 comments
Labels
area/ci CI and automation related, e.g. GitHub Actions area/e2e area/test Test (e2e and unit) kind/tech-debt Technical debt QA/dev-automation Issues that engineers have written automation around so QA doesn't have look at this
Milestone

Comments

@cnotv
Copy link
Contributor

cnotv commented Jun 7, 2023

Description

As effort to diminish the E2E workflow time of execution on scale, it's preferred to run tests in parallel using the flag --parallel, as defined in the Cypress guide.

Context

It's necessary to have idempotent tests if a parallelization is enabled. Although the specs are isolated client wise, the same does not happens server wise. Server initialization is a separated topic which will have to be tackled down.
At the current state (to be verified with the new tests), the only test suite to be required on top is the setup of Rancher, which can be also disabled by env var TEST_SKIP_SETUP as defined in the documentation.

Note: Based to current state, E2E are already run in 2 different groups but using the same job: admin and user.

@cnotv cnotv added area/test Test (e2e and unit) kind/tech-debt Technical debt area/ci CI and automation related, e.g. GitHub Actions QA/dev-automation Issues that engineers have written automation around so QA doesn't have look at this area/e2e labels Jun 7, 2023
@cnotv cnotv added this to the v2.7.next3 milestone Jun 7, 2023
@cnotv
Copy link
Contributor Author

cnotv commented Jun 7, 2023

Just to collect what it has been already discussed previously, the idea is to reduce the current 30 minutes of workflow, which can be extended to 1h in case of failures.

Please mind that this will not reduce drastically as correcting the tests or fixing the related issues (high resource loading stack):

  • Each suite used to have an execution time of ~30s, so I would recommend that any other tests falls into this timing; currently we have tests which range from 3min to 9min
  • Timeout issues while loading the resource cause tests to be often or easily flaky, in some cases even worse it's required logic and make them not reliable
  • Timeout is set by default to max 60s, which will be fully used in case of errors, as major case, multiplied by 3 retry of the test itself; the adoption is due the issue above as attempt to reduce failing tests

@richard-cox
Copy link
Member

We should be mindful of how multiple tests impact performance in the runner. If that's starved of resources we may hit similar problems.

If we need to define tests that can / cannot run in a parallelised way the work @yonasberhe23 is doing to add tags can help.

@nwmac
Copy link
Member

nwmac commented Jun 8, 2023

A different approach to running tests in parallel against one deployment, is to split the tests into suites and have each suite run as a separate job in GH Actions that can then be run in parallel where each is a separate environment.

@cnotv
Copy link
Contributor Author

cnotv commented Jun 8, 2023

A different approach to running tests in parallel against one deployment, is to split the tests into suites and have each suite run as a separate job in GH Actions that can then be run in parallel where each is a separate environment.

It's not a bad idea, just throwing these points to consider.

Pro:

  • It would make sense to split 1h and turn it into ~40min (30 + 10 overhead) or ~25min (15 + 10 overhead) if you have parallels tests on parallel jobs
  • You can ensure tests to pass without use too much time

Cons:

  • The stats you get out are completely messed up (they mention groups, but don't think it works with our dashboard), therefore you have no fine-tuned metrics, which leads to have issues in finding out about the current state
  • Overhead of ~10min
  • Extra configuration may be overwhelming with all the aspects and even be a blocker in some cases

It may make sense to split by initial configuration if we want to test different scenarios or browsers though, which is not our case so far.

@nwmac
Copy link
Member

nwmac commented Jun 9, 2023

I was thinking we'd have one build step that uploaded a build that then the many e2e steps could then download, so they didn't all have to build.

Was also thinking that for code coverage, we could drop this from PRs and have a job that runs on merge to master that runs all unit tests and all e2e tests in one - this would take time but wouldn't block anything - it could then upload coverage data and we would not have to do fiddly stuff uploading data from each suite and then combining.

@richard-cox
Copy link
Member

Linking related issue - #9243. The new initial setup flow needs to run first, with admin and standard user buckets running in parallel afterwards

@richard-cox
Copy link
Member

Example of how the gh jobs can be used to run in parallel (and other good features) - nwmac#18. We'll need to combine this with the grep tags concept

@cnotv
Copy link
Contributor Author

cnotv commented Jul 4, 2023

Would it not be better to have an E2E composite action or reusable workflow at this point, instead of repeating most of it 3 times?

@gaktive gaktive modified the milestones: v2.7.next3, v2.8.0 Aug 3, 2023
@gaktive gaktive modified the milestones: v2.8.0, v2.8.next1 Sep 22, 2023
@cnotv cnotv changed the title Add parallelization for E2E tests in CI Add test suits parallelization for E2E tests in CI Nov 29, 2023
@nwmac nwmac modified the milestones: v2.9.0, v2.9.x Feb 27, 2024
@gaktive gaktive removed this from the v2.9.x milestone Oct 18, 2024
@gaktive gaktive added this to the v2.12.0 milestone Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ci CI and automation related, e.g. GitHub Actions area/e2e area/test Test (e2e and unit) kind/tech-debt Technical debt QA/dev-automation Issues that engineers have written automation around so QA doesn't have look at this
Projects
None yet
Development

No branches or pull requests

4 participants