Description
Context
We have already done a lot of work around Cypress to prepare the tests to be executed successfully on the second quality gate:
- We have created specific executions per area team in buildkite to improve the visibility and ownership of the tests in case of failure
- We have created the logic to mimic our current buildkite executions on MKI
- We have clean code in order to have more reliable and robust tests:
- We have provided developers a easy way to use Cypress with MKI in their local machines
- We have implemented the ability to perform login in serverless using SAML
With all the above we are extremely close to have our tests ready to be integrated with the second quality but there is still some work pending to be finished before the integration.
Final Steps
We need to take into consideration that any failure on the second quality gate is going to block a deployment to production. This is why we need to be extremely careful about the robustness of our tests. To guarantee that our tests are robust and we minimise the risk of blocking releases to production the next actions should be performed:
- Skip flaky or non-working tests on MKI
- Guarantee the retrievability of our tests
- Have several green executions in a row without flakiness
- Integration with the quality gate
Skip flaky or non-working tests on MKI
Before integrating our tests with the Kibana release quality gate, we need to make sure we have green execution. We want to arrive to that point as soon as possible so any failing or flaky which requires investigation will be skipped from the execution with a @brokenInServerlessQA
label.
Guarantee the retrievability of our tests
We all know that flakiness may happen from time to time, ideally, the only flakiness that we should face, is the one regarding external factors as slow machines or network issues.
With Cypress we have the test retries functionality enabled. Test retries has been configured with 1 retry attempt, Cypress will retry a failed test an additional time (for a total of 2 attempts) before potentially being marked as a failed test. When a test is re-executed, the each
hooks will be re-run as well, however, failures in before
and after
hooks will not trigger a retry and the test will be marked as failure.
So in order to have 'retriable' tests, we should get rid off the before
and after
hooks in favor of the beforeEach
and afterEach
hook. Or at least make sure that the code executed in the before
and after
hook is not prone to fail (i.e. es_archiver).
Another thing we need to take into consideration to guarantee that a test can be retried is to make sure that the data that the test might generate is properly cleaned.
Each spec file is executed on a clean environment, but, retries are not. Retries are executed on the same environment the execution was initiated, this is why is pretty important to make sure that the data the test may generate is cleaned at the beginning.
Have several green executions in a row without flakiness
We cannot integrate tests until we have several green executions in a row.
Integration with the quality gate
Once we are sure that our tests are consistently passing on MKI, it will be integrated with the quality gate. Take into consideration that currently we have the executions splitted by area teams, so as soon as an area team has their tests ready, those will be integrated.
Tasks to be done
- [Security Solution] Improving test automation reliability #173508
- Investigations
- Skip non-working tests
- [Investigations] Remove before and after hooks from Cypress tests #175019
- [Investigations] Audit the clean-up of data in Cypress #175095
- Make sure tests are stable in MKI
- Integrate tests with the quality gate
- [Investigations] Add Cypress tests to the Kibana release quality gate #180282
- Explore
- Skip non-working tests
- [Explore] Remove before and after hooks from Cypress tests #175020
- [Explore] Audit clean-up of data in Cypress #175096
- Make sure tests are stable in MKI
- Integrate tests with the quality gate
- [Explore] Add Cypress tests to the Kibana release quality gate #180283
- Detection Engine
- [Security Solution] Enable API integration tests in Serverless, second quality gate #169185
- [Detection Engine] Remove before and after hooks from Cypress tests #175021
- [Explore] Audit clean-up of data in Cypress #175096
- Make sure tests are stable in MKI
- Integrate tests with the quality gate
- [Detection Engine] Add Cypress tests to the Kibana release quality gate #180277
- Rule management
- Skip non-working tests
- [Rule Management] Remove before and after hooks from Cypress tests #175022
- [Rule Management] Audit clean-up of data in Cypress #175098
- Make sure tests are stable in MKI
- Integrate tests with the quality gate
- [Rule Management] Add Cypress tests to the Kibana release quality gate #180278
- Entity Analytics
- Skip non-working tests
- [Entity Analytics] Remove before and after hooks from Cypress tests #175023
- [Entity Analytics] Audit clean-up of data in Cypress #175099
- Make sure tests are stable in MKI
- Integrate tests with the quality gate
- [Entity Analytics] Add Cypress tests to the Kibana release quality gate #180281
- AI Assistant
- Skip non-working tests
- Remove before and after hooks
- Make sure data is cleaned
- Make sure tests are stable in MKI
- Integrate tests with the quality gate
- [GenAI] Add Cypress tests to the Kibana release quality gate #180280