Skip to content

Commit

Permalink
Improve testing harness to separate DB and non-db test (#35160)
Browse files Browse the repository at this point in the history
This PR marks DB tests as such and allows to split execution
of the tests in CI to run the DB tests with the various database
while the non-db tests - without the DB in a separate run.

In order to do that, the code to select which tests to run has been
moved from `entrypoint_ci.sh` bash to breeze's Python code, which is
generally much nicer to maintain and common for both "DB" and
"non-DB" tests.

This will have the nice side effect that it will be easier in the
future to manage different test types and contain some specific
flaky test types.

This change also adds possibility to isolate some of the test types
when parallel DB tests are run and adds new test type
PythonOperator carved out Operator type. This test is best run in
isolation becasue creating and destroing virtualenvs in Docker while
running in parallel to other tests is very slow for some reason and
leads to flaky tests.

Python operator tests are therefore separated out from Operators and
treated separately as isolated tests.

This will help not only with speed but also with stability of the
test suite.
  • Loading branch information
potiuk authored Oct 31, 2023
1 parent 651b326 commit a7e76ba
Show file tree
Hide file tree
Showing 79 changed files with 4,147 additions and 1,498 deletions.
4 changes: 2 additions & 2 deletions .github/actions/post_tests_success/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,14 +28,14 @@ runs:
path: ./files/warnings-*.txt
retention-days: 7
- name: "Move coverage artifacts in separate directory"
if: env.COVERAGE == 'true' && env.TEST_TYPES != 'Helm'
if: env.ENABLE_COVERAGE == 'true' && env.TEST_TYPES != 'Helm'
shell: bash
run: |
mkdir ./files/coverage-reposts
mv ./files/coverage*.xml ./files/coverage-reposts/ || true
- name: "Upload all coverage reports to codecov"
uses: codecov/codecov-action@v3
if: env.COVERAGE == 'true' && env.TEST_TYPES != 'Helm'
if: env.ENABLE_COVERAGE == 'true' && env.TEST_TYPES != 'Helm'
with:
name: coverage-${{env.JOB_ID}}
flags: python-${{env.PYTHON_MAJOR_MINOR_VERSION}},${{env.BACKEND}}-${{env.BACKEND_VERSION}}
Expand Down
122 changes: 86 additions & 36 deletions .github/workflows/ci.yml

Large diffs are not rendered by default.

142 changes: 114 additions & 28 deletions BREEZE.rst
Original file line number Diff line number Diff line change
Expand Up @@ -970,12 +970,14 @@ Here is the detailed set of options for the ``breeze testing`` command.
Iterate on tests interactively via ``shell`` command
....................................................
You can simply enter the ``breeze`` container and run ``pytest`` command there. You can enter the
container via just ``breeze`` command or ``breeze shell`` command (the latter has more options
useful when you run integration or system tests). This is the best way if you want to interactively
run selected tests and iterate with the tests. Once you enter ``breeze`` environment it is ready
out-of-the-box to run your tests by running the right ``pytest`` command (autocomplete should help
you with autocompleting test name if you start typing ``pytest tests<TAB>``).
You can simply enter the ``breeze`` container in interactive shell (via ``breeze`` or more comprehensive
``breeze shell`` command) or use your local virtualenv and run ``pytest`` command there.
This is the best way if you want to interactively run selected tests and iterate with the tests.
The good thing about ``breeze`` interactive shell is that it has all the dependencies to run all the tests
and it has the running and configured backed database started for you when you decide to run DB tests.
It also has auto-complete enabled for ``pytest`` command so that you can easily run the tests you want.
(autocomplete should help you with autocompleting test name if you start typing ``pytest tests<TAB>``).
Here are few examples:
Expand All @@ -991,25 +993,30 @@ To run the whole test class:
pytest tests/core/test_core.py::TestCore
You can re-run the tests interactively, add extra parameters to pytest and modify the files before
You can re-run the tests interactively, add extra parameters to pytest and modify the files before
re-running the test to iterate over the tests. You can also add more flags when starting the
``breeze shell`` command when you run integration tests or system tests. Read more details about it
in the `testing doc <TESTING.rst>`_ where all the test types and information on how to run them are explained.
This applies to all kind of tests - all our tests can be run using pytest.
Running unit tests
..................
Running unit tests with ``breeze testing`` commands
...................................................
An option you have is that you can also run tests via built-in ``breeze testing tests`` command - which
is a "swiss-army-knife" of unit testing with Breeze. This command has a lot of parameters and is very
flexible thus might be a bit overwhelming.
Another option you have is that you can also run tests via built-in ``breeze testing tests`` command.
The iterative ``pytest`` command allows to run test individually, or by class or in any other way
pytest allows to test them and run them interactively, but ``breeze testing tests`` command allows to
run the tests in the same test "types" that are used to run the tests in CI: for example Core, Always
API, Providers. This how our CI runs them - running each group in parallel to other groups and you can
replicate this behaviour.
In most cases if you want to run tess you want to use dedicated ``breeze testing db-tests``
or ``breeze testing non-db-tests`` commands that automatically run groups of tests that allow you to choose
subset of tests to run (with ``--parallel-test-types`` flag)
Another interesting use of the ``breeze testing tests`` command is that you can easily specify sub-set of the
tests for Providers.
Using ``breeze testing tests`` command
......................................
The ``breeze testing tests`` command is that you can easily specify sub-set of the tests -- including
selecting specific Providers tests to run.
For example this will only run provider tests for airbyte and http providers:
Expand All @@ -1025,7 +1032,6 @@ For example this will run tests for all providers except amazon and google provi
breeze testing tests --test-type "Providers[-amazon,google]"
You can also run parallel tests with ``--run-in-parallel`` flag - by default it will run all tests types
in parallel, but you can specify the test type that you want to run with space separated list of test
types passed to ``--parallel-test-types`` flag.
Expand All @@ -1039,12 +1045,9 @@ For example this will run API and WWW tests in parallel:
There are few special types of tests that you can run:
* ``All`` - all tests are run in single pytest run.
* ``PlainAsserts`` - some tests of ours fail when ``--assert=rewrite`` feature of pytest is used. This
is in order to get better output of ``assert`` statements This is a special test type that runs those
select tests tests with ``--assert=plain`` flag.
* ``Postgres`` - runs all tests that require Postgres database
* ``MySQL`` - runs all tests that require MySQL database
* ``Quarantine`` - runs all tests that are in quarantine (marked with ``@pytest.mark.quarantined``
* ``All-Postgres`` - runs all tests that require Postgres database
* ``All-MySQL`` - runs all tests that require MySQL database
* ``All-Quarantine`` - runs all tests that are in quarantine (marked with ``@pytest.mark.quarantined``
decorator)
Here is the detailed set of options for the ``breeze testing tests`` command.
Expand All @@ -1054,6 +1057,86 @@ Here is the detailed set of options for the ``breeze testing tests`` command.
:width: 100%
:alt: Breeze testing tests
Using ``breeze testing db-tests`` command
.........................................
The ``breeze testing db-tests`` command is simplified version of the ``breeze testing tests`` command
that only allows you to run tests that are not bound to a database - in parallel utilising all your CPUS.
The DB-bound tests are the ones that require a database to be started and configured separately for
each test type run and they are run in parallel containers/parallel docker compose projects to
utilise multiple CPUs your machine has - thus allowing you to quickly run few groups of tests in parallel.
This command is used in CI to run DB tests.
By default this command will run complete set of test types we have, thus allowing you to see result
of all DB tests we have but you can choose a subset of test types to run by ``--parallel-test-types``
flag or exclude some test types by specifying ``--excluded-parallel-test-types`` flag.
Run all DB tests:
.. code-block:: bash
breeze testing db-tests
Only run DB tests from "API CLI WWW" test types:
.. code-block:: bash
breeze testing db-tests --parallel-test-types "API CLI WWW"
Run all DB tests excluding those in CLI and WWW test types:
.. code-block:: bash
breeze testing db-tests --excluded-parallel-test-types "CLI WWW"
Here is the detailed set of options for the ``breeze testing db-tests`` command.
.. image:: ./images/breeze/output_testing_db-tests.svg
:target: https://raw.githubusercontent.com/apache/airflow/main/images/breeze/output_testing_db-tests.svg
:width: 100%
:alt: Breeze testing db-tests
Using ``breeze testing non-db-tests`` command
.........................................
The ``breeze testing non-db-tests`` command is simplified version of the ``breeze testing tests`` command
that only allows you to run tests that are not bound to a database - in parallel utilising all your CPUS.
The non-DB-bound tests are the ones that do not expect a database to be started and configured and we can
utilise multiple CPUs your machine has via ``pytest-xdist`` plugin - thus allowing you to quickly
run few groups of tests in parallel using single container rather than many of them as it is the case for
DB-bound tests. This command is used in CI to run Non-DB tests.
By default this command will run complete set of test types we have, thus allowing you to see result
of all DB tests we have but you can choose a subset of test types to run by ``--parallel-test-types``
flag or exclude some test types by specifying ``--excluded-parallel-test-types`` flag.
Run all non-DB tests:
.. code-block:: bash
breeze testing non-db-tests
Only run non-DB tests from "API CLI WWW" test types:
.. code-block:: bash
breeze testing non-db-tests --parallel-test-types "API CLI WWW"
Run all non-DB tests excluding those in CLI and WWW test types:
.. code-block:: bash
breeze testing non-db-tests --excluded-parallel-test-types "CLI WWW"
Here is the detailed set of options for the ``breeze testing non-db-tests`` command.
.. image:: ./images/breeze/output_testing_non-db-tests.svg
:target: https://raw.githubusercontent.com/apache/airflow/main/images/breeze/output_testing_non-db-tests.svg
:width: 100%
:alt: Breeze testing non-db-tests
Running integration tests
.........................
Expand All @@ -1076,11 +1159,14 @@ Here is the detailed set of options for the ``breeze testing integration-tests``
:alt: Breeze testing integration-tests
Running Helm tests
..................
Running Helm unit tests
.......................
You can use Breeze to run all Helm tests. Those tests are run inside the breeze image as there are all
necessary tools installed there.
You can use Breeze to run all Helm unit tests. Those tests are run inside the breeze image as there are all
necessary tools installed there. Those tests are merely checking if the Helm chart of ours renders properly
as expected when given a set of configuration parameters. The tests can be run in parallel if you have
multiple CPUs by specifying ``--run-in-parallel`` flag - in which case they will run separate containers
(one per helm-test package) and they will run in parallel.
.. image:: ./images/breeze/output_testing_helm-tests.svg
:target: https://raw.githubusercontent.com/apache/airflow/main/images/breeze/output_testing_helm-tests.svg
Expand Down
4 changes: 3 additions & 1 deletion CI.rst
Original file line number Diff line number Diff line change
Expand Up @@ -394,7 +394,9 @@ This workflow is a regular workflow that performs all checks of Airflow code.
+---------------------------------+----------------------------------------------------------+----------+----------+-----------+-------------------+
| Tests airflow release commands | Tests if airflow release command works | - | Yes | Yes | - |
+---------------------------------+----------------------------------------------------------+----------+----------+-----------+-------------------+
| Tests (Backend/Python matrix) | Run the Pytest unit tests (Backend/Python matrix) | Yes | Yes | Yes | Yes (8) |
| Tests (Backend/Python matrix) | Run the Pytest unit DB tests (Backend/Python matrix) | Yes | Yes | Yes | Yes (8) |
+---------------------------------+----------------------------------------------------------+----------+----------+-----------+-------------------+
| No DB tests | Run the Pytest unit Non-DB tests (with pytest-xdist) | Yes | Yes | Yes | Yes (8) |
+---------------------------------+----------------------------------------------------------+----------+----------+-----------+-------------------+
| Integration tests | Runs integration tests (Postgres/Mysql) | Yes | Yes | Yes | Yes (9) |
+---------------------------------+----------------------------------------------------------+----------+----------+-----------+-------------------+
Expand Down
2 changes: 1 addition & 1 deletion CI_DIAGRAMS.md
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,7 @@ sequenceDiagram
par
opt
GitHub Registry ->> Tests: Pull CI Images<br>[COMMIT_SHA]
Note over Tests: Unit Tests<br>Python/DB matrix
Note over Tests: Unit Tests<br>Python/DB matrix/No DB
end
and
opt
Expand Down
Loading

0 comments on commit a7e76ba

Please sign in to comment.