Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor tests to use pytest as a test runner for all the packages #732

Merged
merged 18 commits into from
Mar 1, 2023

Conversation

delatrie
Copy link
Contributor

@delatrie delatrie commented Feb 23, 2023

This PR introduces a new testing scheme for the repo. The idea is to use pytest as the main testing framework and express all existing tests in its terms.

Table of contents

Brief usage examples
    Package installation
    Testing
    Collect allure results
    Linting
Motivation
    Current state
        Constraints
            Tests diversity and reusability issues
            Poor examples
            Running tests
            Incompatibility between allure-pytest and allure-pytest-bdd
New testing scheme
    The tests folder structure
    Examples
Other changes
    Linting
    Dependency management
    Dynamic parameter examples
    Allure test result flushing
    User experience improvements
    Bug fixes
    Test fixes and optimizations
    Other minor fixes
    Documentation fixes

Brief usage examples

All commands are executed from the root of the repo.

Package installation

You may install all packages in editable mode or only one/some of them depending on your needs.

Install all packages

Install all allure-python packages in editable mode and their dependencies (including testing and linting dependencies) with this single command:

pip install -r requirements/all.txt

Install single package

The sequence remains the same as before:

pip install -e allure-commons
pip install -e allure-commons-test
pip install -e allure-<framework>
pip install -r allure-<framework>/requirements.txt

Example: installing allure_pytest

pip install -e allure-commons
pip install -e allure-commons-test
pip install -e allure-pytest
pip install -r allure-pytest/requirements.txt

Testing

Below are some examples on how to run tests.

Run all tests

With poethepoet:

poe tests

With pytest directly:

pytest

Test single package

With poethepoet:

cd allure-<framework>
poe tests

With pytest directly:

pytest tests/allure_<framework>

Example: test allure-pytest

cd allure-pytest
poe tests

or

pytest tests/allure_pytest

Collect allure results

If you want to run tests with allure enabled, use the allure-collect task:

poe allure-collect

Generate and view the report

Use the allure-generate task to create an allure report from the previously collected allure results:

poe allure-generate

Use the allure-open task to serve the generated report:

poe allure-open

You can navigate to the report using a browser (if it didn't run automatically).

You can run all tests with allure, generate and open the report using this one-liner:

poe allure-collect; poe allure-generate & poe allure-open

ℹ️ allure-generate and allure-open require you to have allure installed. You may download it here.

Linting

Use the linter task to run static checks against the entire codebase:

poe linter

Or against a single package:

cd allure-<framework>
poe linter

E.g., for allure-pytest this would be:

cd allure-pytest
poe linter

Motivation

We have some long-running issues with tests in this repository. To better understand them, lets first describe the current testing scheme.

Current state

Before the change, each framework integration package has had its own end-to-end tests with the following characteristics:

  • these tests are written using the target framework itself (i.e., tests on allure-behave are written using behave, tests on allure-robotframework - using robotframework, etc.)
  • the tests are contained within the package, in the tests folder (features in case of allure-behave).

Constraints

While this makes packages self-contained, the cons are quite notable:

  • testing code is hard to reuse
  • tests are more diverse and harder to maintain compared to if they were written using a single framework
  • some examples are hidden inside tests and thus are hard to follow
  • it's hard to run all tests before, e.g., committing to the repo.

Tests diversity and reusability issues

To test an allure integration with a framework you need a test runner to invoke the target framework and some code to assert the results.

Currently, we have integrations with five testing frameworks: behave, nose2, pytest, pytest-bdd and robot framework and, accordingly, behave tests, nose2 tests, pytest tests, pytest-bdd tests and robot framework tests. This results in five different test runner implementations, hence, duplicated and more complex test logic. Also, it's impossible to reuse, say, the code of behave tests in a test that uses robot framework and vice versa.

Additionally, tests folders are not included in the package distributions and thus cannot be imported from other modules. Unlike, say, in java, in python we can't conditionally make package subfolders resolvable from other places. The only option here is to move tests to a separate package.

There is no strong reason to have this diversity and distribution of tests across packages in the first place. It's much easier from maintainer's perspective to have all tests written using the same framework and contained in a single separate package to share as much code with each other as possible.

Poor examples

Actually, we don't have allure-behave examples at all. What we have in the allure-behave/features folder is a description on how to test allure-behave. The actual example of usage is just one part of it:

Feature: Label

    Scenario: Scenario label
    Given feature definition
        """
        Feature: Step status
          @allure.label.owner:me
          Scenario: Scenario with passed step
              Given simple passed step
        """
     When I run behave with allure formatter
     Then allure report has a scenario with name "Scenario with passed step"
      And scenario has "owner" label with value "me"

Here the example is provided as a step description and the rest of the file is test info on how to check if this example is correct.

Some feature files doesn't include an example on how to use allure with behave at all. They only describe how exactly allure should support the framework and contain no allure API usage inside.

The same applies to allure-pytest-bdd's examples. Examples on allure-robotframework look better but lack of textual descriptions (mostly contain code blocks only).

Examples should be written as human readable textual documents with code blocks inside. Test descriptions should go separately.

Running tests

To test all packages, one had to do the following:

  1. Install allure-python-commons and allure-python-commons-test packages.
  2. Install other packages.
  3. Install dependencies for package tests.
  4. Change the current directory to the package's directory.
  5. Run tests. Optionally, run the linter.
  6. Repeat steps 4 and 5 for all other packages.

Ideally, this should be doable in two steps:

  1. Install all dependencies
  2. Run all tests

Incompatibility between allure-pytest and allure-pytest-bdd

There is a known issue (#109) that prevents allure-pytest and allure-pytest-bdd from running simultaneously in the same pytest session. That means, we cannot test either of plugins if we have both of them installed at the same time even if we use the -p pytest option to disable one of the plugins. That's because the -p option doesn't propagate to a nested pytest session.
Another option would be using PYTEST_DISABLE_PLUGIN_AUTOLOAD environment variable, but we can't use that because some tests check allure-pytest's support of 3rd party pytest plugins that needs to be loaded.

Ideally, this should be fixed in allure-pytest-bdd itself, and I have plans to do that. But for now, there should be a way to explicitly list the pytest plugins required by a test.

New testing scheme

All tests were rewritten using pytest. They were moved from allure-<framework> folders to tests/allure_<framework>. Therefore, the tests folder now contains all tests.

The tests folder structure

The tests folder structure now looks like this:

tests
├── allure_behave
│   ├── acceptance
│   │   ├── allure_api
│   │   └── behave_support
│   ├── defects
│   ├── behave_runner.py
│   └── conftest.py
├── allure_nose2
│   ├── acceptance
│   │   ├── allure_api
│   │   └── nose2_support
│   ├── conftest.py
│   └── nose2_runner.py
├── allure_pytest
│   ├── acceptance
│   ├── externals
│   ├── conftest.py
│   └── pytest_runner.py
├── allure_pytest_bdd
│   ├── acceptance
│   └── conftest.py
├── allure_robotframework
│   ├── acceptance
│   │   ├── allure_api
│   │   └── robotframework_support
│   ├── conftest.py
│   └── robot_runner.py
├── conftest.py
└── e2e.py

The first level

On the first level we have framework-specific test folders, a high-level conftest.py and the e2e module. Potentially, there could be tests on cross-package functionality on that level as well (i.e., compatibility tests).

tests/e2e.py

This module contains some functions and classes that come handy for end-to-end tests. Historically, those are almost the only tests we have (e.g., the repo originally contained no unit tests) and there are some common patterns used here allowing us to reuse lots of code. See docstrings on functions and classes themselves to better understand what they could be used for.

tests/conftest.py

This is a high level conftest.py of out pytest setup. Its job is to enable pytester and to declare commonly used fixtures: rst_examples and docstring.

Framework-specific test folders

A Framework-specific test folder mainly consists of tests on allure integration with the framework. The tests are grouped into nested folders depending on area they test, their granularity, purpose, etc. I used the following categories on the highest level of separation:

  • acceptance - for end-to-end tests, i.e., tests to ensure the allure fully and correctly supports a corresponding framework. Usually, those tests have doc examples or other form of human readable description associated with them. I further divided acceptance tests into framework support tests (on whether allure correctly translates framework constructs into allure ones) and allure API tests (whether allure API is accessible to a framework user and produces the expected results).
  • defects - for bug reproduction tests.
  • externals - for tests on allure compatibility with 3rd party packages (e.g., pytest plugins).

Framework-specific test runners

Each allure integration has its own test runner (except allure-pytest-bdd; it uses allure-pytest's one) and a fixture associated with it. Each runner inherits from a common base class and implements how exactly the framework should be executed. It also provides a simple way to specify test description (depending on the framework it could be a path to a file, a string or an ID of a code block in an example file or any combination of all these options).

Refer to test runner's docstrings on what capabilities does it provides.

Framework-specific conftest.py

These conftest.py files declare additional fixtures to use by tests. Typically, they are test runner fixtures.

Examples

Examples were rewritten as reStructuredTest (.rst) documents with code blocks inside. Each code block has an ID assigned to it. A test may refer to that ID to access the code block content and use it as an input for a framework run.

Here is a fragment of a document, describing how to assign an allure label to a behave test:

It's possible to add a custom label to a behave scenario. Simply apply :code:`@allure.label.<name>:<value>` tag to your scenario, e.g.:

..  code:: gherkin
    :name: label-feature

    Feature: Allure label for behave tests
        @allure.label.author:John-Doe
        Scenario: Scenario marked with an author label
            Given noop

The document is human readable. Also, there is a test that checks whether the example is correct:

def test_label_from_feature_file(behave_runner: AllureBehaveRunner):
    behave_runner.run_behave(
        feature_rst_ids=["label-feature"],
        step_literals=["given('noop')(lambda c:None)"]
    )
    assert_that(
        behave_runner.allure_results,
        has_test_case(
            "Scenario marked with an author label",
            with_status("passed"),
            has_label("author", "John-Doe")
        )
    )

The test takes the feature file from the example using the code block ID and provides the step definition as an inline string.

Other changes

The PR also contains lots of minor changes.

Linting

Previously, the linting was performed on a per-package basis. It's not necessary, because the linting is fast (compared to building and testing). It performs only static checks, i.e., requires no deps to be installed hence no conflicts.

You can still lint on a per-package basis if you wish, but now there is more general way to lint the entire code base with single poe linter command.

The build allure python workflow was rewritten to use this way of linting instead of creating a full-blown package matrix and linting each package separately. Dependencies installation was removed (linting doesn't require dependencies).

Dependency management

Dependency management wasn't the main concern during this work. We definitely have more to do to ease project management but that will be later. The PR contains some improvements though.

Dependencies now could be installed using the following requirement files:

  • requirements/core.txt: dependencies required to manage the repository (currently, poethepoet only).
  • requirements/commont.txt: allure_commons and allure_commons_test in editable mode.
  • requirements/linting.txt: linting dependencies only.
  • requirements/testing.txt: common testing dependencies.
  • requirements/testing/allure-.txt: additional package-specific testing dependencies
  • allure-<framework>/requirements.txt: testing and linting dependencies specific for the package
  • requirements/all.txt: all packages in editable mode, and all dependencies requires to test and lint them,

See examples above.

Dynamic parameter examples

This was originally in the PR #728. It was moved into this PR to accommodate layout changes.
Closes #727
Obsoletes #728

Allure test result flushing

Allure test result closing was moved from the pytest_runtest_logfinish hook to pytest_runtest_protocol. This makes more sense because this is the last hook in the lifecycle (see the hookspec for more info on that). The only practical implication of this I can think of though is when a test fixture calls pytest.exit. The pytest_runtest_logfinish is not executed in that case, previously leaving the test result unreported.

User experience improvements

The PR contains the following changes that affect user experience:

  • [core]: The allure_commons.mapping.parse_tag function is now less strict: any characters except : and = are now allowed between the last period of a tag and its value separator. Previously, only word characters were allowed. This affects robotframework and behave tags.
  • [allure-pytest]: A test now is reported as skipped if one of its fixtures calls pytest.exit.
  • [allure-robotframework]: General allure tag syntax in form allure.<link-type>[.<link-name>]:<value> is now supported for link tags in a Robot Framework test case file (see examples).
  • [allure-robotframework]: An allure testplan entry now may omit either id or selector. Previously, both properties were mandatory.
  • [allure-pytest]: If --clear-alluredir is specified, all content is now removed from the allure directory. Previously, directories were not deleted. Typically, those were history directories copied from the previous allure report (closes clean-alluredir command line argument removes only files from alluredir root #470).

Bug fixes

The following bugs or errors were fixed:

  • [core]: Invalid allure_commons.typing.LinkType.TEST_CASE enum value "testcase". It was changed to "tms" as expected by allure reporter (closes Link of type test_case not recognized as proper type in generated report #448).
  • [allure-robotframework]: Allure links specified using robot framework tags in form <link type>:[<link text>]<URL> (e.g., link:[homepage]https://qameta.io) were parsed incorrectly if the link text contains characters from outer parts of the URL due to incorrect usage of str.strip.

Test fixes and optimizations

The PR contains the following fixes and improvements:

  • [allure-nose2, allure-pytest]: Unnecessary parametrization was removed from some tests.
  • [allure-behave, allure-pytest]: Unicode tests are obsolete and cover no logic now since python 2 support was dropped. These tests were removed.
  • [allure-pytest]: The following tests were fixed and got their skip marks removed:
    • tests/allure_python/acceptance/duration/duration_time_test.py::test_duration[exit] (original)
    • tests/allure_python/acceptance/duration/duration_time_test.py::test_with_fixture_duration[exit] (original)
    • tests/allure_python/acceptance/duration/duration_time_test.py::test_with_fixture_finalizer_duration[exit] (original)
    • tests/allure_python/acceptance/labels/suite/default_suite_test.py::test_default_suite (original)
    • tests/allure_python/acceptance/labels/suite/default_suite_test.py::test_default_class_suite (original)
    • tests/allure_python/acceptance/step/step_placeholder_test.py::test_args_less_than_placeholders (original)
  • [allure-pytest]: The following tests were fixed and got their xfail marks removed:
    • tests/allure_python/acceptance/parametrization/parametrization_test.py::test_parametrization_with_ids (original)
    • tests/allure_python/acceptance/parametrization/parametrization_test.py::test_parametrization_decorators_with_partial_ids (original)
  • [allure-pytest]: The tests/allure_pytest/acceptance/step/test_step_with_several_step_inside_thread.py::test_step_with_reused_threads test (original) used the time.sleep function to mix allure steps it creates. This led to a massive time overhead. The steps are now shuffled using threading.Event. The test execution time reduced by 30-40 times.
  • [allure-pytest]: Allure-results from doctest examples are now cached and could be reused by another test in a module. It reduces the test execution time by approximately 10%. Caching should be enabled explicitly on a per-test basis to prevent sophisticated test errors.
  • [allure-pytest]: The --log-cli-level pytest option was replaced with more common --log-level in log capture tests.
  • [allure-pytest]: The executed_docstring_source and executed_docstring_path fixtures were removed. The "act" test phase was moved inside test functions (a fixture should ideally takes care of the "arrange" phase only).
  • [allure-pytest]: Less then assertions were added to duration tests to make stop time testing more expressive.
  • [allure-pytest]: The tests/allure_pytest/acceptance.fixture.fixture_test.py::test_one_fixture_on_two_tests (originally) test was fixed: the second assertion is now properly executing.
  • [allure-pytest]: The test files step_parameters.py and custom_label.py were renamed to step_parameters_test.py and custom_label_test.py and now could be collected by pytest as intended.
  • [allure-pytest]: The duplicated parameter set was removed from tests/allure_pytest/acceptance/step/step_parameters_test.py::test_step_parameters (originally).
  • [allure-pytest]: The parameter set with the None testplan was removed from the tests/allure_pytest/acceptance/testplan/select_test_from_testplan_test.py::test_select_by_testcase_id_test (originally) test as redundant. Almost any other test is a None-testplan test.
  • [allure-pytest]: The tests/allure_pytest/acceptance/testplan/select_test_from_testplan_test.py::test_select_by_testcase_id_test (originally) test was rewritten to not depend upon pytest's basetemp folder. Previously, the basetemp folder had to be nested inside the allure-pytest folder.
  • [allure-pytest, allure-pytest-bdd]: Plugins are now not loaded automatically when a nested pytest session is created. All required plugins must be explicitly specified by a test in one of its fixtures.
  • [allure-robotframework]: Test results now contain no duplicated tags. Previously, all allure tags were added twice.

Other minor fixes

The PR contains the following small changes:

  • [allure-pytest]: The typo in duration_time_test.py was fixed.
  • [allure-pytest]: The allure_pytest.utils.escape_name function was removed as it does nothing after parameters were removed from a test full name (closes Allure throws UnicodeDecodeError when recieve \\u in test parameter value #280).
  • [allure-pytest]: Parameter set ids were assigned to some parametrized tests.
  • [allure-pytest]: The select_test_from_testplan_test.py and pytest_get_allure_plugin_test.py tests were moved to acceptance as they check the pytest support by allure.
  • [allure-pytest]: Replace exception-based logic with precondition checking logic in the allure_pytest.utils.allure_title function and in the allure_commons.logger.AllureFileLogger constructor.

Documentation fixes

The following changes were made in the documentation:

  • Invalid badge links in package's README files were fixed.
  • The separator between the listener and its argument was changed from the semicolon (;) to colon (:) in allure-robotframework's README as noted here (closes Documentation error in allure-robot framework.  #427).
  • Links to alluure-behave, allure-pytest and allure-robotframework examples were added to the main README and packages' README files.

The tests directory follows the following pattern:
  <package>/<test-type>
Where a test type could be an acceptance test (i.e., end-to-end test
on the allure integration itself),
a unit test, a test on a defect, a test on an external library support,
etc. Acceptance tests could be further devided into two groups: tests
to ensure the user API implementation is correct, and tests that check
if allure supports the framework itself properly.

Also contains following changes:
  - all: refactor test execution logic into the abstract runner class
  - allure_pytest: fix typo in duration_time_test.py
  - allure_pytest: fix end enable skipped and xfail tests
  - allure_pytest, allure_nose2: remove unnecessary parametrization
    from some tests
  - allure_pytest: change --log-cli-level to more common --log-level
    to test log capturing
  - allure_pytest, allure_pytest_bdd: move act phase from fixtures to
    tests
  - allure_pytest: add less_than assertion to the duration tests
  - allure_pytest: the fix second assertion on test_one_fixture_on_two_tests
  - allure_pytest: add ids to some parametrized tests
  - allure_pytest: fix test file names for them to be collected by pytest
  - allure_pytest: remove duplicated parameter set from test_step_parameters
  - allure_pytest: remove unnecessary test on missing testplan
  - allure_python: remove unicode tests (obsolete since py2 support dropped)
  - all: disable pytest automatic plugin loading
  - allure_pytest: move testplan and pluginmanager tests to acceptance group
  - add test folders to change collection filters
  - change the linting job to checks all the code base at once
  - rename the build job to test
  - remove unnecessary deps installation from lint & test jobs
@delatrie delatrie added theme:core type:enhancement type:documentation Improvements or additions to documentation labels Feb 24, 2023
@delatrie delatrie self-assigned this Feb 27, 2023
@delatrie delatrie marked this pull request as ready for review February 28, 2023 06:48
@delatrie delatrie removed the request for review from sseliverstov March 1, 2023 07:50
@delatrie delatrie merged commit 86cff25 into master Mar 1, 2023
@delatrie delatrie deleted the test-unification branch March 1, 2023 08:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment