Skip to content

[RFC] Stripping down the test set (i.e. make the CI go brrrrr) #3238

Open
@GaetanLepage

Description

@GaetanLepage

Motivation

Our testing strategy is quite exhaustive, but it is large and takes time to evaluate and run (especially from scratch).
Fortunately, there is a lot of redundancy, and we surely can lighten it meaningfully.

General idea

  • Establish, design and implement a criteria for marking a test as platform-agnostic. Such tests would not need to run on all four platforms.
  • Refactor our test set to maximize the number of such tests:
    • Remove the platform-specificy from most tests,
    • Create extra (but way fewer) tests to account for what has been removed.

Background

Which tests are targetted by this RFC

Let us decompose or our huge test set $T$ in two categories:

  • $T_\text{custom}$ ("top-level" tests): Everything that is not automatically generated from test-sources/.
  • $T_\text{config}$ ("config" tests): Everything under test-sources/ that is automatically processed as a configuration test (i.e., we run nvim with this config)

Let's leave $T_\text{custom}$ aside. There aren't too many tests in this group and it makes sense to have them run on all platforms.

Let's instead focus on $T_\text{config}$.

Platform-agnostic vs. Platform-specific tests

A nixvim configuration is said to be platform agnostic (PA) if it has no platform-specific behavior. I.e.:


Definition: A configuration where extraPackages, extraPlugins, extraPythonPackages... are empty.
(Note: this definition is probably incomplete)
Conversely, tests that are not PA, will be refered to as platform specific (PS).


Of course such tests still rely on neovim, which is platform-specific, but we can make the assumption that neovim is working on all platforms (it is extensively tested elsewhere).
Also, nothing prevents the tests from having "platform-specific" lua code in the extraConfigLua option, which is also technically PS.
I will assume that this probably does not exist or at least has a very low risk.
These purely platform-agnostic tests can be limited to running exclusively on our primary platform (most likely x86_64-linux).

Factorization of the test set

Currently, most of our tests are technically PS.
However, the main idea of this proposal relies on two properties:

  1. Each test $t \in T_c$ can be split in two sub-tests: $t_{PS}$ (a platform specific component) and $t_{PA}$ (a platform agnostic component).
  2. We can easily build few macro tests that factorize all $t_{PS}$ components of the tests in $T_c$.

I will illustrate this with the example of plugin tests.
By default plugin tests are PS.
However, a plugin test is testing 3 things:

  1. The plugin can be installed in the wrapper on all platforms.
    -> PS (a package/plugin can be broken on a specific platform)
  2. This configuration can be evaluated (i.e., options exist and have legit, correctly typed values)
    -> PA except for evaluating the required packages/plugins that is PS (see 1.)
  3. The lua configuration is valid. Neovim starts without error with this config.
    -> Although theoretically PS, it can be assumed to be PA

Having a single allPluginPackages test that installs all plugin packages that we support (similarly to our already existing modules-dependencies-all test) could single-handedly account for testing all plugins for property (1).

Assuming that such a test exist, all plugin tests can then be assumed to be PA.
Hence, they can all be marked as such and run on a single platform.

Implementation

  • Introduce a tests.platformSpecific (boolean) flag to each test config that encodes the PS/PA property.
    Maybe, this should be opt-in (true by default) to prevent a
  • Add an allPluginPackages test.
    We could collect all options.plugins.*.package.default items and add them to the extraPlugins of this test.
    This is the general idea. In practice, a more cautious search might be necessary to effectivaly collect all plugin packages across the Nixvim [sub-]modules.
  • Automatically mark the following tests as PA:
    • Tests that already are PA (no extraPackages & co)
    • Tests which PS effects are already tested elsewhere

Conclusion

This proposal could help drastically reduce the CI weight on nix-community's infrastructure.
Most importantly, the darwin tests are the most problematic ones.
They take far longer to run compared to the linux ones and the Mac mini that runs our darwin CI often gets overwhelmed.

In terms of drawbacks and limitations, I can think of two:

  1. The PS and PA quialification for the different tests comprise some assumptions.
    While my intuition is that opering the aforementioned factorization would not effectively weaken our current test coverage, it is important to properly think of eventual flaws caused by these assumptions.
  2. Implementing this logic is not trivial and will inevitably complexify the test creation code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions