Skip to content

Maintenance: improve continuous code quality automation #2123

Open
@dreamorosi

Description

Summary

Currently we are running only limited code quality checks before merging a PR (linting and unit tests), while integration tests are still ran manually by maintainer either as a part of a PR review or before a release.

This is error prone and increases the risk of issues slipping into the codebase.

This issue came out of a discussion under #2072.

Why is this needed?

So that we can have code quality checks on the code merged on main and increase confidence on what's in it.

Which area does this relate to?

Automation, Tests

Solution

We should create a new workflow called "Code Quality" with trigger to be defined with several steps:

  • reusable "Run unit tests" workflow (to be renamed and/or split to reflect that it runs also linting)
  • integration/e2e tests
  • in the future fuzzying

The linting and unit tests complete within ~2 minutes while the integration tests take 8-9 minutes to complete.

We are already parallelizing via a matrix of architectures, runtime version, and packages. This results in 35+ tests and (~80+) CloudFormation stacks being deployed which in some cases seems to trigger some queuing in CloudFormation.

Regarding the trigger, I think we have four options:

  1. Run the code quality periodically - i.e. every few hours or every day. Given that we release at best bi-weekly this type of frequency would be sufficient for us to catch issues and would allow us to merge things quickly. At the same time, doing this would increase the risk of compounding issues if we were to branch out or rebase another feature branch from a main with issues.
  2. Run the code quality after code is pushed into main as a result of a merged PR. Depending on how we set up the CI this would mean a cooldown of ~10 minutes between merged PRs. On the other hand this would ensure that issues are caught as early as possible.
  3. Run the code quality as PR check. Today we are already running the unit tests in this way but while this method would give us high confidence it would also be wasteful and introduce latency since checks are run on every PR update.
  4. Run the code quality right before code is merged on main using merge queues. This would provide the similar benefits to the previous two options, but with two added values: 1/ maintainers don't have to wait for the longer checks to finish before merging one or more PRs, 2/ issues are detected before they are merged to main, thus giving the maintainer an opportunity to fix them without having to create a dedicated issue & additional PR.

At least based on my current understanding of merge queues option 4 seems the most promising option. With that said, this is a relatively new feature and I don't think any of us has had the chance to experiment with it yet.

Acknowledgment

Future readers

Please react with 👍 and your use case to help us understand customer demand.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    automationThis item relates to automationinternalPRs that introduce changes in governance, tech debt and chores (linting setup, baseline, etc.)on-holdThis item is on-hold and will be revisited in the future

    Type

    No type

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions