Add linting to CI via pre-commit #987

hugovk · 2022-11-19T13:23:54Z

I was going to add this to the forthcoming release cycle (draft) PR, but better to split it out to its own PR.

This adds formatters and linters to the CI via pre-commit, we can adjust these as needed.

ezio-melotti

Can you elaborate (here and/or on comments on the files) on what these two new yaml files accomplishes?

lint.yml seems pretty straightforward, but I also noticed that pre-commit/action is apparently deprecated in favor of pre-commit.ci: https://github.com/pre-commit/action
pre-commit-config.yaml is used by pre-commit.ci, and possibly by pre-commit/action too. However it's not obvious why it lists seemingly unrelated repos.

Are we using pre-commit.ci or just the action? Is there something else that contributors need to do to enable the pre-commit hooks? Should this be documented somewhere?

hugovk · 2022-11-20T15:47:27Z

Can you elaborate (here and/or on comments on the files) on what these two new yaml files accomplishes?

Sure!

`lint.yaml`

lint.yaml is a GitHub Actions workflow to run the "pre-commit" tool on the CI. The tool's name can be a bit misleading, as it works very well as a tool for pinning and updating linters and so on, even if you don't use it as a local Git pre-commit.

The workflow checks out this repo, sets up a recent Python, and runs the tool via its GitHub Action.

`.pre-commit-config.yaml`

.pre-commit-config.yaml is the config file for the pre-commit tool.

This lists a number of "hooks", each a separate check, like a linter or formatter:

They're listed under their repo. Repos generally have a single check (e.g. isort), although some have multiple (e.g. check-json, check-yaml). We can define optional arguments to pass (such as --target-version=py37 for Black) or additional dependencies (like flake8-2020 for Flake8).

The tool uses repos to fetch hooks rather than installing from PyPI. This is because the tool is cross-platform and can run things written in 20 languages, not just Python.

Importantly, each repo is pinned via a rev version number. This means we don't get any surprises when a tool is updated to add new checks and our CI suddenly fails; we only see those when explicitly updating the pins.

We can periodically update the pins easily using the tool:

python -m pip install pre-commit
pre-commit autoupdate

lint.yml seems pretty straightforward, but I also noticed that pre-commit/action is apparently deprecated in favor of pre-commit.ci: https://github.com/pre-commit/action

https://github.com/pre-commit/action says:

DEPRECATED this action is in maintenance-only mode and will not be accepting new features.

Please switch to using pre-commit.ci which is faster and has more features.

However, there's been two maintenance releases since the note was added and I don't the action is doing all we need: install the action, run the tool, do some caching. If it stops working, it's straightforward to replace with our own commands.

Are we using pre-commit.ci or just the action?

Just the action.

The difference:

the action runs the tool on GitHub Actions and reports pass/fail.
pre-commit.ci runs the tool on its own CI, reports pass/fail, but can also send a fix commit back to PRs for formatting changes. The ci: autoupdate_schedule: quarterly config also defines the period for it to send PRs to update the pins.

In my projects I'm using both:

Benefit of the action on GHA CI is it's more likely run for forks, so contributors get fast feedback.
Benefit of pre-commit.ci is fixing PRs and periodic autoupdates.

So we could also enable pre-commit.ci for this repo.

pre-commit-config.yaml is used by pre-commit.ci, and possibly by pre-commit/action too. However it's not obvious why it lists seemingly unrelated repos.

Yep, both use it. As mentioned above, the repos define where the tool installs each of the formatters/checkers from.

Is there something else that contributors need to do to enable the pre-commit hooks? Should this be documented somewhere?

So for this PR we're using pre-commit as a tool to pin and run linters, not as an actual Git pre-commit tool. But that option is open to developers (and I'll be using it).

I never used to be a fan of using Git pre-commit locally until I found the https://pre-commit.com/ too which makes it much, much easier to manage: no need to manually edit files under the .git dir. Some people really don't like it, even with this tool, which is totally fine! So I'm a bit hesitant if we want to recommend it to contributors.

Anyway, if you do want to use it locally:

# Install the tool
python -m pip install pre-commit

# Run on all files
pre-commit run --all-files

# Update the pins
pre-commit autoupdate

# Activate it for this repo, so it will run on just the staged files when you 'git commit'
pre-commit install

# No longer want it to run for 'git commit'?
pre-commit uninstall

hugovk

Here's some notes on the hooks I added here.

They're some I commonly use for my projects. We can add or remove as we see fit.

hugovk · 2022-11-20T15:49:00Z

.pre-commit-config.yaml

+  - repo: https://github.com/asottile/pyupgrade
+    rev: v3.2.2
+    hooks:
+      - id: pyupgrade


pyupgrade is a formatter that upgrades to newer Python syntax: https://github.com/asottile/pyupgrade

hugovk · 2022-11-20T15:49:18Z

.pre-commit-config.yaml

+  - repo: https://github.com/psf/black
+    rev: 22.10.0
+    hooks:
+      - id: black


The Black code formatter: https://github.com/psf/black

hugovk · 2022-11-20T15:51:02Z

.pre-commit-config.yaml

+  - repo: https://github.com/PyCQA/isort
+    rev: 5.10.1
+    hooks:
+      - id: isort


isort to sort imports: https://github.com/PyCQA/isort

hugovk · 2022-11-20T15:53:46Z

.pre-commit-config.yaml

+      - id: flake8
+        additional_dependencies: [flake8-2020]


Flake8 to find style errors using pycodestyle and pyflakes (both builtin plugins: https://flake8.pycqa.org/en/latest/) and flake8-2020 (version comparison problems: https://github.com/asottile/flake8-2020)

Maybe worth replacing flake8-2020 with something much more useful, flake8-bugbear. It's more or less the official set of first-party optional checks by the PyCQA that don't meet the strict standards of pyflakes but are still very likely to be mistakes of some sort; I'm not sure I've ever seen a false positive with it. It seems a lot more worth adding than flake8-2020, which is a single check for a single specific issue that is not used here and very unlikely to ever be, at the point.

hugovk · 2022-11-20T15:55:09Z

.pre-commit-config.yaml

+  - repo: https://github.com/pre-commit/pygrep-hooks
+    rev: v1.9.0
+    hooks:
+      - id: python-check-blanket-noqa


This repo has a bunch of simple grep checks. This hook makes sure any # noqa annotations include a code like # noqa: F401:

https://github.com/pre-commit/pygrep-hooks

hugovk · 2022-11-20T15:56:13Z

.pre-commit-config.yaml

+  - repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: v4.3.0
+    hooks:
+      - id: check-json


This repo has lots of different checkers. This checks syntax of JSON files (will be added for the release chart PR).

hugovk · 2022-11-20T15:57:09Z

.pre-commit-config.yaml

+    rev: v4.3.0
+    hooks:
+      - id: check-json
+      - id: check-merge-conflict


Checks for accidentally committed merge conflict stuff:

<<<<<<< HEAD ======= >>>>>>> new_branch_to_merge_later

hugovk · 2022-11-20T15:57:35Z

.pre-commit-config.yaml

+    hooks:
+      - id: check-json
+      - id: check-merge-conflict
+      - id: check-yaml


Checks syntax of YAML files. Can help find mistakes before the CI fails to run.

hugovk · 2022-11-20T15:57:46Z

.pre-commit-config.yaml

+      - id: end-of-file-fixer
+      - id: trailing-whitespace


Some whitespace fixers.

hugovk · 2022-11-20T15:59:13Z

.pre-commit-config.yaml

+ci:
+  autoupdate_schedule: quarterly


Defines a schedule for https://pre-commit.ci/ to send autoupdate PRs. I find the weekly default to be too noisy. monthly is also a choice, and these can be updated manually at any time (via the tool or as needed).

ezio-melotti · 2022-11-21T23:36:47Z

Thanks for the detailed and very informative reply!
I now have a better understanding of this PR, and I'm mostly fine with it.

The only issue I have is that I'm not a big fan of black, which I assume is responsible for the changes to the two .py files. Given that we don't have much Python code in this repo and since the code looked fine (or even better IMHO) before the changes, perhaps we can skip that? blue might be an alternative if we really want something similar, but I'm not sure what output it will generate.

terryjreedy · 2022-11-22T02:10:08Z

Personally, I don't like black's .py changes. They are also a violation of at least the spirit of PEP8.

CAM-Gerlach · 2022-11-22T05:16:30Z

To avoid holding up this PR over Black, you could consider dropping it from here and deferring it to a future one. Or, to at least hugely reduce the diff noise to almost nothing (unless I'm missing something), you could disable the quote normalization, which only leaves one single change (using a single-line hanging indent instead of an ugly two-line split in one spot).

hugovk · 2022-11-22T09:28:59Z

For comparison I've updated Black to skip string normalisation. How does that look?

And here's how blue looks:

string normalisation and default 79 chars: hugovk@3d820a5
skip string normalisation and default 79 chars: hugovk@2e48b7f
skip string normalisation and 88 chars: hugovk@e4821b0

But happy to drop it if you prefer.

CAM-Gerlach · 2022-11-23T02:18:35Z

As far as I'm aware, blue @ 88 chars is identical to the current black settings (the two only differ on a few specific points, of course, and the two settings you changed are equally configurable on both black and blue), so I figure if we're going to do one of them, we may as well just use black as it is now—that would be consistent with what we use on the PEPs repo for the infrastructure code.

I don't always 100% agree with every choice Black makes, but the way I see it, as much as I sometimes love to haggle over such minutia (and don't we all?), there are way more important things to worry and debate about than code style, since everyone has different preferences when it comes to that. The advantage of Black is you can just write your code however you want, and it will automatically take care of conforming it to a standard style while avoiding bikeshedding over what color to paint it. I've learned the long-term value of this the hard way (and continue to)...

ezio-melotti · 2022-11-23T07:00:23Z

string normalisation and default 79 chars: hugovk@3d820a5

skip string normalisation and default 79 chars: hugovk@2e48b7f

skip string normalisation and 88 chars: hugovk@e4821b0

Of the three I like the third the most. Dropping both (black and blue) and leave the code as is is ok too.

Click for bikeshedding aside

I prefer single quotes over double quotes (but I'm -0 on a mass quote rewrite and I don't worry too much about (repo-level) consistency since they both have valid uses). They are easier to type and (IMHO) to read.

I prefer using

res = some_func(that, has, many, arguments,
                on, just, two, lines)

instead of

res = some_func(
   that, has, many, arguments, on, three, or, even,
   four, lines
)

The former is more compact and easier to read (it's clearer that they are all arguments and not separate expressions in an indented block).

I personally use 80 chars, but I can live with 88.

hugovk · 2022-11-23T07:34:00Z

As far as I'm aware, blue @ 88 chars is identical to the current black settings (the two only differ on a few specific points, of course, and the two settings you changed are equally configurable on both black and blue), so I figure if we're going to do one of them, we may as well just use black as it is now—that would be consistent with what we use on the PEPs repo for the infrastructure code.

Ah yes! Pretty much, plus some other minor things we're not worried about so it's better to track upstream Black.

Shall we go for what's currently in this PR (Black+88+skip quote normalisation) or drop it altogether?

Or if people aren't keen on a formatter, I'm fine on leaving it out.

Click for bikeshedding aside

I'm the opposite on quotes, I find double easier to type and read :)
Formatting:

instead of

res = some_func(
   that, has, many, arguments, on, three, or, even,
   four, lines
)

Nitpick: Black won't format like that. Here's an example based on real code. If there's room, it keeps them in one line:

def _carry(value1: float, value2: float, ratio: float) -> tuple[float, float]:
    ...

If too long, it does this:

def _carry(
    value1: float, value2: float, ratio: float, unit: Unit, min_unit: Unit
) -> tuple[float, float]:
    ...

And if still too long, it does this:

def _carry(
    value1: float,
    value2: float,
    ratio: float,
    unit: Unit,
    min_unit: Unit,
    suppress: typing.Iterable[Unit],
) -> tuple[float, float]:
    ...

A benefit is that diffs become clearer, it shows exactly the one new addition. Readability is important and a lot time is spent reading code in diffs, often side-by-side in narrower columns.

There's also some stuff in Black I'm not so keen on, but like CAM said, find there's massive benefit in just letting it take care of all that stuff. Before Black there was yapf, which had so many options it could be hard to settle on which to use :)

CAM-Gerlach · 2022-11-23T08:18:31Z

Shall we go for what's currently in this PR (Black+88+skip quote normalisation) or drop it altogether?

FWIW I prefer the former if others don't have strong objections, for the reasons mentions (which gives the same results as Blue for this repo, in particular the "third option" above that Ezio and I both mention we prefer the most of them).

Before Black there was yapf, which had so many options it could be hard to settle on which to use :)

That and yapf used a complex dynamic weighting algorithm that give quite non-deterministic, variable results depending on code's context, which resulted in a lot of fluctuations and diff noise. I also seem to recall some other ancillary issues with it. And before yapf, there was autopep8, which wasn't as much of a comprehensive autoformatter (such that two blocks of CST would be formatted the same) and just strictly enforced PEP 8's guidelines and no more. Surely there were other formatters before that too (Yapf stood for "Yet another Python formatter", after all) but it seems their names have been mostly lost to time.

ezio-melotti · 2022-12-09T09:15:13Z

I recently contributed to another repo that uses pre-commit and ran into two problems while trying to fix some CI failures:

It is not clear what command/arguments were used. I got errors from both black and pylint¹, and I wanted to run those two checks locally, but the CI output didn't include the exact commands.
After installing pylint locally and getting different errors, I noticed that in the .pre-commit-config.yaml an old version of pylint was specified. This raises the question: will we need to manually (or perhaps automatically, e.g. through @dependabot) update all these versions?

anecdotally one of the correction made by black was wrong (https://github.com/psf/black/issues/1327 -- the others were just quote swaps), and the errors/reported by pylint were false positives, resulting in time wasted and some extra cruft added to the code solely to make the tools happy. ↩

hugovk · 2022-12-09T10:51:28Z

I recently contributed to another repo that uses pre-commit and ran into two problems while trying to fix some CI failures:

It is not clear what command/arguments were used. I got errors from both black and pylint1, and I wanted to run those two checks locally, but the CI output didn't include the exact commands.

Did you run the tools directly? e.g. black . and pylint?

With pre-commit, the magic command is pre-commit run --all-files. This installs the pinned versions and runs with the pinned arguments. No "works on my machine/CI" - we all use the same tools, versions and arguments.

After installing pylint locally and getting different errors, I noticed that in the .pre-commit-config.yaml an old version of pylint was specified. This raises the question: will we need to manually (or perhaps automatically, e.g. through @dependabot) update all these versions?

Yes.

Pinning means we don't run into sudden failures, for example, when a linter adds a new check, and means we can deal with those separately instead of in "normal" PRs.

Yes, we need to update those from time to time. Either manually (pre-commit autoupdate) or automatically. Dependabot doesn't update .pre-commit-config.yaml, but https://pre-commit.ci/ does.

A second benefit of https://pre-commit.ci: if I submit a PR and don't apply Black formatting, the CI will do it and update the PR automatically. It saves a lot of manual work for reviewer and contributor: you don't need to tell me what to do (manual formatting or how to install and run extra tools), and I don't need to do it.

Footnotes

anecdotally one of the correction made by black was wrong (psf/black/issues/1327 -- the others were just quote swaps), and the errors/reported by pylint were false positives, resulting in time wasted and some extra cruft added to the code solely to make the tools happy. ↩

Yeah, pylint is pretty noisy, I very occasionally run it locally and don't use it any CI.

ezio-melotti · 2022-12-09T11:17:19Z

Did you run the tools directly? e.g. black . and pylint?

Yes. The repo had a bunch of other checks, and I thought it would be easier to just install those two.

A second benefit of https://pre-commit.ci/: if I submit a PR and don't apply Black formatting, the CI will do it and update the PR automatically.

In my specific case, black would have "broken" the code, requiring an additional commit to fix it again. Maybe overall it saves time, even taking into account these issues.

CAM-Gerlach · 2022-12-09T23:42:23Z

Yes. The repo had a bunch of other checks, and I thought it would be easier to just install those two.

If you use pre-commit run --all check-name, it will only automatically install the check(s) you specify, and will cache them between runs and between on-demand pre-commit run and automatic pre-commit hooks, if you've enabled them with pre-commit install.

Yes, we need to update those from time to time. Either manually (pre-commit autoupdate) or automatically. Dependabot doesn't update .pre-commit-config.yaml, but pre-commit.ci does.

Its also very straightforward to just have a monthly, quarterly, etc. GitHub action that just runs pre-commit autoupdate and opens a PR with the changes, without any need for relying on a third party cloud service and its limitations. See example here.

I typically use such automation on repos with a handful of relatively straightforward checks. With more tools in play, I find that given tool updates often require manual fixes, introduce new checks/settings or have other non-trivial changes, on most of my repos I run pre-commit autoupdate myself locally quarterly, check the changelog of the updated checks and make any other fixes, changes or adjustments in the checks.

A second benefit of pre-commit.ci: if I submit a PR and don't apply Black formatting, the CI will do it and update the PR automatically. It saves a lot of manual work for reviewer and contributor: you don't need to tell me what to do (manual formatting or how to install and run extra tools), and I don't need to do it.

That can also be done via a GitHub Action without the need for a third party service and the significant limitations of pre-commit CI; see this example. There's also the brand new pre-commit.ci lite, which allows us to get the same auto-fix features as pre-commit.ci by just adding one extra step in the GitHub Actions workflow added here. Contrary to the name, it is this "lite" version that actually offers the full power of pre-commit, as pre-commit.ci can only execute a limited set of pre-installed checks. It does require enabling the Pre-Commit.ci lite GitHub application for some reason, which we may want to avoid by just implementing it ourselves like in the example above.

Yeah, pylint is pretty noisy, I very occasionally run it locally and don't use it any CI.

I do run Pylint in CI via pre-commit and find it useful, but it does take some customization to turn down or disable some of the checks, especially the refactoring ones. Others often find a more shotgun approach useful, disabling all but the warnings and errors, as well as some categories of the same with a higher false positive ratio.

In my specific case, black would have "broken" the code, requiring an additional commit to fix it again. Maybe overall it saves time, even taking into account these issues.

Personally, I always just pre-commit install locally, so I never commit anything that fails the checks, but it is understandable especially in repos like this that others don't. In this particular case, we could enable --experimental-string-processing, or upgrade to a version where it is the default (if it has been released yet), which would have avoided this.

pradyunsg · 2023-01-31T23:34:19Z

Given that this repository does not have a large corpus of code, we've configured black to avoid rewriting quotes to reduce churn, and ~all of the code already formats as black would, does anyone of us think that adopting black here is a blocking concern for this PR?

If no one thinks it's a blocking concern, then let's move forward on this. Otherwise, let's remove black from the list of linters, merge this PR, and argue about black separately.

Either way, we should do an update of the pins before landing this. :)

CAM-Gerlach · 2023-02-01T05:01:31Z

FWIW, my take is given that, with quote normalization disabled, there's only a single existing line changed by Black, we should just rebase with main, ensure the checks continue to pass and then merge the PR. If it becomes a problem, we can always disable it later.

CAM-Gerlach

LGTM, FWIW

CAM-Gerlach

One other thing here: If we're going to add pre-commit, we should move the Sphinx-Lint check under pre-commit as opposed to being hardcoded into the build workflow.

hugovk · 2023-02-06T22:58:32Z

Moved!

ezio-melotti reviewed Nov 19, 2022

View reviewed changes

hugovk commented Nov 20, 2022

View reviewed changes

hugovk force-pushed the linting branch from e951e73 to 690b58b Compare November 22, 2022 09:13

CAM-Gerlach mentioned this pull request Nov 28, 2022

Generate release cycle chart and CSV #988

Merged

1 task

hugovk added 2 commits February 1, 2023 09:40

Add linting to CI via pre-commit

957e20b

Update pre-commit pins

1e8b3a9

hugovk force-pushed the linting branch from b3dd306 to 1810a3f Compare February 1, 2023 07:50

Fix Flake8

c671cdd

hugovk force-pushed the linting branch from 1810a3f to c671cdd Compare February 1, 2023 07:53

CAM-Gerlach approved these changes Feb 2, 2023

View reviewed changes

CAM-Gerlach reviewed Feb 6, 2023

View reviewed changes

Move Sphinx Lint into pre-commit

9241c52

pradyunsg approved these changes Feb 7, 2023

View reviewed changes

Merge remote-tracking branch 'upstream/main' into linting

1df5876

Bump pre-commit

698cf06

hugovk merged commit 1cb3c18 into python:main Jul 31, 2023
2 checks passed

hugovk deleted the linting branch July 31, 2023 08:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add linting to CI via pre-commit #987

Add linting to CI via pre-commit #987

hugovk commented Nov 19, 2022

ezio-melotti left a comment

hugovk commented Nov 20, 2022

hugovk left a comment

hugovk Nov 20, 2022

hugovk Nov 20, 2022

hugovk Nov 20, 2022

hugovk Nov 20, 2022

CAM-Gerlach Nov 22, 2022

hugovk Nov 20, 2022

hugovk Nov 20, 2022

hugovk Nov 20, 2022

hugovk Nov 20, 2022

hugovk Nov 20, 2022

hugovk Nov 20, 2022

ezio-melotti commented Nov 21, 2022

terryjreedy commented Nov 22, 2022

CAM-Gerlach commented Nov 22, 2022

hugovk commented Nov 22, 2022

CAM-Gerlach commented Nov 23, 2022

ezio-melotti commented Nov 23, 2022

hugovk commented Nov 23, 2022

CAM-Gerlach commented Nov 23, 2022

ezio-melotti commented Dec 9, 2022

hugovk commented Dec 9, 2022

Footnotes

ezio-melotti commented Dec 9, 2022

CAM-Gerlach commented Dec 9, 2022

pradyunsg commented Jan 31, 2023 •

edited

Loading

CAM-Gerlach commented Feb 1, 2023

CAM-Gerlach left a comment

CAM-Gerlach left a comment

hugovk commented Feb 6, 2023

Add linting to CI via pre-commit #987

Add linting to CI via pre-commit #987

Conversation

hugovk commented Nov 19, 2022

ezio-melotti left a comment

Choose a reason for hiding this comment

hugovk commented Nov 20, 2022

lint.yaml

.pre-commit-config.yaml

hugovk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ezio-melotti commented Nov 21, 2022

terryjreedy commented Nov 22, 2022

CAM-Gerlach commented Nov 22, 2022

hugovk commented Nov 22, 2022

CAM-Gerlach commented Nov 23, 2022

ezio-melotti commented Nov 23, 2022

hugovk commented Nov 23, 2022

CAM-Gerlach commented Nov 23, 2022

ezio-melotti commented Dec 9, 2022

Footnotes

hugovk commented Dec 9, 2022

Footnotes

ezio-melotti commented Dec 9, 2022

CAM-Gerlach commented Dec 9, 2022

pradyunsg commented Jan 31, 2023 • edited Loading

CAM-Gerlach commented Feb 1, 2023

CAM-Gerlach left a comment

Choose a reason for hiding this comment

CAM-Gerlach left a comment

Choose a reason for hiding this comment

hugovk commented Feb 6, 2023

`lint.yaml`

`.pre-commit-config.yaml`

pradyunsg commented Jan 31, 2023 •

edited

Loading