Better primer tests (False positives detection)

### Current problem

Right now the primer tests are only checking for fatal messages and crashes. But anticipating false positive would be a huge plus in term of quality of releases.

### Desired solution

A solution to anticipate the false positives by running pylint on external repositories. 

### Additional context

Originally posted by @DanielNoord (https://github.com/PyCQA/pylint/pull/5173#issuecomment-968105390):

Some points I found while working on this:
1) The initial approach of asserting `ex.code == 0` doesn't work since many packages will return error messages. Even the `stdlib` packages. We should therefore only look for crashes (`ex.code == 32`) or fatal messages (`ex.code == 1`) I think.

2) I have created a new CI job to do the primer tests on Linux on every push or pull requests. We don't want to run them only when bumping a version (as we did with the previous acceptance tests), but I think it is good to separate them from the other tests. Especially since they will probably take much longer to complete (more on that later) and sometimes early fails on the tests are helpful in finalising a PR (especially after GitHub review comments break a test).

3) The lazy loading of repo's checks for the SHA1 hashes of the local commit and remote commit and reclones when it finds a difference.

4) `tests/primer/primer_external_packages.py` is used to store a list of packages to download. We currently include `numpy` but I would argue against including it. `numpy` has its tests included in the sources files. Normally we don't want to run `pylint` over tests as this can create problems when tests use frameworks that use non-normal python code. Besides, running over the tests also really inflates the time it takes to run the primer tests. I would think we could come up with 2/3 other projects that might be better. Note that these projects do not need to use `pylint`, we just need to be able to assume that their code is "normal" and therefore shouldn't crash `pylint` or `astroid`.

5) We might also want to improve the message that gets raised when the test fails. Currently it is not immediately clear where `pylint` crashed. Improving the message might help expedite the process of fixing it.
Perhaps we should add `--only-fatal`? To make the output only print fatal errors?

Originally posted by @cdce8p  (https://github.com/PyCQA/pylint/pull/5173#discussion_r753879354):

Maybe the --capture=tee-sys option works?
https://pytest.org/en/6.2.x/capture.html#setting-capturing-methods-or-disabling-capturing

Originally posted by @DanielNoord (https://github.com/PyCQA/pylint/pull/5173#issuecomment-968272755):

> The reasoning for checking for error is that if there is an "error" in a well tested largely used package main branch, it means pylint created a false positive. We don't need to use the `--error-only` option as we can bit decode the exit code to check if there were errors or not. But it does force us to install the project dependencies.

I'm not sure this is the case. Take `undefined-variable` for example:
https://github.com/PyCQA/pylint/blob/e5082d3f9401dbcf65b40ce6a819d2a09beccb5c/pylint/checkers/variables.py#L393-L397

We know there are issues with this message. There are false positives and negatives and due to the lack of control-flow we are not able to solve this (now).
If we make it so the primer tests fail whenever an `Error`-level message is found in a project there cannot be any `undefined-variable` messages. This is (currently) unfeasible. There are many more `Error`-level messages where we know there are issues, which we cannot always fix so easily.

We can (sort of) avoid this by only including projects that use `pylint` as they are likely to have already disabled current false positives, but that doesn't fully help us.
What if a commit changes the way `undefined-variable` behaves and the project emits 10 new warnings. We would need to investigate whether any of these are correct or false positive. For larger projects this number might be much higher. Even if we identify 1 false positive and fix it, the primer will still fail because of 9 `undefined-variable` messages. We know these messages are correct (the project just hasn't updated to the WIP-pylint version) and can merge the commit, **but** from now on every primer CI job will fail because of those 9 messages.

By only checking for `Fatal` and `Crash` we make sure that `pylint` can parse any type of code (`pylint`-enforced and non-enforced) without breaking/crashing.
For reference the only `Fatal` messages are: `method-check-failed`, `fatal`, `astroid-error` and `parse-error`.

> > We might want to create a command --use-all-extensions
> 
> I agree with your three point, but this one is really good as this can be useful for those who want to use all extensions. I'm also thinking about making is possible to disable some extensions because preventing while loop is really opinionated for example. Maybe a `disable-plugin` option to mirror the `load-plugin` one ?

So you would want to allow the use of `--use-all-extensions` and `disable-plugin` at the same time?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Better primer tests (False positives detection) #5364

Current problem

Desired solution

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Better primer tests (False positives detection) #5364

Description

Current problem

Desired solution

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions