Skip to content

Experimental: Unify NamingSchemes. #3704

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

bukzor
Copy link
Contributor

@bukzor bukzor commented Jun 21, 2020

Description

The current default pylint naming schemes only differ on three parameters:

  1. characters allowed at beginning of name
  2. characters allowed elsewhere
  3. minimum length of name

Where this was not true was due to either bugs, oversight, or legacy
inertia; i's hard for me to tell the difference. Many odd
discrepancies among the naming schemes have been rectified, and this
change will bother users by calling out some of their bizarre naming
conventions, but all such issues are valid.

For churn considerations, I've not (yet?) modified the names of the
naming scheme objects although they've transitioned from being classes
to objects, and there's a couple other bits of odd naming in here for
the same reason. My aim was chiefly to show that this was possible, and
spur consideration of the idea.

A further improvement would be to set a "minimum length" for names in
the configuration, which should be checked apart from "naming scheme"
issues. This would greatly reduce the need for users to read and
understand regular expressions, and will make messages more actionable,
but also separating that additional resonsibility would further simplify
and rectify the naming schemes.

Type of Changes

Type
🐛 Bug fix
✨ New feature
🔨 Refactoring
📜 Docs

@bukzor bukzor force-pushed the unify-naming-schemes branch 6 times, most recently from b231639 to fa1f686 Compare June 21, 2020 08:15
@bukzor
Copy link
Contributor Author

bukzor commented Jun 21, 2020

I don't understand these test failures. I'm unable to reproduce them reliably, and I notice they only happen when I run the whole test suite -- the tests alone don't fail. I also see that I sometimes get the same errors in master branch. Are there currently flaky tests in master?

@bukzor bukzor force-pushed the unify-naming-schemes branch 3 times, most recently from 632dcc0 to 5b8f498 Compare November 19, 2020 17:09
The current default pylint naming schemes only differ on three parameters:

 1. characters allowed at beginning of name
 2. characters allowed elsewhere
 3. minimum length of name

Where this was not true was due to either bugs, oversight, or legacy
inertia. It's hard for me to tell the difference.  Many odd
discrepancies among the naming schemes have been rectified, and this
change will bother users by calling out some of their bizarre naming
conventions, but all such issues are valid.

For churn considerations, I've not (yet?) modified the names of the
naming scheme objects although they've transitioned from being classes
to objects, and there's a couple other bits of odd naming in here for
the same reason. My aim was chiefly to show that this was possible, and
spur consideration of the idea.

A further improvement would be to set a "minimum length" for names in
the configuration, which should be checked apart from "naming scheme"
issues. This would greatly reduce the need for users to read and
understand regular expressions, and will make messages more actionable,
but also separating that additional resonsibility would further simplify
and rectify the naming schemes.
@bukzor bukzor force-pushed the unify-naming-schemes branch from 5b8f498 to edae48b Compare November 19, 2020 17:10
@coveralls
Copy link

Coverage Status

Coverage increased (+0.02%) to 90.839% when pulling edae48b on bukzor:unify-naming-schemes into 9a5e1b3 on PyCQA:master.

1 similar comment
@coveralls
Copy link

Coverage Status

Coverage increased (+0.02%) to 90.839% when pulling edae48b on bukzor:unify-naming-schemes into 9a5e1b3 on PyCQA:master.

@coveralls
Copy link

coveralls commented Nov 19, 2020

Coverage Status

Coverage increased (+0.02%) to 90.839% when pulling edae48b on bukzor:unify-naming-schemes into 9a5e1b3 on PyCQA:master.

@hippo91 hippo91 self-assigned this Nov 26, 2020
Copy link
Contributor

@hippo91 hippo91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bukzor thanks a lot for this PR and sorry for the delay.
It is clearly an improvement in our naming scheme management.
However i'm a bit confused by the fact that, at this point, this PR prevents the user to use variable with only one character in its name...
By the way, i would like to have @PCManticore on this topic.


1. increase the length of names to be more descriptive (recommended)
2. add strange-but-useful names to the good-names configuration
3. override the default regexen, via pylintrc (not recommended)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that not recommended?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not recommended because even for those deeply familiar with regex (few) it is much too easy to make regex with unintended matches / exclusions.

name_template = r"[^\W\d_%s][^\W%s]%s"

def __init__(self, head_exclude: str, tail_exclude: str, min_length: int):
self.head_exclude = head_exclude
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO it could be interesting to add comments explaining what do you call head and tail. From what i understand, head is only one character long whereas tail is all the rest. I'm i right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly. As in head and tail of a snake -- the tail is everything but the head.

@@ -1,4 +1,4 @@
# pylint: disable=literal-comparison,missing-docstring,misplaced-comparison-constant
# pylint: disable=literal-comparison,missing-docstring,misplaced-comparison-constant,invalid-name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find a little bit damageable if pylint is not compatible with constant name that are one letter long.
What do you think about it?

==input.func_w0801:3
==input.w0801_same:3
==input.func_w0801:4
==input.w0801_same:4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change needed?

@@ -0,0 +1,2 @@
[basics]
good-names=i,j,k,ex,Run,_,_dt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once again i think it is very risky to forbid using single character named variable...

@hippo91 hippo91 added Waiting on author Indicate that maintainers are waiting for a message of the author and removed Work in progress labels Jan 4, 2021
@DanielNoord
Copy link
Collaborator

@bukzor Thanks for you initial contribution. I would like to add this to pylint and finish the work you started. Would you be okay with me creating a new PR based on the work you already did?

@DanielNoord
Copy link
Collaborator

I'm going to close this PR as it has become stale.

I looked at the changes this PR would introduce and have decided not to take them further myself. I think a real contribution would be finding a way to do the naming styles without having to compile 25+ regex patterns, but I'm not sure if that is even feasible.

@DanielNoord DanielNoord closed this Apr 1, 2022
@bukzor
Copy link
Contributor Author

bukzor commented Aug 11, 2022

I intend to revive this work -- after a long haitus, I'm once again participaing in the open-source community.

Please advise:

  1. Would you prefer I update this PR or start a fresh one?
  2. Are there any of these changes that should be excluded?

@DanielNoord
Copy link
Collaborator

I intend to revive this work -- after a long haitus, I'm once again participaing in the open-source community.

Welcome back! 😄

Please advise:

  1. Would you prefer I update this PR or start a fresh one?
  2. Are there any of these changes that should be excluded?

I think a fresh one is probably better since many of the files here have now moved or significantly changed.

That said, I know there was some discussion about whether we should pursue the changes in this PR. I'm still sympathetic to simplifying this code, but I'm not sure how feasible the current approach still is.
We have now also have types that are only part of a specific style and are not defined on other styles, such as TypeVar.

@bukzor
Copy link
Contributor Author

bukzor commented Aug 11, 2022

I intend to revive this work -- after a long haitus, I'm once again participaing in the open-source community.

Welcome back! 😄

Thank you!

Please advise:

  1. Would you prefer I update this PR or start a fresh one?
  2. Are there any of these changes that should be excluded?

I think a fresh one is probably better since many of the files here have now moved or significantly changed.

Roger.

That said, I know there was some discussion about whether we should pursue the changes in this PR. I'm still sympathetic to simplifying this code, but I'm not sure how feasible the current approach still is. We have now also have types that are only part of a specific style and are not defined on other styles, such as TypeVar.

I'm aware that the approach taken isn't strictly agreeable, but I do believe the goals are still relevant and attainable. Once I have something worth discussing I'll bother you again. Async text (i.e. github issues) isn't very conducive to debating abstract ideas such as hypothetical implementations, especially between strangers. How do you normally handle such complex communications?

@DanielNoord
Copy link
Collaborator

I'm aware that the approach taken isn't strictly agreeable, but I do believe the goals are still relevant and attainable. Once I have something worth discussing I'll bother you again. Async text (i.e. github issues) isn't very conducive to debating abstract ideas such as hypothetical implementations, especially between strangers. How do you normally handle such complex communications?

Normally we use issues to discuss significant proposals of change. It's not really perfect like you said, but it also helps narrow the scope of PRs which greatly helps with reviewing them. If the discussion in a PR is becoming too long it is often a good indication that it might try to do too much in one PR.

@Pierre-Sassoulas
Copy link
Member

Hello @bukzor and welcome back !

(I started to work on this issue myself and lost the code some time ago so I had some ideas about it)

I think short variable name should be correctly checked as per hippo's comment. Also imo a split between a new check for variable names that are too small and variable names not respecting the name convention is required (using old_names so retro compatibility for all messages disabled with 'invalid-name' previousely are still disabled for both new message).

We usually use async text communication for everything as we're in various timezones and pylint is something I do on my free time / when convenient. So I don't want to schedule a call unless absolutely necessary. But I think it's possible to talk about the required steps and do them one by one with issues as Daniel said though.

@bukzor
Copy link
Contributor Author

bukzor commented Aug 11, 2022

Hello @bukzor and welcome back !

Thank you :D

I think short variable name should be correctly checked as per hippo's comment.

Please expand on "correctly" here? My best guess is that you mean: One-character variables should be admitted by the name checker without error, by default.

Also imo a split between a new check for variable names that are too small and variable names not respecting the name convention is required ...

To clarify, you're advocating a split of the existing NameChecker into two new checkers? One for naming style, and one for naming length. The naming-style checker would disregard any length problems.

If so, my goal would be to add the two new checkers in a disabled-by-default state until you're ready, while leaving the old checker untouched, for now.

... (using old_names so retro compatibility for all messages disabled with 'invalid-name' previousely are still disabled for both new message).

I don't understand this bit. What is old_names? Do you mean that there would be two new error names that would be set to also disable themselves with disablement of invalid-name? I didn't know that was a feature of checkers.

We usually use async text communication for everything as we're in various timezones and pylint is something I do on my free time / when convenient. So I don't want to schedule a call unless absolutely necessary. But I think it's possible to talk about the required steps and do them one by one with issues as Daniel said though.

Got it! I will endevour to adopt this workflow. Thank you for including me :)

@Pierre-Sassoulas
Copy link
Member

Please expand on "correctly" here? My best guess is that you mean: One-character variables should be admitted by the name checker without error, by default.

I was thinking that a name that is in the list of accepted name should be ok for both checks.

ab should be valid snake case but too short, and that aB should be invalid and too short, Ab should be valid PascalCase, but not ab, etc.. We would also have to define what we want for one letter name. i.e. A and a, is A Pascalcase or uppercase with underscore ? is a camelCase or snake_case ? Maybe both ?

snake_case name A a Ab ab Abcd abcd
invalid-name yes no yes no yes no
name-too-short yes yes yes yes no no

I realize that if we keep regex this is going to be hell so maybe no invalid name for name that are too short when the regex used is the default one (maybe someone crafted a masterful regex that handle short names properly somewhere) is acceptable:

snake_case name A a Ab ab Abcd abcd
invalid-name no no no no yes no
name-too-short yes yes yes yes no no

To clarify, you're advocating a split of the existing NameChecker into two new checkers? One for naming style, and one for naming length. The naming-style checker would disregard any length problems.

I think depending on what we decide for the problem above having one checker instead of two could be simpler but yes.

If so, my goal would be to add the two new checkers in a disabled-by-default state until you're ready, while leaving the old checker untouched, for now.

That could work well but add complexity, if we can refactor the existing checker it's probably less work overall. How would you do the change to the TypeVar check @DanielNoord ? Would a standalone checker with only TypeVar work ? Should we keep TypeVar "generic" ?

Do you mean that there would be two new error names that would be set to also disable themselves with disablement of invalid-name?

Yes, check this code: https://github.com/PyCQA/pylint/blob/main/pylint/checkers/base/docstring_checker.py#L64
Result here, it's possible to enable missing-function-docstring by using the old name : https://github.com/PyCQA/pylint/blob/main/tests/functional/u/use/use_symbolic_message_instead.py#L20

bukzor added a commit to bukzor/pylint that referenced this pull request Aug 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Waiting on author Indicate that maintainers are waiting for a message of the author
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants