Skip to content

proposal: a simplified and generalized invalid-name config #7305

Open
@bukzor

Description

@bukzor

Current problem

The current configuration system for naming style in pylint can be seen as simultaneously overly rigid and overly flexible. There are two distinct ways to configure naming schemes. Primarily there is a named-scheme selection where users choose among lower_case UPPER_CASE PascalCase, etc., and there is a regex-based (-rgx) configuration where the user meant to write a regular expression that matches all (and only) permissible variable names.

As a concrete example:

[BASIC]
const-naming-style=UPPER_CASE
# from https://github.com/openqasm/openqasm/blob/main/.pylintrc
variable-rgx=[a-z_][a-z0-9_]{2,30}$

The named-style is convenient for users, but gives no clarity as to what exactly will match, and admits no ability to make small tweaks to its definition. The regex configuration is its near opposite, being completely unambiguous and easy to modify but painful-to-impossible to use for the average human.

Desired solution

This is my (tentative!) proposal: (keep in mind that all details will/would be modified to reflect maintainers' requirements)

Let's move this configuration out of "basic" to a new, separate "naming" namespace.

[NAMING]
variable = lower_case

[NAMING.lower_case]
allowed-characters = [a-zA-Z0-9_]
allowed-first-character = [a-z]
min-length = 3
max-length  = 0  # no limit

Or, alternatively: (I prefer the above)

[NAMING]
upper-case.allowed-characters = [A-Z0-9_]
upper-case.allowed-first-character = [A-Z]
upper-case.min-length = 0  # no limit
upper-case.max-length  = 30

Each naming style is be defined by just four characteristics: allowed characters, allowed first-letter character, minimum and maximum length. If we can expose those quadruplets to the configuration, then the configuration is easy-to-use (users can largely ignore these fine details), unambiguous, easy to modify, and there's no (obvious) need to expose users to the full footgun of regular expressions.

Note: The "character" configs are required to be a single character-class (i.e. a single-char regex beginning and ending with square brackets). This limitation allows excellent error messaging in the case of a invalid-name error, but also enables programmatic use.

This obviously enables the potential (but not requirement) to allow users to define custom named naming-schemes.

All naming schemes would then be defined as (approximately) f'_{{0,2}}{allowed-first-character}{allowed-characters}{{{min-length},{max-length}}}'

I believe the regular-expression-based configuration could then be phased out entirely, but could of course be retained if wanted as a fifth, optional regex attribute of the naming scheme with default None.

Additional context

This idea was split out of #3704, which considers revamping the invalid-name checker.

I (@bukzor) am volunteering to implement this change, if ratified.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Enhancement ✨Improvement to a componentNeeds specification 🔐Accepted as a potential improvement, and needs to specify edge cases, message names, etc.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions