Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect and prevent implicit string concatenations e.g. lists #13031

Closed
KennethEnevoldsen opened this issue Aug 21, 2024 · 12 comments
Closed

Detect and prevent implicit string concatenations e.g. lists #13031

KennethEnevoldsen opened this issue Aug 21, 2024 · 12 comments
Labels
needs-decision Awaiting a decision from a maintainer rule Implementing or modifying a lint rule

Comments

@KennethEnevoldsen
Copy link

A typical error I (and colleagues) run into is as follows:

mylist = ["a", "b" "c"] 
# results in:
mylist # ['a', 'bc']

# as opposed to the intended ["a", "b", "c"] 

While this pattern is often well known, this pattern can often lead to unintended side effects, which are hard to track and detect.

This suggestion is in line with pep3126 (which was rejected due to "the feature to be removed isn’t all that harmful").

I have searched existing ruff rules and only found the ISC rules, which do not cover this pattern. The proposed pattern is in line with some of the ISC rules (ISC001), while I would say it disagrees with (ISC003)

@KennethEnevoldsen KennethEnevoldsen changed the title Detect and prevent implicit split concetatenations e.g. lists Detect and prevent implicit string concetatenations e.g. lists Aug 21, 2024
@MichaReiser
Copy link
Member

Can you help me understand how this rule would be different from ISC001 or why we should have both rules? We generally prefer not to have rules with overlapping concerns.

@KennethEnevoldsen
Copy link
Author

KennethEnevoldsen commented Aug 21, 2024

In the example above the intention is:

mylist = ["a", "b" "c"] 

ISC001 would suggest (according to the documentation and notably in autofixes). That it should be:

mylist = ["a", "bc"] # correctly reflects how it is evaluated, but not the most likely intention

While I would prefer:

mylist = ["a", "b", "c"] # correctly reflects the intention

@KennethEnevoldsen
Copy link
Author

KennethEnevoldsen commented Aug 21, 2024

In general, I would prefer not to have any implicit string concatenation in my code, which at least in my reading of the ISC it argues for due to implicit string concatenation being more readable. This might be true, but at least it can often carry bugs so I prefer to avoid it if possible.

@MichaReiser
Copy link
Member

In general, I would prefer not to have any implicit string concatenation in my code, which at least in my reading of the ISC it argues for due to implicit string concatenation being more readable.

Using ISC001 and ISC002 should give you the desired behavior that all implicitly concatenated strings are flagged.

ISC001 would suggest (according to the documentation and notably in autofixes). That it should be:

Thank's. That's helpful. This is related to #13014 Overall it seems that there could be an addition of one more implicit string concatenation rule that detects suspicious usages (vs denying it for stylistic reasons). Although it isn't clear to me how to solve the overlap with ISC001, and ISC002

@MichaReiser MichaReiser added rule Implementing or modifying a lint rule needs-decision Awaiting a decision from a maintainer labels Aug 21, 2024
@KennethEnevoldsen
Copy link
Author

Using ISC001 and ISC002 should give you the desired behavior that all implicitly concatenated strings are flagged.

The auto fix does still seem problematic to me. I can of course just disable it.

vs denying it for stylistic reasons

I wouldn't call denying a pattern that often carries bugs a stylistic reason - the discussion around the pep acknowledged this as well, with the reason for not removing it mainly concerning backward compatibility. A choice that I fully understand, but where it is hard to remove it is easy to prevent using linters.

Although it isn't clear to me how to solve the overlap with ISC001, and ISC002

I do think the change is inherently incompatible. ISC002 and ISC003 prefer implicit concatenation over explicit concatenation (ISC001 is fully compatible at least during linting). Where I would argue anytime you see a string concatenation in a list, set, etc. (["a", "b" "c"]) it is much more likely to be a bug than an intended string concatenation.

@AlexWaygood
Copy link
Member

AlexWaygood commented Aug 21, 2024

The exact autofix behaviour you want from ISC001 here likely depends on whether the single-line implicit concatenation was introduced by a human or by an autoformatter. If the implicitly concatenated string was added by a human, then it's almost certainly a mistake, and the human likely meant to put a comma in between the two strings (as you say @KennethEnevoldsen). But if it was inserted by an autoformatter, it was probably previously a multiline implicitly concatenated string that the autoformatter realised was needleslly spread out over multiple lines; in that case, the proper autofix is to merge the two strings, as ISC001 currently does.

Ideally we'd just fix Black and the Ruff formatter so that they never create single-line implicitly concatenated strings in the first place. But the ISC001 rule (originally as a flake8 plugin) was originally created to address an issue where Black would often undesirably collapse multiline implicitly concatenated strings into single-line implicitly concatenated strings. So that's probably the motivation for the current autofix behaviour.

Unfortunately it's not really possible for a linter rule to know whether the implicit concatenation was introduced by a human or an autoformatter...!

@KennethEnevoldsen
Copy link
Author

Thanks for the clarification @AlexWaygood, this gives a lot of context to the current formulation of ISC001. From the warning it also seems to conflict with the ruff formatter:

warning: The following rules may cause conflicts when used with the formatter: `ISC001`.

Just to be clear, I don't believe the auto fix behavior ["a", "b" "c"] -> ["a", "b", "c"] should be implemented.

I (and I imagine others as well) would prefer a set of rules with simply raise a linting error on implicit string concatenation (not only in a single line). This resolves the conflict with the formatter (as far as I understand) and makes parsing simpler. Naturally, it would be optional and incompatible with ISC003.

@AlexWaygood
Copy link
Member

AlexWaygood commented Aug 21, 2024

So, I think you can already get what you're asking for @KennethEnevoldsen using this configuration:

[tool.ruff.lint]
extend-select = ["ISC"]
unfixable = ["ISC001"]

[tool.ruff.lint.flake8-implicit-str-concat]
allow-multiline = false

That will result in Ruff complaining about both single-line and multiline implicitly concatenated strings, and it will switch off the autofix for ISC001 (even if you run Ruff with --fix).

@AlexWaygood
Copy link
Member

@AlexWaygood AlexWaygood changed the title Detect and prevent implicit string concetatenations e.g. lists Detect and prevent implicit string concatenations e.g. lists Aug 22, 2024
@KennethEnevoldsen
Copy link
Author

Thanks, @AlexWaygood that does indeed resolve the issue (couldn't find a case where it failed). I should have noticed that sooner. I went through the documentation to check if this might be a chance to improve the documentation, but there is nothing that I would change as is.

I will close this issue for now. Thanks, @AlexWaygood and @MichaReiser for the help.

@hugovk
Copy link
Contributor

hugovk commented Aug 22, 2024

Ideally we'd just fix Black and the Ruff formatter so that they never create single-line implicitly concatenated strings in the first place. But the ISC001 rule (originally as a flake8 plugin) was originally created to address an issue where Black would often undesirably collapse multiline implicitly concatenated strings into single-line implicitly concatenated strings.

Black has a fix in preview mode to prevent some of the ISC-related issues, but it's not been ready for the past couple of annual style releases. See psf/black#2188.

@MichaReiser
Copy link
Member

@hugovk we plan on implementing a simpler solution that we feel more confident in stabilizing (the mentioned black preview style has been in preview for a long term). See #9457

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-decision Awaiting a decision from a maintainer rule Implementing or modifying a lint rule
Projects
None yet
Development

No branches or pull requests

4 participants