-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[flake8-comprehensions] Fix C420 to prepend whitespace when needed
#18616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[flake8-comprehensions] Fix C420 to prepend whitespace when needed
#18616
Conversation
9dae719 to
bdf0ffb
Compare
|
| /// identifier is ASCII-only or not by mutably altering a reference to a | ||
| /// boolean value passed in. | ||
| fn is_identifier_continuation(c: char, identifier_is_ascii_only: &mut bool) -> bool { | ||
| pub fn is_identifier_continuation(c: char, identifier_is_ascii_only: &mut bool) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure you're fine with this PR increasing the coupling of the crates. However, I didn't want to duplicate that logic. But maybe there's even a better general approach for fixing the issue that works on a higher logical level than chars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a neat find, I didn't know about this function before.
I think I would have instead reached for something more like was done in #17648 (checking if two token ranges are adjacent). Would that work here too?
I'm not totally against using the lexer function if not, but I'm not sure we really want to make it pub like you said.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use edits::pad_start instead or is_identifier. I'd prefer to not make this function public
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pad_start sounds exactly like what we want... However, looking at its implementation I suspect (not tested yet) that it's implemented incorrectly. It only considers is_ascii_alphabetic. Python's identifiers allow much more than ASCII alphabetic characters, which is exactly why I reached for is_identifier_continuation. How do you suggest to proceed? My gut feeling is
- use
pad_startfor this PR - open a separate issue to discuss fixing
pad_start
What is_identifier are you referring to? There are multiple functions with that name in the repo 😇 If you mean ruff_python_stdlib::identifiers::is_identifier, I suppose we could, but it'd require looking at more than just the preceding character which might only be a valid identifier continuation, but not a start (e.g., a digit). Sounds more costly and error prone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, maybe in this case it's correct because we only care about preceding keywords, which only are contain ASCII alphabetic chars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, after looking at some examples in the code-base as well as Python's grammar for expressions, I'm fairly convinced, that pad, pad_start, and pad_end are fine as long as all edits that are padded are replacing a full expression as those cannot be surrounded by arbitrary identifiers as far as I can tell.
ntBre
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This looks good overall, just a couple of suggestions.
| /// identifier is ASCII-only or not by mutably altering a reference to a | ||
| /// boolean value passed in. | ||
| fn is_identifier_continuation(c: char, identifier_is_ascii_only: &mut bool) -> bool { | ||
| pub fn is_identifier_continuation(c: char, identifier_is_ascii_only: &mut bool) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a neat find, I didn't know about this function before.
I think I would have instead reached for something more like was done in #17648 (checking if two token ranges are adjacent). Would that work here too?
I'm not totally against using the lexer function if not, but I'm not sure we really want to make it pub like you said.
..._linter/src/rules/flake8_comprehensions/rules/unnecessary_dict_comprehension_for_iterable.rs
Outdated
Show resolved
Hide resolved
ntBre
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This looks really nice with pad_start. Thanks for double checking that it was working correctly too.
flake8_comprehensions] Fix C420 to prepend whitespace when neededflake8-comprehensions] Fix C420 to prepend whitespace when needed
astral-sh#18616) <!-- Thank you for contributing to Ruff/ty! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? (Please prefix with `[ty]` for ty pull requests.) - Does this pull request include references to any relevant issues? --> ## Summary This PR fixes rule C420's fix. The fix replaces `{...}` with `dict....(...)`. Therefore, if there is any identifier or such right before the fix, the fix will fuse that previous token with `dict...`. The example in the issue is ```python 0 or{x: None for x in "x"} # gets "fixed" to 0 ordict.fromkeys(iterable) ``` ## Related Issues Fixes: astral-sh#18599
Summary
This PR fixes rule C420's fix. The fix replaces
{...}withdict....(...). Therefore, if there is any identifier or such right before the fix, the fix will fuse that previous token withdict....The example in the issue is
Related Issues
Fixes: #18599