Skip to content

Conversation

@robsdedude
Copy link
Contributor

@robsdedude robsdedude commented Jun 10, 2025

Summary

This PR fixes rule C420's fix. The fix replaces {...} with dict....(...). Therefore, if there is any identifier or such right before the fix, the fix will fuse that previous token with dict....

The example in the issue is

0 or{x: None for x in "x"}
# gets "fixed" to
0 ordict.fromkeys(iterable)

Related Issues

Fixes: #18599

@robsdedude robsdedude force-pushed the fix/18599-C420-prepend-padding branch from 9dae719 to bdf0ffb Compare June 10, 2025 20:01
@github-actions
Copy link
Contributor

github-actions bot commented Jun 10, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

/// identifier is ASCII-only or not by mutably altering a reference to a
/// boolean value passed in.
fn is_identifier_continuation(c: char, identifier_is_ascii_only: &mut bool) -> bool {
pub fn is_identifier_continuation(c: char, identifier_is_ascii_only: &mut bool) -> bool {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure you're fine with this PR increasing the coupling of the crates. However, I didn't want to duplicate that logic. But maybe there's even a better general approach for fixing the issue that works on a higher logical level than chars.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a neat find, I didn't know about this function before.

I think I would have instead reached for something more like was done in #17648 (checking if two token ranges are adjacent). Would that work here too?

I'm not totally against using the lexer function if not, but I'm not sure we really want to make it pub like you said.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use edits::pad_start instead or is_identifier. I'd prefer to not make this function public

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pad_start sounds exactly like what we want... However, looking at its implementation I suspect (not tested yet) that it's implemented incorrectly. It only considers is_ascii_alphabetic. Python's identifiers allow much more than ASCII alphabetic characters, which is exactly why I reached for is_identifier_continuation. How do you suggest to proceed? My gut feeling is

  • use pad_start for this PR
  • open a separate issue to discuss fixing pad_start

What is_identifier are you referring to? There are multiple functions with that name in the repo 😇 If you mean ruff_python_stdlib::identifiers::is_identifier, I suppose we could, but it'd require looking at more than just the preceding character which might only be a valid identifier continuation, but not a start (e.g., a digit). Sounds more costly and error prone.

Copy link
Contributor Author

@robsdedude robsdedude Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, maybe in this case it's correct because we only care about preceding keywords, which only are contain ASCII alphabetic chars.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, after looking at some examples in the code-base as well as Python's grammar for expressions, I'm fairly convinced, that pad, pad_start, and pad_end are fine as long as all edits that are padded are replacing a full expression as those cannot be surrounded by arbitrary identifiers as far as I can tell.

@robsdedude robsdedude marked this pull request as ready for review June 10, 2025 20:20
@ntBre ntBre added bug Something isn't working fixes Related to suggested fixes for violations labels Jun 10, 2025
Copy link
Contributor

@ntBre ntBre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks good overall, just a couple of suggestions.

/// identifier is ASCII-only or not by mutably altering a reference to a
/// boolean value passed in.
fn is_identifier_continuation(c: char, identifier_is_ascii_only: &mut bool) -> bool {
pub fn is_identifier_continuation(c: char, identifier_is_ascii_only: &mut bool) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a neat find, I didn't know about this function before.

I think I would have instead reached for something more like was done in #17648 (checking if two token ranges are adjacent). Would that work here too?

I'm not totally against using the lexer function if not, but I'm not sure we really want to make it pub like you said.

@robsdedude robsdedude requested a review from ntBre June 27, 2025 23:11
Copy link
Contributor

@ntBre ntBre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks really nice with pad_start. Thanks for double checking that it was working correctly too.

@ntBre ntBre changed the title [flake8_comprehensions] Fix C420 to prepend whitespace when needed [flake8-comprehensions] Fix C420 to prepend whitespace when needed Jun 30, 2025
@ntBre ntBre merged commit 34052a1 into astral-sh:main Jun 30, 2025
35 checks passed
@robsdedude robsdedude deleted the fix/18599-C420-prepend-padding branch June 30, 2025 19:00
iyakushev pushed a commit to iyakushev/ruff that referenced this pull request Jul 1, 2025
astral-sh#18616)

<!--
Thank you for contributing to Ruff/ty! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title? (Please prefix
with `[ty]` for ty pull
  requests.)
- Does this pull request include references to any relevant issues?
-->

## Summary
This PR fixes rule C420's fix. The fix replaces `{...}` with
`dict....(...)`. Therefore, if there is any identifier or such right
before the fix, the fix will fuse that previous token with `dict...`.

The example in the issue is
```python
0 or{x: None for x in "x"}
# gets "fixed" to
0 ordict.fromkeys(iterable)
```

## Related Issues

Fixes: astral-sh#18599
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working fixes Related to suggested fixes for violations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

C420 fix should add spaces between tokens

3 participants