[`ruff`] Fix syntax error introduced for an empty string followed by a u-prefixed string (`UP025`) #18899

lubaskinc0de · 2025-06-23T17:08:34Z

Summary

/closes #18895

Test Plan

github-actions · 2025-06-23T18:01:30Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

ntBre

Thanks! I had a suggestion for a slightly simpler version of the current implementation and also a question that I'm not sure about either.

ntBre · 2025-06-23T17:54:47Z

crates/ruff_linter/src/rules/pyupgrade/rules/unicode_kind_prefix.rs

+        let first_char = checker
+            .locator()
+            .slice(TextRange::at(string.start(), TextSize::new(1)));
+        let u_position = u32::from(!(first_char == "u" || first_char == "U"));
+        let prefix_range =
+            TextRange::at(string.start() + TextSize::new(u_position), TextSize::new(1));


I don't think this arithmetic is necessary. The is_unicode check above confirms that the first character will be a u, so this just boils down to what was inside Edit::range_deletion originally.

Suggested change

let first_char = checker

.locator()

.slice(TextRange::at(string.start(), TextSize::new(1)));

let u_position = u32::from(!(first_char == "u" || first_char == "U"));

let prefix_range =

TextRange::at(string.start() + TextSize::new(u_position), TextSize::new(1));

let prefix_range = TextRange::at(string.start(), TextSize::new(1));

In general, I think something like this might be a simpler version of the current implementation:

let prefix_range = TextRange::at(string.start(), TextSize::new(1)); let locator = checker.locator(); let content = locator .slice(TextRange::new(prefix_range.end(), string.end())) .to_owned(); // If the preceding character is equivalent to the quote character, insert a space to avoid a // syntax error. For example, when removing the `u` prefix in `""u""`, rewrite to `"" ""` // instead of `""""`. // see https://github.com/astral-sh/ruff/issues/18895 let edit = if locator .slice(TextRange::up_to(prefix_range.start())) .chars() .last() .is_some_and(|char| content.starts_with(char)) { Edit::range_replacement(" ".to_string(), prefix_range) } else { Edit::range_deletion(prefix_range) }; diagnostic.set_fix(Fix::safe_edit(edit));

This just gives either a deletion like before or simply replaces the u with a space, instead of having to replace the whole string.

ntBre · 2025-06-23T18:42:05Z

...uff_linter/src/rules/pyupgrade/snapshots/ruff_linter__rules__pyupgrade__tests__UP025.py.snap

 26 26 | 
 27    |-f"foo"u"bar"  # OK
-   27 |+f"foo""bar"  # OK
+   27 |+f"foo" "bar"  # OK


I'm slightly concerned that we're now adding spaces where they aren't totally necessary. The issue indicated a few more constraints on the problem:

introduce a syntax error when the prefixed string follows a non-triple-quoted empty string with the same kind of quotation marks without an intervening space

Cases like this ("foo""bar" and even "foo""") are okay, as are empty triple-quoted strings like """""""" (8 double quotes):

$ python -m tokenize <<<'""""""""' 1,0-1,6: STRING '""""""' 1,6-1,8: STRING '""' 1,8-1,9: NEWLINE '\n' 2,0-2,0: ENDMARKER ''

I think we could check the kind of the preceding token to see if it's a triple-quoted string, but I don't think we can perform an is_empty check on the tokens.

@MichaReiser what do you think? Is it okay to add spaces where they aren't strictly necessary, or do you know of a better way to check if the previous string part is non-triple-quoted and empty?

I think the easiest would be to make this a rule that operates on StringLike and then iterate over all parts. This also ensures that, when removing multiple parts, that the spacing is only inserted when necessary.

I think we could check the kind of the preceding token to see if it's a triple-quoted string, but I don't think we can perform an is_empty check on the tokens.

I think this is correct because of line continuations where subtracting the prefix from the range doesn't yield the correct result:

a= "\ " u"c"

s it okay to add spaces where they aren't strictly necessary, or do you know of a better way to check if the previous string part is non-triple-quoted and empty?

I think it's fine for as long as it only pads at most one space (never results in two consecutive spaces)

Going back to this problem, I don't really understand the next steps. Here's the current implementation

/// UP025 pub(crate) fn unicode_kind_prefix(checker: &Checker, string: &StringLiteral) { if string.flags.prefix().is_unicode() { let mut diagnostic = checker.report_diagnostic(UnicodeKindPrefix, string.range); let prefix_range = TextRange::at(string.start(), TextSize::new(1)); let locator = checker.locator(); let content = locator .slice(TextRange::new(prefix_range.end(), string.end())) .to_owned(); // If the preceding character is equivalent to the quote character, insert a space to avoid a // syntax error. For example, when removing the `u` prefix in `""u""`, rewrite to `"" ""` // instead of `""""`. // see https://github.com/astral-sh/ruff/issues/18895 let edit = if locator .slice(TextRange::up_to(prefix_range.start())) .chars() .last() .is_some_and(|char| content.starts_with(char)) { Edit::range_replacement(" ".to_string(), prefix_range) } else { Edit::range_deletion(prefix_range) }; diagnostic.set_fix(Fix::safe_edit(edit)); } }

As far as I understand, it correctly handles the following cases:
before fix:

a = "\ " u"c" f"foo"u"bar" f"foo" u"bar" ""u""

after fix:

a = "\ " "c" f"foo" "bar" f"foo" "bar" "" ""

In that case, I don't understand the drawbacks of the current simple implementation. Are there any cases that are currently being handled differently than desired?

My argument was that a case like this:

f"foo"u"bar"

is not being handled correctly because we're adding a space that isn't necessary. In other words, this fix would be fine:

f"foo""bar"

but it's being fixed to

f"foo" "bar"

Micha said that could be okay as long as we never add a second space, but I don't entirely follow the line-continuation example. It does seem to be handled properly in the current version, as you said.

As Micha pointed out, another approach would be to iterate over the parts of the string instead of looking at a single part at a time. Then we could check the conditions mentioned in the issue that actually require a space to be inserted: the preceding string part is not triple quoted and not empty.

To be more concrete, we would pass in value here (or a transformed version of it to allow passing both f-strings and regular strings) instead of passing a single string_part at a time:

ruff/crates/ruff_linter/src/checkers/ast/analyze/expression.rs

Lines 1575 to 1579 in 10a14a9

if checker.is_rule_enabled(Rule::UnicodeKindPrefix) {

for string_part in value {

pyupgrade::rules::unicode_kind_prefix(checker, string_part);

}

}

With all that being said, maybe it's not worth worrying about and we can just move ahead with this version. I think the space looks better anyway, and I can't come up with a counter-example that adds a second space.

I think that the simpler the implementation (and it's quite simple now) the better, and if it works correctly, why not keep it as it is? but it might be worth asking someone else

anyway, if you need to do it differently, I'll do it, just tell me

I think you're right, let's move ahead with the simple fix for now. Thank you!

…ix.rs Co-authored-by: Brent Westbrook <36778786+ntBre@users.noreply.github.com>

fix issue

f3edbd0

ntBre added bug Something isn't working fixes Related to suggested fixes for violations labels Jun 23, 2025

ntBre reviewed Jun 23, 2025

View reviewed changes

lubaskinc0de and others added 2 commits June 23, 2025 22:57

Update crates/ruff_linter/src/rules/pyupgrade/rules/unicode_kind_pref…

1bcb5b7

…ix.rs Co-authored-by: Brent Westbrook <36778786+ntBre@users.noreply.github.com>

simple implementation

10a14a9

ntBre approved these changes Jul 1, 2025

View reviewed changes

ntBre changed the title ~~[ruff] Fix introduces a syntax error for an empty string followed by a u-prefixed string (UP025)~~ [ruff] Fix syntax error introduced for an empty string followed by a u-prefixed string (UP025) Jul 1, 2025

ntBre merged commit 667dc62 into astral-sh:main Jul 1, 2025
35 checks passed

BrewTestBot mentioned this pull request Jul 3, 2025

ruff 0.12.2 Homebrew/homebrew-core#228969

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[`ruff`] Fix syntax error introduced for an empty string followed by a u-prefixed string (`UP025`) #18899

[`ruff`] Fix syntax error introduced for an empty string followed by a u-prefixed string (`UP025`) #18899

Uh oh!

lubaskinc0de commented Jun 23, 2025

Uh oh!

github-actions bot commented Jun 23, 2025 •

edited

Loading

Uh oh!

ntBre left a comment

Uh oh!

ntBre Jun 23, 2025

Uh oh!

ntBre Jun 23, 2025

Uh oh!

ntBre Jun 23, 2025

Uh oh!

MichaReiser Jun 24, 2025

Uh oh!

lubaskinc0de Jun 29, 2025

Uh oh!

ntBre Jun 30, 2025

Uh oh!

lubaskinc0de Jul 1, 2025

Uh oh!

lubaskinc0de Jul 1, 2025

Uh oh!

ntBre Jul 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	if checker.is_rule_enabled(Rule::UnicodeKindPrefix) {
	for string_part in value {
	pyupgrade::rules::unicode_kind_prefix(checker, string_part);
	}
	}

[ruff] Fix syntax error introduced for an empty string followed by a u-prefixed string (UP025) #18899

[ruff] Fix syntax error introduced for an empty string followed by a u-prefixed string (UP025) #18899

Uh oh!

Conversation

lubaskinc0de commented Jun 23, 2025

Summary

Test Plan

Uh oh!

github-actions bot commented Jun 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ruff-ecosystem results

Linter (stable)

Linter (preview)

Uh oh!

ntBre left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[`ruff`] Fix syntax error introduced for an empty string followed by a u-prefixed string (`UP025`) #18899

[`ruff`] Fix syntax error introduced for an empty string followed by a u-prefixed string (`UP025`) #18899

github-actions bot commented Jun 23, 2025 •

edited

Loading

`ruff-ecosystem` results