Skip to content

Conversation

@ntBre
Copy link
Contributor

@ntBre ntBre commented Nov 12, 2025

Summary

This is a second attempt that fixes #19226 based on the feedback in #20494 (comment).

We currently mark the comment in an expression like this:

if '' and (not #
0):
    pass

as a leading comment on not, eventually causing a panic in
Operand::has_unparenthesized_leading_comments because the end of the comment
is greater than the start of the expression.

TextRange::new(comment.end(), expression.start()),

This PR fixes the issue by instead making such a comment a dangling comment on
the unary expression.

In the third commit, I instead tried making the comment a leading comment on the
operand, which also looks pretty reasonable to me. Making it a dangling comment
seems more in line with the docs on handle_unary_op_comments, though.

I also tried deleting the leading comment logic in favor of the new dangling
logic in the fifth commit before reverting in the sixth. This looks okay to me
too, but the current state of the PR seems like the least invasive fix.

Test Plan

A new, minimized test case based on the issue. I also checked that the original
snippet from the report works now.

ntBre and others added 6 commits November 12, 2025 15:31
Summary
--

This is a second attempt at fixing #19226 based on the feedback in #20494 (comment).

We currently mark the comment in an expression like this:

```py
if '' and (not #
0):
    pass
```

as a leading comment on `not`, eventually causing a panic in
`Operand::has_unparenthesized_leading_comments` because the end of the comment
is greater than the start of the expression.

https://github.com/astral-sh/ruff/blob/a1d9cb5830eca9b63e7fb529504fc536e99bca23/crates/ruff_python_formatter/src/expression/binary_like.rs#L843

This PR fixes the issue by instead making such a comment a dangling comment on
the unary expression.

In the third commit, I instead tried making the comment a leading comment on the
operand, which also looks pretty reasonable to me. Making it a dangling comment
seems more in line with the docs on `handle_unary_op_comments`, though.

I also tried deleting the leading comment logic in favor of the new dangling
logic in the fifth commit before reverting in the sixth. This looks okay to me
too, but the current state of the PR seems like the least invasive fix.

Test Plan
--

A new, minimized test case based on the issue. I also checked that the original
snippet from the report works now.

Co-authored-by: Takayuki Maeda <takoyaki0316@gmail.com>
fix leading comments
This reverts commit 4c8540ede798f544a895225b541f46552649fd7b.
@ntBre ntBre added bug Something isn't working formatter Related to the formatter labels Nov 12, 2025
@astral-sh-bot
Copy link

astral-sh-bot bot commented Nov 12, 2025

ruff-ecosystem results

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

@ntBre ntBre marked this pull request as ready for review November 12, 2025 20:51
@ntBre ntBre requested a review from MichaReiser as a code owner November 12, 2025 20:51
if comment.end() < up_to {
.map(|lparen| lparen.start());
let up_to = lparen_start.unwrap_or(unary_op.operand.start());
if lparen_start.is_none()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test case for

if '' and (not #
(0)
):

and

if '' and (not 
	(  #
	0
)):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that immediately panicked again, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second case is okay, it formats like this:

if "" and (
    not (  #
        0
    )
):
    pass

but the first case is related to an existing test case:

x = (
    # a
    not # b
    # c
    ( # d
        # e
        True
    )
)

which formats like this:

x = (
    # a
    # b
    # c
    not (  # d
        # e
        True
    )
)

but this causes a panic on main:

x = '' and (
    # a
    not # b
    # c
    ( # d
        # e
        True
    )
)

So I'm kind of wondering if adjusting the placement is really the right fix. It seems like we often want this kind of comment to become a leading comment to preserve the relative order with other comments and a change to the parent expression is what introduces the problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried a different fix in binary_like in a separate branch: main...brent/unary-comment2

I kind of like this because it feels more targeted and doesn't affect any existing snapshots, but it also feels a bit like just treating the symptom. However, it seems hard to differentiate the cases in my comment above otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I guess that branch is quite similar to #20494 though 😅

&& comment.end() < unary_op.operand.start()
&& comment.line_position().is_end_of_line()
{
CommentPlacement::dangling(unary_op, comment)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we now need to update the documentation of this method and explain in which situation it makes comments as dangling

if ( # comment
not aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
if (
not aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa # comment
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you double check that

if (  # comment
    not aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Comment
    + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
):
    pass

still formats the same before and after your PR, just to make sure we don't accidentially change the stable style

@ntBre
Copy link
Contributor Author

ntBre commented Nov 14, 2025

I played a bit more with this and chatted with Micha, and I think we should just patch this in the binary expression formatting. I think the closest I got to working something out was making the comment placement handling in handle_unary_op_comment:

if comment.end() < up_to {
CommentPlacement::leading(unary_op, comment)
} else {
CommentPlacement::Default(comment)

CommentPlacement::dangling and then partitioning these dangling comments into leading and trailing comments in the unary expression formatting proper:

// Split off the comments that follow after the operator and format them as trailing comments.
// ```python
// (not # comment
// a)
// ```
trailing_comments(dangling).fmt(f)?;

But this causes issues with other parent expression formatting. For example, the if statement formatting needs to know if the condition expression has any leading (or trailing) comments so that it can parenthesize it, but if we make the comments dangling, it won't, and I was causing a syntax error in a case like this:

if (
  not
  # comment
  a):
    pass

which was formatted to

if 
# comment
not a:
    pass

I also temporarily added a check for leading comments that start after their expression in CommentPlacement::leading. Seeing that this is not the only case that violates the TextRange assumption in the binary expression formatting makes me feel slightly better about just adding a check to prevent the panic.

I still need to verify that the surrounding code makes sense after a change like in main...brent/unary-comment2, but that's what I'm currently planning to pursue.

@MichaReiser
Copy link
Member

Regarding the if formatting, I assume that needs_parentheses returns never. You'd have to change it to return Multiline if there's a leading operand comment

@ntBre
Copy link
Contributor Author

ntBre commented Nov 14, 2025

Regarding the if formatting, I assume that needs_parentheses returns never. You'd have to change it to return Multiline if there's a leading operand comment

Ooh, thank you! This is looking quite promising. I'm getting an extra newline now, but if I can fix that, this might work!

      361   378not a
      362   379 │ ):
      363   380pass
      364   381365-if (  # comment
            382+if (
            383+    # comment
      366   384not a
      367   385 │ ):
      368   386pass
      369   387

@ntBre
Copy link
Contributor Author

ntBre commented Nov 14, 2025

Tracking down that newline turned out to be quite the adventure. I think this is another case where things "just work" if the comments on the unary op are leading but would require additional special-casing as dangling comments. I'm pushing up my changes from the other branch with a bit of refactoring and some documentation, but I'm happy to revisit if I'm still missing something.

@MichaReiser
Copy link
Member

Tracking down that newline turned out to be quite the adventure. I think this is another case where things "just work" if the comments on the unary op are leading but would require additional special-casing as dangling comments. I'm pushing up my changes from the other branch with a bit of refactoring and some documentation, but I'm happy to revisit if I'm still missing something.

Can you share an example of where it goes wrong?

@ntBre
Copy link
Contributor Author

ntBre commented Nov 14, 2025

These cases all have their comments moved from end of line to own-line comments, as in the diff above:

if (
    not # comment
    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
    + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
):
    pass

if (
  not  # comment
  a):
    pass

if '' and (not #
0):
    pass

if '' and (not #
(0)
):
    pass

if '' and (not
	(  #
	0
)):
    pass

I tracked the issue into soft_block_indent called from here in the if header formatting:

group(&soft_block_indent(&content)).fmt(f)

triggered by not having leading comments here:

if !node_comments.has_leading() && !node_comments.has_trailing() {
parenthesized("(", &format_expr, ")")
.with_hugging(is_expression_huggable(expression, f.context()))
.fmt(f)

And I was playing with this approach on yet another branch full of debug prints, in case you want to try it: main...brent/unary-3

ntBre added a commit that referenced this pull request Nov 14, 2025
see #21410 (comment), but
the short summary is that `if` (and likely other) statement formatting code that
uses `maybe_parenthesize` checks if the condition has any leading or trailing
comments, so if we try to smuggle the comments in as dangling comments, it
thinks the expression won't break, so it doesn't add parentheses when formatting
a case like this:

```py
if (
  not
  # comment
  a):
    pass
```

and we end up with a syntax error:

```py
if
not a:
    pass
```

There may be some other way around this, but this is why I'm giving up for now.
It really feels like we want another CommentPlacement variant or some kind of
dangling tag, like Micha mentioned when I met with him.
@MichaReiser
Copy link
Member

I'm not sure I follow your latest changes. Pasting your snippet shows that all these comments are places as leading unary expression comments and not dangling comments. Is this intentional? You can see this when opening the comments tab in the playground

@ntBre
Copy link
Contributor Author

ntBre commented Nov 15, 2025

Yes that's intentional. I reverted my earlier changes where I was trying to make them dangling comments, restored them to leading comments like on main, and then worked around the panic in the binary formatting instead.

Basically I gave up on the initial approach, but I'm still happy to keep trying if we come up with something better.

@MichaReiser
Copy link
Member

Oh I see. I don't think I understand the changes you made in that branch. What I'd expected:

  • Why are we now formatting the trailing comments before the operator?
  • We need to insert a hard line break if there are any leading comments (before the operator)
  • No changes should be needed in the expression formatting

Happy to jump on a pair programming on this but I think we should maybe start fresh in the unary expression formatting and try to make small incremental changes.

@ntBre
Copy link
Contributor Author

ntBre commented Nov 15, 2025

  • Why are we now formatting the trailing comments before the operator?

I thought this is what we wanted to avoid splitting the operator and the operand?

  • We need to insert a hard line break if there are any leading comments (before the operator)

If I'm thinking of the same hard line break, this would change our stable formatting. That's what I meant about the newline in my comments above. But maybe you're thinking of a different one.

  • No changes should be needed in the expression formatting

Happy to jump on a pair programming on this but I think we should maybe start fresh in the unary expression formatting and try to make small incremental changes.

That sounds good! The other branch was my attempt to start fresh, but I'm sure I missed a simpler option along the way. I'll convert this back to a draft for now.

@ntBre ntBre marked this pull request as draft November 15, 2025 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working formatter Related to the formatter

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Panic assertion failed: start.raw <= end.raw in ruff_python_formatter/src/expression/binary_like.rs

4 participants