-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Allow newlines after function headers without docstrings #21110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary -- This is a first step toward fixing #9745. After reviewing our open issues and several Black issues and PRs, I personally found the function case the most compelling, especially with very long argument lists: ```py def func( self, arg1: int, arg2: bool, arg3: bool, arg4: float, arg5: bool, ) -> tuple[...]: if arg2 and arg3: raise ValueError ``` or many annotations: ```py def function( self, data: torch.Tensor | tuple[torch.Tensor, ...], other_argument: int ) -> torch.Tensor | tuple[torch.Tensor, ...]: do_something(data) return something ``` I think docstrings help the situation substantially both because syntax highlighting will usually give a very clear separation between the annotations and the docstring and because we already allow a blank line _after_ the docstring: ```py def function( self, data: torch.Tensor | tuple[torch.Tensor, ...], other_argument: int ) -> torch.Tensor | tuple[torch.Tensor, ...]: """ A function doing something. And a longer description of the things it does. """ do_something(data) return something ``` There are still other comments on #9745, such as [this one] with 9 upvotes, where users specifically request blank lines in all block types, or at least including conditionals and loops. I'm sympathetic to that case as well, even if personally I don't find an [example] like this: ```py if blah: # Do some stuff that is logically related data = get_data() # Do some different stuff that is logically related results = calculate_results() return results ``` to be more readable at all than: ```py if blah: # Do some stuff that is logically related data = get_data() # Do some different stuff that is logically related results = calculate_results() return results ``` I'm probably just used to the latter from the formatters I've used, but I do prefer it. I also think that functions are the least susceptible to the accidental introduction of a newline after refactoring described in Micha's [comment] on #8893. I actually considered further restricting this change to functions with multiline headers. I don't think very short functions like: ```py def foo(): return 1 ``` benefit nearly as much from the allowed newline, but I just went with any function without a docstring or immediate comment for now. I guess a marginal case like: ```py def foo(a_long_parameter: ALongType, b_long_parameter: BLongType) -> CLongType: return 1 ``` might be a good argument for not restricting it. I caused a couple of syntax errors before adding special handling for the ellipsis-only case, so I suspect that there are some other interesting edge cases that may need to be handled better. Test Plan -- Existing tests, plus a few simple new ones. As noted above, I suspect that we may need a few more for edge cases I haven't considered. [this one]: #9745 (comment) [example]: psf/black#902 (comment) [comment]: #8893 (comment)
|
|
The ecosystem results look good to me. There were only a couple of somewhat surprising cases:
These both feel a bit awkward to me because they preserve a blank line inside the inner functions but not outside. But I guess we're only preserving the input whitespace, so this might still be what the authors intended. |
MichaReiser
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of restricting it to functions with docstrings but I'm also still concerned that it reduces consistency and will lead to formatting discussions on diff, the very problem Ruff format tries to solve.
For example, I dislike this change and I'd be inclined to comment in the PR review that the empty line here is unnecessary:
def nlg_app(base_url="/"):
+
app = Sanic("test_nlg")
@app.route(base_url, methods=["POST"])
Having said that, I think this strikes a good balance.
I went through the original issue and I noticed that we don't preserve the empty line for
if long_boolean_variable_name and another_long_conditional:
# A pretty long comment documenting what the block does
print("Do stuff")Which I'd find surprising and seems worth checking if we can support it too
| # And we discard empty lines after the first: | ||
| def partially_preserved1(): | ||
|
|
||
|
|
||
| return 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add more tests to this, specifically tests including comments or cases where the first element is a function on its own?
| && matches!(self.kind, SuiteKind::Function) | ||
| && matches!(first, SuiteChildStatement::Other(_)) | ||
| && !comments.has_leading(first) | ||
| && !contains_only_an_ellipsis(statements, f.context().comments()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This elipsis check should exactly match the check we do in the function formatting (also test for trailing comments)
ruff/crates/ruff_python_formatter/src/statement/clause.rs
Lines 451 to 453 in c4506f0
| if should_collapse_stub | |
| && contains_only_an_ellipsis(self.body, f.context().comments()) | |
| && self.trailing_comments.is_empty() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added two new tests trying to exercise this, and they both seemed to work as I expected without any additional check:
def preserved5():
...
# trailing comment prevents collapsing the stub
def removed4():
... # trailing same-line comment does not prevent collapsing the stuband in fact when I updated this condition to:
let will_collapse_stub_function =
contains_only_an_ellipsis(statements, f.context().comments())
&& !first_comments.has_trailing();I caused another syntax error for the removed4 case. contains_only_an_ellipsis already has one kind of check for trailing comments:
ruff/crates/ruff_python_formatter/src/statement/suite.rs
Lines 730 to 739 in d38a529
| pub(crate) fn contains_only_an_ellipsis(body: &[Stmt], comments: &Comments) -> bool { | |
| match body { | |
| [Stmt::Expr(ast::StmtExpr { value, .. })] => { | |
| let [node] = body else { | |
| return false; | |
| }; | |
| value.is_ellipsis_literal_expr() | |
| && !comments.has_leading(node) | |
| && !comments.has_trailing_own_line(node) | |
| } |
but I'm very likely to be missing something. Is first_comments not what I should be checking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's not relevant. You have to trace through FormatStmtFunctionDef.
But yes, first_comments is the wrong thing to test. trailing_comments in FormatClauseBody are the trailing_definition_comments in FormatStmtFunctionDef. So I think it's more about something like
def test(): # comment
...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay, I see now. Thanks! Yes, your example comment prevents collapsing the stub in FormatClauseBody, but we remove the newline because the logic is different.
Is there a way to pass this should_collapse_stub information into the suite formatting (maybe another with_options implementation?)? Or alternatively a way to retrieve the parent function definition so that we could recompute it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed on Discord, I added a variant of contains_only_an_ellipsis that returns the ellipsis statement, which we can then format directly in FormatClauseBody instead of calling into the suite formatting at all. Now the blank line in this example is correctly preserved.
| && !contains_only_an_ellipsis(statements, f.context().comments()); | ||
|
|
||
| if allow_newline_after_block_open | ||
| && lines_before(first.start(), f.context().source()) > 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is correct but maybe worth adding a test that we collapse any empty line before a comment:
def test():
# comment
agets formatted to
def test():
# comment
aThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that different from the new test here?
ruff/crates/ruff_python_formatter/resources/test/fixtures/ruff/newlines.py
Lines 358 to 362 in abb54b3
| def removed2(): | |
| # Comment | |
| return 1 |
It looks like the formatting got slightly mangled in your comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, this is what's coming up in the ecosystem check now. I accidentally skirted around this by not preserving cases with any leading comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lines_before may not be what I want here, if we want to preserve blank lines before comments. All of these are returning 2 lines_before:
def untouched1():
# comment
# more lines in the comment
# and one more
return 0
def untouched2():
# comment
return 0
def untouched3():
# comment
return 0There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That, or you'd have to pass the position of the first leading comment as the offset
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, we do have a lines_after_ignore_trivia (or similar) that uses the tokenizer over doing a string search
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I didn't consider that I could pass a comment offset there. I assumed it had to be a statement node. Thanks!
I saw lines_after_ignore_trivia, but not a before variant. Happy to try that approach if it would be better, but using the comment offset seems to be working well.
crates/ruff_python_formatter/tests/snapshots/format@range_formatting__indent.py.snap
Show resolved
Hide resolved
Should we try preserving the empty line for any One reason I was wary of the conditionals is that they don't seem much different from loops. For example: for i in some_generator_function(
with_many_arguments=[1, 2, 3], another=(4, 5, 6), one_more=None
):
# a comment
print("body of the loop")Should we handle those too? Functions seemed like a good place to draw the line to me, but I can see an argument for more consistency too. |
No sorry. I think I tested it with a function header but then copied the original example from the issue, but maybe I'm just jetlagged and tested it with a function header? So what I think we should preserve is: def foo():
# A pretty long comment documenting what the block does
print("Do stuff") |
|
Ohh okay, that makes sense! So preserve lines before comments instead of removing them. I think that does make sense. At least one person on one of the Black issues even wanted to preserve empty lines before docstrings. I don't feel too strongly about either comments or docstrings, I was just trying to narrow the scope even further. |
The docstrings make sense to me (not allowing an empty line) and is sort of named exactly in the rule. The comments just were a surprise to me (but I don't feel strongly about it) |
|
Oops, last commit was too simplistic. We're now adding blank lines before comments rather than just preserving them. That looks like the majority of new ecosystem hits. |
we're adding a blank line before a comment if the there's a blank line after it
|
The ecosystem check looks good to me again. I didn't click through to every change, but I downloaded the full output from the Actions log and checked more than the subset in the comment.
|
this avoids needing to keep the two should_collapse checks in sync
Summary
This is a first step toward fixing #9745. After reviewing our open issues and several Black issues and PRs, I personally found the function case the most compelling, especially with very long argument lists:
or many annotations:
I think docstrings help the situation substantially both because syntax highlighting will usually give a very clear separation between the annotations and the docstring and because we already allow a blank line after the docstring:
There are still other comments on #9745, such as this one with 9 upvotes, where users specifically request blank lines in all block types, or at least including conditionals and loops. I'm sympathetic to that case as well, even if personally I don't find an example like this:
to be much more readable than:
I'm probably just used to the latter from the formatters I've used, but I do prefer it. I also think that functions are the least susceptible to the accidental introduction of a newline after refactoring described in Micha's comment on #8893.
I actually considered further restricting this change to functions with multiline headers. I don't think very short functions like:
benefit nearly as much from the allowed newline, but I just went with any function without a docstring for now. I guess a marginal case like:
might be a good argument for not restricting it.
I caused a couple of syntax errors before adding special handling for the ellipsis-only case, so I suspect that there are some other interesting edge cases that may need to be handled better.
Test Plan
Existing tests, plus a few simple new ones. As noted above, I suspect that we may need a few more for edge cases I haven't considered.