Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

comprehensions poor formatting #733

Closed
jgirardet opened this issue Mar 4, 2019 · 7 comments
Closed

comprehensions poor formatting #733

jgirardet opened this issue Mar 4, 2019 · 7 comments
Labels
F: linebreak How should we split up lines? T: style What do we want Blackened code to look like?

Comments

@jgirardet
Copy link
Contributor

black 18.9b

class Test:
    def test_list_fields(self, big, big_j, param):
        assert [asdict(y) for x in big for y in getattr(x, param)] == [
            y for x in big_j for y in x[param]
        ]

I don't know if something better can be done but thats not very readable

@ambv
Copy link
Collaborator

ambv commented Mar 14, 2019

I agree this does not look great. However, there's no way for Black to handle all those situations correctly without making it way more complex than it is today. It would need to start understanding concepts like "symmetry" to format this better automatically.

If I tweak the current heuristic to solve this particular situation better, it will suck for other kinds of asserts. I will look into this but this is hard to get right.

If the auto-formatter fails to format very complex expressions, I'd suggest maybe refactoring it slightly:

class Test:
    def test_list_fields(self, big, big_j, param):
        expected = [asdict(y) for x in big for y in getattr(x, param)]
        actual = [y for x in big_j for y in x[param]]
        assert expected == actual

@JelleZijlstra JelleZijlstra added the T: style What do we want Blackened code to look like? label May 5, 2019
@devxpy
Copy link

devxpy commented Aug 8, 2019

I'm not sure whether this is possible, but this transformation looks very pretty

items = [item for iterable in iterables for item in iterable if item < 5]

To

items = [
    item 
    for iterable in iterables 
        for item in iterable 
            if item < 5
]

@pykong
Copy link

pykong commented Nov 7, 2019

The lack of formatting for comprehensions is my sole pain using black.
As comprehensions are such central feature of python, formatting them is an absolute must.
Hence please make this the top priority for new features to implement.

On the implementation

Simple comprehensions, i.e. those not featuring conditionals or nesting are fine just being put on single line. However, once you deal with complex comprehensions proper formatting becomes vital to maintain readability.

The formatting convention suggests putting statements on separate line, like:

values = [
   expression
   for value in collection
   if condition
]

For nested comprehensions indention like the one suggested by @devxpy in his comment certainly also helps.

It might also be worthwhile considering keeping each level of iteration on the same level indentation.
Modifying devxpy 's example would then give us:

items = [
    item 
        for iterable in iterables 
    for item in iterable 
    if item < 5
]

@freakboy3742
Copy link

I appreciate that this is a complex topic, with lots of edge cases, and I'm sure opinions abound. I'll also pre-declare that I'm not familiar with the internals of black, so if there are existing terms of art that I'm missing, I apologize.

However, I humbly submit a proposal:

Proposal

Comprehensions should always be treated as multiline expressions, analogous to long lists/dicts that won't fit on a single line, with each "clause" in the expression being an item in the list. By this rule:

[A for C in D if E]
{A for C in D for E in F if G}
{A: B for C in D if E]

becomes

[
    A 
    for C in D 
    if E
]
{
    A 
    for C in D 
    for E in F
    if G
}
{
    A: B 
    for C in D 
    if E
]

The only exception to this would be if the comprehension is "simple"; "simple" expressions would be collapsed to a single line of code, formatted as originally presented; or, as

[
    A for C in D if E
]

if necessary given the context in which the comprehension is being used.

I appreciate that "simple" is a bit of a handwave-y statement; but as a starting point for discussion, I'll assert a comprehension ceases to be "simple" if:

  • It has more than one attribute access in any single term of the overall expression
  • It has more than one operator in any single term of the overall expression
  • A single term in the expression contains both an attribute access and an operator
  • The entire expression has a length exceeding 40 chars
  • It has an if clause
  • It has more than 1 for clause

I won't claim this is a perfect definition of "simple", but it's a starting point.

Using this strategy, the example from the original post would become:

class Test:
    def test_list_fields(self, big, big_j, param):
        assert [
            asdict(y) 
            for x in big 
            for y in getattr(x, param)
        ] == [
            y 
            for x in big_j 
            for y in x[param]
        ]

Yes, this uses a lot more lines of code. However, I would also argue that it is a very dense and complex expression, and the extra whitespace significantly improves readability (although I'll acknowledge that is subjective)

By the rule I've expressed here, if the "for x" clauses weren't needed, both sides would be considered "simple" comprehensions - and the entire statement would fit on one line.

        assert [asdict(y) for y in getattr(x, param)] == [y for y in x[param]]

Underlying reasoning

One of the goals of Black is to produce small, easily readable diffs on future edits of the code. The simplest list/set comprehension is a dense expression containing at least 3 sub-expressions; the simplest dict comprehension has at least 4 sub-expressions. Each one of those sub-expressions could potentially be the subject of a meaningful code change. If there is any complexity in those terms, the code (and, by extension, a diff of that code) immediately becomes harder to read.

A key-value reverse comprehension:

{v: k for k, v in mydata.items()}

is nearing the simplest possible dictionary expression you can write, but to my eyes is also nearing the practical limit of what can be reasonably consumed as a single line (especially if that expression is inlined as an argument to a function or something similar).

The slightly more complex expression:

{v.upper(): k.upper() for k, v in mydata.items()}

would exceed the 40 character "complexity" limit, and be transformed into:

{
    v.upper(): k.upper() 
    for k, v in mydata.items()
}

This approach doesn't require any analysis of "symmetry" or similar aesthetic concerns. It only aims to recognize that dense comprehensions are, in fact, complex statements, and a little whitespace can considerably improve readability.

@JelleZijlstra JelleZijlstra added the F: linebreak How should we split up lines? label May 30, 2021
@mira01
Copy link

mira01 commented Nov 18, 2021

+1

@JelleZijlstra
Copy link
Collaborator

The original example here is a case of #236. It may be better to open a new issue for suggestions that we always break up comprehensions.

@freakboy3742
Copy link

@JelleZijlstra As requested, I've moved the comprehension line-splitting conversation to #2841.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
F: linebreak How should we split up lines? T: style What do we want Blackened code to look like?
Projects
None yet
Development

No branches or pull requests

7 participants