comprehensions poor formatting #733

jgirardet · 2019-03-04T21:47:08Z

black 18.9b

class Test:
    def test_list_fields(self, big, big_j, param):
        assert [asdict(y) for x in big for y in getattr(x, param)] == [
            y for x in big_j for y in x[param]
        ]

I don't know if something better can be done but thats not very readable

ambv · 2019-03-14T16:52:51Z

I agree this does not look great. However, there's no way for Black to handle all those situations correctly without making it way more complex than it is today. It would need to start understanding concepts like "symmetry" to format this better automatically.

If I tweak the current heuristic to solve this particular situation better, it will suck for other kinds of asserts. I will look into this but this is hard to get right.

If the auto-formatter fails to format very complex expressions, I'd suggest maybe refactoring it slightly:

class Test:
    def test_list_fields(self, big, big_j, param):
        expected = [asdict(y) for x in big for y in getattr(x, param)]
        actual = [y for x in big_j for y in x[param]]
        assert expected == actual

devxpy · 2019-08-08T16:39:42Z

I'm not sure whether this is possible, but this transformation looks very pretty

items = [item for iterable in iterables for item in iterable if item < 5]

To

items = [
    item 
    for iterable in iterables 
        for item in iterable 
            if item < 5
]

pykong · 2019-11-07T08:51:55Z

The lack of formatting for comprehensions is my sole pain using black.
As comprehensions are such central feature of python, formatting them is an absolute must.
Hence please make this the top priority for new features to implement.

On the implementation

Simple comprehensions, i.e. those not featuring conditionals or nesting are fine just being put on single line. However, once you deal with complex comprehensions proper formatting becomes vital to maintain readability.

The formatting convention suggests putting statements on separate line, like:

values = [
   expression
   for value in collection
   if condition
]

For nested comprehensions indention like the one suggested by @devxpy in his comment certainly also helps.

It might also be worthwhile considering keeping each level of iteration on the same level indentation.
Modifying devxpy 's example would then give us:

items = [
    item 
        for iterable in iterables 
    for item in iterable 
    if item < 5
]

freakboy3742 · 2020-04-19T02:54:22Z

I appreciate that this is a complex topic, with lots of edge cases, and I'm sure opinions abound. I'll also pre-declare that I'm not familiar with the internals of black, so if there are existing terms of art that I'm missing, I apologize.

However, I humbly submit a proposal:

Proposal

Comprehensions should always be treated as multiline expressions, analogous to long lists/dicts that won't fit on a single line, with each "clause" in the expression being an item in the list. By this rule:

[A for C in D if E]
{A for C in D for E in F if G}
{A: B for C in D if E]

becomes

[
    A 
    for C in D 
    if E
]
{
    A 
    for C in D 
    for E in F
    if G
}
{
    A: B 
    for C in D 
    if E
]

The only exception to this would be if the comprehension is "simple"; "simple" expressions would be collapsed to a single line of code, formatted as originally presented; or, as

[
    A for C in D if E
]

if necessary given the context in which the comprehension is being used.

I appreciate that "simple" is a bit of a handwave-y statement; but as a starting point for discussion, I'll assert a comprehension ceases to be "simple" if:

It has more than one attribute access in any single term of the overall expression
It has more than one operator in any single term of the overall expression
A single term in the expression contains both an attribute access and an operator
The entire expression has a length exceeding 40 chars
It has an if clause
It has more than 1 for clause

I won't claim this is a perfect definition of "simple", but it's a starting point.

Using this strategy, the example from the original post would become:

class Test:
    def test_list_fields(self, big, big_j, param):
        assert [
            asdict(y) 
            for x in big 
            for y in getattr(x, param)
        ] == [
            y 
            for x in big_j 
            for y in x[param]
        ]

Yes, this uses a lot more lines of code. However, I would also argue that it is a very dense and complex expression, and the extra whitespace significantly improves readability (although I'll acknowledge that is subjective)

By the rule I've expressed here, if the "for x" clauses weren't needed, both sides would be considered "simple" comprehensions - and the entire statement would fit on one line.

        assert [asdict(y) for y in getattr(x, param)] == [y for y in x[param]]

Underlying reasoning

One of the goals of Black is to produce small, easily readable diffs on future edits of the code. The simplest list/set comprehension is a dense expression containing at least 3 sub-expressions; the simplest dict comprehension has at least 4 sub-expressions. Each one of those sub-expressions could potentially be the subject of a meaningful code change. If there is any complexity in those terms, the code (and, by extension, a diff of that code) immediately becomes harder to read.

A key-value reverse comprehension:

{v: k for k, v in mydata.items()}

is nearing the simplest possible dictionary expression you can write, but to my eyes is also nearing the practical limit of what can be reasonably consumed as a single line (especially if that expression is inlined as an argument to a function or something similar).

The slightly more complex expression:

{v.upper(): k.upper() for k, v in mydata.items()}

would exceed the 40 character "complexity" limit, and be transformed into:

{
    v.upper(): k.upper() 
    for k, v in mydata.items()
}

This approach doesn't require any analysis of "symmetry" or similar aesthetic concerns. It only aims to recognize that dense comprehensions are, in fact, complex statements, and a little whitespace can considerably improve readability.

mira01 · 2021-11-18T15:02:58Z

+1

JelleZijlstra · 2022-01-29T16:52:31Z

The original example here is a case of #236. It may be better to open a new issue for suggestions that we always break up comprehensions.

freakboy3742 · 2022-01-31T04:47:53Z

@JelleZijlstra As requested, I've moved the comprehension line-splitting conversation to #2841.

JelleZijlstra added the T: style What do we want Blackened code to look like? label May 5, 2019

JelleZijlstra added the F: linebreak How should we split up lines? label May 30, 2021

JelleZijlstra closed this as completed Jan 29, 2022

freakboy3742 mentioned this issue Jan 31, 2022

Always split "complex" comprehensions into multiple lines #2841

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

comprehensions poor formatting #733

comprehensions poor formatting #733

jgirardet commented Mar 4, 2019

ambv commented Mar 14, 2019

devxpy commented Aug 8, 2019

pykong commented Nov 7, 2019 •

edited

Loading

freakboy3742 commented Apr 19, 2020

mira01 commented Nov 18, 2021

JelleZijlstra commented Jan 29, 2022

freakboy3742 commented Jan 31, 2022

comprehensions poor formatting #733

comprehensions poor formatting #733

Comments

jgirardet commented Mar 4, 2019

ambv commented Mar 14, 2019

devxpy commented Aug 8, 2019

pykong commented Nov 7, 2019 • edited Loading

On the implementation

freakboy3742 commented Apr 19, 2020

Proposal

Underlying reasoning

mira01 commented Nov 18, 2021

JelleZijlstra commented Jan 29, 2022

freakboy3742 commented Jan 31, 2022

pykong commented Nov 7, 2019 •

edited

Loading