Skip to content

docs: Explain differences between list.eval and list.agg #25336

@cr7pt0gr4ph7

Description

@cr7pt0gr4ph7

Description

There is a dedicated list.agg method for aggregations, as well as an additional list.eval method that can do both element-wise mapping as well as aggregations. It's currently not clear when to use which - even the linked documentation of the two methods only contains examples for which the two yield the same results when replaced with the other - or what advantage one has over the other.

Even the docstring for list.agg presents the following example, which arguably isn't an aggregation (meaning it maps many values to a single value) but a per-element filter:

>> df.with_columns(no_nulls=pl.col.a.list.agg(pl.element().drop_nulls()))
shape: (3, 2)
┌──────────────┬───────────┐
│ a            ┆ no_nulls  │
│ ---          ┆ ---       │
│ list[i64]    ┆ list[i64] │
╞══════════════╪═══════════╡
│ [1, null]    ┆ [1]       │
│ [42, 13]     ┆ [42, 13]  │
│ [null, null] ┆ []        │
└──────────────┴───────────┘

...so it arguably would probably be more accurate to use list.eval instead:

>>> df.with_columns(no_nulls=pl.col.a.list.agg(pl.element().drop_nulls()))
shape: (3, 2)
┌──────────────┬───────────┐
│ a            ┆ no_nulls  │
│ ---          ┆ ---       │
│ list[i64]    ┆ list[i64] │
╞══════════════╪═══════════╡
│ [1, null]    ┆ [1]       │
│ [42, 13]     ┆ [42, 13]  │
│ [null, null] ┆ []        │
└──────────────┴───────────┘

The only obvious difference I can see is that list.eval has a parallel=True|False parameter, while list.agg doesn't. The confusion is compounded by the fact that for top-level mapping or aggregation, there is only a single DataFrame.select method instead of two separate agg/eval methods.

Link

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions