Skip to content

Conversation

dangotbanned
Copy link
Member

@dangotbanned dangotbanned commented Sep 18, 2025

What type of PR is this? (check all applicable)

  • ✨ Feature

Related issues

Tasks

Mapping things out a bit, no compliant yet
There's a few gaps, but overall surprised how much was reusable 🥳
lol didn't realise it was just describing python dict behavior
Everything here seems to be working already? 😱

May as well show it off

```py
>>> df.group_by("a", nwp.nth(2, 8)).agg(nwp.mean("d", "e", "g").name.suffix("_mean"))
NotImplementedError: TODO: `GroupBy.agg` needs a `CompliantGroupBy` to dispatch to:

keys:
(a=col('a'), c=col('c'), i=col('i'))

aggs:
(d_mean=col('d').mean(), e_mean=col('e').mean(), g_mean=col('g').mean())

result_schema:
FrozenSchema([
 ('a', String),
 ('c', Int64),
 ('i', Unknown),
 ('d_mean', Unknown),
 ('e_mean', Unknown),
 ('g_mean', Unknown),
])
```
Quite different to current version(s)
@dangotbanned dangotbanned added the enhancement New feature or request label Sep 18, 2025
@dangotbanned dangotbanned mentioned this pull request Sep 18, 2025
60 tasks
Gonna need space for the mini translator
Borrowing some ideas from #2528, #2680
`pyarrow` has the same behavior as `polars`
Wasn't expecting so much to be working already 🥳 🥳 🥳
Just pushing this as tests are working.

Useful changes to follow:
- Column renaming stuff will be avoidable
  - we just use `ArrowAggExpr.output_name`
- Awkward stuff `first`, `last`, `_ensure_single_thread` can be avoided
  - `use_threads` was always available on `Declaration.to_table`
  - Whether we need to use can just be an `__ior__`
`ArrowDataFrame.drop_nulls` is shorter and waaaaaaay more efficient than `main`
Comment on lines +130 to +144
class temp: # noqa: N801
"""Temporary mini namespace for temporary utils."""

_MAX_ITERATIONS: ClassVar[int] = 100

@classmethod
def column_name(
cls,
source: _StoresColumns | Iterable[str],
/,
*,
prefix: str = "nw",
n_chars: int = 16,
) -> str:
"""Generate a single, unique column name that is not present in `source`."""
Copy link
Member Author

@dangotbanned dangotbanned Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Document + test all the temp stuff

Rushed this a bit since I needed temp.column_names.

Would be nice to have something to point to about why this:

nw.temp.column_name(df, prefix="count_")

is preferable to this:

subset = df.columns
nw.generate_temporary_column_name(n_bytes=8, columns=subset, prefix="count_")

RE: #3147

Comment on lines 83 to 88
# NOTE: Want to be able to call `with_columns` at compliant level - and still get the right schema
# - Currently it acts like select in `group_by`
# - Doing some gymnastics to workaround for now
def with_columns(self, *exprs: OneOrIterable[IntoExpr], **named_exprs: Any) -> Self:
named_irs, _ = self._project(exprs, named_exprs, ExprContext.WITH_COLUMNS)
return self._from_compliant(self._compliant.with_columns(named_irs))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Don't merge back into (#2572) without addressing this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added these rules before I realised it described `dict`
…ojection`

Can accomodate
- any narwhals/compliant-level frame
- any reasonable transformation on a schema
- the result of `freeze_schema` can be passed *back* into itself for free
- Currently just replaces the narwhals-level bits
- Next part is making use of them for compliant-level
Tricky to do this in a way that doesn't end up duplicating lots of logic/layers
- Enough to pass the remaining tests
- Hoping to reduce the complexity in `__iter__` next
Bug isnt present on `main`, probably need to redo the resolve stuff to fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request internal tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant