api: plan for narwhals.stable.v2
#1657
Replies: 5 comments 10 replies
This comment has been hidden.
This comment has been hidden.
This comment has been hidden.
This comment has been hidden.
This comment has been hidden.
This comment has been hidden.
-
One more comment on this - I think this would be much simpler to implement as nw.col('sales').cum_sum().over(order_by='date') as opposed to nw.col('sales').cum_sum(order_by='date') Because then, we just have a single nw.col('sales').cum_sum().over('store', order_by='date') whereas Having said that, |
Beta Was this translation helpful? Give feedback.
-
Have you considered moving some of the instance attributes into Here, I'm thinking more in terms of things like: narwhals/narwhals/dataframe.py Lines 73 to 77 in 4a2ca52 Instead you'd have: class BaseFrame(Generic[_FrameT]):
_compliant_frame: Any
_level: ClassVar[Literal["full", "lazy"]]
@classmethod
def _from_compliant_dataframe(cls, df: Any):
# NOTE: `_level` would already be accessible in `__init__`
return cls(df) Looking at Achieves the same goal, but you get a context that is evaluated once per-class (e.g. enforce some error handling at declaration-time) Short example of what that could look like: # nw.dataframe.DataFrame / BaseFrame
class NwDataFrame(Generic[IntoFrameT]):
def __init_subclass__(
cls, *args: Any, version: Version = Version.MAIN, **kwds: Any
) -> None:
super().__init_subclass__(*args, **kwds)
cls._version: Version = version
class DataFrame(NwDataFrame[IntoDataFrameT], version=Version.V1): ... I'm just picking out Maybe there is good reason to be able to configure this on the instance-level - but if not - you'd end up with less context that needs passing around 🙂 |
Beta Was this translation helpful? Give feedback.
-
I think we can aim for
narwhals.stable.v2
some time in 2025The main things I'd like to achieve are:
Series
and have a defined row order (pandas, polars.DataFrame, pyarrow.Table, modin.DataFrame, cudf.DataFrame)collect
, adding/removing/subsetting columns, filtering rows. Initially, at least, row-order-dependent operations (such asnw.cum_sum
) will be left out. This will include Polars.LazyFrame, DuckDBPyRelation, pyspark.DataFrame, ibis.Table, dask.DataFrame. Later we can add anorder_by
argument to these functions which will become required when working with nw.LazyframeCandidate changes:
from_native
. It will get simpler once we remove interchange-level things, but it's still fairly complicated. Not sure how much I like it to be completely honest, but I also haven't come up with anything betterUsage of
narwhals.stable.v1
should remain unaffectedTentative date: some time in 2025
Why remove support for the dataframe interchange protocol?
Outstanding items
Breaking changes:
eager_or_interchange_only
fromfrom_native
group_by(...).agg
should aggregatenw.all()
#1780 (to make sure that addressing it doesn't require further API breaks)nw.LazyFrame
really need.clone
? feat!: require at least one expression be passed to lazyframe select and lazyframe.with_columns, remove lazyframe.clone #2206native_namespace
->backend
in IO functions Deprecatenative_namespace
in favour ofbackend
in IO functions #1888LazyFrame.gather_every
pd.Categorical
get mapped tonw.Enum
? Feat: nw.Enum support for pandas #21920
ornull
?keep
argument inmode
feat: add requiredkeep
argument toExpr.mode
#1793LazyFrame.with_row_index
: allow? require thatorder_by
be specified? should it benw.row_index
instead? [Enh]: AllowLazyFrame.with_row_index(..., order_by=...)
#2307concat(how='horizontal')
for lazyframes #2340Beta Was this translation helpful? Give feedback.
All reactions