ENH: add consortium standard entrypoint #54383

MarcoGorelli · 2023-08-03T10:18:48Z

Here's an entry point to the Consortium's DataFrame API Standard

It enables dataframe-consuming libraries to just check for a __dataframe_consortium_standard__ attribute on a dataframe they receive - then, so long as they stick to the spec defined in https://data-apis.org/dataframe-api/draft/index.html, then their code should work the same way, regardless of what the original backing dataframe library was

Use-case: currently, scikit-learn is very keen on using this, as they're not keen on having to depend on pyarrow (which will become required in pandas), and the interchange protocol only goes so far (e.g. a way to convert to ndarray is out-of-scope for that). If we can get this to work there, then other use cases may emerge

The current spec should be enough for scikit-learn, and having this entry point makes it easier to move forwards with development (without monkey-patching / special-casing)

For reference, polars has already merged this: pola-rs/polars#10244

Maintenance burden on pandas

None

I want to be very clear about this: the compat package will be developed, maintained, and tested by the consortium and community of libraries which use it. It is up to consuming libraries to set a minimum version of the dataframe-api-compat package. No responsibility will land on pandas maintainers. If any bugs are reported to pandas, then they can (and should) be politely closed.
All this is just an entry point to the Consortium's Standard

Tagging @pandas-dev/pandas-core for visibility. Some people raised objections (maintenance burden, naming) when asked in private, and I think I've addressed the concerns. Anything else?
Thanks 🙌

jbrockmendel · 2023-08-03T16:33:45Z

No objections here.

lithomas1

Can you add this to the docs as an optional dependency?
(I assume we'll want a minimum version of the standard at some point)

…ream

lithomas1

This looks good to me now.
Can you add a whatsnew too in 2.1.0 saying that pandas supports the consortium or something like that too?

changes addressed

environment.yml

pyproject.toml

mroeschke · 2023-08-07T20:19:33Z

Thanks @MarcoGorelli

* add consortium standard entrypoint * add dataframe-api-compat to optional dependencies, add to test_downstream * align * use importorskip * mypy * fixup table in docs * whatsnew * add check to package-checks.yml

add consortium standard entrypoint

fd5021c

MarcoGorelli marked this pull request as ready for review August 3, 2023 10:38

lithomas1 previously requested changes Aug 3, 2023

View reviewed changes

lithomas1 added this to the 2.1 milestone Aug 3, 2023

MarcoGorelli added 3 commits August 3, 2023 19:59

add dataframe-api-compat to optional dependencies, add to test_downst…

956d5be

…ream

Merge remote-tracking branch 'upstream/main' into consortium-entrypoint

d1615a8

align

2dc5e04

MarcoGorelli requested a review from mroeschke as a code owner August 3, 2023 20:36

mroeschke added the Interchange Dataframe Interchange Protocol label Aug 3, 2023

MarcoGorelli added 3 commits August 4, 2023 11:20

Merge remote-tracking branch 'upstream/main' into consortium-entrypoint

3a78a64

use importorskip

06d0764

mypy

c6c7754

MarcoGorelli requested a review from lithomas1 August 4, 2023 12:55

MarcoGorelli added 3 commits August 4, 2023 14:31

fixup table in docs

ce7267b

whatsnew

7e2e252

Merge remote-tracking branch 'upstream/main' into consortium-entrypoint

2ce05c2

lithomas1 reviewed Aug 4, 2023

View reviewed changes

mroeschke reviewed Aug 4, 2023

View reviewed changes

environment.yml Show resolved Hide resolved

MarcoGorelli added 2 commits August 7, 2023 17:51

Merge remote-tracking branch 'upstream/main' into consortium-entrypoint

a4705d5

add check to package-checks.yml

cae1afb

mroeschke reviewed Aug 7, 2023

View reviewed changes

pyproject.toml Show resolved Hide resolved

MarcoGorelli added the Build Library building on various platforms label Aug 7, 2023

mroeschke approved these changes Aug 7, 2023

View reviewed changes

mroeschke merged commit 809f371 into pandas-dev:main Aug 7, 2023
70 checks passed

anmyachev mentioned this pull request Jan 26, 2024

Add consortium DataFrame standard entrypoint modin-project/modin#6890

Closed

rgommers mentioned this pull request May 29, 2024

CLN remove dataframe api consortium entrypoint #57482

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: add consortium standard entrypoint #54383

ENH: add consortium standard entrypoint #54383

MarcoGorelli commented Aug 3, 2023 •

edited

Loading

jbrockmendel commented Aug 3, 2023

lithomas1 left a comment

lithomas1 left a comment

mroeschke commented Aug 7, 2023

ENH: add consortium standard entrypoint #54383

ENH: add consortium standard entrypoint #54383

Conversation

MarcoGorelli commented Aug 3, 2023 • edited Loading

Maintenance burden on pandas

jbrockmendel commented Aug 3, 2023

lithomas1 left a comment

Choose a reason for hiding this comment

lithomas1 left a comment

Choose a reason for hiding this comment

mroeschke commented Aug 7, 2023

MarcoGorelli commented Aug 3, 2023 •

edited

Loading