Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: add consortium standard entrypoint #54383

Merged
merged 12 commits into from
Aug 7, 2023
Merged
1 change: 1 addition & 0 deletions ci/deps/actions-311-downstream_compat.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -73,5 +73,6 @@ dependencies:
- pyyaml
- py
- pip:
- dataframe-api-compat>=0.1.7
- pyqt5>=5.15.6
- tzdata>=2022.1
1 change: 1 addition & 0 deletions ci/deps/actions-39-minimum_versions.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -61,5 +61,6 @@ dependencies:
- zstandard=0.17.0

- pip:
- dataframe-api-compat==0.1.7
- pyqt5==5.15.6
- tzdata==2022.1
11 changes: 11 additions & 0 deletions doc/source/getting_started/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -415,3 +415,14 @@ brotli 0.7.0 compression Brotli compression
python-snappy 0.6.1 compression Snappy compression
Zstandard 0.17.0 compression Zstandard compression
========================= ================== =============== =============================================================

Consortium Standard
^^^^^^^^^^^^^^^^^^^

Installable with ``pip install "pandas[consortium-standard]"``

========================= ================== =================== =============================================================
Dependency Minimum Version pip extra Notes
========================= ================== =================== =============================================================
dataframe-api-compat 0.1.7 consortium-standard Consortium Standard-compatible implementation based on pandas
========================= ================== =================== =============================================================
121 changes: 62 additions & 59 deletions doc/source/whatsnew/v2.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -218,6 +218,7 @@ Other enhancements
- Many read/to_* functions, such as :meth:`DataFrame.to_pickle` and :func:`read_csv`, support forwarding compression arguments to lzma.LZMAFile (:issue:`52979`)
- Reductions :meth:`Series.argmax`, :meth:`Series.argmin`, :meth:`Series.idxmax`, :meth:`Series.idxmin`, :meth:`Index.argmax`, :meth:`Index.argmin`, :meth:`DataFrame.idxmax`, :meth:`DataFrame.idxmin` are now supported for object-dtype objects (:issue:`4279`, :issue:`18021`, :issue:`40685`, :issue:`43697`)
- :meth:`DataFrame.to_parquet` and :func:`read_parquet` will now write and read ``attrs`` respectively (:issue:`54346`)
- Added support for the DataFrame Consortium Standard (:issue:`54383`)
- Performance improvement in :meth:`GroupBy.quantile` (:issue:`51722`)

.. ---------------------------------------------------------------------------
Expand Down Expand Up @@ -256,65 +257,67 @@ Increased minimum versions for dependencies
Some minimum supported versions of dependencies were updated.
If installed, we now require:

+-----------------+-----------------+----------+---------+
| Package | Minimum Version | Required | Changed |
+=================+=================+==========+=========+
| numpy | 1.22.4 | X | X |
+-----------------+-----------------+----------+---------+
| mypy (dev) | 1.4.1 | | X |
+-----------------+-----------------+----------+---------+
| beautifulsoup4 | 4.11.1 | | X |
+-----------------+-----------------+----------+---------+
| bottleneck | 1.3.4 | | X |
+-----------------+-----------------+----------+---------+
| fastparquet | 0.8.1 | | X |
+-----------------+-----------------+----------+---------+
| fsspec | 2022.05.0 | | X |
+-----------------+-----------------+----------+---------+
| hypothesis | 6.46.1 | | X |
+-----------------+-----------------+----------+---------+
| gcsfs | 2022.05.0 | | X |
+-----------------+-----------------+----------+---------+
| jinja2 | 3.1.2 | | X |
+-----------------+-----------------+----------+---------+
| lxml | 4.8.0 | | X |
+-----------------+-----------------+----------+---------+
| numba | 0.55.2 | | X |
+-----------------+-----------------+----------+---------+
| numexpr | 2.8.0 | | X |
+-----------------+-----------------+----------+---------+
| openpyxl | 3.0.10 | | X |
+-----------------+-----------------+----------+---------+
| pandas-gbq | 0.17.5 | | X |
+-----------------+-----------------+----------+---------+
| psycopg2 | 2.9.3 | | X |
+-----------------+-----------------+----------+---------+
| pyreadstat | 1.1.5 | | X |
+-----------------+-----------------+----------+---------+
| pyqt5 | 5.15.6 | | X |
+-----------------+-----------------+----------+---------+
| pytables | 3.7.0 | | X |
+-----------------+-----------------+----------+---------+
| pytest | 7.3.2 | | X |
+-----------------+-----------------+----------+---------+
| python-snappy | 0.6.1 | | X |
+-----------------+-----------------+----------+---------+
| pyxlsb | 1.0.9 | | X |
+-----------------+-----------------+----------+---------+
| s3fs | 2022.05.0 | | X |
+-----------------+-----------------+----------+---------+
| scipy | 1.8.1 | | X |
+-----------------+-----------------+----------+---------+
| sqlalchemy | 1.4.36 | | X |
+-----------------+-----------------+----------+---------+
| tabulate | 0.8.10 | | X |
+-----------------+-----------------+----------+---------+
| xarray | 2022.03.0 | | X |
+-----------------+-----------------+----------+---------+
| xlsxwriter | 3.0.3 | | X |
+-----------------+-----------------+----------+---------+
| zstandard | 0.17.0 | | X |
+-----------------+-----------------+----------+---------+
+----------------------+-----------------+----------+---------+
| Package | Minimum Version | Required | Changed |
+======================+=================+==========+=========+
| numpy | 1.22.4 | X | X |
+----------------------+-----------------+----------+---------+
| mypy (dev) | 1.4.1 | | X |
+----------------------+-----------------+----------+---------+
| beautifulsoup4 | 4.11.1 | | X |
+----------------------+-----------------+----------+---------+
| bottleneck | 1.3.4 | | X |
+----------------------+-----------------+----------+---------+
| dataframe-api-compat | 0.1.7 | | X |
+----------------------+-----------------+----------+---------+
| fastparquet | 0.8.1 | | X |
+----------------------+-----------------+----------+---------+
| fsspec | 2022.05.0 | | X |
+----------------------+-----------------+----------+---------+
| hypothesis | 6.46.1 | | X |
+----------------------+-----------------+----------+---------+
| gcsfs | 2022.05.0 | | X |
+----------------------+-----------------+----------+---------+
| jinja2 | 3.1.2 | | X |
+----------------------+-----------------+----------+---------+
| lxml | 4.8.0 | | X |
+----------------------+-----------------+----------+---------+
| numba | 0.55.2 | | X |
+----------------------+-----------------+----------+---------+
| numexpr | 2.8.0 | | X |
+----------------------+-----------------+----------+---------+
| openpyxl | 3.0.10 | | X |
+----------------------+-----------------+----------+---------+
| pandas-gbq | 0.17.5 | | X |
+----------------------+-----------------+----------+---------+
| psycopg2 | 2.9.3 | | X |
+----------------------+-----------------+----------+---------+
| pyreadstat | 1.1.5 | | X |
+----------------------+-----------------+----------+---------+
| pyqt5 | 5.15.6 | | X |
+----------------------+-----------------+----------+---------+
| pytables | 3.7.0 | | X |
+----------------------+-----------------+----------+---------+
| pytest | 7.3.2 | | X |
+----------------------+-----------------+----------+---------+
| python-snappy | 0.6.1 | | X |
+----------------------+-----------------+----------+---------+
| pyxlsb | 1.0.9 | | X |
+----------------------+-----------------+----------+---------+
| s3fs | 2022.05.0 | | X |
+----------------------+-----------------+----------+---------+
| scipy | 1.8.1 | | X |
+----------------------+-----------------+----------+---------+
| sqlalchemy | 1.4.36 | | X |
+----------------------+-----------------+----------+---------+
| tabulate | 0.8.10 | | X |
+----------------------+-----------------+----------+---------+
| xarray | 2022.03.0 | | X |
+----------------------+-----------------+----------+---------+
| xlsxwriter | 3.0.3 | | X |
+----------------------+-----------------+----------+---------+
| zstandard | 0.17.0 | | X |
+----------------------+-----------------+----------+---------+

For `optional libraries <https://pandas.pydata.org/docs/getting_started/install.html>`_ the general recommendation is to use the latest version.
The following table lists the lowest version per library that is currently being tested throughout the development of pandas.
Expand Down
1 change: 1 addition & 0 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,7 @@ dependencies:
- pygments # Code highlighting

- pip:
- dataframe-api-compat>=0.1.7
mroeschke marked this conversation as resolved.
Show resolved Hide resolved
- sphinx-toggleprompt # conda-forge version has stricter pins on jinja2
- typing_extensions; python_version<"3.11"
- tzdata>=2022.1
1 change: 1 addition & 0 deletions pandas/compat/_optional.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
"blosc": "1.21.0",
"bottleneck": "1.3.4",
"brotli": "0.7.0",
"dataframe-api-compat": "0.1.7",
"fastparquet": "0.8.1",
"fsspec": "2022.05.0",
"html5lib": "1.1",
Expand Down
15 changes: 15 additions & 0 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -932,6 +932,21 @@ def __dataframe__(

return PandasDataFrameXchg(self, nan_as_null, allow_copy)

def __dataframe_consortium_standard__(
self, *, api_version: str | None = None
) -> Any:
"""
Provide entry point to the Consortium DataFrame Standard API.

This is developed and maintained outside of pandas.
Please report any issues to https://github.com/data-apis/dataframe-api-compat.
"""
dataframe_api_compat = import_optional_dependency("dataframe_api_compat")
convert_to_standard_compliant_dataframe = (
dataframe_api_compat.pandas_standard.convert_to_standard_compliant_dataframe
)
return convert_to_standard_compliant_dataframe(self, api_version=api_version)

# ----------------------------------------------------------------------

@property
Expand Down
17 changes: 17 additions & 0 deletions pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
from pandas._libs.lib import is_range_indexer
from pandas.compat import PYPY
from pandas.compat._constants import REF_COUNT
from pandas.compat._optional import import_optional_dependency
from pandas.compat.numpy import function as nv
from pandas.errors import (
ChainedAssignmentError,
Expand Down Expand Up @@ -955,6 +956,22 @@ def __array__(self, dtype: npt.DTypeLike | None = None) -> np.ndarray:
arr.flags.writeable = False
return arr

# ----------------------------------------------------------------------

def __column_consortium_standard__(self, *, api_version: str | None = None) -> Any:
"""
Provide entry point to the Consortium DataFrame Standard API.

This is developed and maintained outside of pandas.
Please report any issues to https://github.com/data-apis/dataframe-api-compat.
"""
dataframe_api_compat = import_optional_dependency("dataframe_api_compat")
return (
dataframe_api_compat.pandas_standard.convert_to_standard_compliant_column(
self, api_version=api_version
)
)

# ----------------------------------------------------------------------
# Unary Methods

Expand Down
21 changes: 21 additions & 0 deletions pandas/tests/test_downstream.py
Original file line number Diff line number Diff line change
Expand Up @@ -334,6 +334,27 @@ def test_from_obscure_array(dtype, array_likes):
tm.assert_index_equal(result, expected)


def test_dataframe_consortium() -> None:
"""
Test some basic methods of the dataframe consortium standard.

Full testing is done at https://github.com/data-apis/dataframe-api-compat,
this is just to check that the entry point works as expected.
"""
pytest.importorskip("dataframe_api_compat")
df_pd = DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
df = df_pd.__dataframe_consortium_standard__()
result_1 = df.get_column_names()
expected_1 = ["a", "b"]
assert result_1 == expected_1

ser = Series([1, 2, 3])
col = ser.__column_consortium_standard__()
result_2 = col.get_value(1)
expected_2 = 2
assert result_2 == expected_2


def test_xarray_coerce_unit():
# GH44053
xr = pytest.importorskip("xarray")
Expand Down
2 changes: 2 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -82,11 +82,13 @@ plot = ['matplotlib>=3.6.1']
output_formatting = ['jinja2>=3.1.2', 'tabulate>=0.8.10']
clipboard = ['PyQt5>=5.15.6', 'qtpy>=2.2.0']
compression = ['brotlipy>=0.7.0', 'python-snappy>=0.6.1', 'zstandard>=0.17.0']
consortium-standard = ['dataframe-api-compat>=0.1.7']
mroeschke marked this conversation as resolved.
Show resolved Hide resolved
all = ['beautifulsoup4>=4.11.1',
# blosc only available on conda (https://github.com/Blosc/python-blosc/issues/297)
#'blosc>=1.21.0',
'bottleneck>=1.3.4',
'brotlipy>=0.7.0',
'dataframe-api-compat>=0.1.7',
'fastparquet>=0.8.1',
'fsspec>=2022.05.0',
'gcsfs>=2022.05.0',
Expand Down
1 change: 1 addition & 0 deletions requirements-dev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ feedparser
pyyaml
requests
pygments
dataframe-api-compat>=0.1.7
sphinx-toggleprompt
typing_extensions; python_version<"3.11"
tzdata>=2022.1