ENH: Add future.python_scalars #63016

rhshadrach · 2025-11-06T22:22:37Z

closes #xxxx (Replace xxxx with the GitHub issue number)
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

Adds an experimental option to return Python scalars instead of NumPy scalars across the API. This is not yet fully implemented everywhere, e.g. Series.__getitem__, but I'm hoping reductions are a substantial chunk.

This is complicated by #62988 where it was found that many of our doctests are not running. We run those doctests using NumPy>=2, and if we were to get those doctests to pass as-is, we would need to change the NumPy reprs from e.g. 2 to np.int64(2). If we then change reductions et al to returning Python scalars, we'd then change all the reprs back from e.g. np.int64(2) to 2. So instead I think we can:

Merge this experimental option, not yet advertising it to users.
Merge (after some work) DOC: Run all doctests #62988 where we run doctests with the experimental option enabled. This would reduce churn in the documentation.
Finish work on this option, expose to users in pandas 3.x and start deprecation process for changing the default.
Change default of future.python_scalars to True in 4.0, deprecate the future option.

rhshadrach · 2025-11-06T23:24:41Z

cc @jbrockmendel @mroeschke @jorisvandenbossche

jbrockmendel · 2025-11-07T18:05:07Z

Perf impact?

possible xref #13468, #23106, #29738, #20791, #21256

rhshadrach · 2025-11-07T18:33:09Z

Perf impact?

Plan to run a full set of ASVs next week, some microbenchmarks

from pandas.core.dtypes.cast import maybe_unbox_numpy_scalar

with pd.option_context("python_scalars", True):
    %timeit maybe_unbox_numpy_scalar(np.int64(2))
    # 828 ns ± 9.91 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
    %timeit maybe_unbox_numpy_scalar(2)
    # 161 ns ± 0.414 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

ser = pd.Series([1, 2, 3] * 10_000)
with pd.option_context("python_scalars", True):
    %timeit ser.sum()
    # 9.42 μs ± 423 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
with pd.option_context("python_scalars", False):
    %timeit ser.sum()
    # 8.28 μs ± 137 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Dr-Irv · 2025-11-10T18:03:44Z

pandas/core/dtypes/cast.py

+    if using_python_scalars() and isinstance(value, np.generic):
+        if isinstance(result, np.longdouble):
+            result = float(result)


What if value is np.int ? Don't you want to handle that differently?

Can you give an example input to this function that you think gives an undesirable result.

` np.array([1,2,3]).sum()` returns `np.int64(6)`, so wouldn't `maybe_unbox_numpy_scalar(value)` need to return `6` instead of `6.0` if `value` was set to `np.int64(6)` ?

rhshadrach added 3 commits November 6, 2025 17:02

ENH: Add future.python_scalars

477cc4f

Indicate config is experimental

bd953a2

Add CI job

0896a2f

rhshadrach mentioned this pull request Nov 10, 2025

DOC: Series.sum() has examples that don't illustrate the actual results #62966

Open

1 task

Dr-Irv reviewed Nov 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: Add future.python_scalars #63016

ENH: Add future.python_scalars #63016

rhshadrach commented Nov 6, 2025

Uh oh!

rhshadrach commented Nov 6, 2025

Uh oh!

jbrockmendel commented Nov 7, 2025

Uh oh!

rhshadrach commented Nov 7, 2025

Uh oh!

Dr-Irv Nov 10, 2025

Uh oh!

rhshadrach Nov 10, 2025

Uh oh!

Dr-Irv Nov 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

ENH: Add future.python_scalars #63016

Are you sure you want to change the base?

ENH: Add future.python_scalars #63016

Conversation

rhshadrach commented Nov 6, 2025

Uh oh!

rhshadrach commented Nov 6, 2025

Uh oh!

jbrockmendel commented Nov 7, 2025

Uh oh!

rhshadrach commented Nov 7, 2025

Uh oh!

Dr-Irv Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

rhshadrach Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

Dr-Irv Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants