Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DISC: Make all user-facing dtypes ExtensionDtype #53778

Open
jbrockmendel opened this issue Jun 21, 2023 · 1 comment
Open

DISC: Make all user-facing dtypes ExtensionDtype #53778

jbrockmendel opened this issue Jun 21, 2023 · 1 comment
Labels
API Design Clean ExtensionArray Extending pandas with custom dtypes or arrays. PDEP missing values Issues that would be addressed by the Ice Cream Agreement from the Aug 2023 sprint Refactor Internal refactoring of code

Comments

@jbrockmendel
Copy link
Member

jbrockmendel commented Jun 21, 2023

A lot of code could be simplified internally if we were always working with EAs instead of having to check for EA-vs-ndarray. In particular I'm thinking of Block.values, ArrayManager.arrays, Series._values, Index._data, Index._values, maybe frame._values.

The other big upside is that essentially all numpy-specific logic could eventually be migrated into PandasArray methods.

We could make that change without changing the .dtype/.dtypes properties, but on the margin I think the inconsistency that would introduce would be a footgun.

This might be a hassle for users who rely on obj.dtype/dtypes being np.dtype objects, or on the ndarray attributes listed above being ndarrays.

If we go down this road I think it'd be important to rename PandasDtype->NumpyExtensionDtype or something like it xref #53694. Maybe even rename ExtensionDtype->Dtype.

xref #24662 about implementing a dtype for tznaive dt64 and td64.
xref #40021 which would be "fixed" bc the monkeypatching in the PandasArray tests would no longer be necessary.
xref #24877 would be much more appealing if we didn't need to unwrap PandasArray anytime we see it.

BTW I'm not "+1" on this ATM, just thinking about it.

@jbrockmendel jbrockmendel added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 21, 2023
@lithomas1 lithomas1 added API Design Refactor Internal refactoring of code Clean ExtensionArray Extending pandas with custom dtypes or arrays. and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 21, 2023
@jbrockmendel
Copy link
Member Author

cc @jorisvandenbossche

@jbrockmendel jbrockmendel added the PDEP missing values Issues that would be addressed by the Ice Cream Agreement from the Aug 2023 sprint label Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Clean ExtensionArray Extending pandas with custom dtypes or arrays. PDEP missing values Issues that would be addressed by the Ice Cream Agreement from the Aug 2023 sprint Refactor Internal refactoring of code
Projects
None yet
Development

No branches or pull requests

2 participants