Skip to content

DISC: Make all user-facing dtypes ExtensionDtype #53778

Open
@jbrockmendel

Description

@jbrockmendel

A lot of code could be simplified internally if we were always working with EAs instead of having to check for EA-vs-ndarray. In particular I'm thinking of Block.values, ArrayManager.arrays, Series._values, Index._data, Index._values, maybe frame._values.

The other big upside is that essentially all numpy-specific logic could eventually be migrated into PandasArray methods.

We could make that change without changing the .dtype/.dtypes properties, but on the margin I think the inconsistency that would introduce would be a footgun.

This might be a hassle for users who rely on obj.dtype/dtypes being np.dtype objects, or on the ndarray attributes listed above being ndarrays.

If we go down this road I think it'd be important to rename PandasDtype->NumpyExtensionDtype or something like it xref #53694. Maybe even rename ExtensionDtype->Dtype.

xref #24662 about implementing a dtype for tznaive dt64 and td64.
xref #40021 which would be "fixed" bc the monkeypatching in the PandasArray tests would no longer be necessary.
xref #24877 would be much more appealing if we didn't need to unwrap PandasArray anytime we see it.

BTW I'm not "+1" on this ATM, just thinking about it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    API DesignCleanExtensionArrayExtending pandas with custom dtypes or arrays.PDEP missing valuesIssues that would be addressed by the Ice Cream Agreement from the Aug 2023 sprintRefactorInternal refactoring of code

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions