Skip to content

Allow dtype promotion in Series[ExtensionArray].__setitem__? #24020

Closed
@TomAugspurger

Description

@TomAugspurger

Pandas typically allows for __setitem__ to change an object's dtype.

In [5]: a = pd.Series([1, 2])

In [6]: a[0] = 'a'

In [7]: a
Out[7]:
0    a
1    2
dtype: object

This typically won't work for ExtensionArrays:

In [8]: b = pd.Series([1, 2], dtype='Int64')

In [9]: b[0] = 'a'
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-9-a40f9ead6ccf> in <module>
----> 1 b[0] = 'a'

...

TypeError: <U1 cannot be converted to an IntegerDtype

So, two questions:

  1. Do we want to allow this kind of type promotion for EA-backed series?
  2. If so, how do we design it? Presumably something like ExtensionArray._can_hold_item(item: Type[scalar, Sequence[scalar]]) -> bool. Or maybe the return could be True for "I can hold this", False for "No, raise an exception", and something else (a dtype?) for "Yes, but astype me to something else first)".

I'm needing to work around this for DatetimeTZArray, since we do allow setting with a new timezone (upcasting to object).

In [8]: ser = pd.Series(pd.date_range("2000", periods=4, tz='UTC'))

In [9]: ser.dtype
Out[9]: datetime64[ns, UTC]

In [10]: ser[0] = pd.Timestamp('2000', tz='US/Central')

In [11]: ser.dtype
Out[11]: dtype('O')

Metadata

Metadata

Assignees

No one assigned

    Labels

    Closing CandidateMay be closeable, needs more eyeballsDtype ConversionsUnexpected or buggy dtype conversionsEnhancementExtensionArrayExtending pandas with custom dtypes or arrays.Needs DiscussionRequires discussion from core team before further action

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions