Skip to content

Converting a StringDtype series to an Inte64Dtype not working as expected #32450

Closed
@brent-field

Description

@brent-field

I am interested in converting a StringDtype series to an Inte64Dtype. The following code produces a TypeError:

x = pd.Series(['1', pd.NA, '3'], dtype=pd.StringDtype())
x.astype('Int64')
...
TypeError: data type not understood

If I rewrite it as follows, I get a different TypeError:

x = pd.Series(['1', pd.NA, '3'], dtype=pd.StringDtype())
x.astype(int)
...
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NAType

The only way I have been able to convert from StringDtype is:

x = pd.Series(['1', pd.NA, '3'], dtype=pd.StringDtype())
pd.to_numeric(x, errors='coerce').convert_dtypes()
...
0       1
1    <NA>
2       3
dtype: Int64

This works fine, but is inelegant. I would have expect astype to be able to do the conversion directly. Is there a recommended way to convert between these types?

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugExtensionArrayExtending pandas with custom dtypes or arrays.NA - MaskedArraysRelated to pd.NA and nullable extension arrays

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions