-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Can't add a missing value to an int64 or Int64 column without it being upcast inconsistently #47214
Comments
Thanks for the report! Certainly 1, 3, and 5 look like definitive bugs to me. For the other two:
|
For 1, my thinking here is that if the user encounters pd.NA, then they are working the nullable dtypes and so upcasting to them is okay. |
1 is not a but currently, we treat pd.NA as object in numpy dtypes by design as far as I am aware. 3 and 5 are bugs but have the same cause. We seem to cast to object when enlarging the DataFrame. This works as expected when overwriting an existing value |
duplicate of #32346? (There is already some discussion there on the conversion to object dtype. if 3 and 5 are covered by that discussion, 1 and 2 are not bugs and 4 is already covered also, should probably close this issue to help keep discussion in one place) |
Yes you are correct. I‘ll try to look into this. can close here but maybe copy the NA case over to add tests later |
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
Int64
can hold missing values, however when adding a missing values to anint64
orInt64
column, it gets upcast tofloat64
,Float64
, orobject
.Expected Behavior
int64
should be upcast toInt64
andInt64
should not be upcast at all.Installed Versions
1.4.2
The text was updated successfully, but these errors were encountered: