Closed
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import numpy as np
import pandas as pd
# gets upcast to object
df = pd.DataFrame({"a": [1, 2, 3]}, dtype="int64")
df.loc[4] = pd.NA
# gets upcast to float64
df = pd.DataFrame({"a": [1, 2, 3]}, dtype="int64")
df.loc[4] = np.NaN
# gets upcast to object
df = pd.DataFrame({"a": [1, 2, 3]}, dtype="Int64")
df.loc[4] = pd.NA
# gets upcast to Float64
df = pd.DataFrame({"a": [1, 2, 3]}, dtype="Int64")
df.loc[4] = np.NaN
# can hold a missing value when initialized with it and remain Int64
df = pd.DataFrame({"a": [1, 2, 3, pd.NA]}, dtype="Int64")
# then gets upcast anyway when you add a second missing value
df.loc[4] = pd.NA
Issue Description
Int64
can hold missing values, however when adding a missing values to an int64
or Int64
column, it gets upcast to float64
, Float64
, or object
.
Expected Behavior
int64
should be upcast to Int64
and Int64
should not be upcast at all.
Installed Versions
1.4.2