Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
df = pd.DataFrame({"a": range(1000)})
df.a = 1.2345678 * df.a
df["b"] = df.a.astype("float16")
df.describe()
a b
count 1000.000000 1000.0000
mean 616.666616 inf
std 356.567176 inf
min 0.000000 0.0000
25% 308.333308 308.4375
50% 616.666616 616.7500
75% 924.999924 924.8750
max 1233.333232 1233.0000
Issue Description
when change the column to low float format, the mean or std will calc wrong. In this case mean ,616 to inf. and in some case will result NaN. such as (but there is no NaN value in the column):
count 674522.000000
mean NaN
std 0.000000
min -17.359375
25% -1.610352
50% -0.280029
75% 1.049805
max 19.984375
Expected Behavior
same as normal mean or std
Installed Versions
/home/terry/.local/lib/python3.8/site-packages/_distutils_hack/init.py:30: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
Output exceeds the size limit. Open the full output data in a text editor
INSTALLED VERSIONS
commit : 87cfe4e
python : 3.8.10.final.0
python-bits : 64
OS : Linux
OS-release : 5.15.0-46-generic
Version : #49~20.04.1-Ubuntu SMP Thu Aug 4 19:15:44 UTC 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.5.0
numpy : 1.22.4
pytz : 2022.1
dateutil : 2.8.2
setuptools : 62.1.0
pip : 22.2.2
Cython : 0.29.30
pytest : 7.1.2
hypothesis : None
...
xlrd : 2.0.1
xlwt : None
zstandard : None
tzdata : 2022.1