Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
from pandas import DataFrame, DatetimeIndex
from numpy import nan
df = DataFrame({'B': [1, nan, 4]})
times = ['2020-01-01', '2020-01-02', '2020-01-03']
right = df.ewm(halflife='1 day', times=DatetimeIndex(times), ignore_na=False).mean()
wrong = df.ewm(halflife='1 day', times=DatetimeIndex(times), ignore_na=True).mean()
print(wrong)
print(right)
Issue Description
By passing in the times parameter, NA values should not affect the weights. ignore_na is meant to remove the NAs from the perspective of the weights (treats the past weights based on their absolute vs relative positions) but when times are passed in, they can only be absolute.
The currently implementation does not warn and reports a different result that is not the expected behavior. From tweaking the ordering of the data + the input times, I think the underlying implementation bug could be causing the ignore_na=True to drop only the NA values but NOT the corresponding weight (which should also be).
The solution should either be to force ignore_na to be false (and say something in the docs), or to drop the weight properly if ignore_na=True (will reach the same answer but take more work).
Expected Behavior
The ignore_na=False call is correct, providing the proper answer of 3.4
Installed Versions
INSTALLED VERSIONS
commit : 0f43794
python : 3.11.0.final.0
python-bits : 64
OS : Linux
OS-release : 5.4.191-el7.lime.2.x86_64
Version : #1 SMP Thu Jul 14 19:09:52 EDT 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 2.0.3
numpy : 1.24.2
pytz : 2022.7.1
dateutil : 2.8.2
setuptools : 67.1.0
pip : 23.0
Cython : None
pytest : 7.2.1
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.9.0
pandas_datareader: None
bs4 : 4.11.2
bottleneck : None
brotli :
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.6.3
numba : None
numexpr : None
odfpy : None
openpyxl : 3.1.1
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.10.0
snappy : None
sqlalchemy : 2.0.14
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2023.3
qtpy : 2.3.0
pyqt5 : None