Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Timedelta created by to_timedelta does not add correctly to datetime in Pandas 2 #53643

Open
3 tasks done
Flix6x opened this issue Jun 13, 2023 · 4 comments
Open
3 tasks done
Labels
Bug Non-Nano datetime64/timedelta64 with non-nanosecond resolution Numeric Operations Arithmetic, Comparison, and Logical operations Timedelta Timedelta data type

Comments

@Flix6x
Copy link
Contributor

Flix6x commented Jun 13, 2023

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

from datetime import datetime
import pandas as pd

# Create a datetime object (midnight)
dt = datetime(2015, 1, 1, 0)

# Create a pd.Timedelta object (1 hour)
delta_1 = pd.Timedelta("1H")

# Create a pd.Timedelta object from a DateTimeIndex frequency (1 hour)
delta_2 = pd.to_timedelta(pd.date_range('2015-01-01 00', '2015-01-01 01', freq="1H").freq)

# First check that the pd.Timedelta objects are equal
assert(delta_1 == delta_2)  # True

# Then check whether adding each pd.Timedelta to the datetime object moves the datetime by one hour: this does not behave correctly since Pandas 2.0
assert(dt + delta_1 == dt + delta_2)  # False -> AssertionError

# Left-hand side looks right: the hour was added
dt + delta_1  # datetime.datetime(2015, 1, 1, 1, 0)

# Right-hand side appears to be wrong: the hour was not added
dt + delta_2  # datetime.datetime(2015, 1, 1, 0, 0)

Issue Description

It seems that the pd.Timedelta object produced by calling pd.to_timedelta on a pd.DateTimeIndex frequency can no longer be added to a datetime.datetime object since pandas>1.5.3.

It's not evident to me why the two pd.Timedelta objects (appearing to be equal) are behaving differently.

Expected Behavior

Running the above example in pandas==1.5.3 (and also 1.4.4) works as expected.

Installed Versions

INSTALLED VERSIONS

commit : 4149f32
python : 3.9.9.final.0
python-bits : 64
OS : Linux
OS-release : 5.15.11-76051511-generic
Version : #202112220937164018548121.04~b3a2c21-Ubuntu SMP Mon Jan 3 16:5
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 2.1.0.dev0+956.g4149f32388
numpy : 1.22.4
pytz : 2022.2.1
dateutil : 2.8.2
setuptools : 65.2.0
pip : 23.1.2
Cython : 0.29.30
pytest : 7.0.1
hypothesis : None
sphinx : 4.4.0
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : 1.1
pymysql : None
psycopg2 : 2.9.3
jinja2 : 3.1.2
IPython : None
pandas_datareader: None
bs4 : 4.11.1
bottleneck : None
brotli : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.5.1
numba : 0.56.0
numexpr : None
odfpy : None
openpyxl : 3.0.10
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.8.0
snappy : None
sqlalchemy : 1.4.31
tables : None
tabulate : 0.8.9
xarray : None
xlrd : 2.0.1
zstandard : None
tzdata : 2022.5
qtpy : None
pyqt5 : None

@Flix6x Flix6x added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 13, 2023
@jbrockmendel
Copy link
Member

I have an explanation but not a solution.

delta_2 here has unit="s". Timedeltas with unit="s" can go outside the range supported by datetime.timedelta, so the way they are initialized passes seconds=0 to the parent class (datetime.timedelta) constructor. So the pytimedelta struct (which pandas ignores) looks like timedelta(0). When you do dt + delta_2, that calls the pydatetime's __add__ method, which treats delta_2 as a regular pytimedelta, not a pd.Timedelta. So it uses the 0, giving the incorrect result.

Reversing the operation and doing delta_2 + dt gives the correct result, as does pd.Timestamp(dt) + delta_2.

A solution here would need to be in the stdlib datetime.__add__ returning NotImplemented

@jbrockmendel jbrockmendel added Timedelta Timedelta data type Non-Nano datetime64/timedelta64 with non-nanosecond resolution labels Nov 1, 2023
@miccoli
Copy link
Contributor

miccoli commented May 25, 2024

A solution here would need to be in the stdlib datetime.__add__ returning NotImplemented

No way! pd.Timedelta claims to be a datetime.timedelta subclass:

>>> import datetime
>>> import pandas as pd
>>> issubclass(pd.Timedelta, datetime.timedelta)
True

The Liskov substitution principle requires that datetime.__add__(self, other) is semantically equivalent, whenever other is a datetime.timedelta subclass.

By the way, why is pd.Timedelta a datetime.timedelta subclass? They are mostly incompatible. The only way forward seems to me to completely drop the claim that pd.Timedelta should be a datetime.timedelta subclass.

@mroeschke mroeschke added Numeric Operations Arithmetic, Comparison, and Logical operations and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 17, 2024
@GiorgioBalestrieri
Copy link

Just caught a bug caused by this while trying to upgrade to pandas 2.0. This is both quite concerning (as mentioned above, pd.Timedelta should behave like dt.timedelta, since it claims to be a subclass), and potentially a blocker, because we'll have to go and ensure that each single pd.Timedelta is converted to dt.timedelta before doing any arithmetics on it.

I really don't want to sound like I'm demanding a fix for a bug/issue in an open source library/project, but I think solving this should be priority.

@giuliobeseghi
Copy link

+1, this behavior doesn't seem right and pandas 2.0 is increasingly being adopted, this is quite a risky bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Non-Nano datetime64/timedelta64 with non-nanosecond resolution Numeric Operations Arithmetic, Comparison, and Logical operations Timedelta Timedelta data type
Projects
None yet
Development

No branches or pull requests

6 participants