-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
import pytz
from dateutil.tz import gettz
# CET = pytz.timezone("Europe/Paris")
CET = gettz("Europe/Paris")
start = pd.Timestamp("2021-10-31 01:45", tz=CET)
idx = pd.date_range(start, pd.Timestamp("2021-10-31 03:45", tz=CET), freq="30T")
pd.Series(idx, idx)
# 2021-10-31 01:45:00+02:00 2021-10-31 01:45:00+02:00
# 2021-10-31 02:15:00+02:00 2021-10-31 02:15:00+02:00
# 2021-10-31 02:45:00+02:00 2021-10-31 02:45:00+02:00
# 2021-10-31 02:15:00+01:00 2021-10-31 02:15:00+02:00 --> offset of index changes here
# 2021-10-31 02:45:00+01:00 2021-10-31 02:45:00+02:00
# 2021-10-31 03:15:00+01:00 2021-10-31 03:15:00+01:00 --> offset of values changes here
# 2021-10-31 03:45:00+01:00 2021-10-31 03:45:00+01:00
# Freq: 30T, dtype: datetime64[ns, tzfile('/usr/share/zoneinfo/Europe/Paris')]Issue Description
I noticed unexpected duplicated values when manipulating timestamp ranges with the CET timezone of dateutil. There is no issue with the same timezone from pytz.
Expected Behavior
No duplicates in the values of the series here above.
Installed Versions
INSTALLED VERSIONS
commit : bb1f651
python : 3.10.2.final.0
python-bits : 64
OS : Darwin
OS-release : 21.3.0
Version : Darwin Kernel Version 21.3.0: Wed Jan 5 21:37:58 PST 2022; root:xnu-8019.80.24~20/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.UTF-8
pandas : 1.4.0
numpy : 1.21.0
pytz : 2020.1
dateutil : 2.8.2
pip : 22.0.4
setuptools : 58.1.0
Cython : None
pytest : None
hypothesis : None
...
xarray : None
xlrd : 2.0.1
xlwt : None
zstandard : None