Skip to content

Specific Timestamps breaks time series indexing (.loc returns wrong results) #18029

Closed
@linar-jether

Description

@linar-jether

When try to access labels (.loc) by using a specific list of Timestamp objects or a DatetimeIndex object (See attached csv file), the resulting index is returned in UTC offset but the original timezone is not removed.

This seems to happen only in very specific cases, when the index passed to .loc contains labels that do not exist in the DataFrame and also contains duplicates.

@yuval-jether

import pandas as pd
import pytz
idx = pd.date_range('2011-01-01', '2017-10-01 00:00:00', freq='h', tz='America/Chicago')
s = pd.Series(np.random.rand(len(idx)), index=idx)
i = pd.to_datetime(['2017-09-30 06:00:00', '2017-09-30 23:00:00', '2017-09-30 23:00:00', '2017-10-01 00:00:00',]).tz_localize('America/Chicago')
s.loc[i]

Out[39]: 
2017-09-30 06:00:00-05:00    0.380138
2017-09-30 23:00:00-05:00    0.774696
2017-09-30 23:00:00-05:00    0.774696
2017-10-01 00:00:00-05:00    0.728027
dtype: float64

# Added a label that does not exist in the Series
import pandas as pd
import pytz
idx = pd.date_range('2011-01-01', '2017-10-01 00:00:00', freq='h', tz='America/Chicago')
s = pd.Series(np.random.rand(len(idx)), index=idx)
i = pd.to_datetime(['2017-09-30 06:00:00', '2017-09-30 23:00:00', '2017-09-30 23:00:00', '2017-10-01 00:00:00', '2017-10-01 01:00:00',]).tz_localize('America/Chicago')
s.loc[i]

Out[40]: 
2017-09-30 11:00:00-05:00    0.645350
2017-10-01 04:00:00-05:00    0.099323
2017-10-01 04:00:00-05:00    0.099323
2017-10-01 05:00:00-05:00    0.037136
2017-10-01 06:00:00-05:00         NaN
dtype: float64

Output of pd.show_versions()

pd.show_versions()
2017-10-30 03:15:08 [pip.utils] [DEBUG] lzma module is not available
2017-10-30 03:15:08 [pip.vcs] [DEBUG] Registered VCS backend: git
2017-10-30 03:15:08 [pip.vcs] [DEBUG] Registered VCS backend: hg
2017-10-30 03:15:08 [pip.pep425tags] [DEBUG] Config variable 'Py_DEBUG' is unset, Python ABI tag may be incorrect
2017-10-30 03:15:08 [pip.pep425tags] [DEBUG] Config variable 'WITH_PYMALLOC' is unset, Python ABI tag may be incorrect
2017-10-30 03:15:08 [pip.pep425tags] [DEBUG] Config variable 'Py_UNICODE_SIZE' is unset, Python ABI tag may be incorrect
2017-10-30 03:15:08 [pip.pep425tags] [DEBUG] Config variable 'Py_DEBUG' is unset, Python ABI tag may be incorrect
2017-10-30 03:15:08 [pip.pep425tags] [DEBUG] Config variable 'WITH_PYMALLOC' is unset, Python ABI tag may be incorrect
2017-10-30 03:15:08 [pip.pep425tags] [DEBUG] Config variable 'Py_UNICODE_SIZE' is unset, Python ABI tag may be incorrect
2017-10-30 03:15:08 [pip.vcs] [DEBUG] Registered VCS backend: svn
2017-10-30 03:15:09 [pip.vcs] [DEBUG] Registered VCS backend: bzr
INSTALLED VERSIONS

commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Windows
OS-release: 8.1
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.20.3
pytest: 3.0.7
pip: 9.0.1
setuptools: 36.5.0
Cython: None
numpy: 1.13.1
scipy: 0.18.1
xarray: 0.9.6
IPython: 5.5.0
sphinx: None
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.8
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: None
lxml: 3.8.0
bs4: 4.5.1
html5lib: 0.999999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: 0.4.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    IndexingRelated to indexing on series/frames, not to indexes themselvesTestingpandas testing functions or related to the test suiteTimezonesTimezone data dtype

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions