Skip to content

groupby/apply behaves different if a column is timezone-aware or not #27212

@jordansamuels

Description

@jordansamuels

Code Sample, a copy-pastable example if possible

import pandas as pd

dates = ['2001-01-01'] * 2 + ['2001-01-02'] * 2 + ['2001-01-03'] * 2
index_no_tz = pd.DatetimeIndex(dates)
index_tz = pd.DatetimeIndex(dates, tz='UTC')
df1 = pd.DataFrame({'x': list(range(2))*3, 'y': range(6), 't': index_no_tz})
df2 = pd.DataFrame({'x': list(range(2))*3, 'y': range(6), 't': index_tz})

result1 = df1.groupby('x', group_keys=False).apply(lambda df: df[['x','y']].copy())
result2 = df2.groupby('x', group_keys=False).apply(lambda df: df[['x','y']].copy())

try:
    pd.testing.assert_frame_equal(result1, result2)
except:
    print("does not work")

Problem description

It appears that groupby/apply appears to sort differently for the two dataframes in the example; the only difference between them is the timezone-awareness of the t column.

(Note: I posted this in the pydata/pandas gitter, and @jorisvandenbossche suggested I open an issue here for it.)

Expected Output

We would expect the assertion to pass, i.e. the does not work message would not appear.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.8.final.0
python-bits: 64
OS: Linux
OS-release: 4.18.0-24-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.24.2
pytest: 4.6.3
pip: 19.1.1
setuptools: 41.0.1
Cython: None
numpy: 1.16.4
scipy: 1.3.0
pyarrow: 0.13.0
xarray: None
IPython: 7.5.0
sphinx: None
patsy: 0.5.1
dateutil: 2.8.0
pytz: 2019.1
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 3.1.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: 4.3.4
bs4: None
html5lib: None
sqlalchemy: 1.3.5
pymysql: None
psycopg2: None
jinja2: 2.10.1
s3fs: 0.2.1
fastparquet: 0.3.1
pandas_gbq: None
pandas_datareader: None
gcsfs: None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions