Skip to content

BUG: resample ops includes sampled column #47079

Closed
@rhshadrach

Description

@rhshadrach

This is similar to an example in the DataFrame.resample docstring

d = {
    'price': [10, 11, 9, 13, 14, 18, 17, 19],
    'volume': [50, 60, 40, 100, 50, 100, 40, 50]
}
df = pd.DataFrame(d)
df['week_starting'] = pd.date_range('01/01/2018', periods=8, freq='W')
print(df.resample('M', on='week_starting').first())

# Gives
               price  volume week_starting
week_starting                             
2018-01-31        10      50    2018-01-07
2018-02-28        14      50    2018-02-04

I am not sure, but it seems to me it's not intended to use the on column in the aggregation. Because of #46560, this is current throwing warnings when .first is replaced by e.g. .sum or .mean and there is no way to safely avoid them.

Marking this for 1.5 because of the introduced warnings.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions