Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix for BUG: grouping with tz-aware: Values falls after last bin #24973

Merged
merged 16 commits into from
Jan 29, 2019
Next Next commit
fix #24972
  • Loading branch information
ahcub committed Jan 28, 2019
commit 1940cfd6b387ca427c46b02c0e547250cbc6ac10
10 changes: 10 additions & 0 deletions pandas/core/resample.py
Original file line number Diff line number Diff line change
Expand Up @@ -1413,6 +1413,16 @@ def _get_time_bins(self, ax):
ambiguous='infer',
nonexistent='shift_forward')

# GH #24972
# In edge case of tz-aware grouping binner last index can be
# less than the ax.max() variable in data object, this happens
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should go in _adjust_bin_edges

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are a lot of issues with generating the remaining bins when I try to move it there. if this way of generating the bins is the correct one then I believe that it is better to leave it here. in _adjust_bin_edges we lose information about ax tz

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or this should possibly be fixed in _get_timestamp_range_edges. This patch here seems like a bandaid.

Copy link
Contributor Author

@ahcub ahcub Jan 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, let me apply the patch you suggested in the issue description, it does look better for me.

# because of normalization
if len(binner) > 1 and binner[-1] < ax.max():
extra_date_range = pd.date_range(binner[-1], ax.max() + self.freq,
freq=self.freq, tz=binner[-1].tz,
name=ax.name)
binner = labels = binner.append(extra_date_range[1:])

ax_values = ax.asi8
binner, bin_edges = self._adjust_bin_edges(binner, ax_values)

Expand Down