-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
BUG: Fix a bug in 'timedelta_range' that produced an extra point on a edge case (fix #30353) #33498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ee4c095 to
499b905
Compare
499b905 to
8820776
Compare
266499a to
cb9c023
Compare
ffe1caf to
89a9af5
Compare
WillAyd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the refactor looks reasonable - @jbrockmendel any thoughts?
|
A few comments on the refactor, but i really like the idea of re-using generate_regular_range |
| - Bug in :func:`timedelta_range` that produced an extra point on a edge case (:issue:`30353`, :issue:`33498`) | ||
| - Bug in :meth:`DataFrame.resample` that produced an extra point on a edge case (:issue:`30353`, :issue:`13022`, :issue:`33498`) | ||
| - Bug in :meth:`DataFrame.resample` that ignored the ``loffset`` argument when dealing with timedelta (:issue:`7687`, :issue:`33498`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope this is okay @WillAyd 🙂. I'll be happy to change this if you disagree with this changelog.
BTW, regarding loffset, as mentioned earlier (https://github.com/pandas-dev/pandas/pull/33498/files#r407231964):
Useful because otherwise
tests/resample/test_base.py::test_resample_loffset_arg_typewas always failing.
However, this piece of code is questionable since #31809 will deprecateloffsetanyway.
| if isinstance(expected.index, TimedeltaIndex): | ||
| msg = "DataFrame are different" | ||
| with pytest.raises(AssertionError, match=msg): | ||
| tm.assert_frame_equal(result_agg, expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this is testing incorrect behavior? thats pretty weird
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbrockmendel Yeah, that was the case. That was testing an incorrect behavior... 😕 And this was the case for two reasons:
loffsetsupport was missing in.resamplewhen dealing with timedelta. (loffset has no effect when passing in a numyp.timedelta64 #7687)- an extra bin (fixed with this PR) was present each time when dealing with timedelta. (BUG: resample with TimedeltaIndex, fenceposts are off #13022, Resampler sometimes adds extra bin with NaN #30353)
Here an example of the values when setting a breakpoint with PyCharm line 222 in pandas/tests/resample/test_base.py on master:
The expected result (with loffset and without an extra bin):

The actual result (without loffset and with an extra bin):

I think that was the reason of the presence of this TODO:
pandas/pandas/tests/resample/test_base.py
Line 218 in 3d4f9dc
| # GH 13022, 7687 - TODO: fix resample w/ TimedeltaIndex |
IMO, the presence of this if was just to skip the tests of Timedeltas when resampling without skipping the tests of Timestamps and Periods parametrized with @all_ts.
jreback
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm a couple of questions.
Thank you! 🙂 Let me know if you need anything else! |
a3b41bf to
404ae11
Compare
404ae11 to
71438d6
Compare
|
thanks @hasB4K very nice! |
The issue from #30353 came actually from
timedelta_range.The outputs from
mock_timedelta_rangeand frompd.timedelta_rangeare supposed to equivalent, but are not on pandas v1.0.0:Outputs without this PR:
Outputs with this PR:
It also solve an issue (that fail on master) related to this comment: #13022 (comment)
Outputs with this PR:
On the firsts commits I just fixed the issue in
_generate_regular_range, then I decided to do a refactor and to use the same code that generate ranges incore/arrays/_ranges.pyfordate_rangeandtimedelta_range.Linked with #10887.
closes BUG: resample with TimedeltaIndex, fenceposts are off #13022
closes loffset has no effect when passing in a numyp.timedelta64 #7687 (for loffset when resampling Timedelta)
black pandasgit diff upstream/master -u -- "*.py" | flake8 --diff