-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
DEPR: Deprecate use of strings denoting units with 'M', 'Y' or 'y' in pd.to_timedelta (36666) #36838
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Do we only want to deprecate variations with a space ( |
pandas/core/tools/timedeltas.py
Outdated
| raise ValueError( | ||
| "unit must not be specified if the input is/contains a str" | ||
| ) | ||
| elif arg.upper().endswith(" M") or arg.upper().endswith(" Y"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in your other commented, we also want to catch arguments without the space as well e.g. "1Y".
Also we need a test for this warning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
Ah I see some tests are failing. I will update the code. |
fde639e to
c12eeaa
Compare
|
All tests related to |
c12eeaa to
f56796a
Compare
|
Updated my code and merged master. Everything passes now. |
| ], | ||
| ) | ||
| def test_unambiguous_timedelta_values(self, val, warning): | ||
| with tm.assert_produces_warning(warning): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you but the github issue number here as a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. CI fails due to other tests on master though. Will merge master later once that is solved.
pandas/core/tools/timedeltas.py
Outdated
| if unit is not None: | ||
| raise ValueError( | ||
| "unit must not be specified if the input is/contains a str" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm we already have this check on L103, so something must be going wrong here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you do: pd.to_timedelta("1 Y") then unit is not specified and by default None. L103 checks only if unit is "Y", "M", 'y' or 'm' (e.g. pd.to_timedelta(1, unit="Y") ), and therefore you never reach the conditional code.
| expected = Timedelta(np.timedelta64(2, "m").astype("timedelta64[ns]")) | ||
|
|
||
| result = to_timedelta(f"2{unit}") | ||
| warning = None if unit != "m" else FutureWarning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change this to parameterize this (e.g. elminate the for loop and parameterize on units)
then add a parameter whether should test for FutureWarning or not (could be bool is ok).
then in the body you can easily check this for both to_timedelta and Timedelta
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have approx 50 values that have no warning and just one ("m") that has a warning. IMO a parameter for this would be overkill. But if we want this then I can change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the test. IMO test_unit_parser could be refactored even further: split in 2 tests: test_unit_parser_array and test_unit_parser_scalar and apply a parameter on the function that converts its to timedelta. Do we want to do this?
|
I found out that the following was also possible: |
jreback
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to do the warning inside the string to timedelta conversion code in timedelta.pyx itself;
pandas/core/tools/timedeltas.py
Outdated
| "represent unambiguous timedelta values durations." | ||
| ) | ||
|
|
||
| if isinstance(arg, (list, tuple, ABCSeries, ABCIndexClass, np.ndarray)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the warning shouldn't be here
pandas/_libs/tslibs/timedeltas.pyx
Outdated
| return res | ||
|
|
||
|
|
||
| def check_unambiguous_timedelta_values(object[:] values): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than this you can do it in the Timedelta string converter
pandas/core/tools/timedeltas.py
Outdated
| "unit must not be specified if the input is/contains a str" | ||
| ) | ||
| elif re.search(r"^\d+\s?[M|Y|m|y]$", arg): | ||
| warnings.warn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i wouldn't do this here at all
|
Thanks @jreback , moved the warning to timedelta.pyx and the code is much cleaner now. |
cef33b7 to
cdc5821
Compare
b83228b to
c76486e
Compare
c76486e to
0cc6dff
Compare
|
@jreback implemented all suggestions, could you have a 2nd look? CI is still failing after merging master, but that is related to other issues than introduced in this PR. |
|
CI failed due to FATAL ERROR: Committing semi space failed. Allocation failed - process out of memory. Can someone please restart it? Thanks! |
|
why is 'm' (lowercase) m deprecated? this is minutes as in not ambiguous. (agree on Y, y, M) |
|
My bad, I thought we wanted to deprecate all variations of m (lowercase and uppercase). Will revert it then. |
kk great, otherwise lgtm. ping on green. |
|
@jreback reverted the depr of lowercase m, merged master and CI on green. |
|
thanks @avinashpancham very nice! |
… pd.to_timedelta (36666) (pandas-dev#36838)
… pd.to_timedelta (36666) (pandas-dev#36838)
black pandasgit diff upstream/master -u -- "*.py" | flake8 --diff