Skip to content

DEPR: Deprecate use of strings denoting units with 'M', 'Y' or 'y' in pd.to_timedelta (36666) #36838

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 32 commits into from
Oct 31, 2020
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
de4fb4d
Throw warning when M,Y,m or y is used in an arg for pd.to_timedelta
avinashpancham Oct 3, 2020
61ec5b6
Update docstring by removing old logic and adding depr notice
avinashpancham Oct 3, 2020
abb613e
Add whatsnew entry
avinashpancham Oct 3, 2020
4ccf43d
Update wording in docs
avinashpancham Oct 3, 2020
271f2fc
Place space at the end of string instead of at the beginning of next …
avinashpancham Oct 3, 2020
7b11a92
Also catch month and year timedelta variations without a space
avinashpancham Oct 4, 2020
b39eadf
Add test for depr behavior
avinashpancham Oct 4, 2020
f56796a
Catch deprecated patterns with a regex to prevent false positives and…
avinashpancham Oct 4, 2020
ad38ad3
Add description to unittest
avinashpancham Oct 12, 2020
bc03065
Also find appearences of unambiguous timedelta values in containers
avinashpancham Oct 13, 2020
d491cce
Update test_unit_parser in test_timedelta.py
avinashpancham Oct 15, 2020
76e6602
Move warnings code to timedeltas.pyx and update tests
avinashpancham Oct 18, 2020
cdc5821
Update code for failing tests
avinashpancham Oct 18, 2020
5c471b3
Merge branch 'master' into GH36666
avinashpancham Oct 18, 2020
02d642b
Remove empty line
avinashpancham Oct 18, 2020
89af5bb
Update depr message
avinashpancham Oct 19, 2020
a5ef0c3
Merge branch 'master' into GH36666
avinashpancham Oct 20, 2020
0bc1a89
Move warning location and update tests
avinashpancham Oct 22, 2020
3682687
Merge remote-tracking branch 'upstream/master' into GH36666
avinashpancham Oct 22, 2020
2abd0ed
Update parse_iso_format_string and tests that give warnings
avinashpancham Oct 22, 2020
6f4e7ca
Merge remote-tracking branch 'upstream/master' into GH36666
avinashpancham Oct 23, 2020
8701f26
Replace ipython directives with code-blocks in whatsnew v0.15.0
avinashpancham Oct 24, 2020
783e039
Merge remote-tracking branch 'upstream/master' into GH36666
avinashpancham Oct 26, 2020
0cc6dff
Update newly added test
avinashpancham Oct 26, 2020
6a02d05
Merge remote-tracking branch 'upstream/master' into GH36666
avinashpancham Oct 27, 2020
71f801e
Update test to prevent depr warning
avinashpancham Oct 28, 2020
e183f9b
Merge remote-tracking branch 'upstream/master' into GH36666
avinashpancham Oct 28, 2020
2d8c9fe
Merge remote-tracking branch 'upstream/master' into GH36666
avinashpancham Oct 30, 2020
1d2871b
Merge remote-tracking branch 'upstream/master' into GH36666
avinashpancham Oct 31, 2020
e01f75d
Revert the depr of lowercase m
avinashpancham Oct 31, 2020
5277943
Revert more depr of m
avinashpancham Oct 31, 2020
3b83aa0
Merge remote-tracking branch 'upstream/master' into GH36666
avinashpancham Oct 31, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -293,6 +293,7 @@ Deprecations
- :meth:`Rolling.count` with ``min_periods=None`` will default to the size of the window in a future version (:issue:`31302`)
- Deprecated slice-indexing on timezone-aware :class:`DatetimeIndex` with naive ``datetime`` objects, to match scalar indexing behavior (:issue:`36148`)
- :meth:`Index.ravel` returning a ``np.ndarray`` is deprecated, in the future this will return a view on the same index (:issue:`19956`)
- Deprecate use of strings denoting units with 'M', 'Y', 'm' or 'y' in :func:`~pandas.to_timedelta` (:issue:`36666`)

.. ---------------------------------------------------------------------------

Expand Down
27 changes: 21 additions & 6 deletions pandas/core/tools/timedeltas.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
"""
timedelta support tools
"""
import re
import warnings

import numpy as np

Expand All @@ -25,10 +27,13 @@ def to_timedelta(arg, unit=None, errors="raise"):
Parameters
----------
arg : str, timedelta, list-like or Series
The data to be converted to timedelta. The character M by itself,
e.g. '1M', is treated as minute, not month. The characters Y and y
are treated as the mean length of the Gregorian calendar year -
365.2425 days or 365 days 5 hours 49 minutes 12 seconds.
The data to be converted to timedelta.

.. deprecated:: 1.2
Strings denoting units with 'M', 'Y', 'm' or 'y' do not represent
unambiguous timedelta values durations and will removed in a future
version

unit : str, optional
Denotes the unit of the arg for numeric `arg`. Defaults to ``"ns"``.

Expand Down Expand Up @@ -118,8 +123,18 @@ def to_timedelta(arg, unit=None, errors="raise"):
"arg must be a string, timedelta, list, tuple, 1-d array, or Series"
)

if isinstance(arg, str) and unit is not None:
raise ValueError("unit must not be specified if the input is/contains a str")
if isinstance(arg, str):
if unit is not None:
raise ValueError(
"unit must not be specified if the input is/contains a str"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm we already have this check on L103, so something must be going wrong here.

Copy link
Contributor Author

@avinashpancham avinashpancham Oct 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you do: pd.to_timedelta("1 Y") then unit is not specified and by default None. L103 checks only if unit is "Y", "M", 'y' or 'm' (e.g. pd.to_timedelta(1, unit="Y") ), and therefore you never reach the conditional code.

elif re.search(r"^\d+\s?[M|Y|m|y]$", arg):
warnings.warn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wouldn't do this here at all

"Denoting units with 'M', 'Y', 'm' or 'y' do not represent unambiguous "
"timedelta values durations and will removed in a future version",
FutureWarning,
stacklevel=2,
)

# ...so it must be a scalar value. Return scalar.
return _coerce_scalar_to_timedelta_type(arg, unit=unit, errors=errors)
Expand Down
4 changes: 3 additions & 1 deletion pandas/tests/scalar/timedelta/test_timedelta.py
Original file line number Diff line number Diff line change
Expand Up @@ -258,7 +258,9 @@ def test_unit_parser(self, units, np_unit, wrapper):
if unit == "M":
expected = Timedelta(np.timedelta64(2, "m").astype("timedelta64[ns]"))

result = to_timedelta(f"2{unit}")
warning = None if unit != "m" else FutureWarning
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change this to parameterize this (e.g. elminate the for loop and parameterize on units)

then add a parameter whether should test for FutureWarning or not (could be bool is ok).

then in the body you can easily check this for both to_timedelta and Timedelta

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have approx 50 values that have no warning and just one ("m") that has a warning. IMO a parameter for this would be overkill. But if we want this then I can change it.

Copy link
Contributor Author

@avinashpancham avinashpancham Oct 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the test. IMO test_unit_parser could be refactored even further: split in 2 tests: test_unit_parser_array and test_unit_parser_scalar and apply a parameter on the function that converts its to timedelta. Do we want to do this?

with tm.assert_produces_warning(warning):
result = to_timedelta(f"2{unit}")
assert result == expected
result = Timedelta(f"2{unit}")
assert result == expected
Expand Down
21 changes: 21 additions & 0 deletions pandas/tests/tools/test_to_timedelta.py
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,27 @@ def test_to_timedelta_invalid(self):
invalid_data, to_timedelta(invalid_data, errors="ignore")
)

@pytest.mark.parametrize(
"val, warning",
[
("1M", FutureWarning),
("1 M", FutureWarning),
("1Y", FutureWarning),
("1 Y", FutureWarning),
("1m", FutureWarning),
("1 m", FutureWarning),
("1y", FutureWarning),
("1 y", FutureWarning),
("1 day", None),
("2day", None),
],
)
def test_unambiguous_timedelta_values(self, val, warning):
# GH36666 Deprecate use of strings denoting units with 'M', 'Y', 'm' or 'y'
# in pd.to_timedelta
with tm.assert_produces_warning(warning):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you but the github issue number here as a comment?

Copy link
Contributor Author

@avinashpancham avinashpancham Oct 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. CI fails due to other tests on master though. Will merge master later once that is solved.

to_timedelta(val)

def test_to_timedelta_via_apply(self):
# GH 5458
expected = Series([np.timedelta64(1, "s")])
Expand Down