Skip to content

Commit

Permalink
DEPR: lowercase strings w, d, b and c denoting frequencies …
Browse files Browse the repository at this point in the history
…in `Week`, `Day`, `BusinessDay` and `CustomBusinessDay` classes (pandas-dev#58998)

* deprecate lowercase 'd'

* fix tests, add tests

* deprecate lowercase alias 'b', fix tests

* fix tests and docs

* fix tests, fix an example in v0.20.0

* deprecate 'c',fix examples in v0.22.0, add tests and a note to v3.0.0

* correct examples in whatsnew

* update examples in user_guide/io.rst

---------

Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com>
  • Loading branch information
natmokval and mroeschke authored Jul 5, 2024
1 parent e3af7c6 commit 039edee
Show file tree
Hide file tree
Showing 33 changed files with 351 additions and 139 deletions.
4 changes: 2 additions & 2 deletions doc/source/user_guide/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2161,7 +2161,7 @@ a JSON string with two fields, ``schema`` and ``data``.
{
"A": [1, 2, 3],
"B": ["a", "b", "c"],
"C": pd.date_range("2016-01-01", freq="d", periods=3),
"C": pd.date_range("2016-01-01", freq="D", periods=3),
},
index=pd.Index(range(3), name="idx"),
)
Expand Down Expand Up @@ -2270,7 +2270,7 @@ round-trippable manner.
{
"foo": [1, 2, 3, 4],
"bar": ["a", "b", "c", "d"],
"baz": pd.date_range("2018-01-01", freq="d", periods=4),
"baz": pd.date_range("2018-01-01", freq="D", periods=4),
"qux": pd.Categorical(["a", "b", "c", "c"]),
},
index=pd.Index(range(4), name="idx"),
Expand Down
25 changes: 19 additions & 6 deletions doc/source/whatsnew/v0.18.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -322,15 +322,28 @@ Tz-aware are rounded, floored and ceiled in local times
Timedeltas

.. ipython:: python
.. code-block:: ipython
In [37]: t = pd.timedelta_range('1 days 2 hr 13 min 45 us', periods=3, freq='d')
t = pd.timedelta_range('1 days 2 hr 13 min 45 us', periods=3, freq='d')
t
t.round('10min')
In [38]: t
Out[38]:
TimedeltaIndex(['1 days 02:13:00.000045', '2 days 02:13:00.000045',
'3 days 02:13:00.000045'],
dtype='timedelta64[ns]', freq='D')
In [39]: t.round('10min')
Out[39]:
TimedeltaIndex(['1 days 02:10:00', '2 days 02:10:00',
'3 days 02:10:00'],
dtype='timedelta64[ns]', freq=None)
# Timedelta scalar
t[0]
t[0].round('2h')
In [40]: t[0]
Out[40]: Timedelta('1 days 02:13:00.000045')
In [41]: t[0].round('2h')
Out[41]: Timedelta('1 days 02:00:00')
In addition, ``.round()``, ``.floor()`` and ``.ceil()`` will be available through the ``.dt`` accessor of ``Series``.
Expand Down
27 changes: 19 additions & 8 deletions doc/source/whatsnew/v0.20.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -308,15 +308,26 @@ The new orient ``'table'`` for :meth:`DataFrame.to_json`
will generate a `Table Schema`_ compatible string representation of
the data.

.. ipython:: python
.. code-block:: ipython
df = pd.DataFrame(
{'A': [1, 2, 3],
'B': ['a', 'b', 'c'],
'C': pd.date_range('2016-01-01', freq='d', periods=3)},
index=pd.Index(range(3), name='idx'))
df
df.to_json(orient='table')
In [38]: df = pd.DataFrame(
....: {'A': [1, 2, 3],
....: 'B': ['a', 'b', 'c'],
....: 'C': pd.date_range('2016-01-01', freq='d', periods=3)},
....: index=pd.Index(range(3), name='idx'))
In [39]: df
Out[39]:
A B C
idx
0 1 a 2016-01-01
1 2 b 2016-01-02
2 3 c 2016-01-03
[3 rows x 3 columns]
In [40]: df.to_json(orient='table')
Out[40]:
'{"schema":{"fields":[{"name":"idx","type":"integer"},{"name":"A","type":"integer"},{"name":"B","type":"string"},{"name":"C","type":"datetime"}],"primaryKey":["idx"],"pandas_version":"1.4.0"},"data":[{"idx":0,"A":1,"B":"a","C":"2016-01-01T00:00:00.000"},{"idx":1,"A":2,"B":"b","C":"2016-01-02T00:00:00.000"},{"idx":2,"A":3,"B":"c","C":"2016-01-03T00:00:00.000"}]}'
See :ref:`IO: Table Schema for more information <io.table_schema>`.
Expand Down
21 changes: 16 additions & 5 deletions doc/source/whatsnew/v0.22.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -157,16 +157,27 @@ sum and ``1`` for product.
*pandas 0.22.0*

.. ipython:: python
.. code-block:: ipython
In [11]: s = pd.Series([1, 1, np.nan, np.nan],
....: index=pd.date_range("2017", periods=4))
s = pd.Series([1, 1, np.nan, np.nan], index=pd.date_range("2017", periods=4))
s.resample("2d").sum()
In [12]: s.resample("2d").sum()
Out[12]:
2017-01-01 2.0
2017-01-03 0.0
Freq: 2D, Length: 2, dtype: float64
To restore the 0.21 behavior of returning ``NaN``, use ``min_count>=1``.

.. ipython:: python
.. code-block:: ipython
In [13]: s.resample("2d").sum(min_count=1)
Out[13]:
2017-01-01 2.0
2017-01-03 NaN
Freq: 2D, Length: 2, dtype: float64
s.resample("2d").sum(min_count=1)
In particular, upsampling and taking the sum or product is affected, as
upsampling introduces missing values even if the original series was
Expand Down
60 changes: 48 additions & 12 deletions doc/source/whatsnew/v0.23.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,19 +50,55 @@ JSON read/write round-trippable with ``orient='table'``

A ``DataFrame`` can now be written to and subsequently read back via JSON while preserving metadata through usage of the ``orient='table'`` argument (see :issue:`18912` and :issue:`9146`). Previously, none of the available ``orient`` values guaranteed the preservation of dtypes and index names, amongst other metadata.

.. ipython:: python
.. code-block:: ipython
df = pd.DataFrame({'foo': [1, 2, 3, 4],
'bar': ['a', 'b', 'c', 'd'],
'baz': pd.date_range('2018-01-01', freq='d', periods=4),
'qux': pd.Categorical(['a', 'b', 'c', 'c'])},
index=pd.Index(range(4), name='idx'))
df
df.dtypes
df.to_json('test.json', orient='table')
new_df = pd.read_json('test.json', orient='table')
new_df
new_df.dtypes
In [1]: df = pd.DataFrame({'foo': [1, 2, 3, 4],
...: 'bar': ['a', 'b', 'c', 'd'],
...: 'baz': pd.date_range('2018-01-01', freq='d', periods=4),
...: 'qux': pd.Categorical(['a', 'b', 'c', 'c'])},
...: index=pd.Index(range(4), name='idx'))
In [2]: df
Out[2]:
foo bar baz qux
idx
0 1 a 2018-01-01 a
1 2 b 2018-01-02 b
2 3 c 2018-01-03 c
3 4 d 2018-01-04 c
[4 rows x 4 columns]
In [3]: df.dtypes
Out[3]:
foo int64
bar object
baz datetime64[ns]
qux category
Length: 4, dtype: object
In [4]: df.to_json('test.json', orient='table')
In [5]: new_df = pd.read_json('test.json', orient='table')
In [6]: new_df
Out[6]:
foo bar baz qux
idx
0 1 a 2018-01-01 a
1 2 b 2018-01-02 b
2 3 c 2018-01-03 c
3 4 d 2018-01-04 c
[4 rows x 4 columns]
In [7]: new_df.dtypes
Out[7]:
foo int64
bar object
baz datetime64[ns]
qux category
Length: 4, dtype: object
Please note that the string ``index`` is not supported with the round trip format, as it is used by default in ``write_json`` to indicate a missing index name.

Expand Down
2 changes: 2 additions & 0 deletions doc/source/whatsnew/v3.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -279,6 +279,8 @@ Other Deprecations
- Deprecated allowing non-keyword arguments in :meth:`Series.to_markdown` except ``buf``. (:issue:`57280`)
- Deprecated allowing non-keyword arguments in :meth:`Series.to_string` except ``buf``. (:issue:`57280`)
- Deprecated behavior of :meth:`Series.dt.to_pytimedelta`, in a future version this will return a :class:`Series` containing python ``datetime.timedelta`` objects instead of an ``ndarray`` of timedelta; this matches the behavior of other :meth:`Series.dt` properties. (:issue:`57463`)
- Deprecated lowercase strings ``d``, ``b`` and ``c`` denoting frequencies in :class:`Day`, :class:`BusinessDay` and :class:`CustomBusinessDay` in favour of ``D``, ``B`` and ``C`` (:issue:`58998`)
- Deprecated lowercase strings ``w``, ``w-mon``, ``w-tue``, etc. denoting frequencies in :class:`Week` in favour of ``W``, ``W-MON``, ``W-TUE``, etc. (:issue:`58998`)
- Deprecated parameter ``method`` in :meth:`DataFrame.reindex_like` / :meth:`Series.reindex_like` (:issue:`58667`)
- Deprecated strings ``w``, ``d``, ``MIN``, ``MS``, ``US`` and ``NS`` denoting units in :class:`Timedelta` in favour of ``W``, ``D``, ``min``, ``ms``, ``us`` and ``ns`` (:issue:`59051`)
- Deprecated using ``epoch`` date format in :meth:`DataFrame.to_json` and :meth:`Series.to_json`, use ``iso`` instead. (:issue:`57063`)
Expand Down
10 changes: 10 additions & 0 deletions pandas/_libs/tslibs/dtypes.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -359,6 +359,16 @@ cdef dict c_DEPR_UNITS = {

cdef dict c_PERIOD_AND_OFFSET_DEPR_FREQSTR = {
"w": "W",
"w-mon": "W-MON",
"w-tue": "W-TUE",
"w-wed": "W-WED",
"w-thu": "W-THU",
"w-fri": "W-FRI",
"w-sat": "W-SAT",
"w-sun": "W-SUN",
"d": "D",
"b": "B",
"c": "C",
"MIN": "min",
}

Expand Down
20 changes: 10 additions & 10 deletions pandas/_libs/tslibs/offsets.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -4890,16 +4890,16 @@ cpdef to_offset(freq, bint is_period=False):
)
name = c_PERIOD_TO_OFFSET_FREQSTR.get(name.upper())

if name in c_PERIOD_AND_OFFSET_DEPR_FREQSTR:
warnings.warn(
f"\'{name}\' is deprecated and will be removed "
f"in a future version, please use "
f"\'{c_PERIOD_AND_OFFSET_DEPR_FREQSTR.get(name)}\' "
f" instead.",
FutureWarning,
stacklevel=find_stack_level(),
)
name = c_PERIOD_AND_OFFSET_DEPR_FREQSTR.get(name)
if name in c_PERIOD_AND_OFFSET_DEPR_FREQSTR:
warnings.warn(
f"\'{name}\' is deprecated and will be removed "
f"in a future version, please use "
f"\'{c_PERIOD_AND_OFFSET_DEPR_FREQSTR.get(name)}\' "
f" instead.",
FutureWarning,
stacklevel=find_stack_level(),
)
name = c_PERIOD_AND_OFFSET_DEPR_FREQSTR.get(name)
if sep != "" and not sep.isspace():
raise ValueError("separator must be spaces")
prefix = _lite_rule_alias.get(name) or name
Expand Down
2 changes: 1 addition & 1 deletion pandas/io/json/_table_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -275,7 +275,7 @@ def build_table_schema(
>>> df = pd.DataFrame(
... {'A': [1, 2, 3],
... 'B': ['a', 'b', 'c'],
... 'C': pd.date_range('2016-01-01', freq='d', periods=3),
... 'C': pd.date_range('2016-01-01', freq='D', periods=3),
... }, index=pd.Index(range(3), name='idx'))
>>> build_table_schema(df)
{'fields': \
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/arithmetic/test_period.py
Original file line number Diff line number Diff line change
Expand Up @@ -1086,7 +1086,7 @@ def test_parr_add_timedeltalike_minute_gt1(self, three_days, box_with_array):
with pytest.raises(TypeError, match=msg):
other - rng

@pytest.mark.parametrize("freqstr", ["5ns", "5us", "5ms", "5s", "5min", "5h", "5d"])
@pytest.mark.parametrize("freqstr", ["5ns", "5us", "5ms", "5s", "5min", "5h", "5D"])
def test_parr_add_timedeltalike_tick_gt1(self, three_days, freqstr, box_with_array):
# GH#23031 adding a time-delta-like offset to a PeriodArray that has
# tick-like frequency with n != 1
Expand Down
10 changes: 7 additions & 3 deletions pandas/tests/frame/methods/test_astype.py
Original file line number Diff line number Diff line change
Expand Up @@ -715,8 +715,12 @@ def test_astype_ignores_errors_for_extension_dtypes(self, data, dtype, errors):
df.astype(float, errors=errors)

def test_astype_tz_conversion(self):
# GH 35973
val = {"tz": date_range("2020-08-30", freq="d", periods=2, tz="Europe/London")}
# GH 35973, GH#58998
msg = "'d' is deprecated and will be removed in a future version."
with tm.assert_produces_warning(FutureWarning, match=msg):
val = {
"tz": date_range("2020-08-30", freq="d", periods=2, tz="Europe/London")
}
df = DataFrame(val)
result = df.astype({"tz": "datetime64[ns, Europe/Berlin]"})

Expand All @@ -727,7 +731,7 @@ def test_astype_tz_conversion(self):
@pytest.mark.parametrize("tz", ["UTC", "Europe/Berlin"])
def test_astype_tz_object_conversion(self, tz):
# GH 35973
val = {"tz": date_range("2020-08-30", freq="d", periods=2, tz="Europe/London")}
val = {"tz": date_range("2020-08-30", freq="D", periods=2, tz="Europe/London")}
expected = DataFrame(val)

# convert expected to object dtype from other tz str (independently tested)
Expand Down
5 changes: 4 additions & 1 deletion pandas/tests/frame/methods/test_reindex.py
Original file line number Diff line number Diff line change
Expand Up @@ -754,7 +754,10 @@ def test_reindex_axes(self):
index=[datetime(2012, 1, 1), datetime(2012, 1, 2), datetime(2012, 1, 3)],
columns=["a", "b", "c"],
)
time_freq = date_range("2012-01-01", "2012-01-03", freq="d")

msg = "'d' is deprecated and will be removed in a future version."
with tm.assert_produces_warning(FutureWarning, match=msg):
time_freq = date_range("2012-01-01", "2012-01-03", freq="d")
some_cols = ["a", "b"]

index_freq = df.reindex(index=time_freq).index.freq
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/frame/test_query_eval.py
Original file line number Diff line number Diff line change
Expand Up @@ -763,7 +763,7 @@ def test_check_tz_aware_index_query(self, tz_aware_fixture):
# https://github.com/pandas-dev/pandas/issues/29463
tz = tz_aware_fixture
df_index = date_range(
start="2019-01-01", freq="1d", periods=10, tz=tz, name="time"
start="2019-01-01", freq="1D", periods=10, tz=tz, name="time"
)
expected = DataFrame(index=df_index)
df = DataFrame(index=df_index)
Expand Down
4 changes: 2 additions & 2 deletions pandas/tests/groupby/test_groupby_dropna.py
Original file line number Diff line number Diff line change
Expand Up @@ -420,7 +420,7 @@ def test_groupby_drop_nan_with_multi_index():
),
),
"datetime64[ns]",
"period[d]",
"period[D]",
"Sparse[float]",
],
)
Expand All @@ -437,7 +437,7 @@ def test_no_sort_keep_na(sequence_index, dtype, test_series, as_index):
# Unique values to use for grouper, depends on dtype
if dtype in ("string", "string[pyarrow]"):
uniques = {"x": "x", "y": "y", "z": pd.NA}
elif dtype in ("datetime64[ns]", "period[d]"):
elif dtype in ("datetime64[ns]", "period[D]"):
uniques = {"x": "2016-01-01", "y": "2017-01-01", "z": pd.NA}
else:
uniques = {"x": 1, "y": 2, "z": np.nan}
Expand Down
10 changes: 8 additions & 2 deletions pandas/tests/indexes/datetimes/methods/test_snap.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@
import pandas._testing as tm


@pytest.mark.filterwarnings(r"ignore:PeriodDtype\[B\] is deprecated:FutureWarning")
@pytest.mark.filterwarnings("ignore:Period with BDay freq:FutureWarning")
@pytest.mark.parametrize("tz", [None, "Asia/Shanghai", "Europe/Berlin"])
@pytest.mark.parametrize("name", [None, "my_dti"])
def test_dti_snap(name, tz, unit):
Expand All @@ -27,7 +29,9 @@ def test_dti_snap(name, tz, unit):
dti = dti.as_unit(unit)

result = dti.snap(freq="W-MON")
expected = date_range("12/31/2001", "1/7/2002", name=name, tz=tz, freq="w-mon")
msg = "'w-mon' is deprecated and will be removed in a future version."
with tm.assert_produces_warning(FutureWarning, match=msg):
expected = date_range("12/31/2001", "1/7/2002", name=name, tz=tz, freq="w-mon")
expected = expected.repeat([3, 4])
expected = expected.as_unit(unit)
tm.assert_index_equal(result, expected)
Expand All @@ -37,7 +41,9 @@ def test_dti_snap(name, tz, unit):

result = dti.snap(freq="B")

expected = date_range("1/1/2002", "1/7/2002", name=name, tz=tz, freq="b")
msg = "'b' is deprecated and will be removed in a future version."
with tm.assert_produces_warning(FutureWarning, match=msg):
expected = date_range("1/1/2002", "1/7/2002", name=name, tz=tz, freq="b")
expected = expected.repeat([1, 1, 1, 2, 2])
expected = expected.as_unit(unit)
tm.assert_index_equal(result, expected)
Expand Down
20 changes: 20 additions & 0 deletions pandas/tests/indexes/datetimes/test_date_range.py
Original file line number Diff line number Diff line change
Expand Up @@ -791,6 +791,26 @@ def test_frequency_A_raises(self, freq):
with pytest.raises(ValueError, match=msg):
date_range("1/1/2000", periods=2, freq=freq)

@pytest.mark.parametrize(
"freq,freq_depr",
[
("2W", "2w"),
("2W-WED", "2w-wed"),
("2B", "2b"),
("2D", "2d"),
("2C", "2c"),
],
)
def test_date_range_depr_lowercase_frequency(self, freq, freq_depr):
# GH#58998
depr_msg = f"'{freq_depr[1:]}' is deprecated and will be removed "
"in a future version."

expected = date_range("1/1/2000", periods=4, freq=freq)
with tm.assert_produces_warning(FutureWarning, match=depr_msg):
result = date_range("1/1/2000", periods=4, freq=freq_depr)
tm.assert_index_equal(result, expected)


class TestDateRangeTZ:
"""Tests for date_range with timezones"""
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/indexes/datetimes/test_partial_slicing.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ def test_string_index_series_name_converted(self):
def test_stringified_slice_with_tz(self):
# GH#2658
start = "2013-01-07"
idx = date_range(start=start, freq="1d", periods=10, tz="US/Eastern")
idx = date_range(start=start, freq="1D", periods=10, tz="US/Eastern")
df = DataFrame(np.arange(10), index=idx)
df["2013-01-14 23:44:34.437768-05:00":] # no exception here

Expand Down
Loading

0 comments on commit 039edee

Please sign in to comment.