Skip to content

Commit 9ff7ec9

Browse files
authored
Merge branch 'main' into fix/resample-interpolate-fails-with-inplace-true-58690-remove-inplace-option
2 parents c124121 + a787f45 commit 9ff7ec9

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+634
-487
lines changed

ci/code_checks.sh

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -471,7 +471,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
471471
-i "pandas.plotting.andrews_curves RT03,SA01" \
472472
-i "pandas.plotting.lag_plot RT03,SA01" \
473473
-i "pandas.plotting.scatter_matrix PR07,SA01" \
474-
-i "pandas.qcut PR07,SA01" \
475474
-i "pandas.set_eng_float_format RT03,SA01" \
476475
-i "pandas.testing.assert_extension_array_equal SA01" \
477476
-i "pandas.tseries.offsets.BDay PR02,SA01" \

doc/source/getting_started/intro_tutorials/09_timeseries.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -295,7 +295,7 @@ Aggregate the current hourly time series values to the monthly maximum value in
295295

296296
.. ipython:: python
297297
298-
monthly_max = no_2.resample("ME").max()
298+
monthly_max = no_2.resample("MS").max()
299299
monthly_max
300300
301301
A very powerful method on time series data with a datetime index, is the

doc/source/user_guide/10min.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -101,7 +101,7 @@ truncated for brevity.
101101
Viewing data
102102
------------
103103

104-
See the :ref:`Essentially basics functionality section <basics>`.
104+
See the :ref:`Essential basic functionality section <basics>`.
105105

106106
Use :meth:`DataFrame.head` and :meth:`DataFrame.tail` to view the top and bottom rows of the frame
107107
respectively:

doc/source/user_guide/boolean.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,19 @@ If you would prefer to keep the ``NA`` values you can manually fill them with ``
3737
3838
s[mask.fillna(True)]
3939
40+
If you create a column of ``NA`` values (for example to fill them later)
41+
with ``df['new_col'] = pd.NA``, the ``dtype`` would be set to ``object`` in the
42+
new column. The performance on this column will be worse than with
43+
the appropriate type. It's better to use
44+
``df['new_col'] = pd.Series(pd.NA, dtype="boolean")``
45+
(or another ``dtype`` that supports ``NA``).
46+
47+
.. ipython:: python
48+
49+
df = pd.DataFrame()
50+
df['objects'] = pd.NA
51+
df.dtypes
52+
4053
.. _boolean.kleene:
4154

4255
Kleene logical operations

doc/source/user_guide/integer_na.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,19 @@ with the dtype.
8484
In the future, we may provide an option for :class:`Series` to infer a
8585
nullable-integer dtype.
8686

87+
If you create a column of ``NA`` values (for example to fill them later)
88+
with ``df['new_col'] = pd.NA``, the ``dtype`` would be set to ``object`` in the
89+
new column. The performance on this column will be worse than with
90+
the appropriate type. It's better to use
91+
``df['new_col'] = pd.Series(pd.NA, dtype="Int64")``
92+
(or another ``dtype`` that supports ``NA``).
93+
94+
.. ipython:: python
95+
96+
df = pd.DataFrame()
97+
df['objects'] = pd.NA
98+
df.dtypes
99+
87100
Operations
88101
----------
89102

doc/source/user_guide/timeseries.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1864,15 +1864,15 @@ to resample based on datetimelike column in the frame, it can passed to the
18641864
),
18651865
)
18661866
df
1867-
df.resample("ME", on="date")[["a"]].sum()
1867+
df.resample("MS", on="date")[["a"]].sum()
18681868
18691869
Similarly, if you instead want to resample by a datetimelike
18701870
level of ``MultiIndex``, its name or location can be passed to the
18711871
``level`` keyword.
18721872

18731873
.. ipython:: python
18741874
1875-
df.resample("ME", level="d")[["a"]].sum()
1875+
df.resample("MS", level="d")[["a"]].sum()
18761876
18771877
.. _timeseries.iterating-label:
18781878

doc/source/whatsnew/v3.0.0.rst

Lines changed: 31 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ Other enhancements
4242
- :meth:`DataFrame.corrwith` now accepts ``min_periods`` as optional arguments, as in :meth:`DataFrame.corr` and :meth:`Series.corr` (:issue:`9490`)
4343
- :meth:`DataFrame.cummin`, :meth:`DataFrame.cummax`, :meth:`DataFrame.cumprod` and :meth:`DataFrame.cumsum` methods now have a ``numeric_only`` parameter (:issue:`53072`)
4444
- :meth:`DataFrame.fillna` and :meth:`Series.fillna` can now accept ``value=None``; for non-object dtype the corresponding NA value will be used (:issue:`57723`)
45+
- :meth:`DataFrame.pivot_table` and :func:`pivot_table` now allow the passing of keyword arguments to ``aggfunc`` through ``**kwargs`` (:issue:`57884`)
4546
- :meth:`Series.cummin` and :meth:`Series.cummax` now supports :class:`CategoricalDtype` (:issue:`52335`)
4647
- :meth:`Series.plot` now correctly handle the ``ylabel`` parameter for pie charts, allowing for explicit control over the y-axis label (:issue:`58239`)
4748
- Restore support for reading Stata 104-format and enable reading 103-format dta files (:issue:`58554`)
@@ -280,6 +281,34 @@ Other Deprecations
280281

281282
Removal of prior version deprecations/changes
282283
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
284+
285+
Enforced deprecation of aliases ``M``, ``Q``, ``Y``, etc. in favour of ``ME``, ``QE``, ``YE``, etc. for offsets
286+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
287+
288+
Renamed the following offset aliases (:issue:`57986`):
289+
290+
+-------------------------------+------------------+------------------+
291+
| offset | removed alias | new alias |
292+
+===============================+==================+==================+
293+
|:class:`MonthEnd` | ``M`` | ``ME`` |
294+
+-------------------------------+------------------+------------------+
295+
|:class:`BusinessMonthEnd` | ``BM`` | ``BME`` |
296+
+-------------------------------+------------------+------------------+
297+
|:class:`SemiMonthEnd` | ``SM`` | ``SME`` |
298+
+-------------------------------+------------------+------------------+
299+
|:class:`CustomBusinessMonthEnd`| ``CBM`` | ``CBME`` |
300+
+-------------------------------+------------------+------------------+
301+
|:class:`QuarterEnd` | ``Q`` | ``QE`` |
302+
+-------------------------------+------------------+------------------+
303+
|:class:`BQuarterEnd` | ``BQ`` | ``BQE`` |
304+
+-------------------------------+------------------+------------------+
305+
|:class:`YearEnd` | ``Y`` | ``YE`` |
306+
+-------------------------------+------------------+------------------+
307+
|:class:`BYearEnd` | ``BY`` | ``BYE`` |
308+
+-------------------------------+------------------+------------------+
309+
310+
Other Removals
311+
^^^^^^^^^^^^^^
283312
- :class:`.DataFrameGroupBy.idxmin`, :class:`.DataFrameGroupBy.idxmax`, :class:`.SeriesGroupBy.idxmin`, and :class:`.SeriesGroupBy.idxmax` will now raise a ``ValueError`` when used with ``skipna=False`` and an NA value is encountered (:issue:`10694`)
284313
- :func:`concat` no longer ignores empty objects when determining output dtypes (:issue:`39122`)
285314
- :func:`concat` with all-NA entries no longer ignores the dtype of those entries when determining the result dtype (:issue:`40893`)
@@ -343,7 +372,7 @@ Removal of prior version deprecations/changes
343372
- Enforced deprecation of string ``A`` denoting frequency in :class:`YearEnd` and strings ``A-DEC``, ``A-JAN``, etc. denoting annual frequencies with various fiscal year ends (:issue:`57699`)
344373
- Enforced deprecation of string ``BAS`` denoting frequency in :class:`BYearBegin` and strings ``BAS-DEC``, ``BAS-JAN``, etc. denoting annual frequencies with various fiscal year starts (:issue:`57793`)
345374
- Enforced deprecation of string ``BA`` denoting frequency in :class:`BYearEnd` and strings ``BA-DEC``, ``BA-JAN``, etc. denoting annual frequencies with various fiscal year ends (:issue:`57793`)
346-
- Enforced deprecation of strings ``T``, ``L``, ``U``, and ``N`` denoting frequencies in :class:`Minute`, :class:`Second`, :class:`Milli`, :class:`Micro`, :class:`Nano` (:issue:`57627`)
375+
- Enforced deprecation of strings ``T``, ``L``, ``U``, and ``N`` denoting frequencies in :class:`Minute`, :class:`Milli`, :class:`Micro`, :class:`Nano` (:issue:`57627`)
347376
- Enforced deprecation of strings ``T``, ``L``, ``U``, and ``N`` denoting units in :class:`Timedelta` (:issue:`57627`)
348377
- Enforced deprecation of the behavior of :func:`concat` when ``len(keys) != len(objs)`` would truncate to the shorter of the two. Now this raises a ``ValueError`` (:issue:`43485`)
349378
- Enforced deprecation of values "pad", "ffill", "bfill", and "backfill" for :meth:`Series.interpolate` and :meth:`DataFrame.interpolate` (:issue:`57869`)
@@ -453,6 +482,7 @@ Categorical
453482

454483
Datetimelike
455484
^^^^^^^^^^^^
485+
- Bug in :attr:`is_year_start` where a DateTimeIndex constructed via a date_range with frequency 'MS' wouldn't have the correct year or quarter start attributes (:issue:`57377`)
456486
- Bug in :class:`Timestamp` constructor failing to raise when ``tz=None`` is explicitly specified in conjunction with timezone-aware ``tzinfo`` or data (:issue:`48688`)
457487
- Bug in :func:`date_range` where the last valid timestamp would sometimes not be produced (:issue:`56134`)
458488
- Bug in :func:`date_range` where using a negative frequency value would not include all points between the start and end values (:issue:`56382`)

pandas/_libs/tslibs/dtypes.pxd

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,10 @@ cdef NPY_DATETIMEUNIT get_supported_reso(NPY_DATETIMEUNIT reso)
1212
cdef bint is_supported_unit(NPY_DATETIMEUNIT reso)
1313

1414
cdef dict c_OFFSET_TO_PERIOD_FREQSTR
15-
cdef dict c_OFFSET_DEPR_FREQSTR
16-
cdef dict c_REVERSE_OFFSET_DEPR_FREQSTR
15+
cdef dict c_PERIOD_TO_OFFSET_FREQSTR
16+
cdef dict c_OFFSET_RENAMED_FREQSTR
1717
cdef dict c_DEPR_ABBREVS
18+
cdef dict c_PERIOD_AND_OFFSET_DEPR_FREQSTR
1819
cdef dict attrname_to_abbrevs
1920
cdef dict npy_unit_to_attrname
2021
cdef dict attrname_to_npy_unit

pandas/_libs/tslibs/dtypes.pyx

Lines changed: 40 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -176,6 +176,10 @@ OFFSET_TO_PERIOD_FREQSTR: dict = {
176176
"EOM": "M",
177177
"BME": "M",
178178
"SME": "M",
179+
"BMS": "M",
180+
"CBME": "M",
181+
"CBMS": "M",
182+
"SMS": "M",
179183
"BQS": "Q",
180184
"QS": "Q",
181185
"BQE": "Q",
@@ -228,7 +232,6 @@ OFFSET_TO_PERIOD_FREQSTR: dict = {
228232
"YE-NOV": "Y-NOV",
229233
"W": "W",
230234
"ME": "M",
231-
"Y": "Y",
232235
"BYE": "Y",
233236
"BYE-DEC": "Y-DEC",
234237
"BYE-JAN": "Y-JAN",
@@ -245,7 +248,7 @@ OFFSET_TO_PERIOD_FREQSTR: dict = {
245248
"YS": "Y",
246249
"BYS": "Y",
247250
}
248-
cdef dict c_OFFSET_DEPR_FREQSTR = {
251+
cdef dict c_OFFSET_RENAMED_FREQSTR = {
249252
"M": "ME",
250253
"Q": "QE",
251254
"Q-DEC": "QE-DEC",
@@ -303,10 +306,37 @@ cdef dict c_OFFSET_DEPR_FREQSTR = {
303306
"BQ-OCT": "BQE-OCT",
304307
"BQ-NOV": "BQE-NOV",
305308
}
306-
cdef dict c_OFFSET_TO_PERIOD_FREQSTR = OFFSET_TO_PERIOD_FREQSTR
307-
cdef dict c_REVERSE_OFFSET_DEPR_FREQSTR = {
308-
v: k for k, v in c_OFFSET_DEPR_FREQSTR.items()
309+
PERIOD_TO_OFFSET_FREQSTR = {
310+
"M": "ME",
311+
"Q": "QE",
312+
"Q-DEC": "QE-DEC",
313+
"Q-JAN": "QE-JAN",
314+
"Q-FEB": "QE-FEB",
315+
"Q-MAR": "QE-MAR",
316+
"Q-APR": "QE-APR",
317+
"Q-MAY": "QE-MAY",
318+
"Q-JUN": "QE-JUN",
319+
"Q-JUL": "QE-JUL",
320+
"Q-AUG": "QE-AUG",
321+
"Q-SEP": "QE-SEP",
322+
"Q-OCT": "QE-OCT",
323+
"Q-NOV": "QE-NOV",
324+
"Y": "YE",
325+
"Y-DEC": "YE-DEC",
326+
"Y-JAN": "YE-JAN",
327+
"Y-FEB": "YE-FEB",
328+
"Y-MAR": "YE-MAR",
329+
"Y-APR": "YE-APR",
330+
"Y-MAY": "YE-MAY",
331+
"Y-JUN": "YE-JUN",
332+
"Y-JUL": "YE-JUL",
333+
"Y-AUG": "YE-AUG",
334+
"Y-SEP": "YE-SEP",
335+
"Y-OCT": "YE-OCT",
336+
"Y-NOV": "YE-NOV",
309337
}
338+
cdef dict c_OFFSET_TO_PERIOD_FREQSTR = OFFSET_TO_PERIOD_FREQSTR
339+
cdef dict c_PERIOD_TO_OFFSET_FREQSTR = PERIOD_TO_OFFSET_FREQSTR
310340

311341
# Map deprecated resolution abbreviations to correct resolution abbreviations
312342
cdef dict c_DEPR_ABBREVS = {
@@ -316,6 +346,11 @@ cdef dict c_DEPR_ABBREVS = {
316346
"S": "s",
317347
}
318348

349+
cdef dict c_PERIOD_AND_OFFSET_DEPR_FREQSTR = {
350+
"w": "W",
351+
"MIN": "min",
352+
}
353+
319354

320355
class FreqGroup(Enum):
321356
# Mirrors c_FreqGroup in the .pxd file

pandas/_libs/tslibs/fields.pyx

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -253,9 +253,10 @@ def get_start_end_field(
253253
# month of year. Other offsets use month, startingMonth as ending
254254
# month of year.
255255

256-
if freq_name.lstrip("B")[0:2] in ["MS", "QS", "YS"]:
256+
if freq_name.lstrip("B")[0:2] in ["QS", "YS"]:
257257
end_month = 12 if month_kw == 1 else month_kw - 1
258258
start_month = month_kw
259+
259260
else:
260261
end_month = month_kw
261262
start_month = (end_month % 12) + 1

pandas/_libs/tslibs/offsets.pyx

Lines changed: 49 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -57,8 +57,10 @@ from pandas._libs.tslibs.ccalendar cimport (
5757
from pandas._libs.tslibs.conversion cimport localize_pydatetime
5858
from pandas._libs.tslibs.dtypes cimport (
5959
c_DEPR_ABBREVS,
60-
c_OFFSET_DEPR_FREQSTR,
61-
c_REVERSE_OFFSET_DEPR_FREQSTR,
60+
c_OFFSET_RENAMED_FREQSTR,
61+
c_OFFSET_TO_PERIOD_FREQSTR,
62+
c_PERIOD_AND_OFFSET_DEPR_FREQSTR,
63+
c_PERIOD_TO_OFFSET_FREQSTR,
6264
periods_per_day,
6365
)
6466
from pandas._libs.tslibs.nattype cimport (
@@ -4711,6 +4713,34 @@ INVALID_FREQ_ERR_MSG = "Invalid frequency: {0}"
47114713
_offset_map = {}
47124714

47134715

4716+
def _validate_to_offset_alias(alias: str, is_period: bool) -> None:
4717+
if not is_period:
4718+
if alias.upper() in c_OFFSET_RENAMED_FREQSTR:
4719+
raise ValueError(
4720+
f"\'{alias}\' is no longer supported for offsets. Please "
4721+
f"use \'{c_OFFSET_RENAMED_FREQSTR.get(alias.upper())}\' "
4722+
f"instead."
4723+
)
4724+
if (alias.upper() != alias and
4725+
alias.lower() not in {"s", "ms", "us", "ns"} and
4726+
alias.upper().split("-")[0].endswith(("S", "E"))):
4727+
raise ValueError(INVALID_FREQ_ERR_MSG.format(alias))
4728+
if (is_period and
4729+
alias.upper() in c_OFFSET_TO_PERIOD_FREQSTR and
4730+
alias != "ms" and
4731+
alias.upper().split("-")[0].endswith(("S", "E"))):
4732+
if (alias.upper().startswith("B") or
4733+
alias.upper().startswith("S") or
4734+
alias.upper().startswith("C")):
4735+
raise ValueError(INVALID_FREQ_ERR_MSG.format(alias))
4736+
else:
4737+
alias_msg = "".join(alias.upper().split("E", 1))
4738+
raise ValueError(
4739+
f"for Period, please use \'{alias_msg}\' "
4740+
f"instead of \'{alias}\'"
4741+
)
4742+
4743+
47144744
# TODO: better name?
47154745
def _get_offset(name: str) -> BaseOffset:
47164746
"""
@@ -4850,54 +4880,26 @@ cpdef to_offset(freq, bint is_period=False):
48504880

48514881
tups = zip(split[0::4], split[1::4], split[2::4])
48524882
for n, (sep, stride, name) in enumerate(tups):
4853-
if not is_period and name.upper() in c_OFFSET_DEPR_FREQSTR:
4854-
warnings.warn(
4855-
f"\'{name}\' is deprecated and will be removed "
4856-
f"in a future version, please use "
4857-
f"\'{c_OFFSET_DEPR_FREQSTR.get(name.upper())}\' instead.",
4858-
FutureWarning,
4859-
stacklevel=find_stack_level(),
4860-
)
4861-
name = c_OFFSET_DEPR_FREQSTR[name.upper()]
4862-
if (not is_period and
4863-
name != name.upper() and
4864-
name.lower() not in {"s", "ms", "us", "ns"} and
4865-
name.upper().split("-")[0].endswith(("S", "E"))):
4866-
warnings.warn(
4867-
f"\'{name}\' is deprecated and will be removed "
4868-
f"in a future version, please use "
4869-
f"\'{name.upper()}\' instead.",
4870-
FutureWarning,
4871-
stacklevel=find_stack_level(),
4872-
)
4873-
name = name.upper()
4874-
if is_period and name.upper() in c_REVERSE_OFFSET_DEPR_FREQSTR:
4875-
if name.upper().startswith("Y"):
4876-
raise ValueError(
4877-
f"for Period, please use \'Y{name.upper()[2:]}\' "
4878-
f"instead of \'{name}\'"
4879-
)
4880-
if (name.upper().startswith("B") or
4881-
name.upper().startswith("S") or
4882-
name.upper().startswith("C")):
4883-
raise ValueError(INVALID_FREQ_ERR_MSG.format(name))
4884-
else:
4885-
raise ValueError(
4886-
f"for Period, please use "
4887-
f"\'{c_REVERSE_OFFSET_DEPR_FREQSTR.get(name.upper())}\' "
4888-
f"instead of \'{name}\'"
4889-
)
4890-
elif is_period and name.upper() in c_OFFSET_DEPR_FREQSTR:
4891-
if name.upper() != name:
4883+
_validate_to_offset_alias(name, is_period)
4884+
if is_period:
4885+
if name.upper() in c_PERIOD_TO_OFFSET_FREQSTR:
4886+
if name.upper() != name:
4887+
raise ValueError(
4888+
f"\'{name}\' is no longer supported, "
4889+
f"please use \'{name.upper()}\' instead.",
4890+
)
4891+
name = c_PERIOD_TO_OFFSET_FREQSTR.get(name.upper())
4892+
4893+
if name in c_PERIOD_AND_OFFSET_DEPR_FREQSTR:
48924894
warnings.warn(
4893-
f"\'{name}\' is deprecated and will be removed in "
4894-
f"a future version, please use \'{name.upper()}\' "
4895-
f"instead.",
4895+
f"\'{name}\' is deprecated and will be removed "
4896+
f"in a future version, please use "
4897+
f"\'{c_PERIOD_AND_OFFSET_DEPR_FREQSTR.get(name)}\' "
4898+
f" instead.",
48964899
FutureWarning,
48974900
stacklevel=find_stack_level(),
4898-
)
4899-
name = c_OFFSET_DEPR_FREQSTR.get(name.upper())
4900-
4901+
)
4902+
name = c_PERIOD_AND_OFFSET_DEPR_FREQSTR.get(name)
49014903
if sep != "" and not sep.isspace():
49024904
raise ValueError("separator must be spaces")
49034905
prefix = _lite_rule_alias.get(name) or name

pandas/core/arrays/datetimes.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -143,7 +143,7 @@ def f(self):
143143
month_kw = 12
144144
if freq:
145145
kwds = freq.kwds
146-
month_kw = kwds.get("startingMonth", kwds.get("month", 12))
146+
month_kw = kwds.get("startingMonth", kwds.get("month", month_kw))
147147

148148
if freq is not None:
149149
freq_name = freq.name

0 commit comments

Comments
 (0)