ERR: consistent error messages for unsupported reduction operations

While working on and reviewing some of the string dtype work, we typically have to update some matched error messages in the tests (as there is a new dtype), and that made it clear we really have a variety of ways to phrase the same message.

Focusing specifically on reductions for a moment, some of the variations that are currently being used (all `TypeError`):

```
# plain Series
datetime64 type does not support operation 'any'
'DatetimeArray' with dtype datetime64[ns] does not support operation 'sum'
Accumulation cumsum not supported for <class 'pandas.core.arrays.datetimes.DatetimeArray'>
cumprod not supported for Timedelta.
(NotImplementedError) cannot perform cummin with type interval[float64, right]
Cannot perform reduction 'any' with string dtype
cannot perform cummin with type string
# Series groupby
agg function failed [how->mean,dtype->object]
cummin is not supported for object dtype
'quantile' cannot be performed against 'object' dtypes!
datetime64 type does not support operation 'sum'
Period type does not support sum operations
'std' and 'sem' are not valid for PeriodDtype
'std' and 'sem' are not valid for PeriodDtype
Cannot use quantile with bool dtype
category type does not support sum operations
category dtype does not support aggregation 'mean'
```



<details>

```python
import numpy as np

import pandas as pd
from pandas import Index, CategoricalIndex, IntervalIndex

# from conftest.py
indices_dict = {
    "string-object": Index([f"pandas_{i}" for i in range(10)], dtype=object),
    "datetime": pd.date_range("2020-01-01", periods=10),
    "datetime-tz": pd.date_range("2020-01-01", periods=10, tz="US/Pacific"),
    "period": pd.period_range("2020-01-01", periods=10, freq="D"),
    "timedelta": pd.timedelta_range(start="1 day", periods=10, freq="D"),
    "range": pd.RangeIndex(10),
    "int8": Index(np.arange(10), dtype="int8"),
    "int16": Index(np.arange(10), dtype="int16"),
    "int32": Index(np.arange(10), dtype="int32"),
    "int64": Index(np.arange(10), dtype="int64"),
    "uint8": Index(np.arange(10), dtype="uint8"),
    "uint16": Index(np.arange(10), dtype="uint16"),
    "uint32": Index(np.arange(10), dtype="uint32"),
    "uint64": Index(np.arange(10), dtype="uint64"),
    "float32": Index(np.arange(10), dtype="float32"),
    "float64": Index(np.arange(10), dtype="float64"),
    "bool-object": Index([True, False] * 5, dtype=object),
    "bool-dtype": Index([True, False] * 5, dtype=bool),
    "complex64": Index(
        np.arange(10, dtype="complex64") + 1.0j * np.arange(10, dtype="complex64")
    ),
    "complex128": Index(
        np.arange(10, dtype="complex128") + 1.0j * np.arange(10, dtype="complex128")
    ),
    "categorical": CategoricalIndex(list("abcd") * 2),
    "interval": IntervalIndex.from_breaks(np.linspace(0, 100, num=11)),
    "empty": Index([]),
    "nullable_int": Index(np.arange(10), dtype="Int64"),
    "nullable_uint": Index(np.arange(10), dtype="UInt16"),
    "nullable_float": Index(np.arange(10), dtype="Float32"),
    "nullable_bool": Index(np.arange(10).astype(bool), dtype="boolean"),
    "string-python": Index(
        pd.array([f"pandas_{i}" for i in range(10)], dtype="string[python]")
    ),
    "string-pyarrow": Index(pd.array([f"pandas_{i}" for i in range(10)], dtype="string[pyarrow]"))
}

for dtype, data in indices_dict.items():
    for op in ["any", "all", "min", "max", "sum", "mean", "median", "prod",
                "std", "var", "sem", "kurt", "skew", "cummin", "cummax", "cumsum",
                    "cumprod", "quantile"]:
        try:
            getattr(pd.Series(data), op)()
        except Exception as e:
            print(dtype, op, type(e), e)


for dtype, data in indices_dict.items():
    for op in ["any", "all", "min", "max", "sum", "mean", "median", "prod",
                "std", "var", "sem", "kurt", "skew", "cummin", "cummax", "cumsum",
                    "cumprod", "quantile"]:
        try:
            getattr(pd.Series(data).groupby([0]*len(data)), op)()
        except Exception as e:
            print(dtype, op, type(e), e)





```

</details>

I think it would be useful for both us maintainers/contributors (consistency in the code base, easier to test) as users (clear and consistent message) to harmonize those error messages.

For a single message, I think I certainly want to specify the dtype (and not the array class), and I think it would be useful to use a bit of quoting to clearly distinguish the operation (and potentially dtype). 
But no strong opinion on the actual wording. Some potential suggestions for a single dtype/operation:

1. dtype 'datetime64[ns]' does not support operation 'sum'
2. 'datetime64[ns]'  dtype does not support operation 'sum'
3. operation 'sum' is not supported for dtype 'datetime64[ns]' 
4. cannot perform reduction 'sum' with 'datetime64[ns]' dtype
5. cannot use 'sum' with 'datetime64[ns]' dtype

(could also be all without quotes around the dtype)

Any preferences?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ERR: consistent error messages for unsupported reduction operations #59580

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

ERR: consistent error messages for unsupported reduction operations #59580

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions