Skip to content

Remove all preprocessing from PlotAccessor.__call__ #29412

Open
@TomAugspurger

Description

@TomAugspurger

In #28647, we discovered that PlotAccessor.__call___ does quite a bit of preprocessing of the data before handing it off to the backend.

# The original data structured can be transformed before passed to the
# backend. For example, for DataFrame is common to set the index as the
# `x` parameter, and return a Series with the parameter `y` as values.
data = self._parent.copy()
if isinstance(data, ABCSeries):
kwargs["reuse_plot"] = True
if kind in self._dataframe_kinds:
if isinstance(data, ABCDataFrame):
return plot_backend.plot(data, x=x, y=y, kind=kind, **kwargs)
else:
raise ValueError(
("plot kind {} can only be used for data frames").format(kind)
)
elif kind in self._series_kinds:
if isinstance(data, ABCDataFrame):
if y is None and kwargs.get("subplots") is False:
msg = "{} requires either y column or 'subplots=True'"
raise ValueError(msg.format(kind))
elif y is not None:
if is_integer(y) and not data.columns.holds_integer():
y = data.columns[y]
# converted to series actually. copy to not modify
data = data[y].copy()
data.index.name = y
elif isinstance(data, ABCDataFrame):
data_cols = data.columns
if x is not None:
if is_integer(x) and not data.columns.holds_integer():
x = data_cols[x]
elif not isinstance(data[x], ABCSeries):
raise ValueError("x must be a label or position")
data = data.set_index(x)
if y is not None:
# check if we have y as int or list of ints
int_ylist = is_list_like(y) and all(is_integer(c) for c in y)
int_y_arg = is_integer(y) or int_ylist
if int_y_arg and not data.columns.holds_integer():
y = data_cols[y]
label_kw = kwargs["label"] if "label" in kwargs else False
for kw in ["xerr", "yerr"]:
if kw in kwargs and (
isinstance(kwargs[kw], str) or is_integer(kwargs[kw])
):
try:
kwargs[kw] = data[kwargs[kw]]
except (IndexError, KeyError, TypeError):
pass
# don't overwrite
data = data[y].copy()
if isinstance(data, ABCSeries):
label_name = label_kw or y
data.name = label_name
else:
match = is_list_like(label_kw) and len(label_kw) == len(y)
if label_kw and not match:
raise ValueError(
"label should be list-like and same length as y"
)
label_name = label_kw or data.columns
data.columns = label_name
return plot_backend.plot(data, kind=kind, **kwargs)

Depending on the plot kind and backend, some of this preprocesing may not be appropriate. The backend might want to see the "raw" data. That's not to say we shouldn't do any preprocessing though. We'll need some feedback from backend authors on what should be done before handing it off.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Needs DiscussionRequires discussion from core team before further actionRefactorInternal refactoring of codeVisualizationplotting

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions