Skip to content

Refactor BasePlot not to create a dataframe representation of the data #3718

@ilan-gold

Description

@ilan-gold

What kind of feature would you like to request?

Additional function parameters / changed functionality / changed defaults?

Please describe your wishes

See #3717 for what prompted me to look at this code.

Currently BasePlot creates a in-memory copy as a dataframe of the main data of interest (obsm, X, layers etc.):

self.categories, self.obs_tidy = _prepare_dataframe(
adata,
self.var_names,
groupby,
use_raw=use_raw,
log=log,
num_categories=num_categories,
layer=layer,
gene_symbols=gene_symbols,
)

I believe this to be unnecessary as this dataframe is only ever used for groupby operations, for which we have a zero-copy solution in https://scanpy.readthedocs.io/en/latest/generated/scanpy.get.aggregate.html

Thus we should

Metadata

Metadata

Assignees

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions