Plotting rework 1: use axes, use render_plot everywhere#166
Conversation
…lor to region_predictions
|
Updates:
|
|
Regarding the readthedocs failing: these are warnings that are raised to errors. The warnings appear to happen in |
|
Okay, now I'm finally declaring it finished (again). Changed base branch to keep it separate from a few bugfixes I'd like to push before, while we can collect this with the other plotting changes and maybe some other cleanup like deleting the old CREsted object for a 2.0 release. The warnings making the readthedocs CI fail are already fixed in executablebooks/MyST-NB#706. Looking at progress on their final PRs, I'm expecting a release (hopefully) very soon that removes the cause of the warnings. Since they don't even actually break the rendering and it should be done very soon, I'm not fixing sphinx to below 9.1 for now. Edit: nvm, warnings were something else, see #180, fixed in #181 (which is merged into this). |
|
Merging this into v2.0.0. We can do major in-practice quality testing on there. |


This PR reworks CREsted's plotting functions to be axis-focused according to a few core principles, inspired by scanpy and seaborn. Just in time for Christmas 😅
Core principles
Working with an
ax: Core plotting functions should accept an axis and some data and plot the data on that axis.sc.pl.umap(colors=['var1', 'var2']).Customizing the plotting function: The underlying plotting functions should be exposed through a
plot_kwsargument (like seaborn does for complicated functions with e.g.sns.lmplot(scatter_kws={}))pl.hist.distribution, without requiring all of them to be separate plotting function arguments (which can become overwhelming).Unified syntax: All plotting functions should use an identical syntax, aligning with each other and with matplotlib.
widthandheight, and is set on plot creation rather than post-hoc resizing.render_plotunless really not possible, and things like setting axis labels, titles, tick label rotations, etc, should as much as possible be handed off torender_plotby putting things inkwargsin the plotting functions. (This allows users to prevent any change to pre-existing properties by simply setting property=None in the plotting function)figsavearguments are all unified to usesave_pathinrender_plot(or manually if it's one of the few functions that doesn't use it).High-level summary:
render_plot(except a fewpl.patternsmodisco clustermaps)plot_kwsto add and customize the underlying plotting function's arguments.widthandheightto set dimensions, and multi-plot functions also takesharex/sharey.render_plotnow can also set ax-level labels and titles, set x/y lims, and add a grid.log_transformor not.Complete(ish) changelist:
plot_kwsandax.create_plot, which automatically uses kwargs to create a nice base figure.bar:region_predictionsnow usesregionto plot its components and allows for manual setting of the truth and prediction colors.predictionis cleaned up and made consistent with other functions.heatmapsns.heatmap(square=True)) by default, and default fig size was slightly changed to make it fit a square heatmap + a colorbar well.histlocuslocus.locus_scoringnow takes an axis (if only plotting the locus scoring and not the extra bigwig track) and separate plot_kws for both the locus and bigwig plots. Previous custom arguments are now folded into the plot_kws or render_plot kwargs. Highlights can now also be customized with highlight_kws.locus.trackwas expanded from the beta function I implemented at some point. (Fixes Expand pl.track.locus #161)class_idx, instead of requiring the user to subset dimensions before passing in the data.zoom_n_bases,highlight_positions,highlight_kws.patternscontribution_scores:highlight_kws.sharey=True/False/'sequence'.coordinates, to plot the explainer on genomic coordinates rather than justrange(0, seq_len)._enhancer_design:enhancer_design_steps_predictions: Spelling mistake in the arguments fixed. Now always creates a square grid of plots if supplying a lot of classes, followinghist.distributions.enhancer_design_steps_contribution_scoresis fully reworked to wrap aroundcontribution_scoresand do all nice thingscontribution_scorescan do. Also fixedzoom_n_bases(fixes zoom_n_bases broken in enhancer_design_steps_contribution_scores #167)._modisco_results: These plots are more convoluted/specific (and I very rarely use them), so I didn't touch them beyond the basics.render_plot. Clustermap functions now useg.savefig()as recommended by seaborn instead offig.savefig.clustermap_with_pwm_logospwm positioning logic was slightly adjusted, since they were all overlapping on my test run. Now they're all neatly aligned and separated in my tests at least.selected_instancesnow takes an axis if plotting a single index.cmapas an argument.scatterclass_densitycan now be customized more and has better defaults (figsize mostly square with or without colorbar, colorbar off by default, nicer labels)class_densitynow has properly colored and properly ranged colorbar.class_densitynow has an optional argument for an identity (y=x) line.violin:violin.correlationsnow takes plot_kws and ax. Label adjusted if using log-transformed data.render_plot[x/y]_*arguments aligned with matplotlib and setting arguments (e.g.xlabel's fontsize is now set byxlabel_fontsizerather thanx_label_fontsize, also to preventsupx_label_fontsizewhich looks weird).[x/y]tick, sincex/ylabelrefers to the axis labels, not the axis tick labels.tight_layoutis now off by default, because of the new 'constrained' layout engine (seecreate_plot)create_plotplt.subplotscalls, shorthand forif ax is not None; fig = ax.get_figure(); else; fig, ax = plt.subplots()kwargsthat have to do with plotting if the user wants to customise behavior, otherwise uses defaults set per-function as appropriate.fig.tight_layout()unneeded. Note that calling that anyway will disable this layout engine and go back to previoustight_layoutbehavior.utils:strand_reverser()andparse_region()into general utils, which make working with region strings/tuples easier. These originate from some dataloader work elsewhere, but were also useful here and generally across CREsted I'd imagine.modiscoliteis not installed (ascrested.tl.modiscois also not available then).Compatibility
I've endeavored to keep code as reverse compatible as possible.
show=False,render_plotdoes now return both a fig and ax(s), so code previously doingfig = crested.pl.func(show=False); axs = fig.axesor something similar will have to update tofig, axs = crested.pl.func(show=False)._modisco_results, I tested that all functions at least work with an old CREsted-based modisco run I had lying around, but haven't played with parameters a lot. Did not test the two TF expression-based plots (tf_expression_per_cell_type&clustermap_tf_motif), since I didn't have an elegans TF list available, so anyone testing those is appreciated.Future work
bar.region/bar.predictionautomatically also plotting multiple regions?render_plotfor clustermaps?log_and_raise: currently looks bad in notebooks because it duplicates the error message, and makes errors uncatchable with try/except. Not sure what the advantages are.This is the first (and biggest) part of a plotting overhaul. The next parts will add some new plots, rework plot categorisation, and update all tutorials.