Skip to content

Comments

Plotting rework 1: use axes, use render_plot everywhere#166

Merged
casblaauw merged 96 commits intov2.0from
plotting_axes
Feb 5, 2026
Merged

Plotting rework 1: use axes, use render_plot everywhere#166
casblaauw merged 96 commits intov2.0from
plotting_axes

Conversation

@casblaauw
Copy link
Collaborator

@casblaauw casblaauw commented Dec 19, 2025

This PR reworks CREsted's plotting functions to be axis-focused according to a few core principles, inspired by scanpy and seaborn. Just in time for Christmas 😅

Core principles

  • Working with an ax: Core plotting functions should accept an axis and some data and plot the data on that axis.

    • This allows for composite plots and easy adding of extra annotations/elements from the user's side, by far my greatest frustration currently.
      • Some plots (inherently multi-panel plots, clustermaps) are of course exempt.
    • If no axis is provided, it should create a sensible-sized plot for the user (as it does now already), returning both the fig and axis (if show=False).
    • If multiple values are provided (e.g. predictions from multiple models), it should automatically create a figure with multiple axes (like sc.pl.umap(colors=['var1', 'var2']).
    • Labels/titles should also be set on axes rather than figures preferably, especially on single-axis plots. Both because they don't look misaligned like suptitle does, but also because a function should not disturb the larger figure without being explicitly instructed to (through suptitle/sup[x/y]label).
  • Customizing the plotting function: The underlying plotting functions should be exposed through a plot_kws argument (like seaborn does for complicated functions with e.g. sns.lmplot(scatter_kws={}))

    • Default arguments in the plot should also be overridable with this, i.e. color in pl.hist.distribution, without requiring all of them to be separate plotting function arguments (which can become overwhelming).
  • Unified syntax: All plotting functions should use an identical syntax, aligning with each other and with matplotlib.

    • Figure size is now always set with width and height, and is set on plot creation rather than post-hoc resizing.
    • All plots should use render_plot unless really not possible, and things like setting axis labels, titles, tick label rotations, etc, should as much as possible be handed off to render_plot by putting things in kwargs in the plotting functions. (This allows users to prevent any change to pre-existing properties by simply setting property=None in the plotting function)
    • Things like separate unique figsave arguments are all unified to use save_path in render_plot (or manually if it's one of the few functions that doesn't use it).

High-level summary:

  • All functions now use render_plot (except a few pl.patterns modisco clustermaps)
  • (Almost) all functions now accept an axis to plot their data on, if plotting a single panel.
  • (Almost) all functions now accept plot_kws to add and customize the underlying plotting function's arguments.
  • All functions take width and height to set dimensions, and multi-plot functions also take sharex/sharey.
  • render_plot now can also set ax-level labels and titles, set x/y lims, and add a grid.
  • Default axis labels now denote whether you're using log_transform or not.
  • Lots of plotting functions had their figure size, labels, etc defaults improved, but the core plotting has been untouched.
  • (Almost) all plotting functions are now tested.

Complete(ish) changelist:

  • All* plotting functions:
    • Now take plot_kws and ax.
    • *Exceptions for a few functions that are always multi-plot.
    • Plot creation is now handled through create_plot, which automatically uses kwargs to create a nice base figure.
  • bar:
    • region_predictions now uses region to plot its components and allows for manual setting of the truth and prediction colors.
    • prediction is cleaned up and made consistent with other functions.
    • All barplots now use a y-only grid by default, since an x-grid is superfluous with a categorical axis.
  • heatmap
    • Colormap is now customizable.
    • Colorbar now has a label to show its units (pearson correlation), indicating log1p-transformation if used.
    • Heatmaps are now square (sns.heatmap(square=True)) by default, and default fig size was slightly changed to make it fit a square heatmap + a colorbar well.
  • hist
    • Add nice default axis labels, including denoting log-transformation if used.
    • Non-used plots in the plot grid (if plotting multiple classes) are now hidden by default.
  • locus
    • locus.locus_scoring now takes an axis (if only plotting the locus scoring and not the extra bigwig track) and separate plot_kws for both the locus and bigwig plots. Previous custom arguments are now folded into the plot_kws or render_plot kwargs. Highlights can now also be customized with highlight_kws.
    • locus.track was expanded from the beta function I implemented at some point. (Fixes Expand pl.track.locus #161)
      • Now accepts standard track model outputs and a class_idx, instead of requiring the user to subset dimensions before passing in the data.
      • Automatically creates multiple axes for every class provided.
      • Also now takes zoom_n_bases, highlight_positions, highlight_kws.
  • patterns
    • contribution_scores:
      • Class labels are updated to be consistently at 70% of plot height, rather than at 70% of the positive values (which made them inconsistent if negative values in the data) , and at 2.5% of the plot width instead of at x=5 (which is the same at default zoom level 200bp, but can vary if zoom level is changed). For 'mutagenesis', it's at 30% by default since we expect those values to be mostly negative.
      • Input dimensions are now automatically attempted to be expanded if dimensions are missing.
      • Highlights can now be customized with highlight_kws.
      • y-limit sharing between sequences can now explicitly be customized with sharey=True/False/'sequence'.
      • Internals cleaned up, also makes some behavior more consistent.
      • Now takes coordinates, to plot the explainer on genomic coordinates rather than just range(0, seq_len).
    • _enhancer_design:
      • enhancer_design_steps_predictions: Spelling mistake in the arguments fixed. Now always creates a square grid of plots if supplying a lot of classes, following hist.distributions.
      • enhancer_design_steps_contribution_scores is fully reworked to wrap around contribution_scores and do all nice things contribution_scores can do. Also fixed zoom_n_bases (fixes zoom_n_bases broken in enhancer_design_steps_contribution_scores #167).
    • _modisco_results : These plots are more convoluted/specific (and I very rarely use them), so I didn't touch them beyond the basics.
      • All functions now take width/height, and the non-clustermap functions now all use render_plot. Clustermap functions now use g.savefig() as recommended by seaborn instead of fig.savefig.
      • clustermap_with_pwm_logos pwm positioning logic was slightly adjusted, since they were all overlapping on my test run. Now they're all neatly aligned and separated in my tests at least.
      • selected_instances now takes an axis if plotting a single index.
      • All clustermaps/heatmaps in this module should now have cmap as an argument.
  • scatter
    • class_density can now be customized more and has better defaults (figsize mostly square with or without colorbar, colorbar off by default, nicer labels)
    • class_density now has properly colored and properly ranged colorbar.
    • class_density now has an optional argument for an identity (y=x) line.
  • violin:
    • violin.correlations now takes plot_kws and ax. Label adjusted if using log-transformed data.
  • render_plot
    • Now primarily axis-focused, taking and returning axes, and only disturbing the figure if explicitly asked to. (Fig resizing moved to plot creation, rather than post-hoc, to follow this rule).
    • Can now set axis titles, x/ylabels, and limits. Can handle both a single value (applying that to all axes) and a list of values (one per axis).
    • Rotated labels now align with their ticks, optimized to some heuristics. Primarily important with longer cell type names.
    • [x/y]_*arguments aligned with matplotlib and setting arguments (e.g. xlabel's fontsize is now set by xlabel_fontsize rather than x_label_fontsize, also to prevent supx_label_fontsize which looks weird).
    • rotation arguments renamed to [x/y]tick, since x/ylabel refers to the axis labels, not the axis tick labels.
    • Can now add a grid with nice defaults (behind data). Works both for single-axis and both axes.
    • tight_layout is now off by default, because of the new 'constrained' layout engine (see create_plot)
  • create_plot
    • New function to replace plt.subplots calls, shorthand for if ax is not None; fig = ax.get_figure(); else; fig, ax = plt.subplots()
    • Automatically consumes kwargs that have to do with plotting if the user wants to customise behavior, otherwise uses defaults set per-function as appropriate.
    • Now uses matplotlib's new recommended 'constrained' layout engine (which is set at plot creation), making fig.tight_layout() unneeded. Note that calling that anyway will disable this layout engine and go back to previous tight_layout behavior.
  • General utils:
    • Brought over strand_reverser() and parse_region() into general utils, which make working with region strings/tuples easier. These originate from some dataloader work elsewhere, but were also useful here and generally across CREsted I'd imagine.
  • Tests:
    • Now have tests for (almost) all plotting functions.
      • We're still missing the enhancer design functions and some of the modisco ones I think. The tl.modisco-using ones-using ones are currently skipped if modiscolite is not installed (as crested.tl.modisco is also not available then).

Compatibility

I've endeavored to keep code as reverse compatible as possible.

  • All renamed arguments still work, and raise a warning on how to use them with the renamed version or new syntax.
  • If using show=False, render_plot does now return both a fig and ax(s), so code previously doing fig = crested.pl.func(show=False); axs = fig.axes or something similar will have to update to fig, axs = crested.pl.func(show=False).
  • title as a kwarg now refers to the axis title rather than suptitle; suptitle's now under suptitle. This leads to better titles and nicer plots in 90% of cases, but might need some manual changes if doing multi-panel plots where you expected suptitle.
  • I've tested all base functions (everything except modisco_results) pretty thoroughly (also adjusting plot_kws, etc), but something might've slipped through.
    • for _modisco_results , I tested that all functions at least work with an old CREsted-based modisco run I had lying around, but haven't played with parameters a lot. Did not test the two TF expression-based plots (tf_expression_per_cell_type & clustermap_tf_motif), since I didn't have an elegans TF list available, so anyone testing those is appreciated.

Future work

  • bar.region/bar.prediction automatically also plotting multiple regions?
  • Look into also using render_plot for clustermaps?
  • Think about log_and_raise: currently looks bad in notebooks because it duplicates the error message, and makes errors uncatchable with try/except. Not sure what the advantages are.

This is the first (and biggest) part of a plotting overhaul. The next parts will add some new plots, rework plot categorisation, and update all tutorials.

@casblaauw
Copy link
Collaborator Author

Updates:

  • Added a full test suite for plotting functions, expanded from the tests @LukasMahieu started at some point.
    • Currently don't test some of the modisco functions, otherwise full coverage.
  • enhancer_design_contribution_scores() has been reworked to wrap around contribution_scores, which fixes zoom_n_bases broken in enhancer_design_steps_contribution_scores #167.
  • contribution_scores() code cleaned up and now optionally takes argument coordinates, which plots the contribution scores on genomic coordinates.
  • General util function added to parse these coordinates, for contribution_scores(), track(), and track_scoring()
  • track() expanded to support zoom_n_bases and highlight_positions, which fixes Expand pl.track.locus #161.
  • Updated default layout engine to 'constrained', as recommended by matplotlib (see guide here). Because of this, tight_layout is now False by default, but this basically looks the same for most plots.
    • The big improvement/reason for this is that it better deals with labels and titles. This is relevant wherever we use suptitle, and for contribution_scores where this better separates different sequences in some cases. See next comment for an illustration.

@casblaauw
Copy link
Collaborator Author

image vs image

@casblaauw
Copy link
Collaborator Author

Regarding the readthedocs failing: these are warnings that are raised to errors. The warnings appear to happen in myst-nb from a deprecated usage of sphinx code. It seems to be addressed at executablebooks/MyST-NB#681 and at sphinx-doc/sphinx#13644, will look into further to see whether we need to change anything on our side.

@casblaauw casblaauw changed the base branch from main to v2.0 February 4, 2026 08:38
@casblaauw
Copy link
Collaborator Author

casblaauw commented Feb 4, 2026

Okay, now I'm finally declaring it finished (again). Changed base branch to keep it separate from a few bugfixes I'd like to push before, while we can collect this with the other plotting changes and maybe some other cleanup like deleting the old CREsted object for a 2.0 release.

The warnings making the readthedocs CI fail are already fixed in executablebooks/MyST-NB#706. Looking at progress on their final PRs, I'm expecting a release (hopefully) very soon that removes the cause of the warnings. Since they don't even actually break the rendering and it should be done very soon, I'm not fixing sphinx to below 9.1 for now.

Edit: nvm, warnings were something else, see #180, fixed in #181 (which is merged into this).

@casblaauw
Copy link
Collaborator Author

Merging this into v2.0.0. We can do major in-practice quality testing on there.

@casblaauw casblaauw merged commit 900ca76 into v2.0 Feb 5, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

zoom_n_bases broken in enhancer_design_steps_contribution_scores Expand pl.track.locus

1 participant