Skip to content

Conversation

@nbedanova
Copy link
Contributor

Added in deconvolution code and visualization.

@nbedanova nbedanova requested a review from Copilot November 6, 2025 21:08
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request updates multiple dependencies and introduces a new cytokine deconvolution analysis feature to the codebase. The changes include upgrading key scientific computing libraries and adding a new visualization for decomposed cytokine effects.

  • Updated core dependencies: anndata (0.11.3 → 0.12.4), scipy (1.15.2 → 1.16.3), statsmodels (0.14.2 → 0.14.5), and others
  • Added new deconvolution_cytokine function that decomposes cytokine factor matrices using matrix factorization with L1 regularization
  • Created new figure module figureParseFactorsDeconv.py for visualizing deconvolved cytokine effects

Reviewed Changes

Copilot reviewed 4 out of 6 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
requirements.lock Updates locked dependency versions including anndata, scipy, statsmodels, packaging, cupy-cuda12x, and adds new dependencies (zarr, numcodecs, donfig, crc32c)
requirements-dev.lock Mirrors dependency updates from requirements.lock for development environment
pyproject.toml Updates minimum statsmodels version requirement from 0.14.1 to 0.14.4
pf2rnaseq/factorization.py Adds new deconvolution_cytokine function for matrix factorization with L1 regularization to decompose cytokine effects into direct and induced components
pf2rnaseq/figures/figureParseFactorsDeconv.py New figure module for visualizing deconvolved cytokine effects showing original, deconvolved, and interaction matrices
pf2rnaseq/figures/commonFuncs/plotFactors.py Adds centering parameter to plot_condition_factors function to control data normalization, comments out legend, and removes extra blank line

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

nbedanova and others added 4 commits November 6, 2025 13:13
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@nbedanova nbedanova requested a review from aarmey November 6, 2025 21:14
@nbedanova nbedanova marked this pull request as ready for review November 6, 2025 21:15
Copy link
Member

@aarmey aarmey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aarmey I have just been building this figure to test out the regularization. I have a decomposition at rank 100 saved as "/home/nicoleb/ParsePf2_100_D11_filt.h5ad" then calling deconvolution_cytokine where the alpha(regularization strength) can be adjusted. The deconvoluted matrix, original matrix, and convolution matrix is plotted out. It should also print out the MSE at every 100 iterations and final sparsity.

@aarmey
Copy link
Member

aarmey commented Nov 14, 2025

@nbedanova I made two changes here: (1) I removed the non-negative bounds (there are negative values in the factor matrix), and (2) I removed the regularization from the diagonal of W. This seems to improve things—I'm seeing interactions that seem to make sense, with fewer "bands".

I think there are two ways to make further improvements:

  1. I think the banding comes from the fact that some cytokines are higher or lower across all of the components. Fixing this might be as easy as subtracting off the median value of each cytokine across components. Something to reduce the banding in the original factor matrix will improve the deconvolved output.
  2. L-BFGS-B may be finding a local minimum, and I noticed it fits very slowly. This problem can be solved with ADMM, though writing the solver for that is more involved. Gemini/Claude may be able to write you a good starting point, though, as they gave me a good explanation of the plan for how you would go about doing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants