Skip to content

Conversation

@flying-sheep
Copy link
Member

@flying-sheep flying-sheep commented Sep 26, 2024

  • Release notes not necessary because:

An alternative would be to subclass PCA, but that would involve erroring out or reimplementing all of its options.

Ideally #3267 would be merged first and this one integrated into its improved decision tree.

@codecov
Copy link

codecov bot commented Sep 26, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 77.18%. Comparing base (121f2db) to head (97c3812).
Report is 56 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3263      +/-   ##
==========================================
+ Coverage   77.01%   77.18%   +0.16%     
==========================================
  Files         110      111       +1     
  Lines       12492    12582      +90     
==========================================
+ Hits         9621     9711      +90     
  Misses       2871     2871              
Files with missing lines Coverage Δ
src/scanpy/preprocessing/_pca/__init__.py 90.81% <100.00%> (+0.70%) ⬆️
src/scanpy/preprocessing/_pca/_dask_sparse.py 100.00% <100.00%> (ø)

@flying-sheep flying-sheep marked this pull request as ready for review September 30, 2024 11:53
@flying-sheep flying-sheep added this to the 1.11.0 milestone Sep 30, 2024
@flying-sheep flying-sheep mentioned this pull request Sep 30, 2024
Comment on lines +241 to +243
if isinstance(adata.X, DaskArray) and issparse(adata.X._meta):
reason = "TruncatedSVD is not supported for sparse Dask yet"
request.applymarker(pytest.mark.xfail(reason=reason))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Intron7 you said it would be possible to adapt the code to do TruncatedSVD as well?

I think this PR is pretty clean, so maybe we can extend it in another PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, PR into this PR if we start pre-merge

@flying-sheep flying-sheep changed the title Implement sparse PCA using dask Implement sparse covariance_eigh PCA using Dask Oct 18, 2024
ilan-gold

This comment was marked as off-topic.

@ilan-gold
Copy link
Contributor

ilan-gold commented Oct 18, 2024

Can we add a test here to find out (or to at least know) how far our implementation here is from #3296 for sparse?

@flying-sheep
Copy link
Member Author

Done!

Copy link
Contributor

@ilan-gold ilan-gold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If everything passes CI, good to go from me

Copy link
Member

@Intron7 Intron7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@flying-sheep flying-sheep enabled auto-merge (squash) October 18, 2024 16:19
@flying-sheep flying-sheep disabled auto-merge October 18, 2024 17:04
@flying-sheep flying-sheep merged commit bae1610 into main Oct 18, 2024
@flying-sheep flying-sheep deleted the pca-dask-sparse branch October 18, 2024 17:04
@flying-sheep flying-sheep mentioned this pull request Nov 19, 2024
21 tasks
kaushalprasadhial pushed a commit to sanchit-misra/scanpy that referenced this pull request Feb 4, 2025
Co-authored-by: Ilan Gold <ilanbassgold@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dask Sparse PCA

4 participants