Description
Please make sure these conditions are met
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of scanpy.
- (optional) I have confirmed this bug exists on the main branch of scanpy.
What happened?
When using a dask array in .X
, zero_center
does not seem to get handled correctly in sc.pp.pca
. When zero_center=False
there is an error, while for zero_center=True
the function computes without error.
Minimal code sample
import scanpy as sc
import dask.array as da
import anndata as ad
from scipy.sparse import csr_matrix
X = da.random.random((1000, 500)).map_blocks(csr_matrix)
adata = ad.AnnData(X=X)
sc.pp.pca(adata, zero_center=True) # does not produce error
sc.pp.pca(adata, zero_center=False) # produces error
sc.pp.pca(adata, zero_center=None) # produces error
Error output
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[23], line 9
6 X = X.map_blocks(scipy.sparse.csr_matrix)
7 adata = ad.AnnData(X=X)
----> 9 sc.pp.pca(adata, zero_center=False)
File [...]/envs/scanpy/lib/python3.10/site-packages/scanpy/preprocessing/_pca/__init__.py:352, in pca(***failed resolving arguments***)
350 if issparse(X._meta):
351 msg = "Dask sparse arrays do not support zero-centering (yet)"
--> 352 raise TypeError(msg)
353 from dask_ml.decomposition import TruncatedSVD
355 svd_solver = _handle_dask_ml_args(svd_solver, TruncatedSVD)
TypeError: Dask sparse arrays do not support zero-centering (yet)
Versions
dask 2024.7.1
---- ----
pillow 11.1.0
pyarrow 18.1.0
importlib_metadata 8.5.0
PyYAML 6.0.2
pickleshare 0.7.5
pytz 2024.1
zarr 2.18.3
backports.tarfile 1.2.0
sparse 0.17.0
texttable 1.7.0
tornado 6.4.2
psutil 6.1.1
jedi 0.19.2
six 1.17.0
llvmlite 0.43.0
jaraco.context 5.3.0
leidenalg 0.10.2
zipp 3.21.0
zict 3.0.0
ipython 8.31.0
asciitree 0.3.3
lz4 4.3.3
parso 0.8.4
pure_eval 0.2.3
joblib 1.4.2
asttokens 3.0.0
cytoolz 1.0.1
MarkupSafe 3.0.2
decorator 5.1.1
stack_data 0.6.3
toolz 1.0.0 (toolz: 1.0.0, tlz: 1.0.1)
wcwidth 0.2.13
setuptools 75.6.0
igraph 0.11.8
kiwisolver 1.4.7
louvain 0.8.2
cycler 0.12.1
sortedcontainers 2.4.0
jaraco.collections 5.1.0
tblib 3.0.0
executing 2.1.0
tqdm 4.67.1
prompt_toolkit 3.0.48
msgpack 1.1.0
distributed 2024.7.1
locket 1.0.0
numba 0.60.0
h5py 3.10.0
jaraco.text 3.12.1
numcodecs 0.13.1
jaraco.functools 4.0.1
natsort 8.4.0
python-dateutil 2.9.0.post0
zstandard 0.23.0
dask-expr 1.1.9
---- ----
Python 3.10.17 | packaged by conda-forge | (main, Apr 10 2025, 22:19:12) [GCC 13.3.0]
OS Linux-4.18.0-553.33.1.el8_10.x86_64-x86_64-with-glibc2.28
CPU 128 logical CPU cores, x86_64
GPU No GPU found
Updated 2025-05-26 13:34