Open
Description
A recurring problem is that the GRN inference step of pySCENIC (using Arboreto's GRNBoost2/GENIE3 implementation) fails to complete successfully. This seems to be due to issues with newer Dask releases being incompatible with the existing GRNBoost2/GENIE3 implementation.
Possible errors
ValueError: Metadata mismatch found in from_delayed
Expected partition of type DataFrame but got NoneType
ValueError: tuple is not allowed for map key
...
Possible solutions
- In many cases using an older version of the dask/distributed packages can help to fix this. This is ideally accomplished using the Docker images, which already contain the stable versions of these packages (see here for usage details). Or, to install these via pip:
pip install dask==1.0.0 distributed'>=1.21.6,<2.0.0'
- Alternatively, some users have reported that upgrading to the newest version of Dask can resolve this as well (Error when running grnboost2 : ValueError: tuple is not allowed for map key #147).
-
Another option is to use a helper script (arboreto_with_multiprocessing.py) that runs the Arboreto GRN algorithms (GRNBoost2, GENIE3) without Dask for compatibility.
See here, or the basic usage is:arboreto_with_multiprocessing.py \ expr_mat.loom \ allTFs_hg38.txt \ --output adj.tsv \ --num_workers 20 \
Metadata
Metadata
Assignees
Labels
No labels