Replies: 2 comments
-
|
First, you need to open the h5ad in memory mode if you want efficient random indexing. Fragment data are stored as a specialized sparse row matrix. Random indexing along the rows should be fast. |
Beta Was this translation helpful? Give feedback.
0 replies
-
|
hi @kaizhang, |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Currently, when using fragments in torch, I'm only using a single torch worker, which avoids 'unpickleable' errors when torch multiprocessing tries to spawn and can't pickle the fragment file, due to some issue with hdf5
Under this implementation, it's still quite slow because getting fragments from random indices in the fragment file is bottlenecked somehow.
Wondering if you have addressed this, possibly within Selene SDK, or some other framework.
I thought about pre-serializing the fragments for each cell using the snapatac.export, but I would much prefer any suggestions for a low-latency approach to multiprocessing and indexing on the native snapatac fragment files themselves.
Beta Was this translation helpful? Give feedback.
All reactions