-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark on number of cells and runtime #3
Comments
Hi it's been two days now and GenKI appears to have stuck!! |
Hi Rohit,
I’m currently not in lab, and sorry for the late reply.
Since GenkI internally constructs gene network using a regression based
method, you may encounter delay if your number of genes is beyond 3k. So I
suggest you first run it with the top features such as highly variable
genes.
We have a running speed evaluation in the supplementary material, hope it
helps.
Feel free to let me know if you have any questions.
Best,
Yongjian
…On Wed, Nov 1, 2023 at 14:04 Rohit Satyam ***@***.***> wrote:
Hi it's been two days now and GenKI appears to have stuck!! — Reply to
this email directly, view it on GitHub, or unsubscribe. You are receiving
this because you were mentioned. Message ID:
<yjgeno/GenKI/issues/3/1789414604@ github. com>
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.
ZjQcmQRYFpfptBannerEnd
Hi it's been two days now and GenKI appears to have stuck!!
—
Reply to this email directly, view it on GitHub
<https://urldefense.com/v3/__https://github.com/yjgeno/GenKI/issues/3*issuecomment-1789414604__;Iw!!KwNVnqRv!G3GIEtXtFS25ndyA1kT4vAbQ9qigrMA8iAi--pYOUMit3YwBRNUDVC3R4JPTFCwE2wftIJhVSdAZWcxqFGzSTdupdBaA$>,
or unsubscribe
<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ASQBQCWGJIZJWFSM2FHSBMTYCKFJ7AVCNFSM6AAAAAA6YCWW2CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBZGQYTINRQGQ__;!!KwNVnqRv!G3GIEtXtFS25ndyA1kT4vAbQ9qigrMA8iAi--pYOUMit3YwBRNUDVC3R4JPTFCwE2wftIJhVSdAZWcxqFGzSTZZrHaCr$>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi @yjgeno I went to the paper supplementary and saw that according to the benchmark it should take around 200 minutes if you have up to 5K genes. I put a gene knockout on run yesterday at 9 PM and it has been more than 12 hours and it's still running without any output even when I use all available cores this time using
|
Hi @Rohit-Satyam, it looks like you used more than 100 CPUs, and I doubted the delay was caused by their interaction handled by Ray. If you still have the problem, may you try to reduce CPUs to a lower number like 8 for example, when initiating? |
Hi @yjgeno. I was using just 12 CPUs previously but still, it went on forever. So that's when I decided to use all the CPUs. Besides, since there is no progress bar, it's difficult to know if it is proceeding at all or is just stuck!! |
@yjgeno I tried 8 cpus as well and it's been two days. Instead of running the code in jupyter notebook, I converted your code into an executable file
I am attaching |
@Rohit-Satyam Hi, have you finished your run? I didn't encounter any problems handling data on this scale. If you're still facing issues, feel free to send your data my way, and I'll take care of the run for you |
Hi @yjgeno @jamesjcai @guanxunli I have the same runtime issue using GenKI. I have been using GenKI with the 10X PBMC3k scRNA-seq dataset, which contains 2,698 cells and 1,865 highly variable genes after filtering. I am attempting to simulate the knockout of a single gene. However, even after running the computation on 60 CPUs for an entire week, I have not yet obtained any results. sc.settings.verbosity = 0 import GenKI as gk adata = build_adata("pbmc3k_10X_filtered_scaled.h5ad") data_wrapper = DataLoader( data_wt = data_wrapper.load_data() hyperparams = {"epochs": 2, sensei = VGAE_trainer(data_wt, z_mu_wt, z_std_wt = sensei.get_latent_vars(data_wt) z_mu_wt=pd.DataFrame(z_mu_wt) res_raw = utils.get_generank(data_wt, dis, rank=True) null = sensei.pmt(data_ko, n=5, by="KL") Could you please provide any guidance on whether this computation time is expected, or if there are any optimizations I can apply to speed up the process? Any help would be greatly appreciated. Thank you for your assistance! |
Hi @LPH-BIG It seems that your single-cell dataset is relatively small, so the GRN calculation shouldn't be this slow. Could you try reducing the number of CPUs to something like And if you wish you could share your data with me and I run it for you, if it is not sensitive. |
@LPH-BIG Were you able to resolve it using 4 CPUs? |
Hi @yjgeno @jamesjcai @guanxunli
Thanks for developing GenKI. I was excited to use it to knockout a gene from Malaria Cell Atlas and see if it gives better results in terms of biology than
scTenifoldKnk
. Since the MCA atlas is large, nearly 29K cells, I split them into their sell stages and then tried running GenKI. I have 7K cells in current run and I am using 12 cores but it's been entire day and it seems to have stuck atDataLoader
step.Do you have any benchmark as to how much time will it take against number of cells?
The text was updated successfully, but these errors were encountered: