-
Notifications
You must be signed in to change notification settings - Fork 28
Implement the the order of magnitude algorithm from Cell Ranger to emptyDropsCellRanger #119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: devel
Are you sure you want to change the base?
Conversation
After reading this for half an hour, I have no idea what this is doing. Why is there a need for bootstrapping? Why can't the threshold be found exactly? |
I guess the bootstrapping is saying those barcodes represent a tiny fraction of the population, so they want to use bootstrapping to get a "robust" threshold that is a "guaranteed approximation"? Haha, I guess you found the reason why I did not copy and paste the function you wrote in #119 but spent a day translating their python functions into R. 😉 I have tested that the numbers I got using the function you wrote were only tens of cells' different than their implementation. It is totally fine if we just apply your succinct and straightforward function, but I guess there will be issues popping up requesting "Cell Ranger"'s algorithms. |
Hm. I'm caught between two concerns:
So how about this. We'll spin off the existing existing Depending on the scope of this new package, I am also willing to offer the other 10X-specific functions, e.g., |
I am totally fine with the plan. I will try my best to help implement any future changes in these functions! |
Alright, let me ping BioC slack and see if there are any other interested parties, then I will start the migration process. |
Well, I guess there aren't any more interested parties. Can you set up a repo and put me as a collaborator, then I'll move over some code. |
Haha it's OK. I will work on this soon! |
Sorry for the lateness @DongzeHE, I just remembered to deal with this. I set up https://github.com/LTLA/cram (placeholder name, actual name is up to you). I've already moved your Let me know where you want me to transfer the repo. Then you add your new stuff, submit it to BioC, etc. Happy to help in that regard. If we are quick, you might be able to get it into the next release (3.22). |
Hi @LTLA,
Happy New Year!! Here is Dongze using my Altos's GitHub account. Finally I got some cycles to address #88 (and #109 as a bonus ;P).
According to CellRanger's GitHub repo, I implemented the core functions for the order of magnitude algorithm to estimate the
n.expected.cells
parameter if it is set asNULL
(the default). I have tested that the R implementation resulted in the same number as the python implementation in CellRanger if ignoring the effects of random seeds.The only concern I have is that the current implementation uses population variance in this line (or here in Cell Ranger). Please feel free to change this to sample var (
var(top_n_boot)
) if you think it is more appropriate.Best,
Dongze