Replies: 5 comments
-
The If you want to sample a fixed number of pairs, rather than a proportion, then reservoir sampling can be used. Also see: https://www.biostars.org/p/110107/ |
Beta Was this translation helpful? Give feedback.
-
|
@nvictus but do you agree that it would be a generic-enough and overall useful tool to have ? |
Beta Was this translation helpful? Give feedback.
-
|
At its simplest, it seems to be a very generic operation. Unix However, as many point out, if you're happy with an approximate result, it's a simple one-liner to downsample a stream of lines. Unless this tool would do more sophisticated things that |
Beta Was this translation helpful? Give feedback.
-
|
Ah, my bad. It seems |
Beta Was this translation helpful? Give feedback.
-
|
Hi, @sergpolly , @nvictus , isn't that resolved by |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I feel like we would benefit from having a simple
pairsamtools subsampletool (or an option to subsample forpairsamtools select) ...The rationale being - to enable us to do some "rigorous" statistics/significance estimation/bootstrapping/permutation testing for some of the analyses, e.g., if we want to measure a "subtle" compartment strength difference between 2 experiments, and we have 10 mln and 12 mln pairs for the experiments - one can subsample both down to 5 mln several times and calculate a compartment strength for each subsample and compare the resultant distributions. Another example would be - subsampling and mixing mitotic and G1 pairs to check if some experimental effects could be explained by such a simple mixture, etc.
Technical notes/questions:
select) ...pairixindex help speed up subsampling ? Should we rely on it ?subsamplefit intoselector it deserves to be a separate tool ?Beta Was this translation helpful? Give feedback.
All reactions