You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I was doing some triobinning , and noticed hap1 had 0 reads while hap2 had the vast majority. As it turned out, the estimated informative k-mer peak for hap1 was at 1018, while for hap2 it was 5.
if (2*minAve*aveSize<thisSum) // Over estimates the minimum sum when thisLen < aveSize - i.e., for
where for my k-mer distribution that ratio of thisSum/(minAve+aveSize) peaked at 1.863 at minFreq=5. So it is fairly easy to set this manually for splitHaplotype, but I was wondering if there is a more lenient condition or at least some fallback, because nearly picking minFreq=5 but then actually picking minFreq=1018 is a huge difference and filters out effectively most k-mers. This also finished the for loop so never actually satisfied the break condition.
Not very rigorous, but I could recover minFreq=5 (which isn't necessarily the "right" answer) by doing something like
double maxRatio=0;
uint32 maxRatioFreq=1;
if (thisSum/(minAve*aveSize)>maxRatio) {
maxRatio=thisSum/(minAve*aveSize);
maxRatioFreq=minFreq;
}
...
// if haven't hit the break condition but exited the for loop
if (f==histoLen) minFreq=maxRatioFreq;
The text was updated successfully, but these errors were encountered:
Expected coverage is around 17x on a ~2.7 Gb genome for hap1 (male), closer to 19x for hap2 (female). Here are the first 30 lines from the histogram for both.
Hi,
I was doing some triobinning , and noticed hap1 had 0 reads while hap2 had the vast majority. As it turned out, the estimated informative k-mer peak for hap1 was at 1018, while for hap2 it was 5.
I did some digging and it was due to this line
canu/src/haplotyping/splitHaplotype.C
Line 380 in f29343c
where for my k-mer distribution that ratio of
thisSum/(minAve+aveSize)
peaked at 1.863 atminFreq=5
. So it is fairly easy to set this manually for splitHaplotype, but I was wondering if there is a more lenient condition or at least some fallback, because nearly pickingminFreq=5
but then actually pickingminFreq=1018
is a huge difference and filters out effectively most k-mers. This also finished thefor
loop so never actually satisfied the break condition.Not very rigorous, but I could recover
minFreq=5
(which isn't necessarily the "right" answer) by doing something likeThe text was updated successfully, but these errors were encountered: