You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is not a bug but more of a question. I've run Salmon in alignment mode with transcriptome BAM file generated by STAR. The BAM file contains no unaligned reads. My question is there are often a small number of reads that were not assigned to any rich equivalence class. I am trying to understand what these reads are. I notice that this only happens when the input is paired-end reads. I suspect maybe the unassigned reads are dovetail paired end reads, but I don't know. The --allowDovetail option is not available in alignment mode. Here is an excerpt of the log:
Completed first pass through the alignment file.
Total # of mapped reads : 6205189
# of uniquely mapped reads : 1718004
# ambiguously mapped reads : 4487185
[2024-08-14 18:21:52.491] [jointLog] [info] Computed 350358 rich equivalence classes for further processing
[2024-08-14 18:21:52.491] [jointLog] [info] Counted 6192944 total reads in the equivalence classes
As you can see 6192944 out of 6205189 reads were assigned to rich equivalence classes.
It would be nice to know what the excluded reads are, and/or if there are options to rescue these reads, similar to --allowDovetail.
This is Salmon version 1.10.3, but I also ran older version, which generated same results.
The text was updated successfully, but these errors were encountered:
I realized that most of these unassigned reads are probably paired-end reads that didn't match the specified the libType, which was "IU", or inward, not stranded. So I ran samtools stats on my BAM file to verify that.
SN inward oriented pairs: 6191674
SN outward oriented pairs: 13515
The inward pairs 6191674 is close to the pairs Salmon assigned, which was 6192944, but not the same. That's OK, considering Salmon and samtools probably have different ways of defining inward, outward read pairs.
I think it's helpful if Salmon can say in the log how many reads were excluded, for what reason. Thanks.
This is not a bug but more of a question. I've run Salmon in alignment mode with transcriptome BAM file generated by STAR. The BAM file contains no unaligned reads. My question is there are often a small number of reads that were not assigned to any rich equivalence class. I am trying to understand what these reads are. I notice that this only happens when the input is paired-end reads. I suspect maybe the unassigned reads are dovetail paired end reads, but I don't know. The
--allowDovetail
option is not available in alignment mode. Here is an excerpt of the log:As you can see 6192944 out of 6205189 reads were assigned to rich equivalence classes.
It would be nice to know what the excluded reads are, and/or if there are options to rescue these reads, similar to
--allowDovetail
.This is Salmon version 1.10.3, but I also ran older version, which generated same results.
The text was updated successfully, but these errors were encountered: