-
Notifications
You must be signed in to change notification settings - Fork 544
Description
Hi Alex,
Firstly thank you for the great development of this tool.
I have a question concerning the gene counts from the .tab output when I set --quantMode GeneCounts and more specifically how to get from those counts to RPM (reads per million).
I have smallRNA-Seq data, unstranded, so when I retrieve the gene counts, I am only focusing on the first column of the .tab files.
Now, if I compute the column sum of a matrix (4 samples) built by joining all the .tab files (only 1st column), my understanding is that I should get the number of mapped reads in each sample. Is this assumption "even" correct ?
This is the output of the column sums:
S1 S2 S3 S4
1169953 821332 755315 1050780
If my assumption is correct, then I should see the same number of "uniquely mapped reads" in the output of .Log.final.out files, right ?
This is a screenshot of the .Log.final.out for each of my samples:
_log_out_s1s4_smallrnaseq.pdf
As you can see the colsums that I get from the .tab files are actually closer to the Number of input reads from the log files than to the number of mapped reads.
My plan was to use these colsums to turn the read counts into RPM but now I am only more confused.
Can you help ?
Thanks a lot