-
Notifications
You must be signed in to change notification settings - Fork 19
Description
Hi,
I am interested in doing genome annotation and I am planning on using mapped RNA-seq data as evidence for gene prediction/annotation.
I have worked with genome assembly and I read about read error-correction, but I never heard of RNA-seq error-correction. Thus, I searched for programs specific to RNA-seq data, and I was eager since I found Rcorrector and Seecer. I was about to try Rcorrector but I wanted to ask a question before trying it.
Should I use Rcorrector before or after adaptor trimming/quality control?
I am also asking because while looking for software to error-correct RNA-seq data, I found this software, Semblans (https://github.com/gladshire/Semblans/ @gladshire), its workflow is 1st error-correction with Rcorrector followed up with Trimmomatic (@vivienrosenthal). I thought that it would be weird because if Rcorrector uses trusted k-mers to correct errors, the adaptors might induce some artificial/false-positive corrections since that sequence will be very abundant.
Thus, I wanted to consult to you whether that approach is correct and I should use the raw reads or the filtered/adapter-trimmed for error-correction.
For instance, I did trim the reads and the 5'-ends look weird in terms of base content, which I wanted to consult you whether I should hard-trim those ends or use the reads as such.
Thank you very much;