-
Notifications
You must be signed in to change notification settings - Fork 19
Open
Description
Hi,
I am having this issue that was reported in a closed before. I am running Rcorrector in some RNAseq samples (first two by two) and some samples (not all of them, but the majority) seem to lose reads after the run. Run it a second time on some samples (this time one by one) and still the same problem with different number of
For example:
Samples before running Rcorrector:
~/data/RNAseq_PRJNA338760/FastQrawRNAseq$ grep -c "@HISEQ1" SRR4030253_1.fastq
18218121
~/data/RNAseq_PRJNA338760/FastQrawRNAseq$ grep -c "@HISEQ1" SRR4030253_2.fastq
18218121
After running Rcorrector (first time):
grep: SRR4030253_1.fastq: No such file or directory
mcv@bambi:~/data/RNAseqCorrected$ grep -c "@HISEQ1" SRR4030253_1.cor.fq
7839585
mcv@bambi:~/data/RNAseqCorrected$ grep -c "@HISEQ1" SRR4030253_2.cor.fq
7839611
After running Rcorrector (second time)
/angela/rcorrector$ perl run_rcorrector.pl -1 ~/data/RNAseq_PRJNA338760/FastQrawRNAseq/SRR4030253_1.fastq -2 ~/data/RNAseq_PRJNA338760/FastQrawRNAseq/SRR4030253_2.fastq -od RNAseqCorrected253
Put the kmers into bloom filter
/home/mcv/angela/rcorrector/jellyfish/bin/jellyfish bc -m 23 -s 100000000 -C -t 1 -o tmp_a798458599d74d3e1d510f550790024f.bc /home/mcv/data/RNAseq_PRJNA338760/FastQrawRNAseq/SRR4030253_1.fastq /home/mcv/data/RNAseq_PRJNA338760/FastQrawRNAseq/SRR4030253_2.fastq
Count the kmers in the bloom filter
/home/mcv/angela/rcorrector/jellyfish/bin/jellyfish count -m 23 -s 100000 -C -t 1 --bc tmp_a798458599d74d3e1d510f550790024f.bc -o tmp_a798458599d74d3e1d510f550790024f.mer_counts /home/mcv/data/RNAseq_PRJNA338760/FastQrawRNAseq/SRR4030253_1.fastq /home/mcv/data/RNAseq_PRJNA338760/FastQrawRNAseq/SRR4030253_2.fastq
Dump the kmers
/home/mcv/angela/rcorrector/jellyfish/bin/jellyfish dump -L 2 tmp_a798458599d74d3e1d510f550790024f.mer_counts > tmp_a798458599d74d3e1d510f550790024f.jf_dump
Error correction
/home/mcv/angela/rcorrector/rcorrector -od RNAseqCorrected253 -p /home/mcv/data/RNAseq_PRJNA338760/FastQrawRNAseq/SRR4030253_1.fastq /home/mcv/data/RNAseq_PRJNA338760/FastQrawRNAseq/SRR4030253_2.fastq -c tmp_a798458599d74d3e1d510f550790024f.jf_dump
Stored 83145603 kmers
Weak kmer threshold rate: 0.014117 (estimated from 0.950/1 of the chosen kmers)
Bad quality threshold is '#'
Processed 36436242 reads
Corrected 41138010 bases.
~/angela/rcorrector/RNAseqCorrected253$ grep -c "@HISEQ1" SRR4030253_1.cor.fq
10861080
~/angela/rcorrector/RNAseqCorrected253$ grep -c "@HISEQ1" SRR4030253_2.cor.fq
10861067
I have no idea what is causing this. If you could help?
Thanks in advance,
'Angela
Metadata
Metadata
Assignees
Labels
No labels