Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NO FULLREADS FILES PRODUCED #17

Open
cecilelorrain opened this issue Apr 12, 2019 · 3 comments
Open

NO FULLREADS FILES PRODUCED #17

cecilelorrain opened this issue Apr 12, 2019 · 3 comments

Comments

@cecilelorrain
Copy link

Hi,

I am struggling to run the step4 of the pipeline, I have the flanking.read produced but not the fullreads:

/home/cecile/Desktop/Postdoc_cecile/2_Zymo/4-Mutation_accumulation_analysis_TE_dynamics/1-TESTS/test_RelocalTE2
fastq
/home/cecile/Desktop/Postdoc_cecile/2_Zymo/4-Mutation_accumulation_analysis_TE_dynamics/1-TESTS/test_RelocalTE2/2822_AB_test_nosplit/repeat/flanking_seq/2822_AB-R1.te_repeat.flankingReads.fq
/home/cecile/Desktop/Postdoc_cecile/2_Zymo/4-Mutation_accumulation_analysis_TE_dynamics/1-TESTS/test_RelocalTE2/2822_AB_test_nosplit/repeat/flanking_seq/2822_AB-R2.te_repeat.flankingReads.fq
all unpaired: /home/cecile/Desktop/Postdoc_cecile/2_Zymo/4-Mutation_accumulation_analysis_TE_dynamics/1-TESTS/test_RelocalTE2/2822_AB_test_nosplit/repeat/flanking_seq/2822_AB-R1.te_repeat.flankingReads.fq
all unpaired: /home/cecile/Desktop/Postdoc_cecile/2_Zymo/4-Mutation_accumulation_analysis_TE_dynamics/1-TESTS/test_RelocalTE2/2822_AB_test_nosplit/repeat/flanking_seq/2822_AB-R2.te_repeat.flankingReads.fq
testing if bam exists: /home/cecile/Desktop/Postdoc_cecile/2_Zymo/4-Mutation_accumulation_analysis_TE_dynamics/1-TESTS/test_RelocalTE2/2822_AB_test_nosplit/repeat/bwa_aln/Zt09_assembly.structural_variation.2822_AB-R1.te_repeat.flankingReads.fq_1.te_repeat.flankingReads.bwa.mates.bam
pre: /home/cecile/Desktop/Postdoc_cecile/2_Zymo/4-Mutation_accumulation_analysis_TE_dynamics/1-TESTS/test_RelocalTE2/2822_AB_test_nosplit/repeat/flanking_seq/2822_AB-R1.te_repeat.flankingReads.fq
pre: /home/cecile/Desktop/Postdoc_cecile/2_Zymo/4-Mutation_accumulation_analysis_TE_dynamics/1-TESTS/test_RelocalTE2/2822_AB_test_nosplit/repeat/flanking_seq/2822_AB-R2.te_repeat.flankingReads.fq
bam not exists, preceed with bwa to map the reads
[bwa_aln_core] convert to sequence coordinate... 0.92 sec
[bwa_aln_core] refine gapped alignments... 0.78 sec
[bwa_aln_core] print alignments... 2.29 sec
[bwa_aln_core] 262144 sequences have been processed.
[bwa_aln_core] convert to sequence coordinate... 0.94 sec
[bwa_aln_core] refine gapped alignments... 0.74 sec
[bwa_aln_core] print alignments... 2.31 sec
[bwa_aln_core] 524288 sequences have been processed.
[bwa_aln_core] convert to sequence coordinate... 0.63 sec
[bwa_aln_core] refine gapped alignments... 0.44 sec
[bwa_aln_core] print alignments... [bwa_aln_core] convert to sequence coordinate... 0.80 sec
[bwa_aln_core] refine gapped alignments... 0.77 sec
[bwa_aln_core] print alignments... 1.47 sec
[bwa_aln_core] 691602 sequences have been processed.
[main] Version: 0.6.2-r126
[main] CMD: /home/cecile/RelocaTE2/bin/bwa samse /home/cecile/Desktop/Postdoc_cecile/2_Zymo/4-Mutation_accumulation_analysis_TE_dynamics/1-TESTS/test_RelocalTE2/Zt09_assembly.structural_variation.fasta /home/cecile/Desktop/Postdoc_cecile/2_Zymo/4-Mutation_accumulation_analysis_TE_dynamics/1-TESTS/test_RelocalTE2/2822_AB_test_nosplit/repeat/bwa_aln/Zt09_assembly.structural_variation.2822_AB-R1.te_repeat.flankingReads.bwa.single.sai /home/cecile/Desktop/Postdoc_cecile/2_Zymo/4-Mutation_accumulation_analysis_TE_dynamics/1-TESTS/test_RelocalTE2/2822_AB_test_nosplit/repeat/flanking_seq/2822_AB-R1.te_repeat.flankingReads.fq
[main] Real time: 31.539 sec; CPU: 13.452 sec
2.27 sec
[bwa_aln_core] 262144 sequences have been processed.
[bwa_aln_core] convert to sequence coordinate... 0.75 sec
[bwa_aln_core] refine gapped alignments... 0.70 sec
[bwa_aln_core] print alignments... 1.69 sec
[bwa_aln_core] 524288 sequences have been processed.
[bwa_aln_core] convert to sequence coordinate... 0.39 sec
[bwa_aln_core] refine gapped alignments... 0.34 sec
[bwa_aln_core] print alignments... 0.56 sec
[bwa_aln_core] 658180 sequences have been processed.
[main] Version: 0.6.2-r126
[main] CMD: /home/cecile/RelocaTE2/bin/bwa samse /home/cecile/Desktop/Postdoc_cecile/2_Zymo/4-Mutation_accumulation_analysis_TE_dynamics/1-TESTS/test_RelocalTE2/Zt09_assembly.structural_variation.fasta /home/cecile/Desktop/Postdoc_cecile/2_Zymo/4-Mutation_accumulation_analysis_TE_dynamics/1-TESTS/test_RelocalTE2/2822_AB_test_nosplit/repeat/bwa_aln/Zt09_assembly.structural_variation.2822_AB-R2.te_repeat.flankingReads.bwa.single.sai /home/cecile/Desktop/Postdoc_cecile/2_Zymo/4-Mutation_accumulation_analysis_TE_dynamics/1-TESTS/test_RelocalTE2/2822_AB_test_nosplit/repeat/flanking_seq/2822_AB-R2.te_repeat.flankingReads.fq
[main] Real time: 30.302 sec; CPU: 10.876 sec
mergeing bam file: 2/2 files
[W::bam_merge_core2] No @hd tag found.
mergeing fullread bam file: 0/0 files

job: sh /home/cecile/Desktop/Postdoc_cecile/2_Zymo/4-Mutation_accumulation_analysis_TE_dynamics/1-TESTS/test_RelocalTE2/2822_AB_test_nosplit/shellscripts/step_4/step_4.Zt09_assembly.structural_variation.repeat.align.sh

Here are the files in flanking_seq/
-rw-rw-r-- 1 cecile cecile 362M Apr 12 09:31 2822_AB-R1.te_repeat.flankingReads.fq
-rw-rw-r-- 1 cecile cecile 345M Apr 12 09:36 2822_AB-R2.te_repeat.flankingReads.fq

And in bwa_aln/*
-rw-rw-r-- 1 cecile cecile 2,6K Apr 12 09:52 bwa.stderr
-rw-rw-r-- 1 cecile cecile 106M Apr 12 09:52 Zt09_assembly.structural_variation.2822_AB-R1.te_repeat.flankingReads.bwa.single.bam
-rw-rw-r-- 1 cecile cecile 108M Apr 12 09:52 Zt09_assembly.structural_variation.2822_AB-R2.te_repeat.flankingReads.bwa.single.bam
-rw-rw-r-- 1 cecile cecile 214M Apr 12 09:53 Zt09_assembly.structural_variation.repeat.bwa.bam
-rw-rw-r-- 1 cecile cecile 1,4K Apr 12 09:52 Zt09_assembly.structural_variation.repeat.bwa.bam.sh
-rw-rw-r-- 1 cecile cecile 119M Apr 12 09:54 Zt09_assembly.structural_variation.repeat.bwa.sorted.bam
-rw-rw-r-- 1 cecile cecile 45K Apr 12 09:54 Zt09_assembly.structural_variation.repeat.bwa.sorted.bam.bai

Thank you in advance for you help,
Cécile

@jonathan-wells
Copy link

jonathan-wells commented Apr 23, 2019

Hi, I am getting the same problems as Cécile with fullread bamfiles not being produced. The relevant error report is:

job: sh /local/workdir/jnw72/Projects/drerio-tes/scripts/danio_test/RelocaTE2_outdir/shellscripts/step_2/1.fq2fa.sh
job: sh /local/workdir/jnw72/Projects/drerio-tes/scripts/danio_test/RelocaTE2_outdir/shellscripts/step_3/0.te_repeat.blat.sh
job: sh /local/workdir/jnw72/Projects/drerio-tes/scripts/danio_test/RelocaTE2_outdir/shellscripts/step_3/1.te_repeat.blat.sh
testing if bam exists: /local/workdir/jnw72/Projects/drerio-tes/scripts/danio_test/RelocaTE2_outdir/repeat/bwa_aln/chr25.SRR7081528.chr25_2.te_repeat.flankingReads.fq_1.te_repeat.flankingReads.bwa.mates.bam
bam not exists, preceed with bwa to map the reads
[main] Version: 0.6.2-r126
[main] CMD: /programs/miniconda2/envs/RelocaTE2/bin/bwa samse /local/workdir/jnw72/Projects/drerio-tes/scripts/danio_test/chr25.fa /local/workdir/jnw72/Projects/drerio-tes/scripts/danio_test/RelocaTE2_outdir/repeat/bwa_aln/chr25.SRR7081528.chr25_1.te_repeat.flankingReads.bwa.single.sai /local/workdir/jnw72/Projects/drerio-tes/scripts/danio_test/RelocaTE2_outdir/repeat/flanking_seq/SRR7081528.chr25_1.te_repeat.flankingReads.fq
[main] Real time: 0.001 sec; CPU: 0.003 sec
[main] Version: 0.6.2-r126
[main] CMD: /programs/miniconda2/envs/RelocaTE2/bin/bwa samse /local/workdir/jnw72/Projects/drerio-tes/scripts/danio_test/chr25.fa /local/workdir/jnw72/Projects/drerio-tes/scripts/danio_test/RelocaTE2_outdir/repeat/bwa_aln/chr25.SRR7081528.chr25_2.te_repeat.flankingReads.bwa.single.sai /local/workdir/jnw72/Projects/drerio-tes/scripts/danio_test/RelocaTE2_outdir/repeat/flanking_seq/SRR7081528.chr25_2.te_repeat.flankingReads.fq
[main] Real time: 0.001 sec; CPU: 0.002 sec
mergeing bam file: 2/2 files
[W::bam_merge_core2] No @HD tag found.
mergeing fullread bam file: 0/0 files
job: sh /local/workdir/jnw72/Projects/drerio-tes/scripts/danio_test/RelocaTE2_outdir/shellscripts/step_4/step_4.chr25.repeat.align.sh
Step5: Find non-reference insertions
find insertions on chr25
fullread bam: /local/workdir/jnw72/Projects/drerio-tes/scripts/danio_test/RelocaTE2_outdir/repeat/bwa_aln/chr25.repeat.fullreads.bwa.sorted.bam
Traceback (most recent call last):
  File "/programs/miniconda2/envs/RelocaTE2/scripts/relocaTE_insertionFinder.py", line 1825, in <module>
    main()
  File "/programs/miniconda2/envs/RelocaTE2/scripts/relocaTE_insertionFinder.py", line 1809, in main
    read_junction_reads_align(align_file_f, read_repeat, teJunctionReads)
  File "/programs/miniconda2/envs/RelocaTE2/scripts/relocaTE_insertionFinder.py", line 1648, in read_junction_reads_align
    fsam = pysam.AlignmentFile(align_file_f, 'rb')
  File "pysam/calignmentfile.pyx", line 333, in pysam.calignmentfile.AlignmentFile.__cinit__ (pysam/calignmentfile.c:4808)
  File "pysam/calignmentfile.pyx", line 533, in pysam.calignmentfile.AlignmentFile._open (pysam/calignmentfile.c:7027)
IOError: file `/local/workdir/jnw72/Projects/drerio-tes/scripts/danio_test/RelocaTE2_outdir/repeat/bwa_aln/chr25.repeat.fullreads.bwa.sorted.bam` not found

I haven't been able to get this up and running yet, and the problem has always been some variant of a python IOError caused by some file it's expecting to find not existing. The error code shown above comes from running on a test dataset which I tried to replicate as closely as possible the format of the test_data/ file provided with the package.

Thanks in advance,
Jon

@davidecarlson
Copy link

I had the same issue as Jon and Cécile, but I managed to figure out what the problem was. In my case, at least, Relocate2 was unable to figure out that my input fastq files were actually paired, and so it treated all data as unpaired. After some tests I determined that the problem was the structure of input fastq file names. My fastq files were:

sample_name.1.fastq sample_name.2.fastq

After changing the filenames to:

sample_name_1.fastq sample_name_2.fastq

Relocate2 finished successfully.

I just wanted to make note of this in case others continue to experience this problem.
Thanks,
Dave

@JinfengChen
Copy link
Owner

Hi David,

Thank you so much for resolving the issue. Just want to remind that there are options you can explore. If this works you can save some time in the future. Thanks.

Jinfeng

-1 MATE_1_ID, --mate_1_id MATE_1_ID
string define paired-end read1, default = "_1"
-2 MATE_2_ID, --mate_2_id MATE_2_ID
string define paired-end read2, default = "_2"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants