You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm currently developing the McClintock meta-pipeline for TE detection software, and I've been working on integrating RelocaTE2 as a component method. Right now, RelocaTE2 is running well on paired-end data but I've run into an issue when trying to run it with unpaired data. It seems that if no paired-end data is detected, the fullread bam file will never be produced by relocaTE_align.py causing downstream steps to fail.
Here is how I've replicated the error with the test data
# install
conda create --name relocate2 -c bioconda relocate2=2.0.1=pypl526_4
conda activate relocate2
# get test data
git clone https://github.com/JinfengChen/RelocaTE2.git
# make an "unpaired" fastq directory
test_dir="RelocaTE2/test_data"
mkdir unpaired
cp $test_dir/MSU7.Chr3_2M.ALL_reads/MSU7.Chr3_2M.ALL_reads_6X_100_500_1.fq.gz unpaired/MSU7.Chr3_2M.unPaired.fq.gz
# run relocaTE2 with only unpaired data
relocaTE2.py --te_fasta $test_dir/RiceTE.fa \
--genome_fasta $test_dir/MSU7.Chr3_2M.fa \
--fq_dir unpaired \
--outdir out \
--reference_ins $test_dir/MSU7.Chr3_2M.fa.RepeatMasker.out \
--sample rice \
--size 500 \
--step 1234567 \
--mismatch 2 \
--cpu 20 \
--aligner blat \
--verbose 4
The first exception raised is the IOError: file '/scratch/pjb68507/test/relocate2/out/repeat/bwa_aln/MSU7.Chr3_2M.repeat.fullreads.bwa.sorted.bam' not found and I'm assuming the remaining exceptions are a consequence of relocaTE_insertionFinder.py failing to run due to this missing file.
When looking at relocaTE_align.py I think that, in the function that is supposed to produce the fullread bam, if no paired end data is found, it will not actually produce the fullread bam but just the single.bam
I am not sure if it is intended for RelocaTE2 to require paired end data but I assumed that, due to the functionality of the original RelocaTE and the presence of the --unpaired_id flag, that RelocaTE2 could work with both paired and unpaired data.
I appreciate any help you can provide on this issue.
Thanks,
Preston
The text was updated successfully, but these errors were encountered:
Hello,
I'm currently developing the McClintock meta-pipeline for TE detection software, and I've been working on integrating RelocaTE2 as a component method. Right now, RelocaTE2 is running well on paired-end data but I've run into an issue when trying to run it with unpaired data. It seems that if no paired-end data is detected, the fullread bam file will never be produced by
relocaTE_align.py
causing downstream steps to fail.Here is how I've replicated the error with the test data
stderr & stdout
IOError: file '/scratch/pjb68507/test/relocate2/out/repeat/bwa_aln/MSU7.Chr3_2M.repeat.fullreads.bwa.sorted.bam' not found
and I'm assuming the remaining exceptions are a consequence ofrelocaTE_insertionFinder.py
failing to run due to this missing file.relocaTE_align.py
I think that, in the function that is supposed to produce the fullread bam, if no paired end data is found, it will not actually produce the fullread bam but just the single.bamRelocaTE2/scripts/relocaTE_align.py
Lines 351 to 356 in b774698
--unpaired_id
flag, that RelocaTE2 could work with both paired and unpaired data.I appreciate any help you can provide on this issue.
Thanks,
Preston
The text was updated successfully, but these errors were encountered: