You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
Thank you for your great tools! I am trying to run xTea in long reads sequencing in PACBIO ccs data. And I follow the instructions from the readme. My command is " xtea_long -i sample_id.txt -b long_read_bam_list.txt -p /bastianlab/data1/hpan/xTea -o submit_jobs.sh --rmsk /bastianlab/data1/hpan/xTea/rep_lib_annotation/LINE/hg38/hg38_L1_larger_500_with_all_L1HS.out -r /bastianlab/data1/Shared_datasets/Database/References/ucsc_hg38.bwa-index/hg38.fa --cns /bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus/LINE1.fa --rep /bastianlab/data1/hpan/xTea/rep_lib_annotation --xtea /c4/home/ucsf-pan/software/xTea/xtea_long -f 31 -y 15 -n 8 -m 32 --slurm -q long -t 2-0:0:0"
But error occurs like this:
clip cutoff is: 0
Loaded consensus file list: ['/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/polyA.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/LINE1.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/ALU.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/HERV.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/SVA_ori.fa']
Begin to construct the TE kmer library!
The TE kmer library is constructed/loaded!
[ERROR] failed to open file '/bastianlab/data1/hpan/xTea/MaMel-144al/ghost_reads.fa.separate_flanking.fa': No such file or directory
Traceback (most recent call last):
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_main.py", line 352, in
i_max_clip, i_min_overlap, iset_cutoff, s_cluster_folder)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_ghost_TE.py", line 111, in cluster_reads_by_flank_region
m_info, m_reads, l_reads = self._parse_self_aligned_reads(sf_algnmt, i_max_clip)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_ghost_TE.py", line 392, in _parse_self_aligned_reads
samfile = pysam.AlignmentFile(sf_bam, "rb")
File "pysam/libcalignmentfile.pyx", line 741, in pysam.libcalignmentfile.AlignmentFile.cinit
File "pysam/libcalignmentfile.pyx", line 990, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file has no sequences defined (mode='rb') - is it SAM/BAM format? Consider opening with check_sq=False
Running command: minimap2 -ax asm5 -t 8 /bastianlab/data1/Shared_datasets/Database/References/ucsc_hg38.bwa-index/hg38.fa /bastianlab/data1/hpan/xTea/MaMel-144al/all_ins_seqs.fa | samtools view -hSb - | samtools sort -o /bastianlab/data1/hpan/xTea/MaMel-144al/tmp/classification/all_tei_seq_2_ref.bam -
^CTraceback (most recent call last):
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_main.py", line 311, in
lrc.classify_ins_seqs(sf_rep_ins, sf_ref, flk_lenth, sf_rslt)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_rep_classification.py", line 121, in classify_ins_seqs
self.classify_from_ref_algnmt(sf_ref, sf_rep_ins, sf_rslt)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_rep_classification.py", line 114, in classify_from_ref_algnmt
xtea_contig.align_contigs_2_reference_genome(sf_ref, sf_rep_ins, self.n_jobs, sf_algnmt)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/x_contig.py", line 106, in align_contigs_2_reference_genome
self.run_cmd(cmd)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/x_contig.py", line 47, in run_cmd
self.cmd_runner.run_cmd_small_output(cmd)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/cmd_runner.py", line 13, in run_cmd_small_output
subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE).communicate()
File "/c4/home/ucsf-pan/miniconda3/envs/svim/lib/python3.7/subprocess.py", line 951, in communicate
stdout = self.stdout.read()
KeyboardInterrupt
(svim) [ucsf-pan@c4-n25 MaMel-144al]$ sh run_xTEA_pipeline.sh
Ave coverage is 0: using parameters clip with value 1
Ave coverage is 0: using parameters clip with value 1
clip cutoff is: 0
Loaded consensus file list: ['/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/polyA.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/LINE1.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/ALU.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/HERV.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/SVA_ori.fa']
Begin to construct the TE kmer library!
[ERROR] failed to open file '/bastianlab/data1/hpan/xTea/MaMel-144al/ghost_reads.fa.separate_flanking.fa': No such file or directory
Traceback (most recent call last):
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_main.py", line 352, in
i_max_clip, i_min_overlap, iset_cutoff, s_cluster_folder)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_ghost_TE.py", line 111, in cluster_reads_by_flank_region
m_info, m_reads, l_reads = self._parse_self_aligned_reads(sf_algnmt, i_max_clip)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_ghost_TE.py", line 392, in _parse_self_aligned_reads
samfile = pysam.AlignmentFile(sf_bam, "rb")
File "pysam/libcalignmentfile.pyx", line 741, in pysam.libcalignmentfile.AlignmentFile.cinit
File "pysam/libcalignmentfile.pyx", line 990, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file has no sequences defined (mode='rb') - is it SAM/BAM format? Consider opening with check_sq=False
Running command: minimap2 -ax asm5 -t 8 /bastianlab/data1/Shared_datasets/Database/References/ucsc_hg38.bwa-index/hg38.fa /bastianlab/data1/hpan/xTea/MaMel-144al/all_ins_seqs.fa | samtools view -hSb - | samtools sort -o /bastianlab/data1/hpan/xTea/MaMel-144al/tmp/classification/all_tei_seq_2_ref.bam -
[M::mm_idx_gen::53.8711.58] collected minimizers
[M::mm_idx_gen::75.4591.80] sorted minimizers
[M::main::75.4591.80] loaded/built the index for 455 target sequence(s)
[M::mm_mapopt_update::102.0871.59] mid_occ = 144
[M::mm_idx_stat] kmer size: 19; skip: 19; is_hpc: 0; #seq: 455
[M::mm_idx_stat::106.027*1.57] distinct minimizers: 214834535 (90.55% are singletons); average occurrences: 1.424; average spacing: 10.491; total length: 3209286105
ERROR: failed to open file '/bastianlab/data1/hpan/xTea/MaMel-144al/all_ins_seqs.fa': No such file or directory
ERROR: failed to map the query file
Running command: samtools index /bastianlab/data1/hpan/xTea/MaMel-144al/tmp/classification/all_tei_seq_2_ref.bam
Working on polyA with contigs /bastianlab/data1/hpan/xTea/MaMel-144al/all_ins_seqs.fa and consensus /bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/polyA.fa
Running command: minimap2 -k11 -w5 --sr --frag=yes -A2 -B4 -O4,8 -E2,1 -r150 -p.5 -N5 -n1 -m20 -s30 -g200 -2K50m --MD --heap-sort=yes --secondary=no --cs -a -t 8 /bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/polyA.fa /bastianlab/data1/hpan/xTea/MaMel-144al/all_ins_seqs.fa | samtools view -hSb - | samtools sort -o /bastianlab/data1/hpan/xTea/MaMel-144al/tmp/classification/polyA_cns.bam -
[M::mm_idx_gen::0.0021.85] collected minimizers
[M::mm_idx_gen::0.0033.76] sorted minimizers
[M::main::0.0033.73] loaded/built the index for 1 target sequence(s)
[M::mm_mapopt_update::0.0033.69] mid_occ = 219
[M::mm_idx_stat] kmer size: 11; skip: 5; is_hpc: 0; #seq: 1
[M::mm_idx_stat::0.003*3.65] distinct minimizers: 1 (0.00% are singletons); average occurrences: 218.000; average spacing: 1.050; total length: 229
ERROR: failed to open file '/bastianlab/data1/hpan/xTea/MaMel-144al/all_ins_seqs.fa': No such file or directory
ERROR: failed to map the query file
Running command: samtools index /bastianlab/data1/hpan/xTea/MaMel-144al/tmp/classification/polyA_cns.bam
Traceback (most recent call last):
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_main.py", line 311, in
lrc.classify_ins_seqs(sf_rep_ins, sf_ref, flk_lenth, sf_rslt)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_rep_classification.py", line 175, in classify_ins_seqs
self.get_unmasked_seqs(sf_rep_ins_tmp, sf_tmp_out, sf_new_tmp)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_rep_classification.py", line 297, in get_unmasked_seqs
with pysam.FastxFile(sf_ori) as fin_ori, open(sf_new, "w") as fout_new:
File "pysam/libcfaidx.pyx", line 550, in pysam.libcfaidx.FastxFile.cinit
File "pysam/libcfaidx.pyx", line 580, in pysam.libcfaidx.FastxFile._open
OSError: file /bastianlab/data1/hpan/xTea/MaMel-144al/all_ins_seqs.fa not found
I have no idea how to solve this. Could you please guide me how to solve this?
The text was updated successfully, but these errors were encountered:
Hi,
Thank you for your great tools! I am trying to run xTea in long reads sequencing in PACBIO ccs data. And I follow the instructions from the readme. My command is " xtea_long -i sample_id.txt -b long_read_bam_list.txt -p /bastianlab/data1/hpan/xTea -o submit_jobs.sh --rmsk /bastianlab/data1/hpan/xTea/rep_lib_annotation/LINE/hg38/hg38_L1_larger_500_with_all_L1HS.out -r /bastianlab/data1/Shared_datasets/Database/References/ucsc_hg38.bwa-index/hg38.fa --cns /bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus/LINE1.fa --rep /bastianlab/data1/hpan/xTea/rep_lib_annotation --xtea /c4/home/ucsf-pan/software/xTea/xtea_long -f 31 -y 15 -n 8 -m 32 --slurm -q long -t 2-0:0:0"
But error occurs like this:
clip cutoff is: 0
Loaded consensus file list: ['/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/polyA.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/LINE1.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/ALU.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/HERV.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/SVA_ori.fa']
Begin to construct the TE kmer library!
The TE kmer library is constructed/loaded!
Error: File None doesn't exist!!!
Running command: minimap2 -x ava-pb -c -a -t 8 /bastianlab/data1/hpan/xTea/MaMel-144al/ghost_reads.fa.separate_flanking.fa /bastianlab/data1/hpan/xTea/MaMel-144al/ghost_reads.fa.separate_flanking.fa | samtools view -hSb - | samtools sort -o /bastianlab/data1/hpan/xTea/MaMel-144al/ghost_reads.fa.separate_flanking.fa.algn_2_itself.sorted.bam -
[ERROR] failed to open file '/bastianlab/data1/hpan/xTea/MaMel-144al/ghost_reads.fa.separate_flanking.fa': No such file or directory
Traceback (most recent call last):
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_main.py", line 352, in
i_max_clip, i_min_overlap, iset_cutoff, s_cluster_folder)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_ghost_TE.py", line 111, in cluster_reads_by_flank_region
m_info, m_reads, l_reads = self._parse_self_aligned_reads(sf_algnmt, i_max_clip)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_ghost_TE.py", line 392, in _parse_self_aligned_reads
samfile = pysam.AlignmentFile(sf_bam, "rb")
File "pysam/libcalignmentfile.pyx", line 741, in pysam.libcalignmentfile.AlignmentFile.cinit
File "pysam/libcalignmentfile.pyx", line 990, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file has no sequences defined (mode='rb') - is it SAM/BAM format? Consider opening with check_sq=False
Running command: minimap2 -ax asm5 -t 8 /bastianlab/data1/Shared_datasets/Database/References/ucsc_hg38.bwa-index/hg38.fa /bastianlab/data1/hpan/xTea/MaMel-144al/all_ins_seqs.fa | samtools view -hSb - | samtools sort -o /bastianlab/data1/hpan/xTea/MaMel-144al/tmp/classification/all_tei_seq_2_ref.bam -
^CTraceback (most recent call last):
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_main.py", line 311, in
lrc.classify_ins_seqs(sf_rep_ins, sf_ref, flk_lenth, sf_rslt)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_rep_classification.py", line 121, in classify_ins_seqs
self.classify_from_ref_algnmt(sf_ref, sf_rep_ins, sf_rslt)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_rep_classification.py", line 114, in classify_from_ref_algnmt
xtea_contig.align_contigs_2_reference_genome(sf_ref, sf_rep_ins, self.n_jobs, sf_algnmt)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/x_contig.py", line 106, in align_contigs_2_reference_genome
self.run_cmd(cmd)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/x_contig.py", line 47, in run_cmd
self.cmd_runner.run_cmd_small_output(cmd)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/cmd_runner.py", line 13, in run_cmd_small_output
subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE).communicate()
File "/c4/home/ucsf-pan/miniconda3/envs/svim/lib/python3.7/subprocess.py", line 951, in communicate
stdout = self.stdout.read()
KeyboardInterrupt
(svim) [ucsf-pan@c4-n25 MaMel-144al]$ sh run_xTEA_pipeline.sh
Ave coverage is 0: using parameters clip with value 1
Ave coverage is 0: using parameters clip with value 1
clip cutoff is: 0
Loaded consensus file list: ['/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/polyA.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/LINE1.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/ALU.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/HERV.fa', '/bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/SVA_ori.fa']
Begin to construct the TE kmer library!
The TE kmer library is constructed/loaded!
Error: File None doesn't exist!!!
Running command: minimap2 -x ava-pb -c -a -t 8 /bastianlab/data1/hpan/xTea/MaMel-144al/ghost_reads.fa.separate_flanking.fa /bastianlab/data1/hpan/xTea/MaMel-144al/ghost_reads.fa.separate_flanking.fa | samtools view -hSb - | samtools sort -o /bastianlab/data1/hpan/xTea/MaMel-144al/ghost_reads.fa.separate_flanking.fa.algn_2_itself.sorted.bam -
[ERROR] failed to open file '/bastianlab/data1/hpan/xTea/MaMel-144al/ghost_reads.fa.separate_flanking.fa': No such file or directory
Traceback (most recent call last):
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_main.py", line 352, in
i_max_clip, i_min_overlap, iset_cutoff, s_cluster_folder)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_ghost_TE.py", line 111, in cluster_reads_by_flank_region
m_info, m_reads, l_reads = self._parse_self_aligned_reads(sf_algnmt, i_max_clip)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_ghost_TE.py", line 392, in _parse_self_aligned_reads
samfile = pysam.AlignmentFile(sf_bam, "rb")
File "pysam/libcalignmentfile.pyx", line 741, in pysam.libcalignmentfile.AlignmentFile.cinit
File "pysam/libcalignmentfile.pyx", line 990, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file has no sequences defined (mode='rb') - is it SAM/BAM format? Consider opening with check_sq=False
Running command: minimap2 -ax asm5 -t 8 /bastianlab/data1/Shared_datasets/Database/References/ucsc_hg38.bwa-index/hg38.fa /bastianlab/data1/hpan/xTea/MaMel-144al/all_ins_seqs.fa | samtools view -hSb - | samtools sort -o /bastianlab/data1/hpan/xTea/MaMel-144al/tmp/classification/all_tei_seq_2_ref.bam -
[M::mm_idx_gen::53.8711.58] collected minimizers
[M::mm_idx_gen::75.4591.80] sorted minimizers
[M::main::75.4591.80] loaded/built the index for 455 target sequence(s)
[M::mm_mapopt_update::102.0871.59] mid_occ = 144
[M::mm_idx_stat] kmer size: 19; skip: 19; is_hpc: 0; #seq: 455
[M::mm_idx_stat::106.027*1.57] distinct minimizers: 214834535 (90.55% are singletons); average occurrences: 1.424; average spacing: 10.491; total length: 3209286105
ERROR: failed to open file '/bastianlab/data1/hpan/xTea/MaMel-144al/all_ins_seqs.fa': No such file or directory
ERROR: failed to map the query file
Running command: samtools index /bastianlab/data1/hpan/xTea/MaMel-144al/tmp/classification/all_tei_seq_2_ref.bam
Working on polyA with contigs /bastianlab/data1/hpan/xTea/MaMel-144al/all_ins_seqs.fa and consensus /bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/polyA.fa
Running command: minimap2 -k11 -w5 --sr --frag=yes -A2 -B4 -O4,8 -E2,1 -r150 -p.5 -N5 -n1 -m20 -s30 -g200 -2K50m --MD --heap-sort=yes --secondary=no --cs -a -t 8 /bastianlab/data1/hpan/xTea/rep_lib_annotation/consensus_mask_lrd/polyA.fa /bastianlab/data1/hpan/xTea/MaMel-144al/all_ins_seqs.fa | samtools view -hSb - | samtools sort -o /bastianlab/data1/hpan/xTea/MaMel-144al/tmp/classification/polyA_cns.bam -
[M::mm_idx_gen::0.0021.85] collected minimizers
[M::mm_idx_gen::0.0033.76] sorted minimizers
[M::main::0.0033.73] loaded/built the index for 1 target sequence(s)
[M::mm_mapopt_update::0.0033.69] mid_occ = 219
[M::mm_idx_stat] kmer size: 11; skip: 5; is_hpc: 0; #seq: 1
[M::mm_idx_stat::0.003*3.65] distinct minimizers: 1 (0.00% are singletons); average occurrences: 218.000; average spacing: 1.050; total length: 229
ERROR: failed to open file '/bastianlab/data1/hpan/xTea/MaMel-144al/all_ins_seqs.fa': No such file or directory
ERROR: failed to map the query file
Running command: samtools index /bastianlab/data1/hpan/xTea/MaMel-144al/tmp/classification/polyA_cns.bam
Traceback (most recent call last):
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_main.py", line 311, in
lrc.classify_ins_seqs(sf_rep_ins, sf_ref, flk_lenth, sf_rslt)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_rep_classification.py", line 175, in classify_ins_seqs
self.get_unmasked_seqs(sf_rep_ins_tmp, sf_tmp_out, sf_new_tmp)
File "/c4/home/ucsf-pan/software/xTea/xtea_long/l_rep_classification.py", line 297, in get_unmasked_seqs
with pysam.FastxFile(sf_ori) as fin_ori, open(sf_new, "w") as fout_new:
File "pysam/libcfaidx.pyx", line 550, in pysam.libcfaidx.FastxFile.cinit
File "pysam/libcfaidx.pyx", line 580, in pysam.libcfaidx.FastxFile._open
OSError: file
/bastianlab/data1/hpan/xTea/MaMel-144al/all_ins_seqs.fa
not foundI have no idea how to solve this. Could you please guide me how to solve this?
The text was updated successfully, but these errors were encountered: