-
Notifications
You must be signed in to change notification settings - Fork 9
Description
Hi,
Thank you for continuing to develop ORFik!
R version 4.3.2 (2023-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.4 LTS
ORFik_1.22.2
It was observed that there are many log statements during STAR.align.folder(steps = 'tr-co-ge')
The initial Arguments for folder run are the following: is helpful.
But, for each step of Trimming, Contaminant Depletion and Genome Alignment, the SAME Paired end mode for files: is printed each time.
The log statement for Trimming is fine, but after that it is not as expected, as the input to steps 2 and 3 are from the preceding step, and the output directories and files also differ for each step.
Arguments for folder run are the following:
-f input folder: /Bio_data/raw_data/Ribo-seq/ltm_may/
-o output folder: /Bio_data/processed_data/Ribo-seq/ltm_may/
-p paired end: yes
-l minimum length of reads: 20
-T max mismatches of reads: 3
-g genome dir for all STAR indices: /Bio_data/references/hs_assembly/STAR_index/
-s steps to do: tr-co-ge
-a adapter sequence: auto
-t trim front number: 0
-m max multimap: 10
-q quality filtering: default
-A alignment type: Local
-B allow_introns: TRUE
-m maxCPU: 28
-i subfolders: n
-K Keep contamination reads: no
-u Keep unmapped genome reads: None
-S STAR location: /bin/STAR-2.7.4a/bin/Linux_x86_64/STAR
-P fastp location: /bin/fastp
-I align_single location: /opt/R/4.3.2/lib/R/library/ORFik/STAR_Aligner/RNA_Align_pipeline.sh
-C cleaning location: /opt/R/4.3.2/lib/R/library/ORFik/STAR_Aligner/cleanup_folders.sh
Total number of files are:
4
Current step:
tr
#############################################
Paired end mode for files:
Forward: /Bio_data/raw_data/Ribo-seq/ltm_may//x2-LTM-Plus-kit_S2_R1.fastq.gz
Reverse: /Bio_data/raw_data/Ribo-seq/ltm_may//x2-LTM-Plus-kit_S2_R2.fastq.gz
Files 1 and 2 / 4
-o output folder: /Bio_data/processed_data/Ribo-seq/ltm_may/
-f input file: /Bio_data/raw_data/Ribo-seq/ltm_may//x2-LTM-Plus-kit_S2_R1.fastq.gz
-F input file 2: /Bio_data/raw_data/Ribo-seq/ltm_may//x2-LTM-Plus-kit_S2_R2.fastq.gz
-a adapter sequence: auto
-q quality filtering: default
-s steps to do: tr-co-ge
-r resume (r or new n): tr
-l minimum length of reads: 20
-T max mismatches of reads: 3
-g genome dir for all indices: /Bio_data/references/hs_assembly/STAR_index/
-m maxCPU: 28
-A alignment type: Local
-B allow_introns: TRUE
-t trim front (nt): 0
-k Keep Star Index loaded: y
-K Keep contamination reads: no
-u Keep unmapped genome reads: None
-P fastp location: /bin/fastp
-S STAR location: /bin/STAR-2.7.4a/bin/Linux_x86_64/STAR
...
Current step:
co
#############################################
Paired end mode for files:
Forward: /Bio_data/raw_data/Ribo-seq/ltm_may//x2-LTM-Plus-kit_S2_R1.fastq.gz
Reverse: /Bio_data/raw_data/Ribo-seq/ltm_may//x2-LTM-Plus-kit_S2_R2.fastq.gz
Files 1 and 2 / 4
-o output folder: /Bio_data/processed_data/Ribo-seq/ltm_may/
-f input file: /Bio_data/raw_data/Ribo-seq/ltm_may//x2-LTM-Plus-kit_S2_R1.fastq.gz
-F input file 2: /Bio_data/raw_data/Ribo-seq/ltm_may//x2-LTM-Plus-kit_S2_R2.fastq.gz
-a adapter sequence: auto
-q quality filtering: default
-s steps to do: tr-co-ge
-r resume (r or new n): tr
-l minimum length of reads: 20
-T max mismatches of reads: 3
-g genome dir for all indices: /Bio_data/references/hs_assembly/STAR_index/
-m maxCPU: 28
-A alignment type: Local
-B allow_introns: TRUE
-t trim front (nt): 0
-k Keep Star Index loaded: y
-K Keep contamination reads: no
-u Keep unmapped genome reads: None
-P fastp location: /bin/fastp
-S STAR location: /bin/STAR-2.7.4a/bin/Linux_x86_64/STAR
...
Current step:
ge
#############################################
Paired end mode for files:
Forward: /Bio_data/raw_data/Ribo-seq/ltm_may//x2-LTM-Plus-kit_S2_R1.fastq.gz
Reverse: /Bio_data/raw_data/Ribo-seq/ltm_may//x2-LTM-Plus-kit_S2_R2.fastq.gz
Files 1 and 2 / 4
-o output folder: /Bio_data/processed_data/Ribo-seq/ltm_may/
-f input file: /Bio_data/raw_data/Ribo-seq/ltm_may//x2-LTM-Plus-kit_S2_R1.fastq.gz
-F input file 2: /Bio_data/raw_data/Ribo-seq/ltm_may//x2-LTM-Plus-kit_S2_R2.fastq.gz
-a adapter sequence: auto
-q quality filtering: default
-s steps to do: tr-co-ge
-r resume (r or new n): tr
-l minimum length of reads: 20
-T max mismatches of reads: 3
-g genome dir for all indices: /Bio_data/references/hs_assembly/STAR_index/
-m maxCPU: 28
-A alignment type: Local
-B allow_introns: TRUE
-t trim front (nt): 0
-k Keep Star Index loaded: y
-K Keep contamination reads: no
-u Keep unmapped genome reads: None
-P fastp location: /bin/fastp
-S STAR location: /bin/STAR-2.7.4a/bin/Linux_x86_64/STAR
After Arguments for folder run are the following:, the current TR, CO and GE statement format can be omitted and it will be helpful to have Step-specific log statements to know what exactly was executed, like this entry printed for fastp
/bin/fastp --in1=/Bio_data/raw_data/Ribo-seq/ltm_may//x2-LTM-Plus-kit_S2_R1.fastq.gz --in2=/Bio_data/raw_data/Ribo-seq/ltm_may//x2-LTM-Plus-kit_S2_R2.fastq.gz --out1=/Bio_data/processed_data/Ribo-seq/ltm_may//trim/trimmed_x2-LTM-Plus-kit_S2_R1.fastq --out2=/Bio_data/processed_data/Ribo-seq/ltm_may//trim/trimmed2_x2-LTM-Plus-kit_S2_R1.fastq --json=/Bio_data/processed_data/Ribo-seq/ltm_may//trim/report_x2-LTM-Plus-kit_S2_R1.json --html=/Bio_data/processed_data/Ribo-seq/ltm_may//trim/report_x2-LTM-Plus-kit_S2_R1.html --trim_front1=0 --trim_front2=0 --length_required=20 --thread 16