-
Notifications
You must be signed in to change notification settings - Fork 3
How otb.sh works
otb.sh is the script that the user calls to run an instance of otb. The public secret is that otb.sh is just a wrapper script for run.nf, with some call outs to prefetch_containers and check_env (these are int the scr directory). Generally, what's happening under the hood is that otb.sh checks the users getopts/flags, checks the users compute environment, pre-downloads all the required software containers, setups busco, and then runs a nextflow script (run.nf).
otb runs the check environment script, the prefetch containers script, the check containers script, and nextflow script by building a bash variable, and adding to it and then evaluating it. The following snippet from otb.sh show's how this is done:
prefetch_container="./scr/prefetch_containers.sh"
[ -n "$YAHS" ] && prefetch_container+=" -y"
[ -n "$BUSCO" ] && prefetch_container+=" -b"
[ -n "$POLISHTYPE" ] && prefetch_container+=" -p $POLISHTYPE"
[ -n "$NXF_SINGULARITY_CACHEDIR" ] || ( mkdir -p "./work/singularity"; prefetch_container+=" -l ./work/singularity" )
eval $prefetch_container
...when the $YAHS variable is set, a " -y" will be appended to the prefetch_container variable.
If a developer wanted to add a getopt to otb.sh, they'd add a line to the get opt while loop near the beginning (remembering to add to the help() function at the beginning otb.sh, which would set a variable to add to the corresponding eval.
For instance, if the user wanted to add a --bar and corresponding -b which to the getopts of otb, and this flag took a integer for the prefetch containers script, the following would be completed:
In the documentation
-r or --reverse
another fastq or fastq.gz file for the pipeline, order does not matter
-in or --reads
path to reads (generally from pacbio), may include a wildcard for multiple files, can be fastq or bam
suggested:
-m or --mode
mode to use, must be one of \"phasing\",\"homozygous\",\"heterozygous\",\"trio\", default: homozygous
-t or --threads
number of threads to use, clusters sometimes use this as number of cores, default: 20
-n or --name
a name for the assembly
-y or --yahs
run yahs as well
Would become:
-r or --reverse
another fastq or fastq.gz file for the pipeline, order does not matter
-in or --reads
path to reads (generally from pacbio), may include a wildcard for multiple files, can be fastq or bam
suggested:
-m or --mode
mode to use, must be one of \"phasing\",\"homozygous\",\"heterozygous\",\"trio\", default: homozygous
-t or --threads
number of threads to use, clusters sometimes use this as number of cores, default: 20
-b or --bar
the bar setting for prefetch_containers operation, default 20
-n or --name
a name for the assembly
-y or --yahs
run yahs as well
and then the getopts while loop would be modified:
while [ $# -gt 0 ] ; do
case $1 in
-h | --help) help ;;
-v | --version) version ;;
-s | --supress) SUPRESS="true";;
-c | --check) TEST="true";;
-f | --forward) R1="$2" ;;
-r | --reverse) R2="$2" ;;
-in | --reads) READS="$2" ;;
-m | --mode) MODE="$2";;
-t | --threads) THREADS="$2";;
-n | --name) NAME="$2";;
-y | --yahs) YAHS="true";;
--sge) RUNNER="sge";;
--slurm) RUNNER="slurm";;
--slurm-usda) RUNNER="slurm_usda";;
--slurm-atlas) RUNNER="slurm_atlas";;
--none) RUNNER="none";;
--busco) BUSCO="--busco ";;
--polish-type) POLISHTYPE="$2";;
--auto-lineage) LINEAGE="auto-lineage";;
--auto-lineage-prok) LINEAGE="auto-lineage-prok";;
--auto-lineage-euk) LINEAGE="auto-lineage-euk";;
-l | --lineage) LINEAGE="$2";;
-p | --busco-path) BUSCOPATH="$2";;
esac
shift
done
becoming:
while [ $# -gt 0 ] ; do
case $1 in
-h | --help) help ;;
-v | --version) version ;;
-s | --supress) SUPRESS="true";;
-c | --check) TEST="true";;
-f | --forward) R1="$2" ;;
-r | --reverse) R2="$2" ;;
-in | --reads) READS="$2" ;;
-m | --mode) MODE="$2";;
-t | --threads) THREADS="$2";;
-b | --bar) BAR="$2";;
-n | --name) NAME="$2";;
-y | --yahs) YAHS="true";;
--sge) RUNNER="sge";;
--slurm) RUNNER="slurm";;
--slurm-usda) RUNNER="slurm_usda";;
--slurm-atlas) RUNNER="slurm_atlas";;
--none) RUNNER="none";;
--busco) BUSCO="--busco ";;
--polish-type) POLISHTYPE="$2";;
--auto-lineage) LINEAGE="auto-lineage";;
--auto-lineage-prok) LINEAGE="auto-lineage-prok";;
--auto-lineage-euk) LINEAGE="auto-lineage-euk";;
-l | --lineage) LINEAGE="$2";;
-p | --busco-path) BUSCOPATH="$2";;
esac
shift
done
and finally, the bar variable could either be computed on or passed directly to prefetch_containers.sh:
prefetch_container="./scr/prefetch_containers.sh"
[ -n "$YAHS" ] && prefetch_container+=" -y"
[ -n "$BUSCO" ] && prefetch_container+=" -b"
[ -n "$POLISHTYPE" ] && prefetch_container+=" -p $POLISHTYPE"
[ -n "$BAR"] && prefetch_container+=" -b $BAR"
[ -n "$NXF_SINGULARITY_CACHEDIR" ] || ( mkdir -p "./work/singularity"; prefetch_container+=" -l ./work/singularity" )
eval $prefetch_container
otb is in the public domain in the United States per 17 U.S.C. § 105