homostack

The script homostack visualizes alignments among multiple homologous sequences.

required data

multiple query sequences
BED files with highlight regions (optional) [format, 7 columns separated by Tab]:
1. chr
2. start(0-based)
3. end(1-based)
4. label
5. height(e.g., 0.1)
6. strand(+/-)
7. color(R compatible)

example

running script

Parameter --seq is for a fasta sequence input and parameter --bed is for the BED highlight file. This is a pair. Once --bed is used for one sequence, other sequences should be paired with their own input. If no BED file for a sequence, "--bed none" can be specified.

seq01=/wikiexample/1_data/mads69/B73/MAD69/MAD69.3.Zm00001eb143080.fasta
bed01=/wikiexample/1_data/mads69/B73/MAD69/MAD69.2.transcripts/Zm00001eb143080_T001.adjusted.bed
seq02=/wikiexample/1_data/mads69/B97/MAD69/MAD69.3.Zm00018ab147610.fasta
bed02=/wikiexample/1_data/mads69/B97/MAD69/MAD69.2.transcripts/Zm00018ab147610_T001.adjusted.bed
seq03=/wikiexample/1_data/mads69/Ms71/MAD69/MAD69.3.Zm00035ab147480.fasta
bed03=/wikiexample/1_data/mads69/Ms71/MAD69/MAD69.2.transcripts/Zm00035ab147480_T001.adjusted.bed

perl ../../homostack \
    --seq $seq01 --annot $bed01 --plotname Zm00001eb143080 \
    --seq $seq02 --annot $bed02 --plotname Zm00018ab147610 \
    --seq $seq03 --annot $bed03 --plotname Zm00035ab147480

output plot

full usage

Usage: perl ../homostack --seq <fasta> --annot <annot_file> [options]
    [Options]
    --seq <file>     fasta file containing a sequence as the query; required
                     multiple sequences are needed by using --seq multiple times
    --annotskip      skip annotation if specified; NO skipping by default
    --annot <file>   bed file to highlight regions in query; if --annotskip is specified, --annot will be ignored; required otherwise
                     [format]: 7 columns separated by Tab
                               chr start(0-based) end(1-based) label height(e.g., 0.1) strand(+/-) color(R compatible)
                     [NOTE 1]: if no --annotskip, the number --annot needs to match the number of --seq;
                               they will be paired by their order, i.e., 1st --seq paired with 1st --annot;
                               if some --annot has no data, input "none".
                     [NOTE 2]: "height" is the ratio of height of highlighted bars to height of each alignment unit
                               a highlighted bar fills the specified region if the height equals --$seqheight value
    --plotname       sequence names to be used in the plot; multiple sequences allowed; if specified, equal number of inputs should be used as --seq inputs 
    --alnskip        skip alignments if specified; NO skipping by default
    --identity <int> minimal percentage of identity from 0 to 100 (80)
    --match <int>    minimal bp match of an alignment (100)
    --prefix <str>   the output directory and the prefix for output files (hsout)
    --title <str>    the title of the plot (ALNStack)
    --minident <int> lowest identity for plotting color scaling, 0-100 or auto (auto)
    --maxident <int> highest identity for plotting color scaling, 0-100 or auto (auto)
    --threads <int>  number of cpus (1)
    --seqheight <float> ratio of height of a sequence to height of each alignment unit (0.1)
    --bandcol <str>  a valid R color name (bisque3)
    --cleanup        clean up outputs if specified; NO cleanup by default
    --version        version information
    --help           help information.

Output from homocomp

Plot of sequential alignments of multiple sequences : <prefix>.3.alnstack.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

homostack

required data

example

running script

output plot

full usage

Output from homocomp

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally