For local installation, a command line version of ASAFind can be downloaded from our GitHub repository:
https://github.com/ASAFind/ASAFind-2Using the GitHub function, you can either dowload the files as a zip archive, or you can clone the repository using the provided URL. After download of the latest version, follow these installation steps:
- step into the directory where you want to make the installation e.g.
cd /home/marta/asafind - Make a clone of the GitHub
git clone https://github.com/ASAFind/ASAFind-2.git - run the following command from the command line
python3 -m venv asafind_line_command . asafind_line_command/bin/activate
pip install --upgrade pip
cd ASAFind-2/
pip install -r requirements.txt - Now you are in virtual environment named asafind_line_command. Here you can ask for help e.g.
(asafind_line_command) marta@marta-mini:~/asafind/ASAFind-2$ python3 S0_ASAFind.py --help
It will create the environment called asafind_line_command including subdirectries temp and output, install all required packages activate it.
In the environment root directory is the program S1_ASAFind_v3.py
Takes a Fasta and companion TargetP v.2.0 short format file as input, with the complete TargetP header
(two lines starting with '#'). Some versions of SignalP truncate the sequence names. SignalP-3.0 to 20 characters,
and 4.0, 4.1 to 58 characters. Therefore,
ASAFind only considers the first corresponding characters of the fasta name (and the first 90 in the
case of TargetP 2.0), which must be unique within the file. Parts of the fasta name
after that character are ignored. Additionally, the fasta name may not contain a '-' or '|'. This
requirement is because SignalP converts characters in sequence names (e.g. '-' is changed to '_').
ASAFind requires at least 7 aa upstream and 22 aa downstream of the cleavage site suggested by
SignalP. The output of this script is a tab delimited table.
Python >= 3.10 required.
python S0_ASAFind.py --help
usage: S1_ASAFind_v3.py | |
usage: S0_ASAFind.py |
[-h] -f FASTA_FILE -p SIGNALP_FILE [-s SIMPLE_SCORE_CUTOFF] [-t FASTA_FILE_WITH_MOTIFS] [-w] [-v1] [-ppc] [-s_ppc SCORE_CUTOFF_PPC] [-t_ppc FASTA_FILE_WITH_MOTIFS_PPC] [-l] [-my_org MY_ORGANISM] [-v] |
-h, --help | show this help message and exit | |
-f FASTA_FILE, --fasta_file FASTA_FILE | Specify the input fasta FILE. | |
-p SIGNALP_FILE, --signalp_file SIGNALP_FILE | Specify the input TargetP FILE.. | |
-s SIMPLE_SCORE_CUTOFF, --simple_score_cutoff SIMPLE_SCORE_CUTOFF | Optionally, specify an explicit score cutoff, rather than using ASAFind's default algorithm, not compatible with option -v1. The score given here will not be normalized and therefore should be obtained form a distribution of normalized scores. | |
-t FASTA_FILE_WITH_MOTIFS, --fasta_file_with_motifs FASTA_FILE_WITH_MOTIFS | Optionally, specify a custom scoring table. The scoring table will be normalized with the maximum score, which allows for processing of non-normalized as well as normalized scoring tables. | |
-w, --web_output | Format output for web display. This is mostly useful when called by a web app. | |
-v1, --reproduce_ASAFind_1 |
Reproduce ASAFind 1.x scores and results (non-normalized scores, if no custom scoring table is specified, the original default scoring table generated without small sample size correction will be used, not compatible with option -s). |
|
-ppc, --include_ppc_prediction | Include prediction of proteins that might be targeted to the periplastidic compartment. | |
-t SCORE_TABLE_FILE, --score_table_file SCORE_TABLE_FILE | Optionally, specify a custom scoring table. The scoring table will be normalized with the maximum score, which allows for processing of non-normalized as well as normalized scoring tables. |
|
-o OUT_FILE, --out_file OUT_FILE | Specify the path and name of the output file you wish to create. Default will be the same as the fasta_file, but with a ".tab" suffix. |
|
-s_ppc SCORE_CUTOFF_PPC, --score_cutoff_ppc SCORE_CUTOFF_PPC | Optionally, specify an explicit score cutoff for the ppc protein prediction, if given, ppc protein prediction will be included. The score given here will not be normalized and therefore should be obtained form a distribution of normalized scores. |
|
-t_ppc SCORE_TABLE_FILE_PPC, --score_table_file_ppc SCORE_TABLE_FILE_PPC | Optionally, specify a custom scoring table for the ppc protein prediction, if given, ppc protein prediction will be included. The scoring table will be normalized with the maximum score, which allows for processing of non-normalized as well as normalized scoring tables. |
|
-l {yes,no}, --logomaker {yes,no} | Choose "yes" or "no" (default: no). Optionally, from version 3.0 you can define with keyword yes or no, if the program will generate also the logomaker pictures in .png and .svg formats. They will be include into the output compressed package. |
|
-my_org MY_ORGANISM, --my_organism MY_ORGANISM | Specify the name of organism. | |
-v, --version | show program's version number and exit |
python S0_ASAFind.py -f /data/haptophyta/temp/haptophyta_fasta_for_targetp2.fsa -p /data/haptophyta/temp/haptophyta_fasta_for_targetp2.targetp2 -my_org haptophyta -l