Protty utilizes profile Hidden Markov Models (HMMs) constructed from MEROPS database to predict putative proteases in query protein sequences
git clone https://github.com/ArtemF42/protty.git
cd protty
pip install -e .
# Not available at the moment
pip install protty
Note: protty-build
uses Clustal Omega
to perform multiple sequence alignment and requires it to be installed.
By default, Protty assumes clustalo
is in your PATH
. If this is not the
case, you should specify the --clustalo
parameter
protty-build [options] $DATABASE
Once the process is complete, the profile HMMs will be available in
$DATABASE/profiles
. The easiest way to merge them into the database is
to use cat
cat $DATABASE/profiles/*.hmm > $DATABASE/merops.hmm
The protty-build
pipeline consists of 3 major steps:
- Downloading MEROPS data
- Filtering raw FASTA files
- Building profile HMMs
Use --skip
option if you want to manually control the pipeline. For example,
the command below will only download data from the MEROPS server
protty-build --skip 2,3 $DATABASE
protty-scan [options] $DATABASE/merops.hmm proteins.faa
By default, protty-scan
generates two files named predicted_proteases.tsv
and predicted_proteases.faa
, located in the working directory. Use --tsv
and --faa
options to change the default behavior