The web server, train and test sets of Brewery are available at http://distilldeep.ucd.ie/brewery/.
The docker container is available at https://hub.docker.com/r/mircare/brewery (HOWTO).
The predictions of the UniProtKB entries for COVID-19 are available at http://distilldeep.ucd.ie/brewery/.
See https://github.com/mircare/Porter5 to predict protein secondary structure only.
Diagram of the pipeline we propose to gather and exploit deeper profiles.
$ git clone https://github.com/mircare/Brewery/ --depth 1 && rm -rf Brewery/.git
- Python3 (https://www.python.org/downloads/);
- NumPy (https://www.scipy.org/scipylib/download.html);
- HHblits (https://github.com/soedinglab/hh-suite/);
- uniprot20 (http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/old-releases/uniprot20_2016_02.tgz).
- PSI-BLAST (ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/);
- UniRef90 (ftp://ftp.uniprot.org/pub/databases/uniprot/uniref/uniref90/uniref90.fasta.gz).
# To exploit HHblits only (for fast and accurate predictions)
$ python3 Brewery/Brewery.py -i Brewery/example/2FLGA.fasta --cpu 4 --fast
# To exploit both PSI-BLAST and HHblits (for very accurate predictions)
$ python3 Brewery/Brewery.py -i Brewery/example/2FLGA.fasta --cpu 4
# To split a FASTA file with multiple sequences (Optionally)
$ python3 Brewery/split_fasta.py many_sequences.fasta
# To predict all the fasta files in a given directory (Fastas)
$ python3 Brewery/multiple_fasta.py -i Fastas/ --cpu 4 --fast
# To run multiple predictions in parallel (using a total of 8 cores)
$ python3 Brewery/multiple_fasta.py -i Fastas/ --cpu 4 --parallel 2 --fast
$ python3 Brewery/Brewery.py --help
usage: Brewery.py [-h] [-input fasta_file] [--cpu CPU] [--fast] [--noSS]
[--noTA] [--noSA] [--noCD] [--distill] [--setup]
This is the standalone of Brewery5. Run it on a FASTA file to predict its
Secondary Structure in 3- and 8-classes (Porter5), Solvent Accessibility in 4
classes (PaleAle5), Torsional Angles in 14 classes (Porter+5) and Contact
Density in 4 classes (BrownAle).
optional arguments:
-h, --help show this help message and exit
-input fasta_file FASTA file containing the protein to predict
--cpu CPU How many cores to perform this prediction
--fast Use only HHblits (skip PSI-BLAST)
--bfd Harness also the BFD database (https://bfd.mmseqs.com/)
--noSS Skip Secondary Structure prediction with Porter5
--noTA Skip Torsional Angles prediction with Porter+5
--noSA Skip Solvent Accessibility prediction with PaleAle5
--noCD Skip Contact Density prediction with BrownAle5
--distill Generate useful outputs for 3D protein structure prediction
--setup Initialize Brewery5 from scratch (it is recommended when
there has been any change involving PSI-BLAST, HHblits,
Brewery itself, etc).
E.g., run Brewery on 4 cores: python3 Brewery5.py -i example/2FLGA --cpu 4
# Set the absolute PATHs for databases and query sequences (stored locally)
$ docker run --name brewery -v /**PATH_to_uniprot20_2016_02**:/uniprot20 \
-v /**PATH_to_UniRef90_optional**:/uniref90 -v /**PATH_to_fasta_to_predict**:/Brewery/query \
--cap-add IPC_LOCK mircare/brewery sleep infinity &
# Run a prediction using 5 cores and HHblits only
$ docker exec brewery python3 Brewery.py -i query/2FLGA.fasta --cpu 5 --fast
Method | Q3 per AA | SOV'99 per AA | Q3 per protein | SOV'99 per protein |
---|---|---|---|---|
Brewery | 83.81% | 80.41% | 84.32% | 81.05% |
SPIDER 3 | 83.15% | 79.43% | 83.42% | 79.79% |
Brewery HHblits only | 83.06% | 79.49% | 83.68% | 80.26% |
SSpro 5.1 with templates | 82.58% | 78.54% | 83.94% | 80.29% |
PSIPRED 4.01 | 81.88% | 77.36% | 82.48% | 78.22% |
RaptorX-Property | 81.86% | 78.08% | 82.57% | 78.99% |
Porter 4 | 81.66% | 78.05% | 82.29% | 78.61% |
SSpro 5.1 ab initio | 81.17% | 76.87% | 81.10% | 76.92% |
DeepCNF | 81.04% | 76.74% | 81.16% | 76.99% |
Reference: Table 1 in https://doi.org/10.1101/289033.
Method | Q8 per AA | SOV8_refine per AA | Q8 per protein | SOV8_refine per protein |
---|---|---|---|---|
Brewery | 73.02% | 72.09% | 73.92% | 72.64% |
SSpro 5.1 with templates | 71.91% | 70.72% | 74.46% | 73.45% |
Brewery HHblits only | 71.8% | 71.16% | 72.83% | 71.74% |
RaptorX-Property | 70.74% | 69.65% | 71.78% | 70.03% |
DeepCNF | 69.76% | 68.5% | 70.14% | 68.06% |
SSpro 5.1 ab initio | 68.85% | 67.54% | 69.27% | 67.91% |
Reference: Table 2 in https://doi.org/10.1101/289033.
Method | Q2 per AA | Q3 per AA | Q4 per AA |
---|---|---|---|
ACCpro 5.1 with templates | 80.5% | N.A. | N.A. |
Brewery | 80.48% | 66.41% | 56.46% |
PaleAle 4 | 78.21% | N.A. | 52.53% |
SPIDER 3 | 77.91% | 61.19% | 49.01% |
ACCpro 5.1 ab initio | 76.6% | N.A. | N.A. |
RaptorX-Property | N.A. | 63.25% | N.A. |
Method | Q14 per AA | Q14 per protein |
---|---|---|
Brewery | 69.93% | 70.59% |
SPIDER 3 | 66.58% | 66.27% |
Porter+ | 64.73% | 66% |
Method | Q4 per AA | Q4 per protein |
---|---|---|
Brewery | 50.01% | 48% |
BrownAle | 46.5% | N.A. |
If you use Brewery, please cite our Bioinformatics paper:
@article{torrisi_brewery_2020,
title = {Brewery: Deep Learning and deeper profiles for the prediction of 1D protein structure annotations},
doi = {10.1093/bioinformatics/btaa204},
journal = {Bioinformatics},
author = {Torrisi, Mirko and Pollastri, Gianluca}
}
Brewery: Deep Learning and deeper profiles for the prediction of 1D protein structure annotations,
Bioinformatics, Oxford University Press; Mirko Torrisi and Gianluca Pollastri;
Guest link: https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btaa204/5811232?guestAccessKey=9a73ae2a-2cb6-4fe1-b333-a4f3261f02cf.
Protein Structure Annotations; Essentials of Bioinformatics, Volume I. Springer Nature;
Mirko Torrisi and Gianluca Pollastri; Post-print: https://www.researchgate.net/publication/332048741_Protein_Structure_Annotations.
Deeper Profiles and Cascaded Recurrent and Convolutional Neural Networks for state-of-the-art Protein Secondary Structure Prediction, Scientific Reports, Nature Publishing Group; Mirko Torrisi, Manaz Kaleel and Gianluca Pollastri;
doi: https://doi.org/10.1038/s41598-019-48786-x.
PaleAle 5.0: prediction of protein relative solvent accessibility by deep learning, Amino Acids, Springer
Manaz Kaleel, Mirko Torrisi, Catherine Mooney and Gianluca Pollastri; Guest Link: https://rdcu.be/bNlXS.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Email us at gianluca[dot]pollastri[at]ucd[dot]ie if you wish to use it for purposes not permitted by the CC BY-NC-SA 4.0.