Check out the latest veriosn here.
- Clone the repository
- Create a conda environment
conda create --name precog python=2.7 scipy numpy pandas scikit-learn=0.19.0
- Activate the environment
conda activate precog
- Install the HMMER package
Tip: Use version v3.1b2 for output comparable to that of the webserver/original paper
- See help to know the command-line parameters
./precog.py --help
Note: This version does not include structure-based predictions( using InterPreTS). This version has been tested on Ubuntu-based environments.
- The input file must be FASTA formatted. For example:
>sp|P30518|V2R_HUMAN Vasopressin V2 receptor OS=Homo sapiens OX=9606 GN=AVPR2 PE=1 SV=1 MLMASTTSAVPGHPSLPSLPSNSSQERPLDTRDPLLARAELALLSIVFVAVALSNGLVLA ALARRGRRGHWAPIHVFIGHLCLADLAVALFQVLPQLAWKATDRFRGPDALCRAVKYLQM VGMYASSYMILAMTLDRHRAICRPMLAYRHGSGAHWNRPVLVAWAFSLLLSLPQLFIFAQ RNVEGGSGVTDCWACFAEPWGRRTYVTWIALMVFVAPTLGIAACQVLIFREIHASLVPGP SERPGGRRRGRRTGSPGEGAHVSAAVAKTVRMTLVIVVVYVLCWAPFFLVQLWAAWDPEA PLEGAPFVLLMLLASLNSCTNPWIYASFSSSVSSELRSLLCCARGRTPPSLGPQDESCTT ASSSLAKDTSS >Q14330 MITRNNQDQPVPFNSSHPDEYKIAALVFYSCIFIIGLFVNITALWVFSCTTKKRTTVTIY MMNVALVDLIFIMTLPFRMFYYAKDEWPFGEYFCQILGALTVFYPSIALWLLAFISADRY MAIVQPKYAKELKNTCKAVLACVGVWIMTLTTTTPLLLLYKDPDKDSTPATCLKISDIIY LKAVNVLNLTRLTFFFLIPLFIMIGCYLVIIHNLLHGRTSKLKPKVKEKSIRIIITLLVQ VLVCFMPFHICFAFLMLGTGENSYNPWGAFTTFLMNLSTCLDVILYYIVSKQFQARVISV MLYRNYLRSMRRKSFRSGSLRSLSNINSEML
- Mutations can be given in the following format:
>Q14330/Y60R MITRNNQDQPVPFNSSHPDEYKIAALVFYSCIFIIGLFVNITALWVFSCTTKKRTTVTIR
MMNVALVDLIFIMTLPFRMFYYAKDEWPFGEYFCQILGALTVFYPSIALWLLAFISADRY
MAIVQPKYAKELKNTCKAVLACVGVWIMTLTTTTPLLLLYKDPDKDSTPATCLKISDIIY
LKAVNVLNLTRLTFFFLIPLFIMIGCYLVIIHNLLHGRTSKLKPKVKEKSIRIIITLLVQ
VLVCFMPFHICFAFLMLGTGENSYNPWGAFTTFLMNLSTCLDVILYYIVSKQFQARVISV
MLYRNYLRSMRRKSFRSGSLRSLSNINSEML
- However, the mutated sequence of a GPCR may not be provided again if it's wild type sequence has already been mentioned before.
>Q14330 MITRNNQDQPVPFNSSHPDEYKIAALVFYSCIFIIGLFVNITALWVFSCTTKKRTTVTIY
MMNVALVDLIFIMTLPFRMFYYAKDEWPFGEYFCQILGALTVFYPSIALWLLAFISADRY
MAIVQPKYAKELKNTCKAVLACVGVWIMTLTTTTPLLLLYKDPDKDSTPATCLKISDIIY
LKAVNVLNLTRLTFFFLIPLFIMIGCYLVIIHNLLHGRTSKLKPKVKEKSIRIIITLLVQ
VLVCFMPFHICFAFLMLGTGENSYNPWGAFTTFLMNLSTCLDVILYYIVSKQFQARVISV
MLYRNYLRSMRRKSFRSGSLRSLSNINSEML >Q14330/Y60R
The output file contains following headers:
- GPCR/MUT: Name of the input. Must be alphanumeric.
- GNAI3 - GNAL: Couping values predicted by PRECOG (probabilities) or known from IUPHAR (PC: Primary Coupling, SC: Secondary Coupling) or the shedding assay experiment (LogRAi >= -1.0 is coupled, otherwise uncoupled). If the input sequence cannot be searched against a G-protein model or is unavaliable in IUPHAR or the data from the shedding assay experiment, it is shown with -.
- 7TM1_POS/BW/ALN_POS: Denotes the Pfam 7tm1 (7TM1_POS) position, Ballesteros-Weinstein numbering (BW) or alignment position (ALN_POS) of the positions affected in the given input GPCR.
- Mutation_Info: Information about the input mutation. Not applicable for wild type input.
Singh G, Inoue A, Gutkind JS, Russell RB, Raimondi F. PRECOG: PREdicting COupling probabilities of G-protein coupled receptors. Nucleic Acids Res. 2019 Jul 2;47(W1):W395-W401. doi: 10.1093/nar/gkz392. PMID: 31143927; PMCID: PMC6602504.
gurdeep[at]bioquant[.]uni-heidelberg[.]de