TAFPred: Torsion Angle Fluctuations Prediction from Protein Sequences
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
The dataset can be found in the Dataset/FullDataset directory. The dataset is collected from [1].
We have tested TAFPred on Ubuntu 20.04. You would need to install the following software before replicating this framework in your local or server machine.
-
pyenv latest version
curl https://pyenv.run | bash exec $SHELLFor more details, visit: https://github.com/pyenv/pyenv-installer
-
Python version 3.9.5
pyenv install miniconda3-3.9-4.10.3 pyenv local miniconda3-3.9-4.10.3For more details, visit: https://github.com/pyenv/pyenv
Alternatively, Python version 3.9.5 can be installed using Anaconda.
conda create -n py39 python=3.9.5 conda activate py39 -
Poetry version 1.3.2
curl -sSL https://install.python-poetry.org | python3 - --version 1.3.2For more details, visit: https://python-poetry.org/docs/
-
Docker
The feature extraction part depends on many other tools. So we created a docker image to extract features easily without any setup. It might take quite some time to get all the features.
-
Protein Databases
The tool depends on the nr and uniclust30_2017_04 databases. The database should be placed in the following directory structure. Alternatively, it is possible to pass the database path as parameters.
project
│─── README.md
│
└─── script
└─── Databases
└─── nr
└───uniclust30_2017_04
- Retrieve the code
git clone https://github.com/wasicse/TAFPred.git
To run the program, first install all required libraries by running the following command:
poetry install
Then execute the following command to run TAFPred from the script directory on the example dataset. You need to change the input of the Dataset/example directory to get prediction for new protein sequences and replace DATABASE_PATH with the absoutue path of the databases e.g, "/home/wasi/TAFPred/script/Databases/"
cd script
poetry run python run_tafpred.py -f "taffeatures" -o "./output/" -d "DATABASE_PATH"
- Finally, check output folder for results. The output directory contains predicted lebels with probabilities for each residues.
Md Wasi Ul Kabir, Duaa Mohammad Alawad, Avdesh Mishra, and Md Tamjidul Hoque. For any issue please contact: Md Tamjidul Hoque, thoque@uno.edu
- Zhang, Tuo, Eshel Faraggi, and Yaoqi Zhou. “Fluctuations of backbone torsion angles obtained from NMR-determined structures and their prediction” Proteins: Structure, Function, and Bioinformatics 78, no. 16 (December 2010): 3353–62. https://doi.org/10.1002/prot.22842.