Opt_PredLLPS

About Opt_PredLLPS

We develop a two-task predictor named Opt_PredLLPS for discovering potential phase separation proteins and further judge its mechanism. The first task model of Opt_PredLLPS is based on the combination of CNN and BiLSTM through fully connected layer, in which CNN uses evolutionary information features as input, and BiLSTM uses multimodal features as input, respectively. If a protein is predicted as a PS protein, then it is input into the second task model to predict whether this protein interact with partners to undergo PS. The second task model is based on XGBoost classification algorithm and 37 physicochemical properties after 3-step feature selection.

The datasets can be found in ./datasets/. The Opt_PredLLPS models is available in ./model/. The prediction code can be found in Opt_PredLLPS.py, Opt_PredLLPS_Self.py and Opt_PredLLPS_Part.py.

Tools

The pssm feature is obtained from POSSUM. Please ensure that the fasta file submitted to POSSUM is the same as the fasta file submitted this time. POSSUM's web site is https://possum.erc.monash.edu.

1.The submitted sequence length should be no less than 50 and no longer than 5000.
2.The number of sequences submitted is within 500.
3.Submit A sequence of one and only 20 kinds of amino acids, including 'A', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'K', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'V', 'W', 'Y'.

HMM features requires a multiple sequence alignment tools and a database. Code for generating HMM features is located in ./utils/hhblits_search.

1.hhblits: It is an efficient protein sequence alignment tool that can quickly search homologous sequences in large databases.
2.uniclust30_2018_08: You can download it dababase from https://wwwuser.gwdg.de/~compbiol/uniclust/2018_08/uniclust30_2018_08_hhsuite.tar.gz .

Requirements

•python==3.7
•numpy==1.21.5
•Pandas==1.3.5
•scikit-learn==1.0.2
•tensorflow==1.14.0
•hhblits ==3.3.0

Usage

Running Predictions(Opt_PredLLPS.py)

• Input: The script takes an input file in FASTA format.
• Output: Generates an output file. The prediction results will be saved in Opt_PredLLPS prediction results.csv.
• Interpreting Scores: If the first task model scores of a protein are high (>=0.5), it is considered a LLPS protein. If the second task model scores of a protein are high (>=0.5), it is considered a PS-Self protein.

Running Predictions(Opt_PredLLPS_Self.py)

• Input: The script takes an input file in FASTA format.
• Output: Generates an output file. The prediction results will be saved in Opt_PredLLPS_Self prediction results.csv.
• Interpreting Scores: If scores of a protein are high (>=0.5), it is considered a PS-Self protein.

Running Predictions(Opt_PredLLPS_Part.py)

• Input: The script takes an input file in FASTA format.
• Output: Generates an output file. The prediction results will be saved in Opt_PredLLPS_Part prediction results.csv.
• Interpreting Scores: If scores of a protein are high (>=0.5), it is considered a PS-Part protein.

example

Simply run:

 python Opt_PredLLPS.py --input_fasta_file test/9 proteins/test.fasta

And the prediction results will be saved in Opt_PredLLPS prediction results.csv.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Opt_PredLLPS

About Opt_PredLLPS

Tools

Requirements

Usage

Running Predictions(Opt_PredLLPS.py)

Running Predictions(Opt_PredLLPS_Self.py)

Running Predictions(Opt_PredLLPS_Part.py)

example

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.idea		.idea
datasets		datasets
model		model
test		test
utils		utils
Opt_PredLLPS prediction results.csv		Opt_PredLLPS prediction results.csv
Opt_PredLLPS.py		Opt_PredLLPS.py
Opt_PredLLPS_Part prediction results.csv		Opt_PredLLPS_Part prediction results.csv
Opt_PredLLPS_Part.py		Opt_PredLLPS_Part.py
Opt_PredLLPS_Self prediction results.csv		Opt_PredLLPS_Self prediction results.csv
Opt_PredLLPS_Self.py		Opt_PredLLPS_Self.py
README.md		README.md

Zhou-Yetong/Opt_PredLLPS

Folders and files

Latest commit

History

Repository files navigation

Opt_PredLLPS

About Opt_PredLLPS

Tools

Requirements

Usage

Running Predictions(Opt_PredLLPS.py)

Running Predictions(Opt_PredLLPS_Self.py)

Running Predictions(Opt_PredLLPS_Part.py)

example

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages