-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
about the TM domain prediction #46
Comments
For example, a paper using mature protein sequences as the input file of TMHMM. https://www.frontiersin.org/articles/10.3389/fpls.2014.00098/full
Cheers, |
Hi Xizhe, Usually the approach we take is to just ignore any TM domains predicted within the SP region. The outputs includes the positions of the predicted TM domains, and also the estimated number of TM bases within the first 60 AAs (which the LTR model uses to decide if it should worry about any TM domains). I'm not personally in favour of restricting it. |
I know what you mean and I agree with you point. There's a better choice, we could combine the mature protein sequences (proteins with SP) and other complete protein sequences (proteins without SP) together as the input file of TMHMM. It will be accurate and not lost any candidates! |
We've had a bit of an internal discussion about this one. I can imagine some edge cases where your suggestion might provide some benefit. I think the best way to settle this is to benchmark it and see what happens. I'll leave this open as a reminder until then and hopefully we'll know in the next major release. Thanks for the suggestion! |
Dear Darcy,
When I run the pipeline, I found that the complete protein sequences were set as the input of the TMHMM. But I thought the mature protein sequences predicted by signalP would be better than the complete protein sequences. Because the TM domain on signal peptide would have no function. What do you think about this?
Thanks,
Xizhe
The text was updated successfully, but these errors were encountered: