Juami Hermine Mariama van Gils, Dea Gogishvili, Jan van Eck, Robbin Bouwmeester, Erik van Dijk, Sanne Abeln
Proteins tend to bury hydrophobic residues inside their core during the folding process to provide stability to the protein structure and to prevent aggregation. Nevertheless, proteins do expose some ‘sticky’ hydrophobic residues to the solvent. These residues can play an important functional role, e.g. in protein–protein and membrane interactions. Here, we first investigate how hydrophobic protein surfaces are by providing three measures for surface hydrophobicity: the total hydrophobic surface area, the relative hydrophobic surface area and—using our MolPatch method—the largest hydrophobic patch. Secondly, we analyze how difficult it is to predict these measures from sequence: by adapting solvent accessibility predictions from NetSurfP2.0, we obtain well-performing prediction methods for the THSA and RHSA, while predicting LHP is more challenging. Finally, we analyze implications of exposed hydrophobic surfaces: we show that hydrophobic proteins typically have low expression, suggesting cells avoid an overabundance of sticky proteins.
Bioinformatics Advances, Volume 2, Issue 1, 2022, vbac002, https://doi.org/10.1093/bioadv/vbac002
Outline of the study. (1) Structure-based definition represents the three hydrophobic measures: red and yellow colours indicate the surface of hydrophobic residues, the blue colour indicates the surface of hydrophilic residues. The THSA is calculated by summing the area of all hydrophobic residues (red and yellow). The RHSA is calculated by dividing the THSA by the TASA (red, yellow and blue). The LHP is the largest area of adjacent hydrophobic residues (only red). (2) We train and benchmark sequence-based prediction methods of the three hydrophobic measures. (3) THSA, RHSA and LHP values for the human proteome were predicted by the best-performing methods and used to estimate the abundance of hydrophobic proteins in various diseases and tissues