Co-reporter:Martin Weigt;Robert A. White;James A. Hoch;Hendrik Szurmant;Terence Hwa
PNAS 2009 Volume 106 (Issue 1 ) pp:67-72
Publication Date(Web):2009-01-06
DOI:10.1073/pnas.0805923106
Understanding the molecular determinants of specificity in protein–protein interaction is an outstanding challenge of postgenome
biology. The availability of large protein databases generated from sequences of hundreds of bacterial genomes enables various
statistical approaches to this problem. In this context covariance-based methods have been used to identify correlation between
amino acid positions in interacting proteins. However, these methods have an important shortcoming, in that they cannot distinguish
between directly and indirectly correlated residues. We developed a method that combines covariance analysis with global inference
analysis, adopted from use in statistical physics. Applied to a set of >2,500 representatives of the bacterial two-component
signal transduction system, the combination of covariance with global inference successfully and robustly identified residue
pairs that are proximal in space without resorting to ad hoc tuning parameters, both for heterointeractions between sensor
kinase (SK) and response regulator (RR) proteins and for homointeractions between RR proteins. The spectacular success of
this approach illustrates the effectiveness of the global inference approach in identifying direct interaction based on sequence
information alone. We expect this method to be applicable soon to interaction surfaces between proteins present in only 1
copy per genome as the number of sequenced genomes continues to expand. Use of this method could significantly increase the
potential targets for therapeutic intervention, shed light on the mechanism of protein–protein interaction, and establish
the foundation for the accurate prediction of interacting protein partners.