Co-reporter:Yun Mou;Fang-Ciao Hsu;Po-Ssu Huang;Shing-Jong Huang
PNAS 2015 Volume 112 (Issue 34 ) pp:10714-10719
Publication Date(Web):2015-08-25
DOI:10.1073/pnas.1505072112
Homodimers are the most common type of protein assembly in nature and have distinct features compared with heterodimers and
higher order oligomers. Understanding homodimer interactions at the atomic level is critical both for elucidating their biological
mechanisms of action and for accurate modeling of complexes of unknown structure. Computation-based design of novel protein–protein
interfaces can serve as a bottom-up method to further our understanding of protein interactions. Previous studies have demonstrated
that the de novo design of homodimers can be achieved to atomic-level accuracy by β-strand assembly or through metal-mediated
interactions. Here, we report the design and experimental characterization of a α-helix–mediated homodimer with C2 symmetry
based on a monomeric Drosophila engrailed homeodomain scaffold. A solution NMR structure shows that the homodimer exhibits parallel helical packing similar
to the design model. Because the mutations leading to dimer formation resulted in poor thermostability of the system, design
success was facilitated by the introduction of independent thermostabilizing mutations into the scaffold. This two-step design
approach, function and stabilization, is likely to be generally applicable, especially if the desired scaffold is of low thermostability.
Co-reporter:Heidi K. Privett;Gert Kiss;Toni M. Lee;Leonard M. Thomas;Rebecca Blomberg;Roberto A. Chica;Kendall N. Houk;Donald Hilvert
PNAS 2012 Volume 109 (Issue 10 ) pp:
Publication Date(Web):2012-03-06
DOI:10.1073/pnas.1118082108
A general approach for the computational design of enzymes to catalyze arbitrary reactions is a goal at the forefront of the
field of protein design. Recently, computationally designed enzymes have been produced for three chemical reactions through
the synthesis and screening of a large number of variants. Here, we present an iterative approach that has led to the development
of the most catalytically efficient computationally designed enzyme for the Kemp elimination to date. Previously established
computational techniques were used to generate an initial design, HG-1, which was catalytically inactive. Analysis of HG-1
with molecular dynamics simulations (MD) and X-ray crystallography indicated that the inactivity might be due to bound waters
and high flexibility of residues within the active site. This analysis guided changes to our design procedure, moved the design
deeper into the interior of the protein, and resulted in an active Kemp eliminase, HG-2. The cocrystal structure of this enzyme
with a transition state analog (TSA) revealed that the TSA was bound in the active site, interacted with the intended catalytic
base in a catalytically relevant manner, but was flipped relative to the design model. MD analysis of HG-2 led to an additional
point mutation, HG-3, that produced a further threefold improvement in activity. This iterative approach to computational
enzyme design, including detailed MD and structural analysis of both active and inactive designs, promises a more complete
understanding of the underlying principles of enzymatic catalysis and furthers progress toward reliably producing active enzymes.
Co-reporter:Roberto A. Chica;Matthew M. Moore;Benjamin D. Allen
PNAS 2010 Volume 107 (Issue 47 ) pp:20257-20262
Publication Date(Web):2010-11-23
DOI:10.1073/pnas.1013910107
The longer emission wavelengths of red fluorescent proteins (RFPs) make them attractive for whole-animal imaging because cells
are more transparent to red light. Although several useful RFPs have been developed using directed evolution, the quest for
further red-shifted and improved RFPs continues. Herein, we report a structure-based rational design approach to red-shift
the fluorescence emission of RFPs. We applied a combined computational and experimental approach that uses computational protein
design as an in silico prescreen to generate focused combinatorial libraries of mCherry mutants. The computational procedure helped us identify
residues that could fulfill interactions hypothesized to cause red-shifts without destabilizing the protein fold. These interactions
include stabilization of the excited state through H-bonding to the acylimine oxygen atom, destabilization of the ground state
by hydrophobic packing around the charged phenolate, and stabilization of the excited state by a π-stacking interaction. Our
methodology allowed us to identify three mCherry mutants (mRojoA, mRojoB, and mRouge) that display emission wavelengths > 630 nm,
representing red-shifts of 20–26 nm. Moreover, our approach required the experimental screening of a total of ∼5,000 clones,
a number several orders of magnitude smaller than those previously used to achieve comparable red-shifts. Additionally, crystal
structures of mRojoA and mRouge allowed us to verify fulfillment of the interactions hypothesized to cause red-shifts, supporting
their contribution to the observed red-shifts.
Co-reporter:Benjamin D. Allen;Alex Nisthal
PNAS 2010 Volume 107 (Issue 46 ) pp:19838-19843
Publication Date(Web):2010-11-16
DOI:10.1073/pnas.1012985107
The stability, activity, and solubility of a protein sequence are determined by a delicate balance of molecular interactions
in a variety of conformational states. Even so, most computational protein design methods model sequences in the context of
a single native conformation. Simulations that model the native state as an ensemble have been mostly neglected due to the
lack of sufficiently powerful optimization algorithms for multistate design. Here, we have applied our multistate design algorithm
to study the potential utility of various forms of input structural data for design. To facilitate a more thorough analysis,
we developed new methods for the design and high-throughput stability determination of combinatorial mutation libraries based
on protein design calculations. The application of these methods to the core design of a small model system produced many
variants with improved thermodynamic stability and showed that multistate design methods can be readily applied to large structural
ensembles. We found that exhaustive screening of our designed libraries helped to clarify several sources of simulation error
that would have otherwise been difficult to ascertain. Interestingly, the lack of correlation between our simulated and experimentally
measured stability values shows clearly that a design procedure need not reproduce experimental data exactly to achieve success.
This surprising result suggests potentially fruitful directions for the improvement of computational protein design technology.
Co-reporter:Oscar Alvizo
PNAS 2008 Volume 105 (Issue 34 ) pp:12242-12247
Publication Date(Web):2008-08-26
DOI:10.1073/pnas.0805858105
An accurate force field is essential to computational protein design and protein fold prediction studies. Proper force field
tuning is problematic, however, due in part to the incomplete modeling of the unfolded state. Here, we evaluate and optimize
a protein design force field by constraining the amino acid composition of the designed sequences to that of a well behaved
model protein. According to the random energy model, unfolded state energies are dependent only on amino acid composition
and not the specific arrangement of amino acids. Therefore, energy discrepancies between computational predictions and experimental
results, for sequences of identical composition, can be directly attributed to flaws in the force field's ability to properly
account for folded state sequence energies. This aspect of fixed composition design allows for force field optimization by
focusing solely on the interactions in the folded state. Several rounds of fixed composition optimization of the 56-residue
β1 domain of protein G yielded force field parameters with significantly greater predictive power: Optimized sequences exhibited
higher wild-type sequence identity in critical regions of the structure, and the wild-type sequence showed an improved Z-score.
Experimental studies revealed a designed 24-fold mutant to be stably folded with a melting temperature similar to that of
the wild-type protein. Sequence designs using engrailed homeodomain as a scaffold produced similar results, suggesting the
tuned force field parameters were not specific to protein G.
Co-reporter:Thomas P. Treynor;Christina L. Vizcarra;Daniel Nedelcu;
Proceedings of the National Academy of Sciences 2007 104(1) pp:48-53
Publication Date(Web):December 19, 2006
DOI:10.1073/pnas.0609647103
To determine which of seven library design algorithms best introduces new protein function without destroying it altogether,
seven combinatorial libraries of green fluorescent protein variants were designed and synthesized. Each was evaluated by distributions
of emission intensity and color compiled from measurements made in vivo. Additional comparisons were made with a library constructed by error-prone PCR. Among the designed libraries, fluorescent
function was preserved for the greatest fraction of samples in a library designed by using a structure-based computational
method developed and described here. A trend was observed toward greater diversity of color in designed libraries that better
preserved fluorescence. Contrary to trends observed among libraries constructed by error-prone PCR, preservation of function
was observed to increase with a library's average mutation level among the four libraries designed with structure-based computational
methods.
Co-reporter:Jonathan Kyle Lassila;Heidi K. Privett;Benjamin D. Allen;
Proceedings of the National Academy of Sciences 2006 103(45) pp:16710-16715
Publication Date(Web):October 30, 2006
DOI:10.1073/pnas.0607691103
The incorporation of small-molecule transition state structures into protein design calculations poses special challenges
because of the need to represent the added translational, rotational, and conformational freedoms within an already difficult
optimization problem. Successful approaches to computational enzyme design have focused on catalytic side-chain contacts to
guide placement of small molecules in active sites. We describe a process for modeling small molecules in enzyme design calculations
that extends previously described methods, allowing favorable small-molecule positions and conformations to be explored simultaneously
with sequence optimization. Because all current computational enzyme design methods rely heavily on sampling of possible active
site geometries from discrete conformational states, we tested the effects of discretization parameters on calculation results.
Rotational and translational step sizes as well as side-chain library types were varied in a series of computational tests
designed to identify native-like binding contacts in three natural systems. We find that conformational parameters, especially
the type of rotamer library used, significantly affect the ability of design calculations to recover native binding-site geometries.
We describe the construction and use of a crystallographic conformer library and find that it more reliably captures active-site
geometries than traditional rotamer libraries in the systems tested.
Co-reporter:Julia M. Shifman
PNAS 2003 Volume 100 (Issue 23 ) pp:13274-13279
Publication Date(Web):2003-11-11
DOI:10.1073/pnas.2234277100
Calmodulin (CaM) is a second messenger protein that has evolved to bind tightly to a variety of targets and, as such, exhibits
low binding specificity. We redesigned CaM by using a computational protein design algorithm to improve its binding specificity
for one of its targets, smooth muscle myosin light chain kinase (smMLCK). Residues in or near the CaM/smMLCK binding interface
were optimized; CaM interactions with alternative targets were not directly considered in the optimization. The predicted
CaM sequences were constructed and tested for binding to a set of eight targets including smMLCK. The best CaM variant, obtained
from a calculation that emphasized intermolecular interactions, showed up to a 155-fold increase in binding specificity. The
increase in binding specificity was not due to improved binding to smMLCK, but due to decreased binding to the alternative
targets. This finding is consistent with the fact that the sequence of wild-type CaM is nearly optimal for interactions with
numerous targets.
Co-reporter:Daniel N Bolon, Christopher A Voigt, Stephen L Mayo
Current Opinion in Chemical Biology 2002 Volume 6(Issue 2) pp:125-129
Publication Date(Web):1 April 2002
DOI:10.1016/S1367-5931(02)00303-4
The challenging field of de novo enzyme design is beginning to produce exciting results. The application of powerful computational methods to functional protein design has recently succeeded at engineering target activities. In addition, efforts in directed evolution continue to expand the transformations that can be accomplished by existing enzymes. The engineering of completely novel catalytic activity requires traversing inactive sequence space in a fitness landscape, a feat that is better suited to computational design. Optimizing activity, which can include subtle alterations in backbone conformation and protein motion, is better suited to directed evolution, which is highly effective at scaling fitness landscapes towards maxima. Improved rational design efforts coupled with directed evolution should dramatically improve the scope of de novo enzyme design.
Co-reporter:
Nature Structural and Molecular Biology 2002 9(7) pp:553 - 558
Publication Date(Web):03 June 2002
DOI:10.1038/nsb805
Co-reporter:Christopher A. Voigt;Frances H. Arnold;Zhen-Gang Wang
PNAS 2001 Volume 98 (Issue 7 ) pp:3778-3783
Publication Date(Web):2001-03-27
DOI:10.1073/pnas.051614498
We introduce a computational method to optimize the in
vitro evolution of proteins. Simulating evolution with a simple
model that statistically describes the fitness landscape, we find that
beneficial mutations tend to occur at amino acid positions that are
tolerant to substitutions, in the limit of small libraries and low
mutation rates. We transform this observation into a design strategy by
applying mean-field theory to a structure-based computational model to
calculate each residue's structural tolerance. Thermostabilizing and
activity-increasing mutations accumulated during the experimental
directed evolution of subtilisin E and T4 lysozyme are strongly
directed to sites identified by using this computational approach. This
method can be used to predict positions where mutations are likely to
lead to improvement of specific protein properties.
Co-reporter:Stephen L. Mayo;Daniel N. Bolon
PNAS 2001 Volume 98 (Issue 25 ) pp:14274-14279
Publication Date(Web):2001-12-04
DOI:10.1073/pnas.251555398
We report the development and initial experimental validation of a
computational design procedure aimed at generating enzyme-like protein
catalysts called “protozymes.” Our design approach utilizes a
“compute and build” strategy that is based on the
physical/chemical principles governing protein stability and
catalytic mechanism. By using the catalytically inert
108-residue Escherichia coli thioredoxin as a scaffold,
the histidine-mediated nucleophilic hydrolysis of
p-nitrophenyl acetate as a model reaction, and the
ORBIT protein design software to compute sequences,
an active site scan identified two promising catalytic positions and
surrounding active-site mutations required for substrate binding.
Experimentally, both candidate protozymes demonstrated catalytic
activity significantly above background. One of the proteins, PZD2,
displayed “burst” phase kinetics at high substrate
concentrations, consistent with the formation of a stable enzyme
intermediate. The kinetic parameters of PZD2 are comparable to early
catalytic Abs. But, unlike catalytic Ab design, our design procedure is
independent of fold, suggesting a possible mechanism for examining the
relationships between protein fold and the evolvability of protein
function.
Co-reporter:
Nature Structural and Molecular Biology 2000 7(8) pp:674-678
Publication Date(Web):
DOI:10.1038/77978
We have taken a computational approach to design mutations that stabilize a large protein domain of 200 residues in two alternative conformations. Mutations in the hydrophobic core of the M2 integrin I domain were designed to stabilize the crystallographically defined open or closed conformers. When expressed on the cell surface as part of the intact heterodimeric receptor, binding of the designed open and closed I domains to the ligand iC3b, a form of the complement component C3, was either increased or decreased, respectively, compared to wild type. Moreover, when expressed in isolation from other integrin domains using an artificial transmembrane domain, designed open I domains were active in ligand binding, whereas designed closed and wild type I domains were inactive. Comparison to a human expert designed open mutant showed that the computationally designed mutants are far more active. Thus, computational design can be used to stabilize a molecule in a desired conformation, and conformational change in the I domain is physiologically relevant to regulation of ligand binding.
Co-reporter:Premal S. Shah, Geoffrey K. Hom, Scott A. Ross, Jonathan Kyle Lassila, ... Stephen L. Mayo
Journal of Molecular Biology (7 September 2007) Volume 372(Issue 1) pp:1-6
Publication Date(Web):7 September 2007
DOI:10.1016/j.jmb.2007.06.032
Computational protein design procedures were applied to the redesign of the entire sequence of a 51 amino acid residue protein, Drosophila melanogaster engrailed homeodomain. Various sequence optimization algorithms were compared and two resulting designed sequences were experimentally evaluated. The two sequences differ by 11 mutations and share 22% and 24% sequence identity with the wild-type protein. Both computationally designed proteins were considerably more stable than the naturally occurring protein, with midpoints of thermal denaturation greater than 99 °C. The solution structure was determined for one of the two sequences using multidimensional heteronuclear NMR spectroscopy, and the structure was found to closely match the original design template scaffold.
Co-reporter:Yun Mou, Po-Ssu Huang, Leonard M. Thomas, Stephen L. Mayo
Journal of Molecular Biology (14 August 2015) Volume 427(Issue 16) pp:2697-2706
Publication Date(Web):14 August 2015
DOI:10.1016/j.jmb.2015.06.006
•Computational protein design (CPD) calculations that do not consider competing states may lead to off-target folding.•We developed an MD simulation protocol as a post-CPD screening tool.•The MD protocol identifies mutations leading to undesired competing states.•The MD protocol predicts mutations that favor the target fold.•CPD combined with MD screening can greatly improve design success rates.In standard implementations of computational protein design, a positive-design approach is used to predict sequences that will be stable on a given backbone structure. Possible competing states are typically not considered, primarily because appropriate structural models are not available. One potential competing state, the domain-swapped dimer, is especially compelling because it is often nearly identical with its monomeric counterpart, differing by just a few mutations in a hinge region. Molecular dynamics (MD) simulations provide a computational method to sample different conformational states of a structure. Here, we tested whether MD simulations could be used as a post-design screening tool to identify sequence mutations leading to domain-swapped dimers. We hypothesized that a successful computationally designed sequence would have backbone structure and dynamics characteristics similar to that of the input structure and that, in contrast, domain-swapped dimers would exhibit increased backbone flexibility and/or altered structure in the hinge-loop region to accommodate the large conformational change required for domain swapping. While attempting to engineer a homodimer from a 51-amino-acid fragment of the monomeric protein engrailed homeodomain (ENH), we had instead generated a domain-swapped dimer (ENH_DsD). MD simulations on these proteins showed increased B-factors derived from MD simulation in the hinge loop of the ENH_DsD domain-swapped dimer relative to monomeric ENH. Two point mutants of ENH_DsD designed to recover the monomeric fold were then tested with an MD simulation protocol. The MD simulations suggested that one of these mutants would adopt the target monomeric structure, which was subsequently confirmed by X-ray crystallography.Download high-res image (156KB)Download full-size image