Co-reporter:Tiyun Han, Quan Chen, and Haiyan Liu
ACS Synthetic Biology 2017 Volume 6(Issue 2) pp:
Publication Date(Web):October 30, 2016
DOI:10.1021/acssynbio.6b00248
Genetic switches in which the activity of T7 RNA polymerase (RNAP) is directly regulated by external signals are obtained with an engineering strategy of splitting the protein into fragments and using regulatory domains to modulate their reconstitutions. Robust switchable systems with excellent dark-off/light-on properties are obtained with the light-activatable VVD domain and its variants as regulatory domains. For the best split position found, working switches exploit either the light-induced interactions between the VVD domains or allosteric effects. The split fragments show high modularity when they are combined with different regulatory domains such as those with chemically inducible interaction, enabling chemically controlled switches. To summarize, the T7 RNA polymerase-based switches are powerful tools to implement light-activated gene expression in different contexts. Moreover, results about the studied split positions and domain organizations may facilitate future engineering studies on this and on related proteins.Keywords: allosteric control; genetic switch; light-regulated protein−protein interaction; modular domain organization; split T7 RNA polymerase;
Co-reporter:Yinliang Zhang, Zheng Zhao, and Haiyan Liu
ACS Catalysis 2015 Volume 5(Issue 4) pp:2559
Publication Date(Web):March 13, 2015
DOI:10.1021/cs501709d
We here use an approach of active site alignment and clustering of many evolutionarily distant enzymes catalyzing alike reactions to identify conserved residues/interactions that may play key chemical roles in catalysis. Then density functional theory (DFT) calculations on cluster models are used to investigate the chemical essentialness of such residues/interactions and its mechanistic basis. We apply this approach to 130 glycoside hydrolases (GHs) of the (βα)8-barrel fold. These enzymes adopt either a classical retaining mechanism or a substrate-assisted intramolecular nucleophilic attack mechanism, both in need of a general acid/general base residue for catalysis. On the basis of the multiple active site alignments, the enzyme active sites can be clustered into six categories. The conserved or convergently evolved hydrogen bond/salt bridge involving the general acid/general base in different categories suggests the importance of this interaction. DFT calculations indicate that its presence may reduce the energetic barrier by as large as 17–20 kcal mol–1. The mechanistic explanation for this large effect is that a proton transfer from the general acid to the leaving group takes place before the nucleophile attacks the transition state. The large energetic effect suggests that this interaction should be considered as chemically essential, although it is realized with varied residue types in different GH categories. In addition, for the substrate-assisted mechanism, an interaction between the substrate nucleophile group and a tyrosine is found to have been convergently evolved in enzymes of two different categories. This interaction does not seem to have favorable effects on the energetic barrier. Instead, it might contribute to reducing the activation entropy. In summary, active site alignment of distant enzymes combined with quantum mechanical calculation may comprise a powerful approach to obtain new insights into enzyme catalysis.Keywords: active site alignments; conserved interactions; general acid/general base catalysis; glycoside hydrolase; quantum chemistry calculations
Co-reporter:Haiyan Liu
Quantitative Biology 2015 Volume 3( Issue 4) pp:157-167
Publication Date(Web):2015 December
DOI:10.1007/s40484-015-0054-x
Statistical energy functions are general models about atomic or residue-level interactions in biomolecules, derived from existing experimental data. They provide quantitative foundations for structural modeling as well as for structure-based protein sequence design. Statistical energy functions can be derived computationally either based on statistical distributions or based on variational assumptions. We present overviews on the theoretical assumptions underlying the various types of approaches. Theoretical considerations underlying important pragmatic choices are discussed.
Co-reporter:Zexian Liu, Yongbo Wang, Changhai Zhou, Yu Xue, Wei Zhao, Haiyan Liu
Biochimica et Biophysica Acta (BBA) - Proteins and Proteomics 2014 Volume 1844(Issue 1) pp:171-180
Publication Date(Web):January 2014
DOI:10.1016/j.bbapap.2013.03.001
•A computational approach was proposed to characterize protein zinc-binding sites.•The geometric based training-independent approach achieved a promising performance.•Zinc-binding is involved in more complicated biological processes during evolution.•Zinc-binding was implicated in a variety of diseases and enriched in drug targets.Zinc is one of the most essential metals utilized by organisms, and zinc-binding proteins play an important role in a variety of biological processes such as transcription regulation, cell metabolism and apoptosis. Thus, characterizing the precise zinc-binding sites is fundamental to an elucidation of the biological functions and molecular mechanisms of zinc-binding proteins. Using systematic analyses of structural characteristics, we observed that 4-residue and 3-residue zinc-binding sites have distinctly specific geometric features. Based on the results, we developed the novel computational program Geometric REstriction for Zinc-binding (GRE4Zn) to characterize the zinc-binding sites in protein structures, by restricting the distances between zinc and its coordinating atoms. The comparison between GRE4Zn and analogous tools revealed that it achieved a superior performance. A large-scale prediction for structurally characterized proteins was performed with this powerful predictor, and statistical analyses for the results indicated zinc-binding proteins have come to be significantly involved in more complicated biological processes in higher species than simpler species during the course of evolution. Further analyses suggested that zinc-binding proteins are preferentially implicated in a variety of diseases and highly enriched in known drug targets, and the prediction of zinc-binding sites can be helpful for the investigation of molecular mechanisms. In this regard, these prediction and analysis results should prove to be highly useful be helpful for further biomedical study and drug design. The online service of GRE4Zn is freely available at: http://biocomp.ustc.edu.cn/gre4zn/. This article is part of a Special Issue entitled: Computational Proteomics, Systems Biology & Clinical Implications. Guest Editor: Yudong Cai.
Co-reporter:Yue Hu and Haiyan Liu
The Journal of Physical Chemistry A 2014 Volume 118(Issue 39) pp:9272-9279
Publication Date(Web):June 18, 2014
DOI:10.1021/jp503856h
We studied ligand dissociation from the inducer-binding domain of the Lac repressor protein using temperature-accelerated molecular dynamics (TAMD) simulations. With TAMD, ligand dissociation could be observed within relatively short simulation time. This allowed many dissociation trajectories to be sampled. Under the adiabatic approximation of TAMD, all but one degree of freedom of the system were sampled from usual canonical ensembles at room temperature. Thus, meaningful statistical analyses could be carried out on the trajectories. A systematic approach was proposed to analyze possible correlations between ligand dissociation and fluctuations of various protein conformational coordinates. These analyses employed relative entropies, allowing both linear and nonlinear correlations to be considered. Applying the simulation and analysis methods to the inducer binding domain of the Lac repressor protein, we found that ligand dissociation from this protein correlated mainly with fluctuations of side-chain conformations of a few residues that surround the binding pocket. In addition, the two binding sites of the dimeric protein were dynamically coupled: occupation of one site by an inducer molecule could significantly reduce or slow down conformational dynamics around the other binding pocket.
Co-reporter:Yue Hu, Wei Hong, Yunyu Shi, and Haiyan Liu
Journal of Chemical Theory and Computation 2012 Volume 8(Issue 10) pp:3777-3792
Publication Date(Web):April 13, 2012
DOI:10.1021/ct300061g
In molecular simulations, accelerated sampling can be achieved efficiently by raising the temperature of a small number of coordinates. For collective coordinates, the temperature-accelerated molecular dynamics method or TAMD has been previously proposed, in which the system is extended by introducing virtual variables that are coupled to these coordinates and simulated at higher temperatures (Maragliano, L.; Vanden-Eijnden, E. Chem. Phys. Lett.2005, 426, 168–175). In such accelerated simulations, steady state or equilibrium distributions may exist but deviate from the canonical Boltzmann one. We show that by assuming adiabatic decoupling between the subsystems simulated at different temperatures, correct canonical distributions and ensemble averages can be obtained through reweighting. The method makes use of the low-dimensional free energy surfaces that are estimated as Gaussian mixture probability densities through maximum likelihood and expectation maximization. Previously, we proposed the amplified collective motion method or ACM. The method employs the coarse-grained elastic network model or ANM to extract collective coordinates for accelerated sampling. Here, we combine the ideas of ACM and of TAMD to develop a general technique that can achieve canonical sampling through reweighting under the adiabatic approximation. To test the validity and accuracy of adiabatic reweighting, first we consider a single n-butane molecule in a canonical stochastic heat bath. Then, we use explicitly solvated alanine dipeptide and GB1 peptide as model systems to demonstrate the proposed approaches. With alanine dipeptide, it is shown that sampling can be accelerated by more than an order of magnitude with TAMD while correct distributions and canonical ensemble averages can be recovered, necessarily through adiabatic reweighting. For the GB1 peptide, the conformational distribution sampled by ACM-TAMD, after adiabatic reweighting, suggested that a normal simulation suffered significantly from insufficient sampling and that the reweighted ACM-TAMD distribution may present significant improvements over the normal simulation in representing the local conformational ensemble around the folded structure of GB1.
Co-reporter:Liling Zhao, Zhijun Liu, Zanxia Cao, Haiyan Liu, Jihua Wang
Computational and Theoretical Chemistry 2011 Volume 978(1–3) pp:152-159
Publication Date(Web):30 December 2011
DOI:10.1016/j.comptc.2011.10.004
The thermal intermediate state of the high mobility group box 5 domain in human upstream binding factor was detected at 55 °C by nuclear magnetic resonance (NMR) experiments. For insufficient data, however, the tertiary structure of the intermediate state cannot be resolved as native state with the experimental techniques. To characterize the intermediate state ensemble, here we performed ensemble-averaged molecular dynamics simulations on the box 5 protein with 421 distance restraints derived from Nuclear Overhauser Enhancement and paramagnetic relaxation enhancement, as well as 122 dihedral angle restraints obtained from the program TALOS based on atom chemical shifts. The number of replicas was 48. The 60 ns simulation was completed for each replica. The total simulation time was up to 2.88 μs. The results indicated the intermediate state ensemble of box 5 was high heterogeneity and most secondary structures were formed. The N-terminal coil and helix 1 moved toward the C-terminal region; helix 3 was more stable and native-like than the other two helices; the hydrophobic core was not formed completely in the intermediate ensemble; and the L-shaped topology of the native conformation disappeared. In addition, some experimental inconsistencies were found, which could not be resolved in one conformation. In this study, the structural characteristics of box 5 thermal intermediate state ensemble were determined, which cannot be directly achieved through wet experiments. The findings of the current work are useful for the understanding of the protein folding mechanism. In our knowledge, this is the first report on the structural and thermal characters of the intermediate state ensemble of box 5.Graphical abstractHighlights► The “invisible” intermediate state was predicted with ensemble-averaged simulations. ► Distance and angle restraints from experiments were imposed on the simulations. ► The structural characters of intermediate state ensemble were depicted in detail. ► The ensemble was high heterogeneity and L-shaped topology was vanished. ► Helix 3 was more native-like and the hydrophobic core was unformed completely.
Co-reporter:Chao Xu, Jun Wang and Haiyan Liu
Journal of Chemical Theory and Computation 2008 Volume 4(Issue 8) pp:1348-1359
Publication Date(Web):June 21, 2008
DOI:10.1021/ct7003534
We presented a Hamiltonian replica exchange approach and applied it to investigate the effects of various factors on the conformational equilibrium of peptide backbone. In different replicas, biasing potentials of varying strengths are applied to all backbone (φ,ψ) torsional angle pairs to overcome sampling barriers. A general form of constructing biasing potentials based on a reference free energy surface is employed to minimize sampling in physically irrelevant parts of the conformational space. An extension of the weighted histogram analysis formulation allows for conformational free energy surfaces to be computed using all replicas, including those with biased Hamiltonians. This approach can significantly reduce the statistical uncertainties in computed free energies. For the peptide systems considered, it allows for effects of the order of 0.5−1 kJ/mol to be quantified using explicit solvent simulations. We applied this approach to capped peptides of 2−5 peptide units containing Ala, Phe, or Val in explicit water solvent and focused on how the conformational equilibrium of a single pair of backbone angles are influenced by changing the residue types of the same and neighboring residues as well as conformations of neighboring residues. For the effects of changing side-chain types of the same residue, our results consistently showed increased preference of β for Phe and Val relative to Ala. As for neighbor effects, our results not only indicated that they can be as large as the effects of changing the side-chain type of the same residue but also led to several new insights. We found that for the N-terminal neighbors, their conformations seem to have large effects. Relative to the β conformer of an N-terminal neighbor, its α conformer stabilizes the β conformer of its next Ala disregarding the residue type of the neighbor. For C-terminal neighbors, their chemical identities seem to play more important roles. Val as the C-terminal neighbor significantly increases the PII propensity of its previous Ala disregarding its own conformational state. These results are in good accordance with reported statistics of protein coil structure libraries, proving the persistent presence of such effects in short peptides as well as in proteins. We also observed other side-chain identity and neighbor effects which have been consistently reproduced in our simulations of different small peptide systems but not displayed by coil library statistics.
Co-reporter:Minghui Dong and Haiyan Liu
The Journal of Physical Chemistry B 2008 Volume 112(Issue 33) pp:10280-10290
Publication Date(Web):July 24, 2008
DOI:10.1021/jp711209j
The Escherichia coli peptide deformylase (PDF) and Bacillus thermoproteolyticus thermolysin (TLN) are two representative metal-requiring peptidases having remarkably similar active centers but distinctively different metal preferences. Zinc is a competent catalytic cofactor for TLN but not for PDF. Reaction pathways and the associated energetics for both enzymes were determined using combined semiempirical and ab initio quantum mechanical/molecular mechanical modeling, without presuming reaction coordinates. The results confirmed that both enzymes catalyze via the same chemical steps, and reproduced their different preferences for zinc or iron as competent cofactors. Further analyses indicated that different feasibility of the nucleophilic attack step leads to different metal preferences of the two enzymes. In TLN, the substrate is strongly activated and can serve as the fifth coordination ligand of zinc prior to the chemical steps. In PDF, the substrate carbonyl is activated by the chemical step itself, and becomes the fifth coordination partner of zinc only in a later stage of the nucleophilic attack. These leads to a much more difficult nucleophilic attack in PDF than in TLN. Different from some earlier suggestions, zinc has no difficulty in accepting an activated substrate as the fifth ligand to switch from tetra- to penta-coordination in either PDF or TLN. When iron replaces zinc, its stronger interaction with the hydroxide ligand may lead to higher activation barrier in TLN. In PDF, the stronger interactions of iron with ligands allow iron−substrate coordination to take place either before or at a very early stage of the chemical step, leading to effective catalysis. Our calculations also show combined semiempirical and ab initio quantum mechanical modeling can be efficient approaches to explore complicated reaction pathways in enzyme systems.
Co-reporter:Zheng Zhao and Haiyan Liu
The Journal of Physical Chemistry B 2008 Volume 112(Issue 41) pp:13091-13100
Publication Date(Web):September 24, 2008
DOI:10.1021/jp802262m
The catalytic mechanism of a pyridoxal 5′-phosphate-dependent enzyme, l-serine dehydratase, has been investigated using ab initio quantum mechanical/molecular mechanical (QM/MM) methods. New insights into the chemical steps have been obtained, including the chemical role of the substrate carboxyl group in the Schiff base formation step and a proton-relaying mechanism involving the phosphate of the cofactor in the β-hydroxyl-leaving step. The latter step is of no barrier and follows sequentially after the elimination of the α-proton, leading to a single but sequential α, β-elimination step. The rate-limiting transition state is specifically stabilized by the enzyme environment. At this transition state, charges are localized on the substrate carboxyl group, as well as on the amino group of Lys41. Specific interactions of the enzyme environment with these groups are able to lower the activation barrier significantly. One major difficulty associated with studies of complicated enzymatic reactions using ab initio QM/MM models is the appropriate choices of reaction coordinates. In this study, we have made use of efficient semiempirical models and pathway optimization techniques to overcome this difficulty.
Co-reporter:Xiaoqun Zhou, Peng Xiong, Meng Wang, Rongsheng Ma, Jiahai Zhang, Quan Chen, Haiyan Liu
Journal of Structural Biology (December 2016) Volume 196(Issue 3) pp:350-357
Publication Date(Web):1 December 2016
DOI:10.1016/j.jsb.2016.08.002
We report that using mainly a statistical energy model, protein sequence design for designable backbones can be carried out with high confidence without considering backbone relaxation. A recently-developed statistical energy function for backbone-based protein sequence design has been rationally revised to improve its accuracy. As a demonstrative example, this revised model is applied to design a de novo protein for a target backbone for which the previous model had relied on after-design directed evolution to produce a well-folded protein. The actual backbone structure of the newly designed protein agrees excellently with the corresponding target. Besides presenting a new protein design protocol with experimentally verifications on different backbone types, our study implies that with an energy model of an appropriate resolution, proteins of well-defined structures instead of molten globules can be designed without the explicit consideration of backbone variations due to side chain changes, even if the side chain changes correspond to complete sequence redesigns.