Heather A. Carlson

Find an error

Name: Carlson, Heather
Organization: University of Michigan , USA
Department: Department of Medicinal Chemistry
Title: Professor(PhD)
Co-reporter:Katrina W. Lexa and Heather A. Carlson
Journal of Chemical Information and Modeling February 25, 2013 Volume 53(Issue 2) pp:
Publication Date(Web):January 17, 2013
DOI:10.1021/ci300430v
Computational approaches to fragment-based drug design (FBDD) can complement experiments and facilitate the identification of potential hot spots along the protein surface. However, the evaluation of computational methods for mapping binding sites frequently focuses upon the ability to reproduce crystallographic coordinates to within a low RMSD threshold. This dependency on the deposited coordinate data overlooks the original electron density from the experiment, thus techniques may be developed based upon subjective—or even erroneous—atomic coordinates. This can become a significant drawback in applications to systems where the location of hot spots is unknown. On the basis of comparison to crystallographic density, we previously showed that mixed-solvent molecular dynamics (MixMD) accurately identifies the active site for HEWL, with acetonitrile as an organic solvent. Here, we concentrated on the influence of protic solvent on simulation and refined the optimal MixMD approach for extrapolation of the method to systems without established sites. Our results establish an accurate approach for comparing simulations to experiment. We have outlined the most efficient strategy for MixMD, based on simulation length and number of runs. The development outlined here makes MixMD a robust method which should prove useful across a broad range of target structures. Lastly, our results with MixMD match experimental data so well that consistency between simulations and density may be a useful way to aid the identification of probes vs waters during the refinement of future multiple solvent crystallographic structures.
Co-reporter:Phani Ghanakota
Journal of Computer-Aided Molecular Design 2017 Volume 31( Issue 11) pp:979-993
Publication Date(Web):19 October 2017
DOI:10.1007/s10822-017-0077-7
NMR and X-ray crystallography are the two most widely used methods for determining protein structures. Our previous study examining NMR versus X-Ray sources of protein conformations showed improved performance with NMR structures when used in our Multiple Protein Structures (MPS) method for receptor-based pharmacophores (Damm, Carlson, J Am Chem Soc 129:8225–8235, 2007). However, that work was based on a single test case, HIV-1 protease, because of the rich data available for that system. New data for more systems are available now, which calls for further examination of the effect of different sources of protein conformations. The MPS technique was applied to Growth factor receptor bound protein 2 (Grb2), Src SH2 homology domain (Src-SH2), FK506-binding protein 1A (FKBP12), and Peroxisome proliferator-activated receptor-γ (PPAR-γ). Pharmacophore models from both crystal and NMR ensembles were able to discriminate between high-affinity, low-affinity, and decoy molecules. As we found in our original study, NMR models showed optimal performance when all elements were used. The crystal models had more pharmacophore elements compared to their NMR counterparts. The crystal-based models exhibited optimum performance only when pharmacophore elements were dropped. This supports our assertion that the higher flexibility in NMR ensembles helps focus the models on the most essential interactions with the protein. Our studies suggest that the “extra” pharmacophore elements seen at the periphery in X-ray models arise as a result of decreased protein flexibility and make very little contribution to model performance.
Co-reporter:Phani Ghanakota and Heather A. Carlson
Journal of Medicinal Chemistry 2016 Volume 59(Issue 23) pp:10383-10399
Publication Date(Web):August 3, 2016
DOI:10.1021/acs.jmedchem.6b00399
Identifying binding hotspots on protein surfaces is of prime interest in structure-based drug discovery, either to assess the tractability of pursuing a protein target or to drive improved potency of lead compounds. Computational approaches to detect such regions have traditionally relied on energy minimization of probe molecules onto static protein conformations in the absence of the natural aqueous environment. Advances in high performance computing now allow us to assess hotspots using molecular dynamics (MD) simulations. MD simulations integrate protein flexibility and the complicated role of water, thereby providing a more realistic assessment of the complex kinetics and thermodynamics at play. In this review, we describe the evolution of various cosolvent-based MD techniques and highlight a myriad of potential applications for such technologies in computational drug development.
Co-reporter:Heather A. Carlson; Richard D. Smith; Kelly L. Damm-Ganamet; Jeanne A. Stuckey; Aqeel Ahmed; Maire A. Convery; Donald O. Somers; Michael Kranz; Patricia A. Elkins; Guanglei Cui; Catherine E. Peishoff; Millard H. Lambert;James B. DunbarJr.
Journal of Chemical Information and Modeling 2016 Volume 56(Issue 6) pp:1063-1077
Publication Date(Web):May 6, 2016
DOI:10.1021/acs.jcim.5b00523
The 2014 CSAR Benchmark Exercise was the last community-wide exercise that was conducted by the group at the University of Michigan, Ann Arbor. For this event, GlaxoSmithKline (GSK) donated unpublished crystal structures and affinity data from in-house projects. Three targets were used: tRNA (m1G37) methyltransferase (TrmD), Spleen Tyrosine Kinase (SYK), and Factor Xa (FXa). A particularly strong feature of the GSK data is its large size, which lends greater statistical significance to comparisons between different methods. In Phase 1 of the CSAR 2014 Exercise, participants were given several protein–ligand complexes and asked to identify the one near-native pose from among 200 decoys provided by CSAR. Though decoys were requested by the community, we found that they complicated our analysis. We could not discern whether poor predictions were failures of the chosen method or an incompatibility between the participant’s method and the setup protocol we used. This problem is inherent to decoys, and we strongly advise against their use. In Phase 2, participants had to dock and rank/score a set of small molecules given only the SMILES strings of the ligands and a protein structure with a different ligand bound. Overall, docking was a success for most participants, much better in Phase 2 than in Phase 1. However, scoring was a greater challenge. No particular approach to docking and scoring had an edge, and successful methods included empirical, knowledge-based, machine-learning, shape-fitting, and even those with solvation and entropy terms. Several groups were successful in ranking TrmD and/or SYK, but ranking FXa ligands was intractable for all participants. Methods that were able to dock well across all submitted systems include MDock,1 Glide-XP,2 PLANTS,3 Wilma,4 Gold,5 SMINA,6 Glide-XP2/PELE,7 FlexX,8 and MedusaDock.9 In fact, the submission based on Glide-XP2/PELE7 cross-docked all ligands to many crystal structures, and it was particularly impressive to see success across an ensemble of protein structures for multiple targets. For scoring/ranking, submissions that showed statistically significant achievement include MDock1 using ITScore1,10 with a flexible-ligand term,11 SMINA6 using Autodock-Vina,12,13 FlexX8 using HYDE,14 and Glide-XP2 using XP DockScore2 with and without ROCS15 shape similarity.16 Of course, these results are for only three protein targets, and many more systems need to be investigated to truly identify which approaches are more successful than others. Furthermore, our exercise is not a competition.
Co-reporter:Richard D. Smith; Kelly L. Damm-Ganamet; James B. DunbarJr.; Aqeel Ahmed; Krishnapriya Chinnaswamy; James E. Delproposto; Ginger M. Kubish; Christine E. Tinberg; Sagar D. Khare; Jiayi Dou; Lindsey Doyle; Jeanne A. Stuckey; David Baker
Journal of Chemical Information and Modeling 2016 Volume 56(Issue 6) pp:1022-1031
Publication Date(Web):September 29, 2015
DOI:10.1021/acs.jcim.5b00387
Community Structure–Activity Resource (CSAR) conducted a benchmark exercise to evaluate the current computational methods for protein design, ligand docking, and scoring/ranking. The exercise consisted of three phases. The first phase required the participants to identify and rank order which designed sequences were able to bind the small molecule digoxigenin. The second phase challenged the community to select a near-native pose of digoxigenin from a set of decoy poses for two of the designed proteins. The third phase investigated the ability of current methods to rank/score the binding affinity of 10 related steroids to one of the designed proteins (pKd = 4.1 to 6.7). We found that 11 of 13 groups were able to correctly select the sequence that bound digoxigenin, with most groups providing the correct three-dimensional structure for the backbone of the protein as well as all atoms of the active-site residues. Eleven of the 14 groups were able to select the appropriate pose from a set of plausible decoy poses. The ability to predict absolute binding affinities is still a difficult task, as 8 of 14 groups were able to correlate scores to affinity (Pearson-r > 0.7) of the designed protein for congeneric steroids and only 5 of 14 groups were able to correlate the ranks of the 10 related ligands (Spearman-ρ > 0.7).
Co-reporter:Heather A. Carlson
Journal of Chemical Information and Modeling 2016 Volume 56(Issue 6) pp:951-954
Publication Date(Web):June 27, 2016
DOI:10.1021/acs.jcim.6b00182
Co-reporter:Phani Ghanakota and Heather A. Carlson
The Journal of Physical Chemistry B 2016 Volume 120(Issue 33) pp:8685-8695
Publication Date(Web):June 3, 2016
DOI:10.1021/acs.jpcb.6b03515
Mixed-solvent molecular dynamics (MixMD) is a hotspot-mapping technique that relies on molecular dynamics simulations of proteins in binary solvent mixtures. Previous work on MixMD has established the technique’s effectiveness in capturing binding sites of small organic compounds. In this work, we show that MixMD can identify both competitive and allosteric sites on proteins. The MixMD approach embraces full protein flexibility and allows competition between solvent probes and water. Sites preferentially mapped by probe molecules are more likely to be binding hotspots. There are two important requirements for the identification of ligand-binding hotspots: (1) hotspots must be mapped at very high signal-to-noise ratio and (2) the hotspots must be mapped by multiple probe types. We have developed our mapping protocol around acetonitrile, isopropanol, and pyrimidine as probe solvents because they allowed us to capture hydrophilic, hydrophobic, hydrogen-bonding, and aromatic interactions. Charged probes were needed for mapping one target, and we introduce them in this work. In order to demonstrate the robust nature and wide applicability of the technique, a combined total of 5 μs of MixMD was applied across several protein targets known to exhibit allosteric modulation. Most notably, all the protein crystal structures used to initiate our simulations had no allosteric ligands bound, so there was no preorganization of the sites to predispose the simulations to find the allosteric hotspots. The protein test cases were ABL Kinase, Androgen Receptor, CHK1 Kinase, Glucokinase, PDK1 Kinase, Farnesyl Pyrophosphate Synthase, and Protein-Tyrosine Phosphatase 1B. The success of the technique is demonstrated by the fact that the top-four sites solely map the competitive and allosteric sites. Lower-ranked sites consistently map other biologically relevant sites, multimerization interfaces, or crystal-packing interfaces. Lastly, we highlight the importance of including protein flexibility by demonstrating that MixMD can map allosteric sites that are not detected in half the systems using FTMap applied to the same crystal structures.
Co-reporter:Peter M. U. Ung;Phani Ghanakota;Sarah E. Graham;Katrina W. Lexa
Biopolymers 2016 Volume 105( Issue 1) pp:21-34
Publication Date(Web):
DOI:10.1002/bip.22742

ABSTRACT

Mixed-solvent molecular dynamics (MixMD) simulations use full protein flexibility and competition between water and small organic probes to achieve accurate hot-spot mapping on protein surfaces. In this study, we improved MixMD using human immunodeficiency virus type-1 protease (HIVp) as the test case. We used three probe–water solutions (acetonitrile–water, isopropanol–water, and pyrimidine–water), first at 50% w/w concentration and later at 5% v/v. Paradoxically, better mapping was achieved by using fewer probes; 5% simulations gave a superior signal-to-noise ratio and far fewer spurious hot spots than 50% MixMD. Furthermore, very intense and well-defined probe occupancies were observed in the catalytic site and potential allosteric sites that have been confirmed experimentally. The Eye site, an allosteric site underneath the flap of HIVp, has been confirmed by the presence of a 5-nitroindole fragment in a crystal structure. MixMD also mapped two additional hot spots: the Exo site (between the Gly16-Gly17 and Cys67-Gly68 loops) and the Face site (between Glu21-Ala22 and Val84-Ile85 loops). The Exo site was observed to overlap with crystallographic additives such as acetate and dimethyl sulfoxide that are present in different crystal forms of the protein. Analysis of crystal structures of HIVp in different symmetry groups has shown that some surface sites are common interfaces for crystal contacts, which means that they are surfaces that are relatively easy to desolvate and complement with organic molecules. MixMD should identify these sites; in fact, their occupancy values help establish a solid cut-off where “druggable” sites are required to have higher occupancies than the crystal-packing faces. © 2015 Wiley Periodicals, Inc. Biopolymers 105: 21–34, 2016.

Co-reporter:Peter M.-U. Ung ; James B. Dunbar ; Jr.; Jason E. Gestwicki
Journal of Medicinal Chemistry 2014 Volume 57(Issue 15) pp:6468-6478
Publication Date(Web):July 25, 2014
DOI:10.1021/jm5008352
NMR and MD simulations have demonstrated that the flaps of HIV-1 protease (HIV-1p) adopt a range of conformations that are coupled with its enzymatic activity. Previously, a model was created for an allosteric site located between the flap and the core of HIV-1p, called the Eye site ( Biopolymers 2008, 89, 643−652). Here, results from our first study were combined with a ligand-based, lead-hopping method to identify a novel compound (NIT). NIT inhibits HIV-1p, independent of the presence of an active-site inhibitor such as pepstatin A. Assays showed that NIT acts on an allosteric site other than the dimerization interface. MD simulations of the ligand–protein complex show that NIT stably binds in the Eye site and restricts the flaps. That bound state of NIT is consistent with a crystal structure of similar fragments bound in the Eye site ( Chem. Biol. Drug Des. 2010, 75, 257−268). Most importantly, NIT is equally potent against wild-type and a multidrug-resistant mutant of HIV-1p, which highlights the promise of allosteric inhibitors circumventing existing clinical resistance.
Co-reporter:Katrina W. Lexa, Garrett B. Goh, and Heather A. Carlson
Journal of Chemical Information and Modeling 2014 Volume 54(Issue 8) pp:2190-2199
Publication Date(Web):July 24, 2014
DOI:10.1021/ci400741u
Probe mapping is a common approach for identifying potential binding sites in structure-based drug design; however, it typically relies on energy minimizations of probes in the gas phase and a static protein structure. The mixed-solvent molecular dynamics (MixMD) approach was recently developed to account for full protein flexibility and solvation effects in hot-spot mapping. Our first study used only acetonitrile as a probe, and here, we have augmented the set of functional group probes through careful testing and parameter validation. A diverse range of probes are needed in order to map complex binding interactions. A small variation in probe parameters can adversely effect mixed-solvent behavior, which we highlight with isopropanol. We tested 11 solvents to identify six with appropriate behavior in TIP3P water to use as organic probes in the MixMD method. In addition to acetonitrile and isopropanol, we have identified acetone, N-methylacetamide, imidazole, and pyrimidine. These probe solvents will enable MixMD studies to recover hydrogen-bonding sites, hydrophobic pockets, protein–protein interactions, and aromatic hotspots. Also, we show that ternary-solvent systems can be incorporated within a single simulation. Importantly, these binary and ternary solvents do not require artificial repulsion terms like other methods. Within merely 5 ns, layered solvent boxes become evenly mixed for soluble probes. We used radial distribution functions to evaluate solvent behavior, determine adequate mixing, and confirm the absence of phase separation. We recommend that radial distribution functions should be used to assess adequate sampling in all mixed-solvent techniques rather than the current practice of examining the solvent ratios at the edges of the solvent box.
Co-reporter:Heather A. Carlson
Journal of Chemical Information and Modeling 2013 Volume 53(Issue 8) pp:1837-1841
Publication Date(Web):August 5, 2013
DOI:10.1021/ci4004249
Co-reporter:James B. Dunbar Jr., Richard D. Smith, Kelly L. Damm-Ganamet, Aqeel Ahmed, Emilio Xavier Esposito, James Delproposto, Krishnapriya Chinnaswamy, You-Na Kang, Ginger Kubish, Jason E. Gestwicki, Jeanne A. Stuckey, and Heather A. Carlson
Journal of Chemical Information and Modeling 2013 Volume 53(Issue 8) pp:1842-1852
Publication Date(Web):April 25, 2013
DOI:10.1021/ci4000486
A major goal in drug design is the improvement of computational methods for docking and scoring. The Community Structure Activity Resource (CSAR) has collected several data sets from industry and added in-house data sets that may be used for this purpose (www.csardock.org). CSAR has currently obtained data from Abbott, GlaxoSmithKline, and Vertex and is working on obtaining data from several others. Combined with our in-house projects, we are providing a data set consisting of 6 protein targets, 647 compounds with biological affinities, and 82 crystal structures. Multiple congeneric series are available for several targets with a few representative crystal structures of each of the series. These series generally contain a few inactive compounds, usually not available in the literature, to provide an upper bound to the affinity range. The affinity ranges are typically 3–4 orders of magnitude per series. For our in-house projects, we have had compounds synthesized for biological testing. Affinities were measured by Thermofluor, Octet RED, and isothermal titration calorimetry for the most soluble. This allows the direct comparison of the biological affinities for those compounds, providing a measure of the variance in the experimental affinity. It appears that there can be considerable variance in the absolute value of the affinity, making the prediction of the absolute value ill-defined. However, the relative rankings within the methods are much better, and this fits with the observation that predicting relative ranking is a more tractable problem computationally. For those in-house compounds, we also have measured the following physical properties: logD, logP, thermodynamic solubility, and pKa. This data set also provides a substantial decoy set for each target consisting of diverse conformations covering the entire active site for all of the 58 CSAR-quality crystal structures. The CSAR data sets (CSAR-NRC HiQ and the 2012 release) provide substantial, publically available, curated data sets for use in parametrizing and validating docking and scoring methods.
Co-reporter:Kelly L. Damm-Ganamet, Richard D. Smith, James B. Dunbar Jr., Jeanne A. Stuckey, and Heather A. Carlson
Journal of Chemical Information and Modeling 2013 Volume 53(Issue 8) pp:1853-1870
Publication Date(Web):April 2, 2013
DOI:10.1021/ci400025f
The Community Structure–Activity Resource (CSAR) recently held its first blinded exercise based on data provided by Abbott, Vertex, and colleagues at the University of Michigan, Ann Arbor. A total of 20 research groups submitted results for the benchmark exercise where the goal was to compare different improvements for pose prediction, enrichment, and relative ranking of congeneric series of compounds. The exercise was built around blinded high-quality experimental data from four protein targets: LpxC, Urokinase, Chk1, and Erk2. Pose prediction proved to be the most straightforward task, and most methods were able to successfully reproduce binding poses when the crystal structure employed was co-crystallized with a ligand from the same chemical series. Multiple evaluation metrics were examined, and we found that RMSD and native contact metrics together provide a robust evaluation of the predicted poses. It was notable that most scoring functions underpredicted contacts between the hetero atoms (i.e., N, O, S, etc.) of the protein and ligand. Relative ranking was found to be the most difficult area for the methods, but many of the scoring functions were able to properly identify Urokinase actives from the inactives in the series. Lastly, we found that minimizing the protein and correcting histidine tautomeric states positively trended with low RMSD for pose prediction but minimizing the ligand negatively trended. Pregenerated ligand conformations performed better than those that were generated on the fly. Optimizing docking parameters and pretraining with the native ligand had a positive effect on the docking performance as did using restraints, substructure fitting, and shape fitting. Lastly, for both sampling and ranking scoring functions, the use of the empirical scoring function appeared to trend positively with the RMSD. Here, by combining the results of many methods, we hope to provide a statistically relevant evaluation and elucidate specific shortcomings of docking methodology for the community.
Co-reporter:Richard D. Smith, Alaina L. Engdahl, James B. Dunbar Jr., and Heather A. Carlson
Journal of Chemical Information and Modeling 2012 Volume 52(Issue 8) pp:2098-2106
Publication Date(Web):June 19, 2012
DOI:10.1021/ci200612f
In classic work, Kuntz et al. (Proc. Nat. Acad. Sci. USA1999, 96, 9997–10002) introduced the concept of ligand efficiency. Though that study focused primarily on drug-like molecules, it also showed that metal binding led to the greatest ligand efficiencies. Here, the physical limits of binding are examined across the wide variety of small molecules in the Binding MOAD database. The complexes with the greatest ligand efficiencies share the trait of being small, charged ligands bound in highly charged, well buried binding sites. The limit of ligand efficiency is −1.75 kcal/mol·atom for the protein–ligand complexes within Binding MOAD, and 95% of the set have efficiencies below a “soft limit” of −0.83 kcal/mol·atom. On the basis of buried molecular surface area, the hard limit of ligand efficiency is −117 cal/mol·Å2, which is in surprising agreement with the limit of macromolecule–protein binding. Close examination of the most efficient systems reveals their incredibly high efficiency is dictated by tight contacts between the charged groups of the ligand and the pocket. In fact, a misfit of 0.24 Å in the average contacts inherently decreases the maximum possible efficiency by at least 0.1 kcal/mol·atom.
Co-reporter:Heather A. Carlson and James B. Dunbar Jr.
Journal of Chemical Information and Modeling 2011 Volume 51(Issue 9) pp:2025-2026
Publication Date(Web):September 26, 2011
DOI:10.1021/ci200398g
Co-reporter:James B. Dunbar Jr., Richard D. Smith, Chao-Yie Yang, Peter Man-Un Ung, Katrina W. Lexa, Nickolay A. Khazanov, Jeanne A. Stuckey, Shaomeng Wang, and Heather A. Carlson
Journal of Chemical Information and Modeling 2011 Volume 51(Issue 9) pp:2036-2046
Publication Date(Web):July 5, 2011
DOI:10.1021/ci200082t
A major goal in drug design is the improvement of computational methods for docking and scoring. The Community Structure Activity Resource (CSAR) aims to collect available data from industry and academia which may be used for this purpose (www.csardock.org). Also, CSAR is charged with organizing community-wide exercises based on the collected data. The first of these exercises was aimed to gauge the overall state of docking and scoring, using a large and diverse data set of protein–ligand complexes. Participants were asked to calculate the affinity of the complexes as provided and then recalculate with changes which may improve their specific method. This first data set was selected from existing PDB entries which had binding data (Kd or Ki) in Binding MOAD, augmented with entries from PDBbind. The final data set contains 343 diverse protein–ligand complexes and spans 14 pKd. Sixteen proteins have three or more complexes in the data set, from which a user could start an inspection of congeneric series. Inherent experimental error limits the possible correlation between scores and measured affinity; R2 is limited to ∼0.9 when fitting to the data set without over parametrizing. R2 is limited to ∼0.8 when scoring the data set with a method trained on outside data. The details of how the data set was initially selected, and the process by which it matured to better fit the needs of the community are presented. Many groups generously participated in improving the data set, and this underscores the value of a supportive, collaborative effort in moving our field forward.
Co-reporter:Richard D. Smith, James B. Dunbar Jr., Peter Man-Un Ung, Emilio X. Esposito, Chao-Yie Yang, Shaomeng Wang, and Heather A. Carlson
Journal of Chemical Information and Modeling 2011 Volume 51(Issue 9) pp:2115-2131
Publication Date(Web):August 3, 2011
DOI:10.1021/ci200269q
As part of the Community Structure-Activity Resource (CSAR) center, a set of 343 high-quality, protein–ligand crystal structures were assembled with experimentally determined Kd or Ki information from the literature. We encouraged the community to score the crystallographic poses of the complexes by any method of their choice. The goal of the exercise was to (1) evaluate the current ability of the field to predict activity from structure and (2) investigate the properties of the complexes and methods that appear to hinder scoring. A total of 19 different methods were submitted with numerous parameter variations for a total of 64 sets of scores from 16 participating groups. Linear regression and nonparametric tests were used to correlate scores to the experimental values. Correlation to experiment for the various methods ranged R2 = 0.58–0.12, Spearman ρ = 0.74–0.37, Kendall τ = 0.55–0.25, and median unsigned error = 1.00–1.68 pKd units. All types of scoring functions—force field based, knowledge based, and empirical—had examples with high and low correlation, showing no bias/advantage for any particular approach. The data across all the participants were combined to identify 63 complexes that were poorly scored across the majority of the scoring methods and 123 complexes that were scored well across the majority. The two sets were compared using a Wilcoxon rank-sum test to assess any significant difference in the distributions of >400 physicochemical properties of the ligands and the proteins. Poorly scored complexes were found to have ligands that were the same size as those in well-scored complexes, but hydrogen bonding and torsional strain were significantly different. These comparisons point to a need for CSAR to develop data sets of congeneric series with a range of hydrogen-bonding and hydrophobic characteristics and a range of rotatable bonds.
Co-reporter:Katrina W. Lexa
Journal of the American Chemical Society 2010 Volume 133(Issue 2) pp:200-202
Publication Date(Web):December 15, 2010
DOI:10.1021/ja1079332
A traditional technique for structure-based drug design (SBDD) is mapping of protein surfaces with probe molecules to identify “hot spots” where key functional groups can best complement the receptor. Common methods, such as minimization of probes or calculation of grids, use a fixed protein structure in the gas phase, ignoring both protein flexibility and proper competition between the probes and water. As a result, the potential surface is quite rugged, and many spurious local minima are identified. In this work, we compared rigid and fully flexible proteins in mixed-solvent molecular dynamics, which allows for flexibility and full solvent effects. We were surprised to find that the large number of local minima are still found when a protein’s conformational sampling is restricted; the dynamic averaging of probes and competition with water do not smooth the potential surface as one might expect. Only when a protein is allowed to be fully flexible in the simulation are the proper minima located and the spurious ones eliminated. Our results indicate that inclusion of full protein flexibility is critical to accurate hot-spot mapping for SBDD.
Co-reporter:Heather A. Carlson ; Richard D. Smith ; Nickolay A. Khazanov ; Paul D. Kirchhoff ; James B. Dunbar ; Jr.;Mark L. Benson
Journal of Medicinal Chemistry 2008 Volume 51(Issue 20) pp:6432-6441
Publication Date(Web):October 1, 2008
DOI:10.1021/jm8006504
Physical differences in small molecule binding between enzymes and nonenzymes were found through mining the protein−ligand database, Binding MOAD (Mother of All Databases). The data suggest that divergent approaches may be more productive for improving the affinity of ligands for the two classes of proteins. High-affinity ligands of enzymes are much larger than those with low affinity, indicating that the addition of complementary functional groups is likely to improve the affinity of an enzyme inhibitor. However, this process may not be as fruitful for ligands of nonenzymes. High- and low-affinity ligands of nonenzymes are nearly the same size, so modest modifications and isosteric replacement might be most productive. The inherent differences between enzymes and nonenzymes have significant ramifications for scoring functions and structure-based drug design. In particular, nonenzymes were found to have greater ligand efficiencies than enzymes. Ligand efficiencies are often used to indicate druggability of a target, and this finding supports the feasibility of nonenzymes as drug targets. The differences in ligand efficiencies do not appear to come from the ligands; instead, the pockets yield different amino acid compositions despite very similar distributions of amino acids in the overall protein sequences.
Co-reporter:Michael G. Lerner;Kristin L. Meagher
Journal of Computer-Aided Molecular Design 2008 Volume 22( Issue 10) pp:727-736
Publication Date(Web):2008 October
DOI:10.1007/s10822-008-9231-6
Use of solvent mapping, based on multiple-copy minimization (MCM) techniques, is common in structure-based drug discovery. The minima of small-molecule probes define locations for complementary interactions within a binding pocket. Here, we present improved methods for MCM. In particular, a Jarvis–Patrick (JP) method is outlined for grouping the final locations of minimized probes into physical clusters. This algorithm has been tested through a study of protein–protein interfaces, showing the process to be robust, deterministic, and fast in the mapping of protein “hot spots.” Improvements in the initial placement of probe molecules are also described. A final application to HIV-1 protease shows how our automated technique can be used to partition data too complicated to analyze by hand. These new automated methods may be easily and quickly extended to other protein systems, and our clustering methodology may be readily incorporated into other clustering packages.
Co-reporter:Kelly L. Damm;Peter M. U. Ung;Jerome J. Quintero;Jason E. Gestwicki
Biopolymers 2008 Volume 89( Issue 8) pp:643-652
Publication Date(Web):
DOI:10.1002/bip.20993

Abstract

A novel mechanism of inhibiting HIV-1 protease (HIVp) is presented. Using computational solvent mapping to identify complementary interactions and the Multiple Protein Structure method to incorporate protein flexibility, we generated a receptor-based pharmacophore model of the flexible flap region of the semiopen, apo state of HIVp. Complementary interactions were consistently observed at the base of the flap, only within a cleft with a specific structural role. In the closed, bound state of HIVp, each flap tip docks against the opposite monomer, occupying this cleft. This flap-recognition site is filled by the protein and cannot be identified using traditional approaches based on bound, closed structures. Virtual screening and dynamics simulations show how small molecules can be identified to complement this cleft. Subsequent experimental testing confirms inhibitory activity of this new class of inhibitor. This may be the first new inhibitor class for HIVp since dimerization inhibitors were introduced 17 years ago. © 2008 Wiley Periodicals, Inc. Biopolymers 89: 643–652, 2008.

This article was originally published online as an accepted preprint. The “Published Online” date corresponds to the preprint version. You can request a copy of the preprint by emailing the Biopolymers editorial office at biopolymers@wiley.com

Co-reporter:Richard D. Smith, Liegi Hu, Jayson A. Falkner, Mark L. Benson, Jason P. Nerothin, Heather A. Carlson
Journal of Molecular Graphics and Modelling 2006 Volume 24(Issue 6) pp:414-425
Publication Date(Web):May 2006
DOI:10.1016/j.jmgm.2005.08.002
We have recently announced the largest database of protein–ligand complexes, Binding MOAD (Mother of All Databases). After the August 2004 update, Binding MOAD contains 6816 complexes. There are 2220 protein families and 3316 unique ligands. After searching 6000+ crystallography papers, we have obtained binding data for 1793 (27%) of the complexes. We have also created a non-redundant set of complexes with only one complex from each protein family; in that set, 630 (28%) of the unique complexes have binding data. Here, we present information about the data provided at the Binding MOAD website. We also present the results of mining Binding MOAD to map the degree of solvent exposure for binding sites. We have determined that most cavities and ligands (70–85%) are well buried in the complexes. This fits with the common paradigm that a large degree of contact between the ligand and protein is significant in molecular recognition. GoCAV and the GoCAVviewer are the tools we created for this study. To share our data and make our online dataset more useful to other research groups, we have integrated the viewer into the Binding MOAD website (www.BindingMOAD.org).
1-Cyclohexene-1-carboxylicacid, 4-(acetylamino)-5-amino-3-(1-ethylpropoxy)-, (3R,4R,5S)-
5BETA-CHOLANIC ACID 3,7-DIONE METHYL ESTER
Card-20(22)-enolide,3-(acetyloxy)-14-hydroxy-, (3b,5b)-
Card-20(22)-enolide,3,14,16-trihydroxy-, (3b,5b,16b)-
Card-20(22)-enolide,3-[(6-deoxy-a-L-mannopyranosyl)oxy]-5,14-dihydroxy-19-oxo-,(3b,5b)-