Co-reporter:Johannes Kirchmair, Mark J. Williamson, Avid M. Afzal, Jonathan D. Tyzack, Alison P. K. Choy, Andrew Howlett, Patrik Rydberg, and Robert C. Glen
Journal of Chemical Information and Modeling November 25, 2013 Volume 53(Issue 11) pp:
Publication Date(Web):November 12, 2013
DOI:10.1021/ci400503s
FAst MEtabolizer (FAME) is a fast and accurate predictor of sites of metabolism (SoMs). It is based on a collection of random forest models trained on diverse chemical data sets of more than 20 000 molecules annotated with their experimentally determined SoMs. Using a comprehensive set of available data, FAME aims to assess metabolic processes from a holistic point of view. It is not limited to a specific enzyme family or species. Besides a global model, dedicated models are available for human, rat, and dog metabolism; specific prediction of phase I and II metabolism is also supported. FAME is able to identify at least one known SoM among the top-1, top-2, and top-3 highest ranked atom positions in up to 71%, 81%, and 87% of all cases tested, respectively. These prediction rates are comparable to or better than SoM predictors focused on specific enzyme families (such as cytochrome P450s), despite the fact that FAME uses only seven chemical descriptors. FAME covers a very broad chemical space, which together with its inter- and extrapolation power makes it applicable to a wide range of chemicals. Predictions take less than 2.5 s per molecule in batch mode on an Ultrabook. Results are visualized using Jmol, with the most likely SoMs highlighted.
Co-reporter:Alexios Koutsoukas, Robert Lowe, Yasaman KalantarMotamedi, Hamse Y. Mussa, Werner Klaffke, John B. O. Mitchell, Robert C. Glen, and Andreas Bender
Journal of Chemical Information and Modeling 2013 Volume 53(Issue 8) pp:1957-1966
Publication Date(Web):July 8, 2013
DOI:10.1021/ci300435j
In this study, two probabilistic machine-learning algorithms were compared for in silico target prediction of bioactive molecules, namely the well-established Laplacian-modified Naïve Bayes classifier (NB) and the more recently introduced (to Cheminformatics) Parzen-Rosenblatt Window. Both classifiers were trained in conjunction with circular fingerprints on a large data set of bioactive compounds extracted from ChEMBL, covering 894 human protein targets with more than 155,000 ligand-protein pairs. This data set is also provided as a benchmark data set for future target prediction methods due to its size as well as the number of bioactivity classes it contains. In addition to evaluating the methods, different performance measures were explored. This is not as straightforward as in binary classification settings, due to the number of classes, the possibility of multiple class memberships, and the need to translate model scores into “yes/no” predictions for assessing model performance. Both algorithms achieved a recall of correct targets that exceeds 80% in the top 1% of predictions. Performance depends significantly on the underlying diversity and size of a given class of bioactive compounds, with small classes and low structural similarity affecting both algorithms to different degrees. When tested on an external test set extracted from WOMBAT covering more than 500 targets by excluding all compounds with Tanimoto similarity above 0.8 to compounds from the ChEMBL data set, the current methodologies achieved a recall of 63.3% and 66.6% among the top 1% for Naïve Bayes and Parzen-Rosenblatt Window, respectively. While those numbers seem to indicate lower performance, they are also more realistic for settings where protein targets need to be established for novel chemical substances.
Co-reporter:Johannes Kirchmair, Andrew Howlett, Julio E. Peironcely, Daniel S. Murrell, Mark J. Williamson, Samuel E. Adams, Thomas Hankemeier, Leo van Buren, Guus Duchateau, Werner Klaffke, and Robert C. Glen
Journal of Chemical Information and Modeling 2013 Volume 53(Issue 2) pp:354-367
Publication Date(Web):January 25, 2013
DOI:10.1021/ci300487z
Understanding which physicochemical properties, or property distributions, are favorable for successful design and development of drugs, nutritional supplements, cosmetics, and agrochemicals is of great importance. In this study we have analyzed molecules from three distinct chemical spaces (i) approved drugs, (ii) human metabolites, and (iii) traditional Chinese medicine (TCM) to investigate four aspects determining the disposition of small organic molecules. First, we examined the physicochemical properties of these three classes of molecules and identified characteristic features resulting from their distinctive biological functions. For example, human metabolites and TCM molecules can be larger and more hydrophobic than drugs, which makes them less likely to cross membranes. We then quantified the shifts in physicochemical property space induced by metabolism from a holistic perspective by analyzing a data set of several thousand experimentally observed metabolic trees. Results show how the metabolic system aims to retain nutrients/micronutrients while facilitating a rapid elimination of xenobiotics. In the third part we compared these global shifts with the contributions made by individual metabolic reactions. For better resolution, all reactions were classified into phase I and phase II biotransformations. Interestingly, not all metabolic reactions lead to more hydrophilic molecules. We were able to identify biotransformations leading to an increase of logP by more than one log unit, which could be used for the design of drugs with enhanced efficacy. The study closes with the analysis of the physicochemical properties of metabolites found in the bile, faeces, and urine. Metabolites in the bile can be large and are often negatively charged. Molecules with molecular weight >500 Da are rarely found in the urine, and most of these large molecules are charged phase II conjugates.
Co-reporter:Jonathan D. Tyzack, Mark J. Williamson, Rubben Torella, and Robert C. Glen
Journal of Chemical Information and Modeling 2013 Volume 53(Issue 6) pp:1294-1305
Publication Date(Web):May 24, 2013
DOI:10.1021/ci400058s
Metabolism of xenobiotic and endogenous compounds is frequently complex, not completely elucidated, and therefore often ambiguous. The prediction of sites of metabolism (SoM) can be particularly helpful as a first step toward the identification of metabolites, a process especially relevant to drug discovery. This paper describes a reactivity approach for predicting SoM whereby reactivity is derived directly from the ground state ligand molecular orbital analysis, calculated using Density Functional Theory, using a novel implementation of the average local ionization energy. Thus each potential SoM is sampled in the context of the whole ligand, in contrast to other popular approaches where activation energies are calculated for a predefined database of molecular fragments and assigned to matching moieties in a query ligand. In addition, one of the first descriptions of molecular dynamics of cytochrome P450 (CYP) isoforms 3A4, 2D6, and 2C9 in their Compound I state is reported, and, from the representative protein structures obtained, an analysis and evaluation of various docking approaches using GOLD is performed. In particular, a covalent docking approach is described coupled with the modeling of important electrostatic interactions between CYP and ligand using spherical constraints. Combining the docking and reactivity results, obtained using standard functionality from common docking and quantum chemical applications, enables a SoM to be identified in the top 2 predictions for 75%, 80%, and 78% of the data sets for 3A4, 2D6, and 2C9, respectively, results that are accessible and competitive with other recently published prediction tools.
Co-reporter:Johannes Kirchmair;Andrew Howlett;Julio Peironcely
Journal of Cheminformatics 2013 Volume 5( Issue 1 Supplement) pp:
Publication Date(Web):2013 March
DOI:10.1186/1758-2946-5-S1-O12
Co-reporter:Johannes Kirchmair, Mark J. Williamson, Jonathan D. Tyzack, Lu Tan, Peter J. Bond, Andreas Bender, and Robert C. Glen
Journal of Chemical Information and Modeling 2012 Volume 52(Issue 3) pp:617-648
Publication Date(Web):February 17, 2012
DOI:10.1021/ci200542m
Metabolism of xenobiotics remains a central challenge for the discovery and development of drugs, cosmetics, nutritional supplements, and agrochemicals. Metabolic transformations are frequently related to the incidence of toxic effects that may result from the emergence of reactive species, the systemic accumulation of metabolites, or by induction of metabolic pathways. Experimental investigation of the metabolism of small organic molecules is particularly resource demanding; hence, computational methods are of considerable interest to complement experimental approaches. This review provides a broad overview of structure- and ligand-based computational methods for the prediction of xenobiotic metabolism. Current computational approaches to address xenobiotic metabolism are discussed from three major perspectives: (i) prediction of sites of metabolism (SOMs), (ii) elucidation of potential metabolites and their chemical structures, and (iii) prediction of direct and indirect effects of xenobiotics on metabolizing enzymes, where the focus is on the cytochrome P450 (CYP) superfamily of enzymes, the cardinal xenobiotics metabolizing enzymes. For each of these domains, a variety of approaches and their applications are systematically reviewed, including expert systems, data mining approaches, quantitative structure–activity relationships (QSARs), and machine learning-based methods, pharmacophore-based algorithms, shape-focused techniques, molecular interaction fields (MIFs), reactivity-focused techniques, protein–ligand docking, molecular dynamics (MD) simulations, and combinations of methods. Predictive metabolism is a developing area, and there is still enormous potential for improvement. However, it is clear that the combination of rapidly increasing amounts of available ligand- and structure-related experimental data (in particular, quantitative data) with novel and diverse simulation and modeling approaches is accelerating the development of effective tools for prediction of in vivo metabolism, which is reflected by the diverse and comprehensive data sources and methods for metabolism prediction reviewed here. This review attempts to survey the range and scope of computational methods applied to metabolism prediction and also to compare and contrast their applicability and performance.
Co-reporter:Robert Charles Glen
Journal of Computer-Aided Molecular Design 2012 Volume 26( Issue 1) pp:47-49
Publication Date(Web):2012 January
DOI:10.1007/s10822-011-9501-6
Computers have changed the way we do science. Surrounded by a sea of data and with phenomenal computing capacity, the methodology and approach to scientific problems is evolving into a partnership between experiment, theory and data analysis. Given the pace of change of the last twenty-five years, it seems folly to speculate on the future, but along with unpredictable leaps of progress there will be a continuous evolution of capability, which points to opportunities and improvements that will certainly appear as our discipline matures.
Co-reporter:Alexios Koutsoukas, Benjamin Simms, Johannes Kirchmair, Peter J. Bond, Alan V. Whitmore, Steven Zimmer, Malcolm P. Young, Jeremy L. Jenkins, Meir Glick, Robert C. Glen, Andreas Bender
Journal of Proteomics 2011 Volume 74(Issue 12) pp:2554-2574
Publication Date(Web):18 November 2011
DOI:10.1016/j.jprot.2011.05.011
Given the tremendous growth of bioactivity databases, the use of computational tools to predict protein targets of small molecules has been gaining importance in recent years. Applications span a wide range, from the ‘designed polypharmacology’ of compounds to mode-of-action analysis. In this review, we firstly survey databases that can be used for ligand-based target prediction and which have grown tremendously in size in the past. We furthermore outline methods for target prediction that exist, both based on the knowledge of bioactivities from the ligand side and methods that can be applied in situations when a protein structure is known. Applications of successful in silico target identification attempts are discussed in detail, which were based partly or in whole on computational target predictions in the first instance. This includes the authors' own experience using target prediction tools, in this case considering phenotypic antibacterial screens and the analysis of high-throughput screening data. Finally, we will conclude with the prospective application of databases to not only predict, retrospectively, the protein targets of a small molecule, but also how to design ligands with desired polypharmacology in a prospective manner.Highlights► Public bioactivity databases are increasing in size at a tremendous rate. ► Integration of chemical and biological data is next key step; first examples exist. ► Target prediction and mode-of-action analysis help us rationalize compound action. ► Incorporating pathway information potentially enables multi-target drug design.
Co-reporter:Dr. N. J. Maximilian Macaluso;Dr. Sarah L. Pitkin;Dr. Janet J. Maguire;Dr. Anthony P. Davenport; Robert C. Glen
ChemMedChem 2011 Volume 6( Issue 6) pp:1017-1023
Publication Date(Web):
DOI:10.1002/cmdc.201100069
Abstract
The apelin receptor (APJ) is a class A G-protein-coupled receptor (GPCR) and is a putative target for the treatment of cardiovascular and metabolic diseases. Apelin-13 (NH2-QRPRLSHKGPMPF-COOH) is a vasoactive peptide and one of the most potent endogenous inotropic agents identified to date. We report the design and discovery of a novel APJ antagonist. By using a bivalent ligand approach, we have designed compounds with two ′affinity′ motifs and a short series of linker groups with different conformational and non-bonded interaction properties. One of these, cyclo(1–6)CRPRLC-KH-cyclo(9–14)CRPRLC is a competitive antagonist at APJ. Radioligand binding in CHO cells transfected with human APJ gave a Ki value of 82 nM, competition binding in human left ventricle gave a KD value of 3.2 μM, and cAMP accumulation assays in CHO-K1-APJ cells gave a KD value of 1.32 μM.
Co-reporter:N.J. Maximilian Macaluso ;RobertC. Glen
ChemMedChem 2010 Volume 5( Issue 8) pp:1247-1253
Publication Date(Web):
DOI:10.1002/cmdc.201000061
Co-reporter:Andreas Bender and Robert C. Glen
Organic & Biomolecular Chemistry 2004 vol. 2(Issue 22) pp:3204-3218
Publication Date(Web):14 Oct 2004
DOI:10.1039/B409813G
Molecular Informatics utilises many ideas and concepts to find relationships between molecules. The concept of similarity, where molecules may be grouped according to their biological effects or physicochemical properties has found extensive use in drug discovery. Some areas of particular interest have been in lead discovery and compound optimisation. For example, in designing libraries of compounds for lead generation, one approach is to design sets of compounds ‘similar’ to known active compounds in the hope that alternative molecular structures are found that maintain the properties required while enhancing e.g. patentability, medicinal chemistry opportunities or even in achieving optimised pharmacokinetic profiles. Thus the practical importance of the concept of molecular similarity has grown dramatically in recent years. The predominant users are pharmaceutical companies, employing similarity methods in a wide range of applications e.g. virtual screening, estimation of absorption, distribution, metabolism, excretion and toxicity (ADME/Tox) and prediction of physicochemical properties (solubility, partitioning etc.). In this perspective, we discuss the representation of molecular structure (descriptors), methods of comparing structures and how these relate to measured properties. This leads to the concept of molecular similarity, its various definitions and uses and how these have evolved in recent years. Here, we wish to evaluate and in some cases challenge accepted views and uses of molecular similarity. Molecular similarity, as a paradigm, contains many implicit and explicit assumptions in particular with respect to the prediction of the binding and efficacy of molecules at biological receptors. The fundamental observation is that molecular similarity has a context which both defines and limits its use. The key issues of solvation effects, heterogeneity of binding sites and the fundamental problem of the form of similarity measure to use are addressed.