John B. O. Mitchell

Find an error

Name:
Organization: University of St. Andrews , England
Department: Biomedical Sciences Research Complex and EaStCHEM School of Chemistry
Title: Reader(PhD)
Co-reporter:James L. McDonagh, Neetika Nath, Luna De Ferrari, Tanja van Mourik, and John B. O. Mitchell
Journal of Chemical Information and Modeling March 24, 2014 Volume 54(Issue 3) pp:
Publication Date(Web):February 24, 2014
DOI:10.1021/ci4005805
We present four models of solution free-energy prediction for druglike molecules utilizing cheminformatics descriptors and theoretically calculated thermodynamic values. We make predictions of solution free energy using physics-based theory alone and using machine learning/quantitative structure–property relationship (QSPR) models. We also develop machine learning models where the theoretical energies and cheminformatics descriptors are used as combined input. These models are used to predict solvation free energy. While direct theoretical calculation does not give accurate results in this approach, machine learning is able to give predictions with a root mean squared error (RMSE) of ∼1.1 log S units in a 10-fold cross-validation for our Drug-Like-Solubility-100 (DLS-100) dataset of 100 druglike molecules. We find that a model built using energy terms from our theoretical methodology as descriptors is marginally less predictive than one built on Chemistry Development Kit (CDK) descriptors. Combining both sets of descriptors allows a further but very modest improvement in the predictions. However, in some cases, this is a statistically significant enhancement. These results suggest that there is little complementarity between the chemical information provided by these two sets of descriptors, despite their different sources and methods of calculation. Our machine learning models are also able to predict the well-known Solubility Challenge dataset with an RMSE value of 0.9–1.0 log S units.
Co-reporter:R. E. Skyner;J. B. O. Mitchell;C. R. Groom
CrystEngComm (1999-Present) 2017 vol. 19(Issue 4) pp:641-652
Publication Date(Web):2017/01/23
DOI:10.1039/C6CE02119K
The abundance of crystal structures of solvated organic molecules reflects the common role of solvent in the crystallisation process. An understanding of solvation is therefore important for crystal engineering, with solvent choice often affecting polymorphism as well as influencing the crystal structure. Of particular importance is the role of water, and a number of approaches have previously been considered in the analysis of large datasets of organic hydrates. In this work we attempt to develop a method suitable for application to organic hydrate crystal structures, in order to better understand the distribution of water molecules in such systems. We present a model aimed at combining the distribution functions of multiple atom pairs from a number of crystal structures. From this, we can comment qualitatively on the average distribution of water in organic hydrates.
Co-reporter:James L. McDonagh, David S. Palmer, Tanja van Mourik, and John B. O. Mitchell
Journal of Chemical Information and Modeling 2016 Volume 56(Issue 11) pp:2162-2179
Publication Date(Web):October 17, 2016
DOI:10.1021/acs.jcim.6b00033
We compare a range of computational methods for the prediction of sublimation thermodynamics (enthalpy, entropy, and free energy of sublimation). These include a model from theoretical chemistry that utilizes crystal lattice energy minimization (with the DMACRYS program) and quantitative structure property relationship (QSPR) models generated by both machine learning (random forest and support vector machines) and regression (partial least squares) methods. Using these methods we investigate the predictability of the enthalpy, entropy and free energy of sublimation, with consideration of whether such a method may be able to improve solubility prediction schemes. Previous work has suggested that the major source of error in solubility prediction schemes involving a thermodynamic cycle via the solid state is in the modeling of the free energy change away from the solid state. Yet contrary to this conclusion other work has found that the inclusion of terms such as the enthalpy of sublimation in QSPR methods does not improve the predictions of solubility. We suggest the use of theoretical chemistry terms, detailed explicitly in the Methods section, as descriptors for the prediction of the enthalpy and free energy of sublimation. A data set of 158 molecules with experimental sublimation thermodynamics values and some CSD refcodes has been collected from the literature and is provided with their original source references.
Co-reporter:R. E. Skyner, J. L. McDonagh, C. R. Groom, T. van Mourik and J. B. O. Mitchell  
Physical Chemistry Chemical Physics 2015 vol. 17(Issue 9) pp:6174-6191
Publication Date(Web):23 Jan 2015
DOI:10.1039/C5CP00288E
Over the past decade, pharmaceutical companies have seen a decline in the number of drug candidates successfully passing through clinical trials, though billions are still spent on drug development. Poor aqueous solubility leads to low bio-availability, reducing pharmaceutical effectiveness. The human cost of inefficient drug candidate testing is of great medical concern, with fewer drugs making it to the production line, slowing the development of new treatments. In biochemistry and biophysics, water mediated reactions and interactions within active sites and protein pockets are an active area of research, in which methods for modelling solvated systems are continually pushed to their limits. Here, we discuss a multitude of methods aimed towards solvent modelling and solubility prediction, aiming to inform the reader of the options available, and outlining the various advantages and disadvantages of each approach.
Co-reporter:David S. Palmer and John B. O. Mitchell
Molecular Pharmaceutics 2014 Volume 11(Issue 8) pp:2962-2972
Publication Date(Web):June 11, 2014
DOI:10.1021/mp500103r
We report the results of testing quantitative structure–property relationships (QSPR) that were trained upon the same druglike molecules but two different sets of solubility data: (i) data extracted from several different sources from the published literature, for which the experimental uncertainty is estimated to be 0.6–0.7 log S units (referred to mol/L); (ii) data measured by a single accurate experimental method (CheqSol), for which experimental uncertainty is typically <0.05 log S units. Contrary to what might be expected, the models derived from the CheqSol experimental data are not more accurate than those derived from the “noisy” literature data. The results suggest that, at the present time, it is the deficiency of QSPR methods (algorithms and/or descriptor sets), and not, as is commonly quoted, the uncertainty in the experimental measurements, which is the limiting factor in accurately predicting aqueous solubility for pharmaceutical molecules.Keywords: ADME; ADMET; bioavailability; CheqSol; crystal; dissolution; druglike; experimental error; general solubility equation; Henderson−Hasselbalch; machine learning; Noyes−Whitney; pharmaceutical; polymorph; QSAR; QSPR; Random Forest; rule-of-five; solubility;
Co-reporter:Rosanna G. Alderson;Daniel Barker
Journal of Molecular Evolution 2014 Volume 79( Issue 3-4) pp:117-129
Publication Date(Web):2014 October
DOI:10.1007/s00239-014-9639-7
Bacteria use metallo-β-lactamase enzymes to hydrolyse lactam rings found in many antibiotics, rendering them ineffective. Metallo-β-lactamase activity is thought to be polyphyletic, having arisen on more than one occasion within a single functionally diverse homologous superfamily. Since discovery of multiple origins of enzymatic activity conferring antibiotic resistance has broad implications for the continued clinical use of antibiotics, we test the hypothesis of polyphyly further; if lactamase function has arisen twice independently, the most recent common ancestor (MRCA) is not expected to possess lactam-hydrolysing activity. Two major problems present themselves. Firstly, even with a perfectly known phylogeny, ancestral sequence reconstruction is error prone. Secondly, the phylogeny is not known, and in fact reconstructing a single, unambiguous phylogeny for the superfamily has proven impossible. To obtain a more statistical view of the strength of evidence for or against MRCA lactamase function, we reconstructed a sample of 98 MRCAs of the metallo-β-lactamases, each based on a different tree in a bootstrap sample of reconstructed phylogenies. InterPro sequence signatures and homology modelling were then used to assess our sample of MRCAs for lactamase functionality. Only 5 % of these models conform to our criteria for metallo-β-lactamase functionality, suggesting that the ancestor was unlikely to have been a metallo-β-lactamase. On the other hand, given that ancestral proteins may have had metallo-β-lactamase functionality with variation in sequence and structural properties compared with extant enzymes, our criteria are conservative, estimating a lower bound of evidence for metallo-β-lactamase functionality but not an upper bound.
Co-reporter:David S. Palmer, James L. McDonagh, John B. O. Mitchell, Tanja van Mourik, and Maxim V. Fedorov
Journal of Chemical Theory and Computation 2012 Volume 8(Issue 9) pp:3322-3337
Publication Date(Web):July 25, 2012
DOI:10.1021/ct300345m
We demonstrate that the intrinsic aqueous solubility of crystalline druglike molecules can be estimated with reasonable accuracy from sublimation free energies calculated using crystal lattice simulations and hydration free energies calculated using the 3D Reference Interaction Site Model (3D-RISM) of the Integral Equation Theory of Molecular Liquids (IET). The solubilities of 25 crystalline druglike molecules taken from different chemical classes are predicted by the model with a correlation coefficient of R = 0.85 and a root mean square error (RMSE) equal to 1.45 log10S units, which is significantly more accurate than results obtained using implicit continuum solvent models. The method is not directly parametrized against experimental solubility data, and it offers a full computational characterization of the thermodynamics of transfer of the drug molecule from crystal phase to gas phase to dilute aqueous solution.
Co-reporter:Pedro J. Ballester and John B. O. Mitchell
Journal of Chemical Information and Modeling 2011 Volume 51(Issue 8) pp:1739-1741
Publication Date(Web):May 18, 2011
DOI:10.1021/ci200057e
Co-reporter:Robert Lowe, Robert C. Glen, and John B. O. Mitchell
Molecular Pharmaceutics 2010 Volume 7(Issue 5) pp:1708-1714
Publication Date(Web):August 27, 2010
DOI:10.1021/mp100103e
Phospholipidosis is an adverse effect caused by numerous cationic amphiphilic drugs and can affect many cell types. It is characterized by the excess accumulation of phospholipids and is most reliably identified by electron microscopy of cells revealing the presence of lamellar inclusion bodies. The development of phospholipidosis can cause a delay in the drug development process, and the importance of computational approaches to the problem has been well documented. Previous work on predictive methods for phospholipidosis showed that state of the art machine learning methods produced the best results. Here we extend this work by looking at a larger data set mined from the literature. We find that circular fingerprints lead to better models than either E-Dragon descriptors or a combination of the two. We also observe very similar performance in general between Random Forest and Support Vector Machine models.Keywords: in silico; machine learning; Phospholipidosis; prediction; Random Forest; Support Vector Machine;
Co-reporter:R. E. Skyner, J. L. McDonagh, C. R. Groom, T. van Mourik and J. B. O. Mitchell
Physical Chemistry Chemical Physics 2015 - vol. 17(Issue 9) pp:NaN6191-6191
Publication Date(Web):2015/01/23
DOI:10.1039/C5CP00288E
Over the past decade, pharmaceutical companies have seen a decline in the number of drug candidates successfully passing through clinical trials, though billions are still spent on drug development. Poor aqueous solubility leads to low bio-availability, reducing pharmaceutical effectiveness. The human cost of inefficient drug candidate testing is of great medical concern, with fewer drugs making it to the production line, slowing the development of new treatments. In biochemistry and biophysics, water mediated reactions and interactions within active sites and protein pockets are an active area of research, in which methods for modelling solvated systems are continually pushed to their limits. Here, we discuss a multitude of methods aimed towards solvent modelling and solubility prediction, aiming to inform the reader of the options available, and outlining the various advantages and disadvantages of each approach.
2-[(1S,6S)-6-isopropenyl-3-methyl-1-cyclohex-2-enyl]-5-propyl-benzene-1,3-diol
Methyl (3s,4r)-3-benzoyloxy-8-methyl-8-azabicyclo[3.2.1]octane-4-carboxylate
1-[isopropylamino]-3-[isopropoxyethoxymethylphenoxy]-2-propanol
cortisone
phenobarbital
2-Propanol,1-[(1-methylethyl)amino]-3-[2-(2-propen-1-yloxy)phenoxy]-
Esmolol
2-Propanol, 1-[(1-methylethyl)amino]-3-(1-naphthalenyloxy)-
delta-9-Tetrahydrocannabinol
Pteridine