Co-reporter:Florbela Pereira;Kaixia Xiao;Chengcheng Wu;Qingyou Zhang;Joao Aires-de-Sousa;Diogo A. R. S. Latino
Journal of Chemical Information and Modeling January 23, 2017 Volume 57(Issue 1) pp:11-21
Publication Date(Web):December 29, 2016
DOI:10.1021/acs.jcim.6b00340
Machine learning algorithms were explored for the fast estimation of HOMO and LUMO orbital energies calculated by DFT B3LYP, on the basis of molecular descriptors exclusively based on connectivity. The whole project involved the retrieval and generation of molecular structures, quantum chemical calculations for a database with >111 000 structures, development of new molecular descriptors, and training/validation of machine learning models. Several machine learning algorithms were screened, and an applicability domain was defined based on Euclidean distances to the training set. Random forest models predicted an external test set of 9989 compounds achieving mean absolute error (MAE) up to 0.15 and 0.16 eV for the HOMO and LUMO orbitals, respectively. The impact of the quantum chemical calculation protocol was assessed with a subset of compounds. Inclusion of the orbital energy calculated by PM7 as an additional descriptor significantly improved the quality of estimations (reducing the MAE in >30%).
Co-reporter:Qi Huang, Qingyou Zhang, Enze Wang, Yanmei Zhou, Han Qiao, Lanfang Pang, Fang Yu
Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 2016 Volume 152() pp:70-76
Publication Date(Web):5 January 2016
DOI:10.1016/j.saa.2015.07.062
•The sensor was designed based on spirolactam ring-opening reaction.•The sensor showed high selectivity for Al3+ in aqueous solution.•The sensor can be applied in water samples and imaging of Al3+ in living cells.In this paper, a new fluorescent probe has been synthesized and applied as “off–on” sensor for the detection of Al3+ with a high sensitivity and excellent selectivity in aqueous media. The sensor was easily prepared by one step reaction between rhodamine B hydrazide and pyridoxal hydrochloride named RBP. The structure of the sensor has been characterized by nuclear magnetic resonance and electron spray ionization-mass spectrometry. The fluorescence intensity and absorbance for the sensor showed a good linearity with the concentration of Al3+ in the range of 0–12.5 μM and 8–44 μM, respectively, with detection limits of 0.23 μM and 1.90 μM. The sensor RBP was preliminarily applied to the determination of Al3+ in water samples from the lake of Henan University and tap water with satisfying results. Moreover, it can be used as a bioimaging reagent for imaging of Al3+ in living cells.
Co-reporter:Qingyou Zhang; Chengcheng Wu; Fangfang Zheng; Tanfeng Zhao; Yanmei Zhou;Lu Xu
Journal of Chemical Information and Modeling 2015 Volume 55(Issue 7) pp:1308-1315
Publication Date(Web):June 3, 2015
DOI:10.1021/acs.jcim.5b00044
A highly discriminating topological index, EAID, is generated in our laboratory. A systematic search for degeneracy was performed on a total of over 14 million structures, and no duplicate occurred. These structures are as follows: over 3.8 million alkane trees with 1–22 carbon atoms; over 0.38 million structures containing heteroatoms; over 4 million benzenoids with 1–13 benzene rings; and over 5.9 million compounds from three reality databases. However, in a search of over 20 million alkane trees with 23 and 24 carbon atoms, five and 13 duplicates occurred, respectively, and for over 20 million compounds from the ZINC database, 10 duplicates occurred. To increase the discriminating power of the index, EAID has been extended, and the resulting index is termed 2-EAID. All of the over 55 million structures mentioned above were uniquely identified by 2-EAID except for two duplicates that occurred for the ZINC database. EAID and 2-EAID are the most highly discriminating indices examined to date. Thus, the two indices possess not only theoretical significance but also potential applications. For example, they could possibly be used as a supplementary reference for CAS Registry Numbers for structure documentation.
Co-reporter:Jing-Jie Suo, Qing-You Zhang, Jing-Ya Li, Yan-Mei Zhou, Lu Xu
Journal of Molecular Graphics and Modelling 2013 Volume 43() pp:11-20
Publication Date(Web):June 2013
DOI:10.1016/j.jmgm.2013.03.005
•A chemoinformatics approach was applied to the prediction of enantioselectivity.•Chiral substituent code was developed specifically for secondary alcohols.•The suggested code has potential to be applied to other kinds of compounds.•The models were assessed in terms of single enantiomer and pair of enantiomers.A chiral substituent code was proposed based on the features of secondary alcohols, in which a chiral center is attached to two substituents in addition to OH and H substituents. The new chirality code, which was generated by predefining positional information of four substituents attached to stereocenter, was applied to two datasets composed of secondary alcohols as the enantioselective products of asymmetric reactions. In the first dataset, the chemical reaction was catalyzed by a biocatalyst, lipase from Candida rugosa. The catalyst for the second dataset was (−)-diisopinocampheylchloroborane. The structure–enantioselectivity relationship models were constructed using random forests with the chiral substituent code as the input. The resulting models were assessed both in terms of single enantiomers and pairs of enantiomers. Satisfactory results were obtained for both datasets. Although the chiral substituent code was specifically developed for secondary alcohols, it can easily be extended to represent chiral compounds possessing a specific chiral center bonded to two variable substituents.