Instant machine learning predictions of molecular properties are desirable for materials design, but the predictive power of the methodology is mainly tested on well-known benchmark datasets. Here, we investigate the performance of machine learning with kernel ridge regression (KRR) for the prediction of molecular orbital energies on three large datasets: the standard QM9 small organic molecules set, amino acid and dipeptide conformers, and organic crystal-forming molecules extracted from the Cambridge Structural Database. We focus on the prediction of highest occupied molecular orbital (HOMO) energies, computed at the density-functional level of theory. Two different representations that encode the molecular structure are compared: the Coulomb matrix (CM) and the many-body tensor representation (MBTR). We find that KRR performance depends significantly on the chemistry of the underlying dataset and that the MBTR is superior to the CM, predicting HOMO energies with a mean absolute error as low as 0.09 eV. To demonstrate the power of our machine learning method, we apply our model to structures of 10k previously unseen molecules. We gain instant energy predictions that allow us to identify interesting molecules for future applications.
Skip Nav Destination
Article navigation
28 May 2019
Research Article|
May 31 2019
Chemical diversity in molecular orbital energy predictions with kernel ridge regression
Annika Stuke
;
Annika Stuke
a)
1
Department of Applied Physics, Aalto University
, P.O. Box 11100, Aalto FI-00076, Finland
Search for other works by this author on:
Milica Todorović
;
Milica Todorović
1
Department of Applied Physics, Aalto University
, P.O. Box 11100, Aalto FI-00076, Finland
Search for other works by this author on:
Matthias Rupp
;
Matthias Rupp
b)
2
Fritz Haber Institute of the Max Planck Society
, Faradayweg 4-6, 14195 Berlin, Germany
Search for other works by this author on:
Christian Kunkel
;
Christian Kunkel
1
Department of Applied Physics, Aalto University
, P.O. Box 11100, Aalto FI-00076, Finland
3
Chair for Theoretical Chemistry and Catalysis Research Center, Technische Universität München
, Lichtenbergstr. 4, 85747 Garching, Germany
Search for other works by this author on:
Kunal Ghosh
;
Kunal Ghosh
1
Department of Applied Physics, Aalto University
, P.O. Box 11100, Aalto FI-00076, Finland
4
Department of Computer Science, Aalto University
, P.O. Box 15400, Aaalto FI-00076, Finland
Search for other works by this author on:
Lauri Himanen
;
Lauri Himanen
1
Department of Applied Physics, Aalto University
, P.O. Box 11100, Aalto FI-00076, Finland
Search for other works by this author on:
Patrick Rinke
Patrick Rinke
1
Department of Applied Physics, Aalto University
, P.O. Box 11100, Aalto FI-00076, Finland
3
Chair for Theoretical Chemistry and Catalysis Research Center, Technische Universität München
, Lichtenbergstr. 4, 85747 Garching, Germany
Search for other works by this author on:
a)
Electronic mail: [email protected]
b)
Current address: Citrine Informatics, 702 Marshall Street, Redwood City, California 94063, USA
J. Chem. Phys. 150, 204121 (2019)
Article history
Received:
December 18 2018
Accepted:
April 21 2019
Citation
Annika Stuke, Milica Todorović, Matthias Rupp, Christian Kunkel, Kunal Ghosh, Lauri Himanen, Patrick Rinke; Chemical diversity in molecular orbital energy predictions with kernel ridge regression. J. Chem. Phys. 28 May 2019; 150 (20): 204121. https://doi.org/10.1063/1.5086105
Download citation file:
Pay-Per-View Access
$40.00
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
Citing articles via
DeePMD-kit v2: A software package for deep potential models
Jinzhe Zeng, Duo Zhang, et al.
Beyond the Debye–Hückel limit: Toward a general theory for concentrated electrolytes
Mohammadhasan Dinpajooh, Nadia N. Intan, et al.
Related Content
Machine learning model for non-equilibrium structures and energies of simple molecules
J. Chem. Phys. (January 2019)
Prediction of atomization energy using graph kernel and active learning
J. Chem. Phys. (January 2019)
Machine-learning accelerated structure search for ligand-protected clusters
J. Chem. Phys. (March 2024)
Kernel based quantum machine learning at record rate: Many-body distribution functionals as compact representations
J. Chem. Phys. (July 2023)
Constant size descriptors for accurate machine learning models of molecular properties
J. Chem. Phys. (March 2018)