The goal of the present work is to obtain accurate potential energy surfaces (PESs) for high-dimensional molecular systems with a small number of ab initio calculations in a system-agnostic way. We use probabilistic modeling based on Gaussian processes (GPs). We illustrate that it is possible to build an accurate GP model of a 51-dimensional PES based on 5000 randomly distributed ab initio calculations with a global accuracy of <0.2 kcal/mol. Our approach uses GP models with composite kernels designed to enhance the Bayesian information content and represents the global PES as a sum of a full-dimensional GP and several GP models for molecular fragments of lower dimensionality. We demonstrate the potency of these algorithms by constructing the global PES for the protonated imidazole dimer, a molecular system with 19 atoms. We illustrate that GP models thus constructed can extrapolate the PES from low energies (<10 000 cm−1), yielding a PES at high energies (>20 000 cm−1). This opens the prospect for new applications of GPs, such as mapping out phase transitions by extrapolation or accelerating Bayesian optimization, for high-dimensional physics and chemistry problems with a restricted number of inputs, i.e., for high-dimensional problems where obtaining training data is very difficult.

1.
C. E.
Rasmussen
and
C. K. I.
Williams
,
Gaussian Processes for Machine Learning
(
The MIT Press
,
Cambridge
,
2006
).
2.
J.
Cui
and
R. V.
Krems
, “
Gaussian process model for collision dynamics of complex molecules
,”
Phys. Rev. Lett.
115
,
073202
(
2015
).
3.
R. V.
Krems
, “
Bayesian machine learning for quantum molecular dynamics
,”
Phys. Chem. Chem. Phys.
21
,
13392
(
2019
).
4.
M.
Benning
,
E.
Celledoni
,
M.
J. Ehrhardt
,
B.
Owren
, and
C.-B.
Schönlieb
, “
Deep learning as optimal control problems: Models and numerical methods
,”
J. Comput. Dyn.
6
,
171
(
2019
).
5.
R.
Gómez-Bombarelli
et al.,
Design of efficient molecular organic light-emitting diodes by a high-throughput virtual screening and experimental approach
,”
Nat. Mater.
15
,
1120
(
2016
).
6.
J. N.
Wei
,
D.
Duvenaud
, and
A.
Aspuru-Guzik
, “
Neural networks for the prediction of organic chemistry reactions
,”
ACS Cent. Sci.
2
,
725
(
2016
).
7.
L. M.
Roch
,
F.
Häse
,
C.
Kreisbeck
,
T.
Tamayo-Mendoza
,
L. P.-E.
Yunker
,
J.
E Hein
, and
A.
Aspuru-Guzik
, “
ChemOS: Orchestrating autonomous experimentation
,”
Sci. Rob.
3
,
eaat5559
(
2018
).
8.
F.
Häse
,
L. M.
Roch
,
C.
Kreisbeck
, and
A.
Aspuru-Guzik
, “
Phoenics: A Bayesian optimizer for chemistry
,”
ACS Cent. Sci.
4
,
1134
(
2018
).
9.
P. L. A.
Popelier
, “
QCTFF: On the construction of a novel protein force field
,”
Int. J. Quantum Chem.
115
,
1005
(
2015
).
10.
V.
Botu
and
R.
Ramprasad
, “
Adaptive machine learning framework to accelerate ab initio molecular dynamics
,”
Int. J. Quantum Chem.
115
,
1074
(
2015
).
11.
M.
Caccin
,
Z.
Li
,
J. R.
Kermode
, and
A.
De Vita
, “
A framework for machine-learning-augmented multiscale atomistic simulations on parallel supercomputers
,”
Int. J. Quantum Chem.
115
,
1129
(
2015
).
12.
J.
Wu
,
Y.
Zhou
, and
X.
Xu
, “
The X1 family of methods that combines B3LYP with neural network corrections for an accurate yet efficient prediction of thermochemistry
,”
Int. J. Quantum Chem.
115
,
1021
(
2015
).
13.
K.
Vu
,
J. C.
Snyder
,
L.
Li
,
M.
Rupp
,
B. F.
Chen
,
T.
Khelif
,
K.-R.
Müller
, and
K.
Burke
, “
Understanding kernel ridge regression: Common behaviors from simple functions to density functionals
,”
Int. J. Quantum Chem.
115
,
1115
(
2015
).
14.
J. J.
Mortensen
,
K.
Kaasbjerg
,
S. L.
Frederiksen
,
J. K.
Nørskov
,
J. P.
Sethna
, and
K. W.
Jacobsen
, “
Bayesian error estimation in density-functional theory
,”
Phys. Rev. Lett.
95
,
216401
(
2005
).
15.
A. J.
Medford
,
J.
Wellendorff
,
A.
Vojvodic
,
F.
Studt
,
F.
Abild-Pedersen
,
K. W.
Jacobsen
,
T.
Bligaard
, and
J. K.
Norskov
, “
Catalysis. Assessing the reliability of calculated catalytic ammonia synthesis rates
,”
Science
345
,
197
(
2014
).
16.
M.
Fritz
,
M.
Fernández-Serra
, and
J. M.
Soler
, “
Optimization of an exchange-correlation density functional for water
,”
J. Chem. Phys.
144
,
224101
(
2016
).
17.
R. A.
Vargas-Hernandez
, “
Bayesian optimization for tuning and selecting hybrid-density functionals
,”
J. Phys. Chem. A
124
,
4053
(
2020
).
18.
J.
Proppe
and
M.
Reiher
, “
Reliable estimation of prediction uncertainty for physicochemical property models
,”
J. Chem. Theory Comput.
13
,
3297
(
2017
).
19.
A.
Kamath
,
R. A.
Vargas-Hernández
,
R. V.
Krems
,
T.
Carrington
, Jr.
, and
S.
Manzhos
, “
Neural networks vs Gaussian process regression for representing potential energy surfaces: A comparative study of fit quality and vibrational spectrum accuracy
,”
J. Chem. Phys.
148
,
241702
(
2018
).
20.
D. K.
Duvenaud
,
H.
Nickisch
, and
C. E.
Rasmussen
, “
Additive Gaussian processes
,”
Adv. Neur. Inf. Proc. Sys.
24
,
226
(
2011
).
21.
D. K.
Duvenaud
,
J.
Lloyd
,
R.
Grosse
,
J. B.
Tenenbaum
, and
Z.
Ghahramani
, “
Structure discovery in nonparametric regression through compositional kernel search
,” in
Proceedings of the 30th International Conference on Machine Learning Research
(
W&CP
,
2013
), Vol. 28, p.
1166
.
22.
R.
Vargas-Hernandez
,
J.
Sous
,
M.
Berciu
, and
R. V.
Krems
, “
Extrapolating quantum observables with machine learning: Inferring multiple phase transitions from properties of a single phase
,”
Phys. Rev. Lett.
121
,
255702
(
2018
).
23.
J.
Dai
and
R. V.
Krems
, “
Interpolation and extrapolation of global potential energy surfaces for polyatomic systems by Gaussian processes with composite kernels
,”
J. Chem. Theory Comput.
16
,
1386
(
2020
).
24.
Y.
Cao
,
M. A.
Brubaker
,
D. J.
Fleet
, and
A.
Hertzmann
, “
Efficient optimization for sparse Gaussian process regression
,”
IEEE Trans. Pattern Anal. Mach. Intell.
37
,
2415
(
2015
).
25.
J. Q.
Quinonero-Candela
and
C. E.
Rasmussen
, “
A unifying view of sparse approximate Gaussian process regression
,”
J. Mach. Learn. Res.
6
,
1939
(
2005
).
26.
E.
Snelson
and
Z.
Ghahramani
, in
Advances in Neural Information Processing Systems 18
, edited by
Y.
Weiss
,
B.
Schölkopf
, and
J.
Platt
(
MIT Press
,
2006
), pp.
1257
1264
.
27.
J.
Schreiter
,
D.
Nguyen-Tuong
, and
M.
Toussaint
, “
Efficient sparsification for Gaussian process regression
,”
Neurocomputing
192
,
29
(
2016
).
28.
C. M.
Handley
and
P. L. A.
Popelier
, “
Potential energy surfaces fitted by artificial neural networks
,”
J. Phys. Chem. A
114
,
3371
(
2010
).
29.
J.
Behler
, “
Perspective: Machine learning potentials for atomistic simulations
,”
J. Chem. Phys.
145
,
170901
(
2016
).
30.
S.
Manzhos
and
T.
Carrington
, Jr.
, “
A random-sampling high dimensional model representation neural network for building potential energy surfaces
,”
J. Chem. Phys.
125
,
084109
(
2006
).
31.
S.
Manzhos
,
X.
Wang
,
R.
Dawes
, and
T.
Carrington
, Jr.
, “
A nested molecule-independent neural network approach for high-quality potential fits
,”
J. Phys. Chem. A
110
,
5295
(
2006
).
32.
J.
Behler
and
M.
Parrinello
, “
Generalized neural-network representation of high-dimensional potential-energy surfaces
,”
Phys. Rev. Lett.
98
,
146401
(
2007
).
33.
J.
Behler
, “
Neural network potential-energy surfaces in chemistry: A tool for large-scale simulations
,”
Phys. Chem. Chem. Phys.
13
,
17930
(
2011
).
34.
J.
Behler
, “
Constructing high-dimensional neural network potentials: A tutorial review
,”
Int. J. Quantum Chem.
115
,
1032
(
2015
).
35.
E.
Pradhan
and
A.
Brown
, “
A ground state potential energy surface for HONO based on a neural network with exponential fitting functions
,”
Phys. Chem. Chem. Phys.
19
,
22272
(
2017
).
36.
A.
Leclerc
and
T.
Carrington
, Jr.
, “
Calculating vibrational spectra with sum of product basis functions without storing full-dimensional vectors or matrices
,”
J. Chem. Phys.
140
,
174111
(
2014
).
37.
S.
Manzhos
,
R.
Dawes
, and
T.
Carrington
, “
Neural network-based approaches for building high dimensional and quantum dynamics-friendly potential energy surfaces
,”
Int. J. Quantum Chem.
115
,
1012
(
2015
).
38.
J.
Chen
,
X.
Xu
,
X.
Xu
, and
D. H.
Zhang
, “
A global potential energy surface for the H2 + OH ↔ H2O + H reaction using neural networks
,”
J. Chem. Phys.
138
,
154301
(
2013
).
39.
Q.
Liu
,
X.
Zhou
,
L.
Zhou
,
Y.
Zhang
,
X.
Luo
,
H.
Guo
, and
B.
Jiang
, “
Constructing high-dimensional neural network potential energy surfaces for gas-surface scattering and reactions
,”
J. Phys. Chem. C
122
,
1761
(
2018
).
40.
K.
Yao
,
J. E.
Herr
, and
J.
Parkhill
, “
The many-body expansion combined with neural networks
,”
J. Chem. Phys.
146
,
014106
(
2017
).
41.
S.
Manzhos
and
T.
Carrington
, Jr.
, “
Using neural networks to represent potential surfaces as sums of products
,”
J. Chem. Phys.
125
(
19
),
194105
(
2006
).
42.
C. M.
Handley
,
G. I.
Hawe
,
D. B.
Kell
, and
P. L. A.
Popelier
, “
Optimal construction of a fast and accurate polarisable water potential based on multipole moments trained by machine learning
,”
Phys. Chem. Chem. Phys.
11
,
6365
(
2009
).
43.
A. P.
Bartók
,
M. C.
Payne
,
R.
Kondor
, and
G.
Csányi
, “
Gaussian approximation potentials: The accuracy of quantum mechanics, without the electrons
,”
Phys. Rev. Lett.
104
,
136403
(
2010
).
44.
A. P.
Bartók
and
G.
Csányi
, “
Gaussian approximation potentials: A brief tutorial introduction
,”
Int. J. Quantum Chem.
115
,
1051
(
2015
).
45.
J.
Cui
and
R. V.
Krems
, “
Efficient non-parametric fitting of potential energy surfaces for polyatomic molecules with Gaussian processes
,”
J. Phys. B: At., Mol. Opt. Phys.
49
,
224001
(
2016
).
46.
P. O.
Dral
,
A.
Owens
,
S. N.
Yurchenko
, and
W.
Thiel
, “
Structure-based sampling and self-correcting machine learning for accurate calculations of potential energy surfaces and vibrational levels
,”
J. Chem. Phys.
146
,
244108
(
2017
).
47.
B.
Kolb
,
P.
Marshall
,
B.
Zhao
,
B.
Jiang
, and
H.
Guo
, “
Representing global reactive potential energy surfaces using Gaussian processes
,”
J. Phys. Chem. A
121
,
2552
(
2017
).
48.
G.
Schmitz
and
O.
Christiansen
, “
Gaussian process regression to accelerate geometry optimizations relying on numerical differentiation
,”
J. Chem. Phys.
148
,
241704
(
2018
).
49.
Y.
Guan
,
S.
Yang
, and
D. H.
Zhang
, “
Construction of reactive potential energy surfaces with Gaussian process regression: Active data selection
,”
Mol. Phys.
116
,
823
(
2018
).
50.
G.
Laude
,
D.
Calderini
,
D. P.
Tew
, and
J. O.
Richardson
, “
Ab initio instanton rate theory made efficient using Gaussian process regression
,”
Faraday Discuss.
212
,
237
(
2018
).
51.
Y.
Guan
,
S.
Yang
, and
D. H.
Zhang
, “
Application of clustering algorithms to partitioning configuration space in fitting reactive potential energy surfaces
,”
J. Phys. Chem. A
122
,
3140
(
2018
).
52.
A. E.
Wiens
,
A. V.
Copan
, and
H. F.
Schaefer
, “
Multi-fidelity Gaussian process modeling for chemical energy surfaces
,”
Chem. Phys. Lett. X
3
,
100022
(
2019
).
53.
C.
Qu
,
Q.
Yu
,
B. L.
Van Hoozen
, Jr.
,
J. M.
Bowman
, and
R. A.
Vargas-Hernández
, “
Assessing Gaussian process regression and permutationally invariant polynomial approaches to represent high-dimensional potential energy surfaces
,”
J. Chem. Theory Comput.
14
,
3381
(
2018
).
54.
A.
Glielmo
,
P.
Sollich
, and
A.
De Vita
, “
Accurate interatomic force fields via machine learning with covariant kernels
,”
Phys. Rev. B
95
,
214302
(
2017
).
55.
S. T.
John
and
G.
Csányi
, “
Many-body coarse-grained interactions using Gaussian approximation potentials
,”
J. Phys. Chem. B
121
,
10934
(
2017
).
56.
K. V. J.
Jose
,
N.
Artrith
, and
J.
Behler
, “
Construction of high-dimensional neural network potentials using environment-dependent atom pairs
,”
J. Chem. Phys.
136
,
194111
(
2012
).
57.
V.
Botu
and
R.
Ramprasad
, “
Learning scheme to predict atomic forces and accelerate materials simulations
,”
Phys. Rev. B
92
,
094306
(
2015
).
58.
Z.
Li
,
J. R.
Kermode
, and
A.
De Vita
, “
Molecular dynamics with on-the-fly machine learning of quantum-mechanical forces
,”
Phys. Rev. Lett.
114
,
096405
(
2015
).
59.
M.
Gastegger
,
J.
Behler
, and
P.
Marquetand
, “
Machine learning molecular dynamics for the simulation of infrared spectra
,”
Chem. Sci.
8
,
6924
(
2017
).
60.
L.
Zhang
,
J.
Han
,
H.
Wang
,
R.
Car
, and
Weinan E
, “
Deep potential molecular dynamics: A scalable model with the accuracy of quantum mechanics
,”
Phys. Rev. Lett.
120
,
143001
(
2018
).
61.
V. L.
Deringer
and
G.
Csányi
, “
Machine learning based interatomic potential for amorphous carbon
,”
Phys. Rev. B
95
,
094203
(
2017
).
62.
S.
Chmiela
,
A.
Tkatchenko
,
H. E.
Sauceda
,
I.
Poltavsky
,
K. T.
Schütt
, and
K.-R.
Müller
, “
Machine learning of accurate energy-conserving molecular force fields
,”
Sci. Adv.
3
,
e1603015
(
2017
).
63.
S.
Chmiela
,
H. E.
Sauceda
,
K.-R.
Müller
, and
A.
Tkatchenko
, “
Towards exact molecular dynamics simulations with machine-learned force fields
,”
Nat. Commun.
9
,
3887
(
2018
).
64.
H.E.
Sauceda
,
S.
Chmiela
,
I.
Poltavsky
,
K.-R.
Müller
, and
A.
Tkatchenko
, “
Molecular force fields with gradient-domain machine learning: Construction and application to dynamics of small molecules with coupled cluster forces
,”
J. Chem. Phys.
150
,
114102
(
2019
).
65.
J. S.
Smith
,
O.
Isayev
, and
A. E.
Roitberg
, “
ANI-1: An extensible neural network potential with DFT accuracy at force field computational cost
,”
Chem. Sci.
8
,
3192
(
2017
).
66.
K. T.
Schütt
,
F.
Arbabzadah
,
S.
Chmiela
,
K.-R.
Müller
, and
A.
Tkatchenko
, “
Quantum-chemical insights from deep tensor neural networks
,”
Nat. Commun.
8
,
13890
(
2017
).
67.
K. T.
Schütt
,
H. E.
Sauceda
,
P.-J.
Kindermans
,
A.
Tkatchenko
, and
K.-R.
Müller
, “
SchNet, A deep learning architecture for molecules and materials
,”
J. Chem. Phys.
148
,
241722
(
2018
).
68.
J. S.
Smith
,
O.
Isayev
, and
A. E.
Roitberg
, “
ANI-1, A data set of 20 million calculated off-equilibrium conformations for organic molecules
,”
Sci. Data
4
,
170193
(
2017
).
69.
T.
Hollebeek
,
T.-S.
Ho
, and
H.
Rabitz
, “
Constructing multidimensional molecular potential energy surfaces from ab initio data
,”
Annu. Rev. Phys. Chem.
50
,
537
(
1999
).
70.
T. S.
Ho
and
H.
Rabitz
, “
A general method for constructing multidimensional molecular potential energy surfaces from ab initio calculations
,”
J. Chem. Phys.
104
,
2584
(
1996
).
71.
T.
Hollebeek
,
T.-S.
Ho
, and
H.
Rabitz
, “
A fast algorithm for evaluating multidimensional potential energy surfaces
,”
J. Chem. Phys.
106
,
7223
(
1997
).
72.
T.-S.
Ho
and
H.
Rabitz
, “
Reproducing kernel Hilbert space interpolation methods as a paradigm of high dimensional model representations: Application to multidimensional potential energy surface construction
,”
J. Chem. Phys.
119
,
6433
(
2003
).
73.
Y.
Guan
,
H.
Guo
, and
D. R.
Yarkony
, “
Neural network based quasi-diabatic Hamiltonians with symmetry adaptation and a correct description of conical intersections
,”
J. Chem. Phys.
150
,
214101
(
2019
).
74.
S.
Chmiela
,
H. E.
Sauceda
,
I.
Poltavsky
,
K.-R.
Müller
, and
A.
Tkatchenko
, “
sGDML: Constructing accurate and data efficient molecular force fields using machine learning
,”
Comput. Phys. Commun.
240
,
38
(
2019
).
75.
M. J.
Frisch
,
G. W.
Trucks
,
H. B.
Schlegel
,
G. E.
Scuseria
,
M. A.
Robb
,
J. R.
Cheeseman
,
G.
Scalmani
,
V.
Barone
,
G. A.
Petersson
,
H.
Nakatsuji
,
X.
Li
,
M.
Caricato
,
A. V.
Marenich
,
J.
Bloino
,
B. G.
Janesko
,
R.
Gomperts
,
B.
Mennucci
,
H. P.
Hratchian
,
J. V.
Ortiz
,
A. F.
Izmaylov
,
J. L.
Sonnenberg
,
D.
Williams-Young
,
F.
Ding
,
F.
Lipparini
,
F.
Egidi
,
J.
Goings
,
B.
Peng
,
A.
Petrone
,
T.
Henderson
,
D.
Ranasinghe
,
V. G.
Zakrzewski
,
J.
Gao
,
N.
Rega
,
G.
Zheng
,
W.
Liang
,
M.
Hada
,
M.
Ehara
,
K.
Toyota
,
R.
Fukuda
,
J.
Hasegawa
,
M.
Ishida
,
T.
Nakajima
,
Y.
Honda
,
O.
Kitao
,
H.
Nakai
,
T.
Vreven
,
K.
Throssell
,
J. A.
Montgomery
, Jr.
,
J. E.
Peralta
,
F.
Ogliaro
,
M. J.
Bearpark
,
J. J.
Heyd
,
E. N.
Brothers
,
K. N.
Kudin
,
V. N.
Staroverov
,
T. A.
Keith
,
R.
Kobayashi
,
J.
Normand
,
K.
Raghavachari
,
A. P.
Rendell
,
J. C.
Burant
,
S. S.
Iyengar
,
J.
Tomasi
,
M.
Cossi
,
J. M.
Millam
,
M.
Klene
,
C.
Adamo
,
R.
Cammi
,
J. W.
Ochterski
,
R. L.
Martin
,
K.
Morokuma
,
O.
Farkas
,
J. B.
Foresman
, and
D. J.
Fox
, gaussian 16, Revision A.03,
Gaussian, Inc.
,
Wallingford, CT
,
2016
.
76.
G. F.
Mangiatordi
,
J.
Hermet
, and
C.
Adamo
, “
Modeling proton transfer in imidazole-like dimers: A density functional theory study
,”
J. Phys. Chem. A
115
,
2627
(
2011
).
77.
V.
Deev
and
M. A.
Collins
, “
Approximate ab initio energies by systematic molecular fragmentation
,”
J. Chem. Phys.
122
,
154102
(
2005
).
78.
J.
Snoek
,
H.
Larochelle
, and
R. P.
Adams
, “
Practical Bayesian optimization of machine learning algorithms
,”
Adv. Neural Inf. Process. Sys.
25
,
2951
(
2012
).
79.
B.
Shahriari
,
K.
Swersky
,
Z.
Wang
,
R. P.
Adams
, and
N.
de Freitas
, “
Taking the human out of the loop: A review of Bayesian optimization
,”
Proc. IEEE
104
,
148
(
2016
).
80.
R. A.
Vargas-Hernández
,
Y.
Guan
,
D. H.
Zhang
, and
R. V.
Krems
, “
Bayesian optimization for the inverse scattering problem in quantum reaction dynamics
,”
New J. Phys.
21
,
022001
(
2019
).
81.
Z.
Deng
,
I.
Tutunnikov
,
I. Sh.
Averbukh
,
M.
Thachuk
, and
R. V.
Krems
, “
Bayesian optimization for inverse problems in time-dependent quantum dynamics
,” arXiv:2006.06212.

Supplementary Material

You do not currently have access to this content.