Recent work has demonstrated the promise of using machine-learned surrogates, in particular, Gaussian process (GP) surrogates, in reducing the number of electronic structure calculations (ESCs) needed to perform surrogate model based (SMB) geometry optimization. In this paper, we study geometry meta-optimization with GP surrogates where a SMB optimizer additionally learns from its past “experience” performing geometry optimization. To validate this idea, we start with the simplest setting where a geometry meta-optimizer learns from previous optimizations of the same molecule with different initial-guess geometries. We give empirical evidence that geometry meta-optimization with GP surrogates is effective and requires less tuning compared to SMB optimization with GP surrogates on the ANI-1 dataset of off-equilibrium initial structures of small organic molecules. Unlike SMB optimization where a surrogate should be immediately useful for optimizing a given geometry, a surrogate in geometry meta-optimization has more flexibility because it can distribute its ESC savings across a set of geometries. Indeed, we find that GP surrogates that preserve rotational invariance provide increased marginal ESC savings across geometries. As a more stringent test, we also apply geometry meta-optimization to conformational search on a hand-constructed dataset of hydrocarbons and alcohols. We observe that while SMB optimization and geometry meta-optimization do save on ESCs, they also tend to miss higher energy conformers compared to standard geometry optimization. We believe that further research into characterizing the divergence between GP surrogates and potential energy surfaces is critical not only for advancing geometry meta-optimization but also for exploring the potential of machine-learned surrogates in geometry optimization in general.

1.
W.
Kohn
and
L. J.
Sham
, “
Self-consistent equations including exchange and correlation effects
,”
Phys. Rev.
140
,
A1133
A1138
(
1965
).
2.
W.
Kohn
,
A. D.
Becke
, and
R. G.
Parr
, “
Density functional theory of electronic structure
,”
J. Phys. Chem.
100
,
12974
12980
(
1996
).
3.
A.
Denzel
and
J.
Kästner
, “
Gaussian process regression for geometry optimization
,”
J. Chem. Phys.
148
,
094114
(
2018
).
4.
A.
Denzel
,
B.
Haasdonk
, and
J.
Kästner
, “
Gaussian process regression for minimum energy path optimization and transition state search
,”
J. Phys. Chem. A
123
,
9600
9611
(
2019
).
5.
A.
Denzel
and
J.
Kästner
, “
Gaussian process regression for transition state search
,”
J. Chem. Theory Comput.
14
,
5777
5786
(
2018
).
6.
A.
Denzel
and
J.
Kästner
, “
Hessian matrix update scheme for transition state search based on Gaussian process regression
,”
J. Chem. Theory Comput.
16
,
5083
5089
(
2020
).
7.
R.
Meyer
and
A. W.
Hauser
, “
Geometry optimization using Gaussian process regression in internal coordinate systems
,”
J. Chem. Phys.
152
,
084112
(
2020
).
8.
O.-P.
Koistinen
,
V.
Ásgeirsson
,
A.
Vehtari
, and
H.
Jónsson
, “
Nudged elastic band calculations accelerated with Gaussian process regression based on inverse interatomic distances
,”
J. Chem. Theory Comput.
15
,
6738
6751
(
2019
).
9.
O.-P.
Koistinen
,
V.
Ásgeirsson
,
A.
Vehtari
, and
H.
Jónsson
, “
Minimum mode saddle point searches using Gaussian process regression with inverse-distance covariance function
,”
J. Chem. Theory Comput.
16
,
499
509
(
2020
).
10.
H.
Sugisawa
,
T.
Ida
, and
R. V.
Krems
, “
Gaussian process model of 51-dimensional potential energy surface for protonated imidazole dimer
,”
J. Chem. Phys.
153
,
114101
(
2020
).
11.
P.
Rowe
,
V. L.
Deringer
,
P.
Gasparotto
,
G.
Csányi
, and
A.
Michaelides
, “
An accurate and transferable machine learning potential for carbon
,”
J. Chem. Phys.
153
,
034702
(
2020
).
12.
A. P.
Bartók
,
M. C.
Payne
,
R.
Kondor
, and
G.
Csányi
, “
Gaussian approximation potentials: The accuracy of quantum mechanics, without the electrons
,”
Phys. Rev. Lett.
104
,
136403
(
2010
).
13.
O. T.
Unke
,
S.
Chmiela
,
H. E.
Sauceda
,
M.
Gastegger
,
I.
Poltavsky
,
K. T.
Schütt
,
A.
Tkatchenko
, and
K.-R.
Müller
, “
Machine learning force fields
,”
Chem. Rev.
121
,
10142
10186
(
2021
).
14.
E.
Garijo del Río
,
J. J.
Mortensen
, and
K. W.
Jacobsen
, “
Local Bayesian optimizer for atomic structures
,”
Phys. Rev. B
100
,
104103
(
2019
).
15.
S.
Chmiela
,
A.
Tkatchenko
,
H. E.
Sauceda
,
I.
Poltavsky
,
K. T.
Schütt
, and
K.-R.
Müller
, “
Machine learning of accurate energy-conserving molecular force fields
,”
Sci. Adv.
3
,
e1603015
(
2017
).
16.
H. E.
Sauceda
,
S.
Chmiela
,
I.
Poltavsky
,
K.-R.
Müller
, and
A.
Tkatchenko
, “
Molecular force fields with gradient-domain machine learning: Construction and application to dynamics of small molecules with coupled cluster forces
,”
J. Chem. Phys.
150
,
114102
(
2019
).
17.
S.
Chmiela
,
H. E.
Sauceda
,
I.
Poltavsky
,
K.-R.
Müller
, and
A.
Tkatchenko
, “
sGDML: Constructing accurate and data efficient molecular force fields using machine learning
,”
Comput. Phys. Commun.
240
,
38
45
(
2019
).
18.
A.
Glielmo
,
C.
Zeni
, and
A.
De Vita
, “
Efficient nonparametric n-body force fields from machine learning
,”
Phys. Rev. B
97
,
184307
(
2018
).
19.
V. L.
Deringer
,
A. P.
Bartók
,
N.
Bernstein
,
D. M.
Wilkins
,
M.
Ceriotti
, and
G.
Csányi
, “
Gaussian process regression for materials and molecules
,”
Chem. Rev.
121
,
10073
10141
(
2021
).
20.
G.
Schmitz
and
O.
Christiansen
, “
Gaussian process regression to accelerate geometry optimizations relying on numerical differentiation
,”
J. Chem. Phys.
148
,
241704
(
2018
).
21.
R.
Archibald
,
J. T.
Krogel
, and
P. R. C.
Kent
, “
Gaussian process based optimization of molecular geometries using statistically sampled energy surfaces from quantum Monte Carlo
,”
J. Chem. Phys.
149
,
164116
(
2018
).
22.
C.
Finn
,
P.
Abbeel
, and
S.
Levine
, “
Model-agnostic meta-learning for fast adaptation of deep networks
,” in
Proceedings of the 34th International Conference on Machine Learning
(
PMLR
,
2017
), pp.
1126
1135
.
23.
A.
Nichol
,
J.
Achiam
, and
J.
Schulman
, “
On first-order meta-learning algorithms
,” arXiv:1803.02999 [cs] (
2018
).
24.
J.
Yoon
,
T.
Kim
,
O.
Dia
,
S.
Kim
,
Y.
Bengio
, and
S.
Ahn
, “
Bayesian model-agnostic meta-learning
,” in
Advances in Neural Information Processing Systems
(
Curran Associates, Inc.
,
2018
), Vol. 31.
25.
R. S.
Sutton
and
A. G.
Barto
,
Reinforcement Learning
, 2nd ed. (
MIT Press
,
2018
).
26.
J.
Snoek
,
H.
Larochelle
, and
R. P.
Adams
, “
Practical Bayesian optimization of machine learning algorithms
,” in
Advances in Neural Information Processing Systems
(
Curran Associates, Inc.
,
2012
), Vol. 25.
27.
B.
Shahriari
,
K.
Swersky
,
Z.
Wang
,
R. P.
Adams
, and
N.
de Freitas
, “
Taking the human out of the loop: A review of Bayesian optimization
,”
Proc. IEEE
104
,
148
175
(
2016
).
28.
B.
Settles
, “
Active learning literature survey
,” Technical Report No. 1648,
University of Wisconsin-Madison Department of Computer Sciences
,
2009
.
29.
J. S.
Smith
,
O.
Isayev
, and
A. E.
Roitberg
, “
ANI-1, A data set of 20 million calculated off-equilibrium conformations for organic molecules
,”
Sci. Data
4
,
170193
(
2017
).
30.
J. S.
Smith
,
O.
Isayev
, and
A. E.
Roitberg
, “
ANI-1: An extensible neural network potential with DFT accuracy at force field computational cost
,”
Chem. Sci.
8
,
3192
3203
(
2017
).
31.
Ö.
Farkas
and
H. B.
Schlegel
, “
Methods for optimizing large molecules. Part III. An improved algorithm for geometry optimization using direct inversion in the iterative subspace (GDIIS)
,”
Phys. Chem. Chem. Phys.
4
,
11
15
(
2002
).
32.
X.
Li
and
M. J.
Frisch
, “
Energy-represented direct inversion in the iterative subspace within a hybrid geometry optimization method
,”
J. Chem. Theory Comput.
2
,
835
839
(
2006
).
33.
A.
Banerjee
,
N.
Adams
,
J.
Simons
, and
R.
Shepard
, “
Search for stationary points on surfaces
,”
J. Phys. Chem.
89
,
52
57
(
1985
).
34.
S.
Manzhos
and
T.
Carrington
, “
A random-sampling high dimensional model representation neural network for building potential energy surfaces
,”
J. Chem. Phys.
125
,
084109
(
2006
).
35.
J.
Behler
and
M.
Parrinello
, “
Generalized neural-network representation of high-dimensional potential-energy surfaces
,”
Phys. Rev. Lett.
98
,
146401
(
2007
).
36.
J.
Behler
, “
Representing potential energy surfaces by high-dimensional neural network potentials
,”
J. Phys.: Condens. Matter
26
,
183001
(
2014
).
37.
J.
Behler
, “
Perspective: Machine learning potentials for atomistic simulations
,”
J. Chem. Phys.
145
,
170901
(
2016
).
38.
K.
Schütt
,
P.-J.
Kindermans
,
H. E.
Sauceda Felix
,
S.
Chmiela
,
A.
Tkatchenko
, and
K.-R.
Müller
, “
SchNet: A continuous-filter convolutional neural network for modeling quantum interactions
,” in
Advances in Neural Information Processing Systems
(
Curran Associates, Inc.
,
2017
), Vol. 30.
39.
K. T.
Schütt
,
H. E.
Sauceda
,
P.-J.
Kindermans
,
A.
Tkatchenko
, and
K.-R.
Müller
, “
SchNet—A deep learning architecture for molecules and materials
,”
J. Chem. Phys.
148
,
241722
(
2018
).
40.
K. T.
Schütt
,
P.
Kessel
,
M.
Gastegger
,
K. A.
Nicoli
,
A.
Tkatchenko
, and
K.-R.
Müller
, “
SchNetPack: A deep learning toolbox for atomistic systems
,”
J. Chem. Theory Comput.
15
,
448
455
(
2019
).
41.
L.
Zhang
,
J.
Han
,
H.
Wang
,
W.
Saidi
,
R.
Car
, and
W.
E
, “
End-to-end symmetry preserving inter-atomic potential energy model for finite and extended systems
,” in
Advances in Neural Information Processing Systems
(
Curran Associates, Inc.
,
2018
), Vol. 31.
42.
O. T.
Unke
and
M.
Meuwly
, “
PhysNet: A neural network for predicting energies, forces, dipole moments, and partial charges
,”
J. Chem. Theory Comput.
15
,
3678
3693
(
2019
).
43.
B.
Anderson
,
T. S.
Hy
, and
R.
Kondor
, “
Cormorant: Covariant molecular neural networks
,” in
Advances in Neural Information Processing Systems
(
Curran Associates, Inc.
,
2019
), Vol. 32.
44.
J.
Klicpera
,
J.
Groß
, and
S.
Günnemann
, “
Directional message passing for molecular graphs
,” in
International Conference on Learning Representations
,
2020
.
45.
C. K. I.
Williams
and
C. E.
Rasmussen
,
Gaussian Processes for Machine Learning
(
MIT Press
Cambridge, MA
,
2006
), Vol. 2.
46.
A. A.
Peterson
,
R.
Christensen
, and
A.
Khorshidi
, “
Addressing uncertainty in atomistic machine learning
,”
Phys. Chem. Chem. Phys.
19
,
10978
10985
(
2017
).
47.
E. V.
Podryabinkin
and
A. V.
Shapeev
, “
Active learning of linearly parametrized interatomic potentials
,”
Comput. Mater. Sci.
140
,
171
180
(
2017
).
48.
N.
Bernstein
,
G.
Csányi
, and
V. L.
Deringer
, “
De novo exploration and self-guided learning of potential-energy surfaces
,”
npj Comput. Mater.
5
,
99
(
2019
).
49.
R.
Jinnouchi
,
F.
Karsai
, and
G.
Kresse
, “
On-the-fly machine learning force field generation: Application to melting points
,”
Phys. Rev. B
100
,
014105
(
2019
).
50.
M. K.
Bisbo
and
B.
Hammer
, “
Efficient global structure optimization with a machine-learned surrogate model
,”
Phys. Rev. Lett.
124
,
086102
(
2020
).
51.
F.
Musil
,
A.
Grisafi
,
A. P.
Bartók
,
C.
Ortner
,
G.
Csányi
, and
M.
Ceriotti
, “
Physics-inspired structural representations for molecules and materials
,”
Chem. Rev.
121
,
9759
9815
(
2021
).
52.
M.
Rupp
,
A.
Tkatchenko
,
K.-R.
Müller
, and
O. A.
von Lilienfeld
, “
Fast and accurate modeling of molecular atomization energies with machine learning
,”
Phys. Rev. Lett.
108
,
058301
(
2012
).
53.
B. J.
Braams
and
J. M.
Bowman
, “
Permutationally invariant potential energy surfaces in high dimensionality
,”
Int. Rev. Phys. Chem.
28
,
577
606
(
2009
).
54.
Z.
Xie
and
J. M.
Bowman
, “
Permutationally invariant polynomial basis for molecular energy surface fitting via monomial symmetrization
,”
J. Chem. Theory Comput.
6
,
26
34
(
2010
).
55.
B.
Jiang
and
H.
Guo
, “
Permutation invariant polynomial neural network approach to fitting potential energy surfaces
,”
J. Chem. Phys.
139
,
054112
(
2013
).
56.
D.
Huang
,
C.
Teng
,
J. L.
Bao
, and
J.-B.
Tristan
, “
mad-GP: Automatic differentiation of Gaussian processes for molecules and materials
,”
J. Math. Chem.
(published online 2022).
57.
J.
Quiñonero-Candela
and
C. E.
Rasmussen
, “
A unifying view of sparse approximate Gaussian process regression
,”
J. Mach. Learn. Res.
6
,
1939
1959
(
2005
).
58.
E.
Snelson
and
Z.
Ghahramani
, “
Sparse Gaussian processes using pseudo-inputs
,” in
Advances in Neural Information Processing Systems
(
MIT Press
,
2006
), Vol. 18.
59.
J.
Hensman
,
N.
Fusi
, and
N. D.
Lawrence
, “
Gaussian processes for Big data
,” in
Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence, UAI’13
(
AUAI Press
,
Arlington, VA
,
2013
), pp.
282
290
.
60.
Q.
Le
,
T.
Sarlos
, and
A.
Smola
, “
Fastfood—Computing hilbert space expansions in loglinear time
,” in
Proceedings of the 30th International Conference on Machine Learning
(
PMLR
,
2013
), pp.
244
252
.
61.
A.
Wilson
and
H.
Nickisch
, “
Kernel interpolation for scalable structured Gaussian processes (KISS-GP)
,” in
Proceedings of the 32nd International Conference on Machine Learning
(
PMLR
,
2015
), pp.
1775
1784
.
62.
D.
Eriksson
,
K.
Dong
,
E.
Lee
,
D.
Bindel
, and
A. G.
Wilson
, “
Scaling Gaussian process regression with derivatives
,” in
Advances in Neural Information Processing Systems
(
Curran Associates, Inc.
,
2018
), Vol. 31.
63.
J. J. P.
Stewart
, Mopac2016, Stewart Computational Chemistry,
Colorado Springs, CO
,
2016
.
64.
J. J. P.
Stewart
, “
Optimization of parameters for semiempirical methods VI: More modifications to the NDDO approximations and re-optimization of parameters
,”
J. Mol. Model.
19
,
1
32
(
2013
).
65.
J. A.
Pople
and
R. K.
Nesbet
, “
Self-consistent orbitals for radicals
,”
J. Chem. Phys.
22
,
571
572
(
1954
).
66.
A. G.
Baydin
,
B. A.
Pearlmutter
,
A. A.
Radul
, and
J. M.
Siskind
, “
Automatic differentiation in machine learning: A survey
,”
J. Mach. Learn. Res.
18
,
1
43
(
2018
).
67.
T.
Fink
and
J.-L.
Reymond
, “
Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: Assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physicochemical properties, compound classes, and drug discovery
,”
J. Chem. Inf. Model.
47
,
342
353
(
2007
).
68.
W.
Kabsch
, “
A solution for the best rotation to relate two sets of vectors
,”
Acta Crystallogr., Sect. A
32
,
922
923
(
1976
).
69.
J. C.
Kromann
, Calculate root-mean-square deviation (RMSD) of two molecules using rotation, 2021, software available from http://github.com/charnley/rmsd, v1.4.
70.
J. L.
Bao
,
L.
Xing
, and
D. G.
Truhlar
, “
Dual-level method for estimating multistructural partition functions with torsional anharmonicity
,”
J. Chem. Theory Comput.
13
,
2511
2522
(
2017
).
You do not currently have access to this content.