There are many atomistic simulation methods with very different costs, accuracies, transferabilities, and numbers of empirical parameters. I show how statistical model selection can compare these methods fairly, even when they are very different. These comparisons are also useful for developing new methods that balance cost and accuracy. As an example, I build a semiempirical model for hydrogen clusters.

1.
M.
Born
and
A.
Landé
, “
Die Abst’ande der Atome im Molekül und im Kristalle
,”
Die Naturwiss.
6
,
496
(
1918
).
2.
L.
Talirz
,
L. M.
Ghiringhelli
, and
B.
Smit
, “
Trends in atomistic simulation software usage [article v1.0]
,”
Living J. Comput. Mol. Sci.
3
,
1483
(
2021
).
3.
K. A.
Dill
and
J. L.
MacCallum
, “
The protein-folding problem, 50 years on
,”
Science
338
,
1042
1046
(
2012
).
4.
C. A.
Becker
,
F.
Tavazza
,
Z. T.
Trautt
, and
R. A.
Buarque de Macedo
, “
Considerations for choosing and using force fields and interatomic potentials in materials science and engineering
,”
Curr. Opin. Solid State Mater. Sci.
17
,
277
283
(
2013
).
5.
J. A.
Pople
, “
Nobel lecture: Quantum chemical models
,”
Rev. Mod. Phys.
71
,
1267
1274
(
1999
).
6.
S. J.
Plimpton
and
A. P.
Thompson
, “
Computational aspects of many-body potentials
,”
MRS Bull.
37
,
513
521
(
2012
).
7.
P. R.
Nagy
and
M.
Kállay
, “
Approaching the basis set limit of CCSD(T) energies for large molecules with local natural orbital coupled-cluster methods
,”
J. Chem. Theory Comput.
15
,
5275
5298
(
2019
).
8.
T.
Husch
,
A. C.
Vaucher
, and
M.
Reiher
, “
Semiempirical molecular orbital models based on the neglect of diatomic differential overlap approximation
,”
Int. J. Quantum Chem.
118
,
e25799
(
2018
).
9.
C.
Bannwarth
,
E.
Caldeweyher
,
S.
Ehlert
,
A.
Hansen
,
P.
Pracht
,
J.
Seibert
,
S.
Spicher
, and
S.
Grimme
, “
Extended tight-binding quantum chemistry methods
,”
Wiley Interdiscip. Rev. Comput. Mol. Sci.
11
,
e1493
(
2021
).
10.
K. P.
Burnham
and
D. R.
Anderson
,
Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach
(
Springer
,
New York
,
2002
).
11.
K.
Burke
, “
Perspective on density functional theory
,”
J. Chem. Phys.
136
,
150901
(
2012
).
12.
S.
Kullback
,
Information Theory and Statistics
(
Wiley
,
New York
,
1959
).
13.
P.
Pernot
and
A.
Savin
, “
Probabilistic performance estimators for computational chemistry methods: The empirical cumulative distribution function of absolute errors
,”
J. Chem. Phys.
148
,
241707
(
2018
).
14.
H.
Akaike
, “
A new look at the statistical model identification
,”
IEEE Trans. Autom. Control
19
,
716
723
(
1974
).
15.
R.
Shibata
, “
Statistical aspects of model selection
,” in
From Data to Model
, edited by
J. C.
Willems
(
Springer-Verlag
,
New York
,
1989
), pp.
215
240
.
16.
R. T.
Marler
and
J. S.
Arora
, “
Survey of multi-objective optimization methods for engineering
,”
Struct. Multidisc. Optim.
26
,
369
395
(
2004
).
17.
J. E.
Dennis
, Jr.
and
J. J.
Moré
, “
Quasi-Newton methods, motivation and theory
,”
SIAM Rev.
19
,
46
89
(
1977
).
18.
C. A.
Coulson
and
I.
Fischer
, “
XXXIV. Notes on the molecular orbital treatment of the hydrogen molecule
,”
Philos. Mag.
40
,
386
393
(
1949
).
19.
R.
Clampitt
and
L.
Gowland
, “
Clustering of cold hydrogen gas on protons
,”
Nature
223
,
815
816
(
1969
).
20.
M.
Renzler
,
M.
Kuhn
,
A.
Mauracher
,
A.
Lindinger
,
P.
Scheier
, and
A. M.
Ellis
, “
Anionic hydrogen cluster ions as a new form of condensed hydrogen
,”
Phys. Rev. Lett.
117
,
273001
(
2016
).
21.
N.
Mardirossian
and
M.
Head-Gordon
, “
Thirty years of density functional theory in computational chemistry: An overview and extensive assessment of 200 density functionals
,”
Mol. Phys.
115
,
2315
2372
(
2017
).
22.
K.
Raghavachari
,
G. W.
Trucks
,
J. A.
Pople
, and
M.
Head-Gordon
, “
A fifth-order perturbation comparison of electron correlation theories
,”
Chem. Phys. Lett.
157
,
479
483
(
1989
).
23.
F.
Weigend
and
R.
Ahlrichs
, “
Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: Design and assessment of accuracy
,”
Phys. Chem. Chem. Phys.
7
,
3297
3305
(
2005
).
24.
M. J. S.
Dewar
,
E. G.
Zoebisch
,
E. F.
Healy
, and
J. J. P.
Stewart
, “
Development and use of quantum mechanical molecular models. 76. AM1: A new general purpose quantum mechanical molecular model
,”
J. Am. Chem. Soc.
107
,
3902
3909
(
1985
).
25.
J. J. P.
Stewart
, “
Optimization of parameters for semiempirical methods VI: More modifications to the NDDO approximations and re-optimization of parameters
,”
J. Mol. Model.
19
,
1
32
(
2013
).
26.
M. J. S.
Dewar
and
W.
Thiel
, “
Ground states of molecules. 38. The MNDO method. Approximations and parameters
,”
J. Am. Chem. Soc.
99
,
4899
4907
(
1977
).
27.
S.
Grimme
,
C.
Bannwarth
, and
P.
Shushkov
, “
A robust and accurate tight-binding quantum chemical method for structures, vibrational frequencies, and noncovalent interactions of large molecular systems parametrized for all spd-block elements (Z = 1–86)
,”
J. Chem. Theory Comput.
13
,
1989
2009
(
2017
).
28.
C.
Bannwarth
,
S.
Ehlert
, and
S.
Grimme
, “
GFN2-xTB—An accurate and broadly parametrized self-consistent tight-binding quantum chemical method with multipole electrostatics and density-dependent dispersion contributions
,”
J. Chem. Theory Comput.
15
(
3
),
1652
1671
(
2019
).
29.
M.
Elstner
,
D.
Porezag
,
G.
Jungnickel
,
J.
Elsner
,
M.
Haugk
,
T.
Frauenheim
,
S.
Suhai
, and
G.
Seifert
, “
Self-consistent-charge density-functional tight-binding method for simulations of complex materials properties
,”
Phys. Rev. B
58
,
7260
7268
(
1998
).
30.
J. P.
Perdew
,
K.
Burke
, and
M.
Ernzerhof
, “
Generalized gradient approximation made simple
,”
Phys. Rev. Lett.
77
,
3865
3868
(
1996
).
31.
P. J.
Stephens
,
F. J.
Devlin
,
C. F.
Chabalowski
, and
M. J.
Frisch
, “
Ab initio calculation of vibrational absorption and circular dichroism spectra using density functional force fields
,”
J. Phys. Chem.
98
,
11623
11627
(
1994
).
32.
N.
Mardirossian
and
M.
Head-Gordon
, “
ωB97M-V: A combinatorially optimized, range-separated hybrid, meta-GGA density functional with VV10 nonlocal correlation
,”
J. Chem. Phys.
144
,
214110
(
2016
).
33.
P.
Pulay
, “
Convergence acceleration of iterative sequences. The case of SCF iteration
,”
Chem. Phys. Lett.
73
,
393
398
(
1980
).
34.
M.
Motta
,
C.
Genovese
,
F.
Ma
,
Z.-H.
Cui
,
R.
Sawaya
,
G. K.-L.
Chan
,
N.
Chepiga
,
P.
Helms
,
C.
Jiménez-Hoyos
,
A. J.
Millis
,
U.
Ray
,
E.
Ronca
,
H.
Shi
,
S.
Sorella
,
E. M.
Stoudenmire
,
S. R.
White
, and
S.
Zhang
, “
Ground-state properties of the hydrogen chain: Dimerization, insulator-to-metal transition, and magnetic phases
,”
Phys. Rev. X
10
,
031058
(
2020
).
35.
C.
Duan
,
D. B. K.
Chu
,
A.
Nandy
, and
H. J.
Kulik
, “
Detection of multi-reference character imbalances enables a transfer learning approach for virtual high throughput screening with coupled cluster accuracy at DFT cost
,”
Chem. Sci.
13
,
4962
4971
(
2022
).
36.
P.
Mori-Sánchez
,
A. J.
Cohen
, and
W.
Yang
, “
Localization and delocalization errors in density functional theory and implications for band-gap prediction
,”
Phys. Rev. Lett.
100
,
146401
(
2008
).
37.
F.
Hu
,
F.
He
, and
D. J.
Yaron
, “
Semiempirical Hamiltonians learned from data can have accuracy comparable to density functional theory
,” arXiv:2210.11682 [physics.chem-ph].
38.
T.
Weymuth
and
M.
Reiher
, “
The transferability limits of static benchmarks
,”
Phys. Chem. Chem. Phys.
24
,
14692
14698
(
2022
).
39.
D.
Folmsbee
and
G.
Hutchison
, “
Assessing conformer energies using electronic structure and machine learning methods
,”
Int. J. Quantum Chem.
121
,
e26381
(
2021
).
40.
B.
O’Leary
,
B. J.
Duke
, and
J. E.
Eilers
, “
Utilization of transferability in molecular orbital theory
,”
Adv. Quantum Chem.
9
,
1
67
(
1975
).
41.
C. J.
Cramer
and
D. G.
Truhlar
, “
Implicit solvation models: Equilibria, structure, spectra, and dynamics
,”
Chem. Rev.
99
,
2161
2200
(
1999
).
42.
H. M.
Senn
and
W.
Thiel
, “
QM/MM methods for biomolecular systems
,”
Angew. Chem., Int. Ed.
48
,
1198
1229
(
2009
).
43.
M.
Leslie
and
N. J.
Gillan
, “
The energy and elastic dipole tensor of defects in ionic crystals calculated by the supercell method
,”
J. Phys. C: Solid State Phys.
18
,
973
982
(
1985
).
44.
S.
Goedecker
, “
Linear scaling electronic structure methods
,”
Rev. Mod. Phys.
71
,
1085
1123
(
1999
).
45.
M.
Wolfsberg
and
L.
Helmholz
, “
The Spectra and Electronic Structure of the Tetrahedral Ions MnO4, CrO4, and ClO4
,”
J. Chem. Phys.
20
,
837
843
(
1952
).
46.
M.
Wahiduzzaman
,
A. F.
Oliveira
,
P.
Philipsen
,
L.
Zhechkov
,
E.
van Lenthe
,
H. A.
Witek
, and
T.
Heine
, “
DFTB parameters for the periodic table: Part 1, electronic structure
,”
J. Chem. Theory Comput.
9
,
4006
4017
(
2013
).
47.
J.
Behler
, “
Perspective: Machine learning potentials for atomistic simulations
,”
J. Chem. Phys.
145
,
170901
(
2016
).
48.
P. O.
Dral
,
O. A.
von Lilienfeld
, and
W.
Thiel
, “
Machine learning of parameters for accurate semiempirical quantum chemical calculations
,”
J. Chem. Theory Comput.
11
,
2120
2125
(
2015
).
49.
M.
Belkin
,
D.
Hsu
,
S.
Ma
, and
S.
Mandal
, “
Reconciling modern machine-learning practice and the classical bias–variance trade-off
,”
Proc. Natl. Acad. Sci. U. S. A.
116
,
15849
15854
(
2019
).
50.
D. G. A.
Smith
,
D.
Altarawy
,
L. A.
Burns
,
M.
Welborn
,
L. N.
Naden
,
L.
Ward
,
S.
Ellis
,
B. P.
Pritchard
, and
T. D.
Crawford
, “
The MolSSI QCArchive project: An open-source platform to compute, organize, and share quantum chemistry data
,”
Wiley Interdiscip. Rev.: Comput. Mol. Sci.
11
,
e1491
(
2021
).
51.
M. G.
Medvedev
,
I. S.
Bushmarinov
,
J.
Sun
,
J. P.
Perdew
, and
K. A.
Lyssenko
, “
Density functional theory is straying from the path toward the exact functional
,”
Science
355
,
49
52
(
2017
).
52.
Q.
Sun
,
T. C.
Berkelbach
,
N. S.
Blunt
,
G. H.
Booth
,
S.
Guo
,
Z.
Li
,
J.
Liu
,
J.
McClain
,
E. R.
Sayfutyarova
,
S.
Sharma
,
S.
Wouters
, and
G. K.-L.
Chan
, “
PySCF: The Python-based simulations of chemistry framework
,”
Wiley Interdiscip. Rev.: Comput. Mol. Sci.
8
,
e1340
(
2018
).
53.
Q.
Sun
,
X.
Zhang
,
S.
Banerjee
,
P.
Bao
,
M.
Barbry
,
N. S.
Blunt
,
N. A.
Bogdanov
,
G. H.
Booth
,
J.
Chen
,
Z.-H.
Cui
,
J. J.
Eriksen
,
Y.
Gao
,
S.
Guo
,
J.
Hermann
,
M. R.
Hermes
,
K.
Koh
,
P.
Koval
,
S.
Lehtola
,
Z.
Li
,
J.
Liu
,
N.
Mardirossian
,
J. D.
McClain
,
M.
Motta
,
B.
Mussard
,
H. Q.
Pham
,
A.
Pulkin
,
W.
Purwanto
,
P. J.
Robinson
,
E.
Ronca
,
E. R.
Sayfutyarova
,
M.
Scheurer
,
H. F.
Schurkus
,
J. E. T.
Smith
,
C.
Sun
,
S.-N.
Sun
,
S.
Upadhyay
,
L. K.
Wagner
,
X.
Wang
,
A.
White
,
J. D.
Whitfield
,
M. J.
Williamson
,
S.
Wouters
,
J.
Yang
,
J. M.
Yu
,
T.
Zhu
,
T. C.
Berkelbach
,
S.
Sharma
,
A. Y.
Sokolov
, and
G. K.-L.
Chan
, “
Recent developments in the PySCF program package
,”
J. Chem. Phys.
153
,
024109
(
2020
).
54.

Several quantum chemistry codes were considered for this work, but I was not able to get all of the necessary methods for this project working in any one code without modifications. I ultimately chose PySCF because the open-source Python codebase made it easier to find and fix bugs and contribute bug fixes. The offending bugs were associated with logical problems when a fully spin-polarized system had no beta electrons in some post-HF methods and a memory leak in the CCSD(T) code that caused crashes in long workflows.

55.
X.
Hu
and
W.
Yang
, “
Accelerating self-consistent field convergence with the augmented Roothaan-Hall energy function
,”
J. Chem. Phys.
132
,
054109
(
2010
).
56.
J. J. P.
Stewart
, “
MOPAC: A semiempirical molecular orbital program
,”
J. Computer-Aided Mol. Des.
4
,
1
103
(
1990
).
57.

The most relevant details for success rates were SCF convergence tolerances set to 10−8, the maximum number of DIIS vectors set to 10, and the maximum number of SCF cycles set to 100 for def2-SVP and 200 for def2-QZVPP. The CCSD calculations were also set to a maximum number of 200 iterations and a convergence tolerance of 10−5 on the cluster operator.

58.
V. K.
Voora
,
A.
Kairalapova
,
T.
Sommerfeld
, and
K. D.
Jordan
, “
Theoretical approaches for treating non-valence correlation-bound anions
,”
J. Chem. Phys.
147
,
214114
(
2017
).
You do not currently have access to this content.