A combinatorially optimized, range-separated hybrid, meta-GGA density functional with VV10 nonlocal correlation is presented. The final 12-parameter functional form is selected from approximately 10 × 109 candidate fits that are trained on a training set of 870 data points and tested on a primary test set of 2964 data points. The resulting density functional, ωB97M-V, is further tested for transferability on a secondary test set of 1152 data points. For comparison, ωB97M-V is benchmarked against 11 leading density functionals including M06-2X, ωB97X-D, M08-HX, M11, ωM05-D, ωB97X-V, and MN15. Encouragingly, the overall performance of ωB97M-V on nearly 5000 data points clearly surpasses that of all of the tested density functionals. In order to facilitate the use of ωB97M-V, its basis set dependence and integration grid sensitivity are thoroughly assessed, and recommendations that take into account both efficiency and accuracy are provided.

Density functional theory (DFT) is built on exact foundations,1 but the exact functional, even if it were accessible, would likely be so complicated that it would give little practical advantage relative to the best wave function theories. The great achievement of functional development to date is the fact that very approximate functionals can provide useful levels of accuracy for many electronic structure problems in chemistry and condensed matter physics. The quest to obtain improved functionals that are computationally tractable continues in many research groups today, and this paper describes a promising effort in that direction.

The parameterization of empirical density functionals via linear least-squares fitting is perhaps the most widely used method for functional development in the quantum chemistry community. Introduced by Axel Becke with the B97 density functional,2 it relies on expanding inhomogeneity variables based on physically relevant ingredients, such as the spin-density (ρσ), its gradient (ρσ), or the kinetic energy density (τσ=inσψiσ2), in one or more power series, whose coefficients are determined with the use of a training set of high-quality reference values.

Since 1997, at least 40 semi-empirical density functionals have been developed based on the concept introduced by B97. These functionals range from generalized gradient approximation (GGA) and nonseparable gradient approximation (NGA) functionals to meta-GGA and meta-NGA functionals. GGAs, representing Rung 2 of Perdew’s Jacob’s Ladder,3 usually depend on a single inhomogeneity variable that is a function of both ρσ and ρσ, while NGAs additionally depend on an inhomogeneity variable that is solely a function of ρσ. Meta-GGAs (Rung 3) expand upon GGAs by including an additional dependence on an inhomogeneity variable that depends on ρσ and τσ, while meta-NGAs expand upon meta-GGAs in the same way that NGAs expand upon GGAs.

The inclusion of exact exchange, popularized in 1993 with the B3PW91 density functional,4 has conventionally been of the global variety, meaning that the fraction of exact exchange is uniform for all interelectronic distances. More recently, these global hybrids have been replaced by range-separated hybrid (RSH) functionals,5 which have a fraction of short-range exact exchange that typically either smoothly increases to 1 (long-range-corrected) or smoothly decreases to 0 (screened-exchange).

Finally, since density functionals are unable to properly account for long-range correlation, most modern parameterizations simultaneously train a dispersion correction onto the exchange-correlation functional. The simplest form for a dispersion correction is a damped, atom-atom potential (DFT-D) such as Grimme’s DFT-D2 or DFT-D3 models.6–8 A more rigorous treatment of dispersion is provided by nonlocal correlation (NLC) functionals such as VV109 and vdW-DF-2.10 However, the most elaborate and computationally demanding choice for a dispersion correction is a post-Hartree–Fock correlation (post-HFC) method such as MP2, RPA, or CCSD.

In light of the above considerations, Figure 1 presents an alternate view of elements that can be combined to define most existing density functionals. The first element (Local Exchange-Correlation) pertains to the ingredients that constitute the local exchange-correlation functional, with the available choices mimicking the first three rungs of Jacob’s Ladder. The second element (Exact Exchange) pertains to the optional use of exact exchange contributions to define hybrid functionals. Finally, the third element (Dispersion Correction) generally accounts for the optional treatment of long-range correlation by one of the aforementioned approaches.

FIG. 1.

An alternate view of elements that can be combined to define most existing density functionals.

FIG. 1.

An alternate view of elements that can be combined to define most existing density functionals.

Close modal

The selection of semi-empirical density functionals based on the B97 concept is listed below (dispersion-corrected functionals are underlined):

  • GGA Functionals

    • Local: HCTH/93, HCTH/120, HCTH/147, HCTH/407, B97-D, SOGGA116,11–14

    • Global Hybrid: B97-1, B97-2, B97-K, B97-3, SOGGA11-X11,15–18

    • Range-Separated Hybrid: ωB97, ωB97X, ωB97X-D, ωB97X-D3, ωB97X-V19–22 

  • NGA Functionals

    • Local: N12, GAM23,24

    • Range-Separated Hybrid: N12-SX25 

  • meta-GGA Functionals

    • Local: τ-HCTH, M06-L, M11-L, B97M-V26–29 

    • Global Hybrid: τ-HCTHh, BMK, M05, M05-2X, M06, M06-2X, M06-HF, M08-HX, M08-SO16,26,30–34

    • Range-Separated Hybrid: M11, ωM05-D, ωM06-D321,35,36

  • meta-NGA Functionals

    • Local: MN12-L, MN15-L37,38

    • Global Hybrid: MN1539 

    • Range-Separated Hybrid: MN12-SX.25 

The simplest form for the power series utilized in GGA functionals is

(1)

with y representing an inhomogeneity variable based on one of the aforementioned physically relevant ingredients, and N representing the maximum truncation order for the summation. Conventionally, the value of N has either been chosen a priori or determined based on a “goodness-of-fit” to the training set. Smaller values of N can yield smoother and perhaps more transferable inhomogeneity correction factors, while larger values necessarily provide better fits to training data, whose transferability must subsequently be assessed.

The most general approach, however, is to choose a value for N and consider all possible combinations of the N + 1 coefficients. This approach was explored several years ago,40 resulting in the development of the ωB97X-V functional.22 Using this combinatorial approach leads to a total of 2N+1 − 1 fits, instead of just a handful. With a large number of candidate fits, the transferability of the fits can be assessed on a test set, allowing them to be ranked based on both their training set and test set performance. Furthermore, fits can be discarded based on undesirable physical characteristics or other relevant criteria.

In contrast to the one-dimensional power series that characterizes a GGA density functional, the most general power series that can accommodate a meta-GGA density functional is two-dimensional,

(2)

In the spirit of the original B97 density functional, three components of the local exchange-correlation functional require parameterization: exchange, same-spin correlation, and opposite-spin correlation. With each component contributing (N′ + 1)(N + 1) coefficients, the total number of possible fits is 23N+1N+11. Setting N′, the meta-GGA maximum truncation order, to 8, and N, the GGA maximum truncation order, to 4, bring the total number of possible combinations to an astounding 2135 − 1 ≈ 1041, a “functional genome” whose rank is approaching the square of Avogadro’s number.

The development of the B97M-V density functional29 was a first attempt at a partial search of the meta-GGA functional genome within the combinatorial optimization approach. Apart from the difficult issue of choosing appropriate weights for different sets of training and testing data, the main compromise made in the design of B97M-V was the choice to exclude exact exchange.

The goal of this paper is to improve upon B97M-V by revisiting the meta-GGA combinatorial search problem with the inclusion of exact exchange. The objective is to combinatorially design a range-separated hybrid, meta-GGA density functional which includes VV10 nonlocal correlation. It must be stressed that the combinatorial search performed to define B97M-V should not and will not be used in any direct way for this purpose. The addition of exact exchange means that different coefficients in Equation (2) will emerge as significant, perhaps apart from the few lowest-order terms. Unfortunately, the whole reason for adopting a combinatorial design approach is that it is impossible to anticipate which other terms will emerge as significant. Accordingly, it is a brand new search problem.

After describing the search process and its outcome, the functional which emerges as most transferable (ωB97M-V) can then be compared to existing functionals. ωB97M-V will be compared against two functionals that were designed in a similar fashion (ωB97X-V and B97M-V), as well as some of the best alternative functionals from other groups, particularly those that include similar functional ingredients. However, no other existing functional yet combines precisely the same components as ωB97M-V.

A few of the issues that will be particularly interesting to examine are the following:

  1. Comparing ωB97M-V against ωB97X-V will demonstrate the value of adding kinetic energy density dependence to a range-separated hybrid GGA functional containing VV10 nonlocal correlation.

  2. Comparing ωB97M-V against B97M-V will demonstrate the value of adding exact exchange to a local meta-GGA functional containing VV10 nonlocal correlation.

  3. Comparing ωB97M-V against the best available semi-empirical hybrid meta-GGAs and meta-NGAs including M06-2X, M08-HX, M11, ωM05-D, and MN15 will test the value of the combinatorial optimization strategy.

The hope is that within the limits of the functional form that has been chosen for optimization, the combinatorial design approach will permit the discovery of the most broadly accurate Rung 1-4 density functional to date.

This paper describes the task of designing, self-consistently training, and extensively testing an accurate, transferable, and well-behaved range-separated hybrid meta-GGA density functional with VV10 nonlocal correlation. The use of highly accurate datasets is integral to the functional development process, and the 84 datasets utilized in this work are described in Section IV and Table I. This is followed by a description of the equations that characterize the chosen functional form (Section V). The design process, described in detail in Section VI, involves searching through over 10 × 109 potential fits (Table II and Figure 2), whose parameters are individually trained on a training set and tested for transferability on a primary test set. The fit with the best performance across the training and primary test sets is self-consistently parameterized (Table III and Figure 3) and named ωB97M-V. In Section VII, the new functional is compared to the 11 existing functionals shown in Table IV, across the training set (Figure 5), the primary test set (Figure 6), and a secondary test set (Figure 7). Since the test set performance is the most significant measure of the merit of a semi-empirical density functional, Figure 4 shows the performance of all 12 functionals for the 8 datatypes defined in Section IV across 3547 test data points. The interpolated equilibrium bond lengths and equilibrium binding energies of 90 potential energy curves are analyzed for all 12 functionals in Figure 8, and the benzene-argon potential energy curve is displayed in Figure 9 as an example of the performance of these functionals for weak interactions. In Section VIII, the basis set dependence of ωB97M-V is thoroughly assessed by applying 21 basis sets to 4 datasets, and the results are summarized in Figure 10. In Section IX, the integration grid sensitivity of ωB97M-V is thoroughly assessed (Table V) by calculating 3247 of the 3834 data points in the training and primary test sets with a variety of integration grids. Finally, the paper is concluded in Section X with a summary (Figure 11) of the performance of all 12 functionals for the 8 aforementioned datatypes across 4399 data points.

TABLE I.

Summary of the 84 datasets that comprise the training, primary test, and secondary test sets. The datatypes are explained in Section IV. The sixth column contains the root-mean-squares of the dataset reaction energies. PEC stands for potential energy curve, SR stands for single-reference, MR stands for multi-reference, Bz stands for benzene, Me stands for methane, and Py stands for pyridine.

NameSetDatatype#DescriptionΔE (kcal/mol)References
A24 Train NCED 24 Binding energies of small non-covalent complexes 2.65 55  
DS14 Train NCED 14 Binding energies of complexes containing divalent sulfur 3.70 56  
HB15 Train NCED 15 Binding energies of hydrogen-bonded dimers featuring ionic groups common in biomolecules 19.91 57  
HSG Train NCED 21 Binding energies of small ligands interacting with protein receptors 6.63 58 and 59  
NBC10 Train NCED 184 PECs for BzBz (5), BzMe (1), MeMe (1), BzH2S(1), and PyPy (2) 1.91 59–62  
S22 Train NCED 22 Binding energies of hydrogen-bonded and dispersion-bound non-covalent complexes 9.65 59 and 63  
X40 Train NCED 31 Binding energies of non-covalent interactions involving halogenated molecules 5.26 64  
A21x12 PriTest NCED 252 PECs for the 21 equilibrium complexes from A24 1.43 65  
BzDC215 PriTest NCED 215 PECs for benzene interacting with two rare-gas atoms and eight first- and second-row hydrides 1.81 66  
HW30 PriTest NCED 30 Binding energies of hydrocarbon-water dimers 2.34 67  
NC15 PriTest NCED 15 Binding energies of very small non-covalent complexes 0.95 68  
S66 PriTest NCED 66 Binding energies of non-covalent interactions found in organic molecules and biomolecules 6.88 69 and 70  
S66x8 PriTest NCED 528 PECs for the 66 complexes from S66x8 5.57 69  
3B-69-DIM SecTest NCED 207 Binding energies of all relevant pairs of monomers from 3B-69-TRIM 5.87 71  
AlkBind12 SecTest NCED 12 Binding energies of saturated and unsaturated hydrocarbon dimers 3.14 72  
CO2Nitrogen16 SecTest NCED 16 Binding energies of CO2 to molecular models of pyridinic N-doped graphene 3.84 73  
HB49 SecTest NCED 49 Binding energies of small- and medium-sized hydrogen-bonded systems 14.12 74–76  
Ionic43 SecTest NCED 43 Binding energies of anion-neutral, cation-neutral, and anion-cation dimers 69.94 77  
H2O6Bind8 Train NCEC Binding energies of isomers of (H2O)6 46.96 78 and 79  
HW6Cl Train NCEC Binding energies of Cl(H2O)n (n = 1 − 6) 57.71 78 and 79  
HW6F Train NCEC Binding energies of F(H2O)n (n = 1 − 6) 81.42 78 and 79  
FmH2O10 PriTest NCEC 10 Binding energies of isomers of F(H2O)10 168.50 78 and 79  
Shields38 PriTest NCEC 38 Binding energies of (H2O)n (n = 2 − 10) 51.54 80  
SW49Bind345 PriTest NCEC 31 Binding energies of isomers of SO42(H2O)n (n = 3 − 5) 28.83 81  
SW49Bind6 PriTest NCEC 18 Binding energies of isomers of SO42(H2O)6 62.11 81  
WATER27 PriTest NCEC 23 Binding energies of neutral and charged water clusters 67.48 82 and 83  
3B-69-TRIM SecTest NCEC 69 Binding energies of trimers, with three different orientations of 23 distinct molecular crystals 14.36 71  
CE20 SecTest NCEC 20 Binding energies of water, ammonia, and hydrogen fluoride clusters 30.21 84 and 85  
H2O20Bind10 SecTest NCEC 10 Binding energies of isomers of (H2O)20 (low-energy structures) 198.16 79  
H2O20Bind4 SecTest NCEC Binding energies of isomers of (H2O)20 (dod, fc, fs, and es) 206.12 82, 83, 86, and 87  
TA13 Train NCD 13 Binding energies of dimers involving radicals 22.00 88  
XB18 Train NCD Binding energies of small halogen-bonded dimers 5.23 89  
Bauza30 PriTest NCD 30 Binding energies of halogen-, chalcogen-, and pnicogen-bonded dimers 23.65 90 and 91  
CT20 PriTest NCD 20 Binding energies of charge-transfer complexes 1.07 92  
XB51 PriTest NCD 20 Binding energies of large halogen-bonded dimers 6.06 89  
AlkIsomer11 Train IE 11 Isomerization energies of n = 4 − 8 alkanes 1.81 93  
Butanediol65 Train IE 65 Isomerization energies of butane-1,4-diol 2.89 94  
ACONF PriTest IE 15 Isomerization energies of alkane conformers 2.23 83 and 95  
CYCONF PriTest IE 11 Isomerization energies of cysteine conformers 2.00 83 and 96  
Pentane14 PriTest IE 14 Isomerization energies of stationary points on the n-pentane torsional surface 6.53 97  
SW49Rel345 PriTest IE 31 Isomerization energies of SO42(H2O)n (n = 3 − 5) 1.47 81  
SW49Rel6 PriTest IE 18 Isomerization energies of SO42(H2O)6 1.22 81  
H2O16Rel5 SecTest IE Isomerization energies of (H2O)16 (boat and fused cube structures) 0.40 98  
H2O20Rel10 SecTest IE 10 Isomerization energies of (H2O)20 (low-energy structures) 2.62 79  
H2O20Rel4 SecTest IE Isomerization energies of (H2O)20 (dod, fc, fs, and es) 5.68 82, 83, 86, and 87  
Melatonin52 SecTest IE 52 Isomerization energies of melatonin 5.54 99  
YMPJ519 SecTest IE 519 Isomerization energies of the proteinogenic amino acids 8.33 100  
EIE22 Train ID 22 Isomerization energies of enecarbonyls 4.97 101  
Styrene45 Train ID 45 Isomerization energies of C8H8 68.69 102  
DIE60 PriTest ID 60 Isomerization energies of reactions involving double-bond migration in conjugated dienes 5.06 103  
ISOMERIZATION20 PriTest ID 20 Isomerization energies 44.05 104  
C20C24 SecTest ID Isomerization energies of the ground state structures of C20 and C24 36.12 105  
AlkAtom19 Train TCE 19 n = 1 − 8 alkane atomization energies 1829.31 93  
BDE99nonMR Train TCE 83 Bond dissociation energies (SR) 114.98 104  
G21EA Train TCE 25 Adiabatic electron affinities of atoms and small molecules 40.86 83 and 106  
G21IP Train TCE 36 Adiabatic ionization potentials of atoms and small molecules 265.35 83 and 106  
TAE140nonMR Train TCE 124 Total atomization energies (SR) 381.05 104  
AlkIsod14 PriTest TCE 14 n = 3 − 8 alkane isodesmic reaction energies 10.35 93  
BH76RC PriTest TCE 30 Reaction energies from HTBH38 and NHTBH38 30.44 83, 107, and 108  
EA13 PriTest TCE 13 Adiabatic electron affinities 42.51 109  
HAT707nonMR PriTest TCE 505 Heavy-atom transfer energies (SR) 74.79 104  
IP13 PriTest TCE 13 Adiabatic ionization potentials 256.24 109  
NBPRC PriTest TCE 12 Reactions involving NH3/BH3 and PH3/BH3 30.52 83, 110, and 111  
SN13 PriTest TCE 13 Nucleophilic substitution energies 25.67 104  
BSR36 SecTest TCE 36 Hydrocarbon bond separation reaction energies 20.06 111 and 112  
HNBrBDE18 SecTest TCE 18 Homolytic N–Br bond dissociation energies 56.95 113  
WCPT6 SecTest TCE Tautomerization energies for water-catalyzed proton-transfer reactions 7.53 114  
BDE99MR PriTest TCD 16 Bond dissociation energies (MR) 54.51 104  
HAT707MR PriTest TCD 202 Heavy-atom transfer energies (MR) 83.41 104  
TAE140MR PriTest TCD 16 Total atomization energies (MR) 147.20 104  
PlatonicHD6 SecTest TCD Homodesmotic reactions involving platonic hydrocarbon cages, CnHn (n = 4, 6, 8, 10, 12, 20) 136.71 115  
PlatonicID6 SecTest TCD Isodesmic reactions involving platonic hydrocarbon cages, CnHn (n = 4, 6, 8, 10, 12, 20) 96.19 115  
PlatonicIG6 SecTest TCD Isogyric reactions involving platonic hydrocarbon cages, CnHn (n = 4, 6, 8, 10, 12, 20) 356.33 115  
PlatonicTAE6 SecTest TCD Total atomization energies of platonic hydrocarbon cages, CnHn (n = 4, 6, 8, 10, 12, 20) 2539.27 115  
BHPERI26 Train BH 26 Barrier heights of pericyclic reactions 23.15 83 and 116  
CRBH20 Train BH 20 Barrier heights for cycloreversion of heterocyclic rings 46.40 117  
DBH24 Train BH 24 Diverse barrier heights 28.34 118 and 119  
CR20 PriTest BH 20 Cycloreversion reaction energies 22.31 120  
HTBH38 PriTest BH 38 Hydrogen transfer barrier heights 16.05 108  
NHTBH38 PriTest BH 38 Non-hydrogen transfer barrier heights 33.48 107  
PX13 SecTest BH 13 Barrier heights for proton exchange in water, ammonia, and hydrogen fluoride clusters 28.83 84 and 85  
WCPT27 SecTest BH 27 Barrier heights of water-catalyzed proton-transfer reactions 38.73 114  
AE18 Train – 18 Absolute atomic energies of hydrogen through argon 148 739.00 121  
RG10 PriTest – 569 PECs for the 10 rare-gas dimers involving helium through krypton 1.21 122  
NameSetDatatype#DescriptionΔE (kcal/mol)References
A24 Train NCED 24 Binding energies of small non-covalent complexes 2.65 55  
DS14 Train NCED 14 Binding energies of complexes containing divalent sulfur 3.70 56  
HB15 Train NCED 15 Binding energies of hydrogen-bonded dimers featuring ionic groups common in biomolecules 19.91 57  
HSG Train NCED 21 Binding energies of small ligands interacting with protein receptors 6.63 58 and 59  
NBC10 Train NCED 184 PECs for BzBz (5), BzMe (1), MeMe (1), BzH2S(1), and PyPy (2) 1.91 59–62  
S22 Train NCED 22 Binding energies of hydrogen-bonded and dispersion-bound non-covalent complexes 9.65 59 and 63  
X40 Train NCED 31 Binding energies of non-covalent interactions involving halogenated molecules 5.26 64  
A21x12 PriTest NCED 252 PECs for the 21 equilibrium complexes from A24 1.43 65  
BzDC215 PriTest NCED 215 PECs for benzene interacting with two rare-gas atoms and eight first- and second-row hydrides 1.81 66  
HW30 PriTest NCED 30 Binding energies of hydrocarbon-water dimers 2.34 67  
NC15 PriTest NCED 15 Binding energies of very small non-covalent complexes 0.95 68  
S66 PriTest NCED 66 Binding energies of non-covalent interactions found in organic molecules and biomolecules 6.88 69 and 70  
S66x8 PriTest NCED 528 PECs for the 66 complexes from S66x8 5.57 69  
3B-69-DIM SecTest NCED 207 Binding energies of all relevant pairs of monomers from 3B-69-TRIM 5.87 71  
AlkBind12 SecTest NCED 12 Binding energies of saturated and unsaturated hydrocarbon dimers 3.14 72  
CO2Nitrogen16 SecTest NCED 16 Binding energies of CO2 to molecular models of pyridinic N-doped graphene 3.84 73  
HB49 SecTest NCED 49 Binding energies of small- and medium-sized hydrogen-bonded systems 14.12 74–76  
Ionic43 SecTest NCED 43 Binding energies of anion-neutral, cation-neutral, and anion-cation dimers 69.94 77  
H2O6Bind8 Train NCEC Binding energies of isomers of (H2O)6 46.96 78 and 79  
HW6Cl Train NCEC Binding energies of Cl(H2O)n (n = 1 − 6) 57.71 78 and 79  
HW6F Train NCEC Binding energies of F(H2O)n (n = 1 − 6) 81.42 78 and 79  
FmH2O10 PriTest NCEC 10 Binding energies of isomers of F(H2O)10 168.50 78 and 79  
Shields38 PriTest NCEC 38 Binding energies of (H2O)n (n = 2 − 10) 51.54 80  
SW49Bind345 PriTest NCEC 31 Binding energies of isomers of SO42(H2O)n (n = 3 − 5) 28.83 81  
SW49Bind6 PriTest NCEC 18 Binding energies of isomers of SO42(H2O)6 62.11 81  
WATER27 PriTest NCEC 23 Binding energies of neutral and charged water clusters 67.48 82 and 83  
3B-69-TRIM SecTest NCEC 69 Binding energies of trimers, with three different orientations of 23 distinct molecular crystals 14.36 71  
CE20 SecTest NCEC 20 Binding energies of water, ammonia, and hydrogen fluoride clusters 30.21 84 and 85  
H2O20Bind10 SecTest NCEC 10 Binding energies of isomers of (H2O)20 (low-energy structures) 198.16 79  
H2O20Bind4 SecTest NCEC Binding energies of isomers of (H2O)20 (dod, fc, fs, and es) 206.12 82, 83, 86, and 87  
TA13 Train NCD 13 Binding energies of dimers involving radicals 22.00 88  
XB18 Train NCD Binding energies of small halogen-bonded dimers 5.23 89  
Bauza30 PriTest NCD 30 Binding energies of halogen-, chalcogen-, and pnicogen-bonded dimers 23.65 90 and 91  
CT20 PriTest NCD 20 Binding energies of charge-transfer complexes 1.07 92  
XB51 PriTest NCD 20 Binding energies of large halogen-bonded dimers 6.06 89  
AlkIsomer11 Train IE 11 Isomerization energies of n = 4 − 8 alkanes 1.81 93  
Butanediol65 Train IE 65 Isomerization energies of butane-1,4-diol 2.89 94  
ACONF PriTest IE 15 Isomerization energies of alkane conformers 2.23 83 and 95  
CYCONF PriTest IE 11 Isomerization energies of cysteine conformers 2.00 83 and 96  
Pentane14 PriTest IE 14 Isomerization energies of stationary points on the n-pentane torsional surface 6.53 97  
SW49Rel345 PriTest IE 31 Isomerization energies of SO42(H2O)n (n = 3 − 5) 1.47 81  
SW49Rel6 PriTest IE 18 Isomerization energies of SO42(H2O)6 1.22 81  
H2O16Rel5 SecTest IE Isomerization energies of (H2O)16 (boat and fused cube structures) 0.40 98  
H2O20Rel10 SecTest IE 10 Isomerization energies of (H2O)20 (low-energy structures) 2.62 79  
H2O20Rel4 SecTest IE Isomerization energies of (H2O)20 (dod, fc, fs, and es) 5.68 82, 83, 86, and 87  
Melatonin52 SecTest IE 52 Isomerization energies of melatonin 5.54 99  
YMPJ519 SecTest IE 519 Isomerization energies of the proteinogenic amino acids 8.33 100  
EIE22 Train ID 22 Isomerization energies of enecarbonyls 4.97 101  
Styrene45 Train ID 45 Isomerization energies of C8H8 68.69 102  
DIE60 PriTest ID 60 Isomerization energies of reactions involving double-bond migration in conjugated dienes 5.06 103  
ISOMERIZATION20 PriTest ID 20 Isomerization energies 44.05 104  
C20C24 SecTest ID Isomerization energies of the ground state structures of C20 and C24 36.12 105  
AlkAtom19 Train TCE 19 n = 1 − 8 alkane atomization energies 1829.31 93  
BDE99nonMR Train TCE 83 Bond dissociation energies (SR) 114.98 104  
G21EA Train TCE 25 Adiabatic electron affinities of atoms and small molecules 40.86 83 and 106  
G21IP Train TCE 36 Adiabatic ionization potentials of atoms and small molecules 265.35 83 and 106  
TAE140nonMR Train TCE 124 Total atomization energies (SR) 381.05 104  
AlkIsod14 PriTest TCE 14 n = 3 − 8 alkane isodesmic reaction energies 10.35 93  
BH76RC PriTest TCE 30 Reaction energies from HTBH38 and NHTBH38 30.44 83, 107, and 108  
EA13 PriTest TCE 13 Adiabatic electron affinities 42.51 109  
HAT707nonMR PriTest TCE 505 Heavy-atom transfer energies (SR) 74.79 104  
IP13 PriTest TCE 13 Adiabatic ionization potentials 256.24 109  
NBPRC PriTest TCE 12 Reactions involving NH3/BH3 and PH3/BH3 30.52 83, 110, and 111  
SN13 PriTest TCE 13 Nucleophilic substitution energies 25.67 104  
BSR36 SecTest TCE 36 Hydrocarbon bond separation reaction energies 20.06 111 and 112  
HNBrBDE18 SecTest TCE 18 Homolytic N–Br bond dissociation energies 56.95 113  
WCPT6 SecTest TCE Tautomerization energies for water-catalyzed proton-transfer reactions 7.53 114  
BDE99MR PriTest TCD 16 Bond dissociation energies (MR) 54.51 104  
HAT707MR PriTest TCD 202 Heavy-atom transfer energies (MR) 83.41 104  
TAE140MR PriTest TCD 16 Total atomization energies (MR) 147.20 104  
PlatonicHD6 SecTest TCD Homodesmotic reactions involving platonic hydrocarbon cages, CnHn (n = 4, 6, 8, 10, 12, 20) 136.71 115  
PlatonicID6 SecTest TCD Isodesmic reactions involving platonic hydrocarbon cages, CnHn (n = 4, 6, 8, 10, 12, 20) 96.19 115  
PlatonicIG6 SecTest TCD Isogyric reactions involving platonic hydrocarbon cages, CnHn (n = 4, 6, 8, 10, 12, 20) 356.33 115  
PlatonicTAE6 SecTest TCD Total atomization energies of platonic hydrocarbon cages, CnHn (n = 4, 6, 8, 10, 12, 20) 2539.27 115  
BHPERI26 Train BH 26 Barrier heights of pericyclic reactions 23.15 83 and 116  
CRBH20 Train BH 20 Barrier heights for cycloreversion of heterocyclic rings 46.40 117  
DBH24 Train BH 24 Diverse barrier heights 28.34 118 and 119  
CR20 PriTest BH 20 Cycloreversion reaction energies 22.31 120  
HTBH38 PriTest BH 38 Hydrogen transfer barrier heights 16.05 108  
NHTBH38 PriTest BH 38 Non-hydrogen transfer barrier heights 33.48 107  
PX13 SecTest BH 13 Barrier heights for proton exchange in water, ammonia, and hydrogen fluoride clusters 28.83 84 and 85  
WCPT27 SecTest BH 27 Barrier heights of water-catalyzed proton-transfer reactions 38.73 114  
AE18 Train – 18 Absolute atomic energies of hydrogen through argon 148 739.00 121  
RG10 PriTest – 569 PECs for the 10 rare-gas dimers involving helium through krypton 1.21 122  
TABLE II.

Progression from the 7-parameter fits to the 15-parameter fits based on the SR-LSDA+VV10 initial guess. The first column indicates the additional coefficient that is frozen (compulsorily selected) in order to achieve the associated set of fits. The second column contains the total number of least-squares fits that are performed, of which only the top 100,000 (ranked by total WRMSD) are analyzed. The fourth column indicates the number of fits (of 100,000) that remain after the first filter is applied. The fifth column indicates the number of fits (of at most 1000) from the previous column that remain after the second filter is applied. Finally, the last column indicates the coefficient that is most commonly utilized in the surviving fits (shown in Column 5).

FrozenNo of initial fits# (Fitted)Filter 1Filter 2Common
– 2 898 753 715 3 342 413 cx,01 
cx,01 2 641 902 120 1 476 454 cx,10 
cx,10 2 404 808 340 2 806 623 ccss,10 
ccss,10 2 186 189 400 10 7 144 764 ccss,20 
ccss,20 1 984 829 850 11 18 021 591 ccos,20 
ccos,20 1 799 579 064 12 6 531 469 ccss,00 
ccss,00 1 629 348 612 13 2 985 696 ccos,10 
ccos,10 1 473 109 704 14 1 868 726 ccos,21 
ccos,21 1 329 890 705 15  120 101 ccos,60 
FrozenNo of initial fits# (Fitted)Filter 1Filter 2Common
– 2 898 753 715 3 342 413 cx,01 
cx,01 2 641 902 120 1 476 454 cx,10 
cx,10 2 404 808 340 2 806 623 ccss,10 
ccss,10 2 186 189 400 10 7 144 764 ccss,20 
ccss,20 1 984 829 850 11 18 021 591 ccos,20 
ccos,20 1 799 579 064 12 6 531 469 ccss,00 
ccss,00 1 629 348 612 13 2 985 696 ccos,10 
ccos,10 1 473 109 704 14 1 868 726 ccos,21 
ccos,21 1 329 890 705 15  120 101 ccos,60 
FIG. 2.

Total WRMSDs corresponding to the least-squares fits from the SR-LSDA+VV10 initial guess (red) and the “Best 8” updated guess (blue). The smallest total WRMSD for a given number of linear parameters is displayed only for the “Best 8” data. The total WRMSD of the least-squares fit that corresponds to ωB97M-V is boxed.

FIG. 2.

Total WRMSDs corresponding to the least-squares fits from the SR-LSDA+VV10 initial guess (red) and the “Best 8” updated guess (blue). The smallest total WRMSD for a given number of linear parameters is displayed only for the “Best 8” data. The total WRMSD of the least-squares fit that corresponds to ωB97M-V is boxed.

Close modal
TABLE III.

Linear parameters from the beginning of all six cycles of the self-consistent optimization of ωB97M-V. The “Best 8” column refers to the freezing of eight commonly occurring parameters. The nonlinear parameters that are taken from previous studies2,22,123 are γx = 0.004, γc,ss = 0.2, γc,os = 0.006, ω = 0.3, b = 6, and C = 0.01.

Parameter12 (Best 8)3456 (Final)
cx,00 0.85 0.85 0.85 0.85 0.85 0.85 
cx,10 −0.097 0.265 0.259 0.259 0.259 
cx,01 0.789 1.014 1.007 1.007 1.007 
ccss,00 0.216 0.457 0.443 0.443 0.443 
ccss,10 −2.496 −4.868 −4.55 −4.536 −4.535 
ccss,20 −0.816 −3.401 −3.391 −3.39 −3.39 
ccss,43 4.107 4.267 4.278 4.278 
ccss,04 −1.533 −1.438 −1.437 −1.437 
ccos,00 
ccos,10 3.13 1.573 1.372 1.359 1.358 
ccos,20 1.736 3.002 2.931 2.925 2.924 
ccos,60 −1.736 −1.419 −1.392 −1.39 
ccos,21 −1.591 −8.241 −8.776 −8.81 −8.812 
ccos,61 8.582 9.113 9.141 9.142 
cx 0.15 0.15 0.15 0.15 0.15 0.15 
Parameter12 (Best 8)3456 (Final)
cx,00 0.85 0.85 0.85 0.85 0.85 0.85 
cx,10 −0.097 0.265 0.259 0.259 0.259 
cx,01 0.789 1.014 1.007 1.007 1.007 
ccss,00 0.216 0.457 0.443 0.443 0.443 
ccss,10 −2.496 −4.868 −4.55 −4.536 −4.535 
ccss,20 −0.816 −3.401 −3.391 −3.39 −3.39 
ccss,43 4.107 4.267 4.278 4.278 
ccss,04 −1.533 −1.438 −1.437 −1.437 
ccos,00 
ccos,10 3.13 1.573 1.372 1.359 1.358 
ccos,20 1.736 3.002 2.931 2.925 2.924 
ccos,60 −1.736 −1.419 −1.392 −1.39 
ccos,21 −1.591 −8.241 −8.776 −8.81 −8.812 
ccos,61 8.582 9.113 9.141 9.142 
cx 0.15 0.15 0.15 0.15 0.15 0.15 
FIG. 3.

Exchange, same-spin correlation, and opposite-spin correlation inhomogeneity correction factor plots for the ωB97M-V density functional (bottom). In addition, the inhomogeneity correction factor plots for the SR-LSDA+VV10 initial guess (top) and the “Best 8” updated guess (middle) are included for comparison.

FIG. 3.

Exchange, same-spin correlation, and opposite-spin correlation inhomogeneity correction factor plots for the ωB97M-V density functional (bottom). In addition, the inhomogeneity correction factor plots for the SR-LSDA+VV10 initial guess (top) and the “Best 8” updated guess (middle) are included for comparison.

Close modal
TABLE IV.

Details for the 11 exemplary functionals chosen for comparison to ωB97M-V. L stands for local, GH stands for global hybrid, and RSH stands for range-separated hybrid. The second column lists the number of parameters that were optimized on a training set for the given functional. The third column lists the percentage of exact exchange, 100 ⋅ cx, as well as the value for ω in parentheses, if applicable. The column labeled UEG indicates whether or not the uniform electron gas limits were satisfied.

Functional# (Fitted)100 ⋅ cx (ω)ClassRungUEGReferences
B97-D3(BJ) 12 L GGA DFT-D3(BJ) No 6–8  
B97M-V 12 L meta-GGA VV10 Yes 29  
ωB97X-V 10 16.7-100 (0.3) RSH GGA VV10 No 22  
ωB97M-V 12 15-100 (0.3) RSH meta-GGA VV10 No … 
ωB97X-D 15 22.2-100 (0.2) RSH GGA CHG Yes 20  
ωM05-D 21 37.0-100 (0.2) RSH meta-GGA CHG Yes 36  
M06-2X 29 54 GH meta-GGA Yes 32  
M08-HX 47 52.23 GH meta-GGA Yes 34  
M11 40 42.8-100 (0.25) RSH meta-GGA Yes 35  
MN15 59 44 GH meta-NGA No 39  
M06-L 34 L meta-GGA Yes 27  
MN15-L 58 L meta-NGA No 38  
Functional# (Fitted)100 ⋅ cx (ω)ClassRungUEGReferences
B97-D3(BJ) 12 L GGA DFT-D3(BJ) No 6–8  
B97M-V 12 L meta-GGA VV10 Yes 29  
ωB97X-V 10 16.7-100 (0.3) RSH GGA VV10 No 22  
ωB97M-V 12 15-100 (0.3) RSH meta-GGA VV10 No … 
ωB97X-D 15 22.2-100 (0.2) RSH GGA CHG Yes 20  
ωM05-D 21 37.0-100 (0.2) RSH meta-GGA CHG Yes 36  
M06-2X 29 54 GH meta-GGA Yes 32  
M08-HX 47 52.23 GH meta-GGA Yes 34  
M11 40 42.8-100 (0.25) RSH meta-GGA Yes 35  
MN15 59 44 GH meta-NGA No 39  
M06-L 34 L meta-GGA Yes 27  
MN15-L 58 L meta-NGA No 38  
FIG. 5.

RMSDs in kcal/mol for 24 of the 25 training datasets (AE18 is excluded) for 12 density functionals. Table I contains information regarding the datasets, and the datatypes are explained in Section IV.

FIG. 5.

RMSDs in kcal/mol for 24 of the 25 training datasets (AE18 is excluded) for 12 density functionals. Table I contains information regarding the datasets, and the datatypes are explained in Section IV.

Close modal
FIG. 6.

RMSDs in kcal/mol for 34 of the 35 primary test datasets (RG10 is excluded) for 12 density functionals. Table I contains information regarding the datasets, and the datatypes are explained in Section IV.

FIG. 6.

RMSDs in kcal/mol for 34 of the 35 primary test datasets (RG10 is excluded) for 12 density functionals. Table I contains information regarding the datasets, and the datatypes are explained in Section IV.

Close modal
FIG. 7.

RMSDs in kcal/mol for the 24 secondary test datasets for 12 density functionals. Table I contains information regarding the datasets, and the datatypes are explained in Section IV.

FIG. 7.

RMSDs in kcal/mol for the 24 secondary test datasets for 12 density functionals. Table I contains information regarding the datasets, and the datatypes are explained in Section IV.

Close modal
FIG. 4.

RMSDs in kcal/mol for 8 datatypes for 12 density functionals. These datatype RMSDs include data points from the primary and secondary test sets only. NCED stands for non-covalent dimers (easy), NCEC stands for non-covalent clusters (easy), NCD stands for non-covalent dimers (difficult), IE stands for isomerization energies (easy), ID stands for isomerization energies (difficult), TCE stands for thermochemistry (easy), TCD stands for thermochemistry (difficult), and BH stands for barrier heights. The partitioning of the 3547 data points contained in this figure into the 8 datatypes is: 1433, 223, 70, 679, 88, 660, 258, and 136.

FIG. 4.

RMSDs in kcal/mol for 8 datatypes for 12 density functionals. These datatype RMSDs include data points from the primary and secondary test sets only. NCED stands for non-covalent dimers (easy), NCEC stands for non-covalent clusters (easy), NCD stands for non-covalent dimers (difficult), IE stands for isomerization energies (easy), ID stands for isomerization energies (difficult), TCE stands for thermochemistry (easy), TCD stands for thermochemistry (difficult), and BH stands for barrier heights. The partitioning of the 3547 data points contained in this figure into the 8 datatypes is: 1433, 223, 70, 679, 88, 660, 258, and 136.

Close modal
FIG. 8.

Equilibrium bond length (EBL) RMSDs in Å and equilibrium binding energy (EBE) RMSDs in kcal/mol for 12 density functionals. The first section contains the EBL RMSDs while the second section contains the EBE RMSDs. The All* category contains 81 data points and is a combination of BzDC215, NBC10, and S66x8. More information regarding the datasets and excluded potential energy curves can be found in Table I and Section VII C, respectively.

FIG. 8.

Equilibrium bond length (EBL) RMSDs in Å and equilibrium binding energy (EBE) RMSDs in kcal/mol for 12 density functionals. The first section contains the EBL RMSDs while the second section contains the EBE RMSDs. The All* category contains 81 data points and is a combination of BzDC215, NBC10, and S66x8. More information regarding the datasets and excluded potential energy curves can be found in Table I and Section VII C, respectively.

Close modal
FIG. 9.

Potential energy curves (computed with the def2-QZVPPD basis set and the (99,590)/SG-1 grid) for the benzene-argon dimer from BzDC215 as computed by the 12 benchmarked density functionals. The gray curve represents the DFT method, while the blue curve represents the reference method. The line immediately following the functional name contains the equilibrium bond length in Å and the error (with respect to the reference) in parentheses. The following line contains the same information for the equilibrium binding energy (in kcal/mol).

FIG. 9.

Potential energy curves (computed with the def2-QZVPPD basis set and the (99,590)/SG-1 grid) for the benzene-argon dimer from BzDC215 as computed by the 12 benchmarked density functionals. The gray curve represents the DFT method, while the blue curve represents the reference method. The line immediately following the functional name contains the equilibrium bond length in Å and the error (with respect to the reference) in parentheses. The following line contains the same information for the equilibrium binding energy (in kcal/mol).

Close modal
FIG. 10.

RMSDs in kcal/mol for 4 datasets computed with 21 different basis sets. S66 represents non-covalent interactions, Pentane14 represents isomerization energies, AlkAtom19 represents thermochemistry, and CRBH20 represents barrier heights. The S66 dataset is computed both with and without counterpoise corrections (designated CP and noCP, respectively). The RMSDs are taken with respect to both the reference values (vs. Ref) as well as the training set basis (vs. TSB), which is def2-QZVPPD. The basis sets are sorted based on the geometric mean (GM) of the S66 CP, Pentane14, AlkAtom19, and CRBH20 RMSDs relative to the TSB. The S66 noCP RMSD is excluded from the GM because it unfairly disadvantages triple-zeta basis sets. The number of basis functions (BF) that each basis set contains for octane is shown in the last column.

FIG. 10.

RMSDs in kcal/mol for 4 datasets computed with 21 different basis sets. S66 represents non-covalent interactions, Pentane14 represents isomerization energies, AlkAtom19 represents thermochemistry, and CRBH20 represents barrier heights. The S66 dataset is computed both with and without counterpoise corrections (designated CP and noCP, respectively). The RMSDs are taken with respect to both the reference values (vs. Ref) as well as the training set basis (vs. TSB), which is def2-QZVPPD. The basis sets are sorted based on the geometric mean (GM) of the S66 CP, Pentane14, AlkAtom19, and CRBH20 RMSDs relative to the TSB. The S66 noCP RMSD is excluded from the GM because it unfairly disadvantages triple-zeta basis sets. The number of basis functions (BF) that each basis set contains for octane is shown in the last column.

Close modal
TABLE V.

Grid error ranges for 3247 data points from the training and primary test sets. From the original 3834 data points, the 18 data points from AE18 and the 569 data points from RG10 are excluded. The errors are taken with respect to the (250,974)/SG-1 grid. The grids are assessed with respect to the absolute error (AE) in kcal/mol.

AE (kcal/mol)[0,0.015)[0.015,0.03)[0.03,0.045)[0.045,0.06)[0.06,0.075)[0.075,0.09)[0.09,∞)
(99,590)/SG-1 3247 
(75,590)/SG-1 3244 
(99,302)/SG-1 3200 39 
(75,302)/SG-1 3189 50 
(75,302)/SG-0 3122 107 11 
SG-1/SG-0 1851 587 303 160 101 83 162 
AE (kcal/mol)[0,0.015)[0.015,0.03)[0.03,0.045)[0.045,0.06)[0.06,0.075)[0.075,0.09)[0.09,∞)
(99,590)/SG-1 3247 
(75,590)/SG-1 3244 
(99,302)/SG-1 3200 39 
(75,302)/SG-1 3189 50 
(75,302)/SG-0 3122 107 11 
SG-1/SG-0 1851 587 303 160 101 83 162 
FIG. 11.

RMSDs in kcal/mol for 8 datatypes for 12 density functionals. These datatype RMSDs include data points from the training, primary test, and secondary test sets. NCED stands for non-covalent dimers (easy), NCEC stands for non-covalent clusters (easy), NCD stands for non-covalent dimers (difficult), IE stands for isomerization energies (easy), ID stands for isomerization energies (difficult), TCE stands for thermochemistry (easy), TCD stands for thermochemistry (difficult), and BH stands for barrier heights. The partitioning of the 4399 data points contained in this figure into the 8 datatypes is: 1744, 243, 91, 755, 155, 947, 258, and 206.

FIG. 11.

RMSDs in kcal/mol for 8 datatypes for 12 density functionals. These datatype RMSDs include data points from the training, primary test, and secondary test sets. NCED stands for non-covalent dimers (easy), NCEC stands for non-covalent clusters (easy), NCD stands for non-covalent dimers (difficult), IE stands for isomerization energies (easy), ID stands for isomerization energies (difficult), TCE stands for thermochemistry (easy), TCD stands for thermochemistry (difficult), and BH stands for barrier heights. The partitioning of the 4399 data points contained in this figure into the 8 datatypes is: 1744, 243, 91, 755, 155, 947, 258, and 206.

Close modal

Since several of the density functionals that appear in this paper contain both a local exchange-correlation functional and a nonlocal correlation functional, the integration grids used to evaluate these two components will be reported together, separated by a forward slash (local/nonlocal). Regarding the integration grid, the notation (x,y) indicates x radial shells with y angular grid points per shell.

The (99,590)/SG-1 grid was used for all of the datasets in the training, primary test, and secondary test sets, except AE18 and RG10. The (500,974)/(75,302) grid was used for both of these datasets. The def2-QZVPPD basis set was used for all of the datasets in the training, primary test, and secondary test sets, without counterpoise corrections.

For the basis set limit tests, the (99,590)/SG-1 grid was used, while 21 basis sets from 4 different families were tested: the cc-pVXZ and aug-cc-pVXZ (X = D,T,Q) Dunning basis sets,41–43 the pc-X and aug-pc-X (X = 0,1,2,3) Jensen basis sets,44–46 the def2-SVP, def2-SVPD, and def2-XZVPP and def2-XZVPPD (X = T,Q) Karlsruhe basis sets,47–51 and the 6-311++G(3df,3pd) Pople basis set (LP). For the integration grid limit tests, the def2-QZVPPD basis set was used, while seven different grid combinations were tested: (250,974)/SG-1, (99,590)/SG-1, (99,302)/SG-1, (75,590)/SG-1, (75,302)/SG-1, (75,302)/SG-0, and SG-1/SG-0.52,53

All of the calculations were performed with a development version of Q-Chem 4.0.54 

A total of 84 existing datasets are employed in this work, containing 4986 data points (and requiring 5931 single-point calculations). 82 of these 84 datasets (AE18 and RG10 are excluded) are classified according to 8 categories (or datatypes): NCED (non-covalent dimers (easy)), NCEC (non-covalent clusters (easy)), NCD (non-covalent dimers (difficult)), IE (isomerization energies (easy)), ID (isomerization energies (difficult)), TCE (thermochemistry (easy)), TCD (thermochemistry (difficult)), and BH (barrier heights). Difficult interactions are characterized by either strong correlation or self-interaction error, while easy interactions are characterized by neither. The number of data points (and datasets) that are classified according to NCED, NCEC, NCD, IE, ID, TCE, TCD, and BH are 1744 (18), 243 (12), 91 (5), 755 (12), 155 (5), 947 (15), 258 (7), and 206 (8), respectively.

For the purposes of functional development and testing, the datasets are divided into three categories. A training set is used to fit the parameters of each candidate functional in the combinatorial search, and then again to self-consistently train the best candidate. A primary test set is used in conjunction with the training set to perform the combinatorial search, and identify the best candidate (details of this procedure are given in Section VI). Finally, a secondary test set is used to assess the final optimized functional. Detailed information about the training, primary test, and secondary test sets can be found in Table I. The training set contains 870 data points overall, the primary test set contains 2964 data points overall, and the secondary test set contains 1152 data points overall. Thus, the training set constitutes only 17.5% of the entire database used to develop and assess ωB97M-V.

The complete functional form for ωB97M-V is given by Equations (3)-(5). The acronyms used in this section are exchange-correlation (xc), exchange (x), correlation (c), short-range (sr), long-range (lr), meta-GGA (mGGA), same-spin (ss), opposite-spin (os), and nonlocal (nl),

(3)
(4)
(5)

The local spin-density approximation (LSDA) for exchange can be expressed in terms of the first-order spinless reduced density matrix for an infinite uniform electron gas (UEG),

(6)
(7)

where s = r1r2, r=12r1+r2, and kFσ=6π2ρσ1/3 is the spin-polarized Fermi wave vector. Integration of Equation (6) over s gives the well-known expression for the LSDA exchange energy in terms of the exchange energy density per unit volume,

(8)
(9)

Transforming ExLSDA to its short-range counterpart, Ex,srLSDA, is accomplished by replacing 1s in Equation (6) with erfcωss and carrying out the same integration. The resulting SR-LSDA exchange functional,

(10)

is conveniently identical to its unattenuated counterpart, with the exception of a multiplicative factor,

(11)

where aσ=ωkFσ and ω is a nonlinear parameter that controls the transition from local DFT exchange to nonlocal exact exchange with respect to interelectronic distance.

Accounting for inhomogeneities in the electron density is achieved by multiplying the integrand of the SR-LSDA exchange functional (Equation (10)) by a power series inhomogeneity correction factor (ICF), gx, resulting in the SR-mGGA exchange functional,

(12)
(13)
(14)
(15)

where the variable, wx,σ ∈ [ − 1, 1], is a finite domain transformation of the dimensionless ratio of the UEG kinetic energy density to the exact kinetic energy density, tσ=τσUEGτσ, with τσUEG=356π22/3ρσ5/3, and the variable, ux,σ ∈ [0, 1], is a finite domain transformation of the dimensionless spin-density gradient, sσ=|ρσ|ρσ4/3. The linear DFT exchange parameters, cx,ij, will be determined via least-squares fitting to a training set in Section VI, while γx = 0.004 is a fixed nonlinear DFT exchange parameter that was fit to the Hartree–Fock exchange energies of 20 atoms in 1986 by Becke.123 

Nonlocal exact exchange is introduced by splitting the conventional 1r Coulomb operator into a short-range component (Ex,srexact) and a long-range component (Ex,lrexact) with the erfcωrr and erfωrr Coulomb functions, respectively,

(16)
(17)

where ψ and ψ are occupied Kohn–Sham spatial orbitals and r=r1r2. Instead of setting the percentage of exact-exchange at r = 0 to zero, an optimizable parameter, cx, controls the amount of short-range exact exchange.

Closed-form expressions for the correlation energy density per particle of an infinite uniform electron gas, ϵcUEG, are only known for the low- and high-density limits of the paramagnetic and ferromagnetic cases. Using the Monte-Carlo data of Ceperley and Alder,124 Perdew and Wang developed an analytic spin-compensated representation125 for ϵcUEG. Combined with the spin-polarization interpolation formula of Vosko, Wilk, and Nusair,126 the spin-polarized PW92 correlation energy density per electron, ϵcPW92, is the starting point for the correlation functional,

(18)

Using the spin decomposition technique of Stoll and co-workers,127 the LSDA correlation energy is separated into same-spin and opposite-spin components,

(19)
(20)

where ec,σσPW92 and ec,αβPW92 are the PW92 same-spin and opposite-spin correlation energy densities per unit volume, respectively. Extending Equations (19) and (20) to account for inhomogeneities in the electron density is straightforward, since the approach used for the exchange functional can be utilized,

(21)
(22)
(23)
(24)
(25)
(26)
(27)
(28)

where tαβ=12tα+tβ and sαβ2=12sα2+sβ2. The linear DFT correlation parameters, ccss,ij and ccos,ij, will be determined via least-squares fitting to a training set in Section VI, while γc,ss = 0.2 and γc,os = 0.006 are nonlinear DFT correlation parameters that were fit to the correlation energies of neon and helium in 1997 by Becke.2 

Nonlocal correlation is taken into account via the VV10 NLC functional,9 

(29)

where Φr1,r2;{b,C} is the nonlocal correlation kernel defined in Ref. 9. The VV10 NLC functional contains two nonlinear parameters: b, which controls the short-range damping of the 1/r6 asymptote, and C, which controls the accuracy of the asymptotic C6 coefficients.

With a total of 3834 data points in the training and primary test sets, a three-dimensional nonlinear optimization of the parameters associated with range-separation (ω) and VV10 (b and C) is impractical. As a result, the values of ω = 0.3, b = 6, and C = 0.01 that were optimized for ωB97X-V are taken without further investigation. Any inaccuracies in these parameters will be accounted for during the optimization of the linear parameters.

In order to generate the data that are needed to carry out the least-squares fits, it is necessary to choose an initial guess for the linear coefficients. As explained in Section I, the value of N′ (the maximum truncation order for w in Equations (13), (22), and (26)) is set to 8 and the value of N (the maximum truncation order for u in Equations (13), (22), and (26)) is set to 4. This results in 135 coefficients that arise from the local exchange-correlation functional. Additionally, the fraction of short-range exact exchange (cx) from Equation (4) is the 136th coefficient.

For the 135 coefficients that arise due to the three power series ICFs, the most flexible initial guess is that of the SR-LSDA, where cx,00 = ccss,00 = ccos,00 = 1, and the remaining 132 coefficients are zero. For cx, the most straightforward initial guess is zero. However, since the contribution from exact exchange is bound to constitute a large fraction of the total exchange-correlation energy, it is beneficial to pick a value for cx that is as close to the final value as possible, in order to minimize discrepancies between the root-mean-square deviations (RMSD) that are generated using the initial guess, and those of the final, self-consistently optimized functional. With the value of cx = 0.167 from ωB97X-V serving as a guide, the initial guess for the fraction of short-range exact exchange is set to cx = 0.15. Finally, the only constraint that is explicitly enforced is the UEG limit for exchange (cx,00 + cx = 1).

The fundamental equations that will be used throughout the least-squares fitting procedure are

(30)

and

(31)

where Δx = xi+1xi is the change in the linear coefficients (length: 136), WTr is a diagonal matrix of training weights (dimensions: 3834 × 3834), b = ErefEi is the difference between the reference and initial guess energies (length: 3834), and A=Axi is the matrix of ICF contributions (dimensions: 3834 × 136). While the first three quantities are conceptually straightforward, it is worthwhile to further explain how to generate the A matrix.

A is most generally a (# of data points) × (# of linear parameters) matrix. As an example, the element A7,23 corresponds to the contribution to the 7th data point from the 23rd linear parameter (cx,42). Assuming that the seventh data point corresponds to the binding energy of the water dimer from the S22 dataset, computing A7,23 requires calculating the following quantity:

(32)

for the water dimer and both of its monomers, and computing the associated contribution to the binding energy. In this work, the A matrix is computed twice: once with the (99,590)/SG-1 grid and once with the (250,974)/SG-1 grid (the (500,974)/(75,302) grid is always used for AE18 and RG10). The rationale for computing A in two different grids will be explained when the filtering procedure is discussed. Unless otherwise specified, terms from Equations (30) and (31) that appear henceforth are computed with the (250,974)/SG-1 grid. To refer to a quantity, Z, computed in the (99,590)/SG-1 grid, the following notation will be used: Z′.

Since the optimization procedure involves both a training set (for determining the values of the coefficients) and a primary test set (for assessing the transferability of the resulting coefficients), a measure of the overall performance of the fits is necessary for ranking them. For this purpose, the total weighted root-mean-square-deviation (WRMSD),

(33)

is used, where W is a diagonal matrix of training and primary test set weights (dimensions: 3834 × 3834). The diagonals of W and WTr are identical for the elements that belong to the training set, while WTr contains zeros for the elements that belong to the primary test set.

The weighting scheme used for ωB97M-V is considerably different than that used for B97M-V. Initially, each data point is given a weight that corresponds to the inverse of the product of the number of data points in the associated dataset and the root-mean-square of the reaction energies in the associated dataset. These values can be found in the fourth and sixth columns of Table I, respectively. Within each of the datatypes, the weights are normalized by dividing by the smallest weight, and then exponentiated such that they lie between 1 and 2. For the determination of the weights only, the NCED and NCEC datatypes are consolidated into a single datatype, NCE, giving a total of seven datatypes. Furthermore, AE18 is included in the TCE datatype in order to receive a weight. Finally, the seven datatypes get a multiplicative weight based on intuition: 0.1 for TCD, 1 for TCE, 10 for NCD, ID, and BH, and 100 for NCE and IE. As RG10 does not belong to a datatype, it gets a weight of 10,000. At this point, all of the data points (besides those in RG10) have a weight between 0.1 and 200. In order to promote transferability, the datasets in the primary test set get another multiplicative weight of 2.

Following the initial setup described above, the search for the optimal least-squares fit can proceed. Applying the single UEG constraint to the initial parameter space of 136 brings the total number of linearly independent parameters to 135. With the available computing resources (a 64-core node), the maximum number of fits that can be performed in a single day is around two billion. Therefore 135C5 = 346 700 277, which returns only 5-parameter fits, is the largest calculation that can be performed in a single day using all 135 parameters (since 135C6 = 7 511 839 335). In order to explore fits with more parameters, it is necessary to either consider different truncations of the parameter space or compulsorily select commonly occurring parameters. The first option reduces the value of n (from nCk), allowing for larger values of k. For example, the binomial coefficients 135C5 and 82C6 are both similar in value, yet the latter allows the exploration of 6-parameter fits. The second option requires F commonly occurring parameters to be compulsorily selected (or frozen), allowing for the exploration of (n-F)Ck (k+F)-parameter fits. For example, if the results from 135C5 indicate that cx,01 is the most important parameter, a possible next step would be to freeze cx,01 (not its value but simply its inclusion in all successive fits) and explore the results of 134C5 with cx,01 frozen, giving 6-parameter fits.

For the optimization of the functional at hand, a combination of these two options is utilized. While the meta-GGA parameter space described thus far contains 135 linearly independent parameters, its GGA subset contains only 15, which amounts to 215 − 1 = 32 767 total possible fits. The GGA parameter space is fully searchable and the recent ωB97X-V density functional was developed within this subspace. Since ωB97X-V has seven linear parameters, it is plausible to assume that the minimum number of linear parameters necessary to parameterize a functional within the meta-GGA parameter space is seven. Thus, the initial parameter space of 135 should be truncated such that 7-parameter fits are possible without freezing. After considering a multitude of different truncations, a parameter space designated as 2I6I6I was selected. This includes variables up to second order individually in w and u for exchange and up to sixth order individually in w and u for same-spin and opposite-spin correlation. With 9 coefficients from exchange, 35 coefficients from same-spin correlation, 35 coefficients from opposite-spin correlation, an additional coefficient from short-range exact exchange, and a single constraint, the 2I6I6I designation reduces the parameter space from 135 to 79.

Although the 2I6I6I designation reduces the parameter space substantially, the largest calculation that is possible is still only 79C7 = 2 898 753 715. Thus, only up to 7-parameter fits can be explored. In order to advance past seven, it is necessary to compulsorily select commonly occurring parameters. The process of selecting the parameter that will be frozen involves a series of steps. First, from all of the 7-parameter fits resulting from the 79C7 optimization, the top 100,000 (ranked by total WRMSD) are saved and filtered twice. The first set of filters deals with the physical characteristics of the fits:

  • xi+125

  • 0 ≤ gx ≤ 2.273

  • −10 ≤ gc,ss ≤ 10

  • −10 ≤ gc,os ≤ 10

  • EiEi+AAΔx0.015kcal/mol.

Of the conditions listed above, the first ensures that the coefficients are small, the second ensures that the exchange functional obeys the local Lieb-Oxford bound and that all contributions to exchange are negative, the third and fourth ensure that the correlation functionals are well-behaved, and the last ensures that the absolute interaction energy errors attributed to the integration grid are no larger than 0.015 kcal/mol.

Following the first set of filters, at most 1000 of the remaining fits are passed through a second filter, which deals with the accuracy of the fits for predicting equilibrium bond lengths for non-covalent interactions. The BzDC215, NBC10, RG10, and S66x8 datasets are utilized to this effect. In total, these four datasets contain 96 potential energy curves (PEC), with BzDC215, NBC10, and RG10 each having 10, and S66x8 having 66. However, the benzene-neon dimer PEC from BzDC215 is removed because of severe integration grid issues. The remaining 95 PECs are interpolated and the equilibrium bond lengths are evaluated for all of the fits that pass through the first filter. Only fits that have an RMSD of less than 0.03 Å across the 95 PECs are allowed through the second filter.

Finally, the surviving fits are analyzed in order to determine the coefficient that is most commonly used. This coefficient is then compulsorily selected in the next set of least-squares fits in order to allow for the exploration of 8-parameter fits. This procedure was repeated until a total of eight parameters were frozen (cx,01, cx,10, ccss,10, ccss,20, ccos,20, ccss,00, ccos,10, ccos,21). The progression from the 7-parameter fits to the 15-parameter fits can be tracked in Table II.

Due to the nonlinear nature of the self-consistent field method, the A matrix changes with every update to the parameters (since A=Axi), and it is very likely that the A matrix that is constructed from the initial guess will be vastly different from the last A matrix that will be used to finalize the parameters. Furthermore, the larger the difference between the starting point and the ending point, the higher the chance that the initial RMSDs will differ significantly from the RMSDs of the final functional. In fact, this phenomenon was first encountered during the development of the B97M-V density functional and was bypassed by updating the initial guess with a better guess formed by the first nine parameters that were frozen. Since this methodology worked well during the development of B97M-V, it is utilized in the present work. Thus, the eight parameters shown in the first column of Table II (“Best 8”) are used to update the SR-LSDA+VV10 initial guess.

With the updated guess, the A matrix is computed with both the (99,590)/SG-1 and (250,974)/SG-1 grids and the same procedure outlined above is followed. However, no additional parameters need to be frozen, since with eight parameters already frozen, it is trivial to explore 9- through 15-parameter fits (71C1 through 71C7). However, both sets of filters are still applied. Beginning with the 9-parameter fits, an additional parameter is accepted only if it improves the total WRMSD by more than 0.05 kcal/mol (a protocol which was successfully utilized during the exploration of the GGA subspace40). Consequently, a 12-parameter fit emerges as the optimal fit and is self-consistently optimized. The progression of the minimum total WRMSDs from the 9- to 15-parameters fits is {–, 5.04, 3.52, 3.46, 3.44, 3.42, 3.39}. The total WRMSDs corresponding to the least-squares fits from the SR-LSDA+VV10 initial guess and the “Best 8” updated guess are shown in Figure 2 in red and blue, respectively. The smallest total WRMSD for a given number of linear parameters is displayed only for the “Best 8” data. The total WRMSD of the least-squares fit that corresponds to ωB97M-V is boxed.

Including the initial cycle (Cycle 1) with the unoptimized ωB97M-V density functional as well as the “Best 8” cycle, the self-consistent optimization of ωB97M-V required six cycles. The final parameters of ωB97M-V can be found in the last column of Table III, and Figure 3 shows the exchange, same-spin correlation, and opposite-spin correlation ICF plots for ωB97M-V (bottom). In addition, it displays the ICFs for the SR-LSDA+VV10 initial guess (top) and the “Best 8” updated guess (middle). The final ICFs of ωB97M-V are smooth and well-behaved, with the maximum value of the exchange ICF (2.116) lying well under the Lieb-Oxford bound (2.273). The lower and upper bounds of all three ICFs are

  • 0.591 ≤ gx ≤ 2.116

  • −7.482 ≤ gc,ss ≤ 4.429

  • −1.957 ≤ gc,os ≤ 4.222.

The four parameters that are chosen in addition to the “Best 8” are ccss,43, ccss,04, ccos,60, and ccos,61. Interestingly, the chosen fit does not optimize the value of short-range exact exchange away from cx = 0.15.

In order to expedite the implementation of ωB97M-V in quantum chemistry software packages, Fortran routines for the local exchange, same-spin correlation, and opposite-spin correlation functionals are provided in the supplementary material.128 Furthermore, ωB97M-V will be available in Q-Chem 4.4.

In order to verify that ωB97M-V is a broadly accurate density functional, it is necessary to compare it to existing density functionals. Thus, 11 exemplary density functionals are selected for comparison to ωB97M-V across the nearly 5000 data points in the training, primary test, and secondary test sets. Furthermore, all 12 functionals are benchmarked on 90 potential energy curves in order to assess equilibrium binding energies and equilibrium bond lengths. Details for the 11 density functionals selected for comparison to ωB97M-V are given in Table IV.

Since the training, primary test, and secondary test sets contain 84 datasets overall, it is easier to obtain a general idea of the performance of ωB97M-V by first considering the eight datatypes defined in Section IV. However, in order to make the comparison as unbiased as possible, only the data points from the primary and secondary test sets (henceforth referred to as the test set) are considered. Furthermore, since RG10 does not belong to any of the datatypes, it is excluded from this analysis. The total number of data points and datasets considered in Figure 4 are 3547 and 58, respectively. The density functionals are compared using RMSDs because this convention provides a better summary of the distribution of residuals than the mean absolute deviation (MAD).129 However, both mean signed errors (MSE) and maximum absolute deviations (MAX) are provided in Tables S1 and S2 of the supplementary material.128 

Beginning with the NCED datatype, the functional with the best performance is ωB97M-V, with an RMSD of 0.18 kcal/mol. The next best functionals are ωB97X-V and B97M-V, both with RMSDs of 0.23 kcal/mol. Thus, the test set performance of ωB97M-V for over 1400 non-covalent dimer binding energies is at least 20% better than that of both of its predecessors. Besides the VV10-containing functionals, ωM05-D and ωB97X-D perform similarly, both with RMSDs of 0.38 kcal/mol, followed by M06-2X, B97-D3(BJ), and MN15, which have RMSDs of 0.42, 0.44, and 0.48 kcal/mol, respectively. The remaining functionals (M06-L, M08-HX, M11, and MN15-L) have RMSDs in excess of 0.5 kcal/mol, with MN15-L having by far the largest (1.39 kcal/mol).

Moving on to the 223 non-covalent cluster binding energies in the test set with the NCEC datatype, the best performance is once again reserved for ωB97M-V, which has an RMSD of 0.50 kcal/mol. The next best functional (ωB97X-V) is about 30% worse, with an RMSD of 0.66 kcal/mol. Only two other functionals have RMSDs under or around 1 kcal/mol: B97M-V and ωB97X-D. The rest of the functionals have RMSDs that range from 1.73 kcal/mol (M08-HX) to 13.20 kcal/mol (MN15-L).

The NCD datatype contains non-covalent dimer binding energies that are susceptible to self-interaction error. Thus, local functionals should perform significantly worse than hybrid functionals, while hybrid functionals with a large fraction of exact exchange should perform best. Accordingly, the density functionals with the largest RMSDs across the 70 data points in the test set are the local ones, with RMSDs between 1.35 and 1.75 kcal/mol. Surprisingly, the best density functional is ωB97M-V, with an RMSD of 0.49 kcal/mol, followed by MN15, ωB97X-V, and ωM05-D, which have RMSDs of 0.61, 0.63, and 0.78 kcal/mol, respectively. The rest of the hybrid functionals have RMSDs around 1 kcal/mol.

The test set contains a total of 679 standard isomerization energies, ranging from alkane conformers to amino acid conformers. The three functionals with the best performance for the IE data points in the test set are ωB97X-V, B97M-V, and ωB97M-V, with RMSDs slightly under 0.30 kcal/mol. The next best functionals are M06-2X (0.52 kcal/mol) and ωM05-D (0.54 kcal/mol), while six of the remaining seven functionals (M08-HX, ωB97X-D, MN15, M06-L, M11, and B97-D3(BJ)) have RMSDs between 0.6 and 0.8 kcal/mol. MN15-L, on the other hand, has an RMSD (1.57 kcal/mol) that is more than 2 times larger than that of M06-L.

The ID datatype contains isomerization energies that are sensitive to self-interaction error. As with the NCD category, the local functionals exhibit the worst performance across the 88 data points in the test set, with RMSDs larger than 7.9 kcal/mol. From the hybrid functionals, ωB97X-V and ωB97M-V perform almost indistinguishably, with RMSDs of around 2.3 kcal/mol, while the next best functional, ωB97X-D, performs about 65% worse, with an RMSD of 3.84 kcal/mol. The other hybrids have RMSDs that are 2 to 3 times larger than that of ωB97M-V.

The 660 TCE data points in the test set include heavy-atom transfer energies, homolytic bond dissociation energies, as well as isodesmic reaction energies. ωB97M-V is the best-performing functional by a comfortable margin, with an RMSD of 2.45 kcal/mol. The next best functionals, M06-2X, MN15, and ωM05-D perform more than 30% worse, with RMSDs of 3.17, 3.19, and 3.28 kcal/mol, respectively. The remaining four hybrids (M08-HX, M11, ωB97X-V, and ωB97X-D) perform comparably and have RMSDs around 3.6 kcal/mol. The best local functionals are B97M-V (3.59 kcal/mol) and MN15-L (3.95 kcal/mol), both performing significantly better than M06-L (5.26 kcal/mol).

Of the 258 TCD data points in the test set, 234 are multi-reference atomization energies, bond dissociation energies, and heavy-atom transfer energies from the W4-11 database, while 24 are atomization energies and homodesmotic, isodesmic, and isogyric reactions from the Platonic24 dataset. While the first grouping should be a major challenge for hybrid functionals, the second grouping should present difficulties for local functionals. The best-performing density functional is a hybrid (ωB97M-V with an RMSD of 4.30 kcal/mol), followed by a local functional (B97M-V with an RMSD of 4.82 kcal/mol). While ωB97X-V comes in third with an RMSD of 5.01 kcal/mol, the next best functionals are ωB97X-D and ωM05-D, with RMSDs of 5.79 and 5.87 kcal/mol, respectively. MN15 and M06-2X perform about 50% and 65% worse than ωB97M-V, respectively, while M08-HX and M11 perform about 2 times worse. The second-best local functional is B97-D3(BJ), with an RMSD (7.92 kcal/mol) that is significantly smaller than that of M06-L (12.97 kcal/mol).

Finally, the BH datatype contains five test datasets (136 data points), two of which (HTBH38 and NHTBH38) are found in the Minnesota density functional training sets. Nevertheless, ωB97M-V, with an RMSD of 1.80 kcal/mol, has the smallest RMSD out of the 12 benchmarked functionals, followed by M08-HX, which performs only slightly worse, and MN15, which performs approximately 20% worse. Surprisingly, M06-2X is only sixth best, with an RMSD of 2.97 kcal/mol, followed closely by M11 (3.18 kcal/mol) and distantly by ωM05-D (4.11 kcal/mol). From the local functionals, B97M-V performs the best, with an RMSD of 3.95 kcal/mol, followed by MN15-L, which has an RMSD of 4.93 kcal/mol. M06-L performs about 25% worse than MN15-L, while the worst functional overall is B97-D3(BJ), with an RMSD of 7.85 kcal/mol.

Overall, the performance of ωB97M-V across 3547 test data points is very encouraging. Across the eight datatypes, ωB97M-V performs significantly better than the next best functional for five of the datatypes (NCED, NCEC, NCD, TCE, and TCD), is indistinguishable from the next best functional for one of the datatypes (BH), and is indistinguishable from the best functional for the two remaining datatypes (IE and ID). It is worth noting that the size of the test set (3547 data points) used to validate ωB97M-V is more than 7 times larger than the entire 2015A Minnesota database used to train and test the MN15-L functional. Furthermore, of the 4986 total data points in the training, primary test, and secondary test sets, only 870 data points are used for training, while the other 82.5% are used for testing. Thus, the transferability of ωB97M-V is satisfactorily demonstrated with the results documented thus far.

Figures 57 contain RMSDs for datasets in the training, primary test, and secondary test sets, respectively. Although there are 84 datasets in total, the AE18 and RG10 datasets from the training and primary test sets are excluded both because they are not associated with any of the datatypes and because the corresponding RMSDs are not very meaningful. The performance of ωB97M-V on the training datasets will be discussed very briefly, since it is bad scientific practice to compare the performance of a semi-empirical density functional to that of existing functionals using its own training set.

One training result worth mentioning is that across the 124 atomization energies in the TAE140nonMR dataset,104ωB97M-V affords an RMSD of 2.23 kcal/mol, which significantly improves over ωB97X-V (2.95 kcal/mol) and B97M-V (3.89 kcal/mol). This shows the improvement that is possible by including exact exchange, local meta-GGA contributions, as well as nonlocal correlation, since TAE140nonMR was included in the training set of all three functionals.

Moving away from the relatively unimportant training datasets toward the more meaningful primary test datasets (Figure 6), the performance of ωB97M-V is generally satisfactory, and the new functional has the smallest RMSD for 14 of the 34 primary test datasets considered. In order to circumvent the laborious process of documenting the performance of ωB97M-V for all 34 primary test datasets in Figure 6, only a handful of the datasets will be analyzed.

The S66 dataset69,70 is certainly the most popular of the NCED datasets in the primary test set, and the performance of ωB97M-V is very satisfactory. Its RMSD of 0.15 kcal/mol is slightly larger than that of ωB97X-V (0.13 kcal/mol), which is the best performer. ωB97M-V performs more than 2 times better than M06-2X, M08-HX, and ωB97X-D, more than 3 times better than ωM05-D, and more than 4 times better than M11 and MN15. At 0.18 kcal/mol, B97M-V has the third smallest RMSD, and is the best local functional tested.

The Shields38 dataset80 contains 38 water clusters, ranging from dimers to decamers. ωB97M-V performs the best, with an RMSD of 0.48 kcal/mol, followed by M08-HX (0.51 kcal/mol), MN15 (0.63 kcal/mol), and B97M-V, ωB97X-V, and ωB97X-D, which have RMSDs around 0.7 kcal/mol. The rest of the functionals have RMSDs between 1 and 3 kcal/mol, with the exception of MN15-L, which performs very poorly (10.44 kcal/mol). Binding energies for larger clusters are evaluated later, with the 14 water 20-mers in the secondary test set.

The CYCONF dataset is taken from the GMTKN30 database83,96 and contains the isomerization energies of 10 cysteine conformers. Again, ωB97M-V performs the best for this dataset, with a small RMSD of 0.07 kcal/mol that is virtually indistinguishable from that of ωB97X-V, and nearly 3.5 times better than that of the next best functional, M06-2X.

The HAT707nonMR dataset from the W4-11 database104 contains 505 heavy-atom transfer energies, and is one of the largest datasets in the primary test set. ωB97M-V affords an impressive RMSD of 2.64 kcal/mol on this dataset, performing nearly 20% better than the next best functional, M06-2X (3.27 kcal/mol), and 30% better than ωB97X-V (3.84 kcal/mol).

The most interesting datasets contained in this paper are found in the secondary test set, as most of them are taken from papers that were very recently published. Furthermore, the secondary test set is the truest form of transferability testing, as it was compiled and evaluated after ωB97M-V was fully self-consistently trained. Thus, the bulk of the remaining discussion in this section will be focused on the datasets from the secondary test set (Figure 7).

3B-69-DIM is a dataset created from the 3B-69 dataset of Beran and co-workers71 and contains all relevant pairs of monomers that can be constructed from the 69 trimers. This results in a total of 207 dimer binding energies and serves as a stringent test of transferability for the new functional. ωB97M-V performs outstandingly for this dataset, with an RMSD of 0.16 kcal/mol, followed by ωB97X-V and B97M-V, with RMSDs of 0.20 and 0.21 kcal/mol, respectively. The next best functional, ωM05-D, performs nearly 2 times worse than ωB97M-V, while the best Minnesota functional, MN15, performs 3 times worse.

Since the two NCED datasets discussed thus far (S66 and 3B-69-DIM) contain a mixture of hydrogen-bonded and dispersion-bound systems, it is useful to consider a dataset that deals specifically with the latter. For this purpose, the AlkBind12 dataset,72 which contains saturated and unsaturated hydrocarbon dimers, will be analyzed. The RMS of the 12 reference energies comprising AlkBind12 is only 3.14 kcal/mol, indicating that the systems are fairly weakly bound. The two best functionals are ωB97X-V and ωB97M-V, with almost identical RMSDs of 0.12 and 0.13 kcal/mol, respectively. Only three other functionals manage RMSDs under 0.3 kcal/mol (an error of roughly 10%): B97M-V, M08-HX, and M06-2X. MN15 overbinds the dimers with an RMSD (1.18 kcal/mol) that is almost 10 times larger than that of ωB97M-V, while the worst-performing functional is MN15-L, which completely overbinds the alkanes with an RMSD of 3.49 kcal/mol.

HB4974–76 is a very interesting dataset constructed by Boese, and contains the binding energies of 49 hydrogen-bonded dimers. In fact, a recent benchmark of density functionals on the HB49 dataset found that MP2 at the basis set limit, with an RMSD of approximately 0.3 kcal/mol, performed better than all of the tested Rung 1-4 density functionals. Therefore, it is of interest to assess the performance of ωB97M-V on the HB49 dataset. The results are very encouraging: with a low RMSD of 0.23 kcal/mol, ωB97M-V is the only functional tested which significantly outperforms MP2. In addition, ωB97X-V performs very comparably to MP2, with an RMSD of 0.29 kcal/mol. From the local functionals, B97M-V performs best, with an RMSD of 0.47 kcal/mol.

While the 3B-69 dataset was originally intended as a benchmark for three-body intermolecular interaction energies, it can also be used as a benchmark for trimer binding energies (3B-69-TRIM). This is a good transferability test for ωB97M-V, since very few trimers are found in its training set. ωB97M-V performs very well for this dataset, with the smallest RMSD of 0.32 kcal/mol. The next best functionals are ωB97X-V and B97M-V, with RMSDs of 0.39 and 0.47 kcal/mol, respectively. Only two other functionals manage RMSDs under 1 kcal/mol: ωM05-D and ωB97X-D, with RMSDs of 0.65 and 0.88 kcal/mol, respectively. The best Minnesota functional, MN15 (1.11 kcal/mol), performs more than 3 times worse than ωB97M-V.

Having tested the performance of ωB97M-V for small- to medium-sized water clusters with the Shields38 dataset in the primary test set, it is time to consider the H2O20Bind10 and H2O20Bind4 datasets in the secondary test set, as they contain a total of 14 water 20-mer binding energies. In order to address both datasets simultaneously, the geometric mean of the two RMSDs (GMRMSD) will be considered. ωB97M-V performs best overall, with a GMRMSD of 1.01 kcal/mol, while ωB97X-D and ωB97X-V perform second and third best, with GMRMSDs of 1.38 and 1.48 kcal/mol. ωM05-D and B97M-V perform very similarly, with GMRMSDs around 2.85 kcal/mol, while M06-2X is the best of the tested Minnesota functionals, with a GMRMSD of 3.40 kcal/mol. After B97M-V, the next best local functional is M06-L, with a GMRMSD (7.16 kcal/mol) that is 6 times smaller than that of MN15-L (43.24 kcal/mol).

While the binding energies of small, medium, and large water clusters have been thoroughly addressed thus far, it is important to assess the performance of ωB97M-V for the relative energies of water clusters. This is done with the help of three datasets from the secondary test set: H2O16Rel5, H2O20Rel10, and H2O20Rel4. Once again, the geometric mean of these three datasets will be considered for brevity. The GMRMSD of ωB97M-V across these three datasets is remarkably small, at only 0.09 kcal/mol. The next best functionals are ωB97X-D and ωB97X-V, with very similar GMRMSDs of 0.22 and 0.24 kcal/mol, respectively. B97M-V is the best local functional, with a GMRMSD of 0.41 kcal/mol, while the two remaining non-Minnesota functionals (ωM05-D and B97-D3(BJ)) have GMRMSDs of 0.67 and 1.04 kcal/mol, respectively. All of the tested Minnesota functionals perform poorly for these isomerization energies, with GMRMSDs ranging from 0.98 kcal/mol (M08-HX) to 4.17 kcal/mol (MN15-L).

A recent benchmark by Karton and co-workers on the YMPJ519 dataset of amino acid isomerization energies100 found ωB97X-V to be the best Rung 1-4 density functional. Thus, it is important to verify that ωB97M-V performs as well as its GGA counterpart. Accordingly, both ωB97X-V and ωB97M-V have impressive RMSDs of 0.30 and 0.32 kcal/mol, respectively, while the smallest RMSD is surprisingly reserved for B97M-V, at 0.28 kcal/mol. The rest of the functionals have RMSDs that range from 0.49 kcal/mol (M06-2X) to 0.77 kcal/mol (MN15) to 1.51 kcal/mol (MN15-L).

Another recent benchmark by Martin and co-workers assessed the performance of various density functionals for the relative energies of a handful of C20 and C24 structures.105 The study found that only double hybrid functionals were able to afford RMSDs under 10 kcal/mol (with the smallest RMSD being around 8.3 kcal/mol). Therefore, it is interesting to assess the performance of ωB97M-V on this dataset to see if it can break the 10 kcal/mol barrier. Both ωB97X-V and ωB97M-V manage RMSDs under 7 kcal/mol, with the former slightly outperforming the latter. By contrast, the best local functional is B97M-V with an RMSD of 25.39 kcal/mol, followed closely by MN15-L (27.43 kcal/mol). From the remaining hybrid functionals, ωB97X-D, MN15, ωM05-D, and M08-HX manage RMSDs under 20 kcal/mol, while M11 and M06-2X perform more than 3 times worse than ωB97M-V.

A very challenging benchmark set by Martin and co-workers115 assessed the performance of density functionals for various reactions (homodesmotic, isodesmic, and isogyric) involving platonic hydrocarbon cages, in addition to their atomization energies. While the individual RMSDs for these four datasets are given in Figure 7, the functionals will be assessed based on the geometric mean of the four RMSDs. Although the original study found ωB97X-V to be the most promising Rung 1-4 functional overall, ωB97M-V manages a GMRMSD of only 3.86 kcal/mol, compared to ωB97X-V’s GMRMSD of 5.99 kcal/mol. Thus, ωB97M-V improves over ωB97X-V by over 35%. The next best functional (B97M-V) is surprisingly a local one and has a GMRMSD of 7.27 kcal/mol. The performance of B97M-V is certainly noteworthy, since the two local Minnesota functionals have GMRMSDs larger than 20 kcal/mol. While M06-2X, MN15, and M08-HX perform at least 3 times worse than ωB97M-V, M11 performs more than 4.5 times worse than ωB97M-V.

Within the NCED category, the BzDC215, NBC10, and S66x8 datasets contain potential energies curves that can be used to assess the accuracy of density functionals for predicting equilibrium properties of dimers. Furthermore, the RG10 dataset contains all 10 PECs that can be constructed between the rare-gas dimers from helium to krypton. In total, these four datasets contain 96 PECs, with BzDC215, NBC10, and RG10 each having 10, and S66x8 having 66. Unfortunately, even with the (99,590)/SG-1 grid, some of the resulting potential energy curves are too oscillatory to be accurately interpolated,130–132 primarily for the Minnesota density functionals. Consequently, the benzene-neon dimer and the benzene-argon dimer PECs from BzDC215 were removed, the sandwich benzene dimer, the methane dimer, and the sandwich (S2) pyridine dimer PECs from NBC10 were removed, and the helium dimer PEC from RG10 was removed, leaving a total of 90 potential energy curves. Figure 8 contains the equilibrium bond length (EBL) and equilibrium binding energy (EBE) RMSDs for these four datasets, along with the corresponding total RMSDs with RG10 excluded (All*). In order to keep the discussion succinct, only the RG10 and All* results will be discussed.

For the nine rare-gas dimers, the three VV10-containing functionals predict reasonably accurate equilibrium bond lengths, with RMSDs around 0.07 Å. Only two other functionals manage an EBL RMSD of under 0.1 Å for the rare-gas dimers: MN15-L and MN15. However, it is important to mention that both of these functionals were fit to at least a single point from the PEC of six of the nine rare-gas dimers considered here. The rest of the Minnesota functionals perform very poorly, with RMSDs between 0.2 and 0.7 Å. ωB97X-D also performs poorly, with an RMSD of 0.403, while the worst overall performer is ωM05-D, with an RMSD of 0.722 Å.

Moving on to the 81 PECs in the All* category, the best performance is exhibited by ωB97M-V, with a very impressive equilibrium bond length RMSD of only 0.014 Å. In fact, ωB97M-V performs almost 2 times better than the next best functional, B97M-V, and 3 times better than ωB97X-V. The six Minnesota functionals have RMSDs that range from 0.042 Å (MN15) to 0.088 Å (M08-HX and MN15-L), while ωM05-D, a range-separated hybrid functional based on the M05 functional form,30 performs almost as well as B97M-V, with an RMSD of 0.027 Å.

As for the All* equilibrium binding energies, ωB97X-V, ωB97M-V, and B97M-V perform very well, with RMSDs between 0.15 and 0.17 kcal/mol, while the rest of the functionals (except MN15-L) have RMSDs that range from 0.33 kcal/mol (M06-2X) to 0.60 kcal/mol (M06-L and MN15). MN15-L, on the other hand, has an All* EBE RMSD of nearly 2 kcal/mol, which is more than 3 times larger than that of M06-L.

Although the benzene-argon dimer was removed from the BzDC215 dataset in order to generate the RMSDs discussed thus far, it is nevertheless an interesting example of a system bound primarily by dispersion. Furthermore, due to the inherent weakness of the interaction, it is a case that can be used to assess the sensitivity of density functionals (especially meta-GGAs) to the integration grid. Figure 9 displays the PEC for the benzene-argon dimer as calculated by the 12 benchmarked density functionals with the (99,590)/SG-1 grid. It is evident that the grid filtering that was applied in Section VI worked successfully, since the PEC of ωB97M-V is nearly as smooth as that of ωB97X-V for this system. By contrast, the Minnesota functionals are far harder to converge with respect to the grid, with M06-2X, M08-HX, and MN15 appearing to behave better than M06-L, M11, and MN15-L.

Considering the accuracy of the PECs themselves, ωB97M-V, ωB97X-V, and B97M-V are very accurate, with equilibrium bond length errors of −0.008, −0.01, and −0.026 Å, respectively. Of the remaining nine functionals, ωM05-D, B97-D3(BJ), ωB97X-D, and M06-L predict equilibrium bond lengths that are at least 0.1 Å too long, while MN15-L, M06-2X, M11, and M08-HX predict equilibrium bond lengths that are at least 0.1 Å too short. The three VV10-containing functionals manage to reproduce the equilibrium binding energy rather well, with the largest error (−0.128 kcal/mol) attributed to ωB97X-V, an error of −0.083 kcal/mol attributed to ωB97M-V, and the smallest error associated with B97M-V (−0.019 kcal/mol). Despite predicting a bond length that is more than 0.1 Å too short, M06-2X underestimates the EBE of the benzene-argon dimer by only 0.072 kcal/mol. By contrast, MN15-L overbinds the system by more than 1.15 kcal/mol.

Although ωB97M-V was consistently trained and tested in the def2-QZVPPD basis set (without counterpoise corrections), it is inevitable that it will be used with different basis sets. As a result, this section explores the use of ωB97M-V with 21 basis sets from 4 different families, and makes recommendations based on how closely these basis sets can mimic the results of the training set basis (TSB), def2-QZVPPD. For this purpose, four datasets are selected and tested: S66 representing non-covalent interactions, Pentane14 representing isomerization energies, AlkAtom19 representing thermochemistry, and CRBH20 representing barrier heights. For the S66 dataset, the calculations are performed both with and without counterpoise corrections (designated CP and noCP, respectively), because it is very unlikely that a double- or triple-zeta basis set without counterpoise corrections will be able to reproduce the quadruple-zeta, def2-QZVPPD basis set binding energies. The results, summarized in Figure 10, are analyzed using two sets of RMSDs (the first relative to the reference values and the second relative to the def2-QZVPPD values) for each of the five categories of interest: S66 noCP, S66 CP, Pentane14, AlkAtom19, and CRBH20. In order to facilitate the use of Figure 10, the basis sets are sorted based on the geometric mean (GM) of the S66 CP, Pentane14, AlkAtom19, and CRBH20 RMSDs relative to the TSB. The S66 noCP RMSD is excluded from the GM because it unfairly disadvantages triple-zeta basis sets. Furthermore, the RMSDs within each dataset are color-coded, with green indicating that the use of the corresponding basis set with the type of interaction represented by the corresponding dataset is recommended, yellow indicating that the pairing should be used with caution, and red indicating that the pairing should not be used. Finally, the number of basis functions that each basis set contains for octane is shown in the last column of Figure 10.

From the outset, it is clear that a handful of basis sets are entirely incompatible with ωB97M-V, namely def2-SVP, def2-SVPD, pc-0, aug-pc-0, pc-1, aug-pc-1, cc-pVDZ, and aug-cc-pVDZ. This result is expected, since the functional is trained as close to the basis set limit as possible. On the other hand, it is clear that certain basis sets are very compatible with ωB97M-V, namely pc-3, aug-pc-3, aug-cc-pVQZ, and of course, def2-QZVPPD. These basis sets work superbly well for isomerization energies, thermochemistry, and barrier heights, and provide accurate binding energies for non-covalent interactions with and even without counterpoise corrections. With counterpoise corrections, aug-pc-2 and def2-QZVPP additionally provide satisfactory results for all four types of interactions. While the smallest basis set that can successfully handle all four categories is aug-pc-2 with 782 basis functions for octane, two smaller basis sets, pc-2 and def2-TZVPPD, are almost always satisfactory, with the former being considerably smaller than aug-pc-2. In fact, the only result that makes pc-2 not fully satisfactory is the S66 CP RMSD relative to the TSB (0.11 kcal/mol). However, the RMSD relative to the reference values is actually very impressive (0.13 kcal/mol). Thus, pc-2, with only 492 basis functions for octane, is the most economical option that can be recommended for use with ωB97M-V. def2-TZVPPD, on the other hand, has an RMSD of 2.75 kcal/mol for AlkAtom19, relative to the TSB. However, the RMSD of 1.86 kcal/mol relative to the reference values is still acceptable, making def2-TZVPPD, with only 602 basis functions for octane, another economical basis set choice. The rest of the basis sets that have not been mentioned explicitly must be used very cautiously.

Different density functionals converge to the integration grid limit at different rates. During the training process of ωB97M-V, the billions of candidate functionals were filtered such that the least-squares fit energies generated in the (99,590)/SG-1 and (250,974)/SG-1 grids differed by an absolute maximum of 0.015 kcal/mol. The effectiveness of this decision is tested by analyzing the grid sensitivity of ωB97M-V on all of the data points in the training and primary test sets (with the exception of those in AE18 and RG10) with the following grids: (250,974)/SG-1, (99,590)/SG-1, (99,302)/SG-1, (75,590)/SG-1, (75,302)/SG-1, (75,302)/SG-0, and SG-1/SG-0.

Table V summarizes the results of this comprehensive test involving 3247 data points, which are binned with respect to the absolute error (AE) in kcal/mol. The table is populated with the assumption that the (250,974)/SG-1 results are fully converged with respect to the grid. Starting with the (99,590)/SG-1 grid, it is clear that the filtering applied during the training stage has completely transferred to the final functional form, since all 3247 data points have absolute errors less than 0.015 kcal/mol. Furthermore, changing the number of radial shells from 99 to 75 (while keeping the number of angular grid points constant) seems to have a negligible effect on the results, yet accelerates the integration of the local exchange-correlation functional by 25%. On the other hand, changing the number of angular grid points from 590 to 302 (while keeping the number of radial shells constant) seems to have a much more profound effect. Based on the negligible effect of transitioning from (99,590)/SG-1 to (75,590)/SG-1, it is reasonable to assume that transitioning from (99,302)/SG-1 to (75,302)/SG-1 should have a negligible effect on the (99,302)/SG-1 results. The (75,302)/SG-1 results indicate that this is indeed true.

The goal of this grid analysis is to recommend three tiers of integration grids for use with ωB97M-V: fine, medium, and coarse. So far, it is clear that the (99,590)/SG-1 grid is certainly the finest grid that is necessary to obtain fully converged results. Thus, the (99,590)/SG-1 grid is deemed to be the “fine” option for ωB97M-V. Furthermore, the only other combination that is computationally more efficient than the (99,590)/SG-1 grid yet maintains its accuracy is the (75,590)/SG-1 grid, which receives the “medium” certification. While (75,302)/SG-1 appears to be a viable “coarse” option, it is useful to see if the nonlocal grid can be reduced without incurring substantial additional errors. Modifying the nonlocal grid from SG-1 to SG-0 (while maintaining the (75,302) local grid) only negligibly affects the results. On the other hand, modifying the local grid from (75,302) to SG-1 (with SG-0 as the nonlocal grid) has a devastating effect on the quality of the results and is absolutely not recommended. Therefore, the “coarse” specification is deemed to be the (75,302)/SG-0 grid.

Based on these results, the following three grids are recommended for use with ωB97M-V:

  • fine: (99,590)/SG-1

  • medium: (75,590)/SG-1

  • coarse: (75,302)/SG-0.

For semi-empirical density functionals, universality (or transferability) is impossible to fully guarantee, because such functionals are necessarily approximate. In other words, for a given system, a new density functional cannot necessarily improve upon existing ones, even though it may often do so. Nonetheless, within a class of functionals, transferability can be enhanced by minimizing the number of empirical parameters (i.e., avoiding overfitting), while increasing the size of the training and test sets. Even then, the use of a new density functional should only be advocated if it statistically improves upon a wide variety of existing competitors in and below its class, across a very diverse set of benchmark systems.

ωB97M-V was developed upon these foundations. A combinatorial, “survival-of-the-most-transferable” approach was utilized to screen over 10 × 109 candidate least-squares fits based on accuracy, transferability, and desired physical properties, an immense database of nearly 5000 data points was used to train and test the most promising fit, and the final, self-consistently optimized density functional was assessed against 11 well-respected semi-empirical functionals across all 4986 data points. The results are very encouraging, beginning with a large reduction in the number of trained parameters versus other meta-GGA functionals from 29 (M06-2X), 40 (M11), 47 (M08-HX), or 59 (MN15), to 12 in ωB97M-V. The use of additional parameters does not yield significantly better transferability in the screening and gradually leads to potential problems with overfitting.

The combined training, primary test, and secondary test set results (summarized in Figure 11) indicate that ωB97M-V is remarkably accurate for non-covalent interactions, isomerization energies, thermochemistry, and barrier heights across the main-group elements. For both NCED and NCEC, ωB97M-V is at least 25% more accurate than the next best tested functional, which is ωB97X-V. ωB97M-V is equivalent to ωB97X-V and B97M-V for IE, but outperforms all tested functionals by at least 25% for ID. Additionally, ωB97M-V is almost 30% more accurate than ωB97X-V for TCE, and 25% more accurate than any tested functional. For TCD, ωB97M-V significantly outperforms the next best functionals, which are B97M-V and ωB97X-V. Finally, despite only having 15% short-range exact exchange, ωB97M-V is the best tested density functional for BH.

ωB97M-V was consistently trained and tested in the def2-QZVPPD basis set (without counterpoise corrections). Thus, it is meant to be used as close as practically possible to the basis set limit. Its basis set dependence has been thoroughly tested across four types of interactions (non-covalent interactions, isomerization energies, thermochemistry, and barrier heights) in order to identify basis sets that can provide results similar in quality to those acquired with the basis set used for training the parameters. The def2-QZVPPD, pc-3, aug-pc-3, and aug-cc-pVQZ basis sets are recommended for use, both with and without counterpoise corrections (when applicable). Additionally, the def2-QZVPP and aug-pc-2 basis sets are recommended for use with counterpoise corrections (when applicable). Finally, the pc-2 and def2-TZVPPD basis sets (to be used with counterpoise corrections, when applicable) should serve as economical choices under most circumstances.

Since the evaluation of the kinetic energy density is very sensitive to the integration grid, ωB97M-V was trained with the intention of making the (99,590)/SG-1 grid the integration grid limit. This goal was met by filtering fits during the training stage based on their absolute energetic deviation from the (250,974)/SG-1 grid. Based on tests spanning 3247 data points, the (75,302)/SG-0 grid is recommended as a viable coarse option for use with ωB97M-V (particularly for quick calculations), while the (99,590)/SG-1 grid is recommended as the fine option if results near the integration grid limit are required. For most applications, the medium-sized (75,590)/SG-1 grid can serve as a compromise between these two limits.

It is important to discuss the remaining limitations of ωB97M-V. Like most Kohn–Sham density functionals, it is not appropriate for use when strong correlation effects are significant (e.g., see the TCD results in Figure 11). It contains some self-interaction error, which causes larger errors in problems involving odd electrons or holes (e.g., see the NCD results in Figure 11). Additionally, it is trained and tested on main-group elements only, so its performance on transition metal-containing systems remains to be tested as suitable reference values become available. However, the minimal empiricism of ωB97M-V gives reason for cautious optimism in cases such as organometallic systems, where strong correlation is not important.

Finally, it is desirable to apply the same approach used here to develop other semi-empirical density functionals with improved physical content, so that the resulting density functionals are likewise minimally parameterized and optimally transferable. Perhaps the most obvious next step is a range-separated hybrid, meta-GGA density functional that includes nonlocal correlation through virtual orbitals. A functional of this type should have significantly lower errors due to self-interaction. We hope to report such a development in due course.

N.M. would like to thank Jan M. L. Martin for helpful comments and for providing the reference values for the YMPJ519 dataset prior to their publication online. N.M. would like to thank Amir Karton for providing the geometries for the Platonic24 and YMPJ519 datasets prior to their publication online. N.M. would like to thank Gregory Beran for help with the 3B-69 dataset. N.M. would like to thank A. Daniel Boese for providing the geometries for the HB49 dataset. N.M. would like to thank Susi Lehtola, Jonathan Thirman, and Jon Witte for helpful comments. This work was supported by the Director, Office of Energy Research, Office of Basic Energy Sciences, Chemical Sciences Division of the U.S. Department of Energy under Contract No. DE-AC0376SF00098, and by a grant from the SciDac Program.

1.
W.
Kohn
and
L. J.
Sham
, “
Self-consistent equations including exchange and correlation effects
,”
Phys. Rev.
140
,
A1133
A1138
(
1965
).
2.
A. D.
Becke
, “
Density-functional thermochemistry. V. Systematic optimization of exchange-correlation functionals
,”
J. Chem. Phys.
107
,
8554
8560
(
1997
).
3.
J. P.
Perdew
,
A.
Ruzsinszky
,
J.
Tao
,
V. N.
Staroverov
,
G. E.
Scuseria
, and
G. I.
Csonka
, “
Prescription for the design and selection of density functional approximations: More constraint satisfaction with fewer fits
,”
J. Chem. Phys.
123
,
062201
(
2005
).
4.
A. D.
Becke
, “
Density-functional thermochemistry. III. The role of exact exchange
,”
J. Chem. Phys.
98
,
5648
5652
(
1993
).
5.
P. M. W.
Gill
,
R. D.
Adamson
, and
J. A.
Pople
, “
Coulomb-attenuated exchange energy density functionals
,”
Mol. Phys.
88
,
1005
1009
(
1996
).
6.
S.
Grimme
, “
Semiempirical GGA-type density functional constructed with a long-range dispersion correction
,”
J. Comput. Chem.
27
,
1787
1799
(
2006
).
7.
S.
Grimme
,
J.
Antony
,
S.
Ehrlich
, and
H.
Krieg
, “
A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu
,”
J. Chem. Phys.
132
,
154104
(
2010
).
8.
S.
Grimme
,
S.
Ehrlich
, and
L.
Goerigk
, “
Effect of the damping function in dispersion corrected density functional theory
,”
J. Comput. Chem.
32
,
1456
1465
(
2011
).
9.
O. A.
Vydrov
and
T. V.
Voorhis
, “
Nonlocal van der Waals density functional: The simpler the better
,”
J. Chem. Phys.
133
,
244103
(
2010
).
10.
K.
Lee
,
E. D.
Murray
,
L.
Kong
,
B. I.
Lundqvist
, and
D. C.
Langreth
, “
Higher-accuracy van der Waals density functional
,”
Phys. Rev. B
82
,
081101
(
2010
).
11.
F. A.
Hamprecht
,
A. J.
Cohen
,
D. J.
Tozer
, and
N. C.
Handy
, “
Development and assessment of new exchange-correlation functionals
,”
J. Chem. Phys.
109
,
6264
6271
(
1998
).
12.
A. D.
Boese
,
N. L.
Doltsinis
,
N. C.
Handy
, and
M.
Sprik
, “
New generalized gradient approximation functionals
,”
J. Chem. Phys.
112
,
1670
1678
(
2000
).
13.
A. D.
Boese
and
N. C.
Handy
, “
A new parametrization of exchange–correlation generalized gradient approximation functionals
,”
J. Chem. Phys.
114
,
5497
5503
(
2001
).
14.
R.
Peverati
,
Y.
Zhao
, and
D. G.
Truhlar
, “
Generalized gradient approximation that recovers the second-order density-gradient expansion with optimized across-the-board performance
,”
J. Phys. Chem. Lett.
2
,
1991
1997
(
2011
).
15.
P. J.
Wilson
,
T. J.
Bradley
, and
D. J.
Tozer
, “
Hybrid exchange-correlation functional determined from thermochemical data and ab initio potentials
,”
J. Chem. Phys.
115
,
9233
9242
(
2001
).
16.
A. D.
Boese
and
J. M. L.
Martin
, “
Development of density functionals for thermochemical kinetics
,”
J. Chem. Phys.
121
,
3405
3416
(
2004
).
17.
T. W.
Keal
and
D. J.
Tozer
, “
Semiempirical hybrid functional with improved performance in an extensive chemical assessment
,”
J. Chem. Phys.
123
,
121103
(
2005
).
18.
R.
Peverati
and
D. G.
Truhlar
, “
Communication: A global hybrid generalized gradient approximation to the exchange-correlation functional that satisfies the second-order density-gradient constraint and has broad applicability in chemistry
,”
J. Chem. Phys.
135
,
191102
(
2011
).
19.
J.-D.
Chai
and
M.
Head-Gordon
, “
Systematic optimization of long-range corrected hybrid density functionals
,”
J. Chem. Phys.
128
,
084106
(
2008
).
20.
J.-D.
Chai
and
M.
Head-Gordon
, “
Long-range corrected hybrid density functionals with damped atom-atom dispersion corrections
,”
Phys. Chem. Chem. Phys.
10
,
6615
6620
(
2008
).
21.
Y.-S.
Lin
,
G.-D.
Li
,
S.-P.
Mao
, and
J.-D.
Chai
, “
Long-range corrected hybrid density functionals with improved dispersion corrections
,”
J. Chem. Theory Comput.
9
,
263
272
(
2013
).
22.
N.
Mardirossian
and
M.
Head-Gordon
, “
ωB97X-V: A 10-parameter, range-separated hybrid, generalized gradient approximation density functional with nonlocal correlation, designed by a survival-of-the-fittest strategy
,”
Phys. Chem. Chem. Phys.
16
,
9904
9924
(
2014
).
23.
R.
Peverati
and
D. G.
Truhlar
, “
Exchange-correlation functional with good accuracy for both structural and energetic properties while depending only on the density and its gradient
,”
J. Chem. Theory Comput.
8
,
2310
2319
(
2012
).
24.
H. S.
Yu
,
W.
Zhang
,
P.
Verma
,
X.
He
, and
D. G.
Truhlar
, “
Nonseparable exchange-correlation functional for molecules, including homogeneous catalysis involving transition metals
,”
Phys. Chem. Chem. Phys.
17
,
12146
12160
(
2015
).
25.
R.
Peverati
and
D. G.
Truhlar
, “
Screened-exchange density functionals with broad accuracy for chemistry and solid-state physics
,”
Phys. Chem. Chem. Phys.
14
,
16187
16191
(
2012
).
26.
A. D.
Boese
and
N. C.
Handy
, “
New exchange-correlation density functionals: The role of the kinetic-energy density
,”
J. Chem. Phys.
116
,
9559
9569
(
2002
).
27.
Y.
Zhao
and
D. G.
Truhlar
, “
A new local density functional for main-group thermochemistry, transition metal bonding, thermochemical kinetics, and noncovalent interactions
,”
J. Chem. Phys.
125
,
194101
(
2006
).
28.
R.
Peverati
and
D. G.
Truhlar
, “
M11-L: A local density functional that provides improved accuracy for electronic structure calculations in chemistry and physics
,”
J. Phys. Chem. Lett.
3
,
117
124
(
2012
).
29.
N.
Mardirossian
and
M.
Head-Gordon
, “
Mapping the genome of meta-generalized gradient approximation density functionals: The search for B97M-V
,”
J. Chem. Phys.
142
,
074111
(
2015
).
30.
Y.
Zhao
,
N. E.
Schultz
, and
D. G.
Truhlar
, “
Exchange-correlation functional with broad accuracy for metallic and nonmetallic compounds, kinetics, and noncovalent interactions
,”
J. Chem. Phys.
123
,
161103
(
2005
).
31.
Y.
Zhao
,
N. E.
Schultz
, and
D. G.
Truhlar
, “
Design of density functionals by combining the method of constraint satisfaction with parametrization for thermochemistry, thermochemical kinetics, and noncovalent interactions
,”
J. Chem. Theory Comput.
2
,
364
382
(
2006
).
32.
Y.
Zhao
and
D.
Truhlar
, “
The M06 suite of density functionals for main group thermochemistry, thermochemical kinetics, noncovalent interactions, excited states, and transition elements: Two new functionals and systematic testing of four M06-class functionals and 12 other functionals
,”
Theor. Chem. Acc.
120
,
215
241
(
2008
).
33.
Y.
Zhao
and
D. G.
Truhlar
, “
Density functional for spectroscopy: No long-range self-interaction error, good performance for Rydberg and charge-transfer states, and better performance on average than B3LYP for ground states
,”
J. Phys. Chem. A
110
,
13126
13130
(
2006
).
34.
Y.
Zhao
and
D. G.
Truhlar
, “
Exploring the limit of accuracy of the global hybrid meta density functional for main-group thermochemistry, kinetics, and noncovalent interactions
,”
J. Chem. Theory Comput.
4
,
1849
1868
(
2008
).
35.
R.
Peverati
and
D. G.
Truhlar
, “
Improving the accuracy of hybrid meta-GGA density functionals by range separation
,”
J. Phys. Chem. Lett.
2
,
2810
2817
(
2011
).
36.
Y.-S.
Lin
,
C.-W.
Tsai
,
G.-D.
Li
, and
J.-D.
Chai
, “
Long-range corrected hybrid meta-generalized-gradient approximations with dispersion corrections
,”
J. Chem. Phys.
136
,
154109
(
2012
).
37.
R.
Peverati
and
D. G.
Truhlar
, “
An improved and broadly accurate local approximation to the exchange-correlation density functional: The MN12-L functional for electronic structure calculations in chemistry and physics
,”
Phys. Chem. Chem. Phys.
14
,
13171
13174
(
2012
).
38.
H. S.
Yu
,
X.
He
, and
D. G.
Truhlar
, “
MN15-L: A new local exchange-correlation functional for Kohn–Sham density functional theory with broad accuracy for atoms, molecules, and solids
,”
J. Chem. Theory Comput.
12
,
1280
1293
(
2016
).
39.
H. S.
Yu
,
X.
He
,
S. L.
Li
, and
D. G.
Truhlar
, “
MN15: A Kohn-Sham global-hybrid exchange-correlation density functional with broad accuracy for multi-reference and single-reference systems and noncovalent interactions
,”
Chem. Sci.
(published online, 2016).
40.
N.
Mardirossian
and
M.
Head-Gordon
, “
Exploring the limit of accuracy for density functionals based on the generalized gradient approximation: Local, global hybrid, and range-separated hybrid functionals with and without dispersion corrections
,”
J. Chem. Phys.
140
,
18A527
(
2014
).
41.
H.
Thom
and
J.
Dunning
, “
Gaussian basis sets for use in correlated molecular calculations. I. The atoms boron through neon and hydrogen
,”
J. Chem. Phys.
90
,
1007
1023
(
1989
).
42.
R. A.
Kendall
,
H.
Thom
,
J.
Dunning
, and
R. J.
Harrison
, “
Electron affinities of the first-row atoms revisited. Systematic basis sets and wave functions
,”
J. Chem. Phys.
96
,
6796
6806
(
1992
).
43.
D. E.
Woon
,
H.
Thom
, and
J.
Dunning
, “
Gaussian basis sets for use in correlated molecular calculations. III. The atoms aluminum through argon
,”
J. Chem. Phys.
98
,
1358
1371
(
1993
).
44.
F.
Jensen
, “
Polarization consistent basis sets: Principles
,”
J. Chem. Phys.
115
,
9113
9125
(
2001
).
45.
F.
Jensen
, “
Polarization consistent basis sets. II. Estimating the Kohn–Sham basis set limit
,”
J. Chem. Phys.
116
,
7372
7379
(
2002
).
46.
F.
Jensen
, “
Polarization consistent basis sets. III. The importance of diffuse functions
,”
J. Chem. Phys.
117
,
9234
9240
(
2002
).
47.
A.
Schäfer
,
H.
Horn
, and
R.
Ahlrichs
, “
Fully optimized contracted Gaussian basis sets for atoms Li to Kr
,”
J. Chem. Phys.
97
,
2571
2577
(
1992
).
48.
A.
Schäfer
,
C.
Huber
, and
R.
Ahlrichs
, “
Fully optimized contracted Gaussian basis sets of triple zeta valence quality for atoms Li to Kr
,”
J. Chem. Phys.
100
,
5829
5835
(
1994
).
49.
F.
Weigend
,
F.
Furche
, and
R.
Ahlrichs
, “
Gaussian basis sets of quadruple zeta valence quality for atoms H–Kr
,”
J. Chem. Phys.
119
,
12753
12762
(
2003
).
50.
F.
Weigend
and
R.
Ahlrichs
, “
Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: Design and assessment of accuracy
,”
Phys. Chem. Chem. Phys.
7
,
3297
3305
(
2005
).
51.
D.
Rappoport
and
F.
Furche
, “
Property-optimized Gaussian basis sets for molecular response calculations
,”
J. Chem. Phys.
133
,
134105
(
2010
).
52.
P. M.
Gill
,
B. G.
Johnson
, and
J. A.
Pople
, “
A standard grid for density functional calculations
,”
Chem. Phys. Lett.
209
,
506
512
(
1993
).
53.
S.-H.
Chien
and
P. M. W.
Gill
, “
SG-0: A small standard grid for DFT quadrature on large systems
,”
J. Comput. Chem.
27
,
730
739
(
2006
).
54.
Y.
Shao
,
Z.
Gan
,
E.
Epifanovsky
,
A. T.
Gilbert
,
M.
Wormit
,
J.
Kussmann
,
A. W.
Lange
,
A.
Behn
,
J.
Deng
,
X.
Feng
 et al., “
Advances in molecular quantum chemistry contained in the Q-Chem 4 program package
,”
Mol. Phys.
113
,
184
215
(
2015
).
55.
J.
Řezáč
and
P.
Hobza
, “
Describing noncovalent interactions beyond the common approximations: How accurate is the gold standard CCSD(T) at the complete basis set limit?
,”
J. Chem. Theory Comput.
9
,
2151
2155
(
2013
).
56.
B. J.
Mintz
and
J. M.
Parks
, “
Benchmark interaction energies for biologically relevant noncovalent complexes containing divalent sulfur
,”
J. Phys. Chem. A
116
,
1086
1092
(
2012
).
57.
J.
Řezáč
and
P.
Hobza
, “
Advanced corrections of hydrogen bonding and dispersion for semiempirical quantum mechanical methods
,”
J. Chem. Theory Comput.
8
,
141
151
(
2012
).
58.
J. C.
Faver
,
M. L.
Benson
,
X.
He
,
B. P.
Roberts
,
B.
Wang
,
M. S.
Marshall
,
M. R.
Kennedy
,
C. D.
Sherrill
, and
K. M.
Merz
, “
Formal estimation of errors in computed absolute interaction energies of protein–ligand complexes
,”
J. Chem. Theory Comput.
7
,
790
797
(
2011
).
59.
M. S.
Marshall
,
L. A.
Burns
, and
C. D.
Sherrill
, “
Basis set convergence of the coupled-cluster correction, δMP2CCSD(T): Best practices for benchmarking non-covalent interactions and the attendant revision of the S22, NBC10, HBC6, and HSG databases
,”
J. Chem. Phys.
135
,
194102
(
2011
).
60.
E. G.
Hohenstein
and
C. D.
Sherrill
, “
Effects of heteroatoms on aromatic π–π interactions: Benzene–pyridine and pyridine dimer
,”
J. Phys. Chem. A
113
,
878
886
(
2009
).
61.
C. D.
Sherrill
,
T.
Takatani
, and
E. G.
Hohenstein
, “
An assessment of theoretical methods for nonbonded interactions: Comparison to complete basis set limit coupled-cluster potential energy curves for the benzene dimer, the methane dimer, benzene–methane, and benzene–H2S
,”
J. Phys. Chem. A
113
,
10146
10159
(
2009
).
62.
T.
Takatani
and
C.
David Sherrill
, “
Performance of spin-component-scaled Møller-Plesset theory (SCS-MP2) for potential energy curves of noncovalent interactions
,”
Phys. Chem. Chem. Phys.
9
,
6106
6114
(
2007
).
63.
P.
Jurečka
,
J.
Šponer
,
J.
Černý
, and
P.
Hobza
, “
Benchmark database of accurate (MP2 and CCSD(T) complete basis set limit) interaction energies of small model complexes, DNA base pairs, and amino acid pairs
,”
Phys. Chem. Chem. Phys.
8
,
1985
1993
(
2006
).
64.
J.
Řezáč
,
K. E.
Riley
, and
P.
Hobza
, “
Benchmark calculations of noncovalent interactions of halogenated molecules
,”
J. Chem. Theory Comput.
8
,
4285
4292
(
2012
).
65.
J.
Witte
,
M.
Goldey
,
J. B.
Neaton
, and
M.
Head-Gordon
, “
Beyond energies: Geometries of nonbonded molecular complexes as metrics for assessing electronic structure approaches
,”
J. Chem. Theory Comput.
11
,
1481
1492
(
2015
).
66.
D. L.
Crittenden
, “
A systematic CCSD(T) study of long-range and noncovalent interactions between benzene and a series of first- and second-row hydrides and rare gas atoms
,”
J. Phys. Chem. A
113
,
1663
1669
(
2009
).
67.
K. L.
Copeland
and
G. S.
Tschumper
, “
Hydrocarbon/water interactions: Encouraging energetics and structures from DFT but disconcerting discrepancies for Hessian indices
,”
J. Chem. Theory Comput.
8
,
1646
1656
(
2012
).
68.
D. G. A.
Smith
,
P.
Jankowski
,
M.
Slawik
,
H. A.
Witek
, and
K.
Patkowski
, “
Basis set convergence of the post-CCSD(T) contribution to noncovalent interaction energies
,”
J. Chem. Theory Comput.
10
,
3140
3150
(
2014
).
69.
J.
Řezáč
,
K. E.
Riley
, and
P.
Hobza
, “
S66: A well-balanced database of benchmark interaction energies relevant to biomolecular structures
,”
J. Chem. Theory Comput.
7
,
2427
2438
(
2011
).
70.
J.
Řezáč
,
K. E.
Riley
, and
P.
Hobza
, “
Extensions of the S66 data set: More accurate interaction energies and angular-displaced nonequilibrium geometries
,”
J. Chem. Theory Comput.
7
,
3466
3470
(
2011
).
71.
J.
Řezáč
,
Y.
Huang
,
P.
Hobza
, and
G. J. O.
Beran
, “
Benchmark calculations of three-body intermolecular interactions and the performance of low-cost electronic structure methods
,”
J. Chem. Theory Comput.
11
,
3065
3079
(
2015
).
72.
J.
Granatier
,
M.
Pitoňák
, and
P.
Hobza
, “
Accuracy of several wave function and density functional theory methods for description of noncovalent interaction of saturated and unsaturated hydrocarbon dimers
,”
J. Chem. Theory Comput.
8
,
2282
2292
(
2012
).
73.
S.
Li
,
D. G. A.
Smith
, and
K.
Patkowski
, “
An accurate benchmark description of the interactions between carbon dioxide and polyheterocyclic aromatic compounds containing nitrogen
,”
Phys. Chem. Chem. Phys.
17
,
16560
16574
(
2015
).
74.
A. D.
Boese
, “
Assessment of coupled cluster theory and more approximate methods for hydrogen bonded systems
,”
J. Chem. Theory Comput.
9
,
4403
4413
(
2013
).
75.
A. D.
Boese
, “
Basis set limit coupled-cluster studies of hydrogen-bonded systems
,”
Mol. Phys.
113
,
1618
1629
(
2015
).
76.
A. D.
Boese
, “
Density functional theory and hydrogen bonds: Are we there yet?
,”
ChemPhysChem
16
,
978
985
(
2015
).
77.
K. U.
Lao
,
R.
Schäffer
,
G.
Jansen
, and
J. M.
Herbert
, “
Accurate description of intermolecular interactions involving ions using symmetry-adapted perturbation theory
,”
J. Chem. Theory Comput.
11
,
2473
2486
(
2015
).
78.
K. U.
Lao
and
J. M.
Herbert
, “
An improved treatment of empirical dispersion and a many-body energy decomposition scheme for the explicit polarization plus symmetry-adapted perturbation theory (XSAPT) method
,”
J. Chem. Phys.
139
,
034107
(
2013
).
79.
K. U.
Lao
and
J. M.
Herbert
, “
Accurate and efficient quantum chemistry calculations for noncovalent interactions in many-body systems: The XSAPT family of methods
,”
J. Phys. Chem. A
119
,
235
252
(
2015
).
80.
B.
Temelso
,
K. A.
Archer
, and
G. C.
Shields
, “
Benchmark structures and binding energies of small water clusters with anharmonicity corrections
,”
J. Phys. Chem. A
115
,
12034
12046
(
2011
).
81.
N.
Mardirossian
,
D. S.
Lambrecht
,
L.
McCaslin
,
S. S.
Xantheas
, and
M.
Head-Gordon
, “
The performance of density functionals for sulfate–water clusters
,”
J. Chem. Theory Comput.
9
,
1368
1380
(
2013
).
82.
V. S.
Bryantsev
,
M. S.
Diallo
,
A. C. T.
van Duin
, and
W. A.
Goddard
, “
Evaluation of B3LYP, X3LYP, and M06-Class density functionals for predicting the binding energies of neutral, protonated, and deprotonated water clusters
,”
J. Chem. Theory Comput.
5
,
1016
1026
(
2009
).
83.
L.
Goerigk
and
S.
Grimme
, “
A general database for main group thermochemistry, kinetics, and noncovalent interactions—Assessment of common and reparameterized (meta-)GGA density functionals
,”
J. Chem. Theory Comput.
6
,
107
126
(
2010
).
84.
A.
Karton
,
R. J.
O’Reilly
,
B.
Chan
, and
L.
Radom
, “
Determination of barrier heights for proton exchange in small water, ammonia, and hydrogen fluoride clusters with G4(MP2)-type, MPn, and SCS-MPn procedures—A caveat
,”
J. Chem. Theory Comput.
8
,
3128
3136
(
2012
).
85.
B.
Chan
,
A. T. B.
Gilbert
,
P. M. W.
Gill
, and
L.
Radom
, “
Performance of density functional theory procedures for the calculation of proton-exchange barriers: Unusual behavior of M06-type functionals
,”
J. Chem. Theory Comput.
10
,
3777
3783
(
2014
).
86.
G. S.
Fanourgakis
,
E.
Aprà
, and
S. S.
Xantheas
, “
High-level ab initio calculations for the four low-lying families of minima of (H2O)20. I. Estimates of MP2/CBS binding energies and comparison with empirical potentials
,”
J. Chem. Phys.
121
,
2655
2663
(
2004
).
87.
T.
Anacker
and
J.
Friedrich
, “
New accurate benchmark energies for large water clusters: DFT is better than expected
,”
J. Comput. Chem.
35
,
634
643
(
2014
).
88.
P. R.
Tentscher
and
J. S.
Arey
, “
Binding in radical-solvent binary complexes: Benchmark energies and performance of approximate methods
,”
J. Chem. Theory Comput.
9
,
1568
1579
(
2013
).
89.
S.
Kozuch
and
J. M. L.
Martin
, “
Halogen bonds: Benchmarks and theoretical analysis
,”
J. Chem. Theory Comput.
9
,
1918
1931
(
2013
).
90.
A.
Bauzá
,
I.
Alkorta
,
A.
Frontera
, and
J.
Elguero
, “
On the reliability of pure and hybrid DFT methods for the evaluation of halogen, chalcogen, and pnicogen bonds involving anionic and neutral electron donors
,”
J. Chem. Theory Comput.
9
,
5201
5210
(
2013
).
91.
A. O.
de-la Roza
,
E. R.
Johnson
, and
G. A.
DiLabio
, “
Halogen bonding from dispersion-corrected density-functional theory: The role of delocalization error
,”
J. Chem. Theory Comput.
10
,
5436
5447
(
2014
).
92.
S. N.
Steinmann
,
C.
Piemontesi
,
A.
Delachat
, and
C.
Corminboeuf
, “
Why are the interaction energies of charge-transfer complexes challenging for DFT?
,”
J. Chem. Theory Comput.
8
,
1629
1640
(
2012
).
93.
A.
Karton
,
D.
Gruzman
, and
J. M. L.
Martin
, “
Benchmark thermochemistry of the CnH2n+2 alkane isomers (n = 2–8) and performance of DFT and composite ab initio methods for dispersion-driven isomeric equilibria
,”
J. Phys. Chem. A
113
,
8434
8447
(
2009
).
94.
S.
Kozuch
,
S. M.
Bachrach
, and
J. M.
Martin
, “
Conformational equilibria in butane-1,4-diol: A benchmark of a prototypical system with strong intramolecular H-bonds
,”
J. Phys. Chem. A
118
,
293
303
(
2014
).
95.
D.
Gruzman
,
A.
Karton
, and
J. M. L.
Martin
, “
Performance of ab initio and density functional methods for conformational equilibria of CnH2n+2 alkane isomers (n = 4–8)
,”
J. Phys. Chem. A
113
,
11974
11983
(
2009
).
96.
J. J.
Wilke
,
M. C.
Lind
,
H. F.
Schaefer
,
A. G.
Csaszar
, and
W. D.
Allen
, “
Conformers of gaseous cysteine
,”
J. Chem. Theory Comput.
5
,
1511
1523
(
2009
).
97.
J. M. L.
Martin
, “
What can we learn about dispersion from the conformer surface of n-Pentane?
,”
J. Phys. Chem. A
117
,
3118
3132
(
2013
).
98.
S.
Yoo
,
E.
Aprà
,
X. C.
Zeng
, and
S. S.
Xantheas
, “
High-level ab initio electronic structure calculations of water clusters (H2O)16 and (H2O)17: A new global minimum for (H2O)16
,”
J. Phys. Chem. Lett.
1
,
3122
3127
(
2010
).
99.
U. R.
Fogueri
,
S.
Kozuch
,
A.
Karton
, and
J. M.
Martin
, “
The melatonin conformer space: Benchmark and assessment of wave function and DFT methods for a paradigmatic biological and pharmacological molecule
,”
J. Phys. Chem. A
117
,
2269
2277
(
2013
).
100.
M. K.
Kesharwani
,
A.
Karton
, and
J. M. L.
Martin
, “
Benchmark ab initio conformational energies for the proteinogenic amino acids through explicitly correlated methods. Assessment of density functional methods
,”
J. Chem. Theory Comput.
12
,
444
454
(
2016
).
101.
L.-J.
Yu
,
F.
Sarrami
,
A.
Karton
, and
R. J.
O’Reilly
, “
An assessment of theoretical procedures for π-conjugation stabilisation energies in enones
,”
Mol. Phys.
113
,
1284
1296
(
2015
).
102.
A.
Karton
and
J. M.
Martin
, “
Explicitly correlated benchmark calculations on C8H8 isomer energy separations: How accurate are DFT, double-hybrid, and composite ab initio procedures?
,”
Mol. Phys.
110
,
2477
2491
(
2012
).
103.
L.-J.
Yu
and
A.
Karton
, “
Assessment of theoretical procedures for a diverse set of isomerization reactions involving double-bond migration in conjugated dienes
,”
Chem. Phys.
441
,
166
177
(
2014
).
104.
A.
Karton
,
S.
Daon
, and
J. M.
Martin
, “
W4-11: A high-confidence benchmark dataset for computational thermochemistry derived from first-principles W4 data
,”
Chem. Phys. Lett.
510
,
165
178
(
2011
).
105.
D.
Manna
and
J. M. L.
Martin
, “
What are the ground state structures of C20 and C24? An explicitly correlated ab initio approach
,”
J. Phys. Chem. A
120
,
153
160
(
2016
).
106.
L. A.
Curtiss
,
K.
Raghavachari
,
G. W.
Trucks
, and
J. A.
Pople
, “
Gaussian-2 theory for molecular energies of first- and second-row compounds
,”
J. Chem. Phys.
94
,
7221
7230
(
1991
).
107.
Y.
Zhao
,
N.
González-García
, and
D. G.
Truhlar
, “
Benchmark database of barrier heights for heavy atom transfer, nucleophilic substitution, association, and unimolecular reactions and its use to test theoretical methods
,”
J. Phys. Chem. A
109
,
2012
2018
(
2005
).
108.
Y.
Zhao
,
B. J.
Lynch
, and
D. G.
Truhlar
, “
Multi-coefficient extrapolated density functional theory for thermochemistry and thermochemical kinetics
,”
Phys. Chem. Chem. Phys.
7
,
43
52
(
2005
).
109.
B. J.
Lynch
,
Y.
Zhao
, and
D. G.
Truhlar
, “
Effectiveness of diffuse basis functions for calculating relative energies by density functional theory
,”
J. Phys. Chem. A
107
,
1384
1388
(
2003
).
110.
S.
Grimme
,
H.
Kruse
,
L.
Goerigk
, and
G.
Erker
, “
The mechanism of dihydrogen activation by frustrated Lewis pairs revisited
,”
Angew. Chem., Int. Ed.
49
,
1402
1405
(
2010
).
111.
L.
Goerigk
and
S.
Grimme
, “
Efficient and accurate double-hybrid-meta-GGA density functionals–evaluation with the extended GMTKN30 database for general main group thermochemistry, kinetics, and noncovalent interactions
,”
J. Chem. Theory Comput.
7
,
291
309
(
2011
).
112.
H.
Krieg
and
S.
Grimme
, “
Thermochemical benchmarking of hydrocarbon bond separation reaction energies: Jacob’s ladder is not reversed!
,”
Mol. Phys.
108
,
2655
2666
(
2010
).
113.
R. J.
O’Reilly
and
A. A.
Karton
, “
Dataset of highly accurate homolytic N–Br bond dissociation energies obtained by means of W2 theory
,”
Int. J. Quantum Chem.
116
,
52
60
(
2016
).
114.
A.
Karton
,
R. J.
O’Reilly
, and
L.
Radom
, “
Assessment of theoretical procedures for calculating barrier heights for a diverse set of water-catalyzed proton-transfer reactions
,”
J. Phys. Chem. A
116
,
4211
4221
(
2012
).
115.
A.
Karton
,
P. R.
Schreiner
, and
J. M. L.
Martin
, “
Heats of formation of platonic hydrocarbon cages by means of high-level thermochemical procedures
,”
J. Comput. Chem.
37
,
49
58
(
2016
).
116.
A.
Karton
and
L.
Goerigk
, “
Accurate reaction barrier heights of pericyclic reactions: Surprisingly large deviations for the CBS-QB3 composite method and their consequences in DFT benchmark studies
,”
J. Comput. Chem.
36
,
622
632
(
2015
).
117.
L.-J.
Yu
,
F.
Sarrami
,
R. J.
O’Reilly
, and
A.
Karton
, “
Reaction barrier heights for cycloreversion of heterocyclic rings: An Achilles’ heel for DFT and standard ab initio procedures
,”
Chem. Phys.
458
,
1
8
(
2015
).
118.
J.
Zheng
,
Y.
Zhao
, and
D. G.
Truhlar
, “
Representative benchmark suites for barrier heights of diverse reaction types and assessment of electronic structure methods for thermochemical kinetics
,”
J. Chem. Theory Comput.
3
,
569
582
(
2007
).
119.
A.
Karton
,
A.
Tarnopolsky
,
J.-F.
Lamère
,
G. C.
Schatz
, and
J. M. L.
Martin
, “
Highly accurate first-principles benchmark data sets for the parametrization and validation of density functional and other approximate methods. Derivation of a robust, generally applicable, double-hybrid functional for thermochemistry and thermochemical kinetics
,”
J. Phys. Chem. A
112
,
12868
12886
(
2008
).
120.
L.-J.
Yu
,
F.
Sarrami
,
R. J.
O’Reilly
, and
A.
Karton
, “
Can DFT and ab initio methods describe all aspects of the potential energy surface of cycloreversion reactions?
,”
Mol. Phys.
114
,
21
33
(
2016
).
121.
S. J.
Chakravorty
,
S. R.
Gwaltney
,
E. R.
Davidson
,
F. A.
Parpia
, and
C. F.
Fischer
, “
Ground-state correlation energies for atomic ions with 3 to 18 electrons
,”
Phys. Rev. A
47
,
3649
3670
(
1993
).
122.
K. T.
Tang
and
J. P.
Toennies
, “
The van der Waals potentials between all the rare gas atoms from He to Rn
,”
J. Chem. Phys.
118
,
4976
4983
(
2003
).
123.
A. D.
Becke
, “
Density functional calculations of molecular bond energies
,”
J. Chem. Phys.
84
,
4524
4529
(
1986
).
124.
D. M.
Ceperley
and
B. J.
Alder
, “
Ground state of the electron gas by a stochastic method
,”
Phys. Rev. Lett.
45
,
566
569
(
1980
).
125.
J. P.
Perdew
and
Y.
Wang
, “
Accurate and simple analytic representation of the electron-gas correlation energy
,”
Phys. Rev. B
45
,
13244
13249
(
1992
).
126.
S. H.
Vosko
,
L.
Wilk
, and
M.
Nusair
, “
Accurate spin-dependent electron liquid correlation energies for local spin density calculations: A critical analysis
,”
Can. J. Phys.
58
,
1200
1211
(
1980
).
127.
H.
Stoll
,
E.
Golka
, and
H.
Preuß
, “
Correlation energies in the spin-density functional formalism
,”
Theor. Chem. Acc.
55
,
29
41
(
1980
).
128.
See supplementary material at http://dx.doi.org/10.1063/1.4952647 for more information on datasets, mean signed errors and maximum absolute errors for the datasets, and Fortran routines for theωB97M-V local exchange, same-spin correlation, and opposite-correlation functionals.
129.
B.
Ruscic
, “
Uncertainty quantification in thermochemistry, benchmarking electronic structure computations, and active thermochemical tables
,”
Int. J. Quantum Chem.
114
,
1097
1101
(
2014
).
130.
J.
Grafenstein
,
D.
Izotov
, and
D.
Cremer
, “
Avoiding singularity problems associated with meta-GGA (generalized gradient approximation) exchange and correlation functionals containing the kinetic energy density
,”
J. Chem. Phys.
127
,
214103
(
2007
).
131.
E. R.
Johnson
,
A. D.
Becke
,
C. D.
Sherrill
, and
G. A.
DiLabio
, “
Oscillations in meta-generalized-gradient approximation potential energy surfaces for dispersion-bound complexes
,”
J. Chem. Phys.
131
,
034111
(
2009
).
132.
S. E.
Wheeler
and
K. N.
Houk
, “
Integration grid errors for meta-gga-predicted reaction energies: Origin of grid errors for the M06 suite of functionals
,”
J. Chem. Theory Comput.
6
,
395
404
(
2010
).

Supplementary Material