In the past decade, a number of approaches have been developed to fix the failure of (semi)local density-functional theory (DFT) in describing intermolecular interactions. The performance of several such approaches with respect to highly accurate benchmarks is compared here on a set of separation-dependent interaction energies for ten dimers. Since the benchmarks were unknown before the DFT-based results were collected, this comparison constitutes a blind test of these methods.

Computer modeling of materials has become a prominent means to reduce the cost associated with notional material synthesis, testing, and application. With many significant developments in electronic structure theory and computer science, as well as due to rapid increases in available computer power, material modeling has reached a level of maturity that yields reliable predictions and, in fact, sometimes the computed observables for a variety of properties may rival experiment in accuracy. There is a wide range of computational methods available. The most accurate methods use wave-functions (WF) and utilize many-body perturbation theory (MBPT) or coupled cluster (CC) expansions. The price for this accuracy is steep scaling of the cost with system size which, even with access to powerful supercomputers and highly scalable software packages, renders these methods inapplicable to many problems of interest. Another group of computational approaches is based on density-functional theory (DFT). These approaches utilize only electron densities and occupied orbitals and therefore scale much better than WF-based methods. In fact, DFT is the only computationally viable option for first-principles predictions involving molecules beyond the reach of WF-based methods, as well as for condensed phase systems. DFT has been applied to numerous molecular systems as well as to many types of condensed phases: from metals to molecular crystals. However, the application of DFT has some problems. Whereas it is in principle an exact theory,1 the exact functional is unknown and one has to use one of the many approximate variants of DFT. Although such variants have been successfully applied to a variety of problems (many approximations are specifically tailored for given types of problems), problematic cases were reported for even the best DFT methods when applied to intermolecular interactions. These interactions, also called van der Waals (vdW) or noncovalent interactions, are at least one order of magnitude weaker than the chemical bonds and can be decomposed into contributions of different physical origins: electrostatic, induction, dispersion, and exchange. These components are defined in symmetry-adapted perturbation theory (SAPT).2,3 While DFT was able to describe reasonably well interactions dominated by electrostatics and induction components, it was failing badly in calculations of the potential energy curves for pairs of rare-gas atoms, dominated by dispersion energies.4–6 The dispersion energies originate from mutual correlations of electron motions between monomers. For a pair of rare-gas atoms at separations a couple times larger than the vdW minimum distance, the dispersion energy constitutes nearly 100% of the interaction energy, so it was clear that the failure is related to this component. One should note that while the dimers of rare-gas atoms are an extreme example of the importance of dispersion interactions, these interactions are significant for most intermolecular complexes at all separations. As pointed out by Kristyan and Pulay5 in 1994, one reason for the failure is the local (or semilocal) character of the majority of approximate DFT functionals. As a result, the “exchange-correlation holes” modeled by such functionals have a range on the order of about 1 Å, whereas dispersion interactions involve correlations of electron motion at distances of several angstrøms which is typical of separations between monomers. Another type of interactions where DFT was found to have problems are those with large charge-transfer effects which are a part of the induction effects.7 

The failures of DFT for intermolecular interactions have spawned intense research activity directed toward fixing this problem. The initial attempts were based on hopes that one of the approximate density functionals will actually work for such cases. However, methods performing well on a few dimers were later found to perform poorly on others. Searches for the “best” DFT method for intermolecular interactions were later systematized to utilize sets of benchmarks including dozens of dimers representing various classes of intermolecular interactions. However, these tests have shown that a DFT method that performs well for intermolecular interactions of broad classes of systems does not yet exist. For example, Zhao and Truhlar8 examined 45 different density functionals on a set of 28 dimers at their minimum configurations and found that the best performing functional had an average unsigned error of 0.46 kcal/mol which should be considered large since the interaction energies of the set range from −0.04 to −16.1 kcal/mol, with an average of −3.8 kcal/mol. Investigations of this type have also shown that different functionals yield errors of different signs for given systems (overbinding, underbinding or not binding at all). This fact indicates that the inability to recover dispersion energies is not the only problem of DFT. If this were the case, all methods should underbind, or not bind, since the dispersion energy is always negative (for all distances). Some authors pointed out a lack of balance between kinetic and exchange interactions in Kohn-Sham DFT as a possible additional problem.4,9

The proposed strategies for making DFT-based methods more predictive for intermolecular interactions generally fall into the following four categories: (i) Optimization of parameters in existing functionals on training sets including interaction energies; (ii) addition of dispersion energies in the form of an asymptotic expansion to existing functionals, the so-called DFT+D approaches; (iii) development of nonlocal correlation density functionals which should in principle be capable of describing dispersion interactions. Since the existing methods of this type either do not include couplings to the local parts of the functional, or the couplings result in practically negligible effects, methods of this group can be viewed as a type of DFT+D methods. (iv) Application of “post-DFT” approaches, i.e., approaches that include virtual orbitals.

Methods belonging to each of the four categories defined above were first formulated in the early 2000s and have been actively developed later on. We will refer to these methods as “intermolecular interactions cognizant” methods, whereas other DFT variants will be called standard methods.

With quite a number of intermolecular interactions cognizant DFT methods available, the question arises which of them is most suitable for application to problems in materials science. Obviously, the most relevant features are (a) the best accuracy in all regions of the potential energy surface, (b) computational efficiency, and (c) efficient when applied to condensed-phase systems with periodic boundary conditions, in particular when using plane waves. Whereas there are a number of papers in the literature comparing the accuracy of various intermolecular interactions cognizant DFT methods, it is not easy for researchers not directly involved in such work to find out which methods are the best. Most of the published papers place emphasis on the methods developed by the authors of these papers and sometimes the test set of benchmarks is the same, or very similar, to the training set used to optimize parameters. The most extensive tests including a variety of methods and distance-dependent interaction energies have been published in Refs. 10–12.

The present paper is related to a workshop entitled “Dispersion Interactions and Density Functional Theory” co-organized by the Army Research Laboratory (ARL) and the University of Delaware (UD). The workshop consisted of presentations (available online)13 covering developments of DFT-based methods for reliable treatment of intermolecular interactions. A part of this workshop was a “blind” test of such methods. Each speaker received a set of dimer geometries and was asked to compute interaction energies applying DFT-based methods developed or used in their groups. Independently, the organizers of the workshop computed benchmark interaction energies which were not made known to the speakers prior to the workshop. At the workshop, the results were revealed and compared.

During the comparison of the data when working on the present manuscript, it became clear that the benchmarks were not accurate for large separations. The problem was first traced to a coding error in the coupled cluster triple excitation subroutines of the NWChem software package.14 The error was fixed by the authors of NWChem and all the benchmark calculations were repeated. An analysis of the new results revealed another problem, resulting from loss of accuracy due to linear dependencies in the largest basis sets used by us. We overcame this problem and repeated relevant calculations, as described in Sec. IV A. Both problems resulted in a significant delay of publishing our results.

As a benchmark set, we have chosen a set of dimers relevant to the field of energetic materials. This set includes 10 dimers which range in size from 6 to 32 atoms. For each dimer, we considered the equilibrium configuration, as well as several configurations with different separations between the centers of mass of the monomers (sampling the repulsive wall, minimum, and asymptotic regions), but with the same relative orientation as in the minimum configuration. In this way, we created a set of 80 benchmark interaction energies for the following dimers:

  1. Water: (H2O)2;

  2. Ethanol: (C2H5OH)2;

  3. Nitromethane: (H3C–NO2)2;

  4. Methylformate: (C2H4O2)2;

  5. Benzene-methane: C6H6–CH4;

  6. Benzene-water: C6H6–H2O;

  7. Imidazole: (C3H4N2)2;

  8. Nitrobenzene: (C6H5NO2)2;

  9. 1,1-diamino-2,2-dinitroethylene (FOX-7): (C2N4O4H4)2;

  10. Ethylenedinitramine (EDNA): (C2O4N4H6)2.

This benchmark set has several advantages compared to similar sets used in the literature. The latter sets tended to be dominated by dispersion-bonded and hydrogen-bonded dimers. Whereas our set does contain systems belonging to these two categories, most of our dimers cannot be classified in this way. Furthermore, most of the dimers have not been used in previous sets, which allows us to test methods on systems not used in the optimization of these methods. For each system, interaction energies were computed applying the CC method with single, double, and perturbative triple excitations, CCSD(T), and using complete basis set (CBS) extrapolations, following the approach of Ref. 15.

Although in principle the participants could have computed the benchmark interaction energies themselves, which would have violated the “blind” character of the test, in practice this was not possible except for the smallest dimers in our set. As described later, the calculations of the benchmarks required extraordinary amounts of computer resources and took several months to complete which was more than the time available to the participants.

We have included 12 different intermolecular interactions cognizant DFT methods, representing all the approaches (i)–(iv) described above. We believe this selection is broader than in any previous survey of this type. These methods are briefly described in Sec. II. Technical details of the DFT-based calculations such as the basis sets are specified in Sec. III. The methodology used to obtain the benchmarks is outlined in Sec. IV. The results and their graphical and statistical analysis are presented in Sec. V. In the supplementary material, we include Cartesian coordinates for all configurations as well as the benchmark and DFT interaction energies for all entries in the data set.

To facilitate the comparisons of various methods, we briefly describe in this section the four types of intermolecular interactions cognizant DFT methods defined in the Introduction. A summary of the salient features of each method used in this work is presented in Table I, along with some details of the computational approach (basis set, use of counterpoise correction, etc.). In most cases the authors used their own codes to perform the calculations.

TABLE I.

Summary of methods. See text for explanations of acronyms; The term “meta” indicates that the exchange-correlation functionals contain enhancement factors depending on the kinetic energy density.

Method Correlation Exchange Exact exchange Parameters Optimized on Eint Counterpoise Basis set
Reoptimized standard functionals 
M06  meta LDA/B96/VSXC  meta LDA/PBE/VSXC  27%  36  Yes  No  jun-cc-pVTZ 
M11  meta LDA/PBE  RSH LDA, meta PBE/RPBE  42.8%  46  Yes  No  jun-cc-pVTZ 
DCACP  PBE  PBE  None  2/atom  Yes  BSSE Free  Plane waves 
              300 Ry cutoff 
DFT+D approachesa 
dlDF+D  meta modified PBE  meta LDA  61.4%  11b  Noc  Yes  aug-cc-pVTZ 
B3LYP-D3  B88  LYP  20%  2d  Yes  Yes  aug-cc-pVX
              X = Q, 5, 6e 
LC-ωPBE-D3  RSH PBE  PBE  ω = 0.4  2d  Yes  Yes  aug-cc-pVQZ 
LC-BOP12+LRD  RSH B88  OP  ω = 0.42f  2g  Yes  Yes  aug-cc-pVTZ 
LCgau-BOP+LRD  RSH B88  OP  ω = 0.42h  2i  Yes  Yes  aug-cc-pVTZ 
Functionals with nonlocal correlation 
vdW-DF2  PW86  P86  None  None  No  BSSE Free  Plane waves 
              85 Ry cutoff 
Post-KS methods 
SAPT(DFT)  PBE  PBE  25%  None  No  BSSE free  aug-cc-pVTZ+mb 
RSH+lrMP2  RSH PBEj  RSH PBEj  ω = 0.5  None  No  Yes  aug-cc-pVTZ 
RSH+lrRPAx-SO2  RSH PBEj  RSH PBEj  ω = 0.5  None  No  Yes  aug-cc-pVTZk 
Method Correlation Exchange Exact exchange Parameters Optimized on Eint Counterpoise Basis set
Reoptimized standard functionals 
M06  meta LDA/B96/VSXC  meta LDA/PBE/VSXC  27%  36  Yes  No  jun-cc-pVTZ 
M11  meta LDA/PBE  RSH LDA, meta PBE/RPBE  42.8%  46  Yes  No  jun-cc-pVTZ 
DCACP  PBE  PBE  None  2/atom  Yes  BSSE Free  Plane waves 
              300 Ry cutoff 
DFT+D approachesa 
dlDF+D  meta modified PBE  meta LDA  61.4%  11b  Noc  Yes  aug-cc-pVTZ 
B3LYP-D3  B88  LYP  20%  2d  Yes  Yes  aug-cc-pVX
              X = Q, 5, 6e 
LC-ωPBE-D3  RSH PBE  PBE  ω = 0.4  2d  Yes  Yes  aug-cc-pVQZ 
LC-BOP12+LRD  RSH B88  OP  ω = 0.42f  2g  Yes  Yes  aug-cc-pVTZ 
LCgau-BOP+LRD  RSH B88  OP  ω = 0.42h  2i  Yes  Yes  aug-cc-pVTZ 
Functionals with nonlocal correlation 
vdW-DF2  PW86  P86  None  None  No  BSSE Free  Plane waves 
              85 Ry cutoff 
Post-KS methods 
SAPT(DFT)  PBE  PBE  25%  None  No  BSSE free  aug-cc-pVTZ+mb 
RSH+lrMP2  RSH PBEj  RSH PBEj  ω = 0.5  None  No  Yes  aug-cc-pVTZ 
RSH+lrRPAx-SO2  RSH PBEj  RSH PBEj  ω = 0.5  None  No  Yes  aug-cc-pVTZk 
a

For DFT+D approaches, the parameters in dispersion functions fitted to vdW constants or to dispersion energies are not included in the parameter count, but parameters fitted to total energies are included.

b

Number of parameters in dlDF functional.

c

The dlDF and D parts were separately optimized on the corresponding components of interaction energies.

d

Parameters in the switching function of the dispersion energy.

e

X = 6 for water dimer, X = Q for FOX-7, nitrobenzene, and EDNA dimers, and X = 5 for the remaining dimers.

f

A parameter in the OP functional, denoted by q OP α β , was reoptimized to the value of 2.46 to match this value of ω.

g

The LDR parameters were κ = 0.216 a.u., R0 = 4.760 a.u., and λ = 0.228.

h

The Gaussian exponent in the attenuation function was α = 0.011 and the linear coefficient k = − 18. The parameter q OP α β = 2 . 37 .

i

The LDR parameters were κ = 0.248 a.u., R0 = 4.690 a.u., and λ = 0.229.

j

Short-range PBE exchange and correlation functionals from Ref. 16. Note that this short-range PBE exchange functional is different from the one used in LC-ωPBE-D3.

k

For the nitrobenzene, FOX-7, and EDNA dimers, the results were obtained as E int = E int RSH + lrRPAx SO 2 ( aug - cc - pV DZ ) + [ E int RSH + lrMP 2 ( aug - cc - pV TZ ) E int RSH + lrMP 2 ( aug - cc - pV DZ ) ] .

This subsection includes parametrized semilocal functionals (or semilocal functionals supplemented by “exact” exchange) that utilize a standard or meta generalized-gradient approximation (GGA) and in some cases include a fraction of the exact exchange. The standard GGA functionals employ the exchange-correlation terms depending only on the density and the magnitude of its gradient. Meta GGA functionals also include factors dependent on the kinetic energy density τ, i.e., on the sum of the squared magnitudes of gradients of occupied Kohn-Sham (KS) orbitals. The functionals incorporating a fraction of the exact exchange are called “hybrid” functionals. The functionals in this group depend on a number of adjustable parameters that are optimized using a training set containing theoretical benchmarks including interaction energies and possibly some experimental quantities. Some values of parameters may be inherited from earlier work or fixed using theoretical constraints. An example of a functional of this type was published by Xu and Goddard17 in 2004 and a large number of such functionals appeared since then. We have included in our survey three such functionals, described in Subsections II A 1II A 3.

1. M06

The M06 functional,18 where M stands for Minnesota, belongs to the suite of methods developed by Truhlar and co-workers. It can be classified as a hybrid meta-GGA functional. The GGA component of the exchange part is taken from the Perdew-Burke-Ernzerhof (PBE) functional19 and the factors containing kinetic energy densities are modelled in the spirit of the van Voorhis-Scuseria exchange-correlation functional (VSXC).20 The M06 exchange functional includes in addition to the PBE term also a local-density approximation (LDA) term. The correlation functional is in a modified LDA form with enhancement factors following a functional introduced by Becke in 1996 (B96)21 and the VSXC functional. M06 includes a modest amount (27%) of the exact exchange. The set of benchmark interaction energies from Ref. 8 was used in the optimization of M06.

2. M11

The M11 functional22 adopts a range-separated hybrid (RSH) approach,23–25 i.e., the exact exchange is included to a varying degree depending on the interelectron separation. This is achieved by partitioning the Coulomb operator into the long- and short-range components

(1)

where erf is the error function and erfc = 1-erf. Since erf(r) is zero at small r and one at large r, when this function is used in the exchange integrals, it introduces the exact exchange only at larger separations. The second term in Eq. (1) switches off the Coulomb interaction for large r and is used in appropriately modified DFT exchange functionals.24,26–28 The extent of the DFT versus the exact exchange is controlled by the parameter ω. The M11 functional attenuates the Coulomb interactions in the DFT part using the expression derived by Gill, Adamson, and Pople26 and by Savin24 within the LDA method. The attenuated LDA exchange energy density is multiplied by the sum of the PBE19 and revised PBE (RPBE)29 exchange factors, each of the two factors multiplied in turn by its own meta-enhancement factor. The RPBE functional differs from the PBE functional by using a different form of the exchange enhancement factor. M11 includes in addition to the range-separated exact exchange also 42.8% of all-separation exact exchange. The correlation functional in M11 is a sum of the LDA term and of the gradient correction to this term taken from the PBE functional, both multiplied by their own meta-enhancement factors. The set of benchmark interaction energies from Ref. 8 was used in the optimization of M11.

3. DCACP

In the dispersion-corrected atom-centered potentials (DCACP) approach,30,31 the atomic pseudopotentials routinely used in DFT calculations with plane-wave bases are parametrized and the parameters (two per atom type) are fitted to a training set of benchmark interaction energies. These pseudopotentials can be fitted to be used with any DFT functional. In the present work, the DCACP approach was used with the PBE functional in a plane-wave basis with a kinetic energy cutoff of 300 Ry.

Although, as discussed above, inability to recover dispersion energies is certainly not the only problem that DFT encounters in predicting intermolecular interaction energies, one can hope that the predictions will be improved if dispersion energies calculated using a non-DFT approach are added to DFT interaction energies for selected functionals that tend to underbind molecular complexes. This idea was quite an obvious one since a method consisting of adding dispersion energies to Hartree-Fock (HF) interaction energies (so-called HFD approach) was proposed by Scoles and co-workers32,33 already in the 1970s. An extended HFD method proposed in Ref. 34 is competitive with the DFT+D approaches. Note that the addition of the dispersion energy to the HF interaction energy is rigorous, i.e., by the definitions of these quantities, it involves no double counting, which is not the case for DFT+D. The DFT+D method was applied for the first time by Gianturco et al.35 in 1998 and by Wu et al.36 in 2001 for some specific systems. A generally applicable approach was formulated by Wu and Yang37 in 2002. These authors selected a few functionals that tended to underbind intermolecular complexes and added to the dimer interaction energies given by these functionals dispersion energies in the form of a simple atom-atom function

(2)

where A and B refer to the two monomers, rab are interatomic distances, C 6 a b are constants, and f(rab) is a damping function. This form of dispersion energy was well known in the work on vdW clusters and in biomolecular force fields. The interatomic coefficients C 6 a b were expressed in terms of atomic coefficients using simple combination rules. The latter coefficients were fitted to reproduce molecule-molecule C6 coefficients known from experiments, resulting in a universal atom-atom dispersion function. Wu and Yang examined several damping functions, but have not optimized parameters of these functions on benchmark interaction energies. This critical ingredient of the method was introduced by Grimme38 in 2004 who also included an overall scaling factor multiplying the dispersion energy. This approach led to a significant improvement of the results since the dispersion energy was de facto used not only to reproduce this component, but also to cancel various deficiencies of the DFT methods to which it was applied. On the other hand, the fact that such approaches rely on error cancellations is their weak point from a physics point of view. This problem was removed by the dispersionless density functional plus dispersion (dlDF+D) method proposed by Pernal et al.39 in 2009. This method uses a density functional optimized to reproduce the interaction energies with the dispersion (and exchange-dispersion) energies subtracted. Then atom-atom functions representing the true dispersion energies at all intermolecular separations can be added to the dlDF interaction energies. In fact, one can add accurate dispersion and exchange-dispersion energies computed ab initio using SAPT.3 One should note that DFT+D approaches provide improvements only for the energetics, whereas all the other properties of the complexes (for example, dipole moments or polarizabilities) remain the same as given by the DFT methods used. In our study, we have included the following DFT+D methods: dlDF+D,39 B3LYP+D3,40 LC-ωPBE + D3,41 LC-BOP12+LDR,42,43 and LCgau-BOP+LDR.43,44

1. dlDF+D

The dispersionless density functional (dlDF)39 is a hybrid meta GGA functional with parameters optimized on intermolecular interaction energies with the dispersion and exchange-dispersion energies subtracted. The total interaction energies were computed for a set of dimers using the CCSD(T) method and the dispersion and exchange-dispersion energies were computed using SAPT based on DFT description of monomers [SAPT(DFT)]45–48 (see a description of the latter method below). In contrast to the functionals of Sec. II A, dlDF includes only the components of intermolecular interactions that can be described by (semi)local DFT methods. The dlDF functional form is analogous to that of the M05-2X functional,49 i.e., it uses the PBE exchange density multiplied by a meta-enhancement factor for the exchange part and a modified meta-enhanced LDA form for the correlation part.

Due to the way dlDF was constructed, one can add to the dlDF interaction energies accurate dispersion and exchange-dispersion energies computed using SAPT(DFT), and this approach leads to accurate total interaction energies for all intermonomer separations.39 However, since this post-KS approach is more time consuming than DFT calculations, one can alternatively add an atom-atom function fitted to the SAPT(DFT) dispersion and exchange-dispersion energies: such functions have been developed in Refs. 34 and 39. The functions from Ref. 34 used in the present test have the form

(3)

where the Tang-Toennies50 damping functions fn(r) are defined as

(4)

and the interatomic constants are expressed in terms of atomic ones using the following combination rules: β a b = β a β b and C n a b = C n a C n b . The damping parameters are the same for n = 6 and 8. All the parameters in this expression were fitted only to the sum of the dispersion and exchange-dispersion energies, so that these functions are DFT-independent and can be added to any dispersion-free interaction energy. Notice that the dlDF+D method, in contrast to most of the other approaches considered here, involves no fitting to the total interaction energies. The concept of dispersionless interaction energies was recently developed in alternative ways by Rajchel et al.51 and Austin et al.52 

2. B3LYP-D3 and LC-ωPBE-D3

The B3LYP-D340 and LC-ωPBE-D341 approaches use the standard DFT functionals for the DFT part and add the D3 atom-atom dispersion function from Ref. 40 with different dispersion switching parameters. The hybrid GGA B3LYP functional53,54 consists of the B88 Becke’s exchange functional55 of 1988 and of the Lee, Yang, and Parr (LYP)56 correlation functional. B3LYP contains three parameters (indicated by the numeral 3 in the acronym) fitted to a set of atomization energies, ionization potentials, proton affinities, and total atomic energies.

The LC-ωPBE functional,57 where LC stands for “long-range corrected,” is an RSH version of the PBE functional. Its exchange-correlation functional is a sum of the PBE correlation functional, of the short-range [attenuated according to Eq. (1)] PBE exchange functional, and of the long-range exact exchange. The former exchange component used the expression developed by Heyd et al.28,58 based on the exchange hole model. Note that despite the use of this expression, LC-ωPBE is a functional which is completely different from the functionals of Refs. 28 and 58 which use a certain amount of exact exchange only at short range and therefore have a wrong asymptotic behavior of the exchange potential. LC-ωPBE is actually similar in spirit to the long-range corrected functionals of Hirao and co-workers.27 These relations explain the origin of the LC in the LC-ωPBE acronym, which may seem redundant in view of Eq. (1) since the use of range separation alone, indicated by the parameter ω, assures long-range correctness for exchange functionals of this type. Note that whereas RSH functionals lead to correct asymptotics of exchange-correlation potentials, the more important electron densities produced by such functionals still have wrong asymptotics with the standard values of ω.59 LC-ωPBE was used here with the recommended value of ω = 0.4 bohr−1 optimized for a broad set of thermochemical data.

The D3 dispersion energy40 is of similar character but different in several details from that defined by Eqs. (3) and (4). First, the following quantity is computed for the dimer AB and the monomers A and B:

(5)

(D3 can also include a third-order term which has not been used here) where the switching function is of the form

(6)

The summation here is performed over all pairs of atoms in a system. The dispersion energy is then calculated using the supermolecular approach

Note that all terms coming from intramonomer atom-atom pairs cancel in this expression, so that one could equivalently sum only over intermonomer pairs to obtain the dispersion energy directly. The scaling parameters sn and tn (the latter denoted as sr,n in the original paper) were chosen to minimize deviations of the predictions from benchmarks that include both intermolecular interaction energies and thermochemical data. The value of s6 and t8 is set to 1.0 for both B3LYP and LC-ωPBE functionals. Thus, only two scaling parameters, s8 and t6, are optimized. The constants r a b 0 are distances where atom-atom Heitler-London energies computed for some test systems using Slater determinants built from KS orbitals reach a prescribed value. Thus, the product t n r a b 0 defines the value of atom-atom separation where the switching function changes rapidly from zero to one. We use the name “switching” rather than “damping” function for the fn of Eq. (6) to emphasize that the main role of these functions is not to damp the asymptotic inverse power expansion of dispersion energy in order to account for the charge-overlap effects at finite intermonomer distances R, as done by the Tang-Toennies function, but rather to completely switch off the dispersion energies at some chosen distance (the dispersion energies damped by the functions of Eq. (4) remain finite even at R = 0). Note furthermore that whereas the function of Eq. (3) contains summation only over the pairs of atoms from different monomers, the function of Eq. (5) sums over all pairs of atoms in a dimer and in the corresponding monomers. The powers in Eq. (6) were chosen as α6 = 14 and α8 = 16.

The reference coefficients, C 6 , ref m k a b , for a pair of atoms a = X and b = Y in their hydrides XmHn and YkHl are computed from coupled KS frequency-dependent polarizabilites using the Casimir-Polder formula

(7)

Different hydrides give different coefficients for a given pair of atoms which corresponds to atoms in different coordination states (e.g., corresponding to sp3, sp2, sp hybridizations). For atoms X and Y in actual molecules, their fractional coordination numbers are computed from their environments and then the final value of C 6 a b is obtained as a weighted sum of the reference values from Eq. (7).

The B3LYP interaction energies were obtained using the NWChem software package14 and the D3 correction was evaluated with the DFTD3 V2.1 program.40 The LC-ωPBE-D3 calculations were performed with Gaussian09 codes.60 In both cases only the two-body, second-order dispersion energy was used.

3. LC-BOP+LRD and LCgau-BOP+LRD

The LC-BOP method27 is an RSH approach which employs the B88 exchange functional and the one-parameter progressive (OP) correlation functional.61 One should emphasize that the DFT-based short-range exchange energy in LC-BOP has a very different form than that used in the LC-ωPBE functional discussed above. The recently modified42 form of this functional, LC-BOP12, was utilized in the present work. The LCgau-BOP functional44 differs from LC-BOP by the use of a more elaborate range partitioning function which in addition to the terms present in Eq. (1) includes two mutually cancelling Gaussian terms, one incorporated into the DFT part and the other one into the exact exchange part.

A “local-response dispersion” (LRD) function of Sato and Nakai62 is added to the LC-BOP12/LCgau-BOP interaction energies. The form of this function is the same as given by Eq. (3), except that the n = 10 term is also included. However, in contrast to previously discussed approaches, the coefficients C n a b (van der Waals constants) are calculated on-the-fly from electron densities. The starting point is the generalized Casimir-Polder expression for the exact second-order dispersion energy63 

(8)

where αX(r, r′|iu) is the frequency-dependent density-density response function (also called the frequency-dependent density susceptibility) of monomer X. Such functions can be computed very accurately using the time-dependent DFT (TD-DFT) method. Therefore, the use of TD-DFT response functions gives very accurate dispersion energies, as shown for the first time in Refs. 45 and 46.

The first key simplifying assumption of Sato and Nakai was the use of an approximate expression for the response function, which is a modified version of the expression proposed by Dobson and Dinte,64 

(9)

where ρ(r) is the density of system X. The quantity ω0 is a parametrized function of the density and of the magnitude of its gradient

(10)

where kF(r) = (3π2ρ(r))1/3 and s ( r ) = ρ ( r ) / ( 2 k F ( r ) ρ ( r ) ) . The parameter λ was empirically optimized as described below.

The second simplifying assumption of Sato and Nakai was to expand the interelectronic distances in Eq. (8) in the multipole series, leading to the asymptotic expansion of the dispersion energy. The volume integrals give then frequency-dependent multipole polarizabilities. The subsequent integration over the frequency gives the van der Waals constants. In the simplest case, this expansion is a series in inverse powers of the distance between centers of mass of monomers, but one can also formulate the so-called distributed expansion where all the intermonomer atom-atom distances are used. To this end, one partitions the response function into atomic domains and computes distributed polarizabilities. Sato and Nakai achieved such distribution using numerical integration grids centered around individual atoms. Recently, it has been shown65,66 that one can formulate a faster-converging distributed expansion of dispersion energy which almost exactly reproduces the SAPT(DFT) dispersion energy in all regions where the overlap effects can be neglected.

The parameter λ in Eq. (10) has been originally optimized to reproduce the empirical C6 constants for pairs of rare-gas atoms. It was recently reoptimized43 for 1225 known empirical C6 constants for atoms and small molecules.

To damp the asymptotic expansion at short intermonomer separations, Sato and Nakai used the following damping function:

(11)

where r0 is a constant specific for each pair of atoms and dependent on two parameters κ and R0,

(12)

where αx is the average static dipole-dipole polarizability for atom x computed using the response function of Eq. (9). In the original work, these parameters were optimized to reproduce the empirical minima of dimers of rare-gas atoms. In the calculations presented here, the values recently reoptimized43 on the so-called S66 set67 of separation-dependent benchmark interaction energies of 22 dimers.

Our survey included a functional with nonlocal correlation, vdW-DF2,68 which is a modified version of the earlier vdW-DF functional.69 The vdW-DF2 functional adds to a standard DFT functional a nonlocal term dependent on the density and the magnitude of its gradient, aimed at reproducing the dispersion energy. One has to properly select the standard DFT functional used in the vdW-DF method and the choice was based on similarities of predictions to those given by the HF approach. The original choice in vdW-DF was the revPBE functional,70 a reparametrized version of PBE. In vdW-DF2, the PW86R functional was chosen. This functional is a refitted71 version of the PW86 functional with the exchange part from Ref. 72 and the correlation part (P86) from Ref. 73. PW86 was chosen based on good agreement of its exchange-only version with HF interaction energies for several dimers. The refit concerned the parameters in the exchange enhancement factor and was done to make this factor have the exact large-s and small-s behavior, which was not the case for PW86. The agreement of PW86 and PW86R exchange-only interaction energies with HF interaction energies is about the same.

The nonlocal term has been derived by making approximations to the response function in the spirit of Eq. (9). The resulting expression reads

(13)

Here ρ(r) is the density of the dimer, so that vdW-DF methods include also the nonlocal correlation energies within monomers. The kernel ϕ(r, r′) is given by the expression

(14)

where W and T are relatively simple functions of their arguments, given by Eqs. (15) and (16) in Ref. 69: W is a linear combination of products of sines and cosines of the variables a and b (so W does not depend on density), whereas T is a sum of terms containing inverse powers of the binomials of the arguments. The arguments of T are defined as

(15)

and ν′ differs from ν by q 0 2 ( r ) q 0 2 ( r ) . The quantity q0 is given by

(16)

where ϵ x LDA ( r ) = 3 k F / 4 π and

(17)

The LDA correlation energy density, ϵ c LDA ( r ) , is an approximation to this quantity used in the associated standard DFT functional. The parameter Zab has been chosen based on theoretical constraints arguments and its value of −0.8491 used in vdW-DF was changed to −1.887 in vdW-DF2.

All the methods discussed so far use only electron densities, their gradients, and occupied KS orbitals. These methods scale therefore as N4 with system size, where system size can be defined as the number of electrons (the methods that do not use the exact exchange can easily be programmed to scale as N3). One way to improve the results is to make use of virtual KS orbitals, a step towards WF theory. This usually makes the scaling worse, but if it is only increased to N5, such an approach can still be applied to very large systems. The post-KS terms have to be properly added to the DFT interaction energy to avoid double counting.

1. SAPT(DFT)

In SAPT,3,74 the interaction energy is computed using a perturbative approach. Starting from the wave functions of unperturbed monomers, the intermolecular interaction energy can be evaluated order-by-order as an expansion in powers of intermolecular interaction operator V collecting Coulomb interactions of all electrons and nuclei of monomer A with those of monomer B. The resulting wave functions have to be properly antisymmetrized which explains the phrase “symmetry-adapted” in the name of this method and produces the so-called exchange components of interaction energies such as the exchange-dispersion term mentioned earlier. The SAPT interaction energy is naturally and rigorously decomposed into electrostatic, exchange, induction, and dispersion contributions. Since the exact wave functions of monomers are unknown, the starting points in standard SAPT are actually products of HF determinants of monomers. These wave functions are corrected by using the MBPT/CC expansion in powers of the intramonomer correlation operators WX, i.e., the so-called Møller-Plesset (MP) fluctuation potentials which are differences between the exact Hamiltonian of a monomer and the Fock Hamiltonian. The HF-based SAPT properly truncated at some power of WA gives interaction energies of similar accuracy as those produced by the CCSD(T) method, but also scales in the same way (N7) and therefore is similarly time-consuming. A much better scaling method can be obtained using a DFT description of the monomers. Since all components of interaction energies are computed as a post-KS process, there is by definition no double counting. An important ingredient of SAPT(DFT) is the use of asymptotically corrected exchange-correlation potentials59 giving electron densities with correct long-range behavior. The dispersion energies are computed from Eq. (8) using TD-DFT response functions without any approximations. SAPT(DFT)45–48,75–77 utilizing density fitting techniques45,48,78 scales as N5. If a nonhybrid DFT is used to describe the monomers and some small exchange terms are neglected, SAPT(DFT) scales in practice as N4, i.e., as a hybrid DFT, and can be applied to systems with hundreds of atoms79 [one step, the integral transformation, scales formally as N5, but this scaling is proportional to o2v2Naux, where o(v) is the number of occupied (virtual) orbitals and Naux the number of auxiliary functions, and the prefactor of this term is small].

SAPT(DFT) is usually applied at the second-order in V, but for systems with large polarization effects, the polarization terms of higher order are approximated by the so-called δ int HF correction which is the difference between the supermolecular Hartree-Fock interaction energy and the sum of the first- and second-order SAPT(HF) terms that are accounted for in the HF description of the system. The criterion for including δ int HF was the ratio of the sum of the induction and exchange-induction energies to the total interaction energy for the distance where the interaction energy was the most negative. If this ratio was larger than 12.5%, δ int HF was included. By this criterion, it was included for all systems except those containing benzene and nitrobenzene. The threshold value was determined by analyzing systems studied by SAPT(DFT) in the past. It turned out in the post-analysis that this criterion was not working well for the nitrobenzene dimer. Whereas for this system the large dipole moment of 2.1 a.u. does not result in a significant induction contribution near the minimum, the δ int HF term is very important for large intermolecular separations where it significantly improves the agreement with the benchmarks. However, in order not to violate the ‘blind’ character of this test, δ int HF was not used for the nitrobenzene dimer. In future applications of SAPT(DFT), the criterion for adding δ int HF should also take into account the dipole moments of the monomers.

The DFT calculations for monomers have been performed using the Tozer-Handy-Fermi-Amaldi asymptotic correction.80,81 The electrostatic energies were computed in quadruple precision arithmetics82 and using JK-optimized auxiliary bases.83 For all other terms, the correlation-optimized bases84 were used.

2. RSH+lrMP2 and RSH+lrRPAx-SO2

As discussed earlier, the RSH methods mix DFT and exact (HF) exchange in a ratio dependent on the interelectronic distance, using the DFT exchange at short range (sr) and exact exchange at long range (lr). It is also possible to mix the correlation part in the same way. However, the long-range correlation part has to be obtained from some WF-type approach. The simplest such method is the second-order MBPT with the MP partition of the Hamiltonian, known under the name MP2. The MP2 expression in RSH methods is formally identical to that of the regular MBPT, except that it is computed with attenuated interelectronic interactions and with KS orbitals and orbital energies. A method of this type, dubbed RSH+lrMP2, was developed in Ref. 85. In the first implementation, this method used the short-range LDA exchange-correlation functional of Ref. 86. In the present work, the short-range PBE exchange-correlation functional of Goll et al.16 (which is a modification of the functional of Ref. 87) is used. Note that the short-range PBE exchange part of this functional is different from the one of Refs. 28 and 58 used in LC-ωPBE.

The next step beyond MP2 is the random-phase approximation (RPA) approach. RPA may be derived from the CCD method by selecting from the CCD expression only terms including products of two-electron integrals that can be illustrated by the so-called ring diagrams (the simplest ring diagram represents MP2).88 The RPA amplitudes are obtained from equations of a similar type as used in calculations of response functions. There are several versions of RPA available. The method selected in our survey, RSH+lrRPAx-SO2,89 uses the Hartree kernel plus the Hartree-Fock exchange kernel in the equations for amplitudes. However, matrix elements multiplying these amplitudes are not antisymmetrized. This is variant 2 of a method introduced by Szabo and Ostlund,90 which is indicated by SO2 in the acronym. As for RSH+lrMP2, the calculations were done with the short-range PBE exchange-correlation functional of Goll et al.16 

Ideally, in order to have the most unambiguous comparisons, all calculations should be performed using CBS extrapolations from large basis sets. This approach, however, would be too time consuming for some methods. Furthermore, some methods have been optimized using specific basis sets and perform best if used with the same basis sets. Finally, DFT calculations are known to converge faster with respect to basis set size than post-KS ones. The organizers of the blind test suggested that the augmented triple-zeta basis set of Dunning and co-workers,91,92 denoted as aug-cc-pVTZ, be used and most participants followed this suggestion. This basis set obviously could not be used in codes based on the plane-wave functions. The other basis sets used were mostly different members of the aug-cc-pVXZ family with X = 2 (marked by D), 4 (Q), and 5. The jun-cc-pVTZ basis set is the aug-cc-pVTZ basis with omitted diffuse f functions on all atoms and without any diffuse functions on hydrogens. We will discuss possible uncertainties resulting from the use of different basis sets in Sec. V D. SAPT(DFT) calculations used the so-called dimer-centered plus midbond basis set format.93 

Interaction energies computed using the supermolecular approach, i.e., by subtracting the total monomer energies from the dimer’s total energy, suffer the so-called basis set superposition error (BSSE) resulting from the lowering of monomer energies in dimer calculations due to “borrowing” of the basis set from the interacting partner. One way to reduce this error is to compute such energies in the counterpoise (CP) corrected way,94 i.e., by performing calculations for monomers in the full dimer’s basis set. The organizers of the blind test suggested that the CP correction will be used in all calculations, but some participants have chosen not to use it. Note that codes using plane-wave basis sets have to perform all calculations in the same basis set so they are BSSE free. The SAPT approach is by definition free of BSSE since the interaction energy is computed directly, i.e., it does not involve any subtraction of dimer and monomer energies.

The DFT-based methods described in Sec. II were assessed by comparisons to a set of benchmark interaction energies computed in the same way as in Ref. 15,

(18)

where E int HF CBS is the CBS-extrapolated HF interaction energy, δ E int MP 2 CBS is the CBS-extrapolated difference between the MP2 and HF interaction energies, and δ E int CCSD ( T ) is the difference between CCSD(T) and MP2 interaction energies, computed in the same basis set, generally smaller than the bases used for the HF and MP2 extrapolations. All correlated calculations employed the frozen-core approximation. The HF and MP2 interaction energies were extrapolated to the CBS limit using two-point extrapolations, exponential in the HF case

(19)

with α fixed at 1.63 as recommended in Ref. 95, and inverse third power in the case of MP2,

(20)

where “(X)” denotes that a given energy was obtained using the basis aug-cc-pVXZ (plus midbond functions, see below). The CBS values were obtained from calculations in X − 1 and X bases. For all systems, extrapolations were performed with (X − 1, X) = (4, 5) except for the water dimer, for which we used (5,6) and the EDNA dimer for which we used (3,4). The δ E int CCSD ( T ) contribution was obtained using the aug-cc-pVTZ basis for all systems except for the water dimer for which the aug-cc-pVQZ basis was used and for the EDNA dimer for which aug-cc-pVDZ was used. For each configuration, to improve basis set convergence, an additional set of 3s (α = 0.9,0.3,0.1), 3p (α = 0.9,0.3,0.1), 2d (α = 0.6,0.2), and 2f (α = 0.6,0.2) midbond (mb) functions was used to augment the principal basis, with the location of the midbond functions determined using the algorithm of Ref. 96. The advantages of using midbond functions in conjunction with CBS extrapolations were demonstrated in Ref. 97. The NWChem software package14 was used to obtain all benchmark interaction energies except for some MP2 energies in large basis sets, see below.

For larger monomers, our largest basis sets become linearly dependent to the point that NWChem removes a number of linear combinations of atomic orbitals from the basis set. With the default threshold of 10−5 for the eigenvalues of the overlap matrix, it removes 23 (47) linear combinations from the benzene-water aug-cc-pVQZ+mb (aug-cc-pV5Z+mb) basis set at R = 7 Å. Since our calculations are performed in the same basis set for the dimer and both monomers, and this reduction of basis set size is identical in all three cases, the counterpoise procedure is still rigorously imposed. On the other hand, the CBS extrapolations may be affected. Worse, the interaction energies can still be inaccurate despite the reduction and we found this to be the case for NWChem with the threshold given above. As in the case of the triple excitation error in CCSD(T), we found this problem by comparisons with SAPT(DFT) results which are very reliable at large separations. We first tried to lower the threshold to 10−6, but it resulted in even more erratic results and self-consistent field (SCF) iterations convergence problems in some cases (even with the 10−5 threshold for the overlap matrix, the iterations would sometimes not converge if a tight SCF convergence threshold was used). We then repeated the calculations of MP2 energies using the GAMESS98 and Gaussian60 packages and found that the results are consistent and appear to be sufficiently accurate (probably to at least about 3 significant digits for the interaction energies, based on the level of consistency). Also the CBS extrapolations were consistent (although this could be due to the same number of linear combinations removed). Therefore, we repeated all the MP2 calculations using GAMESS.

The CCSD(T) calculations (NWChem) were run on a Cray XE6 (32 cores/node, 60 GB/node) at the U.S. Army Engineer Research and Development Center (ERDC) using 512 cores for all systems except the 3 largest dimers (FOX-7, EDNA, nitrobenzene) which used 4096 cores. For the small/intermediate sized systems, most calculations were completed in 12-36 h (wall clock) with the total time being a function of the size of the system, as well as the center of mass (COM) separation R, with configurations at large R completing faster due to the significantly reduced number of non-negligible intermonomer integrals remaining at large separation. The CCSD calculations for the large systems were completed in 4-6 days (wall clock) on average and due to the 7-day maximum runtime queue limits at ERDC, the (T) energy was evaluated using NWChem’s restart option. The SCF convergence tolerance for all calculations was equal to 10−10 hartree where the threshold corresponds to the magnitude of the norm of the orbital gradient in NWChem’s quadratically convergent SCF implementation. The CCSD convergence tolerance for the energy was equal to 10−8 hartree. By default, NWCHEM uses a convergence tolerance for the root mean square (RMS) error of the amplitudes (10−8) that has the same magnitude as that used for the energy. For the three biggest dimers, this led to an unacceptably large number of CC iterations (>45) thereby prolonging the total wall clock and CPU time of these already expensive calculations. However, it was observed that although the energy converged fairly quickly (15-20 iterations), the remaining iterations were spent reducing the amplitude error to an equivalent level with little to no change observed in the energy. Therefore, for the large systems, we modified NWChem so that it only used the energy convergence criterion, and not that of the amplitudes, to terminate the CCSD iterations. Although not used as a criterion for terminating the CC iterations, the root mean square error of the amplitudes was still monitored and for all cases the final RMS error was on the order of 10−6 or below.

The large-basis MP2 calculations (using GAMESS) were run on a Cray XC40 (32 cores/node, 126 GB/node) at the U.S. Army Research Laboratory’s Defense Shared Resource Center using 2048 cores for all dimers except EDNA which used 5120 cores. The SCF energy was converged to 10−10 hartree and required 1-8 h (wall clock) time per job, again depending on the system size and COM separation. For three of the required energy components (MP2/aQZ energy of nitromethane monomer A at R = 4.131 Å, MP2/aQZ energy of the methylformate dimer at R = 4.243 Å, and MP2/a5Z energy of benzene-methane R = 8.80 Å), the GAMESS SCF was not convergent and these energies were computed using Gaussian on the same Cray XC40 using 32–256 cores. In total, the computation of the benchmark energies required approximately 40 × 106 CPU hours with most of that time resulting from the required recalculation of the benchmarks due to the issues described above.

The equilibrium geometries of the dimers were taken from the literature, when available, or optimized at the MP2/aug-cc-pVTZ level of theory. Complete geometry optimizations were performed so that the two monomers have slightly different geometries in homogeneous dimers. The optimizations were not CP corrected. All the geometries and references to the appropriate literature sources are given in the supplementary material. For each system, given the optimized configuration, radial potential energy curves were obtained by variation of the radial separation R between the COMs of the two monomers and keeping the mutual orientations and intramonomer coordinates unchanged. The interaction energies computed are the vertical ones, i.e., the geometry of each monomer in calculations for this monomer is exactly the same as in the dimer calculations. All interaction energies from the DFT-based methods were obtained for the same configurations as used to compute the benchmark energies, i.e., the geometries of the dimers were not optimized using these methods. The number of points per dimer ranged from six to sixteen (depending on the size of the molecule) with geometries chosen to sample the repulsive wall, potential well, and the asymptotic region. The entire reference data set contains a total of 80 points.

All the results of calculations leading to our benchmarks are listed in the supplementary material. These tables show that the CBS extrapolations only modestly influence our results: the median unsigned percentage error of the nonextrapolated CCSD(T) interaction energies computed in the applied basis sets (aug-cc-pVTZ+mb for all but two systems) with respect to the CBS interaction energies is 1.04%. The reason is our use of the bond functions which makes the results computed in the aug-cc-pVTZ+mb basis quite well converged. However, there are a number of percentage deviations of the CCSD(T)/aug-cc-pVTZ+mb results from the CBS ones, all occurring for the shortest distances, that are very large in magnitude, up to 357% (for the benzene–methane dimer at R = 3.28 Å). Despite such a large discrepancy, we believe that the CBS results at these separations are still reliable, as can be seen from the following analysis. The large percentage errors are due to a combination of the fact that δ E int CCSD ( T ) is quite large for these dimers and that the interaction energy at these distances, near the points where the potential curves cross zero, are small differences of large numbers. Table II illustrates these relations for the benzene-methane dimer. The 357% error quoted above is for the difference between E int CCSD ( T ) computed in the basis aug-cc-pVTZ+mb, equal to 0.119 kcal/mol, and E int CBS equal to −0.047 kcal/mol. The MP2/CBS energy is still quite different: −0.822 kcal/mol. Although such dramatic differences may suggest a low reliability of our benchmarks, an analysis of Table II shows that this is not the case. Clearly, the observed behaviour results from the fact that E int HF and δ E int MP 2 , the former converged to better than 0.001 kcal/mol and the latter to about 0.04 kcal/mol, cancel to a large extent. Since δ E int CCSD ( T ) is close to the MP2 interaction energy but of opposite sign, adding it results in a near zero interaction energy, which obviously is then very sensitive in relative terms to basis set size and CBS extrapolations.

TABLE II.

Basis set convergence of calculations for benzene-methane at R = 3.28, 3.8, and 8.8 Å. Energies are in kcal/mol and aXZ denotes aug-cc-pVXZ+mb. All quantities in parentheses are obtained from extrapolation formulas as described in the text.

R = 3.28 aDZ aTZ aQZ a5Z CBS
E int HF   5.105  5.105  5.101  5.099  (5.099) 
δ E int MP 2   −5.562  −5.761  −5.845  −5.882  (−5.921) 
δ E int CCSD ( T )   0.725  0.775  (0.787)a  (0.791)a  (0.796)b 
Eintc  0.268  0.119  (0.043)  (0.009)  (−0.025) 
E int CBS d          (−0.047) 
R = 3.8           
E int HF   0.885  0.897  0.898  0.898  (0.897) 
δ E int MP 2   −2.599  −2.663  −2.688  −2.699  (2.711) 
δ E int CCSD ( T )   0.310  0.337  (0.344)a  (0.346)a  (0.349)b 
Eintc  −1.403  −1.428  (−1.446)  (−1.455)  (−1.465) 
E int CBS d          (−1.476) 
R = 8.8           
E int HF   −0.0056  −0.0052  −0.0052  −0.0051  (−0.0051) 
δ E int MP 2   −0.0145  −0.0149  −0.0150  −0.0149  (−0.0149) 
δ E int CCSD ( T )   0.0026  0.0025  (0.0025)a  (0.0025)a  (0.0025)b 
Eintc  −0.0175  −0.0176  (−0.0177)  (−0.0176)  (−0.0176) 
E int CBS d          (−0.0175) 
R = 3.28 aDZ aTZ aQZ a5Z CBS
E int HF   5.105  5.105  5.101  5.099  (5.099) 
δ E int MP 2   −5.562  −5.761  −5.845  −5.882  (−5.921) 
δ E int CCSD ( T )   0.725  0.775  (0.787)a  (0.791)a  (0.796)b 
Eintc  0.268  0.119  (0.043)  (0.009)  (−0.025) 
E int CBS d          (−0.047) 
R = 3.8           
E int HF   0.885  0.897  0.898  0.898  (0.897) 
δ E int MP 2   −2.599  −2.663  −2.688  −2.699  (2.711) 
δ E int CCSD ( T )   0.310  0.337  (0.344)a  (0.346)a  (0.349)b 
Eintc  −1.403  −1.428  (−1.446)  (−1.455)  (−1.465) 
E int CBS d          (−1.476) 
R = 8.8           
E int HF   −0.0056  −0.0052  −0.0052  −0.0051  (−0.0051) 
δ E int MP 2   −0.0145  −0.0149  −0.0150  −0.0149  (−0.0149) 
δ E int CCSD ( T )   0.0026  0.0025  (0.0025)a  (0.0025)a  (0.0025)b 
Eintc  −0.0175  −0.0176  (−0.0177)  (−0.0176)  (−0.0176) 
E int CBS d          (−0.0175) 
a

Values estimated from formula (20) applied to δ E int CCSD ( T ) with the CBS energy and the constant B obtained from the (D,T) extrapolation.

b

(D,T) extrapolation analogous to formula (20).

c

Sum of the values in the preceding rows.

d

Equation (18).

Additional support for the reliability of our benchmarks, even for those difficult cases, can be obtained from performing the (D,T) CBS extrapolation of δ E int CCSD ( T ) . We do not use such extrapolations in our benchmarks since extrapolations from such low X are generally unreliable, but let us assume the (D,T) extrapolation works in this case. Table II shows that the extrapolation gives the value of 0.796 kcal/mol (in the CBS column) and increases δ E int CCSD ( T ) by 0.021 kcal/mol, so that the interaction energy including this extrapolation is −0.026 kcal/mol, 0.021 kcal/mol from our recommended value. Also the sequence of the interaction energies in the row Eint is seen to converge smoothly (since aug-cc-pVTZ+mb was the largest basis set that we could use at the CCSD(T) level, the δ E int CCSD ( T ) entries in columns aQZ and a5Z were obtained using the X−3 dependence with the constants calculated using the aug-cc-pVDZ+mb and aug-cc-pVTZ+mb bases). Whereas it is difficult to say whether the value of −0.025 kcal/mol that includes the (D,T) extrapolation of δ E int CCSD ( T ) is a better representation of the exact CCSD(T) value than E int CBS = 0 . 047 kcal/mol, it is clear from the convergence patterns that either of these results is more accurate than E int CCSD ( T ) computed using aug-cc-pVTZ+mb and equal to 0.119 kcal/mol. Based on the analysis performed above, we can estimate the residual error of E int CBS to be about 0.04 kcal/mol. Whereas it is a large error percentagewise, it is reasonably small in absolute terms.

The results for two other distances presented also in Table II show that in these cases there are no doubts about the adequate convergence of the interaction energies. At R = 3.8 Å, the residual error can be roughly estimated to amount to about 0.02 kcal/mol or about 1% of interaction energy. This is the same relative accuracy as given in Ref. 15 for the set of benchmarks obtained there for systems involving only equilibrium dimer configurations. At R = 8.8 Å, the absolute error is smaller than 0.0001 kcal/mol and the relative one is smaller than 0.6%.

For our largest dimer, (EDNA)2 containing 32 atoms, we were able to use only the comparatively small aug-cc-pVDZ+mb basis in calculations of the δ E int CCSD ( T ) contribution. Table II shows that if this level of basis sets was applied to the benzene-methane dimer, the error relative to E int CBS would be 0.15, 0.025, and 0.0001 kcal/mol for the three consecutive distances. The two latter errors are comparable to the estimated errors of E int CBS for these distances, so the results will be less accurate by about a factor of two. This decrease of accuracy makes a difference only for the best-performing DFT approaches. At R = 3.28 Å, the error increases about 4 times, so that at very small R the EDNA dimer benchmarks could be insufficiently accurate. However, the shortest R included for this system is well outside the zero-crossing point, so that this issue does not become a problem.

Plots of the potential energy surfaces for each complex are presented in Figs. 1–10 and a file listing all of the DFT and benchmark interaction energies is available in the supplementary material. In each figure, for clarity of presentation, the DFT methods were split into two groups with 6 methods in each group, each plotted in a separate panel. For a given system, both panels have the same scales on both axes, therefore curves residing in different panels can be compared directly.

FIG. 1.

Potential energy curves for the water dimer. The left panel compares the performance of DFT+D-type methods and of a functional with nonlocal correlation, whereas the right panel includes parametrized (hybrid) semilocal functionals and methods using virtual orbitals. Lines are drawn only to guide the eye.

FIG. 1.

Potential energy curves for the water dimer. The left panel compares the performance of DFT+D-type methods and of a functional with nonlocal correlation, whereas the right panel includes parametrized (hybrid) semilocal functionals and methods using virtual orbitals. Lines are drawn only to guide the eye.

Close modal
FIG. 2.

Potential energy curves for the ethanol dimer. See Fig. 1 for explanations.

FIG. 2.

Potential energy curves for the ethanol dimer. See Fig. 1 for explanations.

Close modal
FIG. 3.

Potential energy curves for the nitromethane dimer. See Fig. 1 for explanations.

FIG. 3.

Potential energy curves for the nitromethane dimer. See Fig. 1 for explanations.

Close modal
FIG. 4.

Potential energy curves for the methylformate dimer. See Fig. 1 for explanations.

FIG. 4.

Potential energy curves for the methylformate dimer. See Fig. 1 for explanations.

Close modal
FIG. 5.

Potential energy curves for the benzene-methane dimer. See Fig. 1 for explanations.

FIG. 5.

Potential energy curves for the benzene-methane dimer. See Fig. 1 for explanations.

Close modal
FIG. 6.

Potential energy curves for the benzene-water dimer. See Fig. 1 for explanations.

FIG. 6.

Potential energy curves for the benzene-water dimer. See Fig. 1 for explanations.

Close modal
FIG. 7.

Potential energy curves for the imidazole dimer. See Fig. 1 for explanations.

FIG. 7.

Potential energy curves for the imidazole dimer. See Fig. 1 for explanations.

Close modal
FIG. 8.

Potential energy curves for the nitrobenzene dimer. See Fig. 1 for explanations.

FIG. 8.

Potential energy curves for the nitrobenzene dimer. See Fig. 1 for explanations.

Close modal
FIG. 9.

Potential energy curves for the FOX-7 dimer. See Fig. 1 for explanations.

FIG. 9.

Potential energy curves for the FOX-7 dimer. See Fig. 1 for explanations.

Close modal
FIG. 10.

Potential energy curves for the EDNA dimer. See Fig. 1 for explanations.

FIG. 10.

Potential energy curves for the EDNA dimer. See Fig. 1 for explanations.

Close modal

Starting the comparisons from the asymptotic region, one can see in the figures that for most systems the majority of the methods seem to perform very well. One reason is that all but two of our monomers have sizable dipole moments (in some cases very large: 3.6 a.u. for FOX-7 and 2.1 a.u. for nitrobenzene), so that the interaction energy decays like 1/R3. For benzene-water, the decay is still slow, as 1/R4, and only for benzene-methane the decay of the electrostatic energy is 1/R6, i.e., the same as that of the induction and dispersion energies. Since standard DFT methods give fairly accurate multipole moments, one expects accurate reproduction of electrostatically dominated asymptotic decays. Another reason is just smallness of the interaction energy in this region: the typical percentage errors at the largest distances are actually not negligible and amount to a few percent for the systems with the 1/R3 decay. There are a few methods, DCACP, M06, M11, and vdW-DF2, which for several systems give still much larger errors, in the range 20%-230%. Since at these distances the electrostatic energy constitutes a major fraction of the interaction energy, this indicates that either the multipole moments of some monomers are not well reproduced by these approaches or there are some unphysical components appearing at these distances. For benzene-methane, the only dimer where electrostatics decays as fast as dispersion and the SAPT(DFT) energy decomposition shows that at large R the interaction energy consists of equal fractions of these components, these four methods have errors between 52%–77%, whereas the other methods perform overall similarly as for other systems. One should point out that while DCACP, M06, and M11 are not expected to produce the correct asymptotic behavior of the dispersion energy, vdW-DF2 should be able to do so, so the large error of this method is unexpected. For DCACP, this shortcoming can be made less severe through the addition of a larger number of atom-type dependent parameters.99 

Somewhat surprisingly, even for methods with potentially correct asymptotics, the relative unsigned errors often increase with R for large R. The observed behaviour may be due to unphysical effects in the DFT part resulting from the wrong decay of DFT densities. It is probably not due to insufficiently accurate C6 coefficients, since for large enough R this error should be constant (but perhaps R is not large enough in our calculations).

Near the equilibria, the discrepancies visible in the figures are generally much larger than at large separations. In particular, RSH+lrMP2 and RSH+lrRPAx-SO2 systematically overbind, the former more strongly than the latter. In contrast, DCACP systematically underbinds. So do M06 and vdW-DF2, but to a much smaller extent. M11 is the only method which sometimes overbinds and sometimes underbinds. However, for each of the remaining methods there is a dimer or two where a given method clearly over or underbinds.

At the smallest R for each dimer, the deviations from the benchmarks are generally the largest. The main reason is that this point, for most systems, is near the region where the potential energy curves cross zero. Thus, the cancellations similar to those discussed in Sec. V A occur for most methods. The observed discrepancies are actually not that critical since these interaction energies are at the onset of the repulsive wall. As it is well seen on the example of the water dimer (Fig. 1), the walls produced by different methods are quite similar: the lateral shift of the wall between the extreme cases is less than 0.05 Å.

A summary of the performance of all methods on all dimers is presented in Table III. Papers comparing the performance of various methods often use the average unsigned errors (AUE). To see that this approach does not provide sufficient information about the performance, one can recall the example given in the Introduction where the 0.46 kcal/mol AUE tells absolutely nothing about the performance of a method for the weakest bound system with the interaction energy of −0.04 kcal/mol. A much better measure, in the case when a benchmark set contains only energies at minima, is the average unsigned percentage error (AUPE), which works also very well in the asymptotic region. Unfortunately, AUPE cannot be used for the complete potential energy curves since the percentage errors in the region where a given curve crosses zero can be huge, as we have seen in Sec. V A. Thanthiriwatte et al.11 proposed an elaborate procedure to evaluate such results. Another option is to use the median unsigned percentage error (MUPE) since a few very large errors at the edge of the range do not change the position of the median. We have used MUPE to compare the methods in Table III.

TABLE III.

Median unsigned percentage errors (MUPE) for individual dimers with respect to the benchmarks. The MP2 entry are the CBS extrapolated values obtained as described earlier.

SAPT(DFT) MP2 B3-D3a dlDF+D LC-D3b gauBOPc BOPd vdw-DF2 M06 lrRPAe M11 lrMP2f DCACP Ave.
Water dimer  1.38  1.04  1.61  1.39  4.44  3.28  6.23  5.56  4.48  7.03  5.32  7.63  7.06  4.34 
Ethanol dimer  1.94  1.59  3.96  2.31  4.49  6.75  10.10  16.68  22.65  8.22  29.64  9.18  39.12  12.05 
Nitromethane dimer  3.65  4.99  3.48  11.14  7.79  9.27  11.30  6.58  4.36  9.51  7.30  12.53  16.11  8.31 
Methylformate dimer  1.58  7.99  8.03  3.18  1.94  3.13  4.41  9.09  18.87  5.41  11.25  7.40  35.86  9.09 
Benzene-methane  3.47  13.93  7.84  6.16  9.11  7.73  8.23  28.02  29.77  8.87  40.98  16.40  67.74  19.10 
Benzene-water  4.38  3.31  4.90  5.56  11.09  6.95  8.08  25.03  7.54  7.65  5.17  9.96  18.26  9.07 
Imidazole dimer  0.70  4.38  1.20  4.11  2.66  3.04  5.38  2.17  6.85  4.23  4.01  5.70  6.79  3.94 
Nitrobenzene dimer  10.49  18.39  13.99  17.04  9.57  8.57  10.42  11.28  9.87  10.87  10.18  28.69  28.71  14.47 
FOX-7 dimer  6.45  0.20  5.16  6.04  6.28  6.03  7.87  7.68  5.77  8.84  5.14  10.16  9.10  6.52 
EDNA dimer  3.65  1.09  1.74  7.70  3.41  5.02  7.59  7.52  16.99  8.85  30.40  13.00  31.98  10.69 
Average of MUPEs  3.77  5.69  5.19  6.46  6.08  5.98  7.96  11.96  12.72  7.95  14.94  12.07  26.07   
SAPT(DFT) MP2 B3-D3a dlDF+D LC-D3b gauBOPc BOPd vdw-DF2 M06 lrRPAe M11 lrMP2f DCACP Ave.
Water dimer  1.38  1.04  1.61  1.39  4.44  3.28  6.23  5.56  4.48  7.03  5.32  7.63  7.06  4.34 
Ethanol dimer  1.94  1.59  3.96  2.31  4.49  6.75  10.10  16.68  22.65  8.22  29.64  9.18  39.12  12.05 
Nitromethane dimer  3.65  4.99  3.48  11.14  7.79  9.27  11.30  6.58  4.36  9.51  7.30  12.53  16.11  8.31 
Methylformate dimer  1.58  7.99  8.03  3.18  1.94  3.13  4.41  9.09  18.87  5.41  11.25  7.40  35.86  9.09 
Benzene-methane  3.47  13.93  7.84  6.16  9.11  7.73  8.23  28.02  29.77  8.87  40.98  16.40  67.74  19.10 
Benzene-water  4.38  3.31  4.90  5.56  11.09  6.95  8.08  25.03  7.54  7.65  5.17  9.96  18.26  9.07 
Imidazole dimer  0.70  4.38  1.20  4.11  2.66  3.04  5.38  2.17  6.85  4.23  4.01  5.70  6.79  3.94 
Nitrobenzene dimer  10.49  18.39  13.99  17.04  9.57  8.57  10.42  11.28  9.87  10.87  10.18  28.69  28.71  14.47 
FOX-7 dimer  6.45  0.20  5.16  6.04  6.28  6.03  7.87  7.68  5.77  8.84  5.14  10.16  9.10  6.52 
EDNA dimer  3.65  1.09  1.74  7.70  3.41  5.02  7.59  7.52  16.99  8.85  30.40  13.00  31.98  10.69 
Average of MUPEs  3.77  5.69  5.19  6.46  6.08  5.98  7.96  11.96  12.72  7.95  14.94  12.07  26.07   
a

B3LYP-D3.

b

LC-ωPBE-D3.

c

LCgau-BOP+LRD.

d

LC-BOP+LRD.

e

RSH-lrRPAx-SO2.

f

RSH-lrMP2.

As Table III shows, the two “easiest” systems for all methods turned out to be the imidazole dimer and the water dimer, where the average MUPE was only 3.9% and 4.3% and the largest MUPE was 6.9% and 7.6%, respectively. The reason for good performance in the former case is that with the dipole moment of imidazole equal to 1.5 a.u., this interaction is strongly dominated by the electrostatic effects: for the largest R considered (10.25 Å), the dispersion energy is only 3.5% of the electrostatic energy. The water dimer interaction is dominated by electrostatics as well, although to a lesser extent since the water dipole moment is only 0.78 a.u. However, one expects a good performance for water from methods with fitted parameters since the water dimer is included in most training data sets. Despite the low average MUPEs, the differences in performance between different methods are significant. For example, for the water dimer, several curves [SAPT(DFT), LCgau-BOP+LRD, LC-ωPBE-D3] almost overlap the benchmark curve, while methods such as B3LYP-D3, RSH-lrMP2, RSH-lrRPAx-SO2, and vdW-DF2 are quite distinctly shifted from it.

The performance is only a bit worse (average MUPE 6.5% and maximum MUPE 10.2%) for the FOX-7 dimer, again related to the large FOX-7’s dipole moment of 3.6 a.u. As for the imidazole dimer, the dispersion energy is only about 3.5% of the electrostatic energy at the largest R (10.58 Å). One may ask why the importance of the dispersion energy is about the same for the two dimers despite the fact that FOX-7 has a dipole moment that is more than twice that of imidazole. The reason is that whereas electrostatic interactions coming from various regions of monomers are of both signs, and therefore tend to cancel, the dispersion contributions all add up. Therefore, for large molecules, dispersion becomes relatively more important at the same level of polarity.100,101 Since FOX-7 is twice as large as imidazole, this partially explains the observed relations.

The next group of systems includes the nitromethane, benzene-water, methylformate, and EDNA dimers, with average MUPEs in the range 8.3%–10.7% and maximum MUPEs 16%–36%. The monomers have modest dipole moments per size of a given molecule, so the increase of difficulty for DFT-based approaches compared to the group discussed above is expected. Exceptions are benzene-water with one dipole moment of zero and nitromethane dimer with the dipole moment of 1.6 a.u. The better than expected average performance on benzene-water can be due to the fact that this system is included in many training data sets. The nitromethane monomer has about the same size and dipole moment as the imidazole monomer, yet the performance of the DFT methods is about twice worse in the former case. This factor of two is quite consistently reflected by individual methods in both cases. The answer to this puzzle is provided by SAPT(DFT) components. Whereas at the largest separation the ratio of the dispersion energy to the electrostatic energy, amounting to 9% for the nitromethane dimer, is larger than the 3.5% found in the case of the imidazole dimer, but still very small, at the equilibria the ratios are 202% and 39%. Thus, overall the difficult to reproduce dispersion energy is much more important in the case of the nitromethane dimer. This relative behavior of the electrostatic and dispersion energies is related to the dimer geometry. The nitromethane dimer has a geometry where all atoms of a given monomer are fairly close to those of the interacting partner, whereas the imidazole dimer has two rings lying roughly in the same plane so only a few atoms from each monomer are in close contact.

Somewhat surprisingly, for the ethanol dimer the average MUPE is 12.1% and the maximum MUPE is 39%, one of the largest. According to the tendencies discussed above, the ethanol dimer is expected to be a more difficult case than the water dimer since the ethanol dipole moment of 0.69 a.u. is slightly smaller than that of water and the number of atoms is twice as large. Also, in the vdW minimum region, the dispersion energy is almost two times larger in magnitude than the electrostatic energy in the former case, whereas in the latter case, the ratio is about one third, similar to the nitromethane vs. imidazole comparison. One may also notice that for the ethanol dimer about half of the methods perform very well with MUPEs in the range 1.6%–6.8%, whereas the remaining methods have errors from 8.2% to 39%. Thus, only a subset of methods makes the average MUPE so large.

The next to last most difficult system is the nitrobenzene dimer, with the average MUPE of 14.5% and the maximum MUPE of 29%. None of the methods performed well on this system, with all MUPEs above 8.6%. This happens since despite the very large dipole moment of nitrobenzene, amounting to 2.1 a.u., the dispersion energy is about two times larger in magnitude than the electrostatic energy in the region of the vdW minimum. Even SAPT(DFT) gives a very large MUPE of 10.5% for this systems, however, as mentioned earlier, this is partly due to the fact that the δ int HF correction was mistakenly omitted. If it is added, SAPT(DFT)’s MUPE drops to 3.7%.

The largest average MUPE of 19.1% and the maximum MUPE of 68% was obtained for benzene-methane. This is expected as both monomers are nonpolar and the interaction is dispersion dominated: the ratio of the dispersion to the electrostatic energies is about 1 at the largest R, but it is 3.2 near the minimum. The performance of the methods falls into three groups. The best performing SAPT(DFT) method gives a MUPE of only 3.5%, the DFT+D methods and RSH-lrRPAx-SO2 give 6.2%–9.1%, MP2 and RSH-lrMP2 give 14% and 16%, respectively, and the remaining errors are above 28%. The very large errors in the last group lead to the very large average MUPE.

In summary, the order of difficulty correlates reasonably well with the size of the dipole moments, i.e., the larger the moment the easier it is for most methods to obtain accurate predictions of interaction energies. However, for a given size of dipole moment, the performance will be worse for larger systems since the dispersion energies increase with the number of atoms whereas electrostatic interactions from various regions of molecules tend to cancel. These simple rules explain most trends, some more puzzling cases were explained by analyzing the SAPT components.

Table IV lists the overall errors of all computed interaction energies for each method. The methods are ordered using MUPE, but the AUE errors are also listed. The MUPEs are computed for the whole set of 80 configurations and therefore are different from the values in the last row of Table III which are averages of MUPEs for individual dimers. The MUPEs in Table IV range from 2.6% to 16.6%. Thus, the best methods perform really well. In fact, in routine calculations for these systems with the CCSD(T) method, where one would have to choose a much smaller basis set than used in calculations of our benchmarks, MUPE due to limited size of the basis set would certainly have been equal to a few percent, as the MUPE of CCSD(T) predictions in the largest basis sets used by us (without any CBS extrapolations) is 1.2%.

TABLE IV.

Errors of the investigated methods with respect to the benchmarks. AUPE: average unsigned percentage error; AUE: average unsigned error; MUPE: median unsigned percentage error computed for all dimers. The MP2 entries are the CBS extrapolated values obtained as described earlier.

AUPE AUE
Method R > 1.5 Rmin (kcal/mol) MUPE
SAPT(DFT)  3.3  0.17  2.6 
MP2  4.3  0.30  3.0 
B3LYP-D3  4.4  0.17  3.4 
dlDF+D  5.5  0.26  4.2 
LC-ωPBE-D3  6.3  0.15  5.3 
LCgau-BOP+LRD  5.8  0.23  5.5 
M06  20.6  0.31  7.3 
vdW-DF2  14.5  0.42  7.5 
LC-BOP12+LRD  8.1  0.27  7.7 
RSH-lrRPAx-SO2  7.8  0.29  7.7 
M11  22.7  0.34  8.5 
RSH+lrMP2  11.3  0.57  10.2 
DCACP  26.7  0.61  16.6 
AUPE AUE
Method R > 1.5 Rmin (kcal/mol) MUPE
SAPT(DFT)  3.3  0.17  2.6 
MP2  4.3  0.30  3.0 
B3LYP-D3  4.4  0.17  3.4 
dlDF+D  5.5  0.26  4.2 
LC-ωPBE-D3  6.3  0.15  5.3 
LCgau-BOP+LRD  5.8  0.23  5.5 
M06  20.6  0.31  7.3 
vdW-DF2  14.5  0.42  7.5 
LC-BOP12+LRD  8.1  0.27  7.7 
RSH-lrRPAx-SO2  7.8  0.29  7.7 
M11  22.7  0.34  8.5 
RSH+lrMP2  11.3  0.57  10.2 
DCACP  26.7  0.61  16.6 

We have also included in Table IV the MP2 method at the basis set level used to calculate the benchmarks extrapolated to the CBS limit. The overall MUPE of MP2 is small, 3.0%, but the method gives very large errors for systems with π electrons, as seen in Table III. For example, the MUPE for the nitrobenzene dimer is as large as 18.4%.

For the purpose of systematizing the discussion, one can divide the methods into three groups using the MUPE ranges below 5%, 5% to 8%, and above 8%. The first group includes two methods that use virtual orbitals: SAPT(DFT) and MP2, and two DFT+D methods: B3LYP-D3 and dlDF+D. B3LYP-D3 calculations are fully converged in basis set size due to much larger basis set used than for other methods, see Table I, which might have improved the relative performance of this method. Despite using virtual orbitals, for systems of the size included in the present comparison, a SAPT(DFT) calculation is about as time consuming as a DFT supermolecular calculation of the interaction energy.102 Thus, for such systems, there is no cost difference between SAPT(DFT) and the two DFT+D approaches. The MP2 calculations presented in Table IV were, of course, orders of magnitude more expensive due to the use of very large basis sets. SAPT(DFT) also is overall most reliable since if the nitrobenzene dimer is excluded, the range of MUPE’s is between 0.7% and 6.5%. The SAPT(DFT) MUPE of 2.6% is actually very close to the uncertainties resulting from basis set effects: the CBS benchmarks have about 1% uncertainties and a CBS extrapolation of SAPT(DFT) interaction energies would likely result in changes of the order of 1%. Thus, one may say that SAPT(DFT) gives results of CCSD(T) quality at DFT costs for dimers with up to a few dozens of atoms.

The second group includes three DFT+D methods: LC-ωPBE-D3, LCgau-BOP+LRD, and LC-BOP12+LRD, one hybrid meta-GGA functional, M06, one functional with nonlocal correlation, vdW-DF2, and one method using virtual orbitals, RSH-lrRPAx-SO2. Thus, each type of approach has a representative in this group. One should point out that there are significant differences between these methods in terms of the largest MUPE (see Table III): the M06 and vdW-DF2 methods have the maximum MUPEs of 29.8% and 28.1%, respectively, whereas for the remaining approaches the largest MUPE is 11.3%.

The third group includes the M11, RSH+lrMP2, and DCACP methods, with maximum MUPEs (see Table III) amounting to 41%, 28%, and 68%, respectively.

The average unsigned errors (AUE), also shown in Table IV, allow for much less precise differentiation between various methods than MUPEs since the range is fairly narrow, from 0.15 to 0.61 kcal/mol. The main reason is that the asymptotic region has little effect on AUEs and they reflect mainly the minimum and the repulsive region results. For modelling of molecular crystals, the long-range behavior downplayed by the AUE criterion is critical since a major contribution to lattice energy comes from dimers with large intermonomer separations. Thus, AUE is not a good measure for applications in this field.

One more quantity listed in Table IV is the average unsigned percentage error for distances larger than 1.5 times the minimum distance. For such separations, one can meaningfully use the simple average. These errors mainly show the quality of the dispersion function. It so happens that the four top methods in terms of MUPE have also the lowest AUPE(R > 1.5 Rmin). The AUPE of LC-ωPBE-D3 is 1.7% larger than that of B3LYP-D3, despite the fact that both methods share asymptotically the same dispersion function. However, the short-range switching factors are different in the two methods and the dispersion interaction is not the only component influencing AUPE(R > 1.5 Rmin), as discussed earlier.

One should also discuss here the influence of basis set size on the comparisons. The benchmarks and the MP2 results are at the CBS level. Since DFT is known to converge much faster in basis set size than the calculations using expansions involving virtual orbitals, the standard DFT methods, the DFT methods with nonlocal functionals, and the DFT+D methods should all be fairly close to their CBS limits despite using a finite basis set and no extrapolations. In particular, the B3LYP-D3 calculations were performed in basis sets with the cardinal numbers up to 6, so these results are certainly fully converged in basis set size. In contrast, the SAPT(DFT), RSH-lrRPAx-SO2, and RSH+lrMP2 predictions would probably improve somewhat if calculations were performed in larger basis sets and extrapolated to the CBS limit.

After having established the relative performance of the investigated approaches, one should consider the other properties of the methods discussed in the Introduction. The methods using virtual orbitals scale as N5 for SAPT(DFT) and RSH-lrMP2, and as N6 for RSH+lrRPAx-SO2 in the present implementation. Despite such scaling, with proper programming SAPT(DFT) and MP2 are comparable in costs to hybrid DFT methods, scaling as N4, for systems of the size considered in the present test. However, for a much larger system the better scaling of the latter method will result in a cost advantage. Only two DFT methods considered here, vdW-DF2 and DCACP, are non-hybrid and in practice scale as N3. Since the exact exchange is particularly costly in calculations with periodic boundary conditions, these two methods are most economical in such cases. This effectiveness has to be considered, however, in the context of accuracy of these methods.

We have assessed the performance of a variety of DFT-based methodologies by comparisons to a set of CCSD(T)/CBS interaction energies obtained for 10 dimers at varying intermonomer distances. This set avoids the typical bias of mixing similar amounts of hydrogen-bonded and purely dispersion-bonded systems and should be more representative for work in the broad field of molecular crystals. The benchmark data reported here should be useful in the development of new methodologies and both the benchmarks and the performance of DFT-based methods examined in this work should enable a convenient assessment of new approaches relative to currently existing techniques. Since our test was blind and used different monomers than considered in the related literature, none of the surveyed methods was optimized on benchmarks similar to those included here (except for the water dimer).

See supplementary material for Cartesian coordinates of all configurations (file Geometries.txt), the (T), CBS, and DFT interaction energies (file Energies.xlsx), and the Hartree-Fock, MP2 and (T) energies (file CBS-Extrap-Components.xlsx) used to obtain the CBS reference values.

The ARL-UD workshop that led to the present paper was funded by ARO. We thank Dr. James Parker and Dr. Betsy Rice for their encouragement and constant support of this work. We thank Dr. Edoardo Apra and Dr. Karol Kowalski for fixing the errors in NWChem found by us. K.S. was partly supported by NSF Grant No. CHE-1152899. R.P. acknowledges support from the National Science Centre, Poland, Grant No. 2015/17/B/ST4/03727. The work at Rice University was supported by the U.S. Department of Energy, Office of Basic Energy Sciences, Computational and Theoretical Chemistry Program under Award No. DE-FG02-09ER16053. G.E.S. is a Welch Foundation Chair (Grant No. C-0036). C.Z. and G.G. were supported by Department of Energy/Basic Energy Sciences Grant No. DE-FG02-06ER46262. F.G. acknowledges support from Department of Energy Basic Energy Sciences Grant No. DE-SC0008938.

1.
P.
Hohenberg
and
W.
Kohn
,
Phys. Rev.
136
,
B864
(
1964
).
2.
R.
Eisenschitz
and
F.
London
,
Z. Phys.
60
,
491
(
1930
).
3.
B.
Jeziorski
,
R.
Moszyński
, and
K.
Szalewicz
,
Chem. Rev.
94
,
1887
(
1994
).
4.
5.
S.
Kristyan
and
P.
Pulay
,
Chem. Phys. Lett.
229
,
175
(
1994
).
6.
P.
Hobza
,
J.
Sponer
, and
T.
Reschel
,
J. Comput. Chem.
16
,
1315
(
1995
).
7.
E.
Ruiz
,
D. R.
Salahub
, and
A.
Vela
,
J. Phys. Chem.
100
,
12265
(
1996
).
8.
Y.
Zhao
and
D. G.
Truhlar
,
J. Chem. Theory Comput.
1
,
415
(
2005
).
9.
I. C.
Gerber
and
J. G.
Ángyán
,
J. Chem. Phys.
126
,
044103
(
2007
).
10.
L. A.
Burns
,
A.
Vázquez-Mayagoitia
,
B. G.
Sumpter
, and
C. D.
Sherrill
,
J. Chem. Phys.
134
,
084107
(
2011
).
11.
K. S.
Thanthiriwatte
,
E. G.
Hohenstein
,
L. A.
Burns
, and
C. D.
Sherrill
,
J. Chem. Theory Comput.
7
,
88
(
2011
).
12.
R.
Podeszwa
and
K.
Szalewicz
,
J. Chem. Phys.
136
,
161102
(
2012
).
13.
“Dispersion interactions and density functional theory,” http://www.physics.udel.edu/~szalewic/AROwks12, Army Research Laboratory and University of Delaware Workshop, Newark, DE, 1-3 August 2012.
14.
M.
Valiev
,
E. J.
Bylaska
,
N.
Govind
,
K.
Kowalski
,
T. P.
Straatsma
,
H. J. J.
Van Dam
,
D.
Wang
,
J.
Nieplocha
,
E.
Apra
,
T. L.
Windus
, and
W. A.
deJong
,
Comput. Phys. Commun.
181
,
1477
(
2010
).
15.
R.
Podeszwa
,
K.
Patkowski
, and
K.
Szalewicz
,
Phys. Chem. Chem. Phys.
12
,
5974
(
2010
).
16.
E.
Goll
,
H.-J.
Werner
,
H.
Stoll
,
T.
Leininger
,
P.
Gori-Giorgi
, and
A.
Savin
,
Chem. Phys.
329
,
276
(
2006
).
17.
X.
Xu
and
W. A.
Goddard
,
Proc. Natl. Acad. Sci. U. S. A.
101
,
2673
(
2004
).
18.
Y.
Zhao
and
D. G.
Truhlar
,
Theor. Chem. Acc.
120
,
215
(
2008
).
19.
J. P.
Perdew
,
K.
Burke
, and
M.
Ernzerhof
,
Phys. Rev. Lett.
77
,
3865
(
1996
).
20.
T.
van Voorhis
and
G. E.
Scuseria
,
J. Chem. Phys.
109
,
400
(
1998
).
21.
A. D.
Becke
,
J. Chem. Phys.
104
,
1040
(
1996
).
22.
R.
Peverati
and
D. G.
Truhlar
,
J. Phys. Chem. Lett.
2
,
2810
(
2011
).
23.
H.
Stoll
and
A.
Savin
, in
Density Functional Methods in Physics
, edited by
R. M.
Dreizler
and
J. D.
Providencia
(
Plenum
,
New York
,
1985
), p.
177
.
24.
A.
Savin
, in
Recent Developments of Modern Density Functional Theory
, edited by
J. M.
Seminario
(
Elsevier
,
Amsterdam
,
1996
), pp.
327
357
.
25.
I. C.
Gerber
and
J. G.
Ángyán
,
Chem. Phys. Lett.
415
,
100
(
2005
).
26.
P. M. W.
Gill
,
R. D.
Adamson
, and
J. A.
Pople
,
Mol. Phys.
88
,
1005
(
1996
).
27.
H.
Iikura
,
T.
Tsuneda
,
T.
Yanai
, and
K.
Hirao
,
J. Chem. Phys.
115
,
3540
(
2001
).
28.
J.
Heyd
,
G. E.
Scuseria
, and
M.
Ernzerhof
,
J. Chem. Phys.
118
,
8207
(
2003
).
29.
B.
Hammer
,
L. B.
Hansen
, and
J. K.
Norskov
,
Phys. Rev. B
59
,
7413
(
1999
).
30.
O. A.
von Lilienfeld
,
I.
Tavernelli
,
U.
Rothlisberger
, and
D.
Sebastiani
,
Phys. Rev. Lett.
93
,
153004
(
2004
).
31.
I.-C.
Lin
,
M. D.
Coutinho-Neto
,
C.
Felsenheimer
,
O. A.
von Lilienfeld
,
I.
Tavernelli
, and
U.
Rothlisberger
,
Phys. Rev. B
75
,
205131
(
2007
).
32.
J.
Hepburn
,
G.
Scoles
, and
R.
Penco
,
Chem. Phys. Lett.
36
,
451
(
1975
).
33.
R.
Ahlrichs
,
R.
Penco
, and
G.
Scoles
,
Chem. Phys.
19
,
119
(
1977
).
34.
R.
Podeszwa
,
K.
Pernal
,
K.
Patkowski
, and
K.
Szalewicz
,
J. Phys. Chem. Lett.
1
,
550
(
2010
).
35.
F. A.
Gianturco
,
F.
Paesani
,
M. F.
Laranjeira
,
V.
Vassilenko
,
M. A.
Cunha
,
A. G.
Shashkov
, and
A. F.
Zolotoukhina
,
Mol. Phys.
94
,
605
(
1998
).
36.
X.
Wu
,
M. C.
Vargas
,
S.
Nayak
,
V. L.
Lotrich
, and
G.
Scoles
,
J. Chem. Phys.
115
,
8748
(
2001
).
37.
Q.
Wu
and
W.
Yang
,
J. Chem. Phys.
116
,
515
(
2002
).
38.
S.
Grimme
,
J. Comput. Chem.
25
,
1463
(
2004
).
39.
K.
Pernal
,
R.
Podeszwa
,
K.
Patkowski
, and
K.
Szalewicz
,
Phys. Rev. Lett.
103
,
263201
(
2009
).
40.
S.
Grimme
,
J.
Antony
,
S.
Elrich
, and
H.
Krieg
,
J. Chem. Phys.
132
,
154104
(
2010
).
41.
L.
Goerigk
and
S.
Grimme
,
Phys. Chem. Chem. Phys.
13
,
6670
(
2011
).
42.
J.-W.
Song
and
K.
Hirao
,
Chem. Phys. Lett.
563
,
15
(
2013
).
43.
R.
Kar
,
J.-W.
Song
,
T.
Sato
, and
K.
Hirao
,
J. Comput. Chem.
34
,
2353
(
2013
).
44.
J.-W.
Song
,
S.
Tokura
,
T.
Sato
,
M. A.
Watson
, and
K.
Hirao
,
J. Chem. Phys.
127
,
154109
(
2007
);
[PubMed]
Errata,
J.-W.
Song
,
S.
Tokura
,
T.
Sato
,
M. A.
Watson
, and
K.
Hirao
,
J. Chem. Phys.
131
,
059901
(
2009
).
45.
A. J.
Misquitta
,
B.
Jeziorski
, and
K.
Szalewicz
,
Phys. Rev. Lett.
91
,
033201
(
2003
).
46.
A.
Hesselmann
and
G.
Jansen
,
Chem. Phys. Lett.
367
,
778
(
2003
).
47.
A. J.
Misquitta
,
R.
Podeszwa
,
B.
Jeziorski
, and
K.
Szalewicz
,
J. Chem. Phys.
123
,
214103
(
2005
).
48.
A.
Hesselmann
,
G.
Jansen
, and
M.
Schütz
,
J. Chem. Phys.
122
,
014103
(
2005
).
49.
Y.
Zhao
,
N. E.
Schultz
, and
D. G.
Truhlar
,
J. Chem. Theory Comput.
2
,
364
(
2006
).
50.
K. T.
Tang
and
J. P.
Toennies
,
J. Chem. Phys.
80
,
3726
(
1984
).
51.
L.
Rajchel
,
P. S.
Zuchowski
,
M. M.
Szczesniak
, and
G.
Chalasinski
,
Phys. Rev. Lett.
104
,
163001
(
2010
).
52.
A.
Austin
,
G. A.
Petersson
,
M. J.
Frisch
,
F. J.
Dobek
,
G.
Scalmani
, and
K.
Throssell
,
J. Chem. Theory Comput.
8
,
4989
(
2012
).
53.
A. D.
Becke
,
J. Chem. Phys.
98
,
5648
(
1993
).
54.
P. J.
Stephens
,
F. J.
Devlin
,
C. F.
Chabalowski
, and
M. J.
Frisch
,
J. Chem. Phys.
98
,
11623
(
1994
).
55.
A. D.
Becke
,
Phys. Rev. A
38
,
3098
(
1988
).
56.
C.
Lee
,
W.
Yang
, and
R. G.
Parr
,
Phys. Rev. B
37
,
785
(
1988
).
57.
O. A.
Vydrov
and
G. E.
Scuseria
,
J. Chem. Phys.
125
,
234109
(
2006
).
58.
J.
Heyd
and
G. E.
Scuseria
,
J. Chem. Phys.
120
,
7274
(
2004
).
59.
W.
Cencek
and
K.
Szalewicz
,
J. Chem. Phys.
139
,
024104
(
2013
);
[PubMed]
Erratum,
W.
Cencek
and
K.
Szalewicz
,
J. Chem. Phys.
140
,
149902
(
2014
).
60.
M. J.
Frisch
,
G. W.
Trucks
,
H. B.
Schlegel
,
G. E.
Scuseria
,
M. A.
Robb
,
J. R.
Cheeseman
,
G.
Scalmani
,
V.
Barone
,
B.
Mennucci
,
G. A.
Petersson
,
H.
Nakatsuji
,
M.
Caricato
,
X.
Li
,
H. P.
Hratchian
,
A. F.
Izmaylov
,
J.
Bloino
,
G.
Zheng
,
J. L.
Sonnenberg
,
M.
Hada
,
M.
Ehara
,
K.
Toyota
,
R.
Fukuda
,
J.
Hasegawa
,
M.
Ishida
,
T.
Nakajima
,
Y.
Honda
,
O.
Kitao
,
H.
Nakai
,
T.
Vreven
,
J. A.
Montgomery
, Jr.
,
J. E.
Peralta
,
F.
Ogliaro
,
M.
Bearpark
,
J. J.
Heyd
,
E.
Brothers
,
K. N.
Kudin
,
V. N.
Staroverov
,
R.
Kobayashi
,
J.
Normand
,
K.
Raghavachari
,
A.
Rendell
,
J. C.
Burant
,
S. S.
Iyengar
,
J.
Tomasi
,
M.
Cossi
,
N.
Rega
,
J. M.
Millam
,
M.
Klene
,
J. E.
Knox
,
J. B.
Cross
,
V.
Bakken
,
C.
Adamo
,
J.
Jaramillo
,
R.
Gomperts
,
R. E.
Stratmann
,
O.
Yazyev
,
A. J.
Austin
,
R.
Cammi
,
C.
Pomelli
,
J. W.
Ochterski
,
R. L.
Martin
,
K.
Morokuma
,
V. G.
Zakrzewski
,
G. A.
Voth
,
P.
Salvador
,
J. J.
Dannenberg
,
S.
Dapprich
,
A. D.
Daniels
,
O.
Farkas
,
J. B.
Foresman
,
J. V.
Ortiz
,
J.
Cioslowski
, and
D. J.
Fox
, gaussian 09, Revision A.02, Gaussian Inc., Wallingford, CT 06492, 2009.
61.
T.
Tsuneda
,
T.
Suzumura
, and
K.
Hirao
,
J. Chem. Phys.
110
,
10664
(
1999
).
62.
T.
Sato
and
H.
Nakai
,
J. Chem. Phys.
131
,
224104
(
2009
).
63.
H. C.
Longuet-Higgins
,
Discuss. Faraday Soc.
40
,
7
(
1965
).
64.
J. F.
Dobson
and
B. P.
Dinte
,
Phys. Rev. Lett.
76
,
1780
(
1996
).
65.
F.
Rob
and
K.
Szalewicz
,
Chem. Phys. Lett.
572
,
146
(
2013
).
66.
F.
Rob
and
K.
Szalewicz
,
Mol. Phys.
111
,
1430
(
2013
).
67.
J.
Řezáč
,
K. E.
Riley
, and
P.
Hobza
,
J. Chem. Theory Comput.
7
,
2427
(
2011
).
68.
K.
Lee
,
E. D.
Murray
,
L.
Kong
,
B. I.
Lundqvist
, and
D. C.
Langreth
,
Phys. Rev. B
82
,
081101
(
2010
).
69.
M.
Dion
,
H.
Rydberg
,
E.
Schröder
,
D. C.
Langreth
, and
B. I.
Lundqvist
,
Phys. Rev. Lett.
92
,
246401
(
2004
).
70.
Y.
Zhang
and
W.
Yang
,
Phys. Rev. Lett.
80
,
890
(
1998
).
71.
E. D.
Murray
,
K.
Lee
, and
D. C.
Langreth
,
J. Chem. Theory Comput.
5
,
2754
(
2009
).
72.
J. P.
Perdew
and
Y.
Wang
,
Phys. Rev. B
33
,
8800
(
1986
).
73.
J. P.
Perdew
,
Phys. Rev. B
33
,
8822
(
1986
).
74.
K.
Szalewicz
,
Wiley Interdiscip. Rev.: Comput. Mol. Sci.
2
,
254
(
2012
).
75.
H. L.
Williams
and
C. F.
Chabalowski
,
J. Phys. Chem. A
105
,
646
(
2001
).
76.
A. J.
Misquitta
and
K.
Szalewicz
,
Chem. Phys. Lett.
357
,
301
(
2002
).
77.
A.
Hesselmann
and
G.
Jansen
,
Chem. Phys. Lett.
357
,
464
(
2002
).
78.
R.
Bukowski
,
R.
Podeszwa
, and
K.
Szalewicz
,
Chem. Phys. Lett.
414
,
111
(
2005
).
79.
R.
Podeszwa
,
W.
Cencek
, and
K.
Szalewicz
,
J. Chem. Theory Comput.
8
,
1963
(
2012
).
80.
D. J.
Tozer
and
N. C.
Handy
,
J. Chem. Phys.
109
,
10180
(
1998
).
81.
E.
Fermi
and
G.
Amaldi
,
Mem. R. Accad. Ital.
6
(
1
),
119
(
1934
).
82.
R.
Podeszwa
,
R.
Bukowski
, and
K.
Szalewicz
,
J. Chem. Theory Comput.
2
,
400
(
2006
).
83.
F.
Weigend
,
Phys. Chem. Chem. Phys.
4
,
4285
(
2002
).
84.
F.
Weigend
,
A.
Köhn
, and
C.
Hättig
,
J. Chem. Phys.
116
,
3175
(
2002
).
85.
J. G.
Ángyán
,
I. C.
Gerber
,
J.
Toulouse
, and
A.
Savin
,
Phys. Rev. A
72
,
012510
(
2005
).
86.
J.
Toulouse
,
A.
Savin
, and
H.-J.
Flad
,
Int. J. Quantum Chem.
100
,
1047
(
2004
).
87.
J.
Toulouse
,
F.
Colonna
, and
A.
Savin
,
J. Chem. Phys.
122
,
014110
(
2005
).
88.
R.
Moszyński
,
B.
Jeziorski
, and
K.
Szalewicz
,
Int. J. Quantum Chem.
45
,
409
(
1993
).
89.
J.
Toulouse
,
W.
Zhu
,
A.
Savin
,
G.
Jansen
, and
J. G.
Ángyán
,
J. Chem. Phys.
135
,
084119
(
2011
).
90.
A.
Szabo
and
N. S.
Ostlund
,
J. Chem. Phys.
67
,
4351
(
1977
).
91.
T. H.
Dunning
, Jr.
,
J. Chem. Phys.
90
,
1007
(
1989
).
92.
R. A.
Kendall
,
T. H.
Dunning
, Jr.
, and
R. J.
Harrison
,
J. Chem. Phys.
96
,
6796
(
1992
).
93.
H. L.
Williams
,
E. M.
Mas
,
K.
Szalewicz
, and
B.
Jeziorski
,
J. Chem. Phys.
103
,
7374
(
1995
).
94.
S. F.
Boys
and
F.
Bernardi
,
Mol. Phys.
19
,
553
(
1970
).
95.
A.
Halkier
,
T.
Helgaker
,
P.
Jorgensen
,
W.
Klopper
, and
J.
Olsen
,
Chem. Phys. Lett.
302
,
437
(
1999
).
96.
R.
Podeszwa
,
R.
Bukowski
, and
K.
Szalewicz
,
J. Phys. Chem. A
110
,
10345
(
2006
).
97.
M.
Jeziorska
,
R.
Bukowski
,
W.
Cencek
,
M.
Jaszuński
,
B.
Jeziorski
, and
K.
Szalewicz
,
Collect. Czech. Chem. Commun.
68
,
463
(
2003
).
98.
M. W.
Schmidt
,
K. K.
Baldridge
,
J. A.
Boatz
,
S. T.
Elbert
,
M. S.
Gordon
,
J. H.
Jensen
,
S.
Koseki
,
N.
Matsunaga
,
K. A.
Nguyen
,
S.
Su
,
T. L.
Windus
,
M.
Dupuis
, and
J. A.
Montgomery
,
J. Comput. Chem.
14
,
1347
(
1993
).
99.
I.
Tavernelli
,
I. C.
Lin
, and
U.
Rothlisberger
,
Phys. Rev. B
79
,
045106
(
2009
).
100.
R.
Bukowski
,
K.
Szalewicz
, and
C.
Chabalowski
,
J. Phys. Chem. A
103
,
7322
(
1999
).
101.
J.
Hoja
,
A. F.
Sax
, and
K.
Szalewicz
,
Chem. - Eur. J.
20
,
2292
(
2014
).
102.
K.
Szalewicz
,
Acc. Chem. Res.
47
,
3266
(
2014
).

Supplementary Material