We have used diffusion Monte Carlo (DMC) to perform calculations on the L7 benchmark set. DMC is a stochastic numerical integration scheme in real-space and part of a larger set of quantum Monte Carlo methods. The L7 set was designed to test the ability of electronic structure methods to include dispersive interactions. While the agreement between DMC and quantum-chemical state-of-the-art methods is excellent for some of the structures, there are significant differences in others. In contrast to wavefunction-based quantum chemical methods, DMC is a first-principle many-body method with the many-body wavefunction evolving in real space. It includes explicitly all electron–electron interactions and is relatively insensitive to the size of the basis set.
Benchmark systems have been indispensable tools for validating and advancing computational electronic structure methods, both for solids as well as for molecular compounds. In particular, the quantum chemistry community has developed a wide range of benchmark sets that probe different aspects, such as closed shells, covalent bonding, dispersive interactions, or excitation energies.1–5 Usually, great care goes into producing “gold standard” calculations that represent the state-of-the-art, in addition to accurate experimental data when such exist. Typically, quantum-chemical methods such as coupled cluster single-double with perturbative triple [CCSD(T)]6,7 are used in such calculations, with great efforts directed at ensuring that the Complete Basis Set (CBS) limit has been reached or has been extrapolated to. However, quantum-chemical methods such as CCSD(T) and quadratic configuration interaction single-double perturbative triple [QCISD(T)]8 scale poorly as , with N being the number of orbitals, and perturbative triplets may not always converge.9,10 The poor scaling limits the applicability of such methods to a wider range of compounds or solids. Recent developments in methodologies as well as their implementations have enabled linear scaling11 for large, and even periodic, systems. However, the linear scaling requires that distant pairs of localized orbitals be discarded, which adds a new set of approximations that reduce the accuracy of the method.
Nevertheless, “gold standard” calculations are extremely useful for the development of much less expensive methods, in particular density functional theory (DFT) methods,12,13 as the accuracy of DFT methods as well as the origin of errors can be assessed, and better approximate exchange-correlation functional can be developed and tested. One particular area that traditionally has been problematic for DFT methods is systems with dispersive non-local interactions, such as van der Waals interactions in molecular and layered systems.14–17 The L7 benchmark set was developed precisely with the intent of testing the ability to include dispersive interactions in electronic structure methods,18 and several proposed corrected dispersive interactions have been tested on the L7 benchmark set19,20 with the QCISD(T)18 results and CCSD(T) results21–26 for five of the L7 structures as the best available references. Apart from the original L7 work by Sedlak et al.,18 the works by Zen et al.,27 as well as the works in Refs. 22–26, highlight the challenges of converging wavefunctions for large supramolecular complexes, while works by Grimme and co-workers25,28–31 highlight the need for reference interactions on supramolecular and other complex compounds as well as the large community that tries to address this need.
As mentioned above, there are cases for which it is known that CCSD(T) and QCISD(T) give rise to errors, in particular multi-reference systems32 but also spin-crossover systems.33,34 Furthermore, these methods are not variational when perturbative estimations of triplet contributions are added, so it is not always easy to devise systematic extensions that unambiguously lead to improvements. It is therefore of interest to compare the results obtained with CCSD(T) or QCISD(T) with the results obtained with methods that involve fewer, or at least different, approximations but also rigorously satisfy a variational principle.35,36 One such method is the Diffusion Monte Carlo (DMC) method.37,38 While computationally costly, recent advances in algorithms and hardware have led to the application of DMC to a range of molecular compounds and solids.36,39,40,41 DMC is a real-space method that shows only a weak dependence on the basis set compared to wavefunction-based quantum chemical methods, and this dependence enters primarily through the quality of the nodal surface inherited from the initial trial wavefunction35,42,43 (see below). The real-space nature of DMC also gives rise to another, more subtle advantage compared to coupled-cluster methods. DMC naturally includes virtual excitations in the core, or core correlations. This is an orbital effect to which DMC is insensitive because it works in a complete basis. However, the extent to which these core correlations are captured in coupled-cluster methods depends on the basis set used. While, in principle, exact for ground state calculations, in practice, DMC requires a number of controlled approximations. However, the effect of these approximations can be removed. For molecules, the finite imaginary-time step that is used in evolving the wavefunction can be extrapolated to zero, and the effect of pseudopotentials can be entirely removed by doing all-electron calculations when this is computationally feasible. The only uncontrolled approximation, which nevertheless is bounded by the variational principle, is the fixed-node (FN) approximation44 in which the nodal surface of the many-body wavefunction is inherited from a trial wavefunction, typically a single Slater determinant. Despite the unknown bias introduced by the FN approximation, DMC has shown to give excellent results for van der Waals-bonded systems as these interactions are included without any other approximations.45–50 Zen et al.27 demonstrated the ability of DMC to go beyond CCSD(T) and reproduce accurately the lattice energy of molecular crystals within 4 kJ/mol from experimental results. Many of those authors were also involved in a study by Al-Hamdani et al.51 that demonstrated the higher accuracy of FN-DMC when compared to methods such as the random phase approximation (RPA) and second-order Møller–Plesset (MP2) theory in describing the van der Waals-dominated mechanism of adsorption of H2O on hexagonal boron-nitride substrates. We have here applied state-of-the-art DMC to the L7 benchmark set. Among others, Dubecký52 and Zen et al.27 showed that the quality of the DMC wavefunction is almost independent of the trial wavefunction for closed-shell and single-reference systems. Because the L7 molecular compounds are all closed-shell, the FN approximation should not give rise to any appreciable errors.33,42,48,53,54 Furthermore, because the elements in the L7 compounds are all light, we could do all-electron DMC calculations, thereby avoiding potential errors that arise from the use of pseudopotentials. Our results show interesting differences between state-of-the-art QCISD(T) and CCSD(T) results18,21 and the DMC results for several of the molecular structures in the L7 set. We believe that these differences stem from the approximate nature in which the complete basis set limit is reached and how higher-order correlations are included in QCISD(T) and CCSD(T); there is also evidence that core corrections are important52,55—these are avoided in our all-electron calculations.
Quantum Monte Carlo (QMC) approaches are a class of methods solving the many-body Schrödinger equation stochastically. They are general approaches and applicable to a wide range of physical and chemical systems in any dimension or boundary condition. With a favorable scaling with the number of electrons N, QMC methods are well suited for electronic structure problems with a high accuracy on large-scale parallel computers. The efficiency of QMC methods originates from the stochastic numerical sampling where samples can be evaluated independently, making the method embarrassingly parallel and highly efficient for high performance computing (HPC). By including explicitly many-body electronic interactions, methods within QMC can be rigorously exact but, in practice, require some approximations to remain computationally feasible. Some of these approximations are controlled and can be rigorously extrapolated out.
The Variational Monte Carlo (VMC) method56,57 allows for estimating the energy and properties of a given trial wavefunction without requiring to compute the matrix elements, posing no restriction on its functional form. Using the VMC algorithm, through stochastic numerical integration schemes, the expectation value of the energy for any form of the trial wavefunction can be estimated by averaging the local energy over an ensemble of configurations with a distribution given by the probability distribution set by the wavefunction itself, |ψ|2, sampled during a random walk in the configuration space using Metropolis58 or Langevin algorithms.37 The fluctuations of the local energy depend on the quality of the trial wavefunction, and they are zero if the exact wavefunction is used (the so-called zero-variance principle).
Using the same stochastic numerical integration scheme, DMC solves the Schrödinger equation in imaginary time τ = it using a projector or a Green’s function based method. Any initial state |ψ⟩ that is not orthogonal to the ground state |ϕ0⟩ will evolve to the ground state in the long time limit, and any excited-state component will decay exponentially fast, leading to the true ground state of the function,
By introducing a constant offset to the energy, ET = ϵ0, the long-time limit of Eq. (1) can be kept finite. If the Hamiltonian is separated into the kinetic energy and potential terms, the imaginary time Schrödinger equation takes on a form similar to a diffusion equation,
that can be solved using Monte Carlo sampling methods. While the term in the brackets in Eq. (2) represents a density of diffusing particles, the second term can be viewed as a branching term, potentially-dependent and capturing the increase or decrease in the particle density. The potential is unbounded in Coulombic systems, and hence, the rate term, , can diverge. Large fluctuations in the particle density lead to impractically large statistical errors. Moreover, this formulation does not take into account the fermionic nature of electrons (the sign of the many-body fermionic wavefunction must change when two electrons are exchanged). A fermionic wavefunction will have nodes and consequently regions of positive and negative sign, essential for maintaining the anti-symmetry. If left unconstrained, DMC will lead to a symmetric bosonic solution of the Schrödinger equation. This is known as the “nodal problem.” The FN approximation44 consists of imposing the nodes of a reference wavefunction. This approximation is variational: the fixed-node DMC energy is an upper bound to the exact ground state energy59 and its accuracy is wholly dependent upon the nodes of the reference (trial) wavefunction.
By introducing a guiding/trial function that closely approximates the ground state,
Eq. (2) becomes
In this case, ET is a “trial energy” introduced to maintain normalization of the projected solution at large τ, and is the local energy at position R. The last term on the right-hand side of Eq. (4) is a branching term that will kill any walker that crosses a node (where the wavefunction changes sign), and any walker (statistically independent samplings of the many-body wavefunction) that lowers the energy will be cloned as the configuration is deemed closer to the ground state. This is also known as the birth and death process.
Despite the FN approximation, DMC has been shown successfully to reach accuracy below the “chemical accuracy” of 1 kcal/mol for chemical systems33,42,48,53,54 and a few tens of meV/unit-cells for solids with periodic boundary conditions.38,50,60,61 Recently, multiple calculations using a selected Configuration Interaction (sCI) trial wavefunction have demonstrated how to systematically reduce the error from the fixed nodes.62–64 Nevertheless, these errors have proven to be significant only for open shell or multi-reference molecules.33,34
III. COMPUTATIONAL DETAILS
Binding energies of each molecule were computed by evaluating the total energy of the molecule minus the total energies of the molecular fragments constituting the molecule using the highly efficient QMCPACK code.36,39 The geometries of the fragments were not re-optimized, following the procedure of Refs. 18 and 21. For all DMC total energy calculations, we used a trial wavefunction with a Slater Jastrow form65
where is a Slater determinant expressed in terms of single particle orbitals . Our trial many-body wave functions are constructed with the product of the Jastrow functions and a single Slater determinant from B3LYP66–69 Kohn–Sham (KS) orbitals and the def2-QZVP basis set using the molecular structures in the supplementary material of Ref. 18 obtained from the PYSCF70 package. Using a variant of the linear method of Umrigar et al.,71 up to 40 variational parameters including one-body, two-body, and three-body Jastrow factors are optimized within VMC, after which we can obtain the ground-state energies with DMC under the FN approximation.38 Jastrow parameters were optimized independently for each molecular structure and its independent molecular fragment. We used time steps of 0.008 hartree−1, 0.005 hartree−1, and 0.002 hartree−1 and found that the extrapolated energies were equal to that of 0.005 hartree−1. We therefore were satisfied that a time step of 0.005 was sufficient to ensure convergence with respect to the imaginary-time step. Each molecule used 32 768 walkers and enough blocks to ensure convergence below 1 kcal/mol. With our all-electron calculations, the largest system (circumcoronene ⋯ guanine-cytosine dimer) contained 101 atoms, 478 electrons, and 5001 orbitals and required about 65 000 KNL core-hours (4 h on 256 KNL nodes) to reach an error bar of 0.6 kcal/mol; these calculations would not have been possible only a few years ago. DMC evolves in real space and therefore shows small dependence on the size of the basis set, as shown in Refs. 43 and 52. For one molecule (coronene dimer), we also used a smaller basis set (def2-TZVP) and obtained for both basis sets statistically indistinguishable total energies, 5.7 kcal/mol ± 5.7 kcal/mol. Therefore, we have confidence that even with a smaller basis set, the results are converged to a complete basis set. In any case, we used the def2-QZVP basis set to minimize any fixed-node errors arising from a smaller basis set on the geometries given be Sedlak et al.18
Table I shows in tabular form the binding energies in kcal/mol of the L7 molecular structures obtained QCISD and QCISD(T),18 CCSD(T),21 as well as our own DMC results, and Fig. 1 depicts the difference in binding energies relative to CCSD(T) and QCISD(T) from Refs. 18 and 21 (positive value means the DMC binding energy is smaller).
|Method .||C3A .||C3GC .||C2C2PD .||GCGC .||GGG .||CBH .||PHE .|
|Method .||C3A .||C3GC .||C2C2PD .||GCGC .||GGG .||CBH .||PHE .|
For four of the compounds, cirumcoronene ⋯ adenine trimer, (C3A), guanine trimer (GGG), the octadecane trimer (CBH), and the phenylalanine residues trimer (PHE), the differences between DMC and CCSD(T) or QCISD(T) binding energies are smaller than 2 kcal/mol and are close to the DMC error bars. For three compounds, cirumcoronene ⋯ guanine-cytosine dimer (C3GC), coronene dimer (C2C2PD), and for guanine-cytosine base pair dimer (GCGC), the DMC binding energies differ from those of CCSD(T) and QCISD(T) by a significant amount, up to 7 kcal/mol. There is also quite a spread in the differences in binding energies for these compounds, with the CCSD(T) results from Christensen, Elstner, and Cui24 consistently yielding a smaller binding energy in magnitude than those of Calbo et al.23 and Brandenburg et al.25 (both of which we have published data for all L7 compounds), and closer to our computed DMC binding energies. We note that Christensen, Elstner, and Cui used the aug-cc-pVDZ basis set. The augmented basis sets have added diffuse functions to the correlation-consistent basis sets that are typically used to describe anions and long–range interactions, such as van der Waals interactions. The C3A and C3GC structures were designed to represent strong aromatic dispersive interactions with implicit account for three-body interactions and H-bonding-stacking additivity, respectively, while C2C2PD and GCGC were constructed to represent strong aromatic dispersions (see Fig. 2 for depictions of PHE, C3A, C3GC, and C2C2PD). In contrast, GGG and CHB were constructed to represent aromatic stacking of π ⋯ π interactions (GGG) and aliphatic dispersive-dominated interactions (CBH). This suggests that DMC, CCSD(T), and QCISD(T) can handle pure aromatic stacking and aliphatic dispersive interactions equally well, and the origin of the differences in binding energies between DMC and CCSD(T)/QCISD(T) for C3GC, C2C2PD, and GCGC (and perhaps C3A) comes from aromatic dispersive interactions, perhaps further complicated by the inclusion of other many-body interactions.
The fact that the CCSD(T) results are significantly closer to those of DMC for C3GC, C2C2PD, and GCGC when the augmented basis set is used (note that Ma and Werner22 also used an augmented basis set but no CBS extrapolation) clearly suggests a sensitivity of the CCSD(T) results to the basis set. In contrast, DMC is relatively insensitive to the basis set, as described earlier, and the FN approximation with a single Slater determinant trial wavefunction does not lead to any appreciable errors in closed-shell systems. Within these negligible constraints, the DMC results are based on the complete Hamiltonian with all many-body interactions without any approximations. This suggests that the differences in bonding energies originate in the basis-set dependent approximate treatment of many-body interactions inherent in QCISD(T) and perhaps also in the absence of core corrections, while DMC is free from such limitations.
PHE is an amyloid fragment consisting of a trimer of phenylalanine in a mixed H-bonded-stacked conformation. This molecular structure was designed to represent mixed-character interactions of aromatic dispersion and hydrogen bonding as well as implicit many-body interactions. What is interesting is that this molecule binds already at the B3LYP level in our calculations. We examined this molecule a little bit more closely with this in mind. We calculated the binding energy at various levels of DFT theory, using Perdew–Burke–Ernzerhof (PBE), PBE0, and B3LYP with and without dispersive corrections using the D372 or Hirshfeld corrections.73 We note that already DFT with PBE, PBE0, or B3LYP would bind the PHE molecule with a binding energy of about −15 kcal/mol (negative indicating that the energy of the molecule is less than the sum of the energies of the fragments). This clearly indicates that interactions included already at the level of generalized gradient approximation (GGA) or hybrid exchange in DFT play a large role for this molecule. This is not the case for the other molecules in the test set where hybrid functionals are unable to bind the molecules without dispersive corrections. What catches the eye of the PHE depiction in Fig. 2(a) are the indications of strong interactions between the hydrogen and oxygen on adjacent fragments, indicated by dashed lines in Fig. 2(a), which may be the reason why the binding energy is relatively large already at the GGA- or hybrid-DFT level. The DMC binding energy difference arising from using a B3LYP or PBE trial wavefunction is only −1.3 ± 0.9 kcal/mol (negative indicating that the B3LYP nodal surface gives a lower binding energy; the B3LYP nodal surface reduced the total energy for each fragments as well as for the molecular complex, so by the variational principle, the B3LYP nodal surface is better both for the fragments and for the complex). In order to rule out the role of core electron interactions in the binding, we calculated a binding energy of PHE as the difference between the PHE molecule and the sum of the energies of the three isolated fragments, using all-electron calculations with the def2-QZVP basis set and also using plane waves and core-correlated pseudopotentials.74–76 (QUANTUM ESPRESSO77 was used to generate the plane wave trial wavefunction.) The DMC binding energies differed by 0.4 kcal/mol ± 0.7 kcal/mol, confirming the weak dependence of DMC on the size of the basis set and also demonstrating that excluding deep core electrons does not affect the binding of PHE.
We have calculated binding energies of the molecular structures in the L7 benchmark set18 using Diffusion Monte Carlo methods (DMC) and compared the results obtained using QCISD(T)18 and CCSD(T).21–25 We find that for molecular structures with pure aromatic or aliphatic dispersion interactions, the DMC and CCSD(T)/QCISD(T) results are identical. For molecular structures with mixed interactions, aromatic dispersion interactions, also including H-bonding and/or many-body interactions (C3GC, C2C2PD, and GCGC), there are differences in the binding energies of molecule up to 7 kcal/mol between the DMC and CCSD(T)/QCISD(T) and less than 3 kcal/mol between DMC and CCSD(T)/QCISD(T) for GGG, CBH, and PHE, which represent aromatic stacking interactions and aliphatic dispersive-dominated interactions. In addition, the agreement between DMC and CCSD(T)/QCISD(T) is very good for PHE; this molecular compound differs from the others in that DFT at the PBE, B3LYP, or PBE0 level binds, and adding dispersive corrections to these adds a smaller enhancement to the binding energy. The agreement between DMC and CCSD(T)/QCISD(T) for PHE indicates that dispersive interactions are not important to obtain a good nodal surface for DMC. We believe that the origin of the differences for C3GC, C2C2PD, and GCGC (and perhaps C3A) comes from the sensitivity of these methods to the basis set, in particular when dealing with aromatic dispersive interactions. In contrast, DMC includes all interactions to all orders and works in a complete basis set. The only constraints on the DMC methods used here are the fixed-node approximation and single-determinant trial wavefunction, both of which have been shown not to introduce appreciable errors in closed-shell systems.
We gratefully acknowledge comments from T. Janowski and P. Hobza on our work. This work was supported by the U.S. Department of Energy, Office of Science, Basic Energy Sciences, Materials Sciences and Engineering Division, as part of the Computational Materials Sciences Program and Center for Predictive Simulation of Functional Materials. We gratefully acknowledge the computing resources provided on Bebop, a high-performance computing cluster operated by the Laboratory Computing Resource Center at Argonne National Laboratory used to generate the trial wavefunctions. An award of computer time was provided by the Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program. This research used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract No. DE-AC02-06CH11357, and resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract No. DE-AC05-00OR22725.