The accuracy of multipole approximations for distant pair energies in local second-order Møller-Plesset perturbation theory (LMP2) as introduced by Hetzer et al. [Chem. Phys. Lett. 290, 143 (1998)] is investigated for three chemical reactions involving molecules with up to 92 atoms. Various iterative and non-iterative approaches are compared, using different energy thresholds for distant pair selection. It is demonstrated that the simple non-iterative dipole-dipole approximation, which has been used in several recent pair natural orbitals (PNO)-LMP2 and PNO-LCCSD (local coupled-cluster with singles and doubles) methods, may underestimate the distant pair energies by up to 50% and can lead to significant errors in relative energies, unless very tight thresholds are used. The accuracy can be much improved by including higher multipole orders and by optimizing the distant pair amplitudes iteratively along with all other amplitudes. A new approach is presented in which very small special PNO domains for distant pairs are used in the iterative approach. This reduces the number of distant pair amplitudes by 3 orders of magnitude and keeps the additional computational effort for the iterative optimization of distant pair amplitudes minimal.
The efficiency of local electron correlation methods depends crucially on a simplified treatment of the correlation of electrons in spatially distant localized molecular orbitals (LMOs) and . It is well known that the leading term of the distant pair correlation energies Eij decays with , where Rij is the distance between the charge centers of the two LMOs. Typically, for the distant pair energies become individually very small (). However, since the number of distant pairs increases quadratically with the number of correlated LMOs, the total energy contribution of distant pairs in large molecules can be substantial. In this communication, we describe various approximations to compute distant pair energies and investigate how these affect correlation and reaction energies of some larger systems.
In closed-shell local second-order Møller-Plesset perturbation theory (LMP2), the pair correlation energies for an orbital pair are given by
where the indices a, b refer to suitable local virtual orbitals [e.g., projected atomic orbitals1 (PAOs), orbital specific virtuals2 (OSVs), or pair natural orbitals3,4 (PNOs)]. are the amplitudes, and the two-electron repulsion integrals. The summation is limited to a subspace (pair domain) [ij] of virtual orbitals a, b that are spatially close to the LMOs i or j. We will assume that the orbitals in each of these domains are “pseudo-canonical,” i.e., that they diagonalize the Fock matrix within the domain, for . These orbitals are pair-specific, and within a domain they are orthonormal. However, the orbitals of different pair domains [ij] and [kl] are mutually non-orthogonal, with overlap matrix . In the following, we will omit the superscripts of the orbital labels, since it should always be obvious to which pairs they belong.
The amplitudes are determined by solving first-order LMP2 equations, for , with5
Approximate amplitudes can be obtained by neglecting the coupling terms
This is denoted as “semi-canonical” (SC) amplitude approximation, since the Fock matrix is not diagonal in the occupied space.
If i, j, a, b are local, the integrals (ai|bj) decay exponentially with the distances Rai and Rbj. Thus, for distant pairs, one can limit the virtual labels to and , where [i] and [j] denote domains of virtual orbitals that are spatially close to the LMOs i and j, respectively. This means that distant pair energies are approximated as
Charge transfer contributions or , as well as exchange contributions and , are neglected. The impact of this “NOEX” approximation will be demonstrated later on. The amplitudes can either be approximated by Eq. (3) (“semi-canonical distant pair approximation”) or determined iteratively by solving the amplitude equations for and (“iterative distant pair approximation’’).
The integrals describe the Coulomb interaction of effective charge distributions and ,
with . Due to the orthonormality of the virtual and occupied orbitals, these charge distributions carry no charge, i.e., . Thus, the lowest-order contribution at long-range is the dipole-dipole interaction, which decays with . Consequently, the pair energies Eij decay with .
For non-overlapping charge distributions, these integrals can be approximated by multipole approximations, as first proposed in the LMP2 context by Hetzer et al.6 It is assumed that the charge centers of the charge distributions of and coincide with those of the LMOs, and , respectively. Let R = Rj Ri, R = |R|, and . The integrals are then approximated as
In this expansion all terms that decay with up to Rp+2 are included, and p is the highest order of the multipole moments involved. is the “interaction matrix,” which only depends on . The index runs over the (m + 1) (m + 2)/2 unique Cartesian multipole moment components of order m, and are the integrals over these operator components, evaluated at the origin Ri. The sum over ν is restricted accordingly. Note that we use Cartesian multipole moment operators rather than traceless multipole operators as in other formulations.7,8 More details of the derivation and explicit expressions up to p = 3 are given in the supplementary material.
The lowest-order approximation (p = 1) is the dipole-dipole approximation. Increasing the order p improves the accuracy at long range but leads to faster divergence at short range. Hetzer et al.6 therefore recommended to truncate the expansion at p = 3, which gives the best compromise of stability, accuracy, and efficiency.
Using the multipole approximations described above, the semi-canonical LMP2 pair energies [cf. Eqs. (3) and (4)] can be evaluated on-the-fly without storing the integrals. Though not linear scaling, this is very fast, even for large molecules. It is therefore convenient to compute the pair energies for all pairs for which (using, e.g., ) and then select those for which as distant pairs. All remaining pairs are fully included in the LMP2 calculation. The approximated distant pair energies are summed up to yield the total distant correlation energy , which is added to the final LMP2 energy. This approach has been used in the PNO-LMP2 and PNO-LCCSD (local coupled-cluster with singles and doubles) methods of Riplinger et al.9,10 and ours,11–13 as well as in recent PNO-NEVPT214 and PNO-CASPT215 methods. In all these works, only the semi-canonical dipole-dipole approximation was employed. Either PAO or OSV domains were used in these methods to compute the distant pair energy.
Alternatively, a better approximation is obtained by storing the approximate integrals (ai|bj)(p) (, ) and optimizing the corresponding amplitudes and pair energies iteratively along with all other pairs.
As will be shown below, the iterative approach is significantly more accurate than the semi-canonical one. However, it is also more expensive, since the semi-canonical PAO or OSV orbital domains [i] and [j] may be rather large. For accurate calculations using triple-ζ basis sets, the PAO and OSV orbital domains typically include 300 and 150 virtual orbitals, respectively. Since the number of distant pairs is very large, a straightforward use of these domains causes very high storage and central processing unit (CPU) demands.
This problem can be avoided by defining distant pair PNO domains [i]dPNO and [j]dPNO for each pair ij. In contrast to the PAO or OSV orbital domains, these are specific for each distant pair, and very few PNOs per pair domains are sufficient to optimally describe the correlation of the distant pair electrons. This strongly reduces the computational effort for optimizing the distant pair amplitudes. In general, PNOs are obtained by diagonalizing external pair density matrices (ij =
in the basis of pseudo-canonical OSV pair domains [ij]OSV. For distant pairs, the amplitudes are only non-zero for , and obtained using Eq. (3) and the multipole approximation of the integrals . Here [i]OSV and [j]OSV denote pseudo-canonical OSV orbital domains. The pair density then becomes block-diagonal, with the two blocks
These blocks can be diagonalized separately, and the eigenvalues can be used to select distant pair dPNO domains [i]dPNO and [j]dPNO, respectively, using a PNO occupation number threshold . Alternatively, a completeness criterion can be employed, and then as many PNOs are included as needed to recover for each pair at least , where the semi-canonical pair energies are used (see the supplementary material for more details of this procedure). Since the occupation numbers of distant pairs are very small, we found the latter criterion more suitable and used for all calculations presented in this paper. Test calculations with a stricter threshold of 0.999 yielded nearly the same results (see the supplementary material).
Finally, the new domains are made pseudo-canonical, and the integrals are transformed accordingly. The dPNO domains are very much smaller than the original OSV domains. Often only 3-6 functions are sufficient to recover more than 99.5% of the OSV pair energies, and therefore both the memory demand and the CPU time needed to optimize the distant pair amplitudes are dramatically reduced. Nevertheless, as will be demonstrated in the following, the results are almost unchanged. The transformation to the dPNO domains can be done on-the-fly when generating the integrals by the multipole approximations, so that only integrals for the very small domains need to be stored.
We demonstrate the effect of the various approximations for the dissociation of a gold(I)-aminonitrene complex (AuC41H45N4P, for simplicity denoted Auamin, see Fig. 1), which has been used in previous benchmarks of PNO-LCCSD methods.12,16 This is an excellent and difficult benchmark case since the correlation effects on the reaction energy are very large (Hartree-Fock, LMP2-F12, and LCCSD-F12 yield 92, 251, and 190 kJ mol−1, respectively, using the VTZ-F12 basis set16). The results can be directly compared to an experimental gas-phase value17 of 196.5 kJ/mol, which has been obtained by subtracting the PW91/cc-pVTZ-pp zero-point correction17−8.2 kJ/mol from the measured value. The distant pair contribution to the reaction energy is the largest of all systems we have studied so far. Additional results for two other reactions are presented in the supplementary material.
For all atoms except gold, we use the VTZ-F12 basis set,18 and for gold the aug-cc-pVTZ-PP basis19 with the ECP60MDF effective core potential.20 The PAO, OSV, and PNO domains were selected as described in Ref. 11, using well established default thresholds (see the supplementary material for details).
The contributions of distant pairs to PNO-LMP2 reaction energies are presented in Table I for p = 1, 2, 3 and three distant pair selection thresholds of Tdist. Both the semi-canonical (non-iterative) and iterative multipole approximations have been used. For the ease of comparison, the total distant pair contributions obtained without multipole approximations are given in column ; the remaining columns show the errors relative to these values that are introduced by the various approximations. In all cases, the distant pairs have been selected using the semi-canonical p = 1 pair energies, i.e., the number of distant pairs is independent of the approximation applied. In the cases where the distant pair amplitudes are optimized iteratively, all distant pairs were included. For comparison, also reaction energies computed without multipole approximations, but still neglecting the charge-transfer and exchange type integrals are shown (NOEX approximation).
PNO-LMP2 distant pair energy contributions to the reaction energies without multipole approximations and errors relative to these values caused by various multipole approximations. All energies in kJ mol−1, thresholds Tdist in . The numbers over the columns correspond to the multipole order applied. NOEX refers to calculations without multipole approximation but neglected exchange terms. Ndist is the number of distant pairs; the total number of pairs is 7503. and are average orbital domain sizes for distant pairs. The computed total PNO-LMP2 reaction energy (without distant pair approximations) amounts to 239.7 kJ mol−1.
. | . | . | . | Errors of reaction energies . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | . | . | Semi-canonical . | Iterative . | NOEX . | |||||
Tdist . | Ndist . | . | . | . | 1 . | 2 . | 3 . | 1 . | 2 . | 3 . | PNO . | OSV . |
10 | 4812 | 143 | 3.8 | 22.02 | −10.20 | −7.13 | −6.22 | −7.91 | −4.12 | −2.89 | −2.92 | −2.86 |
3 | 3609 | 143 | 3.3 | 6.93 | −2.71 | −1.91 | −1.77 | −1.82 | −0.79 | −0.58 | −0.55 | −0.50 |
1 | 2355 | 143 | 3.1 | 1.98 | −0.64 | −0.46 | −0.44 | −0.46 | −0.13 | −0.10 | −0.10 | −0.08 |
. | . | . | . | Errors of reaction energies . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | . | . | Semi-canonical . | Iterative . | NOEX . | |||||
Tdist . | Ndist . | . | . | . | 1 . | 2 . | 3 . | 1 . | 2 . | 3 . | PNO . | OSV . |
10 | 4812 | 143 | 3.8 | 22.02 | −10.20 | −7.13 | −6.22 | −7.91 | −4.12 | −2.89 | −2.92 | −2.86 |
3 | 3609 | 143 | 3.3 | 6.93 | −2.71 | −1.91 | −1.77 | −1.82 | −0.79 | −0.58 | −0.55 | −0.50 |
1 | 2355 | 143 | 3.1 | 1.98 | −0.64 | −0.46 | −0.44 | −0.46 | −0.13 | −0.10 | −0.10 | −0.08 |
The table demonstrates that the contribution of the distant pairs to the reaction energy is significant (up to 22 kJ mol−1 for ). Using the simple semi-canonical dipole-dipole approximation, this contribution is underestimated by 10 kJ mol−1, while with the iterative scheme and p = 3 the error is reduced to 2.9 kJ mol−1. The remaining error is almost exclusively due to the NOEX approximation.
We also compared calculations with the full OSV orbital domains to those with the small dPNO domains. As can be seen in Table I, the reduction of the domain sizes by the dPNO treatment is dramatic. Despite this, more than 99.5% of the distant pair energy is recovered with (on the average) only 3-4 dPNOs per pair, and the reaction energies change negligibly by only 0.02–0.06 kJ mol−1.
The computed absolute distant pair energies of Auamin and some other molecules can be found in the supplementary material. With a large threshold of the approximate distant pair energies are in all cases significantly underestimated. There are three sources of errors: (i) the neglect of charge-transfer and exchange-type excitations for distant pairs (NOEX); (ii) the non-iterative semi-canonical (SC) approximation; and (iii) the multipole approximation for the integrals (MLTP). The largest error (about ) arises from the NOEX approximation; the error caused by the SC approximation is also significant ( to ), while the error of the multipole approximation itself decreases quickly with p and is in all cases below for p = 3. For example, for the Auamin molecule, the errors of the distant pair correlation energy caused by the iterative MLTP approximation (relative to NOEX) amount to , , and for p = 1, 2, 3, respectively (Tdist = 105 Eh). The semi-canonical dipole-dipole approximation (p = 1) recovers only of the exact PNO-LMP2 distant pair energy for Auamin.
For the small threshold of , the errors caused by the NOEX approximation are reduced but still amount to about for the distant pair correlation energy of Auamin. The relative errors of the other approximations are quite similar as for the larger threshold, but the absolute errors are reduced by one order of magnitude.
Even though only 3 reactions have been investigated in the current work, we believe that these results are representative and sufficient to conclude that the numerous, but individually very small energy contributions of distant pairs in large systems are by no means negligible when summed up. Naturally, the effects of distant pair approximations grow with the molecular size. Furthermore, distant pairs have particularly large effects in systems with many aromatic groups, as demonstrated here by the Auamin reaction. It is therefore important to treat distant pairs as accurately as possible in a most cost effective way. The simple dipole-dipole approximation, which has been applied in previous works, is of limited accuracy. Higher multipole orders and the iterative optimization of distant pair amplitudes lead to significant improvements. It appears that is a safe choice. Very distant pairs with energies smaller than can be treated by the non-iterative semi-canonical approximation without loss of accuracy (see the supplementary material). This restores linear scaling for the optimization of the remaining pairs in the iterative PNO-LMP2.
Finally we note that weak and distant pair energies computed by LMP2 are often significantly (typically 30%) overestimated relative to accurate coupled-cluster values. This effect can be reduced by using spin-component scaled21 (SCS-LMP2) distant pair energies. The use of LMP2 or SCS-LMP2 approximations for weak pairs () in local coupled-cluster calculations can lead to large errors and is no longer recommended.12 Significant computational savings for close and weak pair classes are possible, however, using approximate coupled-cluster treatments, as described in Ref. 12. Furthermore, basis set incompleteness errors may be large, even when triple-ζ basis sets are used. For the Auamin PNO-LMP2/VTZ-F12 reaction energy, the basis set error amounts to about 12 kJ mol−1. This error can be very effectively reduced by explicit correlation (F12) treatments; at the same time, the F12 terms minimize errors caused by the domain approximations.11–13,22,23 This makes it possible to compute highly accurate LCCSD(T)-F12 energies for molecules with more than 100 atoms.
See supplementary material for explicit expressions for the multipole approximations up to p = 3 which are provided, and benchmark results for three reactions are presented.