We present a method to improve upon the resolution-of-the-identity (RI) for correlation methods. While RI is known to allow for drastic speedups, it relies on a cancellation of errors. Our method eliminates the errors introduced by RI which are known to be problematic for absolute energies. In this way, independence of the error compensation assumption for relative energies is also achieved. The proposed method is based on the idea of starting with an oversized RI basis and projecting out all of its unphysical parts. The approach can be easily implemented into existing RI codes and results in an overhead of about 30%, while effectively removing the RI error. In passing, this process alleviates the problem that for many frequently employed basis sets no optimized RI basis sets have been constructed. In this paper, the theory is presented and results are discussed exemplarily for the random phase approximation and Møller-Plesset perturbation theory.

The resolution-of-the-identity (RI) approximation1–6 is a common approach throughout the electronic structure theories of quantum chemistry in which an identity is resolved into a product of terms dependent on a preoptimized (auxiliary) basis. This allows to reformulate theories into a computationally more tractable form. For example, within explicitly correlated methods7–9 (F12) several three-electron integrals occur which are commonly split into products of two-electron integrals via RI. However, the by far most frequently encountered use of RI is the case of 4-center-2-electron repulsion integrals (ERIs) occurring in almost any correlation method, which are split into products of at most 3-center-2-electron integrals.10,11 The final formulations obtained in this case are identical to those that can be derived from another perspective, often called density-fitting12 (DF). For this reason, in theories containing only this special type of RI the terms RI and DF are often used interchangeably.13 While we will focus on this major case of RI, i.e., DF, in the following, we expect the transfer to other types of RI to be straightforward. As an example, we sketch the application to F12 in the supplementary material.

Preoptimized incomplete auxiliary bases have to balance accuracy against computational performance. Several attempts and discussions have therefore revolved around improving the RI approach.14–17 While a lot of effort has been put into advancing upon the approximation formula itself18–23 as well as the auxiliary basis sets,24–26 the null space structure of the physical models remains more or less opaque.

In this paper, we present an approach which explicitly recognizes the physical model of the correlation method. By starting with an oversized auxiliary basis, we eliminate the RI error and then project out any contribution spanned by the auxiliary basis which is not used by the physical model. While the theory extends to any precomputable null space structure of the physical model, here we focus on the concrete examples of second order Møller-Plesset perturbation theory27 (MP2; as a post-Hartree–Fock (HF) theory) and the direct random phase approximation28 (RPA; as a post-Kohn–Sham (KS) theory) because the supports of their physical models are both encompassed by particle-hole space. This allows us to treat both of these theories simultaneously.

In formulating our theory, we employ the Cholesky decomposition (CD) and the singular value decomposition (SVD) which have been studied in related works by Aquilante, Lindh, and Pedersen21–23 and Kállay,29 respectively.

Consider a generic correlation method (Einstein’s sum convention is used throughout)

(1)

which contains a single set of ERIs

(2)

over basis functions {χ}, where Q is given by the physical model (μν|1r12|I) denotes a corresponding three-center integral. Note that if Q again depends on ERIs, the following considerations can be applied to each of them in turn. Even if the actual computation may be done in basis functions χμ,χν,, most physical models will not describe the interaction of all basis functions, but instead of a physically relevant subspace. Writing a theory like Eq. (1), however, hides this information in the physical model Q.

For the ERIs [Eq. (2)], the RI results in the following equality (see the supplementary material for details):

(3)

where I and J are indices over an auxiliary basis set. The B and C matrices are computed in any chosen metric m12 as

(4)
(5)

with the most common choice being the Coulomb metric m12 = 1/r12. Note that (K|L)=(K|1r12|L) is a Coulomb integral independent of the chosen metric. In typical approaches, a preoptimized auxiliary basis set is employed which is only 3-4 times larger in cardinality than the set {χ} because by the reasoning of DF one finds that Eq. (3) is the best approximation in a metric-specific sense also when the auxiliary basis is incomplete. While drastically improving performance, this introduces errors of several meVs in absolute energies. In contrast, our considerations start with a saturated auxiliary basis, such that Eq. (3) in fact effectively constitutes an identity.

By employing the RI, the correlation energy [Eq. (1)] becomes

(6)

As two exemplary cases we consider the RPA and MP2 methods. As post-KS/post-HF methods, they can be formulated in molecular orbitals as linear combinations of atomic orbitals. cμi, cνa, etc. denote the coefficient matrix elements, where i,j,k, index the occupied orbitals and a,b,c, the virtual orbitals. Within the RI-RPA11,30–33 in the time-determining step, one has to compute a quantity of the form

(7)

where ω is a variable to be integrated over in a later step of the algorithm and 𝜀i, etc. denote orbital energies. Therefore, we can associate

(8)

Within RI-MP234 on the other hand, the time-determining step is the recombination of the ERIs according to Eq. (3) in computing

(9)

so we can associate

(10)

Both of these methods share a commonality: Their physical models are built solely around particle-hole (ph) interactions as

(11)

To treat both methods within the same derivation, we will only focus on this common property and restrict ourselves to treat the inner part as a black box.

We aim to project out all auxiliary basis functions and linear combinations in the null space of the physical model (Null(Q̃)={x|Q̃x=0}) because these will not contribute to the final energy, or equivalently project onto its complement, called support.

We begin with a thought experiment: If Q̃ was known beforehand, one could construct a minimal-rank projector P onto the support of Q̃ such that

(12)

To do so, one would first apply SVD to Q̃

(13)

where Σ is a diagonal matrix containing only rank(Q̃) non-zero singular values and the columns of the unitary matrices U and V are the corresponding left and right singular vectors. We now discard all singular vectors with a corresponding zero singular value. In numerical implementations, it is useful to further discard those corresponding to very small singular values. This step is justified by the Eckart-Young-Mirsky theorem.35 Precisely which numerical values qualify as very small in this context will be further analyzed in Sec. III A.

As the left and right singular vectors form an orthogonal system each, recombining the remaining singular vectors according to

(14)

defines a left and a right projector. The right projector fulfils Eq. (12) and the left projector can similarly be introduced to the left of Q̃ at any point in the derivation.

Therefore one can redefine Eqs. (4) and (5) without changing the final correlation energy, Eq. (6),

(15)
(16)

This reduces the effective size of the auxiliary basis in all the time determining steps to the width of the singular vector matrices, which per constructionem is exactly limited by the rank of the physical model.

For the case at hand, there is no need for additional steps in the construction. Nonetheless, we want to propose one that may prove useful in other cases. Once the projector matrix is known, it can be decomposed by pivoted CD as

(17)

where the number of columns in the Cholesky factor L again equals the rank of P. Therefore, Eqs. (15) and (16) can be carried out with L instead of V. This may prove useful, for example, when P is sparse because unlike the singular decomposition, pivoted CD on a sparse matrix can return sparse factors. Although not necessary, we use the Cholesky factors throughout, since their use does not deteriorate the results compared to the use of V.

Computing the exact physical model is almost always part of the time determining step. To exploit the previous discussion, it is therefore necessary to find an auxiliary matrix H, such that

(18)

and construct the projectors from H instead of Q̃. The central idea to finding H is to recognize that because in a k-linear map N:WX, e.g., a matrix or tensor product, the zero element in W is mapped to the zero element in X, in a succession of maps M:VW,N:WX, the null space of the first map is contained within the null space of the succession of all maps.

Returning to the generic example, Eq. (11), this means

(19)

allowing the definition of H for this case as

(20)

By the SVD decomposition of H, it becomes obvious that HTH in fact has the same span as H (although in numerical implementations a tighter threshold for the singular values is needed) so we may equally define

(21)

which is favorable as it allows for a cheaper SVD. Applying the discussion of Sec. II B to H instead of Q̃ thus delivers the new projection. Because it does not make use of all the internal structure of Q̃, it may eliminate less than—however due to Eq. (18) never more than—the projector constructed directly from Q̃. Therefore no additional error is introduced.

While the argument extends naturally to any other operator with a non-trivial null space structure, e.g., Q̃C, by replacing Q̃ with that operator in Eq. (18), we will only evaluate the straightforward example, which we have presented here, in Sec. III.

All calculations have been done in a development version of the program package FermiONs++36,37 employing the Coulomb metric. All RPA calculations have been based on self-consistent orbitals obtained with the Perdew-Burke-Ernzerhof (PBE) functional.

Two practical questions have yet to be addressed: The auxiliary basis to start the projection from and the numerical threshold to choose for the SVD.

In principle, any auxiliary basis large enough to fulfill Eq. (3) as an identity can be taken as the starting point. While a naïve overly large auxiliary basis can be used, it is advisable to make use of specialized auxiliary basis sets whenever available, to lessen the overhead of the projector construction.

To find a suitable choice for both the starting basis and the threshold, we evaluated all 198 monomers and dimers within the S66 test set of small to medium sized molecules.38 In Fig. 1 we compare the deviation in energy from the RI-free result to the total number of auxiliary basis functions left after the projection along the continuous range of thresholds. The variation over the different molecules is expressed by the empirical standard deviation of the energy differences (at threshold values 0 and 1010,109,,105, respectively). The unprojected RI results with the canonical discrete set of preoptimized auxiliary basis sets (cc-pVXZ-RI) are also given for comparison.

FIG. 1.

Mean errors (left axis) and standard deviations (error bars) against the number of auxiliary functions using ph-projection. Solid lines: Projection from cc-pV6Z-RI (red) and cc-pV5Z-RI (orange). Larger thresholds result in fewer auxiliary basis functions. Gray bars labelled XZ = 6Z, 5Z, QZ, TZ represent the corresponding unprojected cc-pVXZ-RI bases. Mean error and standard deviation nearly vanish for cc-pV6Z-RI. Null space removal grants sub-meV accuracy with an auxiliary basis set size only slightly larger than the canonical RI basis.

FIG. 1.

Mean errors (left axis) and standard deviations (error bars) against the number of auxiliary functions using ph-projection. Solid lines: Projection from cc-pV6Z-RI (red) and cc-pV5Z-RI (orange). Larger thresholds result in fewer auxiliary basis functions. Gray bars labelled XZ = 6Z, 5Z, QZ, TZ represent the corresponding unprojected cc-pVXZ-RI bases. Mean error and standard deviation nearly vanish for cc-pV6Z-RI. Null space removal grants sub-meV accuracy with an auxiliary basis set size only slightly larger than the canonical RI basis.

Close modal

RPA/cc-pVQZ results are effectively converged with an auxiliary basis two cardinality numbers larger than the AO basis, cc-pV6Z-RI (Fig. 1, top). Kállay’s very different reasoning29 had focused on reducing the size of the typical auxiliary basis (here cc-pVQZ-RI). In contrast, our understanding as presented above, has led us to consider saturated auxiliary basis sets. We find that even when aiming for the same auxiliary size after reduction, projections based on larger basis sets lead to significantly better results. SVD thresholds tighter than 10−6 hardly change the results, so we recommend the combination cc-pV6Z-RI 106 and use it for the following analysis.

For MP2 (Fig. 1, bottom), where the typical basis set size is triple-ζ, here cc-pVTZ, already the pentuple-ζ auxiliary basis can be considered converged. Because projecting from a smaller basis means that less auxiliary functions are left after projection with the same threshold, we recommend tightening the threshold to 10−7 in this case.

Our recommended choices for the projector construction (see Sec. III A) in eliminating the RI error result in slightly more auxiliary functions entering the energy evaluation than canonical RI (e.g., cc-pVQZ-RI for cc-pVQZ). Additional compute time overhead is caused by the construction of the projector itself. While the most expensive step scales formally as O(N4) [Eq. (21); asymptotic limit with a local metric: O(N)] the same holds true for the RI correlation methods [MP2: O(N5), RPA: O(N4log(N))], so the overhead is approximately given by a constant fraction of the total runtime. In Table I we compare the overall overhead of our method to the gains in accuracy for some larger molecules (Fig. 2). Timings comprise the complete correlation energy calculation after the generation of the Kohn–Sham orbitals. This includes projector construction, integral evaluation and transformation, and evaluation of the Hamiltonian expectation value as part of the full RPA energy. Our recommended choices eliminate the RI error to an insignificant residue of less than one meV in absolute energy also for these larger molecules, causing a total overhead of only about 30% compared to the canonical RI approach, which shows two orders of magnitude larger errors.

FIG. 2.

Selection of molecules employed for testing (see Table I; C26H22O2S,39 adenosine-thymine base pair, and C60).

FIG. 2.

Selection of molecules employed for testing (see Table I; C26H22O2S,39 adenosine-thymine base pair, and C60).

Close modal
TABLE I.

Comparison of accuracy gain and computational overhead of cc-pV6Z-RI106 ph-projection against canonical cc-pVQZ-RI for molecules in Fig. 2 (geometries and more detailed results are given in the supplementary material). All calculations were carried out at the RPA@PBE/cc-pVQZ level with 12 concurrent evaluation threads on an Intel [email protected] architecture.

MoleculeABC
Canonical RI error (meV) 30.2 40.9 56.5 
ph-projection error (meV) 0.7 0.6 0.8 
Total walltime overhead +27% +35% +14% 
MoleculeABC
Canonical RI error (meV) 30.2 40.9 56.5 
ph-projection error (meV) 0.7 0.6 0.8 
Total walltime overhead +27% +35% +14% 

As the projection is essentially error-free, so are potential energy surfaces evaluated with it. We computed the dissociation curve of the C60 dimer in Fig. 3. The equilibrium distance can be deduced from experiment to be 9.93 Å,40 in good agreement with our results. Obtaining a good experimental estimate for the binding energy is more intricate.41,42

FIG. 3.

cc-pV6Z-RI 106 ph-projected potential energy dissociation curve of the C60 dimer. Geometries were taken from Ref. 43. The experimental equilibrium distance40,44 is given as a black line for reference. All calculations have been counter-poise corrected.45 

FIG. 3.

cc-pV6Z-RI 106 ph-projected potential energy dissociation curve of the C60 dimer. Geometries were taken from Ref. 43. The experimental equilibrium distance40,44 is given as a black line for reference. All calculations have been counter-poise corrected.45 

Close modal

We have introduced the concept of null space projection of the physical model. The theory was presented in a general manner and is not limited to the discussed applications at the MP2 and RPA levels. For these specifically we have shown that the RI error can be eliminated to a residue of less than one meV while overall runtime increases by only about 30%, within a few lines of code. Other correlation methods may similarly benefit from our presented null space projection idea.

See supplementary material for further results and discussions as indicated in the text.

We acknowledge funding by DFG: Oc35/4-1, SFB749, and EXC114.

1.
S. F.
Boys
and
I.
Shavitt
, “
A fundamental calculation of the energy surface for the system of three hydrogen atoms
,” Technical Report,
University of Wisconsin
,
1959
.
2.
J. L.
Whitten
,
J. Chem. Phys.
58
,
4496
(
1973
).
3.
E.
Baerends
,
D.
Ellis
, and
P.
Ros
,
Chem. Phys.
2
,
41
(
1973
).
4.
B. I.
Dunlap
,
J. W. D.
Connolly
, and
J. R.
Sabin
,
J. Chem. Phys.
71
,
4993
(
1979
).
5.
C.
Van Alsenoy
,
J. Comput. Chem.
9
,
620
(
1988
).
6.
O.
Vahtras
,
J.
Almlöf
, and
M. W.
Feyereisen
,
Chem. Phys. Lett.
213
,
514
(
1993
).
7.
W.
Kutzelnigg
,
Theor. Chim. Acta
68
,
445
(
1985
).
8.
C.
Hättig
,
W.
Klopper
,
A.
Köhn
, and
D. P.
Tew
,
Chem. Rev.
112
,
4
(
2012
).
9.
L.
Kong
,
F. A.
Bischoff
, and
E. F.
Valeev
,
Chem. Rev.
112
,
75
(
2012
).
10.
M.
Feyereisen
,
G.
Fitzgerald
, and
A.
Komornicki
,
Chem. Phys. Lett.
208
,
359
(
1993
).
11.
H.
Eshuis
,
J.
Yarkony
, and
F.
Furche
,
J. Chem. Phys.
132
,
234114
(
2010
).
12.
H.-J.
Werner
,
F. R.
Manby
, and
P. J.
Knowles
,
J. Chem. Phys.
118
,
8149
(
2003
).
13.

In Ref. 12 the term DF was introduced as a synonym to RI, highlighting the difference in perspective. Some authors have adapted this nomenclature

[cf.
J. G.
Hill
,
Int. J. Quantum Chem.
113
,
21
(
2013
)],
others have not [cf.
T.
Kjærgaard
,
J. Chem. Phys.
146
,
044103
(
2017
)].
[PubMed]
14.
I.
Duchemin
,
J.
Li
, and
X.
Blase
,
J. Chem. Theory Comput.
13
,
1199
(
2017
).
15.
F.
Weigend
,
M.
Kattannek
, and
R.
Ahlrichs
,
J. Chem. Phys.
130
,
164106
(
2009
).
16.
L.
Grajciar
,
J. Comput. Chem.
36
,
1521
(
2015
).
17.
S.
Manzer
,
P. R.
Horn
,
N.
Mardirossian
, and
M.
Head-Gordon
,
J. Chem. Phys.
143
,
024113
(
2015
).
18.
E. G.
Hohenstein
,
R. M.
Parrish
, and
T. J.
Martínez
,
J. Chem. Phys.
137
,
044103
(
2012
).
19.
R. M.
Parrish
,
E. G.
Hohenstein
,
T. J.
Martínez
, and
C. D.
Sherrill
,
J. Chem. Phys.
137
,
224106
(
2012
).
20.
E. G.
Hohenstein
,
R. M.
Parrish
,
C. D.
Sherrill
, and
T. J.
Martínez
,
J. Chem. Phys.
137
,
221101
(
2012
).
21.
F.
Aquilante
,
T. B.
Pedersen
,
A.
Sánchez de Merás
, and
H.
Koch
,
J. Chem. Phys.
125
,
174101
(
2006
).
22.
T. B.
Pedersen
,
F.
Aquilante
, and
R.
Lindh
,
Theor. Chem. Acc.
124
,
1
(
2009
).
23.
F.
Aquilante
,
L.
Boman
,
J.
Boström
,
H.
Koch
,
R.
Lindh
,
A.
Sánchez de Merás
, and
T. B.
Pedersen
,
Linear-Scaling Techniques in Computational Chemistry and Physics
(
Springer
,
Dordrecht, The Netherlands
,
2011
), Vol. 13, pp.
301
343
.
24.
F.
Weigend
,
A.
Köhn
, and
C.
Hättig
,
J. Chem. Phys.
116
,
3175
(
2002
).
25.
C.
Hättig
,
Phys. Chem. Chem. Phys.
7
,
59
(
2005
).
26.
G. L.
Stoychev
,
A. A.
Auer
, and
F.
Neese
,
J. Chem. Theory Comput.
13
,
554
(
2017
).
27.
C.
Møller
and
M. S.
Plesset
,
Phys. Rev.
46
,
618
(
1934
).
28.
D.
Bohm
and
D.
Pines
,
Phys. Rev.
92
,
609
(
1953
).
29.
M.
Kállay
,
J. Chem. Phys.
141
,
244113
(
2014
).
30.
F.
Furche
,
J. Chem. Phys.
129
,
114105
(
2008
).
31.
H.
Eshuis
and
F.
Furche
,
J. Chem. Phys.
136
,
084105
(
2012
).
32.
H. F.
Schurkus
and
C.
Ochsenfeld
,
J. Chem. Phys.
144
,
031101
(
2016
).
33.
A.
Luenser
,
H. F.
Schurkus
, and
C.
Ochsenfeld
,
J. Chem. Theory Comput.
13
,
1647
(
2017
).
34.
F.
Weigend
,
M.
Häser
,
H.
Patzelt
, and
R.
Ahlrichs
,
Chem. Phys. Lett.
294
,
143
(
1998
).
35.
C.
Eckart
and
G.
Young
,
Psychometrika
1
,
211
(
1936
).
36.
J.
Kussmann
,
M.
Beer
, and
C.
Ochsenfeld
,
Wiley Interdiscip. Rev.: Comput. Mol. Sci.
3
,
614
(
2013
).
37.
J.
Kussmann
,
A.
Luenser
,
M.
Beer
, and
C.
Ochsenfeld
,
J. Chem. Phys.
142
,
094101
(
2015
).
38.
J.
Řezáč
,
K. E.
Riley
, and
P.
Hobza
,
J. Chem. Theory Comput.
7
,
2427
(
2011
).
39.
T.
Gatzenmeier
,
M.
van Gemmeren
,
Y.
Xie
,
D.
Hofler
,
M.
Leutzsch
, and
B.
List
,
Science
351
,
949
(
2016
).
40.
P. A.
Heiney
,
J. E.
Fischer
,
A. R.
McGhie
,
W. J.
Romanow
,
A. M.
Denenstein
,
J. P.
McCauley
, Jr.
,
A. B.
Smith
, and
D. E.
Cox
,
Phys. Rev. Lett.
66
,
2911
(
1991
).
41.
J.
Abrefah
,
D. R.
Olander
,
M.
Balooch
, and
W. J.
Siekhaus
,
Appl. Phys. Lett.
60
,
1313
(
1992
).
42.
F.
Tournus
,
J.-C.
Charlier
, and
P.
Mélinon
,
J. Chem. Phys.
122
,
094315
(
2005
).
43.
D. I.
Sharapa
,
J. T.
Margraf
,
A.
Hesselmann
, and
T.
Clark
,
J. Chem. Theory Comput.
13
,
274
(
2017
).
44.
Y.-y.
Ohnishi
,
K.
Ishimura
, and
S.
Ten-no
,
J. Chem. Theory Comput.
10
,
4857
(
2014
).
45.
S.
Boys
and
F.
Bernardi
,
Mol. Phys.
19
,
553
(
1970
).

Supplementary Material