Four-center two-electron Coulomb integrals routinely appear in electronic structure algorithms. The resolution-of-the-identity (RI) is a popular technique to reduce the computational cost for the numerical evaluation of these integrals in localized basis-sets codes. Recently, Duchemin and Blase proposed a separable RI scheme [J. Chem. Phys. 150, 174120 (2019)], which preserves the accuracy of the standard global RI method with the Coulomb metric and permits the formulation of cubic-scaling random phase approximation (RPA) and GW approaches. Here, we present the implementation of a separable RI scheme within an all-electron numeric atom-centered orbital framework. We present comprehensive benchmark results using the Thiel and the GW100 test set. Our benchmarks include atomization energies from Hartree–Fock, second-order Møller–Plesset (MP2), coupled-cluster singles and doubles, RPA, and renormalized second-order perturbation theory, as well as quasiparticle energies from GW. We found that the separable RI approach reproduces RI-free HF calculations within 9 meV and MP2 calculations within 1 meV. We have confirmed that the separable RI error is independent of the system size by including disordered carbon clusters up to 116 atoms in our benchmarks.
I. INTRODUCTION
The accurate computation of molecular or materials’ properties that derive from the electronic structure is important in quantum chemistry and materials science. Kohn–Sham density functional theory1,2 (KS-DFT) is the most popular electronic-structure method to date, providing a reasonable compromise between accuracy and computational efficiency.3 However, KS-DFT faces several limitations, including the correct description of strong correlation,4 van der Waals (vdW) interactions,4,5 or excited-state properties.6 Various methods have been explored to overcome or mitigate the short comings of KS-DFT. They can be broadly classified into wave-function based approaches, such as coupled cluster (CC)7,8 and many-body perturbation theory (MBPT) schemes, although the distinction is becoming blurry as MBPT reformulations of CC methods were presented in the literature.5,9–11 Examples for MBPT schemes are second order Møller–Plesset (MP2) perturbation theory,12 the random phase approximation13,14 (RPA), the GW approximation to Hedin’s equations,15 and renormalized second-order perturbation theory (rPT2), which includes next to RPA the renormalized single excitation16 (rSE) and second-order screened exchange17,18 (SOSEX).
Four-center two-electron repulsion integrals (4c-ERIs) appear in all of the mentioned methods and their direct evaluation scales O(N4) with respect to basis set size N and thus with system size. Resolution-of-the-identity (RI), also known as density fitting, is a popular technique to alleviate the computational demands for the calculation of the 4c-ERIs within a localized-basis-set framework,19–22 which is typically used in the quantum chemistry community. RI expands the product of two atomic orbitals (AOs), φiφj, in a linear combination of auxiliary basis functions (ABFs) and replaces the exact evaluation of the four-center Coulomb interactions with the Coulomb interactions of a set of ABFs. The most popular and accurate RI flavor is the global Coulomb-fitting scheme (RI-V), where the 4c-ERIs are refactored into two- and three-center Coulomb integrals.23 Even though the number of ABFs is at least two times larger than the corresponding AO basis sets,21,22 the refactoring reduces the memory demand and the number of required operations considerably. The RI integrals can be evaluated analytically or numerically depending on the AO type. Analytic schemes are used for, e.g., Gaussian-type orbitals (GTOs),24–29 while integrals over numeric atom-centered orbitals (NAOs) are computed on grids.30
Alternative approaches to approximate the 4c-ERIs employ partial or full real-space quadrature strategies, regardless of the AO type. Examples are Friesner’s pseudospectral method,31 which involves the use of both numerical grids and analytical integrals, and the related semi-numerical integration scheme developed by Neese and co-workers.32 Another alternative approach to RI is the least square tensor hypercontraction (LS-THC) method,33,34 which completely expresses the 4c-ERIs on real-space grids, introducing a high complexity of O(N4–5). In 2015, a cubic-scaling reformulation of the THC method was proposed,35 which was later referred to as interpolative separable density fitting (ISDF) in the literature.36–38 As in RI schemes, ISDF linearly expands the AO pairs φiφj, but uses a separable expression for the expansion coefficients. While the RI expansion coefficients entangle φi(r)φj(r) through the three-center integrals, the ISDF coefficients are defined as the product φi(rk)φj(rk), where {rk} is a subset of the real-space grid {r}. The so-called interpolation points {rk} are selected with randomized QR decomposition or K-means clustering algorithms and the ISDF-equivalent to the ABFs are then obtained by least-square optimization.38
The ISDF method has been used to accelerate different electronic structure methods, including HF or hybrid functionals,37,39–42 MP2,43 RPA,36, GW,44 the Bethe–Salpeter equation45,46 (BSE), or quantum Monte Carlo,47 and has been implemented in plane-wave, real-space, GTO, and NAO basis set codes.38 Recently, Duchemin and Blase48 proposed a separable real-space RI scheme (RI-RS), where ISDF-like expansion coefficients were obtained over a compact set of real-space points by fitting them to the RI-V coefficients. The RI-RS approach does not reduce the computational cost of canonical implementations. However, Duchemin and Blase showed that it enables the formulation of highly accurate cubic-scaling RPA48 and GW49 algorithms, in combination with the Laplace transform technique50 and the space–time method,51 respectively.
In the last decade, RPA and GW became popular in quantum chemistry,6 and low-scaling algorithms for RPA48,52–55 and GW49,56–62 are currently very actively developed for localized basis sets. Sparsity has been introduced in the low-scaling RPA and GW expressions in different ways. One approach is to replace the Coulomb metric in RI-V with a local metric, such as an overlap,52,56 attenuated,53 and truncated57,62 Coulomb metric. Alternatively, the Coulomb metric is kept, but adopted to a local RI scheme,55,58–60,63 which restricts the ABFs to the centers of the AO pairs. High accuracy compared to canonical implementations has been achieved with the first approach,57 at the price of introducing sparse tensor–tensor multiplications, which require special libraries. The second approach, local RI, is computationally less demanding than the first scheme, but controlling the accuracy can be more challenging.64 The RI-RS approach now offers a third way. Its advantage, compared to the other two approaches, is its automatic sparsity in the RPA polarizability. In addition, sparse tensor operations are avoided and RI-RS is highly accurate as previous work suggests.48
Duchemin and Blase implemented the RI-RS approach within a GTO framework and applied it to HF, MP2, RPA, and GW.48,49 In this work, we present our implementation and validation of RI-RS within an NAO framework and extend the use of the RI-RS approach to coupled cluster singles and doubles (CCSD), SOSEX, rSE, and rPT2 energies. The purpose of this paper is to validate the accuracy of the RI-RS scheme for a wide range of methods and highlight its potential for deriving low-scaling algorithms.
This article is organized as follows. In Sec. II, we briefly review the theoretical background of the electronic-structure methods and the numerical techniques used. In Sec. III, we present details of the implementation. Computational details are given in Sec. IV. In Sec. V, we discuss comprehensive benchmark results, including total, atomization and quasiparticle energies. Finally, we draw conclusions in Sec. VI.
II. THEORY
In this section, we summarize the RI-V formalism and the RI-V working equations for the HF exchange and the CCSD, MP2, and RPA correlation energies. We also briefly review the theoretical background of the rPT2 and G0W0 methodologies and their corresponding RI-V expressions. We then introduce the RI-RS scheme in detail. We restrict the presentation in Sec. II to the closed-shell case and drop the spin index. Throughout this discussion, we adopt the following indexing: p, q, … denote general molecular orbitals (MOs), m, n, … occupied (occ) MOs, and a, b, … unoccupied or equivalently virtual (virt) MOs. We use i, j, … to index the functions of the primary basis set and P, Q, … (in capital letters) are used to denote the ABFs.
A. Resolution of the identity
The different RI techniques can be grouped into global and local. In global RI, the sum in Eq. (5) runs over all ABFs,23 while it is restricted to a subset in local RI.65–68 Given the pair φiφj, where φi is centered at atom A and φj at atom B, only RI basis functions with centers at A and B are used in local RI expansions. We note that we use global RI in this work and thus exclude local RI from the discussion in the following.
B. RI-V expression for different theoretical methods
1. Hartree–Fock theory
The exact exchange energy is recomputed in each SCF iteration. In practice, we first compute and store the three-center quantity before the start of the SCF loop. In each iteration, the MO coefficients are updated and is transformed to the MO basis using Eq. (13), which then yields . The scaling of Eq. (14) formally remains O(N4),30 but is in practice usually lower. The dominating step, in particular in our NAO scheme, is the evaluation of the three-center AO integrals [Eq. (9)], which scales O(N3) for small to medium-sized systems and O(N2) for large systems because the product φiφj becomes sparse.
2. Møller–Plesset perturbation theory
3. Coupled cluster theory
Coupled-cluster theory7,8,76 is a wave-function-based approach, which is well-established in quantum chemistry77 and is becoming increasingly popular in computational materials science.78–80 CCSD with perturbative triples, CCSD(T), is considered as one of the most accurate methods, which is still computationally affordable for larger molecules.81 However, we will restrict the discussion to CCSD since our goal is the numerical validation of RI-RS rather than providing the best possible computational reference.
4. Random phase approximation
The total RPA energy is given as sum of the HF energy and the RPA correlation energy , which corresponds to the fifth rung of “Jacobs ladder”82 in the context of KS-DFT, where the local-density approximation (LDA), generalized gradient approximation (GGA), meta GGAs, and hybrid functionals are placed on rungs 1 to 4 in that order. In practice, RPA is typically performed on-top of a preceding KS-DFT calculation. This implies that we insert the KS-DFT orbitals and energies in the expressions for the exact HF exchange and . In this post-processing mode, the total RPA energy is obtained by adding both terms to the KS-DFT total energy and subtracting the KS-DFT exchange–correlation contribution.
5. Renormalized second-order perturbation theory
While long-range interactions are well described in RPA, short-range exchange and correlation are not sufficiently captured5,87 and several schemes have been developed to go beyond RPA. Ren et al.16 proposed to add the rSE correction to RPA, while Grüneis and co-workers17 put forward an RPA + SOSEX approach. Adding both corrections (RPA + SOSEX + rSE) was investigated by Paier et al.88 and later promoted as renormalized second-order perturbation theory (rPT2).89 Benchmark studies88,89 of small inorganic and organic molecular systems showed that rPT2 is generally superior to RPA + SOSEX or RPA + rSE for reaction barrier heights and atomization energies. It has also been argued that rPT2 is a well-rounded approach from a diagrammatic perspective. The goldstone diagrams of RPA, SOSEX, and rSE are distinct infinite summations, where each series sums up topologically similar diagrams. The leading terms, i.e., the first terms in the sum, are the second-order direct, second-order exchange, and single excitation terms, respectively. The combination of the three leading terms is known as second-order Rayleigh–Schrödinger perturbation theory (RSPT). This is reflected in the name of the method: the infinite expansion in RPA + SOSEX + rSE can be understood as a renormalization of RSPT yielding the “renormalized second-order perturbation theory” or in short rPT2.
6. GW quasiparticle energies
C. Separable resolution of the identity
III. IMPLEMENTATION DETAILS
The ABFs are also NAOs in FHI-aims. Most localized-basis set codes use pre-optimized ABFs obtained via variational procedures21 whereas the default procedure in FHI-aims is to compute them on-the-fly during run-time. The ABFs are obtained by generating on-site products of the primary basis functions. Linear dependencies are removed with a Gram–Schmidt orthogonalization procedure. More details can be found in Ref. 30.
The real-space grid points {rk} are pre-optimized for the primary basis set of each atomic species. We used in this work the {rk} grids developed by Duchemin and Blase,48,49 which was so far restricted to the cc-pVTZ and def2-TZVP basis sets. The atomic grids are constructed for each atom type by different combinations of four base shells, which are subsets of Lebedev quadrature grids.108 Each base shell is replicated for each species with different radii. The number of repetitions of each base shell and the corresponding radius were optimized such that is minimized for the isolated atom case.48 This procedure was followed for the grids optimized for usage with the cc-pVTZ basis set.48 In later work by Duchemin and Blase,49 the procedure was further refined by a subsequent optimization step, where the grid points are allowed to move freely without being constrained by any symmetry condition. The grids for the def2-TZVP basis sets were generated by this more sophisticated approach.
Figure 1 shows the rk grid points for the cc-pVTZ basis set for (a) a hydrogen atom, (b) a carbon atom, and (c) the ethylene molecule. We observe that the grids are very compact, which is essential for low-scaling algorithms with small prefactors. The number of grid points is only approximately ten times larger than the number of primary basis functions. For the cc-pVTZ grids used in this work, we have points for each C, N, and O atom and 167 points for the H atom. The {rk} grids for def2-TZVP are of similar size: 136 points for H, 100 for He, and 336 for second-row elements (C, N, O, etc.).49 The number of points increases to 436 and 536 points for third and fourth-row elements, respectively.49 The system-specific grid points are constructed as shown in Fig. 1(c): the real-space grid for the ethylene molecule is formed by a superposition of the rk points of each hydrogen and carbon atom. The grids for the def2-TZVP basis are shown in Fig. S1 (supplementary material), which demonstrates that the arrangement of the grids points is significantly less symmetric as the one shown for the cc-pVTZ grids in Fig. 1.
Real space grid rk (red dots) for (a) the hydrogen atom, (b) carbon atom, and (c) ethylene molecule. The real space grid for a given molecule is constructed by the superposition of the grid of each species. Displayed are the grids for the cc-pVTZ basis set.
Real space grid rk (red dots) for (a) the hydrogen atom, (b) carbon atom, and (c) ethylene molecule. The real space grid for a given molecule is constructed by the superposition of the grid of each species. Displayed are the grids for the cc-pVTZ basis set.
The pseudocode for our RI-RS implementation in FHI-aims is summarized in Fig. 2. We start by computing the coefficients . The two-center Coulomb integrals Vμν [Eq. (7)] are computed with logarithmic Bessel transforms109 in Fourier space.110,111 The evaluation of the three-center integrals [Eq. (9)] is performed in real space using overlapping atom-centered spherical grids; see also Ref. 30 for more details on the NAO integration procedures. In the next step, we build the real-space grids rk from the provided specifications of the base shells and corresponding radii (cc-pVTZ grids) or directly read them from an external file (def2-TZVP grids). In step 3, the product of two basis functions, , is computed for each grid point rk. We only include pairs ij, where the atomic orbitals φi and φj have a significant overlap of their spatial extension. To simplify the computation of , we form the combined index ρ ≡ ij yielding a matrix D of dimension Nρ × Nk, where Nρ is the number of pairs and Nk the number of grid points. In steps 4 and 5, we build the auxiliary matrices A and B, respectively, and invert A. We found that the inversion is numerically stable without L2 regularization, which was proposed in Ref. 49. The quantity Mkμ [Eq. (38)] is obtained by matrix multiplication of A−1 and B. Finally, the RI-RS fitting coefficients (in the AO basis) are obtained from Eq. (37). Figure 2 shows that the computation of the RI-RS coefficients is straightforward, only requiring matrix multiplications and one inversion.
IV. COMPUTATIONAL DETAILS
For the validation of the RI-RS implementation, we used the Thiel set,112 the GW100 benchmark set,113 and ten of the amorphous carbon clusters from our previous work.100 The Thiel set comprises 28 small organic molecules composed of the elements C, N, O, and H. The whole benchmark set is shown in Fig. 3. We used the MP2/6-31-G(d) geometries supplied in Ref. 112 by Thiel and collaborators. The GW100 benchmark set contains 100 molecules, encompassing a broad spectrum of elements from the periodic table. We used the structures reported in the original GW100 benchmark study113 and exclude the seven molecules, which contain 5th period elements (Xe, Rb2, I2, C2H3I, CI4, AlI3, Ag2). For these seven molecules, optimized real-space grid points {rk} are currently still lacking. The amorphous carbon clusters were obtained by carving clusters from periodic structures of three-dimensional disordered carbon materials with the elemental composition CHO. Dangling C–C bonds were passivated with hydrogen; see Ref. 100 for more details.
We used Dunning’s cc-pVTZ114 basis sets for all calculations with the Thiel benchmark set and for the amorphous carbon clusters. We repeated the HF and MP2 calculations for the Thiel benchmark set with the def2-TZVP basis set.115 We also used the def2-TZVP basis set for the GW100 benchmark. Both basis sets are spherical GTOs. For the RI-RS calculations, we used the atomic grids {rk} specifically optimized for the respective basis set. For cc-pVTZ, we utilized the grids published in the SI of Ref. 48, while for def2-TZVP, we used the grids employed in Ref. 49 and published in Ref. 116.
We performed HF and MP2 calculations without RI with the NWChem code117 to obtain exact 4c-ERIs references values. The NWChem default settings are otherwise used in these calculations. HF, MP2, CCSD, RPA, SOSEX, rSE, rPT2, and GW are exclusively implemented with RI in the FHI-aims code. Non-RI versions of these methods are not implemented. We show in the following results from RI-V and RI-RS. For the automatic generation of the ABFs, we used a threshold of 1.0 × 10−10 for the basis–basis product screening. The Perdew–Burke–Ernzerhof (PBE)118,119 exchange–correlation functional was used for the preceding KS-DFT calculations in RPA, SOSEX, rSE, rPT2, and G0W0 with the Thiel benchmark set. The GW100 calculations employed the PBE0120,121 hybrid functional as starting point, while the G0W0 calculations of the amorphous carbon clusters started from a PBE-based hybrid (PBEh)122 functional with 45% of exact exchange α denoted as PBEh(α = 0.45). The frequency integral of the GW self-energy [Eq. (31)] was computed using a modified Gauss–Legendre grid5 with 200 imaginary frequency points. The self-energy Σn(iω) was computed for the same set {iω} and then analytically continued to the real axis employing a Padé model123 with 16 parameters. The QP equation [Eq. (30)] was solved iteratively. The SOSEX frequency integral in Eq. (24) was evaluated numerically using also a modified Gauss–Legendre grid with 200 points. The integration over λ in SOSEX [Eq. (26)] was computed using a Gauss–Legendre quadrature with five points. We made the input and output files of all the FHI-aims calculations (RI-V and RI-RS) available in the NOMAD database.124
V. RESULTS AND DISCUSSION
A. Comparison to non-RI reference
We start the discussion by comparing our RI-V and RI-RS results to the exact 4c-ERIs reference (non-RI) obtained from NWChem. We restrict this comparison to HF and MP2. The errors with respect to the non-RI values are displayed in Figs. 4(a) and 4(b) for the total and in Figs. 4(c) and 4(d) for the atomization energies, respectively. The corresponding mean error (ME), mean absolute error (MAE), and maximum absolute error (maxAE) are given in Table I. The discussion refers, unless otherwise noted, to the cc-pVTZ results.
(a) Errors of RI-V and RI-RS with respect to the non-RI reference. Total energy errors (meV/electron) for (a) HF exchange and (b) MP2 correlation energy. Atomization energy errors (meV) for (c) HF and (d) MP2. The errors are defined as ERI−V/RI−RS − Enon−RI. The cc-pVTZ basis set was used.
(a) Errors of RI-V and RI-RS with respect to the non-RI reference. Total energy errors (meV/electron) for (a) HF exchange and (b) MP2 correlation energy. Atomization energy errors (meV) for (c) HF and (d) MP2. The errors are defined as ERI−V/RI−RS − Enon−RI. The cc-pVTZ basis set was used.
Mean errors (ME), mean absolute errors (MAEs), and maximum absolute error (maxAE) with respect to the non-RI reference for total HF exchange and MP2 correlation energies (meV/electron) and HF and MP2 atomization energies (meV). The errors are defined as ERI−V/RI−RS − Enon−RI.
. | Total energy . | Atomization energy . | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
cc-pVTZ . | def2-TZVP . | cc-pVTZ . | def2-TZVP . | |||||||||||||
HF exch. . | MP2 corr. . | HF exch. . | MP2 corr. . | HF . | MP2 . | HF . | MP2 . | |||||||||
RI-V . | RI-RS . | RI-V . | RI-RS . | RI-V . | RI-RS . | RI-V . | RI-RS . | RI-V . | RI-RS . | RI-V . | RI-RS . | RI-V . | RI-RS . | RI-V . | RI-RS . | |
ME | −0.019 | 0.132 | −0.003 | −0.004 | −0.022 | 0.015 | −0.002 | −0.010 | −0.490 | −8.568 | 0.045 | −0.274 | −0.145 | −2.997 | 0.146 | 0.531 |
MAE | 0.063 | 0.207 | 0.005 | 0.025 | 0.064 | 0.124 | 0.005 | 0.013 | 0.700 | 9.927 | 0.633 | 1.324 | 0.483 | 4.205 | 0.451 | 0.744 |
maxAE | 0.175 | 0.645 | 0.016 | 0.066 | 0.178 | 0.341 | 0.018 | 0.038 | 3.962 | 29.219 | 3.010 | 8.481 | 1.706 | 11.965 | 1.857 | 3.319 |
. | Total energy . | Atomization energy . | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
cc-pVTZ . | def2-TZVP . | cc-pVTZ . | def2-TZVP . | |||||||||||||
HF exch. . | MP2 corr. . | HF exch. . | MP2 corr. . | HF . | MP2 . | HF . | MP2 . | |||||||||
RI-V . | RI-RS . | RI-V . | RI-RS . | RI-V . | RI-RS . | RI-V . | RI-RS . | RI-V . | RI-RS . | RI-V . | RI-RS . | RI-V . | RI-RS . | RI-V . | RI-RS . | |
ME | −0.019 | 0.132 | −0.003 | −0.004 | −0.022 | 0.015 | −0.002 | −0.010 | −0.490 | −8.568 | 0.045 | −0.274 | −0.145 | −2.997 | 0.146 | 0.531 |
MAE | 0.063 | 0.207 | 0.005 | 0.025 | 0.064 | 0.124 | 0.005 | 0.013 | 0.700 | 9.927 | 0.633 | 1.324 | 0.483 | 4.205 | 0.451 | 0.744 |
maxAE | 0.175 | 0.645 | 0.016 | 0.066 | 0.178 | 0.341 | 0.018 | 0.038 | 3.962 | 29.219 | 3.010 | 8.481 | 1.706 | 11.965 | 1.857 | 3.319 |
For the total energies, we report the errors for the HF exchange [Eq. (14)] in Fig. 4(a) and the MP2 correlation energy [Eq. (15)] in Fig. 4(b). The errors for the MP2 correlation energy are an order of magnitude smaller, for both RI-V and RI-RS, which is expected since the absolute values of and differ by an order of magnitude as well. The RI-V scheme reproduces the non-RI reference almost exactly, with MAEs of 0.063 meV/electron and 0.005 meV/electron for and , respectively. Previously published RI-V results for the Thiel benchmark set are similar: Duchemin and Blase48 reported MAEs of 0.103 meV/electron for and 0.030 meV/electron for with respect to the non-RI reference. Our RI-V results are closer to the non-RI reference, which we attribute to the automatic ABF generation in FHI-aims that provides a more complete RI basis set compared to the pre-optimized ones. For cc-pVTZ, the number of auto-generated ABFs is approximately a factor of two larger than the pre-optimized ones used in Ref. 48.
As evident from Figs. 4(a) and 4(b), the RI-RS errors oscillate around the non-RI reference without a clear trend for an over- or underestimation. The RI-RS MAEs and maxAEs of the total energies are ∼4–5 times larger than the RI-V ones. Reference 48 reported only a factor of 1.5–2. One possible cause is the neglect of the rotational non-invariance of the RI-RS approach. Unlike NAOs or GTOs, the {rk} grid is not rotational invariant, which was accounted for in the previous work by averaging over 40 random orientations of each molecule in the Thiel set.48 However, the effect is small. The RI-RS MAEs reported with respect to the non-RI results were 0.171 meV/electron and 0.0435 mev/electron in Ref. 48. Our RI-RS MAEs are very similar or even smaller (0.207/0.025 meV/electron for HF/MP2) because our RI-V reference is closer to the non-RI result. This demonstrates that the influence of other parameters is larger, such as improving an already good RI basis set, justifying to neglect the rotational non-invariance.
Turning from total to relative energies, we compute the atomization energies according to Eq. (44). For MP2, we compute the total exchange and correlation energy Et in Eq. (44) with Eq. (42): the HF ground state energy is calculated with RI-V, while the MP2 correlation energy is computed either with RI-V or RI-RS. The reason for keeping RI-V for the HF contribution is that is an order of magnitude larger than and would dominate the RI-RS error in the MP2 atomization energies, making the assessment of the RI-RS performance for correlated methods difficult. We note that keeping RI-V in the HF exchange is also a reasonable choice in a low-scaling RI-RS implementation of a correlated method, where the purpose of the RI-RS scheme would be the scaling reduction in the calculation of the correlation energy and not in the calculation of the Fock exchange. Since we have to compute the RI-V expansion coefficients also in RI-RS, using them in instead of their RI-RS counterparts keeps the absolute error as small as possible.
The RI-V and RI-RS atomization energy errors with respect to the non-RI reference are shown for HF and MP2 in Figs. 4(c) and 4(d), respectively. The RI-V approach yields again excellent results with MAEs meV and maxAEs meV for both, HF and MP2. The RI-RS MAEs are 14 times larger for HF, but only a factor of 2 for MP2. For HF, RI-RS tends to slightly underestimate the non-RI reference data, while the MP2 errors oscillate around the reference line following the RI-V curve. The largest RI-RS HF and RI-RS MP2 errors are 29 and 8 meV, respectively. Setting the generally accepted threshold of 1 kcal/mol (=4 kJ/mol, = 43 meV) for chemical accuracy, we find that our RI-RS errors are well within this threshold.
We repeated the non-RI, RI-V and RI-RS calculations with the def2-TZVP basis set. The results are reported in Table I and Fig. S2 (supplementary material). For the RI-V results, we get almost the same errors for the total and atomization energies as with cc-pVTZ. This is not surprising since the ABF basis set for def2-TZVP is also generated automatically in FHI-aims and is thus very complete. However, the RI-RS errors with respect to the non-RI reference are noticeably smaller than for cc-pVTZ. The MAEs and maxAEs for the total HF and MP2 energies are reduced by a factor of two. Likewise, the MAEs and maxAEs for the atomization energies are 2–3 times smaller. For example, the MAE/maxAE for the HF atomization energy are 10/29 meV for cc-pVTZ, but only 4/12 meV for def2-TZVP. The reduction of the RI-RS errors might be explained with the more sophisticated procedure used to generate the def2-TZVP real-space grids. While the {rk} grids for cc-pVTZ and def2-TZVP are of similar size, the location of the def2-TZVP grids points is, unlike for the cc-pVTZ grids, not restricted to the surface of the shells of the Lebedev grids. Our results indicate that a more flexible arrangement of the real-space grid points is key to reduce the RI-RS error without increasing the computational cost.
B. Thiel benchmark for correlated methods
In this section, we extend our benchmarking of the RI-RS approach to CCSD, RPA, SOSEX, rSE, rPT2, and G0W0. We compare from now on to RI-V, which reproduces the non-RI results almost exactly, in particular in combination with our large auto-generated auxiliary basis sets as shown in Sec. V A. To the best of our knowledge, RI-RS benchmarks for CCSD, SOSEX, rSE, and rPT2 have not been presented previously, while RI-RS results have been published for low-scaling RPA48 and low-scaling G0W049 implementations. Note that we discuss here the accuracy of RI-RS in conventional RPA and G0W0 with O(N4) scaling, excluding any other sources for possible deviations caused by a low-scaling reformulation. The errors reported in the following are exclusively due to changing the RI scheme from RI-V to RI-RS.
In Figs. 5(a)–5(d), we report total (correlation) energies from CCSD, RPA, SOSEX and rSE. Overall, the trend and magnitude of the errors are similar to the MP2 correlation energies discussed in Sec. V A. The errors are all within ±0.050 meV/electron. Only one molecule (No.20, acetone), shows a larger error of 0.084 meV/electron in the RPA correlation [see Fig. 5(b) and Table II]. The same molecule causes also the largest error (0.030 meV/electron) in the SOSEX energy.
RI-RS errors with respect to RI-V for the total correlation energy (meV/electron) from (a) CCSD, (b) RPA, (c) SOSEX, and (d) rSE and for the atomization energy (meV) from (e) CCSD, (f) RPA, and (g) rPT2, as well as for (h) the G0W0@PBE correction of the HOMO level (meV). The errors are defined as ERI−RS − ERI−V.
RI-RS errors with respect to RI-V for the total correlation energy (meV/electron) from (a) CCSD, (b) RPA, (c) SOSEX, and (d) rSE and for the atomization energy (meV) from (e) CCSD, (f) RPA, and (g) rPT2, as well as for (h) the G0W0@PBE correction of the HOMO level (meV). The errors are defined as ERI−RS − ERI−V.
RI-RS mean errors (ME), mean absolute errors (MAEs) and maximum absolute error (maxAE) with respect to RI-V for total correlation energies from CCSD, RPA, SOSEX, rSE (meV/electron) and for atomization energies from CCSD, RPA, rPT2 (meV), as well as for the HOMO energies from G0W0 (meV). The errors are defined as ERI−RS − ERI−V.
. | Total correlation energy . | ||||
---|---|---|---|---|---|
MP2 . | CCSD . | RPA . | SOSEX . | rSE . | |
ME | −0.001 | −0.002 | −0.002 | −0.001 | −0.005 |
MAE | 0.024 | 0.015 | 0.025 | 0.007 | 0.015 |
maxAE | 0.074 | 0.042 | 0.084 | 0.030 | 0.035 |
Atomization energy | G0W0 | ||||
MP2 | CCSD | RPA | rPT2 | HOMO | |
ME | −0.319 | −0.228 | −0.225 | −0.261 | 0.157 |
MAE | 1.113 | 0.744 | 1.103 | 0.385 | 0.950 |
maxAE | 5.471 | 2.799 | 3.563 | 1.151 | 4.600 |
. | Total correlation energy . | ||||
---|---|---|---|---|---|
MP2 . | CCSD . | RPA . | SOSEX . | rSE . | |
ME | −0.001 | −0.002 | −0.002 | −0.001 | −0.005 |
MAE | 0.024 | 0.015 | 0.025 | 0.007 | 0.015 |
maxAE | 0.074 | 0.042 | 0.084 | 0.030 | 0.035 |
Atomization energy | G0W0 | ||||
MP2 | CCSD | RPA | rPT2 | HOMO | |
ME | −0.319 | −0.228 | −0.225 | −0.261 | 0.157 |
MAE | 1.113 | 0.744 | 1.103 | 0.385 | 0.950 |
maxAE | 5.471 | 2.799 | 3.563 | 1.151 | 4.600 |
We then turn to the RI-RS errors of the atomization energies from CCSD, RPA, and rPT2 displayed in Figs. 5(e)–5(g). Following the same reasoning as for the MP2 atomization energies, we compute the Fock exchange in the total exchange and correlation energies [Eq. (43)] always with RI-V and only the correlation parts with RI-V or RI-RS. The errors are within 4 meV for CCSD and RPA and even within 1.2 meV for rPT2, which is an order of magnitude smaller than the threshold required for chemical accuracy. The MAEs are smaller or around 1 meV, which confirms the excellent accuracy of the RI-RS approach over a wide range of correlated methods.
Finally, we present G0W0 QP energies for the highest occupied molecular orbital (HOMO) in Fig. 5(h). The errors are within 5 meV, and the MAE is meV (see also Table II). The error introduced by RI-RS is thus negligible for GW, where our target accuracy is usually 50 meV for comparisons between different codes. Better agreements can be reached using the same basis set, even if the GW implementations differ significantly. For example, an MAE of 3 meV has been reported comparing GW100 benchmark results of the HOMO from FHI-aims and turbomole with the def2-QZVP Gaussian basis set.113 The RI-RS error is still three times smaller.
C. QP energies for GW100 test set and carbon clusters
For a more comprehensive assessment of the RI-RS approach, we applied it to the GW100 benchmark set. While the Thiel set comprises only organic molecules, the GW100 test set also contains inorganic molecules, noble gases, and generally also heavier elements, covering larger regions of the chemical space. As in Sec. V B, we compare the RI-RS results to the RI-V reference and use RI-RS in combination with a conventional GW implementation. The QP energy errors for the HOMO and the lowest unoccupied molecular orbital (LUMO) are shown in Fig. 6 and Table III. As for the Thiel set, we obtain excellent results: The RI-RS errors are tightly centered around zero. The MAE is in the sub-meV range and the maxAE is below 5 meV. The errors for the LUMO are smaller, which must be attributed to the fact that their absolute values are for most systems smaller than for the HOMO.
Histograms of QP energy errors of RI-RS with respect to RI-V at the G0W0@PBE0 level using the def2-TZVP basis. The error is defined as ɛRI−RS − ɛRI−V. The corresponding box plots are shown below the histograms.
Histograms of QP energy errors of RI-RS with respect to RI-V at the G0W0@PBE0 level using the def2-TZVP basis. The error is defined as ɛRI−RS − ɛRI−V. The corresponding box plots are shown below the histograms.
RI-RS mean errors (ME), mean absolute errors (MAEs), and maximum absolute error (maxAE) with respect to RI-V for QP energies from G0W0@PBE0 for the GW100 benchmark set and from G0W0@PBEh(α = 0.45) for amorphous carbon clusters (meV). The errors are defined as ɛRI−RS − ɛRI−V.
. | QP energy . | |||
---|---|---|---|---|
GW100 . | Carbon clusters . | |||
HOMO . | LUMO . | HOMO . | LUMO . | |
ME | 0.222 | −0.006 | −0.140 | −0.110 |
MAE | 0.562 | 0.212 | 0.480 | 0.180 |
maxAE | 4.500 | 4.700 | 1.300 | 0.600 |
. | QP energy . | |||
---|---|---|---|---|
GW100 . | Carbon clusters . | |||
HOMO . | LUMO . | HOMO . | LUMO . | |
ME | 0.222 | −0.006 | −0.140 | −0.110 |
MAE | 0.562 | 0.212 | 0.480 | 0.180 |
maxAE | 4.500 | 4.700 | 1.300 | 0.600 |
The GW100 benchmark was previously49 performed with a low-scaling GW implementation based on RI-RS and the space–time method, using the same primary basis set, real-space grids, and starting point. The MAEs reported in Ref. 49 (0.21 meV for HOMO and 0.09 meV for LUMO) are 2–3 smaller than ours (see Table III). It might seem surprising that the low-scaling RI-RS GW implementation,49 which is also affected by other error sources, such as the time–frequency transformations, yields smaller errors than our RI-RS implementation with conventional scaling. The neglect of the rotational invariance can be excluded as possible reason since it was also not accounted for in Ref. 49. The most likely cause are the different RI-V references: Ref. 49 uses pre-optimized ABFs for RI-V, while our ABFs are generated automatically. A more detailed analysis shows that our auto-generated ABF set is larger and contains more diffuse functions with higher maximal angular momentum number. Therefore, our RI-V reference is closer to the RI-free results, as already discussed in Sec. V A. However, it might pose a greater challenge to reproduce our RI-V reference with RI-RS, leading to slightly larger RI-RS errors.
Comparing our QP-HOMO results for the Thiel and GW100 benchmark, we find that the MAE is 1.7× smaller for the latter (0.95 vs 0.56 meV). This must be attributed to the quality of the {rk} grids: For the Thiel benchmark, we used the cc-pVTZ basis set and for the GW100 the def2-TZVP basis set, where the corresponding grids were generated with a more refined procedure than for cc-pVTZ. In combination with the findings in Sec. V A, the GW100 results support the assumption that lifting symmetry constraints for the real-space grids improves the accuracy of the RI-RS approach.
So far, the discussion has been restricted to small molecules. We now expand our benchmarking efforts to QP energies of large, three-dimensional amorphous carbon clusters, motivated by our primary goal to use the RI-RS approach for low-scaling GW algorithms. We selected 10 clusters, which were part of our GW training set used to develop machine-learning models for the prediction of x-ray photoemission spectra of disordered carbon materials. Unlike in our previous work,100 where we computed the carbon 1 s excitation of the central carbon atom in each cluster, we present here the HOMO and LUMO energies. We selected amorphous carbon clusters of different sizes, ranging from 29 atoms to 116 atoms, which corresponds to 646 and 2360 basis functions, respectively. Three of the cluster are exemplarily depicted in Fig. 7. Five clusters have the elemental composition CHO, while the other five contain only carbon and hydrogen.
RI-RS error for the QP energies of amorphous carbon clusters. The QP error is defined as ɛRI−RS − ɛRI−V. The results were obtained at the G0W0@PBEh(α = 0.45) level.
RI-RS error for the QP energies of amorphous carbon clusters. The QP error is defined as ɛRI−RS − ɛRI−V. The results were obtained at the G0W0@PBEh(α = 0.45) level.
The RI-RS error of the QP energies is shown in Fig. 7. The amorphous carbon structures are spin-polarized and the error is thus depicted for each spin channel separately. For the HOMO, we find that the error is meV for 8 of the ten structures and the maxAE is only 1.3 meV. The error for the LUMO is even smaller with a maxAE of only 0.6 meV. The MAE and ME are averaged over both spin channels and shown in Table III. Both errors are smaller or in the same range compared to MAEs and MEs for the Thiel benchmark and the GW100 set.
D. Performance
Replacing RI-V in a conventional implementation by RI-RS, as shown in Eq. (40), actually increases the computational cost. The purpose of this paper is to assess the accuracy of the RI-RS approach, but the goal is to utilize the RI-RS idea to formulate accurate low-scaling algorithms for correlated methods. For the development of low-scaling algorithms, it is essential to monitor the overhead of the algorithm, i.e., the prefactor introduced by the additional operations in the computation. A large prefactor in the low-scaling algorithm might otherwise shift the crossover point, where the low-scaling implementation becomes computationally more efficient than the conventional implementation, to large system sizes, rendering the low-scaling approach in the worst case useless. The computation of the RI-RS expansion coefficients is a step, which introduces such an additional prefactor compared to the conventional implementation: in addition to the computation of the RI-V expansion coefficients , a least-square fitting is performed, which involves matrix–matrix multiplications and a matrix inversion.
Figure 8 shows the CPU time for computing the RI-RS expansion coefficients for linear alkane chains CnH2n+2 with n = 4–10. The RI-RS time includes the computation of the RI-V expansion coefficients, which dominates the timings. Only 6%–8% of the total time is spent in calculating the coefficients Mμk given in Eq. (38) (fitting step). The total time to compute also includes step 7 in Fig. 2. However, for cubic-scaling RPA and GW methods, the computation of is not required, we only need Mμk.48,49 Compared to a canonical implementation, we expect therefore a small overhead of less than 10% in the RI step.
CPU timings (h) for computing the RI-V and RI-RS expansion coefficients for linear alkane chains. The computation of the RI-RS expansion coefficients includes the time for constructing the RI-V expansion coefficients and the computation of Mμk given in Eq. (38). To guide the eye, the data points are connected by lines.
CPU timings (h) for computing the RI-V and RI-RS expansion coefficients for linear alkane chains. The computation of the RI-RS expansion coefficients includes the time for constructing the RI-V expansion coefficients and the computation of Mμk given in Eq. (38). To guide the eye, the data points are connected by lines.
VI. CONCLUSIONS
We presented an implementation of the RI-RS approach in an NAO framework. We benchmarked the accuracy of the RI-RS scheme for HF, MP2, CCSD, RPA, SOSEX, rSE, rPT2, and G0W0 by replacing the RI-V expansion coefficients with the RI-RS ones in conventional algorithms of the methods. While this strategy does not result in a speed-up of the calculation, it allows a direct assessment of the RI-RS errors.
We computed total and atomization energies for the 28 molecules of the Thiel benchmark set. We presented QP energies at the G0W0 level for the Thiel and GW100 benchmark sets as well as large amorphous carbon clusters up to 116 atoms. We showed that RI-V in combination with our auto-generated ABFs reproduces the non-RI calculation de facto exactly and thus set RI-V as reference for our RI-RS benchmarks. We found that RI-RS reproduces the RI-V results for atomization energies from MP2, CCSD, RPA, and rPT2 as well as G0W0 QP energies on average within 1 meV or better. The RI-RS error of the HF atomization energies is as expected one order of magnitude larger. We also showed that the RI-RS error remains in the sub-meV range, independent of the system size.
We also found that the RI-RS scheme only introduces a small additional computational prefactor of less than 10% in the RI step. This highlights the potential of the RI-RS approach for deriving highly accurate and computationally efficient low-scaling algorithms of correlated electronic-structure methods. Future and on-going work comprises the extension of RI-RS to periodic systems as well as the development of the actual low-scaling algorithms. Modifications of the scheme, such as fitting the RI-RS expansion coefficients to other RI flavors than RI-V, are also explored to increase the computational efficiency. A tool to generate the real-space grid points {rk} on the fly or with little effort prior to the calculations is also necessary to exploit the potential of the RI-RS scheme.
SUPPLEMENTARY MATERIAL
See the supplementary material for the computed total energies (Tables S1–S5 and S10–S12), atomization energies (Tables S6, S7, and S13), and G0W0 energies (Table S8) for the Thiel set, the GW100 set (Table S14) and amorphous carbon clusters (Table S9) as well as further details on the RI-RS workflow (Sec. S5). A plot of the def2-TZVP real-space grids is shown for ethylene in Fig. S1. RI-RS errors for HF and MP2 total and atomization energies employing the def2-TZVP basis set are plotted in Fig. S2.
ACKNOWLEDGMENTS
The authors acknowledge the European Union’s Horizon 2020 Research and Innovation Program for financial support under Grant No. 951786 (Nomad Center of Excellence) and the Horizon Europe MSCA Doctoral Network Grant No. 101073486 (EUSpecLab). D.G. acknowledges funding by the Emmy Noether Program of the German Research Foundation (Project No. 453275048). We wish to acknowledge CSC–IT Center for Science (Finland), the Jülich Supercomputer Center (Germany), and Aalto Science-IT project for computational resources.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Francisco A. Delesma: Data curation (equal); Formal analysis (equal); Investigation (equal); Writing – original draft (equal). Moritz Leucke: Data curation (supporting); Investigation (supporting); Writing – review & editing (supporting). Dorothea Golze: Conceptualization (equal); Formal analysis (equal); Supervision (equal); Writing – review & editing (equal). Patrick Rinke: Conceptualization (equal); Funding acquisition (equal); Supervision (equal); Writing – review & editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are openly available in the NOMAD database, see Ref. 124.