Benchmarking the accuracy of the separable resolution of the identity approach for correlated methods in the numeric atom-centered orbitals framework

Four-center two-electron Coulomb integrals routinely appear in electronic structure algorithms. The resolution-of-the-identity (RI) is a popular technique to reduce the computational cost for the numerical evaluation of these integrals in localized basis-sets codes. Recently, Duchemin and Blase proposed a separable RI scheme [J. Chem. Phys. 150, 174120 (2019)], which preserves the accuracy of the standard global RI method with the Coulomb metric (RI-V) and permits the formulation of cubic-scaling random phase approximation (RPA) and $GW$ approaches. Here, we present the implementation of a separable RI scheme within an all-electron numeric atom-centered orbital framework. We present comprehensive benchmark results using the Thiel and the GW100 test set. Our benchmarks include atomization energies from Hartree-Fock, second-order M{\o}ller-Plesset (MP2), coupled-cluster singles and doubles, RPA and renormalized second-order perturbation theory as well as quasiparticle energies from $GW$. We found that the separable RI approach reproduces RI-free HF calculations within 9 meV and MP2 calculations within 1 meV. We have confirmed that the separable RI error is independent of the system size by including disordered carbon clusters up to 116 atoms in our benchmarks


I. INTRODUCTION
The accurate computation of molecular or materials' properties that derive from the electronic structure is important in quantum chemistry and materials science.Kohn-Sham density functional theory 1,2 (KS-DFT) is the most popular electronic-structure method to date, providing a reasonable compromise between accuracy and computational efficiency. 3However, KS-DFT faces several limitations, including the correct description of strong correlation, 4 vander-Waals (vdW) interactions 4,5 or excited-state properties. 6arious methods have been explored to overcome or mitigate the short-comings of KS-DFT.0][11] Examples for MBPT schemes are second order Møller-Plesset (MP2) perturbation theory, 12 the random phase approximation 13,14 (RPA), the GW approximation to Hedin's equations, 15 renormalized second-order perturbation theory (rPT2), which includes next to RPA the renormalized single excitation 16 (rSE) and second-order screened exchange 17,18 (SOSEX).
Four-center two-electron repulsion integrals (4c-ERIs) appear in all of the mentioned methods and their direct evaluation scales O(N 4 ) with respect to basis set size N and thus with system size.Resolution-of-the-identity (RI), also known as density fitting, is a popular technique to alleviate the computational demands for the calculation of the 4c-ERIs within a localized-basis-set framework, [19][20][21][22] which is typically used in the quantum chemistry community.RI expands the product of two atomic orbitals (AOs), ϕ i ϕ j , in a linear combina-tion of auxiliary basis functions (ABFs) and replaces the exact evaluation of the four-center Coulomb interactions with the Coulomb interactions of a set of ABFs.The most popular and accurate RI flavor is the global Coulomb-fitting scheme RI-V, where the 4c-ERIs are refactored into two-and threecenter Coulomb integrals. 23Even though the number of ABFs is at least two times larger than the corresponding AO basis sets, 21,22 the refactoring reduces the memory demand and the number of required operations considerably.The RI integrals can be evaluated analytically or numerically depending on the AO type.Analytic schemes are used for, e.g., Gaussiantype orbitals (GTOs), [24][25][26][27][28][29] while integrals over numeric atomcentered orbitals (NAOs) are computed on grids. 30ternative approaches to approximate the 4c-ERIs employ partial or full real-space quadrature strategies, regardless of the AO type.Examples are Friesner's pseudospectral method, 31 which involves the use of both numerical grids and analytical integrals, and the related semi-numerical integration scheme developed by Neese and coworkers. 32Another alternative approach to RI is the least square tensor hypercontraction (LS-THC) method, 33,34 which completely expresses the 4c-ERIs on real-space grids, introducing a high complexity of O(N 4−5 ).7][38] As in RI schemes, ISDF linearly expands the AO pairs ϕ i ϕ j , but uses a separable expression for the expansion coefficients.While the RI expansion coefficients entangle ϕ i (r)ϕ j (r) through the three-center integrals, the ISDF coefficients are defined as the product ϕ i (r k )ϕ j (r k ), where {r k } is a subset of the real-space grid {r}.The socalled interpolation points {r k } are selected with randomized QR decomposition or K-means clustering algorithms and The article is organized as follows.In Section II, we briefly review the theoretical background of the electronic-structure methods and the numerical techniques used.In Section III we present details of the implementation.Computational details are given in Section IV.In Section V, we discuss comprehensive benchmark results, including total, atomization and quasiparticle energies.Finally, we draw conclusions in Section VI.

II. THEORY
In this section, we summarize the RI-V formalism and the RI-V working equations for the HF exchange and the CCSD, MP2, RPA correlation energies.We also briefly review the theoretical background of the rPT2 and G 0 W 0 methodologies and their corresponding RI-V expressions.We then introduce the RI-RS scheme in detail.We restrict the presentation in Section II to the closed-shell case and drop the spin index.Throughout this discussion, we adopt the following indexing: p, q, ... denote general molecular orbitals (MOs), m, n, ... occupied (occ) MOs and a, b, ... unoccupied or equivalently virtual (virt) MOs.We use i, j, .. to index the functions of the primary basis set and P, Q, ... (in capital letters) are used to denote the ABFs.

A. Resolution of the identity
The common ingredient of all the methods discussed in this work are the 4c-ERIs, which are defined as where ψ p (r) are MO (also called single-particle orbitals).The MOs are expanded in terms of the basis functions {ϕ i (r)} where c i p are the MO coefficients.Inserting the expansion of Eq.( 2) into the Eq.( 1) yields where the two-electron integrals in the AO basis are defined as The RI approximation expands the product of two AO functions, ϕ i (r)ϕ j (r), in terms of a set of ABFs, namely where µ labels the ABFs, and {P µ } and C µ i j are the RI expansion coefficients.
The different RI techniques can be grouped into global and local.6][67][68] Given the pair ϕ i ϕ j , where ϕ i is centered at atom A and ϕ j at atom B, only RI basis functions with centers at A and B are used in local RI expansions.We note that we use global RI in this work and thus exclude local RI from the discussion in the following.
Inserting Eq. ( 5) into Eq.(4) yields the expression with the two-center Coulomb integrals Several ways to obtain the expansion coefficients, C µ i j , were proposed in the literature. 23One choice is to use the so-called overlap metric, i.e., to minimize the residual δ ρ i j = ρi j (r) − ρ i j (r) with respect to C µ i j by a simple least-square fit, yielding the RI-SVS approximation. 23Another criterion for obtaining C µ i j is to use the Coulomb metric, which implies minimizing the Coulomb repulsion of the density residual, (δ ρ i j |δ ρ i j ), yielding 19,20,23,69 where the three-center Coulomb integrals in Eq. ( 8) are given by Computing the expansion via Eq.( 8) is known as the RI-V approach.RI-V converges faster with respect to the size of the RI basis set than RI-SVS and is typically more robust and accurate than the latter. 19,20,23,69We therefore choose RI-V as reference for the RI-RS coefficients.Making use of the symmetry of V defined in Eq. ( 7), we can rewrite Eq. ( 6) as µ kl (10)   with Inserting Eq. ( 10) into Eq.( 1), we obtain the RI reformulation of the 4c-ERIs using the MO transformation of B. RI-V expression for different theoretical methods

Hartree-Fock theory
The HF method is fundamental to electronic structure theory.In HF, the electronic structure is described by a single Slater determinant and the energy is minimized by varying the one-electron orbitals using self-consistent field (SCF) algorithms.Evaluating the exact-exchange energy, E HF x , is the most expensive step in HF.The RI-V expression of E HF x is given by The exact exchange energy is recomputed in each SCF iteration.In practice, we first compute and store the three-center quantity M µ i j before the start of the SCF loop.In each iteration, the MO coefficients are updated and M µ i j is transformed to the MO basis using Eq. ( 13), which then yields O µ mn .The scaling of Eq. ( 14) formally remains O(N 4 ), 30 but is in practice usually lower.The dominating step, in particular in our NAO scheme, is the evaluation of the three-center AO integrals (Eq.( 9)), which scales O(N 3 ) for small to medium-sized systems and O(N 2 ) for large systems because the product ϕ i ϕ j becomes sparse.

Møller-Plesset perturbation theory
MP2 12 is widely used in quantum chemistry for adding correlation to the HF ground state. 70The MP2 correlation energy is obtained by adding a small perturbation to the HF Hamiltonian and then terminating the perturbative expansion at second order.The RI-V expression of the MP2 correlation energy for closed shell systems reads 21,22,71 where ε m/n and ε a/b are the orbital energies of the occupied and unoccupied states, respectively.The first term in Eq. ( 15) is the second-order Coulomb term, which is also referred to as direct MP2 term, and corresponds, in a diagrammatic representation, to the leading term in RPA.The second one is the second-order exchange energy, which appears in a dressed form in SOSEX. 5[74][75]

Coupled cluster theory
9][80] CCSD with perturbative triples, CCSD(T), is considered as one of the most accurate methods, which is still computationally affordable for larger molecules. 81However, we will restrict the discussion to CCSD since our goal is the numerical validation of RI-RS rather than providing the best possible computational reference.
The CCSD wave function is given by an exponential ansatz where T is the cluster operator, which generates a linear combination of excited determinants when acting on the HF single Slater determinant ground state.For CCSD, T is defined as where T1 generates singly-and T2 doubly excited determinants.In second quantization these operators read for closed shell systems: The operator Êa m contains the annihilation operator â and the creation operator â † : where Once the cluster amplitudes t a m and t ab mn are computed, the closed shell CCSD correlation energy in its RI-V expression is given by 81 The complexity of the described approach is O(N 6 ), rendering CCSD also a computationally costly approach.

Random phase approximation
The total RPA energy is given as sum of the HF energy and the RPA correlation energy E RPA c , which corresponds to the fifth rung of "Jacobs ladder" 82 in the context of KS-DFT, where LDA, GGA, meta GGAs and hybrid functionals are placed on rungs 1 to 4 in that order.In practice, RPA is typically performed on-top of a preceding KS-DFT calculation.This implies that we insert the KS-DFT orbitals and energies in the expressions for the exact HF exchange and E RPA c .In this post-processing mode, the total RPA energy is obtained by adding both terms to the KS-DFT total energy and subtracting the KS-DFT exchange-correlation contribution.
In a diagrammatic description, RPA expands the direct MP2 term to infinite order.Its RI-V expression reads for the closedshell case 30,48,83,84 with the irreducible polarizability in the ABFs representation 30 where ε m are again the KS-DFT orbital energies of the occupied states and ε a are the orbital energies of the unoccupied states, and c.c. denotes the complex conjugate.The RI-V reformulation of RPA reduces the scaling from O(N 6 ) to O(N 4 ), 30,83,85 which still restricts the accessible system sizes to less than 100 atoms.5]64,86 5. Renormalized second-order perturbation theory While long-range interactions are well described in RPA, short-range exchange and correlation are not sufficiently captured 5,87 and several schemes have been developed to go beyond RPA.Ren et al. 16 proposed to add the rSE correction to RPA, while Grüneis and coworkers 17 put forward an RPA + SOSEX approach.Adding both corrections (RPA + SOSEX + rSE) was investigated by Paier et al., 88 and later promoted as renormalized second-order perturbation theory (rPT2). 89Benchmark studies 88,89 of small inorganic and organic molecular systems showed that rPT2 is generally superior to RPA+SOSEX or RPA+rSE for reaction barrier heights and atomization energies.It has also been argued that rPT2 is a well-rounded approach from a diagrammatic perspective.The goldstone diagrams of RPA, SOSEX and rSE are distinct infinite summations, where each series sums up topologically similar diagrams.The leading terms, i.e., the first terms in the sum, are the second-order direct, second-order exchange and single excitation terms, respectively.The combination of the three leading terms is known as second-order Rayleigh-Schrödinger perturbation theory (RSPT).This is reflected in the name of the method: the infinite expansion in RPA+SOSEX+rSE can be understood as a renormalization of RSPT yielding the "renormalized second-order perturbation theory" or in short rPT2.
SOSEX originates from the wave-function theory and was first derived in the context of CC. 18,90 Diagrammatically, we add higher order exchange diagrams to RPA in the RPA+SOSEX scheme, which removes the correlation part of the one-electron self-interaction from the theory.The RI-V expression for the SOSEX correlation is given for the closed-shell case as 89 with the factors and the coupling-constant-averaged dielectric function is defined as The integration over λ in Eq. ( 26) is performed numerically.
In RSPT, the SE term is given by 91 where vHF is the HF and vMF the mean-field potential from, e.g., KS-DFT.f is the Fock operator, f = −∇ 2 /2 + vext + vHF and vext is the external potential (electron-nuclei interaction).The derivation of Eqs. ( 27) and ( 28) can be found in the SI of Ref. 16.It is obvious from Eq. 27 that the SE contribution is zero if vMF corresponds to the HF potential, which directly follows from Brillouin's theorem. 91he renormalization of the SEs was proposed to avoid divergence problems for systems with vanishing KS gaps.The rSE approximation was initially implemented approximately, 16 including only the diagonal terms in the infinite expansion.The rSE approach was later refined, 89 accounting now also for the off-diagonal elements.The refined version is the one we use in this work.The procedure relies on a subspace diagonalization as follows: we first evaluate the occupied block f mn and unoccupied f ab of the Fock matrix with the KS single-particle orbitals {ψ p }, i.e., The two blocks f mn and f ab are then diagonalized separately which yields two new sets of eigenvalues εm/a and orbitals ψm/a .The rSE correlation energy is then obtained by replacing the KS energies and orbitals in Eq. ( 28) with the ones obtained after the subspace diagonalization yielding The rSE correction was first investigated in the context of RPA, 5,16,88,89 but has been also explored for quasiparticle energies from GW [92][93][94] and the T −matrix approximation, 94 charge-neutral excitations from BSE@GW 95 and dissociation curves from multireference methods. 96

GW quasiparticle energies
The GW approximation to Hedin's equations 15 is a widespread method for the computation of photoemission spectra of molecules and solids. 6][99][100][101][102][103] Central to GW is the self-energy Σ, which is obtained from the Green's function G and the screened Coulomb interaction W .The poles of G are the quasiparticle energies (QPs), which can be directly linked to the charged excitations measured in photoemission spectroscopy.The most common flavor of GW is the single-shot G 0 W 0 approach, where the QP energies are obtained as correction to the KS-DFT eigenvalues via the QP equation: with Σ p = ⟨ψ p |Σ|ψ p ⟩ and v xc p = ⟨ψ p |v xc |ψ p ⟩ and where v xc denotes the exchange-correlation potential from KS-DFT.The evaluation of the self-energy correction term Σ p (ω) involves a frequency integration over G 0 and W 0 , which can be performed with different techniques. 6The RI-V approach has been used in several frequency integration techniques, such as the fully-analytic approach 84,104 , contour deformation 97,101,103,105 and analytic continuation. 30,105,106e use here the analytic continuation, where we calculate the self-energy for a set of imaginary frequencies {i ω}.The RI-V expression for Σ p (i ω) reads 30 where ε F is the Fermi energy and Π µν is given in Eq. ( 23).
The self-energy is then analytically continued from the imaginary to the real-frequency axis before the QP energies are calculated by Eq. ( 30).The evaluation of Eq. ( 31) scales O(N 4 ).

C. Separable Resolution of the Identity
In RI-V, the atomic orbital product ϕ i (r)ϕ j (r) is entangled through the three-center Coulomb integrals in the expansion coefficients C µ i j (Eq.( 8)).The idea of the separable RI or equivalently RI-RS is to disentangle this product, similarly as in ISDF, by the following separable expression 48 [C where k runs over a given set of real-space grid points {r k }.
In Eq. ( 32), the orbital basis functions, ϕ i (r k ), are evaluated for each r k grid point and the contraction coefficients Z kµ are obtained by employing the minimization condition: argmin The reference for the RI-RS approach, as proposed by Duchemin and Blase, 48 is RI-V, but could be in principle replaced by another RI flavor.In this work, we keep the RI-V reference due to its known high accuracy.With D k i j defined as follows Eq. ( 33) can be written as argmin which yields Following our notation for RI-V in Eq. ( 11), we obtain for the AO-based three-center quantity M µ i j in RI-RS: with The MO transformation of [M µ i j ] RI-RS yields the RI-RS version of Eq. ( 13): which replaces the three-center quantity O µ pq in the RI-V expressions of the methods introduced in Section II B. For example, the Fock exchange energy in RI-RS becomes The RI-RS expressions for MP2, CCSD, RPA, SOSEX, rSE and GW are obtained analogously.

III. IMPLEMENTATION DETAILS
The separable-RI approach has been implemented in the all-electron program package FHI-aims, 107 which expands the MOs in NAO basis functions {ϕ i }.The NAOs have the following form where u i are radial functions and Y lm (Ω) spherical harmonics.
The radial part of the NAOs is fully flexible and not restricted to a particular shape.Popular local basis functions such as GTOs or Slater-type orbitals (STOs) present special forms of an NAO and are numerically tabulated when provided as input to FHI-aims.
The ABFs are also NAOs in FHI-aims.Most localizedbasis set codes use pre-optimized ABFs obtained via variational procedures 21 whereas the default procedure in FHIaims is to compute them on-the-fly during run-time.The ABFs are obtained by generating on-site products of the primary basis functions.Linear dependencies are removed with a Gram-Schmidt orthogonalization procedure.More details can be found in Ref. 30.
The real-space grid points {r k } are pre-optimized for the primary basis set of each atomic species.We used in this work the {r k } grids developed by Duchemin and Blase, 48,49 which was so far restricted to the cc-pVTZ and def2-TZVP basis sets.The atomic grids are constructed for each atom type by different combinations of four base shells, which are subsets of Lebedev quadrature grids. 108Each base shell is replicated for each species with different radii.The number of repetitions of each base shell and the corresponding radius were optimized such that 2 is minimized for the isolated atom case. 48This procedure was followed for the grids optimized for usage with the cc-pVTZ basis set. 48In later work by Duchemin and Blase, 49 the procedure was further refined by a subsequent optimization step, where the grid points are allowed to move freely without being constrained by any symmetry condition.The grids for the def2-TZVP basis sets were generated by this more sophisticated approach.
Figure 1 shows the r k grid points for the cc-pVTZ basis set for a) a hydrogen atom, b) a carbon atom and c) the ethylene molecule.We observe that the grids are very compact, which is essential for low-scaling algorithms with small prefactors.The number of grid points is only approximately 10 times larger than the number of primary basis functions.For the cc-pVTZ grids used in this work, we have ≈ 310 points for each C, N, and O atom and 167 points for the H atom.The {r k } Algorithm: 2 FIG. 2. Algorithm for the computation of separable-RI fitting coefficients grids for def2-TZVP are of similar size: 136 points for H, 100 for He and 336 for second-row elements (C, N, O etc.). 49The number of points increases to 436 and 536 points for third and fourth-row elements, respectively. 49The system-specific grid points are constructed as shown in Figure 1c: the real-space grid for the ethylene molecule is formed by a superposition of the r k points of each hydrogen and carbon atom.The grids for the def2-TZVP basis are shown in Figure S1 (Supporting Information), which demonstrates that the arrangement of the grids points is significantly less symmetric as the one shown for the cc-pVTZ grids in Figure 1.
The pseudocode for our RI-RS implementation in FHI-aims is summarized in Figure 2. We start by computing the coefficients M µ i j .The two-center Coulomb integrals V µν , Eq. ( 7), are computed with logarithmic Bessel transforms 109 in Fourier space. 110,111The evaluation of the three-center integrals, Eq. ( 9), is performed in real space using overlapping atom-centered spherical grids; see also Ref. 30 for more details on the NAO integration procedures.In the next step, we build the real-space grids r k from the provided specifications of the base shells and corresponding radii (cc-pVTZ grids) or directly read them from an external file (def2-TZVP grids).In step 3, the product of two basis functions, D k i j , is computed for each grid point r k .We only include pairs i j, where the atomic orbitals ϕ i and ϕ j have a significant overlap of their spatial extension.To simplify the computation of D k i j , we form the combined index ρ ≡ i j yielding a matrix D of dimension N ρ × N k , where N ρ is the number of pairs and N k the number of grid points.In steps 4 and 5, we build the auxiliary matrices A and B, respectively and invert A. We found that the inversion is numerically stable without L 2 regularization, which was proposed in Ref. 49.The quantity M kµ , Eq. (38), is obtained by matrix multiplication of A −1 and B. Finally, the RI-RS fitting coefficients (in the AO basis) are obtained from Eq. (37).Algorithm 2 shows that the computation of the RI-RS coefficients is straightforward, only requiring matrix multiplications and one inversion.

IV. COMPUTATIONAL DETAILS
For the validation of the RI-RS implementation, we used the Thiel set, 112 the GW100 benchmark set 113 and 10 of the armorphous carbon clusters from our previous work. 100The Thiel set comprises 28 small organic molecules composed of the elements C, N, O and H.The whole benchmark set is shown in Figure 3.We used the MP2/6-31-G(d) geometries supplied in Ref. 112 by Thiel and collaborators.The GW100 benchmark set contains 100 molecules, encompassing a broad spectrum of elements from the periodic table.We used the structures reported in the original GW100 benchmark study 113 and exclude the seven molecules, which contain 5 th period elements (Xe, Rb 2 , I 2 , C 2 H 3 I, CI 4 , AlI 3 , Ag 2 ).For these seven molecules, optimized real-space grid points {r k } are currently still lacking.The amorphous carbon clusters were obtained by carving clusters from periodic structures of three-dimensional disordered carbon materials with the elemental composition CHO.Dangling C-C bonds were passivated with hydrogen, see Ref. 100 for more details.
We used Dunning's cc-pVTZ 114 basis sets for all calculations with the Thiel benchmark set and for the amorphous carbon clusters.We repeated the HF and MP2 calculations for the Thiel benchmark set with the def2-TZVP basis set. 115We also used the def2-TZVP basis set for the GW100 benchmark.Both basis sets are spherical GTOs.For the RI-RS calculations, we used the atomic grids {r k } specifically optimized for the respective basis set.For cc-pVTZ, we utilized the grids published in the SI of Ref. 48, while for def2-TZVP, we used the grids employed in Ref. 49 and published in Ref. 116.
We performed HF and MP2 calculations without RI with the NWChem code 117 to obtain exact 4c-ERIs references values.The NWChem default settings are otherwise used in these calculations.HF, MP2, CCSD, RPA, SOSEX, rSE, rPT2 and GW are exclusively implemented with RI in the FHI-aims code.Non-RI versions of these methods are not implemented.We show in the following results from RI-V and RI-RS.For the automatic generation of the ABFs, we used a threshold of 1.0 × 10 −10 for the basis-basis product screening.The Perdew-Burke-Ernzerhof (PBE) 118,119 exchangecorrelation functional was used for the preceding KS-DFT calculations in RPA, SOSEX, rSE, rPT2 and G 0 W 0 with the Thiel benchmark set.The GW100 calculations employed the PBE0 120,121 hybrid functional as starting point, while the G 0 W 0 calculations of the amorphous carbon clusters started from a PBE-based hybrid (PBEh) 122 functional with 45% of exact exchange α denoted as PBEh(α=0.45).The frequency integral of the GW self-energy (Eq.( 31)) was computed using a modified Gauss-Legendre grid 5 with 200 imaginary frequency points.The self-energy Σ n (i ω) was computed for the same set {i ω} and then analytically continued to the real axis employing a Padé model 123 with 16 parameters.The QP equation (Eq.( 30)) was solved iteratively.The SOSEX frequency integral in Eq. ( 24) was evaluated numerically using also a modified Gauss Legendre grid with 200 points.The integration over λ in SOSEX (Eq.( 26)) was computed using a Gauss-Legendre quadrature with 5 points.We made the input and output files of all the FHI-aims calculations (RI-V and RI-RS) available in the NOMAD database. 124e computed the correlation energies (E c ) from MP2, CCSD, RPA, SOSEX, rSE and rPT2 and the HF exchange with RI-V or RI-RS.We also computed MP2, CCSD, RPA and rPT2 total energies (E t ).The MP2 and CCSD total energies were computed as where E HF RI-V gs is the HF ground state energy.The RPA and rPT2 total energies were calculated as where E PBE gs is the KS-DFT ground state with PBE, E PBE xc is the PBE exchange-correlation energy and E HF RI-V x the HF exchange based on RI-V.Note that we computed E HF gs and E HF x in Eqs. ( 42) and ( 43) always with RI-V, also when RI-RS is used for E c .We used the total energies to compute atomization energies ∆E a for molecules with the general formula where is the total energy of the molecule and E t (C), E t (N), E t (O) and E t (H) is the total energy of the carbon, nitrogen, oxygen and hydrogen atom, respectively.Since we are interested in a purely numerically comparison of RI approaches, we omitted corrections for the basis set superposition error (BSSE).We note that such BSSE corrections are necessary for a comparison to computational and experimental reference values.

A. Comparison to non-RI reference
We start the discussion by comparing our RI-V and RI-RS results to the exact 4c-ERIs reference (non-RI) obtained from NWChem.We restrict this comparison to HF and MP2.The errors with respect to the non-RI values are displayed in Figures 4 a and b for the total and in c and d for the atomization energies, respectively.The corresponding mean error (ME), mean absolute error (MAE) and maximum absolute error (maxAE) are given in Table I.The discussion refers, unless otherwise noted, to the cc-pVTZ results.
For the total energies, we report the errors for the HF exchange E HF x (Eq.( 14)) in Figure 4a and the MP2 correlation energy E MP2 c (Eq. ( 15)) in Figure 4b.The errors for the MP2 correlation energy are an order of magnitude smaller, for both RI-V and RI-RS, which is expected since the absolute values of E MP2 c and E HF x differ by an order of magnitude as well.The RI-V scheme reproduces the non-RI reference almost exactly, with MAEs of 0.063 meV/electron and 0.005 meV/electron for E HF x and E MP2 c , respectively.Previously published RI-V results for the Thiel benchmark set are similar: Duchemin et al. 48reported MAEs of 0.103 meV/electron for E HF x and 0.030 meV/electron for E MP2 c with respect to the non-RI reference.Our RI-V results are closer to the non-RI reference, which we attribute to the automatic ABF generation in FHIaims that provides a more complete RI basis set compared to the pre-optimized ones.For cc-pVTZ, the number of autogenerated ABFs is approximately a factor of two larger than the pre-optimized ones used in Ref. 48.
As evident from Figure 4a and b, the RI-RS errors oscillate around the non-RI reference without a clear trend for an overor underestimation.The RI-RS MAEs and maxAEs of the total energies are approximately 4 to 5 times larger than the RI-V ones.Reference 48 reported only a factor of 1.5 to 2. One possible cause is the neglect of the rotational non-invariance of the RI-RS approach.Unlike NAOs or GTOs, the {r k } grid is not rotational invariant, which was accounted for in the previous work by averaging over 40 random orientations of each molecule in the Thiel set. 48However, the effect is small.The RI-RS MAEs reported with respect to the non-RI results were 0.171 meV/electron (E HF x ) and 0.0435 mev/electron (E MP2 c ) in Ref. 48.Our RI-RS MAEs are very similar or even smaller (0.207/0.025 meV/electron for HF/MP2) because our RI-V reference is closer to the non-RI result.This demonstrates that the influence of other parameters is larger, such as improving an already good RI basis set, justifying to neglect the rotational non-invariance.
Turning from total to relative energies, we compute the atomization energies according to Eq. (44).For MP2, we compute the total exchange and correlation energy E t in Eq. (44)  with Eq. ( 42): the HF ground state energy is calculated with RI-V, while the MP2 correlation energy is computed either with RI-V or RI-RS.The reason for keeping RI-V for the HF contribution is that |E HF x | is an order of magnitude larger than |E MP2 c | and would dominate the RI-RS error in the MP2 atomization energies, making the assessment of the RI-RS perfor- mance for correlated methods difficult.We note that keeping RI-V in the HF exchange is also a reasonable choice in a lowscaling RI-RS implementation of a correlated method, where the purpose of the RI-RS scheme would be the scaling reduction in the calculation of the correlation energy and not in the calculation of the Fock exchange.Since we have to compute the RI-V expansion coefficients also in RI-RS, using them in E HF x instead of their RI-RS counterparts keeps the absolute error as small as possible.
The RI-V and RI-RS atomization energy errors with respect to the non-RI reference are shown for HF and MP2 in Figure 4c and d, respectively.The RI-V approach yields again excellent results with MAEs < 1meV and maxAEs < 4meV for both, HF and MP2.The RI-RS MAEs are 14 times larger for HF, but only a factor of 2 for MP2.For HF, RI-RS tends to slightly underestimate the non-RI reference data, while the MP2 errors oscillate around the reference line following the RI-V curve.The largest RI-RS HF and RI-RS MP2 errors are 29 meV and 8 meV, respectively.Setting the generally accepted threshold of 1 kcal/mol (= 4 kJ/mol, = 43 meV) for chemical accuracy, we find that our RI-RS errors are well within this threshold.
We repeated the non-RI, RI-V and RI-RS calculations with the def2-TZVP basis set.The results are reported in Table I and in Figure S2 (Supporting Information).For the RI-V results, we get almost the same errors for the total and atomization energies as with cc-pVTZ.This is not surprising since also the ABF basis set for def2-TZVP is generated automatically in FHI-aims and is thus very complete.However, the RI-RS errors with respect to the non-RI reference are noticeably smaller than for cc-pVTZ.The MAEs and maxAEs for the total HF and MP2 energies are reduced by a factor of two.Likewise, the MAEs and maxAEs for the atomization energies are 2-3 times smaller.For example, the MAE/maxAE for the HF atomization energy are 10/29 meV for cc-pVTZ, but only 4/12 meV for def2-TZVP.The reduction of the RI-RS errors might be explained with the more sophisticated procedure used to generate the def2-TZVP real-space grids.While the {r k } grids for cc-pVTZ and def2-TZVP are of similar size, the location of the def2-TZVP grids points is, unlike for the cc-pVTZ grids, not restricted to the surface of the shells of the Lebedev grids.Our results indicate that a more flexible arrangement of the real-space grid points is key to reduce the RI-RS error without increasing the computational cost.

B. Thiel benchmark for correlated methods
In this section, we extend our benchmarking of the RI-RS approach to CCSD, RPA, SOSEX, rSE, rPT2 and G 0 W 0 .We compare from now on to RI-V, which reproduces the non-RI results almost exactly, in particular in combination with our large auto-generated auxiliary basis sets as shown in Section V A. To the best of our knowledge, RI-RS benchmarks for CCSD, SOSEX, rSE and rPT2 have not been presented previously, while RI-RS results have been published for low-scaling RPA 48 and low-scaling G 0 W 0 49 implementations.Note that we discuss here the accuracy of RI-RS in conventional RPA and G 0 W 0 with O(N 4 ) scaling, excluding any other sources for possible deviations caused by a low-scaling reformulation.The errors reported in the following are exclusively due to changing the RI scheme from RI-V to RI-RS.
In Figure 5a-d we report total (correlation) energies from CCSD, RPA, SOSEX and rSE.Overall, the trend and magnitude of the errors are similar to the MP2 correlation energies discussed in the previous section.The errors are all within ±0.050 meV/electron.Only one molecule (#20, acetone), shows a larger error of 0.084 meV/electron in the RPA correlation (see Figure 5b and Table II).The same molecule causes also the largest error (0.030 meV/electron) in the SO-SEX energy.We then turn to the RI-RS errors of the atomization energies from CCSD, RPA and rPT2 displayed in Figure 5e-g.Following the same reasoning as for the MP2 atomization energies, we compute the Fock exchange in the total exchange and correlation energies E RPA,rPT2 t (Eq.( 43)) always with RI-V and only the correlation parts with RI-V or RI-RS.The errors are within 4 meV for CCSD and RPA and even within 1.2 meV for rPT2, which is an order of magnitude smaller than the threshold required for chemical accuracy.The MAEs are smaller or around 1 meV, which confirms the excellent accuracy of the RI-RS approach over a wide range of correlated methods.
Finally, we present G 0 W 0 QP energies for the highest occupied molecular orbital (HOMO) in Figure 5h.The errors are within 5 meV and the MAE is < 1 meV, see also Tab.II.The error introduced by RI-RS is thus negligible for GW , where our target accuracy is usually 50 meV for comparisons between different codes.Better agreements can be reached using the same basis set, even if the GW implementations differ significantly.For example, an MAE of 3 meV has been reported comparing GW100 benchmark results of the HOMO from FHI-aims and Turbomole with the def2-QZVP Gaussian basis set. 113The RI-RS error is still three times smaller.

C. QP energies for GW100 test set and carbon clusters
For a more comprehensive assessment of the RI-RS approach, we applied it to the GW100 benchmark set.While the Thiel set comprises only organic molecules, the GW100 test set also contains inorganic molecules, noble gases and generally also heavier elements, covering larger regions of the chemical space.As in Section V B, we compare the RI-RS results to the RI-V reference and use RI-RS in combination with a conventional GW implementation.The QP energy errors for the HOMO and the lowest unoccupied molecular orbital (LUMO) are shown in Figure 6 and Table III.As for the Thiel set, we obtain excellent results: The RI-RS errors are tightly centered around zero.The MAE is in the sub-meV range and the maxAE is below 5 meV.The errors for the LUMO are smaller, which must be attributed to the fact that their absolute values are for most systems smaller than for the HOMO.
The GW100 benchmark was previously 49 performed with a low-scaling GW implementation based on RI-RS and the TABLE III.RI-RS mean errors (ME), mean absolute errors (MAEs) and maximum absolute error (maxAE) with respect to RI-V for QP energies from G 0 W 0 @PBE0 for the GW100 benchmark set and from G 0 W 0 @PBEh(α = 0.A more detailed analysis shows that our auto-generated ABF set is larger and contains more diffuse functions with higher maximal angular momentum number.Therefore, our RI-V reference is closer to the RI-free results, as already discussed in Section V A. However, it might pose a greater challenge to reproduce our RI-V reference with RI-RS, leading to slightly larger RI-RS errors. Comparing our QP-HOMO results for the Thiel and GW100 benchmark, we find that the MAE is 1.7× smaller for the latter (0.95 vs 0.56 meV).This must be attributed to the quality of the {r k } grids: For the Thiel benchmark we used the cc-pVTZ basis set and for the GW100 the def2-TZVP basis set, where the corresponding grids were generated with more refined procedure than for cc-pVTZ.In combination with the findings in Section V A, the GW100 results support the assumption that lifting symmetry constraints for the real-space grids improves the accuracy of the RI-RS approach.
So far the discussion has been restricted to small molecules.We now expand our benchmarking efforts to QP energies of large, three-dimensional amorphous carbon clusters, motivated by our primary goal to use the RI-RS approach for low-scaling GW algorithms.We selected 10 clusters, which were part of our GW training set used to develop machinelearning models for the prediction of X-ray photoemission spectra of disordered carbon materials.Unlike in our previous work, 100 where we computed the carbon 1s excitation of the central carbon atom in each cluster, we present here the HOMO and LUMO energies.We selected amorphous carbon clusters of different sizes, ranging from 29 atoms to 116 atoms, which corresponds to 646 and 2360 basis functions, FIG. 6. Histograms of QP energy errors of RI-RS with respect to RI-V at the G 0 W 0 @PBE0 level using the def2-TZVP basis.The error is defined as ε RI-RS − ε RI-V .The corresponding box plots are shown below the histograms.FIG. 7. RI-RS error for the QP energies of amorphous carbon clusters.The QP error is defined as ε RI-RS − ε RI-V .The results were obtained at the G 0 W 0 @PBEh(α = 0.45) level.
respectively.Three of the cluster are exemplarily depicted in Figure 7. Five clusters have the elemental composition CHO, while the other five contain only carbon and hydrogen.
The RI-RS error of the QP energies are shown in Figure 7.The amorphous carbon structures are spin-polarized and the error is thus depicted for each spin channel separately.For the HOMO, we find that the error is ≤ 1 meV for 8 of the 10 structures and the maxAE is only 1.3 meV.The error for the LUMO is even smaller with a maxAE of only 0.6 meV.The MAE and ME are averaged over both spin channels and shown in Tab.III.Both errors are smaller or in the same range compared to MAEs and MEs for the Thiel benchmark and the GW100 set.

D. Performance
Replacing RI-V in a conventional implementation by RI-RS as shown in Eq. ( 40), actually increases the computational cost.The purpose of this paper is to assess the accuracy of the RI-RS approach, but the goal is to utilize the RI-RS idea to formulate accurate low-scaling algorithms for correlated methods.For the development of low scaling algorithms, it is essential to monitor the overhead of the algorithm, i.e., the prefactor introduced by the additional operations in the computation.A large prefactor in the lowscaling algorithm might otherwise shift the crossover point, where the low-scaling implementation becomes computationally more efficient than the conventional implementation, to large system sizes, rendering the low-scaling approach in the worst case useless.The computation of the RI-RS expansion coefficients [C µ i j ] RI-RS is a step, which introduces such an additional prefactor compared to the conventional implementation: in addition to the computation of the RI-V expansion co- efficients [C µ i j ] RI-V , a least-square fitting is performed, which involves matrix-matrix multiplications and a matrix inversion.
Figure 8 shows the CPU time for computing the RI-RS expansion coefficients for linear alkane chains C n H 2n+2 with n = 4 − 10.The RI-RS time includes the computation of the RI-V expansion coefficients, which dominates the timings.Only 6 to 8 % of the total time is spent in calculating the coefficients M µk given in Eq. ( 38) (fitting step).The total time to compute [C µ i j ] RI-RS also includes step 7 in the Figure 2.However, for cubic-scaling RPA and GW methods, the computation of [C µ i j ] RI-RS is not required, we only need M µk . 48,49ompared to a canonical implementation, we expect therefore a small overhead of less than 10% in the RI step.

VI. CONCLUSIONS
We presented an implementation of the RI-RS approach in an NAO framework.We benchmarked the accuracy of the RI-RS scheme for HF, MP2, CCSD, RPA, SOSEX, rSE, rPT2 and G 0 W 0 by replacing the RI-V expansion coefficients with the RI-RS ones in conventional algorithms of the methods.While this strategy does not result in a speed-up of the calculation, it allows a direct assessment of the RI-RS errors.
We computed total and atomization energies for the 28 molecules of the Thiel benchmark set.We presented QP energies at the G 0 W 0 level for the Thiel and GW100 benchmark sets as well as large amorphous carbon clusters up to 116 atoms.We showed that RI-V in combination with our auto-generated ABFs reproduces the non-RI calculation defacto exactly and thus set RI-V as reference for our RI-RS benchmarks.We found that RI-RS reproduces the RI-V results for atomization energies from MP2, CCSD, RPA and rPT2 as well as G 0 W 0 QP energies on average within 1 meV or better.The RI-RS error of the HF atomization energies is as expected one order of magnitude larger.We also showed that the RI-RS error remains in the sub-meV range, independent of the system size.
We also found that the RI-RS scheme only introduces a small additional computational prefactor of less than 10% in the RI step.This highlights the potential of the RI-RS approach for deriving highly accurate and computationally efficient low-scaling algorithms of correlated electronic-structure methods.Future and on-going work comprises the extension of RI-RS to periodic systems as well as the development of the actual low-scaling algorithms.Modifications of the scheme, such as fitting the RI-RS expansion coefficients to other RI flavors than RI-V, are also explored to increase the computational efficiency.A tool to generate the real-space grid points {r k } on the fly or with little effort prior to the calculations is also necessary to exploit the potential of the RI-RS scheme.

1 FIG. 1 .
FIG. 1. Real space grid r k (red dots) for a) the hydrogen atom, b) carbon atom and c) ethylene molecule.The real space grid for a given molecule is constructed by the superposition of the grid of each species.Displayed are the grids for the cc-pVTZ basis set.

1 FIG. 3 .
FIG. 3. Thiel test set of small organic compounds Thiel set Energy error [meV/electron] Energy error [meV] 1 FIG. 4. a) Errors of RI-V and RI-RS with respect to the non-RI reference.Total energy errors [meV/electron] for a) HF exchange and b) MP2 correlation energy.Atomization energy errors [meV] for c) HF and d) MP2.The errors are defined as E RI-V/RI-RS − E non-RI .The cc-pVTZ basis set was used.Thiel set Energy error [meV/electron] Energy error [meV] 1 FIG. 5. RI-RS errors with respect to RI-V for the total correlation energy [meV/electron] from a) CCSD, b) RPA, c) SOSEX and d) rSE and for the atomization energy [meV] from e) CCSD, f) RPA and g) rPT2 as well as for h) the G 0 W 0 @PBE correction of the HOMO level [meV].The errors are defined as E RI-RS − E RI-V .

FIG. 8 .
FIG. 8. CPU timings [h]for computing the RI-V and RI-RS expansion coefficients for linear alkane chains.The computation of the RI-RS expansion coefficients includes the time for constructing the RI-V expansion coefficients and the computation of M µk given in Eq.(38).To guide the eye, the data points are connected by lines.

TABLE I .
Mean errors (ME), mean absolute errors (MAEs) and maximum absolute error (maxAE) with respect to the non-RI reference for total HF exchange and MP2 correlation energies [meV/electron] and HF and MP2 atomization energies[meV].The errors are defined as E RI-V/RI-RS − E non-RI .

TABLE II .
RI-RS mean errors (ME), mean absolute errors (MAEs) and maximum absolute error (maxAE) with respect to RI-V for total correlation energies from CCSD, RPA, SOSEX, rSE [meV/electron] and for atomization energies from CCSD, RPA, rPT2 [meV] as well as for the HOMO energies from G 0 W 0 [meV].The errors are defined as E RI-RS − E RI-V .
49)for amorphous carbon clusters[meV].The errors are defined as ε RI-RS − ε RI-V .-timemethod,using the same primary basis set, realspace grids and starting point.The MAEs reported in Ref. 49 (0.21 meV for HOMO and 0.09 meV for LUMO) are 2-3 smaller than ours, see TableIII.It might seem surprising that the low-scaling RI-RS GW implementation,49which is also affected by other error sources such as the time-frequency transformations, yields smaller errors than our RI-RS implementation with conventional scaling.The neglect of the rotational invariance can be excluded as possible reason since it was also not accounted for in Ref.49.The most likely cause are the different RI-V references: Ref. 49 uses a pre-optimized ABFs for RI-V, while our ABFs are generated automatically. space