The Bethe-Salpeter equation (BSE) based on *GW* quasiparticle levels is a successful approach for calculating the optical gaps and spectra of solids and also for predicting the neutral excitations of small molecules. We here present an all-electron implementation of the *GW*+BSE formalism for molecules, using numeric atom-centered orbital (NAO) basis sets. We present benchmarks for low-lying excitation energies for a set of small organic molecules, denoted in the literature as “Thiel’s set.” Literature reference data based on Gaussian-type orbitals are reproduced to about one millielectron-volt precision for the molecular benchmark set, when using the same *GW* quasiparticle energies and basis sets as the input to the BSE calculations. For valence correlation consistent NAO basis sets, as well as for standard NAO basis sets for ground state density-functional theory with extended augmentation functions, we demonstrate excellent convergence of the predicted low-lying excitations to the complete basis set limit. A simple and affordable augmented NAO basis set denoted “tier2+aug2” is recommended as a particularly efficient formulation for production calculations. We finally demonstrate that the same convergence properties also apply to linear-response time-dependent density functional theory within the NAO formalism.

## I. INTRODUCTION

Predicting the neutral (including optical) excitations of molecules and materials is of fundamental importance in photovoltaics, optoelectronics, and other technologically relevant areas. Several distinct types of computational formalisms are frequently employed in the community for this purpose, including wavefunction-based methods, e.g., equation-of-motion coupled cluster (EOM-CC),^{1–3} complete active space second-order perturbation theory (CASPT2),^{4–7} the quantum Monte Carlo method,^{8–11} linear-response time-dependent density functional theory (LR-TDDFT),^{12–15} or the Bethe-Salpeter equation (BSE) approach in the context of many-body perturbation theory (MBPT).^{16–23} EOM-CC and CASPT2 have been shown to produce highly accurate values for small and mid-sized molecules, when combined with sufficiently high-quality basis sets. They are therefore often used as a trusted reference,^{1–3,5–7,24} although their applicability to larger systems is somewhat limited by the associated computational cost. LR-TDDFT has been widely applied to predict optical excitations for molecules due to its computational efficiency and often reasonable accuracy, especially when combined with carefully designed exchange-correlation (XC) functionals.^{25–30} However, LR-TDDFT calculations can encounter problems for charge-transfer (CT) excitations,^{31,32} especially when used with a simple XC functional such as the adiabatic local density approximation (LDA)^{33} and generalized gradient approximations (GGAs).^{34} In LR-TDDFT, including long-range exact exchange in the XC functional can mitigate this problem.^{28–30,35}

The BSE approach is founded upon MBPT, based on Green’s function (*G*) theory and the idea of using the screened Coulomb interaction *W*.^{16} The BSE formalism was originally proposed in the field of nuclear physics in the 1950s.^{17} Combined with the *GW* approximation in MBPT,^{16} the BSE approach has been shown to successfully approximate the optical spectra of solids^{19,22,36–38} and nanosystems^{39} and later work demonstrates similar applicability to excitations in atoms and molecules.^{20,40–51} The *GW* approach^{16,21,52,53} allows one to predict fundamental gaps—i.e., gaps between highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO) quasiparticle states—as well as single-quasiparticle excitation spectra that are more accurate than those obtained by standard density functional theory (DFT) for a wide range of systems, including both solids and molecules.^{54–68} The description of optical excitations within the BSE approach then uses charged excitations, i.e., electron removal and addition excitations, from the *GW* approach as its input. The BSE method based on the *GW* method has several formal advantages over LR-TDDFT. The electron-hole interaction in the BSE approach has the correct asymptotic behavior in the so-called “Mulliken limit”^{66} for both solids and molecules [i.e., Ref. 66: “*E*_{CT0}(*D*) = *EA*(acceptor) − *IP*(donor) − 1/*D*, where *EA*(acceptor) and *IP*(donor) are the acceptor electronic affinity and the donor ionization energy” for large acceptor-donor distance *D*, all in atomic units]. This limit is not captured by the LR-TDDFT formalism without a long range exchange component.^{43,66} CT excitations, which are problematic for LR-TDDFT especially with LDA and GGA functionals, can be efficiently and accurately predicted by the BSE approach.^{48,66,69–71} This has been demonstrated for CT excitations including both intermolecular and intramolecular types in systems such as simple dipeptides,^{70,72} and more complex fullerene/polymer aggregates.^{71,73}

Calculations of BSE excitation energies within MBPT usually adopt a three-step procedure: (i) Evaluate the Kohn-Sham (KS)^{33} or generalized Kohn-Sham (gKS)^{74} DFT orbitals. (ii) Apply self-energy corrections at the *G*^{0}*W*^{0} level or partially self-consistent^{23,51} *GW* level (*G*^{0}*W*^{0}@DFT or *GW*@DFT; *G*^{0} stands for Green’s function of a noninteracting reference system and *W*^{0} is the screened Coulomb interaction of that reference system).^{16,20,21,52} (iii) Solve the BSE (in practice, an approximate version thereof, see below) based on the *G*^{0}*W*^{0} or *GW* quasiparticle energies and screened and unscreened Coulomb integrals of (g)KS orbitals (BSE@*G*^{0}*W*^{0}@DFT or BSE@*GW*@DFT).^{17,43,44,46} BSE implementations exist in different computational packages based on different basis functions, e.g., MolGW,^{75} Fiesta,^{69,76} and Turbomole,^{46,77,78} which are based on Gaussian-type orbital (GTO) basis sets, BerkeleyGW,^{79} Yambo,^{80} ABINIT,^{81,82} VASP,^{83} and Quantum Espresso,^{84} which are based on plane waves, or Exciting^{85,86} and Elk,^{87} which are based on augmented plane waves.

The present work introduces an accurate implementation of the BSE formalism utilizing compact and efficient numeric atom-centered orbital (NAO) basis sets^{88,89} in the context of the all-electron electronic structure code FHI-aims.^{68,88,90,91} This paper focuses on spin-unpolarized systems (spin-polarized calculations are not yet possible in the present implementation). To obtain the two-electron Coulomb and screened Coulomb interaction matrix elements, we use an efficient and highly accurate variant of the resolution-of-identity (RI) technique.^{68} In FHI-aims, this RI technique is the numerical foundation for all methods beyond semilocal DFT, including Hartree-Fock (HF), hybrid density functionals, the random-phase approximation (RPA), second-order Møller-Plesset perturbation theory (MP2), and the *GW* method.^{68,91} Our current implementation also uses the ELSI infrastructure^{92} and the ELPA eigenvalue solver^{93} for parallel eigenvalue solutions.

This paper is organized as follows: In Sec. II, we introduce the *GW*+BSE formalism in the context of MBPT. In Sec. III, we discuss the details of our implementation. In Sec. IV, we demonstrate the numerical correctness of our BSE implementation by comparing the excitation energies computed by FHI-aims and by the MolGW code^{43,75} with GTO correlation-consistent basis sets for Thiel’s molecular benchmark set.^{24,26} In assessing our BSE implementation, we emphasize the dependence of the BSE results on the *GW* quasiparticle energies. We then study the convergence behavior of excitation energies to the complete basis set limit, combining standard NAO basis sets for ground state DFT (FHI-aims-2009)^{88} or valence correlation consistent NAO basis sets (NAO-VCC-nZ)^{89} with extended augmentation functions (NAO+aug). We demonstrate that the standard FHI-aims-2009 basis sets give essentially basis set converged numerical results for low-lying optical excitation energies when combined with a few extended augmentation functions (NAO+aug basis sets) that are also commonly included in Gaussian-type basis sets. Finally, similar convergence behavior is demonstrated for LR-TDDFT with adiabatic LDA as the kernel.^{33,94}

## II. METHODS

Typical calculations of the neutral (optical) excitation energies of molecules using the BSE approach adopt the following three-step procedure, which is utilized by a wide range of electronic structure packages for calculations of neutral (optical) excitation properties in the framework of MBPT:^{43,44,46,79,80,82,83,85}

(i) The initial step is performed by solving the self-consistent (g)KS equations with an approximate functional for the exchange-correlation energy *E*_{xc}. Common choices for *E*_{xc} are the LDA, GGAs, HF, and hybrid functionals. In KS theory (e.g., LDA and GGAs), we define $v^xc$ as the functional derivative of *E*_{xc} with respect to the electron density. In the gKS case (HF and hybrid functionals), $v^xc$ is the functional derivative with respect to the set of orbitals *ψ*_{l}. In either case, the *ψ*_{l} are constructed as

Equation (1) states the electronic (g)KS single-particle equations for the effective single-particle orbitals *ψ*_{l} and eigenvalues *ε*_{l} (*l* = 1, 2, …, *N*_{orbit}). Equation (2) details the gKS Hamiltonian *ĥ*^{(g)KS}, including the effective single-particle kinetic energy (with relativistic corrections) $t^s$, the external potential $v^ext$, the electrostatic or Hartree potential of the electron density $v^H$, and the exchange-correlation potential $v^xc$. The underlying orbitals {*ψ*_{l}} are here expanded on the basis of NAOs {*φ*_{i}, *i* = 1, 2, …, *N*_{basis}} as

where the NAOs are of the form^{88}

where **r** is a position vector with respect to the nucleus, *r* is its modulus, and Ω is the corresponding angular coordinate on the unit sphere. In the FHI-aims code, the *u*_{i} are numerically tabulated functions, defined as cubic splines in units of a logarithmic grid. *Y*_{lm} are the real-valued spherical harmonics, and *l* and *m* are implicitly included in the basis function index *i*. The eigenvalues and eigenfunctions produced by the initial step serve as a first guess for the quasiparticles and are used to evaluate the Coulomb interaction, the screened Coulomb interaction, and the *GW* self-energy in the subsequent *GW* and BSE@*GW* steps. Although, in the nonperiodic case, *ψ*_{l} can be chosen to be real-valued, we include complex conjugates in the derivations below.

(ii) A perturbative *GW* approach is then applied to obtain the quasiparticle energies as follows:^{68}

where $\epsilon lGW$ is the quasiparticle energy. By convention, the arguments $\epsilon lGW$ used to evaluate the self-energy Σ^{GW} on the right-hand side are updated self-consistently until they match the $\epsilon lGW$ values obtained on the left-hand side, even though the function Σ^{GW}(*ω*) itself is not further updated in the process. The *GW* self-energy is calculated from the Green’s function *G* and the screened Coulomb potential *W* following the *GW* approximation proposed by Hedin:^{16}

In the single-shot perturbative *GW* (i.e., *G*^{0}*W*^{0}) approach, the Green’s function *G* is approximated by the noninteracting Green’s function *G*^{0}, which is calculated from single-particle orbitals *ψ*_{l} and orbital energies *ε*_{l},^{68}

where *ε*_{F} is the Fermi energy and *η* is a positive infinitesimal. The screened Coulomb potential *W*^{0} is calculated from the dielectric function *ε* as^{68}

where the dielectric function *ε* is obtained at the RPA level, using DFT results. The *G*^{0}*W*^{0} self-energy can be calculated using an exact analytic treatment on the real axis, which is the case in the MolGW package.^{60,75} We refer to Sec. III.C of Ref. 60 for the details of this formalism. This treatment is limited to small systems. Instead, two-pole^{95} and Padé^{96} approximations are implemented in the FHI-aims code for the evaluation of the self-energy on the real axis.^{68} Both of these approximations are based on an exact treatment of *G*^{0}, *W*^{0}, and the self-energy $\Sigma G0W0$ on the imaginary frequency axis. $\Sigma G0W0$ is then extended to the real axis by performing an analytic fit of the data on the imaginary axis to a function with a form that has poles on the real axis. This process is usually referred to as “analytic continuation.” The smooth behavior of all quantities (*G*^{0}, *W*^{0}, $\Sigma G0W0$) on the imaginary frequency axis significantly reduces the number of frequency points needed, compared to a full frequency integration along the real frequency axis.^{97} Specifically, the self-energy is approximated to have the following mathematical form in the complex plane in the two-pole approximation:^{95}

where the values of *a*_{n} and *b*_{n} depend on the indices *i* and *j*. In the Padé approximation, the self-energy is expressed as^{96}

where N denotes the total number of parameters in the Padé approximation. We note already here that the Padé approximation can be more accurate than the two-pole approximation to represent the true self-energy but that the Padé approximation is also, in practice, more prone to numerical problems, including nonunique solutions that can be difficult to control without manual inspection of all resulting eigenvalues. In addition to the two approaches mentioned above, another, more elaborate approach to evaluate the self-energy directly on the real axis by contour deformation (CD) was implemented in FHI-aims by Golze and co-workers^{98} while this paper was being completed. We do not assess this approach here because our emphasis here is on the BSE, but we note that essentially exact *G*^{0}*W*^{0} input data to the BSE are expected from the CD approach. On the other hand, the analytical continuation of Σ according to Eq. (9) or Eq. (10) is advantageous over the CD approach in terms of computational cost, both in terms of the base cost (often called the prefactor) and in terms of the scaling exponent with system size if the number of needed *G*^{0}*W*^{0} eigenvalues scales with the size of the system.^{98}

Here, we perform one-shot perturbative *G*^{0}*W*^{0} calculations based on a fixed DFT or HF reference. The quasiparticle energy in Eq. (5) is thus rewritten as $\epsilon lG0W0$. Some studies investigate the effect of iterating the *GW* equations by updating the eigenvalues in Eq. (7) by those calculated from Eq. (5), whereas the wavefunctions *ψ*_{l} are kept at the DFT level.^{44,49,99} This procedure, denoted eigenvalue self-consistent *GW*, is reported to give better agreement with experimental results and wavefunction-based reference methods compared with single-shot *G*^{0}*W*^{0} for some systems.^{44,45,49,99}

(iii) The BSE is a Dyson-like equation for the two-particle correlation function *L*,^{20,21,43}

where the set of variables 1, 2, etc. is short for position, time, and spin (**r**_{1}, *t*_{1}, *σ*_{1}), (**r**_{2}, *t*_{2}, *σ*_{2}), etc. *L*(12; 1′2′) is the electron-hole correlation function which describes the probability amplitude of an electron propagating from 1′ to 2 and a hole propagating from 1 to 2′.^{21}^{,}*L*_{0}(12; 1′2′) represents the correlation function of the noninteracting system defined below in Eq. (12). *K*(35; 46), usually called the electron-hole interaction kernel, is the screened interaction between the electron and the hole (including bare exchange). *L*_{0} and *K* can be expressed in the following equations:^{20}

where *G* is the one-particle Green’s function and *M* is equal to the sum of the self-energy and the Hartree potential,

By applying Eq. (14) to Eq. (13), performing a time-energy Fourier transformation and ignoring the dynamical properties of *W*,^{20} the BSE kernel can be simplified to

where the variables 3, 4, etc. are reduced to **r**_{3}, **r**_{4}, etc., *v* is the bare Coulomb interaction, and *W* is the screened Coulomb interaction, with the frequency-dependence ignored.^{20} This approximation means that the actual BSE part (once the *GW* quasiparticle energies are fixed) is independent of a particular analytical continuation choice since only the *ω* = 0 value of *W* enters the approximated BSE.

In practical implementations, the BSE is usually rewritten in the following matrix form in a transition space spanned by the products of occupied and unoccupied orbitals:^{20,43,44}

where *E*_{s} are the optical excitation eigenvalues and (*X*_{s}, *Y*_{s}) are the eigenvectors. The *X*_{s} and *Y*_{s} are expressed in electron-hole space of the unperturbed system with elements *X*_{s,ia} and *Y*_{s,ia}, i.e., the actual BSE wavefunctions are linear combinations of the product of (g)KS orbitals. The excitation wavefunctions *X*_{s}, *Y*_{s} can be taken to be real-valued in finite (molecular) systems without an external field. Blocks *A* and −*A*^{†} correspond to resonant transitions from occupied to unoccupied orbitals and the antiresonant transitions, respectively.^{100} Blocks *B* and −*B*^{†} describe the coupling between blocks *A* and −*A*^{†}. In the BSE, the diagonal matrices *A*(−*A*^{†}) and off-diagonal matrices *B*(−*B*^{†}) are defined as follows:^{20,43,44}

The indices *i* and *j* denote occupied states and *a* and *b* denote unoccupied states. $\epsilon aGW$ and $\epsilon iGW$ are the quasiparticle energies denoted $\epsilon lGW$ in Eq. (5). The coefficient *α*^{S/T} is equal to 2 for singlet excitations and 0 for triplet excitations. The index conventions for the bare and screened Coulomb interactions *V* and *W* are as follows:^{75}

where (*pq*|*rs*) are the two-electron integrals in a basis set representation,

and the same convention (Mulliken notation) for r and r′ is used in the notation of the screened Coulomb integrals (*pq*|*W*|*rs*) as well. The neglect of the coupling blocks *B*(−*B*^{†}) in Eq. (16) is known as the Tamm-Dancoff approximation (TDA).^{37,101} In the TDA, which also we compare below, the relevant equation becomes simply,

The oscillator strength *f*_{s} can be calculated from the eigenvalues and eigenvectors obtained by solving the BSE eigenvalue problem as follows:^{102}

where *d*_{s,μ} can be calculated as^{102}

Since we are dealing with finite systems, the dipole operator $\mu ^$ is simply taken to be the position operator, i.e., $\mu ^\u2261(x,y,z)$. For convenience, we reference the coordinates *x*, *y*, *z* to the center (average of atomic positions) of the molecule.

We will also compare our observations for the BSE to analogous results for LR-TDDFT, which is widely used in chemistry. We therefore briefly recapitulate the LR-TDDFT formalism, the mathematical structure of which is similar to the BSE, albeit with a two-point kernel instead of the four-point kernel of BSE. A deeper discussion of the mathematical similarities and differences of both levels of theory is given in Ref. 22. LR-TDDFT is often expressed as the Casida eigenvalue equation,^{103} which is formally equivalent to Eq. (16). Here, the LR-TDDFT formalism becomes

In LR-TDDFT, Ω is called the Casida matrix, which has the same dimension as *A* or *B* in Eq. (16). $Es2$ are squares of the neutral many-body excitation energies and *F*_{s} are the eigenvectors of this eigenvalue problem, which can also be related to the oscillator strengths via the dipole operator.^{104} The Casida matrix can be written in the basis of products of (g)KS orbitals as

where *δ* denotes the Kronecker delta. The kernel *K*_{ia,jb} is defined as

where *f*_{xc}[*n*_{0}] is the exchange correlation kernel and *n*_{0} is the (g)KS ground state electron density. As in the case of the BSE approach, the LR-TDDFT method can also employ the TDA approximation as discussed in Ref. 105. Then, an analogous equation to Eq. (22) is solved but with the matrix A being

The kernel is formally defined through the functional derivative of the time-dependent Kohn-Sham exchange and correlation potential *v*_{xc}[*n*](**r**, *t*) with respect to the time-dependent density *n*(**r**′, *t*′) such that

In practice, the so-called adiabatic approximation^{106} is employed in Eq. (29), as we do here, and the exchange-correlation kernel reads

This approximation makes the exchange-correlation kernel frequency-independent.

## III. IMPLEMENTATION

In our implementation, the two-electron Coulomb interaction in Eqs. (17), (18), and (26), the static screened Coulomb interaction in Eqs. (17) and (18), and the two-electron integrals of the exchange correlation kernel in Eq. (26) are calculated employing the RI approach.^{68,107–111} The RI represents pair products of atomic basis functions *φ*_{p}(**r**) ⋅ *φ*_{q}(**r**) in terms of auxiliary basis functions (ABFs),

where *P*_{μ}(**r**) (*μ* = 1, 2, …, *N*_{aux}) are the ABFs and $Cpq\mu $ are the expansion coefficients. The construction of the ABFs in FHI-aims is explained in Ref. 68 and in detail in Ref. 91. The evaluation of the integrals (21) then reduces to

The computation of the expansion coefficients $Cpq\mu $ requires three-center integrals involving the ABFs and the pair products of the NAOs,

and (*ν*|*μ*)^{−1} denotes the inverse of the Coulomb matrix in ABF representation. Thus, the expensive computation of four-center integrals (*pq*|*rs*) is reduced to the computation of much cheaper three-center and two-center ones,

using

where (*ν*|*μ*)^{−1/2} denotes the square root of the inverse Coulomb matrix. This enables the efficient computation of the Coulomb matrix elements both in time and in memory. The screened Coulomb interaction *W* can be represented in terms of the ABFs in a similar way to the Coulomb interaction *V* as follows:

The dielectric function *ε* can be calculated as

where *χ*^{0} is the noninteracting density response function. In real space and for a non-spin-polarized system, *χ*^{0} is defined as

where “c.c.” denotes the complex conjugate. We refer to Ref. 68 for more details. The current BSE implementation in FHI-aims uses global RI.^{68} In the future, use of the localized RI formalism^{91} is expected to facilitate scalability to larger systems as well as support for extended (periodic) systems.

## IV. RESULTS

### A. Numerical validation

We quantify the precision of our BSE implementation by calculating the neutral excitation energies of the molecular benchmark set published by Schreiber *et al.*,^{24} known in the literature as “Thiel’s set.” This set (see Fig. 1 in Ref. 24) includes *N* = 28 small and medium-sized organic molecules, the largest of which is naphthalene with 18 atoms. The chemical elements represented in these organic molecules are H, C, N, and O. The atomic coordinates for the included molecules are taken from the supplementary material of Ref. 24. Schreiber *et al.* focused on obtaining “Best Estimate (BE)” values for singlet and triplet excitation energies of these molecules from *ab initio* calculations, including rather demanding multireference, coupled cluster, or complete active space approaches of their own or from the literature. While the BE values have been used as reference data by both Thiel’s and other groups for implementations, evaluation and development of a variety of methods,^{26,43,44} our present study is aimed at establishing the numerical precision of our approach at a fixed level of theory, i.e., BSE or LR-TDDFT. The primary objective of this paper is therefore not to compare to the BE results but rather to compare the BSE excitation energies calculated by our present NAO-based implementation to results obtained at the same level of theory, using the MolGW code as a benchmark. Regarding the basis set, Schreiber *et al.*^{24} (and therefore, also some of the results from other methods available for comparison in the literature) employed a relatively limited basis set level for correlated calculations, the TZVP basis set by Schäfer *et al.*^{112} A previous study by Bruneval *et al.*^{43} indicates that BSE@*G*^{0}*W*^{0}@DFT-B3LYP^{113} excitation energies for ethene and pyrrole, obtained with the TZVP basis set, overestimate the analogous results from the much larger aug-cc-pVQZ basis set^{114} by about 0.45–0.65 eV. Therefore, the goal of our following investigation is twofold. We first establish the numerical precision of our own implementation in comparison with MolGW using the TZVP basis set.^{112} We then discuss basis set convergence for low-lying singlet and triplet excitations using NAO basis sets.

In Fig. 1, we compare the numerical precision of the present BSE implementation and that in MolGW using the TZVP basis set. Specifically, we show state-resolved mean absolute error (MAE) values of the BSE-approximated energies of the lowest ten singlet and triplet excited states, respectively, of all molecules in Thiel’s set. The state-resolved MAE, MAE(*i*), of a given dataset *D* = {*D*_{i,n}} in comparison with a reference set *R* = {*R*_{i,n}} is defined as

where *i* = 1, …, 10 is the index for the state and *n* = 1, …, *N* is the index for the molecules in Thiel’s set. For the MAE plotted in Fig. 1, the dataset *D* is the set of BSE excitation energies calculated using the FHI-aims implementation. The reference set *R* is the set of BSE excitation energies calculated by MolGW. The Perdew-Burke-Ernzerhof (PBE)^{34} exchange-correlation functional is used for the initial DFT calculations, and the *G*^{0}*W*^{0} quasiparticle energies that enter the BSE are taken from MolGW in both sets. Panels (a) and (b) show the results for singlet excitations with and without TDA, respectively. Panels (c) and (d) show the results for triplet excitations with and without TDA, respectively. The BSE results from the present implementation agree with the results of the MolGW package at the level of 1 meV or below.

Next, Fig. 2 compares BSE oscillator strengths, Eqs. (23) and (24), from both implementations for singlet states in the TDA. The MAEs for the oscillator strengths are at the level of 10^{−4} or below, i.e., numerically negligible. The actual value of the excitation energy and oscillator strength investigated in this section for all the molecules in Thiel’s set can be found in Tables S1–S5 of the supplementary material. Since the calculation results of the FHI-aims and MolGW package agree within 1 meV for the excitation energy and 10^{−4} for the oscillator strength, the values in Tables S1–S5 are valid for both packages within the significant digits given. In short, Figs. 1 and 2 validate our implementation.

### B. Effects of the frequency dependence of the self-energy in GW

We next investigate how the BSE energies are impacted by different treatments of the frequency dependence of the *G*^{0}*W*^{0} self-energy $\Sigma G0W0$. In Sec. IV A, we took the *G*^{0}*W*^{0} quasiparticle energies calculated by MolGW as the input for the FHI-aims BSE calculations. *G*^{0}*W*^{0} calculations in MolGW employ an exact analytic treatment of $\Sigma G0W0$ on the real axis.^{60,75} A similarly precise result is expected from the CD technique by Golze *et al.*^{98} In this section, we investigate the impact of two frequently employed inexact but potentially cost-saving^{98} approximations to the self-energy on the real axis, namely, the two-pole approximation,^{95} Eq. (9), and the Padé approximation,^{96} Eq. (10), with 16 parameters.

In Fig. 3, we show mean absolute errors MAE(*i*) of the *G*^{0}*W*^{0} quasiparticle energies of the HOMO-3 to LUMO+3 states, calculated using either the two-pole approximation [Fig. 3(a)] or the 16-parameter Padé approximation [Fig. 3(b)] and compared to the MolGW reference values. The *G*^{0}*W*^{0} results are based on DFT calculations using the PBE^{34} exchange-correlation functional and employ the Gaussian-type TZVP basis set.^{112} We see that the two-pole approximation gives MAE values up to 0.3 eV in the investigated states. Although not a small value, this must be viewed in the context of the overall change from plain DFT single-particle energies to correlated *G*^{0}*W*^{0} quasiparticle energies, which are typically of the order of several eV. The 16-parameter Padé approximation can reduce this error by a factor of two or more, as shown in Fig. 3(b). Most of the quasiparticle energies agree with the MolGW results within 0.1 eV, except for the state LUMO+2, where the MAE is around 0.12 eV. The performance of both approximations can be compared with a broader analysis performed in Ref. 97 (the “GW100” benchmark). For a different set of molecules, Ref. 97 reports MAE values of 0.125 eV for HOMO levels and 0.071 eV for LUMO levels when comparing FHI-aims’ two-pole approximation with the 16-parameter Padé approximation. These values are somewhat worse than the precision of the two-pole approximation for our molecular set in Fig. 3(a). Additionally, Fig. 3(a) includes states away from the HOMO and LUMO (not considered in Ref. 97), with somewhat higher MAEs. Given the differences in assessed molecules and levels, the two-pole approximation precision reported in Fig. 3(a) is thus very much in line with that found in Ref. 97. For FHI-aims’ 16-parameter Padé approximation, Ref. 97 finds MAE values of 0.003 eV (HOMO) and 0.006 eV (LUMO) when comparing to an RI-free implementation in the Turbomole code and covering a subset of the smaller molecules in the GW100 benchmark. Although this finding is consistent with our general finding that the 16-parameter Padé approximation significantly improves over the average error associated with the two-pole approximation, the MAE values reported for the present test set in Fig. 3(b) still amount to a few tens of millielectronvolt for the HOMO and LUMO. The remaining discrepancy (though small) between our MAE values and those from the GW100 benchmark could partially be attributable to the different molecular test subsets considered in the assessment of the 16-parameter Padé approximation. Additionally, as also discussed in more depth in Sec. V A of Ref. 97, the Padé approximation can result in serious ambiguities for individual eigenvalues $\epsilon lGW$ if there are multiple solutions for the self-consistency condition between the left and the right sides of Eq. (5). In our own assessment, we did not pursue the question of different solutions of Eq. (5) between the 16-parameter Padé approximation in FHI-aims and the numerically exact self-energy in MolGW. While our overall reported MAE values in Fig. 3(b) are evidently small, it is conceivable that multiple solutions of Eq. (5) for individual eigenvalues contribute to the observed MAE, as they would in a practical simulation, where existing implementations typically also do not screen specifically for multiple solutions. In fact, despite its higher MAE, a practical advantage of the two-pole approximation is its relative numerical simplicity and therefore its relative robustness against numerical ambiguities, compared to the Padé approximation. Thus, the two-pole approximated self-energy can be preferable for simple reasons of stability, at the price of reduced precision compared to a formally exact *G*^{0}*W*^{0} self-energy. However, one should bear in mind that the two-pole approximation can in some cases also fail dramatically as was shown in Ref. 97.

As shown in Fig. 4, the different approximate *G*^{0}*W*^{0} self-energy treatments affect the BSE excitation energies, which take the *G*^{0}*W*^{0} quasiparticle energies as input. We compare BSE results based on *G*^{0}*W*^{0} quasiparticle energies calculated using the self-energies of Eqs. (9) and (10) to BSE results based on the MolGW *G*^{0}*W*^{0} eigenvalues with the exact analytic self-energy treatment.^{60,75} Panels (a) and (b) show the MAE(*i*) values for the ten lowest singlet and triplet BSE excitation energies, respectively, using the two-pole approximation and averaged over all molecules in Thiel’s set. Panels (c) and (d) show the analogous results obtained using the 16-parameter Padé approximation. We see that the 16-parameter Padé approximation yields smaller MAE(*i*) (≈0.1 eV across all investigated states) than the two-pole approximation [MAE(*i*) of 0.1–0.4 eV]. The difference is a direct reflection of the different MAE(*i*) of the *G*^{0}*W*^{0} quasiparticle energies (Fig. 3), which constitute the input for the BSE calculations.

In addition to the MAE of the BSE@*G*^{0}*W*^{0}@PBE results between FHI-aims and MolGW, we can also define the state-dependent mean signed error MSE(*i*) as

Just as in Eq. (41), *i* and *n* are indices for states and molecules, respectively, averaging over *N* = 28 molecules in Thiel’s set. Table I summarizes the overall MSE and MAE (averaged over states *i* in addition to averaging over molecules) of the BSE@*G*^{0}*W*^{0}@PBE results using the two-pole and 16-parameter Padé approximation vs MolGW as a reference. As noted above, the MAE of BSE results based on quasiparticle eigenvalues from the 16-parameter Padé approximation is less by a factor of two than the MAE of BSE results from two-pole approximated quasiparticle eigenvalues. The MSE indicates that BSE results from the two-pole quasiparticle eigenvalues overestimate the expected BSE@*G*^{0}*W*^{0} excitation energies based on an exact *G*^{0}*W*^{0} self-energy.

G^{0}W^{0}
. | Two-pole . | 16-Parameter Padé . | ||
---|---|---|---|---|

self-energy . | Singlet . | Triplet . | Singlet . | Triplet . |

MSE | 0.10 eV | 0.15 eV | 0.01 eV | 0.04 eV |

MAE | 0.16 eV | 0.22 eV | 0.06 eV | 0.08 eV |

G^{0}W^{0}
. | Two-pole . | 16-Parameter Padé . | ||
---|---|---|---|---|

self-energy . | Singlet . | Triplet . | Singlet . | Triplet . |

MSE | 0.10 eV | 0.15 eV | 0.01 eV | 0.04 eV |

MAE | 0.16 eV | 0.22 eV | 0.06 eV | 0.08 eV |

It is interesting to compare the errors incurred from the different *G*^{0}*W*^{0} self-energy treatments to the errors associated with the BSE approach itself. Bruneval *et al.*^{43} show that the BSE@*G*^{0}*W*^{0}@PBE singlet excitation energies at the TZVP basis set level give an MSE of −0.8 eV and an MAE of 0.8 eV compared to the BE results of Schreiber *et al.*^{24} The MSE and MAE of the BSE@*G*^{0}*W*^{0}@PBE triplet excitation energies are around −1.2 eV and 1.2 eV, respectively.^{43} In comparison, the error incurred through the approximate *G*^{0}*W*^{0} self-energies in Table I is rather small for the two-pole approximation and negligible for the 16-parameter Padé approximation. Additionally, the sign of the MSE from both self-energy approximations is the opposite of that of the MSE compared to the BE values, i.e., especially the two-pole approximation would actually reduce the MSE compared to the BE values, as a result of fortuitous error cancellation. However, this reduction should not be trusted systematically. For a true improvement over the reported small-molecule BSE excitation energies, it would obviously be preferable to pursue higher-level approaches than BSE@*G*^{0}*W*^{0}@PBE at the TZVP basis set level—in terms of the DFT starting point, in terms of the theoretical treatment of the neutral excitation, and in terms of the basis set.

Even for the small-molecule systems in Thiel’s set, the BSE correction still accurately captures the vast majority of the change between straight differences of HOMO and LUMO levels from *G*^{0}*W*^{0} calculations and actual neutral excitation energies. This correction is much larger than the differences incurred above as a result of the approximate *G*^{0}*W*^{0} self-energies. In our calculations, the BSE@*G*^{0}*W*^{0}@PBE results using both 16-parameter Padé approximation and two-pole approximations reduce the lowest *G*^{0}*W*^{0} HOMO-LUMO gap by about 57%, averaged over all molecules in Thiel’s set. For the He atom, essentially basis set converged results by Li *et al.*^{49} show a *G*^{0}*W*^{0}@Hartree-Fock (HF) fundamental gap of 24.69 eV, compared to significantly lower lowest-lying triplet and singlet excitation energies of 19.82 eV and 21.22 eV, respectively. Even in this extreme case of an isolated two-electron atom, these essentially basis set converged BSE@*G*^{0}*W*^{0}@HF results agree with the exact result better than ≈0.01 Ha (≈0.3 eV). For larger molecules, such as free-base porphyrin and tetraphenylporphyrin,^{115} BSE@*G*^{0}*W*^{0}LDA excitations can match experimental absorption spectra to a similarly good degree (essentially exact within the remaining approximations made in Ref. 115). Here, the BSE again corrects the simple *G*^{0}*W*^{0} HOMO-LUMO energy difference by several eV, much more than the magnitude of changes due to analytical corrections to the *G*^{0}*W*^{0} self-energies reported above.

### C. Basis set convergence

We now address the basis set convergence of the NAO basis sets for the BSE calculation, in comparison with cc-pVnZ basis sets^{116} and aug-cc-pVnZ basis sets^{114} by Dunning and co-workers. Two types of NAO basis sets have been constructed in the context of FHI-aims. The first, denoted “FHI-aims-2009,” was introduced in Ref. 88 and aimed at ground-state DFT calculations. The FHI-aims-2009 basis sets come in different tiers (i.e., levels) n = 1, 2, 3, 4 (in the case of H, a fourth tier does not exist and the molecular results for tier4 below employ tier3 for H). These basis sets allow one to achieve total-energy convergence corresponding to fast qualitative calculations to few-millielectronvolts per atom^{117} calculations in a single, hierarchical basis set scheme (i.e., for a given element, each basis set level contains the exact basis functions from all lower-accuracy basis set levels as a subset). For first- and second-row elements, FHI-aims’ “tight” settings employ all FHI-aims-2009 basis functions up to and including tier 2, as shown in Table S6 in the supplementary material. The second type of basis set is denoted NAO-VCC-nZ with n = 2, 3, 4, 5.^{89} The NAO-VCC-nZ basis functions are constructed in analogy to the cc-pVnZ correlation-consistent (CC) basis sets by Dunning,^{116} but employing the numerically tabulated shape of NAOs (nodeless hydrogen-like radial functions with a numerical confinement potential applied to the tails). The NAO-VCC-nZ basis sets are optimized to be suitable for converging electronic total-energy calculations based on valence-only correlation methods that include sums over unoccupied states, e.g., RPA or MP2.^{89} In the following, the above two types of NAO basis sets, as well as cc-pVnZ (n = 2, 3, 4, 5)^{116} and aug-cc-pVnZ (n = 2, 3, 4, 5) basis sets,^{114} are compared to the aug-cc-pV5Z basis sets as the benchmark reference in the BSE calculations.

In Fig. 5, we show the difference between the BSE excitation energy computed with different basis sets and that of the aug-cc-pV5Z basis set for the lowest five singlet excitations of the ethene molecule in Thiel’s set. The different investigated basis set types and levels are identified on the *x* axes of all three panels. The size of the different basis sets for the ethene molecule is plotted on the *y* axis in panel (a). The difference Δ*E*_{i} between the BSE excitation energy computed using different basis sets and that computed using the aug-cc-pV5Z basis set is identified on the *y* axes in panels (b) and (c),

The Δ*E*_{i} of the lowest five singlet states (*i* = 1–5) are plotted in Fig. 5(b) without the TDA and in Fig. 5(c) with the TDA. In both Figs. 5(b) and 5(c), we see that the results obtained with the cc-pVnZ basis sets converge slowly toward the reference value as the basis size increases. The remaining discrepancy is greater than 0.5 eV even for the very expensive cc-pV5Z basis set. Although the magnitude of this discrepancy is unsatisfactory, its occurrence is not surprising. The stated reason for developing the augmented cc basis sets was an improved description of electron affinities,^{114} a key constituent of the BSE@*GW*@DFT excitation energies discussed here. Accordingly, the results obtained with the aug-cc-pVnZ basis sets converge much faster. This is in line with the literature:^{43,44} e.g., in Fig. 3 of Ref. 43, the LR-TDDFT and BSE@*G*^{0}*W*^{0}@B3LYP^{113} excitation energies of the ethene and pyrrole molecules using the aug-cc-pVDZ and aug-cc-pVQZ basis sets show differences less than 0.1 eV for ethene and about 0.2 eV for pyrrole, for both LR-TDDFT and BSE. In Figs. 5(b) and 5(c), the results obtained with the NAO-VCC-nZ basis sets, which are constructed analogously to the cc-pVnZ basis sets, display a similarly unsatisfactory convergence behavior as that of the results from cc-pVnZ basis sets. The other type of NAO basis sets, i.e., the FHI-aims-2009 “tier” basis sets, behaves slightly differently compared to the NAO-VCC-nZ basis sets. The FHI-aims-2009 tier2 basis set improves the BSE excitation energies significantly compared to the FHI-aims-2009 tier1 results, in fact even slightly better than the two unaugmented cc basis set prescriptions. However, the FHI-aims-2009 tier3 and tier4 results are very similar to those obtained using the tier2 basis set, displaying no further significant improvement.

The results discussed so far confirm the importance of the augmentation functions. We thus extend the FHI-aims-2009 tier2 basis set with different numbers of Gaussian-type augmentation functions obtained from the aug-cc-pV5Z basis set. The label “tier2+aug1” in Fig. 5 denotes the basis set generated by adding the first Gaussian augmentation function from the aug-cc-pV5Z basis set (angular momentum quantum number *l* = 0) to the FHI-aims-2009 tier2 basis set. Similarly, the label “tier2+aug2” denotes the basis set obtained by combining the FHI-aims-2009 tier2 basis set with the first two Gaussian augmentation functions (*l* = 0, 1) from the aug-cc-pV5Z basis set. We see that the addition of the augmentation functions significantly improves the accuracy of the FHI-aims-2009 tier2 results compared to the reference aug-cc-pV5Z values. Specifically, the tier2+aug2 basis set already yields essentially basis set converged values for the excitation energies of ethene shown in Fig. 5, with a remaining discrepancy of 0.1 eV or less compared to aug-cc-pV5Z. This conclusion is independent of whether or not the TDA is used in the BSE calculation, as shown in Fig. 5(c). As an important main result, the tier2+aug2 basis sets can here provide rather well converged values, comparable to the aug-cc-pV5Z reference values for low-lying neutral excitations. As will be shown below, this result can be generalized to the remainder of the molecules in Thiel’s set. Interestingly, the tier2+aug basis sets thus provide a recipe allowing one to use a basis set that is precise but affordable for ground-state DFT^{117} and, with a very limited modification, sufficient to achieve highly converged BSE results for low-lying excitations at the same time. For the lowest-energy excitations, which are often those of the greatest interest, we can thus use a very similar NAO basis set prescription as in ground-state DFT.

It is instructive to analyze the origins of the BSE@*G*^{0}*W*^{0}@PBE basis set convergence behavior in Fig. 5 in terms of the underlying electronic structure steps. We therefore next discuss the convergence of the HOMO and LUMO levels of ethene as well as of the HOMO-LUMO gap for DFT-PBE (Fig. S1 in the supplementary material) and for *G*^{0}*W*^{0}@PBE (Fig. S2 in the supplementary material). Not surprisingly, the DFT-PBE results are all significantly better converged than the BSE@*G*^{0}*W*^{0}@PBE results in Fig. 5. The LUMO converges slightly less well than the HOMO, not unexpectedly since it lies only a bit more than 1 eV below the vacuum level and is thus weakly bound. Much in line with the motivation of Kendall *et al.*^{114} to address electron affinities in their construction of the augmented cc basis sets, the addition of two augmentation functions converges the LUMO to the visual resolution of Fig. S1(b), i.e., a scale of roughly 0.01 eV. As is well known,^{68} the convergence of *G*^{0}*W*^{0} quasiparticle energies is much slower with basis set size than that of the DFT single-particle energies. All the correlation consistent basis sets show noticeable improvements of the quasiparticle HOMO, LUMO, and gap values in Fig. S2, up to the largest basis sets investigated here. However, they do not yet reach visual convergence for the LUMO and, thus, for the gap, even for the largest basis sets. The same is true for the FHI-aims-2009 “tier” basis sets, although this group of basis sets was constructed for ground-state DFT^{88} and is thus, technically, not designed to converge explicitly correlated methods with the same degree of systematicity as the cc basis sets.^{89} The augmentation functions added to “tier2” do not contribute to the convergence of the quasiparticle HOMO, whereas they do contribute somewhat to the convergence of the quasiparticle LUMO. In the cases of cc basis sets, basis set extrapolation strategies could be employed as done in Ref. 97 for *G*^{0}*W*^{0} quasiparticle levels. However, the BSE@*G*^{0}*W*^{0}@PBE results using augmented basis sets in Fig. 5 appear to converge well without extrapolation, which is initially surprising given the slow convergence of the quasiparticle energies. Regarding BSE@*G*^{0}*W*^{0}@PBE convergence, the most striking insight arises from the quasiparticle gap behavior in Fig. S2(c). The gap for all nonaugmented basis sets converges slowly, as seen in Fig. 5. In contrast, the slow convergence of quasiparticle HOMO and LUMO cancels nearly exactly for the augmented cc basis sets as well as for the tier2+aug2 and tier2+aug3 basis sets. Since *G*^{0}*W*^{0} quasiparticle energy differences are key terms in BSE@*G*^{0}*W*^{0} excitation energies, the quasiparticle gap behavior in Fig. S2(c) indicates that, at least for ethene, *both* error cancellation between HOMO and LUMO *G*^{0}*W*^{0} correlation energies and the better description of the LUMO by the augmentation functions contribute to low-lying BSE@*G*^{0}*W*^{0} excitation energies. As a result, the latter is largely numerically converged already for the relatively small tier2+aug2 basis set as well as for the augmented cc basis sets.

Fig. 6 shows the convergence of the MAE(*i*) of the five lowest-lying excitation energies as a function of the size of various basis sets in BSE calculations, employing the TDA and averaged over all molecules in Thiel’s set. The reference is, again, the aug-cc-pV5Z basis set. The different investigated basis set types and levels are identified on the *x* axes of panels (a) and (b). The basis size $N\xafbasis$ of different basis sets, averaged over all molecules in Thiel’s set, is plotted as the *y* axis in Fig. 6(a). $N\xafbasis$ is defined as

where *N*_{mol} is the number of molecules in Thiel’s set and $Nbasisj$ is the basis size for the molecule *j* in the benchmark set. We see that the convergence of different basis sets is similar to the earlier observations made for ethene in Fig. 5. Specifically, the tier2+aug2 basis set produces rather well converged results for all molecules investigated here. The lowest five singlet excitation energies calculated by BSE@*G*^{0}*W*^{0}@PBE using the aug-cc-pV5Z and tier2+aug2 basis sets are also listed in Table S7 in the supplementary material for all molecules in Thiel’s set.

### D. Convergence with respect to the BSE matrix size

In the BSE calculations presented in this work so far, we include the orbital pairs of *all* occupied and unoccupied orbitals in the construction of the BSE matrix. The dimension of the matrix problem Eq. (16) thus grows quadratically with the basis set size and also (for a fixed basis set level) with molecular size. Solving this full BSE matrix problem produces an excitation spectrum of the studied molecule that includes very high excitation energies. In many practical applications, however, we are interested in only the low-lying part of the excitation spectrum. In such a situation, one might suspect that the high-lying unoccupied quasiparticle states do not contribute much to the low-lying optical excitations. This can, in fact, not be entirely true, since single-quasiparticle like *GW* observables such as the ionization potential can depend significantly on high-lying parts of the spectrum being included in unoccupied-state sums in the relevant perturbation expressions.^{118} All-electron approaches to *G*^{0}*W*^{0} band gaps suffer from similar convergence issues with basis set size, specifically the basis set resolution in those regions of space that are closer to a nucleus.^{119} However, neutral excitations are not the same objects, and the question thus remains how many quasiparticle states, especially the high-lying unoccupied orbitals, should be included in the construction of the BSE matrix, in order to obtain sufficiently precise results for low-lying optical excitation energies. Previous investigations on leaving away states in the calculation of excitation energies exist. For instance, this was done in Ref. 115, already referenced above. As another common example, various studies show how to efficiently select the desirable orbital space in the study of complete active space approaches.^{4,120–122} As a final example, in the calculation of the electronic spectra of molecules in solution or surfaces, Besley developed an approach within LR-TDDFT and single excitation configuration interaction that limits electronic excitations to include only those between orbitals localized on the solute or adsorbent.^{123}

In Fig. 7, we show the errors incurred in the BSE low-lying singlet and triplet excitation energies obtained when applying different values of a cutoff energy *E*_{cut} for unoccupied states, limiting the number of states *a* and *b* entering the matrix construction in Eqs. (17) and (18). Here, *E*_{cut} denotes a cutoff energy above which the high-lying unoccupied quasiparticle states are omitted from the BSE matrix (however, such cutoffs are not applied in the construction of the quantities entering the BSE matrix). The average numbers of unoccupied states included for different choices of *E*_{cut} are tabulated in Table II. Figure 7 shows MAE(*i*) values for the lowest ten singlet (a) or triplet (b) excitation energies for the different cutoff energy values, using the tier2+aug2 basis set for all calculations and the excitation energies from the full calculation (no cutoff imposed) as a reference. In these calculations, all occupied states are kept in the construction of the BSE matrix, i.e., no cutoff threshold is applied to the occupied quasiparticle states. Figure 7 shows that the error of the results calculated with *E*_{cut} = 20 eV is about 20–30 meV. Setting *E*_{cut} = 40 eV yields excitation energies with an error closer to 10 meV. Larger *E*_{cut} values of 60 and 80 eV lead to further small improvements. In view of the remaining errors of these calculations (due to the level of theory for underlying DFT, quasiparticle energies, and neutral excitation formalism itself), these results suggest that one may use 40 eV as a threshold beyond which the impact of high-lying unoccupied quasiparticle states becomes negligible in BSE calculations of low-lying excitations. This can reduce the computational effort significantly because of the reduction of the number of states needed in construction of the BSE matrix. Specifically, the time complexity and memory complexity of constructing the BSE matrix in Eqs. (17) and (18) scale as $Nocc2\xd7Nunocc2$, where *N*_{occ} and *N*_{unocc} denote the number of occupied and unoccupied (g)KS single-particle states, respectively. By setting *E*_{cut} = 40 eV and for the tier2+aug2 basis sets, the number of unoccupied states is reduced to about 1/3 of the analogous number if no threshold energy values are used (see Table II). Additionally, the formal effort for solving the full BSE, Eq. (16), would scale as *O*(*N*^{6}), where *N* is a measure of system size, due to the same considerations of how *N*_{occ} and *N*_{unocc} impact the matrix dimension. While imposing *E*_{cut} will not reduce the formal scaling, the actual computational effort will nevertheless be reduced substantially in the limit of large systems where the BSE solution must eventually dominate. In short, both the time and the memory requirements of the BSE calculation of low-lying excitations are expected to be reduced significantly by imposing *E*_{cut}, without sacrificing much accuracy.

### E. Comparison to LR-TDDFT

In this section, we first investigate the basis set convergence of LR-TDDFT excitation energies for the molecules in Thiel’s set, using the strategy already employed for the BSE in Sec. IV C. The LR-TDDFT in FHI-aims was implemented following Eqs. (25)–(30). The exchange-correlation kernel used is the LDA, employing the parametrization of the correlation energy by Perdew and Wang,^{33,94} provided by the Libxc library.^{124,125} Note that the LR-TDDFT formalism leaves one with the freedom to choose different prescriptions of the XC functional for (i) the self-consistent solutions of single-particle orbitals and energies and (ii) the XC kernel *f*_{xc} [Eq. (30)] used in the actual LR-TDDFT construction. In this section, the exchange correlation functional for the self-consistent solutions of single-particle orbitals is PBE.^{34} We will use the notation “LR-TDDFT-LDA@PBE” in the following to denote this approach. To maintain consistency, the TDA is employed in the LR-TDDFT calculations shown in the main text of this paper, as this is widely done for BSE calculations as well. Additionally, LR-TDDFT calculations without making the TDA are included in the SM.

Figure 8 provides state-dependent MAE(*i*) values for the lowest five LR-TDDFT excitation energies computed with different basis sets, averaged over Thiel’s set. Excitation energies computed with the aug-cc-pV5Z basis sets are used as reference values. The different investigated basis set types and levels are identified on the *x* axes of both panels (a) and (b). Figure 8(a) for singlet states and Fig. 8(b) for triplet states show essentially identical behavior. The overall convergence pattern associated with all the basis sets investigated here is also very similar to the behavior seen for the BSE in Fig. 6. Excitation energy values derived from both the cc-pVnZ basis sets and the NAO-VCC-nZ basis sets converge slowly toward the reference value as the basis size increases. The largest error is about 0.5 eV for the very expensive cc-pV5Z basis set for singlet excitation energies and 0.3 eV for triplet excitation energies. The aug-cc-pVnZ basis sets, again, converge much faster. As for the BSE, the FHI-aims-2009 basis sets improve significantly toward the reference results from tier1 to tier2, but not further using tier3 and tier4. Finally, by adding augmentation functions to the FHI-aims-2009 tier2 basis sets, one can obtain well converged LR-TDDFT results by including only one or two augmentation basis functions. The tier2+aug2 basis set, already discussed for BSE calculations above, leads to compellingly good basis set convergence as a production recipe. The lowest five singlet and triplet excitation energies calculated by LR-TDDFT-LDA@PBE using aug-cc-pV5Z and tier2+aug2 basis sets are provided in Tables S8 and S9 in the supplementary material for all molecules in Thiel’s set.

As pointed out by a reviewer of this work, it is initially somewhat surprising that the convergence of LR-TDDFT and of BSE@G^{0}W^{0} is qualitatively so similar since LR-TDDFT contains no quasiparticle energies and no screened Coulomb interaction. Indeed, in our discussion of ethene above (Fig. 5 as well as Figs. S1 and S2 in the supplementary material), we traced the slow convergence of BSE@G^{0}W^{0} excitation energies to the quasiparticle energies as well as the need for augmentation functions to better capture the quasiparticle LUMO level. However, the underlying concepts of BSE and LR-TDDFT are of course similar—see, for instance, Ref. 22 for a detailed derivation. Indeed, the derivation of LR-TDDFT is based on the linear response function *χ* of the system [see Eq. (5.7) and following in Ref. 22 for details], and the linear response *χ*^{0} of the underlying generalized Kohn-Sham system is common to both BSE@G^{0}W^{0} and LR-TDDFT. It was shown by Bruneval^{126} that treating only *χ*^{0} in a larger basis set can enhance the convergence of G^{0}W^{0} quasiparticle eigenvalues, indicating that this is the quantity that determines the common convergence behavior of both approaches.

We next compare the results of BSE@*G*^{0}*W*^{0}@PBE with those of LR-TDDFT-LDA@PBE, implemented in FHI-aims and using the tier2+aug2 basis set validated above. While the primary focus of the present work is numerical and basis set convergence, we provide this comparison here because LR-TDDFT is widely used in quantum chemistry as a computationally efficient method for optical excitation calculations. We note that similar comparisons can be found in the literature,^{43,44} albeit not using the same basis sets. In our comparison, the underlying DFT calculations for both BSE and LR-TDDFT are carried out using the PBE^{34} exchange-correlation functional. Figure 9 shows the MSE(*i*) and the MAE(*i*) between LR-TDDFT-LDA@PBE results and BSE@*G*^{0}*W*^{0}@PBE results, averaged over all molecules in Thiel’s set. The BSE@*G*^{0}*W*^{0}@PBE results are taken as the reference. The lowest ten singlet and triplet excitation energies are compared. The MSE between LR-TDDFT-LDA@PBE and BSE@*G*^{0}*W*^{0}@PBE results for singlet and triplet excitation states is plotted in Figs. 9(a) and 9(b), respectively. The MAE between LR-TDDFT-LDA@PBE and BSE@*G*^{0}*W*^{0}@PBE results for singlet and triplet excitation states is plotted in Figs. 9(c) and 9(d), respectively. We see from Fig. 9 that the MAE between LR-TDDFT-LDA@PBE and BSE@*G*^{0}*W*^{0}@PBE for singlet excitation energies is less than 0.5 eV, whereas the MAE for triplet excitations energies can be as large as 0.7 eV. Singlet excitation energy values obtained from LR-TDDFT-LDA@PBE tend to be lower than those obtained from BSE@*G*^{0}*W*^{0}@PBE, whereas LR-TDDFT-LDA@PBE triplet excitation energies appear to be larger by up to 0.7 eV than those obtained from BSE@*G*^{0}*W*^{0}@PBE. Previous studies of BSE and LR-TDDFT show that the excitation energies computed by BSE and LR-TDDFT depend highly on the DFT starting point.^{43,44} Bruneval *et al.*^{43} compare the MSE of both BSE@*G*^{0}*W*^{0}@B3LYP and LR-TDDFT@B3LYP excitation energies with the BEs of Thiel’s set, showing that the MSE of LR-TDDFT is about 0.4 eV lower than the MSE of BSE.^{26,43} However, there are several differences between their comparison and the comparison shown in this work. First, the dataset used by Bruneval and co-workers is the BEs of Thiel’s set,^{24} which contains 103 singlet and 63 triplet excitation energies. In our work, we have a larger dataset to include the lowest ten singlet and triplet states of each molecule in Thiel’s set. Second, the BSE and LR-TDDFT calculations analyzed in this section rely on the TDA, which is not employed in the comparison performed by Bruneval *et al.*^{43} Finally, we here use a basis set that is essentially converged for both BSE and LR-TDDFT calculations. Another set of comparable results is therefore those of Jacquemin and co-workers,^{44} who compare BSE@*G*^{0}*W*^{0}@PBE0 and LR-TDDFT@PBE0 in a benchmark paper using the aug-cc-pVTZ basis set, which has similar convergence behavior as the tier2+aug2 basis set used here. Different MAE values between BSE@*G*^{0}*W*^{0}@PBE0 and LR-TDDFT@PBE0 excitation energies are reported for different categories of molecules in Thiel’s set: 0.27 eV for unsaturated aliphatic hydrocarbons; 0.51 eV for aromatic compounds; 0.37 eV for aldehydes, ketones, and amides; and 0.47 eV for nucleobases. The reported values are comparable to the values that we find in Fig. 9, in the range of 0.2–0.7 eV for different states of singlet and triplet excitation energies.

Finally, we briefly return to the overall accuracy of both BSE@*G*^{0}*W*^{0}@PBE and LR-TDDFT-LDA@PBE compared to BE^{24} values. In the literature and for TZVP basis sets, the reported error for BSE@*G*^{0}*W*^{0}@PBE can be around 0.8 eV (singlets) or 1.2 eV (triplets),^{43} as already alluded to in Sec. IV B. In Fig. 10, we show results using our own recommended basis set, tier2+aug2, comparing to BE values for the lowest singlets and (where available) triplets for all molecules in Thiel’s set. Interestingly, the BSE@*G*^{0}*W*^{0}@PBE energies appear to deviate even somewhat more from the BE values as a result of the, arguably improved, basis set than those reported for the TZVP basis set.^{43} In comparison, the LR-TDDFT-LDA@PBE level of theory performs better for this set of molecules [Fig. 10(b)], shown using the TDA, albeit still off on average by several tenths of electronvolts. This observation does not depend on the TDA, as shown in Fig. S3 in the SM, which compares LR-TDDFT-LDA@PBE data with and without the TDA to the same set of BE data as used in Fig. 10. As noted earlier, the primary objective of this paper is not a discussion of the accuracy compared to the BE value but rather the assessment of basis set convergence and other methodological aspects of our approach. Nevertheless, the results shown in Fig. 10 should be borne in mind by users of the approaches. Significant improvements to both BSE and LR-TDDFT for molecules are of course possible and documented in the literature, e.g., by changing the underlying density functional approximation^{44} and/or (for BSE) the underlying level of the *GW* approach applied.^{23,51}

### F. Remarks on time and memory cost of the BSE and LR-TDDFT implementation

While the present work does not include a performance optimization of either the BSE or the LR-TDDFT implementations reported above, we provide some individual timings indicative of the computational cost of our (not fully optimized) BSE and LR-TDDFT implementations. The results should only be understood as qualitative indicators of the current implementation, since no dedicated optimization was carried out, but nevertheless give some idea of the relative cost of different steps at present and indicate avenues for the future work in our own implementation. In this section, we consider a series of acene molecules of increasing size: benzene, naphthalene, and anthracene. Both BSE and LR-TDDFT calculations were performed using the “tier2+aug2” basis sets described in Secs. IV C and IV E, respectively, and employing the TDA.

In Table III, we show the timings of BSE@*G*^{0}*W*^{0}@PBE and LR-TDDFT-LDA@PBE performed for the three acene molecules selected. We apply a cutoff energy *E*_{cut} = 40 eV to limit the number of unoccupied states entering the BSE and Casida matrix construction as described in Sec. IV D. The calculations are performed using a single node with 44 cores (Intel Xeon, Broadwell microarchitecture, 2.4 GHz) on the Dogwood cluster at University of North Carolina, Chapel Hill. The total time is decomposed into three parts, i.e., RI basis setup denoted by “Basis” (precomputed three-center and two-center integrals in Sec. III), BSE/LR-TDDFT matrix construction or building denoted by “Build Mat.,” and solving the BSE/LR-TDDFT matrix as eigenvalue problems in Eqs. (22) and (25) denoted by “Solver.” The “Build Mat.” timing accounts for the computational effort in building the BSE matrix as outlined in Eqs. (17) and (18) vs the LR-TDDFT matrix in Eq. (26). We note that the timings comprise the BSE/LR-TDDFT step only, whereas the timings for the preceding steps are not counted, such as the *G*^{0}*W*^{0} and DFT-PBE steps in the BSE@*G*^{0}*W*^{0}@PBE calculation and the DFT-PBE step in the LR-TDDFT-LDA@PBE calculation. Table III shows that the majority of the total timing is attributed to the product basis setup step for both BSE and LR-TDDFT calculations. Both the total timing and the timings for each step are comparable for BSE and LR-TDDFT calculations, which is expected due to the similar formalisms. However, as will be seen below in Table IV, the difference of timings for the matrix building step between BSE and TDDFT becomes more significant in our present implementations if the matrix size is increased.

. | BSE@G^{0}W^{0}@PBE
. | LR-TDDFT-LDA@PBE . | ||||||
---|---|---|---|---|---|---|---|---|

Molecules . | Basis . | Build Mat. . | Solver . | Total . | Basis . | Build Mat. . | Solver . | Total . |

Benzene | 270 | 9 | 1 | 280 | 293 | 7 | 1 | 301 |

Naphthalene | 1200 | 44 | 5 | 1249 | 1184 | 39 | 5 | 1228 |

Anthracene | 3089 | 142 | 16 | 3247 | 2907 | 129 | 19 | 3055 |

. | BSE@G^{0}W^{0}@PBE
. | LR-TDDFT-LDA@PBE . | ||||||
---|---|---|---|---|---|---|---|---|

Molecules . | Basis . | Build Mat. . | Solver . | Total . | Basis . | Build Mat. . | Solver . | Total . |

Benzene | 270 | 9 | 1 | 280 | 293 | 7 | 1 | 301 |

Naphthalene | 1200 | 44 | 5 | 1249 | 1184 | 39 | 5 | 1228 |

Anthracene | 3089 | 142 | 16 | 3247 | 2907 | 129 | 19 | 3055 |

. | BSE@G^{0}W^{0}@PBE
. | LR-TDDFT-LDA@PBE . | ||||||
---|---|---|---|---|---|---|---|---|

Molecules . | Basis . | Build Mat. . | Solver . | Total . | Basis . | Build Mat. . | Solver . | Total . |

Benzene | 276 | 31 | 7 | 313 | 292 | 22 | 7 | 321 |

Naphthalene | 1205 | 314 | 72 | 1592 | 1200 | 173 | 71 | 1443 |

Anthracene | 3175 | 1531 | 407 | 5113 | 3010 | 824 | 405 | 4239 |

. | BSE@G^{0}W^{0}@PBE
. | LR-TDDFT-LDA@PBE . | ||||||
---|---|---|---|---|---|---|---|---|

Molecules . | Basis . | Build Mat. . | Solver . | Total . | Basis . | Build Mat. . | Solver . | Total . |

Benzene | 276 | 31 | 7 | 313 | 292 | 22 | 7 | 321 |

Naphthalene | 1205 | 314 | 72 | 1592 | 1200 | 173 | 71 | 1443 |

Anthracene | 3175 | 1531 | 407 | 5113 | 3010 | 824 | 405 | 4239 |

To demonstrate how the cutoff energy *E*_{cut} = 40 eV reduces the computational timings, we show in Table IV the timings for the same systems without using any cutoff energy for both BSE@*G*^{0}*W*^{0}@PBE and LR-TDDFT-LDA@PBE calculations. All other computational details remain the same as those used to obtain Table III. We see in Table IV that, by using all the unoccupied states in the BSE and LR-TDDFT calculations, the timings for building and solving matrix steps increase significantly compared to those in Table III. In contrast, the timings for the product basis setup step stay almost unchanged, which is expected since the number of basis functions and auxiliary basis functions is not affected by applying the cutoff energy to limit the number of unoccupied states. As mentioned above, the matrix building step in the present implementation reveals a difference between the BSE and the LR-TDDFT timings in Table IV. Specifically, the matrix building step for LR-TDDFT requires the calculations of the kernel *K*_{ia,jb} [Eqs. (26) and (27)]. In our parallel implementation, the state indices *i*, *a*, *j*, and *b* are distributed among different processors, as are the RI two- and three-center integrals, and the calculation of *K*_{ia,jb} requires significant interprocessor communication to get the correct state indices. For BSE, the analogous interprocessor communication has to be conducted twice in the calculations of (*ia*|*V* | *jb*) and (*ij*|*W*|*ab*) because the state index order is different in (*ia*|*V* | *jb*) and (*ij*|*W*|*ab*), and thus, they are calculated in separate steps. As a result, the timing of the matrix building step in the BSE is significantly larger than that in the LR-TDDFT calculations.

Finally, we verify how the cutoff energy *E*_{cut} = 40 eV reduces the memory requirements for the BSE and LR-TDDFT matrices. In Table V, we compare the memory used to store the BSE or LR-TDDFT matrix using all unoccupied states and the memory requirements for the matrices when applying *E*_{cut} = 40 eV, indicating a reduction by a factor of ∼9–11. This is consistent with that in Sec. IV D, where *E*_{cut} = 40 eV was shown to reduce the number of unoccupied states to about 1/3 of the full number of states if no cutoff energy values are used. Since the BSE and LR-TDDFT matrix size scales as $Nocc2\xd7Nunocc2$ [Eqs. (17) and (18)], the overall reduction amounts to (1/3)^{2} = 1/9, which is confirmed by the memory reduction shown in Table V.

. | BSE@G^{0}W^{0}@PBE
. | LR-TDDFT-LDA@PBE . | ||||
---|---|---|---|---|---|---|

Molecules . | Full . | E_{cut} = 40 eV
. | Ratio . | Full . | E_{cut} = 40 eV
. | Ratio . |

Benzene | 400 | 45 | 8.9 | 400 | 49 | 8.2 |

Naphthalene | 2549 | 261 | 9.8 | 2549 | 282 | 9.0 |

Anthracene | 9129 | 876 | 10.4 | 9129 | 979 | 9.3 |

. | BSE@G^{0}W^{0}@PBE
. | LR-TDDFT-LDA@PBE . | ||||
---|---|---|---|---|---|---|

Molecules . | Full . | E_{cut} = 40 eV
. | Ratio . | Full . | E_{cut} = 40 eV
. | Ratio . |

Benzene | 400 | 45 | 8.9 | 400 | 49 | 8.2 |

Naphthalene | 2549 | 261 | 9.8 | 2549 | 282 | 9.0 |

Anthracene | 9129 | 876 | 10.4 | 9129 | 979 | 9.3 |

## V. CONCLUSIONS

We describe an implementation of the Bethe-Salpeter equation approach to neutral excitations in small molecules based on numeric atom-centered basis sets in an all-electron electronic structure framework (the FHI-aims code). Benchmarks performed using Thiel’s set of small molecules^{24} demonstrate the numerical correctness of the implementation. Mean absolute errors less than 1 meV are obtained compared to the reference values computed using the MolGW code when exactly the same basis set and underlying technical approximations are used. We next investigate the impact of analytical approximations to the *G*^{0}*W*^{0} self-energy (the two-pole and 16-parameter Padé approximations) on the *G*^{0}*W*^{0} quasiparticle energies entering the BSE. The MAE of the BSE@*G*^{0}*W*^{0}@PBE results with these analytical approximations is around 0.05–0.20 eV, compared with the exact *G*^{0}*W*^{0} self-energy used in the MolGW reference code. The 16-parameter Padé approximation is more precise than the two-pole approximation where it can be used, but (due to the possible occurrence of multiple solutions in the quasiparticle equation, discussed elsewhere^{97}) the two-pole approximation offers an overall numerically more stable avenue (with the caveats laid out in Ref. 97). Ultimately, the differences due to either approximation are smaller than typical basis set errors and errors due to the level of theory itself as assessed in other benchmark publications. The basis set convergence behavior of the predicted low-lying excitations is investigated for the cc-pVnZ,^{116} FHI-aims-2009,^{88} NAO-VCC-nZ,^{89} and aug-cc-pVnZ^{114} literature basis sets, as well as for a simple modification of the FHI-aims-2009 tier2 basis set by adding two augmentation functions from the aug-cc basis sets,^{114} called “tier2+aug2.” For both BSE@*G*^{0}*W*^{0}@PBE and LR-TDDFT-LDA@PBE, adequate precision requires the use of augmentation functions, as expected from the literature. The “tier2+aug2” basis set provides high precision for both BSE and LR-TDDFT calculations while remaining applicable in production calculations. Furthermore, the convergence is investigated with respect to the number of unoccupied states included in the BSE or LR-TDDFT matrix construction. A threshold of *E*_{cut} = 40 eV is suggested, above which the unoccupied states are discarded in the construction of either the BSE or the LR-TDDFT matrix. This threshold significantly reduces the time and memory consumption while maintaining high precision of the results for low-lying excitations. Finally, BSE@*G*^{0}*W*^{0}@PBE and LR-TDDFT-LDA@PBE results are compared using the tier2+aug2 basis set for Thiel’s set of molecules. The difference between BSE@*G*^{0}*W*^{0}@PBE and LR-TDDFT-LDA@PBE is quantified by an MAE in the range of 0.2–0.7 eV for different singlet and triplet states calculated for molecules in Thiel’s set. In agreement with the literature, deviations from higher-level “best estimate” values are of a similar magnitude; one likely mitigation strategy is the selection of a better starting-point density functional for BSE@*G*^{0}*W*^{0}@DFT.

## SUPPLEMENTARY MATERIAL

See the supplementary material for excitation energies of the molecules in Thiel’s set using the BSE with and without the TDA; corresponding oscillator strengths of singlets within the TDA; definition of the “tier2” basis sets and numerical settings; DFT and *G*^{0}*W*^{0} single-particle HOMO, LUMO, and gap data for the ethene molecule; and basis set convergence of excitation energies of the molecules in Thiel’s set using the BSE and LR-TDDFT; comparison of LR-TDDFT results with and without the TDA to BE data. We furthermore provide numerical versions of the data shown in the supplementary material, Tables S1–S5 and S7–S9, in a comma-separated value (csv) format as file “si_tab.csv.”

## ACKNOWLEDGMENTS

This work was financially supported by the NSF under Award Nos. DMR-1729297 and DMR-1728921, as well as through the Research Triangle MRSEC (Grant No. DMR-11-21107). An award of computer time was provided by the Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program. This research used resources of the Argonne Leadership Computing Facility (ALCF), which is a DOE Office of Science User Facility supported under Contract No. DE-AC02-06CH11357. The authors would like to thank the University of North Carolina at Chapel Hill and the Research Computing group for providing computational resources and support that have contributed to these research results. X.R. acknowledges the financial support from the Chinese National Science Foundation (Grant Nos. 11574283 and 11874335) and the Max Planck Partner Group project. The authors are grateful to the reviewers of this work for their careful reading and numerous helpful suggestions that the authors were fortunate to be able to include in this work.

## REFERENCES

_{0}W

_{0}for molecular systems