Rationale for the Extrapolation Procedure in Selected Configuration Interaction

Selected configuration interaction (SCI) methods have emerged as state-of-the-art methodologies for achieving high accuracy and generating benchmark reference data for ground and excited states in small molecular systems. However, their precision relies heavily on extrapolation procedures to produce a final estimate of the exact result. Using the structure of the exact electronic energy landscape, we provide a rationale for the common linear extrapolation of the variational energy as a function of the second-order perturbative correction. In particular, we demonstrate that the energy gap and the coupling between the so-called internal and external spaces are the key factors determining the rate at which the linear regime is reached. Starting from first principles, we also derive a new non-linear extrapolation formula that improves the post-processing of data generated from SCI methods and can be applied to both ground- and excited-state energies.


I. INTRODUCTION
9][20] Their primary purpose is to calculate reference correlation and excitation energies in small molecular systems, 19,[21][22][23][24][25][26][27][28] for which they have demonstrated a remarkable ability to yield highly accurate estimates of full configuration interaction (FCI) results.The numerous variations of SCI all perform a sparse exploration of the Hilbert space by selecting only the most energetically relevant determinants.This natural philosophy emerges from the observation that, among the incredibly large number of determinants in the FCI space, only a tiny fraction of them significantly contribute to the energy.7][48][49][50] Stochastic CI methods, such as Monte Carlo CI (MCCI) 51,52 and FCI quantum Monte Carlo (FCIQMC), [53][54][55][56][57][58] follow a similar philosophy by using a stochastic representation to select the most important determinants.
The SCI wave function corresponds to a truncated CI expansion constructed from determinants in some internal (or model) space with the associated variational energy E var = ⟨Ψ var | Ĥ|Ψ var ⟩, where we assume the normalisation of the variational wave function, i.e., ⟨Ψ var |Ψ var ⟩ = 1.The accuracy of |Ψ var ⟩ can be assessed using the second-order Epstein-Nesbet perturbation correction, computed using a) Electronic mail: hgaburton@gmail.comb) Electronic mail: loos@irsamc.ups-tlse.frthe determinants {α} that lie outside the model space (i.e. in the external space A) as where H αα = ⟨α| Ĥ|α⟩.This perturbative approximation is traditionally derived using Löwdin partitioning, as shown in Appendix A. The exact FCI wave function and energy are indicated by the limit E PT2 → 0 − .Despite the sparse exploration of the Hilbert space, these state-of-the-art methods still rely heavily on extrapolation procedures to produce final FCI estimates. 19,20,23,34n particular, it is widely observed that an approximate linear relationship appears between E var and E PT2 when E PT2 becomes small enough, and thus a linear or quadratic extrapolation of E var for E PT2 → 0 is generally used to estimate the exact energy for an unconverged SCI calculation. 23The precision and reliability of this post-processing extrapolation procedure are critical in order to produce meaningful estimates.However, to the best of our knowledge, no theoretical justification for the linear (or otherwise) relationship between E var and E PT2 has been proposed.
To illustrate this extrapolation procedure, Fig. 1 shows the evolution of the variational correlation energy of benzene as a function of E PT2 , computed in the cc-pVDZ basis and within the frozen-core approximation.These SCI calculations were performed with quantum package using the CIPSI algorithm 34 and the data are extracted from Ref. 35.The FCI estimate of the correlation energy (solid black line in Fig. 1) was estimated to be −862.890mE h and was obtained by performing a five-point linear fit (dashed black line in Fig. 1) of the CIPSI data.This estimate carries an error of the order of 1 mE h and the fitting error was estimated to be 0.266 mE h .From Fig. 1, it is clear that, for sufficiently small E PT2 , the variational quantity is linear with respect to E PT2 .However, the SCI data deviate significantly from linearity for larger values of E PT2 , which we shall address in detail later on.
In this Communication, we provide a rationale to justify the linear extrapolation of the (zeroth-order) variational energy as a function of the second-order perturbative energy.We adopt a geometric approach that considers the variational wave function as a point on the exact electronic energy landscape, 59 allowing the second-order perturbative correction to be derived from the local gradient and curvature of this energy landscape.Moreover, we investigate a two-state model in which an analytic relationship between E var and E PT2 can be derived, leading to a novel parametrised non-linear formula that facilitates a more robust extrapolation procedure.

II. RATIONALE FOR THE LINEAR EXTRAPOLATION
The linear relationship between E var and E PT2 can be derived from first principles by considering the structure of the electronic energy landscape.While Ref. 59 describes this energy landscape perspective in detail, the salient points are summarised here.Any normalised wave function in the full N -dimensional Hilbert space can be represented by a vector v subject to the normalisation constraint v † • v = 1, which constrains the wave function to the surface of a hypersphere.The energy is given by the quadratic form and exact eigenstates of the Hamiltonian correspond to stationary points of E constrained to the surface of the hypersphere, as illustrated in Fig. 2. At any point on the hypersphere, the tangent space T contains the vectors that are orthogonal to v, which can be collected as the 2. Sketch of the exact electronic energy landscape in a three-dimensional Hilbert space.Eigenstates correspond to stationary points constrained to the surface of the unit sphere.At any point, the tangent space T is spanned by the two vectors (red) that are orthogonal to the position vector.
columns of an N × (N − 1) matrix v ⊥ .These tangent vectors correspond to the states |T ⟩ that are orthogonal to |Ψ⟩ and satisfy where 1 is the identity operator.A constrained step s on this landscape is parametrised using a unitary transformation as Assuming real wave functions, the components of the constrained energy gradient are then while the elements of the Hessian matrix of constrained second-derivatives become So far, we have only considered the structure of the electronic energy landscape for an arbitrary wave function in the full Hilbert space.For a SCI variational wave function, the only non-zero elements of v correspond to determinants included in the internal space, with coefficients c I , as defined in Eq. (1).The tangent vectors can then be split into two disjoint sets corresponding to the eigenstates within the internal space that are orthogonal to |Ψ var ⟩, denoted J = {|J⟩} J̸ =var , and the determinants in the external space, giving T = J ∪ A, as illustrated in Fig. 3.This separation of the internal and external spaces is further explained in Appendix A. Since The variational space I (dashed blue circle) is built from two configurations (blue vectors) and the external space A contains one configuration (red vector).At |Ψvar⟩, the tangent space is spanned by one tangent direction |J⟩, which is locally parallel to the variational space I, and one orthogonal tangent direction in the external space |α⟩.
|Ψ var ⟩ is a CI solution within the internal space, such that ⟨J| Ĥ|Ψ var ⟩ = 0, the gradient in Eq. ( 7) is only non-zero in the direction of the tangent vectors in A.
The local structure of the energy landscape around the variational wave function |Ψ var ⟩ is given by a second-order Taylor series expansion as Optimising this quadratic form with the Newton-Raphson step s = −Q −1 • g gives an estimate of the difference between the exact energy E exact and E var as or, equivalently, where we have exploited the fact that the gradient is zero in the direction of the tangent vectors within J .Crucially, the exact energy landscape becomes quadratic near an eigenstate. 59Therefore, Eq. ( 10) becomes an exact relationship when |Ψ var ⟩ is sufficiently accurate, and we rigorously recover the linear asymptotic relationship The steps that we have taken to derive this asymptotic relation are analytically rigorous.However, the matrix elements ⟨α|( Ĥ − E var 1) −1 |α ′ ⟩ in Eq. ( 11) are too expensive to compute in practice.Instead, we can assume that the Hamiltonian is diagonally dominant in A and take the leading order approximation where Ĥ(0) is the zeroth-order Hamiltonian within the Epstein-Nesbet partitioning.The energy correction then reduces to the second-order Epstein-Nesbet expression obtained in Eq. ( 2) to give Since E PT2 is the leading-order approximation to ∆E, we obtain ∆E ∼ E PT2 for ∆E → 0. Therefore, when |Ψ⟩ is sufficiently close to an eigenstate, we obtain the asymptotic relationship which justifies a linear extrapolation of E var against E PT2 .The only approximation employed in our derivation is to assume that Eq. ( 14) provides the dominant contribution to Eq. (11).From the exact energy landscape, we have shown that the asymptotic behaviour of E var is linear for any SCI variational wave function as E PT2 → 0, and that the slope of this relationship is close to −1 for E var → E exact .These properties are illustrated in Fig. 4, where we plot E var against ∆E and E PT2 for 10,000 randomly selected SCI internal spaces in H 2 O (STO-3G).We systematically target the ground state by ensuring that the HF determinant is always included in the internal space.The linear asymptote (black dashed) and the validity of  12) and ( 15), for H2O (STO-3G).We randomly select 10,000 sets of determinants of varying sizes (including the HF ground-state determinant) to construct the SCI internal space and variational wave function.

III. INSIGHTS FROM A TWO-STATE MODEL
In practice, linear or quadratic extrapolation procedures only work using a limited number of points and for well-converged calculations.As illustrated in Fig. 1, the variational energy generally deviates away from linearity for larger values of E PT2 .The cause of these deviations can be studied using a two-state model that represents the separation of the internal and external spaces in a SCI calculation, and for which the relationship between E var and E PT2 can be analytically derived.
Our model contains individual states |Ψ I ⟩ and |Ψ A ⟩ representing wave functions in the internal and external spaces, respectively, with characteristic energies E I and E A .The Hamiltonian matrix in this basis is then where t = ⟨Ψ I | Ĥ|Ψ A ⟩ represents the strength of the coupling between the internal and external spaces, and δE = E A − E I provides a measure of their energetic separation.The exact ground-state energy is The improvement of |Ψ var ⟩ during the course of a SCI calculation can be modelled by mixing |Ψ I ⟩ and |Ψ A ⟩ to give the parametrisation with 0 ≤ θ < 2π.The corresponding energy is Following a Taylor series expansion, the second-order correction is By solving Eq. ( 20) for θ, we can invert these equations to express E var in terms of E PT2 as (21) which naturally reduces to Eq. ( 17) for E PT2 = 0.This expression reveals that the more general form of E var as a function of E PT2 involves a square-root term that deviates away from linearity, and that this departure from the linear regime is directly related to the energetic separation (δE) and coupling strength (t) between the internal and external spaces.For (E PT2 ) 2 ≪ δE 2 2 + t 2 , we recover the linear behaviour E var ≈ E exact −E PT2 .In a real SCI calculation, we expect |t| ≪ δE 2 .Therefore, the larger the energy separation, the sooner the linear regime is reached, while a large coupling between I and A also leads more rapidly to the linear regime.These features are further demonstrated through the series expansion of E var at small E PT2 , which shows that the quadratic behaviour is minimal when |δE| or |t| is large.This functional relationship can be illustrated by evaluating E var for various values of E PT2 in the limit of interest, as shown in Fig. 5 for E I = −1, δE = 1 and t = 1.5][26][27]34 Figure 1 nicely illustrates this square-root behaviour for a realistic system, and the similarities between Figs. 1 and 5 are striking.For realistic systems, the two-state model can be approximately engineered using a suitable transformation of the external space A such that only one external state couples to the variational wave function through the Hamiltonian, as described in Appendix B.

IV. A NON-LINEAR EXTRAPOLATION FORMULA
The insights from our two-state model suggest that a non-linear functional form may be more suitable for extrapolating SCI data.The separation of the model and perturbation space in SCI means that the relationship between E var and E PT2 is not as straightforward as the model system.However, as more determinants are added to the model space, the variational wave function follows a path towards the exact ground state that is likely to resemble Eq. ( 21).The concave form of Eq. ( 21) suggests that a linear extrapolation procedure will generally underestimate the exact correlation energy and that any two-point linear fit would provide an upper bound to the exact energy.Therefore, we propose a new non-linear extrapolation formula where a, b, and c are fitting parameters.1][62][63] Crucially, Eq. ( 23) reduces to a linear fit for |E PT2 | ≪ c 2b and can reproduce the observed non-linearity for larger E PT2 .The fitted value of a provides the estimate for the FCI result.
The variation of the FCI estimate of the correlation energy of benzene computed in the cc-pVDZ basis using a linear or non-linear extrapolation procedure is compared in Fig. 6.These data show that the non-linear formula can accurately fit the SCI data at larger E PT2 values than the linear procedure.In the left panel of Fig. 7, we study the influence of the number of points included in the linear, quadratic, or non-linear fits, each of them starting from the point associated with the smallest value of E PT2 .The error bars indicate the standard errors associated with the fitting procedure.The fits are weighted according to the inverse square of the perturbative corrections.Although the FCI estimates obtained from the non-linear formula [see Eq. ( 23)] have larger fitting errors for a small number of points, they quickly stabilise and remain relatively consistent compared to those obtained from the linear procedure, which rises much faster.Consequently, a larger number of points can, and should, be employed for the non-linear extrapolation formula, while the linear extrapolation procedure becomes systematically worse when more points are used.Our results also demonstrate that, except for small numbers of points, the quadratic and non-linear fits are nearly identical.However, in contrast to the quadratic fit, the non-linear formula will never predict a maximum in E var vs E PT2 .
In the right panel of Fig. 7, we consider a fixed-length window of consecutive points for the linear, quadratic, and non-linear fits, but we vary the index of the starting point, with the point at index 1 corresponding to the smallest value of E PT2 (i.e., the higher the index of the starting point, the larger the E PT2 correction).Using 5 points for the linear fit and 8 points for the quadratic and non-linear fits, we again show that the quadratic and non-linear procedures are slightly more stable than the linear version with respect to the starting point, although all extrapolation schemes eventually underestimate the correlation energy for larger E PT2 values.These results indicate that, compared to the linear approach, the non-linear extrapolation procedure can provide a more accurate estimate of the FCI result for SCI calculations that are less well converged.
The benzene example offers a practical illustration that is close to the upper limit of what is currently achievable using SCI methods. 19,21,35However, since the exact FCI correlation energy remains unknown, there are potential ambiguities in the comparison of different extrapolation procedures.Instead, we turn to another example where the Hilbert space is significantly smaller, allowing the SCI calculations to be converged to the FCI limit.We consider the water molecule with the geometry used in Ref. 25, for which the exact (frozen-core) correlation energy in the cc-pVDZ basis set is −215.027mE h .We conduct a ground-state CIPSI calculation based on HF orbitals.The calculation is intentionally halted prematurely to assess the feasibility of estimating the true correlation energy with smaller variational spaces.Once again, we evaluate and compare the performance of linear, quadratic, and non-linear fits.
The left panel of Fig. 8 reports the convergence of the variational correlation energy as a function of E PT2 as well as the 5-point linear and non-linear fits obtained with minimal and maximal values of E PT2 = 9.5 × 10 −3 E h and E PT2 = 1.1 × 10 −1 E h , respectively.(The quadratic fit is almost indiscernible from the non-linear fit.)The right panel of Fig. 8 shows the FCI estimates produced from the linear, quadratic, and non-linear fits relying on a 5-point window of consecutive points.(The point of index 1 corresponds to a value of E PT2 = 1.6 × 10 −3 mE h .)Once again, the quadratic and non-linear fitting procedures are nearly identical and have considerably greater stability for less well-converged calculations compared to the linear version.For example, considering a set of only 351 determinants as the largest variational space (which is associated with a PT2 value of 7.5 × 10 −2 E h ), the estimated FCI value is −216.192mE h which is in error by around 1 mE h , while the linear fit predicts −204.07 mE h .

V. CONCLUDING REMARKS
In this Communication, we have proposed a theoretical rationale for the linear extrapolation procedure of the variational zeroth-order energy, E var , as a function of the second-order perturbative correction, E PT2 , commonly employed in SCI methods.Our derivation is based on connecting the SCI variational wave function to the properties of the underlying electronic energy landscape.The accuracy of the extrapolation of E var for E PT2 → 0 is critical for accurately estimating the final FCI value.
Our investigations led us to the discovery of a novel non-linear extrapolation formula that more effectively cap-tures the behaviour of E var for larger E PT2 values, thereby enhancing the robustness of extrapolations toward the FCI limit.Based on a two-state model, we derived the analytic form of this non-linear extrapolation formula and examined its mathematical properties.Specifically, we illustrated that the rate at which the linear regime is attained is primarily determined by the energetic gap and the coupling between the internal and external spaces.As a concrete example, we studied the ground-state correlation energy of benzene and water, which illustrates the versatility and reliability of this new extrapolation procedure.
We anticipate that our findings will provide a useful contribution to the computational toolbox for performing SCI calculations.In particular, introducing a mathematically motivated non-linear extrapolation alongside linear and quadratic approaches provides an alternative route to gain confidence in the validity of extrapolation procedures.While our numerical results indicate that the non-linear extrapolation performs similarly to a quadratic fit, the non-linear approach has the advantage that it cannot create a maximum in E var vs E PT2 , preventing catastrophically unphysical extrapolations.Furthermore, we expect that our two-state model and our framing of the SCI approach within the electronic energy landscape framework will allow other intriguing mathematical aspects of these methods to be explored in the future.
The H α0 = ⟨α| Ĥ|0⟩ elements directly couple the internal and external spaces and are proportional to the relevant gradient components g α = 2H α0 .The SVD ensures that |0⟩ directly couples to only one external state, giving

FIG. 1 .
FIG.1.Frozen-core (variational) correlation energy of benzene as a function of EPT2 computed in the cc-pVDZ basis, as described in Ref.35.

FIG. 3 .
FIG.3.Sketch of the tangent space construction T = J ∪ A. The variational space I (dashed blue circle) is built from two configurations (blue vectors) and the external space A contains one configuration (red vector).At |Ψvar⟩, the tangent space is spanned by one tangent direction |J⟩, which is locally parallel to the variational space I, and one orthogonal tangent direction in the external space |α⟩.

FIG. 4 .
FIG.4.Numerical illustration of the asymptotic relationships, Eqs.(12) and (15), for H2O (STO-3G).We randomly select 10,000 sets of determinants of varying sizes (including the HF ground-state determinant) to construct the SCI internal space and variational wave function.

FIG. 5 .
FIG.5.Evar (red markers) as a function of EPT2 for the two-state system with EI = −1, δE = 1 and t = 1.These data deviate from the linear approximation (black dashed line) due to the square-root term in Eq. (21).

FIG. 6 .
FIG.6.Comparison of the linear and non-linear extrapolation procedure for the correlation energy of benzene computed in the cc-pVDZ basis, using the SCI data from Ref.35.

FIG. 7 .FIG. 8 .
FIG.7.Evolution of the FCI estimate of the correlation energy of benzene computed in the cc-pVDZ basis.Left: A variable number of fitting points is considered in the linear (blue), quadratic (green), or non-linear (orange) fits, with each fit starting from the point associated with the smallest value of EPT2.Right: A fixed-length window of consecutive points is used for the linear (red), quadratic (purple), or non-linear (black) fits.The point of index 1 corresponds to the smallest value of EPT2, and a higher index for the starting point corresponds to a larger EPT2 correction.The error bars indicate the standard errors associated with the fitting procedure.
B2)Ignoring the indirect coupling elements H α ′ 1 and H α ′ β ′ then gives the two-state model with E I = H 00 , t = H α ′ 0 , and E A = H α ′ α ′ .Furthermore, since the states within I are decoupled by solving the CI problem, we can construct this two-state model for any variational state by choosing |Ψ I ⟩ and performing the SVD on the corresponding gradient vector.Therefore, this two-state model can represent both ground and low-lying excited states using different values of E I , δE, and t.While this transformation is not feasible in practice, it conceptually links real SCI data and the present two-state model.