xTC: An efficient treatment of three-body interactions in transcorrelated methods

An efficient implementation for approximate inclusion of the three-body operator arising in transcorrelated methods via exclusion of explicit three-body components (xTC) is presented and tested against results in the"HEAT"benchmark set [A. Tajti et al., J. Chem. Phys. 121, 11599 (2004)]. Using relatively modest basis sets and computationally simple methods, total, atomization, and formation energies within near-chemical accuracy from HEAT results were obtained. The xTC ansatz reduces the nominal scaling of the three-body part of transcorrelation by two orders of magnitude to O(N^5) and can readily be used with almost any quantum chemical correlation method.


I. INTRODUCTION
Achieving chemical accuracy in computational quantum chemistry calculations is still a significant challenge for all but the most simple systems. Convergence of the correlation energy with basis set size is often slow and high-quality correlation methods are required to obtain accurate results, leading to rapidly escalating computational cost.
One of the reasons for the slow convergence of the energy with the size of the basis set are the discontinuities in the first derivative of the wave function at the electron coalescence points r ij = 0, i.e., the Kato cusp. 1 Explicitly correlated methods aim to address this issue by introducing explicit dependence on the inter-electronic distances into the wave function, leading to significantly improved convergence to the basis set limit.  However, extension to higher-order methods is non-trivial.
Alternatively, the electronic wavefunction Ψ can be factorized into a Jastrow factor e τ and a smoother function Φ with the correlation factor describing the correlation of the electron pairs via the symmetric functions u(r i , r j ). 31 Transferring the Jastrow factor to the Hamiltonian via a similarity transformation gives rise to the non-hermitian transcorrelated Hamiltonian,H. The transcorrelated ansatz was pioneered by Boys and Handy 32 and has seen renewed interest in recent years.  Compared to the explicitly correlated methods, the transcorrelated ansatz has the advantage that it not only improves the basis set convergence, but also the quality of the underlying correlation method.
A major bottleneck of transcorrelated methods is the calculation of the three-body integrals arising from the similarity transformation of the Hamiltonian. Further, they require the implementation of additional modifications in the subsequent correlation methods that cost both additional CPU and developer time. Recently, approximate transcorrelated coupled cluster [57][58][59] and distinguishable cluster [60][61][62][63][64] methods that drop the pure three-body part of the normal ordered similarity transformed Hamiltonian have been proposed and shown to introduce only negligible errors in the final energies for atoms and molecules 53,56 as well as the three-dimensional uniform electron gas 52 . However, explicit calculation of the costly three-body integrals via numerical integration was still required.
In this work, we introduce an efficient algorithm for the inclusion of the transcorrelated three-body operator excluding explicit three-body components (xTC) via modifications of the zero, one-and two-body integrals. Instead of calculating the three-body integrals explicitly, the corrections to the integrals are obtained directly from the intermediates of the numerical integration, leading to a reduction of the scaling of the matrix-element evaluation stage by two orders of magnitude with respect to a naïve implementation. Extensive benchmarking results using different correlation methods, basis sets and additional approximations are presented.
The paper is organized as follows. In section II we will derive the working equations for the xTC ansatz in spin-orbital basis. Section III shows the results of the benchmarking calculations. Section IV will offer a brief summary of the results. Finally, spin-integrated working equations are shown in the appendix. The transcorrelated Hamiltonian in second quantization isH where P , Q, R, ... are general spin-orbital indices and are products of creation and annihilation operators. Note that here and throughout the article we use the Einstein convention, i.e., we implicitly sum over repeated indices on the right hand side of the equation as long as they do not appear on the left hand side. In addition to the standard integrals the Baker-Campbell-Hausdorff expansion of the similarity transformed Hamiltonian gives rise to the nonhermitian two-body integrals and the three-body integrals where we have used the symmetric permutation operators Aside from evaluation of the costly three-body integrals, inclusion of the three-body operator L N requires modification of any theory using transcorrelation to include terms interacting with a three-body operator in the Hamiltonian. In order to find approximations that will solve these issues, we will first introduce normal ordering in the next section.

B. Generalized normal ordering
Instead of defining normal-order with respect to a single determinant, Kutzelnigg and Mukherjee proposed a generalized normal ordering with respect to an arbitrary reference function |Φ 0 ⟩. 65,66 In the general normal ordering, the antisymmetrized products of Kronecker deltas are replaced by the density matrices of the reference function For a three-body operator, this yields withã being normal ordered operators and the sums going over all nine non-redundant permutations of (PRT) and (QSU) with the appropriate sign.
Introducing the combined two-electron integrals the normal ordered Hamiltonian can be separated into effective one-, two-and three-body operators with C. The xTC approximation: excluding explicit three-body components We can now exclude the explicit three-body components L N (xTC) and incorporate the remaining 3-body contributions via a change of the two-, one-and zerobody integrals.Ū Neglect of the pure normal ordered three-body operator eq. (14c) has been shown to introduce only minor errors, even for normal ordering with respect to the Hartree-Fock determinant. 53,56 The correction to the two-electron integrals is trivially obtained as InsertingŪ instead of U into F N yields a new generalized Fock-operator that differs from the original generalized Fock-operator F N according to In order to ensure that the generalized Fock-operator remains invariant with respect to the change in integrals, we combine the correction of the change due to ∆U QS P R in F N with the one-body correction arising due to the normal ordering of the three-body operator to obtain The zero-body correction is simply the expectation value of the three-body operator So far these equations are valid for an arbitrary reference function |Φ 0 ⟩. In the next section we will show how a single-determinant reference function greatly simplifies the equations.

D. Single determinant case
If the reference function |Φ 0 ⟩ is a single determinant, the higher order density matrices are simply antisymmetrized products of the one-body density matrix In this case, the one-and zero-body corrections are trivially obtained from contraction of the higher-body correction with the one-body density matrix While these two equations are only exact for a singledeterminant reference, they can still be used approximately with correlated densities (i.e., negelectng the higher order cumulants in the cumulant expansion of the respective density 65,66 ), which offers the potential for additional flexibility in the xTC ansatz.
The equations given above still require explicit calculation of the L-matrix. In the next section we will describe how we can obtain the corrections in (15) without the need of costly six-index intermediates.

E. Decomposition of two-body correction
In our implementation 47 , the three-body integrals are obtained via numerical integration over the grid points and the intermediate We use boldface here to indicate that V S R (i) and some of the following intermediates are three-dimensional spatial vectors and use the scalar product to indicate contraction over the spatial coordinates.
Making use of the decomposition eq. (24) allows us to avoid calculation of the three-body integrals entirely and instead calculate ∆U directly. By inserting eq. (21) into eq. (16) and changing the order of summation, we obtain with the intermediates Whereas the formal scaling of evaluation of the threebody integrals is of order N 6 orb N grid , the most expensive step of the xTC ansatz presented here, i.e., the assembly of the two-body correction eq. (26), scales only as N 4 orb N grid .

A. Computational details
The "HEAT" series of papers 67-70 provide bechmarkquality energetics for 31 atoms and molecules obtained with computationally expensive, accurate correlation methods and large basis sets, employing basis-set extrapolation to produce the final results. We compute the total, atomization, and formation energies of these systems using the transcorrelation ansatz excluding explicit three-body components (xTC) with moderate basis-set sizes without extrapolation and compare them against the corresponding nonrelativistic electronic energies from HEAT, reported as Hartree-Fock and correlation energy contributions in Table I of Ref. 67, which we treat as "exact" values.
Note that carbon is a solid under standard conditions, so "proper" formation energies of compounds containing carbon atoms cannot be calculated from our results. For the purpose of comparing with Table III of Ref. 67 we compute formation energies with respect to carbon monoxide instead, e.g., the formation energy for C 2 H 2 is obtained from the reaction with electron-electron, electron-nucleus, and electronelectron-nucleus contributions v, χ, and f , respectively, expanded in natural powers. 71,72 The Jastrow functions have been optimized by minimizing the variance of the reference energy within variational Monte Carlo, with the details being described elsewhere. 55 Numerical quadrature over the direct product of atom centered grids built from Treutler-Ahlrichs radial grids and Lebedev angular grids obtained from PySCF has been employed. 73 We observe that results are essentially converged for grid level 2 (see supplementary material) and this is the choice we have used throughout the paper. Final results were obtained using unrestriced coupled cluster (CC) and distinguishable cluster (DC) methods implemented in the Molpro quantum chemistry package. 74 In the following, we investigate the quality of the energies obtained with different coupled cluster based correlation methods as well as different types and sizes of basis sets. A comparison to the well established explicitly correlated methods represented by the F12a approximation with the diagonal ansatz 3C(D) as implemented in Molpro [75][76][77][78] has been included as well. Finally, we investigate the effect of further approximations to the xTC ansatz itself. Unless otherwise specified, the density matrix used in the xTC contractions is the HF density as described in appendix A for open-shell molecules and appendix B for closed-shell molecules, which corresponds to approximation B in previous work. 53,56 Furthermote, if not otherwise indicated, all-electron calculations have been carried out.
In order to simplify the comparison, we will only present the signed mean error (ME), standard deviation (STD) and maximum error (MaxE) for total energies and the mean absolute error (MAE), root mean square error (RMSE) and MaxE for relative energies in the body of this work. Individual results and additional figures can be found in the supplementary material.

B. Method dependence
The results for transcorrelated and explicitly correlated F12a methods are shown in Table II and Figures 1, 2 and 3. Additionally, the variational Monte-Carlo (VMC) energies obtained in the Jastrow optimization as well as the mean-field energies obtained with the xTC integrals (xTC-HF) have been included in Table II. Note that we have used CCSDT instead of CCSD(T) for the transcorrelated calculations since use of the xTC ansatz would require an iterative approach to the perturbative inclusion of the triples. Furthermore, there are several possibilities for non-iterative (T) corrections in transcorrelated methods, which will be investigated in a separate publication.
For total energies, transcorelated methods are on average by a factor of roughly two closer to the HEAT reference than the respective explicitly correlated methods. For relative energies however, the F12a methods benefit to a much larger degree from error compensation than the transcorrelated methods. Nevertheless, xTC-CCSD and xTC-DCSD on average still perform better than their explicitly correlated counterparts. While CCSD is unable to provide satisfactory results both with the transcorralated and the explicitly correlated ansatz, DCSD performs significantly better. xTC-DCSD in particular yields average errors close to or even within chemical accuracy (1 kcal/mol = 4.2 kJ/mol = 1.6 mhartree) that are however still about twice as large as those of CCSD(T)-F12a. Nevertheless, it provides a reasonable compromise between accuracy and computational cost. xTC-CCSDT further improves both absolute and relative energies compared to xTC-DCSD, so the success of xTC-DCSD is unlikely to be due to fortuitous error compensation alone. CCSD(T)-F12a emerges as the overall best method for the treatment of relative energies, with xTC-CCSDT yielding slightly worse results for atomization energies and almost identical results for formation energies. However, unlike CCSD(T)-F12a where explicit correlation in the triples part enters only implicitly via the doubles, inclusion of explicit correlation via xTC for higher order excitations is trivial in xTC-CCSDT and subsequent non-perturbative methods in the coupled cluster hierarchy and comes with no additional computational cost beyond generation of the xTC integrals.

C. Cardinal number and diffuse functions
A comparison of the results for xTC-DCSD with the basis sets cc-pVXZ and aug-cc-pVXZ (X = D, T, Q) 79 can be found in Table III. The total energies become progressively better with increasing cardinal number, but on average do not reach mhartree accuracy. Inclusion of diffuse functions further improves the results for the total energies, but the average errors remain larger than for the unaugmented basis set with the next higher cardinal number.
This clear trend does not translate to the atomization and formation energies. The double zeta basis sets yield results with average errors of up to roughly ten times chemical accuracy and are therefore clearly unsuitable. Triple zeta basis sets on the other hand are already able to achieve chemical accuracy for many molecules, especially if diffuse functions are included. However, despite the further improvement in the total energies for quadruple zeta basis sets, no further improvement in the relative energies is observed. Instead, the results become slightly worse while still remaining close to chemical accuracy, indicating that we benefit from favorable error compensation for the triple zeta basis sets.

D. Core correlated basis sets
The results for the use of the core-correlated basis sets cc-pwCVXZ and aug-cc-pwCVXZ (X = D, T) 80 are shown in Table IV. For double zeta basis sets, we see improvement in both total and relative energies, but not nearly enough to justify their use. For triple zeta basis sets, the total energies improve compared to the respective basis sets without additional core functions. For atomization energies however, use of core-correlated basis sets leads to larger errors outside of chemical accuracy and aug-cc-pVTZ results remain closest to the reference for the systems under investigation. For formation energies, the deterioration of results is less pronounced and the results remain close to chemical accuracy.

E. Effect of F12 basis sets
Results for the basis sets cc-pVXZ-F12 and aug-cc-pVXZ-F12 (X = D, T) 81,82 optimized for use with explicitly correlated methods are shown in Table V. While total energies are greatly improved compared to the respective basis sets discussed previously, the relative energies paint an ambiguous picture. Atomization and formation energies for double zeta basis sets are greatly enhanced but still far from chemical accuracy. For triple zeta basis sets however we observe slight deterioration of the results despite the more accurate total energies. Therefore, we come to the conclusion that the triple zeta F12 basis sets offer no advantage here compared to the smaller aug-cc-pVTZ basis set.  An obvious approach to reduce the cost of the calculation is the frozen core approximation. The orbitals can be frozen either before generating the additional integrals (FC-xTC) or after calculating the xTC corrections (xTC-FC).
In the first approach, we employ the standard frozen core approximation after evaluation of the one-and twoelectron integrals eqs. (6a) and (6b), respectively, and evaluate the transcorrelated integrals only for the remaining orbital space. However, this means that we simply dismiss the contribution of the core orbitals to the transcorrelation ansatz.
For the xTC-FC approach on the other hand, we first Method dependence of atomization energies (aug-cc-pVTZ) compared to HEAT. Dotted lines indicate chemical accuracy. The shaded area corresponds to the sum of gaussians centered at the data points with the width of the gaussians chosen such that equidistantly distributed gaussians would be contained to 95% in the corresponding segment.  NH3  CO2  HO2  OH  F  NO  HNO  OF  NH2  H  O  NH  HCO  N  HCN  CH3  CF  CH2  CN  C2H2  CH  C  CCH   CCSD-F12a   HF  H2O  H2O2  NH3  CO2  HO2  OH  F  NO  HNO  OF  NH2  H  O  NH  HCO  N  HCN  CH3  CF  CH2  CN  C2H2  CH  C  CCH   DCSD-F12a   HF  H2O  H2O2  NH3  CO2  HO2  OH  F  NO  HNO  OF  NH2  H  O  NH  HCO  N  HCN  CH3  CF  CH2  CN  C2H2 CH C CCH CCSD(T)-F12a 30 20    evaluate the xTC corrections to the integrals and only afterwards apply the frozen core approximation according to the standard equations, leading to inclusion of additional correlation arising from the core orbitals. Note that for xTC-FC, the one-electron integrals are no longer hermitian due to the contraction and addition of nonhermitian two-electron integrals onto the one-electron integrals. While FC-xTC obviously reduces the cost of the generation of the transcorrelated integrals, the effect on the efficiency of the subsequent correlation treatment is identical for both approaches. Being able to employ the FC-xTC approach would enable treatment of larger systems, but that would require the additional correlation included in xTC-FC via the transcorrelated integrals to be negligible.
The results for both approaches can be found in Table VI together with results for DCSD-F12a and CCSD(T)-F12a with and without frozen core approximation. For all molecules, the 1s-shell was frozen for each C, N, O and F atom in the respective molecule. The FC-xTC approximation increases the average error in the total energies by an order of magnitude and the average errors in the atomization and formation energies by a factor of about three and two, respectively. For the F12a methods use of the frozen core approximation also significantly deteriorates the quality of the total energies. The effect on the atomization and formation energies however is far less severe, showing that the F12 approach benefits again from excellent error compensation.
The total energies for FC-xTC and the FC-F12a methods are remarkably close to each other (see also the additional data provided in the supplementary material) despite the all electron calculations providing vastly different results. The xTC-FC approach however introduces comparatively small errors in both total and relative energies. This indicates that most of the correlation due to core-core and core-valence interactions is already included in the xTC ansatz and the subsequent correlation treatment primarily accounts for the valence-valence interactions.

G. Density and orbital choice
The xTC ansatz offers some additional flexibility through the choice of the density matrix used in the contractions. We are going to investigate three cases: HF orbitals with HF density (HF/HF), HF orbitals with DCSD density (HF/DCSD) and DCSD natural orbitals with DCSD density (DCSD/DCSD). Furthermore, it would be appealing to avoid using the restricted open-shell algorithm (appendix A) for open-shell molecules, since it leads not only to increased computational cost compared to the restricted closed-shell algorithm (appendix B) but also to spin-specific one-and two-body integrals, requiring additional storage space and the subsequent correlation methods to be able to handle spin-specific inte-grals. This can be accomplished by using the density of an open-shell system as the density in the closed-shell algorithm, which we will refer to as the spin-averaged (SA) approximation.
The results for the three orbital and density choices with and without SA approximation are listed in Table VII. The orbital and density choice has only a marginal impact on the total energies. For atomization and formation energies, HF/DCSD and DCSD/DCSD yield slightly better or worse results, respectively, on average. These changes are unlikely to be meaningful and are likely a result of changes in error cancellation.
Curiously, using the SA approximation does not significantly change the quality of either total or relative energies. It can even lead to slightly better results, as can be seen for the HF/HF orbital choice using the SA approximation, which shows the overall best results for the choices investigated here. Therefore, considering that other error sources in the current approach lead to significantly larger deviations, there is no compelling practical reason not to use the SA approximation.

IV. CONCLUSION AND OUTLOOK
We have implemented an approximation to the full transcorrelation treatment which excludes the full threebody term of the normal ordered transcorrelated Hamiltonian (xTC) and includes the effective zero-, one-and two-body contributions of the three-body term via modification of the remaining integrals and the nuclear repulsion term. Explicit calculation of three-body integrals is avoided by obtaining the correction to the integrals via contractions of a density matrix with the three-index intermediates for the numerical integration of the threebody integrals, reducing the scaling by two orders of magnitude while introducing only minor errors.
The scheme presented here has the added benefit of no longer requiring explicit contractions with the threebody integrals in the subsequent correlation treatment, thus removing the need for costly and complicated modifications in conventional correlation methods in order to allow the use of the transcorrelated method. In contrast to F12 methods, inclusion of explicit correlation via xTC in multireference methods or methods using triples and higher excitations is therefore trivial as long as one accomodates non-hermitian integrals.
Furthermore, unlike the frozen core approximation for F12 methods, freezing the core orbitals after generation of the xTC integrals (xTC-FC) allows partial inclusion of explicit correlation for the core orbitals. This has been shown to lead to only minor deviations from the all electron calculations allowing for computational savings in the subsequent correlation treatment without significant loss in accuracy.
Benchmark calculations have been carried out on the molecules of the HEAT set and it has been demonstrated that chemical accuracy for atomization and formation en- ergies is within reach for the comparatively cheap xTC-DCSD method and the aug-cc-pVTZ basis set. For methods that do not include triple excitations, results are generally better than those obtained with the respective explicitly correlated F12 method while methods that include triples excitations yield similar results for xTC and F12. While improving total energies, the more expensive basis sets (aug-)cc-pVQZ, (aug-)cc-pwCVTZ and (aug-)cc-pVTZ-F12 fail to show improvement in the relative energies compared to results obtained with aug-cc-pVTZ. The xTC ansatz in combination with the DCSD method enables the application of transcorrelation to systems of several hundred orbitals at a reasonable computational cost. Further development should focus on additional efficiency improvements and the development of more balanced Jastrow factors in order to further improve the quality of relative energies. Since the DCSD method has proven to be less unstable for systems with strong static correlation than conventional coupled cluster methods, the study of strongly correlated systems may be an interesting application as well.

SUPPLEMENTARY MATERIAL
Raw data and additional figures can be found in the supplementary material.

ACKNOWLEDGMENTS
Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) -455145945. Financial support from the Max-Planck Society is gratefully acknowledged. P.L.R. and A.A. acknowledge support from the European Centre of Excellence in Exascale Computing TREX, funded by the Horizon 2020 program of the European Union under grant no. 952165. Any views and opinions expressed are those of the authors only and do not necessarily reflect those of the European Union or the European Research Executive Agency. Neither the European Union nor the granting authority can be held responsible for them.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available within the article and its supplementary material.