Symmetry-adapted perturbation theory (SAPT) is a method for calculations of intermolecular (noncovalent) interaction energies. The set of SAPT codes that is described here, the current version named SAPT2020, includes virtually all variants of SAPT developed so far, among them two-body SAPT based on perturbative, coupled cluster, and density functional theory descriptions of monomers, three-body SAPT, and two-body SAPT for some classes of open-shell monomers. The properties of systems governed by noncovalent interactions can be predicted only if potential energy surfaces (force fields) are available. SAPT is the preferred approach for generating such surfaces since it is seamlessly connected to the asymptotic expansion of interaction energy. SAPT2020 includes codes for automatic development of such surfaces, enabling generation of complete dimer surfaces with a rigid monomer approximation for dimers containing about one hundred atoms. These codes can also be used to obtain surfaces including internal degrees of freedom of monomers.

## I. INTRODUCTION

Intermolecular (or noncovalent) interactions are on average more than an order of magnitude weaker than chemical (covalent) interactions. Despite being weak, noncovalent interactions govern a broad range of phenomena, from spectra of clusters to properties of most condensed phases and structures of bioaggregates. Due to their relative smallness, interaction energies are most naturally computed by perturbative methods, collectively known as symmetry-adapted perturbation theory (SAPT), starting from the isolated molecules (monomers). This means that the Hamiltonian *H* of a cluster is written as the sum of Hamiltonians of the separated monomers, *H*_{0}, and of the perturbation operator *V* that collects Coulomb interactions between the electrons and nuclei of each monomer and those of the interacting partners. The simplest approach of this type (which is actually not symmetry-adapted) is Rayleigh–Schrödinger’s perturbation theory (RS or RSPT), but RSPT is unphysical at short separations since it does not predict the existence of the repulsive wall. To include repulsion, one has to properly antisymmetrize the cluster wave function, i.e., enforce Pauli’s exclusion principle, resulting in a symmetry-adapted perturbation theory. There are several ways to perform symmetry adaptation, the simplest one being symmetrization of the wave functions of the RS method, leading to the symmetrized RS (SRS) approach^{1} (the approach used exclusively for many-electron applications). SAPT has been presented in detail in many original papers,^{1–11} reviews,^{12–17} and textbooks.^{18–21}

SAPT is formally an exact theory, and in practice, its accuracy is similar to the coupled cluster method with single, double, and noniterative triple excitations [CCSD(T)].^{22} To get such accuracy in the wave-function approach, monomers have to be described with a relatively high-level treatment of electron correlation, and such SAPT is comparable to CCSD(T) also in terms of costs. SAPT based on monomers described by density-functional theory (DFT), a method denoted as SAPT(DFT), is about as accurate as the regular SAPT at its highest programmed level, but significantly faster. SAPT has important features that make it different from the supermolecular method that subtracts monomer energies from a dimer energy to obtain the interaction energy: $Eint=EtotAB\u2212EtotA\u2212EtotB$, where the subscript “tot” denotes total electronic energies. In particular, SAPT provides a unique ability to interpret properties dependent on intermolecular forces in terms of the four fundamental physical mechanisms that lead to the electrostatic, exchange, induction, and dispersion contributions to the interaction energies. Furthermore, since SAPT is a perturbation theory that calculates interaction energy directly, starting from the isolated monomers, no subtractions are involved, and in consequence, SAPT is free of the basis-set superposition error (BSSE) present in the supermolecular approach.

SAPT is *the* theory of intermolecular forces, and most textbooks discussing intermolecular interactions use SAPT concepts even if the SAPT acronym is not mentioned. SAPT is also known under several other names, e.g., the effective fragment potential (EFP) method^{23} is an approximate semiempirical version of SAPT. SAPT provides what can be called the standard model of energy decomposition analysis (EDA) for intermolecular interactions since SAPT by design defines interaction energies as the sum of the physical contributions listed earlier. These contributions are defined in a unique way and can be calculated with potentially arbitrary accuracy and at complete basis set (CBS) limits.

Since the exact wave functions are unknown for larger monomers, SAPT approximates these functions at several levels of electronic structure theory. The original version of SAPT is a double perturbation theory (when applied to dimers) with perturbation operators *V* and *W* = *W*_{A} + *W*_{B}. The latter operator describes intramonomer correlation effects, i.e., *W*_{A} and *W*_{B} are the Møller–Plesset (MP) fluctuation potentials of the respective monomers. If *W* is neglected, the monomers are described at the Hartree–Fock (HF) level. Higher levels include consecutive powers of *W*. This means that each monomer is described at the *n*th-order of many-body perturbation theory, MBPT*n*, denoted often as MP*n*. Selective summations to infinite order in *W* applying the coupled-clusters method are also possible. In this case, SAPT becomes nonperturbative in the operator *W*. In addition, SAPT(DFT) is a perturbation theory only in *V*.

Some of the statements we will make about SAPT refer to general theory defined by *V* and the use of symmetry adaptation. In practical applications, SAPT comes in one of the several flavors due to the different ways of imposing permutational symmetry and also due to the approximations made in accounting for electron correlation effects in monomers. Distinctions between various flavors will be made explicit whenever needed. If symmetry adaptation is not stated explicitly, the SRS approach will be assumed.

### A. Accuracy of SAPT results

While SAPT has become a mainstream electronic structure method, various misconceptions on SAPT are circulated. In particular, it is sometimes stated that SAPT is “just” one more EDA method useful for interpreting interaction energies, but of low accuracy and divergent. In fact, SAPT is an exact method in the sense that it reproduces the exact interaction energy as the order in *V* goes to infinity, provided that appropriate symmetry-enforcing techniques and exact monomer wave functions are used (or the number of excitations in the CC method used to describe monomers approaches the number of electrons in each monomer). These statements are based on high-order calculations for small systems (see Ref. 13 for a review of this work). Figure 1 shows results of SAPT calculations from Refs. 7 and 24 for the triplet state of Li–H at the interatomic separation *R* = 11.5 bohrs near the van der Waals minimum for this state. All electrons were considered. Lithium atom wave functions were obtained using the full configuration interaction (FCI) method, and the SAPT wave function corrections were expanded in FCI bases built as the direct product of the Li basis and the set of H atom orbitals. Thus, the perturbation corrections are exact in the chosen orbital space. Figure 1 shows the percentage of the FCI interaction energy recovered by a given method as a function of the perturbation theory order *n*. Three versions of symmetry-adaptation are shown in Fig. 1, and all three seem to converge to the FCI result in high order. In fact, two of these methods converge in absolute sense, while SRS converges only asymptotically (see the next paragraph). The first version of SAPT was proposed by Eisenschitz and London,^{25} and later by Hirschfelder^{26} and van der Avoird^{27} (EL-HAV). As shown in Fig. 1, this version converges very well in high order (the calculations were performed up to *n* = 70, and the error continues to decrease for all *n* considered). However, EL-HAV performs poorly in low order. Furthermore, low-order EL-HAV is incorrect in the asymptotic region; thus, it would not be a proper variant of SAPT to be used as an EDA. The reason for this problem is that while SRS imposes permutational symmetry at a minimal required extent, EL-HAV does it in the most extensive way, in fact, so extensive that it spoils the correct asymptotic behavior of RSPT.

The SRS method appears to converge in Fig. 1, but if one goes to higher orders, it shows divergent behavior. The order where the divergence starts is dependent on the separation *R* between monomers. Methods with such behavior are called asymptotically convergent. This means that the error of SRS truncated at the order just before the divergence shows up is arbitrarily small if *R* is sufficiently large. One should also add that at the Li–H van der Waals minimum for the triplet state, SRS starts to diverge around *n* = 25 (Ref. 24), which is, of course, inconsequential in any practical applications. In the asymptotic region, the SRS interaction energy corrections become identical to the RS ones; therefore, SRS by definition has the correct asymptotic behavior, i.e., it is seamlessly connected to the multipole expansion of interaction energy. Figure 1 also shows that in orders *n* = 2–4, SRS converges faster than EL-HAV. The convergence of SRS in low order can be improved by regularizing the electron–nuclei interactions,^{28} i.e., by removing the Coulomb singularity at small electron–nucleus separations. The resulting R-SRS method is identical in low order to the R-SRS+EL-HAV method shown in Fig. 1 and discussed below.

The third variant of SAPT shown in Fig. 1, the R-SRS+EL-HAV approach proposed in Ref. 7, combines the best features of all theories discussed above. In low order, it uses R-SRS so that it is accurate there and it has the correct asymptotic behavior. In high order, it uses the EL-HAV method to impose permutational symmetry, which leads to fast high-order convergence. There are several other versions of symmetry adaptation that lead to different convergence patterns (see, for example, Refs. 1, 7, 13, 24, and 29–37).

Since SAPT is a perturbation expansion in powers of *V*, one may expect that SAPT will start to diverge as *R* becomes small and, consequently, *V* becomes a large perturbation. Apparently, this divergence is not observed even for interactions as strong as those characteristic of chemical bonds. Figure 2 of Ref. 13 shows the convergence of several variants of SAPT at the chemical minimum of Li–H (*R* = 3.015 bohrs). Here, SRS shows the high-order divergence already around *n* = 10, but EL-HAV and several other variants of SAPT converge perfectly well. In addition, low-order SAPT calculations for larger systems show only a minimal worsening of convergence when one goes to small *R*’s (see Ref. 38 for an analysis of the Ar_{2} SAPT results at *R* = 1.5 Å, where the interaction energy is more than three orders of magnitude larger than the absolute value of this quantity at the van der Waals minimum, *R* = 3.76 Å).

### B. SAPT vs CCSD(T)

Another misconception about SAPT is that the programmed general version of SAPT^{39–41} is at best equivalent to MBPT in the second order (MP2). In fact, the version of SAPT that is applicable to interactions of arbitrary closed-shell molecules and some open-shell ones includes terms up to the third order in *V* and up to CCSD level treatment of electron correlation in monomers. By comparing individual terms at the RS level to supermolecular MBPT/CC terms, Chalasinski and Szczesniak^{42,43} demonstrated that the MP4 level includes dispersion corrections $Edisp(ni)$ such that *n* + *i* ≤ 4 and the electrostatic correction $Eelst(14)$ (see Sec. II for definitions). The highest programmed level of SAPT in the regular MBPT/CC module includes most such corrections, in particular, $Eelst(14)$ is included in $Eelst(1)$(CCSD) and $Edisp(22)$ is coded. Corrections $Edisp(40)$ and $Edisp(31)$ are not available, but since $Edisp(30)$ is already quite small, they are expected to be negligible. SAPT components such as $Eelst(1)$(CCSD), $Eexch(1)$(CCSD), and $Edisp(2)$[CCD+ST(CCD)] described later on go beyond the supermolecular MP4 level in terms of intramonomer correlation. While SAPT does not include triple excitations within monomers, intermonomer triple excitations are present. Apparently, the former excitations do not contribute significantly to interaction energies, as shown by the agreement between these two methods to within a few percent (see below) found in many if not most calculations. Thus, while a formal proof is not available, one may conjecture that the highest programmed level of SAPT is approximately equivalent to the supermolecular CCSD(T) level. The SAPT variant discussed above is the “standard” SAPT(HF/MP/CC) applicable to fairly large dimers and scaling not worse than *N*^{7}, where *N* is the system size. The equivalence to supermolecular CCSD(T) is even closer for SAPT(CCSD),^{44–51} but these codes scale more steeply with the system size. One should point out here that the agreement of SAPT with high-level supermolecular approaches is better than the results in Fig. 1 may suggest. The reason is that Li is a difficult case for SAPT since the current SAPT assumes that the unperturbed state is nondegenerate, whereas Li is a quasi-degenerate system due to the near-degeneracy of its 2s and 2p orbitals.

While two-body SAPT has been programmed up to the third order in *V*, most applications are restricted to the second order since the third order contributes typically only a few percent^{52,53} to the total interaction energies. However, truncation of the other perturbation expansion, in powers of the intramonomer correlation operators *W*, impacts the accuracy of the predictions substantially. If *W* is neglected, i.e., SAPT(HF) is used (this level is also sometimes referred to as SAPT0), the resulting interaction energies have errors that can be as large as 20%–30%. Perhaps somewhat counterintuitively, SAPT(HF) interaction energies are close to supermolecular MP2 interaction energies rather than to HF interaction energies. This is because the intermonomer correlation effects, such as the dispersion energy, are included at this level. The next level, SAPT2, computationally roughly equivalent to the supermolecular MP2 level and therefore sometimes denoted as SAPT(MP2), is significantly more accurate (SAPT*i* descriptors were initially keywords in the SAPT^{54} codes, and they were recently extended and systematized in Ref. 55). As already mentioned, one can also use the coupled-cluster ansatz to sum up the selected MBPT terms to infinite order in *W* (see Ref. 56 for relations between MBPT and CC), as in Ref. 57 where the dispersion energy was computed with both monomers’ excitations at the CCD level and with the doubly excited dispersion amplitudes of the first-order in *V* iterated to infinite order in *W*. In addition, the single- and triple-excitation contributions in the first-order dispersion functions were computed with monomers’ CCD amplitudes. This approximation to the dispersion energy is denoted as $Edisp(2)$[CCD+ST(CCD)]. All SAPT corrections of the first- and second-order in *V* are also available at the SAPT(CCSD) level.^{44–51} In this notational convention, if DFT is used to describe the monomers, this approach is denoted as SAPT(DFT).^{5,6,9–11} This notation is only qualitative in the sense that in most cases, the monomers are not described at exactly the indicated level. For example, in the case of SAPT(HF), the induction energies use monomer orbitals computed with the response to the external field due to the HF charge distribution of the interacting partner, and in the case of SAPT(DFT), monomer frequency-dependent density–density response functions [also known as frequency dependent density susceptibilities (FDDSs)] are used. Bear in mind that, as discussed elsewhere in this paper, SAPT(HF) and SAPT(DFT) interaction energies are dramatically different from the supermolecular HF and DFT interaction energies, respectively. In addition, SAPT(CCSD) interaction energies are close to supermolecular CCSD(T) values and are significantly more accurate than the CCSD ones.

To illustrate the relations between the results from the programmed version of SAPT and CCSD(T), we plot in Fig. 2 the interaction energies for the helium dimer.^{58,59} All results are at CBS limits, and the “exact” curve is accurate to better than the width of the line. Interestingly, for this system at the van der Waals minimum, SAPT is slightly more accurate than CCSD(T). To achieve CBS limits in SAPT calculations shown in Fig. 2, the corrections *E*^{(1)}, $Edisp(20)$, and $Edisp(21)$ were computed using explicitly correlated bases of Gaussian-type geminals (GTG)^{2,60,61} and programs that are not included in SAPT2020, but the same goal could be achieved by extrapolations from large orbital bases [in addition, *E*^{(1)} computed in GTG bases is equivalent to *E*^{(1)}(CCSD) for He_{2}].

A broader comparison of SAPT with CCSD(T) is presented in Fig. 3. The results are taken from Ref. 62 where CCSD(T)/CBS benchmarks were computed for ten dimers, containing up to 32 atoms, varying the intermonomer separation *R* from asymptotic to repulsive configurations. The median unsigned percentage error of SAPT(DFT) is only about 1% larger than that of CCSD(T), both methods computed in the same basis set [the CCSD(T) error here is, of course, entirely due to basis set incompleteness]. SAPT(DFT) performs significantly better than all other DFT-based methods included in Fig. 3 (see Ref. 62 for discussion and references to the methods shown in Fig. 3).

SAPT has been used to develop potential energy surfaces (PESs) for a large number of dimers. Often, SAPT PESs are among the most accurate ones published and if used in nuclear dynamics calculations predict observables in excellent agreement with experiment, e.g., for the water dimer spectra.^{63,64} SAPT PESs also allow reliable predictions of crystal structures.^{65} Thus, there is a strong connection between SAPT and experiment. Although these comparisons involve the total PESs, there is a connection to SAPT components as well. For example, to predict correctly crystal densities, one has to have the repulsive walls at the right places, which tests the exchange–repulsion energy. The crystals of monomers dominated by dispersion interactions, for example, the argon crystal,^{66} test indirectly this component of SAPT.

### C. SAPT(DFT) vs supermolecular KS DFT

Another misconception is that SAPT(DFT) is an approximation to the KS DFT supermolecular approach. This is no more the case than that SAPT based on HF orbitals is an approximation to the supermolecular HF method. While it should be already clear from the discussion above that SAPT(DFT) is not an approximation to KS DFT, a dramatic illustration is presented in Fig. 4. The data were taken from Refs. 38 and 67, except for SAPT(DFT) values computed by us using the Perdew–Burke-Ernzerhof plus exact exchange (PBE0)^{68,69} functional, the experimental ionization potential of argon^{70} [needed for the gradient regulated asymptotic correction (GRAC)^{71}], aug-cc-pV*X*Z basis sets^{72} plus a 3s3p2d2f set of midbond functions, and 1/*X*^{3} extrapolations. All other calculations were also performed at CBS limits. As expected, the SAPT(DFT) curve is very close to the CCSD(T) one. In stark contrast, supermolecular DFT calculations give interaction energies spread all over the place. This is related to the inability of semilocal DFT methods to reproduce the dispersion energy, which is particularly visible for large *R* where such methods predict essentially zero interaction, whereas SAPT(DFT) and CCSD(T) curves coincide with the dispersion plus exchange–dispersion energy curve (denoted as *E*_{dispx}), as shown in Fig. 4.

### D. Higher-order induction and the $\delta EintHF$ term

The SAPT results for polar systems are usually supplemented by the so-called $\delta EintHF$ term.^{52,53,73} This term is defined as the difference between the supermolecular HF interaction energy $EintHF$ and the sum of the first-order, induction, and exchange–induction components. This sum is included in $EintHF$ so that $\delta EintHF$ accounts mainly for higher-order induction and exchange–induction effects (but does include some relatively small first- and second-order contributions). One can view this approach also as a hybrid method consisting in adding to $EintHF$ the SAPT correlation contributions to interaction energies. Note that this is a well justified procedure since $EintHF$ does not include any electron correlation. However, this procedure is not completely rigorous since some correlation effects in induction energies depend on whether correlation is defined with respect to monomers’ or dimers’ HF energies. For most systems, such ambiguities are not relevant, but for systems with relatively large dispersion interactions, the use of the $\delta EintHF$ term leads to a decrease in accuracy. The authors of Ref. 74 proposed criteria for including the $\delta EintHF$ term. One version of the hybrid approach is the Hartree-Fock plus dispersion plus first order correlation (HFDc^{(1)}) method proposed in Ref. 75, recently applied to dimers with nearly 200 atoms in Ref. 76. This method consists in adding to $EintHF$ the dispersion and exchange–dispersion energies, as well as the intramonomer correlation contribution to first-order energies. HFDc^{(1)} is intended to be used in a fully *ab initio* way but can also be used with a fitted dispersion function. The later approach is very inexpensive and still gives reasonably accurate interaction energies.

### E. SAPT components

The physical components of SAPT are uniquely defined and can be computed to potentially arbitrary accuracy. Each of these terms has a precise physical interpretation. The electrostatic energy is the Coulomb interaction of unperturbed charge distributions. The induction energy of second order in *V* results from the response of monomer A (B) to the field of unperturbed charge distribution of monomer B (A). The dispersion energy results from correlations of electron positions between monomer A and monomer B. All these components are defined in a basis-set independent way at all *R*’s, not only in the asymptotic region. Finally, the exchange–repulsion component results from exchange tunneling of electrons between interacting systems and decays exponentially. Thus, in the case of SAPT, there is no lack of precision in defining the physical origins of these components. Consequently, SAPT can be used as the standard model for EDAs in the intermolecular interaction sector, and EDAs inconsistent with SAPT should be not be used since such EDAs will also be incompatible with the physical contents of intermolecular interactions.

For physical interpretation of induction interactions, it is best to look at the sum of the induction and exchange–induction corrections, sometimes denoted as *E*_{indx}, rather than at just the induction energy. The reason is that at *R* near the van der Waals minimum and smaller, the charge-overlap contributions in induction energies become large, leading to large discrepancies between the asymptotic expansion of induction energy and SAPT values,^{77} larger in magnitude than typical damping effects and of opposite sign. This is due to the fact that for systems with one of the monomers having more than two electrons, the interacting system is submerged in the continuum of Pauli-forbidden states. Unless such states are projected out by properly enforcing antisymmetry,^{7} which is not done in RSPT that defines induction energies, wave functions have spurious contributions from such states. If the exchange–induction term is added, the contributions coming from the violation of symmetry are canceled out to a large extent. The admixture of unphysical states can be significantly reduced by using a regularized operator *V* (see Refs. 7 and 28). This approach was used in Refs. 78 and 79 to approximately separate the charge-transfer contribution in induction energies. When higher-order induction effects are important, as they are in strongly bound systems with large polarization and charge-delocalization, one should include *E*_{indx} computed to the third order in *V* as well as the appropriate $\delta EintHF$ term.^{52,53}

SAPT physical component analysis has been used in a large number of papers to interpret intermolecular interactions. In some cases, such an analysis revised the established interpretations (see, for example, a novel analysis of hydrogen bonds in Refs. 80 and 81). It is applied not only to typical molecular dimers but also to a variety of more exotic systems such as interactions of molecules with carbon nanotubes^{82,83} or adsorption of silver atomic clusters on the surface of titanium oxide.^{84}

### F. Recent SAPT developments by other groups

Several recent developments of SAPT are available in programs written by other groups. These include the methods of the XSAPT family^{85–90} that uses SAPT with monomers polarized in the field of interacting partners via the X-POL method^{91} (the “X” in SAPT^{85–90} denotes the use of X-POL), the use of explicitly correlated basis set, the so-called SAPT-F12 approach,^{92,93} a method of calculating the induction and dispersion energies from the Bethe–Salpeter equation,^{94} or algorithms for dispersion and exchange–dispersion energies based on a multireference description of monomers (still restricted to nondegenerate ground states) presented in Refs. 95 and 96. SAPT-F12 follows a long-time tradition of using explicitly correlated functions in SAPT^{2,59,60,97–99} for small dimers. The *S*^{2} approximation in the second-order exchange energies (see Sec. II) was removed at the SAPT(HF) level by Schäffer and Jansen.^{100,101} Several other SAPT developments made by various groups will be discussed in Secs. IV and VII.

The remainder of this paper is organized as follows: Section II outlines two-body SAPT based on HF, MP, or CC description of monomers, while Sec. III presents an extension to the three-body case. Section IV discusses SAPT based on the DFT description of monomers. In particular, Sec. IV A presents a new, fully density-fitted version of SAPT(DFT) that can be applied to dimers with hundreds of atoms, Sec. IV B lists some technical details of SAPT(DFT) implementation crucial for the proper use of the programs, Sec. IV C sketches parallel performance, Sec. IV D is devoted to three-body SAPT(DFT), and Sec. IV E details open-shell SAPT(DFT). The composition of the SAPT2020 package is expounded in Sec. V. Section VI introduces the autoPES methodology for automatic fitting of PESs to SAPT interaction energies and SAPT asymptotic expansion. Other computer programs implementing the SAPT approach are listed in Sec. VII, while Sec. VIII contains a brief summary of this paper.

## II. SAPT BASED ON MP/CC DESCRIPTION OF MONOMERS

Since the Schrödinger equation for the monomers can be solved exactly or nearly exactly only for very small atoms or molecules, SAPT has to use some approximated description of monomers. The most natural way is to use many-body (body ≡ electron) perturbation theory based on the MP partition of the Hamiltonian of each monomer. In the two-body (body ≡ monomer) SAPT case, the total Hamiltonian then becomes

where *F* = *F*_{A} + *F*_{B} is the sum of the Fock operators for monomers A and B, and *W* = *W*_{A} + *W*_{B} is the intramonomer correlation operator (sum of MP fluctuation potentials) with *W*_{X} = *H*_{X} − *F*_{X}, X = A or B.

The simplest zeroth-order wave function in a perturbational approach based on Eq. (1) is the product of monomer HF determinants,

The function $\Phi 0HF$ is an eigenfunction of the operator *F*. The simplest perturbation expansion that one may employ is the RS method (extended to the case of two perturbation operators). However, as explained earlier, the RS method is not adequate, except for large intermonomer separations, since the antisymmetry condition is satisfied for exchanges within monomers but not between them. The antisymmetry requirement can be imposed by acting on the wave functions with the *N*-electron antisymmetrization operator. This antisymmetrization can be performed in many ways and leads to various versions of SAPT, as described earlier. The simplest of them, the SRS method, has been implemented in the many-electron context in Ref. 3.

The SRS energy corrections are obtained by expanding the expression for the interaction energy in powers of the perturbation operators,

where *n* denotes the order with respect to *V* and *i* denotes the order with respect to *W*. The last expression splits each $ESRS(ni)$ component into the part obtained from the RS expansion and the remainder due to the electron exchanges. General expressions for individual corrections $ERS(ni)$ and $Eexch(ni)$ are given in Refs. 2 and 3. In order to solve the resulting equations for the corrections to the wave function, one usually assumes a finite orbital basis (the so-called algebraic approximation), although other approaches are also possible.^{2} Derivation of explicit forms of the SRS expressions in terms of two-electron integrals and orbital energies can be done using the standard MBPT techniques. The formulas for several of those components have been given in Refs. 3, 52, 53, 57, 60, and 102–107.

Each RS component appearing in Eq. (3) can be related to the physical picture of intermolecular interactions resulting from the long-range expansion of the interaction energy. The first-order corrections $ERS(1i)$ represent the electrostatic interaction of unperturbed charge distributions and are usually denoted by $Eelst(1i)$. In the second order in *V*, the contributions are naturally separated into the induction and dispersion components. The exchange energies can also be related to the corresponding RS terms. In this way, one can, for example, distinguish the exchange–dispersion interactions. The standard level of the second-order SAPT implemented in SAPT2020^{39} represents the four fundamental interaction energy components by the following expansions:

The terms with the subscript “resp” are calculated using monomer orbitals distorted in the field of the interacting partner (non-response versions of all such terms are also available). The quantity $\epsilon exch(1)(CCSD)$ is the intramonomer correlation contribution to the first-order exchange energy calculated using monomers correlated at the CCSD level [it can be replaced with minor differences in costs by $Eexch(11)+Eexch(12)$], while $Eind(22)t$ and $Eexch\u2013ind(22)t$ denote those portions of the second-order intramonomer correlation correction to the induction and exchange–induction energies, respectively, which are not included in $Eind,resp(20)$ and $Eexch\u2212\u2212ind,resp(20)$. In most applications, one adds to the set of SAPT corrections the term^{73}

where $EintHF$ is the supermolecular HF interaction energy. If response terms are replaced by the non-response ones, one gets the corresponding $\delta EintHF$ correction. As already mentioned, this procedure involves an approximation since the relation between SAPT and the supermolecular methods can be established rigorously only in the asymptotic limit of large intermolecular separations.

All exchange corrections, except for $Eexch(10)$, are programmed in the so-called single-exchange or *S*^{2} approximation,^{3,12} which results from keeping in the antisymmetrizer only the terms involving exchange of a pair of electrons, one from system A and the other one from system B. The name *S*^{2}-approximation comes from the fact that the single-exchange approximation leads in the algebraic approximation to expressions involving only squares of overlap integrals *S* between orbitals of the two monomers. The *S*^{2}-approximation generally works well but breaks down for some systems for separations significantly smaller than the van der Waals minimum distance.^{108,109} The HF-level second-order exchange corrections without the *S*^{2}-approximation have been recently developed^{100,101} but are not available in SAPT2020. It turns out that the terms beyond *S*^{2} are much more important for the exchange–induction energies than for the exchange–dispersion energies. Therefore, the problem largely disappears if one uses the hybrid approach, i.e., includes the term $\delta Eint,respHF$.

The electrostatic energy can also be computed at the CCSD level. This correction, denoted as $Eelst,resp(1)$(CCSD), is rarely used since the MBPT expansion in Eq. (4) provides sufficient accuracy in most cases. The dispersion energy can be computed with improved accuracy from CCD infinite summations by using $Edisp=Edisp(2)$[CCD+ST(CCD)] (Ref. 57), as discussed earlier. The third order corrections are also available,^{52,53}

It has been recommended to scale the $Eexch\u2013ind(30)$ by the following ratio: $Eind,resp(30)/Eind(30)$. Such a scaled correction is denoted as $E\u0303exch\u2013ind,resp(30)$, and it replaces $Eexch\u2013ind(30)$ in Eq. (9). If third-order induction and exchange–induction corrections are available, the $\delta Eint,respHF$ term should be replaced by the corresponding expression involving such corrections. The third-order dispersion energy can also be computed from the modified density–density response functions,^{110} but this algorithm has not been programmed yet.

While the grouping of corrections used in Eqs. (4)–(7) is appropriate for discussions of the physical components, when SAPT corrections are fitted by analytic functions, it is better, as discussed earlier, to group second-order exchange and exchange–induction corrections together,

and for consistency, do the same for the dispersion and exchange–dispersion terms, e.g.,

The third-order terms can also be included (such high-accuracy *E*_{dispx} values have recently been used to analyze^{38,111} the performance of various approximate dispersion energies that are added to DFT interaction energies in an approach denoted DFT+D).

The first- and second-order terms in Eqs. (4)–(7) have been available in the SAPT distribution codes since its inception, whereas the third-order terms are there since SAPT2006. The codes in the main SAPT module do not use density fitting and therefore can be used for medium-size dimers only. However, a well parallelized version^{112} of these codes is available in the SAPT2020 distribution. All the *E*^{(10)} and *E*^{(20)} terms can also be computed with efficient density fitting using the fastdf module for SAPT(DFT) calculations (see Sec. IV A), so the HF-level calculations can be performed for dimers with hundreds of atoms. This includes the $Eind,resp(20)$ and $Eexch\u2013ind,resp(20)$ terms, which are computed in fastdf by means of an iterative scheme described in Ref. 113. $Eexch\u2013disp(20)$ [called uncoupled KS dispersion energy in SAPT(DFT)] has also been programmed very efficiently^{114} and scales approximately as *O*(*o*^{2}$v$^{2}*N*_{aux}), where *o* and $v$ denote the number of occupied and virtual orbitals, respectively, and *N*_{aux} is the number of auxiliary basis functions used in density fitting expansions. Calculation of this correction becomes the time-dominating step for large systems. The terms beyond SAPT(HF) contain terms scaling at the worst as *O*(*o*^{3}$v$^{4}). Such terms appear in the triples contribution to $Edisp(22)$ and in $Eexch\u2013disp(30)$ [the other third-order terms scale at worst as *O*(*o*^{2}$v$^{3})].

SAPT2020 also contains a separate module computing the SAPT(CCSD) corrections.^{44–51} This module extends the capabilities of the main module, which includes the first-order terms at the CCSD level, to the second order. However, some of the terms at the *E*^{(2)}(CCSD) level are very computer resource-consuming, so this method is mainly used for benchmarking purposes.

The RS-level SAPT components discussed above are related to the corresponding terms in the asymptotic expression for the interaction energy since this expression is obtained by inserting the multipole expansion of the operator *V* into RSPT formulas. Since, for large intermonomer separations, the exchange and charge-overlap effects vanish, SAPT interaction energies become asymptotically equal to those obtained from the multipole expansion of interaction energy.^{12,20} One can say that SAPT is seamlessly connected to the asymptotic expansion, the property used in fitting PESs to SAPT interaction energies (see Sec. VI). The asymptotic expansion can be expressed in terms of *ab initio* computed multipole moments, polarizabilities, and dynamic polarizabilities located at the centers of masses (COM) of the monomers.^{12,20,115–119} SAPT2020 contains modules using methods and programs developed by Wormer and Hettema^{120,121} that compute such monomer properties and assemble from them the asymptotic expansion coefficients. The level of electron correlation in calculations of monomer properties has to be compatible with the level used in SAPT. Appropriate relations have been developed in Refs. 122–124. For monomers containing more than a few atoms, the COM–COM asymptotic expansion is valid only at separations much larger than the van der Waals minimum separation. For such systems, it is more appropriate to use the distributed expansions, i.e., expansions depending on separations between atoms of different monomers (see Sec. VI).

## III. THREE-BODY SAPT

Three-body SAPT is a generalization of two-body SAPT, i.e., extension from dimers to trimers, clusters involving three monomers. The three-body pairwise-nonadditive interactions energy is defined as

where $Eint=EtotABC\u2212EtotA\u2212EtotB\u2212EtotC$ is the interaction energy of the trimer and *E*_{int}[2, 3] is the sum of the interaction energies of all three dimers formed from the trimer monomers (all geometries are frozen). The notation *E*_{int}[*K*, *N*] denotes, in general, a *K*-body contribution to an *N*-mer cluster, and the definition of *E*_{int}[3, 3] given above can be trivially generalized to larger clusters and to higher than three-body effects.^{125}

Similarly to the two-body SAPT, in the three-body SAPT approach, the Hamiltonian is partitioned according to Eq. (1), but now *F* = *F*_{A} + *F*_{B} + *F*_{C}, *V* = *V*_{AB} + *V*_{BC} + *V*_{AC}, where *V*_{XY} collects Coulomb interactions of all particles of monomer X with those of monomer Y, and *W* = *W*_{A} + *W*_{B} + *W*_{C}. The three-body interaction energy is then expanded as a perturbation series,^{4,43,125–127}

where *i* and *j* correspond to the perturbation orders in *V* and *W*, respectively. This equation is analogous to Eq. (3), and the semicolon is used since the detailed formulas have six indices corresponding to six different perturbation operators, but often triples of intermonomer or intramonomer indices are contracted to one index, so the semicolon is needed to distinguish the *V* and *W* parts.

The expansion in *V* represents different physical terms and, similarly to the two-body case, can be separated into first-order exchange, induction, dispersion, exchange–induction, exchange–dispersion, and induction–dispersion contributions. There are, however, some notable differences from the two-body expansion. First, the electrostatic interaction is purely additive, so there are no electrostatic terms in Eq. (13). Second, the second-order dispersion is also additive, and the first nonadditive dispersion contribution appears only in the third order. Nonadditive induction and dispersion contributions can also be of any sign, in contrast to the two-body case where these contributions are always attractive. The expansion in *W* does not impact the physical interpretation of the terms. For more details on physical interpretation, see Ref. 125.

The first-order exchange energy is a major component of the three-body nonadditivity, and therefore, it was developed up to the second order in *W*. The dispersion nonadditivity is very important for nonpolar systems, in particular for rare-gas trimers, as it has been known for a long time from the work of Axilrod–Teller–Muto (ATM)^{128,129} in the asymptotic region. Therefore, $Edisp(3,1)[3,3]$, the leading intramonomer correlation correction to the dispersion nonadditivity, was included, and the dispersion term of the fourth order in *V* was also developed.

Three-body SAPT has been successfully applied to several systems,^{66,127,130–135} providing the first trimer PESs as well as important physical insights into the investigated trimers (see Ref. 125 for a review of the results published up to 2005). The three-body codes have not been parallelized or density-fitted, so they are applicable to only fairly small trimers with few-atomic monomers. However, the three-body SAPT(DFT) version, described in Sec. IV D, can be applied to much larger systems, for example, to a trimer of benzene molecules.^{136}

## IV. SAPT(DFT)

SAPT(DFT)^{9} (also referred to as DFT-SAPT^{10}) has been developed following an initial idea of Williams and Chabalowski^{137} to replace the Fock operators *F*_{X} of monomers by the KS operators *K*_{X} and neglect the *H*_{X} − *K*_{X} operators, the equivalents of *W*_{X}. In practical terms, such a change is equivalent to performing SAPT0 calculations with the HF orbital coefficients and orbital energies replaced by their KS counterparts. Thus, such a calculation does not require any changes in the SAPT codes. The rationale for this idea was that the KS method effectively accounts for the correlation effects, at least for short-range interelectronic distances, so one could omit the expensive SAPT intramonomer correlation terms. It turned out, however, that this simple approach did not work well,^{137} with the accuracy not better than that of the (equally inexpensive) HF-based SAPT0 approach. It was already suggested in Ref. 137 that one cause of these disappointing results may be the wrong long-range behavior of the densities produced by most semilocal density functionals. This conjecture was confirmed in Refs. 138 and 139 by showing that the application of an asymptotic correction (AC) to exchange–correlation potentials results in accurate electrostatic and first-order exchange energies. However, the dispersion energies calculated by using the expression for $Edisp(20)$ with asymptotically corrected KS orbitals and orbital energies were still very different from accurate values. The conventional (London-type) spectral sum expression for $Edisp(20)$ can be written in the so-called generalized Casimir–Polder form,^{12,140,141} expressing this quantity in terms of frequency-dependent density–density response functions (FDDS functions). The functions needed for $Edisp(20)$ are the so-called noninteracting KS response functions. However, one can instead use in the generalized Casimir–Polder expression the KS response functions computed using the time-dependent DFT (TD-DFT) method. The dispersion energies computed with such response functions^{5,6} turned out to be very close to the values given by SAPT(MP) [see Eq. (6)]. In SAPT(DFT), one refers to these values as coupled KS (CKS) dispersion energies, while the values computed with noninteracting KS response functions are referred to as uncoupled KS (UCKS) dispersion energies. The reason for this terminology is that the induction energies can also be computed from the response functions (at zero frequency). The noninteracting and TD-DFT cases correspond now to the expression computed with unperturbed orbitals and with orbitals deformed in response to the field of the interacting partner, respectively, the quantities denoted previously by $Eind(20)$ and $Eind,resp(20)$. The approach with UCKS [CKS] response functions is denoted as SAPT(KS) [SAPT(DFT)]. A detailed description of these methods can be found in Refs. 8 and 9. These references also gave physical reasons why SAPT(DFT) works so well. In exact DFT, densities and response functions are exact. Therefore, the electrostatic, induction, and dispersion energies also become exact in such an approach. For the first-order exchange energies, the SAPT(KS) approach in exact DFT is asymptotically exact since the asymptotic forms of the interaction density matrices from exact wave-function theory and from exact Kohn–Sham theory are identical. Interestingly, the UCKS/AC induction energies turned out to be fairly accurate,^{138,142} essentially as accurate as CKS/AC ones, in the region of van der Waals minima, whereas for larger intermonomer separations, the latter are more accurate. This was explained in Ref. 8 to result from fortuitous cancellations of errors in UCKS response functions with those in charge-overlap effects.

SAPT(DFT) energies depend on the density functional used, but relatively weakly. It was found that the hybrid PBE0^{68,69} functional with AC performs best in most cases, and for larger systems, one can use its pure version, i.e., PBE, to reduce computational costs. A recent study^{143} examining the use of the latest meta-generalized gradient approximation (meta-GGA) functionals in SAPT(DFT) also found out that PBE0 is still the best performer. One can also formulate SAPT(DFT) using an “exact DFT” approach. This is related to the fact that SAPT(DFT) depends only on the Kohn–Sham exchange–correlation potential $vxcr$, i.e., on the functional derivative of the exchange–correlation energy, and one can find a $v$_{xc}(** r**) that gives nearly exact density using the so-called Zhao–Morrison–Parr (ZMP) procedure.

^{144}The ZMP method requires knowledge of accurate monomer wave functions, so its time requirements are not DFT-like, but since this needs to be done only once and only for monomers when calculating a PES, the additional costs are acceptable. The ZMP-based SAPT(DFT) was proposed in Ref. 145 and recently extended to larger systems in Ref. 146.

The original SAPT(DFT) codes of Ref. 9 were interfaced with CADPAC^{147} (this interface is not maintained anymore). Later, interfaces with DALTON^{148} (SAPT2006 release) and ORCA^{149} (SAPT2012 release) were developed. Both interfaces are active and have a partly different functionality, as it will be discussed below. The details of SAPT(DFT) implementation in SAPT2006 are described in Refs. 8, 9, 150, and 151. Below we will discuss some methodological and algorithmic developments since the SAPT2006 release.

*TD-DFT kernel*: The calculations of the FDDSs require the so-called kernel integrals, i.e., integrals involving a functional derivative of $v$_{xc}(** r**) (see, e.g., Ref. 152). The most consistent way when using a given GGA functional in SAPT(DFT) is to calculate the kernel using the appropriate $v$

_{xc}(

**), i.e., to use adiabatic GGA (AGGA) response functions. This is possible with the DALTON interface (without density fitting) since SAPT(DFT) uses the kernel integrals computed by DALTON. However, if the GGA kernel is replaced by the adiabatic local density approximation (LDA) kernel (ALDA approximation) computed with GGA orbitals and densities, only a very small error, usually below 1%, is introduced.**

*r*^{9,153}The LDA kernel is also less expensive to compute. In both cases, one-electron four-index integrals have to be computed (by numerical integration), so the scaling is

*O*(

*o*

^{2}$v$

^{2}

*g*), where

*g*is the number of numerical grid points; in the LDA case, the integrals include a product of four orbitals and of a function independent of orbital indices, while the GGA case includes in addition several terms containing derivatives of orbitals. Density fitting introduced in Ref. 150 (as a patch to DALTON) reduced the scaling to

*O*(

*o*$v$

*N*

_{aux}

*g*), where

*N*

_{aux}is the number of fitting (auxiliary) basis set functions. This was done only for the LDA kernel since, as explained in detail in Ref. 151, the GGA kernel is difficult to approximate with the density fitting (df) method. The LDA kernel is the default for non-df calculations (and the only option for df ones). The ORCA interface was designed in such a way that only the KS orbitals and orbital energies are imported from ORCA. Thus, the kernel integrals were programmed in one of the SAPT(DFT) modules, again only in the LDA version since only df-SAPT(DFT) is coded with this interface. In the fastdf version of codes (Sec. IV A), the calculations of the kernel integrals take a moderate fraction of the total time.

*Density-fitted calculations of FDDSs*: The original implementation of SAPT(DFT) scaled *O*(*N*^{6}) with system size *N* (the number of electrons) due to the *O*(*o*^{3}$v$^{3}) matrix inversion in calculations of FDDSs. It had already been suggested in Ref. 5 that this scaling could be reduced by using density fitting (also known as the resolution of identity). This idea was realized in Ref. 10 (for nonhybrid functionals only) and in Refs. 150 and 151 (for pure and hybrid functionals). Density fitting reduced the scaling of FDDS calculations to *O*(*N*^{4}) for pure and to *O*(*N*^{5}) for hybrid functionals. The kernel integrals and integral transformation were also density fitted. However, the first-order corrections and the corrections $Eexch\u2013ind(2)$(KS) and $Eexch\u2013disp(2)$(KS) were computed from non-df formulas (the four-index molecular integrals were computed from the transformed three-index ones). For pure functionals, the *O*(*N*^{5}) scaling of $Eexch\u2013disp(2)$(KS) became the limiting step. With all these improvements, for systems such as a 42-atom RDX dimer in an augmented double-zeta basis set, the dominating step of SAPT(DFT) calculations became the DFT calculations for monomers (see Fig. 4 of Ref. 154). Despite the pure FDDSs taking significantly less time than the hybrid ones, the overall times for both types of calculations were essentially the same. A new algorithm for FDDS calculation, applicable only to pure DFT, was developed in Ref. 155. It has the same *O*(*N*^{4}) scaling as the previous algorithm, but with a significantly smaller prefactor and smaller memory and disk requirements. This version was also interfaced with ORCA,^{149} which significantly reduced the time of monomer calculations. These improvements made it possible to compute dispersion energies for dimers containing more than 200 atoms.^{155} Figure 4 of Ref. 154 shows that the overall speedup for the RDX dimer was about 14 times for pure functionals and 6 times for the hybrid ones, and calculations with pure functionals have become significantly faster than with the hybrid ones. The most recent further improvements of the efficiency of the codes are discussed in Sec. IV A. With these codes, *complete* SAPT(DFT) calculations are possible for dimers containing more than 200 atoms.^{76,114}

*Exchange–induction and exchange–dispersion energies*: In SAPT2006, approximate CKS exchange–induction and exchange–dispersion energies were obtained by the following scaling of the UCKS quantities:

The corrections $Eexch\u2013ind(2)(KS)$ and $Eexch\u2013disp(2)(KS)$ use the amplitudes (from wave-function corrections) of first order in *V* and of zeroth order in $HX\u2212KX$. In Ref. 10, the second-order exchange corrections were computed by replacing the perturbational amplitudes by those computed from the TD-DFT response functions. Such directly computed CKS corrections will be indicated by symbols without tilde: $Eexch\u2013ind(2)(CKS)$ and $Eexch\u2013disp(2)(CKS)$, and starting from SAPT2012, these types of corrections are computed by default (the scaled components are still available for compatibility with the earlier results). It has been shown^{51} by comparing the $E\u0303exch\u2013disp(2)(CKS)$ and $Eexch\u2013disp(2)(CKS)$ energies to the $Eexch\u2013disp(2)(CCSD)$ ones that the scaled approach is considerably less accurate than the direct one: at the van der Waals minima configurations of eight small dimers, root-mean-square errors (RMSEs) of $E\u0303exch\u2013disp(2)(CKS)$ and $Eexch\u2013disp(2)(CKS)$ relative to $Eexch\u2013disp(2)(CCSD)$ were 17% and 5%, respectively. However, in absolute terms, the differences are small and RMSEs of the sum of the dispersion and exchange–dispersion energies were 2.2% and 3.0% for $Edisp(2)(CKS)+E\u0303exch\u2013disp(2)(CKS)$ and $Edisp(2)(CKS)+Eexch\u2013disp(2)(CKS)$, respectively, relative to $Edispx(2)(CCSD)$. Of course, the fact that the scaled approach gives a smaller error is purely accidental. One should mention here that the programmed correction $Eexch\u2013disp(2)(CCSD)$ involves several approximations and its values were found to be fairly different from those computed by Hapka *et al.*^{96} using a multireference description of monomers. These authors also have shown that the formula for $Eexch\u2013disp(2)(CKS)$ proposed in Ref. 10 and programmed in SAPT2020 misses, in fact, a component of the response functions. However, for systems that do not require a multireference description, the effects due to this omission are minor.

### A. The fastdf program

SAPT2020 includes a set of new codes, fastdf, aimed at computing SAPT(DFT) interaction energies of large dimers. It is interfaced with ORCA for the HF or DFT self-consistent field (SCF) calculations and with LIBINT^{156} for the computation of overlap, kinetic, and nuclear-attraction one-electron integrals, as well as of the three-center electron–repulsion integrals (two-electron integrals). All the other tasks are performed by fastdf. LIBINT is able to compute integrals with arbitrary angular momentum numbers, and the numerical integration program for the kernel integrals from the previous version of SAPT codes has been extended to support Gaussian orbitals with angular momentum quantum numbers up to *l* = 6, therefore allowing one to use main basis sets of up to the quintuple-zeta level for up to the third-row atoms and up to quadruple-zeta for the higher-row atoms (the *l* value higher by one or two is needed for the auxiliary bases) compared to the triple-zeta level in SAPT2016^{157} with the ORCA interface (provided that the SCF calculations converge, which can be very difficult for quintuple-zeta basis sets with an AC in ORCA). The fastdf program has a simplified user interface and requires users to prepare one simple input file, as opposed to the up to seven files required in the older versions of SAPT codes. Another new feature is that fastdf computes and stores on disk only the quantities that are strictly required for a given calculation, therefore saving time and resources if one were to compute a subset of corrections. One of the most relevant new features concerns efficient computations of the $Eexch\u2013disp(2)$(CKS) correction. This correction is much more difficult to compute for large systems than $Eexch\u2013disp(2)$(KS) despite using essentially the same algorithm. The reason is that while in the latter case, the dispersion amplitudes can be easily computed as needed, in the former case, the dispersion amplitudes of the size *O*(*o*^{2}$v$^{2}) are computed by a numerical frequency integration from FDDS density fitted amplitudes (i.e., linear coefficients of the response functions) of the two monomers, each of size $(ovNauxN\omega )$, where *N*_{ω} is the number of frequencies. This problem has been solved by combining the algorithms for the computation of the density-fitted FDDS amplitudes and of the dispersion amplitudes.^{114} This combined algorithm significantly reduces the disk space by storing only half of the *O*(*N*^{3}) density-fitted FDDS amplitudes and computing from them the full *O*(*N*^{4}) coupled dispersion amplitudes on-the-fly, instead of storing both amplitudes on disk, as the former SAPT codes did. This facilitated^{76} the computation of the $Eexch\u2013disp(2)$(CKS) corrections for all the dimers included in the S12L set^{158} (with up to 177 atoms). Although the described algorithm only works with nonhybrid functionals, the fastdf program also has the option to compute the hybrid-functional-based density-fitted FDDS amplitudes of both monomers and use them in the $Eexch\u2013disp(2)$(CKS) algorithm. The algorithm for the hybrid FDDS amplitudes involves two-electron exchange integrals that are not density fitted, and consequently, the program requires *O*(*N*^{4}) disk space. Still, it is significantly faster than the non-density-fitted one. In previous work^{159} for systems of this size, only $Eexch\u2013disp(2)$(KS) was computed, and then it was scaled by empirical parameters. Some additional performance increases (not as significant as the ones discussed above) were gained in fastdf by storing and reusing some partial contractions of amplitudes and integrals in a similar way as done in Ref. 113.

To demonstrate speedups due to the new developments in fastdf, Table I shows the wall time, memory, and disk requirements for those parts of SAPT(DFT) calculations that were made more efficient in fastdf compared to the SAPT2016.2 version (the total wall times for components that were not changed are also included). The calculations were performed for the 42-atom RDX dimer with the PBE functional,^{68} the GRAC asymptotic correction, and aug-cc-pVDZ and aug-cc-pVTZ basis sets plus a 3s3p2d2f set of midbond (mb) functions^{151} in the MC^{+}BS format (see Sec. IV B for definitions of basis-set formats) correlating all electrons. The total times include the DFT calculation times (5 min and 16 min for the aDZ+ and aTZ+ bases, respectively). The “Integrals and transformation” entry includes all one- and two-electron atomic-orbital integral calculations and transformations but excludes the kernel integrals. The significant time differences for this step are due to (a) the use of only three-index integrals in fastdf, while four-index integrals were computed from the three-index ones in SAPT2016; (b) the removal of integrals of the kind (*K*|$vv$′), where *K* is an auxiliary basis index (these integrals are not needed when pure functionals are employed but were computed in SAPT2016); and (c) the use of LIBINT for the computation of atomic-orbital integrals, which is a far more efficient code than the one used in SAPT2016. The time differences for $E\u0303dispx(2)$(CKS) can be explained by the fact that (a) in SAPT2016, the $Eexch\u2013disp(2)$(KS) algorithm was not density-fitted; and (b) in SAPT2016, both $Eexch\u2013disp(2)$(KS) and $Eexch\u2013disp(2)$(CKS) were always computed, whereas fastdf computes only the one requested by the user [$Eexch\u2013disp(2)$(CKS) by default]. The speedup for the computation of $Edispx(2)(CKS)$ is still larger since the dispersion amplitudes are not stored on disk, therefore decreasing the disk input/output significantly. Additional gains are possible by using the frozen core approximation,^{160} which reduces the time needed for $Edispx(2)$(CKS) by about 37%, with only minor changes of the results. Table I also shows the disk space and the minimum memory necessary to run the calculations. Both have been greatly reduced by avoiding the storage of *O*(*N*^{3}) quantities on disk and using memory buffers of only *O*(*N*^{2}) size (the latter was already implemented in SAPT2016, but only for the induction and dispersion modules).

. | SAPT2016.2 . | SAPT2020.1 . | ||
---|---|---|---|---|

. | aDZ+ . | aTZ+ . | aDZ+ . | aTZ+ . |

Wall-time (min) | ||||

Monomer DFT and kernel integrals | 9 | 23 | 9 | 23 |

Integrals and transformation | 16 | 158 | 1.4 | 5.3 |

$E(1)+Eindx(2)$ | 5 | 32 | 0.6 | 2.2 |

$Edispx(2)$(CKS) | 21 | 340 | 8.2 | 40 |

Total SAPT(DFT) with $Edispx(2)$ | 51 | 553 | 19 | 71 |

$E\u0303dispx(2)$(CKS) | 7.7 | 256 | 4.5 | 20 |

Total SAPT(DFT) with $E\u0303dispx(2)$ | 38 | 469 | 16 | 51 |

Storage (GB) | ||||

Disk space | 130 | 758 | 15 | 50 |

Memory | 2.5 | 5.8 | 1.1 | 2.1 |

. | SAPT2016.2 . | SAPT2020.1 . | ||
---|---|---|---|---|

. | aDZ+ . | aTZ+ . | aDZ+ . | aTZ+ . |

Wall-time (min) | ||||

Monomer DFT and kernel integrals | 9 | 23 | 9 | 23 |

Integrals and transformation | 16 | 158 | 1.4 | 5.3 |

$E(1)+Eindx(2)$ | 5 | 32 | 0.6 | 2.2 |

$Edispx(2)$(CKS) | 21 | 340 | 8.2 | 40 |

Total SAPT(DFT) with $Edispx(2)$ | 51 | 553 | 19 | 71 |

$E\u0303dispx(2)$(CKS) | 7.7 | 256 | 4.5 | 20 |

Total SAPT(DFT) with $E\u0303dispx(2)$ | 38 | 469 | 16 | 51 |

Storage (GB) | ||||

Disk space | 130 | 758 | 15 | 50 |

Memory | 2.5 | 5.8 | 1.1 | 2.1 |

With fastdf, we were able to significantly push the limits on the size of dimers for which SAPT(DFT) calculations are feasible. One may note that from this point of view, reductions of memory and storage are as important as reductions of CPU time. In fact, calculations of SAPT(DFT) interaction energies for dimers containing about 200 atoms are now tractable on a well-equipped laptop: with 4 TB of fast scratch storage, 30 GB of memory, and an 8-core CPU. Table II shows the timings for system C4b of the S12L set (the largest dimer contained in this set), shown in Fig. 5. The “Other contributions” entry includes the electrostatic, exchange, induction, and exchange–induction components. The time shown in parentheses refers to the computation of $E\u0303dispx(2)$(CKS) and to the total time of the SAPT(DFT) calculation including this form of exchange–dispersion energy. The timings given in Tables I and II are for calculations on 8 and 16 cores, respectively. Since scaling of the SAPT(DFT) codes with the number of processor cores is only modest (see Sec. IV C), these timings are a bit pessimistic in the sense that a better parallelization of the codes could reduce them by a factor of 2–3.

Part . | Time (h) . |
---|---|

DFT for monomers | 1.2 |

Integrals and transformation | 2.7 |

Kernel integrals | 6.2 |

Other contributions | 1.8 |

$Edispx(2)$ ($E\u0303dispx(2)$) | 67.1 (25.3) |

Total (total with $E\u0303dispx(2)$) | 78.8 (37.3) |

Part . | Time (h) . |
---|---|

DFT for monomers | 1.2 |

Integrals and transformation | 2.7 |

Kernel integrals | 6.2 |

Other contributions | 1.8 |

$Edispx(2)$ ($E\u0303dispx(2)$) | 67.1 (25.3) |

Total (total with $E\u0303dispx(2)$) | 78.8 (37.3) |

It is important to note that for systems as large as discussed here, standard CCSD(T) calculations are not feasible. Approximate CCSD(T) calculations with methods such as DLPNO-CCSD(T)^{161,162} are possible, but SAPT(DFT) in our current implementation is a less expensive method than DLPNO-CCSD(T), while it provides comparable accuracy.^{76,114} For the RDX dimer case reported in Table I, DLPNO-CCSD(T) calculations in the aug-cc-pVTZ+mb basis took 40 h on 8 cores compared to 1.2 h taken by the SAPT(DFT) calculations, while the disk use was 160 GB and 50 GB, respectively. We used the “TightPNO” setting in ORCA since with the “NormalPNO” setting, the interaction energy is 12% different in magnitude, which is not acceptable in intended applications of such work.

### B. SAPT(DFT) implementation options

SAPT2020 includes the complete implementation of the SAPT(DFT) method developed by our group. There are several options of running these codes, and the availability of options depends on the DFT front-end: DALTON^{148} for the older and slower, but the most comprehensive version of SAPT(DFT) or ORCA^{149} for the fast versions of the density-fitted implementation, including the new fastdf module. One may add that while the interface with DALTON is “deep,” i.e., SAPT(DFT) uses several quantities computed by DALTON and an extensive patch to DALTON is needed [therefore SAPT(DFT) works with the version 2.0 of DALTON, but not with newer editions], the interface with ORCA is “shallow,” i.e., only orbital coefficients and energies are taken from ORCA (thus, fastdf codes can be easily interfaced with any other DFT package that has an AC programmed). Below, we discuss various options for running SAPT(DFT) calculations.

*Format of basis set*: Dimer-centered basis sets (DCBSs), monomer-centered basis sets (MCBSs), and monomer-centered “plus” basis sets (MC^{+}BSs) are available in all versions. The latter format includes “midbond” functions placed between monomers and/or “farbond” functions, i.e., a subset of the basis set of the interacting partner.^{163} The midbond functions can also be used with the DCBS format, and such bases are denoted as DC^{+}BS. The basis functions on monomers can be unrelated, as in the “pure” MCBS format, identical, as in DCBS and DC^{+}BS, or anything between, as in MC^{+}BS. The latter leads to significant computational savings with a modest loss of accuracy. An additional bonus is that the MC^{+}BS format allows one to include a larger number of diffuse functions in cases where DCBS/DC^{+}BS becomes too linearly dependent for convergence of SCF iterations. In the fastdf codes, the formation of the MC^{+} bases, which was a bit burdensome in the previous versions of input, is completely automated. A depository of basis sets is also now available.

*Regular vs density-fitting versions*: The complete set of SAPT(DFT) corrections without density fitting is available only with the DALTON front-end. Three different density-fitting implementations are available: a partial density fitting with DALTON and ORCA interfaces and a complete density fitting in the fastdf module of SAPT2020 (interfaced with ORCA). The partial density fitting is the approach of Refs. 150 and 151: only three-index two-electron integrals in atomic basis sets are computed, molecular integrals are obtained from a df transformation, dispersion and induction energies are computed using df algorithms, but the first-order corrections and the second-order exchange corrections are computed from non-df formulas (generating four-index two-electron molecular integrals from the transformed three-index ones). The main difference between the partial-df versions interfaced with DALTON vs ORCA is that the latter implements the fast dispersion algorithm developed in Ref. 155. The fastdf codes are recommended for all SAPT(DFT) calculations as uncertainties due to the use of df are well below 1%,^{150,151} and these codes are orders of magnitude faster than the non-df ones. In fact, even with current computational resources, non-df codes are difficult to use for dimers with monomers containing more than a few atoms.

*Pure vs hybrid functionals*: As discussed earlier, SAPT(DFT) based on hybrid functionals is more accurate than that based on the corresponding pure functionals,^{9} and it is recommended for all systems for which it can be afforded. However, calculations with hybrid functionals are also more expensive, except for the non-df version where these two types of calculations are of almost the same difficulty. In the df approach, calculations of hybrid FDDSs are much more difficult than those of nonhybrid ones, and in fact, SAPT2006 and later versions are the only codes that include a density-fitted hybrid FDDS algorithm. Still, calculations of nonhybrid FDDSs are much faster, provided an appropriate algorithm is used.^{114,155} For large systems, massive resource savings can be achieved by using pure functionals in fastdf (note that fastdf can be used to compute density-fitted FDDSs and dispersion energies also with hybrid functionals, using an algorithm that was slightly changed from its version developed in Ref. 151). One can also compute the relatively inexpensive but always large in magnitude first-order corrections with hybrid functionals and the second-order ones with nonhybrid functionals, an approach that was described in Ref. 76 to produce results that agree better with the benchmarks than a purely nonhybrid approach while being significantly less expensive than a purely hybrid one.

*AGGA and ALDA kernels*: With the non-df DALTON interface, the kernels are computed by DALTON, and since the second functional derivatives of GGA exchange–correlation energies are available for all functionals included in DALTON, one can use the appropriate AGGA or ALDA kernels. However, as already stated, the df-SAPT(DFT) version interfaced with DALTON can use only the ALDA kernel. With the ORCA interfaces, kernel integrals are computed by the SAPT codes, and again, only the ALDA kernel is available. As discussed earlier, the use of the ALDA kernel (with orbitals from a GGA method) introduces very small errors.

*Asymptotic corrections*: SAPT2020 supports two AC schemes: the Fermi–Amaldi^{164} asymptotic potential with the Tozer–Handy splicing method (FATH)^{165} and GRAC,^{71} the former only available with the DALTON interface and the latter with both the DALTON and ORCA interfaces. For small dimers, FATH was found to often perform better than GRAC^{153} so that it is recommended for such systems, as long as the relative slowness of the DALTON interface is not a problem. For larger systems, GRAC is expected to be more accurate since FATH is inversely proportional to the total number of electrons, while in large molecules the density is sensitive only to the number of electrons from a neighborhood of a given place. It is also possible to omit AC if one of the range-separated hybrid DFT functionals with the KS $v$_{xc} potential replaced at long-range with a shifted HF potential is used,^{153,166,167} provided that the separation parameter *ω* is tuned to make IP+*ϵ*_{HOMO} = 0, where IP denotes the ionization potential of a monomer and *ϵ*_{HOMO} is its highest-occupied molecular orbital energy. However, the accuracy of such an approach is a bit lower than with the best performing asymptotically corrected functionals. Hapka *et al.*^{143} found recently that SCAN0, a meta-GGA hybrid functional,^{168,169} can also be used without AC.

*Ionization potentials*: Both AC schemes require the IPs of the monomers as input parameters. For smaller systems, we recommend to use experimental IPs.^{70} When such IPs are not available, separate calculations have to be performed for each monomer and its ion (in both cases using the same geometry as in SAPT calculations). SAPT(DFT) results are only weakly dependent on the values of ionization potentials. The *ϵ*_{HOMO} value, also required for AC, is obtained by DALTON or ORCA and adjusted during the SCF iterations so that the AC correction is self-consistent in *ϵ*_{HOMO}, but not in IP.

### C. Parallel performance

SAPT(HF/MP/CC) has been effectively parallelized^{112} using the Message Passing Interface (MPI) protocol. It was scaling well up to 64 (single-core) processors on a set of single-processor nodes with only 1 Gbit/s interconnect. The parallel SAPT codes are available in the SAPT2020 distribution.

The DFT calculations in ORCA are parallelized using MPI and show a speedup of about 9 on 16 cores (all speedups quoted are for the C2b system from S12L^{158} with PBE/AC, all electrons, cc-pVTZ basis set in the MC^{+}BS format). We added the OpenMP directive to LIBINT to parallelize the most critical fragments of two-electron integral calculations and achieved a speedup of 3 on 16 cores. Parallelization of SAPT(DFT) was done using different techniques for different modules. The FDDS functions and CKS dispersion and exchange–dispersion calculations utilize the parallel Basic Linear Algebra Subroutines (pBLAS) software package. Such parallelization is very easy to introduce and works reasonably well since almost all computations for these components are programmed using BLAS. The kernel calculation, the three-center integrals, and the integral transformation codes are parallelized using MPI. However, they also scale reasonably well when using pBLAS, so this method is usually chosen, and was used in the example quoted here. The overall speedups of the SAPT(DFT) calculation for system C2b on 16/8/4 cores are 5/4/3. Thus, the optimal number of cores for running SAPT(DFT) is 4. However, since calculations for large systems anyway use a large part of the disk space on a typical single node, running on 16 cores does make sense (on our cluster, we can run two simultaneous jobs for the large S12L dimers due to disk size limitations, so it makes sense to run each job on half of the available 32 cores per node).

### D. Three-body SAPT(DFT)

Three-body pairwise nonadditive interaction energies in the SAPT(DFT) approach were developed in Ref. 136. The KS-level approach results from a simple replacement of the monomer HF orbital coefficients and orbital energies by the KS ones in the expressions for the corrections *E*^{(i;0)}[3, 3] in the three-body code of Ref. 4, leading to corrections now denoted as *E*^{(i)}[3, 3](KS). Furthermore, $Eind(2)[3,3]$(CKS) and $Edisp(3)[3,3]$(CKS) have been developed and programmed both in the df and non-df forms. All these terms are available in the nonadditive interaction energy module of the SAPT2020 suite. For applications to large systems, two of the present authors^{136} recommended a hybrid approach, labeled MP2+SDFT, where nonadditive energies are obtained as a sum of the MP2 supermolecular nonadditive energies and $Edisp(3)[3,3]$(CKS). The MP2+SDFT method was used in the first high-level *ab initio* calculations of the nonadditive interaction energies (including overlap effects) of the benzene trimer.^{170} These energies were then used to compute the cohesion energy of the benzene crystal.^{170} These calculations were possible since the computational costs of $Edisp(3)[3,3]$(CKS) scale only as *O*(*N*^{4}) for pure functionals and *O*(*N*^{5}) for the hybrid ones, the same scaling as for the two-body second-order dispersion energy. This is due to the fact that the costs are determined by calculations of monomer FDDSs. Therefore, the three-body codes would benefit from fast methods of calculations of FDDSs developed in Refs. 114 and 155. Although this option has not been implemented in SAPT2020, it would be very simple to do so. Since the integral transformation and MP2 calculations scale as *O*(*N*^{5}), the MP2+SDFT method scales as *O*(*N*^{5}). This scaling is lower than that of a supermolecular MP3 calculation, which scales as *O*(*N*^{6}) and is the lowest level of supermolecular theory that includes $Edisp(3;0)[3,3]$. The performance of MP2+SDFT was later evaluated by Huang and Beran^{171} who made comparisons to benchmark calculations for a wide range of trimers (MP2+SDFT was named MP2+CKS in Ref. 171). They also considered a method denoted as MP2+ATM, which adds to MP2 the asymptotic form of $Edisp(3)[3,3]$(CKS) restricted to the leading term and empirically damped. They found that the RMSE of MP2+SDFT results was 0.034 kcal/mol, while that of MP2/MP3 was 0.059/0.035 kcal/mol. Surprisingly, the RMSE of MP2+ATM, which is an approximation to MP2+SDFT, was lower, 0.020 kcal/mol, which is, of course, due to the empirical damping. The median unsigned percent errors (computed from the results in Table S1 of Ref. 171) for MP2, MP2+ATM, MP2+SDFT, and MP3 are 15.3%, 5.0%, 7.4%, and 6.0%, respectively. Clearly, MP2+SDFT is the least expensive, i.e., *O*(*N*^{5}), first-principles method, which gives reasonably accurate values of three-body nonadditive energies for any type of trimer.

### E. Open-shell SAPT(DFT)

The open-shell SAPT(DFT) [osSAPT(DFT)] method implemented in the SAPT2020 suite was developed in Ref. 11. It is applicable to monomers that can be described at the restricted open-shell Kohn–Sham (ROKS) level, and the spins of monomers have to be coupled to the high-spin states of dimers. The osSAPT(DFT) approach generalizes the SAPT(KS/CKS) closed-shell methodology of Ref. 9 to ROKS monomers. All the terms developed in Ref. 9 have been generalized, except for the exchange–induction and exchange–dispersion at the CKS level [these components are approximated by scaling the KS corrections analogously to Eqs. (15) and (16)]. The asymptotic correction can use either the FATH^{164,165} or GRAC schemes,^{71} which for ROKS requires two ionization potentials, one for *α* electrons, and a separate one for *β*. The implementation requires a significant patch to DALTON^{148} for computations of the asymptotic corrections and of the CKS kernels. The osSAPT(DFT) codes have not been density fitted yet, so they are not applicable to such large systems as in the case of closed-shell codes. The osSAPT(DFT) approach was tested in Ref. 11 on several dimers, and the accuracy of the predictions appeared to be as good as in the closed-shell case. The osSAPT(DFT) codes can also be used with the ROHF reference, forming the open-shell variant of SAPT0; however, the accuracy of this approach is lower than that of osSAPT(DFT).^{11} Such osSAPT0 has been programmed with density fitting in Ref. 172 and is available in the PSI4 codes.^{41,173} While the generalization to ROKS monomers involved several complications in the derivation and implementation compared to the closed-shell case, the usage of osSAPT(DFT) is no more complicated than in the closed-shell case. The osSAPT(DFT) has been applied successfully to several dimers.^{174–179} A few extensions of osSAPT(DFT) have been developed after Ref. 11 was published, including a method using unrestricted Kohn–Sham descriptions of monomers^{180} and the first-order corrections for low-spin dimer states of high-spin monomers,^{181,182} but neither of these additions are implemented in SAPT2020.

## V. SAPT2020 PACKAGE

The SAPT codes were made available for the first time in the METECC-94 depository of electronic structure programs^{183} in 1993. Several consecutive versions followed, with SAPT2020^{39} being the latest one. The SAPT2020 suite can be found at https://www.physics.udel.edu/∼szalewic/SAPT/. It is available free of charge and with the source codes. The potential users are asked to download and sign a user license. An extensive manual describing the installation and usage is available. The programs are written mostly in Fortran with minor sections in C and C++. The control and processing of calculations are performed with the help of shell and AWK scripts, with the exception of the new fastdf program that relies entirely on Fortran. SAPT2020 works on multiple Unix-like systems, including the most popular 64-bit x86 Linux platform. Several compilers are supported, including GNU Fortran, Intel Fortran and Portland Fortran, although not all the compilers are supported by every module (see SAPT2020 manual). Binary files are not distributed, and users have to compile the code (proper compilation scripts are included).

SAPT calculations require HF or KS orbitals and orbital energies for monomers, and in most cases also integrals, from front-end programs. SAPT2020 includes ATMOL1024^{184} for HF calculations, but several external front-end HF programs are interfaced with SAPT: DALTON,^{148} CADPAC,^{147} MOLPRO,^{40} GAMESS(US),^{185} and GAUSSIAN.^{186} These codes can be used for all SAPT(HF/MP/CC) calculations. For KS monomer calculations needed in SAPT(DFT), SAPT2020 can use either DALTON^{148} or ORCA.^{149} As explained in Sec. IV B, the functionality of SAPT(DFT) is different from these two packages. All these programs have to be obtained and installed separately.

The SAPT2020 suite is organized into several modules. These include, in particular, the main SAPT package with closed-shell SAPT(HF/MP/CC) and the SAPT(DFT) package. SAPT(CCSD) package that implements the theory of Refs. 44–51 requires the MOLPRO front-end that has to be recompiled with a patch supplied in SAPT2020. Three-body SAPT, open-shell SAPT, and asymptotic programs are other separate packages with various degrees of autonomy with respect to the main SAPT package. AutoPES (see Sec. VI) is also a separate package distributed with SAPT2020 and calling SAPT as well as the asymptotic modules.

## VI. POTENTIAL ENERGY SURFACES BASED ON SAPT

A solution of the Schrödinger equation in the Born–Oppenheimer (BO) approximation, i.e., involving only electrons with nuclei clamped in space, gives the total electronic energy a cluster for each nuclear configuration. Each such set containing the energy and the coordinates is called a grid point. After performing calculations for a number of grid points, it is convenient to fit these data by an analytic function, i.e., develop a PES. Such a PES, a function of all nuclear coordinates, can then be used in a variety of nuclear-motion calculations. Since PESs for the isolated systems do not depend on the translation and overall rotation of such systems, they are 3*n* − 6-dimensional, where *n* is the number of atoms in a system. In the case of molecular clusters, it is advantageous to separate the total PES into intermolecular and intramolecular components, with the former defined as the difference between the total PES and the sum of intramolecular parts. Since we concentrate on clusters, we will consider only nonreactive PESs, i.e., interatomic distances within each molecule are restricted to change within the range corresponding to molecular vibrational motions. The derivatives of the potential energy with respect to the nuclear coordinates are the forces acting on atoms, and therefore, PESs are often referred to as force fields. Intermolecular PESs are still functions of all nuclear coordinates and can be greatly simplified if the intramolecular coordinates are kept fixed, leading to PESs in the so-called rigid-monomer approximation. This gives a substantial reduction in dimensionality, from 3*n* − 6 to 6 or less for any type of dimer, so PESs for dimers containing several dozens of atoms can be developed.^{187} In contrast, flexible-monomer PESs become intractable by analytic fits already for rather small systems. Once a PES has been developed, the most rigorous next step is to solve the Schrödinger equation for nuclei with the PES used as the potential energy term in this equation. This task is generally nontrivial if the number of degrees of freedom is a dozen or more, but, fortunately, for many types of system, it is sufficient to solve the classical Newton equations of motion, which can be done for systems with millions of atoms. This approach is referred to as molecular dynamics (MD). The physical quantities obtained from quantum or MD calculations can be directly compared with experiment, which is generally not possible for interaction energies. Note that for individual molecules, meaningful comparisons with experiment can be made using single-point harmonic transition energies, but cluster PESs are so highly anharmonic that a whole PES is needed for such comparisons. Clearly, PESs are fundamental to investigations of clusters and condensed phases. Currently, a great majority of MD simulations use empirically derived PESs, i.e., PESs fitted in MD simulations to reproduce some quantities that have been measured. *Ab initio*-derived PESs are less often used, mainly due to the significant computational effort required to calculate the interaction energies and due to the difficulties encountered in fitting PESs to these data. A broader use of *ab initio* PESs is desirable since they have several distinct advantages over the empirical ones (note that these advantages will persist even if a method that is not strictly *ab initio*, for example, one of the parameterized DFT+D methods, is used). One of these advantages is that while empirical PESs are applicable only to specific systems (e.g., empirical PESs fitted to the bulk properties of water cannot reproduce the spectra of small clusters), *ab initio* PESs are very general [(for instance, a water *ab initio* PES^{132,133,188–190} was successfully applied to both the spectra of the water dimer and the properties of liquid water (see Ref. 191 for a review of this work)].

PESs fitted to SAPT interaction energies were developed soon after the many-electron SAPT codes became available (see Refs. 63, 77, 122–124, and 192–196). These surfaces, as well as many other developed later, were always used in predicting the observed properties of clusters and condensed phases. In particular, a number of surfaces were developed to interpret experiments in superfluid helium nanodroplets (see Ref. 197 for a review of this work). In recent years, SAPT has been used to generate PESs for interactions of monomers containing about 20 atoms each,^{65,170,187} relevant for crystal structure predictions (CSPs). The autoPES approach discussed below is based on the methodologies developed in these papers. In particular, all of these PESs included very accurate asymptotics, which is a distinctive feature of autoPES.

Although the number of published *ab initio* PESs has been gradually increasing, until recently, creation of such a PES, even a rigid one, was very computer-time-consuming and required significant human effort. The fitting process was often by trial and error, and most published PESs were obtained using case-by-case fitting procedures and ways to select grid points, which was human-time consuming. Since tens of thousands of grid points were often used, running large numbers of jobs also required significant effort, and calculations were computer-time consuming if methods such as CCSD(T) were employed. These problems were greatly reduced with the development of the autoPES method and its computational implementation.^{74,198} AutoPES works completely automatically, and the only input it needs are monomer geometries and an accuracy threshold (autoPES creates detailed inputs, generates grid points, submits jobs that perform computations of close-range and asymptotic interaction energies, extracts results from outputs, and performs iterations to convergence without any human involvement). Computer resources needed for *ab initio* calculations are greatly reduced if SAPT(DFT) is used rather than CCSD(T) [even relative to very efficient implementations of CCSD(T) such as DLPNO-CCSD(T)^{161,162} (cf. Sec. IV A)]. As discussed earlier, SAPT(DFT) interaction energies agree to within a couple percent with the CCSD(T) ones computed in the same basis set.^{62} However, autoPES can use CCSD(T) for close-range grid points or, in fact, any other electronic structure method, *ab initio* or not. If very high accuracy is required, as is the case, for example, in the predictions of spectra of small complexes,^{199,200} CCSDT(Q),^{201} a method including full triple excitations and noniterative quadruple ones, can be used. Very substantial savings of computational resources are achieved by reducing the number of necessary grid points. This reduction is due to two reasons. First, autoPES uses Monte Carlo sampling of configurational space in all dimensions, rather than the product of one-dimensional grid approach used in most of older literature work. Moreover, this sampling is guided by the strength of interactions. An alternative and better performing method of sampling, called iterative variance minimizing grid (IVMG) approach,^{202} has recently been published and is available in SAPT2020. Second, autoPES uses the asymptotic expansion of the interaction energy to generate *ab initio* grid points in the long-range regime, essentially at zero costs compared to close-range calculations. Therefore, expensive close-range calculations can be restricted to a small region of space compared to those used in most published work. As a consequence, for rigid monomers, autoPES uses ∼10^{3} rather than ∼10^{5} grid points. AutoPES has two methods for generating asymptotic potentials. The first method uses the COM–COM expansion discussed already in Sec. II and its appropriate modification in the case of SAPT(DFT).^{203} This expansion is then used to generate a set of long-range asymptotic energies, which are subsequently fitted by a simple sum of site–site isotropic multipole expansion terms. The second method, seamlessly connected only to SAPT(DFT), uses the *ab initio* distributed asymptotic expansion based on the distributed multipoles and polarizabilities. Several forms of such an expansion have been published.^{204–216} AutoPES uses density-fitting type approaches.^{217–220} In distributed expansions of this kind, each site–site potential is anisotropic and such a form would not be accepted by most MD or CSP programs. Therefore, this expansion is used to generate long-range energies that are fitted in the same way as in the COM–COM approach. To assure high accuracy of fits, autoPES uses a more elaborate functional form than that used in biomolecular force fields, but still accepted by most MD/CSP codes. This form includes polynomials multiplied by exponentials, terms with up to inverse tenth power of atom–atom distances, and damping functions. In addition to atoms, off-atomic fitting sites can be used. A polarization model can be used by autoPES as a part of a fit. Such a model approximates induction energies by computing electric dipoles induced on a given atom by the electric field of interacting partners. This model can be used in a many-body context to approximate pairwise nonadditive polarization effects. For polar systems in the condensed phase, such a model is a good approximation of the complete pairwise nonadditive effects.^{221}

The autoPES package has been included in the SAPT suite of codes starting with the SAPT2016 version. A limitation of this implementation was that it was using the rigid-monomer approximation. This limitation was lifted in SAPT2020. The flexible-monomer version of autoPES is described in Ref. 222. It should be noted that the rigid-monomer approximation works very well and will continue to be used for most future PESs. Paradoxically, in practice, this approximation may lead to more accurate predictions than those from the flexible-monomer surfaces since often various compromises have to be made in the latter case due to the much larger numbers of grid points needed than in the former case (like a low level of theory and small basis sets). Errors of flexible-monomer fits also tend to be fairly large since difficulty of fitting increases with the number of dimensions. Examples comparing the rigid- and flexible-monomer approaches are the work on H_{2}–CO^{223,224} and water clusters.^{225} In the former case, a rigid-monomer PES predicted scattering cross sections in better agreement with experiment than a flexible-monomer one. In the latter case, a rigid-monomer PES gave cluster energies closer to the benchmark values than did the best published flexible-monomer PESs. However, as the predictive power of theory improves and accuracy expectations increase, at some point, monomer flexibility has to be accounted for. For example, for the water dimer to reach sub-wavenumber accuracy of the rovibrational spectrum requires flexible-monomer PESs.^{226,227} Even if such accuracy is not needed, there are some physical problems where monomer flexibility has to be accounted for, e.g., to describe the effects of the intermolecular interaction on monomers’ vibrational frequencies (red or blue shifts). Other examples are molecules with soft internal degrees of freedom, whose structure in crystals can be radically different than in the gas phase. Often, there are only a few such degrees of freedom, and autoPES can generate fits including only these degrees, going around the impossible-to-fit problem in 3*n* − 6 dimensions.

There are several ways to couple intramonomer degrees of freedom to intermonomer ones. One way is to start from a rigid-monomer PES, with elaborate site–site functions, and expand its parameters in intramonomer coordinates, as done, for example, in Refs. 228 and 229. Another option is to use the sum of products (SOP) of simple atom–atom functions.^{230–234} Usually, such a SOP approach does not distinguish between intra- and intermolecular degrees of freedom. The autoPES method of Ref. 222 implemented an intermediate path between the two options: the intermonomer atom–atom functions are more complex than in the standard SOP approach and off-atomic sites are used, while the SOP-type products are limited to double ones only and the intramolecular part in these products is just a polynomial function of the site–site distances. Despite not using the terms dependent on bond–bond and dihedral angles, this method fits very well the energy of ethylene glycol as functions of the rotation around the central C–C bond.^{222}

The flexible-monomer autoPES method has been tested^{222} on dimers involving the water and ethylene glycol molecules in both cases with three internal degrees of freedom per monomer, i.e., the latter case assumes a partial monomer flexibility. For the water dimer, the developed PES has stationary points closer to the benchmarks^{235} than any published PES, despite being fit to only about 5000 points in 12-dimensional space. For the ethylene glycol dimer, comparisons were made with the empirical optimized potential for liquid simulations with all atoms (OPLS-AA).^{236,237} PES, which in this case should be reasonably accurate since ethylene glycol was one of the training systems in the development of OPLS. The agreement between the two surfaces was quite acceptable. Comparisons were also made with a PES obtained using the interaction potentials from machine learning (IPML) method.^{238} In this case, very good agreement was found for large *R*, but the IPML PES has incorrect behavior at short distances.

SAPT has also been used to develop universal force fields, i.e., applicable to a class of monomers (see, for example, Refs. 239–244). SAPT is also increasingly used in the development of biomolecular force fields.^{245–249}

## VII. SAPT IMPLEMENTATIONS IN OTHER ELECTRONIC STRUCTURE PACKAGES

Various flavors of SAPT have been implemented in several software distributions developed by other groups. Unlike the SAPT suite, most of the other distributions are general-purpose electronic structure packages that have one or more variants of SAPT as one of the available methods. For instance, Q-Chem^{250,251} contains algorithms to compute SAPT0 interaction energies, including linearly scaling SAPT0.^{252} In the latter case, for terms other than $Edispx(20)$, SAPT0 formulas expressed in atomic orbitals were programmed using sparse matrix multiplications, while some more specific sparsity-exploiting techniques were used for $Edisp(20)$. The correction $Eexch\u2013disp(20)$ was not programmed, instead an empirical scaling of $Edisp(20)$ was used. PSI4^{41,173} contains efficient density-fitted algorithms^{113,172,253–255} for SAPT interaction energy up to the third order [all the terms included in Eqs. (4)–(7) and (9), except for $\u03f5exch(1)(CCSD)$]. The PSI4 implementation can use either MCBS or DCBS, but not MC^{+}BS. PSI4 includes also the “atomic” SAPT (A-SAPT)^{256} and functional group SAPT (F-SAPT) methods^{257} that provide atomic or functional group partitioning of interaction energies at the SAPT0 level. As an extension to F-SAPT, intramolecular SAPT (I-SAPT) has also been proposed^{258} and implemented in PSI4. I-SAPT estimates the intramolecular interaction energies of some molecular fragments inside a molecule. Furthermore, PSI4 includes the second-order terms beyond the *S*^{2} approximation^{100,101} and spin-flip SAPT of Ref. 181. MOLPRO^{40,259,260} contains algorithms for the computation of SAPT0 interaction energies up to second-order in *V* and can account for intramonomer correlation by using either SAPT(CCSD) or SAPT(DFT). In the latter case, MOLPRO includes the possibility of using the so-called exact-exchange (EEX) kernel,^{261} leading to a more advanced form of TD-DFT (note that this term is different from the term “exact-exchange” denoting the use of the HF exchange operator computed with KS orbitals) and partial decoupling of the TD-DFT equations,^{262} i.e., setting in the CKS Hessian some matrix elements between occupied-virtual pairs to zero. MOLPRO also includes SAPT0 second-order terms without the single-exchange approximation.^{100,101} CamCASP^{263} is another set of codes that includes SAPT(DFT); however, these codes are not efficient enough to be applied to as large systems as the other discussed codes can handle.

## VIII. SUMMARY

The SAPT package of computer codes, SAPT2020 being the current version, represents the most comprehensive set of programs for SAPT calculations available. Its capabilities include SAPT based on the wave-function description of monomers (HF, MP*n*, and CCSD) and on the DFT description. The latter version, which gives results within a couple percent of those given by supermolecular CCSD(T), has recently been significantly optimized and is capable of calculating complete intermolecular interaction energies for two hundred-atom dimers with modest computational resources. The SAPT2020 package can perform restricted open-shell calculations for high-spin complexes, calculations of nonadditive three-body effects (both with HF/MP and DFT based monomers), calculations of interaction energies from monomer asymptotic properties, and it contains the autoPES code for the automated creation of potential energy surfaces with rigid or flexible monomers. The codes are available free of charge and are supported on multiple Unix-based computer platforms.

## DATA AVAILABILITY

The data that support the findings of this study are available within this article.

## ACKNOWLEDGMENTS

This work was supported by the U.S. Army Research Laboratory and Army Research Office (Grant No. W911NF-19-1-0117) and the NSF (Grant No. CHE-1900551). R.P. acknowledges support from the Polish National Science Centre (Grant No. 2015/17/B/ST4/03727). The authors thank Dr. Bogumił Jeziorski and Dr. Michael P. Metz for reading and commenting on this manuscript.

## REFERENCES

_{2}O and HF dimers

*ab initio*program for many-body symmetry-adapted perturbation theory calculations of intermolecular interaction energies, University of Delaware and University of Warsaw, 2020, https://www.physics.udel.edu/∼szalewic/SAPT.

*ab initio*programs,

*ab initio*program for many-body symmetry-adapted perturbation theory calculations of intermolecular interaction energies, University of Delaware and University of Warsaw, 2002, http://www.physics.udel.edu/∼szalewic/SAPT.

_{2}

^{−}, He–HF, H

_{2}–HF, and Ar–H

_{2}dimers

_{2}O–H

_{2}O and Ar–NH

_{3}

_{2}HF trimer

_{2}O)

_{20}dodecahedral cage

*et al.*.

*ab Initio*Program for Many-Body Symmetry-Adapted Perturbation Theory Calculations of Intermolecular Interaction Energies, University of Delaware and University of Warsaw, 2016, https://www.physics.udel.edu/∼szalewic/SAPT.

^{3}Σ

^{−}) − NH(

^{3}Σ

^{−}) with analytical long range

_{2}–O

_{2}, N

_{2}–N

_{2}and O

_{2}–N

_{2}

_{2}($(\Sigma g\u22123)$) + N

_{2}($(\Sigma g+1)$) interaction. Applications to the collisional, spectroscopic, and thermodynamic properties of the complex

_{2}in the M

^{+}–H

_{2}complexes: Performance of DFT and DFT-D methods

_{2}intermolecular potential energy surface

_{2}–CO and D

_{2}–CO van der Waals complexes

_{2}from ab initio potential energy surfaces

_{2}–CO van der Waals complex

_{2}–CO spectra

_{2}in full dimensionality

_{2}complex

_{2}O)

_{2}

*ab initio*programs, 2019, see https://www.molpro.net.