An efficient representation of molecular correlated wave functions is proposed, which features regularization of the Coulomb electron–electron singularities via the F12-style explicit correlation and a pair-natural orbital factorization of the correlation components of the wave function expressed in the real space. The pair-natural orbitals are expressed in an adaptive multiresolution basis and computed directly by iterative variational optimization. The approach is demonstrated by computing the second-order Moller–Plesset energies of small- and medium-sized molecules. The resulting MRA-PNO-MP2-F12 method allows for the first time to compute correlated wave functions in a real-space representation for systems with dozens of atoms (as demonstrated here by computations on alkanes as large as C_{10}H_{22}), with precision exceeding what is achievable with the conventional explicitly correlated MP2 approaches based on the atomic orbital representations.

## I. INTRODUCTION

Predictive computation of properties of molecules and materials requires a highly precise numerical representation of many-body electronic wave functions/operators. Accurate electronic wave functions, such as in the coupled-cluster method, are traditionally represented in the Linear Combination of Atomic Orbitals (LCAO) representation using pre-optimized sequences of atom-centered basis sets (represented by analytic Gaussian-type or Slater-type functions, or represented numerically^{1}). Recently, many-body electronic structure computations have also been demonstrated in the (augmented) plane-wave (PW) representation.^{2} Both representations suffer from slow convergence to the basis set limit, requiring the use of basis set extrapolation^{3,4} or explicit correlation.^{5,6} Although it is possible to closely match experimental chemical energy differences, such as atomization energies of small molecules,^{7} with the LCAO approaches, such computations suffer from high cost and poor conditioning and do not lend themselves to fast (reduced-scaling) reformulation due to the fully dense representation of electronic states and operators. The known bias of common AO basis sets toward ground-state energies makes it difficult to attain similar accuracy for excited states and response properties. Meanwhile, the plane wave representation struggles with the description of position-localized (e.g., intra-atomic) features of the electronic wave functions and thus also do not naturally lend themselves to low-order formulations.

Grid-based real-space numerical representations of many-body wave functions, such as finite difference and finite/spectral elements, provide an alternative to the LCAO and PW representations with a number of attractive features, such as the ability to resolve spatially localized features, systematic bias-free improvability, and good conditioning. The main bottleneck to routine use of such representations is the impractical size of the 3*k*-dimensional grid needed to represent a *k*-particle state. Even pair theories (*k* = 2) such as MP2 and CCSD become prohibitively expensive for molecular systems due to the grid size; only symmetry-based dimension reduction in atoms makes the application of pair theories possible.^{8–12}

Recently, we demonstrated for the first time real-space 2-body electronic structure methods (MP2, CIS(D), and CC2), in general, molecular systems without any use of symmetry by exploiting (1) multiresolution adaptive spectral element representation of the 2-electron (pair) functions, (2) re-formulation of the integrodifferential pair function equations in the integral form, (3) tensor compression of pair function coefficients, and (4) regularization of electron–electron and electron–nuclear Coulomb singularities via explicit electron–electron and electron–nuclear correlation.^{13–18} Let us re-emphasize that our approach is distinct from other notable recent uses of grids in many-body electronic structures^{19,20} in that the LCAO representation is not used *at all*. Our real-space approach achieved supremacy to the best available LCAO-F12 counterparts by producing correlation energies and other molecular properties with smaller numerical errors; this suggests that when the utmost numerical precision is necessary, the real space representation becomes the preferred choice.

Despite the promising numerical performance, real space many-body wave functions are costly to compute due to the sheer size of the 2-particle basis, *even* with the adaptive basis refinement, coefficient compression, and operator regularization. Here, we propose to address this challenge by leveraging a lesson of the reduced scaling LCAO many-body technology based on pair natural orbitals,^{21–30} namely, that pair functions are *globally* separable with a low (≪100) separation rank. The initial demonstration will be limited to the ground-state MP2 method for simplicity, but extensions to higher-order methods, and to excited states and properties, will be straightforward. The key novel feature of the real-space MRA-PNO-MP2-F12 method is the direct optimization of the pair natural orbitals (PNOs) represented in a real-space representation and the adaptive determination of the PNO subspace ranks.

The rest of this paper is organized as follows. Section II briefly recaps the application of the multiresolution analysis (MRA) to the many-body electronic structure methods as well as the proposed MRA-PNO-MP2-F12 method. Section III describes the technical details of the computations whose results are reported and analyzed in Sec. IV. Our findings are summarized in Sec. V.

## II. FORMALISM

### A. Multiresolution spectral element representation

The multiresolution analysis (MRA) of a Hilbert space defines its recursive decomposition that translates into a straightforward adaptive construction of scale-invariant real-space spectral element bases.^{31} The multiresolution adaptive spectral element representation (or, simply, MRA representation) allows a compact representation of sufficiently smooth functions with arbitrary guaranteed precision *ϵ*, sparse representation of many differential and integral operators, and fast algorithms for their applications,^{32–35} thus realizing a *basis-free* computational integrodifferential calculus. A number of electronic structure methods have been demonstrated in the multiresolution representation, including one-body (Hartree–Fock and DFT^{36}) and two-body (MP2^{13} and CC2^{18}) methods, with energies, forces, and other molecular properties^{17,38,39} of ground and excited states. Complete details can be found in Refs. 34, 36, 40, and 41.

### B. MRA-MP2 and MRA-PNO-MP2-F12

#### 1. MRA-MP2

The *closed-shell* second-order Moller–Plesset energy can be obtained by minimizing its Hylleraas functional^{42} with respect to the first-order pair functions *τ*_{ij}(**r**_{1}, **r**_{2}) ≡ ⟨**r**_{1}**r**_{2}|*τ*_{ij}⟩,

where $P^12$ permutes electrons 1 and 2, $Fij\u2261i|F^|\u2009j$ is the matrix element of the Fock operator $F^$, $F^12=F^1+F^2$, and $g12\u2261r12\u22121$ is the electron–electron repulsion. The strong orthogonality projector $Q^12$ is defined as

with $O^\u2261\u2211m|mm|$ being the projector onto the subspace spanned by the occupied orbitals of the reference determinant.^{37} For real pair functions, the stationarity condition of the Hylleraas functional can be compactly written as

with the functional derivative with respect to the pair functions given by

The real-space projection of Eq. (3) is an integrodifferential equation that can be solved by a number of methods; here, we follow the standard approach of Refs. 13, 14, 17, and 18 by converting it to the integral form,

Here, we introduced a shorthand notation for the potential energy component of the Fock operator,

where $J^$ and $K^$ are the usual Coulomb and exchange operators and $G^\mu \u2261T^12\u2212\mu 2\u22121$ is the 6-dimensional Helmholtz Green’s operator for free-space bound states^{36,44} parameterized by

While for some methods the initial guess has to be constructed by nontrivial approaches, here, the trial real-space pair function can be set to zero. The coupled system of integral equations is then iterated until the fixed point is reached to the desired precision.

#### 2. MRA-PNO-MP2-F12

In this work, we will solve the first-order MP equations by representing the pair functions as products of pair natural orbitals (PNOs). First introduced by Edmiston and Krauss^{21} and developed by Meyer, Kutzelnigg, Ahlrichs, and others,^{23–25} these were recently reintroduced by Neese as a means to reduce the cost of reduced-scaling coupled-cluster methods in the LCAO representation.^{26–30} Here, we will extend the PNO-based MP2 approach as a means to make the real-space representation practical for many-body methods.

The PNOs in the LCAO approach are usually computed by diagonalizing pair contributions to the unoccupied block of the correlation contribution to 1-RDM evaluated from approximate (uncoupled) pair functions. Once determined, the PNOs are traditionally fixed, although recently one of us discovered that PNO optimization can offer a substantial improvement in the context of coupled-cluster singles and doubles.^{45} In a real-space representation, it is not feasible to form 1-RDM (a function of 2 particle coordinates), considering the large number of parameters needed to represent even 1-particle functions. Thus, we will instead determine the PNOs directly by minimizing the second-order Hylleraas functional expressed in terms of PNO-expanded pair functions.

Each first-order pair function |*τ*_{ij}⟩ will be expanded as a linear combination of all possible products of its PNOs {|*c*_{ij}⟩},

The second-order Hylleraas functional thus depends on the PNOs and the corresponding amplitudes,

where the usual contravariant amplitudes were introduced,

Note that the overlaps *S*_{ab} ≡ ⟨*a*|*b*⟩ in Eq. (10) appear since the PNOs for different pairs are not orthogonal to each other.

The stationarity conditions given with respect to both quantities can be written in compact form as

where

The amplitude gradient (12) is evaluated similar to Eq. (12) of Ref. 28 where the PNOs for a given pair were forced to be mutually orthonormal, hence leading to a slightly different equation. In this work, the mutual orthogonality of the PNOs for a given pair is optional and can be ensured by diagonalizing the pair specific Fock matrices in each iteration. Note that Ref. 28 used a fixed (pair-specific) basis of projected atomic orbitals to represent the PNOs, while in this work, the basis used to represent the PNO changes from iteration to iteration. The PNO gradient (13) can be further simplified to

by introducing the PNO–PNO block of the cross-pair reduced density matrix,

which for the same-pair case (i.e., *k* = *i*) reduces, due to the orthogonality between the PNOs of a given pair, to the usual PNO 1-RDM matrix,^{28}

We also introduced the energy-weighted variant of the 1-pair RDM, given by

### C. Explicitly correlated ansatz for MRA-PNO-MP2

The first-order pair functions in real space contain the 3-dimensional seam of the electron–electron cusp caused by the singular electron–electron interaction.^{43} While this leads to the well-known slow asymptotic decay of the basis set error in LCAO,^{46} the cusp seam results in an impractical number of parameters (e.g., grid size) in a real-space representation of the pair functions. Regardless of a representation, it is mandatory to describe the known analytic structure of the cusp by an explicit contribution to the pair function. We thus introduce the explicitly correlated R12/F12 ansatz for the pair function^{5,47}

with the explicitly correlated cusp-containing term given by

and |*u*_{ij}⟩ is a cusp-free electron pair function. The explicitly correlated ansatz regularizes the MP1 equations by canceling the singularity of the Coulomb potential with the kinetic energy operator acting on $w$_{ij}.^{13,16,18}

With the ansatz of Eq. (19), the Hylleraas functional is split into one functional depending only on the regularized pair functions *u*_{ij}, one depending only on the explicitly correlated residues $w$_{ij}, and a coupling functional,

The regularized pair functions *u* are expanded into the PNOs as in Eq. (9); henceforth, all PNOs and amplitudes will refer to the regularized pair functions *u*.

For the evaluation of the functionals depending on $w$, we use the regularized potential $g\u030312$ defined by^{18}

In the coupling functional $L(2)c[u,w]$ and the Hylleraas functional that depends on $w$ only, $L(2)[w]$, all coupling terms between the individual electron pairs cancel out, leaving only

The regularized Hylleraas functional $L(2)u$ takes the same form as in Eq. (10). If the coupling functional is combined with the Hylleraas functional of the regularized pair functions, the Coulomb parts will cancel, resulting in the same functional as before with the Coulomb potential replaced by the regularized potential $g\u030312$,

where we introduced the regularized PNO-MP2-F12 Hylleraas functional,

#### 1. MP2-F12 energies

Having determined the regularized pair functions, the MP2-F12 energy can be evaluated by the Hylleraas functional

The pure *f*_{12} dependent functional $L(2)w$ of Eq. (28) will contain terms such as

emerging from the regularized potential $g\u0303$. The second term on the right-hand side of Eq. (30) can be straightforwardly evaluated without the need to represent 2-particle quantities. This is not the case for the first term,

One way to evaluate such terms within an MRA framework is to compute a two-particle intermediate function

making the costs of the MRA-PNO-MP2-F12 computation comparable to the full (non-PNO) MRA-MP2-F12 computation and therefore shares its disadvantages, namely, the need to compute and store a 6-dimensional function. Another possibility is to introduce an auxiliary virtual basis set $|vk$ in order to approximate $Q^\u2248\u2211k|vkvk|$ and reduce Eq. (30) to an expectation value over one-particle functions

Note that it is not necessary to approximate the full two-particle projector $Q^12$. The introduction of such an auxiliary basis set is possible and can be performed in reasonable time compared to the optimization of the PNOs, but it introduces an additional approximation.

To avoid these challenges, we compute the energy by projection

such that terms such as Eq. (28) do not appear anymore and the energy can be calculated as

In LCAO-MP2 without explicit correlation, the projection formula and Hylleraas functional yield identical working equations for the energy. The two approaches differ if the MP1 wave function is not solved exactly, as it is the case in LCAO-MP2-F12. Since MRA-MP2-F12 and MRA-PNO-MP2-F12 compute the MP1 (residual) wave function numerically exactly, the two expressions again yield the same energy once the wave function is converged with respect to the number of PNOs. Another argument for using projected MP2 energies is that they are similar to the projected coupled-cluster energies so that the quality of the optimized PNOs when used in the coupled-cluster context can be better estimated by the projected MP2 energies. Note, however, that the projected MP2 energies are not variational (in the sense of the Hylleraas functional) so that the final MRA-PNO-MP2-F12 energy can be lower than the exact (basis set limit) MP2 energy. In addition, if the PNOs are far from being converged, the Hylleraas functional and the projected energies can differ quite significantly. If not explicitly stated otherwise, we will always show projected energies in this article.

### D. Optimizing PNOs and their amplitudes

#### 1. Optimization of the PNO amplitudes

At each step of the PNO optimization procedure (see Fig. 2), the amplitudes are optimized for the current PNOs. The straightforward way to do so is by solving Eq. (12), but this might cause convergence problems, especially for larger systems with localized orbitals. A solution to this is the canonicalization of the PNOs, i.e., rotating them in a way that the PNO–PNO Fock matrix $Faijbij$ for each pair becomes diagonal and the coupling between the PNOs of the pair vanishes. Note that if localized orbitals are used, the coupling between the different pairs is still present so that the amplitudes still need to be solved in an iterative way. It is also possible to use a quadratic optimization scheme to improve the convergence behavior with or without reduced coupling; here, we used the KAIN solver introduced in Ref. 48.

#### 2. Optimization of the PNOs

The real-space PNOs are optimized in a manner similar to how the Hartree–Fock/Kohn–Sham orbitals are optimized by using the BSH Green’s operator in an iterative fashion.^{36,44} However, compared to the mean-field methods, the number of PNOs is not fixed; this makes the PNO optimization more complicated compared to the standard LCAO-based PNO methods in which a set of PNOs is determined once and fixed. The current procedure is closer in spirit to the optimized PNO-CCSD approach of Ref. 45, where the number of PNOs was varied.

Before optimization, a given set of PNOs is truncated by diagonalizing the reduced density matrix given in Eq. (17) and using the eigenvalues as a truncation criterium, i.e., for each pair, the PNOs that correspond to eigenvalues of the reduced density matrix smaller than *δ* are removed. This will be referred to as compression of the PNOs; this step is completely analogous how the PNOs are determined in the LCAO-based MP2 and CC methods. In the representation that diagonalizes the reduced density matrix, the PNO gradient of Eq. (15) is

where $daij$ is the corresponding eigenvalue of the PNO RDM (note that the corresponding inter-pair RDMs are not diagonal). The gradient can be rearranged to give

with the potential

The optimized PNOs are then given as a solution to

where $G^\u03f5aij$ is the pair-specific 3D BSH Green’s operator parameterized by

This operator is constructed and applied as described in Ref. 36. Due to the $2daij$ factor in Eq. (41), the MRA precision *ϵ* should be in a similar range as the RDM cutoff threshold *δ* if all PNOs should be optimized to convergence. Since the RDM threshold is usually chosen to be 10^{−6} or smaller, this is, in general, not practical, but since the PNOs corresponding to the small RDM eigenvalues will also contribute less to the overall energy, it seems unnecessary to fully optimize those PNOs.

The PNO optimization as described below does not increase their number; the PNO update keeps the number of PNOs unchanged, whereas the compression step can only decrease their number. Thus, it is important to consider how to adaptively grow the set of PNOs to be able to guarantee the target precision of the MP2 energy. The simplest possibility is just to read in a sufficiently large set of guess PNOs, defined by some predefined atomic basis set and use this as first guess for all pairs. Here, we instead adaptively generate the guess PNOs from scratch in a pair-specific way as follows: The procedure is similar to how guess functions were generated in MRA-CIS, namely, the orbital-like functions in the linear-response equations were created by multiplying “excitation operators” onto the occupied reference orbitals.^{49} In Ref. 49, monomials (i.e., *x*^{a}*y*^{b}*z*^{c}) were used as “excitation operators” in order to introduce desired symmetries into the guess functions. Due to the regularization with the explicitly correlated ansatz, we expect that the low order of the monomials used to generate the guess PNOs will be sufficient for an accurate representation. In this work, we will follow this approach where we wrap the roots of the monomials with a sin function (i.e., $xaybzc\u2192sinaxsinbysincz$) in order to remove the growth of the functions toward the boundary, which can lead to large undesired numerical noise at the boundaries after multiplication with the reference orbital (see Table I). Reference 50 also used a similar strategy to create natural orbital-like functions from the occupied reference orbitals called “product plane waves.” The main difference is that the functions here only serve as guess to provide correct initial symmetries and are optimized afterward.

dipole+ = {x, y, z, r^{2}} |

quadrupole = {x, y, z, x^{2}, y^{2}, z^{2}, xy, xz, yz} |

octupole = quadrupole ∪ {x^{3}, y^{3}, z^{3}, x^{2}y, x^{2}z, xy^{2}, xz^{2}, y^{2}z, yz^{2}, xyz} |

$multipole=macroiteration=1:dipole+\u2003macroiteration=2:quadrupole\u2003macroiteration>2:octopole\u2003$ |

dipole+ = {x, y, z, r^{2}} |

quadrupole = {x, y, z, x^{2}, y^{2}, z^{2}, xy, xz, yz} |

octupole = quadrupole ∪ {x^{3}, y^{3}, z^{3}, x^{2}y, x^{2}z, xy^{2}, xz^{2}, y^{2}z, yz^{2}, xyz} |

$multipole=macroiteration=1:dipole+\u2003macroiteration=2:quadrupole\u2003macroiteration>2:octopole\u2003$ |

The guess functions for the PNOs are created pair-specifically as

where exop is a set of PNO generators given in Table I. After initialization, the PNOs are iterated until convergence, which defines one macroiteration. A macroiteration consist of several microiterations where the PNOs are compressed and optimized. Due to compression, the number of PNOs is usually reduced. In the next macroiteration, a new set of PNOs is added to the already converged ones and the procedure is repeated until the overall convergence of the energy is reached. In order to save computational time, it is possible to grow the PNO ranks with a lower MRA threshold and reiterate the final pre-optimized PNOs again with a larger MRA threshold. The two phases of PNO optimization are illustrated in Figs. 1 and 2.

## III. COMPUTATIONAL DETAILS

The MRA-PNO-MP2-F12 method was implemented in the open-source package MADNESS (revision c6bdfa1fd762080ce9754d17c0b88f26b366b529).^{40}

MRA-PNO-MP2-F12 computations with MRA target precision *ϵ* and PNO truncation threshold *δ* are abbreviated as MRA(*ϵ*, *δ*) (Table II). In this work, we restrict ourselves to 7 initial polynomials per dimension (i.e., *k* = 7) for 7^{3} = 343 grid points per volume element.

Projector onto the occupied space | $O^=\u2211i\u2208occ|ii|$ |

Projector onto the virtual space | $Q^=1\u2212O^$ |

Two-body notation | $X^1\u2261X^1,\u2009X^12\u2261X^1,2$ |

Two-body Fock operator (same for $K^$ and $J^$) | $F^12=F^1+F^2$ |

Fock matrix | $Fpq=p|F^|q$ |

Fock operator | $F^=T^+J^\u2212K^$ |

Potential part | $v^F\u2261F^\u2212T^$ |

Particle and orbital permutation | $P^12,\u2009P^pq,\u2009$ |

Explicit correlation factor | $f12\u2261f1,2=1\u2212exp\u2212r12$ |

Coulomb potential | $g12=r12\u22121$ |

Hylleraas functional | $L2$ see Eqs. (1), (28), and (10) |

BSH Green’s function | $G^\mu $ see Ref. 36 |

MP2 pair function | |τ_{ij}⟩ |

MRA with accuracy 10^{−x} and PNO threshold of 10^{−δ} | MRA(x, δ) |

aug-cc-pVXZ basis-set | aXZ |

Projector onto the occupied space | $O^=\u2211i\u2208occ|ii|$ |

Projector onto the virtual space | $Q^=1\u2212O^$ |

Two-body notation | $X^1\u2261X^1,\u2009X^12\u2261X^1,2$ |

Two-body Fock operator (same for $K^$ and $J^$) | $F^12=F^1+F^2$ |

Fock matrix | $Fpq=p|F^|q$ |

Fock operator | $F^=T^+J^\u2212K^$ |

Potential part | $v^F\u2261F^\u2212T^$ |

Particle and orbital permutation | $P^12,\u2009P^pq,\u2009$ |

Explicit correlation factor | $f12\u2261f1,2=1\u2212exp\u2212r12$ |

Coulomb potential | $g12=r12\u22121$ |

Hylleraas functional | $L2$ see Eqs. (1), (28), and (10) |

BSH Green’s function | $G^\mu $ see Ref. 36 |

MP2 pair function | |τ_{ij}⟩ |

MRA with accuracy 10^{−x} and PNO threshold of 10^{−δ} | MRA(x, δ) |

aug-cc-pVXZ basis-set | aXZ |

In general, a lower *k* value reduces memory demand in any vertex of the MRA tree but also leads to deeper trees and therefore more vertices.^{41,51,52} It is, therefore, hard to state which *k* values should be used in general. A rule of thumb is to choose *k* ≥ 5 and increase it with target precision where *k* = 5 is usually only used in six dimensional representations where higher values have significant impact on memory consumption.^{14} We chose *k* = 7 because it usually performs well within the heuristics of madness for three dimensional representations and for the accuracy range chosen for this work.^{40} We performed some tests with *k* = 8 and *k* = 6 and did not note significant changes.

Unless otherwise mentioned, the MRA computations are performed with reference orbitals localized via the standard Foster–Boys^{53} procedure. All final correlation energies were obtained by the adaptive PNO generation algorithm as described in Secs. II D 2 and IV B.

For comparison with the LCAO-based methods, we used the MP2-F12 method with ansatz 2B, F+K approximation, and fixed-amplitudes (sp) ansatz^{54} implemented in the ricc2^{55} program of the TURBOMOLE^{56} package with the basis sets of Dunning *et al.*^{57–60} with the matching auxiliary (density-fitting) basis sets^{61,62} of the TURBOMOLE library. In most plots and figures, the Dunning basis sets aug-cc-pVXZ are abbreviated as aVXZ. For second row elements, we used the corrected aug-cc-pV(X+d)Z^{63} basis sets abbreviated as aV(Z+d)Z and the corresponding auxiliary basis sets.^{64} All computations are performed using the frozen-core approximation (i.e., neglecting the core orbitals in the electron correlation treatment).

## IV. RESULTS

### A. Saturating the PNO subspace: The helium atom

In Fig. 3, the convergence of MRA-PNO-MP2-F12 computations with different sets of virtual orbitals and MRA thresholds is shown. The smallest set of initial PNOs is created by the “dipole+” operator of Table I, which for the case of the ground state He atom generates an s-type and three p-type orbitals from the reference orbital (crosses and open circles). The energy after 5 iterations (at point M1) differs by several m*E*_{h} from the MP2-F12/aV6Z reference, indicating an incomplete basis. Adding d-type functions (“quadrupole,” filled circles) decreases the basis set error. Note that since the energy is computed by projection, the variational principle does not apply, and the energy can be overestimated, resulting in positive errors with respect to the reference.

In the second macroiteration (converged at point M2), additional PNOs of the same symmetry are generated and the energy changes significantly, while in the last macroiteration, no such changes are observed. At that point, the partial wave sets are saturated, and it can be seen that the 2s2p set (open circles) still misses the 2d functions (filled circles), but no f-functions are needed (filled circles and triangles are on top of each other).

The final macroiteration adds another set of functions of the respective symmetry, which does not change the energies much. However, the two sp-sets start to differ from the third iteration on, where the additional s- and p-functions are removed from the set in the low-threshold calculation (crosses), while they are kept in the tight-threshold calculation (open circles).

In order to reach the sub-millihartree range, it is necessary to not only have the partial wave expansion saturated but also increase the MRA and PNO thresholds. If the d- and f-functions are used, the difference to the MP2-F12/aV6Z reference is below 0.1 m*E*_{h}.

### B. Saturating the PNO subspace: The methane molecule

A similar analysis was performed for methane, a highly symmetric system with more than one electron pair. Figure 4 illustrates the convergence of the MRA-PNO-MP2-F12 energy of a methane molecule with respect to the PNO generators used to generate the initial PNOs.

Just as in the He atom, the error with respect to the MP2-F12/aV6Z reference decreases as the number of PNO generators is increased. The choice of the MRA threshold is investigated in Fig. 5 where the same computation is performed with different MRA thresholds. As in the case of a helium atom, the 10^{−4} and 10^{−5} thresholds do not lead to a significantly different result if accuracy in the millihartree regime is desired, while an MRA threshold of 10^{−3} underestimates the correlation energy by several m*E*_{h}. If the threshold is, however, tightened to 10^{−4} in the last macrocycle, the result is nearly the same as for the computation with MRA(4,6). This indicates that the MRA threshold does not have significant effects on the quality of the generated PNOs and the error arises dominantly from the accumulation of small errors in the individual pair energies.

The first step in Fig. 5 of the third macrocycle (first filled blue dot of M3) of the MRA(3-4,6) computation shows the result for the PNOs generated and optimized with MRA(3,6) but the projected energy calculated with MRA(4,6). This steps corrects the majority of the error of the lower threshold MRA(3,4) result compared to the MP2-F12/aV6Z reference. Since there are still significant changes after the optimization with MRA(4,6), it is not enough to just recompute the energy with tightened precision; the orbitals need to be also relaxed. It is, however, in this case, sufficient to generate and pre-optimize the PNOs with the lower MRA precision so that only a few iterations with the tightened precision are necessary for an accurate result.

In Fig. 6, an adaptive protocol is used where the PNOs are generated and pre-optimized with MRA(3,6) and where the PNO generator sets also increase in each macrocycle. After convergence, the MRA threshold is increased to (4,6) and the pre-optimized orbitals are re-iterated but not further expanded. The computation is performed for the methane and the butane molecule resulting in sub-millihartree agreement with the MP2-F12/aV6Z reference.

### C. Energy computation: The projection formula vs the Hylleraas functional

In Fig. 7, the Hylleraas functional and the projected energies are compared for the water molecule where the $L(2)wij$ part of the functional was computed using a quintuple-*ζ* auxiliary basis set. We note that the projected energies typically converge from below (i.e., the correlation energies are too large in magnitude), while the energies computed with the Hylleraas functions converge from above, as they should according to the variational principle. Energies evaluated with the two approaches converge to the same value but differ significantly in the beginning. The advantage of the projected energies is that there is no need for a 6D computation or the introduction of an auxiliary basis for the computation of the $f12K^12,f12$ term.

### D. Reference orbitals: Canonical vs localized

The effects of the localization are illustrated in Fig. 8. The MRA calculation with canonical orbitals differs significantly from the MP2-F12/aV6Z reference and the error increases with the size of the linear alkane chain, while both are not the case for the localized computation.

The more an orbital becomes delocalized, the smaller the absolute values of the function on any grid point is. Due to the truncation scheme of MRA, this leads to loss of precision. In the extreme case, when an orbital is delocalized enough and its function values fall below the MRA threshold, it would be truncated completely. Tightening the thresholds helps pushing this limiting case further, but loss of precision will occur eventually. Localized orbitals, on the other hand, have magnitude that is asymptotically independent of the system size, and this allows us to maintain high precision.

### E. Error growth with the system size: *n*-alkanes

We computed MRA-PNO-MP2-F12 energies for *n*-alkanes of varying sizes (methane to decane), comparing the results to MP2-F12/aVXZ (X = D, T, Q, 5, 6) in Figs. 9 and 10. Note that the conventional (LCAO) quintuple- and hextuple-*ζ* calculations could not be performed for the larger alkanes, and the quadruple-*ζ* MP2-F12 is not a sufficiently precise reference (e.g., the MP2-F12/aVQZ energy differs from the MP2-F12/aV6Z reference by more than a m*E*_{h} already for pentane). Hence, in Fig. 10, the MRA values were used as the reference.

For the smaller basis sets, growth of the absolute error with respect to the system size is observable, as expected due to the accumulation of errors in the individual pair energies. The dependence of the error on the system size is strongly nonlinear since these small systems are too small to reach the asymptotic $O(N)$ behavior of the MP2 energy (see also the inset in Fig. 10).

The MRA results are in good agreement with the corresponding best available MP2-F12/aVXZ energies. The error of MRA-PNO-MP2-F12 energy does not show artifactual dependence on the system size as was observed for the standard MRA(3,6)-MP2-F12 in Ref. 18. We believe that this is due to the use of localized reference orbitals, which prevents the number of significant pairs from growing quadratically (see the inset in Fig. 10), and due to the increased MRA precision threshold. Note that, in Fig. 6, this system size dependent error is observable for the computations with the MRA threshold of 10^{−3}.

### F. Error growth with the system size: *n*-silanes

A similar analysis was performed for *n*-silanes to test the ability to describe heavier-element core orbitals in the MRA-PNO-MP2-F12 method, which, despite the frozen core approximation, enter the expressions via the strong orthogonality projectors. The results presented in Fig. 11 suggest that, for heavier elements, tighter MRA thresholds are needed, presumably to be able to ensure strict orthogonality of the unoccupied states to the core one-particle states; the unoccupied states need to be resolved on the lengthscale of the tightest occupied orbitals to maintain strict orthogonality. The use of pseudopotentials could alleviate the need for the high resolution of the core regions of the unoccupied orbitals. In all-electron computations, no such simplifications appear possible.

### G. Conventional LCAO PNOs vs optimized MRA PNOs: Comparison of ranks and precision

MRA PNOs can be viewed as essentially exact PNOs (modulo the finite MRA precision parameter). In principle, these should be equivalent to the PNOs obtained from LCAO MP1-F12 amplitudes [with the strong orthogonality projector of Eq. (2) rather than the traditional F12 projector] determined in a sufficiently complete basis. However, in the LCAO representation, the PNOs are traditionally formed from conventional (non-F12) MP1 amplitudes. Thus, a fair comparison of our approach is to the F12-optimized PNOs of Tew *et al.*,^{30} which result in more compact PNO^{30,66} sets that approximate the exact PNOs determined from MP1-F12. To understand the differences between the exact and approximate LCAO (F12-)PNOs, we compared PNO ranks and the corresponding PNO truncation errors in MRA and LCAO approaches for the HCl–H_{2}O complex (see Table III). For comparable accuracy of about −547m*E*_{h}, the LCAO requires average PNO rank 84, where the optimized PNOs of MRA require only 51 PNOs per pair. This suggests that the LCAO F12-PNOs are far from being optimal even for high ranks.

Method . | Max rank . | Av. rank . | Energy (mE_{h})
. |
---|---|---|---|

PNO-MP2-F12, aV(Q+d)Z, t = 6 | 41 | 22 | −542.025 |

PNO-MP2-F12, aV(Q+d)Z, t = 7 | 68 | 42 | −543.863 |

PNO-MP2-F12, aV(Q+d)Z, t = 8 | 105 | 70 | −544.365 |

MP2-F12, aV(Q+d)Z | 307 | 307 | −544.520 |

PNO-MP2-F12, aV(5+d)Z, t = 6 | 45 | 23 | −542.961 |

PNO-MP2-F12, aV(5+d)Z, t = 7 | 83 | 46 | −545.485 |

PNO-MP2-F12, aV(5+d)Z, t = 8 | 127 | 84 | −546.138 |

MP2-F12, aV(5+d)Z | 503 | 503 | −546.350 |

MP2-F12, aV6Z | 768 | 768 | −547.056 |

MRA(3-4,6) | 37 | 17 | −548.729 |

MRA(3-4,7) | 69 | 32 | −547.849 |

MRA(3-4,8) | 118 | 51 | −547.058 |

Method . | Max rank . | Av. rank . | Energy (mE_{h})
. |
---|---|---|---|

PNO-MP2-F12, aV(Q+d)Z, t = 6 | 41 | 22 | −542.025 |

PNO-MP2-F12, aV(Q+d)Z, t = 7 | 68 | 42 | −543.863 |

PNO-MP2-F12, aV(Q+d)Z, t = 8 | 105 | 70 | −544.365 |

MP2-F12, aV(Q+d)Z | 307 | 307 | −544.520 |

PNO-MP2-F12, aV(5+d)Z, t = 6 | 45 | 23 | −542.961 |

PNO-MP2-F12, aV(5+d)Z, t = 7 | 83 | 46 | −545.485 |

PNO-MP2-F12, aV(5+d)Z, t = 8 | 127 | 84 | −546.138 |

MP2-F12, aV(5+d)Z | 503 | 503 | −546.350 |

MP2-F12, aV6Z | 768 | 768 | −547.056 |

MRA(3-4,6) | 37 | 17 | −548.729 |

MRA(3-4,7) | 69 | 32 | −547.849 |

MRA(3-4,8) | 118 | 51 | −547.058 |

### H. Performance for relative energies: Water clusters

The LCAO-based computations often achieve remarkable accuracy for relative energies compared to their performance on total energies due to robust cancellation of “intra-atomic” errors as the geometry varies; the LCAO-based PNO relative energies are also known to be very precise. The MRA-based approaches, in contrast, do not generally profit from the intra-atomic error cancellation by construction, since the structure of the basis near each nucleus is not fixed. Thus, it is vitally important to examine the performance of MRA-PNO-MP2-F12 for relative energetics. To this end, we computed binding energies of small water clusters up to tetramers. The MP2 binding energies of the water clusters are just a few millihartrees; thus, generally very high precision in absolute energies, even for the LCAO approaches, is needed to compute the binding energies precisely.

The errors due to the MP2-F12/aV6Z reference for the total and binding energies are shown in Figs. 12 and 13, respectively. All MRA-PNO-MP2-F12 results give precise results in the binding energies, except MRA(3-4,6); we hypothesized that this is caused by premature truncation of PNOs in the optimization. To investigate this further, we performed computations with a tight PNO truncation threshold (*δ* = 10^{−8}), but the final PNO ranks were restricted to a fixed small value [see the “MRA(3-4,8)-maxrank-X” data in Figs. 12 and 13, with *X* = 20, 30]. The absolute energies obtained with the final PNO ranks were restricted to 20 and 30 PNOs, which differed significantly from those with the non-truncated counterpart (Fig. 12); the average and maximum PNO ranks for the single H_{2}O molecule in the latter were 60 and 90, i.e., much larger. In contrast, the effect of the final PNO truncation on the relative energies (Fig. 13) is very small due to the apparent systematic convergence of the PNO truncation error with the cluster size. This suggests that the PNO construction procedure can be improved further by gradually reducing PNO ranks during their construction.

### I. Formal scaling and expected computational performance

The formal computational complexity of the presented MRA-PNO-MP2-F12 approach is identical to that of the corresponding LCAO counterpart. The cost of computing the MP1 wave function in the LCAO representation grows with the fifth power of the system size or more precisely as $Nocc3Nvirt2$ without any approximation. With localized orbitals, the cost reduces to $Nocc3$; using the real-space representation for the pair functions with localized occupied orbitals results in the same complexity.^{18} The complexity is further reduced to quadratic due to the exponential decay of the Fock operator with a finite HOMO–LUMO gap. Thus, the MRA-PNO-MP2-F12 method as implemented here has quadratic complexity with the system size. Further reduction of complexity to linear could be achieved by pair screening but was not implemented in our work. Although the complexity of our implementation is formally quadratic, the cost is dominated by the short-range pairs, which have the highest PNO ranks, and thus, the *apparent* scaling of the cost with the system size is subquadratic.

A rigorous and fair comparison of the computational cost of MRA vs LCAO PNO-MP2-F12 is complicated due to the need to compare costs of computations that achieve same precision. An additional practical complication was that we performed the two on different computers due to technical and licensing reasons. After accounting for the hardware factors, some MRA-PNO calculations were faster than the LCAO counterparts, namely, aug-cc-pV6Z PNO-MP2-F12 computations for propane and larger systems. For the LCAO calculations, the wall-times are dominated by the Hartree–Fock part, while for PNO-MRA-MP2 calculations, the wall-times are dominated by the MP2 part. In MRA based computations, Hartree–Fock solves for the optimized occupied orbitals only, and the most time consuming step is the actual optimization of the PNOs, while in LCAO based calculations, Hartree–Fock is solved by rotating in the full orbital space defined by the basis-set. In other words, in both calculations, the construction of the basis in which the MP2 equations are solved is the expensive part.^{65}

A potentially interesting question in comparing LCAO with MRA-PNO is also the performance of truncated schemes with regard to relative energies. We illustrated that truncated MRA-PNO schemes can, in principle, achieve consistent and accurate results in cases where small and medium LCAO based calculations fail even with the F12 correction (see Fig. 13). A rigorous investigation is beyond the scope of this work.

## V. SUMMARY AND PERSPECTIVE

We presented a real-space implementation of the simplest many-body electronic structure method, namely, the second-order Moller–Plesset perturbation theory, with the pair functions represented as a linear combination of products of pair-natural orbitals represented in real space on an adaptive multiresolution spectral-element basis. Similarly to the previously demonstrated6-dimensional” MP2 approach^{13,14} in which pair functions were represented directly in real space, regularization of the electron–electron Coulomb singularity is mandatory to be able to solve equations robustly and reach high precision. However, in contrast to the original approach in which each 6D pair function was written as a sum of single-particle products only *locally* (within each element of the adaptive volume decomposition), employing *global* separation of pair functions into products of orbitals allows us to greatly extend the size of systems that can be treated with highly precise real-space MP2 computations on up to 32 atoms demonstrated for the first time.

Compared to the PNO methods in the LCAO representation in which the PNOs are determined once and fixed (for an exception, see Ref. 45 in which PNOs were optimized), in our approach, the PNOs must be determined directly by iterative minimization of an appropriate Lagrangian (namely, the Hylleraas functional for the MP2 energy). Similarly, the rank of the PNO basis is varied adaptively to reach a given precision by employing PNO generating operators (which in atomic systems increase principal and orbital quanta of the argument functions) to repeatedly augment PNO subspaces for each pair. The PNO subspaces generated by this procedure converge rapidly and produce more compact and precise sets than those obtained in the LCAO representation.

Benchmark calculations on *n*-alkanes (up to decane) shows that a size-independent maximum rank of only 25–30 is needed to achieve chemical accuracy of the absolute correlation energy for the whole molecule with perhaps a few dozen atoms and can be applied to systems too large to be treatable by conventional LCAO MP2-F12 approaches of similar precision. Further calculations on a linear chain of alkanes show that the basis set limit can be reached in a systematic way, where, in this case, higher thresholds were necessary, presumably due to the increased number of core orbitals still present in the projector. Initial assessment for relative energies was performed using weakly interacting water clusters up to four molecules where the PNO representation adaptively grown to a low fixed rank achieved precise interaction energies despite having large absolute errors.

The proposed approach can be generalized straightforwardly to higher-order methods, such as coupled-cluster. Of particular interest would be application to systems in electronically excited states and to molecular properties other than energy, where the weaknesses of the LCAO representation become more dramatic. Another prospective application would be to atoms and molecules in strong electromagnetic fields where the atom-centered representations such as LCAO are not appropriate due to their inability to describe large spatial distortions of the electronic wavepacket.

## ACKNOWLEDGMENTS

J.S.K. acknowledges the support from Fonds der Chemischen Industrie and Deutsche Forschungsgemeinschaft (Grant No. BI-1432/2-1). The work of F.A.B. was supported by the Deutsche Forschungsgemeinschaft (Grant No. BI-1432/2-1). The work of E.F.V. was supported by the U.S. National Science Foundation (Award Nos. 1550456 and 1800348).

## REFERENCES

_{3}and H

_{2}

_{3}

^{2}for the sparse representation of integral operators

We follow the notation used in previous works denoting the particles onto which an operator acts as subscripts, i.e., $O^12\u2261O^1,2$. For one-particle equations, the subscripts are dropped for brevity.

^{−}, He, Li

^{+}, Be

^{++}usw

In this work, we only focus on the electron–electron cusp. The electron–nuclear cusp also plays a role in the real-space many-body methods since its seam has the same dimension as that of the electron–electron cusp. Although it is possible to regularize the electron–nuclei cusps via the *NEMO* approach described in Refs. 15 and 16, this is less crucial for the PNO-based approach we pursue here. It is because the cusps in PNOs are pointwise, and just like the cusps in the Hartree–Fock orbitals, these can be effectively described by the adaptive multiresolution refinement of the basis.

_{12}-Dependent terms in the wave function as closed sums of partial wave amplitudes for large l

_{2}O, N

_{2}and C

_{2}in full configuration interaction and coupled cluster theory

The number of iterations in the Hartree–Fock calculation did not change significantly.