A new quantum Monte Carlo (QMC) method for anharmonic vibrational zero-point energies and transition frequencies is developed, which combines the diagrammatic vibrational many-body perturbation theory based on the Dyson equation with Monte Carlo integration. The infinite sums of the diagrammatic and thus size-consistent first- and second-order anharmonic corrections to the energy and self-energy are expressed as sums of a few *m*- or 2*m*-dimensional integrals of wave functions and a potential energy surface (PES) (*m* is the vibrational degrees of freedom). Each of these integrals is computed as the integrand (including the value of the PES) divided by the value of a judiciously chosen weight function evaluated on demand at geometries distributed randomly but according to the weight function via the Metropolis algorithm. In this way, the method completely avoids cumbersome evaluation and storage of high-order force constants necessary in the original formulation of the vibrational perturbation theory; it furthermore allows even higher-order force constants essentially up to an infinite order to be taken into account in a scalable, memory-efficient algorithm. The diagrammatic contributions to the frequency-dependent self-energies that are stochastically evaluated at discrete frequencies can be reliably interpolated, allowing the self-consistent solutions to the Dyson equation to be obtained. This method, therefore, can compute directly and stochastically the transition frequencies of fundamentals and overtones as well as their relative intensities as pole strengths, without fixed-node errors that plague some QMC. It is shown that, for an identical PES, the new method reproduces the correct deterministic values of the energies and frequencies within a few cm^{−1} and pole strengths within a few thousandths. With the values of a PES evaluated on the fly at random geometries, the new method captures a noticeably greater proportion of anharmonic effects.

## I. INTRODUCTION

To accurately predict the vibrational spectra of molecules and solids, it is necessary to go beyond the harmonic approximation to their potential energy surfaces (PES). The vibrational Hamiltonian then becomes a high-rank many-body operator. Sophisticated many-body methods are needed with such a high-rank Hamiltonian, and they should also be able to treat both ground and excited states on an equal footing so that accurate transition frequencies and intensities are obtained as well as zero-point energies.

For small polyatomic molecules, it is now routine to solve the anharmonic vibrational Schrödinger equations nearly exactly with, for instance, a variational method using the discrete variable representation.^{1,2} For larger molecules and solids, systematic hierarchies of approximations to the vibrational wave functions similar to those in the field of electronic structure theory have been developed. The vibrational self-consistent field (VSCF) method^{3–5} is a vibrational analog of the Hartree–Fock method of electronic structure theory. Vibrational correlation methods, including vibrational Møller–Plesset perturbation (VMP) theory,^{6,7} vibrational configuration-interaction (VCI) theory,^{8} and vibrational coupled-cluster theory^{9} have been reported. These methods have the advantage of being able to employ virtually any mean-field theory as a reference, including VSCF (which is implied in VMP). With these methods, however, transition frequencies are not directly calculable, but need to be evaluated as small differences between total energies of the ground and excited states. The results thus obtained are not necessarily size consistent,^{10–12} though this is the most important requirement when the theory is applied to large molecules and solids.

Alternatively, many-body perturbation theory (MBPT)^{13,14} can be defined with a harmonic reference wave function,^{15–22} leading to vibrational perturbation theory (VPT) that achieves surprisingly high accuracy at the second order for weakly anharmonic molecules.^{21} MBPT is distinguished from Rayleigh–Schrödinger perturbation theory (RSPT), upon which VMP is based, in that its computed observables consist of linked diagrams only and are thus rigorously size consistent.^{10–12} Also, VPT can compute transition frequencies directly as well as zero-point energies. In fact, VPT is a molecular generalization of the crystal phonon diagrammatic formalism^{23,24} and improved self-consistent phonon method^{25,26} in solid-state physics.

We have recently introduced vibrational MBPT (Ref. 27) based on the Dyson equation using the reference wave function obtained either from one of the diagrammatically size-consistent variants of VSCF named XVSCF (Refs. 28–30) or the harmonic approximation. XVSCF is closely related to the self-consistent phonon (SCP) method^{31–33} in solid-state physics. We call the second-order vibrational MBPT with the XVSCF reference (i.e., using the Møller–Plesset partitioning of the Hamiltonian) XVMP2 and that with the harmonic reference XVH2.^{27} Like VPT, XVMP2 and XVH2 can compute zero-point energies and transition frequencies directly and in a size-consistent manner. Unlike VPT, however, XVMP2 and XVH2 can solve the Dyson equation self-consistently for multiple roots and their relative intensities, which correspond to the fundamental, overtones, and combinations. They resist divergence in the near presence of resonances even though the underlying theory is still perturbative. Closely related methods have been considered by Stuchebrukhov and co-workers.^{34–36}

All of the aforementioned methods have the common drawback of having to generate and store, at some point in a whole calculation, a high-dimensional array of the Hamiltonian matrix elements such as force constants. For instance, the evaluation and storage of all the *n*th-order force constants of a molecule with *m* vibrational degrees of freedom incur *O*(*m*^{n}) operation and memory costs. Hence, the costs increase exponentially with *n*, even though the overall effect in the frequencies is observed to decay with *n*. There is clearly a considerable waste in computing all force constants up to a finite order or in any method that scans a PES on a product grid. The algorithms for PES scan are not as scalable on a massively parallel supercomputer as stochastic algorithms can be.

A solution to this issue is offered by stochastic methods such as quantum Monte Carlo (QMC).^{37–40} QMC and particularly its variant diffusion Monte Carlo (DMC)^{41,42} have been applied to anharmonic molecular vibrational problems. Their computational kernels involve nearly independent (thus highly parallel-executable) random walk of nuclei on a PES. No precomputed and stored PES or force constants are necessary as the value of PES at any nuclear geometry is evaluated on the fly and geometries are importance-sampled.^{43} However, QMC has the well-known, persistent shortcoming of being limited to the ground vibrational states, which are nodeless. McCoy^{41,42} has used DMC to evaluate the wave functions of a few lowest-lying excited states with assumed nodal structures, but these results inevitably suffer from fixed-node errors. Furthermore, a transition frequency needs to be evaluated as a total-energy difference with one transition at a time, which is particularly troublesome in QMC because total energies have statistical errors.

Here, we “wed” vibrational second-order MBPT (XVMP2 and XVH2)^{27} and QMC to arrive at the methods that inherit the merits of both approaches and cancel the demerits of each other. We call the methods MC-XVMP2 and MC-XVH2. We show that second-order corrections to zero-point energies and transition frequencies can be re-expressed as sums of a few high-dimensional integrals of the vibrational wave functions and the PES. These integrals are, in turn, evaluated by the Monte Carlo (MC) method with nuclear geometries generated randomly by the Metropolis algorithm but according to a judiciously chosen weight function, to be specified in this article. In each MC step, only the values of the vibrational wave functions and the PES at a nuclear geometry are needed, and they are computed on the fly and never stored; again, the geometries are importance-sampled according to the weight function. These MC steps are even more independent than those in DMC and should be highly scalable. The new methods can compute not only the transition frequencies and zero-point energies directly (not as energy differences) but also frequency-dependent self-energies at a discrete set of frequencies, which can then be interpolated reliably and used in self-consistent solutions of the Dyson equation. Hence, MC-XVMP2 and MC-XVH2 inherit the same remarkable ability to determine multiple roots, which resist divergences near resonances, and their intensities. There is no sign problem or fixed-node error in MC-XVMP2 or MC-XVH2.

## II. XVMP2 AND XVH2

Here, we briefly review the formalisms of XVMP2 and XVH2.^{27} They are based on the inverse Dyson equation in the diagonal approximation, which solves

for ν for each mode, where ω_{i} is the *i*th-mode frequency of a reference mean-field theory and Σ_{i}(ν) is the *i*th diagonal element of the Dyson self-energy expanded in a diagrammatic perturbation series truncated at the second order.^{27,44} The reference wave function has to be a product of *effective* harmonic-oscillator functions along normal coordinates with effective harmonic frequencies {ω_{i}}. It is obtained either by XVSCF (Refs. 28–30) or the harmonic approximation. The former choice leads to XVMP2,^{27} whereas the latter defines XVH2.^{27}

Apart from being rigorously size-consistent (which is synonymous with size-extensive for total energies and with size-intensive for frequencies), these methods have unique advantages not seen in the parent second-order VMP (VMP2).^{6,7} The former can obtain anharmonic transition frequencies directly from Eq. (1), in contrast to the latter, which calculates them as differences in total energy between the ground and excited states for one excited state at a time. Also, XVMP2 and XVH2 can resist divergence in the presence of strong anharmonic mode-mode coupling (such as a Fermi resonance) and locate multiple physically meaningful roots from the equation of one mode (with relative intensities also calculable as pole strengths of the corresponding Green's function), owing to the recursive structure of the inverse Dyson equation.^{27}

When seeking roots far from any resonances, however, one can use an even more streamlined approach and obtain an accurate approximate solution to Eq. (1) by evaluating

or

The self-energy in these equations is a sum of connected, irreducible, nonredundant, open diagrams with two stubs^{27} and written as

each of which is defined diagrammatically in Figs. 1–3, where the integer in the parenthesized superscript denotes the perturbation order. In XVMP2, the first-order term

^{28–30}However, for reasons made clear below, we retain this term in this general expression.

These diagrams are translated straightforwardly to algebraic expressions according to the interpretation rules documented in our previous paper.^{27} The first-order self-energy diagrams in Fig. 1 are interpreted as

where *F*_{iijj} is the quartic force constant with respect to the *i*th and *j*th normal coordinates and *F*_{iijjkk} and *F*_{iijjkkll} the sextic and octic force constants, respectively. Likewise, the second-order diagrams in Figs. 2 and 3 are given algebraically as

and

where the prime on the summation symbol in Eq. (6) indicates that *j* = *k* needs to be excluded from the summation in XVMP2.^{27}

The corresponding zero-point energy expressions are given by a unified formula,

where

*n*) and in the XVSCF(

*n*) approximation in the case of XVMP2(

*n*).

^{27}The first- and second-order corrections to this zeroth-order energy are written as

in notation similar to the one used in Ref. 27. The right-hand sides are given algebraically by

where, again, ω's are the harmonic frequencies in XVH2(*n*) and the XVSCF(*n*) frequencies in XVMP2(*n*) and, in the latter case, *i* = *j* is excluded in the summation of Eq. (14) (as designated by the prime on the summation symbol).^{27,45} These expressions are obtained by applying the interpretation rules stipulated in Ref. 27 to the corresponding diagrams in Fig. 4. Although the first-order correction is zero in XVMP2(*n*),^{27} we retain the corresponding term in Eq. (8) for the reason given below.

In practice, the infinite sums in Eqs. (5)–(7), (9), and (10) are truncated after terms containing the *n*th-order force constants. With this truncation, these equations define the XVMP2(*n*) and XVH2(*n*) methods,^{27} where *n* is the truncation order of the Taylor-series expansion of the PES. XVMP2(*n*) is based on a reference wave function of XVSCF(*n*),^{28} which, in turn, uses even-order force constants of the types appearing in Eq. (5) up to the *n*th order.

We have also introduced XVMP2[*n*],^{27} employing the XVSCF[*n*] reference wave function,^{29} in which odd-order force constants up to the *n*th order of certain types are additionally taken into account to define a new center of the normal coordinates known as the first-order Dyson geometry.^{30} In XVMP2[*n*], Eq. (6) for

*n*] reference. In all cases, the self-energy is formally expressed by Eqs. (4)–(7), and the zero-point energy is formally expressed by Eqs. (8)–(16), with differences in the definition of ω and the coordinate center.

^{27,30}The derivation of MC-XVMP2(

*n*) presented in Sec. III applies equally to MC-XVMP2[

*n*] by simply replacing “(

*n*)” by “[

*n*]” everywhere it appears, and the coordinates in the latter are understood to be centered at the first-order Dyson geometry.

^{29,30}Hereafter, we will not separately discuss MC-XVMP2[

*n*].

A typical algorithm of the XVH2 and XVMP2 methods is as follows: (i) We first determine the equilibrium geometry and harmonic force constants of a molecule using an electronic structure method; (ii) we then perform a normal-mode analysis to obtain the normal coordinates {*Q*_{i}} and the harmonic frequencies

*n*th order; (iv) for XVMP2(

*n*) or XVMP2[

*n*], we carry out the XVSCF(

*n*) or XVSCF[

*n*] calculation to obtain the effective harmonic frequencies {ω

_{i}} and/or adjust the geometry. For XVH2(

*n*), this step is unnecessary as

Of these, step (iii) is by far the most expensive in terms of both operation and memory costs and is also arduous to program. To obtain all *n*th-order force constants, one must differentiate a potential energy function with respect to normal coordinates at an *O*(*m*^{n}) computational cost, where *m* is the vibrational degrees of freedom. One can alternatively compute the force constants in the Cartesian coordinates (e.g., in order to use analytical derivative capabilities^{46} of *ab initio* molecular orbital theory or in order to exploit the locality of such force constants), but their transformation to the normal coordinates is an *O*(*m*^{n + 1}) process. These become impractical for *n* > 4 not only because of the exponential scaling of cost with *n* but also owing to the increase in numerical errors caused by the repeated applications of finite-difference methods.^{47} It is also not particularly scalable with respect to the number of processors in a modern parallel supercomputer. It is imperative that this high-order force-constant evaluation step be eliminated altogether so as to fully realize the intrinsic efficiency of XVMP2 (XVH2), especially for large molecules and solids.

## III. MC-XVMP2 AND MC-XVH2

In the stochastic implementation of these methods, we consider two scenarios (*direct* and *indirect*) for evaluating a PES and two ways (*recursive* and *nonrecursive*) of performing frequency calculations.

In a direct calculation, the value of a PES at a given geometry is calculated on demand by an electronic structure calculation, discarded after its use, and never stored. Furthermore, the geometries are generated randomly but in accordance with their probability of occurrence during vibrational motion. This is achieved by the Metropolis algorithm with a suitable weight function (see below). Hence, the overall computational cost is dominated by that of electronic structure calculations and how the rest of the integrand is evaluated is rather unimportant for the computational cost. The stochastic sampling of the PES also allows much higher-order force constants than those considered in the reference method to be implicitly included in the calculation. We call the stochastic XVMP2 methods using this algorithm MC-XVMP2(∞) and MC-XVH2(∞), emphasizing the fact that they can, in principle, include important high-order force constants up to an infinite order. This is a great advantage of these methods unmatched by any force-constant-based methods, but it also causes mismatch in the PES between the reference and perturbation methods. This is why the first-order energy and self-energy are no longer zero in MC-XVMP2(∞). This issue is discussed more fully in Appendix A.

In an indirect calculation, a PES is precalculated and stored in the form of a mathematical function or a set of force constants, which allows the value of the PES to be rapidly evaluated at any geometry during MC integrations. The overall cost is dominated by the precalculation of the PES, which is unchanged by the stochastic algorithm introduced here. For instance, if the PES is given as a quartic force field (QFF), XVMP2(4) and XVH2(4) already constitute the most efficient, deterministic algorithms that take into account the whole QFF in the second-order perturbation theory; the stochastic counterparts, which we call MC-XVMP2(4) and MC-XVH2(4), do not offer any advantage in this case. In other words, the algorithm proposed here is meaningful only in the direct calculation. Nonetheless, we report the results of indirect calculations for the purpose of code verification and assessment of the methods' performance.

In a recursive calculation, we solve the recursive inverse Dyson equation [Eq. (1)] for self-consistent solutions by an iterative algorithm such as the one proposed by us,^{27} which brings about various advantages over the conventional VMP2 method mentioned in the Introduction. In a nonrecursive calculation, we use the nonrecursive approximation of Eq. (3) to significantly reduce the computational cost and algorithm complexity at the expense of losing some of the aforementioned advantages. The total (zero-point) energy calculations are always nonrecursive.

### A. First-order correction to zero-point energy

We seek a mathematical transformation that eliminates any explicit reference to force constants and thereby brings these expressions into forms that lend themselves to MC integrations. Adopting the strategy of Ref. 48, we can immediately rewrite the first-order correction to the zero-point energy, Eq. (9), in such a form:

where ** Q** collectively refers to all normal modes and hence

*dQ*

_{1}

*dQ*

_{2}…

*dQ*

_{m}. That this is equal to Eq. (9) can be understood by recognizing it as the first-order correction in the RSPT, which is identical to the corresponding expression in the diagrammatic many-body perturbation theory.

Here,

where

*n*th-order force constants and

*n*) or in the XVSCF(

*n*).

^{27}We distinguish the highest order (

*n*) of force constants in the Hamiltonian, which is generally undefined and can be essentially infinite in a direct calculation, from the finite order (

In MC-XVH2(*n*), the fluctuation potential is written as

where

*n*th-order force constants and

*V*

_{ref}is its value at the equilibrium geometry (the minimum).

In MC-XVMP2(*n*), the fluctuation potential is

where

^{30}and ω

_{i}is the

*i*th-mode frequency of XVSCF(

The real zero-point wave function, Φ_{0}, is that of the reference method and is written as

where

*n*

_{i}along the

*i*th normal mode,

*Q*

_{i}, obtained either by the harmonic approximation in the case of MC-XVH2(

*n*) or by XVSCF(

*n*).

In a direct calculation, the cost of the calculation is determined by that of evaluating

*O*(

*m*

^{1}

*n*

^{0}), where

*m*is the vibrational degrees of freedom and

*n*is the rank of

_{0}(

*Q*

_{i}) for each

*i*= 1, …,

*m*. This concerns the cost of evaluating the integrand, not the integral, of Eq. (17).

### B. Second-order correction to zero-point energy

We also rewrite the second-order correction to the zero-point energy into a single high-dimensional integral. To do this, we consider the sum of all second-order corrections instead of the individual contributions and write it as

where Φ_{N} is the *N*th excited-state wave function of the reference method and

where

The summation over *N* must formally go over all states that are reached by

_{0}. The number of such states, denoted by

*N*

_{max}, is proportional to the number of force constants included in

*O*(

*m*

^{n}).

We can, however, reduce this *O*(*m*^{n}) scaling by invoking the same mathematical trick used in Ref. 48. First, we note that, in the reference method,

where *n*_{i} is the quantum number of the *i*th mode in the *N*th state. Second, we use the Laplace transform of

Third, we find

where

and

The cost of evaluating Eq. (28) is now reduced from *O*(*m*^{n}) to

*n*

_{max}and some states with up to rank

*mn*

_{max}; we consider

*n*

_{max}≈

*n*for the cost analysis.

The last term in Eq. (29) corresponds to the exclusion of the *N* = 0 term from the summation in Eq. (22), in which it causes divergence. Diagrammatically, this term represents the sum of all disconnected diagrams; the subtraction of this term, therefore, amounts to the act of eliminating disconnected diagrams. Although the foregoing transformations may appear to be a regression from diagrammatic MBPT (Refs. 13,44) to RSPT, the diagrammatic logic to maintain size consistency of the theory is carefully maintained.

### C. First-order correction to frequencies

The first-order self-energy expression of Eq. (5) is converted into a single *m*-dimensional integral,

where

*i*, which satisfy

Therefore, the programmable expression of Eq. (31) is

which is subject to MC integration with an *O*(*m*^{1}*n*^{0}) cost to evaluate the integrand excluding

Equation (34) is equal to the infinite sum of Eq. (5), which can be proven algebraically by substituting the Taylor series of

### D. Second-order correction to frequencies

The second-order corrections to self-energy consist of the four terms denoted by

The contributions from the “pendant” diagrams (Fig. 2) are written as single 2*m*-dimensional integrals as follows:

Insertion of

The first of these equations is transformed to

with

and

where

The last term of Eq. (40) is diagrammatically interpreted as the subtraction of disconnected diagrams. Equation (38) costs *O*(*N*_{max}) = *O*(*m*^{n}) operations to evaluate, whereas Eq. (39) can be calculated at an

*n*

_{max}≈

*n*. The expression for

Next, we consider the “bubble” diagrams (Fig. 3). They are written in integral forms as

Insertion of

In the summations, *N* = 0 is excluded because it corresponds to disconnected diagrams (see above). Also excluded is *N* = *i*, where the *i*th state is the one singly excited in the *i*th mode, i.e., *n*_{i} = 1 and *n*_{j} = 0 (∀*j* ≠ *i*). This exclusion amounts to the deletion of reducible diagrams, which can be understood as follows: Since Φ_{i} is a singly excited state, there is only one line between the two vertexes and its mode index is *i*. The self-energy diagrams with this motif are, e.g., the second diagrams from the left in Fig. 3 with *i* = *j*, which are reducible.^{27}

These equations are further simplified to

with

Unlike Eq. (24) or (38), this does not readily lend itself to a Laplace transform because the denominator,

*N*; their ν-dependence prohibits the application of a Laplace transform. In a recursive calculation, where such dependence is essential, we evaluate

*N*

_{max}truncated at the states with excitation rank

In a nonrecursive calculation, on the other hand, we only need

and

where

## IV. STOCHASTIC ALGORITHMS

### A. Monte Carlo integrations

Any quadrature algorithm for evaluating the *m*- or 2*m*-dimensional integrals appearing in Sec. III would have an operation cost which grows exponentially with *m* (the vibrational degrees of freedom), which is prohibitive for all but the smallest molecules. On the other hand, experiences with QMC have been that the operation cost required to reach a given accuracy increases only as *O*(*m*^{3}).^{37} The trade-off is the slow *N*^{−1/2} fall-off of the statistical error, where *N* is the number of MC steps of sampling points (not to be confused with the state label in Sec. III). Here, we use the MC method.

Each of the integrals *I* appearing in Sec. III is approximated by

with

when the integral is Eqs. (23) and (37), or (44), where *f* refers to the integrand in one of these expressions, *w* is the weight function to be specified below, and

*i*th sampling point in the

*m*-dimensional space of normal coordinates generated stochastically by the Metropolis algorithm

^{49}according to the weight function.

More specifically, we use different formulas to evaluate the integrands depending on whether the solutions of the recursive or nonrecursive inverse Dyson equation are sought. In a recursive calculation, the integrands of

^{48,50}using a change of variable to

The variance

_{N}are estimated by

This estimate is, however, accurate only when *I*_{i} is uncorrelated from one MC step to another and this is not necessarily the case here because a new sampling point is generated from an old one by a random-walk step with a semi-fixed radius (see below).

To more accurately estimate the statistical uncertainty, we use the blocking algorithm of Flyvbjerg and Petersen.^{51} We thus modify the definition of σ_{N} as

where *N*′ = *N*/*N*_{B}, *N*_{B} is the block length, and

In this way, we evaluate the statistical uncertainty using every *N*_{B} integrals, and these data become uncorrelated for sufficiently large *N*_{B} (we use *N*_{B} = 75).

It has also been found that the use of Abelian point-group symmetry helps reduce the statistical uncertainty. This is discussed in Appendix B.

### B. Weight functions

The feasibility of MC integration depends critically on the quality of the weight function. An effective weight function

^{48}

One candidate weight function is the zero-point wave function in the harmonic approximation or its probability density, which has been used implicitly in stochastic SCP and VCI methods^{52–55} (see below). The weight function used in this study is also the probability density of the zero-point wave function of the reference method, i.e., the harmonic approximation for MC-XVH2(*n*) and the XVSCF(

*n*):

where α is a scaling parameter. This weight function is positive everywhere, integrates analytically to unity, and generally behaves like the integrands, when the PES is not too anharmonic. The weight function for the 2*m*-dimensional integral is simply the product

The operation cost in a direct calculation is determined by the cost of electronic structure calculations and the memory cost is always negligible. The overall efficiency of MC-XVMP2 is, therefore, equal to the efficiency of sampling the PES, which, in turn, depends on how well the forms of the PES and weight function match with each other.

### C. Metropolis algorithm

In each MC step, a new geometry,

*m*-dimensional space of the normal coordinates is generated as a random-walk step from the previous geometry,

*j*th normal coordinate, where β is a parameter. The new geometry is accepted if the ratio

^{5}steps (the initial equilibration stage), no PES evaluation is performed, after which the MC integration with PES evaluations starts. The same collection of walkers, guided by the same weight function, is used to evaluate all

*m*- and 2

*m*-dimensional integrals.

In this study, we have also extended the redundant-walker algorithm of Willow *et al.*^{56} to MC-XVMP2(*n*) and MC-XVH2(*n*). Though this algorithm is essential for the efficiency of these methods, its underlying idea is the same as in MC-MP2 for electron correlation and its details are relegated to Appendix C. Briefly, it involves propagating a set of *N*_{w} walkers simultaneously and exploiting a twofold summation over walkers to accelerate the evaluation of the 2*m*-dimensional integrals. In the calculations reported below, “one MC step” is defined as one random-walk step of all *N*_{w} walkers. Since the PES is evaluated roughly half as many times as a single walker is propagated, a calculation reported to have *N* steps evaluates the PES roughly *N*_{w}*N*/2 times.

A highly scalable parallel algorithm of these methods can be readily accomplished by simply executing a Metropolis process in each processor of a parallel machine. There are nearly no inter-processor communications needed, though our implementation issues MPI commands to synchronize and collect

*N*

_{B}steps. The only step that fundamentally resists parallelization is the initial equilibration stage as it has to occur as many times as the number of processors instead of just once in a serial execution. Nevertheless, this MC algorithm naturally exposes an exceptionally high degree of parallelism in its kernel.

## V. OTHER MONTE CARLO METHODS

MC-XVMP2(*n*), MC-XVMP2[*n*], and MC-XVH2(*n*) (collectively MC-XVMP2 in this section) introduced in this work have apparent similarities with DMC for vibrations, but they differ greatly in the theoretical underpinning and the quantities that are available from the respective calculations.

The mathematical basis of DMC is the isomorphism between the Schrödinger equation and the diffusion equation for the lowest, thus nodeless, vibrational wave function.^{57} However, since one is usually interested in transition frequencies in a vibrational study, excited states need to be determined by DMC. In such calculations, an approximate nodal surface has to be assumed,^{41,42} leading to the so-called fixed-node errors. Procedures exist for optimizing the nodal surface with a sufficiently simple functional form,^{58} but finding the latter entails defining an appropriate set of nodal coordinates,^{59} which is generally difficult. Although it has been shown^{60} that multiple *rotational* states can be determined simultaneously by DMC, anharmonic vibrational states must usually be calculated separately and transition frequencies are then obtained as differences between large, noisy total energies. Such a procedure is not only cumbersome, but also easily plagued by large relative errors.

MC-XVMP2, in contrast, computes energy differences directly and simultaneously, which are anharmonic corrections to zero-point energies and to transition frequencies for all modes in a single MC run; they are *not* small differences of two total energies obtained in two separate MC runs. MC-XVMP2 does not require any user-specified constraints on wave functions, while potentially achieving an even greater degree of parallelism in their computational kernel than DMC (see below). On the other hand, one of the advantages of DMC over MC-XVMP2 is that it does not presume small anharmonicity and can perform for any type of bound PES's; MC-XVMP2 is effective when the anharmonicity is small enough to be considered as a perturbation.

A variant of DMC that can compute excited states and hence transition frequencies without an assumed nodal surface is the projection operator imaginary time spectral evolution (POITSE).^{61,62} POITSE computes the autocorrelation functions of various projection operators during an ordinary DMC calculation of the ground state. An inverse Laplace transform is then performed numerically on these noisy functions, yielding a quantity analogous to the vibrational Green's function, from which the locations of its poles and thus anharmonic frequencies can be obtained. While mathematically elegant and exact, it has to perform the inverse Laplace transform numerically, which is known to be computationally ill-conditioned,^{61,62} making the method's practical utility limited. MC-XVMP2, on the other hand, directly compute the self-energies (which are a factor of the exact Green's function) in the frequency domain^{63,64} and the Laplace transform (when applicable) is carried out by a quadrature with no noticeable systematic or statistical error.

There is also a difference in the parallel algorithm between DMC (including POITSE) and MC-XVMP2. In DMC, the number of walkers varies during the calculation and the algorithm must make sure to prevent many walkers from “dying” at once; otherwise many processors suddenly go idle. The most common technique for this prevention requires the current best estimate of the energy to be distributed to all processors, which involves frequent all-to-all inter-processor communications and thus constitutes a bottleneck for parallel scalability.^{65} In MC-XVMP2, this bottleneck does not exist because the number of (redundant) walkers is constant and no inter-processor communication is strictly required. MC integrations on different processors can be performed, in principle, completely independently and asynchronously; they can be paused and restarted at any time after the equilibration stage and some can even be allowed to halt without a catastrophic failure of the whole calculation.

The correlation function quantum Monte Carlo (CFQMC) method^{66,67} is another MC method for excited states that can compute vibrational frequencies directly. It propagates a set of basis functions in imaginary time, yielding a new set with progressively more accurate eigenvalues for some lowest-lying states. It may be viewed as a MC implementation of a subspace diagonalization of the Hamiltonian matrix. The main drawback of this method seems to be the exponential growth of the statistical errors in excited-state results with imaginary time.^{66} Even for semi-rigid molecules like formaldehyde, therefore, the use of initial basis functions that are already close to the exact excited-state wave functions is necessary.^{67}

The most closely related to our proposed methods are the stochastic algorithms of SCP (Refs. 52,53,55) and of VCI with a SCP reference wave function^{54} reported recently, in which high-dimensional integrals appearing in their formalisms are evaluated by MC methods. In our laboratory, the development of a stochastic algorithm of XVSCF,^{28–30} which is related to SCP, is underway, which, when combined with MC-XVMP2, makes the whole anharmonic vibrational analysis proceed without evaluating any anharmonic force constant. Our methods have been inspired by the recent studies that aim at combining *ab initio* electron-correlation methods with QMC conducted in our laboratory^{48,56,68–70} and elsewhere.^{71–79}

## VI. RESULTS AND DISCUSSION

### A. Computational details

Here, we report the results of 12 different types of stochastic calculations for anharmonic zero-point energy and transition frequencies of the water molecule: MC-XVH2(*n*), MC-XVMP2(*n*), and MC-XVMP2[*n*] using either the direct or indirect evaluation of the PES, seeking solutions of either the recursive or nonrecursive inverse Dyson equation. MC-XVH2(*n*), MC-XVMP2(*n*), and MC-XVMP2[*n*] are, respectively, based on the harmonic, XVSCF(4), and XVSCF[4] reference wave functions.^{28–30} These reference wave functions are obtained by their deterministic algorithms using a QFF calculated at the MP2/aug-cc-pVDZ electronic structure theory.

In the indirect calculations, the same QFF precalculated and stored is used throughout. The corresponding methods are designated as MC-XVH2(4), MC-XVMP2(4), and MC-XVMP2[4] and are expected to reproduce the numerical results of their deterministic counterparts of Ref. 27, i.e., XVH2(4), XVMP2(4), and XVMP2[4]. The direct calculations, on the other hand, evaluate the PES at random geometries on demand. They are denoted by MC-XVH2(∞), MC-XVMP2(∞), and MC-XVMP2[∞], reflecting the fact that the highest order of the force constants included implicitly is undefined, though the reference wave function methods still use the QFF. In the latter, the first-order corrections are nonzero and need to be evaluated, as discussed in Appendix A.

In the nonrecursive calculations, the right-hand side of Eq. (3) is evaluated just once for each mode, giving the frequencies of the fundamental transitions and no pole strengths, but using the Laplace-transformed expressions of integrands such as Eq. (46). We set *n*_{max} = 8, meaning that the effect of 24th-order force constants can in principle be included when the PES evaluation is direct. The zero-point energy calculations can always invoke the Laplace-transformed expressions (if applicable).

In the recursive calculations, Eq. (1) is solved using the more elaborate algorithm described in detail in Appendix D for frequencies of the fundamentals and overtones as well as their associated pole strengths. In this case, the sum-over-states expression must be used for

We use the redundant-walker algorithm described in Appendix C with the number of walkers (*N*_{w}) equal to 20. The block length (*N*_{B}) of the Flyvbjerg–Petersen method^{51} is chosen to be 75. These calculations are parallel executed with up to 9 processors using NWCHEM (Ref. 80) as the back-end electronic structure software for the on-the-fly PES evaluations.

The purpose of these calculations is primarily to establish the feasibility and accuracy of these new stochastic methods and their algorithmic behavior. The overall speed of these calculations excluding the PES evaluation step is many orders of magnitude slower than the deterministic counterpart for small molecules such as water. As mentioned above, the total performance of these methods relies more heavily on the efficiency of the Metropolis sampling of the PES, which is, in turn, determined largely by the characteristics of the PES itself and the weight function. This requires a separate systematic study covering a broad range of systems.

### B. Weight functions

First, we assess the performance of the weight functions,

*N*) up to

*N*= 5 × 10

^{4}(corresponding to ∼5 × 10

^{5}PES evaluations) in Figs. 5 and 6.

The zero-point energies of MC-XVH2(4) converge towards the correct limit obtained by the deterministic XVH2(4) calculation for all values of α studied; there seems no systematic bias in the limit, with the weight function merely affecting the rate of convergence. The statistical uncertainty does depend strongly on α. With α = 0.7 or 0.9, the plots in Fig. 6 display characteristic spikes on otherwise *O*(*N*^{−1/2}) decay, including a large spike for α = 0.7 near *N* = 10 000. These spikes occur because Eq. (50) or (51) is experiencing “division by near zero” at these geometries, which, in turn, means that the weight function is decaying too fast as compared with the integrand, causing severe under-sampling. This is why they are more prevalent with greater values of α.

The statistical uncertainties with α = 0.1 and 0.5 exhibit smooth and nearly overlapping *O*(*N*^{−1/2}) decay, but they are roughly twice as large as the statistical uncertainties with α = 0.3. This means that the weight functions with α = 0.1 and 0.5 are slightly over-sampling and under-sampling, respectively. The weight function with α = 0.3 seems to strike a near-optimal balance.

### C. Nonrecursive inverse Dyson equation

Method . | $E_{0}^{(1)}$ $E0(1)$
. | $E_{0}^{(2)}$ $E0(2)$
. | ν_{1}
. | ν_{2}
. | ν_{3}
. |
---|---|---|---|---|---|

Expt.^{a} | … | … | 3657.1 | 1594.7 | 3755.9 |

Harmonic | ±0 | ±0 | 3822.1 | 1628.3 | 3947.9 |

XVH2(4)^{b} | +51.6 | −120.6 | 3645.1 | 1566.9 | 3767.4 |

MC-XVH2(4)^{b} | +51.3 ± 1.1 | −119.1 ± 0.7 | 3646.9 ± 4.5 | 1566.3 ± 2.1 | 3768.5 ± 3.2 |

MC-XVH2(∞)^{c} | +84.9 ± 1.5 | −190.0 ± 1.0 | 3570.8 ± 6.1 | 1577.6 ± 3.2 | 3694.5 ± 4.3 |

XVMP2(4)^{b} | +0.0 | −100.1 | 3679.3 | 1561.9 | 3802.0 |

MC-XVMP2(4)^{b} | −0.5 ± 1.1 | −100.3 ± 0.5 | 3680.0 ± 3.8 | 1561.2 ± 1.8 | 3801.6 ± 2.6 |

MC-XVMP2(∞)^{c} | +29.7 ± 1.4 | −159.1 ± 0.7 | 3613.3 ± 5.1 | 1568.4 ± 2.5 | 3742.9 ± 3.4 |

XVMP2[4]^{b} | +0.0 | −22.7 | 3686.2 | 1560.1 | 3807.6 |

MC-XVMP2[4]^{b} | +0.1 ± 0.7 | −22.7 ± 0.2 | 3687.8 ± 1.8 | 1559.6 ± 1.1 | 3806.1 ± 1.8 |

MC-XVMP2[∞]^{c} | −16.4 ± 0.6 | −35.1 ± 0.3 | 3632.0 ± 2.4 | 1565.9 ± 1.3 | 3750.9 ± 2.3 |

Method . | $E_{0}^{(1)}$ $E0(1)$
. | $E_{0}^{(2)}$ $E0(2)$
. | ν_{1}
. | ν_{2}
. | ν_{3}
. |
---|---|---|---|---|---|

Expt.^{a} | … | … | 3657.1 | 1594.7 | 3755.9 |

Harmonic | ±0 | ±0 | 3822.1 | 1628.3 | 3947.9 |

XVH2(4)^{b} | +51.6 | −120.6 | 3645.1 | 1566.9 | 3767.4 |

MC-XVH2(4)^{b} | +51.3 ± 1.1 | −119.1 ± 0.7 | 3646.9 ± 4.5 | 1566.3 ± 2.1 | 3768.5 ± 3.2 |

MC-XVH2(∞)^{c} | +84.9 ± 1.5 | −190.0 ± 1.0 | 3570.8 ± 6.1 | 1577.6 ± 3.2 | 3694.5 ± 4.3 |

XVMP2(4)^{b} | +0.0 | −100.1 | 3679.3 | 1561.9 | 3802.0 |

MC-XVMP2(4)^{b} | −0.5 ± 1.1 | −100.3 ± 0.5 | 3680.0 ± 3.8 | 1561.2 ± 1.8 | 3801.6 ± 2.6 |

MC-XVMP2(∞)^{c} | +29.7 ± 1.4 | −159.1 ± 0.7 | 3613.3 ± 5.1 | 1568.4 ± 2.5 | 3742.9 ± 3.4 |

XVMP2[4]^{b} | +0.0 | −22.7 | 3686.2 | 1560.1 | 3807.6 |

MC-XVMP2[4]^{b} | +0.1 ± 0.7 | −22.7 ± 0.2 | 3687.8 ± 1.8 | 1559.6 ± 1.1 | 3806.1 ± 1.8 |

MC-XVMP2[∞]^{c} | −16.4 ± 0.6 | −35.1 ± 0.3 | 3632.0 ± 2.4 | 1565.9 ± 1.3 | 3750.9 ± 2.3 |

The results of the indirect MC-XVH2(4), MC-XVMP2(4), and MC-XVMP2[4] calculations that use the precalculated QFF reproduce the corresponding results of the deterministic XVH2(4), XVMP2(4), and XVMP2[4] calculations at *N* = 67500 (∼7 × 10^{5} PES evaluations). In all cases, the stochastic results are within a few cm^{−1} of the correct limits and these errors are also accurately estimated as the statistical uncertainties. The first-order corrections to the zero-point energy in MC-XVMP2(4) and MC-XVMP2[4] are zero to within the statistical uncertainties, as they should be.

MC-XVH2(∞), MC-XVMP2(∞), and MC-XVMP2[∞] perform on-the-fly PES evaluations in each MC step. They do not reproduce the deterministic results as they include the effect of higher-than-quartic force constants implicitly. The first-order corrections to the zero-point energy in MC-XVMP2(∞) and MC-XVMP2[∞] are no longer zero for the same reason. According to these calculations, the fifth- and higher-order force constants roughly account for one third of the anharmonic corrections (at both first and second orders) to the zero-point energy. Such information is exceedingly difficult to obtain by force-constant-based methods.

The statistical uncertainties of the anharmonic frequencies are greater than those of the zero-point corrections because the integrands for the frequencies involve harmonic oscillators with higher quantum numbers and are more rapidly oscillating. The uncertainties of the stretching frequencies in MC-XVH2(4) and MC-XVMP2(4) also appear to be slightly overestimated, possibly due to approximations we make in evaluating these uncertainties, which are discussed in Appendix D. However, like the zero-point energy, the statistical uncertainties of the frequencies decrease as *O*(*N*^{−1/2}). To reduce the uncertainties to within 1 cm^{−1} would require approximately 2 × 10^{6} MC steps or 2 × 10^{7} PES evaluations. Overall, the rate of convergence seems considerably faster than that in the similar method for electrons,^{48} which may be ascribed to the absence of singularities in the integrands in vibrational problems.

Each of the direct calculations takes ∼10^{6} CPU seconds spread across 9 processors, of which ∼80% is consumed by 7 × 10^{5} electronic structure calculations at the MP2/aug-cc-pVDZ level. The overall efficiency of these calculations is greater when the proportion of the time spent by the electronic structure calculations is higher. The efficiency is also greater when the rate of convergence is faster, as judged by the decay rate of the statistical uncertainties in Fig. 6. It is not meaningful to compare the speed of MC-XVMP2(∞) and XVMP2(*n*) as the former captures a greater degree of anharmonic effects.

### D. Recursive inverse Dyson equation

Figures 7 and 8 show how well MC-XVH2(4) can reproduce the self-energy for mode one (symmetric stretch) of the water molecule as a function of frequency ν. The gray curves in Fig. 7 are the XVH2(4) results and the green dots in Figs. 7 and 8 plot the self-energies and statistical uncertainties of MC-XVH2(4), respectively, at a discrete set of frequencies. The green curves are analytic functions obtained by the linear-least-squares fitting procedure described in Appendix D. It can be seen that this procedure works exceedingly well with the interpolated self-energy being indistinguishable from the XVH2(4) results.

The self-energy is divergent at 1628, 3257, and 4885 cm^{−1} in the frequency domain of Fig. 7, corresponding to the bending fundamental (ω_{2}) and first and second bending overtones (2ω_{2} and 3ω_{2}) in the harmonic approximation (which is the reference method for XVH2). Owing to these poles, there are four roots to the recursive inverse Dyson equation of XVH2(4) in this frequency domain, only two of which have non-negligible pole strengths and thus are physically meaningful. They are the symmetric stretch fundamental (ν_{1}) at 3633 cm^{−1} and the first bending overtone (2ν_{2}) at 3238 cm^{−1} with pole strengths of 0.983 and 0.051, respectively. In Fig. 8, the statistical uncertainty is nearly constant at 4 cm^{−1} except at these poles, where it shoots up.

Figure 9 illustrates these two roots of the inverse Dyson equation graphically in the salient frequency domain. They occur at the intersections (the red circles) of two curves: the left- and right-hand sides of the square root of the inverse Dyson equation [Eq. (1)] drawn as the blue line and green curves, respectively. Remarkably, MC-XVH2(4) can reproduce the positions of these roots within 2 cm^{−1}. The statistical uncertainties of these frequencies are the same as those of the self-energies and they are 4.1 cm^{−1} in ν_{1} and 8.7 cm^{−1} in 2ν_{2}. The latter is much greater than the others because the 2ν_{2} root is close to the pole at 2ω_{2}. However, it is likely overestimated, considering the insensitivity of the root location of 2ν_{2} with the vertical placement of the self-energy curve there.

The results of the direct and indirect MC-XVH2 and MC-XVMP2 calculations for the solutions of the recursive inverse Dyson equation are compiled in Table II. The indirect calculations using the same QFF as the deterministic algorithms should reproduce the results (both frequencies and pole strengths) of the latter and they do so within no more than twice the corresponding statistical uncertainties, although the statistical uncertainties of some anharmonic frequencies may be slightly overestimated for reasons mentioned above. We have thus shown that it is entirely possible to solve the recursive Dyson equation for multiple roots and pole strengths using a stochastic algorithm.

Method . | ν_{1}
. | ν_{2}
. | ν_{3}
. | 2ν_{2} ≡ ν_{4}
. | P_{1}
. | P_{2}
. | P_{3}
. | P_{4}
. |
---|---|---|---|---|---|---|---|---|

Expt.^{a} | 3657.1 | 1594.7 | 3755.9 | 3151.6 | … | … | … | … |

Harmonic | 3822.1 | 1628.3 | 3947.9 | 3256.6 | … | … | … | … |

XVH2(4)^{b} | 3633.4 | 1565.6 | 3730.8 | 3238.3 | 0.983 | 1.038 | 1.031 | 0.051 |

MC-XVH2(4)^{b} | 3634.6 ± 4.1 | 1565.3 ± 1.7 | 3731.9 ± 2.7 | 3238.9 ± 8.7 | 0.985 ± 0.001 | 1.038 ± 0.001 | 1.031 ± 0.001 | 0.049 ± 0.001 |

MC-XVH2(∞)^{c} | 3569.3 ± 5.4 | 1577.6 ± 2.4 | 3649.7 ± 3.6 | 3224.1 ± 8.3 | 0.949 ± 0.002 | 1.029 ± 0.002 | 1.050 ± 0.001 | 0.104 ± 0.002 |

XVMP2(4)^{b} | 3668.1 | 1560.4 | 3770.7 | 3064.0 | 1.039 | 0.984 | 1.059 | 0.026 |

MC-XVMP2(4)^{b} | 3668.6 ± 3.5 | 1559.1 ± 1.4 | 3769.2 ± 2.3 | 3064.0 ± 9.4 | 1.039 ± 0.001 | 0.984 ± 0.001 | 1.060 ± 0.001 | 0.026 ± 0.000 |

MC-XVMP2(∞)^{c} | 3606.9 ± 4.6 | 1566.2 ± 2.0 | 3705.7 ± 2.9 | 3056.4 ± 9.6 | 1.037 ± 0.001 | 0.979 ± 0.001 | 1.074 ± 0.001 | 0.045 ± 0.001 |

XVMP2[4]^{b} | 3667.6 | 1559.1 | 3774.5 | 3127.8 | 0.997 | 1.004 | 1.011 | 0.018 |

MC-XVMP2[4]^{b} | 3668.5 ± 1.6 | 1557.5 ± 1.0 | 3772.5 ± 1.7 | 3127.8 ± 8.2 | 0.996 ± 0.001 | 1.005 ± 0.001 | 1.012 ± 0.000 | 0.018 ± 0.000 |

MC-XVMP2[∞]^{c} | 3616.7 ± 2.1 | 1563.1 ± 1.1 | 3712.4 ± 2.0 | 3121.5 ± 5.5 | 0.994 ± 0.001 | 1.001 ± 0.001 | 1.024 ± 0.001 | 0.034 ± 0.000 |

Method . | ν_{1}
. | ν_{2}
. | ν_{3}
. | 2ν_{2} ≡ ν_{4}
. | P_{1}
. | P_{2}
. | P_{3}
. | P_{4}
. |
---|---|---|---|---|---|---|---|---|

Expt.^{a} | 3657.1 | 1594.7 | 3755.9 | 3151.6 | … | … | … | … |

Harmonic | 3822.1 | 1628.3 | 3947.9 | 3256.6 | … | … | … | … |

XVH2(4)^{b} | 3633.4 | 1565.6 | 3730.8 | 3238.3 | 0.983 | 1.038 | 1.031 | 0.051 |

MC-XVH2(4)^{b} | 3634.6 ± 4.1 | 1565.3 ± 1.7 | 3731.9 ± 2.7 | 3238.9 ± 8.7 | 0.985 ± 0.001 | 1.038 ± 0.001 | 1.031 ± 0.001 | 0.049 ± 0.001 |

MC-XVH2(∞)^{c} | 3569.3 ± 5.4 | 1577.6 ± 2.4 | 3649.7 ± 3.6 | 3224.1 ± 8.3 | 0.949 ± 0.002 | 1.029 ± 0.002 | 1.050 ± 0.001 | 0.104 ± 0.002 |

XVMP2(4)^{b} | 3668.1 | 1560.4 | 3770.7 | 3064.0 | 1.039 | 0.984 | 1.059 | 0.026 |

MC-XVMP2(4)^{b} | 3668.6 ± 3.5 | 1559.1 ± 1.4 | 3769.2 ± 2.3 | 3064.0 ± 9.4 | 1.039 ± 0.001 | 0.984 ± 0.001 | 1.060 ± 0.001 | 0.026 ± 0.000 |

MC-XVMP2(∞)^{c} | 3606.9 ± 4.6 | 1566.2 ± 2.0 | 3705.7 ± 2.9 | 3056.4 ± 9.6 | 1.037 ± 0.001 | 0.979 ± 0.001 | 1.074 ± 0.001 | 0.045 ± 0.001 |

XVMP2[4]^{b} | 3667.6 | 1559.1 | 3774.5 | 3127.8 | 0.997 | 1.004 | 1.011 | 0.018 |

MC-XVMP2[4]^{b} | 3668.5 ± 1.6 | 1557.5 ± 1.0 | 3772.5 ± 1.7 | 3127.8 ± 8.2 | 0.996 ± 0.001 | 1.005 ± 0.001 | 1.012 ± 0.000 | 0.018 ± 0.000 |

MC-XVMP2[∞]^{c} | 3616.7 ± 2.1 | 1563.1 ± 1.1 | 3712.4 ± 2.0 | 3121.5 ± 5.5 | 0.994 ± 0.001 | 1.001 ± 0.001 | 1.024 ± 0.001 | 0.034 ± 0.000 |

A comparison of Tables I and II indicates that the nonrecursive approximation of Eq. (3) causes errors up to ∼40 cm^{−1} in the frequencies of the fundamentals. The nonrecursive calculations do not provide any information about overtones, combinations, resonances, or their pole strengths, either. They, however, dramatically simplify the computational procedure; in the recursive algorithm, MC integration needs to be performed at various frequencies (the green dots in Fig. 7) for each mode, whereas a single MC integration at ν = 0 per mode suffices in the nonrecursive algorithm. This is the trade-off between the two algorithms.

Interestingly, ν_{1} and ν_{3} predicted by MC-XVH2(4) are in more accurate agreement with the observed than those obtained by MC-XVH2(∞), even though the latter takes into account higher-than-quartic force constants. This is fortuitous and likely caused by the accidental cancellation of errors between the higher-than-quartic force constants and the higher-than-second order perturbation corrections, both neglected by MC-XVH2(4). Note that the second-order vibrational perturbation method is known to yield exact results for a Morse oscillator with only cubic and quartic force constants.^{15,21} Note also that MC-XVMP2 takes into account a certain class of higher-order perturbation corrections (relative to the harmonic reference wave function) all the way to an infinite order, which may explain why the same cancellation of errors is hardly seen in the MC-XVMP2 results. This underscores the importance of going beyond QFF, which is exceedingly difficult with theories or algorithms specifically designed for a certain fixed representation of a PES (such as a QFF), but is achieved readily by the direct MC-XVMP2 and MC-XVH2 methods.

## VII. CONCLUSION

In an anharmonic vibrational analysis of a molecule, it is often the evaluation and storage of all force constants up to the *n*th order that constitutes the computational bottleneck. Its operational and memory costs grow exponentially with *n* and as a high power of *m* (the vibrational degrees of freedom). Computer programs for this step require an exponentially increasing effort to write with *n* and also tend to be numerically unstable when they are based on finite-difference formulas as they usually are.

In this work, we have introduced the whole new algorithm of the diagrammatic second-order vibrational MBPT that eradicates the need for evaluating and storing force constants, leading to a highly scalable and easily implemented algorithm that can capture anharmonic effects due to higher-order force constants essentially up to an infinite order. This has been achieved by wedding XVMP2 (XVH2) with QMC, that is, by reformulating the energy and self-energy expressions of XVMP2 (XVH2) into sums of a few high-dimensional integrals that lend themselves to brute-force MC integrations with on-the-fly electronic structure calculations of a PES.

The resulting methods, MC-XVMP2 and MC-XVH2, can directly compute the zero-point energy and transition frequencies of not only fundamentals but also overtones as well as their relative intensities all from a single MC run and *not* as small differences of noisy total energies obtained from separate MC runs. They solve the recursive inverse Dyson equation stochastically. Unlike DMC, they neither require the knowledge of nodal surfaces of excited-state wave functions nor suffer from the resulting fixed-node errors. Furthermore, the zero-point energy and transition frequencies obtained by MC-XVMP2 (MC-XVH2) are rigorously size-extensive and intensive, respectively, as their mathematical definitions obey the extensive and intensive diagram theorems,^{11} while including anharmonic effects of higher-order force constants not accessible by any force-constant-based methods.

We have applied MC-XVMP2 and MC-XVH2 to a small molecule to establish their feasibility and assess their performance. We plan to extend the applications to larger molecules, where these methods are expected to become superior to their deterministic counterparts.

## ACKNOWLEDGMENTS

We thank Dr. Lucas K. Wagner for his advice on the role of symmetry in MC calculations. This work is supported by the U.S. Department of Energy, Office of Science, Basic Energy Sciences under Award No. DE-FG02-11ER16211. S.H. is a Camille Dreyfus Teacher-Scholar and a Scialog Fellow of the Research Corporation for Science Advancement.

### APPENDIX A: HIGHER-ORDER DIAGRAMS

In a direct calculation, the PES is evaluated at geometries randomly generated by the Metropolis algorithm. Therefore, it includes the effects of higher-order force constants and their corresponding diagrammatic contributions implicitly. For instance, diagrammatic contributions in Fig. 10, which involves fifth-order force constants, are included in MC-XVH2(∞), MC-XVMP2(∞), and MC-XVMP2[∞] using the direct algorithm, but not in XVH2(4), XVMP2(4), or XVMP2[4], which is based on a QFF.

This is one of the advantages of the direct MC-XVMP2 methods, but it has the following side effect: the PES's used in the MC-XVMP2 and preceding XVSCF reference wave function calculations differ from each other; the partitioning is no longer the Møller–Plesset type. Hence, the first-order diagrams such as those shown in Fig. 11, which involve sixth-order force constants, have nonzero contributions in MC-XVMP2(∞) and MC-XVMP2[∞] when the reference wave function is XVSCF(4) or XVSCF[4]. Therefore, the first-order energy and self-energy contributions are nonzero in MC-XVMP2(∞) and MC-XVMP2[∞] calculations and need to be evaluated using Eqs. (17) and (34).

### APPENDIX B: POINT-GROUP SYMMETRY

For a molecule that belongs to an Abelian point-group symmetry, many individual contributions to the integrals become symmetrically zero. Eliminating those vanishing contributions using a symmetry logic does not change the values of the integrals or the operation cost per MC step, but it reduces the statistical uncertainty and thus increases overall efficiency of MC-XVMP2 and MC-XVH2 to a noticeable degree.

Let us illustrate the proposed symmetry algorithm by taking

in a recursive calculation. Here, Γ_{N} stands for the irreducible representation of the *N*th state and the *N*_{i}th state is the one obtained by incrementing the quantum number of the *i*th mode in the *N*th state by one. Hence,

*N*

_{i}th and zero-point states transform as the same (i.e., totally symmetric) irreducible representation; it is zero otherwise. This has the effect of compelling symmetrically zero integral contributions to be exactly zero rather than be merely convergent towards zero after a long MC run.

In a nonrecursive calculation, we instead use the Laplace-transformed expression, Eq. (46), in which we no longer have individual states to which irreducible representations can be assigned. We can nonetheless use the same symmetry logic with the following modified definition of

with

where *R* is a symmetry operation, *h* is the order of the symmetry group, and

_{i}under operation

*R*. The above equations can be derived from Eq. (B1) using the identity:

Symmetry can be applied to all the other integrals analogously.

### APPENDIX C: REDUNDANT-WALKER ALGORITHM

At minimum, we need one random-walking point (walker),

*m*-dimensional space to evaluate Eq. (50) and two walkers,

However, propagating much more than these minimal numbers of walkers, we can increase the efficiency of the MC integrations by many orders of magnitude; this is called the redundant-walker algorithm.^{56} The only changes we need to make to the formalism are in Eqs. (50) and (51), which now read

for an *m*-dimensional integral, and

for a 2*m*-dimensional integral, where *N*_{w} is the number of walkers and

*j*th walker coordinates in the

*i*th MC step. Hence, propagating

*N*

_{w}walkers instead of just two at an extra cost of

*O*(

*N*

_{w}), we can increase the number of MC sampling points by

*O*(

*N*

_{w}) enhancement of sampling efficiency.

### APPENDIX D: FREQUENCIES AND POLE STRENGTHS

Given the self-energy evaluated as a function of frequency, it remains to solve Eq. (1) for self-consistent solutions. In our implementation, the self-energy is approximated by a function of the form,

with

where *c*_{1}, *c*_{2}, and {*d*_{N}} are adjustable parameters determined by a weighted linear-least-squares fit to

*N*=

*N*

_{max}. Once the continuous function Σ

_{i}(ν) is obtained in this way, we can determine all roots using the bisection method described in Ref. 27.

The statistical uncertainty σ_{i} of Σ_{i}(ν_{i}) obtained by the least-squares fitting is calculated by

where cov denotes covariance, σ_{(1)} and

with *f* = *aA* + *bB*, where *a* and *b* are constants and *A* and *B* are variables. The statistical uncertainty of ν_{i} is approximately equal to σ_{i} since, from a Taylor expansion of Eq. (1), we have

The relative intensity of this transition is proportional to the pole strength given by

See Ref. 27 for its derivation. To estimate its statistical uncertainty

_{i}(ν) that enters the formula for

*P*

_{i}. For the interpolated Σ

_{i}(ν) given in Eq. (D1), its first derivative is written as

whose associated statistical uncertainty at ν = ν_{i} is

where we have used Eq. (D4). Using this with another formula for the propagation of uncertainty,

with *f* = *a*/*A*, we find

The formulas described here for the statistical uncertainties assume that there is no covariance between

*n*) calculations, in which the contributions to the self-energy from all three terms are significant. However, we have found in Sec. VI that these estimates are reasonable, tending only to slightly overestimate the errors of the anharmonic frequencies. The covariances between

## REFERENCES

_{3}

^{+}, H

_{3}O

^{+}, and CH

_{5}

^{+}using diffusion Monte Carlo