TurboRVB is a computational package for ab initio Quantum Monte Carlo (QMC) simulations of both molecular and bulk electronic systems. The code implements two types of well established QMC algorithms: Variational Monte Carlo (VMC) and diffusion Monte Carlo in its robust and efficient lattice regularized variant. A key feature of the code is the possibility of using strongly correlated many-body wave functions (WFs), capable of describing several materials with very high accuracy, even when standard mean-field approaches [e.g., density functional theory (DFT)] fail. The electronic WF is obtained by applying a Jastrow factor, which takes into account dynamical correlations, to the most general mean-field ground state, written either as an antisymmetrized geminal power with spin-singlet pairing or as a Pfaffian, including both singlet and triplet correlations. This WF can be viewed as an efficient implementation of the so-called resonating valence bond (RVB) Ansatz, first proposed by Pauling and Anderson in quantum chemistry [L. Pauling, The Nature of the Chemical Bond (Cornell University Press, 1960)] and condensed matter physics [P.W. Anderson, Mat. Res. Bull 8, 153 (1973)], respectively. The RVB Ansatz implemented in TurboRVB has a large variational freedom, including the Jastrow correlated Slater determinant as its simplest, but nontrivial case. Moreover, it has the remarkable advantage of remaining with an affordable computational cost, proportional to the one spent for the evaluation of a single Slater determinant. Therefore, its application to large systems is computationally feasible. The WF is expanded in a localized basis set. Several basis set functions are implemented, such as Gaussian, Slater, and mixed types, with no restriction on the choice of their contraction. The code implements the adjoint algorithmic differentiation that enables a very efficient evaluation of energy derivatives, comprising the ionic forces. Thus, one can perform structural optimizations and molecular dynamics in the canonical NVT ensemble at the VMC level. For the electronic part, a full WF optimization (Jastrow and antisymmetric parts together) is made possible, thanks to state-of-the-art stochastic algorithms for energy minimization. In the optimization procedure, the first guess can be obtained at the mean-field level by a built-in DFT driver. The code has been efficiently parallelized by using a hybrid MPI-OpenMP protocol, which is also an ideal environment for exploiting the computational power of modern Graphics Processing Unit accelerators.

The solution of the many-body Schrödinger equation, which describes the interaction between electrons and ions at the quantum mechanical level, represents a fundamental challenge in computational chemistry, condensed matter, and materials science. Since about a century ago, there has been a relentless theoretical and computational effort to find an accurate solution to this problem, which also features many recent interdisciplinary applications in machine learning1 and materials informatics.2 While computationally the scaling of the problem is exponential with the number of electrons, several numerical approximate methods have been put forward in the last decades. Among them, the Density Functional Theory (DFT) method, proposed by Kohn and Sham3 in 1965, is one of the most successful approaches. In this framework, the original interacting 3N many-body problem (N being the number of electrons in a system) is mapped to a non-interacting electron system, defined by an effective mean-field potential, to be determined self-consistently.4 While DFT is an exact theory in principle, the exact form for the exchange-correlation functional, which is an essential part of the DFT mean-field potential, remains unknown. Unfortunately, the progress in generating increasingly successful approximations of this functional is rather slow,5 partly because there is no established strategy for systematic improvement,6 while maintaining an efficient scaling with the system size. The commonly adopted approximations for the exchange-correlation DFT functional have well-known limitations, especially in describing weak dispersive interactions, strongly correlated materials, and extreme environments (e.g., high pressure).7,8

Alternative strategies are represented by the so-called wave function (WF)-based approaches popular in quantum chemistry applications.9 In these methods, electronic correlations are captured either variationally or perturbatively by post-Hartree–Fock theories, such as the Møller–Plesset perturbation theory (MP),10 configuration interaction (CI) and Full-CI (FCI),11 multi-configurational self-consistent field (MCSCF),12 and Coupled-Cluster (CC) theory,13 to name a few. Among them, coupled-cluster with single, double, and perturbative triple excitations, or CCSD(T), is considered to be the gold standard in quantum chemistry as it typically provides results in good agreement with experiments and a reasonable balance between accuracy and computational affordability (despite its cost growing as the seventh power of the number of electrons). While quantum chemistry methods have the advantage of treating electronic exchange and correlation effects in a systematically improvable fashion, they are also much more computationally demanding compared to DFT, and their applicability to large or periodic systems is often computationally prohibitive.

Another way to tackle the problem of the electron correlation and the huge dimension of a many-body WF, exponentially large in the number of electrons, is by means of stochastic approaches, which are in this context referenced with the widely used expression of quantum Monte Carlo (QMC) methods.14,15 Since the invention of Markov Chain Monte Carlo (MCMC) in the 1940s,16stochastic approaches to numerical algorithms had a pervasive influence in a wide range of fields, from physics and engineering to finance. In this respect, the QMC framework is qualitatively different from the deterministic one mentioned above, and it represents, therefore, an original and alternative approach for the solution of the Schrödinger equation, overcoming some of the drawbacks of DFT and the deterministic WF-based approaches of quantum chemistry. In particular, QMC does not rely on uncontrolled approximations; its accuracy can be systematically improved, the scaling with system size is good, and it is straightforwardly applicable to both isolated and periodic systems. While the scaling with the system size is comparable with standard DFT methods, the prefactor is typically much larger. However, methods based on stochastic sampling are well suited for massively parallel computing architectures as the algorithms can sustain an almost ideal scaling with the number of cores. As a result, the feasibility and popularity of QMC are expected to increase with the foreseeable substantial improvements in high-performance computing (HPC) facilities.

TurboRVB includes two of the most popular ground-state QMC algorithms: variational Monte Carlo (VMC) and diffusion Monte Carlo (DMC).14 In VMC, a systematically improvable approximation for the ground state is obtained by direct minimization of the energy, evaluated by a parametrized many-body WF. In an explicitly correlated WF, the electron coordinates are not separable, and the expectation value of the Hamiltonian has to be calculated with Monte Carlo integration, hence the name. In realistic calculations, a faithful and nevertheless compact parametrization for the trial state is essential for a successful energy optimization. The first property allows an unbiased treatment of the electronic correlations across different regimes, such as bond dissociation and electronic phase transitions. The second is needed for a stable optimization of the variational parameters, which may become a too difficult task when the chosen parametrization is redundant or unnecessarily detailed. Therefore, a central effort, in the TurboRVB project, has been devoted to the development of efficient and systematic parametrizations of correlated WFs.

As we will see in the following, an accurate trial WF is also fundamental in DMC, which is the second main algorithm present in TurboRVB. DMC is an imaginary-time projection technique,15 which, when combined with the so-called fixed-node (FN) approximation, represents a powerful route to electronic ground-state calculations. In this approximation, the nodal-surface, which is kept fixed during the projection, can be determined by an accurate variational optimization.

So far, several groups have implemented the above algorithms and established excellent QMC codes such as QMCPACK,17 CASINO,18 QWALK,19 CHAMP,20 and HANDE-QMC.21 We also remark that other QMC algorithms have also been developed recently, such as the auxiliary field quantum Monte Carlo (AF-QMC),22,23 full-configuration interaction quantum Monte Carlo (FCI-QMC),24,25 coupled cluster Monte Carlo (CCMC),26,27 density matrix quantum Monte Carlo (DMQMC),28,29 model space quantum Monte Carlo (MSQMC),30–32 clock quantum Monte Carlo (CQMC),33 or driven-dissipative quantum Monte Carlo (DDQMC).34 

A typical workflow for a QMC calculation performed with TurboRVB is shown in Fig. 1. It shall be noticed that the choice of the most suited Ansatz for the WF, as well as the optimization of its parameters, is a prerequisite of both VMC and DMC evaluations. Ionic forces, if evaluated, can be used to perform QMC-based structural relaxations, Langevin molecular dynamics (MD), or path integral molecular dynamics (PIMD). Moreover, several post-processing tools are available to analyze the outcomes. The TurboRVB code is peculiar because it relies on the assumption that a complex electronic problem, though, until now, cannot be solved exactly with a black-box computer software, can be described very accurately by appropriate variational wave functions, guided by robust physical and chemical requirements and human ingenuity. In particular, TurboRVB is different from other codes in the following features: (i) It employs the Resonating Valence Bond (RVB) WF, first proposed by Pauling and Anderson in quantum chemistry35 and condensed matter physics,36 respectively. It includes statical and dynamical correlation effects beyond the commonly used Slater determinant, while keeping the computational cost at the single-determinant level, thanks to its efficient implementation. (ii) The code implements a VMC algorithm based on localized orbitals (e.g., Gaussians) and state-of-the-art optimization methods, such as the stochastic reconfiguration. Therefore, at the VMC level, one can optimize not only the amplitude of the WF (i.e., the Jastrow factor) but also the nodal surfaces (e.g., the Slater determinant). This leads to a better variational energy in general and also improves the corresponding FN-DMC energy. (iii) The energy derivatives (e.g., atomic forces) are calculated very efficiently, thanks to an implementation based on the Adjoint Algorithmic Differentiation (AAD). As a consequence, one can perform structural optimizations and Langevin molecular dynamics. (iv) The code implements the newly developed Lattice Regularized Diffusion Monte Carlo (LRDMC), a stable DMC algorithm that, very recently, has shown to have a better scaling with the atomic number Z, compared with standard DMC.37 

FIG. 1.

Schematic picture of some of the possible tools available in TurboRVB. The arrows indicate typical workflow calculations.

FIG. 1.

Schematic picture of some of the possible tools available in TurboRVB. The arrows indicate typical workflow calculations.

Close modal

This review is organized as follows: in Sec. II, we briefly explain the fundamental QMC algorithms implemented in TurboRVB, namely, the VMC and LRDMC; in Sec. III, we describe the RVB WF; in Sec. IV, we extend the WF to treat periodic systems; in Sec. V, we introduce the DFT algorithm efficiently implemented in TurboRVB; in Sec. VI, we describe the way energy derivatives are computed (e.g., atomic forces) by means of AAD; in Sec. VII, we list the TurboRVB steps to optimize a many-body WF; in Sec. VIII, we introduce the first and the second-order Langevin molecular dynamics implemented in TurboRVB; in Sec. IX, we summarize the typical QMC workflow from the WF generation to the final LRDMC calculation; in Sec. X, we show weak and strong scaling results of TurboRVB, measured on the Marconi/CINECA supercomputer; in Sec. XI, we list the physical properties that can be calculated by TurboRVB; in Sec. XII, we review the major TurboRVB applications done so far; and in Sec. XIII, we introduce a Python-based workflow system, named Turbo-Genius.

TurboRVB implements two types of well established quantum Monte Carlo methods: Variational Monte Carlo (VMC) and Lattice Regularized Diffusion Monte Carlo (LRDMC). We summarize these methods in this section. The interested readers should also refer to the comprehensive review of QMC14 for details. TurboRVB also implements an original DFT engine to generate trial WFs, which is more suitable for the VMC and LRDMC calculations, as shown in Sec. V.

Starting from the variational principle, the expectation value of the energy evaluated for a given WF Ψ can be written as

E=dxΨ2xH^Ψx/ΨxdxΨ2x=dxeLxπx,
(1)

where x=r1σ1,r2σ2,,rNσN here and henceforth is a shorthand notation for the N electron coordinates and their spins, whereas

eLxH^ΨxΨx and πxΨ2xdxΨ2x

are the so-called local energy and the probability of the configuration x, respectively. This multidimensional integration can be evaluated stochastically by generating a set xi according to the distribution πx using the Markov chain Monte Carlo such as the (accelerated38,39) Metropolis method and by averaging the obtained local energies eLxi,

EVMC=eLxπx1Mi=1MeLxi,
(2)

which has an associated statistical error of Var[eL(xi)]/M̃, where Var[eL(xi)] is the variance of the sampled local energies and M̃ is the sampling size M divided by the autocorrelation time. This indicates that the precision of the VMC evaluation is inversely proportional to the square root of the number of samplings (i.e., of the computational cost). It is worth to note that if Ψ(x) is an eigenfunction of H^, say with eigenvalue E0, then eL(x) = E0 for each x, implying that the variance of the local energy is zero and EVMC = E0 with no stochastic uncertainty. This feature is known as the zero-variance property.

The probability distribution used for the importance sampling can also differ from πx. Indeed, one can use an arbitrary probability distribution function πx=ΨG2x/dxΨG2x and estimate a generic local observable Ox either by using ŌVMC=Oxπx or

ŌVMC=OxWxπxWxπxi=1MOxiWxii=1MWxi,
(3)

where Wx=|Ψx/ΨGx|2, and the points xi are distributed according to π′. This reweighting scheme is very important when evaluating atomic forces, as discussed in Sec. VI B. Since the evaluations of the standard deviations are nontrivial in this case, TurboRVB employs the bootstrap and jackknife methods in order to estimate the mean value and the statistical error,15 which are also used when evaluating those of the local energy, forces, and so on. Indeed, the code outputs the history of the local energies, forces, or other properties in appropriate files (when the corresponding option is true), thus allowing the error bar estimates by simple post-processing. The user can also use the reblocking (binning) technique to remove the autocorrelation bias.40 

One can optimize the WF based on the variational theorem by introducing a set of parameters α1,α2,,αp to the WF Ψx,α,

EVMCα=dxeLx,απx,αEexact.
(4)

However, the optimization of a many-body WF remains a difficult challenge not only because optimizing a cost function containing many parameters is a complex numerical task, due to the presence of several local minima in the energy landscape, but also because this difficult task is further complicated by the presence of statistical errors in the QMC evaluation of any quantity. Nevertheless, a great improvement in this field has been achieved when the QMC optimization techniques have made use of the explicit evaluation of energy derivatives with finite statistical errors. In particular, in TurboRVB, the adjoint algorithmic differentiation (AAD) has been implemented, by allowing the efficient calculation of generalized forces (fi=Eαi)41 and very efficient optimization methods, the so-called “stochastic reconfiguration”42,43 and “the linear method.”44–46 These methods are discussed in Secs. VII A and VII B.

Lattice regularized diffusion Monte Carlo (LRDMC), which was initially proposed by Casula et al.,47 is a projection technique that allows us to improve a variational Ansatz systematically. This method is based on Green’s function Monte Carlo (GFMC),48–50 filtering out the ground state WF ϒ0 from a given trial WF ΨT: since the eigenstates of the Hamiltonian have the completeness property, the trial WF can be expanded as

ΨT=nanϒn,
(5)

where an is the coefficient for the nth eigenvectors (Υn). Therefore, by applying GM=ΛH^M, one can obtain

ϒ0limMΛH^MΨT=limMλE0Ma0ϒ0+n0λEnλE0Manϒn,
(6)

where Λ is a diagonal matrix with Λx′,x = λδx′,x (λ should be sufficiently large to obtain the ground state) and En is the nth eigenvalue of H^. Since λEnλE0<1, the projection filters out the ground state WF Υ0 from a given trial WF ΨT, as long as the trial WF is not orthogonal to the true ground state (i.e., a0 ≡ ⟨Υ0T⟩ ≠ 0). To apply the GFMC for ab initio electron calculations, the original continuous Hamiltonian is regularized by allowing electron hopping with step size a, in order to mimic the electronic kinetic energy. The corresponding Hamiltonian H^a is then defined such that H^aH^ for a → 0. Namely, the kinetic part is approximated by a finite difference form

Δifxi,yi,ziΔiafxi,yi,zi1a2fxi+afxi+fxiafxiyizi,
(7)

and the potential term is modified as

Vax=Vx+12iΔiaΔiΨGxΨGx.
(8)

The corresponding Green’s function matrix elements are

Gx,x=xΛH^axΛHax,x
(9)

and the single LRDMC iteration step is given by the following equation:

Ψn+1x=xGx,xΨnx.
(10)

The sketch of the LRDMC algorithm, a Markov chain that evolves the many-body WF according to Eq. (10), is as follows:47 (STEP 1) Prepare a walker with configuration x and weight w (w0 = 1). (STEP 2) A new configuration x′ is generated by the transition probability,

px,x=Gx,x/bx,
(11)

where

bx=xGx,x
(12)

is a normalization factor. By applying the discretized Hamiltonian to a given configuration (H^a|x), (6N + 1) configurations |x′⟩ are determined according to the probability px′,x in Eq. (11), where N is the number of electrons in the system.51 This allows the evaluation of the normalization factor bx in Eq. (12) even in a continuous model. Note that 6N comes from the diffusion of each electron in two directions (±a) and the remaining 1 stands for the starting configuration x before the possible hoppings (all N electrons) (i.e., x′ = x). (STEP 3) Finally, update the weight with wn+1 = wnbx and return to STEP I. After a sufficiently large number of iterations (the Markov process is equilibrated), one can calculate the ground state energy E0,

E0wneLxnwn,
(13)

where denotes the statistical average over many independent samples generated by the Markov chain and eLx is called the (bare) local energy that reads

eLx=xHx,x=λbx.
(14)

Indeed, the ground state energy can be calculated after many independent n-step calculations. A more efficient computation can be realized by using the so-called “correcting factor” technique: after a single simulation that is much larger than the equilibration time, one can imagine starting a projection of length p from each (np)th iteration. The accumulated weight for each projection is

Gnp=j=1pbnj.
(15)

Then, the ground state energy E0 can be estimated by

E0nGnpeLxnnGnp.
(16)

This straightforward implementation of the above simple method is not suitable for realistic simulations due to fluctuations of weights, large correlation times, sign problems, and so on. TurboRVB implements the following state-of-art techniques for real electronic structure calculations.

If the potential term [Eq. (8)] is unbounded (it is the case in ab initio calculations), the bare weight bx [Eq. (12)] and the local energy eLx [Eq. (14)] significantly fluctuate, making the numerical simulation very unstable and inefficient. To overcome this difficulty, the code employs the importance sampling scheme,15 in which the original Green’s function is modified using the so-called guiding function ΨG as

G̃x,x=Gx,xΨGxΨGx,
(17)

and the projection is modified as

ΨGxΨn+1x=xG̃x,xΨGxΨnx.
(18)

In practice, the guiding function is prepared by a VMC calculation. The modified Green’s function for importance sampling G̃x,x has the same eigenvalues as the original one, and this transformation does not change the formalism of LRDMC. The weight is updated by

b̃x=xG̃x,x,
(19)

and the local energy with importance sampling is

ẽLx=ΨG|H^|xΨG|x=xHx,xΨGxΨGx.
(20)

Equation (20) implies that if the guiding function ΨG is an exact eigenstate of the Hamiltonian, there are no statistical fluctuations, implying the zero variance property, namely, the computational efficiency to obtain a given statistical error on the energy improves with the quality of the variational WF. In this respect, it is also important to emphasize that a meaningful reduction in the statistical fluctuations is obtained by satisfying the so-called cusp conditions. As long as they are satisfied, the resulting local energy does not diverge at the coalescence points where two particles are overlapped, despite the singularity of the Coulomb potential term [V(x) in Eq. (8)].51 In addition, the importance sampling maintains the electrons in a region away from the nodal surface since the guiding function vanishes there [i.e., the RHS of Eq. (17) → 0]. This clearly enhances the efficiency of the sampling because the local energy diverges at the nodal surface.

The Green’s function cannot be made strictly positive for fermions; therefore, the fixed-node (FN) approximation has to be introduced15 in order to avoid the sign problem. Indeed, the Hamiltonian is modified using the spin-flip term Vsfx=x:sx,x>0Hx,xΨGx/ΨGx,

Hx,xFN,γ=Hx,x+1+γVSFx    for    x=xHx,x                                        for    xx,sx,x<0γHx,x                                 for    xx,sx,x>0,
(21)

where sx,x=ΨGxHx,x/ΨGx and γ ≥ 0 is a real parameter. The use of the fixed-node Green’s function

G̃x,xFN=ΛHFNx,xΨGxΨGx
(22)

can prevent the crossing of regions where the configuration space yields a sign flip of the Green’s function; therefore, the walkers are constrained in the same nodal pockets and avoid the sign problem.

TurboRVB also implements the many-walker technique and the branching (denoted as reconfiguration49 in TurboRVB) scheme for a more efficient computation.15 The code performs the branching as follows: (1) Set the new weights equal to the average of the old ones,

wα,n=w¯1Nwβwβ,n.
(23)

(2) Select the new walkers from the old ones with a probability that is proportional to the old walkers’ weights,

pα,n=wα,nβwβ,n,
(24)

which does not change the statistical average of weights but suppresses the fluctuations by dropping walkers having small weights. The code performs branching (reconfiguration) after a projection time τ, which can be chosen as a user input parameter. In practice, within the many-walker and branching schemes, the average weights are stored and are set to one for all walkers after each branching. The user can retrieve the accumulated weights at the end of the simulation,

Gnp=j=1pw¯nj,
(25)

and calculate the ground state energy,

E0nGnpeLxnnGnp,
(26)

where eLxn is the mean local energy averaged over the walkers, which reads

eLxn=αwα,neLxα,nαwα,n
(27)

and is evaluated just before each reconfiguration. Note that p is also an input parameter, which has to be carefully chosen by the user to allow energy convergence.

When λ is sufficiently large, the correlation time also becomes large because the diagonal terms of the Green’s function become very close to one (i.e., a walker remains in the same configuration), which causes a very large correlation time. In TurboRVB, this difficulty is solved by considering in a different way the diagonal and non-diagonal moves. In a given interval of M iterations, the non-diagonal updates are efficiently calculated using a random number according to the probability that the configuration remains in the same one (diagonal updates). This technique can be generalized to the continuous-time limit, namely, M, at MΛ=τ fixed. In the M limit, the projection ΛH^M is equal to the imaginary time evolution expτH^, apart from an irrelevant constant ΛM. Thus, the user should specify only τ as the input parameter. Indeed, the walker weight is updated by wwexpτξeLx and the imaginary time is updated by τleftτleftτξ at each non-diagonal update until τleft becomes 0, where τξ=log1ξ/bx is a diagonal move time step determined by a uniform random number 0 ≤ ξ < 1. The branching (reconfiguration) is performed after each projection time of length τ that a user puts in the input file within the many walker and the branching implementation.

In practice, there are three important features in LRDMC. First, there is not a time step error in LRDMC because the Suzuki–Trotter decomposition is not necessary, unlike the standard DMC algorithm.15 Instead, there is a finite-size lattice error due to the discretization of the Hamiltonian (a). Therefore, in order to obtain an unbiased FN energy, it is important to extrapolate the LRDMC energy to the a → 0 limit by using several results corresponding to different lattice spaces.47 This is then consistent with the standard DMC energy estimate (Fig. 2) obtained in the limit of an infinitely small time step. Probably one of the most important advantages of the LRDMC method is that the extrapolation to the a → 0 limit is very smooth and reliable so that unbiased FN energies are easily obtained with low order polynomial fits. Second, LRDMC can straightforwardly handle different length scales of a WF by introducing different mesh sizes (a and a′) so that electrons in the vicinity of the nuclei and those in the valence region can be appropriately diffused,37,47 which defines the so-called double-grid LRDMC. This scheme saves a substantial computational cost in all-electron calculations, especially for a system including large atomic number atoms,37 with a typical computational cost scaling with Z∼6, where Z is the maximum atomic number. TurboRVB makes use of an appropriate ratio of the mesh sizes (i.e., a/a′), the smaller one a used when electrons are close to the nuclei and the larger one a′ ≫ a adopted in the valence region. By choosing a proper Thomas–Fermi characteristic length around the nuclei, where short hops of lengths a mostly occur, a significant improvement of the scaling (i.e., Z∼5.6Z∼5) has recently been reported.37 Finally, the inclusion of non-local pseudopotentials in this framework is straightforward by means of an additional spherical grid defined in an appropriate mesh.51 As described in Refs. 47 and 52, LRDMC provides an upper bound for the true ground-state energy and allows the estimation of EFN, even in the presence of non-local pseudopotentials. Note that this variational property has also been extended to the standard DMC framework,53 with the introduction of the so-called T-moves. Moreover, the recently introduced determinant locality approximation (DLA)54 to deal with non-local pseudopotentials is also implemented in TurboRVB and can be optionally used in LRDMC.

FIG. 2.

Fixed-node energies for the all-electron carbon atom computed within DMC, single-grid LRDMC (one lattice), and double-grid LRDMC. The lattice discretization parameter a is mapped to the time step τ as a=τ. Reprinted with permission from M. Casula, C. Filippi, and S. Sorella, Phys. Rev. Lett. 95, 100201 (2005). Copyright 2005 APS.

FIG. 2.

Fixed-node energies for the all-electron carbon atom computed within DMC, single-grid LRDMC (one lattice), and double-grid LRDMC. The lattice discretization parameter a is mapped to the time step τ as a=τ. Reprinted with permission from M. Casula, C. Filippi, and S. Sorella, Phys. Rev. Lett. 95, 100201 (2005). Copyright 2005 APS.

Close modal

Both the accuracy and the computational efficiency of QMC approaches crucially depend on the WF Ansatz. The optimal Ansatz is typically a trade-off between accuracy and efficiency. On the one side, a very accurate Ansatz can be involved and cumbersome, having many parameters and being expensive to evaluate. On the other hand, an efficient Ansatz is described only by the most relevant parameters and can be quickly and easily evaluated. In particular, in Sec. II, we have seen that QMC algorithms, both at the variational and fixed-node levels, imply several calculations of the local energy eL(x) and the ratio Ψ(x)/Ψ(x′) for different electronic configurations x and x′. The computational cost of these operations determines the overall efficiency of QMC and its scaling with system size.

TurboRVB employs a many-body WF Ansatz Ψ, which can be written as the product of two terms,

Ψ=ΦAS×expJ,
(28)

where the term exp J, conventionally dubbed Jastrow factor, is symmetric under electron exchange and the term ΦAS, also referred to as the determinant part of the WF, is antisymmetric. The resulting WF Ψ is antisymmetric, thus fermionic.

In the majority of QMC applications, the chosen ΦAS is a single Slater determinant (SD) ΦSD, i.e., an antisymmetrized product of single-electron WFs. Clearly, SD alone does not include any correlation other than the exchange. However, when a Jastrow factor, explicitly depending on the inter-electronic distances, is applied to ΦSD, the resulting Ansatz ΨJSD = ΦSD * exp J often provides over 70% of the correlation energy55 at the variational level. Thus, the Jastrow factor proves very effective in describing the correlation, employing only a relatively small number of parameters and therefore providing a very efficient way to improve the Ansatz.56 A Jastrow correlated SD (JSD) function yields a computational cost for QMC simulations—both VMC and FN—about ∝N3, namely, the same scaling of most DFT codes. Therefore, although QMC has a much larger prefactor, it represents an approach much cheaper than traditional quantum chemistry ones, at least for large enough systems.

While the JSD Ansatz is quite satisfactory in several applications, there are situations where very high accuracy is required, and a more advanced Ansatz is necessary. The route to improve JSD is not unique, and different approaches have been attempted within the QMC community. First, it should be mentioned that improving the Jastrow factors is not an effective approach to achieve higher accuracy at the FN level as the Jastrow is positive and cannot change the nodal surface. A popular approach is through the employment of backflow,57 which is a remapping of the electronic configurations that enter into ΦAS (SD as a special case) where each electron position is appropriately changed depending on nearby electrons and nuclei. Backflow is an effective way to recover correlation energy, both at the variational and FN levels. However, it can be used at a price to increase significantly an already large computational cost.58 Another possibility is to improve ΨSD similar to conventional quantum chemistry approaches, namely, by considering ΦAS as a linear expansion of several Slater determinants. While this second approach can provide very high accuracy, it may be extremely expensive as the number of determinants necessary to remain with a pre-defined accuracy grows combinatorially with the system size.

The vision embraced in TurboRVB is that the route toward an improved Ansatz should not compromise the efficiency and good scaling of QMC. For this reason, neither backflow nor explicit multideterminant expansions are implemented in the code. Within the TurboRVB project, the main goal is instead to consider an Ansatz that can be implicitly equivalent to a multideterminant expansion but remains in practice as efficient as a single determinant. There are five alternatives for the choice of ΦAS, which correspond to (i) the Pfaffian (Pf), (ii) the Pfaffian with a constrained number of molecular orbitals (Pfn), (iii) the Antisymmetrized Geminal Power (AGP), (iv) the Antisymmetrized Geminal Power with a constrained number of molecular orbitals (AGPn), and (v) the single Slater determinant. It is interesting to observe that the latter four WFs are all obtained by introducing specific constraints on the most general Pf Ansatz. The hierarchy of the five Ansätze is represented in the Venn diagram of Fig. 3. Clearly, a more general Ansatz is more accurate in the total energy but not necessarily in the energy differences. Moreover, it is described by more variational parameters that could imply a more challenging optimization and a slightly higher cost. TurboRVB includes several tools to go from one Ansatz to another, as represented in Fig. 4. Typically, the starting point is SD, which can be obtained from a Hartree–Fock (HF) or a DFT calculation with different exchange-correlation functionals. Both methods are not expected to provide the optimal parameters when the Jastrow factor is included in the WF. Indeed, in QMC, the WFs are always meant to include the Jastrow factor, which proves fundamental to improve the properties of the overall WF. For instance, AGP carries more correlation than SD. However, it is not size-consistent unless it is multiplied by a Jastrow factor. Thus, a fundamental step to take advantage of the WF Ansatz is the possibility to perform reliable optimizations of the parameters. Optimization will be discussed in Sec. VII. In this section, we will describe the functional form of the Jastrow factor implemented in TurboRVB (Sec. III A), the Pfaffian (Sec. III B), the AGP (Sec. III C), the AGPn and Pfn (Sec. III D), the SD (Sec. III E), the multiconfigurational character of the AGP (Sec. III F), the basis set (Sec. III G), the pseudopotentials (Sec. III H), the contractions of the orbitals (Sec. III I), and the conversion tool (Sec. III J).

FIG. 3.

Ansatz hierarchy. The output of Hartree–Fock (HF) or DFT simulations with different exchange-correlation functionals are special instances of the SD Ansatz.

FIG. 3.

Ansatz hierarchy. The output of Hartree–Fock (HF) or DFT simulations with different exchange-correlation functionals are special instances of the SD Ansatz.

Close modal
FIG. 4.

Ansatz conversion.

FIG. 4.

Ansatz conversion.

Close modal

The Jastrow factor (exp J) plays an important role in improving the correlation of the WF and in fulfilling Kato’s cusp conditions.59TurboRVB implements the Jastrow term composed of one-body, two-body, and three/four-body factors (J = J1 + J2 + J3/4). The one-body and two-body factors are used to fulfill the electron–ion and electron–electron cusp conditions, respectively, and the three/four-body factors are employed to consider a systematic expansion, in principle converging to the most general electron pair contribution. The one-body Jastrow factor J1 is the sum of two parts, the homogeneous part (enforcing the electron–ion cusp condition),

J1hr1,,rN=i=1Na=1Nat2Za3/4ua(2Za)1/4riRa,
(29)

and the corresponding inhomogeneous part,

J1inhr1σ1,,rNσN=i=1Na=1NatlMa,lσiχa,lri,
(30)

where ri are the electron positions, Ra are the atomic positions with corresponding atomic number Za, l runs over atomic orbitals χa,l (e.g., GTO) centered on the atom a, σi represents the electron spin ( or ), {Ma,lσi} are variational parameters, and uar is a simple bounded function. In TurboRVB, the most common choice for ua is

uar=12bea1erbea,
(31)

depending on a single variational parameter bea, which may be optimized independently for each atomic species.

The two-body Jastrow factor is defined as

J2r1σ1,,rNσN=i<jvσi,σjrirj,
(32)

where vσi,σj is another simple bounded function. There are several possible choices for vσi,σj implemented in TurboRVB (all listed in the file input.tex in the doc folder), and one of them is, for instance, the following spin-dependent form:

vσi,σjri,j=ri,j41+beeparari,j1(σi=σj)ri,j21+beeantiri,j1(σiσj),
(33)

where ri,j=rirj, and beepara and beeanti are variational parameters.

The three/four-body Jastrow factor reads

J3/4r1σ1,,rNσN=i<ja,lb,mMa,l,b,mσi,σjχa,l(ri)χb,m(rj),
(34)

where the indices l and m again indicate different orbitals centered on corresponding atoms a and b, and {Ma,l,b,mσi,σj} are variational parameters. Sometimes, it is convenient to set to zero part of the coefficients of the four-body Jastrow factor, namely, those corresponding to ab, as they increase the overall variational space significantly and make the optimization more challenging, without being much more effective in improving the variational WF.

The SD is an antisymmetrized product of single-electron WFs. Thus, SD neglects almost entirely the correlation between electrons. A natural way to improve this description is to include explicitly in the Ansatz pairwise correlations among electrons. This is precisely what the Pfaffian WF does.60–62 The building block of the Pfaffian WF is the pairing function g(i, j) between any pair of electrons i and j. Henceforth, we denote with the generic bolded index i both the space ri coordinates and the spin values σi,

i(ri,σi),
(35)

corresponding to the ith electron.

For simplicity, let us first consider a system with an even number N of electrons. The WF, written in terms of pairing functions, is

ΦAS(1,,N)=Ag(1,2)g(3,4)g(N1,N),
(36)

where A is the antisymmetrization operator,

A1N!PSNϵPP^,
(37)

SN is the permutation group of N elements, P^ is the operator corresponding to the generic permutation P, and ϵP is its sign.

Let us define G the N × N matrix with elements Gi,j = g(i, j). Note that

g(i,j)=g(j,i)(and Gi,j=Gj,i),
(38)

as a consequence of the statistics of fermionic particles; thus, G is skew-symmetric (i.e., GT = −G, T being the transpose operator), so the diagonal is zero and the number of independent entries is n=1N1n=N(N1)/2.

The Pfaffian63 of a N × N skew-symmetric matrix G is defined as

Pf(G)12N/2(N/2)!PSNϵPGP(1),P(2)GP(N1),P(N)
(39)

if N is even, and it is zero if N is odd.64 Therefore, the WF ΦAS defined in the right-hand side (RHS) of Eq. (36) equals 1N!!Pf(G), where the semifactorial N!!N!2N/2(N/2)! is irrelevant in QMC as it affects only the normalization of the WF. Thus, we can define our electronic WF as

ΦPf=Pf(G).
(40)

Note that the ΦPf here defined allows the description of any system with Nu electrons with spin-up and Nd electrons with spin-down, provided that N = Nu + Nd is even. Indeed, with no loss of generality, we can assume that electrons i = 1, …, Nu have σi = 1/2 and electrons with i = Nu + 1, …, N have σi = −1/2. Thus, the N × N skew-symmetric matrix G is written as

formula
(41)

where Guu is a Nu × Nu skew-symmetric matrix with elements guu(i, j), Gdd is a Nd × Nd skew-symmetric matrix with elements gdd(i, j), Gud is a Nu × Nd matrix with elements gud(i, j), and Gdu=GudT, i.e., gdu(i, j) = −gud(j, i). guu describes the pairing between a pair of electrons with spin-up,

guu(i,j)=fuu(ri,rj),
(42)

where the function fuu describes the spatial dependence on the coordinates ri, rj for i, jNu. Note that fuu(rj, ri) = −fuu(ri, rj) as a consequence of the properties of g. The spin part describes a system with unit total spin and spin projection along the z-axis and will be indicated by 1,+1. Similarly, gdd describes the pairing between pairs of electrons with spin-down for i, j > Nu,

gdd(i,j)=fdd(ri,rj)
(43)

with fdd(rj, ri) = −fdd(ri, rj), and the spin part describes a system with total unit spin and negative spin projection along the z-axis, indicated with 1,1. gud describes the pairing between pairs of electrons with unlike spins. Since two electrons with unlike spins can form a singlet 0,0=2 or a triplet 1,0=+2, in the general case, the pairing function gud will be a linear combination of the two components,

gud(i,j)=fS(ri,rj)2+fT(ri,rj)+2,
(44)

where fS(ri, rj) = fS(rj, ri) describes the spatial dependence of the singlet part of gud and fT(ri, rj) = −fT(rj, ri) describes the spatial dependence of the triplet part. Therefore, the generic pairing function g(i, j) is the sum of all four components mentioned above, namely,

gi,j=fS(ri,rj)0,0+fT(ri,rj)1,0+fuu(ri,rj)1,+1+fdd(ri,rj)1,1.
(45)

The pairing functions fS, fT, fuu, and fdd are expanded over atomic orbitals (see Sec. III G). Say, for a generic pairing function f, we have

fri,rj=l,m,a,bAa,l,b,mψa,lriψb,mrj,
(46)

where ψa,l and ψb,m are primitive or contracted atomic orbitals, their indices l and m indicate different orbitals centered on atoms a and b, while i and j label the electron coordinates. Symmetries on the system or properties of the underlying pairing function f imply constraints on the coefficients. For instance, the coefficients of fS are such that Aa,l,b,m=Ab,m,a,l because fS(ri, rj) = fS(rj, ri), whereas Aa,l,b,m=Ab,m,a,l for fT, fuu, and fdd.

Let us consider now the remaining case of a system with an odd number of electrons N. The simplest way to handle this case is to consider a system with an extra fictitious electron N + 1 ↔ (rN+1, σN+1) that we set at infinity rN+1, thus non-interacting with all N physical electrons. The extra matrix elements are easily computed,

g(N+1,N+1)=0,g(i,N+1)=g(N+1,i)=ϕ(i)   foriN,
(47)

where ϕ(i) ≡ ϕ(ri, σi) can be considered an extra spin dependent unpaired orbital vanishing at infinity.

The antisymmetric part of the WF of the overall system is then

ΦAS(1,,N)=Ag(1,2)g(3,4)g(N,N+1),
(48)

where the antisymmetrization operator acts on the (now even) N + 1 electrons. This is a perfectly allowed N electron fermionic WF because by definition (i) it is antisymmetric for all permutations of the N + 1 particles and, in particular, for the N− physical ones and (ii) it does not depend on the extra N + 1th coordinates and spin as the fictitious particle is at infinity. On the other hand, it is easy to show that this is a nontrivial and nonvanishing WF because we can use the basic Pfaffian formula in Eq. (39) for the antisymmetrization of the N + 1 particles and we readily obtain that

ΦPf=Pf(G̃),
(49)

where

formula
(50)

is a (N + 1) × (N + 1) skew-symmetric matrix and Φ is a N-dimensional vector whose elements are ϕ(i), i = 1, …, N. Its Pfaffian is quite generally nonvanishing for at least some configuration, provided that the diagonalization of the Pfaffian (see later) has at least (N + 1)/2 non-zero eigenvalues (likewise for the Slater determinant, it is enough that the N−molecular orbitals are linearly independent), and we obtain in this way that the Pfaffian is perfectly defined even in the odd number of electrons case.

Finally, we would like to emphasize that the above argument can be generalized to arbitrary number k of unpaired orbitals and arbitrary boundary conditions including calculations on periodic supercells. We can indeed assume at the beginning that when N is odd (even), we add an odd (even) number k of unpaired orbitals such that there are N electrons coupled by the matrix G plus k unpaired electrons. The k unpaired electrons are paired with k fictitious electrons [with a pairing function satisfying conditions analogous to those in Eq. (47)]. This yields a Pfaffian of even linear dimension N + k of the same form as the one in Eq. (50) but with Φ being a block N × k rectangular matrix, determined by k different spin-dependent orbitals. The same argument holds that the antisymmetrized WF is written as the Pfaffian of the corresponding (N + k) × (N + k) skew-symmetric matrix and does not depend on the fictitious particle coordinates introduced and therefore is an allowed physical WF antisymmetric over the N physical particles.

A special case that is of interest is when all N electrons are set to be unpaired (i.e., when k = N and Φ is a generic N × N matrix) and it is assumed that there is no pairing among them [i.e., G = 0 in the N × N block matrix in Eq. (50)]. Thus, using a well-known relation of a Pfaffian

formula
(51)

we recover the standard Slater determinant. In this way, we can clearly see that the Pfaffian represents a quite remarkable generalization of the single determinant picture and that a larger variational freedom is exploited by allowing pairing correlations with a non-zero matrix G.

Recently, we have noticed that, following the above-mentioned derivation, it is possible to generalize further the Pfaffian WF Ansatz by introducing fictitious particles and relaxing the constraints on the G̃ matrix, even in the lower block diagonal part, which can be assumed to be an arbitrary non-zero skew-symmetric matrix. Work is in progress in order to understand whether this extended formulation can further improve the accuracy within a single Pfaffian WF.

If we consider only the case of a pairing function g(i, j) that is a spin singlet [namely, fuu, fddfT in Eq. (45) are set to zero, yielding g(i,j)=fS(ri,rj)0,0], then we obtain the singlet Antisymmetrized Geminal Power (denoted as AGP).

Let us consider first an unpolarized system, having an even number N of electrons, and without loss of generality, we can assume that the electrons i = 1, …, N/2 have spin up and electrons j = N/2 + 1, …, N have spin down. Then, the matrices Guu and Gdd defined in Sec. III B are both zero matrices of size N/2 × N/2, and the matrix Gud has only the contribution coming from the singlet, which we dub GS. The antisymmetrization operator implies the computation of

formula
(52)

where the equality follows from a property of the Pfaffian [see Eq. (51)]. The overall sign is arbitrary for a WF; thus, the antisymmetrized product of singlet pairs (geminals) is indeed equivalent to the computation of the determinant of the matrix GS,

ΦAGPs=det(GS).
(53)

It should be noted that it is not necessary that the matrix Gud be symmetric to reduce the Pfaffian to a single determinant evaluation. As long as the matrices Guu and Gdd are zero, the Pfaffian is indeed equivalent to det(Gud) and describes an antisymmetric WF. However, if Gud is not symmetric, the function

ΦAGP=det(Gud)
(54)

is not an eigenstate of the spin. In other terms, there is a spin contamination, similar to the case of unrestricted HF calculations.

The AGP Ansatz can be generalized to describe polarized systems, i.e., systems where the number Nu of electrons with spin up is different from the number Nd of electrons with spin down. With no loss of generality, we can assume that Nu > Nd; thus, the system is constituted by a number p = Nd of electron pairs and a number k = NuNd of unpaired electrons (clearly, N = Nu + Nd = 2p + k). We aim at evaluating

Ag(1,2)g(2p1,2p)ϕ1(2p+1)ϕk(N).
(55)

As mentioned above, we assume that the pairing function is zero for like-spin pairs. With no loss of generality, we can assume that electrons i = 1, …, Nu have spin up and electrons j = Nu + 1, …, N have spin down. Moreover, as was done at the end of Subsection III B, we can add k fictitious entries to g(i, j) such that g(i, N + l) = ϕl(i) for l = 1, …, k and i = 1, …, Nu. Thus, the antisymmetrization implies the use of Eq. (51), i.e.,

formula
(56)

with the Nu × Nu matrix G̃=Gud|Φ, the Nu × Nd matrix Gud describing the pairing between the Nu spin up electrons and the Nd spin down electrons, and the Nu × k matrix Φ describing the k unpaired orbitals. Thus, we need to evaluate

ΦAGP=det(G̃).
(57)

One of the most important advantages of the AGP Ansatz is that it is equivalent to a linear combination of Slater determinants (i.e., multi configurations), but the computational cost remains at the level of a single-determinant one65–67 (see Sec. III F). This is especially important for large systems because the naive multireference approach requires an exponentially large number of Slater determinants, which drastically increases the computational cost. The AGP Ansatz was applied to the ab initio calculation by Casula and Sorella65 for the first time in 2003; then, it has also been implemented in other QMC codes.

A convenient way to impose constraints on the variational parameters defining the AGP or Pf WF is obtained by rewriting the expansion of the geminal in terms of molecular orbitals (MOs). As shown in Eq. (46), a geminal g(i, j) is natively expressed in terms of the atomic orbitals {ψa,l(r)}, by summing over all the atoms a, the corresponding orbitals l, and the spin index σ (as the elements of the basis may in principle also depend on the spin, even though the most common choice is to take the same orbital for each of the two possible values of the spins). In order to simplify the notation here, let us merge the atomic orbital and spin indices in a unique one that is indicated with a Greek symbol [e.g., μ ↔ (a, l)] running from 1 to the total dimension of the 2L atomic orbitals used, L for each spin component. Therefore,

g(i,j)=μ,ν2LAμ,νψμ(i)ψν(j),
(58)

and clearly, the symmetry of g implies that Aμ,ν = −Aν,μ. The coefficients Aμ,ν define a 2L × 2L skew-symmetric matrix A. If we define the 2L dimensional vector Ψi=ψ1(i),,ψ2L(i)T, Eq. (58) rewrites as g(i,j)=ΨiTAΨj. Moreover, the overlaps Sμ,ν ≡ ⟨ψμ|ψν⟩ between atomic orbitals define the overlap matrix S, which in the case of a spin-dependent basis is block diagonal,

formula
(59)

with Suu and Sdd being positive definite L × L square matrices (Suu = Sdd when orbitals are the same for spin up and spin down). In TurboRVB, the overlap matrix S is computed on a suitable uniform mesh with an efficient and general parallel algorithm. Then, an orthonormal basis is defined,

ψ̃μ(i)=νSμ,ν1/2ψν(i),
(60)

where S−1/2 is well defined since S is strictly positive definite.68 The matrix elements of g can be recasted in the new orthonormal basis ψ̃μ(i), yielding g(i,j)=μ,ν2LÃμ,νψ̃μ(i)ψ̃ν(j)=Ψ̃iTÃΨ̃j, with the matrix ÃS1/2AS1/2 that is skew symmetric.

At this point, from the spectral theory of skew symmetric matrices, it is possible to perform the Youla decomposition of à (see Ref. 69), which can be written in the form à = QΣQT, where Q is unitary (also real if à is real) and the matrix Σ is block diagonal with Σ2k−1,2k = λk = −Σ2k,2k−1 for k = 1, …, L and zero everywhere else, with λk ≥ 0.70 So, the pairing function g(i, j) can be written as ΦiTΣΦj, where Φi=QTΨ̃i for each i. This defines a basis of L MOs ϕk(i) and corresponding twinned ones ϕ¯k(i), forming together a basis of 2L mutually orthonormal elements for which the original geminal function reads

gi,j=k=1Lλkϕk(i)ϕ¯k(j)ϕ¯k(i)ϕk(j)
(61)

with λk ≥ 0. After these transformations, these MOs can be finally written in the chosen (hybrid) atomic basis,

ϕk(i)=ν=12LPμ,lψμ(i),ϕ¯k(i)=ν=12LP¯μ,lψμ(i)
(62)

by appropriate p × 2L rectangular matrices P and P¯. Then, with no loss of generality, we can assume that the molecular orbitals ϕk(i),ϕ¯k(i) are ranked such that λ1λ2 ≥…≥ λL ≥ 0. The above expression highlights that the most important MOs are those corresponding to the larger values of λk. Therefore, it is possible to constrain the variational freedom by neglecting all orbitals with k > n, yielding the pairing function

gni,j=k=1nλkϕk(i)ϕ¯k(j)ϕ¯k(i)ϕk(j),
(63)

where n is conveniently chosen and is ≪L. This yields the AGPn Ansatz and the ΦAGPn WF, which can be useful to improve the stability of the wave-function optimization. The corresponding algorithm (enabled by setting molopt = −1 in the optimization section input of TurboRVB), based on projection operators in the space of the n molecular orbitals considered, has been described extensively in Ref. 15. Moreover, in the original paper71 introducing the AGPn, a precise recipe was given to improve the evaluation of the binding energies. Indeed, despite a constraint on the variational parameters necessarily increasing the variational energy expectation value, energy differences may actually improve by an appropriate choice of n. In the mentioned work,71 this promising approach was applied with an AGP containing only singlet correlations, but the binding energies were defined without using a rigorous size consistent criterion. This drawback can now be removed, by exploiting the full variational freedom of the Pf WF combined with a general spin-dependent Jastrow factor (see, e.g., Fig. 14). Work is in progress in this interesting research direction.

The variational optimization of an AGP with a fixed number n of molecular orbitals can be easily generalized to the Pf case, by exploiting that the constrained Pf WF, dubbed Pfn, can be written either in the canonical form with MOs, as in Eq. (63), or in the localized basis set expansion, as in Eq. (58), with a corresponding matrix Aμ,νn=k=1nλkPμ,kP¯ν,kPν,kP¯μ,k. According to Eq. (63), an arbitrary small variation δgn of the constrained pairing function gn reads

δgni,j=k=1nδλkϕk(i)ϕ¯k(j)ϕ¯k(i)ϕk(j)+k=1nλkδϕk(i)ϕ¯k(j)δϕ¯k(i)ϕk(j)+k=1nλkϕk(i)δϕ¯k(j)ϕ¯k(i)δϕk(j)
(64)

and therefore satisfies the following property, as will be shown later,

(ÎL)δgn(ÎR)=0,
(65)

where Î is the identity operator, L and R are projection operators over the occupied MOs, i.e., L2(i, j) = ∫ dkL(i, k)L(k, j) = L(i, j), and similarly, R2 = R, where here and henceforth the shorthand integration symbol dk=σkdrk contains implicitly also the spin summation. These operators are then defined as follows:

L(i,j)=k=1nϕk(i)ϕk*(j)+ϕ¯k(i)ϕ¯k*(j),R(i,j)=k=1nϕk*(i)ϕk(j)+ϕ¯k*(i)ϕ¯k(j).
(66)

With the above definitions, Eq. (64) is easily verified because each term of Eq. (64) is annihilated either by the left (ÎL) or the right (ÎR) projection over the unoccupied MOs. Note that L = R in the real case and L = R* in the most general complex case. In this way, in order to implement a constrained variation δgn of the Pfn WF, corresponding to an appropriate variation of its matrix δAμ,νn, it is useful to work with a small free variation δg (with corresponding δAμ,ν). This is then projected onto the chosen restricted Ansatz by means of the following equation:

δgn=δg(ÎL)δg(ÎR).
(67)

Indeed, it is easy to show that the RHS of the above equation vanishes if we apply ÎL and ÎR to its left and its right, respectively, just because ÎR and ÎL are projection operators, being such R and L, yielding (ÎR)2 = (ÎR) and (ÎL)2 = (ÎL), from which Eq. (67) fulfills Eq. (65). Equation (67) represents, therefore, a linear relation applied to the variational parameter matrix change δAμ,ν corresponding to the unconstrained geminal g in Eq. (58), yielding the new constrained variation δAμ,νn. Indeed, by using the definitions of the projector operators in Eq. (66) and the expansion of the MOs in the atomic (hybrid) basis [see Eq. (62)], Eq. (67) turns to a number of matrix–matrix operations acting on δA, P, P¯, and the overlap matrix S that can be easily and efficiently implemented.15 

This linear relation between A and An can be therefore easily implemented together with the corresponding derivatives necessary for the optimization of the energy72 and allows the explicit calculation of the new matrix Aμ,νn, yielding the new constrained geminal gn + δgn. Then, the new geminal can be recasted in the form of Eq. (63) by the mentioned diagonalization of skew-symmetric matrices, in this way implicitly neglecting nonlinear contributions that are irrelevant close to convergence, when δgn → 0. After employing several iterations of this type, the lowest energy Ansatz of the JPfn type can be obtained in a relatively simple and very efficient way.

Finally, it is also important to emphasize that this constrained optimization algorithm allows the reduction of the number of parameters by efficiently exploiting locality, namely that variational parameters Aμ,ν corresponding to atoms at a distance larger than a reasonable cutoff (an input named rmax in TurboRVB) can be safely disregarded with negligible error.15 

An important special case of the AGPn and Pfn Ansätze discussed in Sec. III D is when we constrain the pairing function gn to use the minimum possible number of molecular orbitals providing a non-zero WF. The minimum number for an unpolarized system with N electrons is equal to the number of electron pairs, that is, n = N/2. The WF obtained in this way starting from the AGP is indeed the single Slater determinant (SD),15,71 and we dub it ΦSD. In principle, also the Pfn WF with n = N/2 corresponds to a single Slater determinant with spin dependent molecular orbitals, a case that has never been considered so far, but it represents an available option within the most recent versions of TurboRVB.

Similarly, in a polarized system having Nu spin up electrons and Nd spin down electrons (we assume Nu > Nd), the SD Ansatz is obtained by using NuNd unpaired electrons and using a constrained pairing function gn with n = Nd/2.

In Secs. III B and III C, it has been mentioned that the Pfaffian and the AGP Ansätze have a multiconfigurational character despite the fact that they can be evaluated at the cost of a single determinant. In this section, we expand the AGP Ansatz in terms of Slater determinants to show this aspect explicitly. In order to simplify the derivation, we will consider here a simplified case, while the most general case could be studied with a similar approach but involving more cumbersome expressions.

We consider the real AGPs Ansatz [Eq. (53)] for an unpolarized system of N = 2Np electrons (i.e., Nu = Nd = N/2). The symmetry implies that the twin molecular orbitals ϕk(i),ϕ¯k(i) appearing in the RHS of Eq. (61) have the same spatial part, which we denote ϕ̃k(r), modulus a sign (because they are orthonormal), and they have an opposite spin part. Without the loss of generality, we can assume that ϕk is spin up and ϕ¯k is spin down. Given this convention, the scalar product lk between the spatial parts of ϕk and ϕ¯k will either be +1 or −1. We define λ̃k=lkλk, where λk is the same as the one in the RHS of Eq. (61). Note that λ̃k=λk, so λ̃ are ranked in decreasing order of their absolute value (i.e., λ̃kλ̃k+1). The pairing function can then be written as

gi,j=k=1Lλ̃kϕ̃k(ri)ϕ̃k(rj).
(68)

This expression is useful for comparing with the standard CI expansion.

It is convenient now to use the second quantization notations in order to simplify the derivation. In particular, we indicate with âk, (âk,) the operator that creates an electron of spin up (down) in the orbitals ϕ̃k and satisfies the canonical anticommutation relations. We can write the pairing functions g(i,j)i,j|ĝ|0, where 0 is the vacuum, i,j is the WF of a system with one electron with coordinates (ri, σi) and another with coordinates (rj, σj), and ĝ is the operator

ĝ=k=1Lλ̃kâk,âk,,
(69)

where λ̃k is the one defined above. Using this notation, it is easy to show (see Appendix of Ref. 67) that the pairing function g is equivalent to the complete active space of two electrons on the L molecular orbitals ϕ̃k.

Within this notation, the AGP WF is

|ΦAGPs=ĝp|0,
(70)

where pN/2 is the number of electron pairs. If we substitute in ĝp the expansion for ĝ in Eq. (69), after having conveniently defined the operator b^kâk,âk,, which created an electron pair on the orbital ϕ̃k, and having noted that b^kb^l=b^lb^k and b^k2=0 (as following from the anticommutation relations of the âk operators), we obtain that

ĝp=p!1i1<i2<<ipLλ̃i1λ̃i2λ̃ipb^i1b^i2b^ip.
(71)

The chosen order of the λ̃k coefficients implies that the leading term in ĝp is given by the term with (i1, …, ip) = (1, …, p).73 Thus, we can rewrite ĝp in terms of this leading term and excitations with respect to it,

ĝp0p!i=1pλ̃i=i=1pb^i|0+j=1pq=p+1Lλ̃qλ̃ji=1pijb^ib^q|0+1j<kpp<q<rLλ̃qλ̃rλ̃jλ̃ki=1pikijb^ib^qb^r|0+.
(72)

In the above expansion, we shall recognize that the first element is a single Slater determinant of the molecular orbitals ϕ̃ii=1,,N/2; the second element is the summation of all possible double excitations going from an orbital jN/2 to an orbital q > N/2; the third element is the summation of a subset of the possible quadruple excitations and so on. In other terms, AGPs can be written as the zero seniority74 subset of the CI expansion, having some constraints on the coefficients of the expansion.

It is worth mentioning that AGPs are also reliable in cases where a single Slater determinant is not a good reference, as, for instance, in the case of a broken covalent bond, where the highest occupied molecular orbital (HOMO) and the lowest unoccupied molecular orbital (LUMO) are degenerate (see Fig. 5). In fact, we can have that λ̃p=λ̃p+1, so the two determinants i=1pb^i0 and i=1p1b^ib^p+10 have the same weight, and they both are the leading terms of the constrained zero-seniority expansion in Eq. (72).

FIG. 5.

The twisted ethylene molecule (C2H4) is a prototypical system requiring a multireference method. The ground state geometry is planar (torsional angle equal to zero); C atoms have a double bond, and the HOMO and LUMO orbitals are separated. As the molecule is twisted, one bond is broken and the HOMO–LUMO orbitals become degenerate for a torsional angle of 90°. The upper panel shows evaluations of the torsional barrier as obtained using JSD and JAGP at the VMC and LRDMC level and using a complete active space (CAS) of 12 orbitals in 12 electrons. JAGP provides reliable results, whereas JSD (which neglects the LUMO orbital) overestimates the barrier both at the variational and diffusion levels. The middle panel shows the ratio λ̃LUMO/λ̃HOMO in the JAGP Ansatz and the corresponding value in the CAS calculations. The bottom panel shows the HOMO and LUMO orbitals for the planar and twisted geometries. Readapted with permission from Zen et al., J. Chem. Theory Comput. 10, 1048 (2014). Copyright 2014 ACS.

FIG. 5.

The twisted ethylene molecule (C2H4) is a prototypical system requiring a multireference method. The ground state geometry is planar (torsional angle equal to zero); C atoms have a double bond, and the HOMO and LUMO orbitals are separated. As the molecule is twisted, one bond is broken and the HOMO–LUMO orbitals become degenerate for a torsional angle of 90°. The upper panel shows evaluations of the torsional barrier as obtained using JSD and JAGP at the VMC and LRDMC level and using a complete active space (CAS) of 12 orbitals in 12 electrons. JAGP provides reliable results, whereas JSD (which neglects the LUMO orbital) overestimates the barrier both at the variational and diffusion levels. The middle panel shows the ratio λ̃LUMO/λ̃HOMO in the JAGP Ansatz and the corresponding value in the CAS calculations. The bottom panel shows the HOMO and LUMO orbitals for the planar and twisted geometries. Readapted with permission from Zen et al., J. Chem. Theory Comput. 10, 1048 (2014). Copyright 2014 ACS.

Close modal

Note that in the AGPn Ansatz, we can perform an expansion similar to Eq. (72), but the excitations stop at the orbital n rather than L. Thus, it is straightforward to see that for n = N/2, there are no excitations and the only term is a single Slater determinant.

TurboRVB employs localized atomic orbitals such as the Gaussian type,

ψl,±m,IGaussianr;ζ=rRIleζrRI2R[(i)1±12Yl,m,Iθ,φ],
(73)

or the Slater type,

ψl,±m,ISlaterr;ζ=rRIleζrRIR[(i)1±12Yl,m,Iθ,φ],
(74)

where the real and the imaginary part (m > 0) of the spherical harmonic function Yl,m,Iθ,φ centered at RI are taken and rewritten in Cartesian coordinates in order to work with real defined and easy to compute orbitals, l is the corresponding angular momentum, and m ≥ 0 is its projection number along the z−quantization axis. The localized atomic orbitals are also present in the one-body and three/four-body Jastrow parts [i.e., denoted as χr in Eqs. (30) and (34)]. One can use standard basis sets with exponents ζ (also coefficients for a contracted basis set) taken from the available standard database such as the Basis Set Exchange75 or from other more specific references when using pseudopotential.76–81 A Python wrapper, named Turbo-Genius, makes the above procedure much easier, as shown later.

TurboRVB supports pseudopotential calculations both in VMC and LRDMC calculations. Many ECPs have been generated and successfully used in quantum chemistry codes, but they are usually tuned to match Density Functional Theory (DFT) or Hartree–Fock (HF) all-electron (AE) calculations, which are not expected to be optimal for state-of-the-art many-body techniques. Recently, some progress has been made in this direction, and pseudopotentials determined by correlated many-body techniques are also available.76–83 All the pseudopotentials used in QMC employ the standard semi-local form

V^ppIri=VlocIri,I+l=0lmaxVlIri,Im=llYl,mYl,m,
(75)

where ri,I = |riRI| is the distance between the ith electron and the Ith ions, lmax is the maximum angular momentum of the ion I, and l=0lmaxm=llYl,mYl,m is a projection operator on the spherical harmonics centered at the ion I. In TurboRVB, the angular momentum projector is calculated by using standard polyhedral quadrature formulas for the angular integrations.84 As it is now becoming a common practice not only in QMC, both the local VlocIri,I and the non-local VlIri,I functions are expanded over a simple Gaussian basis parametrized by coefficients (e.g., effective charge Zeff and other simple constants), multiplying simple powers of r and a corresponding gaussian term

r2Vlr=kαk,lrβk,lexpγk,lr2,
(76)

where αk,l, βk,l (usually small positive integers), and γk,l are the parameters obtained by appropriate fitting. Several published pseudopotentials have already been tabulated in TurboRVB. Of course, one can also use any pseudopotential employing the semi-local form in the mentioned Gaussian basis with a straightforward little extra work for the input preparation.

The contraction of atomic orbitals has been widely used in quantum chemistry and DFT codes, and was originally introduced to define a pseudo-Slater orbital by combining several primitive Gaussian orbitals. This is also important in the QMC context because it decreases the number of variational parameters by a large factor, as will be shown in this section. Although one can obtain a contracted basis set directly from a database, TurboRVB allows us to prepare a high-quality hybrid basis set (contraction) starting from a given primitive basis, by the so-called “geminal embedding scheme.”85 To this purpose, let us decompose the geminal function in terms of atomic contribution, as far as the dependence over i (the left argument) is concerned,

gi,j=IUprojIi,j=Iμ,νAμ,νIϕμiϕνj,
(77)

where I represents an atom in a system; UprojIi,j is the pairing function projected on the atom I; Aμ,νI is Aμ,ν if μI and Aμ,νI=0 otherwise, where Aμ,ν is assumed to be given for the system under consideration, e.g., obtained by a standard DFT calculation, where in this case, by Eq. (63),

Aμ,ν=k=1ncμkc¯νkc¯μkcνk,
(78)

where cμk and c¯μk are the coefficients of the DFT molecular orbitals ϕk(i)=μcμkϕμ(i) and ϕ¯k(i)=νc¯νkϕν(i) in the atomic basis expansion, respectively.

Quite generally, the projected pairing function can be expanded in a truncated space spanned by q terms,

ŨprojIi,j=k=1qσkψkGEOiψkEnvj.
(79)

In other words, the Schmidt decomposition is applied to the matrix ŨprojIi,j describing the coupling between a given atom I and the environment, within the geminal Ansatz. This procedure defines the so-called geminal embedded orbitals (GEOs), which are determined in terms of an expansion over all atomic orbitals used for the atom I,

ψkGEOi=μIμk,μGEOψμi,
(80)

where ψkGEOi are orthonormal. Following the Schmidt decomposition, it is possible to determine the best GEOs by minimizing the Euclidian distance between the original and the truncated geminal functions,

d2=UprojIŨprojI2.
(81)

Considering all possible unconstrained functions ψkEnvj and employing the steady condition δd2δψkEnvj=0, d2 reads

d2=UprojI2k=1pdidjDprojIi,jψkGEOiψkGEOj,
(82)

where UprojI2=didjUprojIi,j2 and DprojIi,j is the density matrix that reads

DprojIi,j=dkUprojIi,k*UprojIj,k.
(83)

Since the GEOs ψkGEOr are orthonormal and the density matrix is Hermitian, Eq. (82) becomes minimum when the GEOs coincide with the p eigenvectors of the density matrix with the maximum eigenvalues (denoted as wi). The original atomic basis ψμi is usually non-orthogonal, so the problem turns into the generalized eigenvalue equation,

jJAI*SAISi,jμk,jGEO=ωkμk,iGEO.
(84)

The truncation error is readily estimated by the summation of the eigenvalues d2=UprojI2i=1pwp. Since the eigenvalues are sorted in ascending order, a suitably chosen value of p allows the user to neglect the most irrelevant vector components with small eigenvalues wi and work with enough accuracy even with a few GEOs per atom, thus minimizing the number of variational parameters necessary to describe well the system, as will be shown below. With this construction, a new geminal is defined in the GEO basis, namely,

g̃i,j=μ,νÃμ,νψμGEOiψνGEOj,
(85)

where the matrix coefficients Ãμ,ν are given by maximizing the normalized overlap (Q) between the original and the new geminals,

Q=g̃|g2g̃|g̃g|g,
(86)

where g̃|g=didjg̃i,j*gi,j. It turns out that the overlap remains large even for small GEO basis set size p, implying that by using this scheme, one can decrease the number of variational parameters corresponding to the matrix A, i.e., from 4L2 to 4p2 ≪ 4L2.

As described in Sec. III, TurboRVB implements different types of Ansätze: (i) the Pfaffian (Pf), (ii) the Pfaffian with a constrained number of molecular orbitals (Pfn), (iii) the Antisymmetrized Geminal Power (AGP), (iv) the Antisymmetrized Geminal Power with a constrained number of molecular orbitals (AGPn), and (v) the single Slater determinant (SD). One can choose a proper Ansatz depending on a target system, considering the computational cost of a chosen Ansatz and the relevant physical and chemical properties of a target material. During the simulation, a user can go back and forth between the Ansatz using modules implemented in TurboRVB, with/without losing the information of an optimized Ansatz (see Fig. 4): The first case is to add molecular orbitals to an Ansatz, i.e., JAGP ⇒ JSD, JAGP ⇒ JAGPn, or JPf ⇒ JPfn. In TurboRVB, this is obtained by rewriting the expansion of the geminal in terms of molecular orbitals (see Secs. III D and III E). The corresponding tool is convertfort10mol.x. The second important case is to convert an Ansatz among the available ones, i.e., JSD, JAGP, or JAGPn ⇒ JAGP. In TurboRVB, this is the purpose of the convertfort10.x tool and is achieved by maximizing the overlap between the two WFs, one of them being the input (fort.10_in) and the other being the type (fort.10_out) to be filled by new geminal matrix coefficients (result written in fort.10_new). In more details, in TurboRVB, the following overlap between two geminals is maximized:

maxQ=gnew|gori2gnew|gnewgori|gori,
(87)

in order to obtain new geminal matrix coefficients Aμ,νnew, defining the new pairing function as

gnewi,j=μνAμνnewψμnewiψνnewj,
(88)

while the original geminal was given in terms of the parameter matrix Aμ,νori,

gorii,j=μ,νAμ,νoriψμoriiψνorij.
(89)

Note that 0 ⩽ Q ⩽ 1; therefore, the larger the Q, the better the conversion, and Q approaches the unit value if the conversion is perfect. For this type of conversion, one can also apply the geminal embedding scheme to construct a hybrid basis set, as described in Sec. III I.

The final case is to convert a JAGP Ansatz to JPf. Since the JAGP Ansatz is a special case of the JPf one, where only Gud and Gdu terms are defined, as described in Sec. III C, the conversion can be realized just by direct substitution. Therefore, the main challenge is to find a reasonable initialization for the two spin-triplet sectors Guu and Gdd that are not described in the JAGP and that otherwise have to be set to 0. There are two possible approaches:62,69: (i) for polarized systems, we can build the Guu block of the matrix by using an even number of unpaired orbitals {ϕi} and build an antisymmetric guu by means of Eq. (63), where the eigenvalues λk are chosen to be large enough to occupy certainly these unpaired states, as in the standard Slater determinant used for our initialization. Again, we emphasize that this works only for polarized systems. (ii) The second approach that also works in a spin-unpolarized case is to determine a standard broken symmetry single determinant Ansatz (e.g., by the TurboRVB built-in DFT within the LSDA) and modify it with a global spin rotation. Indeed, in the presence of finite local magnetic moments, it is often convenient to rotate the spin moments of the WF in a direction perpendicular to the spin quantization axis chosen for our spin-dependent Jastrow factor, i.e., the z quantization axis. In this way, one can obtain reasonable initializations for Guu and Gdd. TurboRVB allows for every possible rotation, including an arbitrary small one close to the identity. A particularly important case is when a rotation of π/2 is applied around the y direction. This operation maps

|12|+|and|12||.
(90)

One can convert from an AGP the pairing function that is obtained from a VMC optimization

gud(i,j)=fS(ri,rj)2+fT(ri,rj)+2
(91)

to a Pf one

gud(i,j)gi,j=fS(ri,rj)2+fT(ri,rj).
(92)

This transformation provides a meaningful initialization to the Pfaffian WF that can then be optimized for reaching the best possible description of the ground state within this Ansatz.

The application of TurboRVB is not limited to open systems such as atoms and molecules. TurboRVB can also simulate bulk systems in large supercells with arbitrary twisted boundary conditions. These are used to minimize finite-size effects and represent quite an important approach86,87 in order to reach a meaningful and accurate thermodynamic limit.

For periodic system calculations, the many-body WF should satisfy the many-body Bloch condition,88,89

Ψksr1,,ri+Ts,,rN=eiksTsΨksr1,,ri,,rN,
(93)

which follows from the property that the many-body Hamiltonian is invariant under the translation of any electron coordinate by a simulation-cell vector Ts, where Ts = la + mb + nc is determined by arbitrary integers l, n, m and the three vectors a, b, and c define the supercell.

In TurboRVB, a single-particle basis set satisfies the following condition:

ψl,m,IPBCr+Ts;ζ=eiksTsψl,m,IPBCr;ζ,
(94)

where ks is a twist vector [ks=ksx,ksy,ksz] and Ts represents an arbitrary simulation cell vector. Note that the use of a non-vanishing twist vector generally makes a many-body WF complex. TurboRVB implements the CRYSTAL periodic basis,15,90,91

ψl,m,IPBCr;ζ=Tsψl,m,Ir+Ts;ζeiksTs,
(95)

where ψl,m,I is a non-periodic real atomic orbital such as Gaussian [Eq. (73)] and Slater [Eq. (74)]. The use of Gaussian or Slater orbitals that rapidly decay far from nuclei guarantees that the above summation converges fast with a finite small number of Ts. Note that, in TurboRVB, Ts are not limited to orthorhombic ones, but has recently been extended to include all possible crystal translation groups (e.g., rhombohedral, hexagonal triclinic).

The same procedure is applied to the basis set for the Jastrow part, although using simple periodic boundary conditions,15 because the twists do not affect the Jastrow part of the WF, namely,

χl,m,IPBCr;ζ=Tsχl,m,Ir+Ts;ζ,
(96)

which satisfies χl,m,IPBCr+Ts;ζ=χl,m,IPBCr;ζ.

Moreover, the homogeneous one-body and two-body Jastrow factors have to be appropriately periodized because they are not defined in terms of the above periodic basis. Namely, the homogeneous one-body Jastrow part [Eq. (30)] should satisfy

J̃1r+Ts=J̃1r,
(97)

and the two-body Jastrow part [Eq. (32)] should fulfill

J2r1σ1,,ri+Tsσi,,rNσN=J2r1σ1,,riσi,,rNσN.
(98)

In order to satisfy the above constraints, we consider the relative electron–nuclei or electron–electron coordinate differences rd, necessary to evaluate J1 and J2, respectively, and expand them as

rd=raa+rbb+rcc,
(99)

where ra, rb, and rc are appropriate transformed coordinates, which are conveniently defined within a cube of unit length because of the assumed periodicity of the supercell, namely, |ra|, |rb|, |rc| < 1/2. As a consequence, this mapping makes the physical electron–electron and electron–ion distance periodic by definition (i.e., they refer to the minimum distance image of the supercell). However, there may be divergences or singularity at the boundaries of this unit cube. Therefore, before computing the distance corresponding to rd, these coordinates are transformed ra,rb,rcr¯a,r¯b,r¯c=p(ra),p(rb),p(rc) by use of an appropriate function p(x), with at least a continuous first derivative for |x| < 1/2. This function is chosen to preserve the physical meaning at short distances, i.e., p(x) = x in these cases and nonlinear elsewhere, in order to satisfy not only the periodicity but also the requirement of continuous first derivatives of the many-body WF Ψks. We have, therefore, defined p(x) as follows:

px=x(14<x<14)18(1+2x)(12x14)18(12x)(14x12).
(100)

Indeed, although the modified relative distance diverges (i.e., rd) at the edges of the Wigner–Seitz cell (e.g., r=±12a,±12b,±12c), the exponential [Eq. (31)] and the Padé [Eq. (33)] functions remain finite, ur1/2bea and vσi,σjri,j1/4beepara or 1/2beeanti, respectively, thus preserving with continuity the periodicity of the one-/two-body homogeneous parts of the Jastrow factor. However, for the Padé form, one has to change the expression for p(x) in order to satisfy also the continuity in the WF derivatives, even when the modified relative distances diverge, i.e.,

px=x(16<x<16)154(1/2+x)2(12x16)154(1/2x)2(16x12).
(101)

In this case, the region where p(x) = x further shrinks. Thus, it is often more convenient to use the exponential form to obtain a more accurate variational WF because the long-range part is implicitly corrected by the inhomogeneous terms in Eq. (34).

Finally, we remark that the many-body WF also obeys the second Bloch condition,88,89 namely,

Ψkpri+Tp=eikpTpΨkpri,
(102)

where Tp represents a unit-cell (not supercell) vector and kp is the crystal momentum. This comes from the property that the many-body Hamiltonian is invariant under the simultaneous translation of all-electron coordinates by a unit-cell vector Tp. Within TurboRVB, this condition can be employed by imposing the intra-unit cell translational symmetries on the Jastrow and the pairing function as simple linear constraints in the variational parameters. However, this option is restricted to the case kp = 0. On the other hand, it is possible to use different twists on each spin component, which has proven very effective for implementing the mentioned translation symmetries within pairing WFs.92 

The systematic error induced by a finite simulation cell is a long-standing issue in the ab initio QMC calculation. There are two types of finite-size errors in QMC calculations: one is the so-called one-body effect that arises from the kinetic energy term of the Hamiltonian and the other is the so-called two-body effect that arises from the periodic Ewald contribution resulting from the electron–electron interaction. Note that, in the independent-particle calculations (i.e., DFT), only the former is present, which can be readily evaluated by k summation in the first Brillouin zone, but the two-body finite-size effects are not present because the exchange-correlation energy used in DFT usually derives from DMC results extrapolated to the infinite simulation cell size. To correct the one-body finite-size error, one can use the twisted averaged boundary conditions,86 special k-points methods,88,89 or the exact special twist (EST) method,87 all of which have been implemented in TurboRVB. Within the TurboRVB implementation, the Jastrow part is independent of twists (i.e., TurboRVB uses a common Jastrow for all twists),

Ψksr1,,rN=Jr1,,rNΦASksr1,,rN.
(103)

As emphasized in Ref. 86, at variance with DFT, the QMC computational effort is independent of the number of k-points used. For the two-body finite-size effects, which cannot be alleviated by the above remedy, one can employ the model periodic Coulomb (MPC) interaction.93–95 This method has not been implemented in TurboRVB yet. Nevertheless, simpler alternatives exist. For instance, one can alleviate the two-body finite size effects by directly increasing the supercell size, or one can estimate these effects by employing the KZK exchange-correlation function96 at the DFT level. Moreover, it is also possible to employ systematic finite-size corrections based on the knowledge of the density structure factor in momentum space,97 which can be readily computed within TurboRVB, with a short postprocessing computation.

Although most QMC codes load their trial WFs from available DFT/quantum chemistry codes such as Gaussian,98 CRYSTAL,91 and Quantum Espresso,99TurboRVB is a self-consistent many-body package and does not require input from other codes. Indeed, in TurboRVB, an original DFT code is implemented that has the important feature to work in exactly the same basis and pseudopotentials (if any) used for the QMC many-body WF and, for instance, can work with mixed Slater–Gaussian basis or with arbitrary contraction, i.e., mixing angular momenta. On the other hand, the drawback is that only very few functionals are available so far because, within the TurboRVB approach, the DFT has to be used only to have a reasonable initialization of the antisymmetric part of the many-body WF. This is then optimized at best by direct minimization of the energy in the presence of the Jastrow factor and should very weakly depend on the initialization or, for instance, on the DFT functional used. At present, TurboRVB supports Perdew–Zunger (PZ81) Local-Spin Density Approximation (LSDA)100 and those combined with KZK exchange-correlation energy96 but, for instance, does not allow LDA+U calculations. This will be the goal of a possible future work.

An important issue in any DFT algorithm based on localized basis sets, such as Gaussian, is how to reach the so-called complete basis set limit. In the plane-wave method, this is systematically achieved by increasing the number of plane waves (e.g., kinetic cutoff energy). However, with a localized basis set, by increasing its dimension, numerical instabilities may occur as a consequence of the redundancy and non-orthonormality of the basis. On the other hand, the use of a small basis set implies too much biased results. In TurboRVB, an original and systematic algorithm to remove this redundancy101 has been implemented. When the basis set is non-orthogonal, the key self consistent step for solving the DFT Kohn–Sham equations turns to a generalized eigenvalue problem,

Hc=ESc,
(104)

where H is the single-particle Hamiltonian matrix, c is the vector of the coefficients of the orbitals, E is the single-particle energy, and the S is the Nb × Nb overlap matrix, where Nb is the dimension of the basis set,

Si,j=ϕi|ϕj=drϕi*rϕjr.
(105)

The overlap matrix is strictly positive definite, namely, all its eigenvalues are positive. The problem mentioned above shows up when the overlap matrix S becomes ill-conditioned. If the condition number, namely, the ratio between the largest (denoted as sNb) and the smallest eigenvalues (denoted as s1), becomes very large, the CBS limit cannot be achieved by the standard procedure due to numerical instabilities. Therefore, in TurboRVB, the small eigenvalues and the corresponding eigenvectors are disregarded. Indeed, the original basis is modified as follows:

eji=1siυji  for  si/sNεmach,
(106)

where si and υji are the ith eigenvalues and eigenvectors of the overlap matrix, εmach is usually set to the machine precision, j runs from 1 to Nb, i runs from 1 to Mb, and Mb represents the number of eigenvalues satisfying the above inequality. If infinite numerical accuracy were available, the matrix S in the new basis would be just the identity and the generalized eigenvalue problem in Eq. (104) would turn to a standard one in this basis. Within finite precision arithmetic, it is better to iterate further the stabilization procedure and define a new basis,

ẽji=1s̃iυ̃ji,
(107)

where s̃i and υ̃ji are the ith eigenvalues and eigenvectors of the recomputed overlap matrix s̃i,j=ei|S|ej (now well conditioned since very close to the identity), respectively. For using the latter basis set, the Nb × Mb global transformation matrix is stored as it takes into account the projection from the original basis to the final modified one,

Ui,j=kejkẽki.
(108)

On this basis set, the generalized eigenvalue problem becomes a well-conditioned one with a corresponding Mb × Mb Hamiltonian matrix H̃=UHŨ. In this way, one can avoid the numerical instabilities induced by a too redundant large basis set. This stabilization introduces an error at most εmachNb, which is typically negligible compared with the target chemical accuracy.

Another feature of the DFT code implemented in TurboRVB is that electron–nuclei cusp conditions are exactly fulfilled for any basis (i.e., Gaussian orbital) even within the DFT framework. This is achieved by an appropriate modification of the standard basis sets commonly used (e.g., ccpVTZ)102 for WF based calculations: the new basis is obtained by multiplying each element of the original basis by a suitably chosen one-body Jastrow factor introducing the correct cusps, namely,

ϕ̃jbrRb=ϕjbrRbJ̃1r,
(109)

where J̃1r is the same as in Eq. (29) and the parameter b in Eq. (31) is optimized by direct minimization of the chosen DFT energy functional. In this way, each element of the reshaped basis set satisfies the so-called electron–ion cusp conditions, namely that, when r is close to any atomic position Rb, the following relation holds:

limrRbϕ̃jaϕ̃ja=ZbrRb|rRb|
(110)

for all a, b. This formulation allows us to reach the CBS limit extremely fast in DFT calculations, especially within all electrons. Indeed, we have experienced that a given target accuracy can be obtained with a smaller basis, e.g., our modified ccpVDZ basis is typically equivalent in accuracy to the much larger ccpVTZ (and the ccpVTZ equivalent to the ccpVQZ, and so on and so forth).

For QMC application, this tool is particularly useful because the Slater determinant obtained with this new basis does not have divergences in the local energy, which are instead present in the original basis, when electron positions are close to the nuclear ones. This feature emphasizes further the clear advantage of TurboRVB to allow the use of a special purpose DFT code, just devoted to the optimal initialization of a QMC many-body electronic WF.

As described in Sec. V B, TurboRVB employs standard atomic orbitals, modified by means of an appropriately chosen one-body factor. Thus, the electron–electron integrations cannot be evaluated analytically even when the Gaussian atomic basis set is employed. Therefore, TurboRVB calculates the electron–electron Hartree potential by solving Poisson’s equation with the fast Fourier transform (FFT) on a Cartesian grid both in open and periodic systems. The numerical solution becomes problematic in all-electron calculations when the atomic number Z becomes large. Indeed, in this case, the Kohn–Sham Hamiltonian matrix elements have to be computed in the presence of a rapidly varying electron density in the vicinity of nuclei and therefore require an exceedingly large FFT mesh, with an almost prohibitive computational cost for its evaluation. TurboRVB alleviates this drawback by a double-grid scheme,103 in which the Hartree potential is calculated with standard FFT convolution on a coarse mesh. Then, these values are interpolated on a much finer mesh in the vicinity of the nuclei. Although the DFT energy obtained with the above approximation is not exactly consistent with the one corresponding to a very dense uniform mesh, the corresponding VMC energies and the variances are almost indistinguishable to each other, at a much cheaper computational cost.

Derivatives of total energies with respect to variational parameters represent an essential ingredient for optimizing a many-body WF. Forces (derivative with respect to atomic positions) are also essential for performing structural optimization or molecular dynamics. However, in a complex code, and especially in QMC, the evaluation of the functional derivatives, necessary for the WF optimization, is very complicated, mainly for the complexity of the algorithm that, in turn, may lead to a very inefficient implementation, although recent progress has been done.104 For instance, a simple approach is to compute them with finite difference expressions, leading to a too large computational time, as obviously proportional to the number of targeted derivatives.

Algorithmic differentiation (AD) is a method capable of solving all the above problems, essentially by a smart application of the differentiation chain rule. There are two types of algorithms: the forward algorithmic differentiation (FAD) that implements the chain rule straightforwardly from the beginning to the end of the algorithm and the adjoint algorithmic differentiation (AAD) that uses the chain rule starting from the end of the algorithm (also known as backward propagation). When the number of input parameters is much larger than the corresponding output ones, AAD is much more efficient than FAD and indeed allows the calculation of all possible derivatives in a computational time proportional (with a small prefactor; see, e.g., Fig. 6) to the one for computing the target function (i.e., the energy or the WF value). The former is therefore the ideal method for quantum Monte Carlo when the variational WF contains many variational parameters. The AAD was applied for the calculation of atomic forces in Ref. 41. To our knowledge, this was the first time that AAD was used within QMC. Now, TurboRVB implements all derivatives such as those of the local energy [αkeLxi] or those corresponding to the WF logarithm [αklnΨxi] using the AAD, which drastically improves the efficiency (Fig. 6)41 and reliability of the calculation.

FIG. 6.

Ratio of CPU time required to compute energies and all force components referenced to the one required for the simple energy calculation within VMC. The calculations refer to 1, 2, 4, and 32 water molecules. The inset is an expansion of the lower part of the plot. Reprinted with permission from S. Sorella and L. Capriotti, J. Chem. Phys. 133, 234111 (2010). Copyright 2010 AIP Publishing LLC.

FIG. 6.

Ratio of CPU time required to compute energies and all force components referenced to the one required for the simple energy calculation within VMC. The calculations refer to 1, 2, 4, and 32 water molecules. The inset is an expansion of the lower part of the plot. Reprinted with permission from S. Sorella and L. Capriotti, J. Chem. Phys. 133, 234111 (2010). Copyright 2010 AIP Publishing LLC.

Close modal

The derivatives of the energy with respect to a given real variational parameter αk (one complex parameter can be thought to be composed by two real ones, its real and imaginary part, respectively) is represented as a generalized force,

fk=Eααk=αkΨα|H^|ΨαΨα|Ψα.
(111)

In variational Monte Carlo, the derivative can be evaluated using M configurations of electron coordinates,15 

fk=2RxeL*xOkxŌkΨαx2xΨαx22R1Mi=1MeL*xiOkxiŌk,
(112)

where eLx is the local energy, Okx is the logarithmic derivative of the WF [i.e., Okx=lnΨαxαk], and Ōk is its average over M samples [i.e., Ōk=1Mi=1MOkxi]. In TurboRVB, the logarithmic derivative [Okx] is computed very efficiently by using the AAD algorithm. Note that the derivatives of the local energy are not needed here because the Hamiltonian does not depend on any variational parameter. Instead, these terms are necessary in order to calculate ionic forces (i.e., derivatives of the total energies with respect to atomic positions). If the WF is an exact eigenstate of the Hamiltonian, the generalized forces fk exactly vanish without statistical errors because the local energy is no longer dependent on x. In other words, the derivatives have the zero-variance property and therefore represent the fundamental ingredients for an efficient WF optimization, as described in Sec. VII.

In practice, during an optimization, the code monitors the variational energy [Eα] and the maximum value of the signal to noise ratio among all the force components, which is denoted as devmax in the code,

devmaxmaxkfkσfk,
(113)

where σfk represents the estimated error bar of the force fk. This value is used in TurboRVB as one of the convergence criteria of the optimization. To our experience, after a reasonable number of iterations, devmax stabilizes to a value ≤4.0, when a correct (possibly local) minimum is being approached.

In TurboRVB, a very useful space warp coordinate transformation (SWCT)41 is employed for the calculation of atomic forces. This scheme was introduced several decades ago in Ref. 105, in order to decrease the statistical errors of forces. The finite-difference expression for ionic force calculations is

Fa=ΔΔRaE=ERa+ΔRaERaΔRa+OΔRa.
(114)

The SWCT is used to mimic the displacement of charges around the nucleus,

r¯i=ri+ΔRaωari,
(115)
ωar=κrRab=1MκrRb,
(116)

where κr is a function that decays sufficiently fast for large r because the charges far from the nuclei should not be affected by the SWCT, and ω → 0, as a consequence of this requirement. It turns out that the original105 simple choice κr=1/r4 works very well and is indeed adopted in TurboRVB.

Starting from the finite-difference expression, one can straightforwardly derive the corresponding differential expression,41 

Fa=ddRaeL+2eLddRalnJ12ΨeLddRalnJ12Ψ,
(117)

where J is the Jacobian of the above transformation and the brackets indicate a Monte Carlo average over the trial WF. All the terms above can be written by the partial derivatives of the local energy and those of the logarithm of the WF,41 

ddRaeL=RaeL+iωaririeL,
(118)
ddRalnJ12Ψ=RalnΨ+iωaririlnΨ+12riωari.
(119)

In order to evaluate these differential expressions, 6N + 6Nat components have to be evaluated, namely, rieL, rilnΨ, RalneL, and RalnΨ.41 These values are very efficiently computed in TurboRVB, by using the aforementioned AAD, which works even in the presence of pseudopotentials.

The SWCT significantly decreases the statistical errors, but the forces still have infinite variance properties because ∂eL and lnΨ diverge in the vicinity of the nodal surfaces. Attaccalite and Sorella106 developed a reweighting method to address this issue in an unbiased way. Within this scheme, the QMC sampling is not driven by the chosen trial WF ΠTx=ΨT2x but by a slightly different guiding function ΨGx defined by

ΨGx=RεxRxΨTx,
(120)

where Rx vanishes more weakly than the trial function does, when the configuration x approaches the nodal surface, whereas Rε is an appropriately chosen regularization of R, depending on ε, that never vanishes. In this way, the guiding function is larger when approaching the nodal surface, and this singular region can be more accurately sampled in order to avoid infinite variance problems in the calculation of forces, as will be shown later. For this purpose, Rx is defined in a way to vanish as the antisymmetric part of the WF to some power 2θR ≤ 1. However, in order to avoid too large fluctuations, a simple relation is used that Pf(G)1Gi,j1, where Gi,j is the pairing function [Eq. (45)], here referred to as the general AGP case (the other cases straightforwardly follows upon some restriction of the matrix G). Indeed, if the Pfaffian goes to zero, most of the matrix elements of the inverse of G diverge inversely proportional to it. Thus, the quantity

Rx=Si,jGi,j12θR
(121)

satisfies the requirement without depending explicitly on the full Pfaffian that has fluctuations exponentially large in the number of particles and would lead to an inefficient regularization.106 Here, we have also introduced the scaling factor S that takes into account that the WF may vanish also when a single particle is going to infinity, and not because it is approaching the nodal surface. In this case, the scaling factor has to vanish in a proper way so as to allow a non-vanishing R. In this way, the guiding function can decay sufficiently fast in this limit and can be normalized in an open system. This represents a necessary condition to have a stable simulation; otherwise, all electrons will be kicked out at infinity, after a long Markov chain. Therefore, the factor S is chosen to vanish as long as one raw of the pairing function vanishes because it is corresponding to an electron going to infinity (all matrix elements containing such an electron coordinate have to vanish),

S=minij|Gij|2.
(122)

This definition is also useful because the regularization proposed is scale-invariant, namely, Rx remains unchanged if G is scaled by an arbitrary constant and therefore is adopted as is also for bulk systems, and it is particularly important when, in these cases, there exist large regions of almost negligible electronic density.107 

After that, Rεx is defined as

Rεx=maxRx,ε,
(123)

which is much simpler than the original proposal reported in Ref. 106. By using the new probability, forces (the HF and the Pulay) can be evaluated with finite variance as

WxFxΠGx/WxΠGx,
(124)

where Wx is the new weight,

Wx=ΨTx/ΨGx2RxmaxRx,ε2
(125)

and Fx, according to Eq. (117), indicates the average of an appropriate random variable containing either the derivative of the local energy (the HF contribution) or the product of the local energy and the log derivative of the WF (the Pulay one), both contributions diverging as 1δ2, where δ is the distance from the nodal surface. The point is that since in the vicinity of the nodal surface, Wx is proportional to δ4θR, whereas the probability vanishes much slower as ΠGxδ24θR, the reweighting scheme solves the infinite variance problem for 1/4 < θR ≤ 1/2 because the random variable WxFx remains with finite variance. This follows after simple inspection of its integrability in the small δ region, i.e., dδδ4θR2<. Thus, both the HF and the Pulay terms can be evaluated with finite variances in the TurboRVB evaluation of atomic forces. In the most recent versions of TurboRVB, the default value of θR is taken to be at the middle of the interval of stability, i.e., θR = 3/8, because the value θ = 1/2, previously chosen,106 leads to some instability; namely, during the QMC sampling, the nodal surface is approached too much closely, and the determinants or Pfaffians become very ill-conditioned with obvious problems of numerical accuracy. Moreover, the value of ε is automatically selected during the first steps of the simulation in a way that the average reweighting factor Wx0.8, which represents empirically an almost optimal setting.

Finally, we remark that this method to compute the energy derivatives with finite variance is applied by default in TurboRVB even for the optimization of the WF variational parameters and that a similar approach can be used when computing atomic forces within the LRDMC algorithm.

Once the energy derivatives can be computed, the most straightforward strategy to optimize a WF is to employ the steepest descent method, where the WF parameters are iteratively updated as follows:

αkαk=αk+δαk,
(126)
δαk=Δfk,
(127)

where Δ is a small constant and fkEαk is the generalized force already defined in Eq. (111). However, it does not work well when optimizing highly non-linear WF parameters because a small change in a given variational parameter may produce a very different WF, whereas another parameter change may weakly affect the WF. Of course, one can use more sophisticated methods such as the Newton–Raphson method, the conjugate gradient, and the quasi-Newton method, but the straightforward implementation of these optimizations does not work efficiently within a stochastic approach such as QMC. In order to overcome this difficulty, a more efficient change in the variational parameters has been defined by means of a positive-definite preconditioning matrix S and the generalized force vector f,

αkαk+ΔS1fk,
(128)

where the matrix S is stochastically evaluated by means of M configuration samples x=x1,x2,,xM,

Sk,k=1Mi=1MOkxiŌk*OkxiŌk,
(129)

where Okxi=lnΨxiαk and Ōk=1Mi=1MOkxi. The resulting approach is the so-called stochastic reconfiguration (SR) method.42 Mazzola et al.108 pointed out that the matrix S is essentially a metric for the parameter space, measuring the distance of the underlying normalized WF. Therefore, Eq. (128) is simply the steepest descent in this curved manifold. This observation connects the SR method with the so-called natural gradient method, widely used in the context of deep learning.109 In this context, for each parameter α, ÔkŌk and Sk,k can be interpreted as the score function (i.e., the gradient of the log-likelihood function) and the Fisher information matrix (FIM), respectively, while the WF square Ψx2 plays the role of the likelihood function. In this sense, the stochastic reconfiguration method is essentially identical to the natural gradient optimization with the FIM that has been intensively used in the machine-learning community.

The straightforward implementation of the SR method is not stable mainly because the statistical noise sometimes makes the matrix S ill-conditioned, which deteriorates the efficiency of the optimization method.43 Therefore, in practice, the diagonal elements of the preconditioning matrix S are shifted by a small positive parameter (ε) as

si,i=sii(1+ε).
(130)

This modification improves the efficiency of the optimization by several orders of magnitude, as shown in Fig. 7. Finally, the variational parameters are updated as

αkαk+ΔS1fk.
(131)

In practice, the user should provide two input parameters, namely, Δ (denoted as tpar in the code input variables) and ε (denoted as parr).

FIG. 7.

Optimization of the variational WF in the simple one-dimensional Heisenberg model H=JiSiSi+1 with the standard SR (ε = 0, open circles) and with the present regularization (ε = 0.001, open triangles). The evolution of the nearest neighbor spin–spin (Sz) Jastrow parameter is plotted. The figure clearly shows that the SR method with regularization is several orders of magnitude more efficient than the standard SR for determining the variational parameter with a given statistical accuracy. The inset shows the first few iterations. Reprinted with permission from S. Sorella, M. Casula, and D. Rocca, J. Chem. Phys. 127, 014105 (2007). Copyright 2007 AIP Publishing LLC.

FIG. 7.

Optimization of the variational WF in the simple one-dimensional Heisenberg model H=JiSiSi+1 with the standard SR (ε = 0, open circles) and with the present regularization (ε = 0.001, open triangles). The evolution of the nearest neighbor spin–spin (Sz) Jastrow parameter is plotted. The figure clearly shows that the SR method with regularization is several orders of magnitude more efficient than the standard SR for determining the variational parameter with a given statistical accuracy. The inset shows the first few iterations. Reprinted with permission from S. Sorella, M. Casula, and D. Rocca, J. Chem. Phys. 127, 014105 (2007). Copyright 2007 AIP Publishing LLC.

Close modal

In TurboRVB, the state-of-the-art QMC optimization method is also implemented, namely, the so-called linear method.44–46 In this scheme, a many-body WF is expanded up the linear order (i.e., considering only the first derivatives for the variational parameters),

|Ψα+δα=|Ψα+k=1δαkαk|Ψα|Ψ0+k=1δαk|Ψk,
(132)

where αk is the kth variational parameter and Ψkx is the first derivative with respect to the kth variational parameter. Within this expansion, the expectation value of the energy reads

Eα+z=k,k=0pzk*zkΨk|H^|Ψkk,k=0pzk*zkΨk|Ψk.
(133)

In order to obtain the vector z that minimizes this expectation value, one should solve the generalized eigenvalue problem,

Hz=ESz,
(134)

where Hk,k=Ψk|H^|Ψk and Sk,k=Ψk|Ψk.

This algorithm is more conveniently implemented in the so-called semi-orthogonal basis,15 

|Ψα+δα=z0|Ψ̃α+k=1zkÔkŌk|Ψ̃α,
(135)

where |Ψ̃α=|Ψα/Ψα, x|Ôk|x=δx,xOkx, Okx=αklnΨαx, and Ōk = ⟨Ψα|Ôkα⟩/⟨Ψαα⟩. TurboRVB calculates z by solving the generalized equation [Eq. (134)] with the matrices,

Hk,k1Mi=1MOkxiŌk*xi|H^Ok̂Ōk|Ψαxi|Ψα,
(136)
Sk,k1Mi=1MOkxiŌk*OkxiŌk,
(137)

which can be readily evaluated by a Monte Carlo sampling, using not only the random variables Ok(xi) necessary for the simpler SR technique but also the parameter derivatives of the local energy αkeL(xi), which can be directly computed by AAD, and thus allow the calculation of the above Hamiltonian matrix elements, by straightforward algebra,

αkeL(xi)=αkxi|H^|Ψαxi|Ψα=xi|H^Ok^Ōk|Ψαxi|Ψα.
(138)

Note that the code does not always take the eigenvector corresponding to the lowest eigenvalue (that in turn may be also complex, as in the linear method the matrix Hk,k is not Hermitian) but takes the one that maximizes z02 (i.e., the coefficient of the zeroth-order term) in order to have the most stable parameter change, namely, the updated WF being as close as possible to the old one. Finally, the code updates the variational parameters by using the obtained z and an input parameter Δ (denoted as tpar) as

αkαk+Δzk/z0,
(139)

where Δ ≃ 1/3 is the default TurboRVB choice that is much more stable than the original algorithm (Δ = 1), but it is approximately 1/3 slower. The linear method is usually rather unstable for large systems and many parameters, and also, in such cases, each iteration requires a diagonalization of a huge matrix (a task that can be often prohibitive). In TurboRVB, a practical compromise has been devised by using the linear method for a restricted variational space containing up to a maximum number (npbra) of variational parameters with the largest signal to noise ratio [optionally larger than parcutpar, see Eq. (113)] and/or a number (ncg) of global line parameter directions spanned by the natural gradient ones (e.g., Δ is one variational parameter of this form in Eq. (131)] calculated at the given optimization iteration or the previous closest ones,43 optionally neglecting the ones with a smaller signal-noise ratio (<parcutmin).

In most optimizations, the number of samplings in VMC is much larger than the number of variational parameters and the optimization is stable and efficient. In the opposite case, the preconditioning matrix S becomes rank-deficient singular; therefore, the optimization does not work properly (cf. the Cramér–Rao inequality). A practical rule is to set the number of samples M such that15 

M510×p,
(140)

where p is the number of variational parameters. However, one can also employ a very small number of samplings with sufficiently large ε in the stochastic reconfiguration method because the regularization of the matrix S in Eq. (130) can remove all instabilities as long as ε is sufficiently large, as in the infinite ε limit one recovers the very powerful stochastic gradient technique well-known within machine learning. This method seems promising for optimizing even a huge amount of variational parameters and could certainly represent an important development in QMC.

At present, the inversion in Eq. (131) can be done in TurboRVB without storing explicitly the matrix110 as the associated linear problem is solved iteratively using conjugate gradients and implicit small matrix-vector operations, optimally distributed within the MPI protocol. Thus, the cost of this inversion becomes negligible in the large ε limit and linear with the number of variational parameters.

In TurboRVB, several types of ab initio molecular dynamics (MD) have been implemented for both classical and quantum nuclei. The MD is driven by ionic forces F = −RV(R), where F ≡ {F1, …, FN}, R ≡ {R1, …, RN}, and the potential energy landscape V(R) is evaluated by VMC, namely,

V(R)=ΨR|H^(R)|ΨRΨR|ΨR.
(141)

In Eq. (141), |ΨR⟩ is the QMC WF that, according to the Born–Oppenheimer (BO) approximation, minimizes the expectation value of H(R) at each ionic position R. The expression for the ionic forces is quite complex. Computing these forces with finite variance and in a fast way, as done in TurboRVB (Sec. VI B), is of paramount importance to make a QMC-based MD possible. Moreover, in the QMC framework, the evaluation of V(R) and its corresponding force estimators are intrinsically noisy. The statistical noise must be kept under control if one wants to have an unbiased sampling of the phase space during the propagation of the trajectory. This issue has been solved by resorting to a Langevin type of molecular dynamics in both classical and quantum formulations, where the QMC noise becomes a controlled source of thermalization at the target temperature T.

Two types of Langevin dynamics (LD) have been implemented in TurboRVB: the second-order LD, in its classical (Sec. VIII A) and quantum (Sec. VIII B) variants, and the first-order LD accelerated by the covariance matrix of QMC forces (Sec. VIII C).

The equations of motion of the second-order LD read106 

υ̇=γRυ+FR+ηt,
(142)
Ṙ=υ,
(143)
ηitηjt=Si,jRδtt,
(144)

where R, υ, f, and η are the 3Nat-dimensional vectors representing atomic positions, velocities, and deterministic and stochastic forces of Nat atoms, written in mass-scaled coordinates,

Ri=Ri0mi,Fi=Fi0/mi,ηi=ηi0mi
(145)

for i = 1, …, Nat. The stochastic forces are related to the friction matrix γ through the fluctuation–dissipation theorem, namely,

SR=2TγR,
(146)

with the temperature T expressed in atomic units. TurboRVB exploits the freedom in Eq. (146), by assuming

SR=s0I+Δ0SQMCR,
(147)

where s0 and Δ0 are constants, I is the identity, and SQMCR is the covariance matrix of QMC forces,

Si,jQMCR=FiRFiRFjRFjR.
(148)

In the above equation, refers to the average over the QMC sampling at the given MD step. The friction, a quantity that controls the nuclear sampling efficiency, is, therefore, position-dependent now. Luo et al. have shown that the force covariance matrix Si,jQMCR is, within a good approximation, proportional to the dynamical matrix (Fig. 8). Therefore, the choice in Eq. (148) realizes the nearly optimal damping of the high-frequency modes. After discretizing Eqs. (142) and (143) in the time interval tnτ/2 < t < tn + τ/2, where the index n denotes the time slice tn = , with time step τ, and by integrating the above equations, one obtains

υn+1t=eγnτυn+ΓnFn+η̃,
(149)
Rn+1=Rn+τυn+1,
(150)

with FnFRn, γnγRn,

Γn=γn1Ieγnτ,
(151)

and

η̃=γn2sinhγnτ/2tnτ/2tn+τ/2eγnttnηtdt.
(152)

By using Eq. (152), it is easy to show that the correlator defining the discrete noise is given by the following 3Nat × 3Nat matrix:

η̃iη̃jS*=Tγn2cothγnτ/2.
(153)

In this way, η̃ fulfill the fluctuation–dissipation theorem required by the Langevin thermostat. However, part of this noise is already present in the QMC force evaluation (η̃QMC), as quantified by Eq. (148). Hence, TurboRVB adds the external random force η̃ext such that η̃=η̃ext+η̃QMC. This is generated according to the following correlator:

η̃iextη̃jext=S*SQMC,
(154)

in order to fulfill the original fluctuation–dissipation theorem. Since the correlation matrix in Eq. (154) is positive definite as long as τ < Δ0, the correlated external force can be readily generated by standard algorithms (e.g., the Box–Muller method112 combined with the matrix diagonalization). This noise correction plays an important role in targeting a stable temperature even in the presence of statistical errors in the QMC forces. The original numerical integration detailed here has been further improved by Mazzola and Sorella (see the supplementary notes of Ref. 113).

FIG. 8.

Eigenvectors of the 3 × 3 correlation matrix of QMC forces SQMC in the water monomer. The obtained eigenvectors correspond to the well-known three vibrational modes: bending (red arrows), symmetrical (blue arrows), and asymmetrical (green arrows) stretching. The smaller eigenvalue corresponds to the lowest-frequency vibrational mode. The eigenvalues in the plot are all rescaled by the lowest one. Reprinted with permission from Y. Luo, A. Zen, and S. Sorella, J. Chem. Phys. 141, 194112 (2014). Copyright 2014 AIP Publishing LLC.

FIG. 8.

Eigenvectors of the 3 × 3 correlation matrix of QMC forces SQMC in the water monomer. The obtained eigenvectors correspond to the well-known three vibrational modes: bending (red arrows), symmetrical (blue arrows), and asymmetrical (green arrows) stretching. The smaller eigenvalue corresponds to the lowest-frequency vibrational mode. The eigenvalues in the plot are all rescaled by the lowest one. Reprinted with permission from Y. Luo, A. Zen, and S. Sorella, J. Chem. Phys. 141, 194112 (2014). Copyright 2014 AIP Publishing LLC.

Close modal

Mouhat et al. have recently introduced the nuclear quantum effects into the second-order LD based on a path integral molecular dynamics (PIMD) approach,114 driven by QMC nuclear forces. As a starting point of the Feynman path integral (PI) theory, the quantum partition function Z=TreβH (with β = 1/kBT) can be written as115 

Z1(2π)3NatLdNatLpdNatLqeτβHL(p,q),
(155)

with τβ = β/L denoting the imaginary time step. We have replaced here the true quantum particles by fictitious classical ring polymers, consisting of L replicas (beads) of the system. The beads in these necklaces are connected to each other by harmonic springs, evolving in the imaginary time. Without the loss of generality, we can extend the definition of classical vectors to the quantum case, by including all L replicas, such that the resulting vector qR(1),,R(i),,R(L)T is 3NatL-dimensional. If interpreted classically, the partition function Z in Eq. (155) describes a system at the effective temperature LT. The Hamiltonian HL corresponding to this quantum-to-classical isomorphism reads as

HL(p,q)=i=1Natj=1L12[pi(j)]2+12ω̃L2Ri(j)Ri(j1)2+j=1LV(R1(j),,RNat(j)),
(156)

written in mass-scaled coordinates [Eq. (145)], with ω̃L=L/β, and subjected to the ring boundary condition Ri(0)Ri(L). The system is thermalized with a Langevin thermostat such that the related Liouville operator L can be written as

L=i=13NatLFipi+piqij=13NatLγijpipj+kBTMpipj
(157)

in mass-scaled coordinates, with qiqi. L is built upon the Hamiltonian propagation, driven by the first two terms, and the Langevin thermostat, represented by the last two, deriving from the Fokker–Planck equation. Hereafter in this section, i and j run over all particles and beads indexed together. In Eq. (157), FiFiBO+Fiharm is the total force acting on each replica, comprising the BO (intra-replica) and harmonic contributions (inter-replicas), where FiBOqiṼ and Ṽ=j=1LV(R1(j),,RNat(j)).

The quantum-to-classical isomorphism Hamiltonian in Eq. (156) includes very different energy scales. To cope with this, we split the Liouvillian in Eq. (157) into two operators, one containing only the physical (BO) modes and the other depending exclusively on the harmonic modes of the rings. To do so, we first separate the friction matrix into two contributions,

γ=γBO+γharm.
(158)

We can then rewrite the total Liouvillian as the sum of two terms, L=LBO+Lharm, where

Lharm=iFiharmpi+piqijγijharmpipj+kBTMpipj,
(159)
LBO=iFiBOpijγijBOpipj+kBTMpipj,
(160)

in such a way that we can break up the evolution operator via a Trotter factorization115 to get the following symmetric propagator:116,117

eLδteLBOδt/2eLharmδteLBOδt/2.
(161)

The equations of motion corresponding to the propagator in Eq. (161) have been implemented in TurboRVB, where now FiBO can be computed in a QMC framework, and therefore, it can be affected by an intrinsic noise, as was in the classical scheme (Sec. VIII A).

We note that the equations of motion generated by Lharm are linear in both p and q, thus exactly solvable in an analytic closed form. This is because the non-linear BO components of the total force have been put in the LBO factor. The dynamics generated by Lharm is a quantum Ornstein–Uhlenbeck process, which describes the motion of Brownian quantum particles under the influence of friction.

The exact integration of Lharm without further splitting the Langevin thermostat from the harmonic modes of the rings is a major achievement, which gives the algorithm an enhanced stability and better scaling with respect to the time step size τ and the number of replica L. Moreover, the time step error is significantly smaller than the one of other PILD methods.118 Thus, this is a key feature of the PI algorithm implemented in TurboRVB. This is also why the implemented method is called path integral Ornstein–Uhlenbeck dynamics (PIOUD).

If we now look at the LBO factor in Eq. (161), the corresponding equations of motion are the ones in Eq. (12) of Ref. 111. Indeed, only the momenta are evolved in LBO, and the resulting equations are equal to those of the simple classical Langevin algorithm introduced in Ref. 106, restricted to p, which in the present case is a 3NatL-dimensional vector. As in the classical second-order QMC-driven LD, the QMC noise affecting the BO forces is controlled, thanks to the friction γBO and the fluctuation–dissipation theorem. A noise reduction scheme similar to the one described in Eqs. (153) and (154) is also applied in this case, yielding a thermalization toward a steady temperature ensemble even for quantum particles.

We refer the interested reader to Ref. 114, where we report a detailed description of the PIOUD algorithm and the integrated equations of motion.

TurboRVB also features an accelerated first-order dynamics to sample the (classical) ionic canonical distribution. This molecular dynamics scheme has been introduced recently by Mazzola and Sorella.119 Here, the ions formally undergo the following dynamics:

Ṙ=S1RFR+η,
(162)
ηiηj=2TδttSi,j1R,
(163)

where T is the temperature, FR and η are the deterministic and stochastic forces, and S is the covariance matrix of the QMC forces, as introduced in Eq. (148). This dynamics generalizes the Newton method, which is valid for structural relaxation, to finite-temperature.

Indeed, the use of the matrix S, which is empirically proportional to the Hessian matrix, can drastically decrease the autocorrelation time of the Markov chain. In other words, this metric plays an important role in treating the different time scales of a given system. In this way, one can still use a quite large integration time step τ, which is beneficial in reducing the autocorrelation time of quantities that depend on slowly varying degrees of freedom (e.g., molecular diffusion), without introducing any bias in the sampling.

Discretization of Eq. (162) for a finite time step τ at finite temperature is rather involved and has been described in Ref. 119 and reads

Rt+τ=Rt+2Tτzt+τS1RFRS1RSRtτSR2RtτRt,
(164)

with

zitzjt=Si,j1Rt.
(165)

As the effective temperature depends on the finite integration time, the convergence to the target temperature Ttarget for τ → 0 can be improved by a renormalization of the temperature T used in the simulation, namely, by using the above equations with an appropriate choice of T,

T=Ttarget(1aτ),
(166)

where a = 1/2 in the case of the harmonic potential leads to the exact result for any discretization time τ, as long as S is the exact elastic matrix. In the general case, the parameter a should be tuned to speed up convergence to the τ → 0 limit. Conversely, the target temperature of the simulation can be measured a posteriori, as is often done in the case of second-order Langevin dynamics, by measuring the average squared velocities of the particles.

The accelerated first-order Langevin dynamics has been successfully employed in the most recent studies of dense liquid hydrogen, providing equilibrated simulations even close to phase transitions.119,120 Moreover, nuclear quantum effects can also be computed within this first-order LD, in exactly the same way as was implemented for the second-order case discussed in Subsection VIII B. However, so far, it is not clear which method is more efficient, and further studies are also necessary to clarify what is the optimal choice of the acceleration matrix S(R) within the QMC framework.

Table I shows the main modules implemented in TurboRVB and a brief description of their functionalities. All programs reported in Table I are coded in Fortran90. The flow-chart of a typical QMC calculation (see Fig. 1) is detailed as follows: starting from the geometry and the chemical composition, the user first chooses the basis set for both Jastrow and antisymmetric factors and generates a JAGP-type Ansatz using makefort10.x (fort.10 is the wave function filename in TurboRVB).

TABLE I.

Main modules in TurboRVB.

ModuleDescription
makefort10.x Generates a JAGP WF from a given basis set and structure 
assembling_pseudo.x Prepares pseudopotentials 
convertfort10mol.x Adds molecular orbitals. If the number of molecular orbitals is equal to (larger than) half the number of electrons in a system, the resultant WF is the JSD (JAGPn) 
convertfort10.x Converts a wave function type (fort.10_in), i.e., JSD/JAGP/JAGPn/JPf/JPfn, to another one with a different type (fort.10_out). The output (fort.10_new) is as close as possible to the input and may also include an optimal contracted (hybrid) basis set. 
convertfortpfaff.x Converts a JAGP WF to a JPf one 
prep.x Performs a DFT calculation 
turborvb.x Performs VMC optimization, VMC evaluation, LRDMC, structural optimization, and molecular dynamics 
readforward.x Performs correlated samplings and calculates various physical properties 
ModuleDescription
makefort10.x Generates a JAGP WF from a given basis set and structure 
assembling_pseudo.x Prepares pseudopotentials 
convertfort10mol.x Adds molecular orbitals. If the number of molecular orbitals is equal to (larger than) half the number of electrons in a system, the resultant WF is the JSD (JAGPn) 
convertfort10.x Converts a wave function type (fort.10_in), i.e., JSD/JAGP/JAGPn/JPf/JPfn, to another one with a different type (fort.10_out). The output (fort.10_new) is as close as possible to the input and may also include an optimal contracted (hybrid) basis set. 
convertfortpfaff.x Converts a JAGP WF to a JPf one 
prep.x Performs a DFT calculation 
turborvb.x Performs VMC optimization, VMC evaluation, LRDMC, structural optimization, and molecular dynamics 
readforward.x Performs correlated samplings and calculates various physical properties 

The antisymmetric part of the WF is initially determined at the DFT mean-field level. Since the DFT calculation is based on a single Slater determinant, the generated JAGP WF should be converted to a JSD one. This is performed by the convertfort10mol.x module, which adds M molecular orbitals to the WF, where M = N.121 The DFT molecular orbitals are then used to construct the initial trial WF for subsequent QMC calculations. In QMC, the user can choose to work with five Ansätze, namely, JSD, JAGP, JAGPn, JPfn, and JPf. The initial antisymmetric part of the JSD and JAGPn wave functions can be directly obtained from the DFT orbitals. In the JSD case, only the occupied orbitals are imported, while in the JAGPn WF also the unoccupied orbitals will be used in the geminal expansion, up to the nth orbital. Another possibility is to employ a JAGP Ansatz. In that case, one has to convert the initial JSD trial WF obtained by DFT to the JAGP one using convertfort10.x.

JAGPn can also be obtained by applying convertfort10mol.x to a previously determined JAGP Ansatz. Analogously, a JPf WF can be converted from a JAGP WF using convertpfaff.x. JPfn can be obtained by applying convertfort10mol.x to the JPf Ansatz. These conversions can be done either before or after a QMC optimization. The possibilities for Ansatz conversion are schematically drawn in Fig. 4.

After the initial determination of the antisymmetric part of the WF, one performs the WF optimization at the VMC level. This is an important step used to determine the best variational parameters of the Jastrow factor at fixed determinant or to fully relax the WF by optimizing both Jastrow and determinant parameters.

Finally, the user can proceed to VMC and LRDMC calculations using the optimized WF. In LRDMC, in order to remove the lattice discretization error, it is better to repeat the calculations with several lattice spaces a and to extrapolate the results with a quartic function Ea=E0+k1a2+k2a4, where E0 is the extrapolated (a → 0) energy.

The Python tool Turbo-Genius makes all the above steps more straightforward, as discussed later.

TurboRVB has been parallelized using the MPI and OpenMP libraries, also supporting the hybrid MPI/OpenMP protocol. As well known, the QMC algorithm is readily parallelized by employing many independent random walks (called walkers), initialized with different seeds for random number generators. Figure 9 shows the strong and the weak scaling performances of TurboRVB, which are compared with the ideal scaling. The benchmarks have been measured using the conventional 2 × 2 × 2 diamond supercell containing 64 carbon atoms for a simulation with 256 electrons on Marconi KNL nodes (Cineca supercomputer/SCAI, Intel Xeon Phi 7250 CPU, 1.40 GHz, 68 cores per node without hyperthreading). For the weak scaling case, we set the number of walkers M to the minimum possible one (one walker per core) and increased the number of cores from 68 (one node) to 27 200 (400 nodes). On the other hand, for the strong scaling case, we fixed the number of walkers M = 2176 and increased the number of cores from 544 (8 nodes) to 8704 (128 nodes) with the hybrid MPI/OpenMP protocol (the number of openMP threads was set to four). For both cases, the performances are excellent because all techniques are off from the ideal scaling by ∼5% at worst in all cases performed here. At present, the code does not support the use of Graphics Processing Unit (GPU) accelerators. This feature is currently under development.

FIG. 9.

(a) The weak and (b) strong scalings of TurboRVB. The benchmarks were measured using the conventional 2 × 2 × 2 diamond with the ccECP pseudopotential (256 electrons in the simulation cell) on Marconi KNL nodes (Cineca supercomputer/SCAI).

FIG. 9.

(a) The weak and (b) strong scalings of TurboRVB. The benchmarks were measured using the conventional 2 × 2 × 2 diamond with the ccECP pseudopotential (256 electrons in the simulation cell) on Marconi KNL nodes (Cineca supercomputer/SCAI).

Close modal

A very efficient and easy-to-use correlated sampling technique has been implemented in TurboRVB. This is very useful in evaluatingtotal energy differences between two very similar WFs. One of its practical sides is to investigate whether a conversion of a WF (e.g., JSD → JAGP) has been successful. A user should run a short VMC calculation using one of the two mentioned WFs, during which the walkers’ history is stored in an appropriate scratch area (turborvb.scratch). After that, a second run should be executed (by using the readforward.x module) to determine at the end the correlated energy difference and the overlap between the two WFs. Correlated samplings for LRDMC or for the difference in other quantities, e.g., forces, have not been implemented yet. The latter could be very important for the frozen phonon calculation based on the ionic force difference method, which represents a possible subject of research.

TurboRVB enables us to calculate various properties using the optimized many-body WF at both the VMC and LRDMC levels by averaging local observables and applying the forward walking technique.15 The observables presently available in the readforward.x module can be categorized as follows: (i) Charge density [ρr=ρr+ρr] and spin density [ρσr=(ρrρr)/2], fundamental properties obtained by a stochastic evaluation of multi-dimensional integrals, involving the many-body WF and improved estimators, such as those proposed by Assaraf, Caffarel, and Scemama.122Figure 10 shows the charge density of the square four hydrogen (H4) depicted by Genovese et al. (ii) Electronic correlation functions, such as the charge–charge and spin–spin structure factors, useful to study the critical behavior close to a phase transition or the physical properties of a given phase. (iii) The expectation values of the spin component along the quantization axis (Sz) and the spin square (S2) operators inside a sphere centered on each atom,62,69 whose radius can be specified by the user. (iv) The Berry phase: zα=Ψ|ei2π/Lαi=1Nriα|Ψ, where the complex polarization is computed for the α = {x, y, z} component riα of the position vector ri and Lα is the supercell length in the same direction α.123 This property can be used to characterize the metal–insulator transition based on the many-body WF, as has been done in the one-dimensional hydrogen chain by Stella et al. (Fig. 11).

FIG. 10.

The xy-plane charge density of the square four hydrogen (H4), calculated using a JAGP WF. Reprinted with permission from C. Genovese, A. Meninno, and S. Sorella, J. Chem. Phys. 150, 084102 (2019).124 Copyright 2019 AIP Publishing LLC.

FIG. 10.

The xy-plane charge density of the square four hydrogen (H4), calculated using a JAGP WF. Reprinted with permission from C. Genovese, A. Meninno, and S. Sorella, J. Chem. Phys. 150, 084102 (2019).124 Copyright 2019 AIP Publishing LLC.

Close modal
FIG. 11.

(a) Electronic localization length λN divided by the interatomic distance a as a function of a, for the one-dimensional hydrogen chain, where λN is defined as λN=L/2πlnzN2/N, N is the number of electrons (hydrogen atoms), and L is the length of the simulation cell along the one dimensional chain. (b) Modulus of the complex polarization zN as a function of the interatomic distance of the hydrogen atoms. According to the previous works,125,126 in the thermodynamic limit (N → 0), a metal can be characterized by a vanishing modulus of the complex polarization (zN0, λN → +), while it becomes unity in the insulating case (zN1, λN → 0). Thus, one can discuss the metal–insulator transition. Reprinted with permission from Stella et al., Phys. Rev. B 84, 245117 (2011). Copyright 2011 APS.

FIG. 11.

(a) Electronic localization length λN divided by the interatomic distance a as a function of a, for the one-dimensional hydrogen chain, where λN is defined as λN=L/2πlnzN2/N, N is the number of electrons (hydrogen atoms), and L is the length of the simulation cell along the one dimensional chain. (b) Modulus of the complex polarization zN as a function of the interatomic distance of the hydrogen atoms. According to the previous works,125,126 in the thermodynamic limit (N → 0), a metal can be characterized by a vanishing modulus of the complex polarization (zN0, λN → +), while it becomes unity in the insulating case (zN1, λN → 0). Thus, one can discuss the metal–insulator transition. Reprinted with permission from Stella et al., Phys. Rev. B 84, 245117 (2011). Copyright 2011 APS.

Close modal

Since the start of the TurboRVB project in 2003 by Casula and Sorella,65 the code has been applied to study atomic species, molecular systems, and various materials, including challenging systems such as large complexes, surfaces, and liquids. Here, we provide a brief review of the applications done so far.

TurboRVB has been employed to study the properties of several molecular systems, including small diatomic systems (such as Li2,127 Be2,71,128 B2,71 C2,62,71 N2,71 O2,69,71,127,129 F2,71 Na2,103 LiF,71 CN,71 and Fe2130), reactive oxygen species (singlet O2, O2, OH, OH, NO, NO, HOO, HOO, and cis and trans HOOO),129 aromatic molecules (benzene127,131 and oligoacene series,132 see Sec. XII D), the water molecule,107,127 dimer133,134 and hexamer clusters,134 the Zundel ion (H5O2+),114,135 H2S, SO2, NH3, PH3,111 and others. Most of the studies report both variational and FN diffusion Monte Carlo results, and several of the WF Ansätze available in TurboRVB are tested. Often, it is shown that quite good results are already obtained at the variational level, and the FN DMC further improves the accuracy of the outcomes. For instance, see the case of the sodium dimer in Fig. 12.

FIG. 12.

PES of the sodium dimer. The dashed lines show the obtained Murrell–Sorbie function. The error bars are within the markers. The obtained LRDMC PES agrees with the CCSD(T) and experimental ones. Reprinted with permission from K. Nakano, R. Maezono, and S. Sorella, J. Chem. Theory Comput. 15, 4044 (2019). Copyright 2019 ACS.

FIG. 12.

PES of the sodium dimer. The dashed lines show the obtained Murrell–Sorbie function. The error bars are within the markers. The obtained LRDMC PES agrees with the CCSD(T) and experimental ones. Reprinted with permission from K. Nakano, R. Maezono, and S. Sorella, J. Chem. Theory Comput. 15, 4044 (2019). Copyright 2019 ACS.

Close modal

Table II and Fig. 13 report the binding energies for the molecular dimers of the first-row atoms, evaluated using LRDMC with the JSD, JAGP, or JPf WF Ansätze as guiding functions. Some of them are close or reach the chemical accuracy (i.e., 1 kcal/mol or ∼0.04 eV). A systematic improvement going from the JSD → JAGP → JPf can be appreciated. It follows from the employment of a more flexible parametrization of the WF, which improves both the variational energy and the quality of the nodes. However, the improvement given by the JPf Ansatz over JAGP and JSD is even more fundamental than what Table II and Fig. 13 indicate, where the reportedbinding energies are evaluated as the difference between the energy of the molecule and twice the energy of the atom. This evaluation does not highlight that JPf is a size-consistent Ansatz, while both the JSD and the JAGP have a size-consistency issue with most of the molecules considered.136 Let us consider the oxygen molecule as an example: the O2 ground state is a spin triplet (  3Σg) which, in both JSD and JAGP, can be described using two unpaired electrons with the same spin. However, as we take the oxygen atoms far apart (say, the O–O system), they should both be in the ground state for the oxygen atom, which is also a spin triplet (3P). Neither JSD nor JAGP has the flexibility to describe the O–O system by the same WF parametrization used for O2;137 thus, EO-O > 2EO. In contrast, JPf has the flexibility to describe the dissociation limit correctly, so EO-OJPf=2EOJPf. Figure 14 shows that JPf is indeed able to provide a reliable dissociation curve of the oxygen molecule, whereas JSD and JAGP are reliable only in proximity of the minimum. Note that the FN projection scheme alleviates in part the limitations of the underlying Ansatz as both JSD and JAGP have a smaller size-inconsistency at the LRDMC than at the VMC level, but the correction is incomplete and the inconsistency is sizable (>1 eV) for all but the JPf Ansatz. Therefore, in general, JPf will yield a better description of the overall potential energy surface (PES), including binding energies, vibrational properties, zero-point motion, and transition states.

TABLE II.

Binding energy of the first-row dimers calculated by TurboRVB with LRDMC and different types of Ansätze as a guiding function (GF), at the experimental bond length (see Ref. 138). Js denotes the symmetric (i.e., spin-independent) Jastrow factor. LRDMC calculations are performed with several lattice spaces a and extrapolated for a → 0. The spin–orbit coupling and the zero point energies71 are not included in these evaluations.

Binding energy (eV)Li2Be2B2C2N2O2F2
LRDMC (GF = JSD) 0.976(6) 0.144(7) 2.83(1) 5.74(2) 9.67(1) 4.94(3) 1.27(1) 
LRDMC (GF = JsAGPs) 0.9812(12) −0.0270(12) 2.66(1) 6.01(1)a 9.91(1)a 5.06(2)a 1.56(1) 
LRDMC (GF = JPf) 1.0580(12) 0.0304(22) 2.75(1) 6.31(1)a 9.97(1)a 5.127(6)a 1.64(1) 
Est. exact 1.06(4)b 0.1153(3)c 2.91(6)d 6.43(2)e 9.902(3)e 5.233(3)e 1.693(5)e 
Binding energy (eV)Li2Be2B2C2N2O2F2
LRDMC (GF = JSD) 0.976(6) 0.144(7) 2.83(1) 5.74(2) 9.67(1) 4.94(3) 1.27(1) 
LRDMC (GF = JsAGPs) 0.9812(12) −0.0270(12) 2.66(1) 6.01(1)a 9.91(1)a 5.06(2)a 1.56(1) 
LRDMC (GF = JPf) 1.0580(12) 0.0304(22) 2.75(1) 6.31(1)a 9.97(1)a 5.127(6)a 1.64(1) 
Est. exact 1.06(4)b 0.1153(3)c 2.91(6)d 6.43(2)e 9.902(3)e 5.233(3)e 1.693(5)e 
a

See Ref. 69.

b

See Ref. 139.

c

See Ref. 140.

d

See Ref. 141.

e

See Ref. 142.

FIG. 13.

Binding energies of the first-row dimers at their experimental bond length (see Table II). The vertical axis shows the difference between the LRDMC and the estimated exact binding energies. The results obtained from the DMC using a Jastrow correlated full-valence-CAS as trial WF [DMC(JFVCAS)] are taken from Ref. 138.

FIG. 13.

Binding energies of the first-row dimers at their experimental bond length (see Table II). The vertical axis shows the difference between the LRDMC and the estimated exact binding energies. The results obtained from the DMC using a Jastrow correlated full-valence-CAS as trial WF [DMC(JFVCAS)] are taken from Ref. 138.

Close modal
FIG. 14.

O2 dissociation curve as obtained with TurboRVB at the variational (VMC) and fixed-node projected (LRDMC) levels of theory and using JSD with DFT orbitals (JDFT), JAGP, and JPf WF Ansätze.

FIG. 14.

O2 dissociation curve as obtained with TurboRVB at the variational (VMC) and fixed-node projected (LRDMC) levels of theory and using JSD with DFT orbitals (JDFT), JAGP, and JPf WF Ansätze.

Close modal

The applications of TurboRVB are not limited to single point evaluations. As discussed in Sec. VI, a distinctive feature of TurboRVB among most QMC codes is the efficient evaluation of atomic forces. This capability has been extensively exploited to investigate several properties of the PES, such as vibrational frequencies, and to perform geometry optimizations and molecular dynamics simulations at finite temperature. It is important to remember that QMC atomic forces are, as the QMC energy, affected by a stochastic error. In a first attempt143 to evaluate vibrational frequencies of a triatomic molecule (namely, water), a grid of points close to the guessed minimum geometry was used to evaluate energy and forces. The results were used to fit the PES around the minimum with a truncated Taylor expansion, yielding evaluations of the structural minimum and of both the harmonic and anharmonic vibrational frequencies (using the second-order perturbation theory). It was observed that the inclusion of the estimates of forces reduces the stochastic error on the vibrational frequencies by around an order of magnitude with respect to the fit using only the energy evaluations (on the same grid and using the same sampling size on each single point evaluation). This corresponds to about a 25 times larger speedup.144 Moreover, it was observed that the optimal grid is a trade-off between maximizing the precision on the vibrational properties and minimizing the errors on the fit. A systematic study of the quality of the variational WF Ansätze in relation to the vibrational (and other) properties of the water molecule is reported in the work of Zen et al.107 Afterward, Luo et al.111,145 automated the process of finding the optimal grid by performing fits on samples generated by a finite temperature molecular dynamics and applied the approach to H2O, H2S, SO2, NH3, and PH3. The obtained vibrational frequencies are in good agreement with the experimental values.

Zen et al.134 applied the second-order LD to study the structural properties of liquid water. They obtained radial distribution functions in good agreement with recent neutron scattering and X-ray experiments, especially for the position of the oxygen–oxygen peak, which is very sensitive to the quality of the water description and, in particular, to the inclusion of dispersive vdW forces. This is a stringent test for ab initio approaches as the shape of the peak can be directly compared with neutron scattering data.

Mouhat et al. developed a path integral LD formalism particularly suited for QMC and applied it to the Zundel ion (H5O2+), with the aim of studying nuclear quantum effects.114 The outcome of their study shows how essential is the inclusion of nuclear quantum effects to properly deal with proton transfer in water and aqueous systems, even at room temperature.

An important application for QMC is the description of systems interacting through weak van der Waals (vdW) dispersive interactions, which are fundamental in supramolecular chemistry, layered and porous materials, molecular crystals, adsorption of molecules on surfaces, and so on. vdW interactions play a major role also in non-covalent intra-molecular bonds, which result from the interplay of forces of different nature, including electrostatic contributions. An accurate evaluation of the overall strength of these non-covalent bonds requires each component of the interaction to be modeled correctly. This is a big challenge for all ab initio approaches, and only a few methods prove reliable. QMC and CCSD(T) are considered to yield very accurate estimates, and there is a general agreement among the predictions from these two theories. Thus, they are typically used to produce benchmark values. Traditionally, mean-field schemes have difficulties in describing dispersive interactions because they arise from non-local electron correlations, completely missed by local density functionals. However, in the context of DFT, there has been an intensive effort in developing new exchange correlation functionals146 to include vdW interactions. This is usually done in a semi-empirical way, often building on the benchmark provided by CCSD(T) or QMC.

The non-covalent interactions are typically a tiny fraction of the total energy of a compound. Similar to the cases discussed in Sec. XII A, also in the evaluation of weak interactions, it is important to use a size-consistent approach. However, very often, the molecules form a closed shell, and most of the times, the intermolecular interactions are among fragments having zero spin. In this case, a single Slater determinant is a size-consistent Ansatz, and so is the JSD. The AGP (without Jastrow) is not size-consistent due to the presence of unphysical charge fluctuations,43,147 but a very flexible (i.e., large) Jastrow factor allows damping the charge fluctuations and recovering the size-consistency in the JAGP. This is of course true also in a subsequent FN-DMC projection, which is indeed equivalent to applying an infinitely flexible Jastrow factor to the trial WF. At the variational level, the JSD Ansatz occasionally leads to better results than the JAGP. Even though JAGP leads to a lower total energy, it is indeed difficult to use a Jastrow factor large enough to damp any charge fluctuation. So, there is a worse error cancellation when energy differences are considered in JAGP. This was observed, for instance, in the water dimer,134 although at the FN level JSD and JAGP lead to the same binding energy. It should also be considered that in the widely used DMC scheme,148 a size-consistency issue was present in any simulation with finite time step τ, until the modification of the “standard” algorithm has been introduced by Zen et al.,149 whose solution removes almost completely the size-consistency error. At variance with DMC, the LRDMC algorithm does not suffer from this bias, and it is always size consistent.

The DMC scheme is commonly used to project a single Slater determinant WF where only the Jastrow factor has been optimized, while the antisymmetric part is filled with DFT orbitals150 (JDFT). The resulting FN energy is usually biased by the nodal surface, kept frozen from previous DFT or lower level mean-field calculations. Optimizing the Jastrow factor and determinant together, which is routinely done in TurboRVB, improves significantly the total energy. However, in the evaluation of non-covalent interactions, there is not yet a clearcut indication of the advantage of the full optimization over JDFT. This is due to the fact that the error cancellation plays a major role. Indeed, it is expected that, for non-covalent interactions, JDFT has almost the same bias in the interacting and non-interacting systems. When pseudopotentials are used, DLA54 can be used to remove any bias given by the Jastrow optimization.

TurboRVB with JDFT, JSD, and JAGP guiding functions has been applied to several interesting systems, such as molecular hydrogen adsorbed on benzene,151 benzene dimers,43 graphite layers,152 water or methanol molecules adsorbed on a clay surface (see Fig. 15),153 and metallic clusters,154 and to evaluate the cohesive energies of B2O3 polymorphs.155 

FIG. 15.

Adsorption of water and methanol on the hydroxyl-terminated and the silicate-terminated faces of kaolinite (side view in the first row and top view in the second row). The adsorbed molecule on kaolinite is depicted in cyan and gray, and the H-bonds are represented by the blue dashed lines. Reprinted with permission from Zen et al., J. Phys. Chem. C 120, 26402 (2016). Copyright 2016 ACS.

FIG. 15.

Adsorption of water and methanol on the hydroxyl-terminated and the silicate-terminated faces of kaolinite (side view in the first row and top view in the second row). The adsorbed molecule on kaolinite is depicted in cyan and gray, and the H-bonds are represented by the blue dashed lines. Reprinted with permission from Zen et al., J. Phys. Chem. C 120, 26402 (2016). Copyright 2016 ACS.

Close modal

Another important QMC application is the study of strongly correlated materials, which represents a challenge for any theoretical method, because in several cases a fully consistent description remains, so far, elusive. In this context, QMC methods have been widely used for solving, as accurately as possible, strongly correlated lattice Hamiltonians, such as, the Hubbard model. Real-space ab initio QMC methods could fill the gap between realistic Hamiltonians and lattice models, by providing an accuracy similar to the one reached in lattice Hamiltonians while keeping the complexity of real materials. TurboRVB has been originally conceived in this perspective, with the idea that one of the most flexible WF Ansätze tested on correlated lattice models,156–160 namely, the generalized version of the resonating valence bond (RVB) WF, could be translated into a successful variational WF for ab initio systems. One of the most appealing features of the RVB WF is that it allows several broken-symmetry phases, such as antiferromagnetic insulators and superconductors, which are present—and sometimes coexisting—in the extremely rich phase diagrams of strongly correlated materials. As introduced in Sec. III, the RVB WF employed in TurboRVB takes the static and dynamic correlation effects into consideration beyond the commonly used Jastrow–Slater determinant (JSD) WF. Therefore, thecode is expected to describe strongly correlated systems more accurately. In particular, among the various families of correlated WFs that go beyond the simplest JSD form, the JAGP/JPfaffian families are very suitable to describe extended systems as they keep a compact and still strongly correlated form even in the thermodynamic limit. Indeed, as detailed in Sec. III, the AGP part is the particle number conserving version of the BCS WF. In the same way, the JAGP variational form can be seen as an efficient formulation of a RVB WF. Thus, superconducting materials including cuprates161 and iron pnictides and selenides162 are prominent applications of the JAGP/RVB WFs. By running extensive VMC calculations for CaCuO2, a parent compound of cuprate high-temperature superconductors, Marchi et al. validated the JAGP description by correctly finding the expected d-wave symmetry of the pairing function.163 Casula and Sorella applied the RVB WF to FeSe, one of the iron-based high-temperature superconductors,164 whose electronic structure and pairing mechanism have been unclear and intensively debated. They determined the symmetry of the superconducting order parameter and the size of its gap entirely from first principles, by analyzing the AGP pairing function (Fig. 16). Busemeyer et al. applied DMC and LRDMC to study the structural and magnetic properties of the normal state of FeSe under pressure and reported that collinear spin configurations are energetically more favorable than other spin patterns, such as ferromagnetic, checkerboard, and staggered dimer, over a large range of pressures.165 

FIG. 16.

(a) Even–even and (b) odd–even components of the AGP pairing function of FeSe at 0 GPa obtained by VMC calculations, where even and odd refer to the parities of the orbitals to the reflection through the Fe plane. The contour plots show ΦRcenter,r with Rcenter set to be the iron lattice site at the center of the supercell, while r spans the plane defined by the 4 × 4 lattice. Red (yellow) balls are iron (selenium) sites. Arbitrary units of blue (red) intensity indicate negative (positive) regions with corresponding magnitude. Reprinted with permission from M. Casula and S. Sorella, Phys. Rev. B 88, 155125 (2013). Copyright 2013 APS.

FIG. 16.

(a) Even–even and (b) odd–even components of the AGP pairing function of FeSe at 0 GPa obtained by VMC calculations, where even and odd refer to the parities of the orbitals to the reflection through the Fe plane. The contour plots show ΦRcenter,r with Rcenter set to be the iron lattice site at the center of the supercell, while r spans the plane defined by the 4 × 4 lattice. Red (yellow) balls are iron (selenium) sites. Arbitrary units of blue (red) intensity indicate negative (positive) regions with corresponding magnitude. Reprinted with permission from M. Casula and S. Sorella, Phys. Rev. B 88, 155125 (2013). Copyright 2013 APS.

Close modal

Aromatic compounds are also successful examples of the RVB theory because they are characterized by resonating C–C bonds that can be efficiently represented by antisymmetrized singlet pairing functions (i.e., antisymmetrized product of geminals or AGP). Casula et al. applied the AGP WF combined with a Jastrow factor to the simplest aromatic compound, that is, the benzene molecule (C6H6) for the first time.127,131 They reported that the inclusion of the resonance between the two possible Kekulé states significantly lowers the VMC energy compared with the one obtained by a single Kekulé WF. They also showed the importance of adding a three-body Jastrow factor to improve the AGP description, just in the spirit of the RVB framework. By selectively switching on the intersite resonances, they proved that the Dewar contributions (i.e., including the third nearest neighbor carbons) improves the description of the resonating valence bond. Genovese et al. have applied a more general AGP WF, i.e., the Pfaffian, to the benzene molecule.69 They reported that the obtained atomization energies are reasonably consistent with the experimental value at the LRDMC level. On the other hand, when starting from JDFT, i.e., from frozen DFT nodes, the obtained atomization energy is severely underestimated in most cases, implying that the optimization of the nodal surface at the VMC level is essential to obtain a correct value for the atomization energy. As mentioned in Sec. XII B, Sorella et al. applied the RVB WF to calculate the binding energy of the face-to-face and displaced parallel benzene dimers43 and obtained the binding energy very close to the experimental value in the displaced parallel case. Dupuy and Casula applied the same RVB WF to study the ground-state properties of the oligoacene series, from anthracene up to the nonacene.132 They found that the ground state obtained by the RVB WF has a weak diradical or polyradical instabilities until the nonacene, which is in contrast with the results previously found by lower-level theories, such as Hartree–Fock, hybrid functionals, and CASSCF in a restricted active space. TurboRVB has been applied not only to simple aromatic molecules but also to honeycomb lattice compounds, in order to study the role of RVB correlations, in other words, to investigate whether the resonance energy is sizable also in extended systems. Marchi et al. studied the nature of chemical bonds in graphene using the RVB WF163 and found that the RVB energy gain becomes extremely small in the thermodynamic limit, i.e., in the infinite lattice. Conversely, Sorella et al. applied the RVB WF to isotropically strained graphene166 (Fig. 17) and found that the RVB effect is crucial in the Kekulé-like dimerized (DIM) phase that has been proposed to become stable under the isotropic strain (Fig. 18). They also reported that the JAGP energy gain with respect to JSD is negligible in the perfect honeycomb structure—as previously found by Marchi et al.—but is extremely important in the DIM phase.

FIG. 17.

The graphene structures studied in Ref. 166. (a) SEM honeycomb, semimetallic; (b) AFI honeycomb antiferromagnetic insulator; (c) DIM dimerized Kekulé-like insulator; and (d) HEX distorted hexagonal insulator. There are two carbons per unit cell in (a) and (b) and six in (c) and (d). tA, tB, and tC schematically denote different hopping integrals magnitudes. Reprinted with permission from Sorella et al., Phys. Rev. Lett. 121, 066402 (2018). Copyright 2018 APS.

FIG. 17.

The graphene structures studied in Ref. 166. (a) SEM honeycomb, semimetallic; (b) AFI honeycomb antiferromagnetic insulator; (c) DIM dimerized Kekulé-like insulator; and (d) HEX distorted hexagonal insulator. There are two carbons per unit cell in (a) and (b) and six in (c) and (d). tA, tB, and tC schematically denote different hopping integrals magnitudes. Reprinted with permission from Sorella et al., Phys. Rev. Lett. 121, 066402 (2018). Copyright 2018 APS.

Close modal
FIG. 18.

Energy gains by the resonance effect, measured by the energy per atom difference between the single determinant Ansatz (Jastrow–Slater determinant WF) and the corresponding multideterminant JAGP WF. The largest energy gain occurs in the DIM state, underlining its resonating valence-bond nature, actually increasing for large strain ε. Small negative values at small strain are finite-size effects. Inset: finite-size scaling of this correlation energy gain in the DIM state at ε = 15%. Reprinted with permission from Sorella et al., Phys. Rev. Lett. 121, 066402 (2018). Copyright 2018 APS.

FIG. 18.

Energy gains by the resonance effect, measured by the energy per atom difference between the single determinant Ansatz (Jastrow–Slater determinant WF) and the corresponding multideterminant JAGP WF. The largest energy gain occurs in the DIM state, underlining its resonating valence-bond nature, actually increasing for large strain ε. Small negative values at small strain are finite-size effects. Inset: finite-size scaling of this correlation energy gain in the DIM state at ε = 15%. Reprinted with permission from Sorella et al., Phys. Rev. Lett. 121, 066402 (2018). Copyright 2018 APS.

Close modal

Diradical and conjugated organic compounds are other interesting applications of the RVB WF because both static and dynamic correlation effects play important roles in these systems. Zen et al. found that the RVB WF gives a correct torsion barrier of the ethylene (C2H4) and the triple-singlet gap in the methylene (CH2).66 They also investigated the charge-transfer and diradical nature of the electronic states of the retinal minimal model (penta-2,4-dieniminium cation) at the VMC and LRDMC levels.67 They proved that the dynamical electronic correlation is key to get a reliable ground state energy surface in proximity of the conical intersection between the electronic ground- and first excited-state of the molecule. Their obtained energy landscapes are significantly different from the CASSCF ones, inverting the relative stability of several torsional paths, and similar to what was obtained by other correlated approaches. Barborini and Guidoni applied VMC and LRDMC to calculate the torsion barriers of 1,3-butadiene and the ring-opening barrier of cyclobutene,167 which are in good agreement with the experimental results. Barborini and Coccia investigated the potential energy curves of the two spin states of the tetramethyleneethane (TME) molecule and the two anionic states of TME as a function of the torsion of the central dihedral angle.168 Through ab initio geometrical optimizations at the VMC level, they proposed possible structural interconversions between the states, which are in agreement with the ion photoelectron spectroscopy experiments.

The bond length alternation (BLA), namely, the impact on the geometry of the difference between the single and double carbon bonds, is one of the key structural descriptors in conjugated organic compounds. Barborini et al. investigated the effects of the static and dynamical electronic correlations on the BLA of the 1,3-butadiene169 using a VMC-JAGP Ansatz and other quantum chemistry calculations. They found inconsistency between BLAs obtained by CCSD(T)-CBS (the most accurate) and VMC-JAGP calculations, but the reason for the discrepancy is still unclear. They also investigated the structural properties of polyacetylene chains H-(C2H2)n-H up to N = 12 acetylene units,170 in which they revealed that the BLA obtained by the extrapolation to n is 0.0910(7) Å, which is compatible with the experimental data. Coccia et al. applied VMC to study geometries of the retinal model, the chromophore of the rhodopsin involved in the mechanism of vision, and discussed the effects of the electronic correlation on the BLA171,172 by comparing with DFT and quantum chemistry methods. Wave-function and geometry optimizations of the retinal model were also performed in the presence of the protein field, classically described via electrostatic and mechanical interactions: the complex environment dramatically affects the BLA of the chromophore, which consistently increases with respect to the gas-phase case. They also performed VMC geometry optimization of the peridinin chromophore in order to verify the interplay between the BLA and the optical properties of a large conjugated moiety, involved in the light-harvesting step of photosynthesis in the peridinin-chlorophyll-protein complex.173 Comparison with DFT and wave-function approaches shows that the combined application of RVB-based VMC and Bethe–Salpeter methods for geometry and excitations, respectively, gives an accurate estimation of the electronic transition energies of peridinin. They recently applied VMC optimization to obtain a ground state structure of keto-1, enol-1, and enol-2 forms of oxyluciferin,174 which is used to calculate the S1S0 absorption energy based on the Bethe–Salpeter formalism. Coccia et al. carried out a systematic basis-set analysis on polarizability, quadrupole moment, and electronic density of the C2H2 molecule, with the aim to test VMC and LRDMC calculations in the presence of an external electric field.175 In particular, a relatively small hybrid (Gaussian + Slater functions) basis set was seen to be able to properly reproduce reference values.

Recently, TurboRVB has also been used to study metal-organic complexes. Chu et al. studied free-energy reaction barriers of water splitting reactions, using a simplified computational model based on the cobalt ion176 (Fig. 19). They found that the total free-energy differences of the water oxidation, computed at the QMC level, fairly agree with the experimental reference values, in a surprisingly better way than the corresponding CCSD(T) ones. M. Barborini and L. Guidoni applied VMC to tackle the geometry optimization of a Fe2S2(SH)  2 model complex (high-spin and broken symmetry states)177 based on the extended broken-symmetry (EBS) approach. The number of applications is still limited in this category, but the recent development of sophisticated transition-metal pseudopotentials will make it more feasible in the near future.

FIG. 19.

Reaction cycle involving the cobalt ion model, studied in Ref. 176 by VMC and LRDMC. The pink, red, and white balls represent the cobalt, oxygen, and hydrogen atoms, respectively. Reprinted with permission from Chu et al., J. Chem. Theory Comput. 12, 5803 (2016). Copyright 2016 ACS.

FIG. 19.

Reaction cycle involving the cobalt ion model, studied in Ref. 176 by VMC and LRDMC. The pink, red, and white balls represent the cobalt, oxygen, and hydrogen atoms, respectively. Reprinted with permission from Chu et al., J. Chem. Theory Comput. 12, 5803 (2016). Copyright 2016 ACS.

Close modal

Ab initio electronic calculations allow one to study the phase stability between crystal polymorphs by comparing the enthalpies or free energies and by computing their equation of states (EOSs). As mentioned in the Introduction, DFT’s predictive power strongly depends on the choice of the exchange and correlation functionals, sometimes causing qualitatively different phase stability predictions. On the other hand, VMC and LRDMC calculations do not suffer from this drawback in principle. Sorella et al. studied the pressure-induced metal–insulator transition in bulk silicon through EOS calculations.178 They obtained a reasonable transition pressure at room temperature by VMC, however still a little far from the experimental value. Devaux et al. studied the α-γ phase transition in elemental cerium at the VMC and LRDMC levels.179 They revealed that the volume-collapse transition could be understood as a conventional first-order transition of electronic origin, induced by the p-f hybridization and by the strong local repulsion affecting the f states. Hay et al. investigated the effect of long-range van der Waals forces in the SiO2 polymorphs using DFT and QMC.180 Their LRDMC calculations showed that the obtained energy differences in quartz-cristobalite and quartz-stishovite agree well with the experimental values within sub-chemical accuracy, implying that QMC is a promising method in describing both conventional and high-pressure SiO2 polymorphs. Note that, at present, only the zero-temperature energy (without the entropic term) is available at the VMC and LRDMC levels. Therefore, the entropic effects are estimated by DFT-PBE calculations.178 In the near future, it will be possible to estimate the entropic corrections directly within the VMC framework, via phonon calculations under the quasi-harmonic approximation. As described in Sec. IV, considering twist boundary conditions is essential to correct the one-body finite-size errors for periodic systems, especially in a metallic case. Although the twisted average approach is commonly used to correct the error, one can also use the exact special twist (EST) method87 developed by Dagrada et al. to save some computational cost. EST works even in realistic systems, namely, the bcc-hydrogen, the bcc-lithium, and the silicon in the high-pressure β-tin phase.87 

One-dimensional chains of atoms have been intensively studied based on both effective and ab initio Hamiltonians because they embody many important open questions in modern condensed matter physics, despite their simplicity. The finite or infinite hydrogen chains are ideal systems for this purpose because it is not necessary to use a pseudopotential or to consider the relativistic effects. They are therefore benchmark systems. Stella et al. investigated the metal–insulator transition in the hydrogen chain by explicitly calculating the complex polarization,123 as shown in Sec. XI. They found that the model has no metal–insulator transition and always behaves as an insulating 1D Hubbard model at half-filling, at least for a large enough interatomic distance (R > 1). Motta et al. (Simons Collaboration on the Many-Electron Problem)181 undertook a comprehensive benchmark study in the finite and infinite linear hydrogen chains using state-of-the-art many-body methods, including VMC and LRDMC with extrapolations to the thermodynamic and the complete-basis-set limits. They provided accurate potential energy curves online, which will be very useful for benchmarking other methods. They have recently investigated the ground states of the hydrogen-chain more in detail, namely, the insulator-to-metal transition, dimerization, and magnetic phases.182 They revealed a fascinating phase diagram, with several emergent quantum phases, depending on the interproton distance.

Hydrogen behavior at high pressures is still not fully understood and is subject of intense research. In 1935, Wigner and Huntington predicted its metallization upon compression,183 while Ashcroft also proposed its high-Tc superconductivity.184 Besides its fundamental interest as the simplest realistic condensed matter system in Nature, calculating its equilibrium properties remains of fundamental importance for planetary science applications.185 This system is particularly challenging for DFT-based simulations due to the presence of strong correlation effects and a subtle interplay between structural and electronic phase transitions. A straightforward way to study a phase diagram under pressure and at finite temperature in QMC is to apply the Langevin molecular dynamics described in Sec. VIII. First-order phase transitions can be identified from a discontinuous behavior in the calculated equation of state and of pair correlation functions. Attaccalite and Sorella have formulated a second-order LD suitable for QMC simulation for the first time, as described in Sec. VIII. They applied it to assess the relative stability of solid and liquid hydrogen at high pressure and room temperature.106 Years later, in a series of works, Mazzola et al. applied this technique to study the proposed liquid–liquid transition (LLT) from a molecular insulating to an atomic conducting phase of hydrogen.113,186 Mazzola and Sorella have also formulated the accelerated first-order LD,119 as described in Sec. VIII, to study such an elusive phase transition with properly equilibrated simulations.119 Finally, under the same framework, the first QMC molecular dynamics simulations of a hydrogen–helium mixture has been performed at Jupiter’s interior conditions.120 The study revealed that mixing He has a significant influence on the H metallization pressure, as shown in Fig. 20, and provided useful benchmark for the equation of states currently adopted by the planetary science community. Dagrada et al. revealed that their developed EST technique also works well in the LD simulation for liquid hydrogen, but a perfect agreement with the twisted-average technique is achieved only when relatively large supercells are used.87 Therefore, a careful finite size scaling analysis is needed when the EST technique is employed, although the thermodynamic limit extrapolation is much smoother at the special twist, saving some computational cost. Moreover, EST allows for the determinant/AGP optimization.

FIG. 20.

Phase diagram of dense hydrogen and a hydrogen–helium mixture. “This work” refers to the results obtained by VMC-Langevin dynamics simulations. The shaded areas represent the liquid–liquid transitions (LLTs) between the insulating-molecular and the metallic-atomic fluids. The solid symbols refer to their QMC LLTs for the pure hydrogen (blue circles) and for the hydrogen–helium mixture (red triangles), and the blue and red solid lines indicate first order LLTs. The empty (solid) left (right) triangles correspond to simulations displaying a clear atomic (molecular) behavior; on the other hand, the red diamonds represent an intermediate behavior. A calculated Jupiter’s adiabat is shown as a gray line. Pure hydrogen first-order LLT predicted by QMC (cyan) and DFT (solid and dashed lines) are also shown. The light green and dark green triangles and the brown stars represent experimental results. Reprinted with permission from G. Mazzola, R. Helled, and S. Sorella, Phys. Rev. Lett. 120, 025701 (2018). Copyright 2018 APS.

FIG. 20.

Phase diagram of dense hydrogen and a hydrogen–helium mixture. “This work” refers to the results obtained by VMC-Langevin dynamics simulations. The shaded areas represent the liquid–liquid transitions (LLTs) between the insulating-molecular and the metallic-atomic fluids. The solid symbols refer to their QMC LLTs for the pure hydrogen (blue circles) and for the hydrogen–helium mixture (red triangles), and the blue and red solid lines indicate first order LLTs. The empty (solid) left (right) triangles correspond to simulations displaying a clear atomic (molecular) behavior; on the other hand, the red diamonds represent an intermediate behavior. A calculated Jupiter’s adiabat is shown as a gray line. Pure hydrogen first-order LLT predicted by QMC (cyan) and DFT (solid and dashed lines) are also shown. The light green and dark green triangles and the brown stars represent experimental results. Reprinted with permission from G. Mazzola, R. Helled, and S. Sorella, Phys. Rev. Lett. 120, 025701 (2018). Copyright 2018 APS.

Close modal

TurboRVB allows studying not only the ground-state electronic properties but also excited states. It is trivial to obtain excitation properties when they are characterized by different particles or different spin projection because these quantities are clearly conserved. For example, Barborini et al. calculated the 3B1u1Ag vertical triplet excitation of the ethylene molecule and its adiabatic excitation 3A11Ag.187 One can also deal with nontrivial cases based on a flexible symmetry-adapted RVB WF within its MO representation, which was proposed by Dupuy et al. in 2015.188 Instead of dealing with a linear combination of symmetry-adapted configuration state functions, they construct a symmetry-adapted RVB WF that accurately describes the targeted symmetry. They reported that their developed constrained (fixed-rank) minimization of the geminal expansion is stable without symmetry contamination of the starting excited state symmetry; as a result, the vertical ionization energies and the electron affinities of anthracene obtained by VMC and LRDMC calculations agree with the experimental values.

Barborini et al. proposed a new ab initio approach to calculate quasiparticle WFs (QPWFs) of isolated systems and resolved in momentum space, based on the correlated sampling technique. They applied the new approach to simulate Scanning Tunneling Microscopy (STM) images189 and Angle-Resolved Photoemission Spectroscopy (ARPES).190 The STM image of [CuCl4]2 obtained by this approach is shown in Fig. 21. It reveals that the electronic correlation has a significant effect on the QPWF in this system. Indeed, the resulting STM image clearly differs from that obtained by uncorrelated methods (i.e., the Hartree–Fock counterpart). In general, the electronic calculation should be useful to interpret the experimental results theoretically. We believe therefore that TurboRVB can become useful also for state-of-the-art research in this field.

FIG. 21.

The STM images of in-plane [CuCl4]2 simulated using UHF and QMC calculations. The left column shows the isosurfaces of the square modulus of the eQPWF for a fixed value of probability density of 9.7 × 10−5 Å−3, while in the right column the isosurfaces take the value of 6.7 × 10−6 Å−3. Panels (a) and (a′) display the LUMO+1 UHF molecular orbitals, while panels (b) and (b′) display the QMC images. Reprinted with permission from Barborini et al., J. Chem. Theory Comput. 12, 5339 (2016). Copyright 2016 ACS.

FIG. 21.

The STM images of in-plane [CuCl4]2 simulated using UHF and QMC calculations. The left column shows the isosurfaces of the square modulus of the eQPWF for a fixed value of probability density of 9.7 × 10−5 Å−3, while in the right column the isosurfaces take the value of 6.7 × 10−6 Å−3. Panels (a) and (a′) display the LUMO+1 UHF molecular orbitals, while panels (b) and (b′) display the QMC images. Reprinted with permission from Barborini et al., J. Chem. Theory Comput. 12, 5339 (2016). Copyright 2016 ACS.

Close modal

In general, a computational study involves many complicated operations, such as preparing input files, searching for optimal parameters, performing benchmark studies, transferring information coming from previous calculations to a subsequent simulation’s input file, and analyzing output files. Automatizing these operations can significantly save researcher’s burden, avoid trivial human errors, and is beneficial for accelerating the distribution of a computational package to a much wider community. We are now developing a Python-based workflow system suitable for TurboRVB named Turbo-Genius, such as Nexus191 and QMC-SW.192 The workflow system has been implemented in Python 3 in an object-oriented fashion, so the modules are highly extensible. Note that, since TurboRVB is an all-in-one package, any other commercial code is not necessary to run the code. There are two main modules, turbo-genius-serial.py and turbo-genius-sequential.py, which manage other modules/classes. The former allows one to perform a single job specified by an option (e.g., turbo-genius-serial.py -j makefort10), and the latter does a sequential job (e.g., makefort10.xconvertfort10mol.xprep.xturborvb.x). By using Turbo-Genius, one can also analyze a simulation output. For example, one can plot an optimization history, average the variational parameters after optimization, and search for an optimal number of equilibration steps and a reblocking length.40Turbo-Genius is still at an early stage of development. However, its main features are already there. In the near future, the Turbo-Genius modules will be capable of sophisticated TurboRVB job automatizations, as well as setting optimal parameters depending on target systems.

In this paper, we have reviewed TurboRVB, an ab initio quantum Monte Carlo package featuring two well established QMC algorithms for electronic structure: variational Monte Carlo and diffusion Monte Carlo. The beginning of the TurboRVB development dates back to 2003, with the Ph.D. thesis project of Casula supervised by Sorella. Since then, TurboRVB has rapidly grown up, and it has been applied to the study of several molecular and condensed matter systems, from benzene to high-temperature superconducting materials. Central to this package are variational parameterizations of correlated electronic WFs in the product form Ψ = ΦAS * exp J, made of an antisymmetric part (ΦAS) and a Jastrow factor (J).

Concerning the antisymmetric part, TurboRVB implements five Ansätze (in order of decreasing variational flexibility): (i) the Pfaffian (Pf), (ii) the Pfaffian with a constrained number of molecular orbitals (Pfn), (iii) the Antisymmetrized Geminal Power (AGP), (iv) the Antisymmetrized Geminal Power with a constrained number of molecular orbitals (AGPn), and (v) the single Slater determinant (SD). Notably, the user can freely navigate between these Ansätze. Indeed, the package includes conversion modules, which allow the user to choose the most suitable WF form for the target system.

All the above Ansätze can be optimized by state-of-the-art algorithms, namely, the stochastic reconfiguration and the linear method, thus achieving highly accurate variational and fixed-node energies, with a computational cost remaining at the single-determinant level, thanks to efficient algorithmic developments.

The stochastic optimization of many variational parameters (so far up to the order of 105) is feasible in TurboRVB, thanks to an efficient evaluation of energy derivatives using the AAD. This algorithmic scheme, which gives the code an original architecture, drastically decreases the computational cost of ionic force calculations, keeping them of order N3. Thus, it paves the way for efficient structural optimizations and molecular dynamics simulations at the VMC level of theory. Indeed, by means of the TurboRVB package, it is at present possible to perform QMC molecular dynamics simulations of dense hydrogen (with up to 256 ions) and liquid water (with up to 64 water molecules), with a remarkably large number of atoms, not far from the DFT state of the art. Thanks to TurboRVB, nuclear quantum effects are also accessible in a correlated electronic environment provided by the VMC WF.

Large scale calculations are possible with TurboRVB not only because of cutting-edge algorithmic developments but also by virtue of an efficient parallelization based on a hybrid MPI and OpenMP protocol, which is ideal for employing GPU accelerators as well, recently available in the most advanced HPC infrastructures.

A Python wrapper, named Turbo-Genius, has recently been developed to make the code user-friendly and to allow both beginners and professional users to handle it more efficiently. We will continue developing TurboRVB and Turbo-Genius to implement new QMC algorithms and to extend their range of applications for the current developers and expected new upcoming contributors and users.

The data that support the findings of this study are available within this article and the cited references. TurboRVB is available from S.S.’s web site (https://people.sissa.it/∼sorella/) upon request.

The authors acknowledge all contributors for the development of the code TurboRVB over the last 20 years. We would especially like to acknowledge very fruitful collaborations with F. Becca, G. Carleo, C. Cavazzoni, N. Devaux, N. Dupuy, L. Guidoni, H. Hay, R. Hlubina, Y. Iqbal, R. Maezono, F. Mouhat, A. Parola, K. Seki, T. Shirakawa, L. Spanu, L. Stella, L. F. Tocchio, and S. Yunoki.

K.N. is grateful for computational resources from PRACE, Project No. 2019204934, and those from the facilities of the Research Center for Advanced Computing Infrastructure at Japan Advanced Institute of Science and Technology (JAIST). K.N. also acknowledges financial support from the Simons Foundation and that from Grant-in-Aid for Scientific Research on Innovative Areas (No. 16H06439). C.A. acknowledges funding from the European Union Seventh Framework Program under Grant Agreement No. 785219 Graphene Core 2 and COST Action TUMIEE CA17126. M.C. is grateful to the French grand équipement national de calcul intensif (GENCI) for the computational time provided through these years under Project No. 0906493; for the funded PRACE Project Nos. 2012061116, 2015133179, and 2016163936; and for the access to the Hokusai and K-computer granted by the Institute of Physical and Chemical Research (RIKEN). Y.L. was supported by the Argonne Leadership Computing Facility, which is a U.S. Department of Energy Office of Science User Facility operated under Contract No. DE-AC02-06CH11357. A.Z.’s work was sponsored by the Air Force Office of Scientific Research, Air Force Material Command, U.S. Air Force, under Grant No. FA9550-19-1-7007. S.S. acknowledges financial support from PRIN 2017BZPKSZ and computational resources from PRACE, Project No. 2019204934. S.S. is also grateful to his wife L. Urgias for bearing the too many weekends spent on the development of TurboRVB.

We dedicate this paper to the memory of P. W. Anderson, remembering him as one of the most influential contributors to physics and chemistry of the past century, and in particular for deeply inspiring this work with his RVB theory of matter.

The submitted manuscript has been created by UChicago Argonne, LLC, Operator of Argonne National Laboratory (“Argonne”). Argonne, a U.S. Department of Energy Office of Science laboratory, is operated under Contract No. DE-AC02-06CH11357. The U.S. Government retains for itself, and others acting on its behalf, a paid-up nonexclusive, irrevocable worldwide license in said article to reproduce, prepare derivative works, distribute copies to the public, and perform publicly and display publicly, by or on behalf of the Government. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan. http://energy.gov/downloads/doe-public-access-plan.

A

antisymmetrization operator

a, b, I, J

indices of atoms

Fa

atomic force for an atom a, FadEdRa

fi

general force for the ith variational parameter, fiEαi

fri,rj

spatial part of the pairing function

G

N × N matrix with elements Gi,j = g(i, j)

gi,j

pairing function between electrons i ↔ (ri, σi) and j ↔ (rj, σj)

i, j

compact notation for (ri, σi) and (rj, σj), respectively

i, j

indices of electrons/orbitals/Monte Carlo sampling

J

Jastrow factor, composed of J1, J2, and J3/4

l, m

indices of orbitals

M

the number of configurations of electrons (the number of Monte Carlo samplings)

N

the number of electrons

Nat

the number of atoms

Nd

the number of electrons with spin down

Nu

the number of electrons with spin up

Okx

the logarithmic derivative of a many-body WF OkxlnΨxαk

Ra

ath ion coordination

ri

ith electron coordination

S

overlap matrix with elements Sμ,ν ≡⟨ψμ|ψν

S

variance–covariance matrix of the logarithmic derivative of a many-body WF

SQMCR

variance–covariance matrix of QMC forces

x

N set of electron coordinations including spins, the same as xi

xi

N set of electron coordinations including spins r1σ1,r2σ2,,rNσNi. The index i refers to the Monte Carlo sampling index

Φ

N × k matrix with elements Φi,k=ϕki

ΦAGP

antisymmetrized geminal power (AGP)

ΦAS

antisymmetric (AS) part

ΦPf

Pfaffian (Pf)

ΦSD

single Slater determinant (SD)

Ψ

many-body WF Ansatz, Ψ = ΦAS × exp J

ΨG

many-body guiding function

ΨT

many-body trial WF

α1, …, αp

variational parameters of the trial WF

αk

kth variational parameter

μ, ν

indices of orbitals

σi

spin of ith electron

χir

primitive atomic orbital for the Jastrow part

ψir

primitive atomic orbital for the antisymmetric part

ϕkr

kth molecular orbital for the antisymmetric part

di

compact notation for σidri

1.
R.
Ramprasad
,
R.
Batra
,
G.
Pilania
,
A.
Mannodi-Kanakkithodi
, and
C.
Kim
,
npj Comput. Mater.
3
,
54
(
2017
).
2.
K.
Rajan
,
Mater. Today
8
,
38
(
2005
).
3.
W.
Kohn
and
L. J.
Sham
,
Phys. Rev.
140
,
A1133
(
1965
).
4.
R. M.
Martin
,
Electronic Structure: Basic Theory and Practical Methods
(
Cambridge University Press
,
2004
).
5.
M. G.
Medvedev
,
I. S.
Bushmarinov
,
J.
Sun
,
J. P.
Perdew
, and
K. A.
Lyssenko
,
Science
355
,
49
(
2017
).
6.
J. P.
Perdew
and
K.
Schmidt
,
AIP Conf. Proc.
577
,
1
20
(
2001
).
7.
K.
Burke
,
J. Chem. Phys.
136
,
150901
(
2012
).
8.
A. J.
Cohen
,
P.
Mori-Sánchez
, and
W.
Yang
,
Chem. Rev.
112
,
289
(
2012
).
9.
A.
Szabo
and
N.
Ostlund
,
Modern Quantum Chemistry
(
Macmillan Publishing Co., Inc.
,
1982
).
10.
C.
Møller
and
M. S.
Plesset
,
Phys. Rev.
46
,
618
(
1934
).
11.
P. J.
Knowles
and
N. C.
Handy
,
Chem. Phys. Lett.
111
,
315
(
1984
).
12.
B. O.
Roos
,
P. R.
Taylor
, and
P. E. M.
Sigbahn
,
Chem. Phys.
48
,
157
(
1980
).
13.
J.
Čížek
,
J. Chem. Phys.
45
,
4256
(
1966
).
14.
W.
Foulkes
,
L.
Mitas
,
R.
Needs
, and
G.
Rajagopal
,
Rev. Mod. Phys.
73
,
33
(
2001
).
15.
F.
Becca
and
S.
Sorella
,
Quantum Monte Carlo Approaches for Correlated Systems
(
Cambridge University Press
,
2017
).
16.
N.
Metropolis
and
S.
Ulam
,
J. Am. Stat. Assoc.
44
,
335
(
1949
).
17.
J.
Kim
,
A. D.
Baczewski
,
T. D.
Beaudet
,
A.
Benali
,
M. C.
Bennett
,
M. A.
Berrill
,
N. S.
Blunt
,
E. J. L.
Borda
,
M.
Casula
,
D. M.
Ceperley
,
S.
Chiesa
,
B. K.
Clark
,
R. C.
Clay
,
K. T.
Delaney
,
M.
Dewing
,
K. P.
Esler
,
H.
Hao
,
O.
Heinonen
,
P. R. C.
Kent
,
J. T.
Krogel
,
I.
Kylänpää
,
Y. W.
Li
,
M. G.
Lopez
,
Y.
Luo
,
F. D.
Malone
,
R. M.
Martin
,
A.
Mathuriya
,
J.
McMinis
,
C. A.
Melton
,
L.
Mitas
,
M. A.
Morales
,
E.
Neuscamman
,
W. D.
Parker
,
S. D. P.
Flores
,
N. A.
Romero
,
B. M.
Rubenstein
,
J. A. R.
Shea
,
H.
Shin
,
L.
Shulenburger
,
A. F.
Tillack
,
J. P.
Townsend
,
N. M.
Tubman
,
B.
Van Der Goetz
,
J. E.
Vincent
,
D. C.
Yang
,
Y.
Yang
,
S.
Zhang
, and
L.
Zhao
,
J. Phys.: Condens. Matter
30
,
195901
(
2018
).
18.
R. J.
Needs
,
M. D.
Towler
,
N. D.
Drummond
, and
P. L.
Ríos
,
J. Phys.: Condens. Matter
22
,
023201
(
2009
).
19.
L. K.
Wagner
,
M.
Bajdich
, and
L.
Mitas
,
J. Comput. Phys.
228
,
3390
(
2009
).
20.
C. J.
Umrigar
and
C.
Filippi
, Cornell-Holland Ab-initio Materials Package (CHAMP), http://pages.physics.cornell.edu/∼cyrus/champ.html; accessed
27 October 2019
.
21.
J. S.
Spencer
,
N. S.
Blunt
,
S.
Choi
,
J.
Etrych
,
M.-A.
Filip
,
W. M. C.
Foulkes
,
R. S.
Franklin
,
W. J.
Handley
,
F. D.
Malone
,
V. A.
Neufeld
 et al,
J. Chem. Theory Comput.
15
,
1728
(
2019
).
22.
S.
Zhang
,
J.
Carlson
, and
J. E.
Gubernatis
,
Phys. Rev. B
55
,
7464
(
1997
).
23.
M.
Motta
and
S.
Zhang
,
Wiley Interdiscip. Rev.: Comput. Mol. Sci.
8
,
e1364
(
2018
).
24.
G. H.
Booth
,
A. J.
Thom
, and
A.
Alavi
,
J. Chem. Phys.
131
,
054106
(
2009
).
25.
G. H.
Booth
,
A.
Grüneis
,
G.
Kresse
, and
A.
Alavi
,
Nature
493
,
365
(
2013
).
26.
A. J. W.
Thom
,
Phys. Rev. Lett.
105
,
263004
(
2010
).
27.
J. S.
Spencer
and
A. J.
Thom
,
J. Chem. Phys.
144
,
084108
(
2016
).
28.
N. S.
Blunt
,
T. W.
Rogers
,
J. S.
Spencer
, and
W. M. C.
Foulkes
,
Phys. Rev. B
89
,
245124
(
2014
).
29.
F. D.
Malone
,
N.
Blunt
,
J. J.
Shepherd
,
D.
Lee
,
J.
Spencer
, and
W.
Foulkes
,
J. Chem. Phys.
143
,
044116
(
2015
).
30.
S.
Ten-no
,
J. Chem. Phys.
138
,
164126
(
2013
).
31.
Y.
Ohtsuka
and
S.
Ten-no
,
J. Chem. Phys.
143
,
214107
(
2015
).
32.
S.
Ten-no
,
J. Chem. Phys.
147
,
244107
(
2017
).
33.
J. R.
McClean
and
A.
Aspuru-Guzik
,
Phys. Rev. A
91
,
012311
(
2015
).
34.
A.
Nagy
and
V.
Savona
,
Phys. Rev. A
97
,
052129
(
2018
).