We present efficient algorithms for using selected configuration interaction (sCI) trial wave functions in phaseless auxiliary field quantum Monte Carlo (ph-AFQMC). These advances, geared toward optimizing computational performance for longer configuration interaction expansions, allow us to use up to a million configurations in the trial state for ph-AFQMC. In one example, we found the cost of ph-AFQMC per sample to increase only by a factor of about 3 for a calculation with 10^{4} configurations compared to that with a single one, demonstrating the tiny computational overhead due to a longer expansion. This favorable scaling allows us to study the systematic convergence of the phaseless bias in auxiliary field quantum Monte Carlo calculations with an increasing number of configurations and provides a means to gauge the accuracy of ph-AFQMC with other trial states. We also show how the scalability issues of sCI trial states for large system sizes could be mitigated by restricting them to a moderately sized orbital active space and leveraging the near-cancellation of out of active space phaseless errors.

## I. INTRODUCTION

Quantum Monte Carlo (QMC) is a powerful tool in our arsenal to tackle the quantum many-body problem.^{1–7} Among various QMC approaches, phaseless auxiliary-field quantum Monte Carlo (ph-AFQMC)^{8} has emerged as an accurate and efficient method. While originally from the condensed matter community [therein usually referred to as constrained-path auxiliary field quantum Monte Carlo (AFQMC)],^{9–11} ph-AFQMC has gained popularity in chemistry in recent years.^{12–24} The accuracy and scalability of ph-AFQMC are determined in large part by the choice of the trial wave function. The use of a trial wave function becomes necessary for retaining statistical efficiency (in sample complexity) to control the fermionic phase (or sign) problem.^{25,26} The constraint imposed to control the phase problem is called the phaseless approximation.^{8} When the trial wave function approaches the exact ground state, the corresponding ph-AFQMC energy tends to the ground state energy. The commonly used trial wave function for ph-AFQMC is the broken-symmetry Hartree–Fock (HF) wave function. It scales as *O*(*N*^{3}) to obtain the trial wave function and *O*(*N*^{5}) to perform the ph-AFQMC calculation with the trial to obtain energy for a fixed statistical error, where *N* is the system size. Beyond HF, one may try to use a single determinant trial using approximate Bruckner orbitals,^{23} which essentially keeps the same computational scaling for ph-AFQMC. While single determinant trial wave functions are attractive due to their scalability, their accuracy can be limited and questionable in many examples.^{13,21,27}

On the other hand, there has been a flurry of developments in selected configuration interaction (sCI) methods in the last few years.^{28–31} Despite their steep (formally exponential) scaling with system size, sCI methods have increasingly been employed to study moderately sized systems and perform large active space correlated calculations. These methods have the capability of generating systematically more accurate approximations to the state of interest by increasing the number of configurations judiciously. This attractive feature makes sCI wave functions natural candidates for trial states in projector QMC methods. Based on significant advances in real-space QMC algorithms with such trial states,^{32,33} several papers have already reported calculations with long sCI expansions in diffusion Monte Carlo.^{34–37} There has also been some recent progress for using them in ph-AFQMC,^{21,24,38} but the commonly used algorithm based on the Sheman–Morrison–Woodbury identity has an inherent computational bottleneck with the local energy evaluation scaling as *O*(*N*_{c}*N*^{4}), where *N*_{c} is the number of determinants. This steeper scaling compared to real-space methods is due to the Gaussian orbital representation of the electron repulsion interaction. Furthermore, the force bias evaluation scales as *O*(*N*_{c}*N*^{2} + *N*^{3}), which can be expensive since it is performed once every time step. Due to these, it has been challenging to scale up the ph-AFQMC calculations beyond 100–1000 configurations.

In this work, we propose to use one of the sCI methods, heat-bath CI (HCI),^{30,39,40} to generate a large trial with up to 10^{6} configurations. We then combine several algorithmic advances made by two of us^{27,41} using the generalized Wick’s theorem and make further improvements to accelerate ph-AFQMC calculations with a large trial. In particular, in our new algorithm, the local energy evaluation scales as *O*(*N*_{c}*N* + *N*^{4}), and the force bias evaluation scales as *O*(*N*_{c} + *N*^{3}). To alleviate the scaling issues of sCI wave functions, we study the efficacy of active space trial states in calculating energy differences and properties. The active space size is another variable to converge the phaseless error systematically in larger systems while keeping the calculation cost manageable. We use the new algorithm for ph-AFQMC with HCI trials to investigate the behavior of the phaseless bias as a function of the number of configurations in the trial for several challenging systems.

This paper is organized as follows: We first review the ph-AFQMC algorithm (Sec. II); we then discuss how one can drastically speed up the evaluation of force bias and local energy of sCI trials using the generalized Wick’s theorem (Sec. III); we show how phaseless errors in the ground state energy and dipole moments change as a function of the number of determinants in hydrogen chains, transition metal oxides, and a few small molecules (Sec. IV); and we finally conclude (Sec. V).

## II. PHASELESS AUXILIARY FIELD QUANTUM MONTE CARLO

We briefly summarize the procedure for phaseless AFQMC here and refer the reader to Ref. 16 for more details. Consider the *ab initio* Hamiltonian given by

where *h*_{ij} are one-electron integrals and $Lij\gamma $ are Cholesky decomposed two-electron integrals in an orthonormal orbital basis. We will use letters *N*, *M*, and *X* to denote the number of electrons, the number of orbitals, and the number of Cholesky vectors, respectively. We note that in chemical systems, empirically, *X* ∼ *O*(*M*) with a proportionality constant usually smaller than 10. For density fitting (not employed here), we expect *X* ∼ 4 − 5*M*.^{42} AFQMC uses the exponential form of the projector to converge to the ground state of the Hamiltonian as

where *τ* is the imaginary time, $\Psi 0$ is the ground state, and $\psi I$ is an initial state such that $\psi I|\psi 0\u22600$. Using the Hubbard–Stratonovic transform,^{43,44} the short-time exponential projector can be written as

where **x** is the vector of auxiliary fields (one scalar field per Cholesky component), *p*(**x**) is the standard normal Gaussian distribution of the auxiliary fields, and $B\u0302(x)$ is a complex propagator given by the exponential of a one-body operator. Due to Thouless’ theorem,^{45} $B\u0302(x)$ acts on a Slater determinant $\varphi $ as

where $\varphi (x)$ is another Slater determinant obtained by a linear transformation of the orbitals in $\varphi $. By sampling the auxiliary fields in Eq. (3) and applying the short time propagator sufficiently many times, we obtain a stochastic representation of the ground state wave function in the long time limit as

where *w*_{i} are weights, $\varphi i$ are Slater determinants with complex orbitals sampled during propagation, and $\psi T$ is the trial state used for importance sampling. More accurate trial states lead to less noisy weights.

We use the hybrid approximation^{8} in this paper, which avoids the expensive calculation of local energy at each propagation step. Importance sampling in AFQMC involves shifting the sampled auxiliary fields by the force bias given as

where $\varphi $ is the walker. The force bias can be thought of as a dynamic correction to the mean-field contour shift^{46} that vanishes as Δ*τ* → 0. Force bias is not enough, by itself, to control the large fluctuations stemming from the phase problem. The phaseless constraint^{8} can be used to overcome the phase problem at the expense of a systematic bias in the sampled wave function. The size of this phaseless bias is dictated by the accuracy of the trial state.

After an equilibration time, the energy of the sampled wave function can be measured as

where

is the local energy of the walker $\varphi i$. We note that this estimator has a zero-variance property, i.e., in the limit that the trial state is the exact, ground state the variance of the estimator vanishes. In practice, this means that more accurate trial states lead to less noisy local energy values.

It is evident that more accurate trial states lead to less noisy simulations and smaller phaseless biases. However, in general, the use of more sophisticated trial states comes at the expense of greater computational cost. In Sec. III, we present efficient algorithms to calculate force bias and local energy of sCI trial states.

## III. SELECTED CONFIGURATION INTERACTION TRIAL STATES

A selected CI wave function is given by

where $\psi n$ are configurations obtained by particle–hole excitations from the reference configuration $\psi 0$, *N*_{c} is the number of configurations in the expansion, *c*_{n} are real expansion coefficients, and *k*_{n} are the excitation ranks. We will use the indices *p*_{μ} and *t*_{μ} to denote occupied and virtual orbitals, respectively, whereas indices *i*, *j*, … will be used for general orbitals. The orthonormal orbital basis set used in the expansion can be chosen to be natural orbitals obtained from an HCI calculation or can be optimized using a self-consistent procedure. In the following, we use the same basis set to express the Hamiltonian in Eq. (1) and, in general, all second-quantized operators refer to this basis. In our prior work, we have proposed efficient algorithms for using selected CI states in variational Monte Carlo^{41} and free projection AFQMC.^{27} These focused on the calculation of local energy using the generalized Wick’s theorem. Here, we discuss an algorithm for calculating the force bias required in ph-AFQMC and briefly summarize the calculation of local energy.

### A. Force bias

The calculation of the force bias for importance sampling is one of the computationally intensive parts of propagation in ph-AFMQC. By substituting the sCI trial state into the force bias expression in Eq. (6), we obtain

The conventional way of evaluating the force bias proceeds by separately calculating the contribution of each configuration in the CI expansion using the Sherman–Morrison–Woodbury identity, resulting in a cost scaling of *O*(*N*_{c}*NM* + *XNM*).^{38} Here, we present an algorithm with cost scaling as *O*(*N*_{c} + *XM*^{2}) by essentially separating the sum over CI excitations from that over Cholesky indices in Eq. (10).

We define the Green’s function matrix for the reference configuration $\psi 0$ and the walker configuration $\varphi $ as^{47,48}

where *ψ*_{0} and *ϕ* are the orbital coefficient matrices of the corresponding configurations and superscripts and subscripts denote row and column indices, respectively. Note that since we work in the orbital basis of the CI expansion, *ψ*_{0} has a particularly simple form with all its columns being unit vectors. Thus, the cost of calculating Green’s function in this basis scales as *O*(*N*^{2}*M*). For convenience, we also define the related quantity as

According to the generalized Wick’s theorem,^{48} we have

where the sets of indices $p\mu $ and $t\mu $ denote the *k* × *k* slice of the *G* matrix, with *k* being the rank of the excitation. This expression is obtained by taking pairwise contractions of the operators in the string of excitations according to the generalized Wick’s theorem with the determinant structure arising due to fermionic permutation parity factors. Therefore, the denominator in Eq. (10) can be expressed as

The computational cost scaling of this overlap ratio is, thus, *O*(*N*_{c}) once Green’s function has been calculated. Since Wick’s theorem applies naturally to overlap ratios, we will find it convenient to express matrix elements as ratios with the reference overlap $\psi 0|\varphi $.

To evaluate the numerator of Eq. (10), consider the matrix element ratio given by

where we have dropped the configuration index subscript (*n*) on the excitations for clarity. This ratio can also be evaluated using the generalized Wick’s theorem, and by choosing the appropriate order of operations, one can achieve a significant reduction in the cost of calculating the numerator. To this end, we note that the pairwise contractions in the string of excitation operators can be divided into two groups: (1) those without any contractions between CI excitations and $a\u0302i\u2020a\u0302j$, (2) those containing contractions between two of the CI operators and $a\u0302i\u2020a\u0302j$. Algebraically, the two terms are given as

where on the first line, the matrix of size (*k* + 1) × (*k* + 1) is written in a block form, and the notation $p\mu \p\nu $ indicates the set of indices $p\mu $ excluding *p*_{ν}. Using this grouping of terms, we get for the numerator in Eq. (10),

where

is an intermediate formed by summing over the CI configurations. The two terms in Eq. (17) are shown as tensor contraction diagrams in Table I for the case of doubly excited CI configurations. Note that a selected CI state does not necessarily include all excitations of a given rank; however, for a tensor representation, we consider all doubly excited configurations in the table. In practice, the sum over CI excitations in Table I is only performed over the configurations in the selected CI state. The first term has a cost scaling of *O*(*N*_{c} + *XNM*), while the second one using the intermediate has a scaling of *O*(*N*_{c} + *XM*^{2}). If the selected CI state is restricted to an active space of size *A*, the overall cost can be reduced to *O*(*N*_{c} + *XNM* + *XAM*).

. | . | This work^{a}
. | |||||
---|---|---|---|---|---|---|---|

Quantity . | SMW . | Scaling . | Tensor diagrams for doubly excited configurations . | ||||

Force bias | NM(N_{c} + X) | N_{c} + XNM | |||||

Local energy | N_{c}XN^{2}M | N_{c} + XM^{2} | |||||

N_{c} + XNM | |||||||

N_{c} + XN^{2}M | |||||||

N_{c} + XM^{2} | |||||||

N_{c} + XNM^{2} | |||||||

X(N_{c} + NM^{2}) |

. | . | This work^{a}
. | |||||
---|---|---|---|---|---|---|---|

Quantity . | SMW . | Scaling . | Tensor diagrams for doubly excited configurations . | ||||

Force bias | NM(N_{c} + X) | N_{c} + XNM | |||||

Local energy | N_{c}XN^{2}M | N_{c} + XM^{2} | |||||

N_{c} + XNM | |||||||

N_{c} + XN^{2}M | |||||||

N_{c} + XM^{2} | |||||||

N_{c} + XNM^{2} | |||||||

X(N_{c} + NM^{2}) |

^{a}

*N*_{c}: number of configurations; *N*: Number of electrons; *M*: Number of orbitals; *X*: Number of Cholesky vectors; *c*: CI coefficients; *L*: Cholesky vectors; *G* and $G$: Green’s functions; *i*, *j*, *k*, *l*: Hamiltonian indices; and *p*, *q*, *t*, *u*: CI excitation indices.

### B. Local energy

The calculation of local energy was discussed in detail in Ref. 27. Here, we give a brief summary of the algorithm. Consider the two-body part of the local energy [see Eq. (8)] expressed as

The calculation of the overlap ratios in the denominator is described in Sec. III A. The matrix element in the numerator for the *n*th configuration is given by using Wick’s theorem as

where we have again dropped the configuration index subscript. Similar to the force bias calculation, the terms in this determinant can be split into two groups based on the kinds of pairwise contractions: those that do not involve cross-contractions between CI and Hamiltonian excitation operators and those that involve at least one such cross-contraction. The second group containing cross-contractions can be further split into two groups depending on whether one or both of the Hamiltonian excitations are cross-contracted with CI excitations. The resulting terms are shown as a tensor diagram in Table I. Again, we assume all double excited CI configurations only for representational convenience. Different orders of performing the tensor contractions lead to different cost scalings, and using the one described in Ref. 27 leads to a scaling of *O*(*N*_{c}*X* + *XNM*^{2}). For a selected CI state restricted to an active space of size *A*, this cost is reduced to *O*(*N*_{c}*X* + *XNAM* + *XN*^{2}*M*).

## IV. RESULTS

In this section, we present the results of our ph-AFQMC/HCI calculations and analyze the convergence of the phaseless error for several systems. In particular, we consider the utility of active space trial states for obtaining accurate energy differences and dipole moments. We used PySCF^{49} to obtain molecular integrals and to perform all quantum chemistry wave function calculations. The SHCI code Dice was used to obtain the trial states. We do not converge the HCI variational calculations close to near-exact accuracies but only generate relatively crude expansions at a modest computational cost. The code used to perform ph-AFQMC calculations is available in a public repository.^{50} Input and output files for all calculations can also be accessed from a public repository.^{51} Sources of systematic errors in ph-AFQMC calculations, besides the phaseless bias, include Trotter error, population control bias, and errors in the Cholesky decomposition of the electron repulsion integrals, matrix exponential Taylor series truncation errors, and bias due to filtering of rare large fluctuations. We used a conservative time step of 0.005 a.u. in all ph-AFQMC calculations. For population control, we used the reconfiguration procedure described in Ref. 52. Cholesky decompositions were calculated up to a threshold error of 10^{−5}. Matrix exponentials were calculated by keeping terms up to the sixth power in the Taylor expansion. We filtered large fluctuations by capping weights and local energies.^{16} Although it is difficult to get an accurate estimate of the systematic errors due to all these factors in general, they can be controlled systematically. We estimate these errors to be much smaller than the statistical errors in all the results presented here.

### A. Hydrogen chains

Hydrogen chain systems are convenient for benchmarking due to their simplicity and the availability of accurate density matrix renormalization group (DMRG) ground state energies.^{53} First, we consider a chain of 50 equidistant hydrogen atoms with an interatomic distance *d* = 1.6 Bohr in the minimal STO-6G basis. We performed ph-AFQMC/HCI calculations on this system using progressively larger HCI expansions. These trial states were constructed as truncations of the HCI state obtained with *ϵ*_{1} = 10^{−4} in the canonical RHF basis. Figure 1 shows the convergence of the phaseless error (calculated as a difference to the DMRG energy reported in reference) with the number of configurations in the trial state. We observe a nearly monotonic decrease in the phaseless error, and the error is less than 2 mH for the trial with 10^{6} configurations. The figure also shows the scaling of the cost of the ph-AFQMC calculation against the number of configurations with respect to the cost of a single configuration calculation. Remarkably, the bare cost factor for up to 10^{4} configurations is less than 10 and increases roughly linearly with the number of configurations thereafter. This observation is consistent with the scaling relations discussed in Sec. III, with the *O*(*N*_{c}*X*) factor in the cost scaling of local energy evaluation dominating for large HCI expansions. We also show an error adjusted cost factor to estimate the cost for obtaining a fixed stochastic error. Since more accurate trial states lead to a reduced stochastic noise, this adjusted cost factor is less than one for *N*_{c} ≤ 10^{4}, demonstrating the efficacy of these trial states. We found that changing the value of *ϵ*_{1} or slightly changing the selection criterion for choosing the configurations only makes a difference in ph-AFQMC energies for the smaller number of configurations, leading to similar convergence behaviors for longer expansions.

We also studied the H_{10} chain in the cc-pVDZ basis at different bond lengths to gauge the performance of active space trial states. We first performed a (10e, 10o) complete active space self-consistent field (CASSCF) calculation and generated trial states by truncating the active space wave function. Figure 2 shows the convergence of the phaseless error with the size of the HCI trial. For all three bond lengths, we see a systematic convergence to an almost vanishing error with the active space trial. The shortest bond length near equilibrium converges rapidly, while the stretched bond length requires almost the full active space wave function to reach convergence. The out of active space phaseless error, arising due to not correlating any orbitals outside the active space in the trial state, is tiny for all three bond lengths in this case. We also show ph-AFQMC/UHF phaseless errors in the figure for comparison. ph-AFQMC/UHF is known to be very accurate for hydrogen chains except at stretched geometries. At stretched geometries, a CI expansion based on an unrestricted HF reference has been employed in the past as a trial state^{53} and could be a handy tool in other problems as well. The algorithms described in this work can be straightforwardly extended to work with UHF-based CI expansions.

### B. Transition metal oxides

Recent studies of transition metal oxide molecules reported benchmark ground state energies using various accurate many-body methods.^{21,54} Here, we use ph-AFQMC/HCI to study convergence of the phaseless errors for active space trials in TiO, CrO, MnO, and FeO. The Trails-Needs pseudopotential and the corresponding DZ basis sets^{55} were used at equilibrium geometries used in Ref. 21.

Figure 3 shows the convergence of phaseless errors with the number of configurations in the trial state at equilibrium geometries used in Ref. 21. We first performed CASSCF calculations with active spaces consisting of the metal 3*d* and 4*d* orbitals and oxygen 2*p* orbitals. The HCI trial states were then generated in a space including the CASSCF active orbitals and the core orbitals, including all electrons. *ϵ*_{1} = 10^{−5} was used for the HCI calculations. Interestingly, for FeO, CrO, and MnO, the phaseless error first increases before converging to a fixed value as the number of configurations in the trial is increased. This peculiar behavior highlights the fact that a lower energy trial state does not guarantee a smaller phaseless error. We note that the trends for smaller expansions are dependent on the particular schemes used for choosing the orbital spaces and for truncating the HCI states. However, results for a large number of configurations are largely independent of such ambiguity. For TiO, using more configurations makes little difference to the phaseless error. We also performed ph-AFQMC/UHF calculations for comparison and found the phaseless errors to be 2(1), 5.3(8), −10(1), and 30.6(9) mH for TiO, CrO, MnO, and FeO, respectively.

The residual phaseless errors can be attributed solely to the out of HCI active space correlation. These errors can be systematically reduced by increasing the size of the active space, but vanishing errors this way is unlikely to be practical for larger systems. On the other hand, as noted in Sec. IV A, it is reasonable to expect near-cancellation of this out of active space error when energy differences at different geometries are calculated using the same active space. Similar trends have been observed in multireference perturbation theories. To gauge the efficacy of this heuristic, we calculated ph-AFQMC/HCI energies of CrO at three bond lengths using the procedure described above. Table II shows that the converged phaseless errors are almost identical at the three points. We note that it is crucial to converge the phaseless error of the active space trial, and different geometries may require trial states with different numbers of configurations to achieve convergence. The table also shows energies at infinite separation, showing that the error in the ph-AFQMC/HCI dissociation energy compared to SHCI is 1.8(9) mH. This evidence of small non-parallelity errors suggests that in large problems, scalability issues of HCI trial states can be mitigated using active spaces.

d
. | ph-AFQMC/HCI . | SHCI . | Phaseless error . |
---|---|---|---|

2.65 | −102.4993(5) | −102.5040(5) | 4.6(7) |

3.06 | −102.5531(5) | −102.5584^{21} | 5.3(5) |

3.59 | −102.5269(7) | −102.5325(5) | 5.6(8) |

∞ | −102.3862(8) | −102.3933^{21} | 7.1(8) |

### C. Dipole moments

The mixed estimator of an observable $O\u0302$ in the ground state is given by

where $\Psi 0$ is the ground state. In ph-AFQMC, one has access to an approximate sampling of the ground state, making the calculation of this mixed estimator convenient,

where

is the local observable value.

For observables $O\u0302$ that do not commute with the system Hamiltonian, the mixed estimator has a systematic bias due to the trial state used to measure the observable. This bias can be reduced by using the extrapolated estimator, often employed in diffusion Monte Carlo,^{56–58} given by

where the variational estimator of the trial state is used to correct the mixed estimator. The accuracy of this extrapolation depends critically on the quality of the trial state. Since ph-AFQMC calculations on molecular systems are typically performed using crude single determinant trials, extrapolation is not a viable option, and backpropagation is used to evaluate variational estimators of the sampled state instead.^{59} Here, we study the behavior of the mixed and extrapolated estimators of dipole moments of a few small molecules. The evaluation of the local dipole moment for an HCI trial state is similar to the force bias calculation outlined in Sec. III A since they both involve matrix elements of one-electron operators.

We first look at the exact solvable problem of NH_{3} in the 6-31G basis (see the supplementary material for geometry). We fully diagonalize this (10e, 14o) problem and calculate its ground state dipole moment. Mixed and extrapolated ph-AFQMC estimators were evaluated with trial states consisting of an increasing number of leading configurations from the ground state in the RHF canonical orbital basis. Figure 4 shows the convergence of the estimators with the trial state. In this case, we see a systematic convergence of the variational and mixed estimators to the exact value. The extrapolation works reasonably well with consistently smaller errors than the mixed estimator.

Table III shows the calculated dipole moments for three molecules at equilibrium geometries in the aug-cc-pVQZ basis. The large basis set precludes exact evaluation, but CCSD(T) moments are known to be very accurate for the molecules considered here and agree well with the experimental values. We computed orbital relaxed coupled cluster and MP2 dipole moments by calculating energies at two different electric fields and evaluating the numerical derivative. For obtaining ph-AFQMC/HCI estimators, we first performed a full-valence CASSCF calculations. The trial states were then constructed using HCI to correlate all electrons in a space of 50 CASSCF orbitals. In all cases, both mixed and extrapolated estimators were nearly converged with the number of configurations in the active space (details can be found in the supplementary material). Consider the case of the challenging CO molecule, which has a small dipole moment, and RHF predicts the wrong direction polarity in this case. Furthermore, MP2 overestimates the dipole moment significantly. The convergence of the dipole moment estimators with the number of configurations in the trial state is shown in Fig. 5. For a small number of configurations (fewer than 100), the variational estimator is qualitatively incorrect similar to RHF. Adding more configurations flips the sign, and the variational estimator seems to eventually converge to the asymptotic value for the truncated space of 50 orbitals. The mixed estimator also exhibits a non-monotonic convergence with its value for a single determinant close to the experimental value because of a fortuitous cancellation of mixed estimator and phaseless biases. The extrapolated estimator for the small number of configurations in the trial is very poor due to the erroneous variational estimates. For more than 10^{3} configurations, while the mixed estimator seems to be converging to a substantially biased dipole moment value, the extrapolation rectifies this bias reasonably well. The extrapolated estimators are in good agreement with the CCSD(T) and experimental values for all three molecules. Evidently, extrapolation effectively corrects the out of active space bias in the mixed estimator for CO and BF.

. | ph-AFQMC/HCI . | . | . | . | . | . | |
---|---|---|---|---|---|---|---|

Molecule . | Mixed . | Extrapolated . | RHF . | MP2 . | CCSD . | CCSD(T) . | Experiment . |

CO | 0.070(3) | 0.044(6) | −0.104 | 0.108 | 0.024 | 0.048 | 0.048^{60} |

BF | 0.356(4) | 0.321(8) | 0.333 | 0.377 | 0.314 | 0.325 | ⋯ |

H_{2}O | 0.726(2) | 0.734(4) | 0.780 | 0.733 | 0.738 | 0.729 | 0.730^{61} |

## V. CONCLUSION

In this work, we presented efficient algorithms for using selected CI trial states in ph-AFQMC. We demonstrated their favorable scaling by showing that simulations with long sCI expansions incur little overhead compared to a single reference trial. In our analysis of the convergence of phaseless error in ground-state energies and dipole moments as a function of the number of configurations, we found it to be non-monotonic in some cases, highlighting the importance of using a sufficient number of determinants in these systems. The use of states restricted to moderately sized active spaces yielded excellent results for energy differences and dipole moments. Our numerical experiments suggest that this may be a practical way to tackle larger systems, where the generation and handling of sCI states in the full space are challenging. Finally, we showed that the extrapolated estimator for dipole moments is accurate if enough configurations are present in the trial state. If this behavior turns out to be true in general, extrapolated estimators could be a viable alternative to backpropagation for calculating ground state properties.

Our technique can be employed in different ways for studying challenging systems. It can be used to validate the use of simpler and less expensive trial states in ph-AFQMC for large calculations. ph-AFQMC/HCI can be adopted as an accurate solver in embedding schemes for describing the embedded cluster. It would be interesting to study the feasibility of the extrapolated estimator in larger systems and for calculating two-body properties. Finally, we note that this technique can be utilized in the quantum–classical hybrid AFQMC (QC-AFQMC) for prototypical and practical quantum computations.^{62}

## SUPPLEMENTARY MATERIAL

See the supplementary material for geometries and details of wave functions used.

## ACKNOWLEDGMENTS

A.M. and S.S. were supported by the National Science Foundation through Grant No. CHE-1800584. S.S. was also partly supported through the Sloan research fellowship. J.L. thanks David Reichman for support and encouragement. All calculations were performed on the Blanca and Summit clusters at CU Boulder.

## AUTHOR DECLARATIONS

### Conflict of Interest

The authors have no conflicts to disclose.

## DATA AVAILABILITY

The data that support the findings of this study are available in a public repository in Ref. 51.

## REFERENCES

_{36}fullerene and an iron porphyrin model complex

^{4}He