Benchmark datasets are crucial for developing and assessing methods for the treatment of electron correlation. In the past several years, the Simons Collaboration on the many-electron problem has produced multiple important benchmark studies on the two-dimensional Hubbard model,1–3 transition metal molecules,4 and hydrogen chains.5,6 These studies have offered new insight into the underlying difficulties associated with distinct electronic structure approaches in a variety of settings and have provided the community with state-of-the-art benchmarks where exact, unbiased calculations are generally not yet feasible.

The recent blind test7 by Eriksen et al. contributes to the body of knowledge in the field of computational electronic structure theory in a similar manner. The target application of the study is the calculation of the non-relativistic Born–Oppenheimer frozen-core correlation energy of a benzene molecule in the cc-pVDZ basis set,7 with a resulting correlation space of 30 electrons and 108 orbitals. Unlike the Simons Collaboration on the many-electron problem benchmark studies, the work of Ref. 7 is focused on a single point calculation but is completely blind such that the authors have contributed their final results without knowledge of the exact answer or the results from other contributors. This latter aspect significantly enhances the unbiased assessment of competing and complimentary approaches.

The blind test reports the frozen-core correlation energies from a total of eight methods, all developed by the authors of Ref. 7. These methods can be largely grouped into five categories: (1) one based on a many-body expansion approach (MBE-FCI), (2) three based on a selected configuration interaction approach with a second-order perturbative correction (ASCI, iCI, and SHCI), (3) one based on a selected coupled-cluster theory approach with a second-order perturbative correction [FCCR or more precisely FCCR(2)], (4) one based on a matrix product state parametrization (DMRG), and (5) two based on the full configuration interaction quantum Monte Carlo (AS-FCIQMC and CAD-FCIQMC). See Ref. 7 for further information of each method.

In the present note, we examine the accuracy of phaseless auxiliary-field quantum Monte Carlo (ph-AFQMC) on the identical problem posed by Eriksen et al. ph-AFQMC is a method that has prominently featured in several benchmark studies led by the Simons Collaboration1–6 and has stood out as a flexible and state-of-the-art ab initio approach.8–27 ph-AFQMC is a projector MC method that naturally parametrizes the wavefunction in a non-linear fashion. Refer the recent review by Motta and Zhang for details of the approach.28 The only uncontrolled bias introduced in ph-AFQMC is the error due to the phaseless constraint8 imposed via a predefined trial wavefunction. While one must be cognizant of this bias, imposing the phaseless constraint is necessary to remove the fermionic sign (or phase) problem entirely. In other words, due to the constraint, statistical errors do not grow exponentially with the system size and one does not need exponentially many walkers to cull reasonable statistics. Furthermore, as long as the underlying phaseless constraint is imposed with a size-consistent trial wavefunction, the approach guarantees size-consistency overall.23 One major potential drawback of this method is that the resulting ph-AFQMC energy is not variational.29 It has been shown that ph-AFQMC can be exceptionally accurate for systems with mainly dynamic correlation (such as benzene).14,19,20,23,25 We also note that ph-AFQMC has been shown to perform well on systems with strong correlation as well, although complicated trial wavefunctions have often been necessary in such cases.2,3,5,6,19,20,25 Given the features of ph-AFQMC outlined above, as well as the fact that it falls in a class distinct from the five categories of the examined methods, we believe that the addition of ph-AFQMC results to those of Ref. 7 would be quite useful. We provide these results here, along with some additional observations associated with the use and accuracy of ph-AFQMC in its most scalable form. Obviously, our results are not “blind” in the manner of those presented in Ref. 7; however, we strive to present completely unbiased and unadjusted results, which reflect standard practice and complete convergence.

The choice of trial wavefunction wholly determines the accuracy of ph-AFQMC. It is possible to exploit multi-determinant (MD) trial wavefunctions and observe a convergence of the ph-AFQMC energy with respect to the number of determinants.23,30 While large MD trial wavefunctions may yield near-exact energies, such wavefunctions are not scalable in general and destroy the size-consistency of ph-AFQMC when the determinantal expansion is aggressively truncated. A size-consistent, scalable trial wavefunction is a single determinant (SD) wavefunction. The combination of ph-AFQMC with an SD trial wavefunction [e.g., Hartree–Fock (HF) orbitals,23 Kohn–Sham density functional theory orbitals,20 and approximate Brueckner orbitals25] has previously demonstrated relatively high accuracy and scalability. In particular, in our experience, the use of a HF trial with ph-AFQMC provides accuracy at roughly the CCSD(T) level while enabling the treatment of larger systems.11,23,25,27 Further studies are necessary to understand the scope of ph-AFQMC with an SD trial relative to that of CCSD(T). The ph-AFQMC algorithm with an SD trial consists of the use of a quartic-scaling molecular orbital transformation of Cholesky decomposed integrals [O(N4)] once at the beginning of the QMC run, cubic-scaling propagation [O(N3)] for each time step, and the local energy evaluation [O(N4) or O(N3) with recent advances22,24,31] for each sampled MC block. The number of samples required for a fixed statistical error scales as O(N2), but this overhead is practically not difficult to address due to its highly parallelizable nature. Therefore, the method is overall quintic-scaling with recent developments22,24,31 for a fixed statistical error, which is more scalable than many other state-of-the-art approaches.

Respecting the unbiased nature of this benchmark, instead of focusing on removing the phaseless error via large MD trial wavefunctions, we first employed the most scalable (and yet least accurate) trial wavefunction, an SD trial based on spin-restricted HF (RHF) orbitals. We refer to the resulting method as ph-AFQMC + RHF. We emphasize that ph-AFQMC with an SD trial wavefunction is what most practitioners of ph-AFQMC would employ for general large-scale ab initio applications. Furthermore, we examined the improvement that one gains by using a simple MD trial wavefunction based on a complete active space self-consistent field (CASSCF) wavefunction with a six-electron and six-orbital active space (π and π* orbitals). We refer to this method as ph-AFQMC + CAS(6,6).32 We used QMCPACK33,34 to run ph-AFQMC calculations on benzene. ph-AFQMC + RHF was run with the cc-pVDZ, cc-pVTZ, and cc-pVQZ basis sets,35 whereas ph-AFQMC + CAS(6,6) was performed only for the cc-pVDZ basis set. We used 1024 walkers for ph-AFQMC + RHF and 1280 walkers for ph-AFQMC + CAS(6,6). A time step of 0.005 a.u. was used. The population control bias and the time step error were found to be negligible for the purpose of this study. Molecular integrals for QMCPACK were generated by PySCF.36 CCSD and CCSD(T) calculations were performed with Q-Chem.37 The smallest basis set, cc-pVDZ, was used in the blind test,7 but we further report larger basis set results along with the T–Q extrapolated complete basis-set (CBS) energy according to Helgaker’s formula.38 The assessment of different approaches in the complete basis set limit is also important for the detailed evaluation of various methodologies.

As mentioned previously, ph-AFQMC + RHF places emphasis on scalability over accuracy. As a result of this, in Table I, we see that ph-AFQMC + RHF deviates from the value where many methods agreed on (i.e., −863 mEh) by −3.1(3) mEh. We note that this deviation is comparable in magnitude to that of ASCI [+3.0(2) mEh], as well as CCSDT/CCSD(T), although the direction of the deviation clearly reflects the non-variational nature of ph-AFQMC. This ph-AFQMC + RHF calculation required modest computation resources: 4 h with 32 graphics processing units (GPUs) [NVIDIA V100 (VOLTA); 4 GPUs per node and 16GB per GPU]. On the other hand, ph-AFQMC + CAS(6,6) is more accurate than ph-AFQMC + RHF, while its scalability is ultimately limited by the CAS calculation itself for general applications. The resulting ph-AFQMC + CAS(6,6) energy deviation from the “exact” answer (−863 mEh) is −1.3(4) mEh, which is about a factor of 2.4 improvement over ph-AFQMC + RHF. This deviation is comparable to that of SHCI [−1(2) mEh] and highlights the accuracy of ph-AFQMC with a relatively simple MD trial wavefunction. It should be noted that more accurate ASCI and SHCI post blind test results may be found in the supplementary material of Ref. 7. The ph-AFQMC + CAS(6,6) calculation was performed on 160 cores [Intel(R) Xeon(R) Gold 6148 CPU @ 2.40 GHz; 40 cores per node] for 10 h.

TABLE I.

The frozen-core correlation energy of benzene in the cc-pVDZ basis set using various methods. All energies other than CCSD(T) and ph-AFQMC were taken from the blind test results in Ref. 7. Note that the numbers in parentheses, except those for AS-FCIQMC and ph-AFQMC, represent author-assessed uncertainties associated with a method-specific extrapolation procedure. These uncertainties are not directly comparable between different methods since they were estimated by different means as explained in the supplementary material of Ref. 7.

MethodEcorr (mEh)
CCSD(T) −859.5 
CCSDT −859.9 
ASCI −860.0(2) 
iCIPT2 −861.1(5) 
CCSDTQ −862.4 
DMRG −862.8(7) 
FCCR(2) −863.0 
MBE-FCI −863.0 
CAD-FCIQMC −863.4 
AS-FCIQMC −863.7(3) 
SHCI −864(2) 
ph-AFQMC + CAS(6,6) −864.3(4) 
ph-AFQMC + RHF −866.1(3) 
MethodEcorr (mEh)
CCSD(T) −859.5 
CCSDT −859.9 
ASCI −860.0(2) 
iCIPT2 −861.1(5) 
CCSDTQ −862.4 
DMRG −862.8(7) 
FCCR(2) −863.0 
MBE-FCI −863.0 
CAD-FCIQMC −863.4 
AS-FCIQMC −863.7(3) 
SHCI −864(2) 
ph-AFQMC + CAS(6,6) −864.3(4) 
ph-AFQMC + RHF −866.1(3) 

Finally, we report the larger basis set ph-AFQMC + RHF results and its CBS limit energy in Table II. The correlation space increases from 108 orbitals (cc-pVDZ) to 258 orbitals (cc-pVTZ) and 504 orbitals (cc-pVQZ). Similar to cc-pVDZ, ph-AFQMC + RHF correlation energies are 6 mEh–7 mEh lower than those of CCSD(T) in both bases and in the complete basis set limit. We expect converged ph-AFQMC with an MD trial wavefunction to lie between these two numbers, similar to cc-pVDZ. The same computational resource as that of ph-AFQMC + RHF/cc-pVDZ was used for cc-pVTZ, and 64 GPUs were used for 4 h for cc-pVQZ.

TABLE II.

The frozen-core correlation energy (mEh) of benzene using ph-AFQMC + RHF, CCSD, and CCSD(T) in the cc-pVTZ and cc-pVQZ basis sets.

Basisph-AFQMC + RHFCCSD CCSD(T)
cc-pVTZ −1033.7(3) −975.2 −1027.3 
cc-pVQZ −1085.5(4) −1027.3 −1079.0 
CBS −1123.3(7) −1057.4 −1116.7 
Basisph-AFQMC + RHFCCSD CCSD(T)
cc-pVTZ −1033.7(3) −975.2 −1027.3 
cc-pVQZ −1085.5(4) −1027.3 −1079.0 
CBS −1123.3(7) −1057.4 −1116.7 

In summary, we report ph-AFQMC correlation energies for the problem posed in the recent blind test of Ref. 7, namely, a benzene molecule in the cc-pVDZ basis set. In addition, we report the ph-AFQMC + RHF correlation energies on larger basis sets (cc-pVTZ and cc-pVQZ) along with the extrapolated complete basis set limit correlation energy. We believe that due to the accuracy, flexibility, and scalability of the approach, the addition of ph-AFQMC results to those of the recent blind test will contribute to the informed use of a broad set of methods to tackle diverse electronic structure problems. Challenges to objective benchmark studies include the broad coverage of relevant methods, the choice of representative targets such as energy differences, and the study of systems as close as possible to the complete basis set limit. The recent studies by the Simons many-electron collaboration1–6 and by Eriksen and et al.7 illustrate community efforts toward this goal, to which we hereby add restricted but useful information concerning the ph-AFQMC approach.

The data that support the findings of this study are available within the article.

D.R.R. was supported by Grant No. NSF-CHE 1954791. The work of F.D.M. was supported by the U.S. Department of Energy (DOE), Office of Science, Basic Energy Sciences, Materials Sciences and Engineering Division, as part of the Computational Materials Sciences Program and Center for Predictive Simulation of Functional Materials (CPSFM). The work of F.D.M. was performed under the auspices of the U.S. DOE by LLNL under Contract No. DE-AC52-07NA27344. Some of AFQMC calculations received computing support from the LLNL Institutional Computing Grand Challenge program.

1.
J. P. F.
LeBlanc
 et al.,
Phys. Rev. X
5
,
041041
(
2015
).
2.
B.-X.
Zheng
 et al.,
Science
358
,
1155
(
2017
).
3.
M.
Qin
 et al.,
Phys. Rev. X
10
,
031016
(
2020
).
4.
K. T.
Williams
 et al.,
Phys. Rev. X
10
,
011041
(
2020
).
5.
M.
Motta
 et al.,
Phys. Rev. X
7
,
031059
(
2017
).
6.
M.
Motta
 et al.,
Phys. Rev. X
10
,
031058
(
2020
).
7.
J. J.
Eriksen
 et al., arXiv:2008.02678 (
2020
).
8.
S.
Zhang
and
H.
Krakauer
,
Phys. Rev. Lett.
90
,
136401
(
2003
).
9.
W. A.
Al-Saidi
,
H.
Krakauer
, and
S.
Zhang
,
Phys. Rev. B
73
,
075103
(
2006
).
10.
M.
Suewattana
 et al.,
Phys. Rev. B
75
,
245123
(
2007
).
11.
W.
Purwanto
 et al.,
J. Chem. Phys.
128
,
114309
(
2008
).
12.
W.
Purwanto
,
S.
Zhang
, and
H.
Krakauer
,
J. Chem. Phys.
142
,
064302
(
2015
).
13.
M.
Motta
and
S.
Zhang
,
J. Chem. Theory Comput.
13
,
5367
(
2017
).
14.
H.
Hao
 et al.,
J. Phys. Chem. Lett.
9
,
6185
(
2018
).
15.
Y.
Liu
,
M.
Cho
, and
B.
Rubenstein
,
J. Chem. Theory Comput.
14
,
4722
(
2018
).
16.
J.
Shee
 et al.,
J. Chem. Theory Comput.
14
,
4109
(
2018
).
17.
M.
Motta
and
S.
Zhang
,
J. Chem. Phys.
148
,
181101
(
2018
).
18.
S.
Zhang
,
F. D.
Malone
, and
M. A.
Morales
,
J. Chem. Phys.
149
,
164102
(
2018
).
19.
J.
Shee
 et al.,
J. Chem. Theory Comput.
15
,
2346
(
2019
).
20.
J.
Shee
 et al.,
J. Chem. Theory Comput.
15
,
4924
(
2019
).
21.
M.
Motta
,
S.
Zhang
, and
G. K.-L.
Chan
,
Phys. Rev. B
100
,
045127
(
2019
).
22.
F. D.
Malone
,
S.
Zhang
, and
M. A.
Morales
,
J. Chem. Theory Comput.
15
,
256
(
2019
).
23.
J.
Lee
,
F. D.
Malone
, and
M. A.
Morales
,
J. Chem. Phys.
151
,
064122
(
2019
).
24.
J.
Lee
and
D. R.
Reichman
,
J. Chem. Phys.
153
,
044131
(
2020
).
25.
J.
Lee
,
F. D.
Malone
, and
M. A.
Morales
,
J. Chem. Theory Comput.
16
,
3019
(
2020
).
26.
Y.
Liu
 et al.,
J. Chem. Theory Comput.
16
,
4298
(
2020
).
27.
F. D.
Malone
,
S.
Zhang
, and
M. A.
Morales
,
J. Chem. Theory Comput.
16
,
4286
(
2020
).
28.
M.
Motta
and
S.
Zhang
,
Wiley Interdiscip. Rev.: Comput. Mol. Sci.
8
,
e1364
(
2018
).
29.
J.
Carlson
 et al.,
Phys. Rev. B
59
,
12788
(
1999
).
30.
E. J.
Landinez Borda
,
J.
Gomez
, and
M. A.
Morales
,
J. Chem. Phys.
150
,
074105
(
2019
).
31.
M.
Motta
,
J.
Shee
,
S.
Zhang
, and
G. K.-L.
Chan
,
J. Chem. Theor. Comput.
15
,
3510
(
2019
).
32.

The determinantal expansion in the CAS(6,6) trial wavefunction was truncated by a coefficient threshold of 0.999 999, which yielded a variational energy that is essentially identical to the full determinantal expansion. The resulting truncated trial wavefunction consists of a total of 87 determinants. We further note that CASSCF changes the 1s core orbitals of carbon atoms from those of RHF, but the corresponding core relaxation lowers the CASSCF energy only by 0.8 μEh. Such a small effect is negligible compared to the statistical error in ph-AFQMC, and therefore, we ignore this.

33.
J.
Kim
 et al.,
J. Phys.: Condens. Matter
30
,
195901
(
2018
).
34.
P. R. C.
Kent
 et al.,
J. Chem. Phys.
152
,
174105
(
2020
).
35.
T. H.
Dunning
,
J. Chem. Phys.
90
,
1007
(
1989
).
36.
Q.
Sun
 et al.,
Wiley Interdiscip. Rev.: Comput. Mol. Sci.
8
,
e1340
(
2017
).
37.
Y.
Shao
 et al.,
Mol. Phys.
113
,
184
(
2015
).
38.
T.
Helgaker
 et al.,
J. Chem. Phys.
106
,
9639
(
1997
).