Approximate solutions to the ab initio electronic structure problem have been a focus of theoretical and computational chemistry research for much of the past century, with the goal of predicting relevant energy differences to within “chemical accuracy” (1 kcal/mol). For small organic molecules, or in general, for weakly correlated main group chemistry, a hierarchy of single-reference wave function methods has been rigorously established, spanning perturbation theory and the coupled cluster (CC) formalism. For these systems, CC with singles, doubles, and perturbative triples is known to achieve chemical accuracy, albeit at (N7) computational cost. In addition, a hierarchy of density functional approximations of increasing formal sophistication, known as Jacob’s ladder, has been shown to systematically reduce average errors over large datasets representing weakly correlated chemistry. However, the accuracy of such computational models is less clear in the increasingly important frontiers of chemical space including transition metals and f-block compounds, in which strong correlation can play an important role in reactivity. A stochastic method, phaseless auxiliary-field quantum Monte Carlo (ph-AFQMC), has been shown to be capable of producing chemically accurate predictions even for challenging molecular systems beyond the main group, with relatively low (N3 − N4) cost and near-perfect parallel efficiency. Herein, we present our perspectives on the past, present, and future of the ph-AFQMC method. We focus on its potential in transition metal quantum chemistry to be a highly accurate, systematically improvable method that can reliably probe strongly correlated systems in biology and chemical catalysis and provide reference thermochemical values (for future development of density functionals or interatomic potentials) when experiments are either noisy or absent. Finally, we discuss the present limitations of the method and where we expect near-term development to be most fruitful.
I. PRELIMINARIES
The scope of this Perspective will be restricted to the following:
Solving what is commonly known as the electronic structure problem, i.e., finding the lowest eigenvalue of the Schrödinger equation (of a given symmetry) under the Born–Oppenheimer Hamiltonian;
quantum chemical calculations on classical computing devices. We posit that, in parallel with the rapid development of quantum algorithms and hardware, the continued advancement of classical approaches will enable robust, converged predictions—ideally from more than one method—for challenging molecular systems; and
the background, present state, and future prospects of auxiliary-field quantum Monte Carlo (AFQMC) with an emphasis on chemical applications.
II. WHY MORE ACCURATE AND SCALABLE METHODS ARE STILL NEEDED IN 2023
The coupled cluster (CC) model with single, double, and perturbative triple excitations [CCSD(T)]1 is widely accepted as the “gold standard” method for small-molecule, main-group chemistry. Indeed, it has paved the way for the development of today’s state-of-the-art semi-empirical density functional approximations. The most successful density functionals have (1–10) parameters optimized to minimize errors vs reference values, typically from CCSD(T) extrapolated to the complete basis set (CBS) limit, for main-group chemical reactions.2,3 These computationally efficient density functionals, in turn, enabled the accurate parameterization of interatomic potentials amenable to large-scale molecular simulations,4 e.g., via fits to density functional theory (DFT) torsional scans. Notable demonstrations of transformational technologies in drug discovery attest to the academic and commercial utility due to this hierarchy of computational methods.5
In recent years, significant efforts have been made to develop machine learning architectures that can either directly predict energies from structural features6–10 or correct mean-field or tight-binding electronic structure methods.11–13 Machine learning has also been used to develop novel density functionals14 and expand the reach of variational Monte Carlo with neural network many-body wave functions.15 Although more data-efficient architectures for these applications are under rapid development,16 often relying on the inclusion of physically motivated constraints,17 these models generally require large amounts of training data and are, thus, inherently limited by the cost and accuracy of the high-level methods on which they are trained. The reliability of CCSD(T) and even many density functional approximations, as well as plentiful gas-phase experimental measurements, for small main-group molecules has enabled the development of efficient machine learning-based approaches for this region of chemical space.
However, the chemical elements found in organic molecules form a very small subset of the entire Periodic Table. The involvement of d and f orbitals in transition metals, lanthanides, and actinides opens up a world of reactive possibilities. Compounds with such elements play important roles in most areas of chemistry, ranging from inorganic chemistry, biochemistry, and materials science to radiochemistry, drug design, chemical catalysis, and emerging fields of molecular magnetism and quantum information science. Can computational chemists routinely perform predictive simulations—e.g., of redox events, reaction pathways, or binding free energies—with kcal/mol accuracy for chemical systems containing elements beyond the main group? We believe the answer is “not yet.”
The fundamental reason is a lack of accurate reference data in systems with d and f electrons, which becomes a bottleneck despite machine learning advances and progress in molecular mechanics-inspired potentials.18,19 This is, in part, a consequence of the scarcity of reliable, gas-phase experimental data (though progress in experimental techniques is encouraging20), and reported measurements often are accompanied by large uncertainties. Meanwhile, the computational cost of exact theoretical approaches, which, in principle, could yield predictions of quality comparable to experiment scales exponentially with system size. For approximate computational methods, several factors contribute to make systematically accurate calculations very challenging. d and f orbitals can give rise to strong static correlation, which arises from energetic near-degeneracies in transition metal, lanthanide, and actinide complexes. This type of strong correlation, also known as multireference character,21 can occur in molecules wherein spatially localized electrons are magnetically coupled. Chemically relevant examples include bridged multi-metal clusters, reduced monometal complexes with redox non-innocent ligands, or (poly)-radical sites in extended organic chromophores such as polyacenes.21,22 This low-spin, partial recoupling phenomenon is more readily found in intermediate regions of bond-dissociation curves. Another complicating factor that is not typically encountered in organic molecules is that these types of orbitals can also enable complex bonding (e.g., σ, π, and δ bonding) involving multiple dynamically correlated electron pairs. Furthermore, the presence of both types of correlations is coupled with enhanced and potentially significant contributions from relativistic effects.
While CCSD(T) typically yields sub-kcal/mol accuracy for properties related to thermochemistry, kinetics, non-covalent interactions, etc., of small organic molecules, its accuracy has been questioned when it comes to transition metal systems21,23,24 and also larger (especially polarizable) systems.25,26 We believe that methodological advances within CC theory, including the development of multireference variants,27–29 spinor-based relativistic variants,30,31 and fine-tuned localization protocols,32,33 can substantially improve the outlook for d- and f-electron systems. Many other quantum chemical methods have also been designed to treat systems with strong electronic correlations. Among them are multireference perturbation theories based on the driven similarity renormalization group,34,35 non-orthogonal configuration interaction36 with the inclusion of dynamic correlation,37 multi-configurational density functional theory,38 approaches based on the density matrix renormalization group,39–41 and both selected42,43 and stochastic44 approaches to approximate full configuration interaction.
Regarding transition metal thermochemistry, it has not yet been convincingly demonstrated that an approximate quantum-chemical model can produce predictions on available computing resources that are accurate to within 1–2 kcal/mol of exact theoretical predictions or experimental measurements for a test set of cases, each containing atoms. To achieve such a goal will require continued development of the leading quantum chemical methods, and quite possibly, comparative use of more than one approximate method. The key challenge is to treat both static and dynamic correlations accurately and on an equal footing, while avoiding exponentially growing computational costs with system size. In our view, the ph-AFQMC method stands out in this regard as a promising candidate.
III. AFQMC IN CONTEXT
Mature and robust electronic structure methods typically undergo decades of development, validation, and optimization. In this sense, AFQMC is still relatively new compared to most methods in the standard quantum chemistry repertoire. In this section, we chart the trajectory and maturation of AFQMC in the context of electronic structure theory.
Formal details are reviewed in Refs. 45–47. Very briefly, the lowest-energy eigenstate of a Hamiltonian can be obtained from an arbitrary initial wave function (with non-zero overlap with the ground state) by employing a Wick rotation, it → τ, in the time-dependent Schrödinger equation. Imaginary time propagation will then exponentially damp the coefficients of all other eigenstates. In AFQMC, the propagator is mapped to an integral of one-body operators in auxiliary fields via the Hubbard–Stratonovich transformation. In a seminal work, Zhang and Krakauer introduced a computational framework for fermionic systems that uses Monte Carlo to sample a manifold of non-orthogonal Slater determinants while avoiding the exponential decrease of the signal-to-noise ratio (with respect to system size or imaginary time projection). A key idea was to reformulate the field-theoretic representation of path integrals of actions into open-ended random walks in a space of DFT-like solutions in stochastic auxiliary fields, which afforded natural connections to the formalism familiar to the electronic structure community. This makes possible48 the introduction of a second ingredient, a constraint that imposes a gauge condition on the phases of the Slater determinants, yielding a polynomial scaling algorithm at the expense of a systematically improvable bias.49 The resulting method is non-perturbative and naturally multireference.
A unique computational advantage of AFQMC, when compared with more traditional quantum chemical methods, is that it can be implemented in an essentially embarrassingly parallel way, with random walkers divided into subsets and propagated on different compute nodes. In recent years, AFQMC has undergone intense optimization efforts. Recognizing that the entire algorithm can be executed with matrix computations, ph-AFQMC has been dramatically accelerated via the use of graphical processing units (GPUs).50,51 Another significant advance is the recent implementation of a single-parameter localized orbital approximation, which does not depend on single-reference methods to construct highly compact orbital spaces.52 An all-electron, localized-orbital ph-AFQMC calculation of the Fe(acac)3 complex with around 1,000 basis functions (cc-pVTZ) and a trial wave function with ∼100 determinants requires only 3 h of wall-time, running on 100 nodes (on current Summit resources). Additional algorithmic advances include efficient ways to utilize multi-determinant trial wave functions,50,53 a correlated sampling approach to more efficiently converge energy differences,54 tensor decompositions,55,56 frozen core and downfolding techniques,57 stochastic resolution-of-the-identity strategies,58 constraint release methods, nuclear gradients and geometry optimization,59,60 and the back-propagation algorithm to estimate observables that do not commute with the Hamiltonian.61
In the limit of an exact trial wave function, ph-AFQMC will recover the exact ground-state energy. However, in practice, the most important user input is the choice of an approximate trial wave function for the constraint. There is mounting evidence that for typical main-group compounds, accurate thermochemistry and even total energies within 1 kcal/mol (1.6 mHa) of the exact reference values can be obtained from ph-AFQMC calculations with complete active space self-consistent field (CASSCF) trial wave functions. For the ground-state of the carbon dimer, C2, at equilibrium, ph-AFQMC with truncated CASSCF expansions with less than 100 determinants is able to converge to the exact total energy.62 In the hydrogen chain, accurate results were obtained in a size-consistent manner across the entire regime of bond lengths with generalized Hartree–Fock (GHF) trial wave functions63 that better preserve symmetry.64
Selected CI wave functions, which can approximately solve active spaces far exceeding 18 orbitals,65 have recently emerged as promising trial wave functions for ph-AFQMC.66 For C2, convergence to the exact total energy has been shown with a selected CI trial (8e90o active space) with 103–4 determinants retained.67 Figure 1 shows that convergence to chemical accuracy can be achieved with selected CI trial wave functions in an active space of 8e16o with at least an order of magnitude fewer determinants. For a fixed number of determinants, it appears preferable to represent the trial wave function in its natural orbital (NO) basis—detailed investigations of these important subtleties are ongoing in our research groups and others.
Convergence of ph-AFQMC total energies with respect to the number of determinants kept in selected CI trial wave functions (ϵ1 = 10−4, 8e16o active space) for C2 at R = 1.242 53 Å. The cc-pVQZ basis is used; HCI refers to selected CI without orbital optimization in the canonical (CO) or natural orbital (NO) basis. The reference value is from an FCIQMC calculation.68 All statistical error bars are 0.6 mHa.
Convergence of ph-AFQMC total energies with respect to the number of determinants kept in selected CI trial wave functions (ϵ1 = 10−4, 8e16o active space) for C2 at R = 1.242 53 Å. The cc-pVQZ basis is used; HCI refers to selected CI without orbital optimization in the canonical (CO) or natural orbital (NO) basis. The reference value is from an FCIQMC calculation.68 All statistical error bars are 0.6 mHa.
We note that other types of trial wave functions have been tested,46,69–74 along with strategies that do not employ the phaseless constraint75,76 or release it after an initial equilibration period. For large molecules and/or strongly correlated states, the use of a trial wave function to implement the phaseless constraint is likely to be necessary to overcome the otherwise exponentially decaying signal-to-noise ratio. Thus, the search for robust and scalable procedures to produce trial wave functions, which can yield chemically-accurate ph-AFQMC predictions for large and diverse chemical systems, is arguably the most momentous outstanding challenge in this field.
IV. SALIENT FRONTIERS OF QUANTUM CHEMISTRY AND THE PROMISE OF AFQMC
A. Transition metals
3d transition metal compounds are particularly challenging from an electron correlation perspective (vs 4d and 5d elements), as the relatively weak ligand field splitting can result in small energy gaps between low- and high-spin states. In compounds with high symmetry or a low coordination number, there tends to be more degenerate or nearly-degenerate orbitals, and static correlation may arise from competing spin states,21 leading to a wide range of predictions from DFT functionals and a breakdown of the usual route to systematic improvement (ascending Jacob’s ladder for small organic molecules). In addition, bonding in organometallic complexes can involve both dispersion effects77 and correlations between multiple electron pairs (e.g., in a metal–carbonyl bond simultaneous σ-donation and π-backbonding is a six-electron process). Such complicated dynamical correlation effects are typically not found in small organic molecules and require a theoretical description beyond second-order in perturbation theory.21 As the high scaling of canonical CCSD and CCSD(T) renders these methods prohibitively costly, those seeking accurate predictions from wave function methods frequently turn to localized orbital approximations, the most popular being the domain based local pair natural orbital (DLPNO) approach.78,79 However, the reliability of DLPNO-CCSD(T) for some transition metal systems has been drawn into question,80,81 although extrapolation schemes appear promising.32,33
In light of this, we believe that another ab initio, potentially benchmark-quality method, which is systematically improvable and computationally feasible for realistic transition metal systems, would be highly desirable. In a series of papers focused on 3d transition metal compounds, the present authors have demonstrated that ph-AFQMC can achieve average errors of around 1 kcal/mol and maximum errors of roughly 3 kcal/mol for atomic ionization potentials,50 bond dissociation energies of metal–nonmetal diatomics82 and 4–6 coordinate complexes,83 and vertical and adiabatic ionization energies of metallocenes.84 The mean average errors from experimental measurements are shown in Table I.
Mean averaged error (kcal/mol) of ph-AFQMC calculations from gas-phase experiments for five different 3d transition metal-containing test sets: 10 atomic ionization potentials (AIP10, Ref. 50), 41 bond dissociation energies for metal–nonmetal diatomics (BDE41, Ref. 82), 34 ligand dissociation energies of 4–6 coordination compounds (LDE34, Ref. 83), and 6 metallocene adiabatic ionization potentials (MCIP6, Ref. 84).
Test set . | MAE . |
---|---|
AIP10 | 0.5 (0.7) |
BDE41 | 1.4 (0.4) |
LDE34 | 1.1 (0.3) |
MCIP6 | 1.7 (1.0) |
Test set . | MAE . |
---|---|
AIP10 | 0.5 (0.7) |
BDE41 | 1.4 (0.4) |
LDE34 | 1.1 (0.3) |
MCIP6 | 1.7 (1.0) |
In these studies, CASSCF trial wave functions limited (due to resource constraints) to 18 active orbitals were used. For many systems with strong static and/or dynamic correlations, it is unlikely that this type of trial wave function provides sufficient flexibility to conclusively demonstrate convergence of total energies with respect to the phaseless bias. Indeed, as shown explicitly in Fig. 2 for FeO, one of the most difficult cases encountered, even with selected CI trials with all orbitals active (computationally feasible in the def2-SVP basis), convergence of the total energies to within chemical accuracy is relatively laborious. However, the energy difference does converge rapidly with the percentage of the trial wave function retained, which is consistent with the chemically accurate thermochemical predictions (ionization potentials, bond dissociations, etc.) shown in our transition metal publications to date. To give a sense of the computational efficiency of this result, Fe, O, and FeO with 90% of the trial wave function weight retained corresponds to 2, 1, and 120 determinants, respectively (and likely to fewer in the natural orbital bases).
Convergence of ph-AFQMC total energies with respect to the percentage of the determinant weight kept in the Semistochastic Heat-Bath Configuration Interaction (SHCI) trial wave function (ϵ1 = 10−4, full active space) for FeO, Fe, and O toward the near-exact, ASCIPT2 reference values.24 The ph-AFQMC energy difference, i.e., the bond dissociation energy, is within the chemical accuracy over the range of percentages investigated, i.e., converges much more quickly than the total energies. All statistical error bars are 0.3 mHa. For comparison, CCSD and CCSD(T) dissociation energies are in error by 20.6 and 4.2 mHa, respectively.24
Convergence of ph-AFQMC total energies with respect to the percentage of the determinant weight kept in the Semistochastic Heat-Bath Configuration Interaction (SHCI) trial wave function (ϵ1 = 10−4, full active space) for FeO, Fe, and O toward the near-exact, ASCIPT2 reference values.24 The ph-AFQMC energy difference, i.e., the bond dissociation energy, is within the chemical accuracy over the range of percentages investigated, i.e., converges much more quickly than the total energies. All statistical error bars are 0.3 mHa. For comparison, CCSD and CCSD(T) dissociation energies are in error by 20.6 and 4.2 mHa, respectively.24
Mean absolute and maximum errors of various computational methods for the BDE41 dataset, with respect to experimental measurements. “QMC/X” indicates ph-AFQMC in a triple-ζ basis extrapolated to the CBS limit with triple- and quadruple-ζ calculations using method X [MP2 or CCSD(T)]. “BestQMC” indicates that some CBS extrapolations used X = ph-AFQMC. Reprinted (adapted) with permission from Shee et al., J. Chem. Theory Comput. 15, 2346-2358 (2019). Copyright 2019 American Chemical Society.
Mean absolute and maximum errors of various computational methods for the BDE41 dataset, with respect to experimental measurements. “QMC/X” indicates ph-AFQMC in a triple-ζ basis extrapolated to the CBS limit with triple- and quadruple-ζ calculations using method X [MP2 or CCSD(T)]. “BestQMC” indicates that some CBS extrapolations used X = ph-AFQMC. Reprinted (adapted) with permission from Shee et al., J. Chem. Theory Comput. 15, 2346-2358 (2019). Copyright 2019 American Chemical Society.
Kernel density estimation plot of the single-reference subset of metal-oxide dissociation energies and atomic ionization potentials with respect to SHCI reference calculations. Reprinted with permission from Williams et al., Phys. Rev. X 10, 011041 (2020). Copyright 2020 Authors, licensed under a Creative Commons Attribution 4.0 License.
Kernel density estimation plot of the single-reference subset of metal-oxide dissociation energies and atomic ionization potentials with respect to SHCI reference calculations. Reprinted with permission from Williams et al., Phys. Rev. X 10, 011041 (2020). Copyright 2020 Authors, licensed under a Creative Commons Attribution 4.0 License.
Furthermore, taking advantage of the fact that most quantities that can be measured experimentally are energy differences, a correlated sampling algorithm54 proved very effective in accelerating the convergence (with respect to statistical error bars) of the energy differences directly. Single-precision floating point arithmetic greatly accelerated the calculations on GPUs, though for larger systems than the ones investigated previously, mixed precision algorithms will likely be required. Figure 3 illustrates the remarkable accuracy of ph-AFQMC relative to other representative computational approaches for the BDE41 set.
For closed-shell and decidedly single-reference transition metal complexes, appropriately performed DLPNO-CCSD(T) calculations are expected to produce reliable reference values, which can be used, e.g., to assess the performance of various density functional approximations. Such high-quality datasets have recently been reported, such as MOR4186 and ROST61.87 However, for many electrochemically relevant, open-shelled transition metal systems, there is reason to suspect that methods based on CCSD(T) may not be sufficiently accurate. For example, the CC ansatz is based on a single-determinant reference, yet unrestricted Hartree–Fock (UHF) states are often heavily spin-contaminated, and it is not uncommon that ROHF calculations do not converge (or land on qualitatively wrong solutions) for spin states of high multiplicity. Furthermore, predictions (total energies and even energy differences) from CCSD(T) can differ substantially from those from CCSD. In our view, ph-AFQMC and its localized orbital implementation (which, unlike the DLPNO approximation, is not expected to be less efficient for multireference systems) can contribute a new route for accurate computations in situations of moderate to strong static correlations. A proof of concept study comparing ph-AFQMC and CCSD(T) for electrochemically relevant complexes is currently under way. Selected CI trial wave functions, including those generated via an iterative orbital optimization procedure,88 with (10–100) active orbitals are now feasible to obtain with modest computational resources. This represents a promising route toward systematic elimination of the phaseless bias.
When converged predictions from more than one systematically improvable, non-empirical, and formally independent method agree, the probability that all are identically wrong is negligibly small. This multi-method approach was convincingly invoked to unambiguously characterize the ground-state charge and spin order of a microscopic model system of cuprate high-temperature superconductors.89 In the same spirit, electronic structure properties of transition metal ions and oxide diatomics have been extensively benchmarked with a large number of ab initio methods, as shown in Fig. 4.85 AFQMC played an important role in both of these projects, and we believe that it will be a powerful tool going forward for quantum chemical applications. For example, when ph-AFQMC and CCSD(T) predictions agree for cases that do not exhibit notable multireference character, the consensus prediction is stronger than that from one method alone. This type of combined approach can potentially be the new “gold standard” and can be especially useful when experimental values do not exist.
B. The interplay of electron correlation and relativity
For heavier elements, i.e., those with occupied 4-5d and/or f orbitals, computationally tractable approximations to the full Dirac equation must be considered, complicating the job of quantum chemical modeling. An important challenge is to treat electron–electron correlations and both scalar and spin-dependent relativistic effects in an accurate and balanced manner. Currently available computational tools for such systems are significantly less robust vs nonrelativistic approaches. For example, variants of DFT incorporating scalar relativistic effects and spin–orbit coupling (SOC) to some extent yield average errors of 10 and 20 kcal/mol for thermochemistry involving relatively small actinide and lanthanide complexes, respectively.90–92 Recent progress in all-electron basis set development and composite correlated wave function approaches appears more promising.93–96 Encouragingly, new findings suggest that the inclusion of spin-dependent relativistic effects, e.g., via two-component formalisms with complex-valued orbitals, can reduce the degree of multireference character in some cases relative to scalar relativistic, real-valued descriptions.31
Reference 97 presents a relativistic many-body formalism to perform ph-AFQMC calculations on two-component Hamiltonians, which include an explicit treatment of SOC. The non-relativistic ph-AFQMC framework is generalized to include additional spin-flip sectors, which doubles the problem size. The size of the Hilbert space can be reduced via the use of effective core potentials and the frozen-core approximation. With single-determinant trial wave functions, encouraging accuracy in chemical and solid-state properties has been demonstrated. For instance, the electron affinity of the Pb atom is computed to within 0.1 eV of the experimentally measured value and bond dissociation energies of heavier halide diatomics to within 2 kcal/mol. The ph-AFQMC prediction of the cohesive energy of solid Bi deviates from experiment by less than 0.1 eV.
Current work involves the computation of zero-field splittings for transition metal complexes, as shown in Fig. 5. A key challenge is that, unlike typical thermochemical quantities, magnetic coupling constants and zero-field splitting energy scales are on the order of 10–100 wavenumbers (roughly 0.03–0.3 kcal/mol). For large molecular systems and/or those with substantial electron correlation effects, converging statistical error bars to below these thresholds implies a hefty computational cost (the error bars go as , where S is the number of samples). Correlated sampling is one possible route, and improved algorithms are currently being developed with such applications in mind. Alternatively, embedding approaches to downfold the full Hilbert space into a computational tractable and chemically relevant subspace99 are compelling alternatives to solving the full molecular problem explicitly.
(Left) Molecular structure of Co(C(SiMe2ONaph)3)2. Purple, gray, turquoise, and red spheres represent Co, C, Si, and O, respectively. Hydrogen atoms have been omitted for clarity. (Right) The calculated splitting of the ground state by spin–orbit coupling. The red line is the experimentally determined energy of the MJ = ±7/2 state. Reprinted with permission from Bunting et al., Science 362, eaat7319 (2018).98 Copyright 2018 AAAS.
(Left) Molecular structure of Co(C(SiMe2ONaph)3)2. Purple, gray, turquoise, and red spheres represent Co, C, Si, and O, respectively. Hydrogen atoms have been omitted for clarity. (Right) The calculated splitting of the ground state by spin–orbit coupling. The red line is the experimentally determined energy of the MJ = ±7/2 state. Reprinted with permission from Bunting et al., Science 362, eaat7319 (2018).98 Copyright 2018 AAAS.
C. Spin gaps in photochemistry and magnetically coupled systems
The relative energy splittings among states of different spin multiplicities are critical quantities in photochemical processes. For example, photoinduced spin-conserving electronic excitations from closed-shell ground-states can be followed by inter-system crossing to a non-emissive triplet state with a relatively longer lifetime such that subsequent photochemistry can be accomplished. Alternatively, under certain conditions, photoexcitation can be followed by singlet fission, which produces two triplet excitons from a single photon.100 This process and its reverse known as triplet–triplet annihilation photon upconversion101 have the potential to dramatically increase solar cell efficiency.
However, the spin gaps relevant to the above-mentioned processes are typically difficult, if not impossible, to characterize experimentally. For this reason, triplet states (in general, excited states with zero-valued oscillator strength with respect to a singlet ground-state, such as doubly excited singlet states) are frequently referred to as “dark.” The accurate prediction of electronic spectra including both bright and dark states represents a challenge for most popular quantum chemical methods, especially those based on DFT. For example, it is well known that relative spin state energetics are sensitive to the exchange-correlation functional employed,79 as the inclusion of exact Hartree–Fock exchange artificially favors higher-spin states. In addition, linear response time-dependent (TD) DFT102 can rigorously only describe single excitations (although we note that other DFT-based approaches,103,104 while not formally rigorous, have been found capable of yielding accurate energetics for pure doubly excited states).
ph-AFQMC can compute spin gaps with consistently high accuracy even for relatively large molecules and/or when multireference states are involved. Singlet-triplet gaps of biradicaloids involving open-shell singlet states, such as the benzyne isomers shown in Fig. 6, and of polyacenes, which are among the most popular and versatile photocatalysts, have been predicted accurately using single-determinant trial wave functions.72 We then demonstrated that ph-AFQMC can be a valuable tool for in silico design of novel upconverting annihilators. By comparing triplet energies computed with ph-AFQMC with S1 energies, we used the thermodynamic criteria of S1 ≥ 2 T1 and to screen myriad candidate annihilators, and our prediction that phenyl-substituted benzothiadiazole would upconvert (and that other BTDs would not) was confirmed by experimental collaborators (see Figs. 7 and 8).105 We note that the triplet-quintet gap of an iron porphyrin was accurately computed with ph-AFQMC,70 and that a simple algorithm for computing excited states with the same symmetry as the ground-state has been proposed62 (an alternate approach for AFQMC computations of general excitations has been shown to perform well in solids106 but has not been tested in molecules).
Singlet-triplet gaps for ortho-, meta-, and para-benzynes relative to experimental values. AFQMC/CAS refers to ph-AFQMC calculations using CASSCF trials with active spaces of eight electrons in 16 orbitals, truncated such that ≥99.5% of the wave function is retained. The AFQMC/U protocol for biradicaloid species is defined as follows: UHF orbitals are used in the single-determinant trial wave function by default, unless 1.1 or 2.1, in which case unrestricted B3LYP (UB3LYP) orbitals are used instead. Reprinted (adapted) with permission from Shee et al., J. Chem. Theory Comput. 15, 4924 (2019). Copyright 2019 American Chemical Society.
Singlet-triplet gaps for ortho-, meta-, and para-benzynes relative to experimental values. AFQMC/CAS refers to ph-AFQMC calculations using CASSCF trials with active spaces of eight electrons in 16 orbitals, truncated such that ≥99.5% of the wave function is retained. The AFQMC/U protocol for biradicaloid species is defined as follows: UHF orbitals are used in the single-determinant trial wave function by default, unless 1.1 or 2.1, in which case unrestricted B3LYP (UB3LYP) orbitals are used instead. Reprinted (adapted) with permission from Shee et al., J. Chem. Theory Comput. 15, 4924 (2019). Copyright 2019 American Chemical Society.
Singlet-triplet gaps for BTD series. DLPNO-CCSD(T) indicates the T0 perturbative triples procedure. Ph-BTD was experimentally found to upconvert when paired with the PtOEP sensitizer and not with the ZnTPP sensitizer (sensitizer triplet energies are shown with the horizontal lines). Reproduced with permission from Weber et al., Chem. Sci. 12, 1068 (2021). Copyright 2021 Royal Society of Chemistry.
Singlet-triplet gaps for BTD series. DLPNO-CCSD(T) indicates the T0 perturbative triples procedure. Ph-BTD was experimentally found to upconvert when paired with the PtOEP sensitizer and not with the ZnTPP sensitizer (sensitizer triplet energies are shown with the horizontal lines). Reproduced with permission from Weber et al., Chem. Sci. 12, 1068 (2021). Copyright 2021 Royal Society of Chemistry.
These encouraging applications, which demonstrate that spin-gaps involving even multi-configurational low-spin states can be accurately predicted with a scalable algorithm, suggest that ph-AFQMC has the potential to tackle many grand challenge problems in bioinorganic chemistry involving transition metal catalysts and magnetically coupled metal clusters. Prominent systems include the oxygen evolving complex in photosystem II,107,108 and iron–sulfur centers109,110 in the electron transport chain and nitrogenase enzymes. Such systems involve intricate electrochemical processes critical to their biological function, which are not completely understood. We envision that ph-AFQMC will soon be used to elucidate the subtle reaction mechanisms involved in these natural processes, which will enable the rational design of new synthetic catalysts. Finally, accurate predictions of the spin ladders of molecular magnets—which typically contain transition metal,111 lanthanide,112,113 or actinide114 elements—will enable the extraction of important properties such as magnetization barriers and magnetic susceptibilities.
V. OUTLOOK
In our view, the following properties set ph-AFQMC apart from other quantum chemical methods: the computational cost scales quartically with system size for Gaussian basis sets (and is cubic with a localized orbital implementation or with a plane-wave basis), the resulting accuracy is, in principle, systematically improvable, the algorithm exhibits near-perfect parallel efficiency and can leverage hardware such as GPUs, and the method has the flexibility to provide a balanced and accurate treatment of both static and dynamic electron correlations.
What remains to be done to have a truly robust and scalable benchmark-quality method for weakly to strongly correlated molecular systems? As mentioned, AFQMC is a relatively new member of the quantum chemistry toolbox, and as such, there are a number of possible future research directions and fruitful opportunities for development and optimization. We highlight several of these as follows:
Explore more efficient ways to reach the CBS limit. This is essential for comparing with both experimental measurements and (e.g., one-shot triple- or quadruple-ζ) DFT calculations. Approaches based on similarity transformations and explicitly correlated procedures115–117 currently scale at least as ; while this scaling is acceptable for methods based on CCSD, additional approximations would be appropriate to have practical utility for large systems with AFQMC. Composite extrapolation strategies, i.e., those based on EQMC(CBS) ∼ EQMC(small) + [EX(large) − EX(small)], where X might be, e.g., DLPNO-CCSD(T) or multireference perturbation theory,34,118 are also promising.
Finally, we note the recent optimization of specialized basis sets for correlated calculations of solid-state119,120 and molecular121–123 systems.
The combination of correlated sampling and a mechanism for branching and population control (i.e., duplicating walkers with large weights and annihilating walkers with small weights with appropriate probability) is desirable. Algorithms are currently under investigation and have, indeed, shown significantly improved performance in correlated sampling in lattice models and solids. Adapting and further developing such approaches in molecular systems will be valuable.
It will be useful to build up a much larger dataset to better clarify how the bias resulting from the phaseless constraint depends on the trial wave function employed, for a variety of different chemical systems. Is there a general, automatable protocol to generate optimal trial wave functions? What is the computationally cheapest trial wave function, or set of trial wave functions, required to converge ph-AFQMC to within the chemical accuracy for a given system? This effort can complement and create synergy with the development of better trial wave function ansatzes (pseudo-BCS, symmetry-projected wave functions, non-orthogonal Slater determinants, self-consistency, etc.).
Pople introduced the idea of a “theoretical model chemistry”124,125 that, although approximate, makes uniform approximations and ideally satisfies a list of pre-defined properties advantageous for the description of chemical reactions. Note that, in this sense, ph-AFQMC can give rise to various model chemistries as determined by, e.g., the choice of trial wave function, much like the CC framework can give rise to various models such as CCSD or CCSD(T). While, of course, it would be desirable (and is increasingly possible, as we have shown with C2 and FeO) to converge every single calculation independently with respect to the phaseless bias, we invite the community to consider Pople’s perspective, and to make efforts to undertake broad assessments using diverse thermochemical datasets of pre-defined ph-AFQMC model chemistries.
Explore alternatives to the cosine projection of the phaseless constraint. During the early development of ph-AFQMC, several versions were tested including the half-plane and harmonic potential, which yielded indistinguishable results from the cosine projection in jellium and simple molecules.126 These results were all obtained with plane-wave basis functions, and different basis and Hubbard–Stratonovich choices can have an effect on the accuracy. Efforts in this direction are under way, including strategies to perform constraint-release combining phaseless with the Metropolis algorithm.
While the AFQMC community at present is expanding along other avenues—coupled electron–phonon systems,127 finite-temperature algorithms,128–131 quantum model systems,89,132,133 ab initio calculations of solids,120,134–137 and hybrid algorithms leveraging quantum devices74—we hope we have clarified that ph-AFQMC has an important role to play in the field of molecular quantum chemistry. From highly accurate thermochemical predictions for transition metals and f-block compounds to mechanistic investigations of biological processes or chemical catalysis where one or more steps involve strongly correlated electronic effects, ph-AFQMC has the potential to play a unique and possibly transformative role. We view ph-AFQMC as complementary to other ab initio approaches, and encourage the joint use of multiple independent methods to solve scientific challenges and provide reference values, especially in the absence of experiment. In this spirit, we reiterate that in today’s era of artificial intelligence and machine learning, high-quality training data are the new gold. In the near future, we envision that routine ph-AFQMC predictions will enable the development of new semi-empirical density functionals and interatomic potentials, along with those based on machine learning.
ACKNOWLEDGMENTS
J.S. acknowledges funding from the National Institute of General Medical Sciences of the National Institutes of Health under Award No. F32GM142231. This research used resources of the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility located at Lawrence Berkeley National Laboratory. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. The Flatiron Institute is a division of the Simons Foundation. We thank many collaborators and colleagues who contributed to this work and/or significantly influenced our perspective on the subject, including Martin Head-Gordon, Benjamin Rudshteyn, Evan Arthur, Mario Motta, Andreas Hansen, Hung Vuong, Pierre Devlaminck, Brandon Eskridge, Miguel Morales, Fengjie Ma, and Hao Shi. We dedicate this paper to the memory of Henry Krakauer.
AUTHOR DECLARATIONS
Conflict of Interest
The author has no conflicts to disclose.
Author Contributions
James Shee: Conceptualization (lead); Data curation (lead); Funding acquisition (equal); Investigation (lead); Methodology (equal); Software (equal); Supervision (equal); Writing – original draft (lead); Writing – review & editing (equal). John L. Weber: Data curation (supporting); Investigation (supporting); Methodology (supporting); Software (equal); Writing – review & editing (supporting). David R. Reichman: Conceptualization (supporting); Methodology (supporting); Writing – review & editing (supporting). Richard A. Friesner: Conceptualization (supporting); Methodology (supporting); Writing – review & editing (supporting). Shiwei Zhang: Conceptualization (supporting); Funding acquisition (equal); Investigation (supporting); Methodology (equal); Supervision (equal); Writing – review & editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are available within the article or from the corresponding author upon reasonable request.