We show that, as in Hartree–Fock theory, the orbitals for excited state mean field theory can be optimized via a self-consistent one-electron equation in which electron–electron repulsion is accounted for through mean field operators. In addition to showing that this excited state ansatz is sufficiently close to a mean field product state to admit a one-electron formulation, this approach brings the orbital optimization speed to within roughly a factor of two of ground state mean field theory. The approach parallels Hartree Fock theory in multiple ways, including the presence of a commutator condition, a one-electron mean-field working equation, and acceleration via direct inversion in the iterative subspace. When combined with a configuration interaction singles Davidson solver for the excitation coefficients, the self-consistent field formulation dramatically reduces the cost of the theory compared to previous approaches based on quasi-Newton descent.

## I. INTRODUCTION

Hartree–Fock (HF) theory^{1,2} is so immensely useful, in large part, due to the rigorous and convenient link it provides between a qualitatively correct many-electron description and an affordable and more intuitive one-electron equation. The link it makes is rigorous in that, when solved, its one-electron equation guarantees that the many-electron description underneath it is optimal in a variational sense, meaning that the energy is made stationary with respect to changes in the wave function. The link is also convenient because many-electron properties like the energy can be evaluated in terms of inexpensive one-electron quantities and because solving a one-electron equation, even one with mean field operators that must be brought to self-consistency, is in most cases easier and less expensive than a direct minimization of the many-electron energy. The fact that this useful link is possible at all owes much to the simplicity of the Slater determinant many-electron wave function on which HF theory is built. Essentially, the Slater determinant is as close as we can get to a truly mean field, correlation-free Hartree product ansatz while still capturing the important effects of Pauli correlation. Happily, this single step away from a product state does not prevent a useful and intuitive formulation in terms of a self-consistent one-electron equation in which mean field operators account for electron–electron coulomb repulsion.

In this paper, we will show how excited state mean field (ESMF) theory^{3} can also be formulated in terms of a one-electron mean field equation that, when solved self-consistently, produces optimal orbitals. As in HF theory, this formulation is possible, thanks to the ansatz hewing closely to the mean field limit: ESMF takes only one additional step away from a truly mean field product state by adding the open-shell correlation that arises in an excitation on top of the Pauli correlations already present in the ground state. Perhaps most importantly, the resulting one-electron equation that determines the optimal orbitals can, like the Roothaan equations, be solved by iteratively updating a set of mean field operators until they are self-consistent with the orbital shapes. As we will see, when accelerated by direct inversion in the iterative subspace (DIIS),^{4} this self-consistent field (SCF) approach brings the orbital optimization cost down to within a factor of two of HF theory and significantly lowers the overall cost of ESMF theory compared to previous approaches. Given that ESMF offers a powerful platform upon which to construct excited-state-specific correlation theories^{5–7} and that it has recently been shown to out-compete other low-cost methods like configuration interaction singles (CIS) and density functional theory in the prediction of charge density changes,^{8} this acceleration of the theory and simplification of its implementation should prove broadly useful.

While recent work has provided an improved ability to optimize the ESMF ansatz via the nonlinear minimization of a generalized variational principle (GVP),^{6,8} the current lack of an SCF formulation stands in sharp contrast to the general state of affairs for methods based on Slater determinants. Even in contexts outside standard HF for ground states, SCF procedures are the norm rather than the exception when it comes to optimizing Slater determinants’ orbitals. Indeed, among many others, the ΔSCF,^{9–12} restricted open-shell Kohn Sham,^{13,14} constrained density functional theory,^{15} ensemble density functional theory,^{16–18} projected HF,^{19} and *σ*-SCF^{20,21} methods all favor SCF optimization approaches. Although the direct minimization of a GVP or the norm of the energy gradient^{22} offers protection against a Slater determinant’s “variational collapse” to the ground state or lower excited states, this rigorous safety comes at some cost to efficiency. It is for good reason that direct energy minimization methods, although available,^{23} are not the default HF optimization methods in quantum chemistry codes. In cases where they prove stable, SCF approaches are typically more efficient. In the case of the ESMF ansatz, an SCF approach is also at risk of collapse to an undesired state but, even in such troublesome cases, a brief relaxation of the orbitals by SCF may still offer a low-cost head start for the direct minimization of a GVP. In cases where an SCF approach to ESMF is stable, history strongly suggests that it will be more efficient than nonlinear minimization. In short, our preliminary data agree with history’s suggestion.

## II. THEORY

### A. Hartree–Fock theory

To understand how an SCF formulation of ESMF theory comes about, it is useful to first review the formulation of HF theory and in particular how its condition for optimal orbitals can be written as a commutator between a mean field operator and a one-body reduced density matrix (RDM). In HF theory, the energy of the Slater determinant Ψ_{SD} is made stationary with respect to changes in the orbital variables, which is the Slater determinant’s approximation of the more general condition that an exact energy eigenstate will have an energy that is stationary with respect to any infinitesimal variation in the wave function. For convenience, and without loss of generality, the molecular orbitals (MOs) are constrained by Lagrange multipliers to be orthonormal.^{1} For Restricted Hartree–Fock (RHF), the resulting Lagrangian,

where ** C** is the matrix whose columns hold the molecular orbital coefficients,

**is the atomic orbital (AO) overlap matrix,**

*S***is the identity matrix,**

*I***is the symmetric matrix of Lagrange multipliers, tr[] is the matrix trace operation, and**

*ϵ**E*

_{RHF}is the RHF energy (given below), is then made stationary by setting derivatives with respect to

**equal to zero. After some rearrangement,**

*C*^{1}this condition can be formulated into the famous Roothaan equations,

where ** h** is the matrix representation of the one-electron components of the Hamiltonian in the atomic orbital basis and

**is interpreted as a mean field approximation for electron–electron repulsions. Of course, this mean field repulsion depends on the orbital shapes, causing the operator**

*W***to be a function of**

*W***, the Aufbau determinant’s 1-body**

*A**α*-spin RDM. In what comes below, we will consider RDMs and other matrices in both the atomic orbital (AO) and molecular orbital (MO) bases and will adopt the notation that a matrix with no superscript (e.g.,

**) refers to the AO representation, while the MO representation is explicitly denoted as such (e.g.,**

*A*

*A*^{(MO)}). The closed-shell Aufbau determinant’s RDM has the form

where the matrix *I*_{o} has ones on the first *n*_{o} elements of its diagonal and zeros elsewhere (*n*_{o} is the number of occupied molecular orbitals). Although in many contexts, it is useful to separate the restricted HF (RHF) mean field electron–electron repulsion operator ** W**[

**] = 2**

*A***[**

*J***] −**

*A***[**

*K***] into its “coulomb”**

*A***and “exchange”**

*J***components,**

*K*defined here using the two-electron integrals (TEIs) in 1122 order, this separation is not necessary at present and so we will work instead in terms of the combined mean field operator ** W**.

Now, while the Roothaan equation has both an intuitive appeal as a one-electron Schrödinger equation and a practical appeal as a convenient setup for an SCF cycle based on the efficient numerical diagonalization of a symmetric generalized eigenvalue problem, it is not the only way to formulate HF theory’s central requirement of Lagrangian stationarity. Noting that only the first *n*_{o} columns of ** C** affect the ansatz, we can right-multiply Eq. (2) by

*I*_{o}=

*A*^{(MO)}to focus our attention to them while at the same time left-multiplying by

*C*^{T}to eliminate the overlap matrix, which results in

where we have made the usual definition of the Fock operator,

If we ensure that we work in the canonical representation,^{1} the matrix ** ϵ** will be diagonal, and so Eq. (6) essentially says that the product

*F*^{(MO)}

*A*^{(MO)}must produce a symmetric matrix. We may enforce this requirement by setting the difference between this product and its transpose equal to zero, which leads to a commutator condition for Lagrangian stationarity that can be used as an alternative to the Roothaan equation when optimizing orbitals,

^{4,24}

If we consider the HF energy expression

alongside the Fock operator definition ** F** =

**+**

*h***, we see a nice connection between the commutator condition and the energy. Specifically, if one halves the one-electron component of the mean field operator whose trace with the density yields the energy, the resulting operator (**

*W***in this case) must, when put in the MO basis, commute with the MO basis representation of the density matrix in order for the Lagrangian to be stationary. With this connection pointed out, we now turn our attention to ESMF theory, where a generalization of Eq. (8) yields a useful SCF formulation for orbital optimization.**

*F*### B. Excited state mean field theory

Like HF theory, the energy expression for the ESMF ansatz for a singlet excited state can be written in terms of traces between mean field operators and density-like matrices. In particular, if we take the simple version of the singlet ESMF ansatz in which the Aufbau coefficient is set to zero,

where ** t** is the matrix of CIS-like configuration interaction coefficients and $|i\u2191a\u2191\u27e9$ is the Slater determinant resulting from an

*i*→

*a α*-spin excitation out of the Aufbau determinant (note we do not say the HF determinant as we are not in the HF MO basis), then the ESMF singlet energy amounts to four traces between mean field operators and density-like matrices,

Here, *γ* is the one-body alpha-spin RDM for the ESMF ansatz,

The matrix ** A** is the Aufbau determinant’s one-body RDM, as in Eq. (3). We define the difference between these density matrices as

**=**

*D**γ*−

**. Finally,**

*A***is the non-symmetric matrix that, in its MO representation, has the**

*T**α*-spin transition density matrix between the Aufbau determinant and the ESMF ansatz (which is as for CIS just

**) in its upper-right corner,**

*t*With the ESMF energy written in terms of one-body mean field operators and density-like matrices, we can now present our central result in which the stationarity conditions for the ESMF Lagrangian,

with respect to orbital variations are written in a one-electron equation that admits an SCF-style solution. We begin, as in HF theory, by setting the (somewhat messy) derivatives *∂L*_{ESMF}/*∂*** C** equal to zero. With some care, this condition can be organized into

whose structure is similar to but also notably different from the analogous HF expression in Eq. (2). The formal difference is that there are now four terms on the left-hand side, one for each trace in the energy expression. The practical difference is that the ESMF equation is not an eigenvalue problem, and it is not obvious that it can be reorganized into one due to the incompatible kernels of the matrices *γ*^{(MO)}, *A*^{(MO)}, and *T*^{(MO)}. Thus, it is at present not clear whether this ESMF equation can offer the same spectral information that the Roothaan equation provides for HF. Nonetheless, for orbital optimization, we have found a convenient alternative by transforming this stationary condition into a commutator form by following the same steps that took us from Eqs. (2)–(8) in HF theory. Defining *F*_{A} = ** h** +

**[**

*W***], the result is that the Lagrangian stationary condition can be written as**

*A*It is interesting that the same pattern holds as in the HF case: the commutator condition has one commutator per trace in the energy expression, and the mean field operators (with any one-electron parts halved) are again paired with the same density-like matrices as in the energy traces. We find this pattern especially interesting in light of the fact that it does not simply follow that each trace produces one commutator. Instead, cancellations of terms coming from derivatives on different traces are needed to arrive at the commutators above, and so we do wonder whether this is a happy accident or whether there is an underlying reason to expect such cancellations.

### C. Self-consistent solution

Either way, Eq. (16) forms the basis for an efficient SCF optimization of the ESMF orbitals. Assuming that we are a small orbital rotation away from stationarity, we insert the rotation ** C** →

**exp(**

*C***) into our commutator condition and then expand the exponential and drop all terms higher than the linear order in the anti-symmetric matrix**

*X***. The result is a linear equation for**

*X***[see Eq (S24) in the supplementary material], which we solve via the iterative Generalized Minimal Residual Method (GMRES) method. Note that, if desired, one can control the maximum step size in**

*X***by simply stopping the GMRES iterations early if the norm of**

*X***grows beyond a user-supplied threshold. This may be desirable, as we did after all assume that only a small rotation was needed and our linearization of the equation prevents us from trusting any proposed rotation that is large in magnitude. In parallel to SCF HF theory, which holds**

*X***fixed while solving the Roothaan equation for new orbitals, we hold**

*F*

*F*_{A},

**[**

*W***], and**

*D***[**

*W***] fixed while solving our linear equation. Thus, although the modified GMRES solver is not as efficient as the dense eigenvalue solvers used for HF theory, it remains relatively inexpensive as it does not involve any Fock builds and so does not have to access the two-electron integrals. [**

*T**Technical note: in practice, we can speed up the GMRES solver considerably by preconditioning it with a diagonal approximation to the linear transformation that is set to one for*

*X**elements in the occupied–occupied and virtual–virtual blocks (since these are expected to play little role in the orbital relaxation) and, in the other blocks, replaces*

*C*^{T}

*F*_{A}

*C**with its diagonal, replaces γ*

^{(MO)}

*with*

*I*_{o}

*, and neglects*

**[**

*W***]**

*D**and*

**[**

*W***]**

*T**(see the*supplementary material

*for the explicit form). DIIS is also effective when we take Eq. (16) transformed into the AO basis as the error vector and the*

*F*_{A}

*,*

**[**

*W***]**

*D**, and*

**[**

*W***]**

*T**matrices as the DIIS parameters. We use both of these accelerations in all calculations.*] Only after the linear equation is solved and the orbitals are updated do we rebuild the three mean field operators, and so each overall SCF iteration requires just three Fock builds, which, as they can be done during the same loop over the two-electron integrals, come at a cost that is not much different than HF theory’s single Fock build. This arrangement contrasts sharply with the nine Fock builds and two integral loops that are necessary to form the analytic derivative of the energy with respect to

**that is used in descent-based orbital optimization.**

*C*^{8}In summary, the ESMF orbitals, like the HF orbitals, can be optimized particularly efficiently via the self-consistent solution of a one-electron mean field equation.

Although this exciting result makes clear that the ESMF ansatz really does hew closely enough to the mean field product-state limit for one-electron mathematics to be of use, there are a number of questions we should now address. First, and we will go into more detail on this point in the next paragraph, is the SCF approach actually faster than descent. The answer, at least in simple systems, is a resounding yes. Second, what of the configuration interaction coefficients ** t**. At present, we optimize them in a two-step approach in which we go back and forth between orbital SCF solutions and CIS calculations (taking care to include the new terms that arise for CIS when not in the HF MO basis) until the energy stops changing. In the future, more sophisticated approaches that provide approximate coupling between these optimizations may be possible, as has long been true in multi-reference theory.

^{25}Third, what physical roles can we ascribe to the different mean field operators that appear in the SCF approach to ESMF. The operator

*F*_{A}obviously carries the lion’s share of the electron–electron repulsion as it is the only mean field operator derived from a many-electron density matrix. Indeed,

**[**

*W***] and**

*D***[**

*W***] represent repulsions from one-electron densities, and so they cannot provide the bulk of the electron–electron repulsion. Thus, we suggest that it is useful to view**

*T*

*F*_{A}as a good starting point that includes the various repulsions between electrons not involved in the excitation but that gets the repulsions affected by the excitation wrong.

**[**

*W***] and**

*D***[**

*W***] then act as single-electron-density corrections to this starting point. If one considers the simple case in which we ignore all electrons other than the pair involved in the excitation (e.g., consider the HOMO/LUMO excitation in H**

*T*_{2}), then a close inspection reveals that

**[**

*W***] eliminates the spurious HOMO–HOMO repulsion that is present in the first trace of the energy expression, while the**

*D***[**

*W***] terms bring the excited electron pair’s repulsion energy into alignment with the actual repulsion energy that results from the singlet’s equal superposition of two open-shell determinants.**

*T*## III. RESULTS

### A. Efficiency comparisons

Returning now to the question of practical efficiency, we report in Table I the convergence of the energy for the HOMO/LUMO excitation in the water molecule for both SCF-based and GVP descent-based ESMF (note all geometries can be found in the supplementary material). Whether one measures by the number of times the expensive two-electron integral (TEI) access must be performed or by the wall time, the two-step SCF approach is dramatically more efficient than GVP-based descent in this case. (The keen-eyed observer will notice that in the SCF case, the TEI loop count and the wall time do not increase at the same rate, which is due to the CIS iterations having many fewer matrix operations to do, as compared to SCF in between each access of the TEIs.) If we focus on just the orbital optimization, as shown in Table II, we find that the SCF approach for ESMF is almost as efficient as ground state HF theory. In practice, of course, we also want to optimize ** t**, and for now, we rely on the two-step approach, as used in Table I.

SCF ESMF . | GVP ESMF . | ||||
---|---|---|---|---|---|

TEI loops . | Time (s) . | ΔE (a.u.)
. | TEI loops . | Time (s) . | ΔE (a.u.)
. |

10 | 0.007 | 0.062 605 | 76 | 0.397 | 0.003 761 |

20 | 0.025 | 0.000 032 | 150 | 0.783 | 0.000 654 |

30 | 0.033 | 0.000 004 | 226 | 1.187 | 0.000 184 |

40 | 0.054 | 0.000 000 | 300 | 1.579 | 0.000 001 |

SCF ESMF . | GVP ESMF . | ||||
---|---|---|---|---|---|

TEI loops . | Time (s) . | ΔE (a.u.)
. | TEI loops . | Time (s) . | ΔE (a.u.)
. |

10 | 0.007 | 0.062 605 | 76 | 0.397 | 0.003 761 |

20 | 0.025 | 0.000 032 | 150 | 0.783 | 0.000 654 |

30 | 0.033 | 0.000 004 | 226 | 1.187 | 0.000 184 |

40 | 0.054 | 0.000 000 | 300 | 1.579 | 0.000 001 |

Molecule . | Basis . | RHF (s) . | n_{i}
. | ESMF (s) . | n_{i}
. |
---|---|---|---|---|---|

Water | cc-pVTZ | 0.087 | 8 | 0.185 | 6 |

Formaldehyde | cc-pVTZ | 0.424 | 11 | 0.862 | 8 |

Ethylene | cc-pVTZ | 0.903 | 8 | 1.735 | 6 |

Toluene | cc-pVDZ | 4.366 | 19 | 6.835 | 11 |

Molecule . | Basis . | RHF (s) . | n_{i}
. | ESMF (s) . | n_{i}
. |
---|---|---|---|---|---|

Water | cc-pVTZ | 0.087 | 8 | 0.185 | 6 |

Formaldehyde | cc-pVTZ | 0.424 | 11 | 0.862 | 8 |

Ethylene | cc-pVTZ | 0.903 | 8 | 1.735 | 6 |

Toluene | cc-pVDZ | 4.366 | 19 | 6.835 | 11 |

While the SCF approach has clear advantages in simple cases, the GVP is still expected to be essential for cases in which the SCF approach may not be stable. For example, without implementing an interior root solver or freezing an open core (and we have not done either), Davidson-based CIS would be problematic for a core excitation. However, as shown in Table III, a combination of an initial SCF optimization of the orbitals followed by a full GVP optimization of ** t** and

**together is quite effective. In this case, the SCF approach brings the energy close to its final value, converging to an energy that is too low by 54**

*C**μ*E

_{h}(remember, excited states do not have any upper bound guarantee, even when a variational principle like energy stationarity or the GVP is in use). From this excellent starting point, the GVP’s combined optimization of

**and**

*C***converges quickly to the final energy, needing just ten gradient evaluations to get within 1**

*t**μ*E

_{h}. In contrast, if the initial SCF orbital optimization is omitted, the GVP coupled optimization requires hundreds of gradient evaluations (exactly how many depends on the choice for

*ω*and how

*μ*is stepped down to zero)

^{6}to reach the same level of convergence and was only able to converge to the correct state at all by setting

*μ*to 0.5 and

*ω*to 0.08 E

_{h}lower than the final energy for the initial iterations to avoid converging to a higher-energy core excitation. Especially interesting is the fact that, if we move to the aug-cc-pVTZ basis, the ESMF predictions for the two lowest core excitations in H

_{2}O are 534.3 eV and 536.2 eV, which are quite close to the experimental values

^{26}of 534.0 eV and 535.9 eV and which match the delta between them even more closely. Thus, even in cases where the SCF approach would be difficult to use on its own, it can offer significant benefits in partnership with direct minimization.

### B. PYCM

To verify that the benefits of the SCF approach are not confined to smaller molecules, we exhibit its use on a charge transfer state in the PYCM molecule that Subotnik used to demonstrate CIS’s bias against charge transfer states.^{27} Working in a cc-pVDZ basis for the heavy atoms and 6-31G for hydrogen, we consider the lowest charge transfer state, for which we provide iteration-by-iteration convergence details in the supplementary material. In Fig. 1, we plot the ESMF prediction for the donor and acceptor orbitals, which in this case are just the relaxed HOMO and LUMO orbitals as the ** t** matrix coefficients are strongly dominated by the HOMO → LUMO transition. We see that this state transfers charge from the

*π*bonding orbital on the methylated ethylene moiety to the

*π*

^{*}orbital on the cyano-substituted ethylene moiety. Aside from the efficiency of the SCF solver in this case (it takes just two-and-a-half times as long as RHF when using the same Fock build code), it is interesting to compare the prediction against that of CIS, which is the analogous theory when orbital relaxation is ignored. CIS predicts a 7.30 eV excitation energy for the lowest state in which this charge transfer transition plays a significant role, whereas ESMF predicts a 4.82 eV excitation energy. This multiple-eV energy lowering after orbital relaxation serves as a stark reminder of how important these relaxations are for charge transfer states.

## IV. CONCLUSION

In conclusion, orbital optimization in ESMF theory can be formulated in terms of a one-electron equation in which mean field operators provide electron–electron repulsion and which is brought to self-consistency through an efficient iterative process that closely mirrors ground state HF theory. In particular, it is possible to formulate the excited state many-electron energy in terms of four traces between density matrices and mean field operators, and the central commutator condition likewise contains four commutators between these density matrices and their partner mean field operators. In a sense, this is a straightforward extension of the HF case, where only one trace and one commutator are needed. As has long been true for Slater determinants, the SCF approach to the ESMF orbitals appears to be significantly more efficient than quasi-Newton methods, at least in cases where the SCF iteration converges stably to the desired state. Looking forward, it will be interesting to see if, as in the ground state case, the SCF approach admits Kohn–Sham-style density functionals and whether the optimization of the excitation coefficients can be more tightly coupled to the optimization of the orbitals.

## SUPPLEMENTARY MATERIAL

See the supplementary material for additional mathematical details, additional calculation details, and molecular geometries.

## ACKNOWLEDGMENTS

This work was supported by the Early Career Research Program of the Office of Science, Office of Basic Energy Sciences, the U.S. Department of Energy, Grant No. DE-SC0017869. While final timing calculations were carried out on a laptop, many preliminary calculations with earlier pilot code were performed using the Berkeley Research Computing Savio cluster.

## DATA AVAILABILITY

The data that support the findings of this study are available within the article and its supplementary material.

## REFERENCES

^{5}-scaling excited-state-specific perturbation theory