We report robust initial guesses for the amplitudes and z-vectors in a configuration interaction singles or Tamm–Dancoff approximation calculation that consistently reduce the total number of iterations required for an excited state calculation often by over 50%. The end result of these guesses is that the practicing chemist can expect to generate excited state optimized structures with a total wall time reduced by as much as 30% in the future without any approximations—simply by using information gathered at one geometry and applying it to another geometry.
During a geometry optimization or ab initio molecular dynamics, it is common to use a set of previously computed molecular orbitals (MOs) as the starting guess for the next geometry.1 This no-cost approach can reduce the error at the initial step by an order of magnitude and shaves a few cycles off of the SCF procedure. Moreover, in the course of dynamics, several researchers have shown that extrapolating a set of previously computed Fock matrices via a power series expansion1–3 or linear-predictor methods4 can reduce the error and the cost without introducing significant bias into the final solution. While these approaches are routine today for ground state calculations, as far as we are aware, only limited efforts are made in standard electronic structure packages to read-in excited state amplitudes from one geometry to the next.5,6 Rather, for example, in Configuration Interaction Singles (CIS) or Tamm–Dancoff Approximation Time-Dependent Density Functional Theory (TDA-TDDFT) calculations, one usually initializes the diagonalization routine using a standard set of guesses for the amplitudes based on zero order energy differences (Δϵ). As we discuss below, this state of affairs arises presumably for reasons of stability.
In this Communication, we report that a simple rotation of the old amplitudes into the space of the new molecular orbitals using the Kabsch algorithm7 (which has many other names as well8,9) can serve as an exceedingly good and stable guess for the excited state amplitudes. This guess often has an error that is several orders of magnitude lower than if no such guess is made, and the Davidson algorithm10 converges dramatically faster to the relevant excited states in a consistently stable fashion. The same approach is applied to the z-vector (which is used in the calculation of the gradient) for similar effect.
The rotation in Eq. (4) improves substantially upon simply reading in the previous guess because, in the absence of , it is the analytically optimal rotation of the old MOs into the new—a task often referred to as the Procrustes problem and long known in the chemistry community.8,12 Interestingly, if one implements Eq. (3) with the cross-geometry atomic-orbital overlap, [instead of ]—as one might be tempted to do for the most rigorous overlap scheme—we find much worse performance; a similar issue was observed in previous efforts to extrapolate the Fock matrix.2,13
In Table I, we show the relative speed-up for computing an excited state optimization of (1) the S1 surface of naphthalene and (2) the T1 surface of a methyl-bridged benzophenone anthracene complex (pictured in Fig. 1) after implementing Eqs. (2)–(4) and (6) into a developmental version of Q-Chem 6.0.15 Excited state energies and gradients were computed via restricted Configuration Interaction Singles (CIS) in the cc-pVTZ (naphthalene) or 6-31G** (C28H20O) basis set on 32 cores of an Intel Gold 6242 CPU clocked at 2.80 GHz. The starting geometries were the converged Hartree–Fock ground state geometries. We computed the lowest three singlet (naphthalene) or triplet (C28H20O) states in each case. Over the course of the optimizations, each optimization cycle was converged when the DIIS error in the SCF was less than 10−8, the norm of the CIS residuals was less than 10−6, and the error in the z-vector was less than 10−7. Absolute energies at each structure computed by all methods never differed by more than 2 × 10−6 Eh, and the energy deviation at the final geometry was less than 2 × 10−7 Eh for both molecules. The optimization was considered converged when the maximum component of the gradient was less than 3 × 10−4 Eh/a0 and either (1) the maximum component of the displacement from the previous step was less than 1.2 × 10−3 a0 or (2) the absolute energy between successive structures was less than 10−6 Eh.
The first two columns, respectively, give the mean number of Davidson and CPHF iterations required per step during a geometry optimization. The third column (CI + z) gives the mean time per step to converge the CIS and z-vector. The final column reports the total wall time for the optimization. Timings are the mean of three consecutive runs.
. | Davidson . | CPHF . | (CI + z)/s . | Wall/s . |
---|---|---|---|---|
Naphthalene/cc-pVTZ | ||||
Δϵ guess | 7.3 | 6.0 | 100.1 | 898 |
prev. | 10.0 | 6.7 | 114.7 | 986 |
sgn. prev. | 8.2 | 5.7 | 96.2 | 875 |
rot. prev. | 4.7 | 4.0 | 68.7 | 713 |
C28H20O/6-31G** | ||||
Δϵ guess | 22.5 | 9.0 | 88.8 | 815 |
prev. | 9.6 | 8.1 | 61.7 | 625 |
sgn. prev. | 7.7 | 6.4 | 49.9 | 542 |
rot. prev. | 7.0 | 5.3 | 42.4 | 490 |
. | Davidson . | CPHF . | (CI + z)/s . | Wall/s . |
---|---|---|---|---|
Naphthalene/cc-pVTZ | ||||
Δϵ guess | 7.3 | 6.0 | 100.1 | 898 |
prev. | 10.0 | 6.7 | 114.7 | 986 |
sgn. prev. | 8.2 | 5.7 | 96.2 | 875 |
rot. prev. | 4.7 | 4.0 | 68.7 | 713 |
C28H20O/6-31G** | ||||
Δϵ guess | 22.5 | 9.0 | 88.8 | 815 |
prev. | 9.6 | 8.1 | 61.7 | 625 |
sgn. prev. | 7.7 | 6.4 | 49.9 | 542 |
rot. prev. | 7.0 | 5.3 | 42.4 | 490 |
Methyl-bridged benzophenone anthracene donor–acceptor complex, C28H20O, in the ground state geometry used for the start of the geometry optimizations.
Methyl-bridged benzophenone anthracene donor–acceptor complex, C28H20O, in the ground state geometry used for the start of the geometry optimizations.
In Figs. 2(a) and 2(d), we plot the number of iterations required to converge the excited states as a function of the optimization cycle; after the first cycle, our guess nearly always requires less than half of the iterations required by the standard Δϵ guess. In Figs. 2(c) and 2(f), we plot the error (norm of the largest residual) during the Davidson algorithm during the second optimization cycle of the amplitude diagonalization [see the arrows in Figs. 2(a) and 2(d)]. Figures 2(b) and 2(e) show the number of cycles required to converge the coupled perturbed Hartree–Fock (CPHF) equations. We declare the CI to be converged when the max error is less than 10−6 and the z-vector when the max error is less than 10−7; we have chosen these error thresholds because it is known that using a guess can lead to small systematic errors in the gradient and long time energy drift2 when solving Hartree–Fock equations. For all problems we have studied so far, such problems can be avoided with the present choice of thresholds, although, no doubt, future work in dynamics would want to investigate long time drift in more detail (including schemes of Niklasson to eliminate such drift16,17) and/or potentially use an extrapolation of more than one previous guess.2 Future work should also investigate non-orthogonal Krylov methods18,19 for reducing cost even further.
Convergence properties for excited state optimization of naphthalene (left) and C28H20O (right). Number of iterations required for the Davidson algorithm (a) and (d) and coupled-perturbed Hartree–Fock equations (b) and (e). (c) and (f) The rate of convergence during optimization cycle 2 for the amplitude diagonalization [see the arrows in (a) and (d)] in terms of the norm of the largest residual vector during the Davidson algorithm; convergence is achieved when all residuals are less than 10−6. In all cases, the proposed guess is entirely stable and often can offer large savings.
Convergence properties for excited state optimization of naphthalene (left) and C28H20O (right). Number of iterations required for the Davidson algorithm (a) and (d) and coupled-perturbed Hartree–Fock equations (b) and (e). (c) and (f) The rate of convergence during optimization cycle 2 for the amplitude diagonalization [see the arrows in (a) and (d)] in terms of the norm of the largest residual vector during the Davidson algorithm; convergence is achieved when all residuals are less than 10−6. In all cases, the proposed guess is entirely stable and often can offer large savings.
In conclusion, we report a robust initial guess for CIS/TDA amplitudes and z-vectors that reduce the number of iterations for excited state calculations by over 50% and reduce excited state optimization times by as much as 40%. Looking forward, we anticipate that the present guessing strategy can be applied to other electronic structure methods, such as TD-DFT under the particle–hole Random Phase Approximation (ph-RPA), Algebraic Diagrammatic Construction (ADC),20 as well as multi-configuration SCF (MCSCF) calculations.6 The most computationally expensive part of the procedure is the singular value decomposition in Eq. (3). The matrix’s size is determined by the number of basis functions, and even in the examples given here, the complete procedure requires less than 1 s (making the guess effectively free). The bottom line is that we anticipate that Eqs. (2) and (6) above can and will be implemented within modern electronic structure programs (including Q-Chem) in the near future and that chemists looking for optimized excited state structures for large molecules can anticipate at least a 30% reduction in cost going forward. We also anticipate savings when running ab initio excited state dynamics.21
This work was supported by the U.S. Air Force Office of Scientific Research (USAFOSR) (under Grant Nos. FA9550-23-1-0368 and FA9550-18-1-0420). We acknowledge the DoD High Performance Computing Modernization Program for computer time.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
D. Vale Cofer-Shabica: Conceptualization (equal); Investigation (equal); Methodology (equal); Software (equal); Validation (equal); Writing – original draft (equal); Writing – review & editing (equal). Vishikh Athavale: Conceptualization (equal); Investigation (equal); Methodology (equal); Writing – review & editing (equal). Joseph E. Subotnik: Conceptualization (lead); Funding acquisition (lead); Supervision (lead); Writing – original draft (equal); Writing – review & editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.
REFERENCES
We believe this anomaly can be explained by noting that S′ is not invariant under translation or rotation. By contrast, the transformation in Eq. (4) is robust in all cases we have studied.