We report robust initial guesses for the amplitudes and z-vectors in a configuration interaction singles or Tamm–Dancoff approximation calculation that consistently reduce the total number of iterations required for an excited state calculation often by over 50%. The end result of these guesses is that the practicing chemist can expect to generate excited state optimized structures with a total wall time reduced by as much as 30% in the future without any approximations—simply by using information gathered at one geometry and applying it to another geometry.

During a geometry optimization or ab initio molecular dynamics, it is common to use a set of previously computed molecular orbitals (MOs) as the starting guess for the next geometry.1 This no-cost approach can reduce the error at the initial step by an order of magnitude and shaves a few cycles off of the SCF procedure. Moreover, in the course of dynamics, several researchers have shown that extrapolating a set of previously computed Fock matrices via a power series expansion1–3 or linear-predictor methods4 can reduce the error and the cost without introducing significant bias into the final solution. While these approaches are routine today for ground state calculations, as far as we are aware, only limited efforts are made in standard electronic structure packages to read-in excited state amplitudes from one geometry to the next.5,6 Rather, for example, in Configuration Interaction Singles (CIS) or Tamm–Dancoff Approximation Time-Dependent Density Functional Theory (TDA-TDDFT) calculations, one usually initializes the diagonalization routine using a standard set of tia guesses for the amplitudes based on zero order energy differences (Δϵ). As we discuss below, this state of affairs arises presumably for reasons of stability.

In this Communication, we report that a simple rotation of the old amplitudes into the space of the new molecular orbitals using the Kabsch algorithm7 (which has many other names as well8,9) can serve as an exceedingly good and stable guess for the excited state amplitudes. This guess often has an error that is several orders of magnitude lower than if no such guess is made, and the Davidson algorithm10 converges dramatically faster to the relevant excited states in a consistently stable fashion. The same approach is applied to the z-vector (which is used in the calculation of the gradient) for similar effect.

Mathematically, the novel guess amplitudes are quite straightforward to compute. Notationally, let us write down a CIS/TDA wave function11 in terms of single excitations, Φia, from the Hartree–Fock ground state,
(1)
In Eq. (1), i indexes occupied orbitals, a indexes virtual orbitals, and σ indexes the spin. The proposed transformation from the old amplitudes t̃ into the new guess, t, is
(2)
The rotation matrices Rσw for w ∈ {occupied, virtual} and σ ∈ {α, β} are constructed by maximizing the weighted overlap of the old MOs with the new MOs by way of the Kabsch algorithm,7,8
(3)
(4)
Here, C̃σw and Cσw are the converged molecular orbitals from the old and new SCF procedures, respectively, and (S)μν = ⟨μ|ν⟩ is the atomic-orbital overlap at the new geometry. The matrices Wσw weight the orbital overlaps by their contribution to the previous electronic states so that one puts most emphasis on aligning orbitals that are relevant to the given electronic state(s),
(5)
with I indexing the set of N previously computed electronic states.

The rotation in Eq. (4) improves substantially upon simply reading in the previous guess because, in the absence of Wσw, it is the analytically optimal rotation of the old MOs into the new—a task often referred to as the Procrustes problem and long known in the chemistry community.8,12 Interestingly, if one implements Eq. (3) with the cross-geometry atomic-orbital overlap, (S)μν=μ(x̃)|ν(x) [instead of (S)μν=μ(x)|ν(x)]—as one might be tempted to do for the most rigorous overlap scheme—we find much worse performance; a similar issue was observed in previous efforts to extrapolate the Fock matrix.2,13

The same approach can be applied to the z-vector—which is used in a variety of response properties, including gradients and dipole moments. Solving for the z-vector requires an initial guess and then performing a coupled perturbed Hartree–Fock (CPHF) calculation. In the spirit of extrapolating the Fock matrix, there have been previous attempts to improve the initial guess for the z-vector.14 For our purposes, we will investigate the implications of using the simple ansatz
(6)
to update the z-vector using the same transformation in Eq. (6) as in Eq. (2), In principle, one could imagine constructing the weights in Eq. (5) using z̃σ for the z-vector instead of t̃σ, but in our limited testing, we find that this modification confers no benefit.
Equations (2) and (6) rely on the principle that when we transform from the old to new molecular orbitals, we treat the occupied and virtual subspaces separately and without mixing. For example, given a CIS/TDA eigenvalue problem of the form
(7)
it is straightforward to prove that when the occupied and virtual MOs are rotated by Rσo and Rσv, respectively, the A tensor and the amplitudes t should be rotated accordingly as well. Therefore, saving what are generally small changes in the atomic orbitals between the old and new geometry, the transformed amplitudes in Eq. (2) exactly solve the CIS/TDA eigenvalue problem at the new geometry and consequently constitute a substantial improvement over the standard Δϵ guess. We note that the standard Δϵ guess is simply formed by taking the N single orbital transitions with smallest energy as the N lowest eigenstates.

In Table I, we show the relative speed-up for computing an excited state optimization of (1) the S1 surface of naphthalene and (2) the T1 surface of a methyl-bridged benzophenone anthracene complex (pictured in Fig. 1) after implementing Eqs. (2)(4) and (6) into a developmental version of Q-Chem 6.0.15 Excited state energies and gradients were computed via restricted Configuration Interaction Singles (CIS) in the cc-pVTZ (naphthalene) or 6-31G** (C28H20O) basis set on 32 cores of an Intel Gold 6242 CPU clocked at 2.80 GHz. The starting geometries were the converged Hartree–Fock ground state geometries. We computed the lowest three singlet (naphthalene) or triplet (C28H20O) states in each case. Over the course of the optimizations, each optimization cycle was converged when the DIIS error in the SCF was less than 10−8, the norm of the CIS residuals was less than 10−6, and the error in the z-vector was less than 10−7. Absolute energies at each structure computed by all methods never differed by more than 2 × 10−6 Eh, and the energy deviation at the final geometry was less than 2 × 10−7 Eh for both molecules. The optimization was considered converged when the maximum component of the gradient was less than 3 × 10−4 Eh/a0 and either (1) the maximum component of the displacement from the previous step was less than 1.2 × 10−3 a0 or (2) the absolute energy between successive structures was less than 10−6 Eh.

TABLE I.

The first two columns, respectively, give the mean number of Davidson and CPHF iterations required per step during a geometry optimization. The third column (CI + z) gives the mean time per step to converge the CIS and z-vector. The final column reports the total wall time for the optimization. Timings are the mean of three consecutive runs.

DavidsonCPHF(CI + z)/sWall/s
Naphthalene/cc-pVTZ 
Δϵ guess 7.3 6.0 100.1 898 
prev. tia 10.0 6.7 114.7 986 
sgn. prev. tia 8.2 5.7 96.2 875 
rot. prev. tia 4.7 4.0 68.7 713 
C28H20O/6-31G** 
Δϵ guess 22.5 9.0 88.8 815 
prev. tia 9.6 8.1 61.7 625 
sgn. prev. tia 7.7 6.4 49.9 542 
rot. prev. tia 7.0 5.3 42.4 490 
DavidsonCPHF(CI + z)/sWall/s
Naphthalene/cc-pVTZ 
Δϵ guess 7.3 6.0 100.1 898 
prev. tia 10.0 6.7 114.7 986 
sgn. prev. tia 8.2 5.7 96.2 875 
rot. prev. tia 4.7 4.0 68.7 713 
C28H20O/6-31G** 
Δϵ guess 22.5 9.0 88.8 815 
prev. tia 9.6 8.1 61.7 625 
sgn. prev. tia 7.7 6.4 49.9 542 
rot. prev. tia 7.0 5.3 42.4 490 
FIG. 1.

Methyl-bridged benzophenone anthracene donor–acceptor complex, C28H20O, in the ground state geometry used for the start of the geometry optimizations.

FIG. 1.

Methyl-bridged benzophenone anthracene donor–acceptor complex, C28H20O, in the ground state geometry used for the start of the geometry optimizations.

Close modal
As one can discern from Table I, implementing the excited state amplitude guess in Eq. (2) leads to a massive reduction in the number of iterations and the time required for the Davidson method to converge the excited states for the large donor–bridge–acceptor system. For this system, the required number of Davidson iterations is reduced by 69%, the mean time to converge the CI is reduced by 52%, and the overall calculation time is reduced by 40% relative to the standard, Δϵ, guess. In Table I, we also investigate the implications of two other simpler schemes: (1) simply reading in the previously converged amplitudes (prev. tia) without the transformation in Eq. (2) and (2) correcting tia using only the relative signs of the new and old molecular orbitals (sgn. prev. tia), that is, constructing a new matrix, Rσw, to be used in Eqs. (2) and (6),
(8)
where sgn(x) is the sign function defined as
We find that, while less effective than the novel transformation proposed here, either of these naïve starting guesses may be helpful in some cases, as for C28H20O. However, both may also make things worse, as for naphthalene. Likely for this reason, most existing electronic structure packages do not read in the previous set of excited state amplitudes by default as a guess. In contrast, the transformation in Eq. (4) never worsens performance.

In Figs. 2(a) and 2(d), we plot the number of iterations required to converge the excited states as a function of the optimization cycle; after the first cycle, our guess nearly always requires less than half of the iterations required by the standard Δϵ guess. In Figs. 2(c) and 2(f), we plot the error (norm of the largest residual) during the Davidson algorithm during the second optimization cycle of the amplitude diagonalization [see the arrows in Figs. 2(a) and 2(d)]. Figures 2(b) and 2(e) show the number of cycles required to converge the coupled perturbed Hartree–Fock (CPHF) equations. We declare the CI to be converged when the max error is less than 10−6 and the z-vector when the max error is less than 10−7; we have chosen these error thresholds because it is known that using a guess can lead to small systematic errors in the gradient and long time energy drift2 when solving Hartree–Fock equations. For all problems we have studied so far, such problems can be avoided with the present choice of thresholds, although, no doubt, future work in dynamics would want to investigate long time drift in more detail (including schemes of Niklasson to eliminate such drift16,17) and/or potentially use an extrapolation of more than one previous guess.2 Future work should also investigate non-orthogonal Krylov methods18,19 for reducing cost even further.

FIG. 2.

Convergence properties for excited state optimization of naphthalene (left) and C28H20O (right). Number of iterations required for the Davidson algorithm (a) and (d) and coupled-perturbed Hartree–Fock equations (b) and (e). (c) and (f) The rate of convergence during optimization cycle 2 for the amplitude diagonalization [see the arrows in (a) and (d)] in terms of the norm of the largest residual vector during the Davidson algorithm; convergence is achieved when all residuals are less than 10−6. In all cases, the proposed guess is entirely stable and often can offer large savings.

FIG. 2.

Convergence properties for excited state optimization of naphthalene (left) and C28H20O (right). Number of iterations required for the Davidson algorithm (a) and (d) and coupled-perturbed Hartree–Fock equations (b) and (e). (c) and (f) The rate of convergence during optimization cycle 2 for the amplitude diagonalization [see the arrows in (a) and (d)] in terms of the norm of the largest residual vector during the Davidson algorithm; convergence is achieved when all residuals are less than 10−6. In all cases, the proposed guess is entirely stable and often can offer large savings.

Close modal

In conclusion, we report a robust initial guess for CIS/TDA amplitudes and z-vectors that reduce the number of iterations for excited state calculations by over 50% and reduce excited state optimization times by as much as 40%. Looking forward, we anticipate that the present guessing strategy can be applied to other electronic structure methods, such as TD-DFT under the particle–hole Random Phase Approximation (ph-RPA), Algebraic Diagrammatic Construction (ADC),20 as well as multi-configuration SCF (MCSCF) calculations.6 The most computationally expensive part of the procedure is the singular value decomposition in Eq. (3). The matrix’s size is determined by the number of basis functions, and even in the examples given here, the complete procedure requires less than 1 s (making the guess effectively free). The bottom line is that we anticipate that Eqs. (2) and (6) above can and will be implemented within modern electronic structure programs (including Q-Chem) in the near future and that chemists looking for optimized excited state structures for large molecules can anticipate at least a 30% reduction in cost going forward. We also anticipate savings when running ab initio excited state dynamics.21 

This work was supported by the U.S. Air Force Office of Scientific Research (USAFOSR) (under Grant Nos. FA9550-23-1-0368 and FA9550-18-1-0420). We acknowledge the DoD High Performance Computing Modernization Program for computer time.

The authors have no conflicts to disclose.

D. Vale Cofer-Shabica: Conceptualization (equal); Investigation (equal); Methodology (equal); Software (equal); Validation (equal); Writing – original draft (equal); Writing – review & editing (equal). Vishikh Athavale: Conceptualization (equal); Investigation (equal); Methodology (equal); Writing – review & editing (equal). Joseph E. Subotnik: Conceptualization (lead); Funding acquisition (lead); Supervision (lead); Writing – original draft (equal); Writing – review & editing (equal).

The data that support the findings of this study are available from the corresponding author upon reasonable request.

1.
P.
Pulay
and
G.
Fogarasi
,
Chem. Phys. Lett.
386
,
272
(
2004
).
2.
J. M.
Herbert
and
M.
Head-Gordon
,
Phys. Chem. Chem. Phys.
7
,
3269
(
2005
).
3.
J.
Fang
,
X.
Gao
,
H.
Song
, and
H.
Wang
,
J. Chem. Phys.
144
,
244103
(
2016
).
4.
J. D.
Herr
and
R. P.
Steele
,
Chem. Phys. Lett.
661
,
42
(
2016
).
5.
K. K.
Baeck
and
T. J.
Martinez
,
Chem. Phys. Lett.
375
,
299
(
2003
).
6.
B. G.
Levine
,
J. D.
Coe
,
A. M.
Virshup
, and
T. J.
Martínez
, “
Ultrafast photoinduced processes in polyatomic molecules
,”
Chem. Phys.
347
,
3
(
2008
).
7.
W.
Kabsch
,
Acta Crystallogr., Sect. A
32
,
922
(
1976
).
8.
B. C.
Carlson
and
J. M.
Keller
,
Phys. Rev.
105
,
102
(
1957
).
9.
J. G.
Aiken
,
J. A.
Erdos
, and
J. A.
Goldstein
,
Int. J. Quantum Chem.
18
,
1101
(
1980
).
10.
11.
J. B.
Foresman
,
M.
Head-Gordon
,
J. A.
Pople
, and
M. J.
Frisch
,
J. Phys. Chem.
96
,
135
(
1992
).
12.
G. H.
Golub
and
C. F.
Van Loan
,
Matrix Computations
(
JHU Press
,
2013
).
13.

We believe this anomaly can be explained by noting that S′ is not invariant under translation or rotation. By contrast, the transformation in Eq. (4) is robust in all cases we have studied.

14.
R. P.
Steele
and
J. C.
Tully
,
Chem. Phys. Lett.
500
,
167
(
2010
).
15.
E.
Epifanovsky
,
A. T. B.
Gilbert
,
X.
Feng
,
J.
Lee
,
Y.
Mao
,
N.
Mardirossian
,
P.
Pokhilko
,
A. F.
White
,
M. P.
Coons
,
A. L.
Dempwolff
,
Z.
Gan
,
D.
Hait
,
P. R.
Horn
,
L. D.
Jacobson
,
I.
Kaliman
,
J.
Kussmann
,
A. W.
Lange
,
K. U.
Lao
,
D. S.
Levine
,
J.
Liu
,
S. C.
McKenzie
,
A. F.
Morrison
,
K. D.
Nanda
,
F.
Plasser
,
D. R.
Rehn
,
M. L.
Vidal
,
Z.-Q.
You
,
Y.
Zhu
,
B.
Alam
,
B. J.
Albrecht
,
A.
Aldossary
,
E.
Alguire
,
J. H.
Andersen
,
V.
Athavale
,
D.
Barton
,
K.
Begam
,
A.
Behn
,
N.
Bellonzi
,
Y. A.
Bernard
,
E. J.
Berquist
,
H. G. A.
Burton
,
A.
Carreras
,
K.
Carter-Fenk
,
R.
Chakraborty
,
A. D.
Chien
,
K. D.
Closser
,
V.
Cofer-Shabica
,
S.
Dasgupta
,
M.
de Wergifosse
,
J.
Deng
,
M.
Diedenhofen
,
H.
Do
,
S.
Ehlert
,
P.-T.
Fang
,
S.
Fatehi
,
Q.
Feng
,
T.
Friedhoff
,
J.
Gayvert
,
Q.
Ge
,
G.
Gidofalvi
,
M.
Goldey
,
J.
Gomes
,
C. E.
González-Espinoza
,
S.
Gulania
,
A. O.
Gunina
,
M. W. D.
Hanson-Heine
,
P. H. P.
Harbach
,
A.
Hauser
,
M. F.
Herbst
,
M.
Hernández Vera
,
M.
Hodecker
,
Z. C.
Holden
,
S.
Houck
,
X.
Huang
,
K.
Hui
,
B. C.
Huynh
,
M.
Ivanov
,
A.
Jász
,
H.
Ji
,
H.
Jiang
,
B.
Kaduk
,
S.
Kähler
,
K.
Khistyaev
,
J.
Kim
,
G.
Kis
,
P.
Klunzinger
,
Z.
Koczor-Benda
,
J. H.
Koh
,
D.
Kosenkov
,
L.
Koulias
,
T.
Kowalczyk
,
C. M.
Krauter
,
K.
Kue
,
A.
Kunitsa
,
T.
Kus
,
I.
Ladjánszki
,
A.
Landau
,
K. V.
Lawler
,
D.
Lefrancois
,
S.
Lehtola
,
R. R.
Li
,
Y.-P.
Li
,
J.
Liang
,
M.
Liebenthal
,
H.-H.
Lin
,
Y.-S.
Lin
,
F.
Liu
,
K.-Y.
Liu
,
M.
Loipersberger
,
A.
Luenser
,
A.
Manjanath
,
P.
Manohar
,
E.
Mansoor
,
S. F.
Manzer
,
S.-P.
Mao
,
A. V.
Marenich
,
T.
Markovich
,
S.
Mason
,
S. A.
Maurer
,
P. F.
McLaughlin
,
M. F. S. J.
Menger
,
J.-M.
Mewes
,
S. A.
Mewes
,
P.
Morgante
,
J. W.
Mullinax
,
K. J.
Oosterbaan
,
G.
Paran
,
A. C.
Paul
,
S. K.
Paul
,
F.
Pavošević
,
Z.
Pei
,
S.
Prager
,
E. I.
Proynov
,
A.
Rák
,
E.
Ramos-Cordoba
,
B.
Rana
,
A. E.
Rask
,
A.
Rettig
,
R. M.
Richard
,
F.
Rob
,
E.
Rossomme
,
T.
Scheele
,
M.
Scheurer
,
M.
Schneider
,
N.
Sergueev
,
S. M.
Sharada
,
W.
Skomorowski
,
D. W.
Small
,
C. J.
Stein
,
Y.-C.
Su
,
E. J.
Sundstrom
,
Z.
Tao
,
J.
Thirman
,
G. J.
Tornai
,
T.
Tsuchimochi
,
N. M.
Tubman
,
S. P.
Veccham
,
O.
Vydrov
,
J.
Wenzel
,
J.
Witte
,
A.
Yamada
,
K.
Yao
,
S.
Yeganeh
,
S. R.
Yost
,
A.
Zech
,
I. Y.
Zhang
,
X.
Zhang
,
Y.
Zhang
,
D.
Zuev
,
A.
Aspuru-Guzik
,
A. T.
Bell
,
N. A.
Besley
,
K. B.
Bravaya
,
B. R.
Brooks
,
D.
Casanova
,
J.-D.
Chai
,
S.
Coriani
,
C. J.
Cramer
,
G.
Cserey
,
A. E.
DePrince
,
R. A.
DiStasio
,
A.
Dreuw
,
B. D.
Dunietz
,
T. R.
Furlani
,
W. A.
Goddard
,
S.
Hammes-Schiffer
,
T.
Head-Gordon
,
W. J.
Hehre
,
C.-P.
Hsu
,
T.-C.
Jagau
,
Y.
Jung
,
A.
Klamt
,
J.
Kong
,
D. S.
Lambrecht
,
W.
Liang
,
N. J.
Mayhall
,
C. W.
McCurdy
,
J. B.
Neaton
,
C.
Ochsenfeld
,
J. A.
Parkhill
,
R.
Peverati
,
V. A.
Rassolov
,
Y.
Shao
,
L. V.
Slipchenko
,
T.
Stauch
,
R. P.
Steele
,
J. E.
Subotnik
,
A. J. W.
Thom
,
A.
Tkatchenko
,
D. G.
Truhlar
,
T.
Van Voorhis
,
T. A.
Wesolowski
,
K. B.
Whaley
,
H. L.
Woodcock
,
P. M.
Zimmerman
,
S.
Faraji
,
P. M. W.
Gill
,
M.
Head-Gordon
,
J. M.
Herbert
, and
A. I.
Krylov
,
J. Chem. Phys.
155
,
084801
(
2021
).
16.
A. M. N.
Niklasson
,
C. J.
Tymczak
, and
M.
Challacombe
,
J. Chem. Phys.
126
,
144103
(
2007
).
17.
A.
Odell
,
A.
Delin
,
B.
Johansson
,
N.
Bock
,
M.
Challacombe
, and
A. M. N.
Niklasson
,
J. Chem. Phys.
131
,
244106
(
2009
).
18.
F.
Furche
,
B. T.
Krull
,
B. D.
Nguyen
, and
J.
Kwon
,
J. Chem. Phys.
144
,
174105
(
2016
).
19.
R. M.
Parrish
,
E. G.
Hohenstein
, and
T. J.
Martínez
,
J. Chem. Theory Comput.
12
,
3003
(
2016
).
20.
A.
Dreuw
and
M.
Wormit
,
Wiley Interdiscip. Rev.: Comput. Mol. Sci.
5
,
82
(
2015
).
21.
E.
Tapavicza
,
G. D.
Bellchambers
,
J. C.
Vincent
, and
F.
Furche
,
Phys. Chem. Chem. Phys.
15
,
18336
(
2013
).