We recently developed an algorithm to compute response properties for the state-averaged complete active space self-consistent field method (SA-CASSCF) that capitalized on sparsity in the atomic orbital basis. Our original algorithm was limited to treating small to moderate sized active spaces, but the recent development of graphical processing unit (GPU) based direct-configuration interaction algorithms provides an opportunity to extend this to large active spaces. We present here a direct-compatible version of the coupled perturbed equations, enabling us to compute response properties for systems treated with arbitrary active spaces (subject to available memory and computation time). This work demonstrates that the computationally demanding portions of the SA-CASSCF method can be formulated in terms of seven fundamental operations, including Coulomb and exchange matrix builds and their derivatives, as well as, generalized one- and two-particle density matrix and σ vector constructions. As in our previous work, this algorithm exhibits low computational scaling and is accelerated by the use of GPUs, making possible optimizations and nonadiabatic dynamics on systems with O(1000) basis functions and O(100) atoms, respectively.

The state-averaged complete active space self-consistent field (SA-CASSCF) method is an important theoretical method for the treatment of molecular systems involving multiple contributing electronic configurations.1–8 State averaging is especially critical for photochemical applications because it enables the treatment of multiple electronic states on an equal footing and alleviates the well-known root-flipping problem associated with optimizing an excited state wavefunction.8–12 As such, SA-CASSCF is commonly employed in the simulation of photochemical reactions. We recently developed algorithms for the calculation of the SA-CASSCF energy, nuclear gradient, and nonadiabatic coupling (NAC) vector that exploit the intrinsic sparsity in the atomic orbital (AO) basis and capitalize on graphical processing unit (GPU) acceleration.13,14 We were able to reduce the formal O(N4) scaling to effectively O(N2), where N is the number of basis functions, without introducing any approximations. Our original algorithms were designed for small to medium sized active spaces and involved explicitly building and diagonalizing the entire CAS configuration interaction (CI) Hamiltonian, which is the full CI Hamiltonian for a subset of active orbitals. Moreover, we worked in the basis of eigenstates of the CI Hamiltonian to solve the coupled perturbed multiconfiguration self-consistent field (CP-MCSCF) equations and form nuclear gradients and NAC vectors. Despite the robustness of these algorithms, the number of determinants scales factorially with the size of the active space, rendering them impractical for large active spaces due to memory requirements.

In order to treat large active spaces, direct-CI algorithms were developed to efficiently solve for a specified number of eigenvalues (energies) and eigenvectors of the CI Hamiltonian.15–19 These algorithms are efficient because they avoid the explicit construction of the CI Hamiltonian. Using the Slater-Condon rules and the resolution of the identity, direct techniques exploit the well-defined structure of the Hamiltonian matrix to reduce the scaling of each matrix-vector product required during the iterative solution to the CI problem from quadratic to linear with respect to the number of determinants. Although spin-adapted formulations of direct methods exist and reduce the dimensionality of the CI problem by a factor of two to three,20,21 determinant-based algorithms are often employed due to the convenient integer form of the coupling coefficients,22 which are required for computing the energy and response properties. In recent work,23 we implemented a vectorized determinantal direct-CI algorithm on GPUs based on the work of Knowles and Handy.15 Our implementation substantially reduced the cost of determining the lowest several energies and eigenvectors of the CI Hamiltonian, enabling the efficient treatment of very large active spaces using commonly available desktop hardware. With the addition of fast routines to compute the one- and two-particle density matrices, it was straightforward to connect our direct-CI and SA-CASSCF energy codes. However, further development was necessary in order to compute nuclear gradients and NAC vectors, as described in the present manuscript.

Our interest in adapting our SA-CASSCF code stems from the importance of nuclear gradients and nonadiabatic coupling vectors to chemical research. Nuclear gradients are broadly useful in electronic structure theory and are required for minimizations, transition state searches, and molecular dynamics. Nonadiabatic coupling (NAC) vectors are important for the study of photochemistry due to their use in dynamics simulations on multiple electronic states, where the Born-Oppenheimer approximation breaks down, as well as in conical intersection searches.24–27 Pulay and Schaefer, among others, developed the initial methods for computing nuclear gradients,28–32 and Yarkony and co-workers established that NAC vectors could be computed with essentially the same theory.33,34 More recent work has simplified the theory behind the evaluation of analytical gradients through the use of Lagrangians.35,36 We employ and further develop the Lagrangian perspective on gradient theory in the present work.

As previously noted, our original implementation of the SA-CASSCF analytical gradients and NAC vectors was performed in a basis of eigenvectors of the CI Hamiltonian.14 This necessitates determining all possible spin-adapted CI eigenvectors, which is only computationally tractable for small active spaces (where the CI Hamiltonian matrix can be stored in memory). To make our gradient code compatible with large active spaces, we reformulate the CP-MCSCF equations in a basis of determinants and design an algorithm that utilizes GPU-accelerated direct-CI operations. Past research has established some of the frameworks required for adapting our method to large active spaces.37,38 However, to the best of our knowledge, this is the first work that discusses a well-defined procedure for computing analytical gradients and NAC vectors for large molecules in an arbitrarily large determinant basis without approximating the electron repulsion integrals, and it is certainly the first to do so using GPUs. As in our previous work,13,14,23 we present timings demonstrating both the favorable efficiency and scaling of our algorithm. In addition, we report results from a minimum energy conical intersection (MECI) search on the Dronpa-2 protein using mixed quantum mechanics/molecular mechanics (QM/MM) and a large (>100 atoms) quantum region. We believe that this represents one of the largest conical intersection searches ever performed. The algorithms developed in this work will enable picosecond (and beyond) time scale excited state dynamics simulations on large molecular systems, including systems that require large active spaces.

Throughout this work, we employ the following notation. Unless otherwise noted, Θ, Λ, Ψ, Φ, ϓ, and Ξ refer to eigenstates of the CI Hamiltonian. Θ and Λ are the particular eigenstates of interest for determining response properties. Ψ, Φ, ϓ, and Ξ are arbitrary eigenstates within the state-averaged space. When an eigenstate index appears in a summation, one should assume that the summation runs over all states within the state-averaged space. A state is considered as a part of the state-averaged space when it has nonzero weight, and there are Ns such states. Determinants will be indexed with I and J, and there are Nd determinants total. Atomic orbitals (AOs) will be indexed with μ, ν, ρ, and σ. Arbitrary molecular orbitals (MOs) will be indexed with p, q, r, and s. The inactive subset of occupied MOs will be indexed with i, j, k, and l, the active subset of MOs will be indexed with t, u, v, w, and x, and the virtual MOs will be indexed with a, b, c, and d. There are No total molecular orbitals.

The SA-CASSCF wavefunction is determined by variationally optimizing the following energy expression:39 

ESA=pqγpqSA(p|h^|q)+pqrsΓpqrsSA(pq|rs).
(1)

The state-averaged one-particle density matrix (OPDM), γSA, and two-particle density matrix (TPDM), ΓSA, are each a weighted average of the corresponding state-specific OPDMs and TPDMs,

γpqSA=ΨωΨγpqΨ,
(2)
ΓpqrsSA=ΨωΨΓpqrsΨ.
(3)

The weight for state Ψ is denoted as ωΨ, and these weights are typically positive and sum to unity.

The use of state averaging complicates the calculation of nuclear gradients and other response properties for a specific state because the energy of a single state is not variational with respect to the wavefunction parameters (the orbitals are optimized to minimize the state-averaged energy and not the energy of any individual state). However, by expressing the energy as a Lagrangian,36 it is possible to construct a variational expression for each electronic state,

LΘ=EΘ+[pqκ¯pqΘESAκpq0]+[ΨIc¯ΨIΘESAcΨI0],
(4)

where κpq and cΨI are orbital and CI parameters and κ¯pqΘ and c¯ΨIΘ are the corresponding Lagrange multipliers that make Eq. (4) variational. One can determine these multipliers by taking the partial derivative with respect to the orbital and CI parameters. The resulting set of linear equations is known as the coupled perturbed multiconfiguration self-consistent field equations,29,36

pqκ¯pqΘ2ESAκpqκrs+ΨIc¯ΨIΘ2ESAcΨIκrs=EΘκrs,
(5)
pqκ¯pqΘ2ESAκpqcΦJ+ΨIc¯ΨIΘ2ESAcΨIcΦJ=0.
(6)

Eqs. (5) and (6) can be rewritten in terms of matrix vector products as

H𝐎𝐎𝜿¯Θ+H𝐎𝐂𝒄¯Θ=𝐠oΘ,
(7)
H𝐂𝐎𝜿¯Θ+H𝐂𝐂𝒄¯Θ=0,
(8)

where H𝐎𝐎 and H𝐂𝐂 are the orbital-orbital and CI−CI components of the Hessian matrix, respectively. The blocks H𝐎𝐂 and H𝐂𝐎 are the orbital-CI coupling components of the Hessian matrix, and H𝐎𝐂=(H𝐂𝐎)T. The vector, 𝐠oΘ, is the state-specific orbital gradient. To calculate the nonadiabatic coupling vector between states Θ and Λ, 𝐠oΘ is replaced by the orbital interstate coupling vector, Θ|H/κpq|Λ, but the Hessian is the same. In principle, solving the coupled perturbed equations is just a straightforward exercise in numerical linear algebra. However, the Hessian can be extremely large and cannot be stored entirely in memory. As a result, it is necessary to iteratively solve the coupled perturbed equations, which only requires matrix-vector products between the Hessian and a trial vector. We will call the trial vector z and the product between the Hessian and a trial vector η, where

𝜼=H𝐳
(9)

and

𝐳=(𝜿¯Θ𝒄¯Θ)
(10)

at convergence. In certain instances, we will refer to elements and subvectors of 𝜿¯Θ, 𝒄¯Θ, and z. If two indices appear as a subscript, then we are referencing an element of the vector. When a single index appears as a subscript, we are referencing a subvector containing all elements with that subscript. In the case of 𝜿¯Θ, we will periodically use the words “core,” “act,” “virt,” and “all” as subscript index pairs to describe the subvector containing elements with these pair designations, expressed as a matrix. These will often appear as submatrices of a No × No matrix of orbital pairs.

In our previous works,13,14 we showed that it is possible to calculate the orbital gradient as well as certain components of H𝐎𝐎 (that scale as O(No2) in memory) using a combination of J and K matrix builds

Jμν(𝐃)=ρσ(μν|ρσ)Dρσ,
(11)
Kμν(𝐃)=ρσ(μρ|νσ)Dρσ,
(12)

where D is a generalized density matrix. This was largely accomplished by constructing a subset of the electron repulsion integrals (ERIs)—more specifically the set of two-electron integrals involving two active orbitals with the form (pq|tu) or (pt|qu)—as well as a few Fock-like matrices. The remaining contributions to H𝐎𝐎 were recovered using two J and two K matrix builds during each iteration of the solution to the coupled perturbed equations. This procedure was discussed in detail in Refs. 13 and 14 and will not be repeated here. However, for the sake of completeness, we provide the equations for the nonredundant portions of the orbital gradient

EΘκia=giaΘ=4Fiacore+2Fiaact(𝜸Θ),
(13)
EΘκta=gtaΘ=2uγtuΘFaucore+4uvwΓtuvwΘ(au|vw),
(14)
EΘκit=gitΘ=4Fitcore+2Fitact(𝜸Θ) 2uγtuΘFiucore4uvwΓtuvwΘ(iu|vw),
(15)

as well as the nonredundant terms of the state-averaged orbital Hessian

2ESAκitκju=it,juOO=4vw[ΓutvwSA(vw|ij)+(ΓuwvtSA+ΓuwtvSA)(vi|wj)]2δijvwx[ΓtvwxSA(uv|wx)+ΓuvwxSA(tv|wx)]+ 2v(δtv+γtvSA)[4(vi|uj)(ui|vj)(uv|ij)]+2v(δuv+γuvSA)[4(vj|ti)(tj|vi)(tv|ij)]+ 2γtuSAFijcoreδijv(γuvSAFtvcore+γtvSAFuvcore)+δij(4Ftucore+2Ftuact(𝜸SA))δtu(4Fijcore+2Fijact(𝜸SA)),
(16)
2ESAκitκja=it,jaOO={2u(2δtuγtuSA)[4(aj|ui)(au|ij)(ai|uj)]}2δijuvwΓtuvwSA(au|vw)+δij(4Fatcore+2Fatact(𝜸SA))δijuγtuSAFaucore,
(17)
2ESAκitκua=it,uaOO=4vw[ΓtuvwSA(ai|vw)+(ΓtvuwSA+ΓtvwuSA)(aw|vi)]2vγuvSA[(ai|tv)+(at|vi) 4(av|ti)]2γtuSAFaicore+δtu(2Faicore+Faiact(𝜸SA)),
(18)
2ESAκiaκjb=ia,jbOO={4[4(ai|bj)(ab|ij)(aj|bi)]}+δij(4Fabcore+2Fabact(𝜸SA))δab(4Fijcore+2Fijact(𝜸SA)),
(19)
2ESAκiaκtb=ia,tbOO={2uγtuSA[4(ai|bu)(au|bi)(ab|ui)]}2δabuvwΓtuvwSA(ui|vw)δab(2Fticore+Ftiact(𝜸SA))δabuγtuSAFuicore,
(20)
2ESAκtaκub=ta,ubOO=4vw[ΓtuvwSA(ab|vw)+(ΓtwvuSA+ΓtwuvSA)(aw|bv)]+2γtuSAFabcore2δabvwx[ΓtvwxSA(uv|wx)+ΓuvwxSA(tv|wx)]δabv(γtvSAFuvcore+γuvSAFtvcore).
(21)

In Eqs. (16)–(21), the braced terms are not explicitly constructed but instead recovered using matrix-vector products during the iterative solution to the coupled perturbed equations. The Fock-like matrices, Fcore and Fact(γ), and the subset of ERIs with two active indices can be computed and expressed in the following way:40,41

Fpqcore=(p|h^|q)+i[2(pq|ii)(pi|qi)]=μνCμpCνq[(μ|h^|ν)+2Jμν(𝑫core)Kμν(𝑫core)],
(22)
Fpqact(𝜸)=tu[2(pq|tu)(pt|qu)]γtu=μνCμpCνq[2Jμν(𝜸(act))Kμν(𝜸(act))],
(23)
(pq|tu)=μνCμpCνq(μν|tu)=μνCμpCνqJμν(𝑷(tu)),
(24)
(pt|qu)=μνCμpCνq(μt|νu)=μνCμpCνqKμν(𝑷(tu)),
(25)

where

Dμνcore=iCμiCνi,
(26)
γμν(act)=tuCμtCνuγtu,
(27)
Pμν(tu)=CμtCνu,
(28)

and p are MO coefficients. Components of the gradient and Hessian with certain molecular orbital pairs (e.g., ij, tu, and ab) are guaranteed to be zero because the energy is invariant to orbital rotations within the core, active, and virtual subspaces. As a result, the Lagrange multipliers corresponding to these orbital index pairs are guaranteed to be zero and are not coupled to the other Lagrange multipliers. For this reason, we can and do omit these terms entirely from the coupled perturbed equations. Furthermore, pqrsOO=qpsrOO=rspqOO=srqpOO and gpqΘ=gqpΘ. As a result, there are linear dependencies in the coupled perturbed equations, which can be removed by only including the terms in which p < q and r < s. Importantly, these terms are not zero and must be accounted for once the coupled perturbed equations are solved. Namely, it is important to remember that κ¯pqΘ=κ¯qpΘ. These omissions exist even when we employ the general MO indices p, q, r, and s in various other equations in this paper. In addition, we want to emphasize that the procedure to handle the orbital contribution to the nuclear gradient and nonadiabatic coupling vector remains unchanged in our direct-compatible implementation.

Our initial implementation of SA-CASSCF was designed for small to moderate sized active spaces (6 electrons in 8 orbitals or smaller). We explicitly built and diagonalized the CI Hamiltonian matrix

HIJ=δIJEcore+tuγtuIJ(t|h^|u)+12tuvwΓtuvwIJ(tu|vw).
(29)

Here, Ecore is the core energy, which is essentially the energy of the closed molecular orbitals and introduces a rigid shift for the eigenvalues, and the elements γtuIJ and ΓtuvwIJ are one- and two-particle coupling coefficients. For small to moderate sized active spaces, the Hamiltonian is small, and this strategy is tractable. Moreover, diagonalizing the entire Hamiltonian produces a complete set of energies and CI eigenvectors for the given active space, which are the eigenvalues and eigenvectors of the CI Hamiltonian, respectively. This made it possible to build and store H𝐎𝐂, H𝐂𝐎, and H𝐂𝐂, which are also quite small, using a basis of CI eigenvectors (instead of a basis of determinants). Unfortunately, the number of determinants increases combinatorially with the size of the active space. For large active spaces, it is often impossible to store the CI Hamiltonian, let alone diagonalize it completely.

This problem is resolved by employing the direct-CI technique,15–17,19 which involves the iterative solution of the Ns lowest eigenvalue-eigenvector pairs of the CI Hamiltonian. In the direct-CI formalism, it is only necessary to explicitly construct the ERIs corresponding to the active space as well as matrix-vector products between the CI Hamiltonian and a trial vector, often called 𝝈,

σI=JHIJcJ.
(30)

In recent work,23 we developed a highly efficient direct-CI algorithm on the GPU, demonstrating that our atomic-orbital based CAS algorithm is still highly efficient for large active spaces. The work focused on the complete active space configuration interaction (CAS-CI) method. However, it is entirely compatible with SA-CASSCF, given an efficient algorithm for the construction of a generalized OPDM,

γtuAB=IJcIAcJBγtuIJ,
(31)

and generalized TPDM,

ΓtuvwAB=12IJcIAcJBΓtuvwIJ.
(32)

In Eqs. (31) and (32), A and B serve only to indicate that the bra and ket coefficient vectors, c, need not be the same. Unless otherwise noted, generalized OPDMs and TPDMs are always symmetrized, which entails that γtuAB=γutAB and ΓtuvwAB=ΓutwvAB=ΓvwtuAB=ΓwvutAB. The ordering of A and B makes no difference, assuming the CI coefficients are real. When we write a generalized OPDM or TPDM with a single superscript, it is to be assumed that A = B. In practice, the resolution of the identity is employed in Eqs. (29)–(32) to avoid explicit computation and storage of the two-particle coupling coefficients.19,42 The algorithm to efficiently compute the OPDM and TPDM follows the one used to compute 𝝈.

Without diagonalizing the entire Hamiltonian, there is no way to construct H𝐎𝐂, H𝐂𝐎, and H𝐂𝐂 in a basis of CI eigenvectors. Moreover, explicitly constructing these blocks quickly becomes intractable as the size of the active space increases. As such, it is necessary to develop a direct-compatible implementation to calculate gradients and other response properties for SA-CASSCF wavefunctions with large active spaces. In a basis of determinants and when equal weights are employed, H𝐂𝐂 can be expressed as36 

ΨI,ΦJCC=2δΨΦωΨ[(HIJEΨδIJ)ϓcIϓcJϓ(EϓEΨ)].
(33)

The first term is a shifted variant of the CI Hamiltonian, and the second term projects out the contributions to H𝐂𝐂 coming from states within the state-averaged space. Similarly, H𝐎𝐂 and H𝐂𝐎 can be expressed as36,43

pq,ΨI𝐎𝐂=ΨI,pq𝐂𝐎=2ωΨ[Tpq(Ψ,I)ϓcIϓTpq(Ψ,ϓ)].
(34)

The matrix, 𝑻pq(Ψ,X), is a generalized orbital gradient matrix and is constructed with symmetrized, generalized density matrices, γΨX and ΓΨX. The variable, X, is a placeholder for any possible vector. In the above equations, X appears as the determinant, I, and the CI eigenvector, ϓ. Analogous to the case for H𝐂𝐂, the second term in Eq. (34) can be thought of as a projector, which removes the contribution to the Hessian from each state within the state-averaged space. The non-redundant components of 𝑻pq(Ψ,X) can be expressed as

Tia(Ψ,X)=4Ψ|XFiacore+2Fiaact(𝜸ΨX),
(35)
Tta(Ψ,X)=2uγtuΨXFaucore+4uvwΓtuvwΨX(au|vw),
(36)
Tit(Ψ,X)=4Ψ|XFitcore+2Fitact(𝜸ΨX)2uγtuΨXFiucore4uvwΓtuvwΨX(iu|vw).
(37)

We note that in the case when X = Ψ, T becomes the orbital gradient, 𝒈oΨ. For any other eigenvector of the CI Hamiltonian, T becomes an orbital interstate coupling vector, which replaces the orbital gradient in the calculation of nonadiabatic coupling vectors. All of these terms can be computed with J and K matrix builds in addition to OPDM and TPDM builds.

As previously noted, we do not explicitly construct any of the matrix blocks defined by Eqs. (33) and (34). Instead, we build products of each matrix block with trial vectors, much like a 𝝈 build, during each iteration of the solution to the coupled perturbed equations. We will begin by showing how to compute the product of H𝐂𝐂 and some trial vector because this is the most straightforward matrix-vector product. The expression for H𝐂𝐂 multiplied by the trial vector 𝒄ΦΘ is

ΦJΨI,ΦJCCcΦJΘ=Φ2δΨΦωΨ[J(HIJEΨδIJ)cΦJΘϓcIϓ(EϓEΨ)JcJϓcΦJΘ].
(38)

The vector 𝒄ΦΘ is an estimate for 𝒄¯ΦΘ, the set of CI Lagrange multipliers belonging to state Φ. The first term is simply a 𝝈 build in which 𝒄ΦΘ is the trial vector and Ecore is replaced by (EcoreEΨ) in Eq. (29). For the second term, the CI eigenvector for each state in the state-averaged space, cϓ, is weighted by the energy difference between states ϓ and Ψ as well as the overlap between 𝒄ΦΘ and cϓ. In practice, the second term in Eq. (38) is not required. The solution to the coupled perturbed equations is orthogonal to the state-averaged space, and there are no rotations within the Hessian between states inside and outside the state-averaged space. As such, it is guaranteed that all trial vectors are orthogonal to the state-averaged space as long as the initial guess is orthogonal to the space and the preconditioner contains no rotations between states inside and outside the state-averaged space. This is discussed at length in Ref. 36, and we employ similar techniques in this work. As such,

ΦJΨI,ΦJCCcΦJΘ=Φ2δΨΦωΨ[J(HIJEΨδIJ)cΦJΘ],
(39)

and only Ns𝝈 builds need to be constructed per iteration, since H𝐂𝐂 is block diagonal. It is also quite straightforward to compute H𝐎𝐂 multiplied by a trial vector,

ΨIpq,ΨIOCcΨIΘ=Ψ2ωΨ[Tpq(Ψ,𝒄ΨΘ)ϓTpq(Ψ,ϓ)IcIϓcΨIΘ].
(40)

Utilizing the same approach that is employed to transform Eq. (38) to (39),

ΨIpq,ΨIOC𝒄ΨIΘ=Ψ2ωΨTpq(Ψ,cΨΘ).
(41)

The generalized gradient, 𝐓(Ψ,𝒄ΨΘ), and by extension the generalized density matrices, 𝜸Ψ𝒄¯ΨΘ and 𝚪Ψ𝒄¯ΨΘ, must be computed Ns times per iteration.

The calculation of H𝐂𝐎 multiplied by a trial vector is somewhat more complicated,

pqΨI,pqCOκpqΘ=2ωΨpq[Tpq(Ψ,I)κpqΘϓcIϓTpq(Ψ,ϓ)κpqΘ].
(42)

As with the CI Lagrange multipliers, 𝜿Θ is an estimate of the vector of orbital Lagrange multipliers, 𝜿¯Θ. It is not feasible to construct T(Ψ,I) because the number of determinants can become extremely large. However, by inserting Eqs. (35)–(37) into Eq. (42), it is evident that Eq. (42) can be rewritten in the following way:

pqΨI,pqCOκpqΘ=2ωΨ[4αcIΨ+2tuγtuΨIFtuact(𝐃)+2tuγtuΨIFtucore(𝐃¯)+4tuvwΓtuvwΨI(tu|vw)mpq,ϓϓ|ITpq(Ψ,ϓ)κpqΘ],
(43)

where the variables α, Ftuact(𝐃), Ftucore(𝐃¯), and (tu|vw)m are defined as

α=pqFpq𝑐𝑜𝑟𝑒Dpq,
(44)
Ftu𝑎𝑐𝑡(𝐃)=pqDpq[2(pq|tu)(pt|qu)],
(45)
Ftu𝑐𝑜𝑟𝑒(𝐃¯)=pD¯tpFpu𝑐𝑜𝑟𝑒,
(46)
(tu|vw)m=pD¯tp(pu|vw)
(47)

and

𝐃=[0κcore,actΘκcore,virtΘ000000],
(48)
𝐃¯=[000κact,coreΘ0κact,virtΘ000].
(49)

By symmetrizing 𝐅act(𝐃), 𝐅core(𝐃¯), and (tu|vw)m, we can further simplify Eq. (43),

pqΨI,pqCOκpqΘ=2ωΨ[4αcIΨ+σIΨpq,ϓcIϓTpq(Ψ,ϓ)κpqΘ],
(50)

where

σIΨ=JHIJcJΨ,
(51)
HIJ=tu2γtuIJ(Ftuact(𝐃)+Ftucore(𝐃¯))+12tuvw4ΓtuvwIJ(tu|vw)m.
(52)

The terms α, 𝐅act(𝐃), 𝐅core(𝐃¯), and (tu|vw)m need only be constructed once per iteration. In addition, Ns modified 𝝈 builds and Ns2 dot products between 𝑻(Ψ,ϓ) and 𝜿¯pq are required per iteration. However, the matrices, 𝐓(Ψ,ϓ), need only be computed once and stored in memory. The reason this is tractable is that there are Ns(Ns+1)/2 unique orbital interstate coupling matrices (or gradient matrices, if Ψ = ϓ), which is a relatively small number in most SA-CASSCF calculations. In addition, each matrix only requires O(No2) memory.

This procedure alone is valid as long as the states included within the state-averaged space all have equal weight. This is not always desirable.44,45 Fortunately, modification of the procedure to treat a wavefunction in which the states are not equally weighted is quite straightforward. Stålring et al. derived a set of more general equations for the CI−CI and orbital-CI blocks of the Hessian,36 

ΨI,ΦJCC=2δΨΦ[ωΨ(HIJEΨδIJ)ϓωϓcIϓcJϓ(EϓEΨ)]+2cIΦcJΨ(EΨEΦ)(ωΨωΦ),
(53)
pq,ΨIOC=ΨI,pqCO=2ωΨTpq(Ψ,I)2ϓωϓcIϓTpq(Ψ,ϓ).
(54)

Importantly, the weights here are not the same as in Eqs. (33) and (34), in which all terms are weighted by ωΨ. In our implementation, however, we do not alter Eqs. (33) and (34). Instead, we introduce an additional set of CI Lagrange multipliers (expressed in the basis of CI eigenvectors) for the pairs of states within the state-averaged space that have different weights. The new Lagrangian takes the following form:

LΘ=EΘ+[pqκ¯pqΘESAκpq0]+[ΨIc¯ΨIΘESAcΨI0]+[Ψ<Φc¯ΨΦΘESAcΨΦ0].
(55)

By taking the derivative with respect to each of the orbital and two CI parameters, we recover an extended set of coupled perturbed equations

pqκ¯pqΘ2ESAκpqκrs+ΨIc¯ΨIΘ2ESAcΨIκrs+Ψ<Φc¯ΨΦΘ2ESAcΨΦκrs=EΘκrs,
(56)
pqκ¯pqΘ2ESAκpqcΦJ+ΨIc¯ΨIΘ2ESAcΨIcΦJ+Ψ<ϓc¯ΨϓΘ2ESAcΨϓcΦJ=0,
(57)
pqκ¯pqΘ2ESAκpqcϓΞ+ΨIc¯ΨIΘ2ESAcΨIcϓΞ+Ψ<Φc¯ΨΦΘ2ESAcΨΦcϓΞ=0.
(58)

As written, Eqs. (56)–(58) have multiple linear dependences because both the determinant and CI eigenvector blocks of the orbital-CI and CI−CI Hessian contain information regarding how the energy changes as a function of rotations within the state-averaged space. Eqs. (33) and (34) project out the components of the determinant Hessian that exist within the state-averaged space, and employing these equations in place of Eqs. (53) and (54) removes all linear dependencies. There is no CI−CI coupling between blocks expressed in a projected determinant and CI eigenvector basis because they span orthogonal spaces. For this reason, ΨI,ΦϓCC=Φϓ,ΨICC=0. With that in mind, the additional blocks of the Hessian corresponding to the states with nonzero weight can then be expressed in the basis of eigenstates of the CI Hamiltonian36,43

ΨΦ,ϓΞCC=2δΨϓ(ωΨωΦ)(Φ|H^|ΞEΨδΦΞ),
(59)
pq,ΨΦOC=ΨΦ,pqCO=2(ωΨωΦ)Tpq(Ψ,Φ).
(60)

It is worth noting that Eqs. (53) and (59) and Eqs. (54) and (60) are the same equations expressed in different bases (though the indices Ψ and Ξ would run over all eigenstates, as opposed to those within the state-averaged space, if we built the entire CI−CI and orbital-CI blocks in a basis of CI eigenvectors). It is tractable to treat this block in the basis of eigenstates because there can be at most Ns(Ns+1)/2 non-redundant Lagrange multipliers. In fact, all of the terms appearing in Eqs. (59) and (60) are already computed and stored to solve the coupled perturbed equations (see Eq. (50)). The advantage of using Eqs. (53) and (54) is that no additional CI Lagrange multipliers are introduced. Furthermore, there are fewer Hessian matrix singularities, which get introduced when the components of the state-averaged space are projected out of the determinant blocks. The disadvantage is that the coupled perturbed equations expressed in a determinant basis exhibit linear dependencies. By expressing Eqs. (53) and (54) in a basis of CI eigenvectors (as is done in Eqs. (59) and (60)), it is clear that c¯ΨΦΘ=c¯ΦΨΘ, which is a linear dependency. This can be easily removed in a basis of CI eigenvectors, but the same is not true in a basis of determinants. Moreover, the CI−CI block of the Hessian is no longer block diagonal when states are not equally weighted. This increases the cost of computing the product of the Hessian and a trial vector and potentially diminishes the rate of convergence of the coupled perturbed equations, which become less diagonally dominant. Our strategy alleviates these problems. We believe that the advantages of our strategy outweigh the disadvantages. The number of additional Lagrange multipliers introduced is trivial compared to the size of the Hessian (at least for most tractable active spaces). Moreover, the Hessian is guaranteed to be singular in a basis of determinants, and the noise from the additional singularities can be easily removed during the course of the solution to the coupled perturbed equations, as the direction of each singularity is known.

Although the primary difficulty in solving the coupled perturbed equations is the efficient computation of matrix-vector products involving the Hessian and a trial vector, there is another difficulty worth mentioning. In a basis of determinants, the Hessian is typically indefinite and always singular. Consider Eqs. (59) and (60). If ωΨ = ωΦ, then the matrix will be singular. In addition, there are linear dependencies in the Hessian, as cΨΦ = -cΦΨ, that introduce additional singularities. Similarly, if ωΨ < ωΦ and EΨ < EΦ or vice versa, then the matrix could be indefinite. This occurs because states with a different spin multiplicity are often lower in energy than the states of a given spin multiplicity included within the state-averaged space. When working in a basis of CI eigenvectors, we can guarantee that the Hessian is positive definite by removing these terms from the coupled perturbed equations, as long as the state weights are nonincreasing within a given spin block. This is the case because the Lagrange multipliers for these terms are either 0 or can be determined from another Lagrange multiplier. This transformation is shown pictorially in Figure 1. Unfortunately, we cannot cleanly excise these terms from the coupled perturbed equations when working in a basis of determinants and are forced to solve an indefinite system of linear equations with infinitely many valid solutions. Of course, only the least squares solution will produce the correct gradient or nonadiabatic coupling vector. There are a variety of possible ways to obtain the least squares solution. We use direct inversion of the iterative subspace (DIIS) to solve the coupled perturbed equations and employ a preconditioner, M, that projects the singular directions out of each subspace vector. The preconditioner employed to solve the coupled perturbed equations requires explanation. It is block diagonal, with separate orbital and CI blocks. The orbital block consists of the partially constructed portion of H𝐎𝐎 (the non-braced terms in Eqs. (16)–(21)) and will be called 𝐌𝐎. 𝐌𝐎 cannot be analytically and sparsely inverted and is instead inverted numerically using a preconditioned conjugate gradient routine during each iteration of the coupled perturbed equations. The CI block of the preconditioner can be written as

𝐌𝐂=𝐏𝐌¯𝐂𝐏,
(61)

where

M¯ΨIC=2ωΨ(HIIEΨ)
(62)

and

𝐏=𝟏ϓ|ϓϓ|
(63)

is the projector responsible for removing rotations between states inside and outside the active space. As demonstrated in prior work,36 𝐌𝐂 can be inverted analytically in the following way:

[𝐌𝐂]1=[𝐌¯𝐂]1Φϓ[𝐌¯𝐂]1|Φϓ|[𝐌¯𝐂]1QΦϓ1.
(64)

Since 𝐌¯𝐂 is diagonal, it is trivial to invert and store this matrix, and the CI eigenvectors for states Φ and ϓ are also readily available. The matrix Q defines the overlap between each CI eigenvector pair and [𝐌¯𝐂]1. It can be expressed as

QΦϓ=Φ|[𝐌¯𝐂]1|ϓ.
(65)

There is no analytical expression for the inverse of Eq. (65), but this matrix is only Ns2 in size, which is quite small. As such, one can do a dense inversion of Q to determine Q−1. The preconditioner enables us to omit certain components in H𝐂𝐂 and H𝐎𝐂 as previously mentioned.

FIG. 1.

The sparsity pattern and relative size of the state-averaged Hessian (in which two states are averaged) in a determinant basis (left), CI eigenvector basis (center), and CI eigenvector basis absent redundant parameters (right). CI eigenvectors are organized in order of increasing spin multiplicity, and elements equal to zero are shown in white. This illustrates the advantage of working in the CI eigenvector basis. Redundant CI parameters can be omitted, ensuring that the Hessian is positive definite. Otherwise, the Hessian is singular and potentially indefinite. Because H𝐂𝐂 is diagonal in a basis of CI eigenvectors, the CP-MCSCF equations can be more easily solved iteratively, as a diagonal preconditioner can be used effectively. However, a determinant basis (or configuration state function basis) is required if the number of states becomes sufficiently large due to time and memory constraints.

FIG. 1.

The sparsity pattern and relative size of the state-averaged Hessian (in which two states are averaged) in a determinant basis (left), CI eigenvector basis (center), and CI eigenvector basis absent redundant parameters (right). CI eigenvectors are organized in order of increasing spin multiplicity, and elements equal to zero are shown in white. This illustrates the advantage of working in the CI eigenvector basis. Redundant CI parameters can be omitted, ensuring that the Hessian is positive definite. Otherwise, the Hessian is singular and potentially indefinite. Because H𝐂𝐂 is diagonal in a basis of CI eigenvectors, the CP-MCSCF equations can be more easily solved iteratively, as a diagonal preconditioner can be used effectively. However, a determinant basis (or configuration state function basis) is required if the number of states becomes sufficiently large due to time and memory constraints.

Close modal

Once the coupled perturbed equations are solved, the procedure to calculate the gradient and nonadiabatic coupling vector for a system is essentially identical to the procedure presented in our previous work.14 However, we repeat much of this work for the sake of completeness. The nuclear gradient can be expressed as

EΘξ=pqγpqe(p|h^|q)ξ+pqrsΓpqrse(pq|rs)ξpqXpqe(p|q)ξ,
(66)

where ξ is an arbitrary electric field or nuclear perturbation and γe, Γe, and Xe are the relaxed (or effective) OPDM, TPDM, and CI Lagrangian, respectively. The effective OPDM is straightforward to compute and store, taking the form

𝜸e=𝜸Θ+𝜸+𝜸¯,
(67)

where

γpq=oγoqSAκ¯opΘγpoSAκ¯qoΘ,
(68)
γ¯pq=Ψ2ωΨγpqΨ𝒄¯ΨΘ+Ψ<Φ2c¯ΨΦΘ(ωΨωΦ)γpqΨΦ.
(69)

The use of Eq. (69) to construct the CI correction to the OPDM is one of the few modifications to our prior algorithm, apart from all the changes to the coupled perturbed equations just discussed. Importantly, the second term in Eq. (69) only appears when unequal weights are employed. The effective TPDM can be expressed in a very similar way,

𝚪e=𝚪Θ+𝚪+𝚪¯,
(70)

where

Γpqrs=oΓoqrsSAκ¯opΘ+ΓporsSAκ¯oqΘ+ΓpqosSAκ¯orΘ+ΓpqroSAκ¯osΘ,
(71)
Γ¯pqrs=Ψ2ωΨΓpqrsΨ𝒄¯ΨΘ+Ψ<Φ2c¯ΨΦΘ(ωΨωΦ)ΓpqrsΨΦ.
(72)

The use of Eq. (72) is the other modification required to make our algorithm direct-compatible. Again, the second term only appears in the case where unequal weights are utilized. Due to the size of the TPDM, these matrices are never explicitly computed. One can derive a factorization for the effective TPDM in terms of contributions to the effective OPDM as well as the complete eigenvalue decomposition of the active portion of the various TPDM contributions. This is accomplished by inserting the factorization of the TPDM into Eqs. (70)–(72),

ΓpqrsΘ=ΓpqrsΘ,act+12γpqΘγrsΘ12γpqΘ,actγrsΘ,act14γpsΘγqrΘ+14γpsΘ,actγqrΘ,act,
(73)

where the first term can be further factorized by taking the eigenvalue decomposition of the all-active state-specific TPDM,

ΓpqrsΘ,act=PωpqΘ,act,PλΘ,act,PωrsΘ,act,P.
(74)

A similar factorization exists for the state-averaged TPDM and two-particle transition density matrices. The result is the following matrix factorization for the effective TPDM:

ΓpqrsΘ=PωpqΘ,act,PλΘ,act,PωrsΘ,act,P+12γpqΘγrsΘ12γpqΘ,actγrsΘ,act14γpsΘγqrΘ+14γpsΘ,actγqrΘ,act,
(75)
Γpqrs=Poκ¯opΘωoqSA,act,PλSA,act,PωrsSA,act,P+PoωpoSA,act,Pκ¯oqΘλSA,act,PωrsSA,act,P+PoωpqSA,act,PλSA,act,Pκ¯orΘωosSA,act,P+PoωpqSA,act,PλSA,act,PωroSA,act,Pκ¯osΘ+12γpqSAγrs12γpqSA,actγrsact+12γpqγrsSA12γpqactγrsSA,act14γpsSAγqr+14γpsSA,actγqract14γpsγqrSA+14γpsactγqrSA,act,
(76)
Γ¯pqrs=Pω¯pqact,Pλ¯act,Pω¯rsact,P+12γpqcoreγ¯rs+12γ¯pqγrscore14γpscoreγ¯qr14γ¯psγqrcore.
(77)

Here, the “act” and “core” superscripts generally denote that only terms with exclusively active or core indices are nonzero. There is one exception to this rule,

γpqact=oγoqSA,actκ¯opΘγpoSA,actκ¯qoΘ.
(78)

Clearly, p and q need not both be active indices for the terms in 𝜸act to be nonzero, but 𝜸act is constructed from a state-averaged matrix in which all nonzero matrix elements have active indices. In practice, the state-specific and CI portions of the all-active TPDM are combined in order to reduce the cost of the calculation, but we separate them in Eqs. (75)–(77) for clarity.

With the effective OPDM and TPDM defined, the effective CI Lagrangian can be expressed as

Xpqe=XpqΘ+Xpq+X¯pq=(oγpoΘ(q|h^|o)+2orsΓporsΘ(qo|rs))+(oγpo(q|h^|o)+2orsΓpors(qo|rs))+(oγ¯po(q|h^|o)+2orsΓ¯pors(qo|rs))=oγpoe(q|h^|o)+2orsΓporse(qo|rs).
(79)

The state-specific contribution to the effective CI Lagrangian (𝐗Θ) is trivial to recover

XipΘ=2Fip𝑐𝑜𝑟𝑒+Fip𝑎𝑐𝑡(𝜸Θ),
(80)
XtpΘ=u𝜸tuΘFpu𝑐𝑜𝑟𝑒+2uvwΓtuvwΘ(pu|vw).
(81)

All other elements are zero. The terms appearing in Eqs. (80) and (81) were previously computed to solve the coupled perturbed equations, and as a result, this contribution to the CI Lagrangian is easy to compute. The CI correction to the effective CI Lagrangian (𝑿¯) is also quite straightforward to compute, exhibiting a very similar form to Eqs. (80) and (81),

X¯ip=Fip𝑎𝑐𝑡(𝜸¯),
(82)
X¯tp=uγ¯tuFpu𝑐𝑜𝑟𝑒+2uvwΓ¯tuvw(pu|vw).
(83)

Again, all other terms are zero. There is no such convenient expression for the computation of X, but this term can be recovered with some additional effort

Xpq=oγpo(q|h^|o)+2orsΓpors(qo|rs)=rγpr(q|h^|r)+2tuvΓ(qt|uv)ptuv1+2rtuΓ(qr|tu)prtu2+2rtuΓ(qt|ru)ptru3+2rtuΓ(qt|ur)ptur4+rγprSAJqr(𝜸)rγprSA,actJqr(𝜸act)+rγprJqr(𝜸SA)rγpractJqr(𝜸SA,act)12rγprSAKqr(𝜸)+12rγprSA,actKqr(𝜸act)12rγprKqr(𝜸SA)+12rγpractKqr(𝜸SA,act),
(84)

where

Γ=ptuv1Poκ¯opΘωotSA,act,PλSA,act,PωuvSA,act,P,
(85)
Γ=prtu2PoωpoSA,act,Pκ¯orΘλSA,act,PωtuSA,act,P,
(86)
Γ=ptru3PoωptSA,act,PλSA,act,Pκ¯orΘωouSA,act,P,
(87)
Γ=ptur4PoωptSA,act,PλSA,act,PωuoSA,act,Pκ¯orΘ.
(88)

At least three indices in Eqs. (85)–(88) must correspond to active orbitals in order for the elements to be nonzero, and the identity of each index that does not need to be active is denoted by the superscript (e.g., 3 indicates that the third index need not be active). To emphasize this point, the active orbital indices t, u, and v are used when possible. Although p generally designates a general MO index, p must correspond to an active MO in Eqs. (86)–(88) for the element to be nonzero. Note that p is a general MO index in Eq. (85). We use this index, however, in order to have a clean expression for the orbital correction to the CI Lagrangian, where the first index in each factorized density matrix is a general MO index. Because the terms defined in Eqs. (85)–(88) must have three active indices to be nonzero, only ERIs with at most two general MO indices are required to recover the contribution to the CI Lagrangian arising from these terms, and these ERIs were previously built to solve the coupled perturbed equations. The remaining contribution to the orbital correction to the CI Lagrangian can then be recovered using a combination of J and K builds and matrix products, as shown in Eq. (84). The J and K builds are implicitly transformed into the MO space for the sake of brevity, but these calculations are performed in the AO basis.

With the effective OPDM, TPDM, and CI Lagrangian defined, it is possible to compute the nuclear gradient for a specific electronic state. It is straightforward to compute the one-electron contributions to the gradient by transforming the effective OPDM and CI Lagrangian into the AO basis and contracting with the derivative one-election integrals and overlap integrals, respectively. As detailed in our previous work, one can recover the two-electron contribution to the gradient using derivative J and K matrices,

J(𝑨,𝑩)ξ=μνρσ(μν|ρσ)ξAμνBρσ,
(89)
K(𝑨,𝑩)ξ=μνρσ(μρ|νσ)ξAμνBρσ,
(90)

where A and B are arbitrary matrices. To accomplish this, one needs only to transform each matrix pair in Eqs. (75)–(77) into the AO basis and contract with the appropriate J and K derivative matrix. In Eqs. (75)–(77), a J gradient evaluation is required when p/q are paired and r/s are paired. Otherwise, a K gradient evaluation is required

pqrsΓpqrse(pq|rs)ξ=PλΘ,act,PJ(𝝎Θ,act,P,𝝎Θ,act,P)ξ+12J(𝜸Θ,𝜸Θ)ξ12J(𝜸Θ,act,𝜸Θ,act)ξ14K(𝜸Θ,𝜸Θ)ξ+14K(𝜸Θ,act,𝜸Θ,act)ξ+PλSA,act,PJ((κ¯act,allΘ)T𝝎SA,act,P,𝝎SA,act,P)ξ+PλSA,act,PJ(𝝎SA,act,P𝜿¯act,allΘ,𝝎SA,act,P)ξ+PλSA,act,PJ(𝝎SA,act,P,(κ¯act,allΘ)T𝝎SA,act,P)ξ+PλΘ,act,PJ(𝝎Θ,act,P,𝝎Θ,act,P𝜿¯act,allΘ)ξ+12J(𝜸SA,𝜸)ξ12J(𝜸SA,act,𝜸act)ξ+12J(𝜸,𝜸SA)ξ12J(𝜸act,𝜸SA,act)ξ14K(𝜸SA,𝜸)ξ+14K(𝜸SA,act,𝜸act)ξ14K(𝜸,𝜸SA)ξ+14K(𝜸act,𝜸SA,act)ξ+Pλ¯act,PJ(𝝎¯act,P,𝝎¯act,P)ξ+12J(𝜸core,𝜸¯)ξ+12J(𝜸¯,𝜸core)ξ14K(𝜸core,𝜸¯)ξ14K(𝜸¯,𝜸core)ξ,
(91)

where the superscript T indicates the matrix transpose. This is important because the orbital Lagrange multiplier matrix, 𝜿¯Θ, is antisymmetric, much like the orbital gradient.

As demonstrated by Yarkony and co-workers33,34 and discussed in our previous work,14 the calculation of the nonadiabatic coupling vectors is essentially identical to that of the gradient, with a few minor changes. To determine the NAC vector for states Θ and Λ, the coupled perturbed equations become

pqκ¯pqΘ2ESAκpqκrs+ΨIc¯ΨIΘ2ESAcΨIκrs=Trs(Θ,Λ),
(92)
pqκ¯pqΘ2ESAκpqcΦJ+ΨIc¯ΨIΘ2ESAcΨIcΦJ=0.
(93)

The equation for the NAC vector is quite similar to that of the gradient and can be expressed as

Θ|ξ|Λ=(EΘEΛ)1[pqγpqe(p|h^|q)ξ+pqrsΓpqrse(pq|rs)ξpqXpqe(p|q)ξ]12pqγpqΘΛ[(pξ|q)(q|pξ)].
(94)

The crescent above the transition density matrix denotes that this matrix is not symmetrized. The relaxed transition density matrices are being contracted with symmetric derivative electron integrals. For this reason, any contribution to the effective transition density matrices can be symmetrized, but the overlap-like matrix in the final term is antisymmetric. As such, this transition density matrix cannot be symmetrized. In addition to the differences already noted, the effective density matrices and their corresponding factorization are different, with the state-specific density matrix replacement by a transition density matrix,

𝜸e=𝜸ΘΛ+𝜸+𝜸¯,
(95)
𝚪e=𝚪ΘΛ+𝚪+𝚪¯.
(96)

As noted in our prior work, Γ¯ is simply a weighted sum of transition density matrices. As a result, the factorization of ΓΘΛ takes the same form as Eq. (77), as opposed to Eq. (75), and can be combined during the computation of the NAC vector, reducing the number of derivative J and K gradient evaluations required. Similarly, the equation for the contribution to the effective CI Lagrangian takes the form of Eqs. (82) and (83) as opposed to Eqs. (80) and (81), and can be combined with the calculation of the CI contribution to effective CI Lagrangian. Otherwise, the gradient and NAC vector programs are exactly the same.

Unrelated to our direct-compatible algorithm, we also made some technical improvements to our previous algorithm. Both the K builds and K gradient builds are substantially more efficient, when the input matrices are symmetric and identical because these routines can capitalize on this information in order to reduce the number of required operations. As a result, there are multiple instances of the K matrix and K gradient available in our code. In the case of the K matrix build, there are two routines, one for symmetric and one for asymmetric input matrices. The symmetric routine is faster than the asymmetric routine by a factor of two. During the iterative solution to the coupled perturbed equations, we build a pair of matrices called D1 and D2. In our original implementation,14 these matrices are not symmetric, but symmetrizing these matrices substantially reduces the cost of the K build

𝐃𝟏=12[00𝜿core,virtΘ000𝜿core,virtΘ,T00],
(97)
𝐃𝟐=12[0𝜿core,actΘ(2𝐈𝜸act,actSA)2𝜿core,virtΘ(2𝐈𝜸act,actSA)𝜿core,actΘ,T0𝜸act,actSA𝜿act,virtΘ2𝜿core,virtΘ,T𝜿act,virtΘ,T𝜸act,actSA0].
(98)

Of course, a minor modification to our algorithm is necessary to ensure that the result is the same. This modification is present in Algorithm 1 lines 10b–10g. We also capitalize on symmetric K build routines to reduce the cost of computing ERIs of the form (pt|qt). A good implementation will further take advantage of the fact that (pq|tu) = (pq|ut) and (pt|qu) = (qu|pt), reducing the number of J and K builds required to build the ERIs by roughly a factor of two. In the case of the K gradient build, there are three routines. The most general routine requires no information about the input matrices. The second requires the input matrices to be identical but not symmetric. This routine is twice as fast as the first routine.The fastest routine requires that the input matrices are identical and symmetric, and this routine is faster than the most general routine by a factor of four. We can reformulate matrix factorizations in Eqs. (75)–(77) in terms of identical matrix pairs by recognizing that

K(𝑨,𝑩)ξ=12[K(𝑨+𝑩,𝑨+𝑩)ξK(𝑨𝑩,𝑨𝑩)ξ].
(99)

This simultaneously reduced the number of K gradient evaluations required and reduced the cost of each evaluation. Eq. (99) is essentially a fourth possible K gradient routine, for a case where the input matrices are different but both are symmetric. This routine is twice as fast as the most general routine, improving our computation of the K gradient by roughly a factor of two. Note also that K(𝑨,𝑩)ξ=K(𝑩,𝑨)ξ, reducing the number of K gradient routines required. We can further reduce the number of K gradient calls by noticing that

γ^psΘγ^qrΘγ^psΘ,actγ^qrΘ,act=γpsΘγqrΘγpsΘ,actγqrΘ,act+γpscoreγ¯qr+γ¯psγqrcore,
(100)

where

𝜸^Θ=𝜸Θ+𝜸¯,
(101)
𝜸^Θ,act=𝜸Θ,act+𝜸¯.
(102)

Of course, Eqs. (100)–(102) are only relevant for the computation of nuclear gradients.

Algorithm 1.

An integral-direct-compatible algorithm for performing a GPU accelerated SA-CASSCF nuclear gradient calculation.

1: Obtain CI eigenvectors and the state energiesa 
2: Construct the active-by-active subset of the ERIs: (pq|tu) and (pt|qu) ▶ Eqs. (24) and (25) 
3: Construct the partially built Hessian and components of the preconditioner, M
3a: Obtain γSA, ΓSA, Fcore, and Fact(γSA) ▶ Eqs. (2), (3), (22), and (23) 
3b: Construct the orbital component of the preconditioner: 𝐌𝐎 ▶ non-braced parts of Eqs. (16)–(21) 
3c: Construct the components of 𝐌𝐂 ▶ Eqs. (61)–(65) 
3d: For all non-redundant pairs of eigenstates Ψ and Φ: 
If Ψ = Φ then compute γΨ, ΓΨ, and Fact(γΨ), and 𝐠oΨ ▶ Eqs. (13)–(15), (23), (31), and (32) 
If Ψ Φ then compute γΨΦ, ΓΨΦ, and Fact(γΨΦ), and T(Ψ,Φ) ▶ Eqs. (23), (31), (32)
and (35)–(37) 
4: Compute γΘ, ΓΘ, and Fact(γΘ) ▶ Eqs. (23), (31), and (32) 
5: Construct the orbital gradient: 𝐠oΘ ▶ Eqs. (13)–(15) 
6: Initialize incremental direct inversion of the iterative subspace (I-DIIS): 
𝐫(0)=𝐌1𝐠Θ;b𝐳(0)=𝐫(0); 𝜼(0)=0; 𝜼inc(0)=0; k=1 
Add z(0) and r(0) to I-DIIS subspace and residual vector lists respectively. 
7: loop 
8: Ifk > 1 then z(k) = DIIS extrapolated vector else z(k) = z(k−1) 
9: Ifk mod Nincc = 0 then z = z(k)else z = z(k)z(k−1) 
10: Build H𝐎𝐎𝐳 contribution to η
10a: 𝜼inc(k)=𝐌𝐎𝐳d 
10b: Build D1 and D2: ▶ Eqs. (97) and (98) 
10c: Jpq=μνCμpCνqJμν(𝑫𝟏); Kpq=μνCμpCνqKμν(𝑫𝟏) 
10d: J¯pq=μνCμpCνqJμν(𝑫𝟐); K¯pq=μνCμpCνqKμν(𝑫𝟐) 
10e: ηinc,it(k)+=2u(2δtu𝜸tu)(4Jiu2Kiu) 
10f: ηinc,ia(k)+=8J¯ia4K¯ia 
10g: ηinc,ta(k)+=2uγtu(4Jua2Kua) 
11: Build H𝐂𝐂𝐳 contribution to η: ▶ Eq. (39) 
11a: ηinc,ΨI(k)+=Φ2δΨΦωΨ[J(HIJEΨδIJ)zΦJ] 
12: Build H𝐎𝐂𝐳 contribution to η: ▶ Eq. (41) 
12a: ηinc,pq(k)+=Ψ2ωΨTpq(Ψ,zΨ) 
13: Build H𝐂𝐎𝐳 contribution to η: ▶Eq. (50) 
13a: Construct 𝐃 and 𝐃¯ ▶Eqs. (48) and (49) 
13b: Construct α, 𝐅act(𝐃), 𝐅core(𝐃¯), and (tu|vw)m ▶Eqs. (44)–(47) 
13c: Setup 𝐇 in order to compute σ ▶Eqs. (51) and (52) 
13d: ηinc,ΨI(k)+=2ωΨ[4αcIΨ+σIΨpq,ϓcIϓTpq(Ψ,ϓ)κpq] 
14: Ifk mod Ninc = 0 then𝜼(k)=𝜼inc(k) else η(k)=ηinc(k)+η(k1) 
15: 𝐫(k)=η(k)+𝐠oΘ 
16: if𝐫(k)2< threshold then
17: return z(k) 
18: end if 
19: 𝐞(k)=𝐌𝟏𝐫(k) ▶Eqs. (64) and (65) 
20: 𝐩(k)=𝐳(k)𝐞(k) 
21: Add 𝐩(k)and 𝐞(k) to DIIS subspace vector and residual vector set respectively 
22: end loop 
23: Build the state-specific density matrices: 𝜸Θand 𝚪Θ 
24: Build the components of 𝜸e and the factorized variant of 𝚪e ▶ Eqs. (67)–(72) and (75)–(77) 
25: Perform an eigenvalue decomposition of 𝚪Θ,act, 𝚪SA,act, and 𝚪¯act ▶ Eq. (74) 
26: Construct the effective Lagrangian: 𝑿e ▶ Eq. (79) 
27: Calculate the one-body contribution to the gradient: ▶ Eq. (66) 
28a: One-electron gradient: pqγpqe(p|h^|q)ξ 
28b: Overlap gradient: pqXpqe(p|q)ξ 
29: To recover the two-electron gradient: pqrsΓpqrse(pq|rs)ξ ▶Eq. (91) 
29a: loop over matrix products in the factorization of 𝚪e=𝚪Θ+𝚪+𝚪¯: ▶ Eqs. (75)–(77) 
29b: Set first and second matrices in the product equal to A and B respectively 
29c: if the indices pq and rs are paired in Eqs. (75)–(77)then
29d: Do a J gradient calculation on A and B ▶ Eq. (89) 
29e: If applicable, weight the contribution to the gradient by λact,P 
29f: else if the indices ps and qr are paired in Eqs. (75)–(77)then
29g: if A = B
29h: Do a K gradient calculation on A and B ▶ Eq. (90) 
29i: else
29j: K(𝑨,𝑩)ξ=12[K(𝑨+𝑩,𝑨+𝑩)ξK(𝑨𝑩,𝑨𝑩)ξ] ▶ Eq. (99) 
29k: end if 
29l: end if 
29m: end loop 
1: Obtain CI eigenvectors and the state energiesa 
2: Construct the active-by-active subset of the ERIs: (pq|tu) and (pt|qu) ▶ Eqs. (24) and (25) 
3: Construct the partially built Hessian and components of the preconditioner, M
3a: Obtain γSA, ΓSA, Fcore, and Fact(γSA) ▶ Eqs. (2), (3), (22), and (23) 
3b: Construct the orbital component of the preconditioner: 𝐌𝐎 ▶ non-braced parts of Eqs. (16)–(21) 
3c: Construct the components of 𝐌𝐂 ▶ Eqs. (61)–(65) 
3d: For all non-redundant pairs of eigenstates Ψ and Φ: 
If Ψ = Φ then compute γΨ, ΓΨ, and Fact(γΨ), and 𝐠oΨ ▶ Eqs. (13)–(15), (23), (31), and (32) 
If Ψ Φ then compute γΨΦ, ΓΨΦ, and Fact(γΨΦ), and T(Ψ,Φ) ▶ Eqs. (23), (31), (32)
and (35)–(37) 
4: Compute γΘ, ΓΘ, and Fact(γΘ) ▶ Eqs. (23), (31), and (32) 
5: Construct the orbital gradient: 𝐠oΘ ▶ Eqs. (13)–(15) 
6: Initialize incremental direct inversion of the iterative subspace (I-DIIS): 
𝐫(0)=𝐌1𝐠Θ;b𝐳(0)=𝐫(0); 𝜼(0)=0; 𝜼inc(0)=0; k=1 
Add z(0) and r(0) to I-DIIS subspace and residual vector lists respectively. 
7: loop 
8: Ifk > 1 then z(k) = DIIS extrapolated vector else z(k) = z(k−1) 
9: Ifk mod Nincc = 0 then z = z(k)else z = z(k)z(k−1) 
10: Build H𝐎𝐎𝐳 contribution to η
10a: 𝜼inc(k)=𝐌𝐎𝐳d 
10b: Build D1 and D2: ▶ Eqs. (97) and (98) 
10c: Jpq=μνCμpCνqJμν(𝑫𝟏); Kpq=μνCμpCνqKμν(𝑫𝟏) 
10d: J¯pq=μνCμpCνqJμν(𝑫𝟐); K¯pq=μνCμpCνqKμν(𝑫𝟐) 
10e: ηinc,it(k)+=2u(2δtu𝜸tu)(4Jiu2Kiu) 
10f: ηinc,ia(k)+=8J¯ia4K¯ia 
10g: ηinc,ta(k)+=2uγtu(4Jua2Kua) 
11: Build H𝐂𝐂𝐳 contribution to η: ▶ Eq. (39) 
11a: ηinc,ΨI(k)+=Φ2δΨΦωΨ[J(HIJEΨδIJ)zΦJ] 
12: Build H𝐎𝐂𝐳 contribution to η: ▶ Eq. (41) 
12a: ηinc,pq(k)+=Ψ2ωΨTpq(Ψ,zΨ) 
13: Build H𝐂𝐎𝐳 contribution to η: ▶Eq. (50) 
13a: Construct 𝐃 and 𝐃¯ ▶Eqs. (48) and (49) 
13b: Construct α, 𝐅act(𝐃), 𝐅core(𝐃¯), and (tu|vw)m ▶Eqs. (44)–(47) 
13c: Setup 𝐇 in order to compute σ ▶Eqs. (51) and (52) 
13d: ηinc,ΨI(k)+=2ωΨ[4αcIΨ+σIΨpq,ϓcIϓTpq(Ψ,ϓ)κpq] 
14: Ifk mod Ninc = 0 then𝜼(k)=𝜼inc(k) else η(k)=ηinc(k)+η(k1) 
15: 𝐫(k)=η(k)+𝐠oΘ 
16: if𝐫(k)2< threshold then
17: return z(k) 
18: end if 
19: 𝐞(k)=𝐌𝟏𝐫(k) ▶Eqs. (64) and (65) 
20: 𝐩(k)=𝐳(k)𝐞(k) 
21: Add 𝐩(k)and 𝐞(k) to DIIS subspace vector and residual vector set respectively 
22: end loop 
23: Build the state-specific density matrices: 𝜸Θand 𝚪Θ 
24: Build the components of 𝜸e and the factorized variant of 𝚪e ▶ Eqs. (67)–(72) and (75)–(77) 
25: Perform an eigenvalue decomposition of 𝚪Θ,act, 𝚪SA,act, and 𝚪¯act ▶ Eq. (74) 
26: Construct the effective Lagrangian: 𝑿e ▶ Eq. (79) 
27: Calculate the one-body contribution to the gradient: ▶ Eq. (66) 
28a: One-electron gradient: pqγpqe(p|h^|q)ξ 
28b: Overlap gradient: pqXpqe(p|q)ξ 
29: To recover the two-electron gradient: pqrsΓpqrse(pq|rs)ξ ▶Eq. (91) 
29a: loop over matrix products in the factorization of 𝚪e=𝚪Θ+𝚪+𝚪¯: ▶ Eqs. (75)–(77) 
29b: Set first and second matrices in the product equal to A and B respectively 
29c: if the indices pq and rs are paired in Eqs. (75)–(77)then
29d: Do a J gradient calculation on A and B ▶ Eq. (89) 
29e: If applicable, weight the contribution to the gradient by λact,P 
29f: else if the indices ps and qr are paired in Eqs. (75)–(77)then
29g: if A = B
29h: Do a K gradient calculation on A and B ▶ Eq. (90) 
29i: else
29j: K(𝑨,𝑩)ξ=12[K(𝑨+𝑩,𝑨+𝑩)ξK(𝑨𝑩,𝑨𝑩)ξ] ▶ Eq. (99) 
29k: end if 
29l: end if 
29m: end loop 
a

We only compute the CI eigenvectors and energies for states within the state-averaged space. This is also true in step 3d.

b

gΘ is the gradient of state Θ with respect to both orbital and CI parameters. The orbital portion of gΘ is equal to the orbital gradient, 𝐠oΘ, and the CI portion of gΘ is 0.

c

Ninc is a user-defined variable that determines how often we need to build 𝝈inc(k) from the full trial vector, as opposed to an incremental update to the new trial vector. We use an incremental algorithm to increase the speed of the calculation, as each incremental update becomes sparser. It is necessary to periodically build 𝝈inc(k) from the full trial vector to avoid numerical instability. In our algorithm, we use Ninc = 8.

d

The matrix, 𝐌𝐎, is the partially constructed portion of H𝐎𝐎. This is equivalent to what we referred to as the orbital-orbital part of H[2] in Ref. 14.

The extension of our AO-based SA-CASSCF gradient and nonadiabatic coupling program to treat large active spaces has been implemented within the TeraChem electronic structure package.46–48 Timings for this program were obtained on Intel Xeon E5-2643 CPUs clocked at 3.30 GHz with either NVIDIA GeForce GTX 970 or TITAN X (Pascal) GPUs. Unless otherwise noted, all calculations were performed on a single GPU in double precision. In Section IV, we will use the notation SA-Ns-CAS(m,n) to describe the details of the SA-CASSCF wavefunction, where m is the number of active electrons, and n is the number of active orbitals. Molecular geometries for microsolvated polyenes were generated using the Amber 12 simulation package49 by equilibrating the system using constant temperature molecular dynamics and minimizing the final frame. The general AMBER force field (GAFF)50,51 was used to parameterize the solute, and the force field parameters for methanol were taken from previous work.52 The structure for Dronpa-2 was taken from the protein data bank crystal structure 4UTS.53 Standard amino acid residues were parameterized using the ff99SBildn force field,54–56 while parameters for the chromophore were developed using a combination of GAFF and RESP charges from the R.E.D. Server.57–61 To determine the protonation state of the various titratable residues, we employed the PDB2PQR Server.62 The protein was solvated using the SPCFw flexible water model.63 Dronpa-2 was treated using QM/MM calculations,64–68 which were recently introduced in a developmental version of TeraChem. The MM portion of the calculation is carried out by the OpenMM molecular dynamics package,69–73 which is linked to TeraChem as a library using an API. We implemented the equations for the interaction between the QM and MM regions74,75 following the QM/MM implementation in Amber and interfaced OpenMM with TeraChem for the evaluation of QM, MM, and QM/MM terms in the energy and forces. Geometry optimizations and conical intersection searches were performed using the DL-FIND optimization package.76 We employed the gradient projection algorithm developed by Bearpark and Robb to search for conical intersections.26 The algorithm which we used to compute the gradients is presented as Algorithm 1. As in our previous work, the algorithm to compute the coupling vector is essentially identical to that used to compute the gradient, and the differences are highlighted in Algorithm 2.

Algorithm 2.

An integral-direct-compatible algorithm for performing a GPU accelerated SA-CASSCF nonadiabatic coupling vector calculation.

1-3: Same as Algorithm 1
4: Compute γΘΛ, ΓΘΛ, and Fact,ΘΛ ▶ Eqs. (23), (31), and (32) 
5: Construct the orbital interstate coupling gradient: T(Θ,Λ) ▶ Eqs. (35)–(37) 
6-22: Same as Algorithm 1, but replace 𝐠oΘ with T(Θ,Λ) 
23: Build the transition density matrices: 𝜸ΘΛ,𝚪ΘΛ,and𝜸˘ΘΛ 
24-26: Same as in Algorithm 1, but replace 𝜸Θand𝚪Θ with 𝜸ΘΛand𝚪ΘΛa 
27: Scale the effective density matrices and Lagrangian by (EΘEΛ)1 
28-30: Calculate the nonadiabatic coupling vector: ▶ Eq. (94) 
(Same as lines 27-29 in Algorithm 1
31: Antisymmetric overlap contribution: pq𝜸˘pqΘΛ[(p|qξ)(pξ|q)] 
1-3: Same as Algorithm 1
4: Compute γΘΛ, ΓΘΛ, and Fact,ΘΛ ▶ Eqs. (23), (31), and (32) 
5: Construct the orbital interstate coupling gradient: T(Θ,Λ) ▶ Eqs. (35)–(37) 
6-22: Same as Algorithm 1, but replace 𝐠oΘ with T(Θ,Λ) 
23: Build the transition density matrices: 𝜸ΘΛ,𝚪ΘΛ,and𝜸˘ΘΛ 
24-26: Same as in Algorithm 1, but replace 𝜸Θand𝚪Θ with 𝜸ΘΛand𝚪ΘΛa 
27: Scale the effective density matrices and Lagrangian by (EΘEΛ)1 
28-30: Calculate the nonadiabatic coupling vector: ▶ Eq. (94) 
(Same as lines 27-29 in Algorithm 1
31: Antisymmetric overlap contribution: pq𝜸˘pqΘΛ[(p|qξ)(pξ|q)] 
a

The factorization of 𝚪ΘΛ takes the form of Eq. (77) as opposed to Eq. (75).

We tested the efficiency of our direct-compatible, atomic orbital-based SA-CASSCF gradient program by applying it to a series of conjugated polyenes in droplets of up to 250 quantum methanol molecules. The microsolvated conjugated polyenes, octatetraene, decapentaene, and dodecahexaene, were treated at the SA-4-CAS(8,8)/6-31G, SA-5-CAS(10,10)/6-31G, and SA-5-CAS(12,12)/6-31G levels of theory, respectively. The highest excited state in each calculation corresponds to the first optically bright state, and all π and π* molecular orbitals are included within the active space. A representative subset of these systems is displayed in Figure 2.In Figure 3, the scaling of computational effort with respect to the number of basis functions is reported for the ERIs, effective Lagrangian, J gradient, and K gradient. These terms are described in Eqs. (24), (25), (79), (89), and (90), respectively. In addition, we report scaling with respect to basis set size for the partial construction of the Hessian (labeled “Hessian”), the first iteration of the coupled perturbed equations (labeled “CP-MCSCF”), and the entire gradient calculation (assuming one iteration of the coupled perturbed equations is required, labeled “Total”). Typically, 10-20 iterations of the coupled perturbed equations are required to reach convergence. However, the construction of the ERIs still has the highest cost, as can be seen in Figure 3 and will be discussed shortly. The partially constructed Hessian consists of the non-braced terms in Eqs. (16)–(21) as well as the orbital gradients, 𝑻(Ψ,Ψ), and interstate couplings, 𝑻(Ψ,ϓ), required in Eq. (50). The computationally expensive portion of each iteration of the coupled perturbed equations consists of doing a J and K matrix evaluation for each matrix defined in Eqs. (97) and (98) as well as evaluating Eqs. (39), (41), and (50). The gradient exhibits approximately O(No2) scaling, with the Hessian and K gradient construction exhibiting the worst scaling. The scaling of the entire gradient calculation tends to decrease with the size of the active space due to the fact that more time is spent computing the relatively weakly scaling ERIs. However, the scaling of the calculation depends significantly on the sparsity of the density matrices in the AO basis. The decapentaene series exhibits lower scaling than the dodecahexaene series, despite the use of a smaller active space. This occurs because the decapentaene systems happen to have a sparser representation in the AO basis, likely by chance, and most of the contributions to the gradient exhibit relatively weaker scaling. Otherwise, the scaling behavior reported here is essentially the same as that reported in our previous work,13,14 further demonstrating that the scaling is consistent as the active space increases.

FIG. 2.

Geometries of microsolvated dodecahexaene solvated by (a) 50 methanol molecules, (b) 150 methanol molecules, and (c) 250 methanol molecules.

FIG. 2.

Geometries of microsolvated dodecahexaene solvated by (a) 50 methanol molecules, (b) 150 methanol molecules, and (c) 250 methanol molecules.

Close modal
FIG. 3.

Timings (in minutes) for one gradient evaluation (assuming a single CP-MCSCF iteration) for (a) octatetraene, (b) decapentaene, and (c) dodecahexaene microsolvated in 50 to 250 quantum methanol molecules, increasing in increments of 50. Each polyene was treated at the (a) SA-4-CAS(8,8)/6-31G, (b) SA-5-CAS(10,10)/6-31G, and (c) SA-5-CAS(12,12)/6-31G levels of theory. The lowest R2 value of all the lines was 0.982. Timings were obtained using 1 NVIDIA GeForce GTX 970 GPU.

FIG. 3.

Timings (in minutes) for one gradient evaluation (assuming a single CP-MCSCF iteration) for (a) octatetraene, (b) decapentaene, and (c) dodecahexaene microsolvated in 50 to 250 quantum methanol molecules, increasing in increments of 50. Each polyene was treated at the (a) SA-4-CAS(8,8)/6-31G, (b) SA-5-CAS(10,10)/6-31G, and (c) SA-5-CAS(12,12)/6-31G levels of theory. The lowest R2 value of all the lines was 0.982. Timings were obtained using 1 NVIDIA GeForce GTX 970 GPU.

Close modal

In contrast to the results presented in our previous work,14 the ERI builds are the most expensive operation in the computation of the gradient for systems treated with large active spaces. Despite exhibiting relatively weak scaling, the number of ERIs required to solve the coupled perturbed equations increases quadratically with the number of active molecular orbitals. The size of the partially constructed Hessian and the number of J gradient calls also increase quadratically with the number of active orbitals, but these terms exhibit much lower prefactors in relation to the ERIs. This additional cost is an unavoidable consequence of employing large active spaces and has motivated the development of semi-numerical77 techniques and density fitting78,79 approximations within CASSCF. Although it is possible that our algorithm could also benefit from these approximations, our SA-CASSCF energy and response property implementation is currently both the fastest and weakest scaling such algorithm and requires neither density-fitting approximations78,79 nor overly stringent convergence thresholds.77 Moreover, there is likely still sparsity that we are not exploiting, which would reduce the cost of recovering the contributions from the ERIs. Nonetheless, our current algorithm is efficient enough to perform calculations on systems with O(10) active orbitals, O(1000) basis functions, and O(108) determinants on a single GPU in hours, significantly advancing the scale of simulations possible at the SA-CASSCF level of theory.

In our previous work,23 we developed efficient routines to compute a 𝝈 vector, the matrix vector product between the CI Hamiltonian matrix and a trial vector, which scaled linearly with the number of determinants. The construction of the OPDM and TPDM follows a very similar algorithm. As noted in Section II, the matrix vector products between the orbital-CI blocks and CI−CI blocks of the Hamiltonian can be written in terms of 𝝈 vector builds and generalized OPDM and TPDM builds. The scaling of H𝐂𝐎𝐳, H𝐎𝐂𝐳, and H𝐂𝐂𝐳 with respect to the number of determinants is displayed in Figure 4. These calculations were performed on hexadecaoctaene treated at the SA-2-CAS(16,12)/6-31G through SA-2-CAS(16,16)/6-31G levels of theory. For the SA-2-CAS(16,16)/6-31G calculation, all π and π* molecular orbitals were included within the active space, and the smaller active spaces include the lowest π* molecular orbitals in addition to all π orbitals. In accordance with our previous work, the construction of H𝐂𝐎𝐳 and H𝐂𝐂𝐳 exhibits approximately linear scaling with respect to the number of determinants. This is expected because the construction of these blocks is dominated by 𝝈 vector builds, as demonstrated in Eqs. (39) and (50). The construction of H𝐎𝐂𝐳 is somewhat more expensive and exhibits slightly worse scaling (but still nearly linear) of O(Nd1.15). This is due to the fact that it is slightly more difficult to compute a TPDM compared to a 𝝈 vector. Although the scaling of each of these blocks increases nearly linearly with the number of determinants, the number of determinants increases factorially with the size of the active space, and memory requirements are an issue as the active space increases beyond 16 electrons in 16 orbitals, which is the largest active spaces we have employed. This is an inherent limitation to all CASSCF codes and is the subject of continuing research. Despite this, our GPU-based algorithms substantially reduce the cost of performing SA-CASSCF energy and gradient calculations for both large and small active spaces.

FIG. 4.

Timings (in minutes) for the construction of H𝐎𝐂𝐳, H𝐂𝐎𝐳, and H𝐂𝐂𝐳 for hexadecaoctaene treated at the SA-2-CAS(16,12)/6-31G through SA-2-CAS(16,16)/6-31G levels of theory in increments of 1 active orbital. These terms are described by Eqs. (41), (50), and (39), respectively. The timings for H𝐂𝐎𝐳 and H𝐂𝐂𝐳 are almost identical, so we have shifted the H𝐂𝐂𝐳 line down for clarity. The lowest R2 value was 0.993. Timings were obtained using 1 NVIDIA TITAN X (Pascal) GPU.

FIG. 4.

Timings (in minutes) for the construction of H𝐎𝐂𝐳, H𝐂𝐎𝐳, and H𝐂𝐂𝐳 for hexadecaoctaene treated at the SA-2-CAS(16,12)/6-31G through SA-2-CAS(16,16)/6-31G levels of theory in increments of 1 active orbital. These terms are described by Eqs. (41), (50), and (39), respectively. The timings for H𝐂𝐎𝐳 and H𝐂𝐂𝐳 are almost identical, so we have shifted the H𝐂𝐂𝐳 line down for clarity. The lowest R2 value was 0.993. Timings were obtained using 1 NVIDIA TITAN X (Pascal) GPU.

Close modal

To further demonstrate the efficiency and accuracy of our code, we apply our method to the Dronpa-2 protein solvated in a water droplet, shown in Figure 5. Dronpa-2 is a fast switching variant of the reversibly photoswitchable protein Dronpa,80 a homologue to the green fluorescent protein, with the M159T point mutation.81 Reversible photoswitchable proteins are fluorescent proteins that can be switched between optically bright and dark states numerous times, which typically involves isomerization of the chromophore, and are important due to their use in super resolution spectroscopy.82–86 Dronpa-2 has been the subject of a variety of experiments,81,87–90 which suggest that fast photoswitching is enabled by a significant reduction in steric resistance to chromophore twisting within the protein matrix. Due to its fast switching dynamics, it was also recently studied using SA-2-CAS(6,6)/3-21G with a very small quantum region (21 atoms).91 

FIG. 5.

The Dronpa-2 protein solvated in a water droplet (left) is treated at the QM/MM SA-2-CAS(12,11)/6-31G* level of theory. The quantum region (right) consists of the chromophore, including the posttranslational modified glycine and cysteine residues, as well as the side chains of Arg66, Arg91, Ser142, Glu144, His193, and Glu211 and two crystal waters. The structure corresponds to that of the conical intersection. Carbon, oxygen, nitrogen, sulfur, and hydrogen atoms are shown in gray, red, blue, yellow, and white, respectively. In total, there are 122 quantum atoms, including the 8 hydrogen link atoms.

FIG. 5.

The Dronpa-2 protein solvated in a water droplet (left) is treated at the QM/MM SA-2-CAS(12,11)/6-31G* level of theory. The quantum region (right) consists of the chromophore, including the posttranslational modified glycine and cysteine residues, as well as the side chains of Arg66, Arg91, Ser142, Glu144, His193, and Glu211 and two crystal waters. The structure corresponds to that of the conical intersection. Carbon, oxygen, nitrogen, sulfur, and hydrogen atoms are shown in gray, red, blue, yellow, and white, respectively. In total, there are 122 quantum atoms, including the 8 hydrogen link atoms.

Close modal

Here we perform a conical intersection search on Dronpa-2 at the SA-2-CAS(12,11)/6-31G* level of theory with a significantly expanded quantum region. This active space has been used in previous studies on the closely related green fluorescent protein chromophore.92,93 Our quantum region consists of the chromophore (Cys62, Tyr63, and Gly64) as well as the side chains of Arg66, Arg91, Ser142, Glu144, His193, and Glu211 (where Glu211 is protonated). Furthermore, the backbone NH atoms from Asp65 and CO atoms from Phe61 in addition to two crystal water molecules were included in the quantum region, as displayed in Figure 5 embedded within the broader protein matrix. A total of 122 atoms are included in the quantum region for this calculation. Although this system has roughly half as many molecular orbitals as the microsolvated butadiene system from our previous paper,14 it has over 5000 times more CI parameters (213,444 total). The total number of parameters optimized during the coupled perturbed equations is 608,507, and as such, this is one of the largest conical intersection searches ever performed at the SA-CASSCF level of theory. Moreover, each step of the conical intersection search required on average 20 min on 8 NVIDIA GTX 970 GPUs in mixed precision,94 placing systems of this scale within the realm of possibility for nonadiabatic dynamics.

The MECI for Dronpa-2 is plotted in Figure 6. It is peaked or very slightly sloped, indicating that radiationless decay is likely to be quite fast once the chromophore twists. However, the fluorescence quantum yield for Dronpa-2 is 0.33,81 and there is expected to be a modest barrier to rotation. The direction of the gradient difference and coupling vectors corresponds to resonance and hula twisting motions, respectively. Of these directions, the hula twisting motion is responsible for isomerization and determines the switching yield. Experimental results81,87–90 demonstrate that the bright-to-dark state switching quantum yield is very low (4.7 × 10−2)81 and the branching ratio between the bright and dark state heavily favors the bright state. These results are consistent with the intersection topology shown in Figure 6, but excited-state dynamics simulations of Dronpa-2 are needed to estimate the branching ratio using computational models. This will be the subject of future work.

FIG. 6.

The MECI for Dronpa-2 treated at the QM/MM SA-2-CAS(12,11)/6-31G* level of theory. The MECI is peaked or slightly sloped, indicating that radiationless decay is fast once the chromophore twists. The chromophore is in a hula twist structure, and the gradient difference and NAC vectors involve bond alternation and hula twisting motions, respectively.

FIG. 6.

The MECI for Dronpa-2 treated at the QM/MM SA-2-CAS(12,11)/6-31G* level of theory. The MECI is peaked or slightly sloped, indicating that radiationless decay is fast once the chromophore twists. The chromophore is in a hula twist structure, and the gradient difference and NAC vectors involve bond alternation and hula twisting motions, respectively.

Close modal

We have generalized our previous algorithm for computing response properties for a SA-CASSCF wavefunction in order to enable efficient treatment of large active spaces. This algorithm simultaneously exploits the intrinsic sparsity in the AO basis, efficient direct-CI routines, and GPU acceleration. Even for very large active spaces, we observe effective O(No2) scaling, as opposed to the formal O(No4) scaling. Although the number of determinants increases factorially with the size of the active space, the scaling of each block of the Hessian multiplied by a trial vector is nearly linear with respect to the number of determinants. As such, we were able to carry out one of the largest SA-CASSCF gradient and nonadiabatic coupling vector calculations in a short timeframe using a single GPU. Importantly, our work demonstrates that the computationally demanding portions of the SA-CASSCF method can be formulated, essentially entirely, in terms of seven fundamental operations, including J and K matrix builds and their derivatives, as well as, generalized 𝝈, OPDM, and TPDM builds. We believe that these are fundamental operations in electronic structure theory, much like BLAS and LAPACK routines are considered fundamental operations in scientific computing.

Although our methods represent an exciting opportunity to study a variety of photochemical reactions at an unprecedented scale, we are continuing to work on methods that address the primary shortcoming of SA-CASSCF—limited treatment of dynamic electron correlation. This includes the development of tensor hypercontraction methods95–100 for multistate complete active space perturbation theory (MS-CASPT2)101,102 as well as methods that combine aspects of SA-CASSCF and density functional theory.103–106 In the meantime, our SA-CASSCF code can be applied to the wide variety of systems where dynamic correlation can be neglected (often due to a judicious choice of active space and state averaging107–109).

As mentioned in several of our past papers,13,14,23 the primary motivation for this work is to apply this code to nonadiabatic dynamics simulations. We recently developed an interface between TeraChem and full multiple spawning (FMS) nonadiabatic dynamics,24,25 and applied this code to the study of the photoinduced ring opening of provitamin D3.110 At the time, this was the largest simulation of its kind and predicted a nonadiabatic mechanism that was subsequently corroborated by experiment.111 With the development of this integral-direct compatible program, we can now simulate nonadiabatic dynamics on molecules with large active spaces (limited primarily by memory requirements for the CI vector). Although our method has not expanded the maximum tractable active space, we can perform calculations with large active spaces faster than previous codes, especially for large molecular systems. More specifically, our method allows us to routinely perform nonadiabatic dynamics on systems with several hundreds to a thousand basis functions, which corresponds to systems with O(100) quantum atoms, in active spaces with up to 12 electrons in 12 orbitals. Moreover, we can perform critical point searches on systems with O(1000) basis functions and with active spaces of up to 16 electrons in 16 orbitals. Molecules that might benefit from such treatment include DNA bases, photochromic molecular switches, and transition metal complexes, where multiple roots of different electronic character as well as the ability to treat degenerate states are important. Furthermore, our implementation of QM/MM has enabled the treatment of large protein systems. In the future, we intend to apply our code to study the dynamics of photoswitchable proteins.

See supplementary material for coordinates of the molecules used in this article as well as input and output files with detailed energy information.

This work was supported by the National Science Foundation (Grant No. ACI-1450179). J.W.S. is grateful to the National Science Foundation for an NSF Graduate Fellowship. This work used the XStream computational resource supported by the NSF MRI Program (Grant No. ACI-1429830). T.J.M. is a cofounder of PetaChem, LLC.

1.
B. O.
Roos
,
P. R.
Taylor
, and
P. E. M.
Siegbahn
,
Chem. Phys.
48
,
157
(
1980
).
2.
B. O.
Roos
,
Int. J. Quantum Chem.
18
(S14),
175
(
1980
).
3.
K.
Ruedenberg
,
M. W.
Schmidt
,
M. M.
Gilbert
, and
S. T.
Elbert
,
Chem. Phys.
71
,
41
(
1982
).
4.
K.
Ruedenberg
,
M. W.
Schmidt
, and
M. M.
Gilbert
,
Chem. Phys.
71
,
51
(
1982
).
5.
K.
Ruedenberg
,
M. W.
Schmidt
,
M. M.
Gilbert
, and
S. T.
Elbert
,
Chem. Phys.
71
,
65
(
1982
).
6.
B. O.
Roos
,
Adv. Chem. Phys.
69
,
399
(
2007
).
7.
M. W.
Schmidt
and
M. S.
Gordon
,
Annu. Rev. Phys. Chem.
49
,
233
(
1998
).
8.
R. N.
Diffenderfer
and
D. R.
Yarkony
,
J. Phys. Chem.
86
,
5098
(
1982
).
9.
D. R.
Yarkony
,
Acc. Chem. Res.
31
,
511
(
1998
).
10.
B. G.
Levine
,
J. D.
Coe
, and
T. J.
Martínez
,
J. Phys. Chem. B
112
,
405
(
2008
).
11.
B. G.
Levine
,
C.
Ko
,
J.
Quenneville
, and
T. J.
Martínez
,
Mol. Phys.
104
,
1039
(
2006
).
12.
M.
Garavelli
,
F.
Bernardi
,
M.
Olivucci
,
T.
Vreven
,
S.
Klein
,
P.
Celani
, and
M. A.
Robb
,
Faraday Discuss.
110
,
51
(
1998
).
13.
E. G.
Hohenstein
,
N.
Luehr
,
I. S.
Ufimtsev
, and
T. J.
Martínez
,
J. Chem. Phys.
142
,
224103
(
2015
).
14.
J. W.
Snyder
, Jr.
,
E. G.
Hohenstein
,
N.
Luehr
, and
T. J.
Martínez
,
J. Chem. Phys.
143
,
154107
(
2015
).
15.
P. J.
Knowles
and
N. C.
Handy
,
Chem. Phys. Lett.
111
,
315
(
1984
).
16.
J.
Olsen
,
B. O.
Roos
,
P.
Jørgensen
, and
H. J. A.
Jensen
,
J. Chem. Phys.
89
,
2185
(
1988
).
17.
S.
Zarrabian
,
C. R.
Sarma
, and
J.
Paldus
,
Chem. Phys. Lett.
155
,
183
(
1989
).
18.
G. L.
Bendazzoli
and
S.
Evangelisti
,
J. Chem. Phys.
98
,
3141
(
1993
).
19.
D. C.
Sherrill
and
H. F.
Schaefer
 III
,
Adv. Quantum Chem.
34
,
143
(
1999
).
21.
I.
Shavitt
,
Int. J. Quantum Chem.
14
,
5
(
1978
).
22.
N. C.
Handy
,
Chem. Phys. Lett.
74
,
280
(
1980
).
23.
B. S.
Fales
and
B. G.
Levine
,
J. Chem. Theory Comput.
11
,
4708
(
2015
).
24.
M.
Ben-Nun
,
J.
Quenneville
, and
T. J.
Martínez
,
J. Phys. Chem. A
104
,
5161
(
2000
).
25.
M.
Ben-Nun
and
T. J.
Martínez
,
Adv. Chem. Phys.
121
,
439
(
2002
).
26.
M. J.
Bearpark
,
M. A.
Robb
, and
H. B.
Schlegel
,
Chem. Phys. Lett.
223
,
269
(
1994
).
27.
M. R.
Manaa
and
D. R.
Yarkony
,
J. Chem. Phys.
99
,
5251
(
1993
).
29.
Y.
Yamaguchi
,
Y.
Osamura
,
J. D.
Goddard
, and
H. F.
Schaefer
,
A New Dimension to Quantum Chemistry: Analytic Derivative Methods in Ab Initio Molecular Electronic Structure Theory
(
Oxford University Press
,
New York
,
1994
).
30.
S.
Kato
and
K.
Morokuma
,
Chem. Phys. Lett.
65
,
19
(
1979
).
31.
J. D.
Goddard
,
N. C.
Handy
, and
H. F.
Schaefer
,
J. Chem. Phys.
71
,
1525
(
1979
).
32.
Y.
Osamura
,
Y.
Yamaguchi
, and
H. F.
Schaefer
,
J. Chem. Phys.
77
,
383
(
1982
).
33.
B. H.
Lengsfield
and
D. R.
Yarkony
,
Adv. Chem. Phys.
82
,
1
(
2007
).
34.
B. H.
Lengsfield
,
P.
Saxe
, and
D. R.
Yarkony
,
J. Chem. Phys.
81
,
4549
(
1984
).
35.
T.
Helgaker
and
P.
Jørgensen
,
Theor. Chim. Acta
75
,
111
(
1989
).
36.
J.
Stålring
,
A.
Bernhardsson
, and
R.
Lindh
,
Mol. Phys.
99
,
103
(
2001
).
37.
B. H.
Lengsfield
and
B.
Liu
,
J. Chem. Phys.
75
,
478
(
1981
).
38.
B. H.
Lengsfield
,
J. Chem. Phys.
77
,
4073
(
1982
).
39.
B.
Levy
,
Int. J. Quantum Chem.
4
,
297
(
1970
).
40.
P.
Siegbahn
,
A.
Heiberg
,
B.
Roos
, and
B.
Levy
,
Phys. Scr.
21
,
323
(
1980
).
41.
P. E. M.
Siegbahn
,
J.
Almlöf
,
A.
Heiberg
, and
B. O.
Roos
,
J. Chem. Phys.
74
,
2384
(
1981
).
42.
P. E. M.
Siegbahn
,
Chem. Phys. Lett.
109
,
417
(
1984
).
43.
K. L.
Bak
,
J.
Boatz
, and
J.
Simons
,
Int. J. Quantum Chem.
40
,
361
(
1991
).
44.
W. J.
Glover
,
J. Chem. Phys.
141
,
171102
(
2014
).
45.
M. P.
Deskevich
,
D. J.
Nesbitt
, and
H.-J.
Werner
,
J. Chem. Phys.
120
,
7281
(
2004
).
46.
I. S.
Ufimtsev
and
T. J.
Martínez
,
J. Chem. Theory Comput.
4
,
222
(
2008
).
47.
I. S.
Ufimtsev
and
T. J.
Martínez
,
J. Chem. Theory Comput.
5
,
1004
(
2009
).
48.
I. S.
Ufimtsev
and
T. J.
Martínez
,
J. Chem. Theory Comput.
5
,
2619
(
2009
).
49.
D. A.
Case
 et al,
AMBER 12
(
University of California
,
San Francisco
,
2012
).
50.
J. M.
Wang
,
R. M.
Wolf
,
J. W.
Caldwell
,
P. A.
Kollman
, and
D. A.
Case
,
J. Comput. Chem.
25
,
1157
(
2004
).
51.
J. M.
Wang
,
W.
Wang
,
P. A.
Kollman
, and
D. A.
Case
,
J. Mol. Graphics Model.
25
,
247
(
2006
).
52.
J. W.
Caldwell
and
P. A.
Kollman
,
J. Phys. Chem.
99
,
6208
(
1995
).
53.
M.
Kaucikas
,
A.
Fitzpatrick
,
E.
Bryan
,
A.
Struve
,
R.
Henning
,
I.
Kosheleva
,
V.
Srajer
,
G.
Groenhof
, and
J. J.
Van Thor
,
Proteins: Struct., Funct., Bioinf.
83
,
397
(
2015
).
54.
J. M.
Wang
,
P.
Cieplak
, and
P. A.
Kollman
,
J. Comput. Chem.
21
,
1049
(
2000
).
55.
V.
Hornak
,
R.
Abel
,
A.
Okur
,
B.
Strockbine
,
A.
Roitberg
, and
C.
Simmerling
,
Proteins: Struct., Funct., Bioinf.
65
,
712
(
2006
).
56.
K.
Lindorff-Larsen
,
S.
Piana
,
K.
Palmo
,
P.
Maragakis
,
J. L.
Klepeis
,
R. O.
Dror
, and
D. E.
Shaw
,
Proteins: Struct., Funct., Bioinf.
78
,
1950
(
2010
).
57.
E.
Vanquelef
,
S.
Simon
,
G.
Marquant
,
E.
Garcia
,
G.
Klimerak
,
J. C.
Delepine
,
P.
Cieplak
, and
F. Y.
Dupradeau
,
Nucleic Acids Res.
39
,
W511
(
2011
).
58.
F.
Wang
,
J. P.
Becker
,
P.
Cieplak
, and
F. Y.
Dupradeau
, R.E.D. Python: Object oriented programming for Amber force fields,
Université de Picardie - Jules Verne, Sanford Burnham Prebys Medical Discovery Institute
,
2013
.
59.
F. Y.
Dupradeau
,
A.
Pigache
,
T.
Zaffran
,
C.
Savineau
,
R.
Lelong
,
N.
Grivel
,
D.
Lelong
,
W.
Rosanski
, and
P.
Cieplak
,
Phys. Chem. Chem. Phys.
12
,
7821
(
2010
).
60.
C. I.
Bayly
,
P.
Cieplak
,
W. D.
Cornell
, and
P. A.
Kollman
,
J. Phys. Chem.
97
,
10269
(
1993
).
61.
M. J.
Frisch
 et al, gaussian 09, Revision E.01,
Gaussian, Inc.
,
Wallingford, CT, USA
,
2009
.
62.
T. J.
Dolinsky
,
J. E.
Nielsen
,
J. A.
McCammon
, and
N. A.
Baker
,
Nucleic Acids Res.
32
,
W665
(
2004
).
63.
Y. J.
Wu
,
H. L.
Tepper
, and
G. A.
Voth
,
J. Chem. Phys.
124
,
024503
(
2006
).
64.
A.
Warshel
and
M.
Levitt
,
J. Mol. Biol.
103
,
227
(
1976
).
65.
M. J.
Field
,
P. A.
Bash
, and
M.
Karplus
,
J. Comput. Chem.
11
,
700
(
1990
).
66.
J. L.
Gao
,
Acc. Chem. Res.
29
,
298
(
1996
).
67.
R. A.
Friesner
and
V.
Guallar
,
Annu. Rev. Phys. Chem.
56
,
389
(
2005
).
68.
U. C.
Singh
and
P. A.
Kollman
,
J. Comput. Chem.
7
,
718
(
1986
).
69.
P.
Eastman
 et al,
J. Chem. Theory Comput.
9
,
461
(
2013
).
70.
P.
Eastman
and
V.
Pande
,
Comput. Sci. Eng.
12
,
34
(
2010
).
71.
P.
Eastman
and
V. S.
Pande
,
J. Chem. Theory Comput.
6
,
434
(
2010
).
72.
P.
Eastman
and
V. S.
Pande
,
J. Comput. Chem.
31
,
1268
(
2010
).
73.
M. S.
Friedrichs
,
P.
Eastman
,
V.
Vaidyanathan
,
M.
Houston
,
S.
Legrand
,
A. L.
Beberg
,
D. L.
Ensign
,
C. M.
Bruns
, and
V. S.
Pande
,
J. Comput. Chem.
30
,
864
(
2009
).
74.
R. C.
Walker
,
M. F.
Crowley
, and
D. A.
Case
,
J. Comput. Chem.
29
,
1019
(
2008
).
75.
C. M.
Isborn
,
A. W.
Götz
,
M. A.
Clark
,
R. C.
Walker
, and
T. J.
Martínez
,
J. Chem. Theory Comput.
8
,
5092
(
2012
).
76.
J.
Kästner
,
J. M.
Carr
,
T. W.
Keal
,
W.
Thiel
,
A.
Wander
, and
P.
Sherwood
,
J. Phys. Chem. A
113
,
11856
(
2009
).
77.
A. A.
Granovsky
,
J. Chem. Phys.
143
,
231101
(
2015
).
78.
M. G.
Delcey
,
T. B.
Pedersen
,
F.
Aquilante
, and
R.
Lindh
,
J. Chem. Phys.
143
,
044110
(
2015
).
79.
I. F.
Galván
,
M. G.
Delcey
,
T. B.
Pedersen
,
F.
Aquilante
, and
R.
Lindh
,
J. Chem. Theory Comput.
12
,
3636
(
2016
).
80.
R.
Ando
,
H.
Mizuno
, and
A.
Miyawaki
,
Science
306
,
1370
(
2004
).
81.
R.
Ando
,
C.
Flors
,
H.
Mizuno
,
J.
Hofkens
, and
A.
Miyawaki
,
Biophys. J.
92
,
L97
(
2007
).
82.
E.
Betzig
,
G. H.
Patterson
,
R.
Sougrat
,
O. W.
Lindwasser
,
S.
Olenych
,
J. S.
Bonifacino
,
M. W.
Davidson
,
J.
Lippincott-Schwartz
, and
H. F.
Hess
,
Science
313
,
1642
(
2006
).
83.
S. T.
Hess
,
T. P. K.
Girirajan
, and
M. D.
Mason
,
Biophys. J.
91
,
4258
(
2006
).
84.
J. S.
Biteen
,
M. A.
Thompson
,
N. K.
Tselentis
,
G. R.
Bowman
,
L.
Shapiro
, and
W. E.
Moerner
,
Nat. Methods
5
,
947
(
2008
).
85.
D.
Bourgeois
and
V.
Adam
,
IUBMB Life
64
,
482
(
2012
).
86.
X. X.
Zhou
and
M. Z.
Lin
,
Curr. Opin. Chem. Biol.
17
,
682
(
2013
).
87.
A. C.
Stiel
,
S.
Trowitzsch
,
G.
Weber
,
M.
Andresen
,
C.
Eggeling
,
S. W.
Hell
,
S.
Jakobs
, and
M. C.
Wahl
,
Biochem. J.
402
,
35
(
2007
).
88.
A.
Lukacs
,
A.
Haigney
,
R.
Brust
,
K.
Addison
,
M.
Towrie
,
G. M.
Greetham
,
G. A.
Jones
,
A.
Miyawaki
,
P. J.
Tonge
, and
S. R.
Meech
,
J. Phys. Chem. B
117
,
11954
(
2013
).
89.
M.
Kaucikas
,
M.
Tros
, and
J. J.
van Thor
,
J. Phys. Chem. B
119
,
2350
(
2015
).
90.
D.
Yadav
,
F.
Lacombat
,
N.
Dozova
,
F.
Rappaport
,
P.
Plaza
, and
A.
Espagne
,
J. Phys. Chem. B
119
,
2404
(
2015
).
91.
D.
Morozov
and
G.
Groenhof
,
Angew. Chem., Int. Ed.
55
,
576
(
2016
).
92.
M. E.
Martin
,
F.
Negri
, and
M.
Olivucci
,
J. Am. Chem. Soc.
126
,
5452
(
2004
).
93.
I. V.
Polyakov
,
B. L.
Grigorenko
,
E. M.
Epifanovsky
,
A. I.
Krylov
, and
A. V.
Nemukhin
,
J. Chem. Theory Comput.
6
,
2377
(
2010
).
94.
N.
Luehr
,
I. S.
Ufimtsev
, and
T. J.
Martínez
,
J. Chem. Theory Comput.
7
,
949
(
2011
).
95.
E. G.
Hohenstein
,
R. M.
Parrish
, and
T. J.
Martínez
,
J. Chem. Phys.
137
,
044103
(
2012
).
96.
R. M.
Parrish
,
E. G.
Hohenstein
,
T. J.
Martínez
, and
C. D.
Sherrill
,
J. Chem. Phys.
137
,
224106
(
2012
).
97.
E. G.
Hohenstein
,
S. I. L.
Kokkila
,
R. M.
Parrish
, and
T. J.
Martínez
,
J. Chem. Phys.
138
,
124111
(
2013
).
98.
E. G.
Hohenstein
,
S. I. L.
Kokkila
,
R. M.
Parrish
, and
T. J.
Martínez
,
J. Phys. Chem. B
117
,
12972
(
2013
).
99.
S. I. L.
Kokkila Schumacher
,
E. G.
Hohenstein
,
R. M.
Parrish
,
L.-P.
Wang
, and
T. J.
Martínez
,
J. Chem. Theory Comput.
11
,
3042
(
2015
).
100.
C.
Song
and
T. J.
Martínez
,
J. Chem. Phys.
144
,
174111
(
2016
).
101.
K.
Andersson
,
P. A.
Malmqvist
, and
B. O.
Roos
,
J. Chem. Phys.
96
,
1218
(
1992
).
102.
T.
Shiozaki
,
W.
Győrffy
,
P.
Celani
, and
H.-J.
Werner
,
J. Chem. Phys.
135
,
081106
(
2011
).
103.
K.
Sharkas
,
A.
Savin
,
H. J. A.
Jensen
, and
J.
Toulouse
,
J. Chem. Phys.
137
,
044104
(
2012
).
104.
J.
Gao
,
A.
Grofe
,
H.
Ren
, and
P.
Bao
,
J. Phys. Chem. Lett.
7
,
5143
(
2016
).
105.
L.
Gagliardi
,
D. G.
Truhlar
,
G.
Li Manni
,
R. K.
Carlson
,
C. E.
Hoyer
, and
J. L.
Bao
,
Acc. Chem. Res.
50
,
66
(
2017
).
106.
S.
Pijeau
and
E. G.
Hohenstein
,
J. Chem. Theory Comput.
13
,
1130
(
2017
).
107.
J. D.
Coe
and
T. J.
Martínez
,
J. Phys. Chem. A
110
,
618
(
2006
).
108.
B. G.
Levine
and
T. J.
Martínez
,
J. Phys. Chem. A
113
,
12815
(
2009
).
109.
A. M.
Virshup
,
C.
Punwong
,
T. V.
Pogorelov
,
B. A.
Lindquist
,
C.
Ko
, and
T. J.
Martínez
,
J. Phys. Chem. B
113
,
3280
(
2009
).
110.
J. W.
Snyder
, Jr.
,
B. F. E.
Curchod
, and
T. J.
Martínez
,
J. Phys. Chem. Lett.
7
,
2444
(
2016
).
111.
B. D.
Smith
,
K. G.
Spears
, and
R. J.
Sension
,
J. Phys. Chem. A
120
,
6575
(
2016
).

Supplementary Material