We introduce a new implementation of time-dependent density-functional theory which allows the entire spectrum of a molecule or extended system to be computed with a numerical effort comparable to that of a single standard ground-state calculation. This method is particularly well suited for large systems and/or large basis sets, such as plane waves or real-space grids. By using a superoperator formulation of linearized time-dependent density-functional theory, we first represent the dynamical polarizability of an interacting-electron system as an off-diagonal matrix element of the resolvent of the Liouvillian superoperator. One-electron operators and density matrices are treated using a representation borrowed from time-independent density-functional perturbation theory, which permits us to avoid the calculation of unoccupied Kohn–Sham orbitals. The resolvent of the Liouvillian is evaluated through a newly developed algorithm based on the nonsymmetric Lanczos method. Each step of the Lanczos recursion essentially requires twice as many operations as a single step of the iterative diagonalization of the unperturbed Kohn–Sham Hamiltonian. Suitable extrapolation of the Lanczos coefficients allows for a dramatic reduction of the number of Lanczos steps necessary to obtain well converged spectra, bringing such number down to hundreds (or a few thousands, at worst) in typical plane-wave pseudopotential applications. The resulting numerical workload is only a few times larger than that needed by a ground-state Kohn–Sham calculation for a same system. Our method is demonstrated with the calculation of the spectra of benzene, fullerene, and of chlorophyll a.
I. INTRODUCTION
Time-dependent density-functional theory1 TDDFT stands as a promising alternative to cumbersome many-body approaches to the calculation of the electronic excitation spectra of molecular and condensed-matter systems.2 According to a theorem established by Runge and Gross in the mid-1980s,1 for any given initial state of an interacting-electron system, the external, time-dependent, potential acting on it is uniquely determined by the time evolution of the one-electron density, , for . Using this theorem, one can formally establish a time-dependent Kohn–Sham (KS) equation from which various one-particle properties of the system can be obtained as functions of time. Unfortunately, if little is known about the exchange-correlation (XC) potential in ordinary density-functional theory (DFT),3,4 even less is known about it in the time-dependent case. Most of the existing applications of TDDFT are based on the so-called adiabatic local density or adiabatic generalized gradient approximations (generically denoted in the following by the acronym ADFT),5 which amount to assuming the same functional dependence of the XC potential upon density as in the static case. Despite the crudeness of these approximations, optical spectra calculated from them are in some cases almost as accurate as those obtained from more computationally demanding many-body approaches.2 TDDFT is in principle an exact theory. Progress in understanding and characterizing the XC functional will substantially increase the predictive power of TDDFT, while (hopefully) keeping its computational requirements at a significantly lower level than that of methods based on many-body perturbation theory.
Linearization of TDDFT with respect to the strength of some external perturbation to an otherwise time-independent system leads to a non-Hermitian eigenvalue problem whose eigenvalues are excitation energies, and whose eigenvectors are related to oscillator strengths.6 Not surprisingly, this eigenvalue problem has the same structure that arises in the time-dependent Hartree–Fock theory,7,8 and the dimension of the resulting matrix (the Liouvillian) is twice the product of the number of occupied (valence) states with the number of unoccupied (conduction) states . The calculation of the Liouvillian is by itself a hard task that is often tackled directly in terms of the unperturbed KS eigenpairs. This approach requires the calculation of the full spectrum of the unperturbed KS Hamiltonian, a step that one may want to avoid when very large basis sets are used. The diagonalization of the resulting matrix can be accomplished using iterative techniques,9,10 often, but not necessarily, in conjunction with the Tamm–Dancoff approximation,11–13 which amounts to enforcing Hermiticity by neglecting the anti-Hermitian component of the Liouvillian. The use of iterative diagonalization techniques does not necessarily entail the explicit construction of the matrix to be diagonalized, but just the availability of a black-box routine that performs the product of the matrix with a test vector (“ products”). An efficient way to calculate such a product without explicitly calculating the Liouvillian can be achieved using a representation of the perturbed density matrix and of the Liouvillian superoperator borrowed from time-independent density-functional perturbation theory (DFPT).14–18 Many applications of TDDFT to atoms, molecules, and clusters have been performed within such a framework, see, for example, Refs. 5, 19, and 20. This approach is most likely to be optimal when a small number of excited states is required. In large systems, however, the number of quantum states in any given energy range grows with the system size. The number of pseudodiscrete states in the continuum also grows with the basis-set size even in a small system, thus making the calculation of individual eigenpairs of the Liouvillian more difficult and not as meaningful. This problem is sometimes addressed by directly calculating the relevant response function(s), rather than individual excitation eigenpairs.2,9,17,21 The price paid in this case is the calculation and further manipulation (inversion, multiplication) of large matrices for any individual frequency, a task which may again be impractical for large systems/basis sets, particularly when an extended portion of a richly structured spectrum is sought after. For these reasons, a method to model the absorption spectrum directly—without calculating individual excited states and not requiring the calculation, manipulation, and eventual disposal of large matrices—would be highly desirable.
Such an alternative approach to TDDFT, which avoids diagonalization altogether, was proposed by Yabana and Bertsch.22 In this method, the TDDFT KS equations are solved in the time domain and susceptibilities are obtained by Fourier analyzing the response of the system to appropriate perturbations in the linear regime. This scheme has the same computational complexity as standard time-independent (ground-state) iterative methods in DFT. For this reason, real-time methods have recently gained popularity in conjunction with the use of pseudopotentials (PPs) and real-space grids,23 and a similar success should be expected using plane-wave (PW) basis sets.24,25 The main limitation in this case is that, because of stability requirements, the time step needed for the integration of the TDDFT KS equations is very small (of the order of in typical PP applications) and decreasing as the inverse of the PW kinetic energy cutoff (or as the square of the real-space grid step).25 The resulting number of steps necessary to obtain a meaningful time evolution of the TDDFT KS equations may be exceedingly large.
In a recent letter a new method was proposed26 to calculate optical spectra in the frequency domain—thus avoiding any explicit integration of the TDDFT KS equations—which does not require any diagonalization (of either the unperturbed KS Hamiltonian or the TDDFT Liouvillian) nor any time-consuming matrix operations. Most important, the full spectrum is obtained at once without repeating time-consuming operations for different frequencies. In this method, which is particularly well suited for large systems and PW, or real-space grid, basis sets, a generalized susceptibility is represented by a matrix element of the resolvent of the Liouvillian superoperator, defined in some appropriate operator space. This matrix element is then evaluated using a Lanczos recursion technique. Each link of the Lanczos chain—that is calculated once for all frequencies—requires a number of floating-point operations which is only twice as large as that needed by a single step of the iterative calculation of a static polarizability within time-independent DFPT.14–16 This number is in turn the same as that needed in a single step of the iterative diagonalization of a ground-state KS Hamiltonian or a single step of Car–Parrinello molecular dynamics.
The purpose of the present paper is to provide an extended and detailed presentation of the method of Ref. 26 and to introduce a few methodological improvements, including a new and more efficient approach to the calculation of off-diagonal elements of the resolvent of a non-Hermitian operator and an extrapolation technique that allows one to substantially reduce the number of Lanczos recursion steps needed to calculate well converged optical spectra. The paper is organized as follows. In Sec. II we introduce the linearized Liouville equation of TDDFT, including the derivation of an expression for generalized susceptibilities in terms of the resolvent of the Liouvillian superoperator, the DFPT representation of response operators and of the Liouvillian superoperator, and the extension of the formalism to ultrasoft PPs;27 in Sec. III we describe our new Lanczos algorithm for calculating selected matrix elements of the resolvent of the Liouvillian superoperator; in Sec. IV we present a benchmark of the numerical performance of the new method, and we introduce an extrapolation technique that allows for an impressive enhancement of it; Sec. V contains applications of the new methodology to the spectra of fullerene and to chlorophyll a; Sec. VI finally contains our conclusions.
II. LINEARIZED TDDFT
The time-dependent KS equations of TDDFT read1
where
is a time-dependent KS Hamiltonian and and being the time-dependent external and Hartree plus XC potentials, respectively. In the above equation, as well as in the following, quantum-mechanical operators are denoted by a hat “̂” and Hartree atomic units are used. When no confusion can arise, local operators, such as one-electron potentials will be indicated by the diagonal of their real-space representation as in Eq. (2).
Let us now assume that the external potential is split into a time-independent part, , plus a time-dependent perturbation, , and let us assume that the ’s satisfy the initial conditions
where are ground-state eigenfunctions of the unperturbed KS Hamiltonian ,
To first order in the perturbation, the time-dependent KS equations can be cast into the form
where
are the orbital response functions, which can be chosen to be orthogonal to all of the unperturbed occupied orbitals .
Equation (1) can be equivalently expressed in terms of a quantum Liouville equation,
where is the reduced one-electron KS density matrix whose kernel reads
and the square brackets indicate a commutator. Linearization of Eq. (7) with respect to the external perturbation leads to
where is the unperturbed density matrix, , is the perturbing external potential, and is the linear variation of the Hartree plus XC potential induced by ,
In the ADFT, the functional derivative of the XC potential is assumed to be local in time, , where is the functional derivative of the ground-state XC potential, calculated at the ground-state charge density, : . In this approximation the perturbation to the XC potential, Eq. (10), therefore reads
where . By inserting Eq. (11) into Eq. (9), the linearized Liouville equation is cast into the form
where the action of the Liouvillian super operator onto , is defined as
and is the linear operator functional of whose (diagonal) kernel is given by Eq. (11). By Fourier analyzing Eq. (12) we obtain
where the tilde indicates the Fourier transform and the hat, which denotes quantum operators, has been suppressed in and in order to keep the notation simple. In the absence of any external perturbations , Eq. (14) becomes an eigenvalue equation for , whose eigenpairs describe free oscillations of the system, i.e., excited states.6 Eigenvalues correspond to excitation energies, whereas eigenvectors can be used to calculate transition oscillator strengths and/or the response of system properties to any generic external perturbation.
One is hardly interested in the response of the more general property of a system to the more general perturbation. When simulating the results of a specific spectroscopy experiment, one is instead usually interested in the response of a specific observable to a specific perturbation. The expectation value of any one-electron operator can be expressed as the trace of its product with the one-electron density matrix. The Fourier transform of the dipole linearly induced by the perturbing potential , for example, therefore reads
where is the quantum-mechanical position operator and is the solution to Eq. (14). Let us now suppose that the external perturbation is a homogeneous electric field,
The dipole given by Eq. (15) therefore reads
where the dynamical polarizability is defined by
Traces of products of operators can be seen as scalar products defined on the linear space of quantum-mechanical operators. Let and be two general one-electron operators. We define their scalar product as
Equation (18) can thus be formally written as
where
is the commutator between the position operator and the unperturbed one-electron density matrix. The results obtained so far and embodied in Eq. (20) can be summarized by saying that within TDDFT the dynamical polarizabilty can be expressed as an appropriate off-diagonal matrix element of the resolvent of the Liouvillian superoperator. A similar conclusion was reached in Ref. 17 in the context of a slightly different formalism. This statement can be extended in a straightforward way to the dynamic linear response of any observable to any local one-electron perturbation. It is worth noticing that the operators that enter the definition of the scalar product in Eq. (20) are orthogonal because is Hermitian and anti-Hermitian (being the commutator of two Hermitian operators), and the trace of the product of one Hermitian and one anti-Hermitian operators vanishes.
A. Representation of density matrices and other one-electron operators
The calculation of the polarizability using Eq. (18) or (20) implies that we should be able to compute in a superoperator linear system. The latter task, in turn, requires an explicit representation for the density-matrix response , for its commutator with the unperturbed Hamiltonian, for local operators, such as or , for their commutators with the unperturbed density matrix, as well as for the Liouvillian superoperator, or at least for its product with any relevant operators such as .
A link between the orbital and density-matrix representations of TDDFT expressed by Eqs. (5) and (9) can be obtained by linearizing the expression (8) for the time-dependent density matrix,
whose Fourier transform reads
Equation (23) shows that is univocally determined by the two sets of orbital response functions, and . A set of a number of orbitals equal to the number of occupied states, such as or , will be nicknamed a batch of orbitals. Notice that is not Hermitian because the Fourier transform of a Hermitian, time-dependent, operator is not Hermitian, unless the original operator is even with respect to time inversion. Because of the orthogonality between occupied and response orbitals , Eq. (22) implies that the matrix elements of between two unperturbed KS orbitals which are both occupied or both empty vanish , as required by the idempotency of density matrices in DFT. As a consequence, in order to calculate the response of the expectation values of a Hermitian operator such as in Eq. (15), one only needs to know and represent the occupied-empty and empty-occupied matrix elements of , and . In other terms, if we define and as the projectors onto the occupied- and empty-state manifolds, respectively, one has that
where is the component of , which can be easily and conveniently represented in terms of batches of orbitals. To this end, let us define the orbitals,
One has then
If Eqs. (27) and (28) are used to represent density matrices, then the free oscillations corresponding to setting in Eq. (14) would be described by Casida’s eigenvalue equations.6
For simplicity and without much loss of generality, from now on we will assume that the unperturbed system is time-reversal invariant, so that the unperturbed KS orbitals, and , can be assumed to be real. The two batches of orbitals and will be called the batch representation of the operator and indicated with the notation or . Scalar products between operators (traces of operator products) can be easily expressed in terms of their batch representations. Let be the batch representation of the operator . If either of the two operators, or , has vanishing and components, one has
If is Hermitian, its batch representation satisfies the relation , whereas anti-Hermiticity would imply . Due to time-reversal invariance and the consequent reality of the unperturbed KS orbitals, the batch representation of a real (imaginary) operator is real (imaginary), and the batch representation of a local operator (which is Hermitian, when real, or non-Hermitian, when complex) satisfies .
In order to solve the superoperator linear system, Eq. (14), using the batch representation, one needs to work out the batch representation of as a functional of , as well as of the various commutators appearing therein. The charge-density response to an external perturbation reads
where is the batch representation of the density-matrix response . The Hartree-plus-XC potential response is
Using Eqs. (25) and (26) the batch representation of the Hartree-plus-XC potential response therefore reads
where
[see Eq. (11)]. Let be the batch representation of a local operator . The batch representation of the commutator between and the unperturbed density matrix, , reads
The batch representation of the commutator between the unperturbed Hamiltonian and the density-matrix response, , reads
The batch representation of the product of the Liouvillian with the density-matrix response appearing in Eq. (14) reads
where the action of the and superoperators on batches of orbitals is defined as
Note that, according to Eqs. (38)–(40), the calculation of the product of the Liouvillian with a general one-electron operator in the batch representation only requires operating on a number of one-electron orbitals equal to the number of occupied KS states (number of electrons), without the need to calculate any empty states. In particular, the calculation of Eq. (40) is best performed by first calculating the HXC potential generated by the fictitious charge density , and then applying it to each unperturbed occupied KS orbital, . The projection of the resulting orbitals onto the empty-state manifold implied by the multiplication with is easily performed using the identity , as it is common practice in DFPT.
Following Tsiper,28 it is convenient to perform a 45° rotation in the space of batches and define
Equations (41) and (42) define the standard batch representation (SBR) of the density-matrix response. The SBR of the response charge density is
The SBR of a general one-electron operator is defined in a similar way. In particular, the SBR of a real Hermitian operator has zero component, whereas the SBR of the commutator of such an operator with the unperturbed density matrix has zero component. The SBR of the TDDFT Liouville equation, Eq. (14), reads
In conclusion, the batch representation of response density matrices and of general one-electron operators allows one to avoid the explicit calculation of unoccupied KS states, as well as of the Liouvillian matrix, which is mandatory when (very) large one-electron basis sets (such as PWs or real-space grids) are used to solve the ground-state problem. This representation is the natural extension to the time-dependent regime of the practice that has become common since the introduction of time-independent DFPT.14,16,30
B. Ultrasoft pseudopotentials
The formalism outlined above applies to all-electron as well as to PP calculations performed using norm-conserving PPs, which give rise to an ordinary KS ground-state eigenvalue problem. Ultrasoft pseudopotentials (USPPs),27 instead, give rise to a generalized KS ground-state eigenvalue problem and the time evolution within TDDFT has to be modified accordingly.24,25 The generalization of the TDDFT formalism to USPPs has been presented in full detail in Ref. 25, and here we limit ourselves to report the main formulas.
In the framework of USPPs, the charge density is written as a sum . The delocalized contribution is represented as the sum over the squared moduli of the KS orbitals: . The augmentation charge , instead, is written in terms of the so-called augmentation functions ,
The augmentation functions, as well as the functions , are localized in the core region of atom . Each consists of an angular momentum eigenfunction times a radial function that vanishes outside the core radius. Typically one or two such functions are used for each angular momentum channel and atom type. The indices and in Eq. (45) run over the total number of such functions for atom . In practice, the functions and are provided with the PP for each type of atom.
The advantage of using USPPs over standard norm-conserving PPs comes from this separation of the strongly localized contributions to the charge density from the more delocalized contributions. The square moduli of the KS orbitals only represent the latter part of , and therefore fewer Fourier components in the representation of the orbitals are sufficient for a correct representation of the charge density. As a consequence the kinetic energy cutoff which determines the size of the basis set can be chosen much smaller in typical USPP applications than in corresponding calculations with norm-conserving PPs. As shown in Ref. 25, the smaller basis set not only reduces the dimensions of the matrices during the computation, but it allows also for a faster convergence of spectroscopic quantities, when calculated both with real-time or with spectral Lanczos techniques (see Sec. III).
The generalized expression for the USPP charge density given above entails a more complicated structure of the KS eigenvalue problem. Instead of the standard eigenvalue Eq. (4), one now has
where the overlap operator is defined as
with and the identity operator. Consequently, the equation for the time-dependent KS orbitals, Eq. (5), also contains the overlap operator in the USPP formalism,
Using the same derivation as before, but starting from Eq. (48) instead of Eq. (5), we arrive at a SBR of the TDDFT Liouville equation in the USPP formalism. It has the same form as Eq. (44) above, but with the superoperators and replaced by
and the right-hand side of Eq. (44) by
In this case the projector onto the empty-state manifold is defined as
The inverse overlap operator appearing in these expressions can be cast in the form
which is very similar to the operator itself, given in Eq. (47), except the fact that generally connects -functions localized on different atoms. The numbers can be obtained from the condition . If the atoms are kept at fixed positions, as it is the case here, the overlap operator is independent of time and the ’s need to be calculated only once for all.
III. GENERALIZED SUSCEPTIBILITIES FROM LANCZOS RECURSION CHAINS
According to Eq. (20), the polarizability can be expressed as an appropriate off-diagonal matrix element of the resolvent of the non-Hermitian Liouvillian (super-) operator between two orthogonal vectors. The standard way to calculate such a matrix element is to solve first a linear system whose right-hand side is the ket of the matrix element, and to calculate then the scalar product between the solution of this linear system with the bra.9,17 The main limitation of such an approach is that solving linear systems entails the manipulation and storage of a large amount of data and that a different linear system has to be solved from scratch for each different value of the frequency. In the case of a diagonal element of a Hermitian operator, a very efficient method, based on the Lanczos factorization algorithm (Ref. 31, p. 185 and ff.) is known, which allows us to avoid the solution of the linear system altogether.32–35 Using such a method (known as the Lanczos recursion method) a diagonal matrix element of the resolvent of a Hermitian operator can be efficiently and elegantly expressed in terms of a continued fraction generated by a Lanczos recursion chain starting from the vector with respect to which one wants to calculate the matrix element.32–35 The generalization of the Lanczos recursion method to non-Hermitian operators is straightforward, based on the Lanczos biorthogonalization algorithm (Ref. 36, p. 503). This generalization naturally applies to the calculation of an off-diagonal matrix element between vectors that are not orthogonal. Less evident is how to encompass the calculation of off-diagonal matrix elements between orthogonal vectors. In Ref. (26) such matrix elements were treated using a block version of the Lanczos biorthogonalization. This approach has the drawback that a different Lanczos chain has to be calculated for the response of each different property to a given perturbation (i.e., for each different bra in the matrix element corresponding to a same ket). In the following, we generalize the recursion method of Haydock and co-workers,32–35 so as to encompass the case of an off-diagonal element of the resolvent of a non-Hermitian operator without resorting to a block variant of the algorithm and allowing us to deal with the case in which the left and the right vectors are orthogonal. This will allow us to calculate the full dynamical response of any dynamical property to a given perturbation, from a single scalar Lanczos chain.
We want to calculate quantities such as
where is a non-Hermitian matrix defined in some linear space, whose dimension will be here denoted , and and are elements of this linear space, which we suppose to be normalized: , where . For simplicity, and without loss of generality in view of applications to time-reversal invariant quantum-mechanical problems, we will assume that the linear space is defined over real numbers. Let us define a sequence of left and right vectors, and , from the following procedure, known as the Lanczos biorthogonalization algorithm (Ref. 36, p. 503),
where
and and are scaling factors for the vectors and , respectively, so that they will satisfy
Thus, from an algorithmic point of view, the right-hand sides of Eqs. (57) and (58) are evaluated first with obtained from Eq. (59). Then, the two scalars and are determined so that Eq. (60) is satisfied. Equation (60) only gives a condition on the product of and . If we call and the vectors on the right-hand sides of Eqs. (57) and (58), respectively, this condition is that . In practice one typically sets
The set of and vectors thus generated are said to be links of a Lanczos chain. In exact arithmetics, it is known that these two sequences of vectors are mutually orthogonal to each other, i.e., , where is the Kronecker symbol.
The resulting algorithm is described in detail, e.g., in Refs. 31 and 36. Let us define and as the matrices,
and let indicate the unit vector in a -dimensional space (when there is no ambiguity on the dimensionality of the space, the superscript will be dropped). The following Lanczos factorization holds in terms of the quantities calculated from the recursions Eqs. (56)–(58)
where indicates the unit matrix and is the tridiagonal matrix,
In the present case, because of the special block structure of the Liouvillian superoperator and of the right-hand side appearing in Eq. (44), at each step of the Lanczos recursion one has that is always orthogonal to , so that, according to Eq. (59), . Let us now rewrite Eq. (65) as
By multiplying Eq. (69) by on the left and by on the right, we obtain
Taking the relation into account, Eq. (70) can be cast as
where
is an array of dimension , and
is the error made when truncating the Lanczos chain at the step. Neglecting we arrive at the following approximation to defined in Eq. (54):
This approximation is the scalar product of two arrays of dimension : , where is obtained by solving a tridiagonal linear system,
Three important practical observations should be made at this point. The first is that solving tridiagonal systems is extremely inexpensive (its operation count scales linearly with the system size). The second is that the calculation of the sequence of vectors from Eq. (72) does not require the storage of the matrix. In fact, each component is the scalar product between one fixed vector and the Lanczos recursion vector , and it can be therefore calculated on the fly along the Lanczos recursion chain. We note that a slightly better approach to evaluating Eq. (74) would be via the LU factorization of the matrix . If , then , which can be implemented as the scalar product of two sequences of vectors. We finally observe that the components of decrease rather rapidly as functions of the iteration count, so that only a relatively small number of components have to be explicitly calculated. This will turn out to be essential for extrapolating the Lanczos recursion, as proposed and discussed in Sec. IV. The components of also tend to decrease, although not as rapidly. In fact, this is used to measure convergence of the Lanczos or Arnoldi algorithms for solving linear systems, see, e.g., Ref. 37.
From the algorithmic point of view, much attention is usually paid in the literature to finding suitable preconditioning strategies that would allow one to reduce the number of steps that are needed to achieve a given accuracy within a given iterative method.9 Although preconditioning can certainly help reduce the number of iterations, it will, in general, destroy the nice structure of the Lanczos factorization, Eq. (65), which is essential to avoid repeating the time-consuming factorization of the Liouvillian for different frequencies. In the next section we will show how a suitable extrapolation of the Lanczos coefficient allows for a substantial reduction of the number of iterations without affecting (but rather exploiting) the nice structure of the Lanczos factorization, Eqs. (66) and (65).
We conclude that the nonsymmetric Lanczos algorithm allows one to easily calculate a systematic approximation to the off-diagonal matrix elements of the resolvent of a non-Hermitian matrix. It is easily seen that, in the case of a diagonal matrix element, this same algorithm would lead to a continued-fraction representation of the matrix element. Although the representation of Eq. (71), which is needed in the case of a nondiagonal element, is less elegant than the continued-fraction one, its actual implementation is in practice no more time consuming from the numerical point of view.
The idea of using the Lanczos algorithm to compute functions such the one in Eq. (54) is not new. In control theory, this function is called a transfer function, and it is used to analyze the frequency response of a system much like it is done here. Using the Lanczos algorithm for computing transfer functions has been considered in, e.g., Refs. 38 and 39. The Lanczos and Arnoldi methods are also important tools in the closely related area of model reduction in control theory, see, e.g., Ref. 40.
IV. BENCHMARKING THE NEW ALGORITHM AND ENHANCING ITS NUMERICAL PERFORMANCE
In this section we proceed to a numerical benchmark of the new methodology against the test case of the benzene molecule, a system for which several TDDFT studies already exist and whose optical spectrum is known to be accurately described by ADFT.23,24,26,41 A careful inspection of the convergence of the calculated spectrum with respect to the length of the Lanczos chain allows us to formulate a simple extrapolation scheme that dramatically enhances the numerical performance of our method. All the calculations reported in the present paper have been performed using the Quantum ESPRESSO distribution of codes for PW DFT calculations.42 Ground-state calculations have been performed with the PWscf code contained therein, whereas TDDFT linear response calculations have been performed with a newly developed code, soon to be included in the distribution.
A. Numerical benchmark
The benchmark has been performed using the Perdew–Burke–Ernzerhof43 (PBE) XC functional and USPPs (Refs. 25, 27, and 44) with a PW basis set up to a kinetic energy cutoff of ( for the charge density). This corresponds to a wavefunction basis set of about 25 000 PWs, resulting in a Liouvillian superoperator whose dimension is of the order of 750 000. Periodic boundary conditions have been used, with the molecule placed horizontally flat in a tetragonal supercell of . The absorption spectrum is calculated as , where is the spherical average (average of the diagonal elements) of the molecular dipole polarizability. A small imaginary part has been added to the frequency argument, , so as to regularize the spectrum. This shift into the complex frequency plane has the effect of introducing a spurious width into the discrete spectral lines. In the continuous part of the spectrum, truncation of the Lanczos chain to any finite order results in the discretization of the spectrum, which appears then as the superposition of discrete peaks. The finite width of the spectral lines has in this case the effect of broadening spectral features finer than the imaginary part of the frequency, thus reestablishing the continuous character of the spectrum. The optimal value of the imaginary part of the frequency is slightly larger than the minimum separation between pseudodiscrete peaks and depends in principle on the details of the system being studied, as well as on the length of the Lanczos chain and on the spectral region. Throughout our benchmark we have rather arbitrarily set . Later in this section, we will see that the length of the Lanczos chain can be effectively and inexpensively increased up to any arbitrarily large size. By doing so, the distance between neighboring (pseudo-) discrete states in the continuum correspondingly decreases, thus making the choice of noncritical.
In Fig. 1 we report our results for the absorption spectrum of the benzene molecule. The agreement is quite good with both experimental data45 and previous theoretical work.23,24,26,41 Above the ionization threshold the TDDFT spectrum displays a fine structure (wiggles), which is not observed in experiments and that was suggested in Ref. 41 to be due to size effects associated with the use of a finite simulation cell. Finite-size effects on the fine structure of the continuous portion of the spectrum are illustrated in Fig. 2, where we display the spectrum of benzene as calculated using two simulation cells of different sizes.
Absorption spectrum calculated using Lanczos method with ultrasoft pseudopotentials. The figure shows the curve at different numbers of recursive steps; a vertical shift has been introduced for clarity.
Absorption spectrum calculated using Lanczos method with ultrasoft pseudopotentials. The figure shows the curve at different numbers of recursive steps; a vertical shift has been introduced for clarity.
Comparison with experimental results of the converged spectrum of benzene for two different sizes of the cell; for the larger cell the structure in the continuum decreases and reproduces the experimental curve better. Theoretical results have been scaled so as to obtain the same integrated intensity as experimental data.
Comparison with experimental results of the converged spectrum of benzene for two different sizes of the cell; for the larger cell the structure in the continuum decreases and reproduces the experimental curve better. Theoretical results have been scaled so as to obtain the same integrated intensity as experimental data.
Our purpose here is not to analyze the features of the benzene absorption spectrum, which are already rather well understood (see, e.g., Ref. 26), nor to dwell on the comparison between theory and experiment, but rather to understand what determines the convergence properties of the new method and how they can be possibly improved. The number of iterations necessary to achieve perfect convergence lies in this case in between 2000 and 3000: The improvement with respect to Ref. 26 is due to the smaller basis set, made possible by the use of USPPs, as discussed in Ref. 25. It is worth noting that the convergence is faster in the low energy portion of the spectrum. This does not come as a surprise because the lowest eigenvalues of the tridiagonal matrix generated by the Lanczos recursion converge to the corresponding lowest eigenvalues of the Liouvillian, and the lower the state the faster the convergence.
A comparison between the performance of the new method with a more conventional approach based on the diagonalization of the Liouvillian is not quite possible because the two methodologies basically address different aspects of a same problem. While the former addresses the global spectrum of a specific response function, the latter focuses on individual excited states, from which many different response functions can be obtained, at the price of calculating all of the individual excited states in a given energy range. It suffices to say that it would be impractical to obtain a spectrum over such a wide energy range as in Fig. 1 by calculating all the eigenvalues of a Liouvillian. Using a localized basis set, which is the common choice in most implementations of Casida’s equations, it would be extremely difficult to resolve the high lying portion of the one-electron spectrum with the needed accuracy; using PW or real-space grid basis sets, instead, the calculation of very many individual eigenpairs of the Liouvillian matrix whose dimension easily exceeds several hundred thousands would be a formidable task.
The comparison with time propagation schemes is instead straightforward and more meaningful. Typical time steps and total simulation lengths in a time propagation approach are of the orders of and , respectively, which amounts to about 10 000 time propagation steps.25 The computational workload at each time step depends on the propagation algorithm. One commonly used technique relies on a fourth-order Taylor expansion of the propagator, together with so-called enforced time-reversal symmetry.46 In this case, each time step requires eight products and one evaluation of the Hartree plus XC potentials. In the Lanczos approach, each step requires two products and one evaluation of the potentials. Furthermore, the response orbitals must be kept orthogonal to the ground-state orbitals. This results in a computational effort which is markedly lower for one recursion step than for one time propagation step. Considering both the larger number of propagation steps and the more expensive workload at each step, we can conclude that our approach is definitely more efficient than the time propagation method to compute linear response spectra.
B. Analysis
In Fig. 3 we report the values of the coefficients and of the last component of the vectors [see Eqs. (61) and (72)], as functions of the Lanczos iteration count, calculated when the direction of both the perturbing electric field and the observed molecular dipole are parallel to each other and lying in the molecular plane (this would correspond to calculating, say, the component of the polarizability tensor). It is seen that the components rapidly tend toward zero, whereas the ’s tend to a constant. Closer inspection of the behavior of the latter actually shows that the values of the ’s are scattered around two close, but distinct, values for even and odd iteration counts. The coefficients [see Eq. (62)] are, in general, equal to the ’s and only in correspondence to few iterative steps they assume a negative sign.
(a) Numerical behavior of the components of the vector given by Eq. (72). Apart for some out of scale oscillation they tend rapidly to a value near zero. (b) Numerical behavior of coefficients given by Eq. (61). They tend rapidly to a constant value even if some larger scale oscillation is present. In the inset the same data are shown on a different scale and with different colors for odd (green) and even (red) coefficients.
(a) Numerical behavior of the components of the vector given by Eq. (72). Apart for some out of scale oscillation they tend rapidly to a value near zero. (b) Numerical behavior of coefficients given by Eq. (61). They tend rapidly to a constant value even if some larger scale oscillation is present. In the inset the same data are shown on a different scale and with different colors for odd (green) and even (red) coefficients.
All the calculated quantities, , , and , are subject to occasional oscillations off their asymptotic values. The observed oscillations in the coefficients and can be partly explained from their definitions, namely, Eqs. (61) and (62). Note at first that there is a risk of a division by zero in Eq. (61). The occurrence of a zero scalar product is known as a breakdown. Several situations can take place. A lucky breakdown occurs when one of the vectors or is zero. Then the eigenvalues of the tridiagonal matrix are exact eigenvalues of the matrix , as the space spanned by (when ) becomes invariant under or the space spanned by (when ) becomes invariant under . Another known situation is when neither nor are zero but their inner product is exactly zero. This situation has been studied extensively in the literature: See, e.g., Refs. 47–49. One of the main results is that when this breakdown takes place at step , say, then it is often still possible to continue the algorithm by essentially bypassing step and computing or some , where , directly. Intermediate vectors are needed to replace the missing and , but these vectors are no longer biorthogonal, resulting in the tridiagonal matrix being spoiled by bumps in its upper part. The class of algorithms devised to exploit this idea are called look-ahead Lanczos algorithms (LALAs), a term first employed in Ref. 47. Finally, an incurable breakdown occurs when no pair with some can be constructed which has the desired orthogonality properties. Note that this type of breakdown cannot occur in the Hermitian Lanczos algorithm because it is a manifestation of the existence of vectors in the right subspace (linear span of ) that are orthogonal to all the vectors of the left subspace (linear span of ), which is impossible when these spaces are the same ( in the Hermitian case). Clearly, exact breakdowns (inner product exactly equal to zero) almost never occur in practice. Near breakdowns correspond to small values of these inner products that determine the observed jumps in the coefficients . The components of the ’s can also show jumps in their magnitude since the vectors will occasionally display large variations in norm. In finite-precision arithmetics the occurrence and precise location of (near-) breakdowns would also depend on the numerical details of the implementation. Nevertheless in our experience the Lanczos recursion always converges to the same final spectrum whose calculation is therefore robust.
In order to understand what determines this robustness, we note that our algorithm amounts to implicitly solving a linear system by an iterative procedure based on a Lanczos scheme. This procedure is mathematically equivalent to the bi conjugate gradient BiCG algorithm.37 The observed robustness is therefore consistent with what is known of BiCG.37 In BiCG, the vector iterates lose their theoretical (bi-) orthogonality and the scalars used to generate the recurrence may correspondingly display very large oscillations, yet the solution of the linear system, which is a linear combination of the vector iterates, usually converges quite well. Because of this inherent robustness of the algorithm, we preferred not to use any of the several available LALAs. The shortcomings that these algorithms are designed to cure not being critical, the marginal advantages that they may possibly provide are outweighed by the drawback of losing the nice tridiagonal structure of the matrices generated by them.
Another difficulty with generic Lanczos algorithms is the loss of biorthogonality of the Lanczos vectors. As was mentioned earlier, in exact arithmetic, the left and right Lanczos vectors are orthogonal to each other. In the presence of round off, a severe loss of orthogonality eventually takes place. This loss of orthogonality is responsible for the appearance of the so-called ghost or spurious eigenpairs of the matrix to be inverted. As soon as the linear span of the Lanczos iterates is large enough as to contain a representation of an eigenvector to within numerical accuracy, the subsequent steps of the Lanczos process will tend to generate replicas of this eigenvector. At this point the Lanczos bases (left or right spaces) become linearly dependent to within machine precision. From the point of view of solving the systems , the effect of these replicated eigenvalues is not very important. Indeed, when thinking in terms of the BiCG algorithm, after the underlying sequence of approximations obtained from the BiCG algorithm converges to , further iterations will only add very small components to . As a result the contributions of these replicas are bound to be negligible and this is observed in practice. Thus, ghost eigenvectors have very small, if any, oscillator strengths and their contribution to the wanted inner products , which approximate in Eq. (54), will be negligible, in general.
C. Extrapolating the Lanczos recursion chain
The fast decrease of the components of implies that the quality of the calculated spectrum depends only on the first few hundreds of them. Specifically, if we set the components of the vector equal to zero in Eq. (71) after, say, 300–400 iterations, but we keep the dimension of the tridiagonal matrix of the order of 2–3000, the resulting spectrum appears to be still perfectly converged. Unfortunately, a relatively large number of iterations seems to be necessary to generate a tridiagonal matrix of adequate dimension. The regular behavior of the ’s for large iteration counts suggests an inexpensive strategy to extrapolate the Lanczos recursion. Let us fix the dimension of the tridiagonal matrix in Eq. (71) to some very large value (say, ) and define an effective vector and matrix by setting the component of equal to zero for and the component of equal to the appropriate estimate of the asymptotic value for odd or even iteration counts, obtained from iterations up to . In general, as previously noted, it very seldom occurs that and have a different sign, and we found that extrapolating them to the same positive value does not invalidate significantly the accuracy of the extrapolation.
In Fig. 4 we display the spectra obtained from the extrapolation procedure just outlined, which from now on will be referred to as the biconstant extrapolation of the Lanczos coefficients. One sees that the extrapolated spectrum is at perfect convergence already for a very modest value of in between and , a substantial improvement with respect to the results shown in Fig. 1. Note that this extrapolation procedure, although necessarily approximate, offers a practical solution to the problem of recovering a continuous spectrum from a limited number of recursion steps. As the dimension of the tridiagonal matrix appearing in Eq. (71) can be made arbitrarily large at a very small cost, the distance between neighboring pseudodiscrete eigenvalues in the continuous part of the spectrum can be made correspondingly small, thus allowing us to chose the imaginary part of the frequency basically as small as wanted.
Convergence of the absorption spectrum of benzene using the extrapolation procedure described in the text. After iterations the components of are set to zero and the ’s are extrapolated. The curves have been shifted vertically for clarity.
Convergence of the absorption spectrum of benzene using the extrapolation procedure described in the text. After iterations the components of are set to zero and the ’s are extrapolated. The curves have been shifted vertically for clarity.
A qualitative insight into the asymptotic behavior of the Lanczos recursion coefficients can be obtained from the analogy with the continued-fraction expansion of the local density of states (LDOS) for tight-binding (TB) Hamiltonians, a problem that has been the breeding ground for the application of Lanczos recursion methods to electronic-structure theory.32–35 Since the late 1970s it has been known that the coefficients of the continued-fraction expansion of a connected LDOS asymptotically tend to a constant—which equals one-fourth of the band width—whereas they oscillate between two values in the presence of a gap: In the latter case the average of the two limits equals one-fourth of the total band width, whereas their difference equals one-half the energy gap.50 These results can be easily verified in the case of a one dimensional TB Hamiltonian with constant hopping parameter which leads to the continued fraction,
where the sign of the square root has to be chosen so as to make for . In this case, one sees that the imaginary part of Green’s function (which equals the LDOS) is nonvanishing over a band that extends between and . In the case where consecutive hopping parameters of the recursion chain oscillate between two values, and , which we assume to be positive, resulting Green’s function reads
In this case we obtain two bands between and and between and .
In our case, the relevant band width of the Liouvillian superoperator extends from minus to plus the maximum excitation energy. In a PP-PW PP scheme, in turn, the latter is of the order of the PW kinetic energy cutoff , whereas the gap is of the order of twice the optical gap . We conclude that the asymptotic values for the and coefficients of the Liouvillian Lanczos chain are and . In Fig. 5(a) we report the behavior of the values of the coefficients of the Liouville Lanczos chain calculated for benzene, versus the iteration count, for different PW kinetic energy cutoffs. In Fig. 5(b) the average asymptotic value is plotted against the kinetic energy cutoff, demonstrating a linear dependence , in remarkable agreement to the qualitative analysis described above. Also the difference between the asymptotic values for odd and even iteration counts is in remarkable qualitative agreement with the optical gap .
(a) Behavior of ’s coefficients of benzene for different values of the kinetic energy cutoff. (b) The asymptotic values plotted as a function of the kinetic energy cutoff; the figure shows that they can be connected by a straight line with slope of about 0.5.
(a) Behavior of ’s coefficients of benzene for different values of the kinetic energy cutoff. (b) The asymptotic values plotted as a function of the kinetic energy cutoff; the figure shows that they can be connected by a straight line with slope of about 0.5.
V. APPLICATION TO LARGE MOLECULES: FULLERENE AND CHLOROPHYLL A
In order to demonstrate the applicability of our methodology to large molecular systems, we present now the results obtained for the prototypical cases of fullerene and chlorophyll a.
Let us begin with fullerene, a molecule whose spectrum has already been the subject of extensive experimental51,52 and theoretical22,41,51,53–55 studies. Our calculations have been performed with the molecule lying in a cubic supercell with side length of , using the PBE XC functional. USPPs (Ref. 44) have been used, with a PW basis set with a kinetic energy cutoff of for the wavefunctions and for the charge density. This corresponds to almost 60 000 PWs, with a dimension of the full Liouvillian exceeding . The Lanczos recursion is explicitly computed up to different orders , as indicated in Sec. III, and then extrapolated up to , as discussed in Sec. IV (this value has been chosen rather arbitrarily because both the numerical workload and the resulting accuracy depend very little on it, as long as it is large enough). In order to regularize the solution of the tridiagonal linear system, Eq. (75), the spectrum has been calculated at complex frequencies whose imaginary part is (also rather arbitrarily) taken as . In Fig. 6(a) we report the calculated absorption spectrum between 0 and . We see that, upon biconstant extrapolation, the calculated spectrum is already very good after as few as 500 iterations, and practically indistinguishable from convergence after 1500 iterations. The resulting spectrum depends very little on the precise choice of as long as its value is smaller than the distance between neighboring eigenvalues of the tridiagonal matrix of Eq. (75) (this distance goes to zero in the continuous portion of the spectrum as grows large) and larger than the desired resolution of the calculated spectrum.
(a) Convergence of the absorption spectrum of fullerene calculated between 0 and . The curves have been shifted vertically for clarity. (b) The fully converged absorption spectrum of fullerene compared with experimental results (Ref. 51) in the energy range between 2 and . Theoretical results have been scaled so as to match the integrated intensity of the experimental data.
(a) Convergence of the absorption spectrum of fullerene calculated between 0 and . The curves have been shifted vertically for clarity. (b) The fully converged absorption spectrum of fullerene compared with experimental results (Ref. 51) in the energy range between 2 and . Theoretical results have been scaled so as to match the integrated intensity of the experimental data.
The overall shape of our calculated spectrum is in substantial agreement to that calculated in Refs. 22, 41, and 54 using the real-time approach to TDDFT. In spite of the small atomic basis set used in Ref. 54, the number of integration steps that was found to be necessary to reach an acceptable accuracy (6000) is rather larger than ours. In Refs. 22 and 41 where a real-space grid representation of the KS equations was adopted, instead, the number of time steps employed is one to two orders of magnitude larger than ours (30–40 000). Considering that several products are necessary at each time step in real-time approaches, whereas only two are needed at each Lanczos recursion, we see that our combined use of the Liouville–Lanczos algorithm with biconstant extrapolation and USPPs with PWs allows for a substantial reduction of the numerical workload, while keeping the full accuracy allowed by the XC functionals currently available.
The absorption spectrum of is characterized by a low-lying and well structured portion (between, say, 3 and ) dominated by transitions, followed by a broader feature between 14 and determined by transitions from both and molecular orbitals. In Fig. 6(b) we compare our converged spectrum with the experimental results of Ref. 51. Despite a slight redshift compatible with that found in the calculations of Ref. 51, the overall shape of the TDDFT spectrum is in good agreement to experiment. Note that the theoretical results reported in Ref. 51, which were obtained by calculating individual eigenpairs of Casida’s equation, could hardly be extended to such a broad energy range as covered in the present calculation because too many lines would have to be calculated.
An even more challenging test is chlorophyll, a molecule which is of fundamental importance for life on Earth since it is responsible for the photosynthetic process. There are several different forms of this molecule, and we will focus on chlorophyll a . Historically the interpretation of the visible spectrum of chlorophyll relies on the four-orbital Gouterman model of porphyrins56 in which only the two highest occupied molecular orbitals and the two lowest unoccupied molecular orbitals are considered. In the last few years there have been several calculations of its low energy spectrum relying on different ab initio techniques.57–62 Despite the fact that TDDFT seems to produce spurious charge transfer states in the visible region,60 according to our calculations the overall shape of the low energy part of spectrum seems to be correctly predicted. Our calculations have been performed using a supercell of dimensions with the PW91 XC functional63 and USPPs.44 Molecular orbitals were expanded in PWs up to a kinetic energy cutoff of , while were used for the charge density. The PW basis sets consist of more than 120 000 PWs, while the dimension of the Liouvillian superoperator exceeds . In this case the imaginary part of the frequency was set to to better compare the results with experiments. In Fig. 7(a) we display the convergence of the spectrum with respect to the number of Lanczos steps, using the usual biconstant extrapolation of the coefficients, as calculated over a wide range of energy between 0 and . In Fig. 7(b) we compare the visible part of the spectrum calculated in this work with the experimental results obtained in diethyl solution in Ref. 64. The agreement with experiment is clearly good but the Soret (B) band located in the indigo region of the spectrum at is slightly redshifted in the calculation, while the red band (Q) has an opposite, blueshifted behavior. How much of this discrepancy has to be attributed to the limitations of the ADFT alone, or to a combination of them with the neglect of solvation effects remains to be ascertained.
(a) Convergence of the chlorophyll absorption spectrum between 0 and . The curves have been shifted vertically for clarity. (b) Chlorophyll absorption spectrum in the visible region for wavelengths between 400 and compared with the experimental data in diethyl ether of Ref. 64. Theoretical results have been scaled so as to obtain the same integrated intensity as experimental data.
(a) Convergence of the chlorophyll absorption spectrum between 0 and . The curves have been shifted vertically for clarity. (b) Chlorophyll absorption spectrum in the visible region for wavelengths between 400 and compared with the experimental data in diethyl ether of Ref. 64. Theoretical results have been scaled so as to obtain the same integrated intensity as experimental data.
VI. CONCLUSIONS
In this paper we have presented a new algorithmic approach to linearized TDDFT that combines the advantages of the more conventional real-time and Casida’s eigenvalue methods, while avoiding many of their drawbacks. This approach results from the combination of many elements which are individually not new in different communities, ranging from condensed matter and quantum chemistry, to control theory/engineering and signal processing. In particular, it is the natural extension to the dynamical regime of DFPT, a method made popular in the condensed-matter community by the calculation of static properties (such as dielectric, piezoelectric, elastic) and by the calculation of phonons and related properties in crystals. The main features of the new method are that it is tailored to the calculation of specific responses to specific perturbations and that the computational burden for the calculation of the complete spectrum of a given response function in a wide frequency range is comparable to that of a single static ground-state or response-function calculation. Of course, in principle, the absorption spectrum of any system extends up to infinite frequency. In practice, however, it is bounded from above by the -sum rule,65 so that a finite (and usually fairly small, as seen) number of Lanczos iterations suffices to capture it. We believe that, from the algorithmic point of view, the new method is close to optimal in its application range and that it opens thus the way to the simulation of the dynamical properties of large and very large molecular and condensed-matter systems. Assuredly, it cannot yield any better results than granted by the quality of the XC functional used to implement it. Devising new XC functionals capable of properly describing the electron-hole interaction responsible, e.g., of Rydberg and excitonic effects in the low-lying portion of the spectrum of molecular and extended systems, respectively, remains a major problem to be addressed and solved.
ACKNOWLEDGMENTS
S.B. wishes to thank Renata Wentzcovitch for hospitality at the University of Minnesota Department of Chemical Engineering and Materials Science and vLab, where some of the work reported in this paper was started in summer 2005 and the Leverhulme trust for a visiting professorship at the Department of Earth Sciences of the University College London, where this paper was written in 2007. R.G. and S.B. thank Brent Walker and Marco Saitta for participating in the early developments of the ideas on which this paper is founded. D.R. gratefully acknowledges useful conversations with Giuseppe Pastori Parravicini and correspondence with Liana Martinelli, the former also bearing the responsibility for making S.B. aware of the power and beauty of Lanczos methods. Y.S. acknowledges support from NSF under Grant No. NSF/ITR-0428774, the U.S. Department of Energy under Grant No. DE-FG-03ER25585, and from the Minnesota Supercomputer Institute.
REFERENCES
The batch representation of response functions has been rediscovered several times, and given several different names, in the quantum chemistry community, since it was introduced in the context of DFPT (Ref. 14). See, e.g., Refs. 17, 29, and 18.