The kernel polynomial method (KPM) is a powerful numerical method for approximating spectral densities. Typical implementations of the KPM require an a prior estimate for an interval containing the support of the target spectral density, and while such estimates can be obtained by classical techniques, this incurs addition computational costs. We propose a spectrum adaptive KPM based on the Lanczos algorithm without reorthogonalization, which allows the selection of KPM parameters to be deferred to after the expensive computation is finished. Theoretical results from numerical analysis are given to justify the suitability of the Lanczos algorithm for our approach, even in finite precision arithmetic. While conceptually simple, the paradigm of decoupling computation from approximation has a number of practical and pedagogical benefits, which we highlight with numerical examples.
I. INTRODUCTION
Over the past several decades, a number of moment-based schemes have been developed to approximate spectral densities of matrices.1–5 These methods typically access the matrix of interest through matrix–vector products and are, therefore, especially well suited for applications where matrix–vector products can be performed quickly or in which the matrix of interest is so large that exact diagonalization techniques are infeasible. Perhaps, the most prominent of these methods is the Kernel Polynomial Method (KPM),1,4,6–8 which has found widespread use in quantum physics/chemistry1,4,6–13 and beyond.14,15
A standard implementation of the KPM8 produces an approximate spectral density via a truncated Chebyshev polynomial expansion obtained using the low-degree Chebyshev moments of the target spectral density. These moments are computed using a Chebyshev recurrence shifted and scaled to an interval of approximation containing the support of the target spectral density. If this interval does not contain the support of the target spectral density, the KPM approximation is unlikely to converge, but if the interval is too large, then the KPM approximation will lose resolution. All implementations of the KPM that we are aware of require this interval of approximation to be determined ahead of time. Thus, as a pre-processing step, it is typical to run another algorithm to determine bounds for the support of the target spectral density.
The focus of this paper is an implementation of the KPM, which decouples computation from the choice of approximation method. For instance, our implementation allows many different intervals of approximation to be tested out at essentially no cost once the main computation is completed. In fact, and more importantly, approximations corresponding to different families of polynomials can also be efficiently obtained. The choice of polynomial family can significantly impact the qualitative properties of the resulting KPM approximation, but the use of non-standard orthogonal polynomial families has been limited in practice thus far, arguably due to the previous lack of a simple implementation.
Our spectrum adaptive KPM is based on the Lanczos algorithm without reorthogonalization. It is well-known that the Lanczos algorithm is unstable, and this has led to a general hesitance to use Lanczos-based methods for approximate spectral densities unless reorthogonalization is used.2,7,8,16–18 Amazingly, this instability does not limit the usefulness of the Lanczos algorithm for many tasks. We use theoretical results from the numerical analysis literature to justify the validity of our approach in finite precision arithmetic. We believe that this commentary will be of general value to the computational physics/chemistry communities. Complimentary numerical experiments provide empirical evidence of the stability of the proposed algorithm.
Finally, we note that there are a number of related Lanczos-based algorithms, such as the Finite Temperature Lanczos Method (FTLM);2 see also stochastic Lanczos quadrature.3,17 These methods are also widely used in practice and are viewed by some as preferable to KPM based methods in many settings.19–21 We do not advocate the use of the KPM over these methods nor the use of any of these methods over KPM. Rather, our aim is to provide a new tool that allows practitioners to test out all of these algorithms for essentially free. For instance, users no longer need to make an a priori decision to use FTLM or KPM; they can simply output both approximations and then decide which to use later.
II. THE KERNEL POLYNOMIAL METHOD
A. Damping
B. Evaluating orthogonal polynomials
C. Choice of reference density
Illustration showing the impact of the support [a, b] for the standard Chebyshev KPM with s = 500 moments and Jackson’s damping kernel. Here, [a, b] = [Emin − η, Emax + η] for varying choices of η. Legend: η = 0 (), η = −0.00007(Emax − Emin) (
), η = 0.5(Emax − Emin) (
), histogram of true eigenvalues (
). Takeaway: Observe that even a slight underestimate of [Emin, Emax] results in a loss of convergence, while an overestimate of [Emin, Emax] results in a loss of resolution. Our spectrum adaptive KPM allows the reference density σ(E) to be chosen after computation. In fact, many different σ(E) can be efficiently obtained and compared.
Illustration showing the impact of the support [a, b] for the standard Chebyshev KPM with s = 500 moments and Jackson’s damping kernel. Here, [a, b] = [Emin − η, Emax + η] for varying choices of η. Legend: η = 0 (), η = −0.00007(Emax − Emin) (
), η = 0.5(Emax − Emin) (
), histogram of true eigenvalues (
). Takeaway: Observe that even a slight underestimate of [Emin, Emax] results in a loss of convergence, while an overestimate of [Emin, Emax] results in a loss of resolution. Our spectrum adaptive KPM allows the reference density σ(E) to be chosen after computation. In fact, many different σ(E) can be efficiently obtained and compared.
The choice of reference density σ(E) impacts the qualitative features of the KPM approximation ρKPM(E), and in principle, any choice of unit-mass density with finite moments is possible.6,8 First, observe that the KPM approximation is exact, if is a polynomial of degree at most s. Thus, one perspective is that ρ(E) should be chosen to try and make as easy to approximate with polynomials as possible. In particular, the support of must contain the support of σ(E). The difficulty of such an approach is that many properties of the spectrum of are not known ahead of time. Even the most basic properties such as an interval [a, b] containing the support of are often unknown a priori and must be approximated numerically as a pre-processing step. The approach we describe in Sec. III addresses these difficulties by allowing σ(E) to be chosen after the computation has completed.
D. Computing Chebyshev moments
In the case that is the Chebyshev density (11), the modified moments can be computed efficiently using properties of Chebyshev recurrences. Such an approach is described in detail in the literature,1,6,8 but as this is by far the most common approach to implementing the KPM, we provide a brief overview to put the contributions of this paper into context.
This approach is summarized in Algorithm 1 and clearly requires a and b to be specified ahead of time. If [a, b] does not contain all of the energies of H, then the algorithm is exponentially unstable. Meanwhile, if [a, b] is much wider than the energies of H, then the convergence may be slowed.
Chebyshev moments.
1: procedure Cheb-moments(H, |r⟩, k, a, b) |
2: α = (b − a)/2, β = (b + a)/2 |
3: |v0⟩ = |r⟩, |v1⟩ = (1/β) (H − αI)|v0⟩ |
4: μ0 = 1, |
5: for n = 1, 2, …, k − 1 do |
6: |vn+1⟩ = (2/β) (H − αI)|vn⟩ − |vn−1⟩ |
7: |
8: |
9: return μ0, μ1, …, μ2k |
1: procedure Cheb-moments(H, |r⟩, k, a, b) |
2: α = (b − a)/2, β = (b + a)/2 |
3: |v0⟩ = |r⟩, |v1⟩ = (1/β) (H − αI)|v0⟩ |
4: μ0 = 1, |
5: for n = 1, 2, …, k − 1 do |
6: |vn+1⟩ = (2/β) (H − αI)|vn⟩ − |vn−1⟩ |
7: |
8: |
9: return μ0, μ1, …, μ2k |
III. A LANCZOS-BASED SPECTRUM ADAPTIVE KPM
We now describe our proposed algorithm, which allows σ(E) (including its support) to be chosen after the expensive aspects of the algorithm have been carried out. This allows the KPM approximation generated to adapt to the energy spectrum of the Hamiltonian H. A related and very general approach to obtaining quadrature approximations from moment data has recently been described.21 The approach in this paper is more focused/straightforward, as we focus only on implementing the KPM using Lanczos. Indeed, our approach can be summarized in one sentence: use the output of Lanczos to compute the KPM moments.
A. The Lanczos algorithm
Lanczos.
1: procedure Lanczos(H, |r⟩, k) |
2: |v0⟩ = |r⟩ |
3: |
4: , |
5: , |
6: for n = 1, 2, …, k − 1 do |
7: |
8: |
9: |
10: |
11: |
12: return α0, α1, …, αk−1, β0, β1, …, βk−1 |
1: procedure Lanczos(H, |r⟩, k) |
2: |v0⟩ = |r⟩ |
3: |
4: , |
5: , |
6: for n = 1, 2, …, k − 1 do |
7: |
8: |
9: |
10: |
11: |
12: return α0, α1, …, αk−1, β0, β1, …, βk−1 |
B. Getting the KPM moments
It is well-known that the Lanczos tridiagonal matrix Hk contains information suitable for computing the polynomial moments of through degree 2k.
The critical observation is that this allows us to obtain the KPM moments {μn} with respect to a reference density σ(E), which we can choose after we have run the Lanczos computation. In fact, we can cheaply produce approximations corresponding to various different reference measures as this process no longer involves computations with H or any vectors of length d.
If we choose , we can compute the moments by applying Algorithm 1 to Hk and |e0⟩. Of note is the fact that the Lanczos algorithm produces high accuracy estimates of extremal eigenvalues when started on a random vector.37–39 In particular, the largest and smallest eigenvalues of the top-left k × k sub-matrix of Hk approximate those of H from the interior. We can use this to help guide our choice of a and b.
Get KPM moments from Lanczos.
1: procedure moments-from-Lanczos(Hk, σ(E)) |
2: Obtain recurrence coefficients {γn, δn} for the orthogonal polynomials of σ(E) |
3: |u0⟩ = |e0⟩, |u1⟩ = (1/δ0)(Hk|u0⟩ − γ0u0⟩) |
4: μ0 = ⟨e0|u0⟩, μ1 = ⟨e0|u1⟩ |
5: for n = 1, 2, …, 2k − 1 do |
6: |un+1⟩ = (1/δn)(Hk|un⟩ − γn|un⟩ − δn−1|un−1⟩) |
7: μn+1 = ⟨e0|un+1⟩ |
8:: return μ0, μ1, …, μ2k |
1: procedure moments-from-Lanczos(Hk, σ(E)) |
2: Obtain recurrence coefficients {γn, δn} for the orthogonal polynomials of σ(E) |
3: |u0⟩ = |e0⟩, |u1⟩ = (1/δ0)(Hk|u0⟩ − γ0u0⟩) |
4: μ0 = ⟨e0|u0⟩, μ1 = ⟨e0|u1⟩ |
5: for n = 1, 2, …, 2k − 1 do |
6: |un+1⟩ = (1/δn)(Hk|un⟩ − γn|un⟩ − δn−1|un−1⟩) |
7: μn+1 = ⟨e0|un+1⟩ |
8:: return μ0, μ1, …, μ2k |
C. Stability of the Lanczos algorithm
The Lanczos algorithm is unstable in the sense that the tridiagonal matrix Hk and basis vectors {|vn⟩} produced in finite precision arithmetic may be nothing like what would be obtained in exact arithmetic. In particular, by symmetry, |vn+1⟩ is automatically orthogonal to |v0⟩, …, |vn−1⟩ in exact arithmetic. However, in finite precision arithmetic, this is no longer even approximately true and the Lanczos basis vectors produced can completely lose orthogonality and even linear independence. To fix this, it is common to use reorthogonalization, that is, to explicitly orthogonalize against |v0⟩, …, |vn−1⟩ before normalizing in Line 10 of Algorithm 2. This, of course, drastically increases the computational requirements to be able to run the algorithm; in particular, the memory required for reorthogonalization scales as O(dk) and the arithmetic cost as O(dk2).
Despite the instabilities of the Lanczos algorithm without reorthogonalization, there is significant theoretical40–45 and empirical19,46,47 evidence that the Lanczos algorithm is highly effective for tasks related to the density of state approximation. In fact, while not widely known, the Lanczos algorithm is forward stable for the tasks of computing Chebyshev polynomials and moments.45,48 We summarize the high-level ideas behind these works, the results of which we believe are relevant to the computational physics and chemistry communities.
Following the Proof of Theorem 1, we might try to use (30) to understand the difference between Hn|r⟩ and ; i.e., how closely (27) holds in finite precision arithmetic. However, it is not hard to see that this results in an error term with an exponential dependence on k. This is fundamentally because the monomial basis is very poorly conditioned and therefore not a good choice to work with numerically. Indeed, if, instead, we use a Chebyshev basis, then it can be shown the error term itself satisfies Chebyshev-like three-term recurrence and grows only polynomially with k. This yields the following bound.
With some additional knowledge of properties satisfied by V†V in finite precision arithmetic (which can be very far from the identity), a similar result can be shown for the Chebyshev moments:
Theorem 3 implies that the (appropriately scaled) Chebyshev moments can be obtained from the matrix Hk, even when Lanczos was carried out in finite precision arithmetic. Thus, the KPM with can be implemented from Lanczos after Hk has been obtained, even in finite precision arithmetic. As with Theorem 2,48 Theorem 1 is much more precise than what is stated in Theorem 3.
We remark this that a similar argument also implies that, even in finite precision arithmetic, the FTLM approximation to ⟨r|f(H)|r⟩ is accurate provided f(E) has a good polynomial approximation, i.e., the same sort of result as would be expected in exact arithmetic. A more detailed analysis is found in Ref. 48. This provides a theoretical justification for the observation that the FTLM works well, even without reorthogonalization.19
D. Computational costs
The overall computational costs of our energy adaptive KPM and the standard KPM (assuming [a, b] is known) are almost identical. In addition to the storage required for H, both algorithms require storage for three vectors of length d. At iteration n, the Lanczos algorithm (without reorthogonalization) and a standard implementation of the KPM require one matrix–vector product, several vector updates, and two inner products. The algorithms also have additional lower-order arithmetic and storage costs, which depend only on the maximum number of iterations k but not the dimension d. Assuming k ≪ d, these costs are negligible.
As noted above, the standard KPM typically requires a pre-processing step in which a suitable interval [a, b] is determined, often via a few iterations of Lanczos. While our Lanczos-based KPM avoids the need for such a pre-processing step, this pre-processing step is often cheap relative to the overall computation. In such cases, the fact that our algorithm avoids this step is not particularly significant from a runtime perspective.
Finally, we note one situation in which the runtimes of the two algorithms may differ is on high-performance clusters where the time spent on communication for inner products can dominant the time spent on arithmetic computation. Indeed, in the case of the KPM, the two inner products are used to compute μn and μn+1 and do not prevent the algorithm from proceeding. Meanwhile, the two inner products in Lanczos are used to compute αn and βn for blocking and therefore must be completed before the algorithm can proceed.
In such settings, if the cost of the pre-processing step is significant, one could run the energy-adaptive KPM suggested here for several iterations to determine good choices of [a, b] and σ(E). Assuming that the Lanczos basis vectors are stored, pn(H)|r⟩ and pn−1(H)|r⟩ can be computed without any additional matrix-vector products at which point an explicit three-term recurrence can be continued without the need for blocking inner products. We leave such implementation details to practitioners who have better knowledge of their individual computing environments.
IV. NUMERICAL EXPERIMENTS
Here, we provide several numerical examples to demonstrate the potential usefulness of our spectrum adaptive KPM. Rather than focusing on any single domain area, we aim to provide a diverse collection of examples from a variety of applications. Each of these examples demonstrates a particular aspect of the paradigm of decoupling computation from approximation, which may prove useful in practical settings. Unless stated otherwise, all KPM approximations are computed using our Lanczos-based approach.
A. Spin systems
Figure 1 shows a histogram of the exact spectrum, along with the degree s = 500 KPM approximation of the LDOS corresponding to a single random state |r⟩. The support [a, b] of the KPM approximation is varied to study the impact of estimating [Emin, Emax]. If En ∉ [a, b] for some index n, then the KPM approximation deteriorates, but if [a, b] is taken too large, then convergence is slowed. Thus, reasonably accurate estimates of Emin and Emax are required. Our spectrum adaptive KPM allows these estimates to be determined after computation with H has occurred.
In Fig. 2, we study the impact of finite precision arithmetic. We first consider how accurately the moments μn can be computed. In particular, we compare our Lanczos-based algorithm and a standard implementation of the KPM and observe that the computed moments agree with essentially machine precision. This is expected due to Theorem 3. For reference, we also show the orthogonality of the Lanczos vector without reorthogonalization as well as the difference between the Lanczos recurrence coefficients {αn, βn} and what would have been obtained with reorthogonalization . There is a complete loss of orthogonality, and the coefficients obtained with and without reorthogonalization are vastly different. This implies that the moments agreeing are not simply due to the outputs with and without reorthogonalization being similar.
Study of relevant quantities in finite precision arithmetic. Legend: Left: error between Chebyshev moments computed directly and using the finite precision Lanczos recurrence and Theorem 1. Right: level of orthogonality of the Lanczos basis vectors maxi<j≤n|⟨vi|vj⟩| () and error in Lanczos coefficients with and without reorthogonalization (
) and (
). Takeaway: Even though the Lanczos algorithm completely lost orthogonality, the Chebyshev moments are computed stably! This enables us to stably implement the Chebyshev KPM with Lanczos, even without reorthogonalization.
Study of relevant quantities in finite precision arithmetic. Legend: Left: error between Chebyshev moments computed directly and using the finite precision Lanczos recurrence and Theorem 1. Right: level of orthogonality of the Lanczos basis vectors maxi<j≤n|⟨vi|vj⟩| () and error in Lanczos coefficients with and without reorthogonalization (
) and (
). Takeaway: Even though the Lanczos algorithm completely lost orthogonality, the Chebyshev moments are computed stably! This enables us to stably implement the Chebyshev KPM with Lanczos, even without reorthogonalization.
B. Tight binding models
Another common use of the KPM is in tight binding models. Here, we consider a cuboid section of a zinc blende crystal with nearest neighbor hopping. The Hamiltonian and visualization of the zinc blende crystal in Fig. 3 were generated using the Kwant code.50 The resulting Hamiltonian is of dimension d = 56 000, and we output the average of 10 LDOSs corresponding to random independent samples of |r⟩.
Approximation of DOS of a cubic zinc blende crystal with nearest neighbor hopping. Legend: reference density (36) with s = 240 () and Chebyshev KPM with Jackson’s damping with s = 240 (
) and s = 800 (
). Takeaway: Large spikes in the spectrum are hard to resolve with regular Chebyshev KPM but can be resolved with a suitable choice of reference density σ(E). Our spectrum adaptive KPM allows such an approximation to be computed without prior knowledge of the spectrum.
Approximation of DOS of a cubic zinc blende crystal with nearest neighbor hopping. Legend: reference density (36) with s = 240 () and Chebyshev KPM with Jackson’s damping with s = 240 (
) and s = 800 (
). Takeaway: Large spikes in the spectrum are hard to resolve with regular Chebyshev KPM but can be resolved with a suitable choice of reference density σ(E). Our spectrum adaptive KPM allows such an approximation to be computed without prior knowledge of the spectrum.
The DOS has a large spike at zero, and this spike causes an undamped Chebyshev KPM to exhibit massive Gibbs oscillations. These oscillations can be mitigated somewhat through the use of a damping kernel. However, as shown in Fig. 3, the resolution is only O(1/k), and therefore, s must be taken large to get high resolution.
C. Density functional theory
In this example, we consider a matrix obtained in the study of Ga41As41H72 with the pseudo-potential algorithm for real-space electronic structure calculation (PARSEC).51 The matrix Ga41As41H72 can be obtained from the sparse matrix collection52 and has been used as a test matrix in a past numerical study.38,53 This matrix is of dimension d = 268 096 and has many low-lying eigenvalues |En| < 100 and a cluster of 123 large eigenvalues En ∈ [1299, 1301]. Thus, while the spectrum can be contained in two reasonably sized intervals, any single interval containing the spectrum must be large. In this experiment, we output the average of 10 random LDOSs.
Approximation of DOS with a large gap in the spectrum. Legend: reference density (37) with a convolutional filter and s = 200 (), Chebyshev KPM with Jackson’s damping s = 200 (
) and s = 800 (
), and histogram of top eigenvalues (
). Takeaway: Even with four times the computation, the standard Chebyshev KPM does match the resolution of KPM with a more suitable choice of reference density σ(E). Our spectrum adaptive KPM allows such an approximation to be computed without prior knowledge of the spectrum.
Approximation of DOS with a large gap in the spectrum. Legend: reference density (37) with a convolutional filter and s = 200 (), Chebyshev KPM with Jackson’s damping s = 200 (
) and s = 800 (
), and histogram of top eigenvalues (
). Takeaway: Even with four times the computation, the standard Chebyshev KPM does match the resolution of KPM with a more suitable choice of reference density σ(E). Our spectrum adaptive KPM allows such an approximation to be computed without prior knowledge of the spectrum.
V. CONCLUSION
We have described an energy adaptive kernel polynomial method based on the Lanczos algorithm. Our approach allows many different KPM approximations to be tested out for close to zero cost, after computation with H has finished. Experiments demonstrate situations in which this allows the reference density σ(E) to be chosen in such a way to improve the resolution of the approximation. It is our belief that the paradigm of separating computation from the desired approximation is beneficial in most settings in which the KPM is used and that our algorithm has the potential to improve the usability of the KPM.
AUTHOR DECLARATIONS
Conflict of Interest
The author has no conflicts to disclose.
Author Contributions
Tyler Chen: Conceptualization (equal); Formal analysis (equal); Visualization (equal); Writing – original draft (equal).
DATA AVAILABILITY
The data that support the findings of this study are available within the article.
REFERENCES
If we can compute the moments of the DOS ρ(E), then we can apply the KPM to the DOS directly.
If the number of outlying eigenvalues is known, it is possible to deflate them.8,20 However, this requires additional computation, including the storage of the eigenvectors (which may be intractable).