The kernel polynomial method (KPM) is a powerful numerical method for approximating spectral densities. Typical implementations of the KPM require an *a prior* estimate for an interval containing the support of the target spectral density, and while such estimates can be obtained by classical techniques, this incurs addition computational costs. We propose a spectrum adaptive KPM based on the Lanczos algorithm without reorthogonalization, which allows the selection of KPM parameters to be deferred to after the expensive computation is finished. Theoretical results from numerical analysis are given to justify the suitability of the Lanczos algorithm for our approach, even in finite precision arithmetic. While conceptually simple, the paradigm of decoupling computation from approximation has a number of practical and pedagogical benefits, which we highlight with numerical examples.

## I. INTRODUCTION

Over the past several decades, a number of moment-based schemes have been developed to approximate spectral densities of matrices.^{1–5} These methods typically access the matrix of interest through matrix–vector products and are, therefore, especially well suited for applications where matrix–vector products can be performed quickly or in which the matrix of interest is so large that exact diagonalization techniques are infeasible. Perhaps, the most prominent of these methods is the Kernel Polynomial Method (KPM),^{1,4,6–8} which has found widespread use in quantum physics/chemistry^{1,4,6–13} and beyond.^{14,15}

A standard implementation of the KPM^{8} produces an approximate spectral density via a truncated Chebyshev polynomial expansion obtained using the low-degree Chebyshev moments of the target spectral density. These moments are computed using a Chebyshev recurrence shifted and scaled to an interval of approximation containing the support of the target spectral density. If this interval does not contain the support of the target spectral density, the KPM approximation is unlikely to converge, but if the interval is too large, then the KPM approximation will lose resolution. All implementations of the KPM that we are aware of require this interval of approximation to be determined ahead of time. Thus, as a pre-processing step, it is typical to run another algorithm to determine bounds for the support of the target spectral density.

The focus of this paper is an implementation of the KPM, which decouples computation from the choice of approximation method. For instance, our implementation allows many different intervals of approximation to be tested out at essentially no cost once the main computation is completed. In fact, and more importantly, approximations corresponding to different families of polynomials can also be efficiently obtained. The choice of polynomial family can significantly impact the qualitative properties of the resulting KPM approximation, but the use of non-standard orthogonal polynomial families has been limited in practice thus far, arguably due to the previous lack of a simple implementation.

Our spectrum adaptive KPM is based on the Lanczos algorithm *without reorthogonalization*. It is well-known that the Lanczos algorithm is unstable, and this has led to a general hesitance to use Lanczos-based methods for approximate spectral densities unless reorthogonalization is used.^{2,7,8,16–18} Amazingly, this instability does not limit the usefulness of the Lanczos algorithm for many tasks. We use theoretical results from the numerical analysis literature to justify the validity of our approach in finite precision arithmetic. We believe that this commentary will be of general value to the computational physics/chemistry communities. Complimentary numerical experiments provide empirical evidence of the stability of the proposed algorithm.

Finally, we note that there are a number of related Lanczos-based algorithms, such as the Finite Temperature Lanczos Method (FTLM);^{2} see also stochastic Lanczos quadrature.^{3,17} These methods are also widely used in practice and are viewed by some as preferable to KPM based methods in many settings.^{19–21} We do not advocate the use of the KPM over these methods nor the use of any of these methods over KPM. Rather, our aim is to provide a new tool that allows practitioners to test out all of these algorithms for essentially free. For instance, users no longer need to make an *a priori* decision to use FTLM or KPM; they can simply output both approximations and then decide which to use later.

## II. THE KERNEL POLYNOMIAL METHOD

*d*< ∞ with the corresponding Density of States (DOSs)

*δ*(

*E*) is a Dirac delta mass centered at zero. Given a state |

**⟩, the Local Density of State (LDOS)**

*r***⟩ is a random state drawn from the uniform distribution on the unit hypersphere, then $\rho \u0302(E)$ is an unbiased estimator for**

*r**ρ*(

*E*). In fact,

*quantum typicality*

^{5,22}ensures that $\rho \u0302(E)$ concentrates around

*ρ*(

*E*). It therefore often suffices to use $\rho \u0302(E)$ as a proxy for

*ρ*(

*E*), but in the case that the variance of a single sample is too large, one can sample multiple random states independently and then average the corresponding LDOSs to reduce the variance.

^{1,2,8,23,24}In numerical analysis, this is called stochastic trace estimation and has been analyzed in detail.

^{25–30}

*ρ*

_{KPM}(

*E*) approximating the LDOS

^{31}$\rho \u0302(E)$. Toward this end, let

*σ*(

*E*) be a fixed

*reference density*and expand $\rho \u0302(E)/\sigma (E)$ as a formal polynomial series,

*p*

_{n}} are the orthonormal polynomials with respect to

*σ*(

*E*). Using the orthonormality of {

*p*

_{n}} with respect to

*σ*(

*E*), we can compute

*μ*

_{n}by

*μ*

_{n}} are the so-called (modified) moments of $\rho \u0302(E)$ with respect to the orthogonal polynomials of

*σ*(

*E*) and can be computed without explicit knowledge of $\rho \u0302(E)$ using the expression in (6).

*s*moments naturally gives an approximation

*ρ*

_{KPM}(

*E*) to $\rho \u0302(E)$ defined by

*s*→ ∞,

*O*(

*s*

^{−1}) in the Wasserstein distance.

^{8,21,32}A rigorous theoretical understanding of other types of convergence is important, particularly if a density approximation is desired. However, this is not entirely straightforward as $\rho \u0302(E)$ is a linear combination of Dirac deltas and therefore is not even a density itself.

### A. Damping

*ρ*

_{KPM}(

*E*) need not be a proper density as it may be negative for some values of

*E*. This effect is particularly noticeable if $\rho \u0302(E)/\sigma (E)$ is very spiky so that polynomial approximations have large Gibbs oscillations. To ensure positivity, it is often suggested to use the so-called

*damping kernels*that effectively results in an approximation,

*g*

_{n}} are carefully chosen. The most common choice of coefficients corresponds to the so-called Jackson’s damping kernel; see Ref. 8 for a detailed discussion on damping. It is also possible to simply apply a standard convolution against the resulting approximation, although this does not necessarily ensure positivity.

### B. Evaluating orthogonal polynomials

*σ*(

*E*) is a unit-mass (positive) density, the orthogonal polynomials satisfy a symmetric three-term recurrence,

*p*

_{1}(

*E*) = (1/

*δ*

_{0})(

*Ep*

_{0}(

*E*) −

*γ*

_{0}

*p*

_{0}(

*E*)),

*p*

_{0}(

*E*) = 1, for some set of recurrence coefficients {

*γ*

_{n},

*δ*

_{n}} depending on

*σ*(

*E*). Throughout, we will assume that these coefficients are known (or can be computed). Then, once the moments are known, the KPM approximation

*ρ*

_{KPM}(

*E*) defined in (7) can be obtained by evaluating the polynomials by this recurrence and then forming a linear combination of these polynomials.

### C. Choice of reference density

*a*,

*b*]. For this choice of reference density, there is an elegant and widely used algorithm for computing the moments, which we summarize in Sec. II D. In Fig. 1, we illustrate some of the impacts of the choice of [

*a*,

*b*] on the KPM approximation when this reference density is used. In particular, it is very important that [

*a*,

*b*] contains the spectrum of

**. Thus, it is often suggested to take [**

*H**a*,

*b*] = [

*E*

_{min}−

*η*,

*E*

_{max}+

*η*], where

*η*> 0, to avoid the risk of [

*a*,

*b*] not containing the entire spectrum.

^{8}However, using nonzero values of

*η*reduces the resolution of the approach.

^{33}

The choice of reference density *σ*(*E*) impacts the qualitative features of the KPM approximation *ρ*_{KPM}(*E*), and in principle, any choice of unit-mass density with finite moments is possible.^{6,8} First, observe that the KPM approximation is exact, if $\rho \u0302(E)/\sigma (E)$ is a polynomial of degree at most *s*. Thus, one perspective is that *ρ*(*E*) should be chosen to try and make $\rho \u0302(E)/\sigma (E)$ as easy to approximate with polynomials as possible. In particular, the support of $\rho \u0302(E)$ must contain the support of *σ*(*E*). The difficulty of such an approach is that many properties of the spectrum of $\rho \u0302(E)$ are not known ahead of time. Even the most basic properties such as an interval [*a*, *b*] containing the support of $\rho \u0302(E)$ are often unknown *a priori* and must be approximated numerically as a pre-processing step. The approach we describe in Sec. III addresses these difficulties by allowing *σ*(*E*) to be chosen after the computation has completed.

^{34,35}

### D. Computing Chebyshev moments

In the case that $\sigma (E)=\sigma a,bT(E)$ is the Chebyshev density (11), the modified moments can be computed efficiently using properties of Chebyshev recurrences. Such an approach is described in detail in the literature,^{1,6,8} but as this is by far the most common approach to implementing the KPM, we provide a brief overview to put the contributions of this paper into context.

*T*

_{1}(

*E*) =

*E*and

*T*

_{0}(

*E*) = 1. These polynomials are orthogonal with respect to the weight $1/1\u2212E2$. The Chebyshev polynomials also satisfy the useful identities

*a*,

*b*] are defined by

*δ*

_{0,n}is the Kronecker delta.

*v*_{1}⟩ = (1/

*β*) (

**−**

*H**α*

**)|**

*I*

*v*_{0}⟩ and |

*v*_{0}⟩ = |

**⟩. Then, at step**

*r**n*, we have $|vn\u232a=T\u0303n(H)|r\u232a$. Using (14) and (15), we see that the moments can then be computed by

*μ*

_{0}= 1, $\mu 1=2\u27e8r|v1\u27e9$, and

This approach is summarized in Algorithm 1 and clearly requires *a* and *b* to be specified ahead of time. If [*a*, *b*] does not contain all of the energies of ** H**, then the algorithm is

*exponentially unstable*. Meanwhile, if [

*a*,

*b*] is much wider than the energies of

**, then the convergence may be slowed.**

*H*1: procedure Cheb-moments(, |H⟩, rk, a, b) |

2: α = (b − a)/2, β = (b + a)/2 |

3: |v_{0}⟩ = |⟩, |rv_{1}⟩ = (1/β) ( − Hα)|Iv_{0}⟩ |

4: μ_{0} = 1, $\mu 1=2\u27e8v1|v0\u27e9$ |

5: for n = 1, 2, …, k − 1 do |

6: |v_{n+1}⟩ = (2/β) ( − Hα)|Iv_{n}⟩ − |v_{n−1}⟩ |

7: $\mu 2n=22\u27e8vn|vn\u27e9\u22122\mu 0$ |

8: $\mu 2n+1=22\u27e8vn+1|vn\u27e9\u2212\mu 1$ |

9: return μ_{0}, μ_{1}, …, μ_{2k} |

1: procedure Cheb-moments(, |H⟩, rk, a, b) |

2: α = (b − a)/2, β = (b + a)/2 |

3: |v_{0}⟩ = |⟩, |rv_{1}⟩ = (1/β) ( − Hα)|Iv_{0}⟩ |

4: μ_{0} = 1, $\mu 1=2\u27e8v1|v0\u27e9$ |

5: for n = 1, 2, …, k − 1 do |

6: |v_{n+1}⟩ = (2/β) ( − Hα)|Iv_{n}⟩ − |v_{n−1}⟩ |

7: $\mu 2n=22\u27e8vn|vn\u27e9\u22122\mu 0$ |

8: $\mu 2n+1=22\u27e8vn+1|vn\u27e9\u2212\mu 1$ |

9: return μ_{0}, μ_{1}, …, μ_{2k} |

## III. A LANCZOS-BASED SPECTRUM ADAPTIVE KPM

We now describe our proposed algorithm, which allows *σ*(*E*) (including its support) to be chosen after the expensive aspects of the algorithm have been carried out. This allows the KPM approximation generated to adapt to the energy spectrum of the Hamiltonian ** H**. A related and very general approach to obtaining quadrature approximations from moment data has recently been described.

^{21}The approach in this paper is more focused/straightforward, as we focus only on implementing the KPM using Lanczos. Indeed, our approach can be summarized in one sentence:

*use the output of Lanczos to compute the KPM*

*moments*.

### A. The Lanczos algorithm

**and |**

*H***⟩ for**

*r**k*iterations, the Lanczos algorithm (Algorithm 2) iteratively produces an orthonormal basis |

*v*_{n}⟩ for the Krylov subspace,

*v*_{1}⟩ = (1/

*β*

_{0})(

**|**

*H*

*v*_{0}⟩ −

*α*

_{0}|

*v*_{0}⟩) and |

*v*_{0}⟩ = |

**⟩. At each step,**

*r**α*

_{n}is chosen so that ⟨

*v*_{n+1}|

*v*_{n}⟩ = 0 and then

*β*

_{n}is chosen so that ⟨

*v*_{n+1}|

*v*_{n+1}⟩ = 1. In exact arithmetic, |

*v*_{n+1}⟩ is automatically orthogonal to |

*v*_{i}⟩ for all

*i*≤

*n*− 2 by symmetry. However, those familiar with the Lanczos algorithm in finite precision arithmetic may be skeptical that we have omitted any form of reorthogonalization. We discuss the stability of our approach in finite precision arithmetic in Sec. III C and argue that reorthogonalization is not needed.

1: procedure Lanczos(, |H⟩, rk) |

2: |v_{0}⟩ = |⟩ r |

3: $|v\u03031\u232a=H|v0\u232a$ |

4: $|v\u03021=|v\u03031\u232a\u2212\alpha 0|v0\u232a$, $\alpha 0=\u27e8v0|v\u03031\u27e9$ |

5: $|v1\u232a=v\u03021/\beta 0$, $\beta 0=\u27e8v\u03021|v\u03021\u27e9$ |

6: for n = 1, 2, …, k − 1 do |

7: $|v\u0303n+1\u232a=H|vn\u232a\u2212\beta n\u22121|vn\u22121\u232a$ |

8: $\alpha n=\u27e8vn|v\u0303n\u27e9$ |

9: $|v\u0302n+1\u232a=|v\u0303n+1\u232a\u2212\alpha n|vn\u232a$ |

10: $\beta n=\u27e8v\u0302n+1|v\u0302n+1\u27e9$ |

11: $|vn+1\u232a=|v\u0302n+1\u232a/\beta n$ |

12: return α_{0}, α_{1}, …, α_{k−1}, β_{0}, β_{1}, …, β_{k−1} |

1: procedure Lanczos(, |H⟩, rk) |

2: |v_{0}⟩ = |⟩ r |

3: $|v\u03031\u232a=H|v0\u232a$ |

4: $|v\u03021=|v\u03031\u232a\u2212\alpha 0|v0\u232a$, $\alpha 0=\u27e8v0|v\u03031\u27e9$ |

5: $|v1\u232a=v\u03021/\beta 0$, $\beta 0=\u27e8v\u03021|v\u03021\u27e9$ |

6: for n = 1, 2, …, k − 1 do |

7: $|v\u0303n+1\u232a=H|vn\u232a\u2212\beta n\u22121|vn\u22121\u232a$ |

8: $\alpha n=\u27e8vn|v\u0303n\u27e9$ |

9: $|v\u0302n+1\u232a=|v\u0303n+1\u232a\u2212\alpha n|vn\u232a$ |

10: $\beta n=\u27e8v\u0302n+1|v\u0302n+1\u27e9$ |

11: $|vn+1\u232a=|v\u0302n+1\u232a/\beta n$ |

12: return α_{0}, α_{1}, …, α_{k−1}, β_{0}, β_{1}, …, β_{k−1} |

*k*iterations of the Lanczos iteration, the recurrence coefficients form a (

*k*+ 1) × (

*k*+ 1) symmetric tridiagonal matrix,

*e*_{n}⟩ is the all-zeros vector with a one in index

*n*, it is not hard to see that (23) implies

**⟩. Note that it is somewhat more common to write such a recurrence with the upper-left**

*v**k*×

*k*principle submatrix of

*H*_{k}, which is closely related to Gaussian quadrature.

^{36}However, as will become apparent in Sec. III B, using

*H*_{k}as defined in (24) will provide slightly more approximation power in our Lanczos-based KPM.

### B. Getting the KPM moments

It is well-known that the Lanczos tridiagonal matrix *H*_{k} contains information suitable for computing the polynomial moments of $\rho \u0302(E)$ through degree 2*k*.

*Let*

*p*

*be any polynomial of degree at most*2

*k*

*. Then,*

*n*= 0, 1, …, 2

*k*. We will first show that $Hn|r\u232a=VHkn|e0\u232a$ for all

*n*≤

*k*. Since

*V*^{†}

**=**

*V***, this immediately implies the desired result.**

*I***⟩ =**

*r***|**

*V*

*e*_{0}⟩, we can use (25) to write

*H*_{k}is tridiagonal, $Hkn$ has a bandwidth 2

*n*+ 1 and $\u27e8ek|Hkn|e0\u27e9=0$ provided

*n*≤

*k*. The base case |

**⟩ =**

*r***|**

*V*

*e*_{0}⟩ is trivial.□

The critical observation is that this allows us to obtain the KPM moments {*μ*_{n}} with respect to a reference density *σ*(*E*), which we can choose *after* we have run the Lanczos computation. In fact, we can cheaply produce approximations corresponding to various different reference measures as this process no longer involves computations with ** H** or any vectors of length

*d*.

If we choose $\sigma (E)=\sigma a,bT(E)$, we can compute the moments by applying Algorithm 1 to *H*_{k} and |*e*_{0}⟩. Of note is the fact that the Lanczos algorithm produces high accuracy estimates of extremal eigenvalues when started on a random vector.^{37–39} In particular, the largest and smallest eigenvalues of the top-left *k* × *k* sub-matrix of *H*_{k} approximate those of ** H** from the interior. We can use this to help guide our choice of

*a*and

*b*.

*σ*(

*E*), we can compute the moments directly via the three-term recurrence for the orthogonal polynomials. In particular, we use the matrix version of (10),

*u*_{1}⟩ = (1/

*δ*

_{0})(

*H*_{k}|

*u*_{0}⟩ −

*γ*

_{n}|

*u*_{0}⟩) and |

*u*_{0}⟩ = |

*e*_{0}⟩. Then, |

*u*_{n}⟩ = |

*p*_{n}(

*H*_{k})|

*e*_{0}⟩, so we can compute

*μ*

_{n}= ⟨

*e*_{0}|

*u*_{n}⟩ for

*n*≤ 2

*k*. This is summarized in Algorithm 3. The cost of this process depends only on

*k*and not on

*d*, so tricks for halving the number of matrix–vector products with

*H*_{k}are not needed.

1: procedure moments-from-Lanczos(H_{k}, σ(E)) |

2: Obtain recurrence coefficients {γ_{n}, δ_{n}} for the orthogonal polynomials of σ(E) |

3: |u_{0}⟩ = |e_{0}⟩, |u_{1}⟩ = (1/δ_{0})(H_{k}|u_{0}⟩ − γ_{0}u_{0}⟩) |

4: μ_{0} = ⟨e_{0}|u_{0}⟩, μ_{1} = ⟨e_{0}|u_{1}⟩ |

5: for n = 1, 2, …, 2k − 1 do |

6: |u_{n+1}⟩ = (1/δ_{n})(H_{k}|u_{n}⟩ − γ_{n}|u_{n}⟩ − δ_{n−1}|u_{n−1}⟩) |

7: μ_{n+1} = ⟨e_{0}|u_{n+1}⟩ |

8:: return μ_{0}, μ_{1}, …, μ_{2k} |

1: procedure moments-from-Lanczos(H_{k}, σ(E)) |

2: Obtain recurrence coefficients {γ_{n}, δ_{n}} for the orthogonal polynomials of σ(E) |

3: |u_{0}⟩ = |e_{0}⟩, |u_{1}⟩ = (1/δ_{0})(H_{k}|u_{0}⟩ − γ_{0}u_{0}⟩) |

4: μ_{0} = ⟨e_{0}|u_{0}⟩, μ_{1} = ⟨e_{0}|u_{1}⟩ |

5: for n = 1, 2, …, 2k − 1 do |

6: |u_{n+1}⟩ = (1/δ_{n})(H_{k}|u_{n}⟩ − γ_{n}|u_{n}⟩ − δ_{n−1}|u_{n−1}⟩) |

7: μ_{n+1} = ⟨e_{0}|u_{n+1}⟩ |

8:: return μ_{0}, μ_{1}, …, μ_{2k} |

### C. Stability of the Lanczos algorithm

The Lanczos algorithm is unstable in the sense that the tridiagonal matrix *H*_{k} and basis vectors {|*v*_{n}⟩} produced in finite precision arithmetic may be nothing like what would be obtained in exact arithmetic. In particular, by symmetry, |*v*_{n+1}⟩ is automatically orthogonal to |*v*_{0}⟩, …, |*v*_{n−1}⟩ in exact arithmetic. However, in finite precision arithmetic, this is no longer even approximately true and the Lanczos basis vectors produced can completely lose orthogonality and even linear independence. To fix this, it is common to use *reorthogonalization*, that is, to explicitly orthogonalize $|v\u0302n+1$ against |*v*_{0}⟩, …, |*v*_{n−1}⟩ before normalizing in Line 10 of Algorithm 2. This, of course, drastically increases the computational requirements to be able to run the algorithm; in particular, the memory required for reorthogonalization scales as *O*(*dk*) and the arithmetic cost as *O*(*dk*^{2}).

Despite the instabilities of the Lanczos algorithm without reorthogonalization, there is significant theoretical^{40–45} and empirical^{19,46,47} evidence that the Lanczos algorithm is highly effective for tasks related to the density of state approximation. In fact, while not widely known, the Lanczos algorithm is *forward stable* for the tasks of computing Chebyshev polynomials and moments.^{45,48} We summarize the high-level ideas behind these works, the results of which we believe are relevant to the computational physics and chemistry communities.

**accounts for local errors, which can be expected to be on the size of machine precision. In addition, while**

*F***need not be orthonormal, |1 − ⟨**

*V*

*v*_{n+1}|

*v*_{n+1}⟩| and |⟨

*v*_{n+1}|

*v*_{n}⟩| are on the order of machine precision. This (and much more) is analyzed in detail in Refs. 40–42. These assumptions form the basis of essentially all analyses of the behavior of Lanczos-based methods in finite precision arithmetic.

Following the Proof of Theorem 1, we might try to use (30) to understand the difference between *H*^{n}|** r**⟩ and $VHkn|e0\u232a$; i.e., how closely (27) holds in finite precision arithmetic. However, it is not hard to see that this results in an error term with an exponential dependence on

*k*. This is fundamentally because the monomial basis is very poorly conditioned and therefore not a good choice to work with numerically. Indeed, if, instead, we use a Chebyshev basis, then it can be shown the error term itself satisfies Chebyshev-like three-term recurrence and grows only polynomially with

*k*. This yields the following bound.

^{45}informal).

*Suppose*[

*a*,

*b*] = [

*E*

_{min}−

*η*,

*E*

_{max}+

*η*]

*for some*

*η*=

*O*(

*ϵ*

_{mach}poly(

*k*)).

*Let*

*H*_{k}

*be the output of the Lanczos algorithm run on*

*H**and*|

**⟩**

*r**for*

*k*

*iterations in finite precision arithmetic without reorthogonalization. Then, for any*

*n*≤ 2

*k*

*,*

*O*hides mild dimensional constants and dependencies on

*a*and

*b*. The actual statement of Ref. 45 (Theorem 1) is more precise and gives explicit bounds for the degree of the poly(

*k*) terms. We remark that in numerical analysis, the precise numerical value of bounds is often much less important than the intuition conveyed by the bound.

With some additional knowledge of properties satisfied by *V*^{†}** V** in finite precision arithmetic (which can be very far from the identity), a similar result can be shown for the Chebyshev moments:

^{48}informal).

*Under similar assumptions to Theorem*2

*, for any*

*n*≤ 2

*k*

*,*

Theorem 3 implies that the (appropriately scaled) Chebyshev moments can be obtained from the matrix *H*_{k}, even when Lanczos was carried out in finite precision arithmetic. Thus, the KPM with $\sigma a,bT(E)$ can be implemented from Lanczos after *H*_{k} has been obtained, even in finite precision arithmetic. As with Theorem 2,^{48} Theorem 1 is much more precise than what is stated in Theorem 3.

*p*(

*x*) of degree 2

*k*can be decomposed in a Chebyshev series

*p*(

*E*)| ≤

*M*for all

*E*∈ [

*a*,

*b*], then

*σ*(

*E*) whose orthogonal polynomials are well-behaved.

We remark this that a similar argument also implies that, even in finite precision arithmetic, the FTLM approximation to ⟨** r**|

*f*(

**)|**

*H***⟩ is accurate provided**

*r**f*(

*E*) has a good polynomial approximation, i.e., the same sort of result as would be expected in exact arithmetic. A more detailed analysis is found in Ref. 48. This provides a theoretical justification for the observation that the FTLM works well, even without reorthogonalization.

^{19}

### D. Computational costs

The overall computational costs of our energy adaptive KPM and the standard KPM (assuming [*a*, *b*] is known) are almost identical. In addition to the storage required for ** H**, both algorithms require storage for three vectors of length

*d*. At iteration

*n*, the Lanczos algorithm (without reorthogonalization) and a standard implementation of the KPM require one matrix–vector product, several vector updates, and two inner products. The algorithms also have additional lower-order arithmetic and storage costs, which depend only on the maximum number of iterations

*k*but not the dimension

*d*. Assuming

*k*≪

*d*, these costs are negligible.

As noted above, the standard KPM typically requires a pre-processing step in which a suitable interval [*a*, *b*] is determined, often via a few iterations of Lanczos. While our Lanczos-based KPM avoids the need for such a pre-processing step, this pre-processing step is often cheap relative to the overall computation. In such cases, the fact that our algorithm avoids this step is not particularly significant from a runtime perspective.

Finally, we note one situation in which the runtimes of the two algorithms may differ is on high-performance clusters where the time spent on communication for inner products can dominant the time spent on arithmetic computation. Indeed, in the case of the KPM, the two inner products are used to compute *μ*_{n} and *μ*_{n+1} and do not prevent the algorithm from proceeding. Meanwhile, the two inner products in Lanczos are used to compute *α*_{n} and *β*_{n} for blocking and therefore must be completed before the algorithm can proceed.

In such settings, if the cost of the pre-processing step is significant, one could run the energy-adaptive KPM suggested here for several iterations to determine good choices of [*a*, *b*] and *σ*(*E*). Assuming that the Lanczos basis vectors are stored, *p*_{n}(** H**)|

**⟩ and**

*r**p*

_{n−1}(

**)|**

*H***⟩ can be computed without any additional matrix-vector products at which point an explicit three-term recurrence can be continued without the need for blocking inner products. We leave such implementation details to practitioners who have better knowledge of their individual computing environments.**

*r*## IV. NUMERICAL EXPERIMENTS

Here, we provide several numerical examples to demonstrate the potential usefulness of our spectrum adaptive KPM. Rather than focusing on any single domain area, we aim to provide a diverse collection of examples from a variety of applications. Each of these examples demonstrates a particular aspect of the paradigm of decoupling computation from approximation, which may prove useful in practical settings. Unless stated otherwise, all KPM approximations are computed using our Lanczos-based approach.

### A. Spin systems

^{19,20}Here, we consider the simplest example: the 1D XX spin chain of length

*m*with Hamiltonian

^{49}For our numerical experiments, we set

*m*= 20 so that

*d*= 2

^{20}≈ 10

^{6}and use

*J*= 1/6 and

*h*= 6.

Figure 1 shows a histogram of the exact spectrum, along with the degree *s* = 500 KPM approximation of the LDOS corresponding to a single random state |** r**⟩. The support [

*a*,

*b*] of the KPM approximation is varied to study the impact of estimating [

*E*

_{min},

*E*

_{max}]. If

*E*

_{n}∉ [

*a*,

*b*] for some index

*n*, then the KPM approximation deteriorates, but if [

*a*,

*b*] is taken too large, then convergence is slowed. Thus, reasonably accurate estimates of

*E*

_{min}and

*E*

_{max}are required. Our spectrum adaptive KPM allows these estimates to be determined after computation with

**has occurred.**

*H*In Fig. 2, we study the impact of finite precision arithmetic. We first consider how accurately the moments *μ*_{n} can be computed. In particular, we compare our Lanczos-based algorithm and a standard implementation of the KPM and observe that the computed moments agree with essentially machine precision. This is expected due to Theorem 3. For reference, we also show the orthogonality of the Lanczos vector without reorthogonalization as well as the difference between the Lanczos recurrence coefficients {*α*_{n}, *β*_{n}} and what would have been obtained with reorthogonalization ${\alpha n*,\beta n*}$. There is a complete loss of orthogonality, and the coefficients obtained with and without reorthogonalization are vastly different. This implies that the moments agreeing are not simply due to the outputs with and without reorthogonalization being similar.

### B. Tight binding models

Another common use of the KPM is in tight binding models. Here, we consider a cuboid section of a zinc blende crystal with nearest neighbor hopping. The Hamiltonian and visualization of the zinc blende crystal in Fig. 3 were generated using the Kwant code.^{50} The resulting Hamiltonian is of dimension *d* = 56 000, and we output the average of 10 LDOSs corresponding to random independent samples of |** r**⟩.

The DOS has a large spike at zero, and this spike causes an undamped Chebyshev KPM to exhibit massive Gibbs oscillations. These oscillations can be mitigated somewhat through the use of a damping kernel. However, as shown in Fig. 3, the resolution is only *O*(1/*k*), and therefore, *s* must be taken large to get high resolution.

*η*= 10

^{−2}. Note that the relative weighting of the spike and the bulk spectrum, as well as the width of the spike, can be tuned using our energy adaptive KPM. When properly chosen, this in higher resolution in other parts of the spectrum as the KPM approximation does not have to use as much of its approximation power on approximating the spike in $\rho \u0302(E)/\sigma (E)$. Moreover, even without damping, Gibbs oscillations are relatively minor.

### C. Density functional theory

In this example, we consider a matrix obtained in the study of Ga_{41}As_{41}H_{72} with the pseudo-potential algorithm for real-space electronic structure calculation (PARSEC).^{51} The matrix Ga_{41}As_{41}H_{72} can be obtained from the sparse matrix collection^{52} and has been used as a test matrix in a past numerical study.^{38,53} This matrix is of dimension *d* = 268 096 and has many low-lying eigenvalues |*E*_{n}| < 100 and a cluster of 123 large eigenvalues *E*_{n} ∈ [1299, 1301]. Thus, while the spectrum can be contained in two reasonably sized intervals, any single interval containing the spectrum must be large. In this experiment, we output the average of 10 random LDOSs.

^{54}This significantly slows convergence, and even with

*s*= 800 moments, the fine-grained structure of the upper spectrum is not resolved. To avoid this delay of convergence, we can take

*σ*(

*E*) as a density supported on two disjoint intervals [

*a*

_{1},

*b*

_{1}] and [

*a*

_{2},

*b*

_{2}] containing the spectrum of

**. In particular, we take**

*H**a*

_{1},

*b*

_{1}] and [

*a*

_{2},

*b*

_{2}] are computed based on the eigenvalues of

*H*_{k}. As shown in Fig. 4, this provides a higher resolution in each interval and the structure of the upper cluster is visible. Here, we have applied a simple convolutional filter to the KPM approximation of the upper eigenvalues to reduce Gibbs oscillations.

## V. CONCLUSION

We have described an energy adaptive kernel polynomial method based on the Lanczos algorithm. Our approach allows many different KPM approximations to be tested out for close to zero cost, after computation with ** H** has finished. Experiments demonstrate situations in which this allows the reference density

*σ*(

*E*) to be chosen in such a way to improve the resolution of the approximation. It is our belief that the paradigm of separating computation from the desired approximation is beneficial in most settings in which the KPM is used and that our algorithm has the potential to improve the usability of the KPM.

## AUTHOR DECLARATIONS

### Conflict of Interest

The author has no conflicts to disclose.

### Author Contributions

**Tyler Chen**: Conceptualization (equal); Formal analysis (equal); Visualization (equal); Writing – original draft (equal).

## DATA AVAILABILITY

The data that support the findings of this study are available within the article.

## REFERENCES

*Maximum Entropy and Bayesian Methods*

_{1−x}Sn

_{x}Te using the kernel polynomial method

*tr*(

*f*(

*a*)) via stochastic Lanczos quadrature

*RR 669-M, TIM3, Informatique et Mathématiques Appliquées de Grenoble*(

If we can compute the moments of the DOS *ρ*(*E*), then we can apply the KPM to the DOS directly.

*Matrices, Moments and Quadrature with Applications*

*Princeton Series in Applied Mathematics Vol. 30*

If the number of outlying eigenvalues is known, it is possible to deflate them.^{8,20} However, this requires additional computation, including the storage of the eigenvectors (which may be intractable).