We consider the chiral model of twisted bilayer graphene introduced by Tarnopolsky, Kruchkov, and Vishwanath (TKV). TKV proved that for inverse twist angles α such that the effective Fermi velocity at the moiré K point vanishes, the chiral model has a perfectly flat band at zero energy over the whole Brillouin zone. By a formal expansion, TKV found that the Fermi velocity vanishes at α ≈ 0.586. In this work, we give a proof that the Fermi velocity vanishes for at least one α between 0.57 and 0.61 by rigorously justifying TKV’s formal expansion of the Fermi velocity over a sufficiently large interval of α values. The idea of the proof is to project the TKV Hamiltonian onto a finite-dimensional subspace and then expand the Fermi velocity in terms of explicitly computable linear combinations of modes in the subspace while controlling the error. The proof relies on two propositions whose proofs are computer-assisted, i.e., numerical computation together with worst-case estimates on the accumulation of round-off error, which show that round-off error cannot possibly change the conclusion of computation. The propositions give a bound below on the spectral gap of the projected Hamiltonian, an Hermitian 80 × 80 matrix whose spectrum is symmetric about 0, and verify that two real 18-th order polynomials, which approximate the numerator of the Fermi velocity, take values with a definite sign when evaluated at specific values of α. Together with TKV’s work, our result proves the existence of at least one perfectly flat band of the chiral model.
I. INTRODUCTION
A. Outline
Twisted bilayer graphene (TBG) is formed by stacking one layer of graphene on top of another in such a way that the Bravais lattices of the layers are twisted relative to each other. For generic twist angles, the lattices will be incommensurate, so the resulting structure will not be periodic. Bistritzer and MacDonald (BM)1 introduced an approximate model (BM model) for the electronic states of TBG, which is periodic over the scale of the bilayer moiré pattern, where the twist angle enters as a parameter. Using this model, BM showed that the Fermi velocity, the velocity of electrons at the Fermi level, vanishes at particular twist angles known as “magic angles.” The largest of these angles, known as the first magic angle, is at θ ≈ 1.1°. Numerical computations on the BM model show the stronger result that at magic angles, the Bloch band of the BM model at zero energy is approximately flat over the whole Brillouin zone.1,2 The flatness of the zero energy Bloch band is thought to be a critical ingredient for recently observed superconductivity of TBG,3 although the precise mechanism for superconductivity in TBG is not yet settled.
Aiming at a simplified model that explains the nearly flat band of TBG, Tarnopolsky, Kruchkov, and Vishwanath (TKV)4 introduced a simplification of the BM model, which has an additional “chiral” symmetry, known as the chiral model. TKV showed analytically that at magic angles (of the chiral model, still defined by vanishing of the Fermi velocity), the chiral model has exactly flat bands over the whole Brillouin zone. Using a formal perturbation theory (for the chiral model, the natural parameter is the reciprocal of the twist angle up to a constant), TKV derived approximate values for the magic angles of the chiral model. It is worth noting that the first magic angles of the chiral model and the BM model are nearby, but the higher magic angles are not very close.
Becker et al. introduced a spectral characterization of magic angles of the TKV model where the role of a non-normal operator is emphasized [the operator Dα appearing in (1)]. Using this characterization, they numerically computed precise values for the magic angles of the TKV model [see the discussion below (14)].5 In the same work, they also proved that the lowest band of the TKV model becomes exponentially close to flat even away from magic angles as the natural small parameter tends to zero. The same authors also investigated flat bands of the TKV model with more general interlayer coupling potentials and the spectrum of other special cases of the BM model.6
In this work, we study the chiral model introduced by TKV and consider the problem of rigorously proving the existence of the first magic angle. We do this by justifying the formal perturbation theory of TKV to make a rigorous expansion of the Fermi velocity to high enough order, and over a large enough parameter range, so that we can prove the existence of a zero. By numerically verifying that the resulting expansion attains a negative value and proving that the result continues to hold when the effect of round-off error is included (Proposition II.2), we obtain existence of the magic angle (Theorem II.2).
The proof of validity of the expansion is challenging because the reciprocal of the twist angle at the zero of the Fermi velocity is large relative to the spectral gap of the unperturbed Hamiltonian, which means that the magic angle falls outside of the interval of twist angles where the perturbation series for the Fermi velocity is obviously convergent. To overcome this difficulty, we start by representing the chiral model Hamiltonian in a basis that takes full advantage of model symmetries. Then, using a rigorous bound on the high frequency components of the error, we reduce the error analysis to analysis of the eigenvalues of the chiral model projected onto finitely many low frequencies. The final stage of the error analysis (Theorem II.1) is to prove a proposition about the eigenvalues of the projected chiral model by a numerical computation that we prove continues to hold when the accumulation of round-off error is considered (Proposition IV.5). We discuss the limitations of our methods and, in particular, whether our methods might be generalized to the more general settings considered by Becker et al.5,6 in Remarks II.1–II.3.
B. Code availability
We have made code for the numerical computations used in our proofs available at github.com/abwats/magic_angle. We give references to specific scripts in the text.
II. STATEMENT OF RESULTS
A. Tarnopolsky–Kruchkov–Vishwanath’s chiral model
The chiral model, such as the Bistritzer–MacDonald model (B–M model) from which it is derived, is a formal continuum approximation to the atomistic tight-binding model of twisted bilayer graphene. The BM and chiral models aim to capture physics over the length-scale of the bilayer moiré pattern, which is, for small twist angles, much longer than the length-scale of the individual graphene layer lattices. Crucially, even when the graphene layers are incommensurate so that the bilayer is aperiodic on the atomistic scale, the chiral model and BM model are periodic (up to phases) with respect to the moiré lattice so that they can be analyzed via Bloch theory.
We define the moiré lattice to be the Bravais lattice
generated by the moiré lattice vectors
and we denote a fundamental cell of the moiré lattice by Ω. The moiré reciprocal lattice is the Bravais lattice
generated by the moiré reciprocal lattice vectors defined by ai · bj = 2πδij, given explicitly by
We define , which is the (re-scaled) difference of the K points (Dirac points) of each layer, and
We write Ω* for a fundamental cell of the moiré reciprocal lattice and refer to such a cell as the Brillouin zone.
Let . Tarnopolsky–Kruchkov–Vishwanath’s chiral Hamiltonian is defined as
where , , † denotes the adjoint (Hermitian transpose), and α is a real parameter, which we will take to be positive α ≥ 0 throughout [see (3)]. The chiral Hamiltonian Hα is an unbounded operator on with domain . We will write functions in as
where represents the electron density near to the K point (in momentum space) on sublattice σ and on layer τ. The diagonal terms of Dα arise from Taylor expanding the single layer graphene dispersion relation about the K point of each layer, while the off-diagonal terms of Dα couple the A and B sublattices of layers 1 and 2. The chiral model is identical to the BM model except that inter-layer coupling between sublattices of the same type is turned off in the chiral model. The precise form of the interlayer coupling potential U can be derived under quite general assumptions on the real space interlayer hopping.1,7 The parameter α is, up to unimportant constants, the ratio
Although the limit α → 0 can be thought of as the limit of vanishing interlayer hopping strength at fixed twist, it is physically more interesting to view the limit as modeling increasing twist angle at a fixed interlayer hopping strength.
B. Rigorous justification of TKV’s formal expansion of the Fermi velocity and proof of existence of first magic angle
Bistritzer and MacDonald studied the effective Fermi velocity of electrons in twisted bilayer graphene modeled by the BM model and computed values of the twist angle such that the Fermi velocity vanishes, which they called “magic angles.” One can similarly define an effective Fermi velocity for the chiral model and refer to values of α such that the Fermi velocity vanishes as “magic angles” [although technically α is related to the reciprocal of the twist angle (3)].
TKV proved the remarkable result that, at magic angles, the chiral model has a perfectly flat Bloch band at zero energy. Let denote the L2 space on a single moiré cell Ω with moiré K point Bloch boundary conditions. The starting point of TKV’s proof is an expression for the Fermi velocity as a function of α, v(α), as a functional of one of the Bloch eigenfunctions, , of Hα,
where denotes the inner product. We give precise definitions of , ψα, and v(α) in Definition III.2, Proposition III.6, and Definition III.3, respectively. We give a systematic formal derivation of why (4) is the effective Fermi velocity at the moiré K point in Appendix A. To complete the proof, TKV showed that zeros of v(α) imply zeros of ψα at special “stacking points” of Ω and that such zeros of ψα allow for Bloch eigenfunctions with zero energy to be constructed for all k in the moiré Brillouin zone.
To derive approximate values for magic angles, TKV computed a formal perturbation series approximation of ψα,
and then substituted this expression into the functional for v(α) to obtain an expansion of v(α) in powers of α,
By setting v(α) = 0, one obtains an approximation for the smallest magic angle: α ≈ 0.586.
Although TKV proved that flat bands occur at magic angles, they did not prove the existence of magic angles, and hence, they did not prove the existence of flat bands. The contribution of the present work is to prove rigorous estimates on the error in approximation (5), which are sufficiently high order and precise that, once substituted into (4), they suffice to rigorously prove the existence of a zero of v(α) and, hence, via TKV’s proof, the existence of at least one perfectly flat band.
It turns out to be relatively straightforward to prove that series (5) and (6) are uniformly convergent and to derive precise bounds on the error in truncating the series for ; see Proposition IV.3. The basic challenge, then, is to derive similar error bounds for α over an interval, which includes the expected location of the first magic angle, at . The first main theorem we will prove, roughly stated, is the following. See Theorem IV.1 for the more precise statement. The theorem relies on the existence of a spectral gap for an 80 × 80 Hermitian matrix, which requires numerical computation for its proof; see Proposition IV.5.
The functions Ψn for 0 ≤ n ≤ 8 are derived recursively: see Appendix C. We stop at eighth order in the expansion because this is the minimal order such that we can guarantee the existence of a zero of v(α), but the functions Ψn are well-defined by a recursive procedure for arbitrary positive integers n; see Proposition IV.1.
where
and
where denotes the inner product and ηα satisfies (8). The following is a straightforward calculation.
We prove Proposition II.1 in Appendix F. Naïvely, expansions (10) and (11) approximate the formal infinite series expansions of and up to terms of order α9. We prove in Proposition F.2 that because of some simplifications, expansions (10) and (11) agree with the infinite series up to terms of order α10.
We are now in a position to state and prove our second result. This result also relies on a proposition, which requires numerical computation for its proof: that one real 18-th order polynomial in α attains a negative value and another attains a positive value, when evaluated at specific values of α; see Proposition II.2.
There exist positive numbers αmin and αmax with 0.57 < αmin < αmax < 0.61 such that the Fermi velocity v(α) defined by (4) has a zero α* satisfying αmin ≤ α* ≤ αmax.
Proposition II.2 obviously implies by continuity (12), (13), and , each has at least one zero in the interval 0.57 < α < 0.61. We denote the largest zero of (12) in the interval by αmax and the smallest zero of (13) in the interval by αmin. Since the zeroes of must lie between those of (12) and (13), we are done.□
(computer-assisted). We will first prove that (12) attains a negative value at 0.61 and then explain the modifications necessary to prove that (13) is positive at 0.57. Evaluating using the double-precision floating-point arithmetic, we find that at α = 0.61, (12) attains the negative value −0.020 263 (five significant figures; this value was computed by running the script compute_expansion_symbolically.py in the Github repo). It is straightforward to bound the numerical error, which accumulates when evaluating an 18-th order polynomial using the floating-point arithmetic. Even the simplest exact bound, which does not account for error cancellation (see, e.g., Eq. (8) of Oliver),8 yields an upper bound on the possible accumulated round-off error in the evaluation of an nth order polynomial , for α ∈ [−1, 1], as , where ϵ is “machine epsilon”: roughly speaking, the maximum possible round-off error generated in a single arithmetic operation. Bounding the maximum coefficient in (12) by 1000, taking n = 18, and bounding ϵ by 3 × 10−16 (which was easily attained working in Python on our machine), the maximum possible numerical error in the evaluation of (12) is ≈10−11, which is much smaller than 0.020 263. We conclude that the first claim of Proposition II.2 must hold. Regarding the second, evaluating at α = 0.57, we find that (13) equals 0.029 138 (5sf). The same argument as before now shows that accumulated round-off error in the evaluation cannot possibly change the sign of (13) at α = 0.57.□
We do not attempt to rigorously estimate αmin and αmax precisely in this work, but numerically computing roots of polynomials (12) and (13) suggests αmin ≈ 0.576 83 (5sf) and αmax ≈ 0.601 77 (5sf), respectively, where 5sf is an abbreviation for five significant figure. Numerical computation of the first zero of gives 0.585 97 (5sf); see Fig. 1 (the zero values were computed by running the script compute_expansion_symbolically.py in the Github repo).
Left: plot of the numerator of the Fermi velocity approximated by eighth order TKV expansion (6) (orange) and of eighth order expansions with worst-case (12) (blue) and best-case (13) (green) errors. Right: detail showing computed roots of these functions near to . Numerically computing the zeroes of each curve yields α = 0.585 97 (5sf), α = 0.601 77 (5sf), and α = 0.576 83 (5sf), respectively. The values of α (0.57 and 0.61) where we evaluated expressions (13) (green line) and (12) (blue line) to prove that has a zero between 0.57 and 0.61 are shown with the black crosses.
Left: plot of the numerator of the Fermi velocity approximated by eighth order TKV expansion (6) (orange) and of eighth order expansions with worst-case (12) (blue) and best-case (13) (green) errors. Right: detail showing computed roots of these functions near to . Numerically computing the zeroes of each curve yields α = 0.585 97 (5sf), α = 0.601 77 (5sf), and α = 0.576 83 (5sf), respectively. The values of α (0.57 and 0.61) where we evaluated expressions (13) (green line) and (12) (blue line) to prove that has a zero between 0.57 and 0.61 are shown with the black crosses.
Using Proposition C.1 and the package Sympy9 for symbolic computation, we can compute the formal expansion of v(α) up to arbitrarily high order in α. In particular, we find the higher-order terms in expansion (6) to be
Truncating the numerator after order α40 and setting the numerator equal to zero yield α = 0.585 663 558 389 56 (14sf) for the first zero of the Fermi velocity (to compute this value, run compute_expansion_symbolically.py in the Github repo with N = 40). This is consistent with the numerical computation of Becker et al.,5 who found α = 0.585 663 558 389 55 truncated (not rounded) to 14 digits, by diagonalizing a non-normal but compact operator whose reciprocal eigenvalues correspond to magic angles. Note that we do not attempt to rigorously justify series (14) to such large values of α and to such high order in this work; see Remark II.4.
(Higher magic angles). The chiral model has been conjectured to have infinitely many magic angles,5 but it is not straightforward to extend our methods to prove the existence of such higher magic angles. The problem is that calculating the perturbation series centered at α = 0 requires diagonalizing the unperturbed operator H0. In principle, it might be possible to calculate the perturbation series to higher order in order to get an accurate approximation of the Fermi velocity near to the higher magic angles. However, this would require significantly more calculation compared with the present work, and we have no guarantee that the error can be made small enough to prove the existence of another zero in that case.
(More general interlayer hopping potentials). The chiral model (1) is an approximation to the full Hamiltonian of the twisted bilayer, even in the chiral limit where coupling between sublattices of the same type is turned off because the interlayer hopping potential U only allows for hopping between nearest neighbors in the momentum lattice (see Fig. 6 ). More general interlayer hopping potentials have been studied by Becker et al.6 In principle, such models should be amenable to the analysis of this work, but longer-range hopping would lead to much more involved calculations, and the construction of the finite-dimensional subspace Ξ of Proposition IV.4 would require more care: the fact that we can choose Ξ so that depends on H1 only coupling nearest neighbors in the momentum lattice. Locality of hopping in the momentum space lattice has been exploited for efficient computation of density of states10 of twisted bilayers.
(Generalization to BM model). Parts of our analysis should also apply to the full Bistritzer–MacDonald model. Specifically, one could study perturbation series for Bloch functions near to zero energy in powers of the inter-layer hopping strength, derive an equivalent expression for the Fermi velocity in terms of that series, and then study the zeroes of that series. However, there are various complications because of the lack of “chiral” symmetry. First, there is no reason for the continuation of the zero eigenvalue of the unperturbed operator to remain at zero. Second, the expression for the Fermi velocity in terms of the associated eigenfunction could be more complicated. Since zeros of the BM model Fermi velocity do not imply the existence of flat bands for that model, we do not consider these complications in this work.
(Expanding to higher order). Our methods could, in principle, be continued to justify the expansion of the Fermi velocity to arbitrarily high order and potentially over larger intervals of α values. However, these extensions are not immediate: pushing the expansion to higher order or to a larger interval of α values would require a larger set Ξ in Lemma IV.1, and Proposition IV.5 would have to be re-proved for the new set Ξ. Note that the essential difficulty is justifying the perturbation series for large α: the series are easily justified to all orders for ; see Proposition IV.3.
C. Structure of paper
We review the symmetries, Bloch theory, and symmetry-protected zero modes of TKV’s chiral model in Sec. III. We prove Theorem II.1 in Sec. IV, postponing most details of the proofs to Appendixes A– F. In Appendix A, we show why (4) corresponds to the effective Fermi velocity at the moiré K point. In Appendix B, we construct an orthonormal basis, which we refer to as the chiral basis, which allows for efficient computation and analysis of TKV’s formal expansion. We re-derive TKV’s formal expansions in Appendix C. We give details of the Proof of Theorem II.1 in Appendixes D and E. We prove Proposition II.1 in Appendix F. In the supplementary material, we list the basis functions of the subspace onto which we project the TKV Hamiltonian, give the explicit forms of the higher-order corrections in expansion (7), and present a derivation of the TKV Hamiltonian from the Bistritzer–MacDonald model.
III. SYMMETRIES, BLOCH THEORY, AND ZERO MODES OF TKV’s CHIRAL MODEL
A. Symmetries of the TKV model
In this section, we review the symmetries of the TKV model for the reader’s convenience and to fix notation. Becker et al.5 gave a group theoretical account of these symmetries, and further reviews can be found in the physics literature.11–13 Recall that , and let Rϕ denote the matrix, which rotates vectors counter-clockwise by ϕ, i.e.,
We define the following.
We then have the following.
The “chiral” symmetry (19) implies that the spectrum of Hα is symmetric about zero because
The same calculation implies that zero modes of Hα can always be chosen without loss of generality to be eigenfunctions of .
B. Bloch theory for the TKV Hamiltonian
We now want to reduce the eigenvalue problem for Hα using the symmetries just introduced. The symmetry (18) means that eigenfunctions of Hα can be chosen without loss of generality to be simultaneous eigenfunctions of τv for all v ∈ Λ. It therefore suffices to seek solutions of
for r in a fundamental cell of the moiré lattice in the symmetry-restricted spaces
where k is known as the quasimomentum. Since for any w ∈ Λ*, it suffices to restrict attention to k in a fundamental cell of Λ*, which we denote and refer to as the Brillouin zone. We also define symmetry-restricted Sobolev spaces for each k ∈ Ω* and positive integer s by
We claim the following.
For each fixed k ∈ Ω* and α ≥ 0, Hα, defined on the domain , extends to an unbounded self-adjoint elliptic operator with compact resolvent. In a complex neighborhood of every α ≥ 0, the family Hα is a holomorphic family of type (A) in the sense of Kato.14
Ellipticity is immediate since the principal symbol of Hα is invertible. Self-adjointness is clear using the Fourier transform when α = 0 and for α ≠ 0 because αH1 is a bounded symmetric perturbation of H0 (see, e.g., Theorem 1.4 of Cycon et al.15). Elliptic regularity implies that the resolvent maps , and compactness of the resolvent then follows by Rellich’s theorem (see, e.g., Proposition 3.4 of Taylor16). The family Hα is holomorphic of type (A) since the domain of Hα is independent of α, and Hαf is holomorphic for every (see Chap. 7 of Kato14).□
We now claim the following.
Let . Then, .
In particular, whenever mod Λ*, we have . Regarding such k, the following is a simple calculation.
The moiré K and K′ points k = 0 and k = −q1 and the moiré Γ point k = q1 + b1 satisfy mod Λ*.
The moiré K, K′, and Γ points are shown in Fig. 2. Note that the moiré K, K′, and Γ points should not be confused with the single layer K, K′, and Γ points. The moiré K point corresponds to the K point of layer 1, while the moiré K′ point corresponds to the K point of layer 2. Interactions with the K′ points of layers 1 and 2 are formally small for small twist angles and are hence ignored.
Diagram showing locations of moiré K (blue), K′ (red), and Γ (black) points within the moiré Brillouin zone (orange).
Diagram showing locations of moiré K (blue), K′ (red), and Γ (black) points within the moiré Brillouin zone (orange).
In this work, we will be particularly interested in Bloch functions at the moiré K and K′ points. We therefore define the following.
Let ω = eiϕ. Since the spaces and are invariant under , they can be divided up into invariant subspaces corresponding to the eigenvalues of ,
where
and , are defined similarly.
The following, which is trivial to prove, will be important for studying the zero modes of Hα.
The operator commutes with τv and and hence maps the and spaces to themselves for σ = 1, ω, ω*.
Since has eigenvalues ±1, we can define the spaces
and spaces similarly.
Note that +1 eigenspaces of correspond to wave-functions, which vanish in their third and fourth entries, which correspond, through (2), to wave-functions supported only on A sites of the layers. Similarly, −1 eigenspaces of correspond to wave-functions, which vanish in their first and second entries, which are supported only on B sites of the layers.
C. Zero modes of the chiral model
We now want to investigate zero modes of Hα in detail. When α = 0, there are exactly four zero modes given by ej, j = 1, 2, 3, 4, where ej equals 1 in its jth entry and 0 in its other entries. It is easy to check that
and hence, 0 is a simple eigenvalue of Hα when restricted to each of these subspaces. Recall that zero modes can always be chosen as eigenfunctions of , and indeed, we have
We now claim that these zero modes persist for all α. This was already established by TKV,4 and the following proposition is also similar to Proposition 3.1 of Becker et al.5 We re-state it using our notation and give the proof for completeness.
There exist smooth functions ψα with ‖ψα‖ = 1 in each of the spaces , , , and such that ψ0 is as in (21), α ↦ ψα is real-analytic, and Hαψα = 0 for all α. The dimension of ker Hα restricted to each of the spaces , , , and is always odd-dimensional.
Since preserves each of the spaces , , , and and anti-commutes with Hα, the spectrum of Hα restricted to each space must be symmetric about 0 for all α. Since Hα restricted to each space has compact resolvent and Hα is a holomorphic family of type (A), the spectrum of Hα consists of finitely degenerate isolated eigenvalues depending real-analytically on α, with associated eigenfunctions also real-analytically depending on α (although the real-analytic choice of eigenfunction at an eigenvalue crossing may not respect ordering); see Theorem 3.9 of Chap. 7 of Kato.14 The null space of Hα in each of the spaces is one-dimensional at α = 0 by explicit calculation, with the zero modes given by (up to non-zero constants) (21). For small α > 0, real-analyticity and the chiral symmetry force the null space to remain simple and it is clear how to define ψα. For large α > 0, the non-zero eigenvalues of Hα may cross 0 at isolated values of α, and in this case, we define ψα to be the real-analytic continuation of the zero mode through the crossings. Note that real-analyticity prevents non-zero eigenvalues from equaling zero except at isolated points so that the real-analytic continuation of the zero mode through the crossing must indeed be a zero mode. At crossings, the null space must be odd-dimensional in order to preserve symmetry of the spectrum of Hα about 0. It remains to check that if ψ0 is in, say, , then ψα must remain in for all α > 0. However, this must hold because the -eigenvalue of ψα cannot change abruptly while preserving real-analyticity. Smoothness of ψα follows from elliptic regularity.□
In this work, we will restrict attention to the moiré K point and, especially, the family . We expect that our analysis would go through with only minor modifications if we considered instead the moiré K′ point. The zero modes in and are related by the following symmetry.
Let and denote the zero modes of Hα in the spaces and , respectively. Then, , where , for all v ∈ Λ, Φα(Rϕr) = Φα(r). Up to gauge transformations , which preserve real-analyticity of , we have .
Since , the last two entries of must vanish, so we can write . That Φα satisfies the stated symmetries follows immediately from . It is straightforward to check using the definitions of and τv that . To see that is a zero mode, note that Φα satisfies DαΦα = 0, which implies that by a simple manipulation. To see that (up to real-analytic gauge transformations) for all α, note first that this clearly holds for α = 0 [the zero modes are explicit (22)]. For α > 0, the identity must continue to hold by uniqueness (up to real-analytic gauge transformations) of the real-analytic continuation of starting from α = 0 and continuing first along the non-zero interval where is non-degenerate in and then through eigenvalue crossings as in the Proof of Proposition III.6.□
In Appendix A, we use Proposition III.7 to derive the effective Dirac operator with the α-dependent Fermi velocity, which controls the Bloch band structure in a neighborhood of the moiré K point. The Fermi velocity of the effective Dirac operator is given by the following. Note that we drop the subscript +1 when referring to the zero mode of Hα in since the zero mode of Hα in plays no further role.
IV. RIGOROUS JUSTIFICATION OF TKV’s EXPANSION OF THE FERMI VELOCITY
A. Alternative formulation of TKV’s expansion
We now turn to approximating the zero mode by a series expansion in powers of α. We write Hα = H0 + αH1 and formally expand ψα as a series
where H0Ψ0 = 0 and
for all n ≥ 1. To solve H0Ψ0 = 0, we take Ψ0 = e1. We prove the following in Appendix C.
Expansion (24) appears different from the series studied by TKV since we work only with the self-adjoint operators H0, H1, and Hα rather than the non-self-adjoint operator Dα [defined in (1)]. Since functions in vanish in their last two components, there is no practical difference. However, working with only self-adjoint operators allows us to use the spectral theorem, which greatly simplifies the error analysis. We compute the first eight terms in expansion (24) in Proposition C.2 after developing some necessary machinery in Appendix B.
B. Rigorous error estimates for the expansion of the moiré K point Bloch function
In this section, we explain the essential challenge in proving error estimates for series (24) and explain how we overcome this challenge. Our goal is to prove the following.
Proposition IV.1 guarantees that series (24) is well-defined up to arbitrarily many terms. A straightforward bound on the growth of terms in the series comes from the following proposition.
This proposition is a combination of Propositions B.2, B.4, and B.7, proved in Appendix B.□
Proposition IV.2 implies that , which implies the following.
For any non-negative integer N, let , where Ψn are as in (26). Since Ψ0 ⊥ Ψn for all n ≥ 1 and ‖Ψ0‖ = 1, we have that ‖ψN,α‖ ≥ 1 for all N. Let ; then, we can decompose ϕN,α = cψα + ηα for some constant c and where ηα ⊥ ψα. Applying Hα to both sides, we have that . Now, fix α ≥ 0 such that . Then, α‖H1‖ < 1, and hence, the first non-zero eigenvalue of Hα is bounded away from 0 by 1 − 3α (recall that the first non-zero eigenvalues of H0 are ±1). Since ηα ⊥ ψα, where ψα spans the eigenspace of the zero eigenvalue of Hα, we have that . Using the bound ‖ΨN‖ ≤ (3α)N and the bound below on ‖ψN,α‖, we have that , which clearly → 0 as N → ∞, so that limN→∞ϕN,α = ψα (up to a non-zero constant). Now, consider v(α) defined by (23). Assuming WLOG that ‖ψα‖ = 1 and substituting ψα = ϕN,α + ηα, we find immediately, using Cauchy–Schwarz, that . In terms of ψN, we have .□
Proposition IV.3 shows that for , series (24) converges to ψα and can be used to compute the Fermi velocity. However, this restriction is too strong to prove that the Fermi velocity has a zero, which occurs at the larger value . Of course, Proposition IV.2 establishes only the most pessimistic possible bound on the expansion functions Ψn, and this bound appears to be far from sharp from explicit calculation of each Ψn; see Proposition C.3. We briefly discuss a possible route to a tighter bound in Remark C.2 but do not otherwise pursue this approach in this work.
We now explain how to obtain error estimates over a large enough range of α values to prove v(α) has a zero. We seek a solution of Hαψα = 0 in with the form
For arbitrary α, let Qα denote the projection in onto ψN,α, and Qα,⊥ ≔ I − Qα (note that Q0 = P). Note that Qα depends on N, but we suppress this to avoid clutter. We assume WLOG that Qαηα(r) = 0. It follows that ηα satisfies
To obtain a bound on ηα in L2(Ω), we require a lower bound on the operator . The following lemma gives a lower bound on this operator in terms of a lower bound on the projection of this operator onto the finite-dimensional subspace of corresponding to a finite subset of the eigenfunctions of H0. The importance of this result is that since H1 only couples finitely many modes of H0, for fixedN, by taking the subset sufficiently large, we can always arrange that ψN,α lies in this subspace.
Note that gα would be identically zero if not for the restriction that the matrix acts on since otherwise ψN,α would be an eigenfunction with eigenvalue zero for all α. As it is, g0 = 1 and α ↦ gα is real-analytic so that gα must be positive for a non-zero interval of positive α values.
For Lemma IV.1 to be useful, we must check that it is possible to choose Ξ so that the bound (31) is non-trivial, i.e., so that the constant is positive. We will prove the following in Appendix D.
The set Ξ constructed in Proposition IV.4 is the set of -eigenfunctions of H0 with eigenvalues with magnitude , augmented with two extra basis functions to ensure that . Including all -eigenfunctions of H0 with eigenvalue magnitudes up to and including ensure that ψ8,α lies in ran PΞ.
We now require the following.
Let Ξ be as in Proposition IV.4. Then, for all .
(computer-assisted). Consider acting on . Assuming that α is restricted to an interval such that the zero eigenspace of is simple, then, using orthogonality of eigenvectors corresponding to different eigenvalues and the fact that Qα is the spectral projection onto the unique zero mode of , has the same non-zero eigenvalues as the matrix Qα,⊥PΞHαPΞQα,⊥ acting on . The matrix is an 81 × 81 matrix whose spectrum is symmetric about 0 because of the chiral symmetry. When α = 0, the spectrum is explicit: 0 is a simple eigenvalue, and the smallest non-zero eigenvalues are ±1, both also simple. Proposition IV.5 is proved if we can prove that the first positive eigenvalue of is bounded away from zero by for all . Note that if this holds, the zero eigenspace of must be simple for all , and hence, our basic assumption is justified. The strategy of the proof is as follows.
Define a grid , where N is a positive integer taken sufficiently large that the grid spacing (the number 388 831 comes from Proposition IV.7).
Numerically compute the eigenvalues of for . We find that the numerically computed first positive eigenvalues of these matrices are uniformly bounded below by .
Perform a backward error analysis that fully accounts for round-off error in the numerical computation in order to prove that the exact first positive eigenvalues of the matrices must also be bounded below by at each .
Use perturbation theory to bound the exact first positive eigenvalue of below by over the whole interval of α values between 0 and .
When discussing round-off error due to working in the floating-point arithmetic, we will denote “machine epsilon” by ϵ. The significance of this number is that we will assume that all complex numbers a can be represented by floating-point numbers such that . We will also make the standard assumption about creation of round-off error in the floating-point arithmetic operations: if and are floating-point complex numbers and if and represent the numerically computed value and exact value of an arithmetic operation on the numbers and , then , where . In Python, this is indeed the case, for all reasonably sized (such that stack overflow does not occur) complex numbers, with ϵ = 2.220 44 × 10−16 (5sf). We now present the main points of parts 2–4 of the strategy, postponing proofs of intermediate lemmas to Appendix E.
For part 2 of the strategy, for each , we let denote (which is known exactly) evaluated as floating-point numbers. We generate numerically computed eigenpairs for 1 ≤ j ≤ 81 for each using numpy’s Hermitian eigensolver eigh. We find that the smallest first positive eigenvalue of for is 0.814 719 126 144 543 6 (computed using compute_PHalphaP_enclosures.py in the Github repo). Note that the difference between this number and is bounded below by 0.01.
The main tool for part 3 of the strategy is the following theorem.
See Appendix E 1.□
Naïvely, one would hope to be able to calculate enclosure intervals for every eigenvalue of and, in particular, a lower bound on the first positive eigenvalue of by directly applying Theorem IV.2 with , m = 81, and λj and vj given for each 1 ≤ j ≤ 81 by the approximate eigenpairs computed in part 2. However, we cannot directly apply the theorem because are not exactly orthonormal because of round-off error. Hence, we will prove the existence of an exactly orthonormal set close to the set and apply Theorem IV.2 to the set (with the same ) instead. Note that to carry out this strategy, we must bound the residuals . The result we need to implement this strategy is the following. Note that the result requires numerical computation of inner products and residuals, and we account for round-off error in these computations.
See Appendix E 2.□
Numerical computation (using the script compute_PHalphaP_enclosures.py in the Github repo) shows that the maximum of and over is bounded by 7 × 10−15. Hence, we can apply Theorem IV.3 with and given by the numerically computed eigenpairs of to obtain orthonormal sets whose residuals with respect to satisfy (35). The following is straightforward.
The first estimate follows from ‖PΞH0PΞ‖ ≤ 7 and ‖H1‖ ≤ 3. The second estimate follows immediately from writing the matrix in the chiral basis.□
We can now apply Theorem IV.2 with and λj, vj given by the numerically computed from part 2 and coming from Theorem IV.3, in order to derive rigorous enclosure intervals for every eigenvalue of . We find that (using the script compute_PHalphaP_enclosures.py in the Github repo) the suprema over of , , and are bounded by 1, 5 × 10−14, 7, and 8, respectively. It is then easy to see that 2 × 81 times the right-hand side of (35) is much smaller than 0.01 and is hence smaller than the distance between the minimum over of the numerically computed first positive eigenvalues of and . We can therefore conclude that the first positive eigenvalues of are bounded below by at every .
The main tool for part 4 of the strategy is the following.
See Appendix E 3.□
We would like to apply Theorem IV.4 to bound the variation of eigenvalues of . To this end, we require the following proposition, which bounds the derivative of with respect to α over the interval .
See Appendix E 4.□
Proposition IV.7 combined with Theorem IV.4 explains the choice of distance between grid points. Assuming that the first positive eigenvalue of is bounded below by at grid points between 0 and separated by h, we see that as long as
Proposition IV.7 and Theorem IV.4 guarantee that the first eigenvalue of must be greater than over the whole interval .□
In the Proof of Proposition IV.5, we bound the round-off error, which can occur in our numerical computations in order to draw rigorous conclusions. A common approach to this is the interval arithmetic; see Rump17 and references therein. Our approach applies directly to the present problem and is just as rigorous.
The results of a computation of the eigenvalues of are shown in Fig. 3.
Plot of numerically computed eigenvalues of the 81 × 81 matrix acting on (blue lines), showing that the first non-zero eigenvalues are bounded away from 0 by (red lines) when α is less than (black line). The zero eigenvalue corresponds to the subspace spanned by ψ8,α, and the non-zero eigenvalues equal those of the 80 × 80 matrix Qα,⊥PΞHαPΞQα,⊥ acting on since non-zero eigenvectors v of Qα,⊥PΞHαPΞQα,⊥ must be orthogonal to ψ8,α by orthogonality of eigenvectors corresponding to different eigenvalues.
Plot of numerically computed eigenvalues of the 81 × 81 matrix acting on (blue lines), showing that the first non-zero eigenvalues are bounded away from 0 by (red lines) when α is less than (black line). The zero eigenvalue corresponds to the subspace spanned by ψ8,α, and the non-zero eigenvalues equal those of the 80 × 80 matrix Qα,⊥PΞHαPΞQα,⊥ acting on since non-zero eigenvectors v of Qα,⊥PΞHαPΞQα,⊥ must be orthogonal to ψ8,α by orthogonality of eigenvectors corresponding to different eigenvalues.
Assuming Proposition IV.4 and Proposition IV.5, the bound (31) becomes, for all ,
We now assume the following, proved in Appendix C.
.
We can now give the Proof of Theorem IV.1.
The proof follows immediately from Lemma IV.1, Proposition IV.4, Assumption IV.5, and Proposition IV.8.□
SUPPLEMENTARY MATERIAL
In the supplementary material, we list the chiral basis functions, which span the space Ξ, list terms Ψ5 − Ψ8 in the formal expansion of ψα, and derive the TKV Hamiltonian from the Bistritzer–MacDonald model.
ACKNOWLEDGMENTS
M.L. and A.B.W. were supported, in part, by ARO MURI Award No. W911NF-14-0247 and NSF DMREF Award No. 1922165.
DATA AVAILABILITY
The data that support the findings of this study are available within the article and its supplementary material.
APPENDIX A: DERIVATION OF EXPRESSION FOR FERMI VELOCITY IN TERMS OF ZERO MODE OF Hα
The Bloch eigenvalue problem for the TKV Hamiltonian at quasi-momentum k is
where Hα is as in (1) and
By Propositions III.6 and III.7, 0 is a twofold (at least) degenerate eigenvalue at the moiré K point k = 0, with associated eigenfunctions as in Proposition III.7. In what follows, we assume that 0 is exactly twofold degenerate so that form a basis of the degenerate eigenspace. This assumption is clearly true for small α but could, in principle, be violated for α > 0.
Introducing , we derive the equivalent Bloch eigenvalue problem with k-independent boundary conditions
where
where Dx,y ≔ −i∂x,y and
Clearly, remain a basis of the zero eigenspace for the problem (A1) at k = 0.
Differentiating the operator , we find and , where I2 denotes the 2 × 2 identity matrix, so that
where the coefficients cσ,k and associated eigenvalues Ek ≈ ϵk are found by solving the matrix eigenvalue problem
Using (A2) and the explicit forms of given by Proposition III.7, we find that the matrix on the left-hand side of (A3) can be simplified to
It follows that, for small k, we have Ek ≈ ±v(α)|k|, where v(α) = |λ(α)| is as in (23).
APPENDIX B: THE CHIRAL BASIS OF AND ACTION OF H0 AND H1 WITH RESPECT TO THIS BASIS
1. The spectrum and eigenfunctions of H0 in
The first task is to understand the spectrum and eigenfunctions of H0 in . In the Appendix B 2, we will discuss the spectrum and eigenfunctions of H0 in . Recall that
where . To describe the eigenfunctions of H0 in , we introduce some notation. Let be a vector in . Then, we will write
Finally, let V denote the area of the moiré cell Ω.
The proof is a straightforward calculation taking into account the boundary conditions given by (20) with k = 0. For example, e2 and e4 are zero eigenfunctions of H0 but in , not .□
Note that (as it must be because of the chiral symmetry) the spectrum is symmetric about 0 and all of the eigenfunctions with negative eigenvalues are given by applying to the eigenfunctions with positive eigenvalues.
The union of the lattices Λ* and Λ* + q1 has the form of a honeycomb lattice in momentum space, where the lattice Λ* corresponds to “A” sites and Λ* + q1 corresponds to “B” sites (or vice versa); see Fig. 4.
Diagram showing A (blue) and B (red) sites of the momentum space lattice. Each site corresponds to two -eigenvalues of H0, given by ± the distance between the site and the origin (black). The lattice vectors b1 and b2 are shown as well as the A site nearest neighbor vectors q1, q2, and q3.
Diagram showing A (blue) and B (red) sites of the momentum space lattice. Each site corresponds to two -eigenvalues of H0, given by ± the distance between the site and the origin (black). The lattice vectors b1 and b2 are shown as well as the A site nearest neighbor vectors q1, q2, and q3.
2. The spectrum and eigenfunctions of H0 in
We now discuss the spectrum of H0 in .
The proof is another straightforward calculation starting from Proposition B.1.□
For an illustration of the support of the -eigenfunctions of H0 on the momentum space lattice, see Fig. 5. It is important to note that the notation introduced in Proposition B.2 is not one-to-one because, for example,
for any G ≠ 0 in Λ*.
Diagram showing support of -eigenfunctions of H0 superposed on the momentum space lattice. Each eigenfunction is given by superposing an -eigenfunction of H0 with its rotations by and . The support of the eigenfunctions with eigenvalues ±1 is shown with the black crosses, while the support of the eigenfunctions with eigenvalues is shown with the black circles.
Diagram showing support of -eigenfunctions of H0 superposed on the momentum space lattice. Each eigenfunction is given by superposing an -eigenfunction of H0 with its rotations by and . The support of the eigenfunctions with eigenvalues ±1 is shown with the black crosses, while the support of the eigenfunctions with eigenvalues is shown with the black circles.
3. The chiral basis of
Recall that zero modes of Hα can be assumed to be eigenfunctions of the chiral symmetry operator . It follows that the most convenient basis for our purposes is not be the spectral basis just introduced but the basis of consisting of eigenfunctions of . We call this basis the chiral basis.
The following is straightforward.
The chiral basis is an orthonormal basis of . The modes , , and are +1 eigenfunctions of , while the modes and are −1 eigenfunctions of .
Written out, chiral basis functions have a very simple form. We have
and for all G ∈ Λ*\{0},
and for all G ∈ Λ*,
We use the chiral basis to divide up as follows.
We define spaces to be the spans of the ±1 eigenfunctions of in , respectively.
Clearly, we have
We can divide up the chiral basis more finely as follows.
Note that notations A and B in Definition B.3 refer to A and B sites of the momentum space lattice, not to the A and B sites of the real space lattice. Recalling Remark III.1 and comparing (B2) and (B3) with (2), we see that corresponds to wave-functions supported on A sites of layer 1, corresponds to wave-functions supported on A sites of layer 2, corresponds to wave-functions supported on B sites of layer 1, and corresponds to wave-functions supported on B sites of layer 2.
Clearly, we have
The following propositions are straightforward to prove. For the first claim, note that .
In the Appendixes B 4 and 5, we will study the action of the operator H1 on with respect to the chiral basis.
4. The spectrum of H1 in and
Recall that
where . We claim the following.
By taking linear combinations of rotated copies of the H1 eigenfunctions, just as we did with the H0 eigenfunctions, it is straightforward to prove an analogous result to Proposition B.6 in . We record only the following.
5. The action of H1 on with respect to the chiral basis
We now want to study the action of H1 on with respect to the chiral basis. We will prove two propositions, which parallel Proposition B.4.
Note that H1 exchanges chirality ( eigenvalue) and the A and B momentum space sublattices, while H0 only exchanges chirality. Proposition B.8 has a simple interpretation in terms of nearest neighbor hopping in the momentum space lattice; see Figs. 6 and 7.
Illustration of the action of H1 in as hopping in the momentum space lattice described by Eq. (B7) (left, starting at b1) and (B8) (right, starting at q1 + b1 − b2). The origin is marked by a black dot.
Illustration of the action of H1 as hopping in the momentum space lattice described by Eq. (B5) (left, starting at 0) and (B6) (right, starting at q1). Although it appears that the hopping in these cases does not respect rotation symmetry, this is an artifact of working with chiral basis functions, which individually respect the rotation symmetry; see (B9).
Illustration of the action of H1 as hopping in the momentum space lattice described by Eq. (B5) (left, starting at 0) and (B6) (right, starting at q1). Although it appears that the hopping in these cases does not respect rotation symmetry, this is an artifact of working with chiral basis functions, which individually respect the rotation symmetry; see (B9).
The proof is similar to that of Proposition B.8 and is hence omitted.□
APPENDIX C: FORMAL EXPANSION OF THE ZERO MODE
We now bring to bear the developments of the Appendix B on the asymptotic expansion of the zero mode starting from . We first give the Proof of Proposition IV.1.
Our goal is to calculate satisfying the conditions of Proposition B.4 up to n = 8. This amounts to calculating, for n = 1 to n = 8,
We do this algorithmically by repeated application of the following proposition, which combines Proposition B.8 and Proposition B.5.
We now claim the following.
We give the explicit forms of Ψ5-Ψ8 in the supplementary material.
Using orthonormality of the chiral basis functions, it is straightforward to calculate the norms of each of the Ψn. We have the following.
Note that the sequence of norms of the expansion functions grows much slower than the pessimistic bound ‖ΨN+1‖ ≤ 3‖ΨN‖, N = 0, 1, 2, … guaranteed by Proposition IV.2. The reason is that bounds (27) and (28) are never attained. As N becomes larger, the bound (27) is very pessimistic because ΨN is mostly made up of eigenfunctions of H0 with eigenvalues strictly larger than 1. The bound (28) is also very pessimistic because it is attained only at delta functions, which can only be approximated with a superposition of a large number of eigenfunctions of H0. It seems possible that a sharper bound could be proved starting from these observations, but we do not pursue this in this work.
We finally give the Proof of Proposition IV.8.
APPENDIX D: PROOF OF PROPOSITION IV.4
We choose Ξ as
Part 1 of Proposition IV.4 follows immediately from observing that is not in Ξ, but |q1 − 2b1 − 2b2| = 7. That μ = 7 is optimal can be seen from Fig. 8.
Illustration of Ξ in the momentum space lattice. The circle has radius so that every dot within the circle corresponds to two chiral basis vectors included in Ξ. Chiral basis vectors exactly away from the origin, marked with black dots, are also included in Ξ. We also include in Ξ the chiral basis vectors , which correspond to the dots marked with circles, which are distance 7 (NB. ) from the origin. We do not include the chiral basis vectors , marked with the black crosses, which are also a distance 7 from the origin. The reason for this is so that part 3 of Proposition IV.4 holds. With this choice, every dot in Ξ has at most one nearest neighbor lattice point outside of Ξ. It follows immediately from Propositions B.8 and B.9 (H1 acts by nearest neighbor hopping in the momentum space lattice) that . Note that if we chose Ξ to include , this would no longer hold because these basis functions would have two nearest neighbors outside Ξ, resulting in the worse bound .
Illustration of Ξ in the momentum space lattice. The circle has radius so that every dot within the circle corresponds to two chiral basis vectors included in Ξ. Chiral basis vectors exactly away from the origin, marked with black dots, are also included in Ξ. We also include in Ξ the chiral basis vectors , which correspond to the dots marked with circles, which are distance 7 (NB. ) from the origin. We do not include the chiral basis vectors , marked with the black crosses, which are also a distance 7 from the origin. The reason for this is so that part 3 of Proposition IV.4 holds. With this choice, every dot in Ξ has at most one nearest neighbor lattice point outside of Ξ. It follows immediately from Propositions B.8 and B.9 (H1 acts by nearest neighbor hopping in the momentum space lattice) that . Note that if we chose Ξ to include , this would no longer hold because these basis functions would have two nearest neighbors outside Ξ, resulting in the worse bound .
Part 2 follows from the fact that ψ8,α depends only on eigenfunctions of H0 with eigenvalues with magnitude less than or equal to . The largest eigenvalue is , coming from dependence of Ψ8 on , since .
Part 3 can be seen from Fig. 8.
APPENDIX E: PROOF OF PROPOSITION IV.5
1. Proof of Theorem IV.2
We will prove Theorem IV.2 starting from Theorem 11.5.1 of Parlett,19 where the proof can be found.
2. Proof of Theorem IV.3
We start with the following lemma, which guarantees that numerically computed approximately orthonormal sets can be approximated by exactly orthonormal sets.
Using Lemma E.2, we have that there exists an exactly orthonormal set nearby to the set . We now want to bound the residuals of the in terms of numerically computable quantities. We start with the following easy lemma whose proof is a straightforward manipulation.
The following lemma quantifies the error in approximating exact residuals by numerically computed values.
We now prove estimate (35). Applying Lemma E.2 to the set yields an orthonormal set such that , where μ is as in (E1). By Lemma E.3, we have that
The estimate now follows easily upon applying Lemma E.4 and taking the sup over j.□
3. Proof of Theorem IV.4
4. Proof of Proposition IV.7
APPENDIX F: PROOF OF PROPOSITION II.1
We can now prove Proposition II.1. We start by proving (11).
1. Proof of (11)
We now prove (11). It is straightforward to derive
We now make two observations which simplify the computation. First, recall that the operator maps and . It follows that , , , and so on, and hence,
It follows that all terms in (F1) with odd powers of α vanish. Second, note that since Ψ0 ∈ ranP, while Ψn ∈ ranP⊥ for all n ≥ 1, we have that
Deriving (11) is then just a matter of computation using the properties of the chiral basis. For the leading term, we have
For the α2 term, the only non-zero term is
using (C4). For the α4 term, the possible non-zero terms are
using (C5) and orthgonality of and . We omit the derivation of the higher terms since the derivations do not require any new ideas.
2. Proof of (10)
It is straightforward to derive
We now note the following.
Let χ be a chiral basis function in . Then, χ*(−r) = χ(r).
Using Proposition F.1 and the same two observations as in the Appendix F 1, we have that the only non-zero terms in (F2) are those with even powers of α and that other than the leading term, terms involving Ψ0 do not contribute. The calculation is then similar to the previous case. For the leading order term, we have
The only non-zero α2 term is
The only non-zero α4 term is
We omit the derivation of the higher terms since the derivations do not require any new ideas.
Proposition IV.1 implies that the series expansion of ψα exists up to any order. We can therefore define formal infinite series by
We then have the following.
The series agree exactly without any simplifications up to terms of α9. However, because the even and odd terms in the expansion of ψα are orthogonal (since they lie in and , respectively), all terms with odd powers of α vanish in expansions (F3) and (F4). The series may disagree at order α10 because the infinite series includes terms arising from inner products of Ψ1 and Ψ9.□