Nonlinearities in finite dimensions can be linearized by projecting them into infinite dimensions. Unfortunately, the familiar linear operator techniques that one would then hope to use often fail since the operators cannot be diagonalized. The curse of nondiagonalizability also plays an important role even in finite-dimensional linear operators, leading to analytical impediments that occur across many scientific domains. We show how to circumvent it via two tracks. First, using the well-known holomorphic functional calculus, we develop new practical results about spectral projection operators and the relationship between left and right generalized eigenvectors. Second, we generalize the holomorphic calculus to a meromorphic functional calculus that can decompose arbitrary functions of nondiagonalizable linear operators in terms of their eigenvalues and projection operators. This simultaneously simplifies and generalizes functional calculus so that it is readily applicable to analyzing complex physical systems. Together, these results extend the spectral theorem of normal operators to a much wider class, including circumstances in which poles and zeros of the function coincide with the operator spectrum. By allowing the direct manipulation of individual eigenspaces of nonnormal and nondiagonalizable operators, the new theory avoids spurious divergences. As such, it yields novel insights and closed-form expressions across several areas of physics in which nondiagonalizable dynamics arise, including memoryful stochastic processes, open nonunitary quantum systems, and far-from-equilibrium thermodynamics. The technical contributions include the first full treatment of arbitrary powers of an operator, highlighting the special role of the zero eigenvalue. Furthermore, we show that the Drazin inverse, previously only defined axiomatically, can be derived as the negative-one power of singular operators within the meromorphic functional calculus and we give a new general method to construct it. We provide new formulae for constructing spectral projection operators and delineate the relations among projection operators, eigenvectors, and left and right generalized eigenvectors. By way of illustrating its application, we explore several, rather distinct examples. First, we analyze stochastic transition operators in discrete and continuous time. Second, we show that nondiagonalizability can be a robust feature of a stochastic process, induced even by simple counting. As a result, we directly derive distributions of the time-dependent Poisson process and point out that nondiagonalizability is intrinsic to it and the broad class of hidden semi-Markov processes. Third, we show that the Drazin inverse arises naturally in stochastic thermodynamics and that applying the meromorphic functional calculus provides closed-form solutions for the dynamics of key thermodynamic observables. Finally, we draw connections to the Ruelle–Frobenius–Perron and Koopman operators for chaotic dynamical systems and propose how to extract eigenvalues from a time-series.

... the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience.

A. Einstein [Ref. 1, p. 165]

Decomposing a complicated system into its constituent parts—reductionism—is one of science’s most powerful strategies for analysis and understanding. Large-scale systems with linearly coupled components give one paradigm of this success. Each can be decomposed into an equivalent system of independent elements using a similarity transformation calculated from the linear algebra of the system’s eigenvalues and eigenvectors. The physics of linear wave phenomena, whether of classical light or quantum mechanical amplitudes, sets the standard of complete reduction rather high. The dynamics is captured by an “operator” whose allowed or exhibited “modes” are the elementary behaviors out of which composite behaviors are constructed by simply weighting each mode’s contribution and adding them up.

However, one should not reduce a composite system more than is necessary nor, as is increasingly appreciated these days, more than one, in fact, can. Indeed, we live in a complex, nonlinear world whose constituents are strongly interacting. Often their key structures and memoryful behaviors emerge only over space and time. These are the complex systems. Yet, perhaps surprisingly, many complex systems with nonlinear dynamics correspond to linear operators in abstract high-dimensional spaces.2–4 And so, there is a sense in which even these complex systems can be reduced to the study of independent nonlocal collective modes.

Reductionism, however, faces its own challenges even within its paradigmatic setting of linear systems: linear operators may have interdependent modes with irreducibly entwined behaviors. These irreducible components correspond to so-called nondiagonalizable subspaces. No similarity transformation can reduce them.

In this view, reductionism can only ever be a guide. The actual goal is to achieve a happy medium, as Einstein reminds us, of decomposing a system only to that level at which the parts are irreducible. To proceed, though, begs the original question, What happens when reductionism fails? To answer this requires revisiting one of its more successful implementations, spectral decomposition of completely reducible operators.

Spectral decomposition—splitting a linear operator into independent modes of simple behavior—has greatly accelerated progress in the physical sciences. The impact stems from the fact that spectral decomposition is not only a powerful mathematical tool for expressing the organization of large-scale systems, but also yields predictive theories with directly observable physical consequences.5 Quantum mechanics and statistical mechanics identify the energy eigenvalues of Hamiltonians as the basic objects in thermodynamics: transitions among the energy eigenstates yield heat and work. The eigenvalue spectrum reveals itself most directly in other kinds of spectra, such as the frequency spectra of light emitted by the gases that permeate the galactic filaments of our universe.6 Quantized transitions, an initially mystifying feature of atomic-scale systems, correspond to distinct eigenvectors and discrete spacing between eigenvalues. The corresponding theory of spectral decomposition established the quantitative foundation of quantum mechanics.

The applications and discoveries enabled by spectral decomposition and the corresponding spectral theory fill a long list. In application, direct-bandgap semiconducting materials can be turned into light-emitting diodes (LEDs) or lasers by engineering the spatially-inhomogeneous distribution of energy eigenvalues and the occupation of their corresponding states.7 Before their experimental discovery, anti-particles were anticipated as the nonoccupancy of negative-energy eigenstates of the Dirac Hamiltonian.8 

The spectral theory, though, extends far beyond physical science disciplines. In large measure, this arises since the evolution of any object corresponds to a linear dynamic in a sufficiently high-dimensional state space. Even nominally nonlinear dynamics over several variables, the canonical mechanism of deterministic chaos, appear as linear dynamics in appropriate infinite-dimensional shift-spaces.4 A nondynamic version of rendering nonlinearities into linearities in a higher-dimensional feature space is exploited with much success today in machine learning by support vector machines, for example Ref. 9. Spectral decomposition often allows a problem to be simplified by approximations that use only the dominant contributing modes. Indeed, human-face recognition can be efficiently accomplished using a small basis of “eigenfaces”.10 

Certainly, there are many applications that highlight the importance of decomposition and the spectral theory of operators. However, a brief reflection on the mathematical history will give better context to its precise results, associated assumptions, and, more to the point, the generalizations we develop here in hopes of advancing the analysis and understanding of complex systems.

Following on early developments of operator theory by Hilbert and co-workers,11 the spectral theorem for normal operators reached maturity under von Neumann by the early 1930s.12,13 It became the mathematical backbone of much progress in physics since then, from classical partial differential equations to quantum physics. Normal operators, by definition, commute with their Hermitian conjugate: AA = AA. Examples include symmetric and orthogonal matrices in classical mechanics and Hermitian, skew-Hermitian, and unitary operators in quantum mechanics.

The spectral theorem itself is often identified as a collection of related results about normal operators; see, e.g., Ref. 14. In the case of finite-dimensional vector spaces,15 the spectral theorem asserts that normal operators are diagonalizable and can always be diagonalized by a unitary transformation; that left and right eigenvectors (or eigenfunctions) are simply related by complex-conjugate transpose; that these eigenvectors form a complete basis; and that functions of a normal operator reduce to the action of the function on each eigenvalue. Most of these qualities survive with only moderate provisos in the infinite-dimensional case. In short, the spectral theorem makes physics governed by normal operators tractable.

The spectral theorem, though, appears powerless when faced with nonnormal and nondiagonalizable operators. What then are we to do when confronted by, say, complex interconnected systems with nonunitary time evolution, by open systems, by structures that emerge on space and time scales different from the equations of motion, or by other novel physics governed by nonnormal and not-necessarily-diagonalizable operators? Where is the comparably constructive framework for calculations beyond the standard spectral theorem? Fortunately, portions of the necessary generalization have been made within pure mathematics,16 some finding applications in engineering and control.17,18 However, what is available is incomplete. And, even that which is available is often not in a form adapted to perform calculations that lead to quantitative predictions.

Here, we build on previous work in functional analysis and operator theory to provide both a rigorous and constructive foundation for physically relevant calculations involving not-necessarily-diagonalizable operators. In effect, we extend the spectral theorem for normal operators to a broader setting, allowing generalized “modes” of nondiagonalizable systems to be identified and manipulated. The meromorphic functional calculus we develop extends Taylor series expansion and standard holomorphic functional calculus to analyze arbitrary functions of not-necessarily-diagonalizable operators. It readily handles singularities arising when poles (or zeros) of the function coincide with poles of the operator’s resolvent—poles that appear precisely at the operator’s eigenvalues. Pole–pole and pole–zero interactions substantially modify the complex-analytic residues within the functional calculus. A key result is that the negative-one power of a singular operator exists in the meromorphic functional calculus. It is the Drazin inverse, a powerful tool that is receiving increased attention in stochastic thermodynamics and elsewhere. Furthermore, we derive consequences from the more familiar holomorphic functional calculus that readily allow spectral decomposition of nondiagonalizable operators in terms of spectral projections and left and right generalized eigenvectors—decanting the abstract mathematical theory into a more tractable framework for analyzing complex physical systems.

Taken together, the functional calculus, Drazin inverse, and methods to manipulate particular eigenspaces, are key to a thorough-going analysis of many complex systems, many now accessible for the first time. Indeed, the framework has already been fruitfully employed in several specific applications, including closed-form expressions for signal processing and information measures of hidden Markov processes,19–23 compressing stochastic processes over a quantum channel,24,25 and stochastic thermodynamics.26,27 However, the techniques are sufficiently general they will be much more widely useful. We envision new opportunities for similar detailed analyses, ranging from biophysics to quantum field theory, wherever restrictions to normal operators and diagonalizability have been roadblocks.

With this broad scope in mind, we develop the mathematical theory first without reference to specific applications and disciplinary terminology. We later give pedagogical (yet, we hope, interesting) examples, exploring several niche, but important applications to finite hidden Markov processes, basic stochastic process theory, nonequilibrium thermodynamics, signal processing, and nonlinear dynamical systems. At a minimum, the examples and their breadth serve to better acquaint readers with the basic methods required to employ the theory.

We introduce the meromorphic functional calculus in Sec. III through Sec. IV, after necessary preparation in Sec. II. Section V A further explores and gives a new formula for eigenprojectors, which we refer to here simply as projection operators. Section V B makes explicit their general relationship with eigenvectors and generalized eigenvectors and clarifies the orthonormality relationship among left and right generalized eigenvectors. Section V B 4 then discusses simplifications of the functional calculus for special cases, while Sec. VI A takes up the spectral properties of transition operators.

Section VI then turns to several applications that demonstrate the theoretical results’ broad utility. In particular, the advantage of the meromorphic approach over holomorphic is demonstrated in deriving Eq. (46) of Sec. VI D on stochastic thermodynamics. This enables a general operator approach to analyzing the nonequilibrium thermodynamics of complex systems. Generally, though, each application benefits from the meromorphic approach via the analytical forms derived from it. Specifically, the meromorphic calculus allows one to derive: (i) arbitrary powers of an operator (Eq. (25)), explicitly including the qualitatively distinct contribution of the zero eigenspace; (ii) negative powers (Eq. (28)) and the Drazin inverse (Eq. (29)); and (iii) key formulae for the spectral projection operators, including Eq. (35). These results make the applications tractable. In this way, the applications demonstrate how the meromorphic toolset is used to analyze linear operators in a range of settings. The result is an efficient and intuitive analysis of the structure and randomness in complex systems. Finally, Sec. VII closes with suggestions for future applications and research directions.

The following is relatively self-contained, assuming basic familiarity with linear algebra at the level of Refs. 15 and 17—including eigen-decomposition and knowledge of the Jordan canonical form, partial fraction expansion (see Ref. 28), and series expansion—and basic knowledge of complex analysis—including the residue theorem and calculation of residues at the level of Ref. 29. For those lacking a working facility with these concepts, a quick review of Sec. VI’s applications may motivate reviewing them. In this section, we introduce our notation and, in doing so, remind the reader of certain basic concepts in linear algebra and complex analysis that will be used extensively in the following.

To begin, we restrict attention to operators with finite representations and only sometimes do we take the limit of dimension going to infinity. That is, we do not consider infinite-rank operators outright. While this runs counter to previous presentations in mathematical physics that consider only infinite-dimensional operators, the upshot is that they—as limiting operators—can be fully treated with a countable point spectrum. We present examples of this later on. Accordingly, we restrict our attention to operators with at most a countably infinite spectrum. Such operators share many features with finite-dimensional square matrices, and so we recall several elementary but essential facts from matrix theory used repeatedly in the main development.

If A is a finite-dimensional square matrix, then its spectrum is simply the set ΛA of its eigenvalues:

ΛA={λC:det(λIA)=0},

where det(·) is the determinant of its argument and I is the identity matrix. The algebraic multiplicity aλ of eigenvalue λ is the power of the term (z − λ) in the characteristic polynomial det(zIA). In contrast, the geometric multiplicity gλ is the dimension of the kernel of the transformation A − λI or, equivalently, the number of linearly independent eigenvectors associated with the eigenvalue. The algebraic and geometric multiplicities are all equal when the matrix is diagonalizable.

Since there can be multiple subspaces associated with a single eigenvalue, corresponding to different Jordan blocks in the Jordan canonical form, it is structurally important to distinguish the index of the eigenvalue associated with the largest of these subspaces.30 

Definition 1.

Eigenvalue λ’s index νλis the size of the largest Jordan block associated with λ.

If z ∉ ΛA, then νz = 0. Note that the index of the operator A itself is sometimes discussed.31 In such contexts, the index of A is ν0. Hence, νλ corresponds to the index of A − λI.

The index of an eigenvalue gives information beyond what the algebraic and geometric multiplicities themselves yield. Nevertheless, for λ ∈ ΛA, it is always true that νλ − 1 ≤ aλgλaλ − 1. In the diagonalizable case, aλ = gλ and νλ = 1 for all λ ∈ ΛA.

The following employs basic features of complex analysis extensively in conjunction with linear algebra. Let us therefore review several elementary notions in complex analysis. Recall that a holomorphic function is one that is complex differentiable throughout the domain under consideration. A pole of order n at z0C is a singularity that behaves as h(z)/(zz0)n as zz0, where h(z) is holomorphic within a neighborhood of z0 and h(z0) ≠ 0. We say that h(z) has a zero of order m at z1 if 1/h(z) has a pole of order m at z1. A meromorphic function is one that is holomorphic except possibly at a set of isolated poles within the domain under consideration.

Defined over the continuous complex variable zC, A’s resolvent:
R(z;A)(zIA)1,
captures all of A’s spectral information through the poles of R(z; A)’s matrix elements. In fact, the resolvent contains more than just A’s spectrum: we later show that the order of each pole gives the index ν of the corresponding eigenvalue.
The spectrum ΛA can be expressed in terms of the resolvent. Explicitly, the point spectrum (i.e., the set of eigenvalues) is the set of complex values z at which zIA is not a one-to-one mapping, with the implication that the inverse of zIA does not exist:
ΛA={λC:R(λ;A)inv(λIA)},
where inv(·) is the inverse of its argument. Later, via our investigation of the Drazin inverse, it should become clear that the resolvent operator can be self-consistently defined at the spectrum, despite the lack of an inverse.

For infinite-rank operators, the spectrum becomes more complicated. In that case, the right point spectrum (the point spectrum of A) need not be the same as the left point spectrum (the point spectrum of A’s dual A). Moreover, the spectrum may grow to include non-eigenvalues z for which the range of zIA is not dense in the vector space it transforms or for which zIA has dense range but the inverse of zIA is not bounded. These two settings give rise to the so-called residual spectrum and continuous spectrum, respectively.32 To mitigate confusion, it should be noted that the point spectrum can be continuous, yet never coincides with the continuous spectrum just described. Moreover, understanding only countable point spectra is necessary to follow the developments here.

Each of A’s eigenvalues λ has an associated projection operator Aλ, which is the residue of the resolvent as z → λ.14 Explicitly:
Aλ=Res((zIA)1,zλ),
where Res(·, z → λ) is the element-wise residue of its first argument as z → λ. The projection operators are orthonormal:
AλAζ=δλ,ζAλ.
(1)
and sum to the identity:
I=λΛAAλ.
(2)
The following discusses in detail and then derives several new properties of projection operators.

In the following, we develop an extended functional calculus that makes sense of arbitrary functions f(·) of a linear operator A. Within any functional calculus, one considers how A’s eigenvalues map to the eigenvalues of f(A); which we call a spectral mapping. For example, it is known that holomorphic functions of bounded linear operators enjoy an especially simple spectral mapping theorem:33 

Λf(A)=f(ΛA).

To fully appreciate the meromorphic functional calculus, we first state and compare the main features and limitations of alternative functional calculi.

Inspired by the Taylor expansion of scalar functions:

f(a)=n=0f(n)(ξ)n!(aξ)n,

a calculus for functions of an operator A can be based on the series:

f(A)=n=0f(n)(ξ)n!(AξI)n,
(3)

where f(n)(ξ) is the nth derivative of f(z) evaluated at z = ξ.

This is often used, for example, to express the exponential of A as:

eA=n=0Ann!.

This particular series-expansion is convergent for any A since ez is entire, in the sense of complex analysis. Unfortunately, even if it exists there is a limited domain of convergence for most functions. For example, suppose f(z) has poles and choose a Maclaurin series; i.e., ξ = 0 in Eq. (3). Then the series only converges when A’s spectral radius is less than the radius of the innermost pole of f(z). Addressing this and related issues leads directly to alternative functional calculi.

Holomorphic functions are well behaved, smooth functions that are complex differentiable. Given a function f(·) that is holomorphic within a disk enclosed by a counterclockwise contour C, its Cauchy integral formula is given by:

f(a)=12πiCf(z)(za)1dz,
(4)

Taking this as inspiration, the holomorphic functional calculus performs a contour integration of the resolvent to extend f(·) to operators:

f(A)=12πiCΛAf(z)(zIA)1dz,
(5)

where CΛA is a closed counterclockwise contour that encompasses ΛA. Assuming that f(z) is holomorphic at z = λ for all λ ∈ ΛA, a nontrivial calculation30 shows that Eq. (5) is equivalent to the holomorphic calculus defined by:

f(A)=λΛAm=0νλ1f(m)(λ)m!(AλI)mAλ.
(6)

After some necessary development, we will later derive Eq. (6) as a special case of our meromorphic functional calculus, such that Eq. (6) is valid whenever f(z) is holomorphic at z = λ for all λ ∈ ΛA.

The holomorphic functional calculus was first proposed in Ref. 30 and is now in wide use; e.g., see [Ref. 17, p. 603]. It agrees with the Taylor-series approach whenever the infinite series converges, but gives a functional calculus when the series approach fails. For example, using the principal branch of the complex logarithm, the holomorphic functional calculus admits log(A) for any nonsingular matrix, with the satisfying result that elog(A) = A. Whereas, the Taylor series approach fails to converge for the logarithm of most matrices even if the expansion for, say, log(1 − z) is used.

The major shortcoming of the holomorphic functional calculus is that it assumes f(z) is holomorphic at ΛA. Clearly, if f(z) has a pole at some z ∈ ΛA, then Eq. (6) fails. An example of such a failure is the negative-one power of a singular operator, which we take up later on.

Several efforts have been made to extend the holomorphic functional calculus. For example, Refs. 34 and 35 define a functional calculus that extends the standard holomorphic functional calculus to include a certain class of meromorphic functions that are nevertheless still required to be holomorphic on the point spectrum (i.e., on the eigenvalues) of the operator. However, we are not aware of any previous work that introduces and develops the consequences of a functional calculus for functions that are meromorphic on the point spectrum—which we take up in the next few sections.

Meromorphic functions are holomorphic except at a set of isolated poles of the function. The resolvent of a finite-dimensional operator is meromorphic, since it is holomorphic everywhere except for poles at the eigenvalues of the operator. We will now also allow our function f(z) to be meromorphic with possible poles that coincide with the poles of the resolvent.

Inspired again by the Cauchy integral formula of Eq. (4), but removing the restriction to holomorphic functions, our meromorphic functional calculus instead employs a partitioned contour integration of the resolvent:

f(A)=λΛA12πiCλf(z)R(z;A)dz,

where Cλ is a small counterclockwise contour around the eigenvalue λ. This and a spectral decomposition of the resolvent (to be derived later) extends the holomorphic calculus to a much wider domain, defining:

f(A)=λΛAm=0νλ1Aλ(AλI)m12πiCλf(z)(zλ)m+1dz.
(7)

The contour is integrated using knowledge of f(z) since meromorphic f(z) can introduce poles and zeros at ΛA that interact with the resolvent’s poles.

The meromorphic functional calculus agrees with the Taylor-series approach whenever the series converges and agrees with the holomorphic functional calculus whenever f(z) is holomorphic at ΛA. However, when both the previous functional calculi fail, the meromorphic calculus extends the domain of f(A) to yield surprising, yet sensible answers. For example, we show that within it, the negative-one power of a singular operator is the Drazin inverse—an operator that effectively inverts everything that is invertible.

The major assumption of our meromorphic functional calculus is that the domain of operators must have a spectrum that is at most countably infinite—e.g., A can be any compact operator. A related limitation is that singularities of f(z) that coincide with ΛA must be isolated singularities. Nevertheless, we expect that these restrictions can be lifted with proper treatment, as discussed in fuller context later.

The preceding gave an overview of the relationship between alternative functional calculi and their trade-offs, highlighting the advantages of the meromorphic functional calculus. This section leverages these advantages and employs a partial fraction expansion of the resolvent to give a general spectral decomposition of almost any function of any operator. Then, since it plays a key role in applications, we apply the functional calculus to investigate the negative-one power of singular operators—thus deriving, what is otherwise an operator defined axiomatically, the Drazin inverse from first principles.

The elements of A’s resolvent are proper rational functions that contain all of A’s spectral information. (Recall that a proper rational function r(z) is a ratio of polynomials in z whose numerator has degree strictly less than the degree of the denominator.) In particular, the resolvent’s poles coincide with A’s eigenvalues since, for z ∉ ΛA:

R(z;A)=(zIA)1=Cdet(zIA)=CλΛA(zλ)aλ,
(8)

where aλ is the algebraic multiplicity of eigenvalue λ and C is the matrix of cofactors of zIA. That is, C’s transpose C is the adjugate of zIA:

C=adj(zIA),

whose elements will be polynomial functions of z of degree less than λΛAaλ.

Recall that the partial fraction expansion of a proper rational function r(z) with poles in Λ allows a unique decomposition into a sum of constant numerators divided by monomials in z − λ up to degree aλ, when aλ is the order of the pole of r(z) at λ ∈ Λ.28 Equation (8) thus makes it clear that the resolvent has the unique partial fraction expansion:

R(z;A)=λΛAm=0aλ11(zλ)m+1Aλ,m,
(9)

where {Aλ,m} is the set of matrices with constant entries (not functions of z) uniquely determined elementwise by the partial fraction expansion. However, R(z; A)’s poles are not necessarily of the same order as the algebraic multiplicity of the corresponding eigenvalues since the entries of C, and thus of C, may have zeros at A’s eigenvalues. This has the potential to render Aλ,m equal to the zero matrix 0.

The Cauchy integral formula indicates that the constant matrices {Aλ,m} of Eq. (9) can be obtained as the residues:

Aλ,m=12πiCλ(zλ)mR(z;A)dz,
(10)

where the residues are calculated elementwise. The projection operators Aλ associated with each eigenvalue λ were already referenced in Sec. II, but can now be properly introduced as the Aλ,0 matrices:

Aλ=Aλ,0
(11)
=12πiCλR(z;A)dz.
(12)

Since R(z; A)’s elements are rational functions, as we just showed, it is analytic except at a finite number of isolated singularities—at A’s eigenvalues. In light of the residue theorem, this motivates the Cauchy-integral-like formula that serves as the starting point for the meromorphic functional calculus:

f(A)=λΛA12πiCλf(z)R(z;A)dz.
(13)

Let’s now consider several immediate consequences.

Even the simplest applications of Eq. (13) yield insight. Consider the identity as the operator function f(A) = A0 = I that corresponds to the scalar function f(z) = z0 = 1. Then, Eq. (13) implies:

I=λΛA12πiCλR(z;A)dz=λΛAAλ.

This shows that the projection operators are, in fact, a decomposition of the identity, as anticipated in Eq. (2).

For f(A) = A, Eqs. (13) and (10) imply that:

A=λΛA12πiCλzR(z;A)dz=λΛAλ12πiCλ  R(z;A)dz+12πiCλ(zλ)R(z;A)dz=λΛAλAλ,0+Aλ,1.
(14)

We denote the important set of nilpotent matrices Aλ,1 that project onto the generalized eigenspaces by relabeling them:

NλAλ,1
(15)
=12πiCλ(zλ)R(z;A)dz.
(16)

Equation (14) is the unique Dunford decomposition:16A = D + N, where DλΛAλAλ is diagonalizable, NλΛANλ is nilpotent, and D and N commute: [D, N] = 0. This is also known as the Jordan–Chevalley decomposition.

The special case where A is diagonalizable implies that N = 0. And so, Eq. (14) simplifies to:

A=λΛAλAλ.

As shown in Ref. 14 and can be derived from Eqs. (12) and (16):

AλAζ=δλ,ζAλ

and

AλNζ=δλ,ζNλ.

Due to these, our spectral decomposition of the Dunford decomposition implies that:

Nλ=Aλ(AζΛAζAζ)=Aλ(AλAλ)=Aλ(AλI).
(17)

Moreover:

Aλ,m=Aλ(AλI)m.
(18)

It turns out that for m > 0: Aλ,m=Nλm. (See also [Ref. 14, p. 483].) This leads to a generalization of the projection operator orthonormality relations of Eq. (1). Most generally, the operators of {Aλ,m} are mutually related by:

Aλ,mAζ,n=δλ,ζAλ,m+n.
(19)

Finally, if we recall that the index νλ is the dimension of the largest associated subspace, we find that the index of λ characterizes the nilpotency of Nλ: Nλm=0 for mνλ. That is:

Aλ,m=0  formνλ.
(20)

Returning to Eq. (9), we see that all Aλ,m with mνλ are zero-matrices and so do not contribute to the sum. Thus, we can rewrite Eq. (9) as:

R(z;A)=λΛAm=0νλ11(zλ)m+1Aλ,m
(21)

or:

R(z;A)=λΛAm=0νλ11(zλ)m+1Aλ(AλI)m,
(22)

for z ∉ ΛA.

The following sections sometimes use Aλ,m in place of Aλ(A − λI)m. This is helpful both for conciseness and when applying Eq. (19). Nonetheless, the equality in Eq. (18) is a useful one to keep in mind.

In light of Eq. (13), Eq. (21) together with Eq. (18) allow us to express any function of an operator simply and solely in terms of its spectrum (i.e., its eigenvalues for the finite dimensional case), its projection operators, and itself:

f(A)=λΛAm=0νλ1Aλ,m12πiCλf(z)(zλ)m+1dz.
(23)

In obtaining Eq. (23) we finally derived Eq. (7), as promised earlier in Sec. III C. Effectively, by modulating the modes associated with the resolvent’s singularities, the scalar function f(·) is mapped to the operator domain, where its action is expressed in each of A’s independent subspaces.

Interpretation aside, how does one use this result? Equation (23) says that the spectral decomposition of f(A) reduces to the evaluation of several residues, where:

Res(g(z),zλ)=12πiCλg(z)dz.

So, to make progress with Eq. (23), we must evaluate function-dependent residues of the form:

Resf(z)/(zλ)m+1,zλ.

If f(z) were holomorphic at each λ, then the order of the pole would simply be the power of the denominator. We could then use Cauchy’s differential formula for holomorphic functions:

f(n)(a)=n!2πiCaf(z)(za)n+1dz,
(24)

for f(z) holomorphic at a. And, the meromorphic calculus would reduce to the holomorphic calculus. Often, f(z) will be holomorphic at least at some of A’s eigenvalues. And so, Eq. (24) is still locally a useful simplification in those special cases.

In general, though, f(z) introduces poles and zeros at λ ∈ ΛA that change their orders. This is exactly the impetus for the generalized functional calculus. The residue of a complex-valued function g(z) around its isolated pole λ of order n + 1 can be calculated from:

Res(g(z),zλ)=1n!limzλdndzn(zλ)n+1g(z).

Equation (23) says that we can explicitly derive the spectral decomposition of powers of the operator A. Of course, we already did this for the special cases of A0 and A1. The goal, though, is to do this in general.

For f(A) = ALf(z) = zL, z = 0 can be either a zero or a pole of f(z), depending on the value of L. In either case, an eigenvalue of λ = 0 will distinguish itself in the residue calculation of AL via its unique ability to change the order of the pole (or zero) at z = 0. For example, at this special value of λ and for integer L > 0, λ = 0 induces poles that cancel with the zeros of f(z) = zL, since zL has a zero at z = 0 of order L. For integer L < 0, an eigenvalue of λ = 0 increases the order of the z = 0 pole of f(z) = zL. For all other eigenvalues, the residues will be as expected. Hence, from Eq. (23) and inserting f(z) = zL, for any LC:

AL=[λΛAλ0m=0νλ1Aλ(AλI)m   12πiCλzL(zλ)m+1dz=1m!limzλdmdzmzL=λLmm!n=1m(Ln+1)]+0ΛAm=0ν01A0Am12πiC0zLm1dz=δL,m=[λΛAλ0m=0νλ1LmλLmAλ(AλI)m]+0ΛAm=0ν01δL,mA0Am,
(25)

where Lm is the generalized binomial coefficient:

Lm=1m!n=1m(Ln+1)withL0=1,
(26)

and [0 ∈ ΛA] is the Iverson bracket which takes on value 1 if zero is an eigenvalue of A and 0 if not. Aλ,m was replaced by Aλ(A − λI)m to suggest the more explicit calculations involved with evaluating any AL. Equation (25) applies to any linear operator with only isolated singularities in its resolvent.

The eigen-decomposition of polynomials implied by Eq. (25) makes the contribution of the zero eigenvalue more explicit than previous treatments and enables closed-form expressions, e.g., for correlation functions, where the zero eigenvalue makes a qualitatively distinct contribution.21,22 Consequentially, this formulation can lead to the recognition of coexistent finite and infinite range physical phenomena of different mechanistic origin.23,24

If L is a nonnegative integer such that Lνλ − 1 for all λ ∈ ΛA, then:

AL=λΛAλ0m=0νλ1LmλLmAλ,m,
(27)

where Lm is now reduced to the traditional binomial coefficient L!/(m!(Lm)!).

If L is any negative integer, then |L|m can be written as a traditional binomial coefficient (1)mL+m1m, yielding:

AL=λΛAλ0m=0νλ1(1)mL+m1mλLmAλ,m,
(28)

for L{1,2,3,}.

Thus, negative powers of an operator can be consistently defined even for noninvertible operators. In light of Eqs. (25) and (28), it appears that the zero eigenvalue does not even contribute to the function. It is well known, in contrast, that it wreaks havoc on the naive, oft-quoted definition of a matrix’s negative power:

A1=?adj(A)det(A)=adj(A)λΛAλaλ,

since this would imply dividing by zero. If we can accept large positive powers of singular matrices—for which the zero eigenvalue does not contribute—it seems fair to also accept negative powers that likewise involve no contribution from the zero eigenvalue.

Editorializing aside, we note that extending the definition of A−1 to the domain including singular operators via Eqs. (25) and (28) implies that:

ALA=AAL=ALforL+ν0,

which is a very sensible and desirable condition. Moreover, we find that AA−1 = IA0.

Specifically, the negative-one power of any square matrix is in general not the same as the matrix inverse since inv(A) need not exist. However, it is consistently defined via Eq. (28) to be:

A1=λΛA\{0}m=0νλ1(1)mλ1mAλ,m.
(29)

This is the Drazin inverseAD of A. Note that it is not the same as the Moore–Penrose pseudo-inverse.36,37

Although the Drazin inverse is usually defined axiomatically to satisfy certain criteria,38 it is naturally derived as the negative one power of a singular operator in the meromorphic functional calculus. We can check that it indeed satisfies the axiomatic criteria for the Drazin inverse, enumerated according to historical precedent:

(1ν0)     Aν0ADA=Aν0(2)     ADAAD=AD(5)      [A,AD]=0,

which gives rise to the Drazin inverse’s moniker as the {1ν0,2,5}-inverse.38 The analytical form of Eq. (29) has been teased out previously by other means; see, e.g., Ref. 38 and for other settings see Refs. 39 and 40. Nevertheless, due to its utility in application, it is noteworthy and appealing that the Drazin inverse falls out organically in the meromorphic functional calculus, as the negative-one power, in contrast to its otherwise rather esoteric axiomatic origin.

While A−1 always exists, the resolvent is nonanalytic at z = 0 for a singular matrix. Effectively, the meromorphic functional calculus removes the nonanalyticity of the resolvent in evaluating A−1. As a result, as we can see from Eq. (29), the Drazin inverse inverts what is invertible; the remainder is zeroed out.

Of course, whenever A is invertible, A−1 is equal to inv(A). However, we should not confuse this coincidence with equivalence. Moreover, despite historic notation there is no reason that the negative-one power should in general be equivalent to the inverse. Especially, if an operator is not invertible! To avoid confusing A−1 with inv(A), we use the notation AD for the Drazin inverse of A. Still, AD=inv(A), whenever 0 ∉ ΛA.

Amusingly, this extension of previous calculi lets us resolve an elementary but fundamental question: What is 0−1? It is certainly not infinity. Indeed, it is just as close to negative infinity! Rather: 0−1 = 0 ≠ inv(0).

Although Eq. (29) is a constructive way to build the Drazin inverse, it imposes more work than is actually necessary. Using the meromorphic functional calculus, we can derive a new, simple construction of the Drazin inverse that requires only the original operator and the eigenvalue-0 projector.

First, assume that λ is an isolated singularity of R(z; A) with finite separation at least ϵ distance from the nearest neighboring singularity. And, consider the operator-valued function fλϵ defined via the RHS of:

Aλ=fλϵ(A)=12πiλ+ϵeiϕ(ζIA)1dζ,

with λ + ϵe defining an ϵ-radius circular contour around λ. Then we see that:

fλϵ(z)=12πiλ+ϵeiϕ(ζz)1dζ=[zC:|zλ|<ϵ],
(30)

where [zC:|zλ|<ϵ] is the Iverson bracket that takes on value 1 if z is within ϵ-distance of λ and 0 if not.

Second, we use this to find that, for any cC\{0}:

(A+cA0)1=λΛAm=0νλ1Aλ,m12πiCλ(z+cf0ϵ(z))1(zλ)m+1dz=AD+m=0ν01A0Am12πiC0(z+c)1zm+1=AD+m=0ν01A0Am(1)m/cm+1,
(31)

where we asserted that the contour C0 exists within the finite ϵ-ball about the origin.

Third, we note that A + cA0 is invertible for all c ≠ 0; this can be proven by multiplying each side of Eq. (31) by A + cA0. Hence, (A+cA0)1=inv(A+cA0) for all c ≠ 0.

Finally, multiplying each side of Eq. (31) by IA0, and recalling that A0,0A0,m = A0,m, we find a useful expression for calculating the Drazin inverse of any linear operator A, given only A and A0. Specifically:

AD=(IA0)(A+cA0)1.
(32)

which is valid for any cC\{0}. Eq. (32) generalizes the result found specifically for c = −1 in Ref. 41.

For the special case of c = −1, it is worthwhile to also consider the alternative construction of the Drazin inverse implied by Eq. (31):

AD=(AA0)1+A0(m=0ν01Am).
(33)

By a spectral mapping (λ → 1 − λ, for λ ∈ ΛT), the Perron–Frobenius theorem and Eq. (31) yield an important consequence for any stochastic matrix T. The Perron–Frobenius theorem guarantees that T’s eigenvalues along the unit circle are associated with a diagonalizable subspace. In particular, ν1 = 1. Spectral mapping of this result means that T’s eigenvalue 1 maps to the eigenvalue 0 of IT and T1 = (IT)0. Moreover:

[(IT)+T1]1=(IT)D+T1,

since ν0 = 1. This corollary of Eq. (31) (with c = 1) corresponds to a number of important and well known results in the theory of Markov processes. Indeed, Z(IT+T1)1 is called the fundamental matrix in that setting.42 

For an infinite-rank operator A with a continuous spectrum, the meromorphic functional calculus has the natural generalization:

f(A)=12πiCΛAf(z)(zIA)1dz,
(34)

where the contour CΛA encloses the (possibly continuous) spectrum of A without including any unbounded contributions from f(z) outside of CΛA. The function f(z) is expected to be meromorphic within CΛA. This again deviates from the holomorphic approach, since the holomorphic functional calculus requires that f(z) is analytic in a neighborhood around the spectrum; see Sec. VII of Ref. 43. Moreover, Eq. (34) allows an extension of the functional calculus of Refs. 34, 35, and 44, since the function can be meromorphic at the point spectrum in addition being meromorphic on the residual and continuous spectra.

In either the finite- or infinite-rank case, whenever f(z) is analytic in a neighborhood around the spectrum, the meromorphic functional calculus agrees with the holomorphic. Whenever f(z) is not analytic in a neighborhood around the spectrum, the function is undefined in the holomorphic approach. In contrast, the meromorphic approach extends the function to the operator-valued domain and does so with novel consequences.

In particular, when f(z) is not analytic in a neighborhood around the spectrum—say f(z) is nonanalytic within A’s spectrum at Ξf ⊂ ΛA—then we expect to lose both homomorphism and spectral mapping properties:

  • Loss of homomorphism: f1(A)f2(A) ≠ ( f1 · f2)(A);

  • Loss of naive spectral mapping: fA∖Ξf) ⊂ Λf(A).

A simple example of both losses arises with the Drazin inverse, above. There, f1(z) = z−1. Taking this and f2(z) = z combined with singular operator A leads to the loss of homomorphism: ADAI. As for the second property, the spectral mapping can be altered for the candidate spectra at Ξf via pole–pole or pole–zero interactions in the complex contour integral. For f(A) = A−1, how does A’s eigenvalue of 0 get mapped into the new spectrum of AD? A naive application of the spectral mapping theorem might seem to yield an undefined quantity. But, using the meromorphic functional calculus self-consistently maps the eigenvalue as 0−1 = 0. It remains to be explored whether the full spectral mapping is preserved for any function f(A) under the meromorphic interpretation of f(λ).

It should now be apparent that extending functions via the meromorphic functional calculus allows one to express novel mathematical properties, some likely capable of describing new physical phenomena. At the same time, extra care is necessary. The situation is reminiscent of the loss of commutativity in non-Abelian operator algebra: not all of the old rules apply, but the gain in nuance allows for mathematical description of important phenomena.

We chose to focus primarily on the finite-rank case here since it is sufficient to demonstrate the utility of the general projection-operator formalism. Indeed, there are ample nontrivial applications in the finite-rank setting that deserve attention. To appreciate these, we now turn to address the construction and properties of general eigenprojectors.

At this point, we see that projection operators are fundamental to functions of an operator. This prompts the practical question of how to actually calculate them. The next several sections address this by deriving expressions with both theoretical and applied use. We first develop the projection operators associated with index-one eigenvalues. We then explicate the relationship between eigenvectors, generalized eigenvectors, and projection operators for normal, diagonalizable, and general matrices. Finally, we show how the general results specialize in several common cases of interest. After these, we turn to examples and applications.

To obtain the projection operators associated with each index-one eigenvalue λ ∈ {ζ ∈ ΛA:νζ = 1}, we apply the functional calculus to an appropriately chosen function of A, finding:

ζΛAζλ(AζI)νζ=ξΛAm=0νξ1Aξ,m2πiCξζΛAζλ(zζ)νζ(zξ)m+1dz=Aλ12πiCλζΛAζλ(zζ)νζzλdz=AλζΛAζλ(λζ)νζ.

Therefore, if νλ = 1:

Aλ=ζΛAζλAζIλζνζ.
(35)

As convenience dictates in our computations, we let νζaζgζ + 1 or even νζaζ in Eq. (35), since multiplying Aλ by (AζI)/(λ − ζ) has no effect for ζ ∈ ΛA∖{λ} if νλ = 1.

Equation (35) generalizes a well known result that applies when the index of all eigenvalues is one. That is, when the operator is diagonalizable, we have:

Aλ=ζΛAζλAζIλζ.

To the best of our knowledge, Eq. (35) is original.

Since eigenvalues can have index larger than one, not all projection operators of a nondiagonalizable operator can be found directly from Eq. (35). Even so, it serves three useful purposes. First, it gives a practical reduction of the eigen-analysis by finding all projection operators of index-one eigenvalues. Second, if there is only one eigenvalue that has index larger than one—what we call the almost diagonalizable case—then Eq. (35), together with the fact that the projection operators must sum to the identity, does give a full solution to the set of projection operators. Third, Eq. (35) is a powerful theoretical tool that we can use directly to spectrally decompose functions, for example, of a stochastic matrix whose eigenvalues on the unit circle are guaranteed to be index-one by the Perron–Frobenius theorem.

Although index-one expressions have some utility, we need a more general procedure to obtain all projection operators of any linear operator. Recall that, with full generality, projection operators can also be calculated directly via residues, as in Eq. (12).

An alternative procedure—one that extends a method familiar at least in quantum mechanics—is to obtain the projection operators via eigenvectors. However, quantum mechanics always concerns itself with a subset of diagonalizable operators. What is the necessary generalization? For one, left and right eigenvectors are no longer simply conjugate transposes of each other. More severely, a full set of spanning eigenvectors is no longer guaranteed and we must resort to generalized eigenvectors. Since the relationships among eigenvectors, generalized eigenvectors, and projection operators are critical to the practical calculation of many physical observables of complex systems, we collect these results in the next section.

Two common questions regarding projection operators are: Why not just use eigenvectors? And, why not use the Jordan canonical form? First, the eigenvectors of a defective matrix do not form a complete basis with which to expand an arbitrary vector. One needs generalized eigenvectors for this. Second, some functions of an operator require removing, or otherwise altering, the contribution from select eigenspaces. This is most adroitly handled with the projection operator formalism where different eigenspaces (correlates of Jordan blocks) can effectively be treated separately. Moreover, even for simple cases where eigenvectors suffice, the projection operator formalism simply can be more calculationally or mathematically convenient.

That said, it is useful to understand the relationship between projection operators and generalized eigenvectors. For example, it is often useful to create projection operators from generalized eigenvectors. This section clarifies their connection using the language of matrices. In the most general case, we show that the projection operator formalism is usefully concise.

1. Normal matrices

Unitary, Hermitian, skew-Hermitian, orthogonal, symmetric, and skew-symmetric matrices are all special cases of normal matrices. As noted, normal matrices are those that commute with their Hermitian adjoint (complex-conjugate transpose): AA = AA. Moreover, a matrix is normal if and only if it can be diagonalized by a unitary transformation: A = UΛU, where the columns of the unitary matrix U are the orthonormal right eigenvectors of A corresponding to the eigenvalues ordered along the diagonal matrix Λ. For an M-by-M matrix A, the eigenvalues in ΛA are ordered and enumerated according to the possibly degenerate M-tuple (ΛA) = (λ1, …, λM). Since an eigenvalue λ ∈ ΛA has algebraic multiplicity aλ ≥ 1, λ appears aλ times in the ordered tuple.

Assuming A is normal, each projection operator Aλ can be constructed as the sum of all ket–bra pairs of right-eigenvectors corresponding to λ composed with their conjugate transpose. We later introduce bras and kets more generally via generalized eigenvectors of the operator A and its dual A. However, since the complex-conjugate transposition rule between dual spaces only applies to a ket basis derived from a normal operator, we put off using the bra-ket notation for now so as not to confuse the more familiar “normal” case with the general case.

To explicitly demonstrate this relationship between projection operators, eigenvectors, and their Hermitian adjoints in the case of normality, observe that:

A=UΛU=u1u2uMλ1000λ2000λMu1u2uM=λ1u1λ2u2λMuMu1u2uM=j=1Mλjujuj=λΛAλAλ.

Evidently, for normal matrices A:

Aλ=j=1Mδλ,λjujuj.

And, since uiuj=δi,j, we have an orthogonal set {Aλ}λΛA with the property that:

AζAλ=i=1Mj=1Mδζ,λiδλ,λjuiuiujuj=i=1Mj=1Mδζ,λiδλ,λjuiδi,juj=i=1Mδζ,λiδλ,λiuiui=δζ,λAλ.

Moreover:

λΛAAλ=j=1Mujuj=UU=I,

and so on. All of the expected properties of projection operators can be established again in this restricted setting.

The rows of U−1 = U are A’s left eigenvectors. In this case, they are simply the conjugate transpose of the right eigenvectors. Note that conjugate transposition is the familiar transformation rule between ket and bra spaces in quantum mechanics (see, e.g., Ref. 45)—a consequence of the restriction to normal operators, as we will show. Importantly, a more general formulation of quantum mechanics would not have this same restricted correspondence between the dual ket and bra spaces.

To elaborate on this point, recall that vector spaces admit dual spaces and dual bases. However, there is no sense of a dual correspondence of a single ket or bra without reference to a full basis.15 Implicitly in quantum mechanics, the basis is taken to be the basis of eigenstates of any Hermitian operator, supposedly since observables are self-adjoint.

To allude to an alternative, we note that ujuj is not only the Hermitian form of inner product uj,uj (where ⟨·, ·⟩ denotes the inner product) of the right eigenvector uj with itself, but importantly also the simple dot-product of the left eigenvector uj and the right eigenvector uj, where uj acts as a linear functional on uj. Contrary to the substantial effort devoted to the inner-product-centric theory of Hilbert spaces, this latter interpretation of ujuj—in terms of linear functionals and a left-eigenvector basis for linear functionals—is what generalizes to a consistent and constructive framework for the spectral theory beyond normal operators, as we will see shortly.

2. Diagonalizable matrices

By definition, diagonalizable matrices can be diagonalized, but not necessarily via a unitary transformation. All diagonalizable matrices can nevertheless be diagonalized via the transformation: A = PΛP−1, where the columns of the square matrix P are the not-necessarily-orthogonal right eigenvectors of A corresponding to the eigenvalues ordered along the diagonal matrix Λ and where the rows of P−1 are A’s left eigenvectors. Importantly, the left eigenvectors need not be the Hermitian adjoint of the right eigenvectors. As a particular example, this more general setting is required for almost any transition dynamic of a Markov chain. In other words, the transition dynamic of any interesting complex network with irreversible processes serves as an example of a nonnormal operator.

Given the M-tuple of possibly-degenerate eigenvalues (ΛA) = (λ1, λ2, …, λM), there is a corresponding M-tuple of linearly-independent right-eigenvectors (|λ1⟩, |λ2⟩, …, |λM⟩) and a corresponding M-tuple of linearly-independent left-eigenvectors (⟨λ1|, ⟨λ2|, …, ⟨λM|) such that:

A|λj=λj|λj

and:

λj|A=λjλj|

with the orthonormality condition that:

λi|λj=δi,j.

To avoid misinterpretation, we stress that the bras and kets that appear above are the left and right eigenvectors, respectively, and typically do not correspond to complex-conjugate transposition.

With these definitions in place, the projection operators for a diagonalizable matrix can be written:

Aλ=j=1Mδλ,λj|λjλj|.

Then:

A=λΛAλAλ=j=1Mλj|λjλj|=λ1|λ1λ2|λ2λM|λMλ1|λ2|λM|=|λ1|λ2|λMλ1000λ2000λMλ1|λ2|λM|=PΛP1.

So, we see that the projection operators introduced earlier in a coordinate-free manner have a concrete representation in terms of left and right eigenvectors when the operator is diagonalizable.

3. Any matrix

Not all matrices can be diagonalized, but all square matrices can be put into Jordan canonical form via the transformation: A = Y JY−1.17 Here, the columns of the square matrix Y are the linearly independent right eigenvectors and generalized right eigenvectors corresponding to the Jordan blocks ordered along the diagonal of the block-diagonal matrix J. And, the rows of Y−1 are the corresponding left eigenvectors and generalized left eigenvectors, but reverse-ordered within each block, as we will show.

Let there be n Jordan blocks forming the n-tuple (J1, J2, …, Jn), with 1 ≤ nM. The kth Jordan block Jk has dimension mk-by-mk:

Jk=λk100000λk1000λk00λk10000λk10000λkmkcolumns  mkrows

such that: k=1nmk=M.

Note that eigenvalue λ ∈ ΛA corresponds to gλ different Jordan blocks, where gλ is the geometric multiplicity of the eigenvalue λ. Indeed: n=λΛAgλ. Moreover, the index νλ of the eigenvalue λ is defined as the size of the largest Jordan block corresponding to λ. So, we write this in the current notation as:

νλ=max{δλ,λkmk}k=1n.

If the index of any eigenvalue is greater than one, then the conventional eigenvectors do not span the M-dimensional vector space. However, the set of M generalized eigenvectors does form a basis for the vector space.46 

Given the n-tuple of possibly-degenerate eigenvalues (ΛA) = (λ1, λ2, …, λn) where each eigenvalue λ ∈ ΛA is listed gλ times, there is a corresponding n-tuple of mk-tuples of linearly-independent generalized right-eigenvectors:

(|λ1(m))m=1m1,(|λ2(m))m=1m2,,(|λn(m))m=1mn,

where, for each λk ∈ (ΛA):

(|λk(m))m=1mk|λk(1),|λk(2),,|λk(mk)

and a corresponding n-tuple of mk-tuples of linearly-independent generalized left-eigenvectors:

(λ1(m)|)m=1m1,(λ2(m)|)m=1m2,,(λn(m)|)m=1mn,

where:

(λk(m)|)m=1mkλk(1)|,λk(2)|,,λk(mk)|

such that:

(AλkI)|λk(m+1)=|λk(m)
(36)

and:

λk(m+1)|(AλkI)=λk(m)|,
(37)

for 0 ≤ mmk − 1, where |λj(0)=0 and λj(0)|=0. Specifically, |λk(1) and λk(1)| are conventional right and left eigenvectors, respectively.

Most directly, the generalized right and left eigenvectors can be found as the nontrivial solutions to:

(AλkI)m|λk(m)=0

and:

λk(m)|(AλkI)m=0,

respectively.

It should be clear from Eqs. (36) and (37) that:

λk(m)|(AλkI)|λk(n)=λk(m)|λk(n)=λk(m)|λk(n),

for m, n, ∈ {0, 1, …, mk} and ≥ 0. At the same time, it is then easy to show that:

λk(m)|λk(n)=λk(m+n)|λk(0)=0,ifm+nmk,

where m, n ∈ {0, 1, …, mk}. Imposing appropriate normalization, we find that:

λj(m)|λk(n)=δj,kδm+n,mk+1.
(38)

Hence, we see that the left eigenvectors and generalized eigenvectors are a dual basis to the right eigenvectors and generalized eigenvectors. Interestingly though, within each Jordan subspace, the most generalized left eigenvectors are dual to the least generalized right eigenvectors, and vice versa.

(To be clear, in this terminology “least generalized” eigenvectors are the standard eigenvectors. For example, the λk(1)| satisfying the standard eigenvector relation λk(1)|A=λkλk(1)| is the least generalized left eigenvector of subspace k. By way of comparison, the “most generalized” right eigenvector of subspace k is |λk(mk) satisfying the most generalized eigenvector relation (AλkI)|λk(mk)=|λk(mk1) for subspace k. The orthonormality relation shows that the two are dual correspondents: λk(1)|λk(mk)=1, while all other eigen-bra–eigen-ket closures utilizing these objects are null.)

With these details worked out, we find that the projection operators for a nondiagonalizable matrix can be written as:

Aλ=k=1nm=1mkδλ,λk|λk(m)λk(mk+1m)|.
(39)

And, we see that a projection operator includes all of its left and right eigenvectors and all of its left and right generalized eigenvectors. This implies that the identity operator must also have a decomposition in terms of both eigenvectors and generalized eigenvectors:

I=λΛAAλ=k=1nm=1mk|λk(m)λk(mk+1m)|.

Let [|λk(m)]m=1mk denote the column vector:

[|λk(m)]m=1mk=|λk(1)|λk(mk),

and let [λk(mk+1m)|]m=1mk denote the column vector:

[λk(mk+1m)|]m=1mk=λk(mk)|λk(1)|.

Then, using the above results, and the fact that Eq. (37) implies that λk(m+1)|A=λkλk(m+1)|+λk(m)|, we derive the explicit generalized-eigenvector decomposition of the nondiagonalizable operator A:

A=(λΛAAλ)A=k=1nm=1mk|λk(m)λk(mk+1m)|A=k=1nm=1mk|λk(m)λkλk(mk+1m)|+λk(mkm)|  =[|λ1(m)]m=1m1[|λ2(m)]m=1m2[|λn(m)]m=1mn  J1000J2000Jn[λ1(m1+1m)|]m=1m1[λ2(m2+1m)|]m=1m2[λn(mn+1m)|]m=1mn=YJY1,

where, defining Y as:

Y=[|λ1(m)]m=1m1[|λ2(m)]m=1m2[|λn(m)]m=1mn,

we are forced by Eq. (38) to recognize that:

Y1=[λ1(m1+1m)|]m=1m1[λ2(m2+1m)|]m=1m2[λn(mn+1m)|]m=1mn

since then Y−1Y = I, and we recall that the inverse is guaranteed to be unique.

The above demonstrates an explicit construction for the Jordan canonical form. One advantage we learn from this explicit decomposition is that the complete set of left eigenvectors and left generalized eigenvectors (encapsulated in Y−1) can be obtained from the inverse of the matrix of the complete set of right eigenvectors and generalized right eigenvectors (encoded in Y) and vice versa. One unexpected lesson, though, is that the generalized left eigenvectors appear in reverse order within each Jordan block.

Using Eqs. (39) and (18) with Eq. (37), we see that the nilpotent operators Aλ,m with m > 0 further link the various generalized eigenvectors within each subspace k. Said more suggestively, generalized modes of a nondiagonalizable subspace are necessarily cooperative.

It is worth noting that the left eigenvectors and generalized left eigenvectors form a basis for all linear functionals of the vector space spanned by the right eigenvectors and generalized right eigenvectors. Moreover, the left eigenvectors and generalized left eigenvectors are exactly the dual basis to the right eigenvectors and generalized right eigenvectors by their orthonormality properties. However, neither the left nor right eigen-basis is a priori more fundamental to the operator. Sympathetically, the right eigenvectors and generalized eigenvectors form a (dual) basis for all linear functionals of the vector space spanned by the left eigenvectors and generalized eigenvectors.

4. Simplified calculi for special cases

In special cases, the meromorphic functional calculus reduces the general expressions above to markedly simpler forms. And, this can greatly expedite practical calculations and provide physical intuition. Here, we show which reductions can be used under which assumptions.

For functions of operators with a countable spectrum, recall that the general form of the meromorphic functional calculus is:

f(A)=λΛAm=0νλ1Aλ,m12πiCλf(z)(zλ)m+1dz.
(40)

Equations (18) and (39) gave the method to calculate Aλ,m in terms of eigenvectors and generalized eigenvectors.

When the operator is diagonalizable (not necessarily normal), this reduces to:

f(A)=λΛAAλ12πiCλf(z)(zλ)dz,
(41)

where Aλ can now be constructed from conventional right and left eigenvectors, although ⟨λj| is not necessarily the conjugate transpose of |λj⟩.

When the function is analytic on the spectrum of the (not necessarily diagonalizable) operator, then our functional calculus reduces to the holomorphic functional calculus:

f(A)=λΛAm=0νλ1f(m)(λ)m!Aλ,m.
(42)

When the function is analytic on the spectrum of a diagonalizable (not necessarily normal) operator this reduces yet again to:

f(A)=λΛAf(λ)Aλ.
(43)

When the function is analytic on the spectrum of a diagonalizable (not necessarily normal) operator with no degeneracy this reduces even further to:

f(A)=λΛAf(λ)|λλ|λ|λ.
(44)

Finally, recall that an operator is normal when it commutes with its conjugate transpose. If the function is analytic on the spectrum of a normal operator, then we recover the simple form enabled by the spectral theorem of normal operators familiar in physics. That is, Eq. (43) is applicable, but now we have the extra simplification that ⟨λj| is simply the conjugate transpose of |λj⟩: λj|=|λj.

To illustrate the use and power of the simultaneously generalized-and-simplified calculus, we now adapt it to analyze a suite of applications from quite distinct domains. First, we point to a set of example calculations for finite-dimensional operators of stochastic processes. Second, we show that the familiar Poisson process is intrinsically nondiagonalizable, and hint that nondiagonalizability may be common more generally in semi-Markov processes. Third, we illustrate how commonly the Drazin inverse arises in nonequilibrium thermodynamics, giving a roadmap to developing closed-from expressions for a number of key observables. Finally, we round out the applications with a general discussion of Ruelle–Frobenius–Perron and Koopman operators for nonlinear dynamical systems.

The preceding employed the notation that A represents a general linear operator. In the following examples, we reserve the symbol T for the operator of a stochastic transition dynamic. If the state-space is finite and has a stationary distribution, then T has a representation that is a nonnegative row-stochastic—all rows sum to unity—transition matrix.

The transition matrix’s nonnegativity guarantees that for each λ ∈ ΛT its complex conjugate λ¯ is also in ΛT. Moreover, the projection operator associated with the complex conjugate of λ is the complex conjugate of Tλ: Tλ¯=Tλ¯.

If the dynamic induced by T has a stationary distribution over the state space, then the spectral radius of T is unity and all of T’s eigenvalues lie on or within the unit circle in the complex plane. The maximal eigenvalues have unity magnitude and 1 ∈ ΛT. Moreover, an extension of the Perron–Frobenius theorem guarantees that eigenvalues on the unit circle have algebraic multiplicity equal to their geometric multiplicity. And, so, νζ = 1 for all ζ ∈ {λ ∈ ΛT:|λ| = 1}.

T’s index-one eigenvalue of λ = 1 is associated with stationarity of the associated Markov process. T’s other eigenvalues on the unit circle are roots of unity and correspond to deterministic periodicities within the process.

All of these results carry over from discrete to continuous time. In continuous time, where etG=Tt0t0+t, T’s stationary eigenvalue of unity maps to G’s stationary eigenvalue of zero. If the dynamic has a stationary distribution over the state space, then the rate matrix G is row-sum zero rather than row-stochastic. T’s eigenvalues, on or within the unit circle, map to G’s eigenvalues with nonpositive real part in the left-hand side of the complex plane.

To reduce ambiguity in the presence of multiple operators, functions of operators, and spectral mapping, we occasionally denote eigenvectors with subscripted operators on the eigenvalues within the bra or ket. For example, |0G=|1T|0G=|1T|0T disambiguates the identification of |0⟩ when we have operators G, T, G, and T with T = eτG, T=eτG, and 0ΛG,ΛG,ΛT.

The generalized spectral theory developed here has recently been applied to give the first closed-form expressions for many measures of complexity for stochastic processes that can be generated by probabilistic finite automata.19–23 Rather than belabor the Kolmogorov–Chaitin notion of complexity which is inherently uncomputable,47 the new analytic framework here infuses computational mechanics48 with a means to compute very practical answers about an observed system’s organization and to address the challenges of prediction.

For example, we can now answer the obvious questions regarding prediction: How random is a process? How much information is shared between the past and the future? How far into the past must we look to predict what is predictable about the future? How much about the observed history must be remembered to predict what is predictable about the future? And so on. The Supplementary Materials of Ref. 19 exploit the generalized spectral theory to answer these (and more) questions for the symbolic dynamics of a chaotic map, the spacetime domain for an elementary cellular automata, and the chaotic crystallographic structure of a close-packed polytypic material as determined from experimental X-ray diffractograms.

In the context of the current exposition, the most notable feature of the analyses across these many domains is that our questions, which entail tracking an observer’s state of knowledge about a process, necessarily induce a nondiagonalizable metadynamic that becomes the central object of analysis in each case. (This metadynamic is the so-called mixed-state presentation of Refs. 49 and 50.)

This theme, and the inherent nondiagonalizability of prediction, is explored in greater depth elsewhere.22,23 We also found that another nondiagonalizable dynamic is induced even in the context of quantum communication when determining how much memory reduction can be achieved if we generate a classical stochastic process using quantum mechanics.24 

We mention the above nondiagonalizable metadynamics primarily as a pointer to concrete worked-out examples where the generalized spectral theory has been employed to analyze finitary hidden Markov processes via explicitly calculated, generalized eigenvectors and projection operators. We now return to a more self-contained discussion, where we show that nondiagonalizability can be induced by the simple act of counting. Moreover, the theory developed is then applied to deliver quick and powerful results.

The functional calculus leads naturally to a novel perspective on the familiar Poisson counting process—a familiar stochastic process class used widely across physics and other quantitative sciences to describe “completely random” event durations that occur over a continuous domain.51–54 The calculus shows that the basic Poisson distribution arises as the signature of a simple nondiagonalizable dynamic. More to the point, we derive the Poisson distribution directly, without requiring the limit of the discrete-time binomial distribution, as conventionally done.29 

Consider all possible counts, up to some arbitrarily large integer N. The dynamics among these first N + 1 counter states constitute what can be called the truncated Poisson dynamic. We recover the full Poisson distribution as N → ∞. A Markov chain for the truncated Poisson dynamic is shown in Fig. 1. The corresponding rate matrix G, for any arbitrarily large truncation N of the possible count, is:

G=rrrrrrr,

where Gij is the rate of transitioning to state (count) j given that the system is in state (count) i. Elements not on either the main diagonal or first superdiagonal are zero. This can be rewritten succinctly as:

G=rI+rD1,

where I is the identity operator in N-dimensions and D1 is the upshift-by-1 matrix in N-dimensions, with zeros everywhere, except 1s along the first superdiagonal. Let us also define the upshift-by-m matrix Dm with zeros everywhere except 1s along the mth superdiagonal, such that Dm=D1m and Dmn=Dmn, with D0 = I. Operationally, if ⟨δ| is the probability distribution over counter states that is peaked solely at state , then ⟨δ|Dm = ⟨δ+m|.

FIG. 1.

Explicit Markov-chain representation of the continuous-time truncated Poisson dynamic, giving interstate transition rates r among the first N + 1 counter-states. (State self-transition rates − r are not depicted.) Taking the limit of N → ∞ recovers the full Poisson counting distribution. It can either be time-homogeneous (transition-rate parameter r is time-independent) or time-inhomogeneous (parameter r is time-dependent).

FIG. 1.

Explicit Markov-chain representation of the continuous-time truncated Poisson dynamic, giving interstate transition rates r among the first N + 1 counter-states. (State self-transition rates − r are not depicted.) Taking the limit of N → ∞ recovers the full Poisson counting distribution. It can either be time-homogeneous (transition-rate parameter r is time-independent) or time-inhomogeneous (parameter r is time-dependent).

Close modal

For any arbitrarily large N, G’s eigenvalues are given by det(G − λI) = (−r − λ)N+1 = 0, from which we see that its spectrum is the singleton: ΛG = {−r}. Moreover, since it has algebraic multiplicity ar = N + 1 and geometric multiplicity gr = 1, the index of the −r eigenvalue is νr = N + 1. Since −r is the only eigenvalue, and all projection operators must sum to the identity, we must have the eigenprojection: Gr = I. The lesson is that the Poisson point process is highly nondiagonalizable.

1. Homogeneous Poisson processes

When the transition rate r between counter states is constant in time, the net counter state-to-state transition operator from initial time 0 to later time t is given simply by:

T(t)=etG.

The functional calculus allows us to directly evaluate etG for the Poisson nondiagonalizable transition-rate operator G; we find:

T(t)=etG=λΛGm=0νλ1Gλ(GλI)m(12πiCλetz(zλ)mdz)=limNm=0NI(G+rI)m1m!limzrdmdzmetztmert=m=0(rD1)mtmertm!=m=0Dm(rt)mertm!.

Consider the orthonormality relation ⟨δi|δj⟩ = δi,j between counter states, where |δj⟩ is represented by 0s everywhere except for a 1 at counter-state j. It effectively measures the occupation probability of counter-state j. Employing the result for T(t), we find the simple consequence that:

δ0|T(t)|δn=(rt)nertn!=δm|T(t)|δm+n.

That is, the probability that the counter is incremented by n in a time interval t is independent of the initial count and given by: (rt)nert/n!.

Let us emphasize that these steps derived the Poisson distribution directly, rather than as the typical limit of the binomial distribution. Our derivation depended critically on spectral manipulations of a highly nondiagonalizable operator. Moreover, our result for the transition dynamic T(t) allows a direct analysis of how distributions over counts evolve in time, as would be necessary, say, in a Bayesian setting with unknown prior count. This type of calculus can immediately be applied to the analysis of more sophisticated processes, for which we can generally expect nondiagonalizability to play an important functional role.

2. Inhomogeneous Poisson processes

Let us now generalize to time-inhomogeneous Poisson processes, where the transition rate r between count events is instantaneously uniform, but varies in time as r(t). Conveniently, the associated rate matrices at different times commute with each other. Specifically, with Ga = −aI + aD1 and Gb = −bI + bD1, we see that:

[Ga,Gb]=0.

Therefore, the net counter state-to-state transition operator from time t0 to time tf is given by:

Tt0,tf=et0tfG(t)dt=et0tfr(t)dt(I+D1)=er(Δt)(I+D1)=e(Δt)Gr,
(45)

where Δt = tft0 is the time elapsed and:

r=1Δtt0tfr(t)dt

is the average rate during that time. Given Eq. (45), the functional calculus proceeds just as in the time-homogeneous case to give the analogous net transition dynamic:

Tt0,tf=m=0Dm(rΔt)merΔtm!.

The probability that the count is incremented by n during the time interval Δt follows directly:

δm|Tt0,tf|δm+n=(rΔt)nerΔtn!.

With relative ease, our calculus allowed us to derive an important result for stochastic process theory that is nontrivial to derive by other means. Perhaps surprisingly, we see that the probability distribution over final counts induced by any rate trajectory r(t) is the same as if the transition rate were held fixed at mean ⟨r⟩ throughout the duration. Moreover, we can directly analyze the net evolution of distributions over counts using the derived transition operator Tt0,tf.

Note that the nondiagonalizability of the Poisson dynamic is robust in a physical sense. That is, even varying the rate parameter in time in an erratic way, the inherent structure of counting imposes a fundamental nondiagonalizable nature. That nondiagonalizability can be robust in a physical sense is significant, since one might otherwise be tempted to argue that nondiagonalizability is extremely fragile due to numerical perturbations within any matrix representation of the operator. This is simply not the case since such perturbations are physically forbidden. Rather, this simple example challenges us with the fact that some processes, even those familiar and widely used, are intrinsically nondiagonalizable. On the positive side, it appears that spectral methods can now be applied to analyze them. And, this will be particularly important in more complex, memoryful processes,55–58 including the hidden semi-Markov processes51,59 that are, roughly speaking, the cross-product of hidden finite-state Markov chains and renewal processes.

The previous simple examples started to demonstrate the spectral methods of the functional calculus. Next, we show a novel application of the meromorphic functional calculus to environmentally driven mesoscopic dynamical systems, selected to give a new set of results within nonequilibrium thermodynamics. In particular, we analyze functions of singular transition-rate operators. Notably, we show that the Drazin inverse arises naturally in the general solution of Green–Kubo relations. We mention that it also arises when analyzing moments of the excess heat produced in the driven transitions atop either equilibrium steady states or nonequilibrium steady states.26 

1. Dynamics in independent eigenspaces

An important feature of the functional calculus is its ability to address particular eigenspaces independently when necessary. This feature is often taken for granted in the case of normal operators; say, in physical dynamical systems when analyzing stationary distributions or dominant decay modes. Consider a singular operator L that is not necessarily normal and not necessarily diagonalizable and evaluate the simple yet ubiquitous integral 0τetLdt. Via the meromorphic functional calculus we find:

0τetLdt=λΛLm=0νλ1Lλ,m12πiCλ0τetzdt(zλ)m+1dz=(m=0ν01L0,m12πiC0z1(eτz1)zm+1dz)+λΛL\0m=0νλ1Lλ,m12πiCλz1(eτz1)(zλ)m+1dz=(m=0ν01τm+1(m+1)!L0,m)+LDeτLI,
(46)

where LD is the Drazin inverse of L, discussed earlier.

The pole–pole interaction (z−1 with zm−1) at z = 0 distinguished the 0-eigenspace in the calculations and required the meromorphic functional calculus for direct analysis. The given solution to this integral will be useful in the following.

Next, we consider the case where L is the transition-rate operator among the states of a structured stochastic dynamical system. This leads to several novel consequence within stochastic thermodynamics.

2. Green–Kubo relations

Let us reconsider the above integral in the case when the singular operator L—let us call it G—is a transition-rate operator that exhibits a single stationary distribution. By the spectral mapping lnΛeG of the eigenvalue 1ΛeG addressed in the Perron–Frobenius theorem, G’s zero eigenmode is diagonalizable. And, by assuming a single attracting stationary distribution, the zero eigenvalue has algebraic multiplicity a0 = 1. Equation (46) then simplifies to:

0τetGdt=τ|0G0G|+GDeτGI.
(47)

Since G is a transition-rate operator, the above integral corresponds to integrated time evolution. The Drazin inverse GD concentrates on the transient contribution beyond the persistent stationary background. In Eq. (47), the subscript within the left and right eigenvectors explicitly links the eigenvectors to the operator G, to reduce ambiguity. Specifically, the projector |0G⟩⟨0G| maps any distribution to the stationary distribution.

Green–Kubo-type relations60,61 connect the out-of-steady-state transport coefficients to the time integral of steady-state autocorrelation functions. They are thus very useful for understanding out-of-steady-state dissipation due to steady-state fluctuations. (Steady state here refers to either equilibrium or nonequilibrium steady state.) Specifically, the Green–Kubo relation for a transport coefficient, κ say, is typically of the form:

κ=0(A(0)A(t)s.s.As.s.2)dt,

where A(0) and A(t) are some observable of the stationary stochastic dynamical system at time 0 and time t, respectively, and the subscript ⟨·⟩s.s. emphasizes that the expectation value is to be taken according to the steady-state distribution.

Using:

A(0)A(t)s.s.=tr(|0G0G|AetGA)=0G|AetGA|0G,

the transport coefficient κ can be written more explicitly in terms of the relevant transition-rate operator G for the stochastic dynamics:

κ=limτ0τ0G|AetGA|0Gdtτ0G|A|0G2=limτ0G|A(0τetGdt)A|0Gτ0G|A|0G2=limτ0G|AGD(eτGI)A|0G=AGDAs.s..
(48)

Thus, we learn that relations of Green–Kubo form are direct signatures of the Drazin inverse of the transition-rate operator for the stochastic dynamic.

The result of Eq. (48) holds quite generally. For example, if the steady state has some number of periodic flows, the result of Eq. (48) remains valid. Alternatively, in the case of nonperiodic chaotic flows—where G will be the logarithm of the Ruelle–Frobenius–Perron operator, as described later in Sec. VI E 1—|0G⟩⟨0G| still induces the average over the steady-state trajectories.

In the special case where the transition-rate operator is diagonalizable, AGDAs.s. is simply the integrated contribution from a weighted sum of decaying exponentials. Transport coefficients then have a solution of the simple form:

κ=  λΛG\01λ0G|AGλA|0G.
(49)

Note that the minus sign keeps κ positive since Re(λ) < 0 for λ ∈ ΛG ∖{0}. Also, recall that G’s eigenvalues with nonzero imaginary part occur in complex-conjugate pairs and Gλ¯=Gλ¯. Moreover, if Gi,j is the classical transition-rate from state i to state j (to disambiguate from the transposed possibility), then ⟨0G| is the stationary distribution. (The latter is sometimes denoted ⟨π| in the Markov process literature.) And, |0G⟩ is a column vector of all ones (sometimes denoted |1⟩) which acts to integrate contributions throughout the state space.

A relationship of the form of Eq. (48), between the Drazin inverse of a classical transition-rate operator and a particular Green–Kubo relation was recently found in Ref. 62 for the friction tensor for smoothly-driven transitions atop nonequilibrium steady states. Subsequently, a truncation of the eigen-expansion of the form of Eq. (49) was recently used in a similar context to bound a universal tradeoff between power, precision, and speed.63 Equation (48) shows that a fundamental relationship between a physical property and a Drazin inverse is to be expected more generally whenever the property can be related to integrated correlation.

Notably, if a Green–Kubo-like relation integrates a cross-correlation, say between A(t) and B(t) rather than an autocorrelation, then we have only the slight modification:

0  (A(0)B(t)s.s.  As.s.  Bs.s.)dt=AGDBs.s..
(50)

The foregoing analysis bears on both classical and quantum dynamics. G may be a so-called linear superoperator in the quantum regime;64 for example, the Lindblad superoperator65,66 that evolves density operators. A Liouville-space representation67 of the superoperator, though, exposes the superficiality of the distinction between superoperator and operator. At an abstract level, time evolution can be discussed uniformly across subfields and reinterpretations of Eq. (50) will be found in each associated physical theory.

Reference 26 presents additional constructive results that emphasize the ubiquity of integrated correlation and Drazin inverses in the transitions between steady states,68 relevant to the fluctuations within any physical dynamic. Overall, these results support the broader notion that dissipation depends on the structure of correlation.

Frequency-dependent generalizations of integrated correlation have a corresponding general solution. For example, the general solution to power spectra of a process generated by any countable-state hidden Markov chain can be given in exact closed form using our methods. Those results will be presented elsewhere.

Since trajectories in state-space can be generated independently of each other, any nonlinear dynamic corresponds to a linear operation on an infinite-dimensional vector-space of complex-valued distributions (in the sense of generalized functions) over the original state-space. For example, the well-known Lorenz ordinary differential equations69 are nonlinear over its three given state-space variables—x, y, and z. Nevertheless, the dynamic is linear in the infinite-dimensional vector space D(R3) of distributions over R3. Although D(R3) is an unwieldy state-space, the dynamics there might be well approximated by a finite truncation of its modes.

1. Ruelle–Frobenius–Perron and Koopman operators

The preceding operator formalism applies, in principle at least. The question, of course, is, Is it practical and does it lead to constructive consequences? Let’s see. The right eigenvector is either |0G⟩ or |1T⟩ with T = eτG as the Ruelle–Frobenius–Perron transition operator.70,71 Equivalently, it is also π, the stationary distribution, with support on attracting subsets of R3 in the case of the Lorenz dynamic. The corresponding left-eigenvector 1, either ⟨0G| or ⟨1T|, is uniform over the space. Other modes of the operator’s action, according to the eigenvalues and left and right eigenvectors and generalized eigenvectors, capture the decay of arbitrary distributions on R3.

The meromorphic spectral methods developed above give a view of the Koopman operator and Koopman modes of nominally nonlinear dynamical systems4 that is complementary to the Ruelle–Frobenius–Perron operator. The Koopman operator K is the adjoint—in the sense of vector spaces, not inner product spaces—of the Ruelle–Frobenius–Perron operator T: effectively, the transpose K = T. Moreover, it has the same spectrum with only right and left swapping of the eigenvectors and generalized eigenvectors.

The Ruelle–Frobenius–Perron operator T is usually associated with the evolution of probability density, while the Koopman operator K is usually associated with the evolution of linear functionals of probability density. The duality of perspectives is associative in nature: ⟨f|(Tn|ρ0⟩) corresponds to the Ruelle–Frobenius–Perron perspective with T acting on the density ρ and (⟨f|Tn)|ρ0⟩ corresponds to the Koopman operator T = K acting on the observation function f. Allowing an observation vector f=[f1,f2,fm] of linear functionals, and inspecting the most general form of Kn given by Eq. (25) together with the generalized eigenvector decomposition of the projection operators of Eq. (39), yields the most general form of the dynamics in terms of Koopman modes. Each Koopman mode is a length-m vector-valued functional of a Ruelle–Frobenius–Perron right eigenvector or generalized eigenvector.

Both approaches suffer when their operators are defective. Given the meromorphic calculus’ ability to work around a wide class of such defects, adapting it to the Ruelle–Frobenius–Perron and Koopman operators suggests that it may lift their decades-long restriction to only analyzing highly idealized (e.g., hyperbolic) chaotic systems.

2. Eigenvalues from a time series

Let’s explore an additional benefit of this view of the Ruelle–Frobenius–Perron and Koopman operators, by proposing a novel method to extract the eigenvalues of a nominally nonlinear dynamic. Let ON(f, z) be (z−1 times) the z-transform [Ref. 72, pp. 257–262] of a length-N sequence of τ-spaced type-f observations of a dynamical system:

ON(f,z)z1n=0Nznf|Tn|ρ0Nf|(zIT)1|ρ0=λΛTm=0νλ1f|Tλ,m|ρ0(reiωλ)m+1,

as N → ∞ for |z| = r > 1. Note that ⟨f|Tn|ρ0⟩ is simply the f-observation of the system at time , when the system started in state ρ0. We see that this z-transform of observations automatically induces the resolvent of the hidden linear dynamic. If the process is continuous-time, then T = eτG implies λT=eτλG, so that the eigenvalues should shift along the unit circle if τ changes; but the eigenvalues should be invariant to τ in the appropriate τ-dependent conformal mapping of the inside of the unit circle of the complex plane to the left half complex plane. Specifically, for any experimentally accessible choice of inter-measurement temporal spacing τ, the fundamental set of continuous-time eigenvalues ΛG can be obtained from λG=1τlnλT, where each λT ∈ ΛT is extrapolated from c/(reiωλT)n curves fit to ON(f, re) for cC, large N, and fixed r.

The square magnitude of ON(f, z) is related to the power spectrum generated by f-type observations of the system. Indeed, the power spectrum generated by any type of observation of a nominally nonlinear system is a direct fingerprint of the eigenspectrum and resolvent of the hidden linear dynamic. This suggests many opportunities for inferring eigenvalues and projection operators from frequency-domain transformations of a time series.

The original, abstract spectral theory of normal operators rose to central importance when, in the early development of quantum mechanics, the eigenvalues of Hermitian operators were detected experimentally in the optical spectra of energetic transitions of excited electrons. We extended this powerful theory by introducing the meromorphic functional calculus and unraveling the consequences of both the holomorphic and meromorphic functional calculi in terms of spectral projection operators and their associated left and right generalized eigenvectors. The result is a tractable spectral theory of nonnormal operators. Our straightforward examples suggest that the spectral properties of these general operators should also be experimentally accessible in the behavior of complex—open, strongly interacting—systems. We see a direct parallel with the success of the original spectral theory of normal operators as it made accessible the phenomena of the quantum mechanics of closed systems. This turns on nondiagonalizability and appreciating how ubiquitous it is.

Nondiagonalizability has consequences for settings as simple as counting, as shown in Sec. VI C. Moreover, there we found that nondiagonalizability can be robust. The Drazin inverse, the negative-one power in the meromorphic functional calculus, is quite common in the nonequilibrium thermodynamics of open systems, as we showed in Sec. VI D. And in related work,23 we found that the power spectrum of a stochastic process is a direct signature of the spectrum and projection operators of the process’ hidden linear dynamic, with nondiagonalizable subspaces yielding qualitatively distinct line profiles. This shows that the spectral character of nonnormal and nondiagonalizable operators manifests itself physically and measurably. Our new formulae for spectral projection operators and the orthonormality relation among left and right generalized eigenvectors will thus likely find use in the analytic treatment of complex physical systems.

From the perspective of functional calculus, nonunitary time evolution, open systems, and non-Hermitian generators are closely related concepts since they all rely on the manipulation of nonnormal operators. Moreover, each domain is gaining traction. Nonnormal operators have recently drawn attention, from the nonequilibrium thermodynamics of nanoscale systems73 to large-scale cosmological evolution.74 In another arena entirely, complex directed networks75 correspond to nonnormal and not-necessarily-diagonalizable weighted digraphs. There are even hints that nondiagonalizable network structures can be optimal for implementing certain dynamical functionalities.76 The opportunity here should be contrasted with the well established field of spectral graph theory77 that typically considers consequences of the spectral theorem for normal operators applied to the symmetric (and thus normal) adjacency matrices and Laplacian matrices. It seems that the meromorphic calculus and its generalized spectral theory will enable a spectral weighted digraph theory beyond the purview of current spectral graph theory.

Even if the underlying dynamic is diagonalizable, particular questions or particular choices of observable often induce a nondiagonalizable hidden linear dynamic. The examples already showed this arising from the simple imposition of counting or assuming a Poissonian dynamic. In more sophisticated examples, we recently found nondiagonalizable dynamic structures in quantum memory reduction,24 classical complexity measures,19 and prediction.22,23

Our goal has been to develop tractable, exact analytical techniques for nondiagonalizable systems. We did not discuss numerical implementation of algorithms that naturally accompany its practical application. Nevertheless, the theory does suggest new algorithms—for the Drazin inverse, projection operators, power spectra, and more. Guided by the meromorphic calculus, such algorithms can be made robust despite the common knowledge that numerics with nondiagonalizable matrices are sensitive in certain ways.

The extended spectral theory we have drawn out of the holomorphic and meromorphic functional calculi complement efforts to address nondiagonalizability, e.g., via pseudospectra.78,79 It also extends and simplifies previously known results, especially as developed by Dunford.16 Just as the spectral theorem for normal operators enabled much theoretical progress in physics, we hope that our generalized and tractable analytic framework yields rigorous understanding for much broader classes of complex system. Importantly, the analytic framework should enable a new theory of complex systems beyond the limited purview of numerical investigations.

While the infinite-dimensional theory is in principle readily obtained from the present framework, special care must be taken to guarantee a similar level of tractability and generality. Nevertheless, even the finite-dimensional theory enables a new level of tractability for analyzing not-necessarily-diagonalizable systems, including nonnormal dynamics. Future work will take full advantage of the operator theory, with more emphasis on infinite-dimensional systems and continuous spectra. Another direction forward is to develop creation and annihilation operators within nondiagonalizable dynamics. In the study of complex stochastic information processing, for example, this would allow analytic study of infinite-memory processes generated by, say, stochastic pushdown and counter automata.58,80–82 In a physical context, such operators may aid in the study of open quantum field theories. One might finally speculate that the Drazin inverse will help tame the divergences that arise there.

JPC thanks the Santa Fe Institute for its hospitality. The authors thank Alec Boyd, Gavin Crooks, Chris Jarzynski, John Mahoney, Sarah Marzen, and Gregory Wimsatt for helpful discussions. We especially thank Gregory Wimsatt for his assistance with Sec. V B 3. This material is based upon work supported by, or in part by, the U.S. Army Research Laboratory and the U. S. Army Research Office under contracts W911NF-12-1-0234, W911NF-13-1-0390, W911NF-13-1-0340, and W911NF-18-1-0028.

1.
A.
Einstein
, “
On the method of theoretical physics
,”
Philosophy of Science
1
(
2
),
163
169
(
1934
), The Herbert Spencer Lecture, delivered at Oxford (10 June 1953).
2.
B. O.
Koopman
, “
Hamiltonian systems and transformation in hilbert space
,”
Proceedings of the National Academy of Sciences
17
(
5
),
315
318
(
1931
).
3.
P.
Gaspard
,
G.
Nicolis
,
A.
Provata
, and
S.
Tasaki
, “
Spectral signature of the pitchfork bifurcation: Liouville equation approach
,”
Phys. Rev. E
51
,
74
94
(
1995
).
4.
M.
Budii
,
R.
Mohr
, and
I.
Mezic
, “
Applied Koopmanism
,”
Chaos
22
(
4
) (
2012
).
5.
N.
Trefethen
, “
Favorite eigenvalue problems
,”
SIAM News
44
(
10
) (
2011
).
6.
A.
Sandage
and
G. A.
Tammann
, “
Steps toward the Hubble constant. VII-distances to NGC 2403, M101, and the Virgo cluster using 21 centimeter line widths compared with optical methods: The global value of H sub 0
,”
Astrophys. J.
210
,
7
24
(
1976
).
7.
A. G.
Milnes
, “
Semiconductor heterojunction topics: Introduction and overview
,”
Solid-State Electronics
29
(
2
),
99
121
(
1986
).
8.
P. A. M.
Dirac
, “
Theory of electrons and positrons
,” in
Nobel Lecture, Physics 1922–1941
(
Elsevier Publishing Company
,
Amsterdam
,
1965
).
9.
C.
Cortes
and
V.
Vapnik
, “
Support-vector networks
,”
Machine Learning
20
(
3
),
273
297
(
1995
).
10.
L.
Sirovich
and
M.
Kirby
, “
Low-dimensional procedure for the characterization of human faces
,”
J. Opt. Soc. Am. A
4
(
3
),
519
524
(
1987
).
11.
R.
Courant
and
D.
Hilbert
,
Methods of mathematical physics: First English edition
, volume 1,
Interscience Publishers
,
1953
.
12.
J.
von Neumann
, “
Zur algebra der funktionaloperationen und theorie der normalen operatoren
,”
Math. Annalen
102
,
370
427
(
1930
).
13.
J.
von Neumann
,
Mathematical Foundations of Quantum Mechanics
(
Princeton University Press
,
Princeton, New Jersey
,
1955
).
14.
S.
Hassani
,
Mathematical Physics
(
Springer
,
New York
,
1999
).
15.
R. R.
Halmos
,
Finite-Dimensional Vector Spaces
(
D. Van Nostrand Company
,
1958
).
16.
N.
Dunford
, “
Spectral operators
,”
Pacific J. Math.
4
(
3
),
321
354
(
1954
).
17.
C. D.
Meyer
,
Matrix Analysis and Applied Linear Algebra
(
SIAM Press
,
Philadephia, Pennsylvannia
,
2000
).
18.
P. J.
Antsaklis
and
A. N.
Michel
.
A Linear Systems Primer
(
Springer Science & Business Media
,
New York, New York
,
2007
).
19.
J. P.
Crutchfield
,
C. J.
Ellison
, and
P. M.
Riechers
, “
Exact complexity: Spectral decomposition of intrinsic computation
,”
Phys. Lett. A
380
(
9-10
),
998
1002
(
2015
).
20.
P. M.
Riechers
,
D. P.
Varn
, and
J. P.
Crutchfield
, “
Diffraction patterns of layered close-packed structures from hidden Markov models
,” e-print arxiv.org:1410.5028.
21.
P. M.
Riechers
,
D. P.
Varn
, and
J. P.
Crutchfield
, “
Pairwise correlations in layered close-packed structures
,”
Acta Cryst. A
71
,
423
443
(
2015
).
22.
P. M.
Riechers
and
J. P.
Crutchfield
, “
Spectral simplicity of apparent complexity. I. The nondiagonalizable metadynamics of prediction
,”
Chaos: An Interdisciplinary Journal of Nonlinear Science
28
(
3
),
033115
(
2018
).
23.
P. M.
Riechers
and
J. P.
Crutchfield
, “
Spectral simplicity of apparent complexity. II. Exact complexities and complexity spectra
,”
Chaos: An Interdisciplinary Journal of Nonlinear Science
28
(
3
),
033116
(
2018
).
24.
P. M.
Riechers
,
J. R.
Mahoney
,
C.
Aghamohammadi
, and
J. P.
Crutchfield
, “
Minimized state complexity of quantum-encoded cryptic processes
,”
Phys. Rev. A
93
,
052317
(
2016
).
25.
F. C.
Binder
,
J.
Thompson
, and
M.
Gu
, “
A practical, unitary simulator for non-Markovian complex processes
,” e-print arXiv.org:1709.02375.
26.
P. M.
Riechers
and
J. P.
Crutchfield
, “
Fluctuations when driving between nonequilibrium steady states
,”
J. Stat. Phys.
168
(
4
),
873
918
(
2017
).
27.
A. B.
Boyd
,
D.
Mandal
,
P. M.
Riechers
, and
J. P.
Crutchfield
, “
Transient dissipation and structural costs of physical information transduction
,”
Phys. Rev. Lett.
118
,
220602
(
2017
).
28.
B.
Latni
,
Signal Processing and Linear Systems
(
Oxford University Press
,
New York, New York
,
1998
).
29.
M. L.
Boas
,
Mathematical Methods in the Physical Sciences
, volume 2,
Wiley and Sons
,
New York, New York
,
1966
.
30.
N.
Dunford
, “
Spectral theory I. Convergence to projections
,”
Trans. Am. Math. Soc.
54
(
2
),
185
217
(
1943
).
31.
M.
Atiyah
,
R.
Bott
, and
V. K.
Patodi
, “
On the heat equation and the index theorem
,”
Inventiones Mathematicae
19
(
4
),
279
330
(
1973
).
32.
C. S.
Kubrusly
,
Spectral theory of operators on Hilbert spaces
(
Springer Science & Business Media
,
2012
).
33.
M.
Haase
, “
Spectral mapping theorems for holomorphic functional calculi
,”
J. London Math. Soc.
71
(
3
),
723
739
(
2005
).
34.
H. A.
Gindler
, “
An operational calculus for meromorphic functions
,”
Nagoya Math. J.
26
,
31
38
(
1966
).
35.
B.
Nagy
, “
On an operational calculus for meromorphic functions
,”
Acta Math. Acad. Sci. Hungarica
33
(
3
),
379
390
(
1979
).
36.
E. H.
Moore
, “
On the reciprocal of the general algebraic matrix
,”
Bull. Am. Math. Soc.
26
(
1920
).
37.
R.
Penrose
, “
A generalized inverse for matrices
,”
Math. Proc. Cambridge Phil. Soc.
51
(
1955
).
38.
A.
Ben-Israel
and
T. N. E.
Greville
,
Generalized Inverses: Theory and Applications
, CMS Books in Mathematics,
Springer
,
New York, New York
,
2003
.
39.
J. J.
Koliha
, “
A generalized Drazin inverse
,”
Glasgow Mathematical Journal
38
(
3
),
367
381
(
1996
).
40.
J. J.
Koliha
and
T. D.
Tran
, “
The Drazin inverse for closed linear operators and the asymptotic convergence of C 0-semigroups
,”
J. Operator Th.
46
(
2
),
323
336
(
2001
).
41.
U. G.
Rothblum
, “
A representation of the Drazin inverse and characterizations of the index
,”
SIAM J. App. Math.
31
(
4
),
646
648
(
1976
).
42.
J. G.
Kemeny
and
J. L.
Snell
,
Finite Markov Chains
, volume 356 (
D. Van Nostrand
,
New York, New York
,
1960
).
43.
N.
Dunford
and
J. T.
Schwartz
,
Linear Operators
(
Interscience Publishers
,
New York
,
1967
).
44.
T.
Bermúdez
, “
Meromorphic functional calculus and local spectral theory
,”
Rocky Mountain J. Math.
29
(
2
),
437
447
(
1999
).
45.
J. J.
Sakurai
and
J. J.
Napolitano
,
Modern Quantum Mechanics
(
Addison-Wesley
,
San Francisco, California
,
2011
).
46.
S. J.
Axler
,
Linear Algebra Done Right
, volume 2 (
Springer
,
New York, New York
,
1997
).
47.
M.
Li
and
P. M. B.
Vitanyi
,
An Introduction to Kolmogorov Complexity and its Applications
(
Springer-Verlag
,
New York
,
1993
).
48.
J. P.
Crutchfield
, “
Between order and chaos
,”
Nature Physics
8
(
January
),
17
24
(
2012
).
49.
J. P.
Crutchfield
,
C. J.
Ellison
, and
J. R.
Mahoney
, “
Time’s barbed arrow: Irreversibility, crypticity, and stored information
,”
Phys. Rev. Lett.
103
(
9
),
094101
(
2009
).
50.
C. J.
Ellison
,
J. R.
Mahoney
, and
J. P.
Crutchfield
, “
Prediction, retrodiction, and the amount of information stored in the present
,”
J. Stat. Phys.
136
(
6
),
1005
1034
(
2009
).
51.
V. S.
Barbu
and
N.
Limnios
,
Semi-Markov Chains and Hidden semi-Markov Models toward Applications: Their Use in Reliability and DNA Analysis
, volume 191 (
Springer
,
New York
,
2008
).
52.
W. L.
Smith
, “
Renewal theory and its ramifications
,”
J. Roy. Stat. Soc. B
20
(
2
),
243
302
(
1958
).
53.
W.
Gerstner
and
W.
Kistler
, “
Statistics of spike trains
,” in
Spiking Neuron Models
(
Cambridge University Press
,
Cambridge, United Kingdom
,
2002
).
54.
F.
Beichelt
,
Stochastic Processes in Science, Engineering and Finance
(
Chapman and Hall
,
New York
,
2006
).
55.
S.
Marzen
and
J. P.
Crutchfield
, “
Informational and causal architecture of discrete-time renewal processes
,”
Entropy
17
(
7
),
4891
4917
(
2015
).
56.
S.
Marzen
and
J. P.
Crutchfield
, “
Informational and causal architecture of continuous-time renewal processes
,”
J. Stat. Phys.
168
(
a
),
109
127
(
2017
).
57.
S.
Marzen
,
M. R.
DeWeese
, and
J. P.
Crutchfield
, “
Time resolution dependence of information measures for spiking neurons: Scaling and universality
,”
Front. Comput. Neurosci.
9
,
109
(
2015
).
58.
S.
Marzen
and
J. P.
Crutchfield
, “
Statistical signatures of structural organization: The case of long memory in renewal processes
,”
Phys. Lett. A
380
(
17
),
1517
1525
(
2016
).
59.
S.
Marzen
and
J. P.
Crutchfield
, “
Structure and randomness of continuous-time discrete-event processes
,”
J. Stat. Physics
169
(
2
),
303
315
(
2016
).
60.
M. S.
Green
, “
Markoff random processes and the statistical mechanics of time-dependent phenomena. II. Irreversible processes in fluids
,”
J. Chem. Physics
22
(
3
),
398
413
(
1954
).
61.
R.
Zwanzig
, “
Time-correlation functions and transport coefficients in statistical mechanics
,”
Ann. Rev. Phys. Chemistry
16
(
1
),
67
102
(
1965
).
62.
D.
Mandal
and
C.
Jarzynski
, “
Analysis of slow transitions between nonequilibrium steady states
,”
J. Stat. Mech. Th. Exp.
2016
,
1
17
.
63.
S.
Lahiri
,
J.
Sohl-Dickstein
, and
S.
Ganguli
, “
A universal tradeoff between power, precision and speed in physical communication
,” e-print arXiv:1603.07758.
64.
P.
Löwdin
, “
On operators, superoperators, Hamiltonians, and Liouvillians
,”
Intl. J. Quant. Chem.
22
(
S16
),
485
560
(
1982
).
65.
G.
Lindblad
, “
On the generators of quantum dynamical semigroups
,”
Comm. Math. Physics
48
(
2
),
119
130
(
1976
).
66.
S. M.
Barnett
and
S.
Stenholm
, “
Spectral decomposition of the Lindblad operator
,”
J. Mod. Optics
47
(
14-15
),
2869
2882
(
2000
).
67.
T.
Petrosky
and
I.
Prigogine
, “
The Liouville space extension of quantum mechanics
,”
Adv. Chem. Phys
99
,
1
120
(
1997
).
68.
Y.
Oono
and
M.
Paniconi
, “
Steady state thermodynamics
,”
Prog. Theo. Phys. Supp.
130
,
29
44
(
1998
).
69.
E. N.
Lorenz
, “
Deterministic nonperiodic flow
,”
J. Atmos. Sci.
20
(
2
),
130
141
(
1963
).
70.
D.
Ruelle
and
F.
Takens
, “
On the nature of turbulence
,”
Comm. Math. Physics
20
(
3
),
167
192
(
1971
).
71.
M. C.
Mackey
,
Time’s Arrow: The Origins of Thermodynamic Behavior
(
Springer
,
New York
,
1992
).
72.
R.
Bracewell
,
The Fourier Transform and Its Applications
, third edition (
McGraw-Hill
,
New York
,
1999
).
73.
B.
Gardas
,
S.
Deffner
, and
A.
Saxena
, “
Non-hermitian quantum thermodynamics
,”
Sci. Reports
6
,
23408
(
2016
).
74.
N.
Berkovits
and
D.
Witten
, “
Conformal supergravity in twistor-string theory
,”
J. High Energy Physics
2004
(
08
),
009
.
75.
M.
Newman
,
Networks: An introduction
(
Oxford University Press
,
Oxford, United Kingdom
,
2010
).
76.
T.
Nishikawa
and
A. E.
Motter
, “
Synchronization is optimal in nondiagonalizable networks
,”
Phys. Rev. E
73
,
065106
(
2006
).
77.
F. R. K.
Chung
,
Spectral Graph Theory
, volume 92 (
American Mathematical Soc.
,
Providence, Rhode Island
,
1997
).
78.
L. N.
Trefethen
, “
Pseudospectra of linear operators
,”
SIAM Review
39
(
3
),
383
406
(
1997
).
79.
L. N.
Trefethen
and
M.
Embree
,
Spectra and pseudospectra: The behavior of nonnormal matrices and operators
(
Princeton University Press
,
Princeton, New Jersey
,
2005
).
80.
J. P.
Crutchfield
and
K.
Young
, “
Computation at the onset of chaos
,” in
W.
Zurek
, editor,
Entropy, Complexity, and the Physics of Information, volume VIII of SFI Studies in the Sciences of Complexity
, pages
223
269
,
Reading, Massachusetts
,
1990
,
Addison-Wesley
.
81.
N.
Travers
and
J. P.
Crutchfield
, “
Infinite excess entropy processes with countable-state generators
,”
Entropy
16
,
1396
1413
(
2014
).
82.
J. P.
Crutchfield
and
S. E.
Marzen
, “
Signatures of infinity: Nonergodicity and resource scaling in prediction, complexity, and learning
,”
Phys. Rev. E
91
,
050106
(
2015
).