We develop a thorough mathematical analysis to deduce conditions for the accuracy and convergence of different approximations of the memory integral in the Mori-Zwanzig (MZ) equation. In particular, we derive error bounds and sufficient convergence conditions for short-memory approximations, the t-model, and hierarchical (finite-memory) approximations. In addition, we derive useful upper bounds for the MZ memory integral, which allow us to estimate a priori the contribution of the MZ memory to the dynamics. Such upper bounds are easily computable for systems with finite-rank projections. Numerical examples are presented and discussed for linear and nonlinear dynamical systems evolving from random initial states.

The Mori-Zwanzig (MZ) formulation is a technique from irreversible statistical mechanics that allows the development of formally exact evolution equations for the quantities of interest such as macroscopic observables in high-dimensional dynamical systems.45,10,41,42 One of the main advantages of developing such exact equations is that they provide a theoretical starting point to avoid integrating the full (possibly high-dimensional) dynamical system and instead solve directly for the quantities of interest, thus reducing the computational cost significantly. Computing the solution to the Mori-Zwanzig equation, however, is a challenging task that relies on approximations and appropriate numerical schemes. One of the main difficulties lies in the approximation of the MZ memory integral (convolution term), which encodes the effects of the so-called orthogonal dynamics in the time evolution of the quantity of interest. The orthogonal dynamics are essentially a high-dimensional flow satisfying a complex integro-differential equation. Over the years, many techniques have been proposed for approximating the MZ memory integral, the most efficient ones being problem-dependent.31,41 For example, in applications to statistical mechanics, Mori’s continued fraction method27,18 has been quite successful in determining exact solutions to several prototype problems, such as the dynamics of the auto-correlation function of a tagged oscillator in a harmonic chain.17,23 Other effective approaches to approximate the MZ memory integral rely on perturbation methods,6,42,8 mode coupling theories,2,31 functional approximation techniques,40,19,43,21,29 and data-driven methods.5,25 In a parallel effort, the applied mathematics community has, in recent years, attempted to derive general easy-to-compute representations of the memory integral. In particular, various approximations such as the t-model,10,12,34 the modified t-model,7 and, more recently, renormalized perturbation methods36 were proposed to address the approximation of the memory integral in situations where there is no clear separation of scales between resolved and unresolved dynamics.

The main objective of this paper is to develop rigorous estimates of the memory integral and provide convergence analysis of different approximation models of the Mori-Zwanzig equation, such as the short-memory approximation,33 the t-model,12 and hierarchical methods.35 In particular, we study the MZ equation corresponding to two broad classes of projection operators: (i) infinite-rank projections (e.g., Chorin’s projection10) and (ii) finite-rank projections (e.g., Mori’s projection28). We develop our analysis for both state-space and probability density function (PDF) space formulations of the MZ equation. These two descriptions are connected by the same duality principle that pairs the Koopman and Frobenious-Perron operators.15 

This paper is organized as follows. In Sec. II, we outline the general procedure to derive the MZ equation in the phase space and discuss the common choices of projection operators. In Sec. III, we derive error bounds for the MZ memory integral and provide the convergence analysis of different memory approximation methods, including the t-model,34,10,12 the short-memory approximation,33 and the hierarchical memory approximation technique.35 Such estimates are built upon semigroup estimation methods we present in  Appendix A. In Sec. IV, we present numerical examples demonstrating the accuracy of the memory approximation/estimation methods we develop throughout the paper. The main findings are summarized in Sec. V. Convergence analysis of the MZ memory term in the probability density function space formulation is presented in  Appendix B.

Consider the nonlinear dynamical system

dxdt=F(x),  x(0)=x0
(1)

evolving on a smooth manifold S. For simplicity, let us assume that S=Rn. We will consider the dynamics of scalar-valued observables g:SC, and for concreteness, it will be desirable to identify structured spaces of such observable functions. In Ref. 15, it was argued that C*-algebras of observables such as L(S,C) (the space of all measureable, essentially bounded functions on S) and C0(S,C) (the space of all continuous functions on S, vanishing at infinity) make natural choices. In what follows, we do not require the observables to comprise a C*-algebra, but we will want them to comprise a Banach space as the estimation theorems of Sec. III make an extensive use of the norm of this space. Having the structure of a Banach space of observables also gives greater context to the meaning of the linear operators L, K, P, and Q to be defined hereafter.

The dynamics of any scalar-valued observable g(x) (quantity of interest) can be expressed in terms of a semi-group K(t,s) of operators acting on the Banach space of observables. This is the Koopman operator24 which acts on the function g as

g(x(t))=K(t,s)g(x(s)),
(2)

where

K(t,s)=e(ts)L,  Lg(x)=F(x)g(x).
(3)

Rather than computing the Koopman operator applicable to all observables, it is often more tractable to compute the evolution only of a (closed) subspace of the quantities of interest. This subspace can be described conveniently by means of a projection operator P with the subspace as its image. Both P and the complementary projection Q=IP act on the space of observables. The nature, mathematical properties, and connections between P and the observable g are discussed in detail in Ref. 15 and summarized in Sec. II A. For now, it suffices to assume that P is a bounded linear operator and that P2=P. The MZ formalism describes the evolution of observables initially in the image of P. Because the evolution of observables is governed by the semi-group K(t,s), we seek an evolution equation for K(t,s)P. By using the definition of the Koopman operator (3) and the well-known Dyson identity

etL=etQL+0tesLPLe(ts)QLds,

we obtain the operator equation

ddtetL=etLPL+etQLQL+0tesLPLe(ts)QLQLds.
(4)

By applying this equation to an observable function u0, we obtain the well-known MZ equation in phase space

tetLu0=etLPLu0+etQLQLu0+0tesLPLe(ts)QLQLu0ds.
(5)

Acting on the left with P, we obtain the evolution equation for projected dynamics

tPetLu0=PetLPLu0+0tPesLPLe(ts)QLQLu0ds.
(6)

Note, in fact, that the projection of the second term in (5), i.e., PetQLQLx0=0, vanishes since PQ=0.

In this section, we make a summary on the commonly used projection operators P in the Mori-Zwanzig framework. To make our definition mathematically sound, we begin by assuming that the Liouville operator (3) acts on observable functions in a C*-algebra A, for instance, L(M,Σ,μ), where M is a space such as RN, Σ is a σ-algebra on M, and μ is a measure on Σ. Let σA* be a positive linear functional on A. We define the weighted pre-inner product

f,gσ=σ(f*g).

This can be used to define a Hilbert space H=L2(M,σ), which is the completion of the quotient space

H={fA:σ(f*f)<}/{fA:σ(f*f)=0}

endowed with the inner product ⟨·,·⟩σ. The L2 norm induced by the inner product is denoted as ∥·∥σ. In the rest of the paper, the positive linear functional σ is always induced by a probability distribution σ̃ through

σ(u)=Mσ̃(ω)u(ω)dω,

where σ̃ is typically chosen to be the probability density of the initial condition ρ0 or the equilibrium distribution ρeq in statistical physics. To conform to the literature, we also use notation ,ρ0, ,ρeq,,eq to represent the weighted inner product corresponding to different probability measures σ̃(ω)dω. With the Hilbert space determined, we now focus on the following two broad classes of orthogonal projections on H.

1. Infinite-rank projections

The first class of projection operators to consider in this setting are the conditional expectations P such that P*σ=σ. In this case, the properties of conditional expectations (in particular, that P[P(f)gP(h)]=P(f)P(g)P(h)39) and the fact that P*σ=σ imply that

Pf,gσ=σ[(Pf)*g]=P*(σ)[(Pf)*g]=σ[P((Pf)*g)]=σ[(Pf)*(Pg)]f,Pgσ=σ[f*Pg]=P*(σ)[f*Pg]=σ[P(f*Pg)]=σ[(Pf*)(Pg)]=σ[(Pf)*(Pg)]

so that

Pf,gσ=f,Pgσ

for all f,gH. It follows that

Qf,gσ=f,gσPf,gσ=f,gσf,Pgσ=f,Qgσ.

Therefore both P and Q are self-adjoint (i.e., orthogonal) projections onto closed subspaces of H and hence contractions Pσ1, Qσ1. Chorin’s projection12,10 is one of this class and is defined as

Pg(x^0)=+g(x^(t;x^0,x̃0),x̃(t;x^0,x̃0))ρ0(x^0,x̃0)dx̃0+ρ0(x^0,x̃0)dx̃0=Eρ0[g|x^0].
(7)

Here x(t;x0)=(x^(t;x^0,x̃0),x̃(t;x^0,x̃0)) is the flow map generated by (1) split into resolved (x^) and unresoved (x̃) variables, and g(x)=g(x^,x̃) is the quantity of interest. For Chorin’s projection, the positive functional σ defining the Hilbert space H may be taken to be integration with respect to the probability measure ρ0(x^0,x̃0). Clearly, if x0 is deterministic, then ρ0(x^0,x̃0) is a product of Dirac delta functions. On the other hand, if x^(0) and x̃(0) are statistically independent, i.e., ρ0(x^0,x̃0)=ρ^0(x^0)ρ̃0(x̃0), then the conditional expectation P simplifies to

Pu(x^0)=+u(x^(t;x^0,x̃0),x̃(t;x^0,x̃0))ρ̃0(x̃0)dx̃0.
(8)

In the special case where u(x^,x̃)=x^(t;x^0,x̃0), we have

Px^(x^0)=+x^(t;x^0,x̃0)ρ̃0(x̃0)dx̃0,
(9)

i.e., the conditional expectation of the resolved variables x^(t) given the initial condition x^0. This means that an integration of (9) with respect to ρ^0(x^0) yields the mean of the resolved variables, i.e.,

Eρ0[x^(t)]=Px^(x^0)ρ^0(x^0)dx^0=x^(t,x0)ρ0(x0)dx0.
(10)

Obviously, if the resolved variables x^(t) evolve from a deterministic initial state x^0, then the conditional expectation (9) represents the the average of the reduced-order flow map x^(t;x^0,x̃0) with respect to the PDF of x̃0, i.e., the flow map

PetLx^(0)=X0(t;x^0)=+x^(t;x^0,x̃0)ρ̃0(x̃0)dx̃0.
(11)

In this case, the MZ equation (6) is an exact (unclosed) partial differential equation (PDE) for the multivariate field X0(t,x^0). In order to close such an equation, a mean field approximation of the type Pf(x^)=f(Px^) was introduced by Chorin et al. in Refs. 10, 12, and 11, together with the assumption that the probability distribution of x0 is invariant under the flow generated by (1).

2. Finite-rank projections

Another class of projections is defined by choosing a closed (typically finite-dimensional) linear subspace VH=L2(M,σ) and letting P to be the orthogonal projection onto V in the σ inner product. An example of such projection is Mori’s projection,46 widely used in statistical physics. For finite-dimensional V, given a linearly independent set {u1, …, uM} ⊂ V that spans V, P can be defined by first constructing the positive definite Gram matrix Gij=ui,ujσ. Then

Pf=i,j=1M(G1)ijui,fσuj.
(12)

This projection is orthogonal with respect to the Lσ2 inner product. In statistical physics, a common choice for the positive functional σ that generates H is integration with respect to the Gibbs canonical distribution ρeq=eβH/Z, for the Hamiltonian H=H(p,q) and the associated partition function Z. Here q are generalized coordinates while p are kinetic momenta.

In this section, we develop a thorough mathematical analysis of the MZ memory integral

0tPesLPLe(ts)QLQLu0ds=0tPesLPe(ts)LQLQLu0ds
(13)

and its approximation. We begin by describing the behavior of the semigroup norms etL, etQLQ, and etLQ as functions of time, for different choices of projection P and different norms. As we will see, the analysis will give clear computable bounds only in some circumstances, illustrating the difficulty of this problem and the need for further development and insight.

For any Liouville operator L of the form (3) acting on A=L(M,Σ,μ) and for any σ identified with an element of L1(M,Σ,μ), the functional L*σ (assuming that σ lies in the domain of L*) is absolutely continuous with respect to σ (essentially because L acts locally) and the Radon-Nikodym derivative20 may be identified with the negative of the divergence of the vector field F with respect to the measure induced by σ, i.e.,

dL*σdσ=divσF,  i.e.,  (L*σ)(u)=σ(udivσF),
(14)

where

divσF=(σ̃(x)F(x))σ̃(x).
(15)

When M=RN and σ has the form σ(u)=RNσ̃(x)u(x)dx, this can be shown more directly using integration by parts. By assuming that σ̃(x) or Fi decays to 0 at , we have

L*σ(u)=σ(Lu)=RNσ̃i=1NFiuxidx=RNui=1Nxi(σ̃Fi)dx=RNu1σ̃i=1Nxi(σ̃Fi)σ̃dx
(16a)
=σu1σ̃i=1Nxi(σ̃Fi)=σudivσF
(16b)

from which we see

divσF=(σ̃F)σ̃=1σ̃i=1Nxi(σ̃Fi)=F+Flnσ̃.

Therefore,

v,udivσFσ=σ(v*udivσF)=(L*σ)(v*u)=σ(L(v*u))=σ(L(v)*u+v*L(u))=Lv,uσ+v,Luσ

and the following are equivalent: (i) L*σ=0 (i.e., σ is invariant); (ii) divσF = 0; (iii) L is the skew-adjoint with respect to the σ inner product. More generally, on H=L2(M,σ), we find that

L+L=divσF

so that the numerical abscissa ω37,38 (i.e., logarithmic norm14,32) of L is given by

ω=sup0uHRu,Luσu,uσ=sup0uHu,(L+L)uσ2u,uσ=sup0uHu,udivσFσ2u,uσ=12infxdivσF(x).
(17)

Using this numerical abscissa ω, we obtain the following Lσ2 estimation of the Koopman semigroup:

etLLσ2eωt,
(18)

and moreover, eωt is the smallest exponential function that bounds etLσ.14 When P and Q=IP are the orthogonal projections on L2(M,σ), we can observe that the numerical abscissa of QLQ is bounded by that of L. In fact,

sup0uHRu,QLQuσu,uσ=sup0uHRQu,LQuσu,uσ=sup0uImQRu,Luσu,uσsup0uHRu,Luσu,uσ=ω

[see Eq. (17)] so that

etQLQLσ2eωt.
(19)

It should be noticed that this bound for the orthogonal semigroup is not necessarily tight. The tightness of the bound depends on the choice of projection P and comes down to whether functions in the image of Q can be chosen localized to regions where divσF is close to its infimal value.

When estimating the MZ memory integral, we need to deal with the semigroup etLQ. It turns out to be extremely difficult to prove strong continuity of such a semigroup in general, due to the unboundedness of LQ. It is shown in  Appendix A that when PLQ is an unbounded operator, as is typical when P is a conditional expectation, the semigroup etLQ can only be bounded as

etLQσMQetωQ
(20)

for some MQ>1, due to the fact that etLQσ has an infinite slope at t = 0. More work is needed to obtain satisfactory, computable values for MQ and ωQ, in the case where P and Q are the infinite-rank projections. It is also shown that when either PLQ or LP is bounded, for example, when P is a finite-rank projection, we can get computable semigroup bounds of the form

etLQσeωQte12ω2+PLQσ2+ωt,
(21a)
etLQσeωQte(ω+LPσ)t,
(21b)

where ω = −inf divσF if we use the Lσ2 estimation of etL.

We begin by seeking to bound the MZ memory integral (B4) as a whole and build our analysis from there. A key assumption of our analysis is that the semigroup etLQ is strongly continuous, i.e., the map tetLQg is continuous in the norm topology on the space of observables for each fixed g.16 As we pointed out in Sec. III A, it is very difficult to rigorously prove a strong continuity of etLQ and to obtain a computable upper bound for unbounded generators of the form LQ. We leave this problem for future research, and assume for now that that there exist constants MQ and ωQ such that etLQMQetωQ. Throughout this section, ∥·∥ denotes a general Banach norm. We begin with the following simple estimate:

Theorem 3.1
(Memory growth). LetetLandetLQbe strongly continuous semigroups with upper boundsetLMetωandetLQMQetωQ. Then
0tPesLPLe(ts)QLQLu0dsM0(t),
(22)
where
M0(t)=C1tetωQ,  ω=ωQ,C1ωωQ[etωetωQ],  ωωQ
(23)
andC1=MMQLQLu0is a constant. Clearly, limt→0M0(t) = 0.

Proof.
We first rewrite the memory integral in the equivalent form
0tPesLPLe(ts)QLQLu0ds=0tPesLPe(ts)LQLQLu0ds.
Since etL and etLQ are assumed to be strongly continuous semigroups, we have the upper bounds etLMetω, etLQMQetωQ. Therefore
0tPesLPe(ts)LQLQLu0ds0tesLPe(ts)LQLQLu0dsMMQLQLu00tes(ωωQ)ds=C1tetωQ,  ω=ωQ,C1ωωQ[etωetωQ],  ωωQ,
where C1=MMQP2LQLu0.

Theorem 3.1 provides an upper bound for the growth of the memory integral based on the assumption that etL and etLQ are strongly continuous semigroups. We emphasize that only for simple cases can such upper bounds be computed analytically (we will compute one of the cases later in Sec. IV) because of the fundamental difficulties in computing the upper bound of etLQ. However, it will be shown later that, although the specific expression for M0(t) is unknown, the form of it is already useful as it enables us to derive some verifiable theoretical predictions for general nonlinear systems.

Theorem 3.1 can be employed to obtain upper bounds for well-known approximations of the memory integral. Let us begin with the t-model proposed in Ref. 12. This model relies on the approximation

0tPesLPLe(ts)QLQLu0dstetLPLQLu0  (t-model).
(24)

Theorem 3.2
(Memory approximation via thet-model12). LetetLandetLQbe strongly continuous semigroups with upper boundsetLMetωandetLQMQetωQ. Then
0tPesLPLe(ts)LQLQLu0dstPetLLQLu0M1(t),
where
M1(t)=C1etωQetωωQω+tetωMQ,ωωQ,C1MQ+1MQtetω,ω=ωQ
andC1=MMQP2LQLu0.

Proof.
By applying the triangle inequality, we obtain that
0tPesLPe(ts)LQLQLu0dstPetLPLQLu00tPesLPe(ts)LQds+tPetLPLQLu0P2LQLu0MMQ0tesωe(ts)ωQds+tMetω=C1etω0tes(ωQω)ds+tMQ=C1etωQetωωQω+tetωMQ,ωωQ,C1MQ+1MQtetω,ω=ωQ,
where C1=MMQP2LQLu0.
Theorem 3.2 provides an upper bound for the error associated with the t-model. The limit
limt0M1(t)=0
(25)
guarantees the convergence of the t-model for short integration times. On the other hand, depending on the semigroup constants M, ω, MQ, and ωQ (which may be estimated numerically), the error of the t-model may remain small for longer integration times (see the numerical results in Sec. IV B 2). Next, we study the short-memory approximation proposed in Ref. 33. The main idea is to replace the integration interval [0, t] in (13) by a shorter time interval [t − Δt, t], i.e.,
0tPesLPLe(ts)QLQLu0dstΔttPesLPLe(ts)QLQLu0ds  (short-memory approximation),
where Δt ∈ [0, t] identifies the effective memory length. The following result provides an upper bound to the error associated with the short-memory approximation.

Theorem 3.3
(Short memory approximation33). LetetLandetLQbe strongly continuous semigroups with upper boundsetLMetωandetLQMQetωQ. Then the following error estimate holds true:
0tPesLPLe(ts)QLQLu0dstΔttPesLPLe(ts)QLQLu0dsM2(tΔt,t),
where
M2(Δt,t)=C1(tΔt)etωQ,  ω=ωQ,C1eΔtωQe(tΔt)ωe(tΔt)ωQωωQ,  ωωQ
andC1=MMQP2LQLu0.

We omit the proof due to its similarity to that of Theorem 3.1. Note that limΔttM2t, t) = 0 for all finite t > 0.

An alternative way to approximate the memory integral (13) was proposed by Stinis in Ref. 35. The key idea is to repeatedly differentiate (13) with respect to time and establish a hierarchy of PDEs which can eventually be truncated or approximated at some level to provide an approximation of the memory. In this section, we derive this hierarchy of memory equations and perform a thorough theoretical analysis to establish the accuracy and convergence of the method. To this end, let us first define

w0(t)=0tPesLPLe(ts)QLQLu0ds
(26)

to be the memory integral (13). By differentiating w0(t) with respect to time, we obtain

dw0(t)dt=PetLPLQLu0+w1(t),

where

w1(t)=0tPesLPLe(ts)QL(QL)2u0ds.

By iterating this procedure n times, we obtain

dwn1(t)dt=PetLPL(QL)n1u0+wn(t),
(27)

where

wn(t)=0tPesLPLe(ts)QL(QL)n+1u0ds.
(28)

The hierarchy of Eqs. (27) and (28) is equivalent to the following infinite-dimensional system of PDEs:

dw0(t)dt=PetLPLQLu0+w1(t),dw1(t)dt=PetLPLQLQLu0+w2(t),dwn1(t)dt=PetLPL(QL)nu0+wn(t),
(29)

evolving from the initial condition wi(0) = 0, i = 1, 2, … [see Eq. (28)]. With such an initial condition available, we can solve (29) with backward substitution, i.e., from the last equation to the first one, to obtain the following (exact) Dyson series representation of the memory integral (26):

w0(t)=0tPesLPLQLu0ds+0t0τ1PesLPLQLQLu0dsdτ1++0t0τn10τ1PesLPL(QL)nu0dsdτ1dτn1+.
(30)

So far no approximation was introduced, i.e., the infinite-dimensional system (29) and the corresponding formal solution (30) are exact. To make progress in developing a computational scheme to estimate the memory integral (26), it is necessary to introduce approximations. The simplest of these rely on truncating the hierarchy (29) after n equations, while simultaneously introducing an approximation of the nth order memory integral wn(t). We denote such an approximation as wnen(t). The truncated system takes the form

dw0n(t)dt=PetLPLQLu0+w1n(t),dw1n(t)dt=PetLPLQLQLu0+w2n(t),dwn1n(t)dt=PetLPL(QL)nu0+wnen(t).
(31)

The notation wjn(t) (j = 0, …, n − 1) emphasizes that the solution to (31) is, in general, different from the solution to (29). The initial condition of the system can be set as win(0)=0, for all i = 0, , n − 1. By using backward substitution, this yields the following formal solution:

w0n(t)=0tPesLPLQLu0ds+0t0τ1PesLPLQLQLu0dsdτ1++0t0τn10τ1PesLPL(QL)nu0dsdτ1dτn1+0t0τn10τ1wnen(s)dsdτ1dτn1,
(32)

representing an approximation of the memory integral (26). Note that, for a given system, such approximation depends only on the number of equations n in (31) and on the choice of the approximation wnen(t). In the present paper, we consider the following choices:

  1. Approximation by truncation (H-model),

wnen(t)=0.
(33)
  1. Type-I finite memory approximation (FMA),

wnen(t)=max(0,tΔtn)tPesLPLe(ts)QL(QL)n+1u0ds.
(34)
  1. Type-II finite memory approximation,

wnen(t)=min(t,tn)tPesLPLe(ts)QL(QL)n+1u0ds.
(35)
  1. Ht-model,

wnen(t)=tPetLPL(QL)n+1u0.
(36)

The quantities tn and Δtn appearing in (34) and (35) will be defined in Theorems 3.5 and 3.6, respectively. The first approximation, i.e., (33), is a truncation of the hierarchy obtained by assuming that wn(t) = 0. Such approximation was originally proposed by Stinis in Ref. 35, and we shall call it the H-model. The Type-I finite memory approximation (FMA) is obtained by applying the short memory approximation to the nth order memory integral wn(t). The Type-II finite memory approximation (FMA) is a modified version of the Type-I, with a larger memory band. The Ht-model approximation is based on replacing the nth order memory integral wn(t) with a classical t-model. Note that in this setting, the classical t-model approximation proposed by Chorin and Stinis12 is equivalent to a zeroth-order Ht-model approximation.

Hereafter, we present a thorough mathematical analysis that aims at estimating the error w0(t)w0n(t), where w0(t) is the full memory at time t [see (26) or (30)], while w0n(t) is the solution of the truncated hierarchy (31), with wnen(t) given by (33), (34), (35), or (36). With such error estimates available, we can infer whether the approximation of the full memory w0(t) with w0n(t) is accurate and, more importantly, if the algorithm to approximate the memory integral converges. To the best of our knowledge, this is the first time a rigorous convergence analysis is performed on various approximations of the MZ memory integral. It turns out that the distance w0(t)w0n(t) can be controlled through the construction of the hierarchy under some constraint on the initial condition.

1. The H-model

Setting wnen(t)=0 in (31) yields an approximation by truncation, which we will refer to as the H-model (hierarchical model). Such a model was originally proposed by Stinis in Ref. 35. Hereafter we provide error estimates and convergence results for this model. In particular, we derive an upper bound for the error w0(t)w0n(t) and sufficient conditions for the convergence of the reduced-order dynamical system. Such conditions are problem dependent, i.e., they involve the Liouvillian L, the initial condition u0, and the projection P.

Theorem 3.4
(Accuracy of theH-model). LetetLandetLQbe strongly continuous semigroups with upper boundsetLMetωandetLQMQetωQ, and let T > 0 be a fixed integration time. For some fixed n, let
αj=(LQ)j+1Lu0(LQ)jLu0,1jn.
(37)
Then, for any 1 ≤ p ≤ n and all t ∈ [0, T], we have
w0(t)w0p(t)M3p(t)M3p(T),
where
M3p(t)=C1A1A2tp+1(p+1)!j=1pαj,    C1=LQLu0MMQ,
and
A1=maxs[0,T]es(ωωQ)=1,  ωωQeT(ωωQ),  ωωQ,  A2=maxs[0,T]esωQ=1,  ωQ0eTωQ,  ωQ0.
(38)

Proof.
We begin with the expression for the difference between the memory term w0 and its approximation w0p,
w0(t)w0p(t)=0t0τp0τ20τ1PesLPe(τ1s)LQ(LQ)n+1Lu0dsdτ1dτp.
(39)
Since etL and etLQ are strongly continuous semigroups, we have etLMeωt and etLQMQeωQt. By using Cauchy’s formula for repeated integration, we bound the norm of the error (39) as
w0(t)w0p(t)0t(tσ)p1(p1)!0σPesLPe(σs)LQ(LQ)p+1Lu0dsdσP2MMQ(LQ)p+1Lu00t(tσ)p1(p1)!0σesωe(σs)ωQdsdσC1j=1pαj0t(tσ)p1(p1)!0σesωe(σs)ωQdsdσfp(t,ω,ωQ)=C1j=1pαjfp(t,ω,ωQ),
(40)
where C1=P2LQLu0MMQ as before. The function fp(t,ω,ωQ) may be bounded from above as
fp(t,ω,ωQ)A1A20t(tσ)p1(p1)!0σdsdσ=A1A2tp+1(p+1)!.
Hence, we have
w0(t)w0p(t)C1A1A2j=1pαjtp+1(p+1)!=M3p(t).

Theorem 3.4 states that for a given dynamical system (represented by L) and quantity of interest (represented by P), the error bound M3p(t) is strongly related to {αj} which is ultimately determined by the initial condition x0. It turns out that by bounding {αj}, we can control M3p(t) and therefore the overall error w0(t)w0p(t). The following corollaries discuss sufficient conditions such that the error w0(T)w0n(T) decays as we increase the differentiation order n for fixed time T > 0.

Corollary 3.4.1
(Uniform convergence of theH-model). If {αj} in Theorem 3.4 satisfy
αj<j+1T,1jn,
(41)
for any fixed time T > 0, then there exists a sequence of constants δ1> δ2>> δnsuch that
w0(T)w0p(T)δp,  1pn.

Proof.
Evaluating (40) at any fixed (finite) time, T > 0 yields
w0(T)w0p(T)C2j=1pαifp(T,ω,ωQ)C2j=1pαjTp+1(p+1)!,w0(T)w0p+1(T)C2j=1p+1αjTp+2(p+2)!,
where C2 = C2(T) = C1A1A2. If there exists δp ≥ 0 such that
w0(T)w0p(T)C2j=1pαjTp+1(p+1)!δp,
then there exists a δp+1 such that
w0(T)w0p+1(T)C2j=1pαjTp+1(p+1)!αp+1Tp+2δp+1<δp
since αp+1 < (p + 2)/T. Moreover, the condition αj < (j + 1)/T holds for all 1 ≤ jn. Therefore, we conclude that for any fixed time T > 0, there exists a sequence of constants δ1 > δ2 > ⋯ > δn such that w0(T)w0p(T)δp, where 1 ≤ pn.

Corollary 3.4.1 provides a sufficient condition for the error w0(t)w0p(t) to decrease monotonically as we increase p in (31). A stronger condition that yields an asymptotically decaying error bound is given by the following Corollary.

Corollary 3.4.2
(Asymptotic convergence of theH-model). If αjin Theorem 3.4 satisfies
αj<C,1j<+
(42)
for some positive constant C, then for any fixed time T > 0 and arbitrary δ > 0, there exists a constant 1 ≤ p < +∞ such that for all n > p,
w0(T)w0n(T)δ.

Proof.
By introducing the condition αj < C in the Proof of Theorem 3.4, we obtain
w0(T)w0p(T)C2j=1pαjTp+1(p+1)!C2T(CT)p(p+1)!  for all 1<p<+.
The limit
limp+C2T(CT)p(p+1)!=0
allows us to conclude that there exists a constant 1 < p < + such that for all n > p, w0(T)w0n(T)δ.

An interesting consequence of Corollary 3.4.2 is the existence of a convergence barrier, i.e., a “hump” in the error plot w0(T)w0p(T) versus p generated by the H-model. While Corollary 3.4.2 only shows that behavior for an upper bound of the error, not directly the error itself, the feature is often found in the actual errors associated with numerical methods based on these ideas. The following corollary shows that the requirements on {αj} can be dropped (we still need αj < +) if we consider relatively short integration times T.

Corollary 3.4.3
(Short-time convergence of theH-model). For any integer n for which αj< ∞ for 1 ≤ j ≤ n and any sequence of constants δ1> δ2>> δn> 0, there exists a fixed time T > 0 such that
w0(T)w0p(T)δp
for 1 ≤ p ≤ n.

Proof.
Since αj < +, we can choose C=max1jnαj. By following the same steps we used in the Proof of Theorem 3.4, we conclude that for
T1Cmin1pnC(p+1)!C2δp1p+1,
the errors satisfy
w0(T)w0p(T)C2j=1pαjTp+1(p+1)!C2C(CT)p+1(p+1)!δp
as desired, for all 1 ≤ pn.

Corollaries 3.4.1 and 3.4.2 provide sufficient conditions for the error w0(T)w0n(T) generated by the H-model to decay as we increase the truncation order n. However, we still need to answer the important question of whether the H-model actually provides accurate results for a given nonlinear dynamics (L), quantity of interest (P) and initial state x0. Corollary 3.4.3 provides a partial answer to this question by showing that, at least in the short time period, condition (41) is always satisfied (assuming that {αj} are finite). This guarantees the short-time convergence of the H-model for any reasonably smooth nonlinear dynamical system and almost any observable. However, for longer integration times T, convergence of the H-model for arbitrary nonlinear dynamical systems cannot be established in general, which means that we need to proceed on a case-by-case basis by applying Theorem 3.4 or by checking whether the hypotheses of Corollary 3.4.1 or Corollary 3.4.2 are satisfied. On the other hand, the convergence of the H-model can be established for any finite integration time in the case of linear dynamical systems, as we have recently shown in Ref. 44. From a practical viewpoint, the implementation of the H-model requires computing (LQ)nLx0 to high-order in n. This is not straightforward if the dynamical systems are nonlinear. However, for linear systems, such terms can be easily computed (see Sec. III E).

2. Type-I finite memory approximation (FMA)

The Type-I finite memory approximation is obtained by solving the system (31) with wnen(t) given by (34). As before, we first derive an upper bound for w0(t)w0n(t) and then discuss sufficient conditions for convergence. Such conditions basically control the growth of an upper bound on w0(t)w0n(t).

Theorem 3.5
(Accuracy of the Type-I FMA). LetetLandetLQbe strongly continuous semigroups and let T > 0 be a fixed integration time. If
αj=(LQ)j+1Lu0(LQ)jLu0,1jn,
(43)
then for each 1 ≤ p ≤ n and for Δtp ≤ t ≤ T,
w0(t)w0p(t)M4p(t),
where
M4p(t)=C1A1A2i=1pαi(tΔtp)p+1(p+1)!
and C1, A1, A2are as in Theorem 3.4.

Proof.
The error at the pth level is of the form
wp(t)wpep(t)=0max(0,tΔtp)PesLPLe(ts)QL(QL)p+1u0ds,
and the error at the zeroth level is
w0(t)w0p(t)=0t0τp0τ2wn(τ1)wnep(τ1)dτ1dτp=0t0τp0τ20max(0,τ1Δtp)PesLPe(τ1s)LQ(LQ)p+1Lu0dsdτ1dτp=ΔtptΔtpτpΔtpτ20τ1ΔtpPesLPe(τ1s)LQ(LQ)p+1Lu0dsdτ1dτp=ΔtptΔtpτp0τ2Δtp0τ̃1PesLPe(τ̃1+Δtps)LQ(LQ)p+1Lu0dsdτ̃1dτp=0max(0,tΔtp)0τ̃p0τ̃20τ̃1PesLPe(τ̃1+Δtps)LQ(LQ)p+1Lu0dsdτ̃1dτ̃p.
The norm of this error may be bounded as
w0(t)w0p(t)0max(0,tΔtp)0τ̃p0τ̃20τ̃1PesLPe(τ̃1+Δtps)LQ(LQ)p+1Lu0dsdτ̃1dτ̃pC1j=1pαj0max(0,tΔtp)0τ̃p0τ̃20τ̃1es(ωωQ)e(τ̃1+Δtp)ωQdsdτ̃1dτ̃pC1j=1pαjfp(t,Δtp,ω,ωQ),
where
fp(t,Δtp,ω,ωQ)=0max(0,tΔtp)0τ̃p0τ̃20τ̃1es(ωωQ)e(τ̃1+Δtp)ωQdsdτ̃1dτ̃p.
If we bound fp as
fp(t,Δtp,ω,ωQ)A1A20max(0,tΔtp)0τ̃p0τ̃20τ̃1dsdτ̃1dτ̃p=0,0tΔtp,A1A2(tΔtp)p+1(p+1)!,tΔtp,
where A1, A2 are defined in (38), then we have that
w0(t)w0p(t)C1A1A2j=1pαj(tΔtp)p+1(p+1)!=M4p(t).

We notice that if the effective memory band at each level decreases as we increase the differentiation order p, then we can control the error w0(t)w0n(t). The following corollary provides a sufficient condition that guarantees this sort of control of the error.

Corollary 3.5.1
(Uniform convergence of the Type-I FMA). If αjin Therorem 3.5 satisfy
αj<(j+1)δj!C1A1A2k=1j1αk1j,  1jn,
(44)
then for any T > 0 and δ > 0, there exists an ordered sequence Δtn < Δtn−1 < ⋯ < Δt1 < T such that
w0(T)w0p(T)δ,1pn,
which satisfies
ΔtpTδ(p+1)!C1A1A2j=1pαj1p+1.
(45)

Proof.
For 1 ≤ pn, we set
w0(t)w0p(t)C1A1A2j=1pαj(tΔtp)p+1(p+1)!δ.
This yields the following requirement on Δtp:
ΔtpTδ(p+1)!C1A1A2j=1pαj1p+1.
(46)
Since hypothesis (44) holds, it is easy to check that the lower bound on each Δtp satisfies
Tδ(p+1)!C1A1A2j=1pαj1p+1<Tδp!C1A1A2j=1p1αj1p,  Δtp>Δtp1.
Therefore, by using the equality in (46) to define a sequence of Δtn, we find that it is a decreasing time sequence 0 < Δtn < Δtn−1 < ⋯ < Δt1 < T such that w0(T)w0n(T)δ holds for all t ∈ [0, T] and which satisfies (45).

Remark.
The sufficient condition provided in Corollary 3.5.1 guarantees uniform convergence of the Type-I finite memory approximation. If we replace condition (44) with
αj<C,  for all  1j<+,
where C is a positive constant (independent on T), then we obtain asymptotic convergence. In other words, for each δ > 0, there exists an integer p such that for all n > p, we have w0(t)w0n(t)<δ. This result is based on the limit
limp+δ(p+1)!C1A1A2j=1pαj>limp+δ(p+1)!C1A1A2Cp=+
which guarantees the existence of an integer p for which the upper bound on Δtp is smaller or equal to zero. In such a case, the Type I FMA degenerates to the H-model, for which Corollary 3.4.2 holds.

3. Type-II finite memory approximation

The Type-II finite memory approximation is obtained by solving the system (31) with wnen(t) given in (35). We first derive an upper bound for w0(t)w0n(t) and then discuss sufficient conditions for convergence.

Theorem 3.6
(Accuracy of the Type-II FMA). LetetLandetLQbe strongly continuous semigroups with upper boundsetLMetωandetLQMQetωQ. If
αj=(LQ)j+1Lu0(LQ)jLu0,1jn,
(47)
then for 1 ≤ pn,
w0(t)w0p(t)M5p(t),
where
M5p(t)=C1j=1pαjfp(ωQ,t)h(ωωQ,tp),
fp(ωQ,t)=0t(tσ)p1(p1)!eσωQdσ,    h(ωωQ,tp)=0tpes(ωωQ)ds,
andC1=MMQP2(LQ)p+1Lu0.

Proof.
By following the same procedure as in the Proof of the Theorem 3.4, we obtain
w0(t)w0p(t)=0t0τp0τ20min(τ1,tp)PesLPe(τ1s)LQ(LQ)p+1Lu0dsdτ1dτp.
By applying Cauchy’s formula for repeated integration, this expression may be simplified to
w0(t)w0p(t)=0t(tσ)p1(p1)!0min(σ,tp)PesLPe(σs)LQ(LQ)p+1Lu0dsdσ.
Thus,
w0(t)w0p(t)0t(tσ)p1(p1)!0min(σ,tp)PesLPe(σs)LQ(LQ)p+1Lu0dsdσMMQP2(LQ)p+1Lu00t(tσ)p1(p1)!0tpesωe(σs)ωQdsdσC1j=1pαj0t(tσ)p1(p1)!eσωQdσ0tpes(ωωQ)ds=C1j=1pαjfp(ωQ,t)h(ωωQ,tp)=M5p(t),
where C1=MMQP2(LQ)p+1Lu0,
fp(ωQ,t)=0t(tσ)p1(p1)!eσωQdσ=tpp!,ωQ=0,1ωQpetωQk=0p1(tωQ)kk!,ωQ0,
(48)
and
h(ωωQ,tp):=0tpes(ωωQ)ds=tp,ω=ωQ,etp(ωωQ)1ωωQ,ωωQ
are both strictly increasing functions of t and tp, respectively.

Corollary 3.6.1
(Uniform convergence of the Type-II FMA). If αjin Theorem 3.6 satisfy
αj<jT(ωQ=0)  or  αj<ωQeTωQk=0j2(TωQ)kk!eTωQk=0j1(TωQ)kk!(ωQ0)
(49)
for 1 ≤ jn, then for any arbitrarily small δ > 0, there exists an ordered sequence 0 < t0 < t1 < ⋯ < tnT such that
w0(T)w0p(T)δ,1pn,
which satisfies
tjj!δC1i=1jαiTj,ωQ=0,ωQjδC1i=1jαieTωQk=0j1(TωQ)kk!,ωQ0,
whenω=ωQand
tj1ωln1+j!ωδC1i=1jαiTj,ωQ=0,1ωωQln1+(ωωQ)ωQjδC1i=1jαieTωQk=0j1(TωQ)kk!,ωQ0,
whenωωQ.

Proof.
We now consider separately the two cases where ω=ωQ and where ωωQ. If ω=ωQ, then
w0(t)w0p(t)C1i=1pαi0t0τp0τ20tpeτ1ωQes(ωωQ)dsdτ1dτp=tpC1i=1pαi0t0τp0τ2eτ1ωQdτ1dτp=tpC1i=1pαifp(ωQ,t),
where fp(ωQ,t) is defined in (48). To ensure that w0(t)w0p(t)δ for all 0 ≤ tT, we can take
tpC1i=1pαifp(ωQ,T)=maxt[0,T]tpC1i=1pαifp(ωQ,t)δ
so that
tpδC1i=1pαifp(ωQ,T)=p!δC1i=1pαiTp,ωQ=0,ωQpδC1i=1pαieTωQk=0p1(TωQ)kk!,ωQ0.
On the other hand, if ωωQ, then
w0(t)w0p(t)C1i=1pαi0t0τp0τ20tpeτ1ωQes(ωωQ)dsdτ1dτp=etp(ωωQ)1ωωQC1i=1pαi0t0τp0τ2eτ1ωQdτ1dτp=etp(ωωQ)1ωωQC1i=1pαifp(ωQ,t).
To ensure that w0(t)w0p(t)δ for all 0 ≤ tT, we can take
etp(ωωQ)1ωωQC1i=1pαifp(ωQ,T)=maxt[0,T]etp(ωωQ)1ωωQC1i=1pαifp(ωQ,t)δ.
Let us now consider the two cases ω>ωQ and ω<ωQ separately. When ω>ωQ, we have
etp(ωωQ)1+(ωωQ)δC1i=1pαifp(ωQ,T)
and
tp1ωln1+p!ωδC1i=1pαiTp,ωQ=0,1ωωQln1+(ωωQ)ωQpδC1i=1pαieTωQk=0p1(TωQ)kk!,ωQ0.
On the other hand, when ω<ωQ, we have
1etp(ωQω)ωQωC1i=1pαifp(ωQ,T)=maxt[0,T]1etp(ωQω)ωQωC1i=1pαifp(ωQ,t)δ
so that
etp(ωQω)1(ωQω)δC1i=1pαifp(ωQ,T),
i.e.,
tp(ωQω)ln1(ωQω)δC1i=1pαifp(ωQ,T).
Hence,
tp1ωln1p!(ω)δC1i=1pαiTp,ωQ=0,1ωQωln1(ωQω)ωQpδC1i=1pαieTωQk=0p1(TωQ)kk!,ωQ0.
For all the four cases, if ωQ=0, then we have condition αp < p/T, and the upper bound of the time sequence satisfies
p!δC1i=1pαiTp<(p1)!δC1i=1p1αiTp1,  p2.
If ωQ0, then we have the condition
αp<ωQeTωQk=0p2(TωQ)kk!eTωQk=0p1(TωQ)kk!
and the upper bound of the time sequence satisfies
ωQp1eTωQk=0p2(TωQ)kk!i=1p1αi<ωQpeTωQk=0p1(TωQ)kk!i=1pαi,  p2.
Therefore, there always exists a increasing time sequence 0 < t1 < ⋯ < tn such that w0(t)w0p(t)δ for all 0 ≤ tT. And since we have proved that this δ-bound on the error holds for all tn upper bounded as in the two cases above, there exists such an increasing time sequence 0 < t1 < ⋯ < tn with tnlower-bounded by the same quantities. Indeed, because of the coarseness of the approximations applied in the proof, there may exist such a time sequence with significantly larger ti.

Remark.
If we replace (49) with the stronger condition
αj<jϵT,  ωQ=0,αj<1ϵT,  ωQ0,        1j<,
(50)
where ϵ is some arbitrary constant satisfying ϵ > 1, then we have
limj+tjlimj+j!δC1i=1jαiTjδC1ϵj=+,ωQ=0,ωQjδC1i=1jαieTωQk=0j1(TωQ)kk!=ωQjδC1i=1jαio(TjωQj)=+,ωQ0
for ω=ωQ and
limj+tjlimj+jωlnδC1ωϵ=+,ωQ=0,jωωQln(ωωQ)TjC1o(Tj)=+,ωQ0.
Hence, there exists a j such that the upper bound for tj is greater than or equal to T. For such a case, the Type II FMA degenerates to the truncation approximation (H-model), for which Corollary 3.4.2 grants us asymptotic convergence.

4. Ht-model

The Ht-model is obtained by solving the system (31) with wnen(t) approximated using Chorin’s t-model12 [see Eq. (36)]. Convergence analysis can be performed by using the mathematical methods we employed for the proofs of the H-model. Note that the classical t-model is equivalent to a zeroth-order Ht-model.

Theorem 3.7
(Accuracy of theHt-model). LetetLandetLQbe strongly continuous semigroups with upper boundsetLMetωandetLQMQetωQ, and let T > 0 be a fixed integration time. For some fixed n, let
αj=(LQ)j+1Lu0(LQ)jLu0,1jn.
(51)
Then, for any 1 ≤ pn and all t ∈ [0, T], we have
w0(t)w0p(t)M6p(t)M6p(T),
where
M6p(t)=C4jpαjtp+1(p+1)!,  C4=C1A1A2+C1MQA3,  A3=maxs[0,T]sesω=1,  ω0,eTω,  ω>0
and C1, A1, A2are the same as before.

Proof.
For pth order Ht-model, the difference between the memory term w0 and its approximation w0p is
w0(t)w0p(t)=0t0τp0τ20τ1PesLPe(τ1s)LQ(LQ)p+1Lu0dsτ1Peτ1LP(LQ)p+1Lu0×dτ1dτp.
(52)
Using Cauchy’s formula for repeated integration, we can bind the norm of the second term in (52) as
0t(tσ)p1(p1)!σPeσLP(LQ)p+1Lu0dσ0t(tσ)p1(p1)!σPeσLP(LQ)p+1Lu0dσP2M(LQ)p+1Lu00t(tσ)p1(p1)!σeσωdσgp(t,ω)=C1MQj=1pαjgp(t,ω),
(53)
where C1=P2LQLu0MMQ as before. The function gp(t, ω) may be bounded from above as
gp(t,ω)A30t(tσ)p1(p1)!σdσ=A3tp+1(p+1)!,A3=maxs[0,T]esω=1,  ω0eTω,  ω>0.
By applying the triangle inequality to (52) and taking (53) into account, we obtain
w0(t)w0p(t)C1A1A2j=1pαjtp+1(p+1)!+C1MQA3j=1pαjtp+1(p+1)!=M6p(t).

One can see that the upper bounds M6p(t) and M3p(t) (see Theorem 3.4) share the same structure, the only difference being the constant out front. Hence by changing C2 to C4, we can prove a series of corollaries similar to 3.4.1–3.4.3. In summary, what holds for the H-model also holds for the Ht-model. For the sake of brevity, we omit the statement and proofs of those corollaries.

The upper bounds we obtained above are not easily computable for general nonlinear systems and infinite-rank projections, e.g., Chorin’s projection (7). However, if the dynamical system is linear, then such upper bounds are explicitly computable and the convergence of the H-model can be established for linear phase space functions in any finite integration time T. To this end, consider the linear system = Ax with a random initial condition x(0) sampled from the joint probability density function

ρ0(x0)=δ(x01x1(0))j=2Nρ0j(x0j).
(54)

In other words, the initial condition for the quantity of interest u(x) = x1(t) is set to be deterministic, while all other variables x2, , xN are zero-mean and statistically independent at t = 0. Here we assume for simplicity that ρ0j (j = 2, …, N) are independent identically distributed (i.i.d.) standard normal distributions. Observe that the Liouville operator associated with the linear system = Ax is

L=i=1Nj=1NAijxjxi,
(55)

where Aij are the entries of the matrix A. If we choose observable u = x1(t), then Chorin’s projection operator (9) yields the evolution equation for the conditional expectation E[x1(t)|x1(0)], i.e., the conditional mean path (11), which can be explicitly written as

ddtE[x1|x1(0)]=A11E[x1|x1(0)]+w0(t),
(56)

where A11=PLx1(0) is the first entry of the matrix A and w0 represents the memory integral (26). Next, we explicitly compute the upper bounds for the memory growth and the error in the H-model for this system. To this end, we first notice that the domain of the Liouville operator can be restricted to the linear space

V=span{x1,,xN}.
(57)

In fact, V is invariant under L, P, and Q, i.e., LVV, PVV, and QVV. These operators have the following matrix representations:

LATa11bTaM11T,  P100000000,  Q000010001,

where M11 is the minor of the matrix of A obtained by removing the first column and the first row, while

a=[A12A1N]T,  bT=[A21AN1].
(58)

Therefore,

LQ0bT0M11T,  L(QL)nx1(0)bTM11Tn1aM11Tna.
(59)

At this point, we set x01 = x1(0) and

q(t,x01,x̃0)=0tesLPLe(ts)QLQLx01ds.

Since x̃0=(x2(0),,xN(0)) is random, q(t,x01,x̃0) is a random variable. By using Jensen’s inequality [E(X)]2E[X2], we have the following L estimate:

(Pq)(t,x01)Lq(t,x01,)Lρ02.
(60)

On the other hand, we have

etLLρ02(V)etLLρ02etω,ω=12infdivρ0F.
(61)

For linear dynamical systems, both Lρ02(V) and Lρ02 upper bounds can be used to estimate the norm of the semigroup etL. However, for the semigroup etLQ, we can only obtain the explicit form of the Lρ02(V) bound, which is given by the following perturbation theorem16 (see also  Appendix A):

etLQLρ02(V)etωQ, where ωQ=ω+A112+i=2NA1i2xi2(0)ρ0x12(0)ω+LPLρ02(V).
(62)

Memory growth. It is straightforward at this point to compute the upper bound of the memory growth we obtained in Theorem 3.1. Since PLρ02=QLρ02=1 (P and Q are the orthogonal projections relative to ρ0), we have the following result:

|w0(t)|LQLx1(0)etωetωQωωQ=(bTa)2x12(0)+Λxi+1(0)M11Ta22etωetωQωωQ,
(63)

where Λxi+1(0) is a N − 1 × N − 1 diagonal matrix with Λii=xi+1(0)ρ0 and ∥·∥2 is the vector 2-norm. Accuracy of theH-model. We are interested in computing the upper bound of the approximation error generated by the H-model (see Sec. III D 1–Theorem 3.4). By using the matrix representation of L, P, and Q, the nth order H-model MZ Eq. (56) for the linear system can be explicitly written as

ddtE[x1|x1(0)]=A11E[x1|x1(0)]+w0n(t)  (MZ equation),dwjn(t)dt=bT(M11T)jaTE[x1|x1(0)]+wj+1n(t),  j=0,1,,n1,dwnn(t)dt=bT(M11T)naTE[x1|x1(0)],
(64)

where M11, a, and b are defined as before [see Eq. (58)]. The upper bound for the memory term approximation error is explicitly obtained as

|w0(t)w0n(t)|A1A2L(QL)nx1(0)tn+1(n+1)!=A1A2bTM11Tna2x12(0)+Λxi+1(0)M11Tn+1a22tn+1(n+1)!,
(65)

where A1, A2 are defined in (38), while ω and ωQ are given in (61) and (62), respectively. Note that the error bound (65) is slightly different from the one we obtained in Theorem 3.4. The reason is that here we choose to bound L(QL)nu0, instead of the quotient αn=|L(QL)n+1u0/L(QL)nu0. For each fixed integration time T, the upper bound (65) goes to zero as we send n to infinity, i.e.,

limn+|w0(T)w0n(T)|=0.

This means that the H-model converges for all linear dynamical systems with observables in the linear space (57).

The semigroup estimates we obtained in Sec. III A allow us to compute explicitly an a priori estimate of the memory kernel in the Mori-Zwanzig equation if we employ finite-rank projections. In this section, we outline the procedure to obtain such an estimate for Hamiltonian dynamical systems. We begin by recalling that, in general, Hamiltonian systems are necessarily divergence-free, i.e.,

divρeq(F)=0.
(66)

Here, F(x) is the velocity field at the right-hand side of (1), while ρeq=eβH/Z is the canonical Gibbs distribution. Equation (66) can be easily obtained by noticing that

divρeq(F)=eβHeβHF,=eβHi=1NqieβHHpipieβHHqi,=0.

By combining the estimate (18) with the divergence-free constraint (66), we find that the Koopman semigroup of a Hamiltonian dynamical system is a contraction in the Lρeq2 norm, i.e.,

etLLρeq21.
(67)

Moreover, the MZ equation (6) with a finite-rank projection P of the form (12) (with σ = ρeq) can be reduced to the following Volterra integro-differential equation:

ddtPui(t)=j=1MΩijPuj(t)j=1M0tKij(ts)Puj(s)ds,  i=1,,M,
(68)

where

Gij=ui,ujeq,
(69a)
Ωij=k=1M(G1)jkuk,Luieq,
(69b)
Kij(ts)=k=1M(G1)jkQLuk,e(ts)QLQLuieq.
(69c)

Equation (68) is often called the generalized Langevin equation (GLE)13,31 for the projected quantity of interest ui(t). To derive (69a)–(69c), we used the fact that L is skew-adjoint and Q is self-adjoint with respect to the Lρeq2 inner product and that Q2=Q. In classical statistical physics, Eq. (69c) is known as the second fluctuation dissipation theorem. Next, define the temporal-correlation matrix

Cij(t)=uj(0),ui(t)eq=uj(0),Pui(t)eq.
(70)

By applying uj,()eq to both sides of Eq. (68), we obtain the following exact evolution equation for Cij(t):

dCijdt=k=1MΩikCkjk=1M0tKik(ts)Ckj(s)ds.
(71)

Moreover, if we employ a one-dimensional Mori’s basis, i.e., M = 1, then we obtain the simplified equation

dC(t)dt=Ω1C(t)0tK(ts)C(s)ds,
(72)

where C(t)=u1(0),u1(t)eq. The main difficulty in solving the GLE (71) [or (72)] lies in computing the memory kernels Kij(t). Hereafter we prove that such memory kernels can be uniformly bounded by a computable quantity that depends only on the initial condition of the system. For the sake of simplicity, we will focus on the one-dimensional GLE (72).

Theorem 3.8.
The memory kernel K(t) in the one-dimensional GLE (71) is uniformly bounded byu̇1(0)Lρeq22/u1(0)Lρeq22, i.e.,
|K(t)|u̇1(0)Lρeq22u1(0)Lρeq22,  t0.
(73)

Proof.
From the second-fluctuation dissipation theorem (69c), the memory kernel K(t) satisfies
|K(t)|=etQLQLu1(0),QLu1(0)equ1(0),u1(0)eqetQLQLρeq2Lu1(0)Lρeq22u1(0)Lρeq22=etQLQLρeq2u̇1(0)Lρeq22u1(0)Lρeq22.
On the other hand, by using the numerical abscissa (17) and formula (19), we see that the semigroup etQLQ is contractive, i.e., etQLQLρeq21. Since Q is an orthogonal projection with respect to ρeq, we have etQLQLρeq2=QetQLQLρeq2QLρeq2etQLQLρeq21. This yields
|K(t)|etQLQLρeq2u̇1(0)Lρeq22u1(0)Lρeq22u1̇(0)Lρeq22u1(0)Lρeq22.

Theorem 3.8 provides an a priori and easily computable upper bound for the memory kernel defining the dynamics of any quantity of interest u1 that is initially in the Gibbs canonical ensemble ρeq=eβH/Z. In Sec. IV, we will calculate the upper bound (73) analytically and compare it with the exact memory kernel we obtain in prototype linear and nonlinear Hamiltonian systems.

Remark.
We emphasized in Sec. III A that the semigroup estimate for etQLQ is not necessarily tight. In the context of high-dimensional Hamiltonian systems (e.g., molecular dynamics), it is often empirically assumed that the semigroup etQLQ is dissipative, i.e., etQLQetω0, where ω0 < 0. In this case, the memory kernel turns out to be uniformly bounded by an exponentially decaying function since
|K(t)|etQLQLρeq2Lu1Lρeq22u1Lρeq22etω0u̇1Lρeq22u1Lρeq22.

In this section, we provide simple numerical examples of the MZ memory approximation methods we discussed throughout the paper. Specifically, we study Hamiltonian systems (linear and nonlinear) with finite-rank projections (Mori’s projection) and non-Hamiltonian systems with infinite-rank projections (Chorin’s projection). In both cases, we demonstrate the accuracy of the a priori memory estimation method we developed in Secs. III F and III E. We also compute the solution to the MZ equation for non-Hamiltonian systems with the t-model, the H-model, and the Ht-model.

In this section, we consider dimension reduction in linear and nonlinear Hamiltonian dynamical systems with finite-rank projection. In particular, we consider the Mori projection and study the MZ equation for the temporal auto-correlation function of a scalar quantity of interest.

1. Harmonic chains of oscillators

Consider a one-dimensional chain of harmonic oscillators. This is a simple but illustrative example of a linear Hamiltonian dynamical system which has been widely studied in statistical mechanics, mostly in relation with the microscopic theory of Brownian motion.3,18,17 The Hamiltonian of the system can be written as

H(p,q)=12mi=1Npi2+k2i,j=0i<jN+1(qiqj)2,
(74)

where qi and pi are, respectively, the displacement and momentum of the ith particle, m is the mass of the particles (assumed constant throughout the network), and k is the elasticity constant that modulates the intensity of the quadratic interactions. We set fixed boundary conditions at the endpoints of the chain, i.e., q0(t) = qN+1(t) = 0 and p0(t) = pN+1(t) = 0 (particles are numbered from left to right) and m = k = 1. The Hamilton’s equations are

dqidt=Hpi,  dpidt=Hqi,
(75)

which can be written in a matrix-vector form as

ṗq̇=0kBkDI/m0pq,
(76)

where B is the adjacency matrix of the chain and D is the degree matrix (see Ref. 4). Note that (76) is a linear dynamical system. We are interested in the velocity auto-correlation function of a tagged oscillator, say, the one at location j = 1. Such an auto-correlation function is defined as

Cp1(t)=p1(0)p1(t)eqp1(0)p1(0)eq,
(77)

where the average is with respect to the Gibbs canonical distribution ρeq=eβH/Z. It was shown in Ref. 18 that Cp1(t) can be obtained analytically by employing Lee’s continued fraction method. The result is the well-known J0J4 solution

Cp1(t)=J0(2t)J4(2t),
(78)

where Ji(t) is the ith Bessel function of the first kind. On the other hand, the Mori-Zwanzig equation derived by the following Mori’s projection

P()=(),p1(0)eqp1(0),p1(0)eqp1(0)
(79)

yields the following GLE for Cp1(t):

dCp1(t)dt=Ωp1Cp1(t)0tK(s)Cp1(ts)ds.
(80)

Here,

Ωp1=Lp1(0),p1(0)eqp1(0),p1(0)eq=0

since pi(0),qj(0)eq=0, while K(t) is the MZ memory kernel. For the J0J4 solution, it is possible to derive the memory kernel K(t) analytically. To this end, we simply insert (78) into (80) and apply the Laplace transform

L[](s)=0()estdt

to obtain

K^(s)=s+1Ĉ(s),
(81)

where Ĉ(s)=L[Cp1(t)] and K^(s)=L[K(t)]. The inverse Laplace transform of (81) can be computed analytically as

K(t)=J1(2t)t+1.
(82)

With K(t) available, we can verify the memory estimated we derived in Theorem 3.8. To this end,

|K(t)|ṗ1(0)Lρeq22p1(0)Lρeq22=q2(0)2q1(0)Lρeq22p1(0)Lρeq22=2.
(83)

Here we used the exact solution of the velocity auto-correlation function and displacement auto-correlation function of the fixed-end harmonic chain given by (see Ref. 18)

pi(0),pj(0)eq=kBTπ0πsin(ix)sin(jx)dx,  qi(0),qj(0)eq=kBTπ0πsin(ix)sin(jx)4sin2(x/2)dx.

In Fig. 1, we plot the absolute value of the memory kernel K(t) together with the theoretical bound (83). It is seen that the upper bound we obtain in this case is of the same order of magnitude as the memory kernel.

FIG. 1.

Harmonic chain of oscillators. (a) Velocity auto-correlation function Cp1(t) and (b) memory kernel K(t) of the corresponding MZ equation. It is seen that our theoretical estimate (83) (dashed line) correctly bounds the MZ memory kernel. Note that the upper bound we obtain is of the same order of magnitude as the memory kernel.

FIG. 1.

Harmonic chain of oscillators. (a) Velocity auto-correlation function Cp1(t) and (b) memory kernel K(t) of the corresponding MZ equation. It is seen that our theoretical estimate (83) (dashed line) correctly bounds the MZ memory kernel. Note that the upper bound we obtain is of the same order of magnitude as the memory kernel.

Close modal

2. Hald system

In this section, we study the Hamiltonian system studied by Chorin et al. in Refs. 12 and 9. The Hamiltonian function is defined as

H(p,q)=12(q12+p12+q22+p22+q12q22),
(84)

while the corresponding Hamilton’s equations of motion are

q̇1=p1,ṗ1=q1(1+q22),q̇2=p2,ṗ2=q2(1+q12).
(85)

We assume that the initial state is distributed according to canonical Gibbs distribution ρeq=eH(p,q)/Z. The partition function Z is given by

Z=e1/4(2π)3/2K014,
(86)

where K0(t) is the modified Bessel function of the second kind. We aim to study the properties of the autocorrelation function of the first component q1, which is defined as

Cq1(t)=q1(0),q1(t)eqq1(0),q1(0)eq.

Obviously, Cq1(0)=1. The evolution equation for Cq1(t) is obtained by using the MZ formulation with the Mori’s projection

P()=(),q1(0)eqq1(0),q1(0)eqq1(0).
(87)

This yields the GLE

dCq1(t)dt=Ωq1Cq1(t)0tK(s)Cq1(ts)ds.
(88)

The streaming term Ωq1Cq1(t) is again identically zero since

Ωq1=Lq1(0),q1(0)eqq1(0),q1(0)eq=0.

Theorem 3.8 provides the following computable upper bound for the modulus of K(t):

|K(t)|q̇1(0)Lρeq22q1(0)Lρeq22=p1(0)Lρeq22q1(0)Lρeq22=e1/4K01/4πU1/2,0,1/21.39786,
(89)

where U(a, b, y) is the confluent hypergeometric function of the second kind. In Fig. 2(a), we plot the correlation function Cq1(t) we obtained by sampling (85) and then averaging over all realizations. The samples of the initial condition are generated from the Gibbs equilibrium distribution by using Markov Chain Monte Carlo (MCMC). In Fig. 2(b), we plot the memory kernel K(t) we computed numerically based on Cq1(t). To compute such a kernel, we inverted numerically the Laplace transform of (88), i.e.,

K(t)=L1s+1Ĉ(s),
(90)

where Ĉ(s)=L[Cq1(t)]. In practice, we replaced Cq1(t) within the time interval [0, 20] with a high-order polynomial interpolant at Gauss-Chebyshev-Lobatto nodes, computed the Laplace transform of such polynomial analytically, and then computed the inverse Laplace transform (90) numerically with the Talbot algorithm.1 

FIG. 2.

Hald Hamiltonian system (85). (a) Autocorrelation function of the displacement q1(t) and (b) memory kernel of the governing MZ equation. Here Cq1(t) is computed by Markov chain Monte-Carlo (MCMC) while K(t) is determined by inverting numerically the Laplace transform in (90) with the Talbot algorithm. It is seen that the theoretical upper bound (89) (dashed line) is of the same order of magnitude as the memory kernel.

FIG. 2.

Hald Hamiltonian system (85). (a) Autocorrelation function of the displacement q1(t) and (b) memory kernel of the governing MZ equation. Here Cq1(t) is computed by Markov chain Monte-Carlo (MCMC) while K(t) is determined by inverting numerically the Laplace transform in (90) with the Talbot algorithm. It is seen that the theoretical upper bound (89) (dashed line) is of the same order of magnitude as the memory kernel.

Close modal

In this section, we study the accuracy of the t-model, the H-model, and the Ht model in predicting scalar quantities of interest in non-Hamiltonian systems. In particular, we consider the MZ formulation with Chorin’s projection operator. For the particular case of linear dynamical systems, we also compute the theoretical upper bounds we obtained in Sec. III E for the memory growth and the error in the H-model and compare such bounds with exact results.

1. Linear dynamical systems

We begin by considering a low-dimensional linear dynamical system = Ax evolving from a random initial state with density ρ0(x) to verify the MZ memory estimates we obtained in Sec. III E. For simplicity, we choose A to be negative definite

A=eCBeC,  B=180002300012,  C=010101010.
(91)

In this case, the origin of the phase space is a stable node and it is easy to estimate etLρ0. We set x1(0) = 1 and x2(0), x3(0) independent standard normal random variables. In this setting, the semigroup estimates (61) and (62) are explicit,

etLetω,ω=12Tr(A)=0.6458,etLQetωQ,ωQ=ω+A112+i=2NA1i2x12(0)=1.1621.

Therefore, we obtain the following explicit upper bounds for the memory integral and the error of the H-model [see Eqs. (63) and (65)]:

|w0(t)|0.1964e1.1621te0.6458t,
(92)
|w0(t)w0n(t)|e1.1624tbTM11Tna12x12(0)+M11Tn+1a22tn+1(n+1)!.
(93)

Next, we compare these error bounds with numerical results obtained by solving numerically the H-model (64). For example, the second-order H-model reads

ddtE[x1(t)|x1(0)]=0.4560E[x1(t)|x1(0)]+w02(t),dw02(t)dt=0.0586E[x1(t)|x1(0)]+w12(t),dw12(t)dt=0.0192E[x1(t)|x1(0)].
(94)

In Fig. 3, we demonstrate the convergence of the H-model to the benchmark solution computed by Monte-Carlo simulation as we increase the H-model differentiation order. In Fig. 4, we plot the bound on the memory growth [Eq. (92)] and the bound in the memory error [Eq. (93)] together with exact results.

FIG. 3.

Convergence of the H-model for the linear dynamical system with matrix (91). The benchmark solution is computed with Monte-Carlo (MC) simulation. Also, the zero-order H-model represents the Markovian approximation to the MZ equation, i.e., the MZ equation without the memory term.

FIG. 3.

Convergence of the H-model for the linear dynamical system with matrix (91). The benchmark solution is computed with Monte-Carlo (MC) simulation. Also, the zero-order H-model represents the Markovian approximation to the MZ equation, i.e., the MZ equation without the memory term.

Close modal
FIG. 4.

Linear dynamical system with matrix (91). In (a), we plot the memory term w0(t) we obtain from Monte Carlo simulation together with the estimated upper bound (92). In (b) and (c), we plot the H-model approximation error |w0(T)w0n(T)| together with the upper bound (93) for different differentiation orders n and at different times t.

FIG. 4.

Linear dynamical system with matrix (91). In (a), we plot the memory term w0(t) we obtain from Monte Carlo simulation together with the estimated upper bound (92). In (b) and (c), we plot the H-model approximation error |w0(T)w0n(T)| together with the upper bound (93) for different differentiation orders n and at different times t.

Close modal

Remark.
The results we just obtained can be obviously extended to higher-dimensional linear dynamical systems. In Fig. 5, we plot the benchmark conditional mean path we obtained through Monte Carlo simulation together with the solution of the H-model (64) for the 100-dimensional linear dynamical system defined by the matrix (N = 100)
A=11(1)N1B1,
(95)
where B = eCΛeC and
Λ=1800029000N1N+6,  C=010101010.
It is seen that the H-model converges as we increase the differentiation order in any finite time interval, in agreement with the theoretical prediction of Sec. III E.

FIG. 5.

Linear dynamical system with matrix A (95). Convergence of the H-model to the conditional mean path solution E[x1(t)|x1(0)]. The initial condition is set as x1(0) = 3, while {x2(0), , x100(0)} are i.i.d. normals.

FIG. 5.

Linear dynamical system with matrix A (95). Convergence of the H-model to the conditional mean path solution E[x1(t)|x1(0)]. The initial condition is set as x1(0) = 3, while {x2(0), , x100(0)} are i.i.d. normals.

Close modal

2. Nonlinear dynamical systems

The hierarchical memory approximation method we discussed in Sec. III D can be applied to nonlinear dynamical systems in the form (1). As we will see, if we employ the Ht-model, then the nonlinearity introduces a closure problem that needs to be addressed properly.

a. Lorenz-63 system.

Consider the classical Lorenz-63 model

ẋ1=σ(x2x1),ẋ2=x1(rx3)x2,ẋ3=x1x2βx3,
(96)

where σ = 10 and β = 8/3. The phase space Liouville operator for this ordinary differential equation (ODE) is

L=σ(x2x1)x1+(x1(rx3)x2)x2+(x1x2βx3)x3.

We choose the resolved variables to be x^={x1,x2} and aim at formally integrating out x̃=x3 by using the Mori-Zwanzig formalism. To this end, we set x3(0)N(0,1) and consider the zeroth-order Ht-model (t-model)

dx1mdt=σ(x1mx2m),dx2mdt=x2m+rx1mtx1m2x2m,
(97)

where x1m(t)=E[x1(t)|x1(0),x2(0)] and x2m(t)=E[x2(t)|x1(0),x2(0)] are the conditional mean paths. To obtain this system, we introduced the following mean field closure approximation:

tPetLPLQLx2(0)=tE[x1(t)2x2(t)|x1(0),x2(0)]tE[x1(t)|x1(0),x2(0)]2E[x2(t)|x1(0),x2(0)]=tx1m2x2m.
(98)

Higher-order Ht-models can be derived based on (98). As is well known, if r < 1, the fixed point (0, 0, 0) is a global attractor and exponentially stable. In this case, the t-model (zeroth-order Ht-model) yields an accurate prediction of the conditional mean path for long time (see Fig. 6). On the other hand, if we consider the chaotic regime at r = 28, then the t-model and its higher-order extension, i.e., the Ht-model, are accurate only for relatively short time. This is in agreement with our theoretical predictions. In fact, different from linear systems where the hierarchical representation of the memory integral can be proven to be convergent for long time, in nonlinear systems, the memory hierarchy is, in general, provably convergent only in a short time period (Theorem 3.7 and Corollary 3.4.3). This does not mean that the H-model or the Ht-model is not accurate for nonlinear systems. It just means that the accuracy depends on the system, the quantity of interest, and the initial condition.

FIG. 6.

Accuracy of the Ht model in representing the conditional mean path in the Lorenz-63 system (96). It is seen that if r = 0.5 (first row), then the zeroth-order Ht-model, i.e., the t-model is accurate for long integration times. On the other hand, if we consider the chaotic regime at r = 28 (second row), then we see that the t-model and its high-order extension (Ht-model) are accurate only for relatively short time.

FIG. 6.

Accuracy of the Ht model in representing the conditional mean path in the Lorenz-63 system (96). It is seen that if r = 0.5 (first row), then the zeroth-order Ht-model, i.e., the t-model is accurate for long integration times. On the other hand, if we consider the chaotic regime at r = 28 (second row), then we see that the t-model and its high-order extension (Ht-model) are accurate only for relatively short time.

Close modal
b. Modified Lorenz-96 system.

As an example of a high-dimensional nonlinear dynamical system, we consider the following modified Lorenz-96 system:22,26

ẋ1=x1+x1x2+F,ẋ2=x2+x1x3+F,xi̇=xi+(xi+1xi2)xi1+F,ẋN=xNxN2xN1+F,
(99)

where F is constant. As is well known, depending on the values of N and F, this system can exhibit a wide range of behaviors.22 Suppose we take the resolved variables to be x^={x1,x2}. Correspondingly, the unresolved ones, i.e., those we aim at integrating through the MZ framework, are x̃={x3,,xN}, which we set to be independent standard normal random variables. By using the mean field approximation (98), we obtain the following zeroth-order Ht-model (t-model) of the modified Lorenz-96 system (99):

ẋ1m=x1m+x1mx2m+F,ẋ2m=x2m+F+t(x1m2x2mx1mF).
(100)

In Fig. 7, we study the accuracy of the Ht-model in representing the conditional mean path with F = 5 and N = 100. It is seen that the the Ht-model converges only for short time (in agreement with the theoretical predictions), and it provides results that are more accurate than the classical t-model.

FIG. 7.

Accuracy of the Ht-model in representing the conditional mean path in the Lorenz-96 system (96). Here we set F = 5 and N = 100. It is seen that the Ht-model converges only for short time and provides results that are more accurate that the classical t-model.

FIG. 7.

Accuracy of the Ht-model in representing the conditional mean path in the Lorenz-96 system (96). Here we set F = 5 and N = 100. It is seen that the Ht-model converges only for short time and provides results that are more accurate that the classical t-model.

Close modal

In this paper, we developed a thorough mathematical analysis to deduce conditions for the accuracy and convergence of different approximations of the memory integral in the Mori-Zwanzig (MZ) equation. In particular, we studied the short memory approximation, the t-model, and various hierarchical memory approximation techniques. We also derived useful upper bounds for the MZ memory integral which allowed us to estimate a priori its contribution to the dynamics. Such upper bounds are easily computable for systems with finite-rank projections—such as Mori’s projection. Furthermore, if the system is Hamiltonian, then the MZ memory bounds we derived are tight. This can be attributed to the fact that the Koopman operator of a Hamiltonian dynamical system is always a contraction relative to the Lρeq2 norm weighted by the Gibbs canonical distribution ρeq. In the more general case of dissipative systems, however, we do not have any guarantee that the MZ memory bounds we obtained are tight. Indeed, for dissipative systems, the numerical abscissa in Eq. (17) [see also Eqs. (21a) and (21b)] can be positive, yielding an exponential growth of the memory bounds. In the presence of infinite-rank projections—such as Chorin’s projection—we found that it is extremely difficult to obtain explicitly computable upper bounds for the MZ memory or any of its approximations discussed in this paper. The main difficulties are related to the lack of an explicitly computable upper bound for the operator semigroup etLQ, which involves the unbounded generator LQ (see the discussion in Sec. III A and  Appendix A). To demonstrate the accuracy of the approximation methods and error bounds we developed in this paper, we provided numerical examples involving Hamiltonian dynamical systems (linear and nonlinear) and more general dissipative nonlinear systems evolving from random initial states. The numerical results we obtained are found to be in agreement with our theoretical predictions.

This work was supported by the Air Force Office of Scientific Research Grant No. FA9550-16-586-1-0092.

In looking for the numerical abscissa14 (i.e., the logarithmic norm) of LQ, we seek to bound

supD(LQ)x0Rx,LQxσx,xσ.

Notice that if PLQ=0, then LQ=QLQ so that we have the previously proven bound

supRx,QLQxσx,xσ12infdivσF.

In this section, we consider what happens when PLQ0. To that end, let us note that xD(LQ) may be decomposed as

x=Qx+αPLQx+Py,

where αC and Py are orthogonal to PLQx. In other words, we define Py as

Py=PxPLQx,PxσPLQx,PLQxσPLQx

and define α as

α=PLQx,PxσPLQx,PLQxσ.

Then

Rx,LQxσx,xσ=RQx+αPLQx+Py,LQxσx,xσ=RQx,LQxσ+R(α)PLQxσ2Qxσ2+|α|2PLQxσ2+Pyσ2max0,RQx,LQxσ+R(α)PLQxσ2Qxσ2+|α|2PLQxσ2.

Since we assume PLQ0, there exists x such that PLQx0 and then for any α such that

R(α)RQx,LQxσPLQxσ2,

we have

RQx,LQxσ+R(α)PLQxσ20

so that for PLQ0,

supD(LQ)x0Rx,LQxσx,xσ=supD(LQ)x0RQx,LQxσ+R(α)PLQxσ2Qxσ2+|α|2PLQxσ2.

Now, fix any QxD(L)0 and consider the expression

RQx,LQxσ+R(α)PLQxσ2Qxσ2+|α|2PLQxσ2=ξ+R(α)β21+|α|2β2,

where

ξ=ξ(Qx)=RQx,LQxσQxσ2,  β=β(Qx)=PLQxσQxσ.

Then, for this fixed Qx,

ξ+R(α)β21+|α|2β2maxaRξ+aβ21+a2β2.

Differentiating with respect to a and setting equal to zero, we find that the latter expression is extremized when

0=β2(1+a2β2)2aβ2(ξ+aβ2),

i.e., when

β2a2+2ξa1=0,  a=ξ±ξ2+β2β2.

Since β2 > 0, ξ+aβ21+a2β2 is maximized at â=ξ+ξ2+β2β2. Then

ξ+âβ2=ξ2+β2,  1+â2β2=21ξâ=2ξ2+β2ξξ2+β2β2

so that

maxaRξ+aβ21+a2β2=ξ+âβ21+â2β2=12β2ξ2+β2ξ=12ξ2+β2+ξ.

Therefore,

supD(LQ)x0Rx,LQxσx,xσ=supD(L)(Qx)012ξ2(Qx)+β2(Qx)+ξ(Qx).

When PLQ is unbounded, which is the typical case when P is an infinite-rank projection, such as most conditional expectations, there is unlikely to be a finite numerical abscissa for LQ. In particular, notice that if divσ(F) is a bounded function (bounded both above and below), then ξ(Qx) is bounded for all Qx while β(Qx) is unbounded, in which case

supD(LQ)x0Rx,LQxσx,xσ=.

It follows Refs. 37 and 30 that in these cases, etLQσ has an infinite slope at t = 0, and therefore there is no finite ω such that etLQσeωt for all t ≥ 0 (see Ref. 14). Assuming still that LQ generates a strongly continuous semigroup, we must look to bound the semigroup as etLQσMeωt, where

ω>ω0=limtlnetLQσt

and

MM(ω)=sup{etLQσeωt:t0}.

On the other hand, if PLQ is a bounded operator, for example, when P is a finite-rank projection [e.g., Mori’s projection (12)], there exists a finite value for the numerical abscissa. Indeed, in this case, since ζ is bounded by ω = −inf divσ(F), the numerical abscissa of LQ may be bounded as

ωLQ=supD(LQ)x0Rx,LQxσx,xσ12ω2+PLQσ2+ω.
(A1)

Alternatively, in the case of finite rank P, the operator LQ may be thought of as a bounded perturbation of L, i.e., LQ=LLQ, and the numerical abscissa of LQ can be bounded using the bounded perturbation theorem Ref. 16, III.1.3, obtaining

ωLQω+LPσ.
(A2)

Either of these bounds for ωLQ can be used to bound the semigroup norm

etLQσeωLQte12ω2+PLQσ2+ωt,etLQσeωLQte(ω+LPσ)t.

Which of these two estimates gives the tighter bound will generally depend on the values of PLQσ and LPσ. It may be noted, however, that, when σ is invariant, ω = 0 and L is skew-adjoint so that

PLQσ=QLPσ=QLPσLPσ

and therefore the bound in (A1) is half that of (A2).

It was shown in Ref. 15 that the Banach dual of (4) defines an evolution in the probability density function space. Specifically, the joint probability density function of the state vector u(t) that solves Eq. (1) is pushed forward by the Frobenius-Perron operator F(t,0) [Banach dual of the Koopman operator (3)],

p(x,t)=F(t,s)p(x,s),  F(t,s)=e(ts)M,
(B1)

where

M(x)p(x,t)=(F(x)p(x,t)).
(B2)

By introducing a projection P in the space of probability density functions and its complement Q=IP, it is easy to show that the projected density Pp satisfies the MZ equation42 

Pp(t)t=PMPp(t)+PetQMQp(0)+0tPMe(ts)QMQMPp(s)ds.
(B3)

In Appendixes B 1 and B 2, we perform an analysis of different types of approximations of the MZ memory integral

0tPMe(ts)QMQMPp(s)ds.
(B4)

The main objective of such analysis is to establish rigorous error bounds for widely used approximation methods and also propose new provably convergent approximation schemes.

1. Analysis of the memory integral

In this section, we develop the analysis of the memory integral arising in the PDF formulation of the MZ equation. The starting point is the definition (B4). As before, we begin with the following estimate of upper bound estimation of the integral:

Theorem B.1
(Memory growth). LetetMQandetMbe strongly continuous semigroups with upper boundsetMMetωandetMQMQetωQ, and let T > 0 be a fixed integration time. Then for any 0 ≤ t ≤ T, we have
0tPMe(ts)QMQMPp(s)dsN0(t),
where
N0(t)=tC4,  ωQ=0,C4ωQ(etωQ1),  ωQ0
andC4=max0sTPMPMp(s). Moreover, N(t) satisfieslimt0N(t) = 0.

Proof.
Consider
0tPMe(ts)QMQMPp(s)ds=0tPe(ts)MQMQMPp(s)dsC4MQ0te(ts)ωQds=tC4,ωQ=0,C4ωQ(etωQ1),ωQ0,
where C4=max0sTPMQMPp(s).

Theorem B.2
(Memory approximation via thet-model). LetetMQandetMbe strongly continuous semigroups with boundsetMMetωandetMQMQetωQ, and let T > 0 be a fixed integration time. If the functionk(s,t)=PMe(ts)QMQMPp(s)(integrand of the memory term) is at least twice differentiable respect to s for all t ≥ 0, then
0tPMe(ts)QMQMPp(s)dstPMQMPp(t)N1(t),
where N1(t) is defined as
N1(t)=t(MQ+1)C4,  ωQ=0,C2MQωQ(etωQ1)+tC4,  ωQ0
and C4is as in Theorem B.1.

Proof.
0tPMe(ts)QMQMPp(s)dstPMQMPp(t)C4MQ0te(ts)ωQds+C4t=t(MQ+1)C4,  ωQ=0,C4MQωQ(etωQ1)+tC4,  ωQ0.

2. Hierarchical memory approximation in PDF space

The hierarchical memory approximation methods we discussed in Sec. III D can also be developed in the PDF space. To this end, let us first define

v0(t)=0tPMe(ts)QMQMPp(s)ds.
(B5)

By repeatedly differentiating v0(t) with respect to time [assuming v0(t) smooth enough], we obtain the hierarchy of equations

tvi1(t)=PM(QM)iPp(t)+vi(t),  i=1,,n,

where

vi(t)=0tPMe(ts)QM(QM)i+1Pp(s)ds.

By following closely the discussion in Sec. III D, we introduce the hierarchy of memory equations

dv0n(t)dt=PMQMPp(t)+v1n(t),dv1n(t)dt=PM(QM)2Pp(t)+v2n(t),  dvn1n(t)dt=PM(QM)nPp(t)+vnen(t)
(B6)

and approximate the last term in such hierarchy in different ways. Specifically, we consider

vnen(t)=ttPMe(ts)QM(QM)n+1Pp(s)ds=0      (H-model),vnen(t)=max(0,tΔt)tPMe(ts)QM(QM)n+1Pp(s)ds   (Type I Finite Memory Approximation),vnen(t)=min(t,tn)tPMe(ts)QM(QM)n+1Pp(s)ds   (Type II Finite Memory Approximation),vnen(t)=tPM(QM)n+1Pp(t)         (Ht-model).

Hereafter we establish the accuracy of the approximation schemes resulting from the substitution of each vnen(t) above into (B6).

Theorem B.3
(Accuracy of theH-model). LetetMandetMQbe strongly continuous semigroups, T > 0 a fixed integration time, and
βi=sups[0,T](MQ)i+1MPp(s)sups[0,T](MQ)iMPp(s),  1in.
(B7)
Then for 1 ≤ qn, we have
v0(t)v0q(t)N2q(t),
where
N2q(t)=A2C4i=1qβitq+1(q+1)!,
A2=maxs[0,T]esωQ, and C4is as in Theorem B.1.

Proof.
The error at the nth level can be bounded as
v0(t)v0q(t)0t0τq0τ2PMe(τ1s)QM(QM)p+1Pp(s)dsdτ1dτqA2C4i=1qβitq+1(q+1)!,
where
A2=maxs[0,T]esωQ=1,ωQ0eTωQ,ωQ0.
(B8)
Let
βi=sups[0,T](MQ)i+1MPp(s)sups[0,T](MQ)iMPp(s),
(B9)
under the assumption that these quantities are finite. Then we have
v0(t)v0q(t)A2C4i=1qβitq+1(q+1)!.

Corollary B.3.1
(Uniform convergence of theH-model). If βiin Theorem B.3 satisfy
βi<i+1T,1in
for any fixed time T, then there exits a sequence δ1 > δ2 > ⋯ > δnsuch that
v0(T)v0q(T)δq,
where 1 ≤ qn.

Corollary B.3.2
(Asymptotic convergence of theH-model). If βiin Theorem B.3 satisfy
βi<C,1i<+
for some constant C, then for any fixed time T and arbitrary δ > 0, there exits an integer q such that for all n > q,
v0(T)v0n(T)δ.
The Proofs of the Corollaries B.3.1 and B.3.2 closely follow the Proofs of Corollaries 3.4.1 and 3.4.2. Therefore we omit details here.

Theorem B.4
(Accuracy of Type-I FMA). LetetMandetMQbe strongly continuous semigroups, T > 0 be a fixed integration time, and let
βi=sups[0,T](MQ)i+1MPp(s)sups[0,T](MQ)iMPp(s),  1in.
(B10)
Then for 1 ≤ qn,
v0(t)v0q(t)N3q(t),
where
N3q(t)=A2C4i=1qβi(tΔtq)q+1(q+1)!
and C4is as in Theorem B.1.

Proof.
The proof is very similar with the Proof of Theorem 3.5. We begin with the estimate of v0(t)v0q(t),
0max(0,tΔtq)0τ̃q0τ̃20τ̃1Pe(τ̃1+Δtqs)MQ(MQ)q+1MQp(s)dsdτ̃1dτ̃q,=0,0tΔtq,0tΔtq0σ(tΔtqσ)q1(q1)!Pe(σ+Δtqs)MQ(MQ)q+1MQp(s)dsdσ,tΔtq.
This can be bounded by following the technique in the Proof of Theorem B.3. This yields
v0(t)v0q(t)0,0tΔtq,A2C4i=1qβi(tΔtq)q+1(q+1)!,tΔtq.
(B11)

Corollary B.4.1
(Uniform convergence of Type-I FMA). If βiin Theorem B.4 satisfy
βi<(i+1)δi!C4A2k=1iβk1i,
(B12)
then for any δ > 0, there exists an ordered time sequence Δtn < Δtn−1 < ⋯ < Δt1 < T such that
w0(T)w0q(T)δ,1qn
which satisfies
ΔtqTδ(q+1)!C4A2i=1qβi1q+1.
The proof is very similar with the Proof of Corollary 3.5.1, and therefore we omit it.

Theorem B.5
(Accuracy of Type-II FMA). LetetMandetMQbe strongly continuous semigroups and T > 0 be a fixed integration time. Set
βi=sups[0,T](MQ)i+1MPp(s)sups[0,T](MQ)iMPp(s),  1in.
(B13)
Then for 1 ≤ qn,
v0(t)v0q(t)N4q(t),
where
N4q(t)=C4ωQ1etqωQi=1qβifq(ωQ,t),
fq(ωQ,t)is defined in (48), and C4is as in Theorem B.1.

Proof.
The proof is very similar with the Proof of Theorem 3.6. Hereafter we provide the proof for the case when ωQ>0. Other cases can be easily obtained by using the same method. First of all, we have error estimation
v0(t)v0q(t)0t0τq0τ20tqPMe(τ1s)QM(QM)q+1Pp(s)dsdτ1dτqC4i=1qβi0t0τq0τ20tqe(τ1s)ωQdsdτ1dτq=C4ωQ1etqωQi=1qβifq(ωQ,t),
wherefq(ωQ,t)is as in (48).

Corollary B.5.1
(Uniform convergence of Type-II FMA). If βiin Theorem B.5 satisfy
βi<iT,  ωQ=0,βi<ωQeTωQk=0i2(TωQ)kk!eTωQk=0i1(TωQ)kk!,  ωQ0
for all t ∈ [0, T], then for arbitrarily small δ > 0, there exists an ordered time sequence 0 < t0 < t1 < … tn < T such that
v0(T)v0q(T)δ,1qn,
which satisfies
tq1ωQln1δωQq+1C4i=1qβieTωQk=0q1(TωQ)kk!.

Proof.
To ensure that v0(t)v0q(t)δ for all 0 ≤ tT, we can take (for ωQ>0)
C4ωQ1etqωQi=1qβifq(ωQ,T)=maxt[0,T]C2ωQ1etqωQi=1qβifq(ωQ,t)δ.
Therefore
etqωQ1δωQC4i=1qβifq(ωQ,T),  i.e.,tq1ωQln1δωQq+1C4i=1qβieTωQk=0q1(TωQ)kk!.
Since for ωQ>0, we have have condition
βi(t)<ωQeTωQk=0i2(TωQ)kk!eTωQk=0i1(TωQ)kk!.
Thus, there exists an ordered time sequence 0 < t1 < ⋯ < tn such that v0(T)v0n(T)δ. As in Theorem 3.6, this δ-bound on the error holds for all tn (with the upper bound as above), which implies the existence of such an increasing time sequence 0 < t1 < ⋯ < tn with tn bounded from below by the same quantities.

1.
J.
Abate
and
W.
Whitt
, “
A unified framework for numerically inverting Laplace transforms
,”
INFORMS J. Comput.
18
(
4
),
408
421
(
2006
).
2.
B. J.
Alder
and
T. E.
Wainwright
, “
Decay of the velocity autocorrelation function
,”
Phys. Rev. A
1
(
1
),
18
(
1970
).
3.
R. J.
Baxter
,
Exactly Solved Models in Statistical Mechanics
(
Elsevier
,
2016
).
4.
N.
Biggs
,
Algebraic Graph Theory
(
Cambridge University Press
,
1993
).
5.
C.
Brennan
and
D.
Venturi
, “
Data-driven closures for stochastic dynamical systems
,”
J. Comput. Phys.
372
,
281
298
(
2018
).
6.
H.-P.
Breuer
,
B.
Kappler
, and
F.
Petruccione
, “
The time-convolutionless projection operator technique in the quantum theory of dissipation and decoherence
,”
Ann. Phys.
291
(
1
),
36
70
(
2001
).
7.
A.
Chertock
,
D.
Gottlieb
, and
A.
Solomonoff
, “
Modified optimal prediction and its application to a particle-method problem
,”
J. Sci. Comput.
37
(
2
),
189
201
(
2008
).
8.
H.
Cho
,
D.
Venturi
, and
G. E.
Karniadakis
, “
Statistical analysis and simulation of random shocks in Burgers equation
,”
Proc. R. Soc. A
470
(
2171
),
20140080
(
2014
).
9.
A.
Chorin
,
O.
Hald
, and
R.
Kupferman
, “
Optimal prediction with memory
,”
Physica D
166
(
3-4
),
239
257
(
2002
).
10.
A. J.
Chorin
,
O. H.
Hald
, and
R.
Kupferman
, “
Optimal prediction and the Mori-Zwanzig representation of irreversible processes
,”
Proc. Natl. Acad. Sci. U. S. A.
97
(
7
),
2968
2973
(
2000
).
11.
A. J.
Chorin
,
R.
Kupferman
, and
D.
Levy
, “
Optimal prediction for Hamiltonian partial differential equations
,”
J. Comput. Phys.
162
(
1
),
267
297
(
2000
).
12.
A. J.
Chorin
and
P.
Stinis
, “
Problem reduction, renormalization and memory
,”
Commun. Appl. Math. Comput. Sci.
1
(
1
),
1
27
(
2006
).
13.
E.
Darve
,
J.
Solomon
, and
A.
Kia
, “
Computing generalized Langevin equations and generalized Fokker-Planck equations
,”
Proc. Natl. Acad. Sci. U. S. A.
106
(
27
),
10884
10889
(
2009
).
14.
E. B.
Davies
, “
Semigroup growth bounds
,”
J. Oper. Theory
53
(
2
),
225
249
(
2005
).
15.
J.
Dominy
and
D.
Venturi
, “
Duality and conditional expectations in the Nakajima-Mori-Zwanzig formulation
,”
J. Math. Phys.
58
,
082701
(
2017
).
16.
K.-J.
Engel
and
R.
Nagel
,
One-Parameter Semigroups for Linear Evolution Equations
(
Springer
,
1999
), Vol. 194.
17.
P.
Español
, “
Dissipative particle dynamics for a harmonic chain: A first-principles derivation
,”
Phys. Rev. E
53
(
2
),
1572
(
1996
).
18.
J.
Florencio
and
H. M.
Lee
, “
Exact time evolution of a classical harmonic-oscillator chain
,”
Phys. Rev. A
31
(
5
),
3231
(
1985
).
19.
R. F.
Fox
, “
Functional-calculus approach to stochastic differential equations
,”
Phys. Rev. A
33
(
1
),
467
476
(
1986
).
20.
S.
Gudder
, “
A Radon-Nikodým theorem for *-algebras
,”
Pac. J. Math.
80
(
1
),
141
149
(
1979
).
21.
G. D.
Harp
and
B. J.
Berne
, “
Time-correlation functions, memory functions, and molecular dynamics
,”
Phys. Rev. A
2
(
3
),
975
(
1970
).
22.
A.
Karimi
and
M. R.
Paul
, “
Extensive chaos in the Lorenz-96 model
,”
Chaos
20
(
4
),
043105
(
2010
).
23.
J.
Kim
and
I.
Sawada
, “
Dynamics of a harmonic oscillator on the Bethe lattice
,”
Phys. Rev. E
61
(
3
),
R2172
(
2000
).
24.
B. O.
Koopman
, “
Hamiltonian systems and transformation in Hilbert spaces
,”
Proc. Natl. Acad. Sci. U. S. A.
17
(
5
),
315
318
(
1931
).
25.
H.
Lei
,
N. A.
Baker
, and
X.
Li
, “
Data-driven parameterization of the generalized Langevin equation
,”
Proc. Natl. Acad. Sci. U. S. A.
113
(
50
),
14183
14188
(
2016
).
26.
E. N.
Lorenz
, “
Predictability—A problem partly solved
,” in
ECMWF Seminar on Predictability
(
ECMWF
,
1996
), Vol. 1, pp.
1
18
.
27.
H.
Mori
, “
A continued-fraction representation of the time-correlation functions
,”
Prog. Theor. Phys.
34
(
3
),
399
416
(
1965
).
28.
H.
Mori
, “
Transport, collective motion, and Brownian motion
,”
Prog. Theor. Phys.
33
(
3
),
423
455
(
1965
).
29.
Noise in Nonlinear Dynamical Systems
, Theory of Continuous Fokker-Planck Systems, edited by
F.
Moss
and
E. V. P.
McClintock
(
Cambridge University Press
,
1995
), Vol. 1.
30.
A.
Pazy
,
Semigroups of Linear Operators and Applications to Partial Differential Equations
(
Springer
,
1992
).
31.
I.
Snook
,
The Langevin and Generalised Langevin Approach to the Dynamics of Atomic, Polymeric and Colloidal Systems
, 1st ed. (
Elsevier
,
2007
).
32.
G.
Söderlind
, “
The logarithmic norm. History and modern theory
,”
BIT Numer. Math.
46
(
3
),
631
652
(
2006
).
33.
P.
Stinis
, “
Stochastic optimal prediction for the Kuramoto–Sivashinsky equation
,”
Multiscale Model. Simul.
2
(
4
),
580
612
(
2004
).
34.
P.
Stinis
, “
A comparative study of two stochastic model reduction methods
,”
Physica D
213
,
197
213
(
2006
).
35.
P.
Stinis
, “
Higher order Mori-Zwanzig models for the Euler equations
,”
Multiscale Model. Simul.
6
(
3
),
741
760
(
2007
).
36.
P.
Stinis
, “
Renormalized Mori–Zwanzig-reduced models for systems without scale separation
,”
Proc. R. Soc. A
471
(
2176
),
20140446
(
2015
).
37.
L. N. P.
Trefethen
, “
Pseudospectra of linear operators
,”
SIAM Rev.
39
(
3
),
383
406
(
1997
).
38.
L. N.
Trefethen
and
M.
Embree
,
Spectra and Pseudospectra: The Behavior of Nonnormal Matrices and Operators
(
Princeton University Press
,
2005
).
39.
U.
Umegaki
, “
Conditional expectation in an operator algebra. I
,”
Tohoku Math. J.
6
(
2
),
177
181
(
1954
).
40.
D.
Venturi
, “
The numerical approximation of nonlinear functionals and functional differential equations
,”
Phys. Rep.
732
,
1
102
(
2018
).
41.
D.
Venturi
,
H.
Cho
, and
G. E.
Karniadakis
, “
The Mori-Zwanzig approach to uncertainty quantification
,” in
Handbook of Uncertainty Quantification
, edited by
R.
Ghanem
,
D.
Higdon
, and
H.
Owhadi
(
Springer
,
2016
).
42.
D.
Venturi
and
G. E.
Karniadakis
, “
Convolutionless Nakajima-Zwanzig equations for stochastic analysis in nonlinear dynamical systems
,”
Proc. R. Soc. A
470
(
2166
),
20130754
(
2014
).
43.
D.
Venturi
,
T. P.
Sapsis
,
H.
Cho
, and
G. E.
Karniadakis
, “
A computable evolution equation for the joint response-excitation probability density function of stochastic dynamical systems
,”
Proc. R. Soc. A
468
(
2139
),
759
783
(
2012
).
44.
Y.
Zhu
and
D.
Venturi
, “
Faber approximation of the Mori-Zwanzig equation
,”
J. Comput. Phys.
372
,
694
718
(
2018
).
45.
R.
Zwanzig
, “
Memory effects in irreversible thermodynamics
,”
Phys. Rev.
124
,
983
992
(
1961
).
46.
R.
Zwanzig
,
Nonequilibrium Statistical Mechanics
(
Oxford University Press
,
2001
).