We develop a thorough mathematical analysis to deduce conditions for the accuracy and convergence of different approximations of the memory integral in the Mori-Zwanzig (MZ) equation. In particular, we derive error bounds and sufficient convergence conditions for short-memory approximations, the t-model, and hierarchical (finite-memory) approximations. In addition, we derive useful upper bounds for the MZ memory integral, which allow us to estimate a priori the contribution of the MZ memory to the dynamics. Such upper bounds are easily computable for systems with finite-rank projections. Numerical examples are presented and discussed for linear and nonlinear dynamical systems evolving from random initial states.

The Mori-Zwanzig (MZ) formulation is a technique from irreversible statistical mechanics that allows the development of formally exact evolution equations for the quantities of interest such as macroscopic observables in high-dimensional dynamical systems.45,10,41,42 One of the main advantages of developing such exact equations is that they provide a theoretical starting point to avoid integrating the full (possibly high-dimensional) dynamical system and instead solve directly for the quantities of interest, thus reducing the computational cost significantly. Computing the solution to the Mori-Zwanzig equation, however, is a challenging task that relies on approximations and appropriate numerical schemes. One of the main difficulties lies in the approximation of the MZ memory integral (convolution term), which encodes the effects of the so-called orthogonal dynamics in the time evolution of the quantity of interest. The orthogonal dynamics are essentially a high-dimensional flow satisfying a complex integro-differential equation. Over the years, many techniques have been proposed for approximating the MZ memory integral, the most efficient ones being problem-dependent.31,41 For example, in applications to statistical mechanics, Mori’s continued fraction method27,18 has been quite successful in determining exact solutions to several prototype problems, such as the dynamics of the auto-correlation function of a tagged oscillator in a harmonic chain.17,23 Other effective approaches to approximate the MZ memory integral rely on perturbation methods,6,42,8 mode coupling theories,2,31 functional approximation techniques,40,19,43,21,29 and data-driven methods.5,25 In a parallel effort, the applied mathematics community has, in recent years, attempted to derive general easy-to-compute representations of the memory integral. In particular, various approximations such as the t-model,10,12,34 the modified t-model,7 and, more recently, renormalized perturbation methods36 were proposed to address the approximation of the memory integral in situations where there is no clear separation of scales between resolved and unresolved dynamics.

The main objective of this paper is to develop rigorous estimates of the memory integral and provide convergence analysis of different approximation models of the Mori-Zwanzig equation, such as the short-memory approximation,33 the t-model,12 and hierarchical methods.35 In particular, we study the MZ equation corresponding to two broad classes of projection operators: (i) infinite-rank projections (e.g., Chorin’s projection10) and (ii) finite-rank projections (e.g., Mori’s projection28). We develop our analysis for both state-space and probability density function (PDF) space formulations of the MZ equation. These two descriptions are connected by the same duality principle that pairs the Koopman and Frobenious-Perron operators.15

This paper is organized as follows. In Sec. II, we outline the general procedure to derive the MZ equation in the phase space and discuss the common choices of projection operators. In Sec. III, we derive error bounds for the MZ memory integral and provide the convergence analysis of different memory approximation methods, including the t-model,34,10,12 the short-memory approximation,33 and the hierarchical memory approximation technique.35 Such estimates are built upon semigroup estimation methods we present in  Appendix A. In Sec. IV, we present numerical examples demonstrating the accuracy of the memory approximation/estimation methods we develop throughout the paper. The main findings are summarized in Sec. V. Convergence analysis of the MZ memory term in the probability density function space formulation is presented in  Appendix B.

Consider the nonlinear dynamical system

$dxdt=F(x), x(0)=x0$
(1)

evolving on a smooth manifold $S$. For simplicity, let us assume that $S=Rn$. We will consider the dynamics of scalar-valued observables $g:S→C$, and for concreteness, it will be desirable to identify structured spaces of such observable functions. In Ref. 15, it was argued that C*-algebras of observables such as $L∞(S,C)$ (the space of all measureable, essentially bounded functions on $S$) and $C0(S,C)$ (the space of all continuous functions on $S$, vanishing at infinity) make natural choices. In what follows, we do not require the observables to comprise a C*-algebra, but we will want them to comprise a Banach space as the estimation theorems of Sec. III make an extensive use of the norm of this space. Having the structure of a Banach space of observables also gives greater context to the meaning of the linear operators $L$, $K$, $P$, and $Q$ to be defined hereafter.

The dynamics of any scalar-valued observable g(x) (quantity of interest) can be expressed in terms of a semi-group $K(t,s)$ of operators acting on the Banach space of observables. This is the Koopman operator24 which acts on the function g as

$g(x(t))=K(t,s)g(x(s)),$
(2)

where

$K(t,s)=e(t−s)L, Lg(x)=F(x)⋅∇g(x).$
(3)

Rather than computing the Koopman operator applicable to all observables, it is often more tractable to compute the evolution only of a (closed) subspace of the quantities of interest. This subspace can be described conveniently by means of a projection operator $P$ with the subspace as its image. Both $P$ and the complementary projection $Q=I−P$ act on the space of observables. The nature, mathematical properties, and connections between $P$ and the observable g are discussed in detail in Ref. 15 and summarized in Sec. II A. For now, it suffices to assume that $P$ is a bounded linear operator and that $P2=P$. The MZ formalism describes the evolution of observables initially in the image of $P$. Because the evolution of observables is governed by the semi-group $K(t,s)$, we seek an evolution equation for $K(t,s)P$. By using the definition of the Koopman operator (3) and the well-known Dyson identity

$etL=etQL+∫0tesLPLe(t−s)QLds,$

we obtain the operator equation

$ddtetL=etLPL+etQLQL+∫0tesLPLe(t−s)QLQLds.$
(4)

By applying this equation to an observable function u0, we obtain the well-known MZ equation in phase space

$∂∂tetLu0=etLPLu0+etQLQLu0+∫0tesLPLe(t−s)QLQLu0ds.$
(5)

Acting on the left with $P$, we obtain the evolution equation for projected dynamics

$∂∂tPetLu0=PetLPLu0+∫0tPesLPLe(t−s)QLQLu0ds.$
(6)

Note, in fact, that the projection of the second term in (5), i.e., $PetQLQLx0=0$, vanishes since $PQ=0$.

In this section, we make a summary on the commonly used projection operators $P$ in the Mori-Zwanzig framework. To make our definition mathematically sound, we begin by assuming that the Liouville operator (3) acts on observable functions in a C*-algebra $A$, for instance, $L∞(M,Σ,μ)$, where $M$ is a space such as $RN$, Σ is a σ-algebra on $M$, and μ is a measure on Σ. Let $σ∈A*$ be a positive linear functional on $A$. We define the weighted pre-inner product

$⟨f,g⟩σ=σ(f*g).$

This can be used to define a Hilbert space $H=L2(M,σ)$, which is the completion of the quotient space

$H′={f∈A:σ(f*f)<∞}/{f∈A:σ(f*f)=0}$

endowed with the inner product ⟨·,·⟩σ. The L2 norm induced by the inner product is denoted as ∥·∥σ. In the rest of the paper, the positive linear functional σ is always induced by a probability distribution $σ̃$ through

$σ(u)=∫Mσ̃(ω)u(ω)dω,$

where $σ̃$ is typically chosen to be the probability density of the initial condition ρ0 or the equilibrium distribution ρeq in statistical physics. To conform to the literature, we also use notation $⟨⋅,⋅⟩ρ0$, $⟨⋅,⋅⟩ρeq,⟨⋅,⋅⟩eq$ to represent the weighted inner product corresponding to different probability measures $σ̃(ω)dω$. With the Hilbert space determined, we now focus on the following two broad classes of orthogonal projections on $H$.

#### 1. Infinite-rank projections

The first class of projection operators to consider in this setting are the conditional expectations $P$ such that $P*σ=σ$. In this case, the properties of conditional expectations (in particular, that $P[P(f)gP(h)]=P(f)P(g)P(h)$39) and the fact that $P*σ=σ$ imply that

$⟨Pf,g⟩σ=σ[(Pf)*g]=P*(σ)[(Pf)*g]=σ[P((Pf)*g)]=σ[(Pf)*(Pg)]⟨f,Pg⟩σ=σ[f*Pg]=P*(σ)[f*Pg]=σ[P(f*Pg)]=σ[(Pf*)(Pg)]=σ[(Pf)*(Pg)]$

so that

$⟨Pf,g⟩σ=⟨f,Pg⟩σ$

for all $f,g∈H$. It follows that

$⟨Qf,g⟩σ=⟨f,g⟩σ−⟨Pf,g⟩σ=⟨f,g⟩σ−⟨f,Pg⟩σ=⟨f,Qg⟩σ.$

Therefore both $P$ and $Q$ are self-adjoint (i.e., orthogonal) projections onto closed subspaces of $H$ and hence contractions $‖P‖σ≤1$, $‖Q‖σ≤1$. Chorin’s projection12,10 is one of this class and is defined as

$Pg(x^0)=∫−∞+∞g(x^(t;x^0,x̃0),x̃(t;x^0,x̃0))ρ0(x^0,x̃0)dx̃0∫−∞+∞ρ0(x^0,x̃0)dx̃0=Eρ0[g|x^0].$
(7)

Here $x(t;x0)=(x^(t;x^0,x̃0),x̃(t;x^0,x̃0))$ is the flow map generated by (1) split into resolved ($x^$) and unresoved ($x̃$) variables, and $g(x)=g(x^,x̃)$ is the quantity of interest. For Chorin’s projection, the positive functional σ defining the Hilbert space $H$ may be taken to be integration with respect to the probability measure $ρ0(x^0,x̃0)$. Clearly, if x0 is deterministic, then $ρ0(x^0,x̃0)$ is a product of Dirac delta functions. On the other hand, if $x^(0)$ and $x̃(0)$ are statistically independent, i.e., $ρ0(x^0,x̃0)=ρ^0(x^0)ρ̃0(x̃0)$, then the conditional expectation $P$ simplifies to

$Pu(x^0)=∫−∞+∞u(x^(t;x^0,x̃0),x̃(t;x^0,x̃0))ρ̃0(x̃0)dx̃0.$
(8)

In the special case where $u(x^,x̃)=x^(t;x^0,x̃0)$, we have

$Px^(x^0)=∫−∞+∞x^(t;x^0,x̃0)ρ̃0(x̃0)dx̃0,$
(9)

i.e., the conditional expectation of the resolved variables $x^(t)$ given the initial condition $x^0$. This means that an integration of (9) with respect to $ρ^0(x^0)$ yields the mean of the resolved variables, i.e.,

$Eρ0[x^(t)]=∫−∞∞Px^(x^0)ρ^0(x^0)dx^0=∫−∞∞x^(t,x0)ρ0(x0)dx0.$
(10)

Obviously, if the resolved variables $x^(t)$ evolve from a deterministic initial state $x^0$, then the conditional expectation (9) represents the the average of the reduced-order flow map $x^(t;x^0,x̃0)$ with respect to the PDF of $x̃0$, i.e., the flow map

$PetLx^(0)=X0(t;x^0)=∫−∞+∞x^(t;x^0,x̃0)ρ̃0(x̃0)dx̃0.$
(11)

In this case, the MZ equation (6) is an exact (unclosed) partial differential equation (PDE) for the multivariate field $X0(t,x^0)$. In order to close such an equation, a mean field approximation of the type $Pf(x^)=f(Px^)$ was introduced by Chorin et al. in Refs. 10, 12, and 11, together with the assumption that the probability distribution of x0 is invariant under the flow generated by (1).

#### 2. Finite-rank projections

Another class of projections is defined by choosing a closed (typically finite-dimensional) linear subspace $V⊂H=L2(M,σ)$ and letting $P$ to be the orthogonal projection onto V in the σ inner product. An example of such projection is Mori’s projection,46 widely used in statistical physics. For finite-dimensional V, given a linearly independent set {u1, …, uM} ⊂ V that spans V, $P$ can be defined by first constructing the positive definite Gram matrix $Gij=⟨ui,uj⟩σ$. Then

$Pf=∑i,j=1M(G−1)ij⟨ui,f⟩σuj.$
(12)

This projection is orthogonal with respect to the $Lσ2$ inner product. In statistical physics, a common choice for the positive functional σ that generates $H$ is integration with respect to the Gibbs canonical distribution $ρeq=e−βH/Z$, for the Hamiltonian $H=H(p,q)$ and the associated partition function Z. Here q are generalized coordinates while p are kinetic momenta.

In this section, we develop a thorough mathematical analysis of the MZ memory integral

$∫0tPesLPLe(t−s)QLQLu0ds=∫0tPesLPe(t−s)LQLQLu0ds$
(13)

and its approximation. We begin by describing the behavior of the semigroup norms $‖etL‖$, $‖etQLQ‖$, and $‖etLQ‖$ as functions of time, for different choices of projection $P$ and different norms. As we will see, the analysis will give clear computable bounds only in some circumstances, illustrating the difficulty of this problem and the need for further development and insight.

For any Liouville operator $L$ of the form (3) acting on $A=L∞(M,Σ,μ)$ and for any σ identified with an element of $L1(M,Σ,μ)$, the functional $L*σ$ (assuming that σ lies in the domain of $L*$) is absolutely continuous with respect to σ (essentially because $L$ acts locally) and the Radon-Nikodym derivative20 may be identified with the negative of the divergence of the vector field F with respect to the measure induced by σ, i.e.,

$dL*σdσ=−divσF, i.e., (L*σ)(u)=−σ(udivσF),$
(14)

where

$divσF=∇⋅(σ̃(x)F(x))σ̃(x).$
(15)

When $M=RN$ and σ has the form $σ(u)=∫RNσ̃(x)u(x)dx$, this can be shown more directly using integration by parts. By assuming that $σ̃(x)$ or Fi decays to 0 at , we have

$L*σ(u)=σ(Lu)=∫RNσ̃∑i=1NFi∂u∂xidx=−∫RNu∑i=1N∂∂xi(σ̃Fi)dx=−∫RNu1σ̃∑i=1N∂∂xi(σ̃Fi)σ̃dx$
(16a)
$=σu−1σ̃∑i=1N∂∂xi(σ̃Fi)=σu−divσF$
(16b)

from which we see

$divσF=∇⋅(σ̃F)σ̃=1σ̃∑i=1N∂∂xi(σ̃Fi)=∇⋅F+F⋅∇lnσ̃.$

Therefore,

$⟨v,−udivσF⟩σ=σ(−v*udivσF)=(L*σ)(v*u)=σ(L(v*u))=σ(L(v)*u+v*L(u))=⟨Lv,u⟩σ+⟨v,Lu⟩σ$

and the following are equivalent: (i) $L*σ=0$ (i.e., σ is invariant); (ii) divσF = 0; (iii) $L$ is the skew-adjoint with respect to the σ inner product. More generally, on $H=L2(M,σ)$, we find that

$L+L†=−divσF$

so that the numerical abscissa ω37,38 (i.e., logarithmic norm14,32) of $L$ is given by

$ω=sup0≠u∈HR⟨u,Lu⟩σ⟨u,u⟩σ=sup0≠u∈H⟨u,(L+L†)u⟩σ2⟨u,u⟩σ=sup0≠u∈H⟨u,−udivσF⟩σ2⟨u,u⟩σ=−12infxdivσF(x).$
(17)

Using this numerical abscissa ω, we obtain the following $Lσ2$ estimation of the Koopman semigroup:

$‖etL‖Lσ2≤eωt,$
(18)

and moreover, eωt is the smallest exponential function that bounds $‖etL‖σ$.14 When $P$ and $Q=I−P$ are the orthogonal projections on $L2(M,σ)$, we can observe that the numerical abscissa of $QLQ$ is bounded by that of $L$. In fact,

$sup0≠u∈HR⟨u,QLQu⟩σ⟨u,u⟩σ=sup0≠u∈HR⟨Qu,LQu⟩σ⟨u,u⟩σ=sup0≠u∈ImQR⟨u,Lu⟩σ⟨u,u⟩σ≤sup0≠u∈HR⟨u,Lu⟩σ⟨u,u⟩σ=ω$

[see Eq. (17)] so that

$‖etQLQ‖Lσ2≤eωt.$
(19)

It should be noticed that this bound for the orthogonal semigroup is not necessarily tight. The tightness of the bound depends on the choice of projection $P$ and comes down to whether functions in the image of $Q$ can be chosen localized to regions where divσF is close to its infimal value.

When estimating the MZ memory integral, we need to deal with the semigroup $etLQ$. It turns out to be extremely difficult to prove strong continuity of such a semigroup in general, due to the unboundedness of $LQ$. It is shown in  Appendix A that when $PLQ$ is an unbounded operator, as is typical when $P$ is a conditional expectation, the semigroup $etLQ$ can only be bounded as

$‖etLQ‖σ≤MQetωQ$
(20)

for some $MQ>1$, due to the fact that $‖etLQ‖σ$ has an infinite slope at t = 0. More work is needed to obtain satisfactory, computable values for $MQ$ and $ωQ$, in the case where $P$ and $Q$ are the infinite-rank projections. It is also shown that when either $PLQ$ or $LP$ is bounded, for example, when $P$ is a finite-rank projection, we can get computable semigroup bounds of the form

$‖etLQ‖σ≤eωQt≤e12ω2+‖PLQ‖σ2+ωt,$
(21a)
$‖etLQ‖σ≤eωQt≤e(ω+‖LP‖σ)t,$
(21b)

where ω = −inf divσF if we use the $Lσ2$ estimation of $etL$.

We begin by seeking to bound the MZ memory integral (B4) as a whole and build our analysis from there. A key assumption of our analysis is that the semigroup $etLQ$ is strongly continuous, i.e., the map $t↦etLQg$ is continuous in the norm topology on the space of observables for each fixed g.16 As we pointed out in Sec. III A, it is very difficult to rigorously prove a strong continuity of $etLQ$ and to obtain a computable upper bound for unbounded generators of the form $LQ$. We leave this problem for future research, and assume for now that that there exist constants $MQ$ and $ωQ$ such that $‖etLQ‖≤MQetωQ$. Throughout this section, ∥·∥ denotes a general Banach norm. We begin with the following simple estimate:

Theorem 3.1
(Memory growth). Let$etL$and$etLQ$be strongly continuous semigroups with upper bounds$‖etL‖≤Metω$and$‖etLQ‖≤MQetωQ$. Then
$∫0tPesLPLe(t−s)QLQLu0ds≤M0(t),$
(22)
where
$M0(t)=C1tetωQ, ω=ωQ,C1ω−ωQ[etω−etωQ], ω≠ωQ$
(23)
and$C1=MMQ‖LQLu0‖$is a constant. Clearly, limt→0M0(t) = 0.

Proof.
We first rewrite the memory integral in the equivalent form
$∫0tPesLPLe(t−s)QLQLu0ds=∫0tPesLPe(t−s)LQLQLu0ds.$
Since $etL$ and $etLQ$ are assumed to be strongly continuous semigroups, we have the upper bounds $‖etL‖≤Metω$, $‖etLQ‖≤MQetωQ$. Therefore
$∫0tPesLPe(t−s)LQLQLu0ds≤∫0t‖esLPe(t−s)LQLQLu0‖ds≤MMQ‖LQLu0‖∫0tes(ω−ωQ)ds=C1tetωQ, ω=ωQ,C1ω−ωQ[etω−etωQ], ω≠ωQ,$
where $C1=MMQ‖P‖2‖LQLu0‖$.

Theorem 3.1 provides an upper bound for the growth of the memory integral based on the assumption that $etL$ and $etLQ$ are strongly continuous semigroups. We emphasize that only for simple cases can such upper bounds be computed analytically (we will compute one of the cases later in Sec. IV) because of the fundamental difficulties in computing the upper bound of $etLQ$. However, it will be shown later that, although the specific expression for M0(t) is unknown, the form of it is already useful as it enables us to derive some verifiable theoretical predictions for general nonlinear systems.

Theorem 3.1 can be employed to obtain upper bounds for well-known approximations of the memory integral. Let us begin with the t-model proposed in Ref. 12. This model relies on the approximation

$∫0tPesLPLe(t−s)QLQLu0ds≃tetLPLQLu0 (t-model).$
(24)

Theorem 3.2
(Memory approximation via thet-model12). Let$etL$and$etLQ$be strongly continuous semigroups with upper bounds$‖etL‖≤Metω$and$‖etLQ‖≤MQetωQ$. Then
$∫0tPesLPLe(t−s)LQLQLu0ds−tPetLLQLu0≤M1(t),$
where
$M1(t)=C1etωQ−etωωQ−ω+tetωMQ,ω≠ωQ,C1MQ+1MQtetω,ω=ωQ$
and$C1=MMQ‖P‖2‖LQLu0‖$.

Proof.
By applying the triangle inequality, we obtain that
$∫0tPesLPe(t−s)LQLQLu0ds−tPetLPLQLu0≤∫0t‖P‖esL‖P‖e(t−s)LQds+t‖P‖etL‖P‖‖LQLu0‖≤‖P‖2‖LQLu0‖MMQ∫0tesωe(t−s)ωQds+tMetω=C1etω∫0tes(ωQ−ω)ds+tMQ=C1etωQ−etωωQ−ω+tetωMQ, ω≠ωQ,C1MQ+1MQtetω, ω=ωQ,$
where $C1=MMQ‖P‖2‖LQLu0‖$.
Theorem 3.2 provides an upper bound for the error associated with the t-model. The limit
$limt→0M1(t)=0$
(25)
guarantees the convergence of the t-model for short integration times. On the other hand, depending on the semigroup constants M, ω, $MQ$, and $ωQ$ (which may be estimated numerically), the error of the t-model may remain small for longer integration times (see the numerical results in Sec. IV B 2). Next, we study the short-memory approximation proposed in Ref. 33. The main idea is to replace the integration interval [0, t] in (13) by a shorter time interval [t − Δt, t], i.e.,
$∫0tPesLPLe(t−s)QLQLu0ds≃∫t−ΔttPesLPLe(t−s)QLQLu0ds (short-memory approximation),$
where Δt ∈ [0, t] identifies the effective memory length. The following result provides an upper bound to the error associated with the short-memory approximation.

Theorem 3.3
(Short memory approximation33). Let$etL$and$etLQ$be strongly continuous semigroups with upper bounds$‖etL‖≤Metω$and$‖etLQ‖≤MQetωQ$. Then the following error estimate holds true:
$∫0tPesLPLe(t−s)QLQLu0ds−∫t−ΔttPesLPLe(t−s)QLQLu0ds≤M2(t−Δt,t),$
where
$M2(Δt,t)=C1(t−Δt)etωQ, ω=ωQ,C1eΔtωQe(t−Δt)ω−e(t−Δt)ωQω−ωQ, ω≠ωQ$
and$C1=MMQ‖P‖2‖LQLu0‖$.

We omit the proof due to its similarity to that of Theorem 3.1. Note that $limΔt→t$M2t, t) = 0 for all finite t > 0.

An alternative way to approximate the memory integral (13) was proposed by Stinis in Ref. 35. The key idea is to repeatedly differentiate (13) with respect to time and establish a hierarchy of PDEs which can eventually be truncated or approximated at some level to provide an approximation of the memory. In this section, we derive this hierarchy of memory equations and perform a thorough theoretical analysis to establish the accuracy and convergence of the method. To this end, let us first define

$w0(t)=∫0tPesLPLe(t−s)QLQLu0ds$
(26)

to be the memory integral (13). By differentiating $w$0(t) with respect to time, we obtain

$dw0(t)dt=PetLPLQLu0+w1(t),$

where

$w1(t)=∫0tPesLPLe(t−s)QL(QL)2u0ds.$

By iterating this procedure n times, we obtain

$dwn−1(t)dt=PetLPL(QL)n−1u0+wn(t),$
(27)

where

$wn(t)=∫0tPesLPLe(t−s)QL(QL)n+1u0ds.$
(28)

The hierarchy of Eqs. (27) and (28) is equivalent to the following infinite-dimensional system of PDEs:

$dw0(t)dt=PetLPLQLu0+w1(t),dw1(t)dt=PetLPLQLQLu0+w2(t),⋮dwn−1(t)dt=PetLPL(QL)nu0+wn(t),⋮$
(29)

evolving from the initial condition $w$i(0) = 0, i = 1, 2, … [see Eq. (28)]. With such an initial condition available, we can solve (29) with backward substitution, i.e., from the last equation to the first one, to obtain the following (exact) Dyson series representation of the memory integral (26):

$w0(t)=∫0tPesLPLQLu0ds+∫0t∫0τ1PesLPLQLQLu0dsdτ1+⋯+∫0t∫0τn−1…∫0τ1PesLPL(QL)nu0dsdτ1…dτn−1+….$
(30)

So far no approximation was introduced, i.e., the infinite-dimensional system (29) and the corresponding formal solution (30) are exact. To make progress in developing a computational scheme to estimate the memory integral (26), it is necessary to introduce approximations. The simplest of these rely on truncating the hierarchy (29) after n equations, while simultaneously introducing an approximation of the nth order memory integral $w$n(t). We denote such an approximation as $wnen(t)$. The truncated system takes the form

$dw0n(t)dt=PetLPLQLu0+w1n(t),dw1n(t)dt=PetLPLQLQLu0+w2n(t),⋮dwn−1n(t)dt=PetLPL(QL)nu0+wnen(t).$
(31)

The notation $wjn(t)$ (j = 0, …, n − 1) emphasizes that the solution to (31) is, in general, different from the solution to (29). The initial condition of the system can be set as $win(0)=0$, for all i = 0, , n − 1. By using backward substitution, this yields the following formal solution:

$w0n(t)=∫0tPesLPLQLu0ds+∫0t∫0τ1PesLPLQLQLu0dsdτ1+⋯+∫0t∫0τn−1…∫0τ1PesLPL(QL)nu0dsdτ1…dτn−1+∫0t∫0τn−1…∫0τ1wnen(s)dsdτ1…dτn−1,$
(32)

representing an approximation of the memory integral (26). Note that, for a given system, such approximation depends only on the number of equations n in (31) and on the choice of the approximation $wnen(t)$. In the present paper, we consider the following choices:

1. Approximation by truncation (H-model),

$wnen(t)=0.$
(33)
1. Type-I finite memory approximation (FMA),

$wnen(t)=∫max(0,t−Δtn)tPesLPLe(t−s)QL(QL)n+1u0ds.$
(34)
1. Type-II finite memory approximation,

$wnen(t)=∫min(t,tn)tPesLPLe(t−s)QL(QL)n+1u0ds.$
(35)
1. Ht-model,

$wnen(t)=tPetLPL(QL)n+1u0.$
(36)

The quantities tn and Δtn appearing in (34) and (35) will be defined in Theorems 3.5 and 3.6, respectively. The first approximation, i.e., (33), is a truncation of the hierarchy obtained by assuming that $w$n(t) = 0. Such approximation was originally proposed by Stinis in Ref. 35, and we shall call it the H-model. The Type-I finite memory approximation (FMA) is obtained by applying the short memory approximation to the nth order memory integral $w$n(t). The Type-II finite memory approximation (FMA) is a modified version of the Type-I, with a larger memory band. The Ht-model approximation is based on replacing the nth order memory integral $w$n(t) with a classical t-model. Note that in this setting, the classical t-model approximation proposed by Chorin and Stinis12 is equivalent to a zeroth-order Ht-model approximation.

Hereafter, we present a thorough mathematical analysis that aims at estimating the error $‖w0(t)−w0n(t)‖$, where $w$0(t) is the full memory at time t [see (26) or (30)], while $w0n(t)$ is the solution of the truncated hierarchy (31), with $wnen(t)$ given by (33), (34), (35), or (36). With such error estimates available, we can infer whether the approximation of the full memory $w$0(t) with $w0n(t)$ is accurate and, more importantly, if the algorithm to approximate the memory integral converges. To the best of our knowledge, this is the first time a rigorous convergence analysis is performed on various approximations of the MZ memory integral. It turns out that the distance $‖w0(t)−w0n(t)‖$ can be controlled through the construction of the hierarchy under some constraint on the initial condition.

#### 1. The H-model

Setting $wnen(t)=0$ in (31) yields an approximation by truncation, which we will refer to as the H-model (hierarchical model). Such a model was originally proposed by Stinis in Ref. 35. Hereafter we provide error estimates and convergence results for this model. In particular, we derive an upper bound for the error $‖w0(t)−w0n(t)‖$ and sufficient conditions for the convergence of the reduced-order dynamical system. Such conditions are problem dependent, i.e., they involve the Liouvillian $L$, the initial condition u0, and the projection $P$.

Theorem 3.4
(Accuracy of theH-model). Let$etL$and$etLQ$be strongly continuous semigroups with upper bounds$‖etL‖≤Metω$and$‖etLQ‖≤MQetωQ$, and let T > 0 be a fixed integration time. For some fixed n, let
$αj=‖(LQ)j+1Lu0‖‖(LQ)jLu0‖, 1≤j≤n.$
(37)
Then, for any 1 ≤ p ≤ n and all t ∈ [0, T], we have
$‖w0(t)−w0p(t)‖≤M3p(t)≤M3p(T),$
where
$M3p(t)=C1A1A2tp+1(p+1)!∏j=1pαj, C1=‖LQLu0‖MMQ,$
and
$A1=maxs∈[0,T]es(ω−ωQ)=1, ω≤ωQeT(ω−ωQ), ω≥ωQ, A2=maxs∈[0,T]esωQ=1, ωQ≤0eTωQ, ωQ≥0.$
(38)

Proof.
We begin with the expression for the difference between the memory term $w$0 and its approximation $w0p$,
$w0(t)−w0p(t)=∫0t∫0τp⋯∫0τ2∫0τ1PesLPe(τ1−s)LQ(LQ)n+1Lu0dsdτ1⋯dτp.$
(39)
Since $etL$ and $etLQ$ are strongly continuous semigroups, we have $‖etL‖≤Meωt$ and $‖etLQ‖≤MQeωQt$. By using Cauchy’s formula for repeated integration, we bound the norm of the error (39) as
$‖w0(t)−w0p(t)‖≤∫0t(t−σ)p−1(p−1)!∫0σ‖PesLPe(σ−s)LQ(LQ)p+1Lu0‖dsdσ≤‖P‖2MMQ‖(LQ)p+1Lu0‖∫0t(t−σ)p−1(p−1)!∫0σesωe(σ−s)ωQdsdσ≤C1∏j=1pαj∫0t(t−σ)p−1(p−1)!∫0σesωe(σ−s)ωQdsdσ︸fp(t,ω,ωQ)=C1∏j=1pαjfp(t,ω,ωQ),$
(40)
where $C1=‖P‖2‖LQLu0‖MMQ$ as before. The function $fp(t,ω,ωQ)$ may be bounded from above as
$fp(t,ω,ωQ)≤A1A2∫0t(t−σ)p−1(p−1)!∫0σdsdσ=A1A2tp+1(p+1)!.$
Hence, we have
$‖w0(t)−w0p(t)‖≤C1A1A2∏j=1pαjtp+1(p+1)!=M3p(t).$

Theorem 3.4 states that for a given dynamical system (represented by $L$) and quantity of interest (represented by $P$), the error bound $M3p(t)$ is strongly related to {αj} which is ultimately determined by the initial condition x0. It turns out that by bounding {αj}, we can control $M3p(t)$ and therefore the overall error $‖w0(t)−w0p(t)‖$. The following corollaries discuss sufficient conditions such that the error $‖w0(T)−w0n(T)‖$ decays as we increase the differentiation order n for fixed time T > 0.

Corollary 3.4.1
(Uniform convergence of theH-model). If {αj} in Theorem 3.4 satisfy
$αj
(41)
for any fixed time T > 0, then there exists a sequence of constants δ1> δ2>> δnsuch that
$‖w0(T)−w0p(T)‖≤δp, 1≤p≤n.$

Proof.
Evaluating (40) at any fixed (finite) time, T > 0 yields
$‖w0(T)−w0p(T)‖≤C2∏j=1pαifp(T,ω,ωQ)≤C2∏j=1pαjTp+1(p+1)!,‖w0(T)−w0p+1(T)‖≤C2∏j=1p+1αjTp+2(p+2)!,$
where C2 = C2(T) = C1A1A2. If there exists δp ≥ 0 such that
$‖w0(T)−w0p(T)‖≤C2∏j=1pαjTp+1(p+1)!≤δp,$
then there exists a δp+1 such that
$‖w0(T)−w0p+1(T)‖≤C2∏j=1pαjTp+1(p+1)!αp+1Tp+2≤δp+1<δp$
since αp+1 < (p + 2)/T. Moreover, the condition αj < (j + 1)/T holds for all 1 ≤ jn. Therefore, we conclude that for any fixed time T > 0, there exists a sequence of constants δ1 > δ2 > ⋯ > δn such that $‖w0(T)−w0p(T)‖≤δp$, where 1 ≤ pn.

Corollary 3.4.1 provides a sufficient condition for the error $‖w0(t)−w0p(t)‖$ to decrease monotonically as we increase p in (31). A stronger condition that yields an asymptotically decaying error bound is given by the following Corollary.

Corollary 3.4.2
(Asymptotic convergence of theH-model). If αjin Theorem 3.4 satisfies
$αj
(42)
for some positive constant C, then for any fixed time T > 0 and arbitrary δ > 0, there exists a constant 1 ≤ p < +∞ such that for all n > p,
$‖w0(T)−w0n(T)‖≤δ.$

Proof.
By introducing the condition αj < C in the Proof of Theorem 3.4, we obtain
$‖w0(T)−w0p(T)‖≤C2∏j=1pαjTp+1(p+1)!≤C2T(CT)p(p+1)! for all 1
The limit
$limp→+∞C2T(CT)p(p+1)!=0$
allows us to conclude that there exists a constant 1 < p < + such that for all n > p, $‖w0(T)−w0n(T)‖≤δ$.

An interesting consequence of Corollary 3.4.2 is the existence of a convergence barrier, i.e., a “hump” in the error plot $‖w0(T)−w0p(T)‖$ versus p generated by the H-model. While Corollary 3.4.2 only shows that behavior for an upper bound of the error, not directly the error itself, the feature is often found in the actual errors associated with numerical methods based on these ideas. The following corollary shows that the requirements on {αj} can be dropped (we still need αj < +) if we consider relatively short integration times T.

Corollary 3.4.3
(Short-time convergence of theH-model). For any integer n for which αj< ∞ for 1 ≤ j ≤ n and any sequence of constants δ1> δ2>> δn> 0, there exists a fixed time T > 0 such that
$‖w0(T)−w0p(T)‖≤δp$
for 1 ≤ p ≤ n.

Proof.
Since αj < +, we can choose $C=max1≤ j≤n$αj. By following the same steps we used in the Proof of Theorem 3.4, we conclude that for
$T≤1Cmin1≤p≤nC(p+1)!C2δp1p+1,$
the errors satisfy
$‖w0(T)−w0p(T)‖≤C2∏j=1pαjTp+1(p+1)!≤C2C(CT)p+1(p+1)!≤δp$
as desired, for all 1 ≤ pn.

Corollaries 3.4.1 and 3.4.2 provide sufficient conditions for the error $‖w0(T)−w0n(T)‖$ generated by the H-model to decay as we increase the truncation order n. However, we still need to answer the important question of whether the H-model actually provides accurate results for a given nonlinear dynamics ($L$), quantity of interest ($P$) and initial state x0. Corollary 3.4.3 provides a partial answer to this question by showing that, at least in the short time period, condition (41) is always satisfied (assuming that {αj} are finite). This guarantees the short-time convergence of the H-model for any reasonably smooth nonlinear dynamical system and almost any observable. However, for longer integration times T, convergence of the H-model for arbitrary nonlinear dynamical systems cannot be established in general, which means that we need to proceed on a case-by-case basis by applying Theorem 3.4 or by checking whether the hypotheses of Corollary 3.4.1 or Corollary 3.4.2 are satisfied. On the other hand, the convergence of the H-model can be established for any finite integration time in the case of linear dynamical systems, as we have recently shown in Ref. 44. From a practical viewpoint, the implementation of the H-model requires computing $(LQ)nLx0$ to high-order in n. This is not straightforward if the dynamical systems are nonlinear. However, for linear systems, such terms can be easily computed (see Sec. III E).

#### 2. Type-I finite memory approximation (FMA)

The Type-I finite memory approximation is obtained by solving the system (31) with $wnen(t)$ given by (34). As before, we first derive an upper bound for $‖w0(t)−w0n(t)‖$ and then discuss sufficient conditions for convergence. Such conditions basically control the growth of an upper bound on $‖w0(t)−w0n(t)‖$.

Theorem 3.5
(Accuracy of the Type-I FMA). Let$etL$and$etLQ$be strongly continuous semigroups and let T > 0 be a fixed integration time. If
$αj=‖(LQ)j+1Lu0‖‖(LQ)jLu0‖, 1≤j≤n,$
(43)
then for each 1 ≤ p ≤ n and for Δtp ≤ t ≤ T,
$‖w0(t)−w0p(t)‖≤M4p(t),$
where
$M4p(t)=C1A1A2∏i=1pαi(t−Δtp)p+1(p+1)!$
and C1, A1, A2are as in Theorem 3.4.

Proof.
The error at the pth level is of the form
$wp(t)−wpep(t)=∫0max(0,t−Δtp)PesLPLe(t−s)QL(QL)p+1u0ds,$
and the error at the zeroth level is
$w0(t)−w0p(t)=∫0t∫0τp⋯∫0τ2wn(τ1)−wnep(τ1)dτ1⋯dτp=∫0t∫0τp⋯∫0τ2∫0max(0,τ1−Δtp)PesLPe(τ1−s)LQ(LQ)p+1Lu0dsdτ1⋯dτp=∫Δtpt∫Δtpτp⋯∫Δtpτ2∫0τ1−ΔtpPesLPe(τ1−s)LQ(LQ)p+1Lu0dsdτ1⋯dτp=∫Δtpt∫Δtpτp⋯∫0τ2−Δtp∫0τ̃1PesLPe(τ̃1+Δtp−s)LQ(LQ)p+1Lu0dsdτ̃1⋯dτp⋮=∫0max(0,t−Δtp)∫0τ̃p⋯∫0τ̃2∫0τ̃1PesLPe(τ̃1+Δtp−s)LQ(LQ)p+1Lu0dsdτ̃1⋯dτ̃p.$
The norm of this error may be bounded as
$‖w0(t)−w0p(t)‖≤∫0max(0,t−Δtp)∫0τ̃p⋯∫0τ̃2∫0τ̃1PesLPe(τ̃1+Δtp−s)LQ(LQ)p+1Lu0dsdτ̃1⋯dτ̃p≤C1∏j=1pαj∫0max(0,t−Δtp)∫0τ̃p⋯∫0τ̃2∫0τ̃1es(ω−ωQ)e(τ̃1+Δtp)ωQdsdτ̃1⋯dτ̃p≤C1∏j=1pαjfp(t,Δtp,ω,ωQ),$
where
$fp(t,Δtp,ω,ωQ)=∫0max(0,t−Δtp)∫0τ̃p⋯∫0τ̃2∫0τ̃1es(ω−ωQ)e(τ̃1+Δtp)ωQdsdτ̃1⋯dτ̃p.$
If we bound fp as
$fp(t,Δtp,ω,ωQ)≤A1A2∫0max(0,t−Δtp)∫0τ̃p⋯∫0τ̃2∫0τ̃1dsdτ̃1⋯dτ̃p=0, 0≤t≤Δtp,A1A2(t−Δtp)p+1(p+1)!, t≥Δtp,$
where A1, A2 are defined in (38), then we have that
$‖w0(t)−w0p(t)‖≤C1A1A2∏j=1pαj(t−Δtp)p+1(p+1)!=M4p(t).$

We notice that if the effective memory band at each level decreases as we increase the differentiation order p, then we can control the error $‖w0(t)−w0n(t)‖$. The following corollary provides a sufficient condition that guarantees this sort of control of the error.

Corollary 3.5.1
(Uniform convergence of the Type-I FMA). If αjin Therorem 3.5 satisfy
$αj<(j+1)δj!C1A1A2∏k=1j−1αk−1j, 1≤j≤n,$
(44)
then for any T > 0 and δ > 0, there exists an ordered sequence Δtn < Δtn−1 < ⋯ < Δt1 < T such that
$‖w0(T)−w0p(T)‖≤δ, 1≤p≤n,$
which satisfies
$Δtp≤T−δ(p+1)!C1A1A2∏j=1pαj1p+1.$
(45)

Proof.
For 1 ≤ pn, we set
$‖w0(t)−w0p(t)‖≤C1A1A2∏j=1pαj(t−Δtp)p+1(p+1)!≤δ.$
This yields the following requirement on Δtp:
$Δtp≥T−δ(p+1)!C1A1A2∏j=1pαj1p+1.$
(46)
Since hypothesis (44) holds, it is easy to check that the lower bound on each Δtp satisfies
$T−δ(p+1)!C1A1A2∏j=1pαj1p+1Δtp−1.$
Therefore, by using the equality in (46) to define a sequence of Δtn, we find that it is a decreasing time sequence 0 < Δtn < Δtn−1 < ⋯ < Δt1 < T such that $‖w0(T)−w0n(T)‖≤δ$ holds for all t ∈ [0, T] and which satisfies (45).

Remark.
The sufficient condition provided in Corollary 3.5.1 guarantees uniform convergence of the Type-I finite memory approximation. If we replace condition (44) with
$αj
where C is a positive constant (independent on T), then we obtain asymptotic convergence. In other words, for each δ > 0, there exists an integer p such that for all n > p, we have $w0(t)−w0n(t)<δ$. This result is based on the limit
$limp→+∞δ(p+1)!C1A1A2∏j=1pαj>limp→+∞δ(p+1)!C1A1A2Cp=+∞$
which guarantees the existence of an integer p for which the upper bound on Δtp is smaller or equal to zero. In such a case, the Type I FMA degenerates to the H-model, for which Corollary 3.4.2 holds.

#### 3. Type-II finite memory approximation

The Type-II finite memory approximation is obtained by solving the system (31) with $wnen(t)$ given in (35). We first derive an upper bound for $‖w0(t)−w0n(t)‖$ and then discuss sufficient conditions for convergence.

Theorem 3.6
(Accuracy of the Type-II FMA). Let$etL$and$etLQ$be strongly continuous semigroups with upper bounds$‖etL‖≤Metω$and$‖etLQ‖≤MQetωQ$. If
$αj=‖(LQ)j+1Lu0‖‖(LQ)jLu0‖, 1≤j≤n,$
(47)
then for 1 ≤ pn,
$‖w0(t)−w0p(t)‖≤M5p(t),$
where
$M5p(t)=C1∏j=1pαjfp(ωQ,t)h(ω−ωQ,tp),$
$fp(ωQ,t)=∫0t(t−σ)p−1(p−1)!eσωQdσ, h(ω−ωQ,tp)=∫0tpes(ω−ωQ)ds,$
and$C1=MMQ‖P‖2‖(LQ)p+1Lu0‖$.

Proof.
By following the same procedure as in the Proof of the Theorem 3.4, we obtain
$w0(t)−w0p(t)=∫0t∫0τp⋯∫0τ2∫0min(τ1,tp)PesLPe(τ1−s)LQ(LQ)p+1Lu0dsdτ1⋯dτp.$
By applying Cauchy’s formula for repeated integration, this expression may be simplified to
$w0(t)−w0p(t)=∫0t(t−σ)p−1(p−1)!∫0min(σ,tp)PesLPe(σ−s)LQ(LQ)p+1Lu0dsdσ.$
Thus,
$‖w0(t)−w0p(t)‖≤∫0t(t−σ)p−1(p−1)!∫0min(σ,tp)PesLPe(σ−s)LQ(LQ)p+1Lu0dsdσ≤MMQ‖P‖2‖(LQ)p+1Lu0‖∫0t(t−σ)p−1(p−1)!∫0tpesωe(σ−s)ωQdsdσ≤C1∏j=1pαj∫0t(t−σ)p−1(p−1)!eσωQdσ∫0tpes(ω−ωQ)ds=C1∏j=1pαjfp(ωQ,t)h(ω−ωQ,tp)=M5p(t),$
where $C1=MMQ‖P‖2‖(LQ)p+1Lu0‖$,
$fp(ωQ,t)=∫0t(t−σ)p−1(p−1)!eσωQdσ=tpp!, ωQ=0,1ωQpetωQ−∑k=0p−1(tωQ)kk!, ωQ≠0,$
(48)
and
$h(ω−ωQ,tp):=∫0tpes(ω−ωQ)ds=tp,ω=ωQ,etp(ω−ωQ)−1ω−ωQ,ω≠ωQ$
are both strictly increasing functions of t and tp, respectively.

Corollary 3.6.1
(Uniform convergence of the Type-II FMA). If αjin Theorem 3.6 satisfy
$αj
(49)
for 1 ≤ jn, then for any arbitrarily small δ > 0, there exists an ordered sequence 0 < t0 < t1 < ⋯ < tnT such that
$‖w0(T)−w0p(T)‖≤δ, 1≤p≤n,$
which satisfies
$tj≥j!δC1∏i=1jαiTj,ωQ=0,ωQjδC1∏i=1jαieTωQ−∑k=0j−1(TωQ)kk!,ωQ≠0,$
when$ω=ωQ$and
$tj≥1ωln1+j!ωδC1∏i=1jαiTj, ωQ=0,1ω−ωQln1+(ω−ωQ)ωQjδC1∏i=1jαieTωQ−∑k=0j−1(TωQ)kk!, ωQ≠0,$
when$ω≠ωQ$.

Proof.
We now consider separately the two cases where $ω=ωQ$ and where $ω≠ωQ$. If $ω=ωQ$, then
$‖w0(t)−w0p(t)‖≤C1∏i=1pαi∫0t∫0τp⋯∫0τ2∫0tpeτ1ωQes(ω−ωQ)dsdτ1⋯dτp=tpC1∏i=1pαi∫0t∫0τp⋯∫0τ2eτ1ωQdτ1⋯dτp=tpC1∏i=1pαifp(ωQ,t),$
where $fp(ωQ,t)$ is defined in (48). To ensure that $‖w0(t)−w0p(t)‖≤δ$ for all 0 ≤ tT, we can take
$tpC1∏i=1pαifp(ωQ,T)=maxt∈[0,T]tpC1∏i=1pαifp(ωQ,t)≤δ$
so that
$tp≤δC1∏i=1pαifp(ωQ,T)=p!δC1∏i=1pαiTp, ωQ=0,ωQpδC1∏i=1pαieTωQ−∑k=0p−1(TωQ)kk!, ωQ≠0.$
On the other hand, if $ω≠ωQ$, then
$‖w0(t)−w0p(t)‖≤C1∏i=1pαi∫0t∫0τp⋯∫0τ2∫0tpeτ1ωQes(ω−ωQ)dsdτ1⋯dτp=etp(ω−ωQ)−1ω−ωQC1∏i=1pαi∫0t∫0τp⋯∫0τ2eτ1ωQdτ1⋯dτp=etp(ω−ωQ)−1ω−ωQC1∏i=1pαifp(ωQ,t).$
To ensure that $‖w0(t)−w0p(t)‖≤δ$ for all 0 ≤ tT, we can take
$etp(ω−ωQ)−1ω−ωQC1∏i=1pαifp(ωQ,T)=maxt∈[0,T]etp(ω−ωQ)−1ω−ωQC1∏i=1pαifp(ωQ,t)≤δ.$
Let us now consider the two cases $ω>ωQ$ and $ω<ωQ$ separately. When $ω>ωQ$, we have
$etp(ω−ωQ)≤1+(ω−ωQ)δC1∏i=1pαifp(ωQ,T)$
and
$tp≤1ωln1+p!ωδC1∏i=1pαiTp, ωQ=0,1ω−ωQln1+(ω−ωQ)ωQpδC1∏i=1pαieTωQ−∑k=0p−1(TωQ)kk!, ωQ≠0.$
On the other hand, when $ω<ωQ$, we have
$1−e−tp(ωQ−ω)ωQ−ωC1∏i=1pαifp(ωQ,T)=maxt∈[0,T]1−e−tp(ωQ−ω)ωQ−ωC1∏i=1pαifp(ωQ,t)≤δ$
so that
$e−tp(ωQ−ω)≥1−(ωQ−ω)δC1∏i=1pαifp(ωQ,T),$
i.e.,
$−tp(ωQ−ω)≥ln1−(ωQ−ω)δC1∏i=1pαifp(ωQ,T).$
Hence,
$tp≤1ωln1−p!(−ω)δC1∏i=1pαiTp, ωQ=0,−1ωQ−ωln1−(ωQ−ω)ωQpδC1∏i=1pαieTωQ−∑k=0p−1(TωQ)kk!, ωQ≠0.$
For all the four cases, if $ωQ=0$, then we have condition αp < p/T, and the upper bound of the time sequence satisfies
$p!δC1∏i=1pαiTp<(p−1)!δC1∏i=1p−1αiTp−1, p≥2.$
If $ωQ≠0$, then we have the condition
$αp<ωQeTωQ−∑k=0p−2(TωQ)kk!eTωQ−∑k=0p−1(TωQ)kk!$
and the upper bound of the time sequence satisfies
$ωQp−1eTωQ−∑k=0p−2(TωQ)kk!∏i=1p−1αi<ωQpeTωQ−∑k=0p−1(TωQ)kk!∏i=1pαi, p≥2.$
Therefore, there always exists a increasing time sequence 0 < t1 < ⋯ < tn such that $‖w0(t)−w0p(t)‖≤δ$ for all 0 ≤ tT. And since we have proved that this δ-bound on the error holds for all tn upper bounded as in the two cases above, there exists such an increasing time sequence 0 < t1 < ⋯ < tn with tnlower-bounded by the same quantities. Indeed, because of the coarseness of the approximations applied in the proof, there may exist such a time sequence with significantly larger ti.

Remark.
If we replace (49) with the stronger condition
$αj
(50)
where ϵ is some arbitrary constant satisfying ϵ > 1, then we have
$limj→+∞tj≥limj→+∞j!δC1∏i=1jαiTj≥δC1ϵj=+∞, ωQ=0,ωQjδC1∏i=1jαieTωQ−∑k=0j−1(TωQ)kk!=ωQjδC1∏i=1jαio(TjωQj)=+∞, ωQ≠0$
for $ω=ωQ$ and
$limj→+∞tj≥limj→+∞jωlnδC1ωϵ=+∞, ωQ=0,jω−ωQln(ω−ωQ)TjC1o(Tj)=+∞, ωQ≠0.$
Hence, there exists a j such that the upper bound for tj is greater than or equal to T. For such a case, the Type II FMA degenerates to the truncation approximation (H-model), for which Corollary 3.4.2 grants us asymptotic convergence.

#### 4. Ht-model

The Ht-model is obtained by solving the system (31) with $wnen(t)$ approximated using Chorin’s t-model12 [see Eq. (36)]. Convergence analysis can be performed by using the mathematical methods we employed for the proofs of the H-model. Note that the classical t-model is equivalent to a zeroth-order Ht-model.

Theorem 3.7
(Accuracy of theHt-model). Let$etL$and$etLQ$be strongly continuous semigroups with upper bounds$‖etL‖≤Metω$and$‖etLQ‖≤MQetωQ$, and let T > 0 be a fixed integration time. For some fixed n, let
$αj=‖(LQ)j+1Lu0‖‖(LQ)jLu0‖, 1≤j≤n.$
(51)
Then, for any 1 ≤ pn and all t ∈ [0, T], we have
$‖w0(t)−w0p(t)‖≤M6p(t)≤M6p(T),$
where
$M6p(t)=C4∏jpαjtp+1(p+1)!, C4=C1A1A2+C1MQA3, A3=maxs∈[0,T]sesω=1, ω≤0,eTω, ω>0$
and C1, A1, A2are the same as before.

Proof.
For pth order Ht-model, the difference between the memory term $w$0 and its approximation $w0p$ is
$w0(t)−w0p(t)=∫0t∫0τp⋯∫0τ2∫0τ1PesLPe(τ1−s)LQ(LQ)p+1Lu0ds−τ1Peτ1LP(LQ)p+1Lu0× dτ1⋯dτp.$
(52)
Using Cauchy’s formula for repeated integration, we can bind the norm of the second term in (52) as
$∫0t(t−σ)p−1(p−1)!σPeσLP(LQ)p+1Lu0dσ≤∫0t(t−σ)p−1(p−1)!‖σPeσLP(LQ)p+1Lu0‖dσ≤‖P‖2M‖(LQ)p+1Lu0‖∫0t(t−σ)p−1(p−1)!σeσωdσ︸gp(t,ω)=C1MQ∏j=1pαjgp(t,ω),$
(53)
where $C1=‖P‖2‖LQLu0‖MMQ$ as before. The function gp(t, ω) may be bounded from above as
$gp(t,ω)≤A3∫0t(t−σ)p−1(p−1)!σdσ=A3tp+1(p+1)!, A3=maxs∈[0,T]esω=1, ω≤0eTω, ω>0.$
By applying the triangle inequality to (52) and taking (53) into account, we obtain
$‖w0(t)−w0p(t)‖≤C1A1A2∏j=1pαjtp+1(p+1)!+C1MQA3∏j=1pαjtp+1(p+1)!=M6p(t).$

One can see that the upper bounds $M6p(t)$ and $M3p(t)$ (see Theorem 3.4) share the same structure, the only difference being the constant out front. Hence by changing C2 to C4, we can prove a series of corollaries similar to 3.4.1–3.4.3. In summary, what holds for the H-model also holds for the Ht-model. For the sake of brevity, we omit the statement and proofs of those corollaries.

The upper bounds we obtained above are not easily computable for general nonlinear systems and infinite-rank projections, e.g., Chorin’s projection (7). However, if the dynamical system is linear, then such upper bounds are explicitly computable and the convergence of the H-model can be established for linear phase space functions in any finite integration time T. To this end, consider the linear system = Ax with a random initial condition x(0) sampled from the joint probability density function

$ρ0(x0)=δ(x01−x1(0))∏j=2Nρ0j(x0j).$
(54)

In other words, the initial condition for the quantity of interest u(x) = x1(t) is set to be deterministic, while all other variables x2, , xN are zero-mean and statistically independent at t = 0. Here we assume for simplicity that ρ0j (j = 2, …, N) are independent identically distributed (i.i.d.) standard normal distributions. Observe that the Liouville operator associated with the linear system = Ax is

$L=∑i=1N∑j=1NAijxj∂∂xi,$
(55)

where Aij are the entries of the matrix A. If we choose observable u = x1(t), then Chorin’s projection operator (9) yields the evolution equation for the conditional expectation $E[x1(t)|x1(0)]$, i.e., the conditional mean path (11), which can be explicitly written as

$ddtE[x1|x1(0)]=A11E[x1|x1(0)]+w0(t),$
(56)

where $A11=PLx1(0)$ is the first entry of the matrix A and $w$0 represents the memory integral (26). Next, we explicitly compute the upper bounds for the memory growth and the error in the H-model for this system. To this end, we first notice that the domain of the Liouville operator can be restricted to the linear space

$V=span{x1,…,xN}.$
(57)

In fact, V is invariant under $L$, $P$, and $Q$, i.e., $LV⊆V$, $PV⊆V$, and $QV⊆V$. These operators have the following matrix representations:

$L≃AT≃a11 bTa M11T, P≃1 0 ⋯ 00 0 ⋯ 0⋮ ⋮ ⋱ ⋮0 0 ⋯ 0, Q≃0 0 ⋯ 00 1 ⋯ 0⋮ ⋮ ⋱ ⋮0 0 ⋯ 1,$

where M11 is the minor of the matrix of A obtained by removing the first column and the first row, while

$a=[A12⋯A1N]T, bT=[A21⋯AN1].$
(58)

Therefore,

$LQ≃0 bT0 M11T, L(QL)nx1(0)≃bTM11Tn−1aM11Tna.$
(59)

At this point, we set x01 = x1(0) and

$q(t,x01,x̃0)=∫0tesLPLe(t−s)QLQLx01ds.$

Since $x̃0=(x2(0),…,xN(0))$ is random, $q(t,x01,x̃0)$ is a random variable. By using Jensen’s inequality $[E(X)]2≤E[X2]$, we have the following L estimate:

$‖(Pq)(t,x01)‖L∞≤‖q(t,x01,⋅)‖Lρ02.$
(60)

On the other hand, we have

$‖etL‖Lρ02(V)≤‖etL‖Lρ02≤etω, ω=−12infdivρ0F.$
(61)

For linear dynamical systems, both $‖⋅‖Lρ02(V)$ and $‖⋅‖Lρ02$ upper bounds can be used to estimate the norm of the semigroup $etL$. However, for the semigroup $etLQ$, we can only obtain the explicit form of the $‖⋅‖Lρ02(V)$ bound, which is given by the following perturbation theorem16 (see also  Appendix A):

$‖etLQ‖Lρ02(V)≤etωQ, where ωQ=ω+A112+∑i=2NA1i2⟨xi2(0)⟩ρ0x12(0)≥ω+‖LP‖Lρ02(V).$
(62)

Memory growth. It is straightforward at this point to compute the upper bound of the memory growth we obtained in Theorem 3.1. Since $‖P‖Lρ02=‖Q‖Lρ02=1$ ($P$ and $Q$ are the orthogonal projections relative to ρ0), we have the following result:

$|w0(t)|≤‖LQLx1(0)‖etω−etωQω−ωQ=(bTa)2x12(0)+Λxi+1(0)M11Ta22etω−etωQω−ωQ,$
(63)

where $Λxi+1(0)$ is a N − 1 × N − 1 diagonal matrix with $Λii=⟨xi+1(0)⟩ρ0$ and ∥·∥2 is the vector 2-norm. Accuracy of theH-model. We are interested in computing the upper bound of the approximation error generated by the H-model (see Sec. III D 1–Theorem 3.4). By using the matrix representation of $L$, $P$, and $Q$, the nth order H-model MZ Eq. (56) for the linear system can be explicitly written as

$ddtE[x1|x1(0)]=A11E[x1|x1(0)]+w0n(t) (MZ equation),dwjn(t)dt=bT(M11T)jaTE[x1|x1(0)]+wj+1n(t), j=0,1,…,n−1,dwnn(t)dt=bT(M11T)naTE[x1|x1(0)],$
(64)

where M11, a, and b are defined as before [see Eq. (58)]. The upper bound for the memory term approximation error is explicitly obtained as

$|w0(t)−w0n(t)|≤A1A2‖L(QL)nx1(0)‖tn+1(n+1)!=A1A2bTM11Tna2x12(0)+Λxi+1(0)M11Tn+1a22tn+1(n+1)!,$
(65)

where A1, A2 are defined in (38), while ω and $ωQ$ are given in (61) and (62), respectively. Note that the error bound (65) is slightly different from the one we obtained in Theorem 3.4. The reason is that here we choose to bound $‖L(QL)nu0‖$, instead of the quotient $αn=|L(QL)n+1u0‖/‖L(QL)nu0‖$. For each fixed integration time T, the upper bound (65) goes to zero as we send n to infinity, i.e.,

$limn→+∞|w0(T)−w0n(T)|=0.$

This means that the H-model converges for all linear dynamical systems with observables in the linear space (57).

The semigroup estimates we obtained in Sec. III A allow us to compute explicitly an a priori estimate of the memory kernel in the Mori-Zwanzig equation if we employ finite-rank projections. In this section, we outline the procedure to obtain such an estimate for Hamiltonian dynamical systems. We begin by recalling that, in general, Hamiltonian systems are necessarily divergence-free, i.e.,

$divρeq(F)=0.$
(66)

Here, F(x) is the velocity field at the right-hand side of (1), while $ρeq=e−βH/Z$ is the canonical Gibbs distribution. Equation (66) can be easily obtained by noticing that

$divρeq(F)= eβH∇⋅e−βHF,= eβH∑i=1N∂∂qie−βH∂H∂pi−∂∂pie−βH∂H∂qi,= 0.$

By combining the estimate (18) with the divergence-free constraint (66), we find that the Koopman semigroup of a Hamiltonian dynamical system is a contraction in the $Lρeq2$ norm, i.e.,

$etLLρeq2≤1.$
(67)

Moreover, the MZ equation (6) with a finite-rank projection $P$ of the form (12) (with σ = ρeq) can be reduced to the following Volterra integro-differential equation:

$ddtPui(t)=∑j=1MΩijPuj(t)−∑j=1M∫0tKij(t−s)Puj(s)ds, i=1,…,M,$
(68)

where

$Gij=⟨ui,uj⟩eq,$
(69a)
$Ωij=∑k=1M(G−1)jk⟨uk,Lui⟩eq,$
(69b)
$Kij(t−s)=−∑k=1M(G−1)jk⟨QLuk,e(t−s)QLQLui⟩eq.$
(69c)

Equation (68) is often called the generalized Langevin equation (GLE)13,31 for the projected quantity of interest ui(t). To derive (69a)–(69c), we used the fact that $L$ is skew-adjoint and $Q$ is self-adjoint with respect to the $Lρeq2$ inner product and that $Q2=Q$. In classical statistical physics, Eq. (69c) is known as the second fluctuation dissipation theorem. Next, define the temporal-correlation matrix

$Cij(t)=⟨uj(0),ui(t)⟩eq=⟨uj(0),Pui(t)⟩eq.$
(70)

By applying $⟨uj,(⋅)⟩eq$ to both sides of Eq. (68), we obtain the following exact evolution equation for Cij(t):

$dCijdt=∑k=1MΩikCkj−∑k=1M∫0tKik(t−s)Ckj(s)ds.$
(71)

Moreover, if we employ a one-dimensional Mori’s basis, i.e., M = 1, then we obtain the simplified equation

$dC(t)dt=Ω1C(t)−∫0tK(t−s)C(s)ds,$
(72)

where $C(t)=⟨u1(0),u1(t)⟩eq$. The main difficulty in solving the GLE (71) [or (72)] lies in computing the memory kernels Kij(t). Hereafter we prove that such memory kernels can be uniformly bounded by a computable quantity that depends only on the initial condition of the system. For the sake of simplicity, we will focus on the one-dimensional GLE (72).

Theorem 3.8.
The memory kernel K(t) in the one-dimensional GLE (71) is uniformly bounded by$‖u̇1(0)‖Lρeq22/‖u1(0)‖Lρeq22$, i.e.,
$|K(t)|≤‖u̇1(0)‖Lρeq22‖u1(0)‖Lρeq22, ∀t≥0.$
(73)

Proof.
From the second-fluctuation dissipation theorem (69c), the memory kernel K(t) satisfies
$|K(t)|=⟨etQLQLu1(0),QLu1(0)⟩eq⟨u1(0),u1(0)⟩eq≤‖etQLQ‖Lρeq2‖Lu1(0)‖Lρeq22‖u1(0)‖Lρeq22=‖etQLQ‖Lρeq2‖u̇1(0)‖Lρeq22‖u1(0)‖Lρeq22.$
On the other hand, by using the numerical abscissa (17) and formula (19), we see that the semigroup $etQLQ$ is contractive, i.e., $‖etQLQ‖Lρeq2≤1$. Since $Q$ is an orthogonal projection with respect to ρeq, we have $‖etQLQ‖Lρeq2=‖QetQLQ‖Lρeq2≤‖Q‖Lρeq2‖etQLQ‖Lρeq2≤1$. This yields
$|K(t)|≤‖etQLQ‖Lρeq2‖u̇1(0)‖Lρeq22‖u1(0)‖Lρeq22≤‖u1̇(0)‖Lρeq22‖u1(0)‖Lρeq22.$

Theorem 3.8 provides an a priori and easily computable upper bound for the memory kernel defining the dynamics of any quantity of interest u1 that is initially in the Gibbs canonical ensemble $ρeq=e−βH/Z$. In Sec. IV, we will calculate the upper bound (73) analytically and compare it with the exact memory kernel we obtain in prototype linear and nonlinear Hamiltonian systems.

Remark.
We emphasized in Sec. III A that the semigroup estimate for $etQLQ$ is not necessarily tight. In the context of high-dimensional Hamiltonian systems (e.g., molecular dynamics), it is often empirically assumed that the semigroup $etQLQ$ is dissipative, i.e., $‖etQLQ‖≤etω0$, where ω0 < 0. In this case, the memory kernel turns out to be uniformly bounded by an exponentially decaying function since
$|K(t)|≤‖etQLQ‖Lρeq2‖Lu1‖Lρeq22‖u1‖Lρeq22≤etω0‖u̇1‖Lρeq22‖u1‖Lρeq22.$

In this section, we provide simple numerical examples of the MZ memory approximation methods we discussed throughout the paper. Specifically, we study Hamiltonian systems (linear and nonlinear) with finite-rank projections (Mori’s projection) and non-Hamiltonian systems with infinite-rank projections (Chorin’s projection). In both cases, we demonstrate the accuracy of the a priori memory estimation method we developed in Secs. III F and III E. We also compute the solution to the MZ equation for non-Hamiltonian systems with the t-model, the H-model, and the Ht-model.

In this section, we consider dimension reduction in linear and nonlinear Hamiltonian dynamical systems with finite-rank projection. In particular, we consider the Mori projection and study the MZ equation for the temporal auto-correlation function of a scalar quantity of interest.

#### 1. Harmonic chains of oscillators

Consider a one-dimensional chain of harmonic oscillators. This is a simple but illustrative example of a linear Hamiltonian dynamical system which has been widely studied in statistical mechanics, mostly in relation with the microscopic theory of Brownian motion.3,18,17 The Hamiltonian of the system can be written as

$H(p,q)=12m∑i=1Npi2+k2∑i,j=0i
(74)

where qi and pi are, respectively, the displacement and momentum of the ith particle, m is the mass of the particles (assumed constant throughout the network), and k is the elasticity constant that modulates the intensity of the quadratic interactions. We set fixed boundary conditions at the endpoints of the chain, i.e., q0(t) = qN+1(t) = 0 and p0(t) = pN+1(t) = 0 (particles are numbered from left to right) and m = k = 1. The Hamilton’s equations are

$dqidt=∂H∂pi, dpidt=−∂H∂qi,$
(75)

which can be written in a matrix-vector form as

$ṗq̇=0 kB−kDI/m 0pq,$
(76)

where B is the adjacency matrix of the chain and D is the degree matrix (see Ref. 4). Note that (76) is a linear dynamical system. We are interested in the velocity auto-correlation function of a tagged oscillator, say, the one at location j = 1. Such an auto-correlation function is defined as

$Cp1(t)=⟨p1(0)p1(t)⟩eq⟨p1(0)p1(0)⟩eq,$
(77)

where the average is with respect to the Gibbs canonical distribution $ρeq=e−βH/Z$. It was shown in Ref. 18 that $Cp1(t)$ can be obtained analytically by employing Lee’s continued fraction method. The result is the well-known J0J4 solution

$Cp1(t)=J0(2t)−J4(2t),$
(78)

where Ji(t) is the ith Bessel function of the first kind. On the other hand, the Mori-Zwanzig equation derived by the following Mori’s projection

$P(⋅)=⟨(⋅),p1(0)⟩eq⟨p1(0),p1(0)⟩eqp1(0)$
(79)

yields the following GLE for $Cp1(t)$:

$dCp1(t)dt=Ωp1Cp1(t)−∫0tK(s)Cp1(t−s)ds.$
(80)

Here,

$Ωp1=⟨Lp1(0),p1(0)⟩eq⟨p1(0),p1(0)⟩eq=0$

since $⟨pi(0),qj(0)⟩eq=0$, while K(t) is the MZ memory kernel. For the J0J4 solution, it is possible to derive the memory kernel K(t) analytically. To this end, we simply insert (78) into (80) and apply the Laplace transform

$L [⋅](s)=∫0∞(⋅)e−stdt$

to obtain

$K^(s)=−s+1Ĉ(s),$
(81)

where $Ĉ(s)=L [Cp1(t)]$ and $K^(s)=L [K(t)]$. The inverse Laplace transform of (81) can be computed analytically as

$K(t)=J1(2t)t+1.$
(82)

With K(t) available, we can verify the memory estimated we derived in Theorem 3.8. To this end,

$|K(t)|≤‖ṗ1(0)‖Lρeq22‖p1(0)‖Lρeq22=‖q2(0)−2q1(0)‖Lρeq22‖p1(0)‖Lρeq22=2.$
(83)

Here we used the exact solution of the velocity auto-correlation function and displacement auto-correlation function of the fixed-end harmonic chain given by (see Ref. 18)

$⟨pi(0),pj(0)⟩eq=kBTπ∫0π⁡sin(ix)sin(jx)dx, ⟨qi(0),qj(0)⟩eq=kBTπ∫0πsin(ix)sin(jx)4⁡sin2(x/2)dx.$

In Fig. 1, we plot the absolute value of the memory kernel K(t) together with the theoretical bound (83). It is seen that the upper bound we obtain in this case is of the same order of magnitude as the memory kernel.

FIG. 1.

Harmonic chain of oscillators. (a) Velocity auto-correlation function $Cp1(t)$ and (b) memory kernel K(t) of the corresponding MZ equation. It is seen that our theoretical estimate (83) (dashed line) correctly bounds the MZ memory kernel. Note that the upper bound we obtain is of the same order of magnitude as the memory kernel.

FIG. 1.

Harmonic chain of oscillators. (a) Velocity auto-correlation function $Cp1(t)$ and (b) memory kernel K(t) of the corresponding MZ equation. It is seen that our theoretical estimate (83) (dashed line) correctly bounds the MZ memory kernel. Note that the upper bound we obtain is of the same order of magnitude as the memory kernel.

Close modal

#### 2. Hald system

In this section, we study the Hamiltonian system studied by Chorin et al. in Refs. 12 and 9. The Hamiltonian function is defined as

$H(p,q)=12(q12+p12+q22+p22+q12q22),$
(84)

while the corresponding Hamilton’s equations of motion are

$q̇1 =p1,ṗ1 =−q1(1+q22),q̇2 =p2,ṗ2 =−q2(1+q12).$
(85)

We assume that the initial state is distributed according to canonical Gibbs distribution $ρeq=e−H(p,q)/Z$. The partition function Z is given by

$Z=e1/4(2π)3/2K014,$
(86)

where K0(t) is the modified Bessel function of the second kind. We aim to study the properties of the autocorrelation function of the first component q1, which is defined as

$Cq1(t)=⟨q1(0),q1(t)⟩eq⟨q1(0),q1(0)⟩eq.$

Obviously, $Cq1(0)=1$. The evolution equation for $Cq1(t)$ is obtained by using the MZ formulation with the Mori’s projection

$P(⋅)=⟨(⋅),q1(0)⟩eq⟨q1(0),q1(0)⟩eqq1(0).$
(87)

This yields the GLE

$dCq1(t)dt=Ωq1Cq1(t)−∫0tK(s)Cq1(t−s)ds.$
(88)

The streaming term $Ωq1Cq1(t)$ is again identically zero since

$Ωq1=⟨Lq1(0),q1(0)⟩eq⟨q1(0),q1(0)⟩eq=0.$

Theorem 3.8 provides the following computable upper bound for the modulus of K(t):

$|K(t)|≤‖q̇1(0)‖Lρeq22‖q1(0)‖Lρeq22=‖p1(0)‖Lρeq22‖q1(0)‖Lρeq22=e1/4K01/4πU1/2,0,1/2≈1.39786,$
(89)

where U(a, b, y) is the confluent hypergeometric function of the second kind. In Fig. 2(a), we plot the correlation function $Cq1(t)$ we obtained by sampling (85) and then averaging over all realizations. The samples of the initial condition are generated from the Gibbs equilibrium distribution by using Markov Chain Monte Carlo (MCMC). In Fig. 2(b), we plot the memory kernel K(t) we computed numerically based on $Cq1(t)$. To compute such a kernel, we inverted numerically the Laplace transform of (88), i.e.,

$K(t)=L−1−s+1Ĉ(s),$
(90)

where $Ĉ(s)=L [Cq1(t)]$. In practice, we replaced $Cq1(t)$ within the time interval [0, 20] with a high-order polynomial interpolant at Gauss-Chebyshev-Lobatto nodes, computed the Laplace transform of such polynomial analytically, and then computed the inverse Laplace transform (90) numerically with the Talbot algorithm.1

FIG. 2.

Hald Hamiltonian system (85). (a) Autocorrelation function of the displacement q1(t) and (b) memory kernel of the governing MZ equation. Here $Cq1(t)$ is computed by Markov chain Monte-Carlo (MCMC) while K(t) is determined by inverting numerically the Laplace transform in (90) with the Talbot algorithm. It is seen that the theoretical upper bound (89) (dashed line) is of the same order of magnitude as the memory kernel.

FIG. 2.

Hald Hamiltonian system (85). (a) Autocorrelation function of the displacement q1(t) and (b) memory kernel of the governing MZ equation. Here $Cq1(t)$ is computed by Markov chain Monte-Carlo (MCMC) while K(t) is determined by inverting numerically the Laplace transform in (90) with the Talbot algorithm. It is seen that the theoretical upper bound (89) (dashed line) is of the same order of magnitude as the memory kernel.

Close modal

In this section, we study the accuracy of the t-model, the H-model, and the Ht model in predicting scalar quantities of interest in non-Hamiltonian systems. In particular, we consider the MZ formulation with Chorin’s projection operator. For the particular case of linear dynamical systems, we also compute the theoretical upper bounds we obtained in Sec. III E for the memory growth and the error in the H-model and compare such bounds with exact results.

#### 1. Linear dynamical systems

We begin by considering a low-dimensional linear dynamical system = Ax evolving from a random initial state with density ρ0(x) to verify the MZ memory estimates we obtained in Sec. III E. For simplicity, we choose A to be negative definite

$A=eC⁡Be−C, B=−18 0 00 −23 00 0 −12, C=0 1 0−1 0 10 −1 0.$
(91)

In this case, the origin of the phase space is a stable node and it is easy to estimate $etLρ0$. We set x1(0) = 1 and x2(0), x3(0) independent standard normal random variables. In this setting, the semigroup estimates (61) and (62) are explicit,

$‖etL‖≤etω, ω=−12Tr(A)=0.6458,‖etLQ‖≤etωQ, ωQ=ω+A112+∑i=2NA1i2x12(0)=1.1621.$

Therefore, we obtain the following explicit upper bounds for the memory integral and the error of the H-model [see Eqs. (63) and (65)]:

$|w0(t)|≤0.1964e1.1621t−e0.6458t,$
(92)
$|w0(t)−w0n(t)|≤e1.1624tbTM11Tna12x12(0)+M11Tn+1a22tn+1(n+1)!.$
(93)

Next, we compare these error bounds with numerical results obtained by solving numerically the H-model (64). For example, the second-order H-model reads

$ddtE[x1(t)|x1(0)]=−0.4560E[x1(t)|x1(0)]+w02(t),dw02(t)dt=0.0586E[x1(t)|x1(0)]+w12(t),dw12(t)dt=−0.0192E[x1(t)|x1(0)].$
(94)

In Fig. 3, we demonstrate the convergence of the H-model to the benchmark solution computed by Monte-Carlo simulation as we increase the H-model differentiation order. In Fig. 4, we plot the bound on the memory growth [Eq. (92)] and the bound in the memory error [Eq. (93)] together with exact results.

FIG. 3.

Convergence of the H-model for the linear dynamical system with matrix (91). The benchmark solution is computed with Monte-Carlo (MC) simulation. Also, the zero-order H-model represents the Markovian approximation to the MZ equation, i.e., the MZ equation without the memory term.

FIG. 3.

Convergence of the H-model for the linear dynamical system with matrix (91). The benchmark solution is computed with Monte-Carlo (MC) simulation. Also, the zero-order H-model represents the Markovian approximation to the MZ equation, i.e., the MZ equation without the memory term.

Close modal
FIG. 4.

Linear dynamical system with matrix (91). In (a), we plot the memory term $w$0(t) we obtain from Monte Carlo simulation together with the estimated upper bound (92). In (b) and (c), we plot the H-model approximation error $|w0(T)−w0n(T)|$ together with the upper bound (93) for different differentiation orders n and at different times t.

FIG. 4.

Linear dynamical system with matrix (91). In (a), we plot the memory term $w$0(t) we obtain from Monte Carlo simulation together with the estimated upper bound (92). In (b) and (c), we plot the H-model approximation error $|w0(T)−w0n(T)|$ together with the upper bound (93) for different differentiation orders n and at different times t.

Close modal

Remark.
The results we just obtained can be obviously extended to higher-dimensional linear dynamical systems. In Fig. 5, we plot the benchmark conditional mean path we obtained through Monte Carlo simulation together with the solution of the H-model (64) for the 100-dimensional linear dynamical system defined by the matrix (N = 100)
$A=−1 1 … (−1)N1 ⋮ B 1 ,$
(95)
where B = eCΛeC and
$Λ=−18 0 ⋯ 00 −29 ⋮⋮ ⋱ 00 ⋯ 0 −N−1N+6, C=0 1 0−1 0 ⋱ ⋱ ⋱ 10 −1 0.$
It is seen that the H-model converges as we increase the differentiation order in any finite time interval, in agreement with the theoretical prediction of Sec. III E.

FIG. 5.

Linear dynamical system with matrix A (95). Convergence of the H-model to the conditional mean path solution $E[x1(t)|x1(0)]$. The initial condition is set as x1(0) = 3, while {x2(0), , x100(0)} are i.i.d. normals.

FIG. 5.

Linear dynamical system with matrix A (95). Convergence of the H-model to the conditional mean path solution $E[x1(t)|x1(0)]$. The initial condition is set as x1(0) = 3, while {x2(0), , x100(0)} are i.i.d. normals.

Close modal

#### 2. Nonlinear dynamical systems

The hierarchical memory approximation method we discussed in Sec. III D can be applied to nonlinear dynamical systems in the form (1). As we will see, if we employ the Ht-model, then the nonlinearity introduces a closure problem that needs to be addressed properly.

##### a. Lorenz-63 system.

Consider the classical Lorenz-63 model

$ẋ1=σ(x2−x1),ẋ2=x1(r−x3)−x2,ẋ3=x1x2−βx3,$
(96)

where σ = 10 and β = 8/3. The phase space Liouville operator for this ordinary differential equation (ODE) is

$L=σ(x2−x1)∂∂x1+(x1(r−x3)−x2)∂∂x2+(x1x2−βx3)∂∂x3.$

We choose the resolved variables to be $x^={x1,x2}$ and aim at formally integrating out $x̃=x3$ by using the Mori-Zwanzig formalism. To this end, we set $x3(0)∼N(0,1)$ and consider the zeroth-order Ht-model (t-model)

$dx1mdt=σ(x1m−x2m),dx2mdt=−x2m+rx1m−tx1m2x2m,$
(97)

where $x1m(t)=E[x1(t)|x1(0),x2(0)]$ and $x2m(t)=E[x2(t)|x1(0),x2(0)]$ are the conditional mean paths. To obtain this system, we introduced the following mean field closure approximation:

$tPetLPLQLx2(0)=−tE[x1(t)2x2(t)|x1(0),x2(0)]≃−tE[x1(t)|x1(0),x2(0)]2E[x2(t)|x1(0),x2(0)]=−tx1m2x2m.$
(98)

Higher-order Ht-models can be derived based on (98). As is well known, if r < 1, the fixed point (0, 0, 0) is a global attractor and exponentially stable. In this case, the t-model (zeroth-order Ht-model) yields an accurate prediction of the conditional mean path for long time (see Fig. 6). On the other hand, if we consider the chaotic regime at r = 28, then the t-model and its higher-order extension, i.e., the Ht-model, are accurate only for relatively short time. This is in agreement with our theoretical predictions. In fact, different from linear systems where the hierarchical representation of the memory integral can be proven to be convergent for long time, in nonlinear systems, the memory hierarchy is, in general, provably convergent only in a short time period (Theorem 3.7 and Corollary 3.4.3). This does not mean that the H-model or the Ht-model is not accurate for nonlinear systems. It just means that the accuracy depends on the system, the quantity of interest, and the initial condition.

FIG. 6.

Accuracy of the Ht model in representing the conditional mean path in the Lorenz-63 system (96). It is seen that if r = 0.5 (first row), then the zeroth-order Ht-model, i.e., the t-model is accurate for long integration times. On the other hand, if we consider the chaotic regime at r = 28 (second row), then we see that the t-model and its high-order extension (Ht-model) are accurate only for relatively short time.

FIG. 6.

Accuracy of the Ht model in representing the conditional mean path in the Lorenz-63 system (96). It is seen that if r = 0.5 (first row), then the zeroth-order Ht-model, i.e., the t-model is accurate for long integration times. On the other hand, if we consider the chaotic regime at r = 28 (second row), then we see that the t-model and its high-order extension (Ht-model) are accurate only for relatively short time.

Close modal
##### b. Modified Lorenz-96 system.

As an example of a high-dimensional nonlinear dynamical system, we consider the following modified Lorenz-96 system:22,26

$ẋ1=−x1+x1x2+F,ẋ2=−x2+x1x3+F, ⋮xi̇=−xi+(xi+1−xi−2)xi−1+F, ⋮ẋN=xN−xN−2xN−1+F,$
(99)

where F is constant. As is well known, depending on the values of N and F, this system can exhibit a wide range of behaviors.22 Suppose we take the resolved variables to be $x^={x1,x2}$. Correspondingly, the unresolved ones, i.e., those we aim at integrating through the MZ framework, are $x̃={x3,…,xN}$, which we set to be independent standard normal random variables. By using the mean field approximation (98), we obtain the following zeroth-order Ht-model (t-model) of the modified Lorenz-96 system (99):

$ẋ1m=−x1m+x1mx2m+F,ẋ2m=−x2m+F+t(x1m2x2m−x1mF).$
(100)

In Fig. 7, we study the accuracy of the Ht-model in representing the conditional mean path with F = 5 and N = 100. It is seen that the the Ht-model converges only for short time (in agreement with the theoretical predictions), and it provides results that are more accurate than the classical t-model.

FIG. 7.

Accuracy of the Ht-model in representing the conditional mean path in the Lorenz-96 system (96). Here we set F = 5 and N = 100. It is seen that the Ht-model converges only for short time and provides results that are more accurate that the classical t-model.

FIG. 7.

Accuracy of the Ht-model in representing the conditional mean path in the Lorenz-96 system (96). Here we set F = 5 and N = 100. It is seen that the Ht-model converges only for short time and provides results that are more accurate that the classical t-model.

Close modal

In this paper, we developed a thorough mathematical analysis to deduce conditions for the accuracy and convergence of different approximations of the memory integral in the Mori-Zwanzig (MZ) equation. In particular, we studied the short memory approximation, the t-model, and various hierarchical memory approximation techniques. We also derived useful upper bounds for the MZ memory integral which allowed us to estimate a priori its contribution to the dynamics. Such upper bounds are easily computable for systems with finite-rank projections—such as Mori’s projection. Furthermore, if the system is Hamiltonian, then the MZ memory bounds we derived are tight. This can be attributed to the fact that the Koopman operator of a Hamiltonian dynamical system is always a contraction relative to the $Lρeq2$ norm weighted by the Gibbs canonical distribution ρeq. In the more general case of dissipative systems, however, we do not have any guarantee that the MZ memory bounds we obtained are tight. Indeed, for dissipative systems, the numerical abscissa in Eq. (17) [see also Eqs. (21a) and (21b)] can be positive, yielding an exponential growth of the memory bounds. In the presence of infinite-rank projections—such as Chorin’s projection—we found that it is extremely difficult to obtain explicitly computable upper bounds for the MZ memory or any of its approximations discussed in this paper. The main difficulties are related to the lack of an explicitly computable upper bound for the operator semigroup $etLQ$, which involves the unbounded generator $LQ$ (see the discussion in Sec. III A and  Appendix A). To demonstrate the accuracy of the approximation methods and error bounds we developed in this paper, we provided numerical examples involving Hamiltonian dynamical systems (linear and nonlinear) and more general dissipative nonlinear systems evolving from random initial states. The numerical results we obtained are found to be in agreement with our theoretical predictions.

This work was supported by the Air Force Office of Scientific Research Grant No. FA9550-16-586-1-0092.

In looking for the numerical abscissa14 (i.e., the logarithmic norm) of $LQ$, we seek to bound

$supD(LQ)∋x≠0R⟨x,LQx⟩σ⟨x,x⟩σ.$

Notice that if $PLQ=0$, then $LQ=QLQ$ so that we have the previously proven bound

$supR⟨x,QLQx⟩σ⟨x,x⟩σ≤−12infdivσF.$

In this section, we consider what happens when $PLQ≠0$. To that end, let us note that $x∈D(LQ)$ may be decomposed as

$x=Qx+αPLQx+Py,$

where $α∈C$ and $Py$ are orthogonal to $PLQx$. In other words, we define $Py$ as

$Py=Px−⟨PLQx,Px⟩σ⟨PLQx,PLQx⟩σPLQx$

and define α as

$α=⟨PLQx,Px⟩σ⟨PLQx,PLQx⟩σ.$

Then

$R⟨x,LQx⟩σ⟨x,x⟩σ=R⟨Qx+αPLQx+Py,LQx⟩σ⟨x,x⟩σ=R⟨Qx,LQx⟩σ+R(α)‖PLQx‖σ2‖Qx‖σ2+|α|2‖PLQx‖σ2+‖Py‖σ2≤max0, R⟨Qx,LQx⟩σ+R(α)‖PLQx‖σ2‖Qx‖σ2+|α|2‖PLQx‖σ2.$

Since we assume $PLQ≠0$, there exists x such that $PLQx≠0$ and then for any α such that

$R(α)≥R⟨Qx,LQx⟩σ‖PLQx‖σ2,$

we have

$R⟨Qx,LQx⟩σ+R(α)‖PLQx‖σ2≥0$

so that for $PLQ≠0$,

$supD(LQ)∋x≠0R⟨x,LQx⟩σ⟨x,x⟩σ=supD(LQ)∋x≠0R⟨Qx,LQx⟩σ+R(α)‖PLQx‖σ2‖Qx‖σ2+|α|2‖PLQx‖σ2.$

Now, fix any $Qx∈D(L)≠0$ and consider the expression

$R⟨Qx,LQx⟩σ+R(α)‖PLQx‖σ2‖Qx‖σ2+|α|2‖PLQx‖σ2=ξ+R(α)β21+|α|2β2,$

where

$ξ=ξ(Qx)=R⟨Qx,LQx⟩σ‖Qx‖σ2, β=β(Qx)=‖PLQx‖σ‖Qx‖σ.$

Then, for this fixed $Qx$,

$ξ+R(α)β21+|α|2β2≤maxa∈Rξ+aβ21+a2β2.$

Differentiating with respect to a and setting equal to zero, we find that the latter expression is extremized when

$0=β2(1+a2β2)−2aβ2(ξ+aβ2),$

i.e., when

$β2a2+2ξa−1=0, a=−ξ±ξ2+β2β2.$

Since β2 > 0, $ξ+aβ21+a2β2$ is maximized at $â=−ξ+ξ2+β2β2$. Then

$ξ+âβ2=ξ2+β2, 1+â2β2=21−ξâ=2ξ2+β2−ξξ2+β2β2$

so that

$maxa∈Rξ+aβ21+a2β2=ξ+âβ21+â2β2=12β2ξ2+β2−ξ=12ξ2+β2+ξ.$

Therefore,

$supD(LQ)∋x≠0R⟨x,LQx⟩σ⟨x,x⟩σ=supD(L)∋(Qx)≠012ξ2(Qx)+β2(Qx)+ξ(Qx).$

When $PLQ$ is unbounded, which is the typical case when $P$ is an infinite-rank projection, such as most conditional expectations, there is unlikely to be a finite numerical abscissa for $LQ$. In particular, notice that if divσ(F) is a bounded function (bounded both above and below), then $ξ(Qx)$ is bounded for all $Qx$ while $β(Qx)$ is unbounded, in which case

$supD(LQ)∋x≠0R⟨x,LQx⟩σ⟨x,x⟩σ=∞.$

It follows Refs. 37 and 30 that in these cases, $‖etLQ‖σ$ has an infinite slope at t = 0, and therefore there is no finite ω such that $‖etLQ‖σ≤eωt$ for all t ≥ 0 (see Ref. 14). Assuming still that $LQ$ generates a strongly continuous semigroup, we must look to bound the semigroup as $‖etLQ‖σ≤Meωt$, where

$ω>ω0=limt→∞ln‖etLQ‖σt$

and

$M≥M(ω)=sup{‖etLQ‖σe−ωt:t≥0}.$

On the other hand, if $PLQ$ is a bounded operator, for example, when $P$ is a finite-rank projection [e.g., Mori’s projection (12)], there exists a finite value for the numerical abscissa. Indeed, in this case, since ζ is bounded by ω = −inf divσ(F), the numerical abscissa of $LQ$ may be bounded as

$ωLQ=supD(LQ)∋x≠0R⟨x,LQx⟩σ⟨x,x⟩σ≤12ω2+‖PLQ‖σ2+ω.$
(A1)

Alternatively, in the case of finite rank $P$, the operator $LQ$ may be thought of as a bounded perturbation of $L$, i.e., $LQ=L−LQ$, and the numerical abscissa of $LQ$ can be bounded using the bounded perturbation theorem Ref. 16, III.1.3, obtaining

$ωLQ≤ω+‖LP‖σ.$
(A2)

Either of these bounds for $ωLQ$ can be used to bound the semigroup norm

$‖etLQ‖σ≤eωLQt≤e12ω2+‖PLQ‖σ2+ωt,‖etLQ‖σ≤eωLQt≤e(ω+‖LP‖σ)t.$

Which of these two estimates gives the tighter bound will generally depend on the values of $‖PLQ‖σ$ and $‖LP‖σ$. It may be noted, however, that, when σ is invariant, ω = 0 and $L$ is skew-adjoint so that

$‖PLQ‖σ=‖QL†P‖σ=‖QLP‖σ≤‖LP‖σ$

and therefore the bound in (A1) is half that of (A2).

It was shown in Ref. 15 that the Banach dual of (4) defines an evolution in the probability density function space. Specifically, the joint probability density function of the state vector u(t) that solves Eq. (1) is pushed forward by the Frobenius-Perron operator $F(t,0)$ [Banach dual of the Koopman operator (3)],

$p(x,t)=F(t,s)p(x,s), F(t,s)=e(t−s)M,$
(B1)

where

$M(x)p(x,t)=−∇⋅(F(x)p(x,t)).$
(B2)

By introducing a projection $P$ in the space of probability density functions and its complement $Q=I−P$, it is easy to show that the projected density $Pp$ satisfies the MZ equation42

$∂Pp(t)∂t=PMPp(t)+PetQMQp(0)+∫0tPMe(t−s)QMQMPp(s)ds.$
(B3)

In Appendixes B 1 and B 2, we perform an analysis of different types of approximations of the MZ memory integral

$∫0tPMe(t−s)QMQMPp(s)ds.$
(B4)

The main objective of such analysis is to establish rigorous error bounds for widely used approximation methods and also propose new provably convergent approximation schemes.

#### 1. Analysis of the memory integral

In this section, we develop the analysis of the memory integral arising in the PDF formulation of the MZ equation. The starting point is the definition (B4). As before, we begin with the following estimate of upper bound estimation of the integral:

Theorem B.1
(Memory growth). Let$etMQ$and$etM$be strongly continuous semigroups with upper bounds$‖etM‖≤Metω$and$‖etMQ‖≤MQetωQ$, and let T > 0 be a fixed integration time. Then for any 0 ≤ t ≤ T, we have
$∫0tPMe(t−s)QMQMPp(s)ds≤N0(t),$
where
$N0(t)=tC4, ωQ=0,C4ωQ(etωQ−1), ωQ≠0$
and$C4=max0≤s≤T‖P‖‖MPMp(s)‖$. Moreover, N(t) satisfies$limt→0$N(t) = 0.

Proof.
Consider
$∫0tPMe(t−s)QMQMPp(s)ds=∫0tPe(t−s)MQMQMPp(s)ds≤C4MQ∫0te(t−s)ωQds=tC4, ωQ=0,C4ωQ(etωQ−1), ωQ≠0,$
where $C4=max0≤s≤T‖P‖‖MQMPp(s)‖$.

Theorem B.2
(Memory approximation via thet-model). Let$etMQ$and$etM$be strongly continuous semigroups with bounds$‖etM‖≤Metω$and$‖etMQ‖≤MQetωQ$, and let T > 0 be a fixed integration time. If the function$k(s,t)=PMe(t−s)QMQMPp(s)$(integrand of the memory term) is at least twice differentiable respect to s for all t ≥ 0, then
$∫0tPMe(t−s)QMQMPp(s)ds−tPMQMPp(t)≤N1(t),$
where N1(t) is defined as
$N1(t)=t(MQ+1)C4, ωQ=0,C2MQωQ(etωQ−1)+tC4, ωQ≠0$
and C4is as in Theorem B.1.

Proof.
$∫0tPMe(t−s)QMQMPp(s)ds−tPMQMPp(t)≤C4MQ∫0te(t−s)ωQds+C4t=t(MQ+1)C4, ωQ=0,C4MQωQ(etωQ−1)+tC4, ωQ≠0.$

#### 2. Hierarchical memory approximation in PDF space

The hierarchical memory approximation methods we discussed in Sec. III D can also be developed in the PDF space. To this end, let us first define

$v0(t)=∫0tPMe(t−s)QMQMPp(s)ds.$
(B5)

By repeatedly differentiating $v$0(t) with respect to time [assuming $v$0(t) smooth enough], we obtain the hierarchy of equations

$∂∂tvi−1(t)=PM(QM)iPp(t)+vi(t), i=1,…,n,$

where

$vi(t)=∫0tPMe(t−s)QM(QM)i+1Pp(s)ds.$

By following closely the discussion in Sec. III D, we introduce the hierarchy of memory equations

$dv0n(t)dt=PMQMPp(t)+v1n(t),dv1n(t)dt=PM(QM)2Pp(t)+v2n(t), ⋮dvn−1n(t)dt=PM(QM)nPp(t)+vnen(t)$
(B6)

and approximate the last term in such hierarchy in different ways. Specifically, we consider

$vnen(t)=∫ttPMe(t−s)QM(QM)n+1Pp(s)ds=0 (H-model),vnen(t)=∫max(0,t−Δt)tPMe(t−s)QM(QM)n+1Pp(s)ds (Type I Finite Memory Approximation),vnen(t)=∫min(t,tn)tPMe(t−s)QM(QM)n+1Pp(s)ds (Type II Finite Memory Approximation),vnen(t)=tPM(QM)n+1Pp(t) (Ht-model).$

Hereafter we establish the accuracy of the approximation schemes resulting from the substitution of each $vnen(t)$ above into (B6).

Theorem B.3
(Accuracy of theH-model). Let$etM$and$etMQ$be strongly continuous semigroups, T > 0 a fixed integration time, and
$βi=sups∈[0,T]‖(MQ)i+1MPp(s)‖sups∈[0,T]‖(MQ)iMPp(s)‖, 1≤i≤n.$
(B7)
Then for 1 ≤ qn, we have
$‖v0(t)−v0q(t)‖≤N2q(t),$
where
$N2q(t)=A2C4∏i=1qβitq+1(q+1)!,$
$A2=maxs∈[0,T]esωQ$, and C4is as in Theorem B.1.

Proof.
The error at the nth level can be bounded as
$‖v0(t)−v0q(t)‖≤∫0t∫0τq…∫0τ2‖PMe(τ1−s)QM(QM)p+1Pp(s)‖dsdτ1…dτq≤A2C4∏i=1qβitq+1(q+1)!,$
where
$A2=maxs∈[0,T]esωQ=1, ωQ≤0eTωQ, ωQ≥0.$
(B8)
Let
$βi=sups∈[0,T]‖(MQ)i+1MPp(s)‖sups∈[0,T]‖(MQ)iMPp(s)‖,$
(B9)
under the assumption that these quantities are finite. Then we have
$‖v0(t)−v0q(t)‖≤A2C4∏i=1qβitq+1(q+1)!.$

Corollary B.3.1
(Uniform convergence of theH-model). If βiin Theorem B.3 satisfy
$βi
for any fixed time T, then there exits a sequence δ1 > δ2 > ⋯ > δnsuch that
$‖v0(T)−v0q(T)‖≤δq,$
where 1 ≤ qn.

Corollary B.3.2
(Asymptotic convergence of theH-model). If βiin Theorem B.3 satisfy
$βi
for some constant C, then for any fixed time T and arbitrary δ > 0, there exits an integer q such that for all n > q,
$‖v0(T)−v0n(T)‖≤δ.$
The Proofs of the Corollaries B.3.1 and B.3.2 closely follow the Proofs of Corollaries 3.4.1 and 3.4.2. Therefore we omit details here.

Theorem B.4
(Accuracy of Type-I FMA). Let$etM$and$etMQ$be strongly continuous semigroups, T > 0 be a fixed integration time, and let
$βi=sups∈[0,T]‖(MQ)i+1MPp(s)‖sups∈[0,T]‖(MQ)iMPp(s)‖, 1≤i≤n.$
(B10)
Then for 1 ≤ qn,
$‖v0(t)−v0q(t)‖≤N3q(t),$
where
$N3q(t)=A2C4∏i=1qβi(t−Δtq)q+1(q+1)!$
and C4is as in Theorem B.1.

Proof.
The proof is very similar with the Proof of Theorem 3.5. We begin with the estimate of $v0(t)−v0q(t)$,
$∫0max(0,t−Δtq)∫0τ̃q⋯∫0τ̃2∫0τ̃1Pe(τ̃1+Δtq−s)MQ(MQ)q+1MQp(s)dsdτ̃1⋯dτ̃q,=0, 0≤t≤Δtq,∫0t−Δtq∫0σ(t−Δtq−σ)q−1(q−1)!Pe(σ+Δtq−s)MQ(MQ)q+1MQp(s)dsdσ, t≥Δtq.$
This can be bounded by following the technique in the Proof of Theorem B.3. This yields
$‖v0(t)−v0q(t)‖≤0, 0≤t≤Δtq,A2C4∏i=1qβi(t−Δtq)q+1(q+1)!, t≥Δtq.$
(B11)

Corollary B.4.1
(Uniform convergence of Type-I FMA). If βiin Theorem B.4 satisfy
$βi<(i+1)δi!C4A2∏k=1iβk−1i,$
(B12)
then for any δ > 0, there exists an ordered time sequence Δtn < Δtn−1 < ⋯ < Δt1 < T such that
$‖w0(T)−w0q(T)‖≤δ, 1≤q≤n$
which satisfies
$Δtq≤T−δ(q+1)!C4A2∏i=1qβi1q+1.$
The proof is very similar with the Proof of Corollary 3.5.1, and therefore we omit it.

Theorem B.5
(Accuracy of Type-II FMA). Let$etM$and$etMQ$be strongly continuous semigroups and T > 0 be a fixed integration time. Set
$βi=sups∈[0,T]‖(MQ)i+1MPp(s)‖sups∈[0,T]‖(MQ)iMPp(s)‖, 1≤i≤n.$
(B13)
Then for 1 ≤ qn,
$‖v0(t)−v0q(t)‖≤N4q(t),$
where
$N4q(t)=C4ωQ1−e−tqωQ∏i=1qβifq(ωQ,t),$
$fq(ωQ,t)$is defined in (48), and C4is as in Theorem B.1.

Proof.
The proof is very similar with the Proof of Theorem 3.6. Hereafter we provide the proof for the case when $ωQ>0$. Other cases can be easily obtained by using the same method. First of all, we have error estimation
$‖v0(t)−v0q(t)‖≤∫0t∫0τq⋯∫0τ2∫0tq‖PMe(τ1−s)QM(QM)q+1Pp(s)‖dsdτ1⋯dτq≤C4∏i=1qβi∫0t∫0τq⋯∫0τ2∫0tqe(τ1−s)ωQdsdτ1⋯dτq=C4ωQ1−e−tqωQ∏i=1qβifq(ωQ,t),$
where$fq(ωQ,t)$is as in (48).

Corollary B.5.1
(Uniform convergence of Type-II FMA). If βiin Theorem B.5 satisfy
$βi
for all t ∈ [0, T], then for arbitrarily small δ > 0, there exists an ordered time sequence 0 < t0 < t1 < … tn < T such that
$‖v0(T)−v0q(T)‖≤δ, 1≤q≤n,$
which satisfies
$tq≥1ωQln1−δωQq+1C4∏i=1qβieTωQ−∑k=0q−1(TωQ)kk!.$

Proof.
To ensure that $‖v0(t)−v0q(t)‖≤δ$ for all 0 ≤ tT, we can take (for $ωQ>0$)
$C4ωQ1−e−tqωQ∏i=1qβifq(ωQ,T)=maxt∈[0,T]C2ωQ1−e−tqωQ∏i=1qβifq(ωQ,t)≤δ.$
Therefore
$e−tqωQ≥1−δωQC4∏i=1qβifq(ωQ,T), i.e.,tq≤1ωQln1−δωQq+1C4∏i=1qβieTωQ−∑k=0q−1(TωQ)kk!.$
Since for $ωQ>0$, we have have condition
$βi(t)<ωQeTωQ−∑k=0i−2(TωQ)kk!eTωQ−∑k=0i−1(TωQ)kk!.$
Thus, there exists an ordered time sequence 0 < t1 < ⋯ < tn such that $‖v0(T)−v0n(T)‖≤δ$. As in Theorem 3.6, this δ-bound on the error holds for all tn (with the upper bound as above), which implies the existence of such an increasing time sequence 0 < t1 < ⋯ < tn with tn bounded from below by the same quantities.

1.
J.
Abate
and
W.
Whitt
, “
A unified framework for numerically inverting Laplace transforms
,”
INFORMS J. Comput.
18
(
4
),
408
421
(
2006
).
2.
B. J.
Alder
and
T. E.
Wainwright
, “
Decay of the velocity autocorrelation function
,”
Phys. Rev. A
1
(
1
),
18
(
1970
).
3.
R. J.
Baxter
,
Exactly Solved Models in Statistical Mechanics
(
Elsevier
,
2016
).
4.
N.
Biggs
,
Algebraic Graph Theory
(
Cambridge University Press
,
1993
).
5.
C.
Brennan
and
D.
Venturi
, “
Data-driven closures for stochastic dynamical systems
,”
J. Comput. Phys.
372
,
281
298
(
2018
).
6.
H.-P.
Breuer
,
B.
Kappler
, and
F.
Petruccione
, “
The time-convolutionless projection operator technique in the quantum theory of dissipation and decoherence
,”
Ann. Phys.
291
(
1
),
36
70
(
2001
).
7.
A.
Chertock
,
D.
Gottlieb
, and
A.
Solomonoff
, “
Modified optimal prediction and its application to a particle-method problem
,”
J. Sci. Comput.
37
(
2
),
189
201
(
2008
).
8.
H.
Cho
,
D.
Venturi
, and
G. E.
, “
Statistical analysis and simulation of random shocks in Burgers equation
,”
Proc. R. Soc. A
470
(
2171
),
20140080
(
2014
).
9.
A.
Chorin
,
O.
Hald
, and
R.
Kupferman
, “
Optimal prediction with memory
,”
Physica D
166
(
3-4
),
239
257
(
2002
).
10.
A. J.
Chorin
,
O. H.
Hald
, and
R.
Kupferman
, “
Optimal prediction and the Mori-Zwanzig representation of irreversible processes
,”
Proc. Natl. Acad. Sci. U. S. A.
97
(
7
),
2968
2973
(
2000
).
11.
A. J.
Chorin
,
R.
Kupferman
, and
D.
Levy
, “
Optimal prediction for Hamiltonian partial differential equations
,”
J. Comput. Phys.
162
(
1
),
267
297
(
2000
).
12.
A. J.
Chorin
and
P.
Stinis
, “
Problem reduction, renormalization and memory
,”
Commun. Appl. Math. Comput. Sci.
1
(
1
),
1
27
(
2006
).
13.
E.
Darve
,
J.
Solomon
, and
A.
Kia
, “
Computing generalized Langevin equations and generalized Fokker-Planck equations
,”
Proc. Natl. Acad. Sci. U. S. A.
106
(
27
),
10884
10889
(
2009
).
14.
E. B.
Davies
, “
Semigroup growth bounds
,”
J. Oper. Theory
53
(
2
),
225
249
(
2005
).
15.
J.
Dominy
and
D.
Venturi
, “
Duality and conditional expectations in the Nakajima-Mori-Zwanzig formulation
,”
J. Math. Phys.
58
,
082701
(
2017
).
16.
K.-J.
Engel
and
R.
Nagel
,
One-Parameter Semigroups for Linear Evolution Equations
(
Springer
,
1999
), Vol. 194.
17.
P.
Español
, “
Dissipative particle dynamics for a harmonic chain: A first-principles derivation
,”
Phys. Rev. E
53
(
2
),
1572
(
1996
).
18.
J.
Florencio
and
H. M.
Lee
, “
Exact time evolution of a classical harmonic-oscillator chain
,”
Phys. Rev. A
31
(
5
),
3231
(
1985
).
19.
R. F.
Fox
, “
Functional-calculus approach to stochastic differential equations
,”
Phys. Rev. A
33
(
1
),
467
476
(
1986
).
20.
S.
Gudder
, “
,”
Pac. J. Math.
80
(
1
),
141
149
(
1979
).
21.
G. D.
Harp
and
B. J.
Berne
, “
Time-correlation functions, memory functions, and molecular dynamics
,”
Phys. Rev. A
2
(
3
),
975
(
1970
).
22.
A.
Karimi
and
M. R.
Paul
, “
Extensive chaos in the Lorenz-96 model
,”
Chaos
20
(
4
),
043105
(
2010
).
23.
J.
Kim
and
I.
, “
Dynamics of a harmonic oscillator on the Bethe lattice
,”
Phys. Rev. E
61
(
3
),
R2172
(
2000
).
24.
B. O.
Koopman
, “
Hamiltonian systems and transformation in Hilbert spaces
,”
Proc. Natl. Acad. Sci. U. S. A.
17
(
5
),
315
318
(
1931
).
25.
H.
Lei
,
N. A.
Baker
, and
X.
Li
, “
Data-driven parameterization of the generalized Langevin equation
,”
Proc. Natl. Acad. Sci. U. S. A.
113
(
50
),
14183
14188
(
2016
).
26.
E. N.
Lorenz
, “
Predictability—A problem partly solved
,” in
ECMWF Seminar on Predictability
(
ECMWF
,
1996
), Vol. 1, pp.
1
18
.
27.
H.
Mori
, “
A continued-fraction representation of the time-correlation functions
,”
Prog. Theor. Phys.
34
(
3
),
399
416
(
1965
).
28.
H.
Mori
, “
Transport, collective motion, and Brownian motion
,”
Prog. Theor. Phys.
33
(
3
),
423
455
(
1965
).
29.
Noise in Nonlinear Dynamical Systems
, Theory of Continuous Fokker-Planck Systems, edited by
F.
Moss
and
E. V. P.
McClintock
(
Cambridge University Press
,
1995
), Vol. 1.
30.
A.
Pazy
,
Semigroups of Linear Operators and Applications to Partial Differential Equations
(
Springer
,
1992
).
31.
I.
Snook
,
The Langevin and Generalised Langevin Approach to the Dynamics of Atomic, Polymeric and Colloidal Systems
, 1st ed. (
Elsevier
,
2007
).
32.
G.
Söderlind
, “
The logarithmic norm. History and modern theory
,”
BIT Numer. Math.
46
(
3
),
631
652
(
2006
).
33.
P.
Stinis
, “
Stochastic optimal prediction for the Kuramoto–Sivashinsky equation
,”
Multiscale Model. Simul.
2
(
4
),
580
612
(
2004
).
34.
P.
Stinis
, “
A comparative study of two stochastic model reduction methods
,”
Physica D
213
,
197
213
(
2006
).
35.
P.
Stinis
, “
Higher order Mori-Zwanzig models for the Euler equations
,”
Multiscale Model. Simul.
6
(
3
),
741
760
(
2007
).
36.
P.
Stinis
, “
Renormalized Mori–Zwanzig-reduced models for systems without scale separation
,”
Proc. R. Soc. A
471
(
2176
),
20140446
(
2015
).
37.
L. N. P.
Trefethen
, “
Pseudospectra of linear operators
,”
SIAM Rev.
39
(
3
),
383
406
(
1997
).
38.
L. N.
Trefethen
and
M.
Embree
,
Spectra and Pseudospectra: The Behavior of Nonnormal Matrices and Operators
(
Princeton University Press
,
2005
).
39.
U.
Umegaki
, “
Conditional expectation in an operator algebra. I
,”
Tohoku Math. J.
6
(
2
),
177
181
(
1954
).
40.
D.
Venturi
, “
The numerical approximation of nonlinear functionals and functional differential equations
,”
Phys. Rep.
732
,
1
102
(
2018
).
41.
D.
Venturi
,
H.
Cho
, and
G. E.
, “
The Mori-Zwanzig approach to uncertainty quantification
,” in
Handbook of Uncertainty Quantification
, edited by
R.
Ghanem
,
D.
Higdon
, and
H.
(
Springer
,
2016
).
42.
D.
Venturi
and
G. E.
, “
Convolutionless Nakajima-Zwanzig equations for stochastic analysis in nonlinear dynamical systems
,”
Proc. R. Soc. A
470
(
2166
),
20130754
(
2014
).
43.
D.
Venturi
,
T. P.
Sapsis
,
H.
Cho
, and
G. E.
, “
A computable evolution equation for the joint response-excitation probability density function of stochastic dynamical systems
,”
Proc. R. Soc. A
468
(
2139
),
759
783
(
2012
).
44.
Y.
Zhu
and
D.
Venturi
, “
Faber approximation of the Mori-Zwanzig equation
,”
J. Comput. Phys.
372
,
694
718
(
2018
).
45.
R.
Zwanzig
, “
Memory effects in irreversible thermodynamics
,”
Phys. Rev.
124
,
983
992
(
1961
).
46.
R.
Zwanzig
,
Nonequilibrium Statistical Mechanics
(
Oxford University Press
,
2001
).