We develop a thorough mathematical analysis to deduce conditions for the accuracy and convergence of different approximations of the memory integral in the Mori-Zwanzig (MZ) equation. In particular, we derive error bounds and sufficient convergence conditions for short-memory approximations, the t-model, and hierarchical (finite-memory) approximations. In addition, we derive useful upper bounds for the MZ memory integral, which allow us to estimate a priori the contribution of the MZ memory to the dynamics. Such upper bounds are easily computable for systems with finite-rank projections. Numerical examples are presented and discussed for linear and nonlinear dynamical systems evolving from random initial states.
I. INTRODUCTION
The Mori-Zwanzig (MZ) formulation is a technique from irreversible statistical mechanics that allows the development of formally exact evolution equations for the quantities of interest such as macroscopic observables in high-dimensional dynamical systems.45,10,41,42 One of the main advantages of developing such exact equations is that they provide a theoretical starting point to avoid integrating the full (possibly high-dimensional) dynamical system and instead solve directly for the quantities of interest, thus reducing the computational cost significantly. Computing the solution to the Mori-Zwanzig equation, however, is a challenging task that relies on approximations and appropriate numerical schemes. One of the main difficulties lies in the approximation of the MZ memory integral (convolution term), which encodes the effects of the so-called orthogonal dynamics in the time evolution of the quantity of interest. The orthogonal dynamics are essentially a high-dimensional flow satisfying a complex integro-differential equation. Over the years, many techniques have been proposed for approximating the MZ memory integral, the most efficient ones being problem-dependent.31,41 For example, in applications to statistical mechanics, Mori’s continued fraction method27,18 has been quite successful in determining exact solutions to several prototype problems, such as the dynamics of the auto-correlation function of a tagged oscillator in a harmonic chain.17,23 Other effective approaches to approximate the MZ memory integral rely on perturbation methods,6,42,8 mode coupling theories,2,31 functional approximation techniques,40,19,43,21,29 and data-driven methods.5,25 In a parallel effort, the applied mathematics community has, in recent years, attempted to derive general easy-to-compute representations of the memory integral. In particular, various approximations such as the t-model,10,12,34 the modified t-model,7 and, more recently, renormalized perturbation methods36 were proposed to address the approximation of the memory integral in situations where there is no clear separation of scales between resolved and unresolved dynamics.
The main objective of this paper is to develop rigorous estimates of the memory integral and provide convergence analysis of different approximation models of the Mori-Zwanzig equation, such as the short-memory approximation,33 the t-model,12 and hierarchical methods.35 In particular, we study the MZ equation corresponding to two broad classes of projection operators: (i) infinite-rank projections (e.g., Chorin’s projection10) and (ii) finite-rank projections (e.g., Mori’s projection28). We develop our analysis for both state-space and probability density function (PDF) space formulations of the MZ equation. These two descriptions are connected by the same duality principle that pairs the Koopman and Frobenious-Perron operators.15
This paper is organized as follows. In Sec. II, we outline the general procedure to derive the MZ equation in the phase space and discuss the common choices of projection operators. In Sec. III, we derive error bounds for the MZ memory integral and provide the convergence analysis of different memory approximation methods, including the t-model,34,10,12 the short-memory approximation,33 and the hierarchical memory approximation technique.35 Such estimates are built upon semigroup estimation methods we present in Appendix A. In Sec. IV, we present numerical examples demonstrating the accuracy of the memory approximation/estimation methods we develop throughout the paper. The main findings are summarized in Sec. V. Convergence analysis of the MZ memory term in the probability density function space formulation is presented in Appendix B.
II. THE MORI-ZWANZIG FORMULATION
Consider the nonlinear dynamical system
evolving on a smooth manifold . For simplicity, let us assume that . We will consider the dynamics of scalar-valued observables , and for concreteness, it will be desirable to identify structured spaces of such observable functions. In Ref. 15, it was argued that C*-algebras of observables such as (the space of all measureable, essentially bounded functions on ) and (the space of all continuous functions on , vanishing at infinity) make natural choices. In what follows, we do not require the observables to comprise a C*-algebra, but we will want them to comprise a Banach space as the estimation theorems of Sec. III make an extensive use of the norm of this space. Having the structure of a Banach space of observables also gives greater context to the meaning of the linear operators , , , and to be defined hereafter.
The dynamics of any scalar-valued observable g(x) (quantity of interest) can be expressed in terms of a semi-group of operators acting on the Banach space of observables. This is the Koopman operator24 which acts on the function g as
where
Rather than computing the Koopman operator applicable to all observables, it is often more tractable to compute the evolution only of a (closed) subspace of the quantities of interest. This subspace can be described conveniently by means of a projection operator with the subspace as its image. Both and the complementary projection act on the space of observables. The nature, mathematical properties, and connections between and the observable g are discussed in detail in Ref. 15 and summarized in Sec. II A. For now, it suffices to assume that is a bounded linear operator and that . The MZ formalism describes the evolution of observables initially in the image of . Because the evolution of observables is governed by the semi-group , we seek an evolution equation for . By using the definition of the Koopman operator (3) and the well-known Dyson identity
we obtain the operator equation
By applying this equation to an observable function u0, we obtain the well-known MZ equation in phase space
Acting on the left with , we obtain the evolution equation for projected dynamics
Note, in fact, that the projection of the second term in (5), i.e., , vanishes since .
A. Projection operators
In this section, we make a summary on the commonly used projection operators in the Mori-Zwanzig framework. To make our definition mathematically sound, we begin by assuming that the Liouville operator (3) acts on observable functions in a C*-algebra , for instance, , where is a space such as , Σ is a σ-algebra on , and μ is a measure on Σ. Let be a positive linear functional on . We define the weighted pre-inner product
This can be used to define a Hilbert space , which is the completion of the quotient space
endowed with the inner product ⟨·,·⟩σ. The L2 norm induced by the inner product is denoted as ∥·∥σ. In the rest of the paper, the positive linear functional σ is always induced by a probability distribution through
where is typically chosen to be the probability density of the initial condition ρ0 or the equilibrium distribution ρeq in statistical physics. To conform to the literature, we also use notation , to represent the weighted inner product corresponding to different probability measures . With the Hilbert space determined, we now focus on the following two broad classes of orthogonal projections on .
1. Infinite-rank projections
The first class of projection operators to consider in this setting are the conditional expectations such that . In this case, the properties of conditional expectations (in particular, that 39) and the fact that imply that
so that
for all . It follows that
Therefore both and are self-adjoint (i.e., orthogonal) projections onto closed subspaces of and hence contractions , . Chorin’s projection12,10 is one of this class and is defined as
Here is the flow map generated by (1) split into resolved () and unresoved () variables, and is the quantity of interest. For Chorin’s projection, the positive functional σ defining the Hilbert space may be taken to be integration with respect to the probability measure . Clearly, if x0 is deterministic, then is a product of Dirac delta functions. On the other hand, if and are statistically independent, i.e., , then the conditional expectation simplifies to
In the special case where , we have
i.e., the conditional expectation of the resolved variables given the initial condition . This means that an integration of (9) with respect to yields the mean of the resolved variables, i.e.,
Obviously, if the resolved variables evolve from a deterministic initial state , then the conditional expectation (9) represents the the average of the reduced-order flow map with respect to the PDF of , i.e., the flow map
In this case, the MZ equation (6) is an exact (unclosed) partial differential equation (PDE) for the multivariate field . In order to close such an equation, a mean field approximation of the type was introduced by Chorin et al. in Refs. 10, 12, and 11, together with the assumption that the probability distribution of x0 is invariant under the flow generated by (1).
2. Finite-rank projections
Another class of projections is defined by choosing a closed (typically finite-dimensional) linear subspace and letting to be the orthogonal projection onto V in the σ inner product. An example of such projection is Mori’s projection,46 widely used in statistical physics. For finite-dimensional V, given a linearly independent set {u1, …, uM} ⊂ V that spans V, can be defined by first constructing the positive definite Gram matrix . Then
This projection is orthogonal with respect to the inner product. In statistical physics, a common choice for the positive functional σ that generates is integration with respect to the Gibbs canonical distribution , for the Hamiltonian and the associated partition function Z. Here q are generalized coordinates while p are kinetic momenta.
III. ANALYSIS OF THE MEMORY INTEGRAL
In this section, we develop a thorough mathematical analysis of the MZ memory integral
and its approximation. We begin by describing the behavior of the semigroup norms , , and as functions of time, for different choices of projection and different norms. As we will see, the analysis will give clear computable bounds only in some circumstances, illustrating the difficulty of this problem and the need for further development and insight.
A. Semigroup estimates
For any Liouville operator of the form (3) acting on and for any σ identified with an element of , the functional (assuming that σ lies in the domain of ) is absolutely continuous with respect to σ (essentially because acts locally) and the Radon-Nikodym derivative20 may be identified with the negative of the divergence of the vector field F with respect to the measure induced by σ, i.e.,
where
When and σ has the form , this can be shown more directly using integration by parts. By assuming that or Fi decays to 0 at ∞, we have
from which we see
Therefore,
and the following are equivalent: (i) (i.e., σ is invariant); (ii) divσF = 0; (iii) is the skew-adjoint with respect to the σ inner product. More generally, on , we find that
Using this numerical abscissa ω, we obtain the following estimation of the Koopman semigroup:
and moreover, eωt is the smallest exponential function that bounds .14 When and are the orthogonal projections on , we can observe that the numerical abscissa of is bounded by that of . In fact,
[see Eq. (17)] so that
It should be noticed that this bound for the orthogonal semigroup is not necessarily tight. The tightness of the bound depends on the choice of projection and comes down to whether functions in the image of can be chosen localized to regions where divσF is close to its infimal value.
When estimating the MZ memory integral, we need to deal with the semigroup . It turns out to be extremely difficult to prove strong continuity of such a semigroup in general, due to the unboundedness of . It is shown in Appendix A that when is an unbounded operator, as is typical when is a conditional expectation, the semigroup can only be bounded as
for some , due to the fact that has an infinite slope at t = 0. More work is needed to obtain satisfactory, computable values for and , in the case where and are the infinite-rank projections. It is also shown that when either or is bounded, for example, when is a finite-rank projection, we can get computable semigroup bounds of the form
where ω = −inf divσF if we use the estimation of .
B. Memory growth
We begin by seeking to bound the MZ memory integral (B4) as a whole and build our analysis from there. A key assumption of our analysis is that the semigroup is strongly continuous, i.e., the map is continuous in the norm topology on the space of observables for each fixed g.16 As we pointed out in Sec. III A, it is very difficult to rigorously prove a strong continuity of and to obtain a computable upper bound for unbounded generators of the form . We leave this problem for future research, and assume for now that that there exist constants and such that . Throughout this section, ∥·∥ denotes a general Banach norm. We begin with the following simple estimate:
Theorem 3.1 provides an upper bound for the growth of the memory integral based on the assumption that and are strongly continuous semigroups. We emphasize that only for simple cases can such upper bounds be computed analytically (we will compute one of the cases later in Sec. IV) because of the fundamental difficulties in computing the upper bound of . However, it will be shown later that, although the specific expression for M0(t) is unknown, the form of it is already useful as it enables us to derive some verifiable theoretical predictions for general nonlinear systems.
C. Short memory approximation and the t-model
Theorem 3.1 can be employed to obtain upper bounds for well-known approximations of the memory integral. Let us begin with the t-model proposed in Ref. 12. This model relies on the approximation
We omit the proof due to its similarity to that of Theorem 3.1. Note that M2(Δt, t) = 0 for all finite t > 0.
D. Hierarchical memory approximation
An alternative way to approximate the memory integral (13) was proposed by Stinis in Ref. 35. The key idea is to repeatedly differentiate (13) with respect to time and establish a hierarchy of PDEs which can eventually be truncated or approximated at some level to provide an approximation of the memory. In this section, we derive this hierarchy of memory equations and perform a thorough theoretical analysis to establish the accuracy and convergence of the method. To this end, let us first define
to be the memory integral (13). By differentiating 0(t) with respect to time, we obtain
where
By iterating this procedure n times, we obtain
where
The hierarchy of Eqs. (27) and (28) is equivalent to the following infinite-dimensional system of PDEs:
evolving from the initial condition i(0) = 0, i = 1, 2, … [see Eq. (28)]. With such an initial condition available, we can solve (29) with backward substitution, i.e., from the last equation to the first one, to obtain the following (exact) Dyson series representation of the memory integral (26):
So far no approximation was introduced, i.e., the infinite-dimensional system (29) and the corresponding formal solution (30) are exact. To make progress in developing a computational scheme to estimate the memory integral (26), it is necessary to introduce approximations. The simplest of these rely on truncating the hierarchy (29) after n equations, while simultaneously introducing an approximation of the nth order memory integral n(t). We denote such an approximation as . The truncated system takes the form
The notation (j = 0, …, n − 1) emphasizes that the solution to (31) is, in general, different from the solution to (29). The initial condition of the system can be set as , for all i = 0, …, n − 1. By using backward substitution, this yields the following formal solution:
representing an approximation of the memory integral (26). Note that, for a given system, such approximation depends only on the number of equations n in (31) and on the choice of the approximation . In the present paper, we consider the following choices:
Approximation by truncation (H-model),
Type-I finite memory approximation (FMA),
Type-II finite memory approximation,
Ht-model,
The quantities tn and Δtn appearing in (34) and (35) will be defined in Theorems 3.5 and 3.6, respectively. The first approximation, i.e., (33), is a truncation of the hierarchy obtained by assuming that n(t) = 0. Such approximation was originally proposed by Stinis in Ref. 35, and we shall call it the H-model. The Type-I finite memory approximation (FMA) is obtained by applying the short memory approximation to the nth order memory integral n(t). The Type-II finite memory approximation (FMA) is a modified version of the Type-I, with a larger memory band. The Ht-model approximation is based on replacing the nth order memory integral n(t) with a classical t-model. Note that in this setting, the classical t-model approximation proposed by Chorin and Stinis12 is equivalent to a zeroth-order Ht-model approximation.
Hereafter, we present a thorough mathematical analysis that aims at estimating the error , where 0(t) is the full memory at time t [see (26) or (30)], while is the solution of the truncated hierarchy (31), with given by (33), (34), (35), or (36). With such error estimates available, we can infer whether the approximation of the full memory 0(t) with is accurate and, more importantly, if the algorithm to approximate the memory integral converges. To the best of our knowledge, this is the first time a rigorous convergence analysis is performed on various approximations of the MZ memory integral. It turns out that the distance can be controlled through the construction of the hierarchy under some constraint on the initial condition.
1. The H-model
Setting in (31) yields an approximation by truncation, which we will refer to as the H-model (hierarchical model). Such a model was originally proposed by Stinis in Ref. 35. Hereafter we provide error estimates and convergence results for this model. In particular, we derive an upper bound for the error and sufficient conditions for the convergence of the reduced-order dynamical system. Such conditions are problem dependent, i.e., they involve the Liouvillian , the initial condition u0, and the projection .
Theorem 3.4 states that for a given dynamical system (represented by ) and quantity of interest (represented by ), the error bound is strongly related to {αj} which is ultimately determined by the initial condition x0. It turns out that by bounding {αj}, we can control and therefore the overall error . The following corollaries discuss sufficient conditions such that the error decays as we increase the differentiation order n for fixed time T > 0.
Corollary 3.4.1 provides a sufficient condition for the error to decrease monotonically as we increase p in (31). A stronger condition that yields an asymptotically decaying error bound is given by the following Corollary.
An interesting consequence of Corollary 3.4.2 is the existence of a convergence barrier, i.e., a “hump” in the error plot versus p generated by the H-model. While Corollary 3.4.2 only shows that behavior for an upper bound of the error, not directly the error itself, the feature is often found in the actual errors associated with numerical methods based on these ideas. The following corollary shows that the requirements on {αj} can be dropped (we still need αj < +∞) if we consider relatively short integration times T.
Corollaries 3.4.1 and 3.4.2 provide sufficient conditions for the error generated by the H-model to decay as we increase the truncation order n. However, we still need to answer the important question of whether the H-model actually provides accurate results for a given nonlinear dynamics (), quantity of interest () and initial state x0. Corollary 3.4.3 provides a partial answer to this question by showing that, at least in the short time period, condition (41) is always satisfied (assuming that {αj} are finite). This guarantees the short-time convergence of the H-model for any reasonably smooth nonlinear dynamical system and almost any observable. However, for longer integration times T, convergence of the H-model for arbitrary nonlinear dynamical systems cannot be established in general, which means that we need to proceed on a case-by-case basis by applying Theorem 3.4 or by checking whether the hypotheses of Corollary 3.4.1 or Corollary 3.4.2 are satisfied. On the other hand, the convergence of the H-model can be established for any finite integration time in the case of linear dynamical systems, as we have recently shown in Ref. 44. From a practical viewpoint, the implementation of the H-model requires computing to high-order in n. This is not straightforward if the dynamical systems are nonlinear. However, for linear systems, such terms can be easily computed (see Sec. III E).
2. Type-I finite memory approximation (FMA)
The Type-I finite memory approximation is obtained by solving the system (31) with given by (34). As before, we first derive an upper bound for and then discuss sufficient conditions for convergence. Such conditions basically control the growth of an upper bound on .
We notice that if the effective memory band at each level decreases as we increase the differentiation order p, then we can control the error . The following corollary provides a sufficient condition that guarantees this sort of control of the error.
3. Type-II finite memory approximation
The Type-II finite memory approximation is obtained by solving the system (31) with given in (35). We first derive an upper bound for and then discuss sufficient conditions for convergence.
4. Ht-model
The Ht-model is obtained by solving the system (31) with approximated using Chorin’s t-model12 [see Eq. (36)]. Convergence analysis can be performed by using the mathematical methods we employed for the proofs of the H-model. Note that the classical t-model is equivalent to a zeroth-order Ht-model.
One can see that the upper bounds and (see Theorem 3.4) share the same structure, the only difference being the constant out front. Hence by changing C2 to C4, we can prove a series of corollaries similar to 3.4.1–3.4.3. In summary, what holds for the H-model also holds for the Ht-model. For the sake of brevity, we omit the statement and proofs of those corollaries.
E. Linear dynamical systems
The upper bounds we obtained above are not easily computable for general nonlinear systems and infinite-rank projections, e.g., Chorin’s projection (7). However, if the dynamical system is linear, then such upper bounds are explicitly computable and the convergence of the H-model can be established for linear phase space functions in any finite integration time T. To this end, consider the linear system ẋ = Ax with a random initial condition x(0) sampled from the joint probability density function
In other words, the initial condition for the quantity of interest u(x) = x1(t) is set to be deterministic, while all other variables x2, …, xN are zero-mean and statistically independent at t = 0. Here we assume for simplicity that ρ0j (j = 2, …, N) are independent identically distributed (i.i.d.) standard normal distributions. Observe that the Liouville operator associated with the linear system ẋ = Ax is
where Aij are the entries of the matrix A. If we choose observable u = x1(t), then Chorin’s projection operator (9) yields the evolution equation for the conditional expectation , i.e., the conditional mean path (11), which can be explicitly written as
where is the first entry of the matrix A and 0 represents the memory integral (26). Next, we explicitly compute the upper bounds for the memory growth and the error in the H-model for this system. To this end, we first notice that the domain of the Liouville operator can be restricted to the linear space
In fact, V is invariant under , , and , i.e., , , and . These operators have the following matrix representations:
where M11 is the minor of the matrix of A obtained by removing the first column and the first row, while
Therefore,
At this point, we set x01 = x1(0) and
Since is random, is a random variable. By using Jensen’s inequality , we have the following L∞ estimate:
On the other hand, we have
For linear dynamical systems, both and upper bounds can be used to estimate the norm of the semigroup . However, for the semigroup , we can only obtain the explicit form of the bound, which is given by the following perturbation theorem16 (see also Appendix A):
Memory growth. It is straightforward at this point to compute the upper bound of the memory growth we obtained in Theorem 3.1. Since ( and are the orthogonal projections relative to ρ0), we have the following result:
where is a N − 1 × N − 1 diagonal matrix with and ∥·∥2 is the vector 2-norm. Accuracy of theH-model. We are interested in computing the upper bound of the approximation error generated by the H-model (see Sec. III D 1–Theorem 3.4). By using the matrix representation of , , and , the nth order H-model MZ Eq. (56) for the linear system can be explicitly written as
where M11, a, and b are defined as before [see Eq. (58)]. The upper bound for the memory term approximation error is explicitly obtained as
where A1, A2 are defined in (38), while ω and are given in (61) and (62), respectively. Note that the error bound (65) is slightly different from the one we obtained in Theorem 3.4. The reason is that here we choose to bound , instead of the quotient . For each fixed integration time T, the upper bound (65) goes to zero as we send n to infinity, i.e.,
This means that the H-model converges for all linear dynamical systems with observables in the linear space (57).
F. Memory estimates for finite-rank projections and Hamiltonian systems
The semigroup estimates we obtained in Sec. III A allow us to compute explicitly an a priori estimate of the memory kernel in the Mori-Zwanzig equation if we employ finite-rank projections. In this section, we outline the procedure to obtain such an estimate for Hamiltonian dynamical systems. We begin by recalling that, in general, Hamiltonian systems are necessarily divergence-free, i.e.,
Here, F(x) is the velocity field at the right-hand side of (1), while is the canonical Gibbs distribution. Equation (66) can be easily obtained by noticing that
By combining the estimate (18) with the divergence-free constraint (66), we find that the Koopman semigroup of a Hamiltonian dynamical system is a contraction in the norm, i.e.,
Moreover, the MZ equation (6) with a finite-rank projection of the form (12) (with σ = ρeq) can be reduced to the following Volterra integro-differential equation:
where
Equation (68) is often called the generalized Langevin equation (GLE)13,31 for the projected quantity of interest ui(t). To derive (69a)–(69c), we used the fact that is skew-adjoint and is self-adjoint with respect to the inner product and that . In classical statistical physics, Eq. (69c) is known as the second fluctuation dissipation theorem. Next, define the temporal-correlation matrix
By applying to both sides of Eq. (68), we obtain the following exact evolution equation for Cij(t):
Moreover, if we employ a one-dimensional Mori’s basis, i.e., M = 1, then we obtain the simplified equation
where . The main difficulty in solving the GLE (71) [or (72)] lies in computing the memory kernels Kij(t). Hereafter we prove that such memory kernels can be uniformly bounded by a computable quantity that depends only on the initial condition of the system. For the sake of simplicity, we will focus on the one-dimensional GLE (72).
Theorem 3.8 provides an a priori and easily computable upper bound for the memory kernel defining the dynamics of any quantity of interest u1 that is initially in the Gibbs canonical ensemble . In Sec. IV, we will calculate the upper bound (73) analytically and compare it with the exact memory kernel we obtain in prototype linear and nonlinear Hamiltonian systems.
IV. NUMERICAL EXAMPLES
In this section, we provide simple numerical examples of the MZ memory approximation methods we discussed throughout the paper. Specifically, we study Hamiltonian systems (linear and nonlinear) with finite-rank projections (Mori’s projection) and non-Hamiltonian systems with infinite-rank projections (Chorin’s projection). In both cases, we demonstrate the accuracy of the a priori memory estimation method we developed in Secs. III F and III E. We also compute the solution to the MZ equation for non-Hamiltonian systems with the t-model, the H-model, and the Ht-model.
A. Hamiltonian dynamical systems with finite-rank projections
In this section, we consider dimension reduction in linear and nonlinear Hamiltonian dynamical systems with finite-rank projection. In particular, we consider the Mori projection and study the MZ equation for the temporal auto-correlation function of a scalar quantity of interest.
1. Harmonic chains of oscillators
Consider a one-dimensional chain of harmonic oscillators. This is a simple but illustrative example of a linear Hamiltonian dynamical system which has been widely studied in statistical mechanics, mostly in relation with the microscopic theory of Brownian motion.3,18,17 The Hamiltonian of the system can be written as
where qi and pi are, respectively, the displacement and momentum of the ith particle, m is the mass of the particles (assumed constant throughout the network), and k is the elasticity constant that modulates the intensity of the quadratic interactions. We set fixed boundary conditions at the endpoints of the chain, i.e., q0(t) = qN+1(t) = 0 and p0(t) = pN+1(t) = 0 (particles are numbered from left to right) and m = k = 1. The Hamilton’s equations are
which can be written in a matrix-vector form as
where B is the adjacency matrix of the chain and D is the degree matrix (see Ref. 4). Note that (76) is a linear dynamical system. We are interested in the velocity auto-correlation function of a tagged oscillator, say, the one at location j = 1. Such an auto-correlation function is defined as
where the average is with respect to the Gibbs canonical distribution . It was shown in Ref. 18 that can be obtained analytically by employing Lee’s continued fraction method. The result is the well-known J0 − J4 solution
where Ji(t) is the ith Bessel function of the first kind. On the other hand, the Mori-Zwanzig equation derived by the following Mori’s projection
yields the following GLE for :
Here,
since , while K(t) is the MZ memory kernel. For the J0 − J4 solution, it is possible to derive the memory kernel K(t) analytically. To this end, we simply insert (78) into (80) and apply the Laplace transform
to obtain
where and . The inverse Laplace transform of (81) can be computed analytically as
With K(t) available, we can verify the memory estimated we derived in Theorem 3.8. To this end,
Here we used the exact solution of the velocity auto-correlation function and displacement auto-correlation function of the fixed-end harmonic chain given by (see Ref. 18)
In Fig. 1, we plot the absolute value of the memory kernel K(t) together with the theoretical bound (83). It is seen that the upper bound we obtain in this case is of the same order of magnitude as the memory kernel.
Harmonic chain of oscillators. (a) Velocity auto-correlation function and (b) memory kernel K(t) of the corresponding MZ equation. It is seen that our theoretical estimate (83) (dashed line) correctly bounds the MZ memory kernel. Note that the upper bound we obtain is of the same order of magnitude as the memory kernel.
Harmonic chain of oscillators. (a) Velocity auto-correlation function and (b) memory kernel K(t) of the corresponding MZ equation. It is seen that our theoretical estimate (83) (dashed line) correctly bounds the MZ memory kernel. Note that the upper bound we obtain is of the same order of magnitude as the memory kernel.
2. Hald system
In this section, we study the Hamiltonian system studied by Chorin et al. in Refs. 12 and 9. The Hamiltonian function is defined as
while the corresponding Hamilton’s equations of motion are
We assume that the initial state is distributed according to canonical Gibbs distribution . The partition function Z is given by
where K0(t) is the modified Bessel function of the second kind. We aim to study the properties of the autocorrelation function of the first component q1, which is defined as
Obviously, . The evolution equation for is obtained by using the MZ formulation with the Mori’s projection
This yields the GLE
The streaming term is again identically zero since
Theorem 3.8 provides the following computable upper bound for the modulus of K(t):
where U(a, b, y) is the confluent hypergeometric function of the second kind. In Fig. 2(a), we plot the correlation function we obtained by sampling (85) and then averaging over all realizations. The samples of the initial condition are generated from the Gibbs equilibrium distribution by using Markov Chain Monte Carlo (MCMC). In Fig. 2(b), we plot the memory kernel K(t) we computed numerically based on . To compute such a kernel, we inverted numerically the Laplace transform of (88), i.e.,
where . In practice, we replaced within the time interval [0, 20] with a high-order polynomial interpolant at Gauss-Chebyshev-Lobatto nodes, computed the Laplace transform of such polynomial analytically, and then computed the inverse Laplace transform (90) numerically with the Talbot algorithm.1
Hald Hamiltonian system (85). (a) Autocorrelation function of the displacement q1(t) and (b) memory kernel of the governing MZ equation. Here is computed by Markov chain Monte-Carlo (MCMC) while K(t) is determined by inverting numerically the Laplace transform in (90) with the Talbot algorithm. It is seen that the theoretical upper bound (89) (dashed line) is of the same order of magnitude as the memory kernel.
Hald Hamiltonian system (85). (a) Autocorrelation function of the displacement q1(t) and (b) memory kernel of the governing MZ equation. Here is computed by Markov chain Monte-Carlo (MCMC) while K(t) is determined by inverting numerically the Laplace transform in (90) with the Talbot algorithm. It is seen that the theoretical upper bound (89) (dashed line) is of the same order of magnitude as the memory kernel.
B. Non-Hamiltonian systems with infinite-rank projections
In this section, we study the accuracy of the t-model, the H-model, and the Ht model in predicting scalar quantities of interest in non-Hamiltonian systems. In particular, we consider the MZ formulation with Chorin’s projection operator. For the particular case of linear dynamical systems, we also compute the theoretical upper bounds we obtained in Sec. III E for the memory growth and the error in the H-model and compare such bounds with exact results.
1. Linear dynamical systems
We begin by considering a low-dimensional linear dynamical system ẋ = Ax evolving from a random initial state with density ρ0(x) to verify the MZ memory estimates we obtained in Sec. III E. For simplicity, we choose A to be negative definite
In this case, the origin of the phase space is a stable node and it is easy to estimate . We set x1(0) = 1 and x2(0), x3(0) independent standard normal random variables. In this setting, the semigroup estimates (61) and (62) are explicit,
Therefore, we obtain the following explicit upper bounds for the memory integral and the error of the H-model [see Eqs. (63) and (65)]:
Next, we compare these error bounds with numerical results obtained by solving numerically the H-model (64). For example, the second-order H-model reads
In Fig. 3, we demonstrate the convergence of the H-model to the benchmark solution computed by Monte-Carlo simulation as we increase the H-model differentiation order. In Fig. 4, we plot the bound on the memory growth [Eq. (92)] and the bound in the memory error [Eq. (93)] together with exact results.
Convergence of the H-model for the linear dynamical system with matrix (91). The benchmark solution is computed with Monte-Carlo (MC) simulation. Also, the zero-order H-model represents the Markovian approximation to the MZ equation, i.e., the MZ equation without the memory term.
Convergence of the H-model for the linear dynamical system with matrix (91). The benchmark solution is computed with Monte-Carlo (MC) simulation. Also, the zero-order H-model represents the Markovian approximation to the MZ equation, i.e., the MZ equation without the memory term.
Linear dynamical system with matrix (91). In (a), we plot the memory term 0(t) we obtain from Monte Carlo simulation together with the estimated upper bound (92). In (b) and (c), we plot the H-model approximation error together with the upper bound (93) for different differentiation orders n and at different times t.
Linear dynamical system with matrix (91). In (a), we plot the memory term 0(t) we obtain from Monte Carlo simulation together with the estimated upper bound (92). In (b) and (c), we plot the H-model approximation error together with the upper bound (93) for different differentiation orders n and at different times t.
Linear dynamical system with matrix A (95). Convergence of the H-model to the conditional mean path solution . The initial condition is set as x1(0) = 3, while {x2(0), …, x100(0)} are i.i.d. normals.
Linear dynamical system with matrix A (95). Convergence of the H-model to the conditional mean path solution . The initial condition is set as x1(0) = 3, while {x2(0), …, x100(0)} are i.i.d. normals.
2. Nonlinear dynamical systems
a. Lorenz-63 system.
Consider the classical Lorenz-63 model
where σ = 10 and β = 8/3. The phase space Liouville operator for this ordinary differential equation (ODE) is
We choose the resolved variables to be and aim at formally integrating out by using the Mori-Zwanzig formalism. To this end, we set and consider the zeroth-order Ht-model (t-model)
where and are the conditional mean paths. To obtain this system, we introduced the following mean field closure approximation:
Higher-order Ht-models can be derived based on (98). As is well known, if r < 1, the fixed point (0, 0, 0) is a global attractor and exponentially stable. In this case, the t-model (zeroth-order Ht-model) yields an accurate prediction of the conditional mean path for long time (see Fig. 6). On the other hand, if we consider the chaotic regime at r = 28, then the t-model and its higher-order extension, i.e., the Ht-model, are accurate only for relatively short time. This is in agreement with our theoretical predictions. In fact, different from linear systems where the hierarchical representation of the memory integral can be proven to be convergent for long time, in nonlinear systems, the memory hierarchy is, in general, provably convergent only in a short time period (Theorem 3.7 and Corollary 3.4.3). This does not mean that the H-model or the Ht-model is not accurate for nonlinear systems. It just means that the accuracy depends on the system, the quantity of interest, and the initial condition.
Accuracy of the Ht model in representing the conditional mean path in the Lorenz-63 system (96). It is seen that if r = 0.5 (first row), then the zeroth-order Ht-model, i.e., the t-model is accurate for long integration times. On the other hand, if we consider the chaotic regime at r = 28 (second row), then we see that the t-model and its high-order extension (Ht-model) are accurate only for relatively short time.
Accuracy of the Ht model in representing the conditional mean path in the Lorenz-63 system (96). It is seen that if r = 0.5 (first row), then the zeroth-order Ht-model, i.e., the t-model is accurate for long integration times. On the other hand, if we consider the chaotic regime at r = 28 (second row), then we see that the t-model and its high-order extension (Ht-model) are accurate only for relatively short time.
b. Modified Lorenz-96 system.
As an example of a high-dimensional nonlinear dynamical system, we consider the following modified Lorenz-96 system:22,26
where F is constant. As is well known, depending on the values of N and F, this system can exhibit a wide range of behaviors.22 Suppose we take the resolved variables to be . Correspondingly, the unresolved ones, i.e., those we aim at integrating through the MZ framework, are , which we set to be independent standard normal random variables. By using the mean field approximation (98), we obtain the following zeroth-order Ht-model (t-model) of the modified Lorenz-96 system (99):
In Fig. 7, we study the accuracy of the Ht-model in representing the conditional mean path with F = 5 and N = 100. It is seen that the the Ht-model converges only for short time (in agreement with the theoretical predictions), and it provides results that are more accurate than the classical t-model.
Accuracy of the Ht-model in representing the conditional mean path in the Lorenz-96 system (96). Here we set F = 5 and N = 100. It is seen that the Ht-model converges only for short time and provides results that are more accurate that the classical t-model.
Accuracy of the Ht-model in representing the conditional mean path in the Lorenz-96 system (96). Here we set F = 5 and N = 100. It is seen that the Ht-model converges only for short time and provides results that are more accurate that the classical t-model.
V. SUMMARY
In this paper, we developed a thorough mathematical analysis to deduce conditions for the accuracy and convergence of different approximations of the memory integral in the Mori-Zwanzig (MZ) equation. In particular, we studied the short memory approximation, the t-model, and various hierarchical memory approximation techniques. We also derived useful upper bounds for the MZ memory integral which allowed us to estimate a priori its contribution to the dynamics. Such upper bounds are easily computable for systems with finite-rank projections—such as Mori’s projection. Furthermore, if the system is Hamiltonian, then the MZ memory bounds we derived are tight. This can be attributed to the fact that the Koopman operator of a Hamiltonian dynamical system is always a contraction relative to the norm weighted by the Gibbs canonical distribution ρeq. In the more general case of dissipative systems, however, we do not have any guarantee that the MZ memory bounds we obtained are tight. Indeed, for dissipative systems, the numerical abscissa in Eq. (17) [see also Eqs. (21a) and (21b)] can be positive, yielding an exponential growth of the memory bounds. In the presence of infinite-rank projections—such as Chorin’s projection—we found that it is extremely difficult to obtain explicitly computable upper bounds for the MZ memory or any of its approximations discussed in this paper. The main difficulties are related to the lack of an explicitly computable upper bound for the operator semigroup , which involves the unbounded generator (see the discussion in Sec. III A and Appendix A). To demonstrate the accuracy of the approximation methods and error bounds we developed in this paper, we provided numerical examples involving Hamiltonian dynamical systems (linear and nonlinear) and more general dissipative nonlinear systems evolving from random initial states. The numerical results we obtained are found to be in agreement with our theoretical predictions.
ACKNOWLEDGMENTS
This work was supported by the Air Force Office of Scientific Research Grant No. FA9550-16-586-1-0092.
APPENDIX A: SEMIGROUP BOUNDS VIA FUNCTION DECOMPOSITION
In looking for the numerical abscissa14 (i.e., the logarithmic norm) of , we seek to bound
Notice that if , then so that we have the previously proven bound
In this section, we consider what happens when . To that end, let us note that may be decomposed as
where and are orthogonal to . In other words, we define as
and define α as
Then
Since we assume , there exists x such that and then for any α such that
we have
so that for ,
Now, fix any and consider the expression
where
Then, for this fixed ,
Differentiating with respect to a and setting equal to zero, we find that the latter expression is extremized when
i.e., when
Since β2 > 0, is maximized at . Then
so that
Therefore,
When is unbounded, which is the typical case when is an infinite-rank projection, such as most conditional expectations, there is unlikely to be a finite numerical abscissa for . In particular, notice that if divσ(F) is a bounded function (bounded both above and below), then is bounded for all while is unbounded, in which case
It follows Refs. 37 and 30 that in these cases, has an infinite slope at t = 0, and therefore there is no finite ω such that for all t ≥ 0 (see Ref. 14). Assuming still that generates a strongly continuous semigroup, we must look to bound the semigroup as , where
and
On the other hand, if is a bounded operator, for example, when is a finite-rank projection [e.g., Mori’s projection (12)], there exists a finite value for the numerical abscissa. Indeed, in this case, since ζ is bounded by ω = −inf divσ(F), the numerical abscissa of may be bounded as
Alternatively, in the case of finite rank , the operator may be thought of as a bounded perturbation of , i.e., , and the numerical abscissa of can be bounded using the bounded perturbation theorem Ref. 16, III.1.3, obtaining
Either of these bounds for can be used to bound the semigroup norm
Which of these two estimates gives the tighter bound will generally depend on the values of and . It may be noted, however, that, when σ is invariant, ω = 0 and is skew-adjoint so that
APPENDIX B: THE MORI-ZWANZIG FORMULATION IN PDF SPACE
It was shown in Ref. 15 that the Banach dual of (4) defines an evolution in the probability density function space. Specifically, the joint probability density function of the state vector u(t) that solves Eq. (1) is pushed forward by the Frobenius-Perron operator [Banach dual of the Koopman operator (3)],
where
By introducing a projection in the space of probability density functions and its complement , it is easy to show that the projected density satisfies the MZ equation42
In Appendixes B 1 and B 2, we perform an analysis of different types of approximations of the MZ memory integral
The main objective of such analysis is to establish rigorous error bounds for widely used approximation methods and also propose new provably convergent approximation schemes.
1. Analysis of the memory integral
In this section, we develop the analysis of the memory integral arising in the PDF formulation of the MZ equation. The starting point is the definition (B4). As before, we begin with the following estimate of upper bound estimation of the integral:
2. Hierarchical memory approximation in PDF space
The hierarchical memory approximation methods we discussed in Sec. III D can also be developed in the PDF space. To this end, let us first define
By repeatedly differentiating 0(t) with respect to time [assuming 0(t) smooth enough], we obtain the hierarchy of equations
where
By following closely the discussion in Sec. III D, we introduce the hierarchy of memory equations
and approximate the last term in such hierarchy in different ways. Specifically, we consider
Hereafter we establish the accuracy of the approximation schemes resulting from the substitution of each above into (B6).