We propose a new coarse-grained (CG) molecular simulation technique based on the Mori–Zwanzig (MZ) formalism along with the iterative Boltzmann inversion (IBI). Non-Markovian dissipative particle dynamics (NMDPD) taking into account memory effects is derived in a pairwise interaction form from the MZ-guided generalized Langevin equation. It is based on the introduction of auxiliary variables that allow for the replacement of a non-Markovian equation with a Markovian one in a higher dimensional space. We demonstrate that the NMDPD model exploiting MZ-guided memory kernels can successfully reproduce the dynamic properties such as the mean square displacement and velocity autocorrelation function of a Lennard–Jones system, as long as the memory kernels are appropriately evaluated based on the Volterra integral equation using the force-velocity and velocity-velocity correlations. Furthermore, we find that the IBI correction of a pair CG potential significantly improves the representation of *static* properties characterized by a radial distribution function and pressure, while it has little influence on the dynamic processes. Our findings suggest that combining the advantages of both the MZ formalism and IBI leads to an accurate representation of both the static and dynamic properties of microscopic systems that exhibit non-Markovian behavior.

## I. INTRODUCTION

Soft matter and biological systems intrinsically involve hierarchical phenomena that are governed by interactions and processes on a wide range of spatiotemporal scales.^{1,2} Therefore, it is of great importance to elucidate static and dynamic properties at each scale and seamlessly link them towards a better understanding of such multiscale phenomena. Molecular dynamics (MD)^{3,4} is a powerful tool for analyzing these properties in a microscopic system where the constitutive equation for a continuum description is unknown. However, all-atom MD simulations of complex systems are computationally expensive because they explicitly deal with fine-grained atomistic motions according to Newton’s law, and hence they are typically impractical for the application to mesoscale phenomena. Coarse-grained (CG) molecular simulations have been often employed to bridge the scales from a microscopic level to a mesoscopic one.^{5,6} In CG simulations, a single CG particle represents multiple atoms or molecules, allowing for tracing mesoscopic trajectories at significantly larger spatiotemporal scales compared to all-atom MD. Obviously, the qualities of static and dynamic properties of a CG system are determined by the description of the CG force field and its fluctuations.

In general, there are two methods for the construction of CG models. The first strategy is the so-called reverse coarse-graining, where a CG force field is determined by tuning functional forms or parameters so that target properties such as radial distribution functions (RDFs), pressure, isothermal compressibility, and mean force acting on centers-of-mass (COMs) of molecules are reproduced as accurately as possible. Indeed, most CG models that have been proposed thus far fall into this category. Among such reverse coarse-graining techniques are, for instance, the force-matching method,^{7,8} iterative Boltzmann inversion (IBI),^{9,10} reverse Monte Carlo,^{11,12} stochastic parameter optimization using Bayesian inference,^{13,14} and minimization of relative entropy.^{15,16} Certainly, these CG models can accurately represent the target properties, but the qualities of other properties are not generally guaranteed. In particular, an effective CG potential alone cannot yield correct *dynamic* properties such as diffusivity, viscosity, or, more generally, time correlation functions, unless the dynamic properties themselves are chosen as target properties. More specifically, a coarse-grained molecular dynamics (CGMD) simulation generally gives rise to faster diffusion dynamics compared to MD^{17,18} since the effective CG potential is not accompanied by friction and fluctuating forces responsible for dynamic properties, which should be taken into account to compensate for the lost degrees-of-freedom (DOFs) as a consequence of coarse-graining. By contrast, in the second strategy—referred to as forward coarse-graining—a CG force field is directly constructed from microscopic trajectories of an underlying fine-grained system based on the Mori–Zwanzig (MZ) formalism,^{19–30} inherently involving friction and fluctuating terms. In the forward coarse-graining, once relevant variables are chosen, the generalized Langevin equation (GLE) to which they are subject can be straightforwardly derived using the projection operator method while retaining an explicit relation to the microscopic dynamical variables. Therefore, there are no free parameters to be tuned in principle, although a few approximations are required for practical implementation.

Recently, we have proposed non-Markovian dissipative particle dynamics (NMDPD) models,^{25,29} which are derived from the MZ-guided GLE on the assumption that forces acting on CG particles are pairwise decomposable and that many-body correlations between different pairs of CG particles are negligible. The GLE-based NMDPD with pairwise interactions allows for an accurate representation of hydrodynamics because it rigorously satisfies mass and momentum conservation. Moreover, in contrast to the conventional DPD based on the Markovian approximation,^{31–33} the NMDPD models explicitly incorporate memory effects, corresponding to the extension of the conventional DPD. In fact, we found that a typical time scale of CG particle motion is comparable to that of memory kernels in dense liquid systems, thus necessitating the incorporation of the memory effects into CG modeling.^{25,29} Subsequently, we demonstrated that incorporating the memory effects based on the MZ formalism improves the dynamic properties remarkably. More recently, Li *et al.*^{30} have proposed MZ-guided NMDPD employing auxiliary variables, which enable the replacement of a non-Markovian equation with a Markovian one in a higher dimensional space. The technique of introducing auxiliary variables has also been investigated by other researchers to address the complexity in computing the memory term in the GLE.^{34–36} Unlike the original NMDPD which needs to store historical variables to explicitly compute memory kernels, the NMDPD employing auxiliary variables yields a convolution-free Markovian equation, leading to orders of magnitude less memory usage and faster computation in terms of numerical simulation.

The MZ-guided modeling of memory kernels is indeed a powerful choice for the faithful representations of dynamic properties including time correlation functions. However, static properties such as a RDF and pressure, which are predominantly determined by an effective CG potential, may not be generated as accurately compared to the dynamic ones since our NMDPD model utilizes a pair CG potential obtained from a direct ensemble average of instantaneous forces between two clusters composed of microscopic particles.^{24,25} Indeed, the pair CG potentials obtained in that manner fail to yield RDFs of corresponding MD systems where Lennard–Jones (LJ) clusters are greatly overlapping, implying non-negligible many-body effects.^{25} To tackle this issue, we propose a new coarse-graining strategy that combines forward and reverse coarse-graining methods. More specifically, we demonstrate that incorporating memory effects based on the MZ-guided GLE, followed by the IBI correction of a CG potential, is an effective technique for constructing a CG model that can successfully represent both static and dynamic properties of an underlying microscopic system. In addition, we compare the memory kernels constructed from the Volterra integral equation using force-velocity and velocity-velocity correlations with their zeroth-order approximate counterparts, which are also referred to as a *Q*-approximation.^{26,27,37}

The remainder of this paper is organized as follows. In Sec. II, we detail the MZ-guided NMDPD employing auxiliary variables followed by a brief review on the IBI method. Section III describes a reference MD system to be coarse-grained, where the COM positions and momenta of each LJ cluster are regarded as relevant variables. Here, we present in detail the procedure for extracting a pair CG potential and memory kernels together with their zeroth-order approximate forms from the MD simulation. In Sec. IV, we describe the method for implementation of the NMDPD models exploiting auxiliary variables based on the CG potential and memory kernels. Subsequently, we evaluate the NMDPD models by comparing static and dynamic properties with those of the MD system. Finally, we discuss the effect of the IBI correction of the CG potential on the static and dynamic properties. Concluding remarks are presented in Sec. V.

## II. THEORY

### A. Generalized Langevin equation derived from the Mori–Zwanzig formalism

The MZ projection operator method^{19,20,38} provides us a systematic way for extracting relevant dynamical variables from a very high DOF system. Let us consider a microscopic system composed of *N* interacting particles. The microscopic particles are grouped into *K* clusters, and the COM positions and momenta of each cluster {**R**_{I}, **P**_{I}; *I* = 1, …, *K*} are regarded as the relevant variables. By introducing the projection operator $P$ onto the subspace spanned by the relevant variables and the orthogonal counterpart $Q=(1\u2212P)$, the equation of motion (EOM) for the COMs equivalent to CG particles can be derived in the form of the GLE^{21,23–25,28–30}

where *β* = (*k*_{B}*T*)^{−1} with *k*_{B} as the Boltzmann constant and *T* as the thermodynamic temperature, **R** = {**R**_{I}; *I* = 1, …, *K*} is a phase point of the CG configuration space, and *M*_{I} is the mass of CG particle *I*; also, *ω*(**R**) is a normalized partition function associated with all the microscopic configurations at phase point **R**. For a detailed definition of the projector operator $P$, see Eq. (17) in Ref. 21. Each term of the right-hand side of Eq. (1) corresponds to mean force, friction force, and time-correlated fluctuating force, respectively. The friction and fluctuating forces intrinsically appear in Eq. (1) to compensate for the lost DOFs as a consequence of coarse-graining. The friction force involves the time convolutions of the memory kernels $\beta \delta FIQt\delta FJQ0T$ and the momenta of CG particles, satisfying the second fluctuation-dissipation theorem (FDT).^{21,39} More generally, nonlinear terms appear in the memory when the stochastic force is not fully decoupled from the past CG coordinates, resulting in position- and momentum-dependent memory functions.^{26} However, it is very demanding to compute the nonlinear terms for realistic systems. In a practical sense, the linear approximation of the memory in Eq. (1) is considered to be reasonable unless the system is far from equilibrium.

Although GLE (1) accurately represents the trajectories in the CG phase space, direct computation of Eq. (1) is very demanding even for one-dimensional harmonic chain systems^{40,41} since the mean force depends on all the positions of CG particles, while the friction force depends on all the momenta of CG particles, thus requiring the evaluation of multi-body interactions. For practical implementation, we employ two approximations. The first approximation is a pairwise decomposition, where the force acting on CG particle *I* is given in a pairwise additive form $FIt\u2248\u2211J\u2260IFIJt$ with $FIJt$ as the instantaneous force acting from CG particle *J* to *I*. The second approximation is to neglect many-body correlations between different pairs $\delta FIJQt\delta FIXQ0TX\u2260J\u22480$. These two approximations render Eq. (1) a GLE in a pairwise additive form corresponding to NMDPD^{25,29}

where $FIJ$ is an ensemble average of **F**_{IJ}, $\delta FIJQt$ is a pairwise fluctuating force acting from CG particle *J* to *I*, $KIJt=\beta \delta FIJQt\delta FIJQ0T$ is a pairwise memory kernel satisfying the second FDT, and $VIJ=VI\u2212VJ$ is a relative velocity with **V**_{I} as the velocity of CG particle *I*. If **K**_{IJ}(*t*) reduces to $2\gamma IJ\delta t$ in Eq. (2), where $\gamma IJ\u2261\u222b0\u221eKIJtdt$, then the EOM for Markovian DPD is recovered.

### B. Non-Markovian dissipative particle dynamics incorporating auxiliary variables

To implement the EOM for NMDPD (2), we introduce the intermediate variables defined for each pair of CG particles d**P**_{IJ}, which represent the increments of momenta contributed from CG particle *J* to *I*.^{30} Accordingly, the time variation of **P**_{I} is given by

We note that the pairwise intermediate variables d**P**_{IJ} must be calculated for each pair *I* and *J*, namely, d**P**_{IJ}/d*t* ≠ d**P**_{I}/d*t* − d**P**_{J}/d*t*. Then, instead of explicitly evaluating the non-Markovian equation (3b),^{25,29,42} we introduce auxiliary variables **s**_{IJ} for each pair *I* and *J*, which are linearly coupled to **P**_{IJ} and themselves. Consider the following stochastic differential equations with respect to **P**_{IJ} and **s**_{IJ}:^{30,34}

where $Asp\u2009=\u2009\u2212ApsT$ and $\xi IJ$ is a vector of uncorrelated Gaussian random numbers with each element $\xi IJi$ satisfying $\xi IJit\u2009=\u20090$ and $\xi IJit\xi KLj0\u2009=\u2009\delta ij\delta IK\delta JL+\delta IL\delta JK\xd7\delta t$. The matrices $A\u22610,AspT,Aps,AssT$ and $B\u2009\u2261\u2009diag0,Bs$ are termed *drift* and *diffusion* matrices. Assuming that non-Markovian dynamics has finite memory, we can safely take $sIJ\u2212\u221e=0$ and solve Eq. (4) with respect to **s**_{IJ},

Substituting Eq. (5) into Eq. (4) allows for the elimination of **s**_{IJ} in the resultant equation for **P**_{IJ},^{30,34}

The lower limit of the integration in Eq. (3b) can also be taken as −∞ since the pairwise memory kernel is finite, i.e., $KIJt\u2032t\u2032>t=0$. Therefore, Eq. (6) corresponds to Eq. (3b), meaning that solving Eq. (3b) in a non-Markovian form is equivalent to solving Eq. (4) in a Markovian form. The second FDT reads

for the Markovian dynamics given by Eq. (4).^{30,34} Equation (7) can be seen as an extension of the standard second FDT for free-particle Langevin dynamics, i.e., *b*^{2} = 2*ak*_{B}*T* with *a* being the friction coefficient and *b* being the intensity of random force. Practically, **B**_{s} can be obtained through the Cholesky factorization^{43} by designing $Ass+AssT$ to be positive-definite or positive-semidefinite. This constraint also ensures that the stochastic process is consistent with the second law of thermodynamics since the Fourier transform of the memory kernel is positive for all real frequencies.^{44} It has been reported that this modest loss of generality does not significantly affect the accuracy or flexibility of the representation of the memory kernel.^{34} Furthermore, when **A**_{ss} is a real matrix and has complex eigenvalues with positive real parts, $K\u0303IJt$ can be expressed as a linear combination of exponentially damped oscillators.^{30,34}

### C. Iterative Boltzmann inversion

In contrast to the MZ formalism, the IBI is a structure-based inverse coarse-graining method for obtaining a CG pair potential that reproduces the RDF of a reference system.^{9} The simple Boltzmann inversion yields

where *R* is the separation between CG particles, $grefR$ is the target reference RDF, and *U*_{PMF}(*R*) is the potential of mean force (PMF). Unfortunately, this PMF cannot be directly adopted as a pair potential for a CG model since it inherently involves multibody contribution from all the particles in the system.^{45} Therefore, the iterative update of the pair potential is performed in the standard IBI as

where the subscript *i* denotes the iteration number. The RDF after the *i*th update *g*_{i}(*R*) is usually obtained through a CGMD simulation exploiting the pair potential $UiCGR$. This iterative update is performed until the converged pair potential that reproduces *g*_{ref}(*R*) is attained. Typically, *U*_{PMF}(*R*) given by Eq. (8) is adopted as an initial guess of the pair potential $U0CGR$, although in the present study, we utilize a MZ-guided pair potential as an initial guess as described in Sec. IV C. In addition, the practical implementation of the IBI requires smoothing and extrapolation of the potential to provide continuous forces.

In principle, the Henderson theorem^{46} guarantees a one-to-one correspondence between the target RDF *g*_{ref}(*R*) and CG pair potential. However, in practical molecular simulations, the cutoff distance for pairwise interactions breaks down the uniqueness of the CG pair potential. More specifically, different cutoff distances lead to different CG pair potentials, all of which yield identical RDFs but different pressures. In the present study, we choose the cutoff distance obtained from the MZ-guided pair potential because it ensures the accurate representations of both the RDF and pressure.^{47}

## III. MOLECULAR DYNAMICS SYSTEM TO BE COARSE-GRAINED

### A. Reference Lennard–Jones system

Based on the MZ formalism, we extract the pair potential and memory kernels from a LJ system, where 10 LJ particles are grouped into a single cluster and the COM positions and momenta of each cluster {**R**_{I}, **P**_{I}; *I* = 1, …, *K*} are regarded as CG variables (Fig. 1). The system size is 158 × 158 × 158 Å^{3} with periodic boundary conditions in all the directions. The system consists of 8000 clusters and thus 80 000 LJ particles. The interatomic potential is given by the LJ (12-6) form

where *r* is the interatomic distance, *σ* = 3.4 Å, and *ε* = 1.67 × 10^{−21} J corresponding to argon atoms with the mass of 6.63 × 10^{−26} kg. The cutoff radius is 5*σ*. Each LJ cluster is subject to the following constraint condition:^{24,25,48}

where *N*_{m} = 10 is the number of LJ particles per cluster, **r**_{Ii} is the position of LJ particle *i* in cluster *I*, and *R*_{g} = 4.985 Å is the constant radius of gyration. The constraint condition (11) prevents extreme expansion of constituent LJ particles, thus allowing for the evaluation of intercluster forces. Here, *R*_{g} is determined such that the inner number density, defined by 3*N*_{m}/(4π*R*_{g}^{3}), matches the number density of the system. The MD simulation is performed in a microcanonical ensemble with a 2-fs time step. The average temperature of the system is *T* = 121 K.

### B. Extraction of the pair potential

The instantaneous force **F**_{IJ} acting from cluster *J* to *I* is evaluated by summing the interatomic forces between the LJ particles in both clusters, and it is decomposed into two components that are parallel and orthogonal to **R**_{IJ} = **R**_{I} − **R**_{J}. Then, the distance-dependent conservative force is obtained by averaging the parallel component of **F**_{IJ} as

where $eIJ=RIJ/RIJ$ with $RIJ=RIJ$, $FIJCRIJ$ is the distance-dependent magnitude of the conservative force, and *V*_{CG}(*R*_{IJ}) is the pair CG potential. The pair potential shown in Fig. 2 has a significantly soft repulsion core and yields a finite value even around *R*_{IJ} ≈ 0, although there are larger statistical errors for smaller *R*_{IJ}, where it is more improbable to find cluster pairs. Accordingly, the pair potential is fitted by the linear combination of cubic spline functions as

where *R*_{c} is the cutoff distance of the smoothing function and *n* is a dimensionless parameter. Equation (13b) with *n* = 1/4 reduces to the smoothing function commonly used in smoothed particle hydrodynamics (SPH).^{49,50} Unlike the cubic spline function used in SPH, where the first derivative of the function is necessarily zero, Eq. (13) allows for the extrapolation of the pair potential, whose first derivative is a finite negative value at *R*_{IJ} = 0, as shown in Fig. 2.

### C. Extraction of the memory kernels

Let us define a deviation of the instantaneous force from the mean value as $\delta FIJt\u2261FIJt\u2212FIJ$. Then, Eq. (2) yields^{29,30}

By multiplying both sides with $VIJT0$ and taking an ensemble average, Eq. (14) can be further reformulated into a form of the Volterra integral equation of the first kind,

Here, we note that $\delta FIJQtVIJT0$ vanishes because $\delta FIJQt$ originates from the orthogonal dynamics,^{19,21} and thus the correlation with the relevant variable $VIJT0$ is zero. For simplicity, we define $AIJt\u2261\delta FIJtVIJT0$ and $BIJt\u2261VIJtVIJT0$, leading to

For practical implementation, it is useful to decompose each function in Eq. (16) into two components that are parallel and perpendicular to **e**_{IJ}. *δ***F**_{IJ}, $\delta FIJQ$, and **V**_{IJ} can be rewritten as

where the superscripts “$||$” and “$\u22a5$” denote parallel and perpendicular components. Therefore, **A**_{IJ}, **B**_{IJ}, and **K**_{IJ} can be approximated by

where the scalar quantities are given by

Note that the orthogonal scalar quantities (23b), (24b), and (25b) involve factors of 1/2; this can be easily confirmed by taking the traces of both sides of, for instance, $AIJ\u22a5t1\u2212eIJeIJT=\delta FIJ\u22a5tVIJ\u22a50T$. Substitution of Eqs. (20)–(22) into Eq. (16) yields

Both the force-velocity correlations $AIJ||,\u22a5t$ and velocity-velocity correlations $BIJ||,\u22a5t$ can be directly evaluated from the MD simulation, as shown in Fig. 3. We note that these correlations are dependent on the inter-cluster separation as well as the time, and representative correlations at *R*_{IJ} = 9.5 Å are displayed in Fig. 3. Subsequently, the memory kernels $KIJ||,\u22a5t$ can be obtained through the deconvolutions of Eq. (26), as shown in Fig. 4. The velocity autocorrelation function (VACF) of the cluster COMs is also displayed, indicating that its time scale is comparable with that of the memory kernels and hence the Markovian approximation is not applicable to the present system. In addition, we compute an approximate form of **K**_{IJ}(*t*) given by

where

Since the force deviation is related to the fluctuating force as $\delta FIQt=exp\u2212QiLt\delta FI$ with *L* as the Liouville operator and *Q* as the orthogonal projection operator,^{21}^{,}**C**_{IJ}(*t*) can be seen as a zeroth-order approximation of **K**_{IJ}(*t*),^{26,29} i.e., $exp\u2212QiLt\u2248exp\u2212iLt$, and it has been adopted as a memory kernel in some studies.^{24,25,28,51} However, Fig. 4 indicates the clear difference between **K**_{IJ}(*t*) and **C**_{IJ}(*t*) in contrast to a MD system of star polymer melts,^{29} where there is no significant difference between them and their time integrals. The validity of the approximation $KIJt\u2248CIJt$ will be confirmed in Sec. IV B.

The distance-dependent friction coefficients in the radial and perpendicular directions are defined by

where *t*_{c} is the upper limit of the time integration. Basically, a self-diffusion coefficient which characterizes long-time dynamics at the hydrodynamic spatiotemporal scale is mainly determined by the friction coefficients, while short-time dynamics originating from molecular interaction significantly depends on the functional forms of the memory kernels as well as their integrals. Figure 5 shows the friction coefficients in the radial and perpendicular directions computed from $KIJ||,\u22a5RIJ;t$ and $CIJ||,\u22a5RIJ;t$ with *t*_{c} = 1 ps. Again, there are remarkable differences between the friction coefficients obtained from $KIJ||,\u22a5$ and $CIJ||,\u22a5$. The friction coefficients are fitted by

where *R*_{p} = 10 Å, *s*_{3} = 5, and the other parameters (Λ, *s*_{1}, *R*_{f1}, *η*, *s*_{2}, *R*_{f2}, and *χ*) for each case are listed in Table I. $\gamma \u22a5RIJ$ based on $KIJ\u22a5t$ can be well fitted by Eq. (30b) alone. The function *γ*_{tail}(*R*_{IJ}) is responsible for the algebraically decaying tails of the friction coefficients (see the insets of Fig. 5), which would slightly influence the diffusivity of CG particles.

Memory kernel . | γ (kg/s)
. | Λ (10^{−12} kg/s)
. | s_{1}
. | R_{f1} (Å)
. | η (10^{−11} kg/s)
. | s_{2}
. | R_{f2} (Å)
. | χ
. |
---|---|---|---|---|---|---|---|---|

$KIJ||,\u22a5t$ | $\gamma ||RIJ$ | 3.168 | 2.8 | 16.7 | 5 | 20 | 19 | 2 × 10^{3} |

$\gamma \u22a5RIJ$ | 3.271 | 4.8 | 18.8 | … | … | … | … | |

$CIJ||,\u2009\u22a5\u2009t$ | $\gamma ||RIJ$ | 1.690 | 2.2 | 16.6 | 6 | 20 | 19 | 3 × 10^{3} |

$\gamma \u22a5RIJ$ | 1.699 | 2.4 | 15.6 | 2 | 21 | 18 | 3 × 10^{3} |

Memory kernel . | γ (kg/s)
. | Λ (10^{−12} kg/s)
. | s_{1}
. | R_{f1} (Å)
. | η (10^{−11} kg/s)
. | s_{2}
. | R_{f2} (Å)
. | χ
. |
---|---|---|---|---|---|---|---|---|

$KIJ||,\u22a5t$ | $\gamma ||RIJ$ | 3.168 | 2.8 | 16.7 | 5 | 20 | 19 | 2 × 10^{3} |

$\gamma \u22a5RIJ$ | 3.271 | 4.8 | 18.8 | … | … | … | … | |

$CIJ||,\u2009\u22a5\u2009t$ | $\gamma ||RIJ$ | 1.690 | 2.2 | 16.6 | 6 | 20 | 19 | 3 × 10^{3} |

$\gamma \u22a5RIJ$ | 1.699 | 2.4 | 15.6 | 2 | 21 | 18 | 3 × 10^{3} |

## IV. NON-MARKOVIAN DISSIPATIVE PARTICLE DYNAMICS SIMULATIONS

### A. Implementation of non-Markovian dissipative particle dynamics with auxiliary variables

To take into account the friction and fluctuating forces orthogonal to **R**_{IJ}, we define the local coordinates (**e**_{IJ}, $eIJ\u22a51$, and $eIJ\u22a52$) assigned to each pair of CG particles as depicted in Fig. 6. Here, the unit vectors $eIJ\u22a51$ and $eIJ\u22a52$ are orthogonal to each other and $eIJ$. Then, the EOM (3a) can be rewritten as

where $dPIJ||$, $dPIJ\u22a51$, and $dPIJ\u22a52$ represent the increments of momenta along **e**_{IJ}, $eIJ\u22a51$, and $eIJ\u22a52$, respectively. By substituting Eq. (22) into Eq. (3b), we arrive at

Here, $VIJ\u22a51=VIJ\u22c5eIJ\u22a51eIJ\u22a51$ and $VIJ\u22a52=VIJ\u22c5eIJ\u22a52eIJ\u22a52$, while $\delta FIJQ,\u22a51=\delta FIJQ\u22c5eIJ\u22a51eIJ\u22a51$ and $\delta FIJQ,\u22a52=\delta FIJQ\u22c5eIJ\u22a52eIJ\u22a52$.

For each Eqs. (32a)–(32c), we consider the corresponding Markovian equations with auxiliary variables as shown in Eq. (4), necessitating the construction of the drift and diffusion matrices. The memory kernels $KIJ||RIJ;t$ and $KIJ\u22a5RIJ;t$ are divided into distance-dependent and time-dependent parts as

where $\theta IJ||0=\theta IJ\u22a50=1$. The time-dependent parts are approximated by a linear combination of exponentially damped oscillators given as^{30}

Equation (34) with *n*_{t} = 4 and the fitting parameters listed in Table II faithfully represent the time-dependent parts of the memory kernels, as shown in Fig. 7(a). The distance-dependent parts $\Phi IJ||,\u22a5RIJ$ are obtained through Eqs. (29) and (33) as

which ensures accurate representations of the integral values of the memory kernels. Then, by fitting Eq. (6b) to the memory kernels using the Nelder–Mead method,^{52} the corresponding elements in the drift matrices in Eq. (4) are obtained as

where $\lambda =\Phi IJ||RIJ$ and $\mu =\Phi IJ\u22a5RIJ$. The method for constructing drift and diffusion matrices from the fitting functions is detailed in a previously published paper.^{30} Practically, care must be taken for $K\u0303IJ=KIJ/M$ with *M* being the mass of a CG particle as indicated by Eqs. (3b) and (6a), meaning that $Aps||,\u22a5$ must be replaced by $Aps||,\u22a5/M$ in accordance with Eq. (6a). Once the matrices $Ass||,\u22a5$ are obtained, then the elements in the diffusion matrices $Bs||,\u22a5$ are automatically determined through the second FDT (7).

. | Index i
. | κ_{i}
. | τ_{i} (fs)
. | ω_{i} (ps^{−1})
. | φ_{i}
. |
---|---|---|---|---|---|

$\theta IJ||t$ | 1 | 0.4136 | 197.4 | 7.831 | 0.5741 |

2 | 0.4536 | 77.58 | 17.03 | 0.6478 | |

3 | 0.1672 | 177.7 | 12.98 | 0.4092 | |

4 | 0.2021 | 412.8 | 2.345 | 0.8017 | |

$\theta IJ\u22a5t$ | 1 | 0.8312 | 127.4 | 11.37 | 0.6040 |

2 | 0.1276 | 100.7 | 23.11 | 0.4059 | |

3 | 5.134 × 10^{−4} | 247.5 | 87.78 | 4.600 × 10^{−2} | |

4 | 0.2386 | 215.0 | 6.741 | 0.6040 |

. | Index i
. | κ_{i}
. | τ_{i} (fs)
. | ω_{i} (ps^{−1})
. | φ_{i}
. |
---|---|---|---|---|---|

$\theta IJ||t$ | 1 | 0.4136 | 197.4 | 7.831 | 0.5741 |

2 | 0.4536 | 77.58 | 17.03 | 0.6478 | |

3 | 0.1672 | 177.7 | 12.98 | 0.4092 | |

4 | 0.2021 | 412.8 | 2.345 | 0.8017 | |

$\theta IJ\u22a5t$ | 1 | 0.8312 | 127.4 | 11.37 | 0.6040 |

2 | 0.1276 | 100.7 | 23.11 | 0.4059 | |

3 | 5.134 × 10^{−4} | 247.5 | 87.78 | 4.600 × 10^{−2} | |

4 | 0.2386 | 215.0 | 6.741 | 0.6040 |

Likewise, when the memory kernel **K**_{IJ}(*t*) is approximated by **C**_{IJ}(*t*), their time-dependent parts are fitted by Eq. (34) with *n*_{t} = 3 and the fitting parameters listed in Table III, as displayed in Fig. 7(b). Here, we neglect physically unnatural offsets in the long-time region of the temporal parts of the original data, which may be due to the time variations in inter-cluster relative positions.^{25} Subsequently, the elements in the drift matrices in Eq. (4) are given by

. | Index i
. | κ_{i}
. | τ_{i} (fs)
. | ω_{i} (ps^{−1})
. | φ_{i}
. |
---|---|---|---|---|---|

$\theta IJ||t$ | 1 | 0.1200 | 454.8 | 1.281 | 1.043 |

2 | 0.3424 | 100.4 | 17.57 | 0.5158 | |

3 | 0.7544 | 167.1 | 10.10 | 0.5352 | |

$\theta IJ\u22a5t$ | 1 | 0.2293 | 373.0 | 0.9893 | 1.217 |

2 | 0.8910 | 78.05 | 13.04 | 0.7768 | |

3 | 0.3235 | 246.4 | 10.59 | 0.3660 |

. | Index i
. | κ_{i}
. | τ_{i} (fs)
. | ω_{i} (ps^{−1})
. | φ_{i}
. |
---|---|---|---|---|---|

$\theta IJ||t$ | 1 | 0.1200 | 454.8 | 1.281 | 1.043 |

2 | 0.3424 | 100.4 | 17.57 | 0.5158 | |

3 | 0.7544 | 167.1 | 10.10 | 0.5352 | |

$\theta IJ\u22a5t$ | 1 | 0.2293 | 373.0 | 0.9893 | 1.217 |

2 | 0.8910 | 78.05 | 13.04 | 0.7768 | |

3 | 0.3235 | 246.4 | 10.59 | 0.3660 |

We perform the NMDPD simulations augmented with the auxiliary variables corresponding to **K**_{IJ} and **C**_{IJ}. The systems consist of 8000 CG particles in the periodic boxes of 158 × 158 × 158 Å^{3} in accordance with the MD system. The generalized Verlet scheme^{53} is employed for the numerical integration with a time step of Δ*t* = 10 fs. The local coordinates assigned to each pair of CG particles are rotated at each time step.^{25} More specifically, $eIJ\u22a51t+\Delta t$ and $eIJ\u22a52t+\Delta t$ are updated by rotating $eIJ\u22a51t$ and $eIJ\u22a52t$ around $eIJt\xd7eIJt+\Delta t$, which is uniquely determined from **R**_{IJ}.

### B. Evaluation of coarse-grained models

Figure 8 shows the dynamic properties in the MD, *K*_{IJ}-based NMDPD, and *C*_{IJ}-based NMDPD systems. The *K*_{IJ}-based NMDPD model can successfully reproduce the VACF of the MD system, indicating an accurate representation of the short-time dynamics. In contrast, the *C*_{IJ}-based NMDPD model fails to reproduce the VACF of the MD system, involving a noticeable undershoot after the exponential decay. This undershoot may be partly attributed to the undershoot of the memory kernel in the radial direction $CIJ||t$, as displayed in Fig. 4(a). Moreover, the mean square displacement (MSD) characterizing the long-time dynamics is accurately reproduced by the *K*_{IJ}-based NMDPD model as shown in Fig. 8(b), while the *C*_{IJ}-based NMDPD model underestimates the long-time diffusivity. Specifically, the self-diffusion coefficient in the *K*_{IJ}-based NMDPD system is 4.20 × 10^{−10} m^{2}/s, in excellent agreement with that in the MD system 4.23 × 10^{−10} m^{2}/s, although that in the *C*_{IJ}-based NMDPD system is 3.23 × 10^{−10} m^{2}/s. These results indicate that the NMDPD model based on the exact memory kernels **K**_{IJ} can accurately reproduce both the short- and long-time dynamics, while the approximate memory kernels **C**_{IJ} are inappropriate for the present liquid system.

Figure 9 shows the RDFs in the MD, *K*_{IJ}-based NMDPD, and *C*_{IJ}-based NMDPD systems. Both the CG models provide identical RDFs because static properties of the CG system depend primarily on the conservative force field.^{7} The RDFs in the CG systems differ significantly from that in the MD system. In addition, the pressures in the CG systems are 15.3 MPa, which is significantly smaller than that in the MD system, 46.4 MPa. These discrepancies imply that the many-body effect among CG particles should be effectively incorporated into the pair potential to accurately reproduce the static properties in the dense liquid system. As a matter of fact, the RDF in the MD system indicates significant overlapping among the clusters, highlighting the non-negligible many-body effect.

### C. Evaluation of a coarse-grained model exploiting the optimal pair potential

To improve CG representations of the static properties, the pair CG potential is modified through the IBI described in Sec. II C. The MZ-guided pair potential shown in Fig. 2 is adopted as an initial guess, and the cutoff distance is set to be 22.32 Å according to that of the MZ-guided potential. The IBI procedure is implemented in the VOTCA package,^{54} where a series of CGMD simulations with a 2-fs time step is performed using GROMACS.^{55} In each iteration, the system is initially equilibrated for 60 ps followed by a production run of 440 ps.

Figure 10(a) depicts the original and modified pair potentials, where the modified one has a softer and longer-range repulsion core than the original one. The resultant RDF obtained from a *K*_{IJ}-based NMDPD simulation employing the modified potential is displayed in Fig. 10(b), being an accurate representation of the MD counterpart. Moreover, the pressure in the *K*_{IJ}-IBI-based NMDPD system is 46.3 MPa, showing excellent agreement with the MD pressure, 46.4 MPa. Most importantly, the dynamics properties characterized by the VACF and MSD are little influenced by the IBI correction of the pair potential, as shown in Figs. 10(c) and 10(d). Actually, the self-diffusion coefficient in the *K*_{IJ}-IBI-based NMDPD system is 4.82 × 10^{−10} m^{2}/s, still in good agreement with the MD value 4.23 × 10^{−10} m^{2}/s, although it is slightly larger than that of the *K*_{IJ}-based NMDPD model 4.20 × 10^{−10} m^{2}/s because of less caging effects of the modified potential. These results indicate that the dynamic properties are primarily determined by the memory kernels, with negligible dependence on the form of the pair CG potential. Therefore, the MZ-guided construction of memory kernels with the subsequent IBI correction of a pair CG potential is a powerful technique to accurately represent both the static *and* dynamic properties of microscopic systems.

Although the present study has investigated moderate coarse-graining of 10 LJ particles into a single cluster, the same methodology can be applied, in principle, to more aggressive coarse-graining, i.e., 100 LJ particles in a cluster. However, in that case, the many-body effect would be more pronounced^{27,47} and it would be difficult to separate the memory effect from the many-body effect. Hence, we plan a future study where we can systematically investigate the interplay of these two important effects in aggressively coarse-grained systems.

## V. CONCLUSIONS

We have proposed a non-Markovian coarse-graining method, which takes advantage of the Mori–Zwanzig (MZ) formalism for extracting memory kernels together with the iterative Boltzmann inversion (IBI) for improving the representation of static properties. We have demonstrated that the MZ-guided non-Markovian dissipative particle dynamics model exploiting auxiliary variables (allowing for the replacement of a non-Markovian equation with a Markovian one in a higher dimensional space) can successfully reproduce the dynamic properties such as the velocity autocorrelation function and mean square displacement of the reference Lennard–Jones system, as long as the pairwise memory kernels are appropriately evaluated. More specifically, we have found that the memory kernels should be constructed based on the Volterra integral equation using force-velocity and velocity-velocity correlations, rather than based on the time-correlations of instantaneous force deviations from the mean force, which correspond to a zeroth-order approximation of the exact memory kernels. Furthermore, the IBI correction of the pair potential ensures the accurate representation of the radial distribution function and pressure, while it has little influence on the dynamic properties that depend primarily on the memory kernels. The present study indicates that constructing memory kernels based on the MZ formalism followed by the IBI correction of a pair potential is a promising method for obtaining coarse-grained models that can successfully represent both the static and dynamic properties of underlying microscopic systems.

## ACKNOWLEDGMENTS

This work was partly supported by JSPS KAKENHI Grant No. 16K18009. This work was also supported by the U.S. Army Research Laboratory and was accomplished under Cooperative Agreement No. W911NF-12-2-0023.