We propose a route for parameterizing isotropic (generalized) Langevin [(G)LE] thermostats with the aim to correct the dynamics of coarse-grained (CG) models with pairwise conservative interactions. The approach is based on the Mori–Zwanzig formalism and derives the memory kernels from Q-projected time correlation functions. Bottom-up informed (GLE and LE) thermostats for a CG star-polymer melt are investigated, and it is demonstrated that the inclusion of memory in the CG simulation leads to predictions of polymer diffusion in quantitative agreement with fine-grained simulations. Interestingly, memory effects are observed in the diffusive regime. We demonstrate that previously neglected cross-correlations between the “irrelevant” and the CG degree of freedom are important and lie at the origin of shortcomings in previous CG simulations.

## I. INTRODUCTION

To reduce the numerical complexity in computer simulations of soft matter, coarse-grained (CG) models are commonly used. The systematic derivation of these models frequently targets at structural and thermodynamical properties based on procedures that map a high-dimensional space of atomistic degrees of freedom (DoFs) on a lower dimensional space of CG DoFs. The interaction sites (“beads”) of the resulting CG models usually correspond to the centers of mass of several atoms grouped together to represent, e.g., a small molecule, functional group, monomer, thermal blob, or entire macromolecule.^{1–8} Contrary to static properties, the dynamic properties of these CG models are usually poorly represented in standard molecular dynamics simulations.^{9–12} Despite progress made in recent years, challenges remain, in particular, in representing the dynamic properties of material-specific CG models on all relevant time scales.

The Hamiltonian dynamics of a fine-grained (FG) system can be represented with a CG model when non-Markovian interactions, represented by friction and noise terms, are introduced in the equation of motion (EoM). The Mori–Zwanzig theory^{13–15} provides an exact theoretical framework to link the FG and CG dynamics and has been explored in recent years with varying success.^{16–23} The difficulties encountered are usually caused by incomplete time scale separation of the relevant (CG) and the removed DoFs. Therefore, configuration-dependent frictions, e.g., pairwise, distance-dependent, and time-independent friction coefficients in dissipative particle dynamics (DPD) models, cannot readily be computed. A conceptually simpler approach involves the use of configuration-independent friction coefficients or memory kernels in a Langevin (LE) or a generalized Langevin (GLE) thermostat coupled to the CG DoFs.^{24,25}

Methods for constructing (G)LE thermostats, which do not rely on an *a posteriori* optimization scheme,^{25–27} have in common that FG trajectories are analyzed by separating forces into contributions from an assumed conservative force field, governing structure and thermodynamical properties, and fluctuating contributions that are assigned to frictional and random interactions.^{16–23} In early studies, Lei *et al.*^{24} using this approach attempted to correct the self-diffusivity of a liquid composed of CG fluid particles representing clusters of Lennard-Jones (LJ) particles. A Markovian Langevin thermostat was applied where the friction coefficient was parameterized based on the integral of the force auto-correlation function (FACF) of the fluctuating forces. This approach was derived based on the EoM derived by Kinjo and Hyodo^{28} by neglecting cross-correlations between different CG beads and subsequently assuming the Markovian behavior. The authors found that the self-diffusion coefficient was underestimated by a factor of 4 in the CG Langevin models, while DPD-type models performed significantly better for the same system,^{24} indicating that the mismatch was due to the missing configuration dependency in the frictional and random forces. In the following years, the same research group focused their efforts on improving the methodology to derive more accurate DPD-type models by, e.g., the introduction of memory effects.^{19–21} Yet, the magnitude of the mismatch in self-diffusion coefficients in the Langevin models is intriguing as one could reasonably assume that configuration dependencies would average out in isotropic systems, similar to the case of single-particle implicit solvent GLE models, which can capture the underlying FG dynamics exactly in the absence of conservative interactions.^{29} In principle, the parametrization of (generalized) Langevin models can be optimized iteratively until the fine-grained dynamics is matched.^{25–27} While this is a valid approach and widely used in the derivation of conservative interactions for CG models,^{1,4,30} such methods obscure the origin of the poor performance of bottom-up derived friction coefficients and memory kernels for LE and GLE models.

Herein, we will demonstrate that through the splitting of interactions into conservative and fluctuating forces on the CG level, frictional contributions arise, which were not accounted for in earlier studies. The calculation of these contributions from Q-projected time correlation functions provides a route to compute memory kernels with which dynamic consistency is achieved in GLE simulations. It furthermore provides improved insight into interactions that lead to memory effects in soft matter systems manifested on diffusive time scales.

We shall be interested in a GLE of the form

in which *F*(*t*) is the total force on a CG particle, *F*^{C}(*t*) is the conservative force corresponding to a CG force field, *m* is the particle mass, and *v*(*t*) is its velocity. One-dimensional descriptions are used throughout, which for isotropic systems trivially hold for all Cartesian dimensions. The random force $F\u0303R(t)$ and the memory kernel $K\u0303(t)$ are related through the fluctuation–dissipation theorem (FDT)^{15,31} and represent the GLE thermostat. The question of interest herein is how the microscopic interactions encode $K\u0303(t)$.

To this end, it is instructive to first consider the GLE for a single particle in an isotropic medium. Expressed in the Mori–Zwanzig formulation, the EoM is

in which *F*_{Q}(*t*) denotes the Q-projected total force on the particle.^{15,32} Its relation to *K*(*t*) is provided by the FDT according to

where *k*_{B} is the Boltzmann constant, *T* is temperature, and ⟨⋯⟩ denotes averaging over configurations of the microscopic (FG) system. To make the evaluation of Eq. (2) tractable in computer simulations, *F*_{Q}(*t*) has to be interpreted to be a random force *F*^{R}(*t*), in which case the EoM reads

The Q-projected total force in Eq. (3) can be written as $FQ(t)=FQC(t)+\delta FQ(t)$. The kernel can therefore be expressed as

A similar splitting of contributions to *K*(*t*) in terms of attractive and repulsive interactions in a LJ fluid was carried out by Carof *et al.*^{32} The above splitting of *F*_{Q}(*t*) in Eq. (2) generally leads to correlated Q-projected conservative and fluctuating forces, i.e., $FQC(t)\delta FQ(0)\u22600$. The *random* force, $F\u0303R(t)$, in Eq. (1) is often interpreted to stem from the same fluctuations that determine *δF*_{Q}(*t*).^{24} This however leads to the tacit assumption that $FQC(t)\delta FQ(0)=0$, and correspondingly, *K*_{X}(*t*) = 0, which can only be assumed if $FQC(t)$ and *δF*_{Q}(*t*) evolve on separated time scales.

The above analysis of the single-particle model has consequences for interpreting the memory kernel in Eq. (1). The CG conservative forces are explicit in Eq. (1) and, therefore, do not contribute to the friction represented by the kernel $K\u0303$. We may therefore approximate $K\u0303(t)$ as

if we assume that *K*_{X}(*t*) = 0. In the Markovian limit, the Q-projected forces may be replaced with the real forces (“Q-approximation”) so that

Based on these approximations, Lei *et al.*^{24} performed Langevin dynamics simulations of CG LJ clusters in which the applied friction coefficient *γ*_{FACF} was obtained from

In the general case, Eq. (5) however suggests that the kernel $K\u0303$ in Eq. (1) should be written as

It thus contains a contribution of Q-projected correlations between fluctuating forces, encoded by the term $K\delta (t)=\delta FQ(t)\delta FQ(0)/mkBT$, and a contribution corresponding to Q-projected cross-correlations of fluctuating and conservative forces, encoded by $KX(t)\u2261FQC(t)\delta FQ(0)/mkBT$. The remaining contribution $KC(t)=FQC(t)FQC(0)/mkBT$ encodes the friction and memory due to the conservative interactions *F*^{C}(*t*). The Q-projected time correlation functions are herein computed on the basis of the backward orthogonal dynamics method proposed by Carof *et al.*^{32} while using the second order integration scheme proposed by Jung *et al.*^{26} for numerical evaluation.

Alternatively, different memory kernels can be defined by multiplying Eq. (1) or Eq. (4) with *v*(0) and subsequent averaging, leading to Volterra equations. These equations can be numerically inverted to derive memory kernels from time correlation functions of mapped FG trajectories. In particular, we define the memory kernels *K*^{V}(*t*), $K\u0303V(t)$, and $KCV(t)$ linked to the force–velocity correlation functions $F(t)v(0)$, $\delta F(t)v(0)$, and $FC(t)v(0)$, respectively. The Volterra inversion approach is discussed in more detail in the supplementary material. In Table I, definitions of memory kernels both in terms of the backward orthogonal dynamics method and the Volterra inversion method are listed, as are their interrelations. Memory kernels derived from the inversion of Volterra equations are denoted by the superscript *V*. We note that a Volterra inversion applied to Eq. (4) provides *K*^{V}(*t*) = *K*(*t*).^{29,33–36} Solving the Volterra equation corresponding to Eq. (1) however yields $K\u0303V(t)=K\delta (t)+KX(t)\u2260K\u0303(t)$. This will be demonstrated numerically in Sec. II A and discussed analytically in the supplementary material.

Backward orthogonal dynamics . | Volterra equations . | Auxiliary relations . |
---|---|---|

$K(t)=FQ(t)FQ(0)/mkBT$ | $F(t)v(0)=\u2212m\u222b0tdsKV(t\u2212s)v(s)v(0)$ | K_{δ,X}(t) = K_{δ}(t) + K_{X}(t) |

$KC(t)=FQC(t)FQC(0)/mkBT$ | $\delta F(t)v(0)=\u2212m\u222b0tdsK\u0303V(t\u2212s)v(s)v(0)$ | $K\u0303(t)=K\delta (t)+2KX(t)$ |

$K\delta (t)=\delta FQ(t)\delta FQ(0)/mkBT$ | $FC(t)v(0)=\u2212m\u222b0tdsKCV(t\u2212s)v(s)v(0)$ | K_{C,X}(t) = K_{C}(t) + K_{X}(t) |

$KX(t)=FQC(t)\delta FQ(0)/mkBT$ | K^{V}(t) = K(t) | |

$K\u0303V(t)=K\delta ,X(t)$ | ||

$KCV(t)=KC,X(t)$ |

Backward orthogonal dynamics . | Volterra equations . | Auxiliary relations . |
---|---|---|

$K(t)=FQ(t)FQ(0)/mkBT$ | $F(t)v(0)=\u2212m\u222b0tdsKV(t\u2212s)v(s)v(0)$ | K_{δ,X}(t) = K_{δ}(t) + K_{X}(t) |

$KC(t)=FQC(t)FQC(0)/mkBT$ | $\delta F(t)v(0)=\u2212m\u222b0tdsK\u0303V(t\u2212s)v(s)v(0)$ | $K\u0303(t)=K\delta (t)+2KX(t)$ |

$K\delta (t)=\delta FQ(t)\delta FQ(0)/mkBT$ | $FC(t)v(0)=\u2212m\u222b0tdsKCV(t\u2212s)v(s)v(0)$ | K_{C,X}(t) = K_{C}(t) + K_{X}(t) |

$KX(t)=FQC(t)\delta FQ(0)/mkBT$ | K^{V}(t) = K(t) | |

$K\u0303V(t)=K\delta ,X(t)$ | ||

$KCV(t)=KC,X(t)$ |

## II. RESULTS

As a test case for the proposed methods, we consider a generic star-polymer melt system, where every star polymer in the microscopic (FG) description consists of one central bead with ten arms and three beads per arm, following Wang *et al.*^{25} We use the iterative Boltzmann inversion (IBI) method for the derivation of the conservative CG model in which every star polymer is represented as a single interaction site. A visualization of a single star-polymer in its FG and CG representation is shown in Fig. 1. Further details can be found in the supplementary material.

### A. Memory kernels

We first demonstrate the equivalence of the derived memory kernels based on Volterra equations and the backward orthogonal dynamics method. In particular, we want to validate that the cross contribution *K*_{X}(*t*) contributes equivalently to $KCV(t)$ and $K\u0303V(t)$ obtained by a Volterra equation approach (see the supplementary material). The memory kernels obtained by these methods are compared in Fig. 2. Except for small numerical deviations, we find that *K*^{V}(*t*) = *K*(*t*), $KCV(t)=KC,X(t)$, and $K\u0303V(t)=K\delta ,X(t)$ holds.

Figure 3 shows all memory kernels evaluated based on the backward orthogonal dynamics method and their respective integrals Γ(*t*). It can be seen that on small time scales, the total memory kernel *K*(*t*) is mainly governed by the fast DoFs, i.e., *K*_{δ}(*t*). On longer time scales, the conservative CG interactions, i.e., *K*_{C}(*t*), dominate. The contributions from cross-correlations, i.e., *K*_{X}(*t*), overall are small in amplitude. Yet, over long time scales, the mainly negative contributions accumulate, giving rise to an effective decrease in the total friction. A similar behavior was found for the cross-correlations of long-ranged and short-ranged interactions in LJ fluids.^{32}

For the parameterization of the GLE-thermostat models, we fit the integrals of the memory kernels *K*(*t*), *K*_{δ}(*t*), *K*_{δ,X}(*t*), and $K\u0303(t)$ with six damped oscillators, respectively (see the supplementary material for details). We chose to fit the integrals rather than the memory kernels to account for the accumulative nature of memory effects, which effectively govern the qualitative behavior. Furthermore, we want to emphasize the long time dynamics. In Fig. 4, the fitted functions are overlayed as dotted lines. For *K*(*t*), *K*_{δ}(*t*), and *K*_{δ,X}(*t*), the fitted functions well reproduce the integral of the memory kernels up to a time of *t* = 50. For $K\u0303(t)$, we chose to fit the function only up to *t* ≈ 20. The reason is that the cross contributions *K*_{X}(*t*) show a long-lived negative tail, leading to divergence in the plateau of the integral. All memory kernels that can be derived from Volterra equations seem not to suffer from this problem.

### B. Langevin dynamics simulations

For the parameterization of Markovian LE models, we chose to derive the friction coefficient from the plateau value of the fit functions. By this, the effective friction coefficient employed in the Markovian LE models is consistent with the memory kernels of the non-Markovian GLE models discussed in Sec. II C, which allows for an analysis of the relevance of memory effects. We will below refer to the different Markovian LE models as “*γ*-models” and adjust the notation for “*γ*” according to the corresponding kernels [i.e., *γ*-model ↔ *K*, *γ*_{δ}-model ↔ *K*_{δ}, *γ*_{δ,X}-model ↔ *K*_{δ,X}, $\gamma \u0303$-model $\u2194K\u0303$, and *γ*_{FACF}-model $\u2194\delta F(t)\delta F(0)/mkBT$].

In Fig. 5, the mean square displacement (MSD), scaled by $16t$ is plotted in the linear regime and compared to the FG results. The scaled MSD is constant for the shown time scale for all models and represents the self-diffusion coefficient. As expected, the *γ*-model, for which all interactions are encoded in *γ* obtained by integrating *K*(*t*), accurately reproduces the self-diffusion coefficient of the FG system and can be seen as a validation of the method for calculating the memory kernels. For the remaining models, the LE thermostat is used in conjunction with CG conservative interactions based on the IBI method. For the *γ*_{FACF}-model [Eq. (8)], we find that the self-diffusion coefficient is underestimated by ≈25%. This is in line with the findings of Lei *et al.*,^{24} who found an even larger deviation of approximately a factor of 4 for LJ clusters. The finding that LE models based on the FACF of the fluctuating forces *δF*(*t*) show significantly slower dynamics appears to be quite general. Comparing the result for the *γ*_{FACF}-model with the *γ*_{δ}-model shows that the main error source for the deviation does not lay in the *Q*-approximation as the *γ*_{δ}-model predicts even lower self-diffusion coefficients. The *γ*_{δ,X}-model, in which the friction coefficient is identical to the one obtained by integrating the kernel $K\u0303V(t)$ obtained from the Volterra equation (cf. Table I), improves the result significantly but still leads to an underestimation of the self-diffusion coefficient. The $\gamma \u0303$-model further enhances the dynamics compared to the *γ*_{δ,X}-model but is found to overshoot, effectively leading to an overestimation of the self-diffusion coefficient by a similar margin as the *γ*_{δ,X}-model shows an underestimation of the self-diffusion coefficient.

The very fast self-diffusion of the $\gamma \u0303$-model can be understood in terms of memory effects, which are neglected in Markovian models. Generally, when applying a non-Markovian GLE thermostat, the effects on the dynamics depend on the density of states governed by the conservative interactions.^{37,38} In a simple model system, this feature of GLE thermostats was nicely demonstrated by Kappler *et al.* by applying a GLE thermostat to a particle in a one-dimensional double well potential.^{39,40} They have shown that the application of an exponential memory kernel, which decays on much longer time scales than the typical time scale of fluctuations within one minimum of the double well, leads to a slow down of the barrier crossing dynamics. In particular, the introduction of a second exponential term in the memory kernel has shown that the effects of the shorter lived memory kernel dominate.^{40} This underlines the feature that dynamic modes in the system more strongly couple to a memory kernel that relaxes on comparable time scales.

Differences in the self-diffusion coefficients obtained from Markovian and non-Markovian models parameterized based on the same memory kernel provide indications for memory effects in the long time dynamics. The preceding arguments applied to the CG model used in this work can be used to explain the role of memory. The IBI-model is effectively purely repulsive. As we consider a dense star-polymer melt, the overall dynamics is dominated by short-lived collisions of the beads with their nearest neighbors. The positive amplitudes of the memory kernel on short time scales couple strongly to the dominant dynamic modes. The introduction of the cross contributions *K*_{X}(*t*) in the memory kernel, as demonstrated in Figs. 3 and 4, introduces negative amplitudes on long time scales, which should mostly decouple from the short-lived fluctuations due to collisions. Including contributions on all time scales indiscriminately in the derivation of the friction coefficient of the LE-thermostat leads to an increase in the diffusion coefficient compared to non-Markovian models, which correctly account for the frequency or time scale dependence of dissipative interactions. We will validate this interpretation of the data in Sec. II C.

### C. Generalized Langevin dynamics simulations

The dynamics of the non-Markovian models is evaluated based on the velocity auto-correlation function (VACF) and compared to the FG model in Fig. 6. We do not show the non-Markovian counterpart of the *γ*_{FACF}-model for which the functional form of the memory kernel is assumed not to be physically meaningful due to a lack of time scale separation. The model without conservative interaction (*K*, blue line) reproduces the FG VACF and its integral $D(t)=\u222b0tdsv(s)v(0)$ on all time scales and is shown here to validate the method for the calculation and fitting of the memory kernel *K*(*t*). For the non-Markovian models, we again find that the models not including the cross-correlation contributions [*K*_{δ}(*t*)] or including it only once [*K*_{δ,X}(*t*)] lead to an underestimation of the self-diffusion coefficient. The self-diffusion coefficient is well matched for $K\u0303(t)$. With this kernel, the VACF is also better matched in terms of its amplitudes on small time scales compared to *K*_{δ}(*t*) and *K*_{δ,X}(*t*). The origin of the remaining deviations from the FG VACF can be attributed to errors in the conservative interactions and is discussed in the supplementary material.

A comparison of the derived self-diffusion coefficients between the Markovian and non-Markovian models allows us to assess the importance of memory effects in the models at hand. Most evidently, we find that the $K\u0303(t)$-model shows a significantly slower self-diffusion than the Markovian $\gamma \u0303$-model. The same holds true to a lesser extent when comparing the *K*_{δ,x}(*t*)-model and the *γ*_{δ,x}-model, while the self-diffusion coefficients for the *K*_{δ}(*t*)-model and the *γ*_{δ}-model are comparable. This finding is exactly in line with the discussion presented in Sec. II B.

## III. CONCLUSION

In this work, we have presented a new method for the parameterization of LE thermostats and GLE thermostats for dynamic consistency in CG models. Recently, it has been demonstrated that GLE-thermostat models can be parameterized by means of optimization schemes,^{25,26} which ensure that the dynamical properties such as the VACF of CG beads are reproduced. We instead aimed to illuminate the main origins of the deficiencies in early attempts^{24} of constructing Langevin thermostat based CG models based on the Mori–Zwanzig theory. We have found that cross-correlations of CG and FG degrees of freedom must be considered to consistently model friction and memory effects on a CG level. In non-Markovian models, inclusion of the cross-correlated contribution to the memory leads to improved representability of the dynamics, also on short time scales. Neglecting the cross-correlated contribution to the memory kernel leads to dynamic inconsistencies of the CG model and fully accounts for the very slow dynamics predicted with methods solely based on the correlations of fluctuating forces.

Additionally, the comparison of Markovian and non-Markovian models has shown that, in the CG system studied in this work, memory effects manifest on the diffusive time scale. This has shown to be only the case for memory kernels, which exhibit a significant negative contribution on large time scales. In the supplementary material, we demonstrate that this memory effect is inverted for positive long-lived memory kernels and is less pronounced in dilute (implicit solvent) CG systems.

For the proposed approach to perform optimally, in principle, the conservative interactions would have to represent the exact multi-body potential of mean force. Deviations can lead to a discrepancy of the friction caused by the CG conservative interactions in sampling governed by the CG force field compared to the sampling of the microscopic (FG) system (see the supplementary material). Generally, the loosest requirement to have a fully dynamically consistent model, independent of the choice of how to model friction and random interactions, is an exact representation of the multi-body potential of the mean force on the CG level.^{11,41} This is, of course, not generally feasible, and compromises have to be made. While an iterative optimization technique^{25,26} can be used to optimally reproduce a single measure such as the VACF, this procedure does not guarantee that all conceivable dynamical properties are reproduced. This is related to the representability problem of CG conservative interactions,^{42–44} where, e.g., the IBI method only guarantees the reproduction of the radial distribution function (RDF) but is known to, e.g., overestimate the pressure, underestimate isothermal compressibilities,^{45,46} and does not guarantee to represent higher-order correlations.

In cases in which the exact reproduction of the VACF is required, the memory kernel obtained with the method proposed in this work can be used as an optimal initial guess for the memory kernel in iterative reconstruction methods, which have shown to need up to hundreds of iterations to reach convergence, when the initial guess is far from the optimal solution.^{26}

While in the past much effort has been put into the development of new methods for parameterizing CG methods with conserved dynamical properties,^{11} predictive modeling of the dynamic properties of the material-specific soft matter system still poses challenges.^{12} For non-Markovian models, the form of Eq. (1) is computationally the least demanding in both derivation and implementation with a Mori–Zwanzig based CG modeling approach. These types of models are not as rigorous as formulations including configuration-dependent memory kernels,^{27} but the ease of their implementation, in principle, allows for application to more complex problems. For future endeavors within this field of research, it is desirable to extend the efforts to more complex systems such that predictive capabilities of the existing approaches can be tested.

## SUPPLEMENTARY MATERIAL

See the supplementary material for more details on the setup of the simulations and the derivation of the coarse-grained models and also for complementary data and discussions on memory effects, the origin of remaining inaccuracies in the coarse-grained models, and the Markovian, high mass limit for coarse-grained generalized Langevin models.

## ACKNOWLEDGMENTS

Financial support for this work was granted by the Deutsche Forschungsgemeinschaft (DFG) via SFB TRR 146 (Grant No. 233630050, Project A2). The authors thank Swaminath Bharadwaj, Marvin P. Bernhardt, Madhusmita Tripathy, and Friederike Schmid for inspiring discussions.

## DATA AVAILABILITY

The data that support the findings of this study are available within the article and its supplementary material. Input parameters for simulations, analysis scripts, and raw data of the shown figures are available from the corresponding author upon reasonable request.