The development of dynamically consistent coarse-grained models for molecular simulations is often based on generalized Langevin equations, motivated by the application of the projection operator formalism (Mori–Zwanzig theory). While Mori’s projection operator yields linear generalized Langevin equations that can be computationally efficiently implemented in numerical simulations, the downside is that Mori’s generalized Langevin equation does not encompass the multi-body potential of mean force required to correctly encode structural and thermodynamic properties in coarse-grained many-body systems. Zwanzig’s projection operator yields nonlinear generalized Langevin equations including the multi-body potential of mean force, while the remaining force contributions are not as cheap to implement in molecular simulation without making it formally hard to justify approximations. For many-particle coarse-grained models, due to computational and conceptual simplicity, an often used approach is to combine nonlinear conservative interactions with linear expressions to model dissipation. In a previous study [V. Klippenstein and N. F. A. van der Vegt, J. Chem. Phys. **154**, 191102 (2021)], we proposed a method to parameterize such models to achieve dynamic consistency in coarse-grained models, allowing us to reconcile Mori’s and Zwanzig’s approach for practical purposes. In the current study, by applying the same strategy, we develop coarse-grained implicit solvent models for the continuous Asakura–Oosawa model, which under certain conditions allows us to develop very accurate coarse-grained potentials. By developing coarse-grained models for different reference systems with varying parameters, we test the broader applicability of the proposed procedure and demonstrate the relevance of accurate coarse-grained potentials in bottom-up derived dissipative models. We study how different system parameters affect the dynamic representability of the coarse-grained models. In particular, we find that the quality of the coarse-grained potential is crucial to correctly model the backscattering effect due to collisions on the coarse-grained scale. As hydrodynamic interactions are not explicitly modeled in the presented coarse-graining approach, deviations are observed in the long-time dynamics. The Asakura–Oosawa model allows for the tuning of system parameters to gain an improved understanding of this limitation. We also propose three new iterative optimization schemes to fine-tune the generalized Langevin thermostat to exactly match the reference velocity-autocorrelation function.

## I. INTRODUCTION

In recent years, there has been an increasing interest in developing methods to systematically derive coarse-grained (CG) models for molecular simulations that reproduce dynamic properties.^{1–3} Several systematic approaches have been proposed to derive CG force fields in a systematic way,^{4–6} which (ideally) allow one to describe specific fine-grained (FG) reference systems. Applying CG force fields in standard equilibrium molecular dynamics (MD) simulations allows us to reproduce structural and thermodynamic properties of the FG reference system, if the CG force field represents the multi-body potential of mean force (MB-PMF) sufficiently well.^{7} Independent of the quality of the CG force field, standard MD methods, however, do not correctly represent dynamical properties, as coarse-graining leads to a reduction in the number of degrees of freedom (DoFs) and to a corresponding reduction in friction and noise.^{1}

With an eye for representing both structure and dynamics, the coarse-graining process can be split up into two separate tasks. The first task is to derive the conservative interactions in the CG model and construct a potential, *U*_{CG}(** R**), that closely approximates the equilibrium probability distribution in the space of CG configurational variables

**,**

*R*^{7}

Here, *W*(** R**) is the MB-PMF. The quantity $M\u0302R(r)$ represents a mapping operator, which defines the CG configurational DoFs $R=R1,\u2026,RN$ and is usually a linear function of FG (i.e., atomic) Cartesian coordinates

**= {**

*r*

*r*_{1}, …,

*r*_{n}} with

*n*>

*N*. Multiple points in the FG configurational space spanned by

**map onto a single point in the CG space spanned by**

*r***. The FG potential energy is denoted by**

*R**U*

_{FG}(

**), and**

*r**k*

_{B}is the Boltzmann constant and

*T*the temperature. Bottom-up coarse-graining methods implicitly or explicitly aim to construct a CG potential that best approximates

*W*(

**). For nonbonded interactions, the CG potential, due to computational and conceptional simplicity, is usually modeled in terms of two-body potentials,**

*R*Here, *U*_{IJ}(*R*_{IJ}) is a CG pair-potential for a pair of CG particles *I* and *J* and *R*_{IJ} is the distance between particles *I* and *J*. The force evaluation for standard CG-MD simulations is then

where $e\u0302IJ$ is the unit vector parallel to the distance vector between particle *I* and *J*.

The second task is to account for the effects of the FG DoFs removed by the coarse-graining process on the dynamics of the remaining CG DoFs. To this end, the total force, *F*_{I}, on the CG particle *I* is augmented with a dissipative and random force according to

The superscripts *C*, *D*, and *R* denote conservative, dissipative, and random forces, respectively. In practice, Eq. (4) can be realized by augmenting standard MD algorithms with a dissipative thermostat that simultaneously controls the temperature and adjusts the dynamics. To this end, different types of models have been proposed, such as Markovian Langevin^{8,9} or Markovian dissipative particle dynamics (DPD) models^{9–12} and non-Markovian isotropic generalized Langevin,^{13,14} non-Markovian DPD,^{15–17} or non-isotropic generalized Langevin models.^{18} The strategies for the parameterization of the dissipative and random forces are typically based on either an iterative optimization strategy^{13,18–20} (comparable to iterative Boltzmann inversion (IBI),^{21} inverse Monte Carlo (IMC),^{22} hypernetted-chain Newton (HNCN)^{23} methods in structural coarse-graining) or the direct reconstruction of friction coefficients or memory kernels from time-correlation functions.^{8,9,14–17}

The phenomenologically intuitive form of Eq. (4) is often motivated by the Mori–Zwanzig theory,^{3,24,25} which formalizes the coarse-graining process in terms of the projection operator formalism. The mathematical form the terms in Eq. (4) formally take depends on the choice of the projection operator.^{3} While the projection operator of Mori leads to a CG equation in which all terms are linear in the CG DoFs, applying Zwanzig’s projection operator leads to a CG EoM in which both the terms $FIC$ and $FID$ can be nonlinear in the CG DoFs.^{3}

In fact, Zwanzigs projection operator allows us to derive an EoM in which the conservative force $FIC$ is the negative gradient of the MB-PMF.^{26} This makes it a good candidate to use in conjunction with established coarse-graining methods, but simultaneously, the nonlinearity of the dissipative term, in general, complicates the justification of computationally tractable models.^{26}

Mori’s projection operator allows us to derive a linear generalized Langevin equation (GLE) for which the dissipative and random forces can be efficiently implemented in molecular dynamics simulations as a non-Markovian GLE thermostat.^{27,28} This comes with the drawback that the conservative forces in Mori’s GLE are not, in general, equivalent to the negative gradient of the MB-PMF and, thus, structural and thermodynamic properties generally cannot be modeled purely based on this approach.

From a modeling perspective, a combination of both approaches would be optimal. In this study, we parameterize models of the form

where *P*_{I}(*t*) is the momentum of the CG particle *I* and $K\u0303(t)$ is an isotropic, configuration independent memory kernel, which is related to the random force $F\u0303IR(t)$ via the fluctuation–dissipation relation

where *M*_{I} is the mass associated with the *I*th CG DoF.

In Ref. 14, we proposed a method to parameterize $K\u0303(t)$, which is motivated based on the single-particle GLE,

which can be rigorously derived using Mori’s linear projection operator for a freely diffusing particle in an isotropic environment. We keep the subscript *I*, as we will use Eq. (7) to motivate the parameterization of many-body models following Eq. (5), while *de facto* Eq. (7) arises as CG EoM for a single tagged particle. $FIQ(t)$ is called $Q$-projected force, which is propagated in time in the subspace that is orthogonal to the CG momentum at the time origin *P*_{I}(0) and is related to the memory kernel *K*(*t*) via

Equation (7) is an exact CG EoM and the memory kernel *K*(*t*) can be readily obtained from a reference trajectory of a single tagged particle.^{29–35} The $Q$-projected force in Eq. (7) is often interpreted as random force, which allows us to construct a single-particle CG model reproducing, e.g., the velocity autocorrelation function (VACF) of the Refs. 31 and 34.

In general, models of the form of Eq. (5) with a nonlinear conservative force term and linear friction can only under certain conditions be derived exactly from microscopic dynamics,^{26,36} while the validity of assuming these conditions can in principle be numerically assessed as recently demonstrated by Ayaz *et al.*^{36}

In a recent study, we have proposed a link between the exact EoM given by Eqs. (7) and (5), which allows us to parameterize $K\u0303(t)$ such that the single-particle dynamic properties are well reproduced in many-body CG simulations following Eq. (5), which we have shown for a star-polymer melt system.^{14}

The proposed parameterization scheme is based a method called backward-orthogonal dynamics (BOD), introduced by Carof *et al.*,^{29} which allows us to directly evaluate $Q$-projected correlation functions. This method can be used to determine the effective single-particle memory kernel *K*(*t*) by analyzing the trajectory of a tagged particle. The crucial difference to other methods is that the BOD-method allows us to derive $Q$-projected correlation functions for any observable. For our purpose, we analyze a mapped FG (e.g., atomistic) trajectory, i.e., we apply $M\u0302R(r)$ to a FG trajectory. Hence, we track the evolution of the CG DoFs in a many-body FG system, in which we split the total force on any CG DoF *I* as $FI(t)=FIC(t)+\delta FI(t)$. Here, *F*_{I}(*t*) is the total force on the CG DoF *I* in the mapped FG reference trajectory, $FIC(t)$ is the force a priorly defined CG potential would yield for the same CG configurations, and *δ**F*_{I}(*t*) is the residual force, not captured by the CG potential. For each force contribution, and for their cross correlation, a $Q$-projected correlation function can be calculated following the BOD-method. This allows us to split the total memory kernel in Eq. (8) into different contributions, as shown in the following equation:

In the CG target model, Eq. (5), the conservative forces will effectively exert friction on any particle *I*, in addition to the friction due to the GLE thermostat. This has to be taken into account in the parameterization of $K\u0303(t)$. Equation (9) suggests a relation between *K*_{C}(*t*) and the conservative interactions. Assuming that the additional friction and memory effects due to the conservative interactions in the multi-body CG simulations are well estimated by *K*_{C}(*t*), we suggest^{14} that

should yield an optimal parameterization for dynamic consistency. One requirement for this approach to work optimally is that the CG model samples the same distribution $\rho (R)\u221de\u2212W(R)/kBT$ as the FG reference, as only then the overall force fluctuations can be consistent.

We applied the sketched approach to coarse-grain a star-polymer melt system and were able to reproduce the overall mobility and the form of the velocity autocorrelation function (VACF) reasonably well.^{14} Still, some deviations were observed, especially around the first minimum of the VACF. We argued that the residual deviations of the VACF in our CG model compared to the reference were mainly due to the fact that the conservative forces in the CG model did not optimally represent the MB-PMF, as higher order correlations cannot be accounted for in CG force fields based on pairwise potentials. This is the main hypothesis we want to test in the current work, for which we need a CG particle based model, with an accurate representation of the MB-PMF in terms of pairwise interactions.

To this end, we herein study the (continuous) Asakura–Oosawa^{37–40} (AO) model for which we parameterize a GLE thermostat for the CG representation. The AO model by itself is a generic CG representation that models a colloidal suspension in a polymer solution. Both colloids and polymers are modeled as purely repulsive single beads. Due to entropic effects, the colloid beads effectively attract each other. It was shown that these effective interactions can be exactly described in terms of pairwise interactions if the size ratio of polymer beads to colloid beads is small enough.^{37,39} For this case, an implicit solvent model, only resolving the colloid beads, can be derived for which the MB-PMF can be exactly reproduced. By changing the polymer–colloid bead size ratio, the quality of the CG model can be readily tuned and our main hypothesis can be tested.

The remainder of this paper is structured as follows: Sec. II A describes some routes to evaluate the effective single-particle memory kernel based on the inversion of Volterra equations. Section II B sketches the main ideas behind the BOD-method.^{29} In Sec. II C, we elaborate in more detail the choice for the parameterization of $K\u0303(t)$. In Sec. II D, we propose three different iterative optimization schemes to optimize the parameterization of GLE thermostats to correct for residual deviations in the VACF from its reference. In Sec. II E, we provide a summary of the general coarse-graining procedure. In Sec. III, we summarize how we model the FG and CG continuous AO model. In Sec. IV, we report the main results. In particular, we compare dynamic properties of the FG reference and CG GLE models for three different AO models with different polymer–colloid bead size ratios. We also demonstrate the performance of the newly proposed iterative optimization schemes for memory kernels. In Sec. V, we discuss our findings and its implications and conclude with a short summary and outlook in Sec. VI.

## II. THEORY

### A. Evaluation of the memory kernel through inversion of Volterra equations

In order to evaluate the memory kernel in the exact single-particle EoM given by Eq. (7), we multiply both sides of the equation with *P*_{I}(0) and subsequently take the canonical average. Due to the definition of the projection operator,

and, thus,

or, equivalently,

where we have defined the time-correlation functions $CFP(t)=13FI(t)PI(0)$, $CPP(t)=13PI(t)PI(0)$, $CFV(t)=13FI(t)VI(0)$, and the VACF $CVV(t)=13VI(t)VI(0)$. These correlation functions can be directly evaluated from molecular dynamics trajectories and, thus, allow one to evaluate *K*(*t*) through inversion of Eq. (13). This can be done by discretization.^{33,34} In our previous study,^{14} we exploited the convolution theorem to recover *K*(*t*) in the Fourier space.

Generally, the schemes used to derive *K*(*t*) from Eq. (13) tend to be numerically unstable. This can be improved by integrating Eq. (13),^{35}

Here, *G*(*t*) is the integrated memory kernel,

where *G*_{i} ≡ *G*(*iδt*) and *C*_{VV,i} ≡ *C*_{VV}(*iδt*), with *δt* being the discretization step of the VACF. To further increase the numerical stability, a predictor–corrector scheme is employed as follows:

where the superscript * indicates values calculated in the predictor step and Eq. (17) is the corrector step. In addition to improved numerical stability, this scheme has the advantage that it can be used to evaluate the memory kernel from trajectories in which the total force is not stored, which is often the case in standard MD software when stochastic thermostats are used. We apply it to derive reference memory kernels, against which the results of the BOD-method can be compared and to derive effective single-particle memory kernels for CG GLE trajectories.

### B. Direct evaluation of projected time-correlation function: The backward-orthogonal dynamics method

Two alternative routes to evaluate memory kernels based on the GLE in Eq. (7) were proposed by Carof *et al.*,^{29} called the forward- and backward-orthogonal dynamics (BOD) methods. As in Ref. 14, we apply the BOD-method in this study, which is summarized in the supplementary material. The main idea behind the BOD-method is to express $Q$-projected correlation functions in terms of properties that are easily accessible form FG MD trajectories. Accordingly, arbitrary projected correlation functions of the form

can be evaluated.

By choosing *A* = *B* = *F*_{I}, the memory kernel *K*(*t*) can be evaluated as

While the BOD-method and the Volterra-inversion approach described in Sec. II A yield the same memory kernel,^{14,29,31} the BOD-method can be used to evaluate an arbitrary $Q$-projected correlation function.^{29} In particular, by choosing $A=B=FIC$, $A=B=\delta FI=FI\u2212FIC$, or $A=FIC$ and *B* = *δ**F*_{I}, the total memory kernel can be split into different contributions [Eq. (9)] as described in the Introduction.

The three contributions to *K*(*t*), namely, *K*_{C}(*t*), *K*_{δ}(*t*), and *K*_{X}(*t*), can be interpreted as the memory and friction due to effective conservative interactions between the CG DoFs, due to interactions with the lost DoFs, and due to the cross-correlations between these interactions.

### C. Definition of dynamic consistency

In this work, we define dynamic consistency in terms of the reproduction of single-particle correlation functions. In practice, we can map the dynamics of any single particle *I* onto a GLE of the form of Eq. (7), where the memory kernel *K*(*t*) determines all other time-correlation functions. By replacing $FIQ(t)$ with a random force $FIR(t)$, which fulfills the fluctuation–dissipation theorem, single-particle CG simulations, in which, e.g., the VACF is correctly reproduced, can be readily carried out.^{31,34} The EoM for such a model is

with $3MIkBTK(t)=FIR(t)FIR(0)$. Equivalently, the memory kernel *K*(*t*) can always be represented by a sum of *n* functions with *n* independent random forces,

where every random force term has to fulfill its own fluctuation–dissipation relation $3MkBTKi(t)=FI,iR(t)FI,iR(0)$. As long as $K(t)=\u2211i=1nKi(t)$, Eqs. (20) and (21) are equivalent in the sense that they yield the same single-particle time-correlation functions, such as the VACF.

Analogously, we interpret the target model, Eq. (5), as a many-particle model in which the dynamics of any particle *I* is governed by the memory and friction due to the sum of two contributions, those due to the conservative interactions and due to the memory kernel $K\u0303(t)$. Considering such a CG simulation based on Eq. (5), the trajectory of any particle in the system can be mapped onto a GLE of the form of Eq. (7), and an effective single-particle memory kernel *K*^{CG}(*t*) can be derived from the Volterra equation

[or its integrated form as in Eq. (14)], where the superscript *CG* denotes evaluation from a CG trajectory. From Eq. (5), we can also derive

Inversion of Eq. (24) allows us to evaluate $\Delta KCG(t)=KCG(t)\u2212K\u0303(t)$, which is the memory and friction effectively added through the conservative interaction $FIC(t)$. The question now is how to parameterize $K\u0303(t)$ such that *K*^{CG}(*t*) ≈ *K*(*t*). Here, the BOD-method described in Sec. II B can be exploited.

The physical interpretation that Δ*K*^{CG}(*t*) is due to conservative interactions suggests

if *K*^{CG}(*t*) ≈ *K*(*t*). Because of the additivity of memory kernels implied by Eq. (21), this furthermore suggests that

should be an optimal *a priori* choice for the parameterization of the target model, which we will validate in Sec. IV.

One might be tempted to replace the CG correlation functions in Eq. (23) with correlation functions from reference simulations to derive a memory kernel for the target model. This approach was used in the past to parameterize a Markovian Langevin thermostat.^{8} The assumption in the derivation of Eq. (23) is $FIR(t)PI(0)=0$, similar to the assumption in Eq. (11), and holds true for the GLE thermostat by construction. Contrary to the single-particle GLE, the target EoM Eq. (5) from which Eq. (23) is derived is not equivalent to a formally exact CG EoM^{26} and, thus, the link between the Volterra equation Eq. (23) and the FG dynamics, without further justification,^{36} is at least ambiguous.

### D. Iterative optimization

Similar to structural coarse-gaining,^{21,22} memory kernels in GLE models can be iteratively optimized to match certain target properties, where the VACF is the most prominent target to this day.^{13,31} As the VACF is a single-particle property, it can always be related to an effective (integrated) single-particle memory kernel following Eq. (14). For an iterative optimization scheme, *G*(*t*) can be directly used as target. The rationale is that the relation between the optimal memory kernel $K\u0303(t)$, or its integrated form $G\u0303(t)=\u222b0tdsK\u0303(s)$, and the effective integrated single-particle memory kernel *G*^{CG}(*t*) is less complex and, thus, straightforward, hopefully stable, iterative schemes, can be established. Here, we want to present three different update schemes for the task. First, we notice that in a CG simulation,

Assuming that the residual between the integrated target memory kernel *G*^{tgt}(*t*) ≡ *G*(*t*) and the CG integrated effective single-particle memory kernel *G*^{CG}(*t*), both evaluated based on the VACF using Eq. (14), can be compensated by a change in $G\u0303(t)$ suggests the update scheme

In the limit where conservative interactions are negligibly weak, we have

and, thus, Eq. (28) would converge within a single step. When conservative interactions play a significant role in the dynamics of the system, Eq. (28) would converge within a single step, only if Δ*G*^{CG}(*t*) is independent of $G\u0303(t)$.

As we will demonstrate in Sec. IV D, Δ*G*^{CG}(*t*) in fact depends on $G\u0303(t)$, which can lead to an oscillatory approach of the target. To prevent that, a step size parameter can be included in Eq. (28) to increase the stability. As the optimal step parameter is not known *a priori*, an update scheme that is generally less prone to suffer from oscillations might be preferable. We, therefore, consider

or by using $GiCG(t)=\Delta GiCG(t)+G\u0303i(t),$

In a case where $\Delta GiCG(t)=0$, this update scheme again should trivially converge within one step. For the case of large $\Delta GiCG(t)$, we also expect that the absolute change of $\Delta GiCG(t)$ will be large. The formulation of Eq. (30) automatically decreases the step size in this case and, thus, should prevent oscillatory behavior. As will be demonstrated in Sec. IV D, Eq. (30) can yield slow convergence in some cases.

We consider a third update scheme to further improve the convergence by approximately quantifying the dependence of Δ*G*^{CG}(*t*) on the GLE thermostat parameterization due to $G\u0303(t)$. We start from $GCG(t)=\Delta GCG(t)+G\u0303(t)$ and assume the linear relation $\Delta GCG(t)\u2248a(t)+b(t)G\u0303(t)$. In this expression, *b*(*t*) accounts for the fact that application of a GLE thermostat affects the friction due to particle collisions (conservative interactions). Without a GLE thermostat, the approximation yields Δ*G*^{CG}(*t*) = *a*(*t*), with *a*(*t*) corresponding to the overall single-particle friction in a CG-MD simulation, i.e., *a*(*t*) = *G*_{CG-MD}(*t*). To derive an iterative scheme, we apply these assumptions and write

for each iteration and

for the target. Solving Eq. (32) for *b*(*t*), using the result in Eq. (33), and replacing $G\u0303tgt(t)$ by $G\u0303i+1(t)$ yield

For this third update scheme, an additional CG-MD simulation has to be carried out to predetermine *a*(*t*).

For negligible conservative interactions, *a*(*t*) vanishes, in which limit Eq. (34) is equivalent to Eq. (30) and again should converge within one iteration step. For strong conservative interactions, the correction term *a*(*t*) takes care of the case where friction due to conservative interactions is quite dominant. This effectively leads to larger steps close to the target, while still preventing oscillatory behavior.

### E. Summary of the bottom-up coarse-graining procedure and of notations

Here, we want to summarize the procedure needed to derive coarse-grained models for the target EoM Eq. (5). The first step is always to derive a conservative CG potential. For the parameterization of the GLE thermostat, the following steps need to be carried out.

Generate trajectories for the FG reference model.

Generate a mapped trajectory, representing the reference dynamics in terms of CG DoFs.

Analyze the mapped trajectory to compute

*K*_{C}(*t*),*K*_{δ}(*t*), and*K*_{X}(*t*) according to the BOD-method.Run CG simulations following Eq. (5), with $K\u0303(t)=K\delta (t)+2KX(t)$.

Evaluate dynamic properties from CG trajectories, e.g., Δ

*K*^{CG}(*t*) from Eq. (24) to test the assumption*K*_{C}(*t*) ≈ Δ*K*^{CG}(*t*).

The above summary is generic. In practice, before carrying out step 4, we fit the respective memory kernel with dampened oscillators to parameterize the auxiliary variable GLE thermostat due to Ceriotti.^{27} (See the supplementary material for a detailed description.) To study the relevance of the cross-correlation term, *K*_{X}(*t*), in step 4 we also consider *K*_{δ}(*t*) and *K*_{δ,X}(*t*) = *K*_{δ}(*t*) + *K*_{X}(*t*) for the parameterization of the GLE thermostat.

As we have introduced many different memory kernels with slightly differing notations, a short recap follows:

*K*(*t*),*K*_{δ}(*t*),*K*_{C}(*t*), and*K*_{X}(*t*) always refer to the single-particle memory kernel and its different contributions, as defined by Mori’s linear single-particle EoM [Eq. (7)]. These memory kernels are always evaluated from FG trajectories based on the BOD-method.$K\u0303(t)$ refers to the proposed optimal bottom-up parameterization of the target EoM Eq. (5).

Δ

*K*^{CG}(*t*) refers to the memory added through conservative interactions in CG simulations following Eq. (5). Δ*K*^{CG}(*t*) is evaluated based on the inversion of Eq. (24).*K*^{CG}(*t*) refers to the effective single-particle memory kernel in CG simulations. In practice, we evaluate only its integral,*G*^{CG}(*t*), directly from CG VACFs, following Eq. (14).*G*generally refers to integrated memory kernels, with the respective subscripts, superscripts, etc., denoting the particular memory kernel we refer to.

For the iterative optimization schemes, we additionally introduced the following notations:

*G*^{tgt}(*t*) refers to the target integrated memory kernel, equivalent to*G*(*t*), but evaluated from the FG reference VACF following Eq. (14).$G\u0303i(t)$ refers to the parameterization of the GLE thermostat of the

*i*th iteration. In particular, $G\u03030(t)$ is the initial guess, which can be chosen in different ways.$GiCG(t)$ refers to the integrated effective single-particle memory kernel, evaluated based on the VACF from the

*i*th CG simulation.$\Delta GiCG(t)$ refers to integrated memory implicitly introduced due to conservative interactions in the

*i*th CG simulation. (It has not to be explicitly evaluated.)

## III. MODELS

### A. Fine-grained reference: The continuous Asakura–Oosawa model

The Asakura–Oosawa (AO) model^{37,38} is a well-known generic model for the description of colloid–polymer mixtures.^{39} In this model, colloids and polymers are described, respectively, by single beads of different radii *σ*_{cc} and *σ*_{pp}, where *σ*_{pp} < *σ*_{cc}. The colloid–colloid and the colloid–polymer interactions are described by hard-sphere interactions, while there are no polymer–polymer interactions. While all interactions in this model are repulsive, the *effective* (depletion) interaction between the colloids is attractive and related to the translational entropy of the polymer. For a sufficiently small size ratio *q* = *σ*_{pp}/*σ*_{cc} < 0.1547,^{39,41} it can be shown that these effective interactions can be described exactly in terms of pair-interactions. This means that a CG implicit solvent model can be derived, for which the MB-PMF can be exactly represented with pairwise CG potentials.

For our purpose, the hard-sphere description is not very useful, which is why we use a continuous variant of the AO model, similar to Ref. 40. We model the colloid–colloid and colloid–polymer interaction via purely repulsive Weeks–Chandler–Andersen (WCA) interactions,^{42} i.e.,

and

Throughout this work, we use a system of reduced units by defining the energy, length, mass, and time units as *ɛ* = *σ* = *m* = 1 and $\tau =\sigma m/\epsilon $. Furthermore, we set the Boltzmann constant *k*_{B} = 1. (In this section, we will make the choice of the unit system explicit. In the following sections, tables, and figures, we will omit the units.) For all systems, we set *k*_{B}*T* = 1*ɛ*, *σ*_{cc} = 1*σ*, and *σ*_{cp} = 0.5(*σ*_{cc} + *σ*_{pp}). For the four reference systems reported in the main text, we always use a cubic simulation box of length 14 *σ*, yielding a box volume of *V* = 2744 *σ*^{3}, and *n*_{c} = 2000 colloid particles, which amounts to a colloid number density *ρ*_{c} = 0.73 *σ*^{−3}. The colloid bead mass is chosen as *m*_{c} = 10 *m*. We summarize the parameters that are discussed in the main text in Table I, where we also report the volume fractions $\eta c/p=(\pi \sigma cc/pp3/6)\rho c/p$. In total, we studied 19 different systems. The parameters and results for the systems not discussed in detail in the main text are reported in the supplementary material. In the remainder of this paper, we refer to the three bold systems in Table I as the *q* = 0.15-, *q* = 0.4-, and *q* = 0.6-system, respectively.

q
. | n_{c}
. | n_{p}
. | ρ_{c}
. | ρ_{p}
. | η_{c}
. | η_{p}
. | m_{p}
. |
---|---|---|---|---|---|---|---|

0.15 | 2000 | 60 000 | 0.73 | 21.87 | 0.38 | 0.039 | 1 |

0.15 | 2000 | 60 000 | 0.73 | 21.87 | 0.38 | 0.039 | 3 |

0.4 | 2000 | 10 000 | 0.73 | 3.64 | 0.38 | 0.122 | 3 |

0.6 | 2000 | 1 000 | 0.73 | 0.36 | 0.38 | 0.041 | 3 |

q
. | n_{c}
. | n_{p}
. | ρ_{c}
. | ρ_{p}
. | η_{c}
. | η_{p}
. | m_{p}
. |
---|---|---|---|---|---|---|---|

0.15 | 2000 | 60 000 | 0.73 | 21.87 | 0.38 | 0.039 | 1 |

0.15 | 2000 | 60 000 | 0.73 | 21.87 | 0.38 | 0.039 | 3 |

0.4 | 2000 | 10 000 | 0.73 | 3.64 | 0.38 | 0.122 | 3 |

0.6 | 2000 | 1 000 | 0.73 | 0.36 | 0.38 | 0.041 | 3 |

The FG reference simulations were carried out using GROMACS version 2018.2.^{43,44} We used a simulation step of Δ*t* = 0.0005*τ*. Trajectories, which were used for the calculation of radial distribution functions (RDF) and as input for the coarse-graining procedure, were run for 4 · 10^{6} steps, generated using the stochastic dynamics integrator with a time constant of 4 *τ* while snapshots were saved every 1000 steps. For the evaluation of VACFs, simulations were run for 2 · 10^{6} steps using the velocity-Verlet integrator and the Nosé–Hoover thermostat with a time constant of 4 *τ*. Snapshots were taken after every 20 simulation steps. For the application of the BOD-method, we generated five independent trajectories per system with 2 · 10^{5} steps per trajectory and snapshots were taken after every two steps. Memory kernels derived based on the BOD-method were averaged over these five trajectories.

### B. Coarse-grained Asakura–Oosawa model

#### 1. Conservative interactions

The CG AO model can be understood as an implicit solvent model, in which the DoFs of the polymer beads are removed and effective colloid–colloid interactions are employed. Thus, the mapping operator is defined as

with $r={r1,\u2026,rnc,rnc+1,\u2026,rnc+np}$, $R={R1,\u2026,Rnc}$, and *R*_{I} = *r*_{I}. The CG momenta and masses are accordingly equivalent to the FG colloids momenta and masses. While the exact analytical effective interactions for small values of *q* are known for the hard-sphere AO model,^{39} in principle, it should also be possible to describe the MB-PMF for the continuous AO model in terms of pairwise interactions. The CG force field can be derived by any established coarse-graining method for conservative interactions. A natural choice for the system at hand is the force-matching (FM) (also referred to as multi-scale coarse-graining)^{7} method, which optimizes the pairwise interactions such that the MB-PMF is optimally represented.^{7} For small values of *q*, the FM-method should yield exact (within numerical accuracy) CG potentials. We derived effective pair-potentials for the three studied systems, using the FM-method implemented in the VOTCA software package.^{45} For the FM-method, we set the cutoff distance to 1.31, 2.6, and 2.0 for the *q* = 0.15-, *q* = 0.4-, *q* = 0.6-system, respectively.

In Fig. 1, we show the FG colloid–colloid and colloid–polymer interactions, alongside the CG potentials derived via FM. As expected, in the CG picture, the colloid–colloid interactions are effectively attractive.

#### 2. Coarse-grained simulations

To test the CG potentials and the calculation of radial distribution functions, we generated trajectories again using GROMACS version 2018.2.^{43,44} The stochastic dynamics integrator was used with a smaller time constant of 0.625, to efficiently obtain uncorrelated snapshots.

All CG trajectories used to evaluate time-correlation functions were obtained using Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS).^{46} CG-MD trajectories were generated using the Nosé–Hoover thermostat with a time constant of 5. GLE thermostat simulations were carried out using the auxiliary variable GLE thermostat by Ceriotti *et al.*^{28} as implemented in LAMMPS. After equilibration, CG simulations for the calculation of VACFs were carried out for a total simulation time of 1000 with a spacing between frames of 0.01. Additional trajectories were generated to evaluate Δ*K*^{CG}(*t*) based on Eq. (24) with a total simulation time of 100 and a spacing between frames of 0.001.

For the application of the iterative schemes discussed in Sec. IV D, the simulation time was reduced to 500 to save computational costs.

## IV. RESULTS

### A. Accuracy of conservative interactions and structure

In the CG AO model, we assume that the FM-method yields accurate conservative interactions as long as *q* ≤ 0.15. For larger *q* values, multi-body correlations contribute to the MB-PMF, which cannot be captured in terms of pairwise interactions. In Fig. 2, we compare the radial distribution functions *g*(*R*) from FG and CG simulations for the considered systems. For *q* = 0.15, we find that the FM-method yields a potential that perfectly reproduces the function *g*(*R*), as expected. For *q* = 0.4, the structure is also very accurately represented. Only minor deviations can be found. This indicates that for the system with *q* = 0.4, many-body effects might be present, but they still might be comparably weak. While the RDF is also quite well reproduced for the *q* = 0.6-system, we find a slightly smaller first peak in this case. By varying the value of *q*, we can thus control the accuracy of the representation of the MB-PMF, while still reproducing the main structural features, and, thus, test the relevance of the quality of the conservative interactions on the representability of dynamic properties in the CG models.

### B. Memory due to conservative interactions

Here, we want to test the hypothesis that *K*_{C}(*t*) ≈ Δ*K*^{CG}(*t*), to further justify our strategy for the parameterization of $K\u0303(t)$.

In the supporting information of Ref. 14, we have made the equivalent comparison for a star-polymer melt. While we found a qualitatively good agreement, quantitatively, the two functions showed some nontrivial deviations. Accordingly, small but significant deviations have been found in the VACF of the CG model. We suggested that the deviations stem from inaccuracies in the conservative interactions.

The comparison between *K*_{C}(*t*) and Δ*K*^{CG}(*t*) for the systems discussed in Sec. IV A is shown in Fig. 3. Indeed, overall a good agreement between *K*_{C}(*t*) and Δ*K*^{CG}(*t*) is found for all the three systems. In the *q* = 0.15- and the *q* = 0.4-system, the initial decay matches perfectly. For the *q* = 0.6-system, we find the most significant deviations on short time scales. In particular, the mismatch *K*_{C}(0) ≠ Δ*K*^{CG}(0) directly indicates that different probability distributions *ρ*(** R**) are sampled in the FG and CG simulations, which is in line with the small deviations in the radial distribution function found in Fig. 2. From Fig. 3, it can be inferred that the VACF will be well reproduced on short time scales for

*q*= 0.15 and

*q*= 0.4. For

*q*= 0.6, deviations, especially on short time scales, can be expected.

An interesting finding is that in the *q* = 0.15-system, the tail behavior of *K*_{C}(*t*) and Δ*K*^{CG}(*t*) differs noticeably. We will discuss this finding and its implication in more detail in Sec. IV C 1.

### C. Dynamic properties in coarse-grained models

In Sec. IV B, we have shown that the validity of the assumption that *K*_{C}(*t*) ≈ Δ*K*^{CG}(*t*) depends on the accuracy of the conservative interactions and is more generally system dependent on longer time scales. The AO model has some free parameters, which can be independently chosen. A detailed study of the full parameter space is beyond the scope of this study and we only sparsely screen the bead size ratio *q* in the main text. Varying colloid density, polymer density, and mass of the polymer beads are reported in the supplementary material.

#### 1. The *q* = 0.15-system

As a first example, we study a system with accurate CG conservative interactions (*q* = 0.15), high colloid density (*ρ*_{c} = 0.73), high polymer density (*ρ*_{p} = 21.87), and a polymer bead mass of *m*_{p} = 3. For this system, it can be expected that the overall friction on a single colloid particle is dominated neither by the colloid–colloid collisions nor solely by its collision with the polymer beads. This ensures that both the conservative and dissipative interactions in the CG model are relevant to accurately represent dynamics.

In Fig. 4, the full memory kernel *K*(*t*) and its contributions as defined by Eq. (9) are shown. We also show the integrals $G(t)=\u222b0tdsK(s)$ to evaluate the contributions to the total friction. We find that the total memory kernel on very short time scales is dominated by *K*_{δ}(*t*), which is mostly decayed at *t* = 1. While the amplitude of *K*_{C}(*t*) is smaller on short time scales, it is the dominant contribution already at *t* = 0.15. The cross-correlation contribution *K*_{X}(*t*) is the smallest in amplitude. At times *t* < 0.5, it shows some complex behavior, while it exhibits a long-lived negative tail.

One interesting feature of the full memory kernel is that it is purely positive and, thus, retarding on all time scales. Especially in dilute colloid suspensions, this is typically not the case. In Ref. 31, correlations at larger time scales were found for the single-particle memory kernel of colloids, modeled by Lennard-Jones interactions. A negative, persistent tail of the memory kernel in colloids is induced by a backflow effect in the solvent. For the AO model, with the chosen parameters, we do not find this feature in the full memory kernel. The *K*_{δ}(*t*) and *K*_{X}(*t*) contributions indicate that a similar correlation enhancing mechanism is present but counteracted by the retarding effect of colloid–colloid collisions. The fit functions for *K*_{δ}(*t*) and $K\u0303(t)$ are shown as dotted black lines. The integrals of the memory kernel and its contributions are shown in inset. At long times, the integrals represent the effective friction coefficient and its contributions. For this system, the overall contributions to the total friction due to *K*_{δ}(*t*) and *K*_{C}(*t*) are comparable. As also found in Ref. 14, *K*_{X}(*t*) effectively reduces the overall friction.

We have already discussed the physical interpretation of *K*_{C}(*t*). Now, we can try to infer the physical meaning of *K*_{δ}(*t*) and *K*_{X}(*t*) by considering the data shown in Fig. 4, while noting that this can only provide a qualitative picture. *K*_{δ}(*t*) can be straightforwardly interpreted as being the actual solvent contribution, as it is explicitly related to the orthogonal projection of the residual forces *δF*_{I}(*t*) and encodes for the viscoelastic response due to the lost DoFs. The definition of *K*_{X}(*t*) indicates that not only is it physically linked to the lost DoFs, but it also describes the coupling between the irrelevant and the relevant DoFs. The physical interpretation of this coupling somewhat depends on the type of system and the mapping scheme. In the current system, we consider the dynamics of some colloid particle *I*. Its motion will induce a sound wave in the solvent DoFs, which will interact with other colloid particles in the vicinity. On short time scales (but not instantaneous), this solvent mediated colloid–colloid coupling will modulate the dynamics of colloid particle *I* in some complex way. On longer time scales, it will yield persistent coherent collective motion of the colloids and the solvent, effectively increasing the diffusivity of particle *I*. In Fig. 4, we find that *K*_{X}(*t*) on short time scales shows some complex fluctuations, and on long time scales, it ultimately has a long-lived negative tail, which would fit the described physical picture. In our CG model, we do not include solvent mediated hydrodynamic colloid–colloid interactions explicitly, instead the incorporation of *K*_{X}(*t*) in the GLE thermostat accounts for these phenomena in an averaged way.

In Fig. 5, the VACFs from CG simulations and FG reference simulations are shown. We find that the VACF is almost perfectly reproduced when using $K\u0303(t)$ as defined by Eq. (26) in the GLE thermostat. We find that even zooming into the data by about a factor of 10, as found in the inset of Fig. 5(a), does not reveal any significant deviation. The FG VACF shows a pronounced backscattering effect, indicated by a negative region of the VACF. While this behavior is qualitatively captured by all three parameterizations, using *K*_{δ}(*t*), *K*_{δ,X}(*t*) = *K*_{δ}(*t*) + *K*_{X}(*t*), and $K\u0303(t)$, it is best captured with $K\u0303(t)$, demonstrating that the cross-correlation contribution *K*_{X}(*t*) has to be taken into account to capture the VACF accurately. In this system, we see two sign changes in the VACF and, accordingly, two local extrema. In *K*_{X}(*t*), we find similar features, exhibiting first a local maximum, which shifts the local minimum in the VACF to smaller values, and a local minimum, which shifts the local maximum in the VACF to higher values. While the initial decay can be well captured without taking *K*_{X}(*t*) into account, the intermediate fluctuations are modulated through the cross-correlations.

While in our earlier study on a star-polymer melt,^{14} the backscattering effect was not as well reproduced as in the AO model, it was also found that taking *K*_{X}(*t*) into account strongly improves the representation of the backscattering effect and that, in particular, a neglection of *K*_{X}(*t*) leads to too slow dynamics, compared to the FG reference. This was previously indirectly shown in a study by Lei *et al.*^{9} and is now confirmed for the AO model.

As discussed in, for example, Refs. 26, 36, and 47, Eq. (5) can only be an approximate EoM and we do expect to find some discrepancies in the CG dynamics. The inset of Fig. 5(b) shows the data zoomed in by another order of magnitude on a larger time scale. On a time scale on which the VACF is almost fully decayed (by more than 99%), small but persistent deviations can be found in the tail region. In Fig. 5(b), the log–log plot reveals that in fact the hydrodynamic tail is not described accurately in the CG systems.

#### 2. The *q* = 0.4-system

By varying the bead size ratio *q*, multi-body effects can be introduced into the effective colloid–colloid interactions and the quality of the CG potentials based on pair-interactions can be investigated in a controlled way. In this section, we study the dynamic properties of the *q* = 0.4-system. In Fig. 2, we have shown that despite the increased bead size ratio *q*, the RDF can be accurately captured by the FM-potentials. While many-body effects should in theory be present, the volume per polymer bead is still only 6.4% of the volume per colloid bead and the pairwise potential seems to be a good approximation of the MB-PMF.

This is also reflected in Fig. 3, where we have found that Δ*K*^{CG}(*t*) in CG GLE models matches *K*_{C}(*t*) from the reference simulations quite well on all time scales. In particular, we do not see the divergence in the tail region as found in the *q* = 0.15-system and it can be assumed that the CG GLE model parameterized with $K\u0303(t)$ should capture the reference VACF on short and long time scales.

In Fig. 6, we compare the memory kernel *K*(*t*) and its contributions. Qualitatively, the contributions show a comparable behavior as in the *q* = 0.15-system, but we find that *K*_{C}(*t*) yields a more significant contribution to the total friction. This is due to (1) the more attractive and longer ranged CG potential and (2) the reduced number of polymer beads in the reference system. Still, the contribution to the total friction due to the removed DoFs is far from negligible and the reproduction of dynamic properties in CG simulations is nontrivial.

In Fig. 7, we compare the FG and CG VACFs and find, as one would expect, a slower decay in the CG-MD simulations without a GLE thermostat, indicating overall faster dynamics. While the parameterization with *K*_{δ}(*t*), *K*_{δ,X}(*t*), and $K\u0303(t)$ all lead to a significant improvement in the reproduction of the reference VACF, zooming into the first minimum and into the tail region reveals again that neglecting the cross-correlation contribution *K*_{X}(*t*) in the parameterization of the GLE thermostat leads to errors in the representation on intermediate time scales. For this particular system, even the long time tail is very well captured with $K\u0303(t)$, which can be best seen in the log–log representation.

#### 3. The *q* = 0.6-system

In this section, we study the dynamic properties of the *q* = 0.6-system. In Fig. 8, we show the memory kernel *K*(*t*) and its contributions. In this system, the *K*_{C}(*t*)-contribution decays significantly faster, while the *K*_{δ}(*t*)-contribution shows a more persistent tail, compared to the other systems. Overall, *K*_{C}(*t*) accounts for the most part of the total friction.

For this system, the volume of a polymer bead is about 22% of a colloid bead and multi-body effects can be assumed to be more relevant.

As shown in Fig. 2, the radial distribution function is not perfectly matched in CG simulations, which indicates that the MB-PMF is not exactly reproduced in this system. In Fig. 3, we have found that while the overall form of *K*_{C}(*t*) and Δ*K*^{CG}(*t*) qualitatively still match quite well, a discrepancy at time zero is found. As both *K*_{C}(*t*) and Δ*K*^{CG}(*t*) are proportional to the variance of the forces due to the CG potential in the FG reference simulation and the CG simulation, respectively, this, in addition to slight deviations in the RDF, directly indicates that the CG simulations do not sample the correct probability distribution *ρ*(** R**). If the hypothesis we proposed in the Introduction and in Ref. 14 is true, this should introduce some deviations in the VACF in the $K\u0303$-model.

In Fig. 9, we compare the VACF of CG simulations with the FG reference. Even though the overall friction should be dominated by the conservative interactions, we find that the CG-MD simulation still leads to a too slow decay of the VACF in CG simulations, indicating too fast dynamics, compared to the FG reference, and, thus, necessitates the introduction of friction. We again find that all three GLE models lead to an improvement of the representation of the VACF, but for this system, by zooming into the first minimum and the tail, we find that $K\u0303(t)$ cannot fully capture the VACF on short and intermediate time scales. Still the $K\u0303$-model performs significantly better than the *K*_{δ}- and *K*_{δ,X}-model, which show a strong underestimation of the VACF at times *t* > 1, and is, thus, found to be the optimal purely bottom-up informed choice for the parameterization of the isotropic GLE thermostat.

### D. Iterative optimization

Here, we want to apply the three iterative optimization schemes for memory kernels as described in Sec. II D [see Eqs. (28), (30), and (34)]. We start with the first system in Table I. This system is equivalent to the *q* = 0.15-system discussed in Sec. IV C 1, except for the reduced polymer bead mass *m*_{p} = 1. We made this choice as we have found that the deviations in the tail in the VACF are more pronounced with reduced polymer bead mass and, thus, an optimization is more relevant.

In Fig. 10, we show that the first update scheme [Eq. (28)] improves the tail behavior of the VACF, while it maintains the previously already accurate dynamics on shorter time scales. However, we also find that the residual in the effective kernel $GiCG(t)$ is overcompensated in the first steps. The reason for this behavior is found in the fact that Δ*G*^{CG}(*t*) depends on $G\u0303(t)$. To get an intuition on how the effective friction and memory due to conservative interactions is affected by friction added through a thermostat, we can compare the integrals of $G\delta (t)=\u222b0tdsK\delta (s)$, which is related to the friction due to collisions with polymer beads, and $GC(t)=\u222b0tdsKC(s)$, related to the friction due to effective interactions with colloid beads.

In Fig. 11, we plot *G*_{C}(*τ*) against *G*_{δ}(*τ*), with *τ* = 4 for 18 different systems. For otherwise equivalent systems, an increase in the polymer mass naturally leads to an increase in the friction due to colloid–polymer collisions, and, thus, the plateau value of *G*_{δ}(*t*) increases. We find that at the same time, the other contributions to the total memory also get more pronounced. In particular, the plateau values of *G*_{C}(*t*) are increased. This indicates that an increase in the viscosity of the implicit solvent also increases the effective friction exerted due to effective colloid–colloid interactions. This is not surprising as the reduction in the mobility of the colloids leads to a slower decorrelation of the conservative interactions and, thus, increases the friction due to these interactions. The relation between *G*_{C}(*τ*) and *G*_{δ}(*τ*) seems to be almost linear if only the polymer mass is changed. This qualitatively explains the oscillatory nature of the iterative scheme due to Eq. (28). In CG simulations, Δ*G*^{CG}(*t*) will dominantly change in the same direction as the correction applied to $G\u0303(t)$.

In Sec. II D, with Eq. (30), we proposed an update scheme that should be less prone to oscillations. We show the first iterations using Eq. (30) in Fig. 12, and indeed, both the effective memory kernel and the VACF smoothly approach the target.

The results for the third update scheme [Eq. (34)] are found in Fig. 13. Here, the target is approached faster than with Eq. (30) and simultaneously does not show any oscillatory behavior, indicating that Eq. (34) constitutes the most promising optimization scheme.

The choice $G\u03030(t)\u2261G\u0303(t)$ already provides a good initial guess. Jung *et al.*^{31} in their iterative scheme based the initial guess on force-autocorrelation functions. To test our schemes for an initial guess that is not as close to the optimal parameterization, we also started from $G\u03030(t)\u2261\u222b0tdsC\delta F\delta F(s)MIkBT$. The results obtained with this choice are summarized in the supplementary material. While the initial guess in this case is not as close to the optimal choice, all three iteration schemes yield quite good results within the first few steps.

In summary, all three iterative schemes perform quite well for the *q* = 0.15-system with *m*_{p} = 1. As already discussed, this is a system in which the contribution to the total friction due to conservative interactions is highly relevant but not dominant. We would expect to see a stronger difference in the performance of the different schemes in a case in which the conservative interactions yield the dominant part of the overall friction. This is the case in the *q* = 0.6-system. Furthermore, the deviations in the *q* = 0.6 are not limited to the tail region in the VACF, which makes it a good complementary test case. For reference, we show the results for $G\u03030(t)=G\u0303(t)$ as initial guess in the supplementary material. We find that all three update schemes yield good corrections. Yet, as in this system friction due conservative interactions is more dominant, for the scheme of Eq. (30), the target is approached quite slowly, which is not the case for Eqs. (28) and (34). This becomes more evident by again using $G\u03030(t)\u2261\u222b0tdsC\delta F\delta FMIkBT(s)$ as initial guess. The results for optimization following Eqs. (30) and (34) are shown in Figs. 14 and 15, respectively. We find that for Eq. (30), the target is approached only slowly, while Eq. (34) automatically optimizes the step size, such that the target is well reproduced within three to four iterations.

## V. DISCUSSION

In the previous sections, we have demonstrated that the parameterization of a CG model following Eq. (5) based on the splitting of contributions to the single-particle memory kernel as introduced in Ref. 14 and Sec. II C generally yields a good representation of the VACF in coarse-grained implicit solvent continuous AO models.

We have found that the assumption that Δ*K*^{CG}(*t*) ≈ *K*_{C}(*t*) is generally well justified in the studied systems, and this reinforces the interpretation of *K*_{C}(*t*) as the memory and friction introduced by the CG potential in multi-body simulations. The results of the *q* = 0.6-system have shown that indeed this assumption is less accurate if the MB-PMF is not fully captured by the CG potential.

This confirms our interpretation of the data in Ref. 14. An increase in *q* introduces errors in the radial distribution function and accordingly our CG models cannot exactly reproduce FG VACFs on short time scales. The reason lies in the assumption made in Eq. (25). In the evaluation of *K*_{C}(*t*), trajectories that sample the actual MB-PMF are used. When the conservative interactions in the CG model do not reproduce the MB-PMF well enough, the CG model will sample different structures and, subsequently, different force fluctuations. A change in the force fluctuations directly translates to a change in the effective friction introduced by the conservative interactions. While this point is typically not deeply discussed in the literature on CG dissipative models, in other contexts, the importance of accurate CG force fields for dynamical properties is discussed. For example, the relevance of accurate energy barriers in CG models is discussed in Ref. 48. This is particularity crucial, as this fact does not specifically apply to our approach but to any parameterization scheme of dissipative models, which is based on the analysis of mapped reference trajectories, as, for example, in Refs. 11, 12, 16, and 17.

In Ref. 11, the authors derived different types of Markovian DPD models for star-polymer melts with different numbers of beads per molecule. The authors found that the quality of their DPD models depends on the number of beads per molecule. In particular, in these models, the VACF could be very well reproduced for small star-polymers, in which the structural properties were also well reproduced. For a similar star-polymer system, such as that used in our previous study,^{14} the minimum of the VACF could not be matched; and simultaneously, the structural properties were also less accurately reproduced. The authors mentioned the potential relevance of many-body correlations^{11} and our findings reinforce this interpretation.

In Ref. 17, the authors derived Markovian and non-Markovian CG DPD models for fluid blobs with an adaptive mapping scheme. They were able to reproduce the pair-structure quite accurately by employing local-density dependent potentials. While their non-Markovian models reproduce the initial decay and the long time scale of the VACF of the reference model accurately, the backscattering effect could only be reproduced qualitatively.^{17} While the complexity of the model complicates the interpretation of this finding, we propose that our method of comparing *K*_{C}(*t*) and Δ*K*^{CG}(*t*) could in principle be used as an indirect test for the origin of the mismatch in the VACF on intermediate time scales for such models.

The bottom-up derivation of dynamically consistent particle based CG models is thus not only complicated by finding an appropriate, tractable EoM and its formal relation to the FG reference, but in some sense, it is also limited by the quality of the CG potential.

We have found that for the *q* = 0.15-system, the long time tail cannot be fully captured without *a posteriori* optimization of the parameterization, while the *q* = 0.4-system does not suffer from this problem. In the supplementary material, we report the results for a broader spectrum of parameters of the reference AO model. We find that the accuracy of our CG models, in particular with respect to the hydrodynamic tail, is strongly system dependent. In particular, the discrepancy in the tail behavior is less severe for high polymer bead density and for high polymer mass. Therefore, an increase in the viscosity of the “solvent” improves the long tail representability in our CG models. This finding reflects the screening of hydrodynamic interactions: In the FG reference systems, the collision of colloid beads with polymer beads induces a flow of polymer beads, which can subsequently transfer momentum to other colloid beads. These hydrodynamic interactions are not explicitly encoded in our CG models. At high viscosity, these hydrodynamic interactions are dampened and, thus, less relevant.^{49} Both an increase in the polymer bead’s mass and an increase in the polymer bead’s number density lead to an effective increase in viscosity. It can, thus, be assumed that the indirect momentum transfer through hydrodynamic interactions is reduced and the modeling in terms of independent friction and fluctuating forces on the colloid beads is better justified.

We want to stress that the averaged effects of hydrodynamic interactions, which typically lead to an increase in diffusivity through collective motion, are implicitly included in $K\u0303(t)$ via *K*_{X}(*t*). The origin of the mismatch in the tail behavior for some systems comes from the fact that the friction and noise are modeled independently for each bead. This potentially reduces the inter-bead correlations, which leads to a change in the temporal fluctuations in the structure. This, in turn, determines the fluctuations in conservative interactions, and on long time scales, it can lead to errors in the induced contribution to the total effective memory kernel.

Qualitatively the question whether the hydrodynamic tail can be correctly modeled by Eq. (5) following our parameterization scheme boils down to the question whether the colloid–colloid momentum transfer is dominated through the conservative interactions or through colloid–colloid hydrodynamic interactions. If the former is the case, the inter-bead correlations should be less dependent on the choice of the form of the thermostat. If the latter is the case, the loss of cross-correlations will be evident in the tail of the VACF.

To validate this intuition, we can calculate the velocity cross-correlation function $CVVX(t;Rcut)$ between pairs of particles as defined by

In Fig. 16, we compare $CVVx(t;Rcut)$ for *R*_{cut} = 1.2 and *R*_{cut} = 1.8 for all three systems. In the *q* = 0.15-system, we find that in the CG model, the maximal cross-correlations are approximately half of the reference, while in the *q* = 0.4-system and the *q* = 0.6-system, the CG model captures the cross-correlations significantly better. This is not surprising, as the depth and range of the CG conservative potential in the *q* = 0.4- and *q* = 0.6-system are larger, which allows for an increase in correlations between neighboring beads due to conservative forces. Thus, in the *q* = 0.15-system, the colloid–colloid momentum transfer through hydrodynamic interactions is more prominent than in the other systems and cannot be fully captured in CG models without explicit hydrodynamic interactions. Thus, the modeling errors evident in the VACFs in the *q* = 0.15- and the *q* = 0.6-system are of different origin and affect different time scales. In either case, the proposed iterative optimization schemes, especially Eq. (34), can be applied to correct these residual deviations to match the target correlation functions.

## VI. CONCLUSION

In this article, we bottom-up derived CG implicit solvent models for different continuous AO models. For the derivation of CG pair-potentials, we used the FM-method. We applied an isotropic GLE thermostat based on the auxiliary variable approach to correctly represent single-particle time-correlation functions. The parameterization of the GLE thermostat to achieve dynamic consistency is based on the splitting of the effective single-particle memory kernel into different contributions. We have shown that our scheme generally leads to a good representation of single-particle dynamics. By deliberately choosing the parameters in the reference AO models, we have shown that an accurate representation of the MB-PMF in terms of the CG potential is crucial in purely bottom-up derived models, as only then the collision and backscattering dynamics can be accurately captured. We have shown that the application of an isotropic, configuration independent thermostat in many-body simulations can lead to errors in the long tail of VACFs, as such a thermostat does not explicitly consider inter-bead correlations in dissipation and random forces. This implies a lack of inter-bead momentum transfer, which can lead to a reduction in collective motion. Based on inter-bead velocity cross-correlation functions, we have shown that this limitation of isotropic thermostats applies to different extents for different systems and can be expected to be most crucial when the momentum transfer between the CG DoFs in the FG reference models is dominantly mediated through the removed DoFs. In implicit solvent models, as in the studied systems, this is the momentum transfer mediated by an induced flow in the solvent. More generally, one cannot expect to capture all dynamic properties in full detail when using a CG model of the form given by Eq. (5). This is not unique to dynamic coarse-graining, as in structural coarse-graining also, e.g., the reliance on nonbonded pair-potentials limits the accuracy of CG models and the interpretation of results from CG simulations has to always be assessed carefully. For practical purposes, always a balance between feasibility and accuracy has to be found. In particular, in molecular coarse-graining, in which few atomistic degrees of freedom are mapped onto CG beads, and the molecular structure is maintained in the CG representation, a perfect representation of the MB-PMF with simple CG potentials is not possible due to, e.g., the representation of nonsymmetric molecule segments in terms of spherically symmetric interactions. This fact alone poses limitations to the derivation of dynamically consistent coarse-grained models in a purely bottom-up manner. For the case in which single-particle time-correlation functions need to be reproduced accurately despite the intrinsic limitations of the target model, we have proposed three different iterative optimization schemes for the parameterization of the optimal memory kernel. We used the integrated effective single-particle memory kernel, which can be readily obtained from a VACF, as the target property, as this memory kernel fully determines other correlation functions and allows for the usage of very simple update schemes.

In the future, it is of interest to investigate how well the CG models of the form studied in this paper can perform for realistic molecular systems. For example, the speedup upon coarse-gaining in mixtures can lead to inconsistent relative diffusivities in different species,^{50} which could be accounted for by coupling different CG DoFs to distinctly parameterized GLE thermostats. Another example where non-Markovian dissipative models might be particularly useful is the study of the dynamics of polymer chains, as anomalous diffusion found in, e.g., polymer melts is an intrinsically non-Markovian process,^{51} which cannot be fully captured through Markovian models.

## SUPPLEMENTARY MATERIAL

See the supplementary material for complementary data and discussion on additional systems not discussed in the main text. We also provide summaries of Mori’s projection operator, the BOD-method, and the parameterization of the auxiliary variable GLE thermostat.

## ACKNOWLEDGMENTS

Financial support for this work was granted by the Deutsche Forschungsgemeinschaft (DFG) via SFB TRR 146 (Grant No. 233630050, project A2). The authors thank Friederike Schmid for suggesting the Asakura–Oosawa model as an interesting model system and Madhusmita Tripathy for helpful suggestions on the manuscript.

## AUTHOR DECLARATIONS

### Conflict of Interest

The authors have no conflicts to disclose.

## DATA AVAILABILITY

The data that support the findings of this study are available within the article and its supplementary material. Input parameters for simulations, analysis scripts, and raw data of the shown figures are available from the corresponding author upon reasonable request.