Empirical force fields employed in molecular dynamics simulations of complex systems are often optimized to reproduce experimentally determined structural and thermodynamic properties. In contrast, experimental knowledge about the interconversion rates between metastable states in such systems is hardly ever incorporated in a force field due to a lack of an efficient approach. Here, we introduce such a framework based on the relationship between dynamical observables, such as rate constants, and the underlying molecular model parameters using the statistical mechanics of trajectories. Given a prior ensemble of molecular dynamics trajectories produced with imperfect force field parameters, the approach allows for the optimal adaption of these parameters such that the imposed constraint of equally predicted and experimental rate constant is obeyed. To do so, the method combines the continuum path ensemble maximum caliber approach with path reweighting methods for stochastic dynamics. When multiple solutions are found, the method selects automatically the combination that corresponds to the smallest perturbation of the entire path ensemble, as required by the maximum entropy principle. To show the validity of the approach, we illustrate the method on simple test systems undergoing rare event dynamics. Next to simple 2D potentials, we explore particle models representing molecular isomerization reactions and protein–ligand unbinding. Besides optimal interaction parameters, the methodology gives physical insights into what parts of the model are most sensitive to the kinetics. We discuss the generality and broad implications of the methodology.

Molecular dynamics (MD) is a powerful tool that can, in principle, rigorously fully characterize molecular structures, thermodynamics, and kinetics. This scope is delimited by the accuracy of the atomistic molecular dynamics force field. Coarse-graining quantum mechanical molecular interactions in force fields enables modeling many-body molecular systems; yet, this process transfers errors associated with the model selection bias and parameterization based on sparse experimental datasets,1,2 such as vaporization data for small fragments.1,3 Some of these limitations are lifted by modern approaches employing open science and machine learning to coarse-grain electronic degrees of freedom4–7 or by constraining to experimental nuclear-magnetic resonance (NMR) data.8–10 

Current force fields can reach experimental accuracy in determining highly populated structures and their thermodynamics,11 but can differ vastly in predictions of kinetic properties.2,12 This systematic error is caused by the fact that during force field parameterization, ensemble averages are very sensitive to the free energy differences of the metastable states emerging from the potential energy function, but not so much to the free energies of the barriers between these states. Thus, by parameterizing against thermodynamic ensemble averages, it is unlikely to obtain kinetically accurate force fields. Several systematic approaches to optimize force field parameters aiming at reproducing thermodynamics have been developed throughout the years, e.g., force field parameterization for large biomolecules against solution NMR data13–15 or against distances between labeled sites.16 However, so far, there is no systematic strategy to optimize the force field parameters for the underlying kinetics such that an improved force field would match the target (experimental) interconversion rates directly, although procedures for model optimization for dynamical trajectories have been recently proposed.17–19 Such force fields, capable of representing experimental kinetics, would have a large impact on molecular dynamics since they would accurately report on transition state structures and populations of processes that are impossible to resolve by experiments,20 thus offering a leverage in, for instance, protein regulation and design (e.g., by mutations) or design of materials with specified kinetic properties.

In this work, we explore an efficient way to infer the relationship between molecular (force field) model parameters and kinetic data using only prior ensembles of reference trajectories and employing techniques to reweight these paths. (Note that in this work, trajectory and path refer to the same concept: an ordered sequence of states representing the dynamical evolution of a system.) This avoids the computationally expensive step of recomputing the (rare event) rate constant and moving randomly in the high-dimensional force field parameter space (e.g., a typical protein force field contains 100’s of parameters). Recently, Donati et al. have explored such path reweighting techniques, which explicitly compute the change in path action based on a force field perturbation.21–23 We combine this path reweighting technique with the continuum path ensemble maximum caliber (CoPE-MaxCal) approach20,24 in order to impose the dynamical constraint and at the same time select the best solution among multiple sets of parameters that all give the correct kinetics. This selection thus corresponds to a minimal perturbation, i.e., the change in the force field that causes the smallest possible perturbation to the entire path ensemble, while still obeying the constraint.20 As the computation of the reference path ensemble (at the current force field) and the rate constant by direct MD is inefficient, we employ trajectory sampling methodology, in particular transition path sampling (TPS),25,26 and its descendent single replica transition interface sampling (SRTIS) to efficiently obtain path ensembles.27 Our method is applicable to any rare event method that gives access to the reference path ensemble (e.g., Refs. 28 and 29).

The aim of this paper is to lay the theoretical foundations for the above-mentioned framework and to illustrate the proof of principle of our methodology on several simple particle-based systems undergoing stochastic rare event dynamics. The remainder of this paper is organized as follows. In Sec. II, we introduce and develop the reweighted estimators for the CoPE-MaxCal approach and show how these equations can be used to optimize the underlying force field parameters. In Sec. III, we describe the model systems and give simulation details. We discuss the results of the optimization procedure in Sec. IV. We end with conclusions in which we give an outlook on future directions.

Consider a system comprising N atoms. xR3N denotes the configurational state of the system, where R3N is the 3 N-dimensional position space. We assume that the system evolves according to the overdamped Langevin dynamics30,31 in a force field −∇V(x), and note that our method can be generalized to underdamped Langevin dynamics23 so that xR6N. We simulate the system using the Euler–Maruyama (EM) integrator32 to obtain time-discretized paths. A path or trajectory is defined as an ordered sequence of frames x = {x0, x1, …xL}, where the subscripts denote the time index. Subsequent frames are separated by the time step of the integration algorithm Δt such that the total duration of the path is T=LΔt. The probability for a trajectory of length T is
(1)
where ρ(x0) denotes the probability density of the initial condition, usually the Boltzmann distribution ρ(x) ∼ exp(−βV(x)), with V(x) being the potential energy of configuration x, β = 1/kBT being the reciprocal temperature, T being the temperature, and kB being Boltzmann’s constant. p(xixi+1) is a short-time Markovian probability representing the dynamical evolution, as given by the integration algorithm, and thus depends on V(x). Z is a normalization constant such that DxP[x]=1. See the supplementary material for details.
Consider next a modified potential
(2)
where V0(x) is the reference potential, and the perturbation U(x;a) can be expressed in terms of m parameters a = (a1, a2am). Consequently, also the path probability density at the modified potential depends on these parameters: P[x;a].
The (relative) path entropy, S or caliber, for this path distribution P[x,a] with respect to the probability P0[x] at V0(x) is given by the Kullback–Leibler divergence,
(3)
The maximum caliber principle33 states that the optimal path probability distribution PMC[x] follows from maximizing the caliber, or path entropy, by varying a while satisfying an external constraint sexp,
(4)
The integral in the first constraint defines a path ensemble average ⟨… ⟩W of a path observable s[x] with respect to the path probability density P[x;a]. The second constraint keeps the probability distribution normalized.
Solving Eq. (4) can be addressed using the method of Lagrange multipliers. The path Lagrange function is
(5)
where the second term imposes the experimental constraint, the final constraint enforces normalization, and μ and ν stand for Lagrange multipliers. L depends on the potential energy function V(x;a) via Eq. (1). The task is now to find the stationary points of the Lagrange function, which constitutes setting to zero the derivatives of L with respect to the adjustable parameters a of the potential energy function.
To be able to estimate the derivative from paths sampled at V0(x), we express P[x,a] using a path reweighting factor,23 
(6)
where W[x;a] depends on the difference of the path actions and is specified in Eq. (16). Z0/Z(a) is the ratio of the partition function at V0(x) and the partition function at V(x, a). Including Z0Z(a) in the definition of the modified path probability density guarantees that the density is normalized: DxP[x;a]=1. The reweighted path ensemble average, then, is
(7)
By inserting Eq. (6) into Eqs. (5) and (3), we can express the Lagrange function as reweighted path ensemble averages,
(8)
(9)
Details on the derivation of these central equations are given in the supplementary material. The terms have the following interpretation: ⟨lnW[x;a]⟩W represents the path action difference, lnZ(a)Z represents the free energy difference, and ⟨s[x]⟩W is the expected value of the experimental observable at the modified potential.
In some cases, it is beneficial to constrain to the logarithm of the path observable. The path Lagrange function then becomes
(10)
The extension to multiple constraints is shown in the supplementary material.

To find the optimal parameters a of the potential energy function for a single constraint, we determine the stationary point of the Lagrange function, i.e., we solve L/ak=0,L/μ=0 for all a1, …am, leading to m + 1 constraint equations.

Obtaining explicit expressions for these constraint equations is easier when defining an auxiliary function w[x;a] = lnW[x;a] so that W[x;a] = exp(w[x;a]) and w[x;a]=akw[x;a]. The derivative of the path Lagrange function [Eq. (10)] with respect to a parameter ak, then, is
(11)
where we define
(12)
The derivative with respect to the Lagrange multiplier is given by the constraint itself,
(13)
Details of the derivation of these equations are provided in the supplementary material.

Since the path expected values in Eqs. (11) and (13) can be estimated from a set of trajectories generated at the reference potential V0(x), these two equations provide the basis for optimizing the parameters of the modified potential V(x, a). So far, the logarithm of the relative path probability w(x, a) = lnW[x, a] and the path observable s(x) have been abstract functions. Next, we derive explicit expressions for constraining the rate constant and expressions for w and w′ for paths generated by the EM integrator.

While the experimental observable in the constraint could be any dynamical observable, such as mobility and viscosity, here, we focus on the kinetic rate constant. In principle, a straightforward MD simulation yields the rate constant as the number of effective transitions per time unit, but for a rare event, this is extremely inefficient. Many enhanced sampling methods exist to make rate constant computations more efficient.34–40 We adopt the framework of transition path sampling25 and transition interface sampling (TIS)41 and, in particular, that of the reweighted path ensemble (RPE).42 Defining (meta)stable states A and B using an order parameter (or collective variable) λ, with λA,B being the boundaries of states A and B, one can compute the rate constant in the RPE framework as
(14)
where ADx denotes a path integral over trajectories that leave A and go over the barrier to B or return to enter A, after which they are terminated (see the supplementary material). The frequency with which these trajectories are sampled is determined by P[x], and such a path ensemble can be obtained by, e.g., TIS.41 
θ(x) is the Heaviside step function, and λmax[x] returns the maximum value of the variable λ along the path. Thus, the fraction of path integrals equals the probability that paths reach the boundary of state B. Multiplying with the flux ϕA through the first interface λ0 for paths leaving state A yields the rate constant. Using the path reweighting of Eq. (6) and setting s[x] = ϕAθ(λmax[x] − λB) and the experimental observable to sexp=kABexp in Eq. (10), the Lagrange function and its derivative are given by
(15)
We used the assumption that ϕA does not depend on a and, thus, cancels in the second term, which is justified if the parameters a do not influence stable state A. In general, the parameters can also affect the stable states, but the effect on the flux is expected to be small, in comparison to the change in rate constant due to the barrier height.

In the constraint term, the path ensemble average wA,W is calculated using the reweighted path probability P[x]=W[x;a]P0[x] (indicated by subscript W) over all trajectories that leave A (indicated by subscript A). The effect of s[x], as defined above, is to select the reactive AB paths, i.e., those that reach B, from the set of all transition paths. With this choice of s[x], Eq. (12) becomes a reweighted path ensemble average over the set of reactive AB trajectories. Thus, we can write wWs=wAB,W, where subscript AB indicates that the path ensemble average is calculated with the path probability P[x]θ(λmax[x]λB) over all trajectories that leave A. Note the similarities to the temperature derivative of the rate constant in, e.g., Refs. 43 and 44.

The term between brackets in Eq. (15) denotes the derivative of the rate constant with respect to the force field parameters and can serve as a first sanity check whether the reweighting approach actually works.

For the EM integrator, the factor W[x;a] is23 
(16)
where U(x;a) is the perturbation to the force field as defined in Eq. (2), κΔt/2kBTξm, ξ is the Langevin collision rate in units of s−1, and m is the particle mass, as defined in the EM integrator. ηi is the random number used in the ith iteration of the EM integrator out of n time steps (frames) and can be recorded during simulation at the reference potential V0(x). Taking the derivative of w[x;a] = ln W[x;a] yields
(17)
The above equations can be used to compute both the caliber and the derivatives of the Lagrange function. While it is possible to compute the derivative with respect to ak on the fly, it is even more efficient to be able to compute these a posteriori. This depends on the precise functional form of U(x;a). In the case of a linear dependence V(x, a) = V0(x) + kakUk(x), the perturbation forces are −akUk(x), and the derivative of ∇U(x, a) with respect to ak is just ∇Uk(x). Hence, it is convenient to store the sums over ηiU(xi, a) and (U(xi;a))2 terms for each path explicitly so that the derivatives can be computed easily for arbitrary values of a. For a non-linear dependence, one can still do so, but it becomes more involved.
We demonstrate our novel methodology on several simple models. First, we consider a 2D triatomic system in which three atoms interacting with Weeks-Chandler-Andersen (WCA) potentials45 are bound by a harmonic potential that has a minimum at a certain equilibrium bond distance: V(r)=12a(rreq)2. This trimer can undergo an isomerization transition where one particle passes between the other two, hence changing from a clockwise to anticlockwise arrangement of the (labeled) particles (see Fig. 1). Note that this system was also studied in Ref. 26. We focus on the isomerization rate constants for this trimer and measure the rate constants and their derivatives as functions of the force constant a and the equilibrium distance req. In fact, since we are interested in the changes Δa, Δr, with respect to a reference system, we define the modified harmonic potential as
(18)
so that the perturbation U(x;a) becomes
(19)
where rj denotes the jth bond vector length in the trimer.
FIG. 1.

Cartoon of a 2D triatomic system held together by harmonic springs. The trimer can isomerize as indicated. For illustrative purposes, particle 1 is colored red. Note that we focus on only one of the three possible equivalent transition channels.

FIG. 1.

Cartoon of a 2D triatomic system held together by harmonic springs. The trimer can isomerize as indicated. For illustrative purposes, particle 1 is colored red. Note that we focus on only one of the three possible equivalent transition channels.

Close modal
We sample the transition over the barrier in the triatomic system using SRTIS.27 The collective variable used to define stable states and interfaces is the shortest distance rperp of the hopping particle to the axis between the two remaining particles. This distance is negative or positive depending on which side the hopping particles is located. The collective variable follows from the (ad hoc) transformation,
For a configuration in the minimum of the potential, this transformation leads to λ=1.5(10.53)0.20. Stable state A is rather strictly defined as λ < λA = 0.0 to avoid too short paths. Stable state B is defined from the other side and is also λ0 < λB = 0.0. Note that since the reference state is different, both definitions refer to different states even if the value of λ is equal. Interfaces were defined at λ = 0.4, 0.45, 0.50, 0.55, 0.60, 0.65, 0.70, 0.75, 0.80, 0.85, 0.90, 0.95, 1.00, 1.05, 1.10, 1.15, 1.20, 1.25, 1.30, 1.35, 1.40, 1.45, 1.50, 1.55, 1.60, 1.65, 1.70, 1.75, 1.80. The Langevin integrator settings are γ = 2.5 and dt = 0.001. Integration is done via the EM algorithm.

Sampling 10 000 cycles with SRTIS,27 where each cycle consisted of 100 shots, 100 interface exchanges, and 100 state swaps, resulted in a path ensemble for each interface. The measured crossing histograms were joined with the weighted histogram analysis method (WHAM),46 which together with the effective positive flux through the first interface27,39,41 leads to rate constant estimates over the barrier. The WHAM procedure also allowed us to assign a weight to each trajectory in the path ensemble, yielding the RPE. Each of these RPE trajectories can be evaluated in terms of the path action and its derivatives.

As a second illustrative model, we consider a dissociation transition in a cluster of Lennard-Jones (LJ)-like particles. We first consider a seven-particle setup in two dimensions, indicated in Fig. 2. In the simplest instance of this model, all particles are kept fixed except the red particle, which can dissociate into the bulk. This red particle interacts with both the central blue particle (denoted as c) and the green particles via an attractive LJ-like interaction potentials with adjustable depths. The interaction with the gray particles is purely repulsive and given by a standard repulsive WCA potential.45 Setting the particle diameter as the unit of length σ = 1 and denoting the red particle with index 0, the total energy is, thus, given by
(20)
where the (adjustable) potentials Vϵ(r), in units of kBT, are given by
(21)
Here, the constants vs and vϵ shift the potential such that the potential is zero at the cutoff rc = 1.5 and continuous at the minimum r = 21/6. ϵ is the (reduced/dimensionless) depth of the potential, while ϵ0 = 1 is the standard reference value for the potential. For this particular cutoff, it follows that vs=rc6rc12=0.0800841 and vϵ = (1 − ϵ) + 4ϵvs.
FIG. 2.

Cartoon of the 2D dissociation model. The red particle, originally bound to the central (blue) and outer (green) particles, can escape into the bulk. Below the line, the interaction strengths between blue-red and green-red are indicated.

FIG. 2.

Cartoon of the 2D dissociation model. The red particle, originally bound to the central (blue) and outer (green) particles, can escape into the bulk. Below the line, the interaction strengths between blue-red and green-red are indicated.

Close modal

The switching at the minimum of the potential is done to avoid numerical problems with the rather steep repulsive part of the potential arising for the high values of ϵ1 required to bind the particle. Therefore, we chose to keep the repulsive part of the potential equal to the standard WCA potential, i.e., not scale with ϵ. In this way, only when the particles are in the attractive part of the potential, they contribute to the path action derivatives, making the evaluation of the path action more robust. Note that this does not change the generality of our approach.

Subsequently, we allow the other particles to move as well. To keep the cluster together, we apply an additional potential that binds the non-red particles to the central blue particle by an additional strong LJ interaction Vϵ3(r) with ϵ3 = 20. While this keeps the cluster intact, rearrangements are still possible. We avoid these by imposing an additional weak harmonic spring between neighboring particles with a spring constant k = 1 (this, of course, excludes the red particle).

The order parameter λ used for SRTIS is the center-to-center distance r between the red and blue particles. Stable states were defined as r < 21/6 and r > 2.5 for the initial and final states, respectively. (Note that beyond r = 2.5, the ligand cannot yet be considered escaped to the bulk. While it is possible to take this into account, see, e.g., Refs. 47 and 48, we assume here for simplicity that the ligand is dissociated.) Interfaces were positioned at λ = 0.15, 0.15, 0.20, 0.25, 0.30, 0.35, 0.40, 0.50, 0.60, 0.70, 0.80, 1.00 with respect to the minimum distance rA = 21/6. We integrate the equations of motion using the EM integrator with a time step of dt = 0.001 and a friction of 2.5.

Finally, in order to show that our methodology easily extends to 3D systems, we consider a 3D version of the LJ-model, now considering 13 LJ particles as depicted in Fig. 3. Here, four outer green particles bind the ligand with an attraction ϵ2.

FIG. 3.

Cartoon of the 3D dissociation model. The escaping outer red particle is initially bound to the central particle (also red) and four outer green particles.

FIG. 3.

Cartoon of the 3D dissociation model. The escaping outer red particle is initially bound to the central particle (also red) and four outer green particles.

Close modal

We illustrate our methodology and system setup for several models, including a bistable potential (see the supplementary material). Here, we explore the 2D molecular isomerization reaction from Fig. 1 and the particle dissociation from a 2D cluster of seven Lennard-Jones (LJ) particles shown in Fig. 2, which can be viewed as analogous to the protein–ligand unbinding, an important biophysical process49,50 or dissociation for small nano-clusters.51–56 In addition, we present and discuss 3D LJ particle systems, which are remarkably similar to the 2D LJ particle systems, showing the robustness of the results.

1. Path ensembles

For the triatom model (see Sec. III), we perform SRTIS with in total 106 shooting and replica exchange moves. Acceptance ratios for the shooting move ranges from 0.4 for the first interface to 0.15 for the last interface. Replica exchange moves were accepted around 50%. Path lengths vary from 50 time steps for the first interface to a few thousand for the last interface. From this reference path ensemble, we estimate the rate constant for the unmodified potential as lnkAB ≈ − 9. (See the supplementary material for more details.)

Before embarking on the full problem of model parameter optimization, we first checked the path reweighting method by computing the derivative of the rate constant, i.e., the term in brackets in Eq. (15). In the supplementary material, we show that the rate constant derivatives are well-predicted by our path reweighting approach for both the triatom and a simpler diatomic system.

2. Optimization of the Lagrange function

Now, we can turn to the optimization of the molecular (force field) model parameters. The trimer system has two parameters that can both be tweaked to reproduce the imposed rate. To optimize the Lagrange function [Eq. (9)], we have to be able to compute it as a function of parameters not only at zero perturbation but also for the reweighted path ensemble W[xa]. Moreover, we need to be able to take its derivatives at nonzero perturbation.

To do so, we keep track of several variables for each path that appear in the polynomial expansion of the path action. To be precise, we compute the path action as
(22)
where the different u and ηf terms refer to the specific contribution to the energy in the first time slice 0 and the random number force (gradients), respectively [see Eq. (17)]. The f2 terms refer to the gradient square terms. Using these quantities, it is easy to compute the derivatives dw[a, ra, Δr]/da and dw[a, ra, Δr]/dr. From these, we compute the path ensemble averages ⟨ww′⟩W, ⟨wW, and ⟨w′⟩W. Finally, we construct the reweighted rate constant lnk[a, ra, Δr], DKL, and the Lagrange function L from Eqs. (11)(15).

For a particular set of values of a = 20 and req = 1.5, we computed the path ensemble using SRTIS. In Fig. 4, we show DKL and log rate constant lnk[a, ra, Δr] on top of each other as a function of Δa and Δr. As expected, DKL is minimal for the reference value Δa = 0, Δr = 0, that is, at zero perturbation.

FIG. 4.

Contour plot of DKL (shaded contour plot) for the triatomic system as a function of Δr, Δa for a reference path ensemble obtained at req = 1.5, a = 20 kBT. Note that the minimum value of DKL is at the origin, as expected. On top of the shaded contour plot is a green line contour plot of the predicted rate constant lnkAB for the settings. Several numerical values of the contours are indicated. The graphical solution to optimization problem is to pick an imposed green contour and minimize the DKL along this contour.

FIG. 4.

Contour plot of DKL (shaded contour plot) for the triatomic system as a function of Δr, Δa for a reference path ensemble obtained at req = 1.5, a = 20 kBT. Note that the minimum value of DKL is at the origin, as expected. On top of the shaded contour plot is a green line contour plot of the predicted rate constant lnkAB for the settings. Several numerical values of the contours are indicated. The graphical solution to optimization problem is to pick an imposed green contour and minimize the DKL along this contour.

Close modal

The reweighted rate constant is identical to the predicted rate constant from the prior path ensemble for Δa = 0, Δr = 0, but clearly varies if the system is perturbed.

Note that the rate contour lines in Fig. 4 are roughly parallel to the Δreq axis, implying that, for a given spring constant a + Δa, the rate is rather insensitive to the choice of equilibrium distance. At the transition state, the bond between atoms 1 and 2 is quite stretched, whereas the other two bonds are compressed, which explains why the spring constant has greater influence on the rate than the equilibrium distance. In addition, DKL is less sensitive to Δa than to Δreq, indicating that it is probably better to adjust a than req.

When we change the rate constant from the observed value of lnkAB = −9 to the new value lnkexp, we need to compute the derivatives and optimize the Lagrange function. To do so, we apply the method of Ref. 57. This iterative method starts at certain initial values and slowly converges to the solution. The solution for the most optimal set of {Δa, Δr} is shown in Table I. In Table I, we show several sets of optimal solutions as a function of imposed kABexp. The first set shows the optimal solution for both Δa and Δr. The other two sets we optimized for only one of the two parameters, and set the other to zero. From the value of DKL (which is identical to the negative of the Lagrange function L when the constraints are obeyed), it is clear that this is always less optimal than the two parameter solution, thus illustrating the need for multi-parameter force field optimization. In addition, note that the further the imposed lnkABexp is from −9, the more L and, thus, DKL deviate.

TABLE I.

Results of the optimization procedure for different imposed log rates. The first set allows both parameters to vary; the second set varies only Δa, and the third varies only Δr.

lnkABexpμΔaΔrDKL
−8 −0.162 56 −3.034 52 −0.002 90 0.083 18 
−12 0.275 89 8.439 88 0.014 61 0.454 86 
−16 0.303 01 19.057 9 0.044 08 1.685 25 
−20 0.280 86 29.106 6 0.066 00 2.879 69 
−8 −0.164 47 −3.064 28 0.084 08 
−12 0.317 65 8.842 44 0.489 06 
−16 1.000 39 19.867 5 2.981 42 
−20 0.572 78 28.773 4 6.958 17 
−12 0.587 05 0.258 99 4.181 48 
−16 0.286 30 0.385 95 5.617 00 
−20 0.547 23 0.490 97 6.965 69 
lnkABexpμΔaΔrDKL
−8 −0.162 56 −3.034 52 −0.002 90 0.083 18 
−12 0.275 89 8.439 88 0.014 61 0.454 86 
−16 0.303 01 19.057 9 0.044 08 1.685 25 
−20 0.280 86 29.106 6 0.066 00 2.879 69 
−8 −0.164 47 −3.064 28 0.084 08 
−12 0.317 65 8.842 44 0.489 06 
−16 1.000 39 19.867 5 2.981 42 
−20 0.572 78 28.773 4 6.958 17 
−12 0.587 05 0.258 99 4.181 48 
−16 0.286 30 0.385 95 5.617 00 
−20 0.547 23 0.490 97 6.965 69 

Having shown that we can apply our framework to model systems for molecular (isomerization) reactions, we explore in this section a slightly more elaborate system, which also has interesting physical properties, namely, particle dissociation from a cluster of LJ particles. Such a process is reminiscent of ligand unbinding, which is an important problem in biophysics.49,50 Moreover, it can be seen as dissociation for small nano-clusters.51–56 

We describe this dissociation transition using a model of Lennard-Jones (LJ)-like particles. We first consider a seven particle setup in two dimensions, indicated in Fig. 2. In the simplest instance of this model, all particles are kept fixed except the red particle, which can dissociate from the cluster and diffuse away.

Here, the only important interactions are that of the red particle with the six other particles. In the second model, we allow the other particles to move as well, resulting in a fluctuating cluster of six particles, which can expel the red particle. During the dissociation, the green particles can move closer to each other, gaining in entropy. We can interpret this simple model as representing a ligand unbinding reaction, e.g., of a protein, in which the protein binding pocket slightly rearranges upon (un)binding. We stress that this is just an analogy.

Finally, in order to show that our methodology easily extends to 3D systems, we consider a 3D version of the model with 13 LJ particles.

1. Path ensembles

We first start exploring the fixed seven-particle model. Setting the interaction with the central particle to ϵ1 = 10 and with the green outer particle to ϵ2 = 1, we sample unbinding transitions using SRTIS (see Sec. III).

In total, we perform 105 shooting and replica exchange moves. Acceptance ratios for the shooting move ranges from 0.4 for the first interface to 0.15 for the last interface. Replica exchange moves were accepted around 50%. Path lengths vary from 50 time steps for the first interface to a few thousand for the last interface. The crossing probability of the final interface (obtained from WHAM) is lnP(λB|λ1) = −6.567. The flux of the first interface is ϕ = 0.003 550. The total rate constant is thus kAB = 5 × 10−6 in units of time steps. Note that, when optimizing the rate constant, we assume that the fluxes are not altered much (as above), and we only have to consider the crossing probability.

We also performed similar runs for the flexible 2D cluster.

We can now use our framework to look for the best set of new parameters ϵ1=ϵ1+Δϵ1 and ϵ2=ϵ2+Δϵ2. The path action is given by
(23)
where u0, ηf-functions, and f-functions involve the potential energy of the first slice and the gradient of the potential energy; cf. Eq. (22). Note that, as above, only the attractive part of the potential has to be taken into account.

2. Caliber/DKL and rate constant predictions

Our framework then gives the rate constant predictions for the altered parameters, as well as the caliber or DKL. Figure 5 represents both predictions as contour plots. The green contours delimit the (log of the) dissociation rate constant (in fact, the crossing probability) predictions, while the blue-ochre contours depict DKL. Several conclusions follow from Fig. 5. The first is that for the fixed cluster, the DKL contours show some anti-correlation in the two parameters. This indicates that the path ensemble is least disturbed when an increase in ϵ2 is compensated by a decrease of ϵ1. The DKL contours also are slightly asymmetric, showing a larger sensitivity to negative values of ϵ2. As this parameter is set to the relatively low value of ϵ2 = 1, reduction below Δϵ2 < −1 will, therefore, reverse the sign of the attractive interaction, which, of course, completely alters the systems. The flexible cluster in Fig. 5(b) shows no such anti-correlation in DKL, indicating that the compensating effect has largely disappeared.

FIG. 5.

Contour plot of DKL (shaded contour plot) for the unbinding transition in the seven-particle system as a function of Δϵ1, Δϵ2 for a reference path ensemble. Note that the minimum value of DKL is at the origin. On top of the shaded contour plot is a green contour plot of the predicted rate constant lnkAB for the settings. Several numerical values of the rate contours are indicated in white. The graphical solution to optimization problem is to pick an imposed green contour and minimize DKL along this contour. The red points depict this optimal solution. (a) Fixed cluster, reference path ensemble obtained at ϵ1 = 10, ϵ2 = 1.0kBT. (b) Flexible cluster, reference path ensemble obtained at ϵ1 = 12, ϵ2 = 2.0kBT. (c) Flexible cluster, reference path ensemble obtained at ϵ1 = 15, ϵ2 = 5.0kBT.

FIG. 5.

Contour plot of DKL (shaded contour plot) for the unbinding transition in the seven-particle system as a function of Δϵ1, Δϵ2 for a reference path ensemble. Note that the minimum value of DKL is at the origin. On top of the shaded contour plot is a green contour plot of the predicted rate constant lnkAB for the settings. Several numerical values of the rate contours are indicated in white. The graphical solution to optimization problem is to pick an imposed green contour and minimize DKL along this contour. The red points depict this optimal solution. (a) Fixed cluster, reference path ensemble obtained at ϵ1 = 10, ϵ2 = 1.0kBT. (b) Flexible cluster, reference path ensemble obtained at ϵ1 = 12, ϵ2 = 2.0kBT. (c) Flexible cluster, reference path ensemble obtained at ϵ1 = 15, ϵ2 = 5.0kBT.

Close modal

The second observation is that the green rate constant contours are roughly linear with a negative slope. This indicates that both increasing ϵ1 and ϵ2 have similar effects for a large variety of values. For a similar decrease in the rate constant, one could choose to increase either ϵ1 or ϵ2. Note that the slope of the contour is roughly −0.5, as there are two outer particles (green) and only one central particle, so changing their interaction strengths ϵ2 has, therefore, twice the effect. This observation indicates that ϵ1 and ϵ2 are more or less interchangeable, and one can choose many combinations for arriving at the same rate constant.

In case of a flexible cluster, as shown in Fig. 5(b), the predicted rate constant contours become nonlinear. In Fig. 5(c), we show a case where the outer and central particles have a stronger attraction.

3. Optimization of the Lagrange function

Next, we would like to find the optimal choice for the force field parameters ϵ1,2. As before, optimizing the parameters for a given rate constant amounts to following a green rate constant contour until DKL is minimal. Using the optimization procedure outlined in Sec. IV A 2, we arrive at a prediction given by the red points in Fig. 5. This prediction corresponds, thus, to the most optimal force field parameters, which minimizes the change in the path ensemble with respect to the original force field.

In all cases in Fig. 5, a trivial observation is that the curve passes through the origin, as there, the original force field reproduces the original rate constant most optimally. For the fixed cluster in Fig. 5(a), the optimal curve is roughly vertical, with only a relatively small deviation in ϵ1. This means that that whether enhancing or reducing the rate, it is always better to change ϵ2 rather than ϵ1. Remarkably, this trend even holds for Δϵ2 < −1, which changes the interaction from attraction to repulsion. This is likely a consequence of the immobility of the particles in this case.

For the flexible cluster cases in Figs. 5(b) and 5(c), the most striking feature is perhaps the L-shaped curve, indicating an asymmetry concerning reducing the rate or enhancing the rate. For a rate change of about one natural log unit, the red curve is roughly diagonal, that is, ϵ1,2 are contributing to the rate constant in equal proportions. However, when reducing the rate constant further, the curve bends over more vertically, indicating that it is better to change ϵ2 instead of ϵ1. In contrast, when increasing the rate, the curve bends over horizontally, indicating that it is now better to change ϵ1 instead of ϵ2. This conclusion also holds for Fig. 5(c), where ϵ1 = 15 and ϵ2 = 5.

We can interpret this behavior as follows. The best parameters are those that perturb the path ensemble as little as possible. When reducing the rate constant, it is better to adjust the ϵ2 parameter than the ϵ1 parameter, even if they both can lead to the same rate constant predictions. This can be interpreted by realizing that changing the central particle interaction ϵ1 will alter the entire reweighted path ensemble: all interface ensembles will be affected. In contrast, the change of ϵ2 will mostly affect only the interface ensembles further out. Since distant interfaces have a (much) lower weight in the ensemble, the perturbation, as measured by DKL/caliber will be smaller. In contrast, for increasing the rate, changing ϵ2 will not get you very far, and a substantial change of ϵ1 is also necessary. These effects can be shown by plotting the total crossing probability P(λ|λ0) for the different settings in Fig. 6. For the curves for Δϵ1 = 4 and Δϵ2 = 2, the final rate constant predictions (crossing probs) are almost equal, but the intermediate crossing probability and, hence, the path ensembles are very different. Clearly, the Δϵ2 = 2 case is much closer to the original dataset (black solid line), especially in the beginning of the crossing probability where the path ensemble is most dominant.

FIG. 6.

Reweighted crossing probabilities for the flexible 2D cluster obtained at ϵ1 = 12, ϵ2 = 2.0kBT. The black solid line is the reference crossing probability. The red and blue curves are reweighed crossing probabilities changing either ϵ1 (solid) or ϵ2 (dashed). Clearly, changing ϵ1 has most effect on the beginning of the crossing probability curves, while changing ϵ2 in the positive direction has more influence on the latter part, also compared to changing in the negative direction, thus explaining the asymmetry in the binding problem.

FIG. 6.

Reweighted crossing probabilities for the flexible 2D cluster obtained at ϵ1 = 12, ϵ2 = 2.0kBT. The black solid line is the reference crossing probability. The red and blue curves are reweighed crossing probabilities changing either ϵ1 (solid) or ϵ2 (dashed). Clearly, changing ϵ1 has most effect on the beginning of the crossing probability curves, while changing ϵ2 in the positive direction has more influence on the latter part, also compared to changing in the negative direction, thus explaining the asymmetry in the binding problem.

Close modal

4. Comparison with rate constant predictions

While the prediction of the optimal parameters is already providing valuable insight, the ultimate goal is to establish a better force field model. To assess the quality of the predictions, we can compare the predicted rate constants with independent calculations at these different force field parameters. Figure 7 shows this comparison for the flexible 2D unbinding. The agreement is good, especially up to two kBT from the reference point (ϵ1 = 12, ϵ2 = 2.0).

FIG. 7.

Contour plot of the (logarithm of the) rate constant predictions (blue contours) based on the reference ensemble of the flexible 2D cluster obtained at ϵ1 = 12, ϵ2 = 2.0kBT (gray point). These blue rate contours, identical to those in Fig. 5(b), are one natural log unit apart. The predicted rate contours are compared with interpolated contours from a set of validated rate constants, computed for parameter settings indicated by the green points in the plot. Note that the red contours trace the blue contours closely, up to variations of roughly 2kBT, thus validating our approach.

FIG. 7.

Contour plot of the (logarithm of the) rate constant predictions (blue contours) based on the reference ensemble of the flexible 2D cluster obtained at ϵ1 = 12, ϵ2 = 2.0kBT (gray point). These blue rate contours, identical to those in Fig. 5(b), are one natural log unit apart. The predicted rate contours are compared with interpolated contours from a set of validated rate constants, computed for parameter settings indicated by the green points in the plot. Note that the red contours trace the blue contours closely, up to variations of roughly 2kBT, thus validating our approach.

Close modal

Note that Fig. 7 compares just the reweighted crossing probability part of the rate to validate the rate prediction of our method. Thus, in this comparison, we have not included possible differences in flux factor. In this particular example, we verified that this difference is relatively small (≲20%), which would translate in a small correction on the logarithmic scale of less than 0.18, but in general, we encourage to recalculate the flux factor through the first interface. This is usually not the most expensive part.

5. Physical insight from the optimization

These results also reveal several physical aspects: While the central particle is more important for increasing the rate, outer particles are more important for decreasing the rate, as the entire path ensemble is mostly influenced by the central particle’s interaction change, whereas outer particles only affect the barrier region. Outer particles, therefore, are prime targets for modulating when reducing the dissociation rate constants. Translating to real proteins this amounts to engineering mutations or post-translational modifications58 of binding pocket residues close to the surface or modulation of the ligand chemistry in order to bind better to encounter complex sites at the surface. Of course, this extrapolation to realistic systems is currently no more than a hypothesis, which requires further testing. Yet, the general principle is likely to be robust.

In Fig. 8, we plot the results for the 3D systems. They are remarkably similar to the 2D systems, showing the robustness of the results. One striking difference is the slope of the green rate constant contours, which is now −0.25 (for the flexible 3D case), as we now have four outer particles. Hence, a change in ϵ2 has a fourfold effect on the rate constant. In addition, the DKL contours appear different and a bit more skewed compared to the more circular ones in the 2D case. Remarkably, the optimal solution for the parameters (the red curves) looks again qualitatively similar to those of the 2D cases. Only for very strong interaction Δϵ2 > 2 of the outer particles, shown in Figs. 8(b) and 8(c), the red curve bends over.

FIG. 8.

Contour plot of DKL (shaded contour plot) for the unbinding transition in the 3D 13-particle system as a function of Δϵ1, Δϵ2 for a reference path ensemble. Note that the minimum value of DKL is at the origin. On top of the shaded contour plot is a green contour plot of the predicted rate lnkAB for the settings. Several numerical values of the contours are indicated in white. The graphical solution to the optimization problem is to pick an imposed green contour and minimize DKL along this contour. The red points depict this optimal solution. (a) Fixed cluster, reference path ensemble obtained at ϵ1 = 12, ϵ2 = 1.0kBT. (b) Flexible cluster, reference path ensemble obtained at ϵ1 = 12, ϵ2 = 2.0kBT. (c) Flexible cluster, reference path ensemble obtained at ϵ1 = 20, ϵ2 = 1.0kBT.

FIG. 8.

Contour plot of DKL (shaded contour plot) for the unbinding transition in the 3D 13-particle system as a function of Δϵ1, Δϵ2 for a reference path ensemble. Note that the minimum value of DKL is at the origin. On top of the shaded contour plot is a green contour plot of the predicted rate lnkAB for the settings. Several numerical values of the contours are indicated in white. The graphical solution to the optimization problem is to pick an imposed green contour and minimize DKL along this contour. The red points depict this optimal solution. (a) Fixed cluster, reference path ensemble obtained at ϵ1 = 12, ϵ2 = 1.0kBT. (b) Flexible cluster, reference path ensemble obtained at ϵ1 = 12, ϵ2 = 2.0kBT. (c) Flexible cluster, reference path ensemble obtained at ϵ1 = 20, ϵ2 = 1.0kBT.

Close modal

In this paper, we have introduced a Continuum Path Ensemble Maximum Caliber (CoPE MaxCal) based method to optimize force field parameters in order to impose an dynamical constraint, in particular a rate constant in a complex molecular transition. Without any computationally expensive recalculation of the kinetics, the method yields the optimal change in the parameters that leads to an imposed rate constant after the path reweighting, while making the least possible perturbation to the prior trajectory ensemble as measured by the caliber or KL divergence.

We show that the path reweighting leads to meaningful prediction of the rate constants, which agrees with the direct calculation for the new force field, even up to more than an order of magnitude suppression of the rate constant in the case of the unbinding model. While in this work we develop the methodology and applied it only to simple models, we expect the method to be generally applicable.

Our method is especially aimed at force fields with multiple parameters. If there are only one or two parameters to change in a system, such as in a monoatomic Lennard-Jones fluid, adding kinetic constraints might overdetermine the parameter space, prohibiting a sensible solution. Instead, we focus on the other extreme, that of underdetermined systems, where the best solution should be found among many candidate parameters.

Besides a corrected force field, we find that the optimization for multiple parameters allows for an interpretation that gives physical insights in which parts of the path ensemble are affected by the parameter changes. Thus, the method provides a powerful tool to inspect rare event trajectory ensembles and extract valuable information from them. Trajectory ensembles provide us not only with mechanisms, rate constants, and transition states for rare event dynamics, but they can also inform on how such properties change with the model parameters. Importantly, they can predict the change in rate constants and, moreover, point us in the direction of parameters that least affect the original path ensembles. This gives the possibility to extend the optimization of force fields that are already optimized for thermodynamics to kinetics and, moreover, gives us pointers to what parts of the systems are most sensitive and thus most sensible to adapt or mutate.

One concern is how altering the potential might change the thermodynamics (or other static properties). When the force field perturbation only affects the barrier region between the two stable states and does influence the stable states themselves, this effect will be minimal (see the trimer example). However, if, for instance, changing force field parameters makes one of the states relatively more stable, this has the (most likely undesired) side-effect of changing the free energy difference. This issue might be resolved by including an additional constraint, e.g., by demanding that the forward and backward rates are modified with the same factor so that the free energy difference is unchanged in the optimized force field. Note that other thermodynamic or dynamic properties not targeted in the optimization might also be affected and might need an additional reevaluation.

Looking toward the future, we consider several directions for further research. First of all, we have used only the simplest of path actions, i.e., the Onsager–Machlup action for the EM integrator. It is known that EM is not a great integrator, and an obvious extension of the methods would be to extend it to underdamped Langevin integrators.23 

Second, we only have considered up to two parameters. We envision that extension to many parameters is, in principle, straightforward, but some bookkeeping issues arise, e.g., keeping track of all the cross terms in the square gradient. Nevertheless, the method should be equally effective for optimization in higher dimensions.

Third, the method can and should be applied in a realistic force field and included in a MD engine. This is non-trivial since the evaluation of the path action should be done in the integrator. Nevertheless, we believe that this is a worthwhile and necessary research direction to make our approach useful. One of the issues is the functional form of the potential perturbation. To start, one could try simple functional forms, such as dihedral terms, and Lennard-Jones-like interactions. In a more advanced exploration, one could try to use machine learning to learn the functional form of the perturbation.

Fourth, the methods can be pushed beyond the standard atomistic force fields into coarse-grained force fields, for which rate constants are notoriously difficult to reproduce. Such an approach would go beyond a simple adjustment of the diffusion constant or friction.

Fifth, the method should be also applicable to other dynamical observables, such as diffusion and viscosity. In fact, all observables based on time-correlation functions of a trajectory ensemble should be treatable.

Sixth, parameters beyond the force field might be optimized, including the diffusion, effective mass, and even molecular topology. The holy grail in chemistry, material science, and biology is to accurately determine the link between microscopic degrees of freedom, such as the chemistry, the microscopic mechanisms, transition rates, and structure populations with macroscopic thermodynamic or kinetic properties related to the function. Having an efficient algorithm that is able to provide the link between kinetic/thermodynamic macroscopic properties and force field parameters, reporting on the underlying physics could be potentially very useful in protein engineering and design of small molecules and of materials with tunable macroscopic properties.49–56 Examples of such design applications are coarse-grained molecular force fields where the sole parameters are the identity of the aminoacid, which are a.o. employed in protein aggregation and protein folding.59,60 For such problems, one could apply our approach to suggest changes in the aminoacid sequence in the direction of promoting or depressing the rate constant of, e.g., aggregation, protein folding, or liquid/liquid phase separation. In this way, one could optimize the design of the novel (bio)material to be further tested in the wet lab, thus bypassing expensive and timely optimization protocols in the wet lab.

Moreover, our method could be particularly useful in chemical reactions by correcting the kinetics of imperfect while computationally cost-effective calculations by reactive force fields61 instead of doing calculations with accurate albeit computationally expensive density functional theory potentials. Another utility is the generality of the method, giving the possibility to combine it with deep learning potentials that are known to be very flexible and have already been used when calibrating force fields for thermodynamics in material science.5,62

Finally, our methodology of constrained optimization based on the maximum entropy principle is not limited to molecular systems and could even potentially contribute to various problems of time-series, where the underlying dynamics can be modeled as stochastic, undergoing rare events and with a potential energy function reporting on interactions of agents, such as in modeling of stock-market, fluid-flows, and health-pandemics.63–65 

The supplementary material comprises an appendix with several theory sections containing detailed explanations and derivations mentioned in the text and results and discussion sections that are provided using the rate constant derivative as a test, applied to a diatomic and triatomic system.

The authors thank Christoph Dellago and Pieter Rein ten Wolde for useful feedback and helpful discussions. B.G.K. acknowledges the funding by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) through Grant No. SFB 1449–431232613, sub-project C02. The ideas for this contribution were initiated during the workshop “Accelerating the Understanding of Rare Events” (6-10 September 2021) at the Lorentz Center in Leiden, NL.

The authors have no conflicts to disclose.

Peter G. Bolhuis: Conceptualization (equal); Data curation (lead); Investigation (lead); Methodology (equal); Writing – original draft (equal); Writing – review & editing (equal). Z. Faidon Brotzakis: Conceptualization (equal); Investigation (supporting); Methodology (equal); Writing – original draft (equal); Writing – review & editing (equal). Bettina G. Keller: Conceptualization (equal); Investigation (supporting); Methodology (equal); Writing – original draft (equal); Writing – review & editing (equal).

The data that support the findings of this study are available from the corresponding author upon reasonable request.

1.
J. A.
Harrison
,
J. D.
Schall
,
S.
Maskey
,
P. T.
Mikulski
,
M. T.
Knippenberg
, and
B. H.
Morrow
,
Appl. Phys. Rev.
5
,
031104
(
2018
).
2.
S.
Piana
,
K.
Lindorff-Larsen
, and
D. E.
Shaw
,
Biophys. J.
100
,
L47
(
2011
).
3.
D.
van der Spoel
,
Curr. Opin. Struct. Biol.
67
,
18
(
2021
).
4.
A.
Tkatchenko
and
M.
Scheffler
,
Phys. Rev. Lett.
102
,
073005
(
2009
).
5.
L.
Zhang
,
J.
Han
,
H.
Wang
,
R.
Car
, and
E.
Weinan
,
Phys. Rev. Let.
120
,
143001
(
2018
).
6.
L.
Bonati
and
M.
Parrinello
,
Phys. Rev. Lett.
121
,
265701
(
2018
).
7.
A.
Singraber
,
J.
Behler
, and
C.
Dellago
,
J. Chem. Theory Comput.
15
,
1827
(
2019
).
8.
J.
Köfinger
,
L. S.
Stelzl
,
K.
Reuter
,
C.
Allande
,
K.
Reichel
, and
G.
Hummer
,
J. Chem. Theory Comput.
15
,
3390
(
2019
).
9.
A.
Varela-Rial
,
I.
Maryanow
,
M.
Majewski
,
S.
Doerr
,
N.
Schapin
,
J.
Jiménez-Luna
, and
G.
De Fabritiis
,
J. Chem. Inf. Model.
62
,
225
(
2022
).
10.
Y.
Qiu
,
D. G. A.
Smith
,
S.
Boothroyd
,
H.
Jang
,
D. F.
Hahn
,
J.
Wagner
,
C. C.
Bannan
,
T.
Gokey
,
V. T.
Lim
,
C. D.
Stern
,
A.
Rizzi
,
B.
Tjanaka
,
G.
Tresadern
,
X.
Lucas
,
M. R.
Shirts
,
M. K.
Gilson
,
J. D.
Chodera
,
C. I.
Bayly
,
D. L.
Mobley
, and
L.-P.
Wang
,
J. Chem. Theory Comput.
17
,
6262
(
2021
).
11.
K.
Lindorff-Larsen
,
P.
Maragakis
,
S.
Piana
,
M. P.
Eastwood
,
R. O.
Dror
, and
D. E.
Shaw
,
PLoS One
7
,
e32131
(
2012
).
12.
F.
Vitalini
,
A. S.
Mey
,
F.
Noé
, and
B. G.
Keller
,
J. Chem. Phys.
142
,
084101
(
2015
).
13.
A.
Cesari
,
S.
Bottaro
,
K.
Lindorff-Larsen
,
P.
Banáš
,
J.
Šponer
, and
G.
Bussi
,
J. Chem. Theory Comput.
15
,
3425
(
2019
).
14.
G.
Tesei
,
T. K.
Schulze
,
R.
Crehuet
, and
K.
Lindorff-Larsen
,
Proc. Natl. Acad. Sci. U. S. A.
118
,
e2111696118
(
2021
).
15.
T.
Fröhlking
,
M.
Bernetti
,
N.
Calonaci
, and
G.
Bussi
,
J. Chem. Phys.
152
,
230902
(
2020
).
16.
J.
Köfinger
and
G.
Hummer
,
Eur. Phys. J. B
94
,
245
(
2021
).
17.
D. C.
Rose
,
J. F.
Mair
, and
J. P.
Garrahan
,
New J. Phys.
23
,
013013
(
2021
).
18.
A.
Das
,
D. C.
Rose
,
J. P.
Garrahan
, and
D. T.
Limmer
,
J. Chem. Phys.
155
,
134105
(
2021
).
19.
A.
Das
,
B.
Kuznets-Speck
, and
D. T.
Limmer
,
Phys. Rev. Lett.
128
,
028005
(
2022
).
20.
Z. F.
Brotzakis
,
M.
Vendruscolo
, and
P. G.
Bolhuis
,
Proc. Natl. Acad. Sci. U. S. A.
118
,
e2012423118
(
2021
).
21.
L.
Donati
,
C.
Hartmann
, and
B. G.
Keller
,
J. Chem. Phys.
146
,
244112
(
2017
).
22.
L.
Donati
and
B. G.
Keller
,
J. Chem. Phys.
149
,
072335
(
2018
).
23.
S.
Kieninger
and
B. G.
Keller
,
J. Chem. Phys.
154
,
094102
(
2021
).
24.
P. G.
Bolhuis
,
Z. F.
Brotzakis
, and
M.
Vendruscolo
,
Eur. Phys. J. B
94
,
188
(
2021
).
25.
P. G.
Bolhuis
,
D.
Chandler
,
C.
Dellago
, and
P. L.
Geissler
,
Annu. Rev. Phys. Chem.
53
,
291
(
2002
).
26.
C.
Dellago
,
P. G.
Bolhuis
, and
P. L.
Geissler
,
Advances in Chemical Physics
(
John Wiley & Sons, Ltd.
,
2002
), Vol.
123
, p.
1
.
27.
W.-N.
Du
and
P. G.
Bolhuis
,
J. Chem. Phys.
139
,
044105
(
2013
).
28.
R. J.
Allen
,
P. B.
Warren
, and
P. R.
Ten Wolde
,
Phys. Rev. Lett.
94
,
018104
(
2005
).
29.
D. M.
Zuckerman
and
L. T.
Chong
,
Annu. Rev. Biophys.
46
,
43
(
2017
).
30.
B.
Leimkuhler
and
C.
Matthews
,
Interdiscip. Appl. Math.
39
,
261
(
2015
).
31.
B.
Øksendal
,
Stochastic Differential Equations
(
Springer
,
2003
), pp.
65
84
.
32.
P. E.
Kloeden
and
E.
Platen
,
Numerical Solution of Stochastic Differential Equations
, 1st ed. (
Springer
,
Berlin, Heidelberg
,
1992
).
33.
M. J.
Hazoglou
,
V.
Walther
,
P. D.
Dixit
, and
K. A.
Dill
,
J. Chem. Phys.
143
,
051104
(
2015
).
34.
D.
Chandler
,
J. Chem. Phys.
68
,
2959
(
1978
).
35.
A. K.
Faradjian
and
R.
Elber
,
J. Chem. Phys.
120
,
10880
(
2004
).
36.
R. J.
Allen
,
D.
Frenkel
, and
P. R.
Ten Wolde
,
J. Chem. Phys.
124
,
024102
(
2006
).
37.
P.
Tiwary
and
M.
Parrinello
,
Phys. Rev. Lett.
111
,
230602
(
2013
).
38.
Z. F.
Brotzakis
and
P. G.
Bolhuis
,
J. Chem. Phys.
151
,
174111
(
2019
).
39.
P. G.
Bolhuis
and
C.
Dellago
,
Rev. Comput. Chem.
27
,
111
(
2010
).
40.
O.
Valsson
,
P.
Tiwary
, and
M.
Parrinello
,
Annu. Rev. Phys. Chem.
67
,
159
(
2016
).
41.
T.
van Erp
,
D.
Moroni
, and
P.
Bolhuis
,
J. Chem. Phys.
118
,
7762
(
2003
).
42.
J.
Rogal
,
W.
Lechner
,
J.
Juraszek
,
B.
Ensing
, and
P. G.
Bolhuis
,
J. Chem. Phys.
133
,
174109
(
2010
).
43.
C.
Dellago
and
P. G.
Bolhuis
,
Mol. Simul.
30
,
795
(
2004
).
44.
P. G.
Bolhuis
and
G.
Csányi
,
Phys. Rev. Lett.
120
,
250601
(
2018
).
45.
J. D.
Weeks
,
D.
Chandler
, and
H. C.
Andersen
,
J. Chem. Phys.
54
,
5237
(
1971
).
46.
A. M.
Ferrenberg
and
R. H.
Swendsen
,
Phys. Rev. Lett.
63
,
1195
(
1989
).
47.
A.
Vijaykumar
,
P. G.
Bolhuis
, and
P. R.
ten Wolde
,
Faraday Discuss.
195
,
421
(
2016
).
48.
A.
Vijaykumar
,
P. R.
ten Wolde
, and
P. G.
Bolhuis
,
J. Chem. Phys.
147
,
184108
(
2017
).
49.
R. A.
Copeland
,
D. L.
Pompliano
, and
T. D.
Meek
,
Nat. Rev. Drug Discovery
5
,
730
(
2006
).
51.
F.
Romano
and
F.
Sciortino
,
Nat. Mater.
10
,
171
(
2011
).
52.
J. S.
Oh
,
S.
Lee
,
S. C.
Glotzer
,
G. R.
Yi
, and
D. J.
Pine
,
Nat. Commun.
10
,
3936
(
2019
).
53.
Z.
Wang
,
Z.
Wang
,
J.
Li
,
S. T. H.
Cheung
,
C.
Tian
,
S. H.
Kim
,
G. R.
Yi
,
E.
Ducrot
, and
Y.
Wang
,
J. Am. Chem. Soc.
141
,
14853
(
2019
).
54.
S.
Zhang
,
R. G.
Alberstein
,
J. J.
De Yoreo
, and
F. A.
Tezcan
,
Nat. Commun.
11
,
3770
(
2020
).
55.
R. S.
Fisher
and
S.
Elbaum-Garfinkle
,
Nat. Commun.
11
,
4628
(
2020
).
56.
M. J.
Mitchell
,
M. M.
Billingsley
,
R. M.
Haley
,
M. E.
Wechsler
,
N. A.
Peppas
, and
R.
Langer
,
Nat. Rev. Drug Discovery
20
,
101
(
2021
).
57.
J. C.
Platt
and
A. H.
Barr
, in
NIPS’87: Proceedings of the 1987 International Conference on Neural Information Processing Systems
(
MIT Press
,
1987
), pp.
612
621
.
58.
D. L.
Winter
,
H.
Iranmanesh
,
D. S.
Clark
, and
D. J.
Glover
,
ACS Synth. Biol.
9
,
2132
(
2020
).
59.
G. L.
Dignon
,
W.
Zheng
,
Y. C.
Kim
,
R. B.
Best
, and
J.
Mittal
,
PLoS Comput. Biol.
14
,
e1005941
(
2018
).
60.
R. B.
Best
and
G.
Hummer
,
Proc. Natl. Acad. Sci. U. S. A.
113
,
3263
(
2016
).
61.
T. P.
Senftle
,
S.
Hong
,
M. M.
Islam
,
S. B.
Kylasa
,
Y.
Zheng
,
Y. K.
Shin
,
C.
Junkermeier
,
R.
Engel-Herbert
,
M. J.
Janik
,
H. M.
Aktulga
,
T.
Verstraelen
,
A.
Grama
, and
A. C.
Van Duin
,
NPJ Comput. Mater.
2
,
15011
(
2016
).
62.
J.
Rogal
,
E.
Schneider
, and
M. E.
Tuckerman
,
Phys. Rev. Lett.
123
,
245701
(
2019
).
63.
J.-P.
Bouchaud
and
R.
Cont
,
Eur. Phys. J. B
6
,
543
(
1998
).
64.
D. C.
Haworth
and
S. B.
Pope
,
Phys. Fluids
29
,
387
(
1986
).
65.
L. D.
Jones
,
M.
Magdon-Ismail
,
L.
Mersini-Houghton
, and
S.
Meshnick
, arXiv:2008.10530 (
2020
), p.
1
.
Published open access through an agreement with Universiteit van Amsterdam Van 't Hoff Institute for Molecular Sciences

Supplementary Material