Empirical force fields employed in molecular dynamics simulations of complex systems are often optimized to reproduce experimentally determined structural and thermodynamic properties. In contrast, experimental knowledge about the interconversion rates between metastable states in such systems is hardly ever incorporated in a force field due to a lack of an efficient approach. Here, we introduce such a framework based on the relationship between dynamical observables, such as rate constants, and the underlying molecular model parameters using the statistical mechanics of trajectories. Given a prior ensemble of molecular dynamics trajectories produced with imperfect force field parameters, the approach allows for the optimal adaption of these parameters such that the imposed constraint of equally predicted and experimental rate constant is obeyed. To do so, the method combines the continuum path ensemble maximum caliber approach with path reweighting methods for stochastic dynamics. When multiple solutions are found, the method selects automatically the combination that corresponds to the smallest perturbation of the entire path ensemble, as required by the maximum entropy principle. To show the validity of the approach, we illustrate the method on simple test systems undergoing rare event dynamics. Next to simple 2D potentials, we explore particle models representing molecular isomerization reactions and protein–ligand unbinding. Besides optimal interaction parameters, the methodology gives physical insights into what parts of the model are most sensitive to the kinetics. We discuss the generality and broad implications of the methodology.

## I. INTRODUCTION

Molecular dynamics (MD) is a powerful tool that can, in principle, rigorously fully characterize molecular structures, thermodynamics, and kinetics. This scope is delimited by the accuracy of the atomistic molecular dynamics force field. Coarse-graining quantum mechanical molecular interactions in force fields enables modeling many-body molecular systems; yet, this process transfers errors associated with the model selection bias and parameterization based on sparse experimental datasets,^{1,2} such as vaporization data for small fragments.^{1,3} Some of these limitations are lifted by modern approaches employing open science and machine learning to coarse-grain electronic degrees of freedom^{4–7} or by constraining to experimental nuclear-magnetic resonance (NMR) data.^{8–10}

Current force fields can reach experimental accuracy in determining highly populated structures and their thermodynamics,^{11} but can differ vastly in predictions of kinetic properties.^{2,12} This systematic error is caused by the fact that during force field parameterization, ensemble averages are very sensitive to the free energy differences of the metastable states emerging from the potential energy function, but not so much to the free energies of the barriers between these states. Thus, by parameterizing against thermodynamic ensemble averages, it is unlikely to obtain kinetically accurate force fields. Several systematic approaches to optimize force field parameters aiming at reproducing thermodynamics have been developed throughout the years, e.g., force field parameterization for large biomolecules against solution NMR data^{13–15} or against distances between labeled sites.^{16} However, so far, there is no systematic strategy to optimize the force field parameters for the underlying kinetics such that an improved force field would match the target (experimental) interconversion rates directly, although procedures for model optimization for dynamical trajectories have been recently proposed.^{17–19} Such force fields, capable of representing experimental kinetics, would have a large impact on molecular dynamics since they would accurately report on transition state structures and populations of processes that are impossible to resolve by experiments,^{20} thus offering a leverage in, for instance, protein regulation and design (e.g., by mutations) or design of materials with specified kinetic properties.

In this work, we explore an efficient way to infer the relationship between molecular (force field) model parameters and kinetic data using only prior ensembles of reference trajectories and employing techniques to reweight these paths. (Note that in this work, *trajectory* and *path* refer to the same concept: an ordered sequence of states representing the dynamical evolution of a system.) This avoids the computationally expensive step of recomputing the (rare event) rate constant and moving randomly in the high-dimensional force field parameter space (e.g., a typical protein force field contains 100’s of parameters). Recently, Donati *et al.* have explored such path reweighting techniques, which explicitly compute the change in path action based on a force field perturbation.^{21–23} We combine this path reweighting technique with the continuum path ensemble maximum caliber (CoPE-MaxCal) approach^{20,24} in order to impose the dynamical constraint and at the same time select the best solution among multiple sets of parameters that all give the correct kinetics. This selection thus corresponds to a minimal perturbation, i.e., the change in the force field that causes the smallest possible perturbation to the entire path ensemble, while still obeying the constraint.^{20} As the computation of the reference path ensemble (at the current force field) and the rate constant by direct MD is inefficient, we employ trajectory sampling methodology, in particular transition path sampling (TPS),^{25,26} and its descendent single replica transition interface sampling (SRTIS) to efficiently obtain path ensembles.^{27} Our method is applicable to any rare event method that gives access to the reference path ensemble (e.g., Refs. 28 and 29).

The aim of this paper is to lay the theoretical foundations for the above-mentioned framework and to illustrate the proof of principle of our methodology on several simple particle-based systems undergoing stochastic rare event dynamics. The remainder of this paper is organized as follows. In Sec. II, we introduce and develop the reweighted estimators for the CoPE-MaxCal approach and show how these equations can be used to optimize the underlying force field parameters. In Sec. III, we describe the model systems and give simulation details. We discuss the results of the optimization procedure in Sec. IV. We end with conclusions in which we give an outlook on future directions.

## II. THEORY

### A. Continuum path ensemble maximum caliber

*N*atoms. $x\u2208R3N$ denotes the configurational state of the system, where $R3N$ is the 3

*N*-dimensional position space. We assume that the system evolves according to the overdamped Langevin dynamics

^{30,31}in a force field −∇

*V*(

*x*), and note that our method can be generalized to underdamped Langevin dynamics

^{23}so that $x\u2208R6N$. We simulate the system using the Euler–Maruyama (EM) integrator

^{32}to obtain time-discretized paths. A path or trajectory is defined as an ordered sequence of frames

**x**= {

*x*

_{0},

*x*

_{1}, …

*x*

_{L}}, where the subscripts denote the time index. Subsequent frames are separated by the time step of the integration algorithm Δ

*t*such that the total duration of the path is $T=L\Delta t$. The probability for a trajectory of length $T$ is

*ρ*(

*x*

_{0}) denotes the probability density of the initial condition, usually the Boltzmann distribution

*ρ*(

*x*) ∼ exp(−

*βV*(

*x*)), with

*V*(

*x*) being the potential energy of configuration

*x*,

*β*= 1/

*k*

_{B}

*T*being the reciprocal temperature,

*T*being the temperature, and

*k*

_{B}being Boltzmann’s constant.

*p*(

*x*

_{i}→

*x*

_{i+1}) is a short-time Markovian probability representing the dynamical evolution, as given by the integration algorithm, and thus depends on

*V*(

*x*). $Z$ is a normalization constant such that $\u222bDxP[x]=1$. See the supplementary material for details.

*V*

^{0}(

*x*) is the reference potential, and the perturbation

*U*(

*x*;

**a**) can be expressed in terms of

*m*parameters

**a**= (

*a*

_{1},

*a*

_{2}…

*a*

_{m}). Consequently, also the path probability density at the modified potential depends on these parameters: $P[x;a]$.

*S*or caliber, for this path distribution $P[x,a]$ with respect to the probability $P0[x]$ at

*V*

^{0}(

*x*) is given by the Kullback–Leibler divergence,

^{33}states that the optimal path probability distribution $PMC[x]$ follows from maximizing the caliber, or path entropy, by varying

**a**while satisfying an external constraint

*s*

^{exp},

_{W}of a path observable

*s*[

**x**] with respect to the path probability density $P[x;a]$. The second constraint keeps the probability distribution normalized.

*μ*and

*ν*stand for Lagrange multipliers. $L$ depends on the potential energy function

*V*(

*x*;

**a**) via Eq. (1). The task is now to find the stationary points of the Lagrange function, which constitutes setting to zero the derivatives of $L$ with respect to the adjustable parameters

**a**of the potential energy function.

### B. The reweighted path Lagrange function $L(a)$

*V*

^{0}(

*x*), we express $P[x,a]$ using a path reweighting factor,

^{23}

*W*[

**x**;

**a**] depends on the difference of the path actions and is specified in Eq. (16). $Z0/Z(a)$ is the ratio of the partition function at

*V*

^{0}(

*x*) and the partition function at

*V*(

*x*,

**a**). Including $Z0Z(a)$ in the definition of the modified path probability density guarantees that the density is normalized: $\u222bDxP[x;a]=1$. The reweighted path ensemble average, then, is

*W*[

**x**;

**a**]⟩

_{W}represents the path action difference, $lnZ(a)Z$ represents the free energy difference, and ⟨

*s*[

**x**]⟩

_{W}is the expected value of the experimental observable at the modified potential.

### C. Derivatives of the Lagrange function $L(a)$

To find the optimal parameters **a** of the potential energy function for a single constraint, we determine the stationary point of the Lagrange function, i.e., we solve $\u2202L/\u2202ak=0,\u2202L/\u2202\mu =0$ for all *a*_{1}, …*a*_{m}, leading to *m* + 1 constraint equations.

*w*[

**x**;

**a**] = ln

*W*[

**x**;

**a**] so that

*W*[

**x**;

**a**] = exp(

*w*[

**x**;

**a**]) and $w\u2032[x;a]=\u2202\u2202akw[x;a]$. The derivative of the path Lagrange function [Eq. (10)] with respect to a parameter

*a*

_{k}, then, is

Since the path expected values in Eqs. (11) and (13) can be estimated from a set of trajectories generated at the reference potential *V*^{0}(*x*), these two equations provide the basis for optimizing the parameters of the modified potential *V*(*x*, **a**). So far, the logarithm of the relative path probability *w*(**x**, **a**) = ln*W*[**x**, **a**] and the path observable *s*(**x**) have been abstract functions. Next, we derive explicit expressions for constraining the rate constant and expressions for *w* and *w*′ for paths generated by the EM integrator.

### D. The rate constant estimate

^{34–40}We adopt the framework of transition path sampling

^{25}and transition interface sampling (TIS)

^{41}and, in particular, that of the reweighted path ensemble (RPE).

^{42}Defining (meta)stable states A and B using an order parameter (or collective variable)

*λ*, with

*λ*

_{A,B}being the boundaries of states A and B, one can compute the rate constant in the RPE framework as

*B*or return to enter

*A*, after which they are terminated (see the supplementary material). The frequency with which these trajectories are sampled is determined by $P[x]$, and such a path ensemble can be obtained by, e.g., TIS.

^{41}

*θ*(

*x*) is the Heaviside step function, and

*λ*

_{max}[

**x**] returns the maximum value of the variable

*λ*along the path. Thus, the fraction of path integrals equals the probability that paths reach the boundary of state B. Multiplying with the flux

*ϕ*

_{A}through the first interface

*λ*

_{0}for paths leaving state A yields the rate constant. Using the path reweighting of Eq. (6) and setting

*s*[

**x**] =

*ϕ*

_{A}

*θ*(

*λ*

_{max}[

**x**] −

*λ*

_{B}) and the experimental observable to $sexp=kABexp$ in Eq. (10), the Lagrange function and its derivative are given by

*ϕ*

_{A}does not depend on

**a**and, thus, cancels in the second term, which is justified if the parameters

**a**do not influence stable state A. In general, the parameters can also affect the stable states, but the effect on the flux is expected to be small, in comparison to the change in rate constant due to the barrier height.

In the constraint term, the path ensemble average $w\u2032A,W$ is calculated using the reweighted path probability $P[x]=W[x;a]P0[x]$ (indicated by subscript *W*) over all trajectories that leave A (indicated by subscript *A*). The effect of *s*[**x**], as defined above, is to select the reactive AB paths, i.e., those that reach B, from the set of all transition paths. With this choice of *s*[**x**], Eq. (12) becomes a reweighted path ensemble average over the set of reactive AB trajectories. Thus, we can write $w\u2032Ws=w\u2032AB,W$, where subscript AB indicates that the path ensemble average is calculated with the path probability $P[x]\theta (\lambda max[x]\u2212\lambda B)$ over all trajectories that leave *A*. Note the similarities to the temperature derivative of the rate constant in, e.g., Refs. 43 and 44.

The term between brackets in Eq. (15) denotes the derivative of the rate constant with respect to the force field parameters and can serve as a first sanity check whether the reweighting approach actually works.

### E. Path reweighting using the adapted force field

*W*[

**x**;

**a**] is

^{23}

*U*(

*x*;

**a**) is the perturbation to the force field as defined in Eq. (2), $\kappa \u2261\Delta t/2kBT\xi m$,

*ξ*is the Langevin collision rate in units of s

^{−1}, and

*m*is the particle mass, as defined in the EM integrator.

*η*

_{i}is the random number used in the

*i*th iteration of the EM integrator out of

*n*time steps (frames) and can be recorded during simulation at the reference potential

*V*

^{0}(

*x*). Taking the derivative of

*w*[

**x**;

**a**] = ln

*W*[

**x**;

**a**] yields

*a*

_{k}on the fly, it is even more efficient to be able to compute these

*a posteriori*. This depends on the precise functional form of

*U*(

*x*;

**a**). In the case of a linear dependence

*V*(

*x*,

**a**) =

*V*

^{0}(

*x*) +

*∑*

_{k}

*a*

_{k}

*U*

_{k}(

*x*), the perturbation forces are −

*a*

_{k}∇

*U*

_{k}(

*x*), and the derivative of ∇

*U*(

*x*,

**a**) with respect to

*a*

_{k}is just ∇

*U*

_{k}(

*x*). Hence, it is convenient to store the sums over

*η*

_{i}∇

*U*(

*x*

_{i},

**a**) and $(\u2207U(xi;a))2$ terms for each path explicitly so that the derivatives can be computed easily for arbitrary values of

**a**. For a non-linear dependence, one can still do so, but it becomes more involved.

## III. SIMULATION METHODS

### A. Triatomic system

^{45}are bound by a harmonic potential that has a minimum at a certain equilibrium bond distance: $V(r)=12a(r\u2212req)2$. This trimer can undergo an isomerization transition where one particle passes between the other two, hence changing from a clockwise to anticlockwise arrangement of the (labeled) particles (see Fig. 1). Note that this system was also studied in Ref. 26. We focus on the isomerization rate constants for this trimer and measure the rate constants and their derivatives as functions of the force constant

*a*and the equilibrium distance

*r*

_{eq}. In fact, since we are interested in the changes Δ

*a*, Δ

*r*, with respect to a reference system, we define the modified harmonic potential as

*U*(

*x*;

**a**) becomes

*r*

_{j}denotes the

*j*th bond vector length in the trimer.

^{27}The collective variable used to define stable states and interfaces is the shortest distance

*r*

_{perp}of the hopping particle to the axis between the two remaining particles. This distance is negative or positive depending on which side the hopping particles is located. The collective variable follows from the (ad hoc) transformation,

*λ*<

*λ*

_{A}= 0.0 to avoid too short paths. Stable state B is defined from the other side and is also

*λ*

_{0}<

*λ*

_{B}= 0.0. Note that since the reference state is different, both definitions refer to different states even if the value of

*λ*is equal. Interfaces were defined at

*λ*= 0.4, 0.45, 0.50, 0.55, 0.60, 0.65, 0.70, 0.75, 0.80, 0.85, 0.90, 0.95, 1.00, 1.05, 1.10, 1.15, 1.20, 1.25, 1.30, 1.35, 1.40, 1.45, 1.50, 1.55, 1.60, 1.65, 1.70, 1.75, 1.80. The Langevin integrator settings are

*γ*= 2.5 and

*dt*= 0.001. Integration is done via the EM algorithm.

Sampling 10 000 cycles with SRTIS,^{27} where each cycle consisted of 100 shots, 100 interface exchanges, and 100 state swaps, resulted in a path ensemble for each interface. The measured crossing histograms were joined with the weighted histogram analysis method (WHAM),^{46} which together with the effective positive flux through the first interface^{27,39,41} leads to rate constant estimates over the barrier. The WHAM procedure also allowed us to assign a weight to each trajectory in the path ensemble, yielding the RPE. Each of these RPE trajectories can be evaluated in terms of the path action and its derivatives.

### B. Dissociation model

*c*) and the green particles via an attractive LJ-like interaction potentials with adjustable depths. The interaction with the gray particles is purely repulsive and given by a standard repulsive WCA potential.

^{45}Setting the particle diameter as the unit of length

*σ*= 1 and denoting the red particle with index 0, the total energy is, thus, given by

*V*

_{ϵ}(

*r*), in units of

*k*

_{B}

*T*, are given by

*v*

_{s}and

*v*

_{ϵ}shift the potential such that the potential is zero at the cutoff

*r*

_{c}= 1.5 and continuous at the minimum

*r*= 2

^{1/6}.

*ϵ*is the (reduced/dimensionless) depth of the potential, while

*ϵ*

_{0}= 1 is the standard reference value for the potential. For this particular cutoff, it follows that $vs=rc\u22126\u2212rc\u221212=0.0800841$ and

*v*

_{ϵ}= (1 −

*ϵ*) + 4

*ϵv*

_{s}.

The switching at the minimum of the potential is done to avoid numerical problems with the rather steep repulsive part of the potential arising for the high values of *ϵ*_{1} required to bind the particle. Therefore, we chose to keep the repulsive part of the potential equal to the standard WCA potential, i.e., not scale with *ϵ*. In this way, only when the particles are in the attractive part of the potential, they contribute to the path action derivatives, making the evaluation of the path action more robust. Note that this does not change the generality of our approach.

Subsequently, we allow the other particles to move as well. To keep the cluster together, we apply an additional potential that binds the non-red particles to the central blue particle by an additional strong LJ interaction $V\u03f53(r)$ with *ϵ*_{3} = 20. While this keeps the cluster intact, rearrangements are still possible. We avoid these by imposing an additional weak harmonic spring between neighboring particles with a spring constant *k* = 1 (this, of course, excludes the red particle).

The order parameter *λ* used for SRTIS is the center-to-center distance *r* between the red and blue particles. Stable states were defined as *r* < 2^{1/6} and *r* > 2.5 for the initial and final states, respectively. (Note that beyond *r* = 2.5, the ligand cannot yet be considered escaped to the bulk. While it is possible to take this into account, see, e.g., Refs. 47 and 48, we assume here for simplicity that the ligand is dissociated.) Interfaces were positioned at *λ* = 0.15, 0.15, 0.20, 0.25, 0.30, 0.35, 0.40, 0.50, 0.60, 0.70, 0.80, 1.00 with respect to the minimum distance *r*_{A} = 2^{1/6}. We integrate the equations of motion using the EM integrator with a time step of *dt* = 0.001 and a friction of 2.5.

Finally, in order to show that our methodology easily extends to 3D systems, we consider a 3D version of the LJ-model, now considering 13 LJ particles as depicted in Fig. 3. Here, four outer green particles bind the ligand with an attraction *ϵ*_{2}.

## IV. RESULTS AND DISCUSSION

We illustrate our methodology and system setup for several models, including a bistable potential (see the supplementary material). Here, we explore the 2D molecular isomerization reaction from Fig. 1 and the particle dissociation from a 2D cluster of seven Lennard-Jones (LJ) particles shown in Fig. 2, which can be viewed as analogous to the protein–ligand unbinding, an important biophysical process^{49,50} or dissociation for small nano-clusters.^{51–56} In addition, we present and discuss 3D LJ particle systems, which are remarkably similar to the 2D LJ particle systems, showing the robustness of the results.

### A. Triatom isomerization

#### 1. Path ensembles

For the triatom model (see Sec. III), we perform SRTIS with in total 10^{6} shooting and replica exchange moves. Acceptance ratios for the shooting move ranges from 0.4 for the first interface to 0.15 for the last interface. Replica exchange moves were accepted around 50%. Path lengths vary from 50 time steps for the first interface to a few thousand for the last interface. From this reference path ensemble, we estimate the rate constant for the unmodified potential as ln*k*_{AB} ≈ − 9. (See the supplementary material for more details.)

Before embarking on the full problem of model parameter optimization, we first checked the path reweighting method by computing the derivative of the rate constant, i.e., the term in brackets in Eq. (15). In the supplementary material, we show that the rate constant derivatives are well-predicted by our path reweighting approach for both the triatom and a simpler diatomic system.

#### 2. Optimization of the Lagrange function

Now, we can turn to the optimization of the molecular (force field) model parameters. The trimer system has two parameters that can both be tweaked to reproduce the imposed rate. To optimize the Lagrange function [Eq. (9)], we have to be able to compute it as a function of parameters not only at zero perturbation but also for the reweighted path ensemble *W*[**x**; *a*]. Moreover, we need to be able to take its derivatives at nonzero perturbation.

*u*and

*η*

*f*terms refer to the specific contribution to the energy in the first time slice 0 and the random number force (gradients), respectively [see Eq. (17)]. The

*f*

^{2}terms refer to the gradient square terms. Using these quantities, it is easy to compute the derivatives

*dw*[

*a*,

*r*;Δ

*a*, Δ

*r*]/

*da*and

*dw*[

*a*,

*r*;Δ

*a*, Δ

*r*]/

*dr*. From these, we compute the path ensemble averages ⟨

*w*′

*w*′⟩

_{W}, ⟨

*w*⟩

_{W}, and ⟨

*w*′⟩

_{W}. Finally, we construct the reweighted rate constant ln

*k*[

*a*,

*r*;Δ

*a*, Δ

*r*],

*D*

_{KL}, and the Lagrange function $L$ from Eqs. (11)–(15).

For a particular set of values of *a* = 20 and *r*_{eq} = 1.5, we computed the path ensemble using SRTIS. In Fig. 4, we show *D*_{KL} and log rate constant ln*k*[*a*, *r*;Δ*a*, Δ*r*] on top of each other as a function of Δ*a* and Δ*r*. As expected, *D*_{KL} is minimal for the reference value Δ*a* = 0, Δ*r* = 0, that is, at zero perturbation.

The reweighted rate constant is identical to the predicted rate constant from the prior path ensemble for Δ*a* = 0, Δ*r* = 0, but clearly varies if the system is perturbed.

Note that the rate contour lines in Fig. 4 are roughly parallel to the Δ*r*_{eq} axis, implying that, for a given spring constant *a* + Δ*a*, the rate is rather insensitive to the choice of equilibrium distance. At the transition state, the bond between atoms 1 and 2 is quite stretched, whereas the other two bonds are compressed, which explains why the spring constant has greater influence on the rate than the equilibrium distance. In addition, *D*_{KL} is less sensitive to Δ*a* than to Δ*r*_{eq}, indicating that it is probably better to adjust *a* than *r*_{eq}.

When we change the rate constant from the observed value of ln*k*_{AB} = −9 to the new value ln*k*^{exp}, we need to compute the derivatives and optimize the Lagrange function. To do so, we apply the method of Ref. 57. This iterative method starts at certain initial values and slowly converges to the solution. The solution for the most optimal set of {Δ*a*, Δ*r*} is shown in Table I. In Table I, we show several sets of optimal solutions as a function of imposed $kABexp$. The first set shows the optimal solution for both Δ*a* and Δ*r*. The other two sets we optimized for only one of the two parameters, and set the other to zero. From the value of *D*_{KL} (which is identical to the negative of the Lagrange function $L$ when the constraints are obeyed), it is clear that this is always less optimal than the two parameter solution, thus illustrating the need for multi-parameter force field optimization. In addition, note that the further the imposed $lnkABexp$ is from −9, the more $L$ and, thus, *D*_{KL} deviate.

$ln\u2061kABexp$ . | μ
. | Δa
. | Δr
. | D_{KL}
. |
---|---|---|---|---|

−8 | −0.162 56 | −3.034 52 | −0.002 90 | 0.083 18 |

−12 | 0.275 89 | 8.439 88 | 0.014 61 | 0.454 86 |

−16 | 0.303 01 | 19.057 9 | 0.044 08 | 1.685 25 |

−20 | 0.280 86 | 29.106 6 | 0.066 00 | 2.879 69 |

−8 | −0.164 47 | −3.064 28 | 0 | 0.084 08 |

−12 | 0.317 65 | 8.842 44 | 0 | 0.489 06 |

−16 | 1.000 39 | 19.867 5 | 0 | 2.981 42 |

−20 | 0.572 78 | 28.773 4 | 0 | 6.958 17 |

−12 | 0.587 05 | 0 | 0.258 99 | 4.181 48 |

−16 | 0.286 30 | 0 | 0.385 95 | 5.617 00 |

−20 | 0.547 23 | 0 | 0.490 97 | 6.965 69 |

$ln\u2061kABexp$ . | μ
. | Δa
. | Δr
. | D_{KL}
. |
---|---|---|---|---|

−8 | −0.162 56 | −3.034 52 | −0.002 90 | 0.083 18 |

−12 | 0.275 89 | 8.439 88 | 0.014 61 | 0.454 86 |

−16 | 0.303 01 | 19.057 9 | 0.044 08 | 1.685 25 |

−20 | 0.280 86 | 29.106 6 | 0.066 00 | 2.879 69 |

−8 | −0.164 47 | −3.064 28 | 0 | 0.084 08 |

−12 | 0.317 65 | 8.842 44 | 0 | 0.489 06 |

−16 | 1.000 39 | 19.867 5 | 0 | 2.981 42 |

−20 | 0.572 78 | 28.773 4 | 0 | 6.958 17 |

−12 | 0.587 05 | 0 | 0.258 99 | 4.181 48 |

−16 | 0.286 30 | 0 | 0.385 95 | 5.617 00 |

−20 | 0.547 23 | 0 | 0.490 97 | 6.965 69 |

### B. Dissociation from a LJ cluster

Having shown that we can apply our framework to model systems for molecular (isomerization) reactions, we explore in this section a slightly more elaborate system, which also has interesting physical properties, namely, particle dissociation from a cluster of LJ particles. Such a process is reminiscent of ligand unbinding, which is an important problem in biophysics.^{49,50} Moreover, it can be seen as dissociation for small nano-clusters.^{51–56}

We describe this dissociation transition using a model of Lennard-Jones (LJ)-like particles. We first consider a seven particle setup in two dimensions, indicated in Fig. 2. In the simplest instance of this model, all particles are kept fixed except the red particle, which can dissociate from the cluster and diffuse away.

Here, the only important interactions are that of the red particle with the six other particles. In the second model, we allow the other particles to move as well, resulting in a fluctuating cluster of six particles, which can expel the red particle. During the dissociation, the green particles can move closer to each other, gaining in entropy. We can interpret this simple model as representing a ligand unbinding reaction, e.g., of a protein, in which the protein binding pocket slightly rearranges upon (un)binding. We stress that this is just an analogy.

Finally, in order to show that our methodology easily extends to 3D systems, we consider a 3D version of the model with 13 LJ particles.

#### 1. Path ensembles

We first start exploring the fixed seven-particle model. Setting the interaction with the central particle to *ϵ*_{1} = 10 and with the green outer particle to *ϵ*_{2} = 1, we sample unbinding transitions using SRTIS (see Sec. III).

In total, we perform 10^{5} shooting and replica exchange moves. Acceptance ratios for the shooting move ranges from 0.4 for the first interface to 0.15 for the last interface. Replica exchange moves were accepted around 50%. Path lengths vary from 50 time steps for the first interface to a few thousand for the last interface. The crossing probability of the final interface (obtained from WHAM) is ln*P*(*λ*_{B}|*λ*_{1}) = −6.567. The flux of the first interface is *ϕ* = 0.003 550. The total rate constant is thus *k*_{AB} = 5 × 10^{−6} in units of time steps. Note that, when optimizing the rate constant, we assume that the fluxes are not altered much (as above), and we only have to consider the crossing probability.

We also performed similar runs for the flexible 2D cluster.

*u*

_{0},

*ηf*-functions, and

*f*-functions involve the potential energy of the first slice and the gradient of the potential energy; cf. Eq. (22). Note that, as above, only the attractive part of the potential has to be taken into account.

#### 2. Caliber/*D*_{KL} and rate constant predictions

Our framework then gives the rate constant predictions for the altered parameters, as well as the caliber or *D*_{KL}. Figure 5 represents both predictions as contour plots. The green contours delimit the (log of the) dissociation rate constant (in fact, the crossing probability) predictions, while the blue-ochre contours depict *D*_{KL}. Several conclusions follow from Fig. 5. The first is that for the fixed cluster, the *D*_{KL} contours show some anti-correlation in the two parameters. This indicates that the path ensemble is least disturbed when an increase in *ϵ*_{2} is compensated by a decrease of *ϵ*_{1}. The *D*_{KL} contours also are slightly asymmetric, showing a larger sensitivity to negative values of *ϵ*_{2}. As this parameter is set to the relatively low value of *ϵ*_{2} = 1, reduction below Δ*ϵ*_{2} < −1 will, therefore, reverse the sign of the attractive interaction, which, of course, completely alters the systems. The flexible cluster in Fig. 5(b) shows no such anti-correlation in *D*_{KL}, indicating that the compensating effect has largely disappeared.

The second observation is that the green rate constant contours are roughly linear with a negative slope. This indicates that both increasing *ϵ*_{1} and *ϵ*_{2} have similar effects for a large variety of values. For a similar decrease in the rate constant, one could choose to increase either *ϵ*_{1} or *ϵ*_{2}. Note that the slope of the contour is roughly −0.5, as there are two outer particles (green) and only one central particle, so changing their interaction strengths *ϵ*_{2} has, therefore, twice the effect. This observation indicates that *ϵ*_{1} and *ϵ*_{2} are more or less interchangeable, and one can choose many combinations for arriving at the same rate constant.

#### 3. Optimization of the Lagrange function

Next, we would like to find the optimal choice for the force field parameters *ϵ*_{1,2}. As before, optimizing the parameters for a given rate constant amounts to following a green rate constant contour until *D*_{KL} is minimal. Using the optimization procedure outlined in Sec. IV A 2, we arrive at a prediction given by the red points in Fig. 5. This prediction corresponds, thus, to the most optimal force field parameters, which minimizes the change in the path ensemble with respect to the original force field.

In all cases in Fig. 5, a trivial observation is that the curve passes through the origin, as there, the original force field reproduces the original rate constant most optimally. For the fixed cluster in Fig. 5(a), the optimal curve is roughly vertical, with only a relatively small deviation in *ϵ*_{1}. This means that that whether enhancing or reducing the rate, it is always better to change *ϵ*_{2} rather than *ϵ*_{1}. Remarkably, this trend even holds for Δ*ϵ*_{2} < −1, which changes the interaction from attraction to repulsion. This is likely a consequence of the immobility of the particles in this case.

For the flexible cluster cases in Figs. 5(b) and 5(c), the most striking feature is perhaps the L-shaped curve, indicating an asymmetry concerning reducing the rate or enhancing the rate. For a rate change of about one natural log unit, the red curve is roughly diagonal, that is, *ϵ*_{1,2} are contributing to the rate constant in equal proportions. However, when reducing the rate constant further, the curve bends over more vertically, indicating that it is better to change *ϵ*_{2} instead of *ϵ*_{1}. In contrast, when increasing the rate, the curve bends over horizontally, indicating that it is now better to change *ϵ*_{1} instead of *ϵ*_{2}. This conclusion also holds for Fig. 5(c), where *ϵ*_{1} = 15 and *ϵ*_{2} = 5.

We can interpret this behavior as follows. The best parameters are those that perturb the path ensemble as little as possible. When reducing the rate constant, it is better to adjust the *ϵ*_{2} parameter than the *ϵ*_{1} parameter, even if they both can lead to the same rate constant predictions. This can be interpreted by realizing that changing the central particle interaction *ϵ*_{1} will alter the entire reweighted path ensemble: all interface ensembles will be affected. In contrast, the change of *ϵ*_{2} will mostly affect only the interface ensembles further out. Since distant interfaces have a (much) lower weight in the ensemble, the perturbation, as measured by *D*_{KL}/caliber will be smaller. In contrast, for increasing the rate, changing *ϵ*_{2} will not get you very far, and a substantial change of *ϵ*_{1} is also necessary. These effects can be shown by plotting the total crossing probability *P*(*λ*|*λ*_{0}) for the different settings in Fig. 6. For the curves for Δ*ϵ*_{1} = 4 and Δ*ϵ*_{2} = 2, the final rate constant predictions (crossing probs) are almost equal, but the intermediate crossing probability and, hence, the path ensembles are very different. Clearly, the Δ*ϵ*_{2} = 2 case is much closer to the original dataset (black solid line), especially in the beginning of the crossing probability where the path ensemble is most dominant.

#### 4. Comparison with rate constant predictions

While the prediction of the optimal parameters is already providing valuable insight, the ultimate goal is to establish a better force field model. To assess the quality of the predictions, we can compare the predicted rate constants with independent calculations at these different force field parameters. Figure 7 shows this comparison for the flexible 2D unbinding. The agreement is good, especially up to two *k*_{B}*T* from the reference point (*ϵ*_{1} = 12, *ϵ*_{2} = 2.0).

Note that Fig. 7 compares just the reweighted crossing probability part of the rate to validate the rate prediction of our method. Thus, in this comparison, we have not included possible differences in flux factor. In this particular example, we verified that this difference is relatively small (≲20%), which would translate in a small correction on the logarithmic scale of less than 0.18, but in general, we encourage to recalculate the flux factor through the first interface. This is usually not the most expensive part.

#### 5. Physical insight from the optimization

These results also reveal several physical aspects: While the central particle is more important for increasing the rate, outer particles are more important for decreasing the rate, as the entire path ensemble is mostly influenced by the central particle’s interaction change, whereas outer particles only affect the barrier region. Outer particles, therefore, are prime targets for modulating when reducing the dissociation rate constants. Translating to real proteins this amounts to engineering mutations or post-translational modifications^{58} of binding pocket residues close to the surface or modulation of the ligand chemistry in order to bind better to encounter complex sites at the surface. Of course, this extrapolation to realistic systems is currently no more than a hypothesis, which requires further testing. Yet, the general principle is likely to be robust.

### C. Dissociation from a 3D LJ cluster

In Fig. 8, we plot the results for the 3D systems. They are remarkably similar to the 2D systems, showing the robustness of the results. One striking difference is the slope of the green rate constant contours, which is now −0.25 (for the flexible 3D case), as we now have four outer particles. Hence, a change in *ϵ*_{2} has a fourfold effect on the rate constant. In addition, the *D*_{KL} contours appear different and a bit more skewed compared to the more circular ones in the 2D case. Remarkably, the optimal solution for the parameters (the red curves) looks again qualitatively similar to those of the 2D cases. Only for very strong interaction Δ*ϵ*_{2} > 2 of the outer particles, shown in Figs. 8(b) and 8(c), the red curve bends over.

## V. CONCLUSION

In this paper, we have introduced a Continuum Path Ensemble Maximum Caliber (CoPE MaxCal) based method to optimize force field parameters in order to impose an dynamical constraint, in particular a rate constant in a complex molecular transition. Without any computationally expensive recalculation of the kinetics, the method yields the optimal change in the parameters that leads to an imposed rate constant after the path reweighting, while making the least possible perturbation to the prior trajectory ensemble as measured by the caliber or KL divergence.

We show that the path reweighting leads to meaningful prediction of the rate constants, which agrees with the direct calculation for the new force field, even up to more than an order of magnitude suppression of the rate constant in the case of the unbinding model. While in this work we develop the methodology and applied it only to simple models, we expect the method to be generally applicable.

Our method is especially aimed at force fields with multiple parameters. If there are only one or two parameters to change in a system, such as in a monoatomic Lennard-Jones fluid, adding kinetic constraints might overdetermine the parameter space, prohibiting a sensible solution. Instead, we focus on the other extreme, that of underdetermined systems, where the best solution should be found among many candidate parameters.

Besides a corrected force field, we find that the optimization for multiple parameters allows for an interpretation that gives physical insights in which parts of the path ensemble are affected by the parameter changes. Thus, the method provides a powerful tool to inspect rare event trajectory ensembles and extract valuable information from them. Trajectory ensembles provide us not only with mechanisms, rate constants, and transition states for rare event dynamics, but they can also inform on how such properties change with the model parameters. Importantly, they can predict the change in rate constants and, moreover, point us in the direction of parameters that least affect the original path ensembles. This gives the possibility to extend the optimization of force fields that are already optimized for thermodynamics to kinetics and, moreover, gives us pointers to what parts of the systems are most sensitive and thus most sensible to adapt or mutate.

One concern is how altering the potential might change the thermodynamics (or other static properties). When the force field perturbation only affects the barrier region between the two stable states and does influence the stable states themselves, this effect will be minimal (see the trimer example). However, if, for instance, changing force field parameters makes one of the states relatively more stable, this has the (most likely undesired) side-effect of changing the free energy difference. This issue might be resolved by including an additional constraint, e.g., by demanding that the forward and backward rates are modified with the same factor so that the free energy difference is unchanged in the optimized force field. Note that other thermodynamic or dynamic properties not targeted in the optimization might also be affected and might need an additional reevaluation.

Looking toward the future, we consider several directions for further research. First of all, we have used only the simplest of path actions, i.e., the Onsager–Machlup action for the EM integrator. It is known that EM is not a great integrator, and an obvious extension of the methods would be to extend it to underdamped Langevin integrators.^{23}

Second, we only have considered up to two parameters. We envision that extension to many parameters is, in principle, straightforward, but some bookkeeping issues arise, e.g., keeping track of all the cross terms in the square gradient. Nevertheless, the method should be equally effective for optimization in higher dimensions.

Third, the method can and should be applied in a realistic force field and included in a MD engine. This is non-trivial since the evaluation of the path action should be done in the integrator. Nevertheless, we believe that this is a worthwhile and necessary research direction to make our approach useful. One of the issues is the functional form of the potential perturbation. To start, one could try simple functional forms, such as dihedral terms, and Lennard-Jones-like interactions. In a more advanced exploration, one could try to use machine learning to learn the functional form of the perturbation.

Fourth, the methods can be pushed beyond the standard atomistic force fields into coarse-grained force fields, for which rate constants are notoriously difficult to reproduce. Such an approach would go beyond a simple adjustment of the diffusion constant or friction.

Fifth, the method should be also applicable to other dynamical observables, such as diffusion and viscosity. In fact, all observables based on time-correlation functions of a trajectory ensemble should be treatable.

Sixth, parameters beyond the force field might be optimized, including the diffusion, effective mass, and even molecular topology. The holy grail in chemistry, material science, and biology is to accurately determine the link between microscopic degrees of freedom, such as the chemistry, the microscopic mechanisms, transition rates, and structure populations with macroscopic thermodynamic or kinetic properties related to the function. Having an efficient algorithm that is able to provide the link between kinetic/thermodynamic macroscopic properties and force field parameters, reporting on the underlying physics could be potentially very useful in protein engineering and design of small molecules and of materials with tunable macroscopic properties.^{49–56} Examples of such design applications are coarse-grained molecular force fields where the sole parameters are the identity of the aminoacid, which are a.o. employed in protein aggregation and protein folding.^{59,60} For such problems, one could apply our approach to suggest changes in the aminoacid sequence in the direction of promoting or depressing the rate constant of, e.g., aggregation, protein folding, or liquid/liquid phase separation. In this way, one could optimize the design of the novel (bio)material to be further tested in the wet lab, thus bypassing expensive and timely optimization protocols in the wet lab.

Moreover, our method could be particularly useful in chemical reactions by correcting the kinetics of imperfect while computationally cost-effective calculations by reactive force fields^{61} instead of doing calculations with accurate albeit computationally expensive density functional theory potentials. Another utility is the generality of the method, giving the possibility to combine it with deep learning potentials that are known to be very flexible and have already been used when calibrating force fields for thermodynamics in material science.^{5,62}

Finally, our methodology of constrained optimization based on the maximum entropy principle is not limited to molecular systems and could even potentially contribute to various problems of time-series, where the underlying dynamics can be modeled as stochastic, undergoing rare events and with a potential energy function reporting on interactions of agents, such as in modeling of stock-market, fluid-flows, and health-pandemics.^{63–65}

## SUPPLEMENTARY MATERIAL

The supplementary material comprises an appendix with several theory sections containing detailed explanations and derivations mentioned in the text and results and discussion sections that are provided using the rate constant derivative as a test, applied to a diatomic and triatomic system.

## ACKNOWLEDGMENTS

The authors thank Christoph Dellago and Pieter Rein ten Wolde for useful feedback and helpful discussions. B.G.K. acknowledges the funding by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) through Grant No. SFB 1449–431232613, sub-project C02. The ideas for this contribution were initiated during the workshop “Accelerating the Understanding of Rare Events” (6-10 September 2021) at the Lorentz Center in Leiden, NL.

## AUTHOR DECLARATIONS

### Conflict of Interest

The authors have no conflicts to disclose.

### Author Contributions

**Peter G. Bolhuis**: Conceptualization (equal); Data curation (lead); Investigation (lead); Methodology (equal); Writing – original draft (equal); Writing – review & editing (equal). **Z. Faidon Brotzakis**: Conceptualization (equal); Investigation (supporting); Methodology (equal); Writing – original draft (equal); Writing – review & editing (equal). **Bettina G. Keller**: Conceptualization (equal); Investigation (supporting); Methodology (equal); Writing – original draft (equal); Writing – review & editing (equal).

## DATA AVAILABILITY

The data that support the findings of this study are available from the corresponding author upon reasonable request.

## REFERENCES

*Advances in Chemical Physics*

*Stochastic Differential Equations*

*Numerical Solution of Stochastic Differential Equations*