We introduce a variational algorithm to estimate the likelihood of a rare event within a nonequilibrium molecular dynamics simulation through the evaluation of an optimal control force. Optimization of a control force within a chosen basis is made possible by explicit forms for the gradients of a cost function in terms of the susceptibility of driven trajectories to changes in variational parameters. We consider probabilities of time-integrated dynamical observables as characterized by their large deviation functions and find that in many cases, the variational estimate is quantitatively accurate. Additionally, we provide expressions to exactly correct the variational estimate that can be evaluated directly. We benchmark this algorithm against the numerically exact solution of a model of a driven particle in a periodic potential, where the control force can be represented with a complete basis. We then demonstrate the utility of the algorithm in a model of repulsive particles on a line, which undergo a dynamical phase transition, resulting in singular changes to the form of the optimal control force. In both systems, we find fast convergence and are able to evaluate large deviation functions with significant increases in statistical efficiency over alternative Monte Carlo approaches.

## I. INTRODUCTION

A system kept away from thermal equilibrium by a continuous supply of energy is subject to fewer physical constraints than one evolving within an equilibrium state. As a consequence, the application of external forces or the internal consumption of energy can produce structures and responses without equilibrium equivalent.^{1–3} Advances in the theory and modeling of nonequilibrium steady-states^{4–6} have resulted in an increased interest in trying to understand the behavior in systems out of equilibrium and leverage their versatility to design new functional materials.^{7–13} However, quantifying emergent nonequilibrium behavior with computer simulations is currently hampered by the lack of robust tools to sample the rare fluctuations required to estimate response functions, overcome kinetic bottlenecks, and reach the time scales of experimental relevance. For a generic class of stochastic systems that violate detailed balance, we have developed an algorithm to compute control forces that can be used to enhance the sampling of nonequilibrium steady-states. The control forces we optimize are variational as are the estimates they provide of the likelihood of a rare fluctuation as characterized by large deviation functions. In two paradigmatic models of nonequilibrium systems, we demonstrate that estimating large deviation functions in this way is both accurate and statistically efficient.

Enhanced sampling methods within equilibrium ensembles are standard tools that enable the determination of phase diagrams and the calculation of rates of rare events, through the evaluation of equilibrium free energies.^{14} Free energies characterize the likelihood of configurational fluctuations around an equilibrium state, and the analogous quantity for fluctuations of time integrated observables around nonequilibrium steady-states is large deviation functions.^{5,15,16} Large deviation functions have been used to map regions of stability for nonequilibrium phases,^{17,18} to elucidate complex dynamical behavior,^{19–23} and infer nonlinear and multivariate response.^{24–27} Methods to compute large deviation functions in systems with many degrees of freedom have largely been restricted to Monte Carlo based approaches, including cloning,^{28,29} forward flux sampling,^{30,31} nonequilibrium umbrella sampling,^{32} list-based algorithms,^{33} and transition path sampling.^{34,35} Most current algorithms scale exponentially in computational effort the further the rare fluctuation is from the mean, as most do not employ additional importance sampling apart from stratification or population dynamics.^{36–39} Recent work adding control forces to importance sample trajectory based Monte Carlo has demonstrated that even an approximate force can greatly improve the efficiency of Monte Carlo methods in estimating large deviation functions.^{40–42} Consequently, there has been much work to find approximate control forces analytically or through empirical arguments in both lattice-based and continuous systems^{43–45} and several iterative effective force optimization techniques have been proposed with varying levels of generality or accuracy.^{46–49} The control forces, in general, can have many-body components in interacting particle systems,^{45,50} can be long-ranged in systems with dynamical phase transitions,^{51} and can stabilize otherwise metastable states.^{52}

For Markovian systems, there exists an optimal control force, which is the unique additional force having the smallest contribution to the path ensemble measure that can be added to the system to make a rare fluctuation typical.^{53,54} This optimal control force satisfies several variational identities.^{55} By deriving such a variational principle and explicit forms for the gradients required to optimize it, we develop an algorithm that approximates the control force sufficiently well so as to make quantitatively accurate estimates of the likelihood of rare events within nonequilibrium steady-states. In this way, we generalize previous work on variational control of single particle systems to interacting, continuous force systems, bypassing the need for exponentially scaling Monte Carlo sampling. Our algorithm is similar in strategy to the recent use of thermodynamic variational principles to compute equilibrium free energies^{56} and to the Rayleigh-Ritz variational principle that others have used to nonperturbatively compute effective forces far from equilibrium.^{57} The variational principle that underlies our algorithm is related to minimum-entropy production principles^{58,59} and the Donsker-Varadhan formula in Markov stochastic processes.^{60} While our variational estimate of the large deviation function is subject to errors associated with the representation of the control force, we derive exact corrections that can be evaluated straightforwardly. In the two systems studied, these corrections are easy to evaluate, as our control forces are sufficiently close to the optimal control forces to make these corrections perturbatively small. However, in cases where the corrections are large, we show that using optimized control forces in conjunction with standard Monte Carlo algorithms can increase the statistical efficiency in the estimation of large deviation functions by orders of magnitude. In this way, our algorithm is similar to the use of variationally optimized wavefunctions for quantum Diffusion Monte Carlo calculations.^{61}

## II. ENHANCED SAMPLING FROM OPTIMAL CONTROL FORCES

Our aim is to construct a method by which rare fluctuations within a nonequilibrium steady-state can be sampled. We consider dynamics described by a Langevin equation of the form

where **a** is the vector of all dynamical coordinates, *a*_{i}, which can include the positions and velocities of all particles in the system. Its time derivative, $a\u0307$, depends on the force, **F**, with components, *F*_{i}, that are, in principle, functions of all coordinates, **a**. The Gaussian white noise, ** η**, has components,

*η*

_{i}, that satisfy

where *B*_{i} are diagonal elements of the diffusion constant matrix, **B**. While we have assumed **B** is diagonal and independent of **a** for ease of notation, generalizations for nondiagonal and coordinate-dependent diffusion matrices are straightforward.

For a specific trajectory, *X*(*τ*) = {**a**(0), …, **a**(*τ*)} spanning an observation time, *τ*, we are interested in fluctuations of time-averaged observables *A*_{τ} of the form

where *f* is a scalar function and **g** is a vector function with components, *g*_{i}, with the second term being evaluated in the Ito sense.^{62} Path observables such as the particle density, particle current, and entropy production can all be expressed in this form. We will be interested in the statistics of this observable in the long time limit, *τ* → *∞*.

### A. Nonequilibrium variational principle

We assume that in the long time limit, the probability distribution of *A*_{τ} satisfies a large deviation principle, with a rate function, or log likelihood, *I*(*A*), defined by^{15}

where the angular brackets denote a trajectory average

and *P*[*X*(*τ*)] denotes the path probability associated with trajectory *X*(*τ*). We will consider finite size systems that have exponentially decaying correlation functions and thus are expected to obey the large deviation principle.

The long time behavior of *A*_{τ} can also be characterized by its scaled cumulant generating function (SCGF) defined as

where *λ* is a counting parameter conjugate to *A*_{τ} and denotes the extent of *biasing* or *tilting* on the typical value of *A*_{τ}. Larger positive or negative values of *λ* probe rarer fluctuations. This is clear by noting that the derivatives of *ψ*(*λ*) report on the cumulants of *A*_{τ}. We refer to the rate function, *I*(*A*), and the SCGF, *ψ*(*λ*), collectively as the large deviation functions. When the rate function is convex, it can be obtained from the SCGF using a Legendre-Fenchel transform,

where inf refers to an infimum taken over all possible values of *λ*.

Computing either of the large deviation functions of *A*_{τ} requires sampling exponentially rare fluctuations. These rare fluctuations can, in principle, be made to occur more frequently by introducing a control force into the system as a means of importance sampling. In the presence of a new force, **u**(**a**), replacing the original force, **F**(**a**), the computation of the SCGF can done by changing the path ensemble measure,

where ⟨·⟩_{u} denotes an average in the controlled path ensemble with path probabilities *P*_{u}[*X*(*τ*)] and *O*_{τ}[**u**] can be derived from the difference in Onsager-Machlup path-actions,^{63}

interpreted in the Ito sense. Changing the force for such a Gaussian process does not change the normalization constant associated with the path ensemble in the long time limit where boundary terms from the initial and final configurations can be ignored.

Expanding Eq. (8) in terms of its cumulants and using Jensen’s inequality, we find a variational expression for the SCGF,

in terms of the mean of *O*_{τ}[**u**], within the controlled path ensemble. This expression is identical to previous work by Chetrite and Touchette that was derived using the contraction principle.^{55} Among the forces that make the rare value of the observable statistically typical, the one closest to the original force is the optimal force that realizes the supremum of the inequality. This many-body function can be approximated within a chosen *ansatz* with variationally optimizable parameters {*c*_{n}}. In the limit that {*c*_{n}} represents all possible functional forms of the many-body force, this *ansatz* becomes exact^{55} so that

where the optimal coefficients {*c*_{n}} will, in general, depend on *λ*.

The existence of a control force that saturates the supremum in Eq. (11) follows from the eigenspectrum of the generator of the SCGF,

where we have suppressed the arguments of *F*_{i}, *g*_{i}, and *f* for compactness. This operator satisfies an eigenvalue equation,

where *ψ*(*λ*) and *ϕ*_{λ}(**a**) are, respectively, the largest real eigenvalue and corresponding right eigenvector of *L*_{λ}, which follows from the Perron-Frobenius theorem and the long time limit of the SCGF. The optimal force **u**_{λ} that solves Eq. (11) is related to *ϕ*_{λ} through a Hopf-Cole transform^{53,55,64} defined as

and the controlled dynamics associated with this optimal force can be obtained from a generalized Doob transform of *L*_{λ}.^{53,65} For an interacting many-body system, the dominant eigenvector is a many body state, and therefore the optimal control force is many-bodied. Generally, we will assume that the control force is well approximated by a low rank *ansatz* such as obtained from a low order many-body expansion.

Obtaining the SCGF from directly diagonalizing the tilted generator in many-body systems is prohibitively expensive due to the size of the multidimensional state space over which *L*_{λ} is defined. There have been recent advances to approximate this state space using matrix product states for lattice based models.^{66,67} However, for continuous space systems with many particles, it is expected that Eq. (11) will present a physically motivated way to formulate approximate solutions to the eigenvalue problem and to the computation of *ψ*(*λ*) and subsequently *I*(*A*). It is worth noting that the constrained optimization of a variational expression analogous to (11) can also be directly used to compute *I*(*A*),^{55} with a straightforward extension of the algorithm described below.

### B. Optimization algorithm with explicit gradients

In order to optimize Eq. (11) by gradient descent, we need to calculate derivatives of $\u27e8O\tau [u]\u27e9u$ with respect to the variational parameters {*c*_{n}} in the limit of a large *τ*. Using these explicitly calculated gradients in the optimization algorithm can reduce the noise and numerical instabilities associated with finite difference schemes, which are generally used to empirically estimate the gradients from the optimization trajectory through the parameter space. The explicit gradients that we use have the form of expectation values in the controlled ensemble,

where *c*_{n} is any of the optimizable parameters specifying the control force. While the first term is straightforward to compute, functional forms of *∂* ln *P*_{u}/*∂c*_{n} can be calculated from the normalized path probabilities,

and its fluctuation is defined as $\delta \u022e\tau [u]=\u022e\tau [u]\u2212\u27e8\u022e\tau [u]\u27e9u$. The averages in Eq. (16) can be computed by propagating additional coordinates *y*_{n}(*t*) associated with each variational parameter *c*_{n} as

where the sum has been performed over all dynamical coordinates of the system, and its fluctuation is defined as $\delta y\u0307n(t)=y\u0307n(t)\u2212\u27e8y\u0307n(t)\u27e9u$. These fictitious coordinates are known in the literature as Malliavin weights^{68} and have previously been used to calculate parameter sensitivity of steady-state distributions in Langevin systems.^{69} Provided these averages are evaluated in the steady-state generated by the control force, $a\u0307=u+\eta $, we can invoke time-translational invariance and note that only past noise history correlates with the observable, to simplify Eq. (15),

where in the long time limit, the gradient is proportional to an integrated time correlation function. This is an example of a generalized fluctuation-dissipation formula.^{70} Putting together the two contributions

we arrive at an explicit form for the gradient of our SCGF estimate with respect to the variational parameters that can be estimated as time-averages from a straightforward molecular dynamics trajectory with the control forces. In practice, we will take the integral over the time correlation function in Eq. (20) up to a time Δ*t*. The choice of Δ*t* is discussed in Appendix A.

Using these explicit gradients, an iterative optimization is performed in the parameter space spanned by {*c*_{n}} in order to estimate the SCGF. We use an algorithm called Nesterov’s Accelerated Gradient Descent^{71,72} which shows a superlinear convergence. The learning rate and conjugate momenta are scaled by fixed parameters *μ* and *ν*, respectively. The optimization algorithm is given in Algorithm 1.

1: Begin from a guess for the variational parameters {c_{n}} and |

conjugate momenta {p_{n} = 0}. |

2: After the kth step of the optimization, parametrize the force u^{(k)} |

with parameters ${cn(k)+\nu pn(k)}$. |

3: Propagate an MD trajectory to evaluate the gradients |

$dn(k)=\u2202[O\tau [u]u(k)/\tau ]/\u2202cn$ for a large τ. |

4: Update the momenta as $pnk+1=\nu pn(k)+\mu dn(k)$. |

5: Update the variational parameters as $cn(k+1)=cn(k)+pn(k+1)$. |

6: Repeat steps (2-5) until all $dn(k)$ are less than a tolerance value. |

1: Begin from a guess for the variational parameters {c_{n}} and |

conjugate momenta {p_{n} = 0}. |

2: After the kth step of the optimization, parametrize the force u^{(k)} |

with parameters ${cn(k)+\nu pn(k)}$. |

3: Propagate an MD trajectory to evaluate the gradients |

$dn(k)=\u2202[O\tau [u]u(k)/\tau ]/\u2202cn$ for a large τ. |

4: Update the momenta as $pnk+1=\nu pn(k)+\mu dn(k)$. |

5: Update the variational parameters as $cn(k+1)=cn(k)+pn(k+1)$. |

6: Repeat steps (2-5) until all $dn(k)$ are less than a tolerance value. |

This algorithm converges to a local maximum in the parameter space, which can be different from the global maximum when the variational surface is not convex. For all the models for which we computed the SCGF, we did not obtain evidence of nonconvexity of the variational functional at any point in the parameter space. However, the convergence was significantly slower at values of *λ* near a crossover point or a phase transition. We have illustrated in Appendix B that we often converge to the global maximum in the parameter space smoothly. Nevertheless, in the event that we converge to a local maximum, we incur a systematic error in the SCGF that we discuss how to correct in Sec. II C.

### C. Correcting for systematic errors

In general, the *ansatz* specified by the parameters {*c*_{n}} will not form a complete basis for a many body system. This is because generically the dominant eigenvector of Eq. (13) is a many-body state, containing exponentially many parameters, and not expected to be exactly expressible with a low rank form. Because of this, the variationally converged SCGF *ψ*^{*}(*λ*) obtained from Eq. (11) will have a systematic error. This error and errors associated with convergence to a local maximum can both be corrected, in principle, by computing the remaining terms of the cumulant expansion

where {*κ*_{ℓ}} are the second and higher cumulants in the expansion of $ln\u27e8exp(O\tau [u*])\u27e9u*$ and the force **u**^{*} is the solution of the variational problem in the approximate and incomplete. If the *ansatz* used to express the control force, **u**^{*}, is close enough to the optimal force obtained from the Doob transform, the correction terms are small in magnitude and the series will converge quickly. This will occur when the trajectory distribution generated by the controlled dynamics has significant overlap with the tilted distribution of the original dynamics.

In cases where the *ansatz* is poor and many cumulants are needed, brute force convergence of the correction will be difficult. In such cases, control forces can be used as guiding functions for estimating the SCGF through Monte Carlo based approaches like the cloning algorithm. In the cloning algorithm, an ensemble of *N*_{w} trajectories generated from the ordinary path probabilities *P*[*X*(*τ*)] are branched with corresponding weights of exp(*λτA*_{τ}). However, under the controlled dynamics, following Eq. (8), the weighted path probabilities can be written as^{40,45}

where system evolution under an approximate controlled dynamics is nonconservative and must be accompanied by branching steps with weights given by exp(*O*_{τ}). An estimate of the SCGF is then obtained from the normalization constant of this weight so that in the limit of large *N*_{w},

where $O\tau (j)[u]$ denotes the time-integrated observable for the walker labeled *j*.

When the variationally optimized **u**^{*} is used to generate trajectories and to compute the branching probabilities, the efficiency of the cloning algorithm is improved as the control force samples the rare fluctuations in the observable. When **u**^{*} is actually the optimal force derived from the Doob transform, all trajectories achieve the rare fluctuation as typical behavior, and the weight of each trajectory becomes a constant. In this situation, no trajectories are killed in the branching step of the cloning algorithm, and the sampling is statistically optimal.^{28} However, even with an approximate *ansatz*, the variationally optimized force slows down the rate of death of uncorrelated trajectories with increasing *τ*, as demonstrated in Sec. III B.

The variational algorithm along with the cumulant-correction has improved scaling properties compared to the cloning algorithm. By adopting an approximate *ansatz* for the many-body force containing, for example, one-body and two-body terms, for a system of identical particles, we can exploit their permutation symmetry and optimize a single one-body and two-body force. Hence, the variational algorithm scales linearly with the system size, the computational cost arising only from the propagation of trajectories of interacting particles. This is in contrast to the cloning algorithm, which has an exponential scaling for observables that are system size extensive.^{37} Also, while the cloning algorithm scales exponentially with *λ*, the variational algorithm depends on the bias only through the complexity of the optimal force and scales linearly with the number of variational parameters required to approximate the force. Hence, in cases that the dominant part of the optimal force can be simply expressed within the choice of the *ansatz*, the computational cost for the algorithm to converge does not increase with *λ*. This indicates a resummation of the exponential bias through the modification of the control force. Neither does the algorithm scale with increasing observation time *τ*, as the *τ* → *∞* limit has already been incorporated in the algorithm. Finally, this algorithm can be parallelized trivially by distributing the computation of the expectation values at each step of the iteration to independent trajectories on independent processors.

## III. NUMERICAL ILLUSTRATIONS

To study the accuracy and efficiency of our variational algorithm to compute the SCGF and the optimal force, we apply it to two different continuous time and space systems. The first is a benchmark system where we can test our algorithm against a numerically exact result. This model consists of a driven underdamped particle in a periodic potential, for which we have studied rare fluctuations of the total current. The second system is comprised of multiple repulsive overdamped particles, where we have focused on the fluctuations of the total activity, which measures how much the particles explore configuration space. In this system, we demonstrate the ability of our algorithm to compute the optimal control force even through singular changes in the SCGF across a dynamical phase transition.

### A. Driven underdamped particle in a periodic potential

An underdamped particle being driven on a periodic potential by a constant external force is a simple system with two dynamical coordinates, position and velocity, that can exhibit nontrivial nonequilibrium properties due to competing ballistic and diffusive modes of transport.^{73,74} Large deviation functions for current fluctuations in this model can be obtained by numerically exact diagonalizations of the tilted generator, and the controlled ensemble can show diverse behavior in different parameter regimes.^{75} We consider this model to benchmark our variational optimization algorithm.

Specifically, we consider an underdamped particle of mass *m* moving in a one-dimensional periodic box of length *L* = 2*π*. The forces acting on the particle are derived from a cosine potential, *V*(*x*) = *V*_{0} cos(*x*), where *V*_{0} is the magnitude of the potential, and include a constant external driving force, *F*_{ext}. For the particle in contact with a bath of temperature, *T*, and friction coefficient, *γ*, the equations of motion for the position, *x*, and velocity, *v*, are

where *F*(*x*) = −*V*′(*x*) + *F*_{ext} and *η*(*t*) is a Gaussian white noise with

where *k*_{B} is Boltzmann’s constant. These equations of motion have the form of Eqs. (1) and (2) with two dynamical coordinates and a vanishing noise in position.^{75}

We investigate the statistics of the time-averaged current flowing through the system,

which measures the total displacement of the particle. The SCGF for current is given by

where the path average is obtained from the controlled dynamics

and the optimal force is, in general, a function of both the position and velocity. We expand this force in an *ansatz*

where *c*_{pq} are parameters that can be optimized variationally subject to $c\u2212p,q*=cp,q$, and the number of position and velocity basis functions are (2*M*_{1} + 1) and (*M*_{2} + 1), respectively. The basis is complete in the limit of large *M*_{1} and *M*_{2}. Note that this force incorporates the periodicity of *x* and also allows the external nonequilibrium driving, which is the *p* = *q* = 0 term, to be optimized. In the high friction limit, the dynamics becomes overdamped, and in that limit, the optimal force becomes a function of just the particle position. For small friction, inertia is important and the general form of the optimal force must be considered. We note that this velocity-dependent drift function is a *force* only in a generalized sense.

The SCGF and the optimized control force obtained from the variational algorithm can be compared to numerically exact results obtained by solving the eigenvalue equation for the tilted generator given by^{75}

as in Eq. (13). The exact control force is obtained using the right eigenvector *ϕ*_{λ}(*x*, *v*) corresponding to the largest real eigenvalue, as

where numerical diagonalization of *L*_{λ} can be performed by expressing the right and left eigenvectors over a position-velocity grid and representing the differential operators in *L*_{λ} using a second order finite difference scheme. The boundary conditions are periodic in the position grid and reflective in the velocity grid so that only forward (backward) difference at the minimum (maximum) velocity grid point is used to represent the differential operator.

We have computed the cumulant-corrected large deviation functions in this system and have compared them to the numerically exact results. We have worked with *k*_{B}*T* = 1 and *γ* = 1. These parameters along with the length of the box *L* = 2*π* let us define a natural time unit as *t*^{*} = 4*π*^{2}*k*_{B}*T*/*γL*^{2}. All observables have been reported in dimensionless units following these definitions. We have done our computations at two values of mass, *m*/*γt*^{*} = 1 and *m*/*γt*^{*} → 0. We have also chosen *V*_{0} = 2 and *F*_{ext} = 1. The numerically exact result was obtained with a grid of 140 × 50 points in the position-velocity space. The position points span the box, and the velocity points are centered at (*F*_{ext} + 2*λk*_{B}*T*)/*γ* corresponding to the mean velocity in the *V*_{0} → 0 limit. For all the simulations, the time step was chosen to be 0.001 natural time units. For *m*/*γt*^{*} → 0, an Euler scheme was used to integrate the overdamped equation of motion, while for *m*/*γt*^{*} = 1, a velocity Verlet scheme was used.^{14}

For each iterative step during the optimization, a trajectory of duration 10^{4} units was simulated. During the first half of each trajectory, the system was allowed to come to a steady-state, and the time-averaged gradients were computed only with the second half of the trajectories. For implementing Eq. (20), we integrated the correlation function up to Δ*t* = 100. The size of the basis was *M*_{1} = 3, *M*_{2} = 1 for *m*/*γt*^{*} = 1 and *M*_{1} = 3, *M*_{2} = 0 for *m*/*γt*^{*} → 0, the overdamped limit. The optimization parameters used for the gradient descent were *μ* = 0.5, *ν* = 0.2. Near *λ* = 0, all *c*_{pq} were initialized at zero, and subsequent optimizations with increasing magnitude of *λ* were initialized from a previously optimized set of *c*_{pq} taken from the nearest value of *λ*. In the overdamped limit, an accurate estimate of the SCGF could be obtained with just the variational optimization, with the cumulant correction merely a confirmation of the optimal control forces being correct. However, for *m*/*γt*^{*} = 1, the variational SCGF had to be corrected with cumulants computed with an observation time *τ* = 100 and a total trajectory length 10^{5} units. Following this procedure, we obtain estimates of SCGFs that are in quantitative agreement with the numerically exact results throughout the range of *λ* considered, as shown in Fig. 1(a). We have also calculated the rate functions for the current [Fig. 1(b)] in these two parameter regimes by a numerical Legendre-Fenchel transform of the SCGFs.

The SCGFs in Fig. 1(a) both have a *locked* region where the current changes slowly with *λ* and an *unlocked* region for larger magnitudes of *λ*. Due to the time-reversal properties of *L*_{λ}, the SCGF shows a Gallavotti-Cohen symmetry,^{76}

which is clear through the reflection symmetry about *λ* = −0.5 of the SCGF in Fig. 1(a). Analogously, the rate function obeys a fluctuation theorem symmetry,

indicating the exponentially rare probability of current in the direction opposite to the applied force.

Figure 2(a) shows the position-dependent optimal forces obtained in the overdamped limit, *u*^{(1)}(*x*), overlaid on the numerically exact answers obtained from diagonalization,^{77} for multiple values of *λ*. In the limit of |*λ*| → *∞*, the optimal forces approach the free-diffusion limit, where the majority of contribution comes from a constant nonequilibrium driving. When |*λ*| is of the order |*F*_{ext}|/*k*_{B}*T*, the forces have a nontrivial position dependence. This is manifested in the size of the basis-set, *M*_{1}, required to obtain the optimal control force accurately. Figure 2(b) shows the effect of finite basis size on the error made in estimating *ψ*(*λ*). Increasing *M*_{1} reduces the error, and ultimately the *ansatz* becomes exact when *M*_{1} is large. The error decreases when going to larger |*λ*| as the forces are easier to represent using the first few basis functions. The error bars were computed from 5 independent estimates of the SCGF using independent trajectories.

For the *m*/*γt*^{*} = 1 system, inertial effects are important and the optimal force depends on both position and velocity, and the optimal force has a complicated functional dependency that is difficult to represent using a small number of basis functions. Using a truncated basis to represent the control force leads to a systematic error in the SCGF estimate obtained using Eq. (27) that can be corrected using the cumulant expansion in Eq. (21). Figure 3(a) shows the approximate forces obtained from the variational optimization compared to the numerically exact results. When *λ* is near the Gallavotti-Cohen symmetry point, the average current is small and the optimal control force is a complicated function of both *v* and *x*. Within our *ansatz*, the optimized *u*(*x*, *v*) does not reproduce the exact form of the optimal control force. Nevertheless, these approximate forces recover the majority of the SCGF so that the cumulant expansion converges for all tested *λ* points. Figure 3(a) also contains the optimal force at a larger positive *λ*, where the forces lose their velocity dependence and simplify toward the free-diffusion limit. In this limit, position based forces are sufficient to recover the SCGF quantitatively.

Figure 3(b) shows the convergence of the consecutive terms of the cumulant expansion in Eq. (21) for different values of *λ*. *κ*_{1}, the first cumulant, is identical to *ψ*^{*}(*λ*), the variational estimate. Error bars were calculated using 5 independent trajectories for the estimation of the cumulants. Even though our basis is small and approximate, the cumulants computed from a single trajectory have decreasing amplitudes for various values *λ*, showing that the variational force is accurate enough to approach the force derived from the Doob transform. We note that the sign of the cumulants need not be positive, and therefore the variational structure in the estimate of *ψ*(*λ*) holds only for the first cumulant. Furthermore, the magnitude of the terms in the cumulant expansion need not be strictly decreasing. Figure 3(b) includes an example of a nonmonotonic convergence for *λ* = −1.1. Moreover, the sign of the error of the approximate SCGF at a given truncation of the cumulant expansion can change resulting in the cancellation of errors of two oppositely signed cumulant corrections and an accidental near agreement of the exact SCGF. We have found that by considering the convergence of the consecutive terms of the cumulant expansion, we can reliably determine the accuracy of the approximate SCGF.

### B. Activity fluctuations of overdamped repulsive particles

To study how this algorithm performs in an interacting system, we consider the fluctuations of the activity in a system of overdamped repulsive particles on a line. In both lattice and continuum models of volume excluding particles in one dimension, it has been reported that there are two characteristically distinct types of activity fluctuations, with a dynamical phase transition separating them.^{78} For rare large negative values of the activity, such systems spontaneously phase separate into macroscopically sized clusters, whereas for rare small values of the activity, they form a hyperuniform phase in which long-wavelength density fluctuations are suppressed. This behavior emerges as a singularity in the SCGF and a closing of the gap in the eigenspectrum of the tilted operator, which in the hydrodynamic scaling limit is predicted to occur with a critical point at *λ*_{c} → 0^{−}.^{45,78} This system is thus suitable to test the effectiveness of the variational algorithm in computing rare fluctuations that are collective in origin.

Specifically, we study the fluctuations of dynamical activity in a system of *N* overdamped repulsive particles in a one-dimensional periodic box of length *L*. The equation of motion is

where *F*_{i}(**x**) is the total force felt by the *i*th particle,

where *x*_{ij} = *x*_{i} − *x*_{j} and the force is derived from a Weeks-Chandler-Andersen (WCA) pair potential,

with characteristic energy, *ε*, and length scale, *σ*. The Gaussian white noise, *η*_{i}, is specified by

We work with *k*_{B}*T* = 0.5, *γ* = 1, and *σ* = 1. As before, we define our unit of time for this system as 2*k*_{B}*T*/*γσ*^{2}, and we have reported all observables in dimensionless units. Additionally, we set *ε* = 1 and consider a density of *ρ* = *Nσ*/*L* = 0.5 so that the box is half-filled.

We study a measure of activity derived from the probability that the particles stay in the same state in a short time interval.^{79} This form of the activity,

is also a part of the time-symmetric component of the path-action,^{80} and its long time statistics are similar to other commonly used metrics that count the total number of hops for particles on a lattice.^{81,82} Using Ito’s lemma to simplify the last term in Eq. (9), the variational expression for the SCGF becomes

where in addition to the force, we require the gradient of both the original and the control force.

For this system, the optimal control force **u**(**x**) is, in general, long-range and many-bodied. Previous work on related one-dimensional systems have shown long-range repulsive interactions stabilizing the hyperuniform state for values of activity small in,^{51} and long-range attractive forces acting on the surface of particle clusters that emerge in rare large negative fluctuations of the activity.^{45} For our variational *ansatz*, we have approximated the many-body force as a sum of long-range pairwise interactions. Pair forces are the lowest rank approximation to this system due to its translational invariance. From the Hopf-Cole transform, optimization of a pair force is analogous to optimization of a two-body Jastrow function as used in variational quantum Monte Carlo.^{83}

To represent the control force, we expand it in a basis of Laguerre polynomials *L*_{p} with coefficients *c*_{p} as

where $x\u0303ij=\alpha \u2212\beta |xij|$ is a linear transformation on the distance between particles *i* and *j*. The parameters *α* and *β* can be adjusted to set a scale and a cutoff for where the force smoothly decays to zero, and *M*_{3} determines the size of the basis. The basis is complete for all possible two-body forces in the limit of large *M*_{3}. The exponential factor makes the basis functions orthogonal and aids in the convergence of the optimization. We have used *M*_{3} = 10 for all of our results. We have fixed *β* = 2/*L* and optimized {*c*_{p}} and *α* with starting values of 0 and 1, respectively. In each iteration of the optimization, a trajectory of length 2 × 10^{4} time units is simulated, the first half again reserved for equilibration and the second half being used to compute the gradients. For computing the integrated correlation function in Eq. (20), we have used Δ*t* = 200 units. After obtaining the optimized control force in this *ansatz*, we use it to compute the unbiased SCGF using a cumulant expansion as before, with an observation time *τ* = 10 and a total trajectory length of 5 × 10^{4} units. Across the range of *λ* considered, we find convergence using the first three cumulants to correct the variational result. The SCGF obtained from this cumulant expansion is identical to results obtained using a guided cloning algorithm that has been described later in this section.

In Fig. 4(a), we have plotted the size scaled SCGF and the mean activity for positive and negative values of *λ*. For *λ* > 0, we find the system in a hyperuniform state, where all particles are pushed apart from each other and long-range density fluctuations are suppressed.^{78} The SCGF is size-extensive in this range of *λ*. For *λ* ≪ 0, the particles phase separate, forming a single cluster. In the region where *λ* is negative but small, there is a phase transition to this clustered state accompanied by an inflection point in the mean activity, shown in Fig. 4(b), obtained from taking the numerical derivative of the SCGF, ⟨*K*⟩_{λ} = *ψ*′(*λ*). The extensive scaling regime has been explored systematically in a related model and found to agree well with predictions from macroscopical fluctuation theory.^{45} In our studies, we find it limited to 0 > *λ* > −0.02. For large negative values of *λ*, the cluster is a highly compressed solid with system-spanning correlations that result in the SCGF scaling super-extensively. In this regime of the SCGF, the typical force is on the order of $N$ and can continue to increase with decreasing *λ* because of the soft repulsion of the WCA potential. Inspection of the distribution of mean squared forces reveals that the cluster is not homogeneous, but most compressed in its interior with lower density near the edges, with a system size independent profile; see Appendix C. The phase transition from a disordered state to a clustered state is in accord with previous observations in related systems and result in diverging correlation times rendering the precise study of the critical point difficult.^{45,78} We therefore focus our attention on the two phases on either side of that transition. Error bars were obtained from independent statistics from 3 distinct trajectories.

Figure 5 shows the effective pair-potential, *V*^{(2)}(*r*), derived from the optimal control force at different values of *λ*, for *N* = 40, obtained by the numerical integration of the control force. The potential is long-ranged and repulsive in the hyperuniform phase and long-ranged and attractive in the clustered phase. The long-range potential leads to the observed size scaling in Fig. 4 because it imposes infinite range correlations. We also observe that the depth of the attractive potential for increasingly negative values of *λ* tends to saturate, while the magnitude of the repulsive potential for increasingly positive *λ* does not. This difference arises from the steeply rising WCA forces that can achieve more negative values of ⟨*K*⟩_{λ} with just a slight decrease in the nearest neighbor distance in the controlled system. In the hyperuniform phase, achieving the rarer values of activity implies an exponentially small number of collisions between the particles, which leads to an increasing repulsive control force. These optimal control forces derived from the variational *ansatz* do not contain many-body components unlike analytically derived approximate forces,^{45} yet they achieve the same phenomenology of phase separation and hyperuniformity described previously.

Figure 6(a) characterizes the steady-state radial distribution function *g*(*r*),

obtained in these phases, for a system size of *N* = 80, where *x*_{12} denotes the interparticle distance between each distinct pair of particles. In the phase-separated state, the particles form a solid cluster that has sharp peaks in *g*(*r*) at intervals of *σ*. In the hyperuniform phase, the particles are repelled away from each other and *g*(*r*) has little structure aside from the volume-exclusion. We also characterize the structure of the hyperuniform state through the structure factor, *S*(*q*), as a function of the wavenumber *q*, obtained from

where the averages are computed in the ensemble with the control force. A linear increase in *S*(*q*) from zero at small *q* is a signature of the suppression of long-wavelength density fluctuations in the hyperuniform phase, which we confirm in Fig. 6(b). The spike at *q* = 2*π*/2^{1/6}*σ* results from 2^{1/6}*σ* being the distance of the closest approach of the repulsive particles without experiencing a force.

While we have not investigated the phase transition directly, the disparate behavior of either side of the dynamical phase transition provides a useful test of our ability to obtain control forces as the structure and dynamics of the system in the phase separated and hyperuniform states are very different. Despite their differences in both regimes, we are able to obtain control forces that are near enough to the optimal force to converge the large deviation functions using a brute force evaluation of the remaining cumulant expansion. Nevertheless, we expect this strategy may fail, in general, in which case a more robust means of estimating the remaining contribution must be employed. To explore such alternatives, we apply these control forces as guiding functions within the cloning algorithm.^{40} To quantify the statistical benefit from the control forces, we start with a trajectory ensemble of *N*_{w} = 32 000 walkers and monitor the decay rate in the number of uncorrelated walkers, *N*_{c}, with and without the control forces. The number of uncorrelated walkers is defined as those with a distinct history, having not been previously merged into an existing walker. Figures 7(a) and 7(b) show the statistics of the walkers with respect to observation time, with and without the control forces, in a system with 20 particles, and branching steps taken every 0.5 time units. We have plotted $f\lambda c(\tau )=Nc(\tau )/Nw(\tau )$, where *τ* is the observation time, to represent the growth of correlation in the trajectory ensemble.

In the clustered state, incorporating the control forces improves the number of uncorrelated walkers by multiple orders of magnitude. For larger negative *λ*, an unbiased estimate of the SCGF can be obtained only when the variational control forces are used. The improvement in the statistics of the walkers increases for more negative *λ* because the magnitude of the SCGF grows rapidly, and therefore the weight carried by the branching step increases. We see this effect in the inset, where we show the fraction of uncorrelated walkers left after an observation time and how it varies with *λ*.^{40}

The decay of the walkers depends on the overlap between the tilted trajectory ensemble and that generated from the controlled dynamics. Slower decay will result when the control dynamics generates a trajectory ensemble that is close, in this sense, to the tilted trajectory ensemble. This behavior is analogous to other approximate guiding function based importance sampling, such as that arrived by iterative feedback^{42} or analytical approximation.^{45} These effects are seen in the hyperuniform phase as well, albeit the decay of walkers in the ordinary cloning algorithm is less drastic and so is the improvement by incorporating the guiding forces. The improvement in statistical efficiency upon including the optimized forces is not restricted to the cloning algorithm and could be analogously adopted within transition path sampling^{45} or forward flux sampling.^{31}

## IV. CONCLUSION

We have developed a variational algorithm to compute optimal control forces for Langevin models driven into nonequilibrium steady-states. We have used the control forces to sample rare fluctuations in time integrated dynamical observables such as current and activity, in order to compute large deviation functions, and shown that they can be used to improve the efficiency of the cloning algorithm. Our variational algorithm, along with the correction of the systematic error with the cumulant expansion, has improved scaling properties compared to trajectory ensemble methods and can be useful in dealing with many-particle chemical or biological systems.

Although we worked with Langevin models of structureless particles, the algorithm is straightforward to generalize to higher dimensions, where optimal control forces might have significant rotational components. It can also be extended to lattice models, where the rate matrix has to be expressed in a variational *ansatz*. A system modeled by a different stochastic equation of motion, like that employing an Andersen thermostat^{14} or quantum trajectory-based approaches,^{84,85} can also be treated through this algorithm by changing only the functional forms of the path-actions provided that a Doob transform exists.

The versatility of the variational algorithm allows for its use with different force *ansatz*. In the activity-biased system, using a low-rank approximation for a many-body optimal control force was sufficiently accurate. However, in cases where the control force is not expressible in a simple functional form or even as a many-body expansion, machine learning using artificial neural networks could be used to approximate it. The variational algorithm relies on evaluating functional derivatives of the force with respect to the parameters, which can be automated with autodifferentiation algorithms^{86} as has already been demonstrated in equilibrium free energy calculations.^{87} The use of techniques developed in this paper can aid the formulation of such optimization algorithms in the future. Additionally, this algorithm can be used for model reduction in high-dimensional systems^{88} and hence to extend variational force-matching and ultra-course-graining algorithms^{89–91} out of equilibrium so that biomolecular and other soft matter systems can be simulated over large length and time scales with effective forces in nonequilibrium steady-states.

Finally, this framework of solving the optimal forces can tackle inverse-design problems out of equilibrium. Various inverse-design algorithms have been proposed that can obtain optimal forces to rationalize material design with targeted properties and to guide directed self-assembly of smaller objects.^{92,93} Our variational algorithm can be used to obtain optimal forces suitable for targeted assembly or tailored particle distributions when nonequilibrium driving forces are present and hence can be used to characterize and predict dynamical phases in new functional materials.

## ACKNOWLEDGMENTS

We thank Robert L. Jack and Hugo Touchette for helpful discussions. This work was supported by the UC Berkeley College of Chemistry. We also acknowledge support from the U.S. Department of Energy, Office of Basic Energy Sciences through Award No. DE-SC0019375.

### APPENDIX A: CHOICE OF Δ*t* FOR MALLIAVIN WEIGHTS

The choice of a finite integration limit Δ*t* to compute the integral in Eq. (19) depends on both the intrinsic correlation times of the system and the time scale of the variance of the integrated correlation function to diverge. To illustrate this, we plot

for the system in Sec. III A in the *m*/*γt*^{*} → 0 limit. The ansatz can be written in this limit as

and for Fig. 8, we have chosen *c*_{n} parameter values randomly between −1 and 1, with *λ* = 0.5. We see that even though the correlation function converges for large Δ*t*, the error in the computed gradient increases steadily. For all the results in this paper, Δ*t* was chosen to balance between these two effects so that the computed gradients suffers from no systematic error and minimum statistical error.

### APPENDIX B: CONVERGENCE OF GRADIENT DESCENT

The accelerated gradient descent algorithm converges superlinearly, and in Fig. 9, we have plotted the decrease in the systematic error *δψ*(*λ*) in the current SCGF estimate with optimization steps, for the model system in Sec. III A, in the limit *m*/*γt*^{*} → 0. We also show the simultaneous convergence of the controlled ensemble steady-state density $\rho uss(x)$ to the true biased steady-state density *ρ*_{λ}(*x*) ∝ *χ*_{λ}(*x*)*ϕ*_{λ}(*x*), where *χ*_{λ} and *ϕ*_{λ} are the dominant left and right eigenvectors of the tilted generator (30). We demonstrate this by plotting the relative entropy of the two,

which shows that even as only the current is being optimized to have a nontypical value, the entire trajectory ensemble simultaneously converges to the exact biased ensemble.

### APPENDIX C: ACTIVITY PROFILE IN CLUSTERED STATE

Under large negative activity bias, we find that the overdamped repulsive particles form a highly compressed cluster. This cluster is described by system-spanning correlations. Shown in Fig. 10 is the size-scaled profile for the first term of the collective activity (38), $\u27e8Fi2\u27e9\lambda /4\gamma kBT$, with respect to a size-scaled particle index *N*_{i} = *i* − (*N* + 1)/2. The particles are indexed from one end of the cluster to the other, such that the center of the cluster is indexed at *N*_{i} = 0. The compressed cluster does not break apart during the duration of the trajectories observed so that large |*N*_{i}| unambiguously refers to particles close to the surface of the cluster. The total mean activity ⟨*K*⟩_{λ} is proportional to the total mean squared force appearing in the first term, such that the profile of the second term in the definition looks analogous only with an opposite sign.^{82} The *O*(*N*^{2}) scaling of the mean squared force and its size-invariant parabolic profile explains the super-extensive SCGF scaling and the system spanning correlations in this *λ* regime.