Even though the computation of local properties, such as densities or radial distribution functions, remains one of the most standard goals of molecular simulation, it still largely relies on straightforward histogram-based strategies. Here, we highlight recent developments of alternative approaches leading, from different perspectives, to estimators with a reduced variance compared to conventional binning. They all make use of the force acting on the particles, in addition to their position, and allow us to focus on the non-trivial part of the problem in order to alleviate (or even remove in some cases) the catastrophic behavior of histograms as the bin size decreases. The corresponding computational cost is negligible for molecular dynamics simulations, since the forces are already computed to generate the configurations, and the benefit of reduced-variance estimators is even larger when the cost of generating the latter is high, in particular, with *ab initio* simulations. The force sampling approach may result in spurious residual non-zero values of the density in regions where no particles are present, but strategies are available to mitigate this artifact. We illustrate this approach on number, charge, and polarization densities, radial distribution functions, and local transport coefficients, discuss the connections between the various perspectives, and suggest future challenges for this promising approach.

## I. INTRODUCTION

Molecular Dynamics (MD) and Monte Carlo (MC) algorithms allow us to sample configurations from a statistical ensemble and to numerically compute observable properties as averages over configurations for realistic models of interacting atoms or molecules, when analytical theories are usually limited to cruder models.^{1} Since the early days of simulations, the tremendous development of accurate quantum and classical descriptions and of efficient sampling algorithms, supported by the ever growing availability of computational resources, established molecular simulation as an essential tool and, more fundamentally, a new scientific approach to investigate the properties of matter.^{2}

The determination of the local properties in condensed matter has always been one of the main applications of molecular simulation. To mention but a few examples in biology, chemistry, and physics, one can cite the characterization of the local structure: of water around solutes such as biological molecules by 3D densities,^{3–5} of interfacial number and charge densities at electrochemical interfaces involving aqueous electrolytes or room temperature ionic liquids (RTIL),^{6–9} of the complex solvation structure in RTILs^{10} or of aqueous solutes near an electrode surface,^{11} and of the radial distribution in concentrated electrolytes^{12} in order to understand the intriguing behavior of the correlation length reported in these systems.^{13} These structural properties are also used to infer thermodynamic quantities, such as local solvation entropies, energies, or free energies around proteins,^{14} or Kirkwood–Buff integrals in mixtures.^{15} Finally, the densities or radial distribution functions (RDFs) obtained from molecular simulations are often used as reference data to test and/or parameterize liquid state theories such as 3D-RISM^{16,17} or molecular density functional theory^{18–21} as well as coarse-grained models^{22,23} for mesoscale simulations.

Compared to the development of models to describe the systems and of algorithms to sample statistical ensembles, surprisingly little effort has been devoted to the improvement of estimators to compute local properties such as densities or radial distribution functions from the available configurations. In practice, one generally relies on histograms counting the number of particles in a voxel of finite size *h*^{d} (in *d* = 1, 2, or 3 dimensions) around a given point or of pairs separated by a distance comprised between *r* and *r* + Δ*r*. As discussed in more detail below, these straightforward estimators provide the correct expectation value but behave poorly when increasing the resolution, with a diverging variance as *h* or Δ*r* → 0. The benefit of estimators with a reduced variance would be even larger when the computational cost to generate configurations is high, for example, for *ab initio* MD.

Here, we highlight recent developments of alternative approaches leading, from different perspectives (e.g., inspired from zero-variance estimators developed in quantum Monte Carlo^{24–26} or directly rooted in statistical mechanics^{27}), to estimators with a reduced variance compared to conventional binning for densities or radial distribution functions^{28–32} or even local transport coefficients.^{33} A common point of all the resulting expressions is to use the force acting on the particles, in addition to their position. In Sec. II, we introduce the main problem of histogram-based estimators and the idea behind force sampling strategies. Section III then illustrates the variance reduction obtained from force-based estimators on local number densities in several dimensions, and extensions to charge and polarization densities, or to take into account constraints for rigid molecules. Sections IV and V are devoted to radial distribution functions and local transport coefficients, respectively. The link between these and related approaches is further discussed in Sec. VI, while Sec. VII offers some suggestions of future challenges.

## II. FORCE SAMPLING

### A. What is wrong with binning?

The computation of local densities or radial distribution functions (RDFs) from molecular simulations is based on their statistical mechanical definitions as ensemble averages over microscopic configurations. Let us first consider for simplicity a system of *N* independent atoms, without distance or angular constraints to define rigid molecules (see Sec. III C). A point in phase space is defined by the set of all positions **r**^{N} = {**r**_{1}, …, **r**_{N}} and momenta **p**^{N} = {**p**_{1}, …, **p**_{N}}. The potential energy of the system, *U*(**r**^{N}), includes the interactions between atoms and the effect of external potentials. In the canonical ensemble, with fixed volume *V* and temperature *T*, the local number density in particles of type *a* is defined as^{1}

where the sum runs over atoms of type *a*, *δ* is the (3D) Dirac delta function, and $\cdots \u2009$ denotes an average in the canonical ensemble. Specifically, the average of an observable *f* is

where *β*^{−1} = *k*_{B}*T* is the thermal energy, $H\u2009(rN,pN)\u2009=\u2009U(rN)\u2009+\u2009K(pN)$, with *K*(**p**^{N}) being the kinetic energy, is the Hamiltonian of the system, and $Z$ is the partition function. Expressions similar to Eq. (1) can be written for the 1D- and 2D-densities in cartesian coordinates using, e.g., *δ*(*z*_{i} − *z*) and *δ*(*x*_{i} − *x*)*δ*(*y*_{i} − *y*). In addition, RDFs can be defined as

where *r*_{ij} is the distance between particles *i* and *j* and the prime in the second sum indicates that *j* = *i* should be excluded when *b* = *a*.

These expressions lead straightforwardly to algorithms based on a discretization of space into a grid of voxels with size *h*^{d}, with *h* being the grid spacing and *d* ∈ {1, 2, 3}, for the density and similarly of distances, with a bin width Δ*r*, for the RDFs. Histograms are obtained by simply counting the number of particles in a given voxel (for densities) or of particle pairs separated by a distance comprised between *r* and *r* + Δ*r* (for RDFs) for each configuration, summing over particles or pairs, and averaging over a collection of configurations generated according to their weight in the canonical ensemble using MD or MC simulations.

In order to illustrate the limitations of this histogram approach, which remains the most commonly used to date, let us consider a system of identical non-interacting particles in three dimensions. In the absence of an external potential, one expects the density to be uniform and equal to the average density *ρ* = *N*/*V*. If the bin size *h* is small, each voxel of the grid discretizing space will be either empty or contain one particle, leading to an instantaneous estimate of the density $\rho \u0303$ in this voxel of 0/*h*^{3} or 1/*h*^{3}, respectively. In addition, the fraction of occupied voxels for each configuration, which for a sufficiently large number of configurations is also the fraction of configurations in which a given voxel is occupied, is *α* = *ρh*^{3} ≪ 1. The estimator $\rho \u0303$ of the local density provides the correct average over many configurations, since $\rho \u0303=\alpha \xd7h\u22123+(1\u2212\alpha )\xd70=\rho $. The quality of this estimator can be measured by computing its variance: $\delta \rho \u03032=\alpha \xd7(h\u22123\u2212\rho )2+(1\u2212\alpha )\xd7(0\u2212\rho )2=\alpha (1\u2212\alpha )h\u22126\u2248\rho /h3$. As a result, the variance diverges as the bin size *h* decreases.

This problem of histogram-based estimates of the density is illustrated in Fig. 1: When the bin size is small, most voxels are empty and the estimate of the local density is given by the fraction of configurations in which a given bin is occupied, which is small and therefore requires a large number of configurations to converge. Since the computational cost essentially comes from generating the configurations (and not from the estimate of microscopic properties from these configurations), improved estimators with a reduced variance would decrease the number of samples necessary to converge the results. The advantage would be even greater for computationally intensive approaches, based, e.g., on *ab initio* calculations, with which data are usually scarce.

### B. The idea behind force sampling

The previous discussion shows that the variance issue with histograms arises essentially from the ideal gas contribution. Fortunately, this contribution is known: It is simply the average density *ρ* = *N*/*V*. However, what really matters are the *variations* in density with respect to the position (or in RDFs with distance), which result from the interactions between the particles and from external potentials. The idea behind force-sampling strategies is to sample these variations (the gradients) using estimators involving the force acting on the atoms—a strategy well summarized in the title of Ref. 30: “Better than counting: density profiles from force sampling.” In some cases, determining the density (or RDF) from its gradient numerically requires introducing some discretization. In others, this can be done analytically so that the density or RDF can be determined with arbitrary resolution—in strong contrast with the conventional histogram-based methods. Before turning to the presentation and discussion of force-based estimators, we note that they fundamentally differ from a simple data-smoothing approach (e.g., with splines) of histogram-based ones, as they make use of additional information pertaining to the gradient of the property of interest rather than combining values of this property computed at different positions.

## III. DENSITY

### A. Variance reduction from force sampling

Starting from the definition of the density as an ensemble average [Eq. (1)] and by differentiating with respect to the position **r**, one obtains (omitting the species type *a*)

where we have introduced the force density **F**(**r**) expressed as an ensemble average of the force **f**_{i} acting on the particles, including external potentials and interactions with other particles. The appearance of the force is due to the gradient of the Boltzmann weight, since $\u2207rie\u2212\beta H=e\u2212\beta H(\u2212\beta \u2207riU)=e\u2212\beta H\beta fi$ in Eq. (2) and to the properties of the Dirac delta function. The force sampling approach hence consists in first sampling the force density from the trajectory and then computing the density from its gradient. More precisely, this “inversion of the gradient” provides the density up to a constant, which can be taken as the average density *ρ*_{0} to obtain the excess density Δ*ρ*(**r**) = *ρ*(**r**) − *ρ*_{0}. Such an approach avoids the computation of the ideal gas contribution and focuses on the non-trivial contribution from interactions. In 2 or 3 dimensions, the gradient can formally inverted as^{30}

with *c*_{d} = 4*π* if *d* = 3 and *c*_{d} = 2*π* if *d* = 2. In practice, this integration of the gradient can also be done efficiently in reciprocal space, using Fast Fourier Transforms (FFT), from the force density discretized on a grid (see Refs. 29 and 32).

Figure 2 illustrates the benefit of the force sampling approach on a 2D system of Lennard-Jones particles in an external potential from Ref. 30. Both panels, showing a 2D density map and a 1D cut through the latter, clearly demonstrate that using data from the same configurations from MC simulations and the same grid, the force-based estimator displays much less noise than the direct histogram approach. As expected from the discussion of Sec. II, the error with respect to the converged result with many configurations diverges as the bin size decreases and behaves much better with force scaling (see Ref. 30 for a discussion of the scaling on a 1D example and Ref. 32 on a 3D example). While force sampling provides a clear reduction in variance compared to histograms for small bins, the benefit is less obvious for larger ones (a more precise statement would have to be system specific), in particular, because the numerical integration of the gradient on a coarse grid also introduces some discretization errors.

### B. Pitfalls and mitigation

Another point deserving the attention of the reader is that the force sampling approach may result in non-zero values of the density in regions where no particles are present. This is illustrated on a 1D example in Fig. 3, adapted from the work of Purohit *et al.*,^{31} which compares the conventional binning approach to force sampling as described above and to mapped averaging, which is a more general framework that can be applied to recover force sampling methods for density distributions (see below). All three methods provide similar results everywhere, but the inset shows that they differ in the tails of the distribution. While with histograms, the density falls to zero when the external potential diverges, this is not exactly the case for methods based on the force (which, on the positive side, display a lower variance): by integrating the density gradient as described above, the density starts from 0 on the left but integrates to a finite plateau on the right; mapped averaging provides a symmetrized but non-vanishing result. Fortunately, this spurious residual density, which comes from inaccuracies in the force density, can be reduced simply by increasing the number of configurations used to sample the latter.

Despite the above limitation (which also applies in higher dimensions), the 1D case also offers an example where the gradient can be integrated analytically. The above-mentioned mapped averaging framework provides a way to introduce the approximate theoretical results (such as an *a priori* estimate of a density) and obtain an exact expression of the error in the theory.^{34,35} In the present 1D geometry, introducing a uniform density as a prior guess, Purohit *et al.* obtained the following expression:^{31}

with *L* being the length of the system in the *z* direction, *S* = *V*/*L* being the area of system in the lateral directions, *H* being the Heaviside function, and *f*_{z,i} being the *z*-component of the force acting on particle *i*. It is probably also possible, using a different prior estimate, to recover from this formalism another expression obtained in Ref. 33 from a different perspective, namely, to combine a position-based and force-based estimator. To that end, one can introduce appropriate weights *w*_{N}(*z*) and *w*_{f}(*z*), where the subscripts refer to the number and force, respectively, such that *w*_{f}′(*z*) = *δ*(*z*) − *w*_{N}(*z*) and that *w*_{f}(*z*) vanishes when |*z*| is large, in order to write the 1D number density as

The function *w*_{N}(*z*) can be seen as a coarse-graining kernel for the contribution of each particle to the number density and should therefore vanish beyond a coarse-graining length *ξ*. Families of estimators can be obtained by choosing various forms for *w*_{N}(*z*) and corresponding *w*_{f}(*z*), and varying *ξ* (see Ref. 33 for an example of weight functions and a discussion of the choice of *ξ*). This provides a handle to mitigate the artifact of non-zero density in empty regions predicted by integrating the force density. The possible connection with mapped averaging can be hypothesized from the observation that Eq. (7) reduces to Eq. (6) (which is a particular case of a more general expression^{31}) for *w*_{N}(*z*) = 1/*L* (which strictly speaking does not satisfy the criteria to be interpreted as a coarse-graining kernel, since every particle would contribute to the density everywhere). In practice, the estimator defined in Eq. (7) can be computed efficiently by convoluting *a posteriori* the histogram-based estimators of the number and force densities with their corresponding weight functions.

### C. Extensions: Rigid bodies and generic densities

The force sampling approach to the number density has recently been extended in several directions by Coles *et al.*^{32} From the practical point of view, it is important to consider the case of molecules described as rigid bodies using distance constraints, which includes popular water models such as SPC/E (extended simple point charge model) or the TIP*n*P family. Even though the derivation is not straightforward, the final result is particularly simple,^{32}

where $fi*$ is *the sum of forces acting on particle i and all particles participating in a constraint with i* (in particular, those belonging to the same rigid molecule as *i*) and where the average is made over configuration satisfying all constraints. In practice, the gradient of the water oxygen (respectively, hydrogen) density is computed by assigning the total force acting on each molecule to the position of the O atom (respectively, H atoms).

Another generalization was to consider other local quantities such as the charge or polarization densities. Such an extension is trivial for combinations of the type,

when the microscopic property *a*_{i} does *not* depend on the coordinates **r**^{N}, which is the case for the charge density. Following the same derivation as for the number density, one readily obtains

Unlike the charge density, the polarization density depends on the orientation of the molecules, hence their atomic coordinates. For rigid polar molecules, it is nevertheless possible to use a different set of coordinates, namely, the positions of their centers of mass $RNr$ and orientations $\Omega Nr$, with *N*_{r} being the number of rigid molecules. The result for the *α* component of the polarization density **P**(**r**) is

where the sum runs over molecules *l* and *μ*_{l,α} is the *α* component of their molecular dipole. In that case, there is no need to include constraints explicitly in the average since there are no additional degrees of freedom.

Figure 4 illustrates the results on the 3D organization of water, around a fixed water molecule from simulations with the rigid SPC/E water model, with results from Ref. 32. Panel (a) shows the regions of a large number density of O atoms, computed using Eq. (8) on a grid and integrating the gradient numerically using FFT. One can identify the position of the first solvation shell, with different basins corresponding to molecules donating (top) or receiving (bottom, only one is visible from this angle) H-bonds to/from the central molecule. Panel (b) then shows the *z* component of the polarization density, computed using Eq. (11) and FFT to integrate the gradient. This allows clarifying the orientation of the molecular dipole of water molecules in the different basins of panel (a), consistent with the donation/reception of H-bonds, and shows, in particular, that *P*_{z} vanishes in the plane of the central molecule, as expected from the symmetry of the system.

In order to demonstrate the benefit of force sampling compared to histograms to determine the 3D number and polarization densities, panels (c) and (d) finally show a 1D trace through the 3D polarization density along the blue line shown in panel (a). Both methods display two peaks of opposite sign in the vicinity of the molecule, corresponding to the two basins for H-bond donating molecules of panel (b). For a relatively large bin size, both methods give the comparable results. As the bin size decreases, the quality of the histogram-based estimator deteriorates much faster than the force-based estimator: This is visible in the larger amplitude of the noise far from the solute, as well as near the rising edge close to the latter (see the insets) and the growing asymmetry between the two sides of the molecule.

As a final illustration on a more complex system, Fig. 5, also from Ref. 32, compares the number and charge densities around a small protein, lysozyme. Following Eq. (10), the position and charge *q*_{i} of each atom are considered when computing the gradient, but the relevant force is the total force acting on the rigid molecule. The advantage of force sampling over histograms is again clearly visible in the reduction of the noise away from the solute, as well as in a better resolved solvation structure near its surface. Among other features, the positive and negative lobes of the charge density allow one to identify molecules donating or receiving H-bonds.

## IV. RADIAL DISTRIBUTION FUNCTION

While Sec. III only considered one-body densities, other statistical tools exist to quantify the structure of a system. We now turn to the force sampling approach to radial distribution functions, which reflect correlations between particles.

### A. Virial-like estimator

Inspired by ideas for the electron density in quantum Monte Carlo,^{24–26} Borgis *et al.*^{29} proposed an expression alternative to the definition equation (3). Starting from the latter, they used Poisson’s equation to rewrite $\delta (r)=\u221214\pi \Delta 1r$, with Δ being the Laplace operator. Integration by parts in the canonical average, symmetrization between particles *i* and *j* and finally Gauss’s law then lead to

with *H* being the Heaviside function. The origin of the forces **f**_{i} and **f**_{j} acting on the particles is again the gradient of the Boltzmann weight with respect to the particles positions, which comes from integrating by parts. This expression displays naturally the separation between the ideal gas contribution, which as described in Sec. II A is detrimental to the convergence of the histogram-based estimator and a virial-like correction arising from interactions between particles and external potentials. It is worth emphasizing here that the above expression holds even with many-body potentials, since no assumption of pairwise additivity was needed in the derivation.

One can further note that it involves a Heaviside instead of a Dirac delta in Eq. (3). This results from the integration by parts in the canonical average, which can be seen as a way to “integrate the gradient” of *g* analytically, and has important practical consequences. While in the histogram approach, each pair only contributes to the estimate of the RDF in the bin corresponding to *r* = *r*_{ij}, and symmetrically, the estimate at *r* only benefits from pairs such that *r*_{ij} = *r*; with Eq. (12), each pair contributes to the estimate of the RDF for all values *r* ≤ *r*_{ij}, and symmetrically, the estimate of *g*(*r*) benefits from all pairs separated by a distance larger than *r*. In addition, bins are not necessary anymore, and one can compute the RDF with arbitrary resolution in *r*.

The benefit of the force sampling approach to estimate RDFs is illustrated in Fig. 6, which reproduces the results from Ref. 29 for a Lennard-Jones fluid. Panel (a) shows that the estimate obtained on a single configuration with histograms (using a very small bin width) is much noisier than with force sampling, the latter being indistinguishable from the converged histogram-based estimate from 10^{4} configurations. Panel (b) then shows the variance of the estimator over *N*_{conf} = 10^{3} configurations as a function of *r*, defined by

The shape of *v*(*r*) is the same for all methods, in particular, it vanishes inside the core (where *g* should vanish) and at long distance (where *g* ≈ 1). However, the variance of the force-based estimator is significantly reduced compared to histograms, even more so that the bin size is small for the latter.

### B. Alternative derivations and expressions

As mentioned earlier, Eq. (12) can be seen as an integration of *g*′(*r*) = *dg*/*dr* from *r* → *∞* where *g* = 1, leading to two difficulties. First, in a finite box, the limiting value of *g* is never reached exactly. In fact for simulations in the canonical ensemble, the plateau value differs from 1 by a $O(N\u22121)$ correction.^{36,37} Second, this expression does not guarantee *a priori* that *g* will vanish inside the core, which is the case for *r* = 0 only if

In practice, this is almost the case [hence, it is hard to see the small but finite value in Fig. 6(a)], but only approximately. For systems with sufficiently hard repulsion at short range, one can enforce that *g*(0) = 0 and integrate from *r* = 0 to obtain an alternative expression,

A similar expression was obtained for the two-body density by Purohit *et al.* using mapped averaging (already discussed for the one-body density in Sec. III).^{31} Compared to Eq. (12), each pair now contributes to the estimate of the RDF for all values *r* ≥ *r*_{ij}, and symmetrically, the estimate of *g*(*r*) benefits from all pairs separated by a distance smaller than *r*.

It is worth noting that another similar expression had, in fact, been introduced earlier by Adib and Jarzynski^{28} as an unbiased estimator first for the density around a fixed solute and then for the RDF. Starting from the expression of a local (surface) quantity in terms of a volume average, using Gauss’s theorem and introducing an appropriate vector field **u**, they obtained an expression of the RDF [Eq. (20) of Ref. 28] valid for pairwise additive potentials—while the above expressions do not rely on such an approximation and can also be used with many-body potentials such as in *ab initio* simulations. The vector field **u** introduced in this work is similar to the $\u22071r=\u2212rr3$ in the above expressions but also accounts for the possible presence of a hard sphere solute at the origin and introduces the largest sphere that fits in the simulation box (*R*_{max} = *L*/2 with *L* being the box length). It might therefore be useful to consider a similar quantity in the derivation of Ref. 29.

As a final remark on RDFs, we also mention that the idea of integrating by parts in the canonical average can be pushed even further. In the same study,^{29} Borgis *et al.* obtained by performing a second integration by parts another expression of the RDF,

with $\Phi i=\beta fi2\u2212\Delta riU$. Unlike Eq. (12) in which only the force enters, this one requires the Laplacian of the energy (trace of the force gradient) with respect to **r**_{i}, which is usually not computed in molecular dynamics simulations. In addition, preliminary numerical tests in this study proved disappointing. However, this illustrates the fact that there are many possibilities to obtain alternative estimators (even though not always practically useful, at least immediately).

## V. LOCAL TRANSPORT COEFFICIENTS

Beyond structural properties, the force sampling strategy has also been explored recently for the computation of local transport coefficients in confined fluids,^{33} such as the system illustrated in Fig. 7(a). Due to the external potential from the walls, the fluid adopts a layered structure, shown in Fig. 7(b). In the presence of an external perturbation parallel to the walls, such as a pressure gradient −∇*P* or a chemical potential gradient −∇*μ* (or an electric field −∇*ψ*, e.g., for an electrolyte solution), the various components of the fluid will respond differently and the local steady-state flux of each species will depend on the type of perturbation. For sufficiently small perturbations, the response is linear and fully characterized by a *mobility matrix* $M(z)$ relating local fluxes to the external forces. For the binary fluid of Fig. 7, this can be written as

where *q* and *j*_{A} are the local volume fluxes and solute fluxes, respectively, and $cA*$ is a reference concentration related to the composition in the bulk part of the fluid. For example, the element $M11(z)$ of this 2 × 2 matrix predicts the flux of all species under a pressure gradient (Poiseuille flow), while $M12(z)$ corresponds to diffusio-osmosis.

### A. Mobility profiles from equilibrium molecular dynamics

For illustration purposes, we will only consider here $M11(z)$, but the other cases can be found in Ref. 33. Using linear response theory, it is possible to derive expressions of the mobility profiles $Mij(z)$ as integrals of correlation functions [Green–Kubo approach (GK)],

with $CqQ(t,z)=q(z,t)Q(0)$ is a time-correlation function (for each *z*) between the local volume flux,

with *v*_{x,i} being the *x*-component of the velocity of particle *i*, *N* = *N*_{A} + *N*_{B} being the total number of particles, and *H* being the distance between the wall, and the global flux averaged over the fluid slab,

Alternatively, the integrals can be computed as [Einstein–Helfand approach (EH)]

### B. Force sampling for time correlation functions

Since the time correlations are defined as canonical averages of observables involving a Dirac delta, one can follow the ideas developed for static properties. From the definition,

one can introduce a force-weighted observable,

and form, in the spirit of Eq. (7), the mixed estimator,

where the convolution products are in space only, not time. A similar construction can be followed for the EH approach.

Figure 7(c) compares the predictions from equilibrium MD simulations with the GK and EH approaches, both using the mixed estimators introduced above, for the *M*_{11}(*z*) element of the mobility matrix, corresponding to a Poiseuille flow. The results are found to be in excellent agreement with the more direct non-equilibrium MD approach. The inset further shows that when considering the velocity profile, i.e., the flux divided by the density, one recovers the parabolic shape expected from continuum hydrodynamics. The fact that the NEMD results are recovered validates the relevance of the approach. This is particularly important because the derivation leading to the exact result for the density is complicated for time-correlation functions by the fact that observables are considered at two times: The canonical average corresponds to points in phase space at the initial time 0, where the global flux is also considered, while the local fluxes are computed from the positions and velocities at subsequent time *t* (it is possible to write a symmetric expression). The integration by parts over the initial positions *z*_{i}(0) then introduces an additional term, $Q(0)HN\u2211i=1N\u2202vx,i(t)\u2202zi(0)wf(zi\u2212z)$, which involves the derivative of the *x* component of the velocity at time *t* with respect to the initial position in the *z* direction. Although this term might not vanish in principle, it would be very difficult to evaluate from the trajectories. It was found numerically that it was sufficient to neglect it, but a formal derivation would be desirable.

The improvement of the forced-based estimator compared to histograms is discussed in detail in Ref. 33, together with the results for the other elements of the mobility matrix. We only emphasize here that such an improvement is essential because determining the mobility profiles is computationally much more demanding than static properties: It requires converging, *for all positions*, the time correlation functions with sufficient accuracy to compute the integral in Eq. (18) (GK) or the slope in Eq. (21) (EH). To conclude this discussion of local transport properties, we note that for the determination of a single mobility profile, NEMD remains more efficient than the above equilibrium MD route. However, the latter can provide all elements of the mobility matrix simultaneously, whereas NEMD requires separate simulations with the different perturbations, and to consider different magnitudes in order to check the validity of the linear response. Therefore, one can expect that the equilibrium approach will be particularly helpful for multicomponent systems, e.g., for the response of electrolyte solutions to pressure, chemical potential, or electric potential gradients (hence, a 3 × 3 mobility matrix). The force based estimator, with a reduced variance compared to histograms, will contribute to decreasing the number and length of trajectories necessary to obtain the reliable results.

## VI. DISCUSSION

In Secs. III–V, we have mentioned several approaches arriving from different perspectives to similar expressions for estimators with a reduced variance compared to histograms, which involve the forces acting on the particles. The examples making use of the Poisson equation to rewrite the Dirac delta and/or using Gauss’s theorem involve integration by parts in the definitions as canonical averages, very much in the spirit of the Yvon theorem for averages of the derivative of an arbitrary function of the particle coordinates *A*(**r**^{N}) with respect to the position *z*_{i} of one particle,^{1}

They are also directly related to the notion of potential of mean force (PMF), as discussed for the RDF in Ref. 29 or the derivation of Ref. 32 for the density gradient in the presence of constraints, which makes use of the fundamental result of Ciccotti *et al.* for the mean force associated with generic collective variables,^{38} by choosing the three spatial coordinates as collective variables. The same strategy could be used to extend Eq. (12) in the presence of constraints using the distance between particles as collective variable. One should keep in mind, however, that the force density **F**(**r**) = *k*_{B}*T*∇*ρ*(**r**) [see Eq. (4)] and the mean force, derived from the PMF, differ by a factor of *ρ*(**r**). A more detailed discussion of this difference and the practical consequences for their computation in molecular simulations, in the case of the RDF, can be found in Ref. 29.

The separation between the ideal gas reference and the contribution from interactions and external potentials is natural in the framework of mapped averaging, which provides the exact correction to a theory (e.g., uniform density). This idea of using a reference distribution, leading to estimators involving forces, had also been proposed by Basner and Jarzynski to compute binless PMFs, with an illustration on an angular distribution,^{39} or Zhang and Ma to compute RDFs or angular distributions.^{40} One could envision, for example, to apply it also in the case of the local transport properties illustrated in Sec. V using the theoretical prediction from continuum hydrodynamics as reference.

Since in molecular dynamics simulations, the forces are computed to generate the trajectory, the force-based estimator does not entail a larger computational cost compared to histograms. As a result, one can only recommend the use force-based estimators to compute local densities or radial distribution functions. For Monte Carlo, the additional cost of computing the forces in addition to the energy should be compared to that of generating enough configurations to achieve the same variance reduction with histograms, which depends, in particular, on the bin width. Another issue related to the computation of forces is the case of particles described by hard cores. Indeed, the expressions reported here cannot be used directly. However, the effect of excluded volume can be treated separately from other interactions, which allows us to apply the ideas of force sampling even in this case.^{28,35}

## VII. SUMMARY AND FUTURE CHALLENGES

Even though the determination of local properties, such as densities or radial distribution functions, remains one of the most standard goals of molecular simulation, it seems that the community still essentially relies on histogram-based algorithms to estimate them. The main objective of the present work is to highlight recent developments of alternative approaches leading, from different perspectives, to estimators with a reduced variance compared to conventional binning. They all make use of the force acting on the particles, in addition to their position, and allow us to focus on the non-trivial part of the problem in order to alleviate (or even remove in some cases) the catastrophic behavior of histograms as the bin size decreases. The corresponding computational cost is negligible for molecular dynamics simulations, since the forces are already computed to generate the configurations, and the benefit of reduced-variance estimators will be even larger when the cost of generating the latter is high, in particular, with *ab initio* simulations. The force sampling approach may result in spurious residual non-zero values of the density in regions where no particles are present, but strategies are available to mitigate this artifact.

Up to now, the variance reduction via force sampling has mainly been demonstrated numerically—even though the methods themselves were, of course, derived as rigorously as possible and good reasons were proposed to explain the observations. Since variance reduction is a field of research by itself, computational physicists, chemists, and biologists would certainly benefit from insights from the mathematics community in order to obtain analytical predictions of the gain to be expected with respect to binning, in particular, the scaling with the bin size (when the force density is first sampled on a grid, then integrated numerically) and the number of configurations. Other directions could include improved or optimal methods to “integrate the gradient” in order to reconstruct the density from the force density, in particular in 3D, or the formal derivations in the case of transport (see the discussion of the “missing term” in Sec. V), including the presence of constraints for rigid bodies.^{41}

The potential of these force sampling strategies could also be investigated for other thermodynamic conditions, e.g., the isothermal–isobaric (*NPT*) or the grand-canonical (*μVT*) ensembles, or for quantities derived from the ones considered here, e.g., to improve the convergence of Kirkwood–Buff integrals that involve RDFs.^{42,43} Other local properties might also benefit from the ideas behind force sampling, in particular, the local stress tensor.^{44,45} However, one should keep in mind that this might require quantities that are not computed in typical simulations, such as the force gradient, and that the additional computational cost should not exceed that of reducing the variance by the same amount using histograms simply by generating more configurations. Another possible direction for the short-term development of force sampling, in particular, following the approach of Ref. 32 based on the results of Ref. 38, is the extension to non-cartesian coordinates and/or multidimensional collective variables, which can be used to characterize the 3D structure of liquids (see, e.g., Ref. 46 for a recent example).

Finally, it would be interesting to explore the connections with other simulation contexts where the strategies discussed in the present work could also provide alternative expressions for observables of interest. For example, improved estimators involving the force have already been proposed in path integral molecular dynamics to estimate thermodynamic quantities based on a virial form (see, e.g., Refs. 47 and 48). In the other direction, the ability to reduce errors in the computation of densities/PMFs would also be useful for coarse-graining strategies based, e.g., on Boltzmann inversion^{22} or on the relative entropy^{23} or for simulations with adaptive resolution introducing a thermodynamic force density in the hybrid region.^{49} As for dynamic properties, beyond the equilibrium route presented in Sec. V, it could be useful to extend the force-sampling approach to non-equilibrium steady states, as suggested in Ref. 30, or even to path sampling methods.^{50} We hope that this Perspective will motivate others to embark on this promising path.

## ACKNOWLEDGMENTS

The author wishes to express his gratitude to many colleagues for collaborations, discussions, and suggestions of literature on related work, in particular, Daniel Borgis, Rodolphe Vuilleumier, Roland Assaraf, Samuel Coles, Etienne Mangaud, Daan Frenkel, Antoine Carof, Sara Bonella, Giovanni Ciccotti, David Kofke, Vincent Krakoviack, Thomas F. Miller III, Glen Hocky, and Juan P. Garrahan. He acknowledges financial support from the Ville de Paris (Emergences, project Blue Energy) and the H2020-FETOPEN project NANOPHLOW (Grant Agreement No. 766972) from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant Agreement No. 863473).

## DATA AVAILABILITY

Data sharing is not applicable to this article as no new data were created or analyzed in this study.