A configurational sampling algorithm based on nested layerings of Markov chains (Layered Nested Markov Chain Monte Carlo or L-NMCMC) is presented for simulations of systems characterized by rugged free energy landscapes. The layerings are generated using a set of auxiliary potential energy surfaces. The implementation of the method is demonstrated in the context of a rugged, two-dimensional potential energy surface. The versatility of the algorithm is next demonstrated on a simple, many-body system, namely, a canonical Lennard-Jones fluid in the liquid state. In that example, different layering schemes and auxiliary potentials are used, including variable cutoff distances and excluded-volume tempering. In addition to calculating a variety of properties of the system, it is also shown that L-NMCMC, when combined with a free-energy perturbation formalism, provides a straightforward means to construct approximate free-energy surfaces at no additional computational cost using the sampling distributions of each auxiliary Markov chain. The proposed L-NMCMC scheme is general in that it could be complementary to any number of methods that rely on sampling from a target distribution or methods that exploit a hierarchy of time scales and/or length scales through decomposition of the potential energy.

## I. INTRODUCTION

Monte Carlo (MC) techniques provide essential tools for molecular modeling, especially in cases where the inherent system dynamics are unable to adequately explore relevant regions of phase space during the time scale of a simulation. Many powerful MC methods have been developed in attempts to overcome such issues. Some of these methods include configurational bias MC,^{1,2} expanded ensemble MC,^{3} hyper-parallel tempering MC,^{4,5} Hamiltonian Replica Exchange (HREX) MC,^{6,7} extended state space MC,^{8} multicanonical parallel tempering MC,^{9} resolution-exchange MC,^{10} free-energy perturbation Hamiltonian replica exchange,^{11} Hamiltonian switch MC,^{12} and hybrid non-equilibrium molecular dynamics MC,^{13} among many others.^{14–34} Of particular interest to this work is a set of so-called Nested Markov Chain Monte Carlo (NMCMC) methods, exemplified by several pioneering studies,^{35–37} in which a computationally cheaper, approximate potential energy surface (PES) is utilized to sample the distribution of a more expensive, detailed PES.

In the NMCMC schemes of Gelb^{36} and Hetenyi,^{37} a number of “normal” MC moves (e.g., displacements) are made with an auxiliary (typically approximate) PES to generate a Markov chain of configurations. At the conclusion of the auxiliary Markov chain, the entire sequence of events is used as a move proposal in the Markov chain corresponding to the full, “exact” PES; this sequence is then accepted or rejected based on an acceptance criterion that unbiases the Markov chain of the auxiliary potential and enforces detailed balance in the Markov chain of the exact PES. In the canonical ensemble, the acceptance criterion is simply related to the difference between the exact PES and the auxiliary PES. This approach has been used to successfully accelerate simulations by utilizing classical, empirical potentials to propose configurations in *ab initio* simulations,^{35,38,56} as well as reducing the interaction cutoff distance in Lennard-Jones (LJ) systems^{36} and reducing the frequency of long-range electrostatic energy evaluations.^{39}

While NMCMC methods can reduce computational cost considerably, these methods can suffer from issues associated with using non-optimal parameters in the auxiliary Markov chain. Two of the more notable issues concern (i) the number of steps that are taken in the auxiliary Markov chain and (ii) the choice of the thermodynamic parameters (potential energy function, temperature, etc.) for the auxiliary Markov chain. For auxiliary Markov chains of sufficient length, consecutive samples become statistically de-correlated and require fewer samples to produce accurate ensemble averages; however, the length of the auxiliary Markov chain is inversely correlated to the acceptance probability of the overall move, which directly influences the efficiency of phase-space exploration. With regard to choosing thermodynamic parameters of the approximate layer, NMCMC inherits many of the issues of free-energy perturbation theory (FEP)^{36} since effective sampling hinges on the sufficient overlap of the target and approximate distributions. To this end, promising methods have been introduced for optimizing the thermodynamic parameters of the auxiliary Markov chain.^{40–43,55}

Here, we propose a layered NMCMC scheme, referred to as L-NMCMC, that mitigates issues encountered in previous NMCMC formulations while enhancing the benefits. In this approach, multiple Markov chains, each generated using different potential energy functions (one being the exact PES and the remaining being auxiliary potentials), are nested recursively. In comparison to earlier NMCMC schemes, this many-layer approach adds flexibility by permitting the use of multiple auxiliary potentials that can exploit the often hierarchical nature of interactions encountered in molecular simulation and improves efficiency by improving the overlap in configuration space between Markov chains in neighboring layers. To be concrete, in L-NMCMC, the number of “layers” is synonymous with the number of Markov chains used in the MC scheme. One-layer NMCMC is simply considered as a normal Markov chain MC procedure, and two-layer NMCMC is identical to the original NMCMC developments,^{35–37} where a single nesting of Markov chains is used. Including more than two layers encompasses the generalization presented in this Article.

The outline of the paper is as follows. In Sec. II, we review the basic formalism of NMCMC using two layers before extending the formalism to an arbitrary number of layers in Sec. III. The application of FEP theory with L-NMCMC sampling to extract the underlying free-energy surface (FES) is demonstrated in Sec. IV, and technical considerations related to the performance of L-NMCMC are outlined in Sec. V. Subsequently, L-NMCMC is applied to a series of model systems in Secs. VI and VII. Finally, the relationship of L-NMCMC to other methods is briefly discussed in Sec. VIII before concluding in Sec. IX.

## II. TWO-LAYER NMCMC FORMALISM

Our notation and development follows that of Gelb.^{36} We define a Markov chain that satisfies detailed balance (*π*_{i}*p*_{ij} = *π*_{j}*p*_{ji}), where *π*_{i} and *π*_{j} are the probabilities associated with states *i* and *j*, respectively, and *p*_{ij} and *p*_{ji} are the transition probabilities associated with moving from state *i* to *j* and *j* to *i*, respectively. Using the Metropolis acceptance criterion, we define the transition probability *p*_{ij} as the product of an arbitrary transition matrix, *q*_{ij}, and the acceptance criterion, *α*_{ij}, the proper choice of which ensures microscopic reversibility.

In two-layer NMCMC, a Markov chain corresponding to an auxiliary PES is introduced to generate new trial configurations for the Markov chain associated with the exact PES. The arbitrary transition matrix *q*_{ij} of the original Markov chain can be written in terms of a sequence of *M* consecutive MC moves performed with the auxiliary potential to generate an auxiliary Markov chain,

In this notation, the state with *k* = 0 corresponds to the state *i*, and the state with *k* = *M* corresponds to state *j*. Using the Metropolis acceptance criterion for $\alpha ij\u2032$ in the auxiliary layer,

and *α*_{ij} in the exact layer,

the ratio of the transition matrices in the exact layer can be rewritten in terms of the state probabilities in the auxiliary layer

Using Eq. (4), the final acceptance criterion required to satisfy detailed balance in the Markov chain of the exact PES becomes

In the canonical ensemble, Eq. (5) becomes

where Δ*U* is the exact potential energy change associated with the change from state *i* to *j*, Δ*U*′ is the auxiliary potential energy change associated with the same configuration change, and *β* is the inverse temperature of the system. This acceptance criterion is analogous to that governing the exchange of configurations between replicas in HREX in the canonical ensemble.

## III. LAYERED NMCMC FORMALISM

To develop an arbitrarily layered version of NMCMC, *N* + 1 potential energy surfaces {*U*} = {*U*_{0}, *U*_{1}, …, *U*_{N}} are defined, with *U*_{0} representing the exact or target PES. To ensure that each layer obeys microscopic reversibility (i.e., $\pi inpijn=\pi jnpjin$ for the *n*th layer), the transition probability in the *n*th layer is written as

Furthermore, the transition matrix $qijn$ of the *n*th layer is written as a product of *M*_{n+1} transition probabilities for the Markov chain of the (*n* + 1)th auxiliary layer

By recursively substituting this definition, one obtains the result corresponding to *N* + 1 total layers as

In the layered formalism of Eq. (9), it is important to note that changes in configuration space (e.g., displacements) only occur in auxiliary layer *N* (see Fig. 1), and this constitutes the majority of the total number of PES evaluations. PES evaluations in the layers with *n* < *N* are performed as needed according to a schedule set at the beginning of a simulation; the purpose of these PES evaluations is to either accept and propagate a configuration to the next highest layer or to reject and reset the configuration of the lower layer (Fig. 1). After a walk in the (*n* + 1)th layer, the acceptance criterion in the neighboring layer, $\alpha ijn$, shown diagrammatically in Fig. 1, is governed by the following equation:

Equations (9) and (10) form the basis of L-NMCMC and serve to generalize previous two-layered schemes^{35–37} to many-nested Markov chains with an arbitrary number of layers. The practical implementation of L-NMCMC involves selecting a list of steps ({*M*} = [*M*_{0}, *M*_{1}, *M*_{2}, …, *M*_{N}]) and potentials ({*U*} = [*U*_{0}, *U*_{1}, *U*_{2}, …, *U*_{N}]) associated with each layer. To perform a step in layer 0, *M*_{1} steps are performed using the PES for layer 1, with each step consisting of *M*_{2} steps using the PES for layer 2, and so on. At the completion of *M*_{n+1} steps, Eq. (10) is used to decide if the final configuration in the (*n* + 1)th layer is accepted by layer *n*. If the move in the *n*th layer is accepted, the configurations in layers *n* and *n* + 1 are set to the new configuration of the *n*th layer, and the recursion continues. Likewise, if the overall move is rejected by the *n*th layer, the configuration in layer *n* + 1 is reset to the previously accepted configuration of the *n*th layer, and the recursion continues.

## IV. L-NMCMC WITH FREE ENERGY PERTURBATION THEORY

For neighboring layers, the acceptance of a proposed L-NMCMC move depends on the overlap of their associated probability distributions. Since L-NMCMC serves as a move proposal in the layer corresponding to the exact PES, a rejected move corresponds to a possibly significant amount of computational time invested in the auxiliary layers, which is undesirable particularly if MC acceptance probabilities are low. An intriguing possibility toward this end is to apply FEP to the auxiliary sampling distributions from an L-NMCMC simulation to utilize all sampling in the auxiliary layers, irrespective of whether the move is ultimately accepted or rejected.

If the differences in thermodynamic parameters between layers are small enough such that a perturbative criterion holds, FEP can be used to approximate a FES from the statistics of configuration space exploration in the auxiliary layers, even if the overall L-NMCMC move is rejected. More formally, the free-energy difference between the exact layer 0 and the auxiliary layer *N* can be constructed based on the energy difference Δ*U*_{n,n+1} between layers using FEP theory as

where the (*N* + 1)th potential is a theoretical potential such that *U*_{N+1} = 0 for all configurations. Equation (11) converges more quickly as the number of visited configurations increases in the (*n* + 1)th layer and becomes representative of the ensemble in the *n*th layer, or likely when Δ*U*_{n,n+1} is small. A particularly appealing aspect of this approach is that the terms required for the exponential average in Eq. (11) are already computed in the L-NMCMC interlayer check [Eq. (10)]. Consequently, the approximate FES for the exact layer can be computed with no additional computational effort from the L-NMCMC simulation. Importantly, free-energy estimates can be improved with additional layers that enhance the configurational overlap, whereas two-layer schemes will be inherently limited. Methods for statistically optimal combination of statistics across different states, such as weighted-histogram analysis^{44} or multi-state Bennett acceptance ratio,^{45} likely have analogous implementations in the context of L-NMCMC.

## V. PERFORMANCE AND LAYER SELECTION

For accelerating molecular simulations, L-NMCMC is in part analogous to multiple time-stepping molecular dynamics techniques such as reversible reference system propagator algorithms (RESPA)^{46} and in part analogous to hyper-parallel tempering.^{4,5} Specifically, the L-NMCMC layering, i.e., the set of auxiliary potentials and corresponding set of steps, could be selected such that the computational cost of evaluating the auxiliary PES decreases as one progresses from the exact layer to the highest-indexed auxiliary layer, but the layering could also be selected such that the overall acceptance rate (and thus efficiency) of Monte Carlo moves is improved. Each or both of these may be major considerations depending on the application.

To quantify anticipated speedups, we assume a check schedule for L-NMCMC of the form {*M*} = [*M*_{0}, *M*_{1}, *M*_{2}, …, *M*_{N}]; the computational cost of PES evaluations in each layer then follows {$w$} = [$w0$, $w1$, $w2$, …, $wN$]. The computational cost of a single L-NMCMC move proposal in the exact layer can be written as

The ratio of *W* between layers can be used to compare the relative cost of L-NMCMC layerings. As an illustrative example of the use of Eq. (12), consider two different layering schedules—a 3-layer scheme with {*M*} = [1,*M*_{1}, *M*_{2}] and a 2-layer scheme with {*M*} = [1, $M1\u2032$]—with the constraint that the product of the steps in each schedule is constant, i.e., $M1\u2032=M1M2$. Supposing that the relative costs of potential energy calls are given as $w0$ = *c*_{1}$w1$ and $w1$ = *c*_{2}$w2$ and comparing to a $M1\u2032$-step standard MC simulation, the general speedup of the 2-layer L-NMCMC simulation is $M1\u2032/(1+M1\u2032c1\u22121)$, while the speedup of the 3-layer L-NMCMC simulation is $M1\u2032/[1+M1c1\u22121+M1M2(c1c2)\u22121]$. As an illustrative example, assuming relative costs of *c*_{1} = 8 and *c*_{2} = 2, the 2-layer scheme with {*M*} = [1, 100] is expected to be 7.4× faster than 100 moves of standard MC, while a 3-layer scheme with {*M*} = [1, 10, 10] is expected to be 11.8× faster, even with the modest cost reduction between layers 2 and 3. Overall, this leads to a 60% computational speedup of the 3-layer scheme over the 2-layer scheme; if $w2$ could be increased to 8$w3$, this enhancement becomes 354%. Note that using more steps in the higher-indexed and presumably less costly auxiliary potentials generally results in greater simulation acceleration, for example, assuming that $w0$ > $w1$ > $w2$, {*M*} = [10, 10, 20] is computationally more efficient than {*M*} = [10, 20, 10].

Regarding configurational sampling, an advantage of the layering in L-NMCMC is that the layers serve as gatekeepers that typically only propagate configurations from higher-indexed layers when those configurations are representative of the ensemble of the neighboring layer PES (Fig. 1). However, in a strictly two-layer setting, reasonable acceptance rates are obtained only when the approximate PES is sufficiently similar to the exact PES.^{35,36} Here, if an inter-layer acceptance probability is low, extra layers with a PES intermediate between the original layers can be added to improve acceptance rates, analogous to adding more replicas in parallel tempering or replica exchange (REX). Moreover, LNMCMC can likely take advantage of ideas from relative entropy^{18,47} and replica exchange^{43} optimization approaches to improve layer selection. Notably, the cost of adding more layers need not be linear as in other techniques, and this often improves the overall sampling efficiency.

Importantly, L-NMCMC enhances the exploration of configuration space for a given number of potential energy calls. In the limit that {*U*} is picked to induce diffusive motion in all layers, traversal of configuration space will obey random walk statistics in each layer. Assuming that moves proposed in layer *n* + 1 are accepted with probability *α*_{n+1} and that the average displacement size in the (*n* + 1)th layer is *d*_{n+1}, then the root-mean-squared distance moved in the *n*th layer is

which sets the root-mean-squared distance diffused by a single L-NMCMC move to be

While the assumption of diffusive motion in all layers is not generalizable, Eq. (14) provides an upper bound on the extent of configuration space explored using L-NMCMC moves, which can provide guidance with the initial setup of the layering scheme. Specifically, the relative enhancement in configurational exploration that is useful for comparing layering schemes can be obtained by rewriting Eq. (14) using the acceptance ratio of the exact layer, *α*_{0},

The relative enhancement in configuration exploration, *E*_{c}, when considered in a ratio with the computational cost per step, W [Eq. (12)], can be used to determine the efficiency of convergence in the diffusive regime and represents a useful starting point for optimizing layering schemes.

The overall efficiency of a given L-NMCMC layering depends critically on the interplay of the cost of move proposal and the efficiency of accepted moves. These considerations are reflected separately in Eqs. (12) and (15), the former dictating the cost of proposing a trial move and the latter approximating the relative enhancement in configuration space exploration. It is notable that the potential amount of configuration space explored increases geometrically [Eq. (15)] with the addition of more layers, but the overall computational cost could increase simultaneously. As an illustrative example, we provide an assortment of simple layering schemes for the 2D Lennard-Jones (LJ) fluid examined in Sec. VII and include the corresponding acceptance rates, potential energy evaluations per L-NMCMC step, and cost analysis in Table S1 and Fig. S4 of the supplementary material. Because actual computational cost depends strongly on implementation and machine architecture, the number of potential energy evaluations is used as a proxy. The data in Table S1 of the supplementary material provide information relevant for selecting a L-NMCMC layering scheme for this system, and the ratios of Equations *E*_{c} and *W* are shown to explain the rates of convergence observed for the LJ fluid studied in Sec. VII.

However, in general, good layering schemes will depend on the shapes and costs of the PES of the system, as well as the goals for the calculation itself. To this end, we re-emphasize the additional information derived from the use of more layers via the FEP formalism (Sec. IV). As with many existing MC schemes, a reasonable approach is thus to perform short calibration runs with different layerings to understand computational expense and to use tuning phases to adjust acceptance rates for a particular layering choice.

## VI. A TWO-DIMENSIONAL SURFACE WITH MULTIPLE METASTABLE MINIMA

To validate the accuracy and demonstrate the utility of L-NMCMC, we first explore the statistics of a rugged two-dimensional PES with multiple metastable minima^{48} (see the Appendix for details); the exact FES obtained by direct numerical integration of the PES is shown in Fig. 2(a). All simulations use *β* = 1.0, and proposed displacements in the final layer are drawn uniformly from the interval [–0.15, 0.15). The employed auxiliary potentials, {*U*}, are simply obtained by scaling the exact PES by a constant less than 1.0.

Figure 2 illustrates that the L-NMCMC scheme enhances exploration of configuration space and accurately reproduces major features of the free-energy surface using as few as three layers. Using an L-NMCMC simulation with {*U*} = [1.0, 0.5, 0.2], {*M*} = [10^{6}, 10, 10], Fig. 2(b) shows the FES obtained by Boltzmann-inverting the exact layer statistics of the L-NMCMC trajectory, Fig. 2(c) shows the FEP-derived FES using Eq. (11), and Fig. 2(d) illustrates the difference between the exact FES [Fig. 2(a)] and the FEP results [Fig. 2(c)]. Whereas multiple standard MC runs using 10^{7} trial moves failed to explore more than a single PES minimum of Fig. 2(a) due to barrier heights as high as 40 energy units, all four major minima are visited sufficiently often in the exact layer to compute their relative free energies as seen in Fig. 2(b); this demonstrates the ability of L-NMCMC to escape deep minima and explore configuration space. At the same time, little information is obtained regarding the free energy outside of those regions using the direct Boltzmann inversion approach. However, Fig. 2(c) shows that the FEP approach to construct the FES [Eq. (11)] enables estimates of the free energy for the entirety of configuration space and particularly for all low-energy regions. To understand this capability, it is useful to examine a portion of an L-NMCMC trajectory such as that shown in Fig. 3, which illustrates that the 3-layer L-NMCMC walk can escape deep minima with the Markov chain in the exact layer and simultaneously gather statistics in the neighborhood of the wells by sampling in the higher-indexed layers. The primary inaccuracies of the FEP result occur near the top of the barriers because the likelihood of accepting configurations in that region is low for all layers, leading to poorly sampled contributions of Eq. (11); the accuracy in these high energy regions might be improved by either increasing the number of layers or modifying the auxiliary potentials.

To assess the performance of different layering schemes, we additionally compute the percentage deviation of the L-NMCMC average energy from the exact average energy (*f*_{⟨E⟩}) and the root-mean-squared deviation in the L-NMCMC probability distribution function relative to the exact surface given by

which directly quantifies the accuracy of L-NMCMC in exploring high probability regions of configuration space. The results of these calculations are shown in Table I. In all cases, the average energy and *π*_{RMSD}(*x*, *y*) of the L-NMCMC schemes are superior to those of conventional MC. Moreover, there is a trend of increasing accuracy as the L-NMCMC layering is increased; this is due to the growing similarity of neighboring layer potentials, resulting in a more accurate evaluation of Eq. (11). Additionally, the number of PES evaluations (*N*_{eval}) for each L-NMCMC layering is also provided in Table I; the higher accuracy results of the {*M*} = [10^{4}, 2, 2, 2, 10], {*U*} = [1.0, 0.8, 0.6, 0.4, 0.2] scheme utilizes less total potential energy calls than the {*M*} = [10^{5}, 10], {*U*} = [1.0, 0.2] layering. This further demonstrates the capability of L-NMCMC to simultaneously reduce computational cost and enhance configurational sampling.

N_{l}
. | {M}
. | {U}
. | f_{⟨E⟩}
. | Δπ(x, y)^{a}
. | N_{eval}
. |
---|---|---|---|---|---|

1 | [10^{7}] | [1.0] | 1.53 | 1.446 | 10^{7} |

2 | [10^{5}, 10] | [1.0, 0.2] | −0.74 | 0.065 | 1.2 · 10^{6} |

3 | [10^{5}, 4,10] | [1.0, 0.5, 0.2] | −0.38 | 0.052 | 5 · 10^{6} |

4 | [10^{4}, 4,4,10] | [1.0, 0.75, 0.5, 0.25] | −0.25 | 0.047 | 2.0 · 10^{6} |

5 | [10^{4}, 2, 2, 2, 10] | [1.0, 0.8, 0.6, 0.4, 0.2] | −0.22 | 0.049 | 1.1 · 10^{6} |

5 | [10^{4}, 4, 4, 4, 10] | [1.0, 0.8, 0.6, 0.4, 0.2] | −0.22 | 0.041 | 8.1 · 10^{6} |

N_{l}
. | {M}
. | {U}
. | f_{⟨E⟩}
. | Δπ(x, y)^{a}
. | N_{eval}
. |
---|---|---|---|---|---|

1 | [10^{7}] | [1.0] | 1.53 | 1.446 | 10^{7} |

2 | [10^{5}, 10] | [1.0, 0.2] | −0.74 | 0.065 | 1.2 · 10^{6} |

3 | [10^{5}, 4,10] | [1.0, 0.5, 0.2] | −0.38 | 0.052 | 5 · 10^{6} |

4 | [10^{4}, 4,4,10] | [1.0, 0.75, 0.5, 0.25] | −0.25 | 0.047 | 2.0 · 10^{6} |

5 | [10^{4}, 2, 2, 2, 10] | [1.0, 0.8, 0.6, 0.4, 0.2] | −0.22 | 0.049 | 1.1 · 10^{6} |

5 | [10^{4}, 4, 4, 4, 10] | [1.0, 0.8, 0.6, 0.4, 0.2] | −0.22 | 0.041 | 8.1 · 10^{6} |

^{a}

Relative values assuming a null probability surface with $\pi (x,y)=0\u2009\u2200\u2009x,y$.

## VII. EXAMPLES WITH LENNARD-JONES PARTICLES

### A. Layers with different interaction cutoffs

To benchmark L-NMCMC in a molecular context, the average energy per particle and average pressure of a two-dimensional LJ fluid are computed for various L-NMCMC layerings and compared to the results of Rovere *et al.*^{49} Simulations are performed at *T* = 0.9 and *ρ* = 0.4 using 256 particles; the LJ interaction parameter is *ϵ* = 1 for all systems. Auxiliary potentials utilize different values for the interaction cutoff (*r*_{cut}) in each layer. Table II reports results for a variety of layerings. The table indicates that the deviation of the average energy per particle (*f*_{⟨E⟩}) and the average system pressure (*f*_{⟨P⟩}) from conventional 1-layer MC, corroborated by published results,^{49} is ∼1% in all cases. While different layering parameters affect the rate of convergence of L-NMCMC (Figs. S1 and S2 of the supplementary material), these data suggest that L-NMCMC is robust and that requisite accuracy for thermodynamic quantities (Table II) and distribution functions (Fig. S2 of the supplementary material) can be obtained using a variety of layering schemes.

N_{l}
. | {M}
. | {r_{cut}}
. | f_{⟨E⟩}
. | f_{⟨P⟩}
. |
---|---|---|---|---|

1 | [2.56 · 10^{7}] | [2.5] | −0.02 | 2.09 |

2 | [10^{5}, 256] | [2.5, 2.2] | 0.08 | 1.45 |

2 | [10^{5}, 256] | [2.5, 2.0] | 0.02 | 1.57 |

3 | [10^{5}, 16, 16] | [2.5, 2.0, 1.6] | 0.03 | 1.2 |

3 | [10^{5}, 64, 16] | [2.5, 1.8, 1.6] | 0.14 | 2.35 |

4 | [10^{5}, 4, 4, 16] | [2.5, 2.0, 1.8, 1.6] | 0.17 | 1.08 |

N_{l}
. | {M}
. | {r_{cut}}
. | f_{⟨E⟩}
. | f_{⟨P⟩}
. |
---|---|---|---|---|

1 | [2.56 · 10^{7}] | [2.5] | −0.02 | 2.09 |

2 | [10^{5}, 256] | [2.5, 2.2] | 0.08 | 1.45 |

2 | [10^{5}, 256] | [2.5, 2.0] | 0.02 | 1.57 |

3 | [10^{5}, 16, 16] | [2.5, 2.0, 1.6] | 0.03 | 1.2 |

3 | [10^{5}, 64, 16] | [2.5, 1.8, 1.6] | 0.14 | 2.35 |

4 | [10^{5}, 4, 4, 16] | [2.5, 2.0, 1.8, 1.6] | 0.17 | 1.08 |

### B. Excluded volume tempering: Translocation through a pinhole

Finally, L-NMCMC is combined with excluded-volume tempering to examine the FES of a LJ particle diffusing through a pinhole. The pinhole itself is comprised of two sets of four, fixed LJ particles with the coordinates of {(–0.75, –0.75), (−0.25, –0.75), (0.25, –0.75), (0.75, –0.75)} and {(−0.75, 0.75), (−0.25, 0.75), (0.25, 0.75), (0.75, 0.75)}, in units of *σ*. The positioning sets an effective radius for the pinhole of *r*_{p} = 0.75*σ*, such that the act of diffusing through the pinhole is extremely rare given the strongly repulsive interaction of the LJ potential for *r* < *σ*, and a transition through the pinhole is not observed over the course of a standard MC simulation. To keep the lone diffusing LJ particle within reasonable boundaries, a piecewise confining potential is additionally applied,

To enhance configurational sampling, the auxiliary PES are selected to match the form presented by Bunker and Dunweg^{50} in their excluded-volume-tempering simulations. Essentially, a point (*r*_{t}) in the LJ potential energy surface is chosen, and the values at *r* < *r*_{t} are determined by matching the LJ potential and its derivative at *r*_{t} to a quadratic function (Fig. 4). Different auxiliary PES then utilize different values of *r*_{t} in the simulation to soften the repulsive barrier at *r* < *σ*.

Figure 5 depicts the FES for the pinhole system obtained using the excluded-volume tempering L-NMCMC scheme. Figure 5(a) shows the FES obtained via FEP analysis, and Fig. 5(b) provides the FES obtained solely from statistics in the exact layer. As expected, sampling using L-NMCMC is superior to standard MC, which never traverses the pinhole in multiple test runs of 10^{7} MC steps, especially in its ability to approximate the FES within the pinhole region, even when the exact layer never visits these states. FEP analysis of the pinhole region shows that it is about 10*ϵ* higher than regions outside of the pinhole and also reveals the presence of several metastable minima in the pinhole region. These details are entirely obscured if the FEP procedure with L-NMCMC is not used.

## VIII. RELATIONSHIPS TO EXISTING METHODS AND FUTURE OUTLOOK

Given the similarity between the acceptance criterion used in canonical L-NMCMC and that used in extended-ensemble methods like Hamiltonian replica exchange, umbrella-sampling replica exchange,^{6,7} and hyper-parallel tempering,^{4,5,9} it is worth considering the relationship of L-NMCMC to these and other methods. Fundamentally, L-NMCMC is principally a *move proposal* method, whereas replica exchange (REX) is an extended ensemble approach. An essential difference is that all calculations for L-NMCMC occur within a single simulation box, whereas REX methods require calculations in multiple simulation boxes, which may be run simultaneously. Nonetheless, it is therefore conceivable that L-NMCMC could be combined with HREX or other extended ensemble methods since the exploration of configuration space in any given simulation could be accomplished using moves proposed via L-NMCMC without hindering the larger REX scheme. Indeed, L-NMCMC could be complementary to any method that simply requires an engine to generate configurations according to the Boltzmann distribution.

In addition, L-NMCMC seems generally well suited to problems in molecular simulation that feature a hierarchy of interaction styles that act over different time and energy scales. Just as multiple time-stepping techniques such as RESPA^{46} take advantage of a separation of time scales to accelerate simulations, L-NMCMC can take advantage of the decomposition of typical interactions in a simulation (two-body bonds, 3-body angle bends, 4-body torsions, Lennard-Jones interactions, short-range Coulomb force, long-range Ewald sums, etc.) via a many-layer approach where computation of only a subset of these interactions is considered in each layer. Similarly, approximate techniques that reduce the cost of path-integral methods by treating different interactions with different degrees of nuclear quantum resolution, such as ring-polymer contraction^{51,52} or mixed-time slicing,^{26} could be translated to an L-NMCMC scheme to exactly recover the target distribution. Finally, whereas FEP was used in this work to take advantage of sampling distribution in the auxiliary layers, histogram reweighting^{45,53} or thermodynamic integration^{54} approaches could also likely be adapted to this purpose.

## IX. CONCLUSIONS

Advanced MC algorithms are critical to the exploration of molecular systems that do not dynamically explore phase space efficiently. Here, Layered Nested Markov Chain Monte Carlo (L-NMCMC) has been developed as a generalization of existing Nested Markov Chain methods to arbitrarily many potential energy layers. By recursively nesting Markov chains with different auxiliary potential energy surfaces, L-NMCMC provides a platform to simultaneously reduce computational cost and improve configurational sampling in molecular simulations. The incorporation of free-energy perturbation theory increases the information obtainable from L-NMCMC moves, allowing for the estimation of an approximate free-energy surface at no extra cost, even if the full L-NMCMC move is not accepted, and the capability to systematically improve free-energy estimates by the addition of potential energy surface layers is unique to the many-layer formalism. Applications to rugged potential energy surfaces and simple, many-body problems demonstrate that L-NMCMC generates the correct equilibrium Boltzmann distribution and can be used to approximately construct the free-energy landscape. L-NMCMC is well suited to take advantage of the decomposition of distinct time and energy scales in molecular simulation, but the flexibility in the definition of the auxiliary potentials provides interesting possibilities. Future work will explore the incorporation of L-NMCMC into other extended ensemble schemes and enhanced sampling methods, the use of dynamic and adaptive auxiliary potentials, and its application to more complex free-energy landscapes.

## SUPPLEMENTARY MATERIAL

See supplementary material for the parameters utilized for the randomly deposited gaussians in the 2D PES as well as additional information regarding the convergence, acceptance ratios, and computational costs of the LNMCMC algorithm for various layering schemes applied to the LJ fluid.

## ACKNOWLEDGMENTS

This work was supported by the U.S. Department of Energy Office of Science, Program in Basic Energy Sciences, Materials Sciences and Engineering Division, through the Midwest Integrated Center for Computational Materials (MICCoM). N.E.J. thanks the Maria Goeppert Mayer Named Fellowship from Argonne National Laboratory for support.

### APPENDIX: GENERATION OF 2D POTENTIAL ENERGY SURFACE

The two-dimensional PES employed in Sec. VI is generated by the summation of a series of two-dimensional Gaussian functions using a scheme similar to that of Chopra *et al.*^{48} In particular, the PES has two distinct sets of Gaussian functions such that the potential energy *U*(*x*, *y*) for 0 ≤ *x*, *y* ≤ 1 is computed as

with

where *A*_{i}, *μ*_{ix}, *μ*_{iy}, *σ*_{ix}, and *σ*_{iy} are parameters defining the amplitude, location, and width of the various Gaussian functions; outside the domain 0 ≤ *x*, *y* ≤ 1, the potential energy is infinite. In Eq. (A1), the first set of nine Gaussian functions generate the primary features of the PES, i.e., the four observable minima and five obstructing maxima [see Fig. 2(a)]; the corresponding parameters are provided in Table III. The second set of 100 Gaussian functions simply add roughness to the PES; the parameters for these Gaussian functions are generated randomly via a protocol that ensures smaller amplitudes and widths compared to those in Table III. Specifically, for a uniform random variable from *a* to *b*, $U(a,b)$ , the widths *σ*_{jx} and *σ*_{jy} are given as $U(0,0.1)$ , and the amplitudes *A*_{j} are given as $U(\u22125,5)$ . The specific parameters used in this work are provided in the supplementary material.

i
. | A_{i}
. | μ_{ix}
. | μ_{iy}
. | σ_{ix}
. | σ_{iy}
. |
---|---|---|---|---|---|

1 | −20.0 | 0.2 | 0.2 | 0.1 | 0.1 |

2 | −23.0 | 0.8 | 0.8 | 0.1 | 0.1 |

3 | −24.0 | 0.2 | 0.8 | 0.1 | 0.1 |

4 | −24.0 | 0.7 | 0.3 | 0.1 | 0.1 |

5 | 20.0 | 0.5 | 0.5 | 0.1 | 0.1 |

6 | 15.0 | 0.5 | 0.2 | 0.1 | 0.1 |

7 | 10.0 | 0.5 | 0.8 | 0.1 | 0.1 |

8 | 15.0 | 0.8 | 0.5 | 0.1 | 0.1 |

9 | 10.0 | 0.2 | 0.5 | 0.1 | 0.1 |

i
. | A_{i}
. | μ_{ix}
. | μ_{iy}
. | σ_{ix}
. | σ_{iy}
. |
---|---|---|---|---|---|

1 | −20.0 | 0.2 | 0.2 | 0.1 | 0.1 |

2 | −23.0 | 0.8 | 0.8 | 0.1 | 0.1 |

3 | −24.0 | 0.2 | 0.8 | 0.1 | 0.1 |

4 | −24.0 | 0.7 | 0.3 | 0.1 | 0.1 |

5 | 20.0 | 0.5 | 0.5 | 0.1 | 0.1 |

6 | 15.0 | 0.5 | 0.2 | 0.1 | 0.1 |

7 | 10.0 | 0.5 | 0.8 | 0.1 | 0.1 |

8 | 15.0 | 0.8 | 0.5 | 0.1 | 0.1 |

9 | 10.0 | 0.2 | 0.5 | 0.1 | 0.1 |