Combined-resolution simulations are an effective way to study molecular properties across a range of length and time scales. These simulations can benefit from adaptive boundaries that allow the high-resolution region to adapt (change size and/or shape) as the simulation progresses. The number of degrees of freedom required to accurately represent even a simple molecular process can vary by several orders of magnitude throughout the course of a simulation, and adaptive boundaries react to these changes to include an appropriate but not excessive amount of detail. Here, we derive the Hamiltonian and distribution function for such a molecular simulation. We also design an algorithm that can efficiently sample the boundary as a new coordinate of the system. We apply this framework to a mixed explicit/continuum simulation of a peptide in solvent. We use this example to discuss the conditions necessary for a successful implementation of adaptive boundaries that is both efficient and accurate in reproducing molecular properties.
Multiscale models are an effective way to simulate molecular systems. The motivation is clear: a high-resolution model can capture physical detail, while a low-resolution model offers computational efficiency and is sometimes better suited or more easily parameterized for large-scale phenomena.1 There are two strategies for multiscale modeling. In the first, multiple independent simulations are used for different levels of resolution. Combined intelligently, the sum is worth more than the parts: simulations at one level motivate and parameterize simulations at another.2–4 The focus of this work is the second approach, which combines multiple levels of resolution into a single simulation. These simulations use a fine-grained (FG) model for a region of interest and a computationally efficient coarse-grained (CG) model elsewhere. Examples include mixed quantum and molecular mechanical,5,6 mixed all-atom and coarse grained (CG),7–13 and hybrid explicit-continuum solvent models.14–21
Accurately modeling the boundary between high- and low-resolution regions is the crux of a combined-resolution simulation. Even for a homogeneous system like bulk solvent, equilibrium properties emerge from a delicate balance of interactions with the surrounding medium. An improper handling of the boundary will break the natural symmetry, and the resulting structural artifacts can extend well beyond the boundary into other regions of the simulation.15
We have developed a hybrid explicit-continuum solvent model that includes a boundary region over which molecules gradually, rather than abruptly, change resolution. This boundary method avoids the structural artifacts common to hybrid solvent models and accurately reproduces thermodynamic properties throughout the entire explicit domain.15,16 This boundary method is similar to that introduced in the Adaptive Resolution (AdRes) approach,22,23 a method used to couple high resolution particles to a more coarse-grained9,10 or continuum24–26 representation and that can also incorporate regions with some quantum mechanical effects.27–29 AdRes has been successfully applied to a range of molecular systems.9–13,30
Previously,16 we introduced adaptive boundaries into combined-resolution simulations. This allows the high-resolution region to adapt and include an appropriate but not excessive amount of detail as the simulation progresses. Consider protein folding as a simple example. A simulation with a fixed boundary must be large enough to solvate the largest possible protein conformation. As we show below, an infrequently visited extended conformation may require over an order of magnitude more solvent molecules than the predominant collapsed state. By contrast, an adaptive boundary can shrink and expand as the simulation progresses, as shown in Fig. 1 (Multimedia view). Adaptive boundaries would similarly benefit simulations of biomolecular assembly, aggregation, crystallization, and other examples outside of biology.
Simulations with flexible boundaries have been previously implemented with a restraining potential centered on a solute molecule that prevents atomistic solvent from drifting away.31,32 Kreis et al. have implemented an adjustable boundary with AdRes, which does not use a restraining potential and defines the high resolution region by a set of overlapping spheres that can change the relative position.33 This method successfully reproduces thermodynamic properties.
Here, we define the Hamiltonian and derive the distribution function of a combined-resolution simulation that has adaptive boundaries. This approach complements the work of Kreis et al., which does not have a conserved Hamiltonian. Their method could presumably be incorporated with the Hamiltonian-based version of AdRes,34–37 in which case the appropriate distribution function should match that derived here.
This formalism allows us to connect the equilibrium properties of the high resolution region to those expected from a simulation performed in full detail. We build on previous work,16 where we derived adaptive boundaries specifically for a mixed explicit-continuum model with a spherical domain. Our work here extends this theory for general mixed-resolution models and for arbitrarily shaped boundaries. We also design a new algorithm that efficiently samples the boundary as a coordinate of the system. We test this model on a peptide in a mixed explicit/continuum solvent model, as shown in Fig. 1 (Multimedia view), and show that an adaptive boundary severely reduces the number of degrees of freedom in the simulation.
We first outline the theory of combined-resolution models with fixed boundaries following the formalism of Roux and co-workers.14,17,19 Consider a system containing molecule A of interest (e.g., a protein), which is always modeled in high resolution, and N identical solvent molecules. We define fine-grained U and coarse-grained V potential functions. This is a minimal example; the theory is easily extended to multi-component systems or to a molecule A that can change resolution.
We first consider the full system in fine-grained (FG) detail. The configurational probability distribution in the canonical ensemble is
for configuration X, where the subscripts A and N correspond to molecule A and the N solvent molecules. Henceforth, we will drop the coordinate arguments of the Hamiltonian. The Hamiltonian can be written as a sum of intra- and inter-molecular terms; for .
Now consider the same system modeled in combined-resolution with a fixed boundary. We partition the domain into regions of high- and low-resolution, , delineated by the boundary Γ. The representation of molecule i is defined by the scaling function . Molecules in high resolution correspond to λi = 1 for xi ∈ Ωi. Molecules in low resolution correspond to λi = 0. This switch may occur abruptly so that λi = 0 for xi ∈ Ωo, or λi may smoothly interpolate to zero over some transition region.9,10,15,16 The multiscale Hamiltonian is
where is a many-body potential of mean force (PMF). It is implied that V is a compound function that contains a mapping M to low resolution, M: X → Y, where Y is a coarse-grained representation of X. The configurational probability distribution of this combined-resolution system is
where the superscript signals that the boundary is fixed, though HMM depends on its location Γ.
To reproduce the thermodynamic properties of the interior region, we define the marginal probability distribution of the high-resolution system,
and aim to reproduce this function with the analogous marginal distribution of the combined-resolution system. We have separated the potential into inner-inner, , inner-outer, , and outer-outer, , components15 and the subscripts n and N − n denote molecules located within the interior (Ωi) and exterior (Ωo) regions. If we do the same for Eq. (4), we can ensure by imposing
We can now explicitly include adaptive boundaries in the distribution function. We define a normalized joint probability that couples the boundary location to the configuration of molecule A and modify Eq. (4) to include the boundary as an explicit coordinate,
Equation (7) is the distribution function sampled from a multiscale simulation with adaptive boundaries. Consider some molecule or molecules always given in high-resolution (here, molecule A). This distribution will exactly reproduce the thermodynamic properties for these molecules if (1) Eq. (6) is satisfied and (2) is normalized, as we can see by integrating over the boundary Γ and all other degrees of freedom when these conditions are satisfied,
For a static domain, the Γ-dependence of ΔW may be safely ignored. To include adaptive boundaries, it is essential that ΔW accurately capture this dependence. For a hybrid explicit-continuum model, for example, ΔW must have a cavitation term that is very accurate with respect to the shape and size of the explicit domain. Otherwise, the domain of the simulation will tend toward large or small sizes and, through Eq. (7), artificially bias the configuration of molecule A.16
The low-resolution region can be represented through a number of CG models38–47 or with a pure continuum.14,18–20,48 A combination of these two limiting cases works well, and we have developed a model that includes a “flawed region” of solvent molecules that gradually transition from explicit detail to continuum. This flawed region reproduces local interactions and relaxes the complexity of ΔW.16 The scaling function is
where ri is the distance of molecule i from the center of the domain, R is the location of the boundary, and defines the width of the transition region.
In our model, the PMF ΔW includes a position-dependent chemical potential that accounts for transforming molecules from an FG to continuum representation and ensures constant solvent density across the transition region.15,16 The multiscale Hamiltonian, Eq. (3), leads to position-dependent forces that cannot be written as a sum of pairwise antisymmetric terms between molecules. These forces arise from interactions between a molecule and the degrees of freedom that have been “removed” from the system and are also present in other hybrid models that rely on a PMF construction.14,17,19,20 This approach gives a sound thermodynamic formalism14–16,49 and matches the Hamiltonian version of AdRes,34–37 though Newton’s third law is not conserved between molecules.
Alternatively, the force-based version of AdRes is constructed to have forces that are a sum of pairwise antisymmetric terms,22–24,35 important to describe, for example, hydrodynamics at the boundary.24,35,37,50 This method does not have a conserved Hamiltonian but does conserve Newton’s third law between molecules. We do not consider hydrodynamics here since it would not affect thermodynamics and, as we discuss below, does not seem to be necessary for the kinetics of biomolecules in our model. Hydrodynamics can be important for other systems and have been considered in more detail in AdRes24,35,37 and other hybrid models.21,51–53
We now apply this work to simulations of a peptide in a sphere of explicit solvent. We use a boundary Γ = R that defines the inner region Ωi as a sphere of radius R that is coupled to the radius of gyration Rg of the peptide using a Gaussian distribution,
where aR and bR determine the amount of space between the peptide and boundary, and k is the coupling strength.
Our algorithm has two steps: (1) the configuration X is updated with a combination of molecular dynamics and grand canonical Monte Carlo (MC) moves; (2) the boundary is updated with an MC move. Previously, we proposed updating the boundary to discrete positions using MC moves biased by nonequilibrium paths that required hundreds of integration steps per boundary update.16 The following method is more efficient and more easily applied. We instead define the boundary as a continuous variable and modify the Hamiltonian so that the boundary update can be completed without any integration steps. We define two bounding radii, and , for a given discretization Rh with the floor ⌊⌋ and ceiling ⌈⌉ functions. The new Hamiltonian is a linear combination of the energies corresponding to the two bounding radii,
where . The two energies on the right hand side of Eq. (11) can be calculated efficiently and simultaneously as they differ only in their interactions for molecules in the transition region.
For the MC move, we hold all molecules fixed and select a candidate radius R′ from a small uniform window (0.01 nm) centered on the current value R. We accept or reject the move with a Metropolis criterion. If the candidate radius R′ does not cross one of the bounding radii, then and do not change, and the new energy is calculated from Eq. (11) by updating the value of α. This move is of negligible computational expense. Alternatively, the candidate boundary R′ will cross one of the bounding radii. If , we must insert a new shell of molecules. At the moment that , these new molecules are non-interacting and the candidate solvent shell is inserted using the distribution of ideal molecules. Because the candidate R′ will not fall precisely on , this move will have a small energetic contribution calculated from Eq. (11). Similarly, for a move that crosses the lower bounding radius, , we automatically delete the appropriate shell of water molecules. The selection probabilities of the insertion and deletion moves cancel exactly with their contributions to the overall configurational probability distribution.16 This algorithm is our second main result.
This scheme leverages the weak interactions at the boundary to construct a new high-resolution configuration from the previous low-resolution configuration. Because there is a gradual change in resolution, these new coordinates are close to equilibrium and we have high acceptance rates. This would be more difficult with an abrupt change in resolution at the boundary.
We now test the adaptive boundary algorithm on simulations of solvated peptides. We use the MARTINI forcefield,16,54,55 a coarse representation that does not include solvent charged interactions. Simulations are preformed with a Langevin integrator56 and a generalized hybrid MC correction for the finite time step.57 We set the width of the transition region defined in Eq. (9) to = 0.5 nm, roughly the size of a single solvation layer in MARTINI. This value has been found to accurately reproduce thermodynamic properties.15 After testing multiple parameter sets for the boundary location defined in Eq. (10), we set aR = 0.7 nm, bR = 1.8, and k = 2000 kJ/(mol/nm2). These values give a tight shell of 1-3 solvent layers around the peptide throughout the course of the simulation, though different parameters may be suitable for other biomolecules. The bulk PMF ΔW includes a term, , where a = 207.20 kJ/(mol/nm3), b = −9.00 kJ/(mol/nm2), and c = −9.17 kJ/(mol/nm). This term models the change in ΔW as a function of the boundary position and is essential to ensure that the explicit domain is not artificially biased to large or small sizes. Parameterization of this and other contributions to ΔW are described in Ref. 16.
The adaptive boundary method is illustrated in Fig. 1 (Multimedia view) for a simulation of a 50-residue repeat of polyglutamine. The joint distribution of Rg and the sphere radius R is shown in Fig. 2. The number of explicit particles, proportional to volume, spans from 440 to 2800, a sixfold difference over the course of the simulation. Were this simulation performed in atomistic detail, rather than with MARTINI (where one particle corresponds to 4 water molecules), the corresponding range would be 72-fold. While [688, 996] solvent molecules are required for the interquartile range, the infrequently visited states (Rg > 1.72 nm, 2% of the simulation) require more than 1880 molecules. This shows the computational advantage of the method: the size of the high-resolution region can remain small unless more detail is needed.
This model reproduces thermodynamic properties exactly, within statistical errors, as shown in Fig. 3. This is the distribution of Rg compared to a pure explicit simulation for a 30-residue and an 8-residue repeat of polyglutamine. We have obtained this degree of accuracy for multidimensional thermodynamic properties, such as the principal moments of inertia (data not shown).
Our derivation of HMM is committed to reproducing thermodynamic properties but gives no consideration to kinetics. The integrated autocorrelation function for Rg of the polyglutamine octamer shows an increase from 38.33 ps in full explicit detail to 44.34 ps in the adaptive boundary model. The 16% increase is unsurprising: kinetics may be affected both by the continuum representation and lack of hydrodynamics for solvent in the outer domain, and by the introduction of MC boundary updates. We find that the former effect is negligible here by testing the model over larger explicit regions. It may be possible to develop boundary updates with memory in order to reduce this small error in kinetics.
In summary, we have derived the distribution function and have given an example implementation of adaptive boundaries for combined-resolution simulations. In our example, the boundary reacts to a peptide’s radius of gyration to include an appropriate but not excessive amount of detail as the simulation progresses. Adaptive boundaries would similarly benefit simulations of crystallization, aggregation, biomolecular assembly, and others.
Our approach should transfer to other models that use smooth coupling at the boundary. An adaptive boundary simulation requires both a very accurate model of the bulk PMF as a function of the system’s boundary and a way to generate equilibrium high-resolution configurations mapped from low-resolution configurations. Both of these requirements can be met using a smoothly coupled boundary that has a gradual change in resolution.
The example we have used here is simple and changes the size but not the shape of a spherical domain. Kreis et al. have implemented a domain of several overlapping spheres that cannot change size but do adjust shape by changing their relative orientations.33 This approach could be included here to give a more flexible domain that can change both size and shape. With fixed boundaries, this model can include electrostatic interactions using Generalized Born-like terms with volume overlaps58 or a reaction-field term solved for a smoothly varying dielectric.59,60 For adaptive boundaries, the remaining challenge is to parameterize the boundary-dependence of these electrostatic contributions to the PMF.
We are grateful to Sean Seyler for a discussion on the role of hydrodynamic interactions in combined-resolution models. We thank Emiliano Brini and James Dama for helpful comments on this manuscript. This work was supported by NIH Award No. GM62868.