An adaptive modeling method (AMM) that couples a deep neural network potential and a classical force field is introduced to address the accuracy-efficiency dilemma faced by the molecular simulation community. The AMM simulated system is decomposed into three types of regions. The first type captures the important phenomena in the system and requires high accuracy, for which we use the Deep Potential Molecular Dynamics (DeePMD) model in this work. The DeePMD model is trained to accurately reproduce the statistical properties of the ab initio molecular dynamics. The second type does not require high accuracy, and a classical force field is used to describe it in an efficient way. The third type is used for a smooth transition between the first and the second types of regions. By using a force interpolation scheme and imposing a thermodynamics force in the transition region, we make the DeePMD region embedded in the AMM simulated system as if it were embedded in a system that is fully described by the accurate potential. A representative example of the liquid water system is used to show the feasibility and promise of this method.
I. INTRODUCTION
The molecular simulation community is often faced with the accuracy-efficiency dilemma: the atomic interaction in the ab initio molecular dynamics (AIMD)1 is accurately modeled by the density functional theory (DFT),2–4 but the extensive computational cost that typically scales cubically with respect to the number of atoms limits its applications to system size of a few hundreds of atoms and simulation time of a few hundreds of picoseconds. On the other hand, molecular dynamics (MD) with atomic interaction modeled by classical force fields (FFs) can easily scale to millions of atoms, but the accuracy and transferability of classical FFs are often in question.
For a large class of MD applications, people have been addressing this dilemma by multi-scale modeling. In these applications, only the accuracy of part of the system is crucial to the phenomena of interest. Taking the problem of protein folding as an example, the accuracy of modeling the interactions within protein atoms and the interaction between the protein and nearby solvent molecules dominates the conformation of the protein and the folding process. Therefore, a natural idea of saving the computational cost while preserving the accuracy is to model the protein and nearby water molecules by an accurate but presumably expensive model, whereas to model other water molecules by a cheaper model. Methods of particular interest are the hybrid quantum mechanics/molecular mechanics (QM/MM) approach,5,6 which combines QM models and classical molecular models, and the adaptive-resolution-simulation (AdResS) technique,7,8 which combines atomic models and coarse-grained models. It is noted that interpolating the force from different models is commonly used in the multi-scale modeling methods, for example, the force-mixing QM/MM method,9–12 the force interpolation AdResS for classical7,13 or path-integral MD,14,15 and the force-blending atomistic-continuum coupling methods.16–19
Recently, machine learning (ML) methods have brought in another solution to this dilemma.20–28 After fitting the AIMD data, these approaches target at an accurate and much less expansive potential energy surface, thereby eliminating the need of calculating electronic structure information on the fly. A representative example is the Deep Potential Molecular Dynamics method (in abbreviation, DeePMD)27,28 that the authors recently developed with collaborators. In this scheme, the many-body potential energy is a sum of the “atomic energies” associated with the individual atoms in the system. Each one of these “atomic” energies depends analytically, via the deep neural network representation, on the symmetry-preserving coordinates of the atoms belonging to the local environment of each given atom. Upon training, DeePMD faithfully reproduces the distribution information on trajectories from AIMD simulations, with nuclei being treated either classically or quantum mechanically (by path-integral MD).
With the promising features of the ML methods, several problems have motivated us to develop an adaptive method that concurrently combines an ML model with a classical FF. Throughout this paper, we use the DeePMD method as an example. First, accurate training data are expensive and the amount of data is often limited to a small number of atoms and conformations. In addition, for large and complex systems usually we do not have the full QM description but only a part of it. Therefore, a practical expectation on the DeePMD model should be that it is trained to be accurate in the important regions under study, whereas the remaining regions could be described by a more simplified classical model. Second, since the DeePMD model is essentially a many-body potential, the evaluation of energy, force, and virial requires much more floating point operations than classical pairwise potentials. Taking the liquid water system, for example, the DeePMD model is about two orders of magnitudes more expensive than the classical TIP3P29 water model.28 Therefore, given the same computational resource, the maximal system size that is tractable by the DeePMD model is two orders of magnitudes smaller than that is described by the TIP3P model, or the longest simulation time achievable is two orders of magnitudes shorter than that of the TIP3P model. This imposes a limitation on the spatial and temporal scales of the problem if it is only described by the DeePMD model. Finally, the energy decomposition scheme adopted by the DeePMD model and many other ML methods provides a natural way of doing adaptive modeling. This would make the boundary problem in QM/MM due to boundary conditions adopted by electronic computations much less severe. It should be noted that electronic structure information, as a natural output of the QM/MM method, will be missing when we perform MD using only an ML model or a classical FF model. Therefore, we shall limit ourselves to cases that are well described by the potential energy surface and are less sensitive to electronic degrees of freedoms.
In this work, we introduce an adaptive modeling method (AMM) and numerically prove its feasibility in terms of adaptively changing the model for a molecule, depending on its spatial position, from the DeePMD model to a classical model, or vice versa. The system is divided into DeePMD regions and classical regions. Different regions are bridged by transition regions where the model of a molecule changes smoothly. The equilibrium between the regions is ensured by the thermodynamic force applying in the transition region. We demonstrate, by using liquid water as a representative example, that the density profile, the radial distribution functions (RDFs), and angular distribution functions (ADFs) in the DeePMD region are in satisfactory agreement with the corresponding subregion of a full DeePMD reference system. Therefore, the DeePMD region is embedded in the AMM system as if it were embedded in a full DeePMD system. The statistics of the DeePMD region approximates the grand-canonical ensemble in the thermodynamic limit.
II. METHOD
The AMM simulation region is decomposed into three types of non-overlapping regions: DeePMD regions ΩD where the many-body atomistic interactions are modeled by the DeePMD scheme,28 classical regions ΩC where the interactions are modeled by a classical FF model, and transition regions ΩT of uniform thickness dT that bridge the DeePMD regions and the classical regions. See an illustrative example in Fig. 1. Here we only consider one DeePMD region, one classical region, and the transition region between them. The case of multiple DeePMD or classical regions can be generalized without substantial difficulty.
We define a reference system, whose only difference with the AMM system is that it is fully described by the DeePMD model in the whole simulation region. Our goal is to embed the DeePMD region in the classical region as if it were embedded in a system that is fully modeled by the DeePMD. In other words, the equilibrium statistical property of the DeePMD region should mimic that of the corresponding subregion in the reference system. In this sense, since the subregion of the reference system is subject to the grand-canonical ensemble as the size of the system goes to infinity, the statistics in the DeePMD region approximates the grand-canonical ensemble, and the AMM is a grand-canonical-like molecule dynamics simulation.
In the AMM scheme, we use a force scheme to fulfill the goal. We define the force Fi on each atom i as a summation of three components,
where is an interpolated force between the DeePMD and the classical model, is a stochastic force from a Langevin thermostat that controls the canonical distribution, and is a thermodynamic force that balances the density profile of the whole AMM simulation region.
In the following, we define and discuss in more detail the three terms in the definition of Fi. For the interpolated force , we define a position dependent characteristic function (r), which takes a constant value 1 in the DeePMD region and 0 in the classical region and changes smoothly from 1 to 0 in the transition region. The way of defining the characteristic function (r) is not unique, and here we use
where is the closest distance from the position r to the boundary of the DeePMD region. Then is defined as a linear interpolation, through (r), between the DeePMD force and the classical force, i.e.,
where FD and FC denote the DeePMD force and the classical force, respectively, and R(·) denotes the characteristic position of atom i. In general, R(·) can be directly defined as an identity mapping. However, in our test example of the water system, we observe that defining R(·) as a mapping from the atomic position to the molecular center-of-mass (COM) will stabilize the numerical issue caused by the rigidness of the classical water model. Based on similar considerations, for macromolecules, R(·) can be, e.g., a mapping from atomic positions to the residue COM. In addition, we note that instead of a force-interpolation scheme, it is possible to do an energy-interpolation scheme, which conserves the total interpolated energy at the cost of momentum conservation.30 The equivalence of the two approaches in terms of equilibrium statistical properties is extensively discussed in Ref. 31. Here we focus on the force interpolation approach.
The Langevin term is defined as
where pi and mi denote the momentum and mass of atom i, respectively. W denotes the standard Wiener process, the friction γ and the noise magnitude σ are related by the fluctuation-dissipation theorem σ2 = 2γkBT, with kB being the Boltzmann constant and T being the temperature.
The accuracy of the AMM is investigated by comparing the statistical properties of the DeePMD region with those of the corresponding region in the full DeePMD reference system. Due to the difference in the definition of the DeePMD model and the classical model, an imbalance of pressure on the transition region exists, which in general will result in an unphysical gap of the density profile of different regions, and other higher order marginal distributions of the configurational distribution functions. This can be systematically improved by requiring the marginal probability distributions of different orders in the transition region identical to those in the full DeePMD model.32 The first-order marginal distribution, the density profile, is corrected by the one-body thermodynamic force .13 In practice, FT is computed by an iterative scheme,
where ρ denotes the equilibrium number density, ρk(R) denotes the density profile at the kth iteration step, κ denotes the isothermal compressibility, and α denotes a damping prefactor. By using the iterative scheme of Eq. (5), the converged thermodynamic force will lead to a flat density profile in the system, which indicates the equilibration between the DeePMD region and the classical MD region. Higher order corrections in the transition region, e.g., the correction of the radial distribution function (RDF), are possible by using the RDF correction to the transition like that proposed in Ref. 33. In this work, we do not consider the RDF and higher order corrections, and demonstrate, by the numerical example, that the properties in the DeePMD region are satisfactorily accurate by only using the thermodynamic force correction.
III. SIMULATION PROTOCOL
In this work, the AMM scheme is demonstrated and validated by a water system. In total 864 water molecules are simulated in a cubic cell of size 7.4668 nm × 1.8667 nm × 1.8667 nm and subject to periodic boundary conditions. As shown in Fig. 1, the model only changes along the x axis. In a copy of the simulation cell, the DeePMD region of width 2.0 nm locates at the center of the simulation region xc = 3.7334 nm, and the thickness of the transition region is dT = 0.3 nm. It is noted that this thickness is smaller than those usually used in AdResS methods,7,8 i.e., twice of the cutoff radius. Therefore, on average there are roughly 115, 70, and 678 water molecules in the DeePMD region, the transition region, and the classical region, respectively. The damping prefactor and the compressibility in Eq. (5) are set to 0.25 and 4.6 × 10−5 bars−1, respectively. The rest of the system belongs to the classical region, in which the water molecules are modeled by a flexible SPC/E force field.34 The details of the force field are provided in the Appendix. In this work, the atoms are modeled by point-mass particles in both the DeePMD and classical water models. The whole system is coupled to a Langevin thermostat of lag-time 0.1 ps (as a rule of thumb, the friction is set to 10 ps−135) to keep the temperature at 330 K.
One practical but important issue is that the equilibrium covalent bond length of the DeePMD model is not identical to that of the classical force field. When a molecule leaves the DeePMD region and enters the transition region, the classical force field switches on, and the mismatched bond length will lead to a bond force with large magnitude. This may cause the difficulty of equilibrating the bond length in the transition region. One solution is to use a small enough time step, so the prefactor (R(ri)) is small to suppress the large bond force as the molecule enters the transition region. Another solution, which we use in this work for the simulation efficiency, is to slightly modify the force interpolation scheme (3) as
where FC,nb and FC,b are the non-bonded and bonded contributions to the classical force field, respectively. εp is the shape protection parameter, and we take εp = 0.01 in this work, if not stated otherwise. With the shape protection parameter, the force in the DeePMD region is thus modified as , so the molecular shape of the classical force field is partially preserved in the DeePMD region, thus the equilibration of the bond length is much easier when the molecule enters the transition. Since the protection parameter εp is small, the molecular configuration in the DeePMD region is not substantially perturbed. It is noted that in the numerical example, we do not observe any difficulty of equilibrating the covalent bonds when a molecule leaves the classical region and enters the transition region because the intramolecular part of the DeePMD interaction is much softer than the classical force field.
The data for training the DeePMD water model were generated by a 330 K NVT AIMD simulation of a 64-molecule bulk water system with PBE0+TS exchange-correlation functional under periodic boundary condition. The total length of the AIMD simulation is 20 ps, and the information on the system was saved every time step of 0.5 fs, thus 40 000 snapshots of the system is available. Among the data, 95%, i.e., 38 000 snapshots, are used as the training data, while the rest 2000 snapshots are used as the testing data. The cut-off radius of the DeePMD model is 0.6 nm. The descriptors (network input) contain both the angular and radial information on 16 closest oxygen atoms and 32 closest hydrogen atoms. The descriptors contain only the radial information for the rest of neighbors in the cut-off radius. The deep neural network that models the many-body atomic interaction has 5 hidden layer, each of which has 240, 120, 60, 30, 10 neurons from the input side to the output side, respectively. The model is trained using the DeePMD-kit package.36 The detailed description of the training process is available in Ref. 36. At the end of the training, the root-mean-square errors of the energy and force evaluated by the testing set are 0.44 meV (i.e., 0.042 kJ/mol, normalized by the number of molecules) and 2.4 × 10−2 eV/Å [23 kJ/(mol/nm)], respectively.
IV. RESULTS AND DISCUSSION
Since our goal is to make the DeePMD region in the AMM system as if it were embedded in a full DeePMD system, the essential check should be made by comparing the configurational probability density of the DeePMD region of the AMM system with that of the corresponding subregion of a full DeePMD reference system. The high-dimensional configurational probability density defined in the phase space cannot be easily compared, but the marginal probability densities can be directly computed from MD trajectories, and compared with those from the reference system. The agreement of the first-order marginal probability density is checked by the density profile along the x-axis because the system is homogeneous on the y- and z-directions. The agreement in the second-order marginal probability density is checked by comparing the oxygen-oxygen (O–O), oxygen-hydrogen (O–H), and hydrogen-hydrogen (H–H) RDFs. In addition, we check the agreement of the third marginal probability density in terms of the ADFs of oxygen atoms defined up to several cut-off radii. In this work, both the AMM system and the full DeePMD reference system are simulated for 2000 ps. The first 200 ps of the trajectories are discarded, and the rest of the MD trajectories are considered to be fully equilibrated. The configurations of the systems are recorded every 0.1 ps, and the density profile, RDFs, and ADFs are computed from these configurations.
We report the density profile of the AMM system along the x-axis and compare it with the equilibrium density of the DeePMD model (denoted by ρ0) in Fig. 2. The equilibrium profile in the DeePMD region is almost a constant and is in satisfactory agreement with the equilibrium density of the full DeePMD result. The density profiles in the transition regions and in the SPC/E region are also very close to the equilibrium density. The worst-case deviation, with the statistical uncertainty considered, is 4% from the equilibrium density. This indicates that the DeePMD region is embedded in the AMM system with a similar environment to a full DeePMD system, in the sense that the density profile of the environment is close to the equilibrium density of the DeePMD model.
The O–O, O–H, and H–H RDFs of the DeePMD region in the AMM system are reported in the top panel of Fig. 3. All the RDFs are compared with those computed from the corresponding subregion of the reference system, and the differences are presented in the bottom panel of Fig. 3. The AMM scheme reproduces the intermolecular parts of the RDFs with satisfactory accuracy. It is noticed that the O–H RDF at around 0.1 nm, which corresponds to the intramolecular O–H bond length distribution, deviates from the reference system. This deviation is due to the introduction of the shape protection term in the force interpolation (6). In other words, when a water molecule diffuses from the classical region to the DeePMD region, a relaxation time is needed to equilibrate the O–H bond length. As pointed out by the anonymous referee, it is possible to alleviate the problem by using a larger transition region. In this study, we test the case of transition region width dT = 0.9 nm, which allows a protection parameter that is 10 times smaller, viz., 0.001. With this milder shape protection, the O–H bond distribution is restored in a better quality in the DeePMD region (see the inset of the bottom panel of Fig. 3). It is noted that the protection parameter can be further reduced by using a larger transition region; however, the computational cost of the AMM simulation will also increase correspondingly. This issue will be discussed later in this article.
The ADF is defined by
where denotes all the neighboring atoms of i within a cut-off radius rc, θjik denotes the angle formed by the atoms j, i, and k, ⟨·⟩ denotes the ensemble average, and Z denotes the normalization factor so that . In Fig. 4, we report the oxygen ADF of the DeePMD region of the AMM system at various cut-off radii rc = 0.270, 0.370, 0.456, and 0.600 nm. All the results are compared with the reference system, and the differences between the AMM and the reference system are presented in the bottom panel of the figure. The ADFs of the DeePMD region in the AMM system are in satisfactory agreement with the reference system.
In addition, we further investigate the RDF and ADF as a function of positions and plot the error of the RDF and the ADF investigated in different subregions of the DeePMD region. As shown in Fig. 5, the overall deviations of either the RDF or the ADF compared with the benchmarks are very small. However, also as expected, the deviation in the region closer to the transition region (Region II) is larger than the deviation lying inside the DeePMD region (Region I).
The computational cost of AMM simulation is dominated by the computation of the atomic interactions and is estimated by T = TD + TC, where TD and TC denote the computational costs of the DeePMD and classical forces, respectively. Since both the DeePMD and classical forces are linearly scalable, the costs are estimated by TD = CD(ρ)ND and TC = CC(ρ)NC, where ND and NC denote the number of atoms on which the DeePMD and classical forces are computed, respectively. CD(ρ) and CC(ρ) are density dependent parameters, which are independent with the number of molecules. We assume the system is well equilibrated so that the number of atoms is estimated by ND = ρ(VD + VT + VB), where VD and VT denote the volumes of DeePMD and transition regions, respectively. The DeePMD force depends on the network derivatives of the neighbors in the cut-off radius, rc;36 therefore, the network derivatives are evaluated for the atoms in a buffering region of width rc outside the transition region, and the computational cost of this part is estimated by ρVB. Similarly, the computational cost of the classical force evaluation is estimated by NC = ρ(VC + VT). In the end, we have
where and . The constants and can be estimated by short simulations of small DeePMD and classical systems of the same density. It is noted that using a larger transition region improves the accuracy of AMM, but at the same time, the extra computational cost grows linearly with respect to the size of the transition region. Thus, the size of the transition region should be kept as small as possible, as long as the accuracy of AMM is still satisfactory.
The ratio of the computational cost of the AMM over the full DeePMD simulations is
where Vsys = VD + VT + VC is the volume of the whole system. At the limit of infinitely large classical region, the ratio converges to , while at the limit of infinitely fast classical force field evaluation, the ratio converges to (VD + VT + VB)/Vsys. Therefore, and Vsys/(VD + VT + VB) are the highest acceleration ratio that one obtains from using the AMM method, at the corresponding limits.
In our example, the constants and are 7.6 × 10−3 s/nm3/step and 1.2 × 10−3 s/nm3/step on one core of an Intel Xeon Gold 6148 central processing unit (CPU). It is noted that the performance of our in-house code of the classical force field is not optimized. As a comparison, the constant of Gromacs 5.1.437,38 is 7.0 × 10−5 s/(nm3/step), which is 17 times faster than the in-house code. The estimated computational time by Eq. (8) is 0.124 s/step, while the measured wall-time on the same CPU is 0.134 s/step, which validates the estimated (8). The computational cost of a full DeePMD simulation is 0.200 s/step, so the AMM saves 33% [or 38% by estimate (8)] computational cost of the full DeePMD simulation. This may not be significant at the first sight because the volume VD + VT + VB takes 51% of the whole system, then the AMM cannot save more than 49% of the computational cost. Moreover, the sub-optimized force field code also makes the AMM slower. The acceleration of the AMM will be further improved if the DeePMD region becomes smaller, the classical region becomes larger, or the code of the classical force field is further optimized.
V. CONCLUSION AND PERSPECTIVES
In summary, we introduce a promising tool for concurrent coupling of the DeePMD model and a classical force field. It should be clear that the same strategy should also be applicable to general cases, where an expensive model is concurrently simulated with a cheap model. The requirement for the expensive model is that the part dominating the computational expense is computed in a short-range manner.
Future work of this adaptive modeling method is to study biomolecules solvated in water, where one could use DeePMD to only parameterize the potential for the biomolecules and nearby water molecules and couple it to a less expensive water model. It is worth investigating the accuracy in the systems where a long-range electrostatic effect plays an important role. In this situation, the long-range electrostatic of the ML model should be included by, for example, learning the partial charge based on the atomic environment,39 and then the point-charge electrostatic is efficiently computed by fast Ewald algorithms.40,41 It is also of particular interest to investigate the accuracy of the dynamical properties, like auto-correlation functions, upon concurrently coupling of different models.42,43
ACKNOWLEDGMENTS
The work of L.Z. and W.E is supported in part by ONR Grant No. N00014-13-1-0338, DOE Grant Nos. DE-SC0008626 and DE-SC0009248, and NSFC Grant Nos. U1430237 and 91530322. The work of H.W. is supported by the National Science Foundation of China under Grant Nos. 11501039, 11871110, and 91530322, the National Key Research and Development Program of China under Grant Nos. 2016YFB0201200 and 2016YFB0201203, and the Science Challenge Project No. JCKY2016212A502.
APPENDIX: THE FLEXIBLE SPC/E FORCE FIELD
A flexible SPC/E water molecule is modeled by three point-mass particles.34 The interaction is composed by the non-bonded and bonded parts
The non-bonded interaction has the Coulomb and van der Waals contributions
Both the Coulomb and van der Waals contributions are pairwise additive; i.e., they are of the form U = ∑i≠jU(rij), where rij is the distance between atoms i and j. The Coulomb interaction between an oxygen atom and a hydrogen atom of distance r reads
where qO and qH are partial charges of the oxygen and hydrogen atoms, respectively, which are defined in Table I. The Coulomb interaction of oxygen-oxygen and hydrogen-hydrogen atom pairs is defined analogously. The oxygen atoms in the system interact with the Lennard-Jones 6-12 potential
where r is the distance between the oxygens, and C12 and C6 are force field parameters given in Table I. The bonded interaction has the bond stretching and the angle bending contributions
The bond stretching of the O–H covalence bond is modeled by a harmonic potential
where rOH and are the O–H bond length and the equilibrium bond length, respectively, and kOH is the spring constant. The bending of the H–O–H angle is modeled by a harmonic potential of the angle
where θHOH and are the H–O–H angle and the equilibrium angle, respectively, and kHOH is the spring constant. The values of the parameters of the flexible SPC/E model are provided in Table I.
Parameter . | Value . | Unit . |
---|---|---|
C12 | 2.6331 × 10−6 | kJ mol−1 nm12 |
C6 | 2.6171 × 10−3 | kJ mol−1 nm6 |
qH | 0.4238 | e |
qO | −0.8476 | e |
0.1 | nm | |
kOH | 3.45 × 105 | kJ mol−1 nm−2 |
109.47 | deg | |
kHOH | 3.45 × 105 | kJ mol−1 rad−2 |
Parameter . | Value . | Unit . |
---|---|---|
C12 | 2.6331 × 10−6 | kJ mol−1 nm12 |
C6 | 2.6171 × 10−3 | kJ mol−1 nm6 |
qH | 0.4238 | e |
qO | −0.8476 | e |
0.1 | nm | |
kOH | 3.45 × 105 | kJ mol−1 nm−2 |
109.47 | deg | |
kHOH | 3.45 × 105 | kJ mol−1 rad−2 |