An adaptive modeling method (AMM) that couples a deep neural network potential and a classical force field is introduced to address the accuracy-efficiency dilemma faced by the molecular simulation community. The AMM simulated system is decomposed into three types of regions. The first type captures the important phenomena in the system and requires high accuracy, for which we use the Deep Potential Molecular Dynamics (DeePMD) model in this work. The DeePMD model is trained to accurately reproduce the statistical properties of the ab initio molecular dynamics. The second type does not require high accuracy, and a classical force field is used to describe it in an efficient way. The third type is used for a smooth transition between the first and the second types of regions. By using a force interpolation scheme and imposing a thermodynamics force in the transition region, we make the DeePMD region embedded in the AMM simulated system as if it were embedded in a system that is fully described by the accurate potential. A representative example of the liquid water system is used to show the feasibility and promise of this method.

The molecular simulation community is often faced with the accuracy-efficiency dilemma: the atomic interaction in the ab initio molecular dynamics (AIMD)1 is accurately modeled by the density functional theory (DFT),2–4 but the extensive computational cost that typically scales cubically with respect to the number of atoms limits its applications to system size of a few hundreds of atoms and simulation time of a few hundreds of picoseconds. On the other hand, molecular dynamics (MD) with atomic interaction modeled by classical force fields (FFs) can easily scale to millions of atoms, but the accuracy and transferability of classical FFs are often in question.

For a large class of MD applications, people have been addressing this dilemma by multi-scale modeling. In these applications, only the accuracy of part of the system is crucial to the phenomena of interest. Taking the problem of protein folding as an example, the accuracy of modeling the interactions within protein atoms and the interaction between the protein and nearby solvent molecules dominates the conformation of the protein and the folding process. Therefore, a natural idea of saving the computational cost while preserving the accuracy is to model the protein and nearby water molecules by an accurate but presumably expensive model, whereas to model other water molecules by a cheaper model. Methods of particular interest are the hybrid quantum mechanics/molecular mechanics (QM/MM) approach,5,6 which combines QM models and classical molecular models, and the adaptive-resolution-simulation (AdResS) technique,7,8 which combines atomic models and coarse-grained models. It is noted that interpolating the force from different models is commonly used in the multi-scale modeling methods, for example, the force-mixing QM/MM method,9–12 the force interpolation AdResS for classical7,13 or path-integral MD,14,15 and the force-blending atomistic-continuum coupling methods.16–19 

Recently, machine learning (ML) methods have brought in another solution to this dilemma.20–28 After fitting the AIMD data, these approaches target at an accurate and much less expansive potential energy surface, thereby eliminating the need of calculating electronic structure information on the fly. A representative example is the Deep Potential Molecular Dynamics method (in abbreviation, DeePMD)27,28 that the authors recently developed with collaborators. In this scheme, the many-body potential energy is a sum of the “atomic energies” associated with the individual atoms in the system. Each one of these “atomic” energies depends analytically, via the deep neural network representation, on the symmetry-preserving coordinates of the atoms belonging to the local environment of each given atom. Upon training, DeePMD faithfully reproduces the distribution information on trajectories from AIMD simulations, with nuclei being treated either classically or quantum mechanically (by path-integral MD).

With the promising features of the ML methods, several problems have motivated us to develop an adaptive method that concurrently combines an ML model with a classical FF. Throughout this paper, we use the DeePMD method as an example. First, accurate training data are expensive and the amount of data is often limited to a small number of atoms and conformations. In addition, for large and complex systems usually we do not have the full QM description but only a part of it. Therefore, a practical expectation on the DeePMD model should be that it is trained to be accurate in the important regions under study, whereas the remaining regions could be described by a more simplified classical model. Second, since the DeePMD model is essentially a many-body potential, the evaluation of energy, force, and virial requires much more floating point operations than classical pairwise potentials. Taking the liquid water system, for example, the DeePMD model is about two orders of magnitudes more expensive than the classical TIP3P29 water model.28 Therefore, given the same computational resource, the maximal system size that is tractable by the DeePMD model is two orders of magnitudes smaller than that is described by the TIP3P model, or the longest simulation time achievable is two orders of magnitudes shorter than that of the TIP3P model. This imposes a limitation on the spatial and temporal scales of the problem if it is only described by the DeePMD model. Finally, the energy decomposition scheme adopted by the DeePMD model and many other ML methods provides a natural way of doing adaptive modeling. This would make the boundary problem in QM/MM due to boundary conditions adopted by electronic computations much less severe. It should be noted that electronic structure information, as a natural output of the QM/MM method, will be missing when we perform MD using only an ML model or a classical FF model. Therefore, we shall limit ourselves to cases that are well described by the potential energy surface and are less sensitive to electronic degrees of freedoms.

In this work, we introduce an adaptive modeling method (AMM) and numerically prove its feasibility in terms of adaptively changing the model for a molecule, depending on its spatial position, from the DeePMD model to a classical model, or vice versa. The system is divided into DeePMD regions and classical regions. Different regions are bridged by transition regions where the model of a molecule changes smoothly. The equilibrium between the regions is ensured by the thermodynamic force applying in the transition region. We demonstrate, by using liquid water as a representative example, that the density profile, the radial distribution functions (RDFs), and angular distribution functions (ADFs) in the DeePMD region are in satisfactory agreement with the corresponding subregion of a full DeePMD reference system. Therefore, the DeePMD region is embedded in the AMM system as if it were embedded in a full DeePMD system. The statistics of the DeePMD region approximates the grand-canonical ensemble in the thermodynamic limit.

The AMM simulation region is decomposed into three types of non-overlapping regions: DeePMD regions ΩD where the many-body atomistic interactions are modeled by the DeePMD scheme,28 classical regions ΩC where the interactions are modeled by a classical FF model, and transition regions ΩT of uniform thickness dT that bridge the DeePMD regions and the classical regions. See an illustrative example in Fig. 1. Here we only consider one DeePMD region, one classical region, and the transition region between them. The case of multiple DeePMD or classical regions can be generalized without substantial difficulty.

FIG. 1.

Schematic plot of an adaptive model system. From left to right, are the classical, transition, DeePMD, transition, and classical regions. The blue curve presents the shape of the characteristic function w(r).

FIG. 1.

Schematic plot of an adaptive model system. From left to right, are the classical, transition, DeePMD, transition, and classical regions. The blue curve presents the shape of the characteristic function w(r).

Close modal

We define a reference system, whose only difference with the AMM system is that it is fully described by the DeePMD model in the whole simulation region. Our goal is to embed the DeePMD region in the classical region as if it were embedded in a system that is fully modeled by the DeePMD. In other words, the equilibrium statistical property of the DeePMD region should mimic that of the corresponding subregion in the reference system. In this sense, since the subregion of the reference system is subject to the grand-canonical ensemble as the size of the system goes to infinity, the statistics in the DeePMD region approximates the grand-canonical ensemble, and the AMM is a grand-canonical-like molecule dynamics simulation.

In the AMM scheme, we use a force scheme to fulfill the goal. We define the force Fi on each atom i as a summation of three components,

(1)

where FiI is an interpolated force between the DeePMD and the classical model, FiL is a stochastic force from a Langevin thermostat that controls the canonical distribution, and FiT is a thermodynamic force that balances the density profile of the whole AMM simulation region.

In the following, we define and discuss in more detail the three terms in the definition of Fi. For the interpolated force FiI, we define a position dependent characteristic function w(r), which takes a constant value 1 in the DeePMD region and 0 in the classical region and changes smoothly from 1 to 0 in the transition region. The way of defining the characteristic function w(r) is not unique, and here we use

(2)

where d(r,ΩD)=minsΩD|rs| is the closest distance from the position r to the boundary of the DeePMD region. Then FiI is defined as a linear interpolation, through w(r), between the DeePMD force and the classical force, i.e.,

(3)

where FD and FC denote the DeePMD force and the classical force, respectively, and R(·) denotes the characteristic position of atom i. In general, R(·) can be directly defined as an identity mapping. However, in our test example of the water system, we observe that defining R(·) as a mapping from the atomic position to the molecular center-of-mass (COM) will stabilize the numerical issue caused by the rigidness of the classical water model. Based on similar considerations, for macromolecules, R(·) can be, e.g., a mapping from atomic positions to the residue COM. In addition, we note that instead of a force-interpolation scheme, it is possible to do an energy-interpolation scheme, which conserves the total interpolated energy at the cost of momentum conservation.30 The equivalence of the two approaches in terms of equilibrium statistical properties is extensively discussed in Ref. 31. Here we focus on the force interpolation approach.

The Langevin term FiL is defined as

(4)

where pi and mi denote the momentum and mass of atom i, respectively. W denotes the standard Wiener process, the friction γ and the noise magnitude σ are related by the fluctuation-dissipation theorem σ2 = 2γkBT, with kB being the Boltzmann constant and T being the temperature.

The accuracy of the AMM is investigated by comparing the statistical properties of the DeePMD region with those of the corresponding region in the full DeePMD reference system. Due to the difference in the definition of the DeePMD model and the classical model, an imbalance of pressure on the transition region exists, which in general will result in an unphysical gap of the density profile of different regions, and other higher order marginal distributions of the configurational distribution functions. This can be systematically improved by requiring the marginal probability distributions of different orders in the transition region identical to those in the full DeePMD model.32 The first-order marginal distribution, the density profile, is corrected by the one-body thermodynamic force FiT.13 In practice, FT is computed by an iterative scheme,

(5)

where ρ denotes the equilibrium number density, ρk(R) denotes the density profile at the kth iteration step, κ denotes the isothermal compressibility, and α denotes a damping prefactor. By using the iterative scheme of Eq. (5), the converged thermodynamic force will lead to a flat density profile in the system, which indicates the equilibration between the DeePMD region and the classical MD region. Higher order corrections in the transition region, e.g., the correction of the radial distribution function (RDF), are possible by using the RDF correction to the transition like that proposed in Ref. 33. In this work, we do not consider the RDF and higher order corrections, and demonstrate, by the numerical example, that the properties in the DeePMD region are satisfactorily accurate by only using the thermodynamic force correction.

In this work, the AMM scheme is demonstrated and validated by a water system. In total 864 water molecules are simulated in a cubic cell of size 7.4668 nm × 1.8667 nm × 1.8667 nm and subject to periodic boundary conditions. As shown in Fig. 1, the model only changes along the x axis. In a copy of the simulation cell, the DeePMD region of width 2.0 nm locates at the center of the simulation region xc = 3.7334 nm, and the thickness of the transition region is dT = 0.3 nm. It is noted that this thickness is smaller than those usually used in AdResS methods,7,8 i.e., twice of the cutoff radius. Therefore, on average there are roughly 115, 70, and 678 water molecules in the DeePMD region, the transition region, and the classical region, respectively. The damping prefactor and the compressibility in Eq. (5) are set to 0.25 and 4.6 × 10−5 bars−1, respectively. The rest of the system belongs to the classical region, in which the water molecules are modeled by a flexible SPC/E force field.34 The details of the force field are provided in the  Appendix. In this work, the atoms are modeled by point-mass particles in both the DeePMD and classical water models. The whole system is coupled to a Langevin thermostat of lag-time 0.1 ps (as a rule of thumb, the friction is set to 10 ps−135) to keep the temperature at 330 K.

One practical but important issue is that the equilibrium covalent bond length of the DeePMD model is not identical to that of the classical force field. When a molecule leaves the DeePMD region and enters the transition region, the classical force field switches on, and the mismatched bond length will lead to a bond force with large magnitude. This may cause the difficulty of equilibrating the bond length in the transition region. One solution is to use a small enough time step, so the prefactor w(R(ri)) is small to suppress the large bond force as the molecule enters the transition region. Another solution, which we use in this work for the simulation efficiency, is to slightly modify the force interpolation scheme (3) as

(6)

where FC,nb and FC,b are the non-bonded and bonded contributions to the classical force field, respectively. εp is the shape protection parameter, and we take εp = 0.01 in this work, if not stated otherwise. With the shape protection parameter, the force in the DeePMD region is thus modified as FiD+εpFiC,nb, so the molecular shape of the classical force field is partially preserved in the DeePMD region, thus the equilibration of the bond length is much easier when the molecule enters the transition. Since the protection parameter εp is small, the molecular configuration in the DeePMD region is not substantially perturbed. It is noted that in the numerical example, we do not observe any difficulty of equilibrating the covalent bonds when a molecule leaves the classical region and enters the transition region because the intramolecular part of the DeePMD interaction is much softer than the classical force field.

The data for training the DeePMD water model were generated by a 330 K NVT AIMD simulation of a 64-molecule bulk water system with PBE0+TS exchange-correlation functional under periodic boundary condition. The total length of the AIMD simulation is 20 ps, and the information on the system was saved every time step of 0.5 fs, thus 40 000 snapshots of the system is available. Among the data, 95%, i.e., 38 000 snapshots, are used as the training data, while the rest 2000 snapshots are used as the testing data. The cut-off radius of the DeePMD model is 0.6 nm. The descriptors (network input) contain both the angular and radial information on 16 closest oxygen atoms and 32 closest hydrogen atoms. The descriptors contain only the radial information for the rest of neighbors in the cut-off radius. The deep neural network that models the many-body atomic interaction has 5 hidden layer, each of which has 240, 120, 60, 30, 10 neurons from the input side to the output side, respectively. The model is trained using the DeePMD-kit package.36 The detailed description of the training process is available in Ref. 36. At the end of the training, the root-mean-square errors of the energy and force evaluated by the testing set are 0.44 meV (i.e., 0.042 kJ/mol, normalized by the number of molecules) and 2.4 × 10−2 eV/Å [23 kJ/(mol/nm)], respectively.

Since our goal is to make the DeePMD region in the AMM system as if it were embedded in a full DeePMD system, the essential check should be made by comparing the configurational probability density of the DeePMD region of the AMM system with that of the corresponding subregion of a full DeePMD reference system. The high-dimensional configurational probability density defined in the phase space cannot be easily compared, but the marginal probability densities can be directly computed from MD trajectories, and compared with those from the reference system. The agreement of the first-order marginal probability density is checked by the density profile along the x-axis because the system is homogeneous on the y- and z-directions. The agreement in the second-order marginal probability density is checked by comparing the oxygen-oxygen (O–O), oxygen-hydrogen (O–H), and hydrogen-hydrogen (H–H) RDFs. In addition, we check the agreement of the third marginal probability density in terms of the ADFs of oxygen atoms defined up to several cut-off radii. In this work, both the AMM system and the full DeePMD reference system are simulated for 2000 ps. The first 200 ps of the trajectories are discarded, and the rest of the MD trajectories are considered to be fully equilibrated. The configurations of the systems are recorded every 0.1 ps, and the density profile, RDFs, and ADFs are computed from these configurations.

We report the density profile of the AMM system along the x-axis and compare it with the equilibrium density of the DeePMD model (denoted by ρ0) in Fig. 2. The equilibrium profile in the DeePMD region is almost a constant and is in satisfactory agreement with the equilibrium density of the full DeePMD result. The density profiles in the transition regions and in the SPC/E region are also very close to the equilibrium density. The worst-case deviation, with the statistical uncertainty considered, is 4% from the equilibrium density. This indicates that the DeePMD region is embedded in the AMM system with a similar environment to a full DeePMD system, in the sense that the density profile of the environment is close to the equilibrium density of the DeePMD model.

FIG. 2.

The density profile of the AMM simulation. The density is averaged along the y and z directions, and the profile is displayed against the x axis. The solid line is the density profile, and the light shadow denotes the statistical uncertainty of the density profile at the 95% confidence level.

FIG. 2.

The density profile of the AMM simulation. The density is averaged along the y and z directions, and the profile is displayed against the x axis. The solid line is the density profile, and the light shadow denotes the statistical uncertainty of the density profile at the 95% confidence level.

Close modal

The O–O, O–H, and H–H RDFs of the DeePMD region in the AMM system are reported in the top panel of Fig. 3. All the RDFs are compared with those computed from the corresponding subregion of the reference system, and the differences are presented in the bottom panel of Fig. 3. The AMM scheme reproduces the intermolecular parts of the RDFs with satisfactory accuracy. It is noticed that the O–H RDF at around 0.1 nm, which corresponds to the intramolecular O–H bond length distribution, deviates from the reference system. This deviation is due to the introduction of the shape protection term in the force interpolation (6). In other words, when a water molecule diffuses from the classical region to the DeePMD region, a relaxation time is needed to equilibrate the O–H bond length. As pointed out by the anonymous referee, it is possible to alleviate the problem by using a larger transition region. In this study, we test the case of transition region width dT = 0.9 nm, which allows a protection parameter that is 10 times smaller, viz., 0.001. With this milder shape protection, the O–H bond distribution is restored in a better quality in the DeePMD region (see the inset of the bottom panel of Fig. 3). It is noted that the protection parameter can be further reduced by using a larger transition region; however, the computational cost of the AMM simulation will also increase correspondingly. This issue will be discussed later in this article.

FIG. 3.

The RDFs of the DeePMD region of the AMM simulation (solid lines) compared with the RDFs of the reference simulation (DeePMD, dotted lines). The RDFs of the AMM simulation are presented by solid lines, while those of the reference simulation are presented by dotted lines. The inset of the bottom panel presents the error in the O–H bond length distribution. The solid green line shows the case of dT = 0.3 nm and εp = 0.01, while the dashed green line shows the case of dT = 0.9 nm and εp = 0.001.

FIG. 3.

The RDFs of the DeePMD region of the AMM simulation (solid lines) compared with the RDFs of the reference simulation (DeePMD, dotted lines). The RDFs of the AMM simulation are presented by solid lines, while those of the reference simulation are presented by dotted lines. The inset of the bottom panel presents the error in the O–H bond length distribution. The solid green line shows the case of dT = 0.3 nm and εp = 0.01, while the dashed green line shows the case of dT = 0.9 nm and εp = 0.001.

Close modal

The ADF is defined by

(7)

where N(i,rc) denotes all the neighboring atoms of i within a cut-off radius rc, θjik denotes the angle formed by the atoms j, i, and k, ⟨·⟩ denotes the ensemble average, and Z denotes the normalization factor so that Prc(θ)dθ=1. In Fig. 4, we report the oxygen ADF of the DeePMD region of the AMM system at various cut-off radii rc = 0.270, 0.370, 0.456, and 0.600 nm. All the results are compared with the reference system, and the differences between the AMM and the reference system are presented in the bottom panel of the figure. The ADFs of the DeePMD region in the AMM system are in satisfactory agreement with the reference system.

FIG. 4.

The ADFs of the DeePMD region of the AMM simulation (solid lines) compared with the ADFs of the reference simulation (DeePMD, dotted lines). In the top panel, the ADFs of the AMM simulation are presented by solid lines, while those of the reference simulation are presented by dotted lines. In the bottom panel, the difference between the ADFs of the AMM and reference simulations is shown.

FIG. 4.

The ADFs of the DeePMD region of the AMM simulation (solid lines) compared with the ADFs of the reference simulation (DeePMD, dotted lines). In the top panel, the ADFs of the AMM simulation are presented by solid lines, while those of the reference simulation are presented by dotted lines. In the bottom panel, the difference between the ADFs of the AMM and reference simulations is shown.

Close modal

In addition, we further investigate the RDF and ADF as a function of positions and plot the error of the RDF and the ADF investigated in different subregions of the DeePMD region. As shown in Fig. 5, the overall deviations of either the RDF or the ADF compared with the benchmarks are very small. However, also as expected, the deviation in the region closer to the transition region (Region II) is larger than the deviation lying inside the DeePMD region (Region I).

FIG. 5.

The error of the O–O RDF (top panel) and ADF (rc = 0.37 nm, bottom panel) investigated in different subregions of the DeePMD region, i.e., regions I and II. The whole DeePMD region is of width 2.0 nm. The region I is of width 1.0 nm, and the region II is of width 0.5 nm.

FIG. 5.

The error of the O–O RDF (top panel) and ADF (rc = 0.37 nm, bottom panel) investigated in different subregions of the DeePMD region, i.e., regions I and II. The whole DeePMD region is of width 2.0 nm. The region I is of width 1.0 nm, and the region II is of width 0.5 nm.

Close modal

The computational cost of AMM simulation is dominated by the computation of the atomic interactions and is estimated by T = TD + TC, where TD and TC denote the computational costs of the DeePMD and classical forces, respectively. Since both the DeePMD and classical forces are linearly scalable, the costs are estimated by TD = CD(ρ)ND and TC = CC(ρ)NC, where ND and NC denote the number of atoms on which the DeePMD and classical forces are computed, respectively. CD(ρ) and CC(ρ) are density dependent parameters, which are independent with the number of molecules. We assume the system is well equilibrated so that the number of atoms is estimated by ND = ρ(VD + VT + VB), where VD and VT denote the volumes of DeePMD and transition regions, respectively. The DeePMD force depends on the network derivatives of the neighbors in the cut-off radius, rc;36 therefore, the network derivatives are evaluated for the atoms in a buffering region of width rc outside the transition region, and the computational cost of this part is estimated by ρVB. Similarly, the computational cost of the classical force evaluation is estimated by NC = ρ(VC + VT). In the end, we have

(8)

where C̃D(ρ)=ρCD(ρ) and C̃C(ρ)=ρCC(ρ). The constants C̃D(ρ) and C̃C(ρ) can be estimated by short simulations of small DeePMD and classical systems of the same density. It is noted that using a larger transition region improves the accuracy of AMM, but at the same time, the extra computational cost grows linearly with respect to the size of the transition region. Thus, the size of the transition region should be kept as small as possible, as long as the accuracy of AMM is still satisfactory.

The ratio of the computational cost of the AMM over the full DeePMD simulations is

(9)

where Vsys = VD + VT + VC is the volume of the whole system. At the limit of infinitely large classical region, the ratio converges to C̃C(ρ)/C̃D(ρ), while at the limit of infinitely fast classical force field evaluation, the ratio converges to (VD + VT + VB)/Vsys. Therefore, C̃D(ρ)/C̃C(ρ) and Vsys/(VD + VT + VB) are the highest acceleration ratio that one obtains from using the AMM method, at the corresponding limits.

In our example, the constants C̃D(ρ) and C̃C(ρ) are 7.6 × 10−3 s/nm3/step and 1.2 × 10−3 s/nm3/step on one core of an Intel Xeon Gold 6148 central processing unit (CPU). It is noted that the performance of our in-house code of the classical force field is not optimized. As a comparison, the constant of Gromacs 5.1.437,38 is 7.0 × 10−5 s/(nm3/step), which is 17 times faster than the in-house code. The estimated computational time by Eq. (8) is 0.124 s/step, while the measured wall-time on the same CPU is 0.134 s/step, which validates the estimated (8). The computational cost of a full DeePMD simulation is 0.200 s/step, so the AMM saves 33% [or 38% by estimate (8)] computational cost of the full DeePMD simulation. This may not be significant at the first sight because the volume VD + VT + VB takes 51% of the whole system, then the AMM cannot save more than 49% of the computational cost. Moreover, the sub-optimized force field code also makes the AMM slower. The acceleration of the AMM will be further improved if the DeePMD region becomes smaller, the classical region becomes larger, or the code of the classical force field is further optimized.

In summary, we introduce a promising tool for concurrent coupling of the DeePMD model and a classical force field. It should be clear that the same strategy should also be applicable to general cases, where an expensive model is concurrently simulated with a cheap model. The requirement for the expensive model is that the part dominating the computational expense is computed in a short-range manner.

Future work of this adaptive modeling method is to study biomolecules solvated in water, where one could use DeePMD to only parameterize the potential for the biomolecules and nearby water molecules and couple it to a less expensive water model. It is worth investigating the accuracy in the systems where a long-range electrostatic effect plays an important role. In this situation, the long-range electrostatic of the ML model should be included by, for example, learning the partial charge based on the atomic environment,39 and then the point-charge electrostatic is efficiently computed by fast Ewald algorithms.40,41 It is also of particular interest to investigate the accuracy of the dynamical properties, like auto-correlation functions, upon concurrently coupling of different models.42,43

The work of L.Z. and W.E is supported in part by ONR Grant No. N00014-13-1-0338, DOE Grant Nos. DE-SC0008626 and DE-SC0009248, and NSFC Grant Nos. U1430237 and 91530322. The work of H.W. is supported by the National Science Foundation of China under Grant Nos. 11501039, 11871110, and 91530322, the National Key Research and Development Program of China under Grant Nos. 2016YFB0201200 and 2016YFB0201203, and the Science Challenge Project No. JCKY2016212A502.

A flexible SPC/E water molecule is modeled by three point-mass particles.34 The interaction is composed by the non-bonded and bonded parts

(A1)

The non-bonded interaction has the Coulomb and van der Waals contributions

(A2)

Both the Coulomb and van der Waals contributions are pairwise additive; i.e., they are of the form U = ijU(rij), where rij is the distance between atoms i and j. The Coulomb interaction between an oxygen atom and a hydrogen atom of distance r reads

(A3)

where qO and qH are partial charges of the oxygen and hydrogen atoms, respectively, which are defined in Table I. The Coulomb interaction of oxygen-oxygen and hydrogen-hydrogen atom pairs is defined analogously. The oxygen atoms in the system interact with the Lennard-Jones 6-12 potential

(A4)

where r is the distance between the oxygens, and C12 and C6 are force field parameters given in Table I. The bonded interaction has the bond stretching and the angle bending contributions

(A5)

The bond stretching of the O–H covalence bond is modeled by a harmonic potential

(A6)

where rOH and rOH0 are the O–H bond length and the equilibrium bond length, respectively, and kOH is the spring constant. The bending of the H–O–H angle is modeled by a harmonic potential of the angle

(A7)

where θHOH and θHOH0 are the H–O–H angle and the equilibrium angle, respectively, and kHOH is the spring constant. The values of the parameters of the flexible SPC/E model are provided in Table I.

TABLE I.

The parameters used in the flexible SPC/E water model.

ParameterValueUnit
C12 2.6331 × 10−6 kJ mol−1 nm12 
C6 2.6171 × 10−3 kJ mol−1 nm6 
qH 0.4238 
qO −0.8476 
rOH0 0.1 nm 
kOH 3.45 × 105 kJ mol−1 nm−2 
θHOH0 109.47 deg 
kHOH 3.45 × 105 kJ mol−1 rad−2 
ParameterValueUnit
C12 2.6331 × 10−6 kJ mol−1 nm12 
C6 2.6171 × 10−3 kJ mol−1 nm6 
qH 0.4238 
qO −0.8476 
rOH0 0.1 nm 
kOH 3.45 × 105 kJ mol−1 nm−2 
θHOH0 109.47 deg 
kHOH 3.45 × 105 kJ mol−1 rad−2 
1.
D.
Marx
and
J.
Hutter
, “
Ab initio molecular dynamics: Theory and implementation
,” in
Modern Methods and Algorithms of Quantum Chemistry
, NIC Series, Vol. 3 (NIC-Directors,
2000
), pp.
329
477
.
2.
P.
Hohenberg
and
W.
Kohn
, “
Inhomogeneous electron gas
,”
Phys. Rev.
136
(
3B
),
B864
(
1964
).
3.
W.
Kohn
and
L. J.
Sham
, “
Self-consistent equations including exchange and correlation effects
,”
Phys. Rev.
140
(
4A
),
A1133
(
1965
).
4.
R. M.
Martin
,
Electronic Structure: Basic Theory and Practical Methods
(
Cambridge University Press
,
2004
).
5.
A.
Warshel
and
M.
Levitt
, “
Theoretical studies of enzymic reactions: Dielectric, electrostatic and steric stabilization of the carbonium ion in the reaction of lysozyme
,”
J. Mol. Biol.
103
(
2
),
227
249
(
1976
).
6.
H.
Lin
and
D. G.
Truhlar
, “
QM/MM: What have we learned, where are we, and where do we go from here?
,”
Theor. Chem. Acc.
117
(
2
),
185
(
2007
).
7.
M.
Praprotnik
,
L.
Delle Site
, and
K.
Kremer
, “
Adaptive resolution molecular-dynamics simulation: Changing the degrees of freedom on the fly
,”
J. Chem. Phys.
123
(
22
),
224106
(
2005
).
8.
L.
Delle Site
and
M.
Praprotnik
, “
Molecular systems with open boundaries: Theory and simulation
,”
Phys. Rep.
693
,
1
56
(
2017
).
9.
N.
Bernstein
,
J. R.
Kermode
, and
G.
Csányi
, “
Hybrid atomistic simulation methods for materials systems
,”
Rep. Prog. Phys.
72
(
2
),
026501
(
2009
).
10.
N.
Bernstein
,
C.
Várnai
,
I.
Solt
,
S. A.
Winfield
,
M. C.
Payne
,
I.
Simon
,
M.
Fuxreiter
, and
G.
Csányi
, “
QM/MM simulation of liquid water with an adaptive quantum region
,”
Phys. Chem. Chem. Phys.
14
(
2
),
646
656
(
2012
).
11.
C.
Várnai
,
N.
Bernstein
,
L.
Mones
, and
G.
Csányi
, “
Tests of an adaptive QM/MM calculation on free energy profiles of chemical reactions in solution
,”
J. Phys. Chem. B
117
(
40
),
12202
12211
(
2013
).
12.
L.
Mones
,
A.
Jones
,
A. W.
Götz
,
T.
Laino
,
R. C.
Walker
,
B.
Leimkuhler
,
G.
Csányi
, and
N.
Bernstein
, “
The adaptive buffered force QM/MM method in the CP2K and AMBER software packages
,”
J. Comput. Chem.
36
(
9
),
633
648
(
2015
).
13.
S.
Fritsch
,
P.
Simon
,
C.
Junghans
,
G.
Ciccotti
,
L.
Delle Site
, and
K.
Kremer
, “
Adaptive resolution molecular dynamics simulation through coupling to an internal particle reservoir
,”
Phys. Rev. Lett.
108
(
17
),
170602
(
2012
).
14.
A. B.
Poma
and
L.
Delle Site
, “
Classical to path-integral adaptive resolution in molecular simulation: Towards a smooth quantum-classical coupling
,”
Phys. Rev. Lett.
104
(
25
),
250201
(
2010
).
15.
A.
Agarwal
and
L.
Delle Site
, “
Path integral molecular dynamics within the grand canonical-like adaptive resolution technique: Simulation of liquid water
,”
J. Chem. Phys.
143
(
9
),
094102
(
2015
).
16.
X. H.
Li
,
M.
Luskin
, and
C.
Ortner
, “
Positive definiteness of the blended force-based quasicontinuum method
,”
Multiscale Model. Simul.
10
(
3
),
1023
1045
(
2012
).
17.
J.
Lu
and
P.
Ming
, “
Convergence of a force-based hybrid method in three dimensions
,”
Commun. Pure. Appl. Math.
66
(
1
),
83
108
(
2013
).
18.
X. H.
Li
,
M.
Luskin
,
C.
Ortner
, and
A. V.
Shapeev
, “
Theory-based benchmarking of the blended force-based quasicontinuum method
,”
Comput. Methods Appl. Mech. Eng.
268
,
763
781
(
2014
).
19.
H.
Wang
,
S.
Liu
, and
F.
Yang
, “
A posteriori error control for three typical force-based atomistic-to-continuum coupling methods for an atomistic chain
,”
Numer. Math. Theor. Meth. Appl.
12
,
233
264
(
2019
), available at http://www.global-sci.org/intro/10.4208/nmtma.OA-2017-0094.
20.
J.
Behler
and
M.
Parrinello
, “
Generalized neural-network representation of high-dimensional potential-energy surfaces
,”
Phys. Rev. Lett.
98
(
14
),
146401
(
2007
).
21.
T.
Morawietz
,
A.
Singraber
,
C.
Dellago
, and
J.
Behler
, “
How van der Waals interactions determine the unique properties of water
,”
Proc. Natl. Acad. Sci. U. S. A.
113
(
30
),
8368
8373
(
2016
).
22.
A. P.
Bartók
,
M. C.
Payne
,
R.
Kondor
, and
G.
Csányi
, “
Gaussian approximation potentials: The accuracy of quantum mechanics, without the electrons
,”
Phys. Rev. Lett.
104
(
13
),
136403
(
2010
).
23.
M.
Rupp
,
T.
Alexandre
,
K.-R.
Müller
, and
O. A.
von Lilienfeld
, “
Fast and accurate modeling of molecular atomization energies with machine learning
,”
Phys. Rev. Lett.
108
(
5
),
058301
(
2012
).
24.
K. T.
Schütt
,
F.
Arbabzadah
,
S.
Chmiela
,
K. R.
Müller
, and
T.
Alexandre
, “
Quantum-chemical insights from deep tensor neural networks
,”
Nat. Commun.
8
,
13890
(
2017
).
25.
S.
Chmiela
,
T.
Alexandre
,
H. E
Sauceda
,
I.
Poltavsky
,
K. T.
Schütt
, and
K.-R.
Müller
, “
Machine learning of accurate energy-conserving molecular force fields
,”
Sci. Adv.
3
(
5
),
e1603015
(
2017
).
26.
J. S.
Smith
,
O.
Isayev
, and
A. E.
Roitberg
, “
ANI-1: An extensible neural network potential with DFT accuracy at force field computational cost
,”
Chem. Sci.
8
(
4
),
3192
3203
(
2017
).
27.
J.
Han
,
L.
Zhang
,
R.
Car
, and
W.
E
, “
Deep potential: A general representation of a many-body potential energy surface
,”
Commun. Comput. Phys.
23
(
3
),
629
639
(
2018
), available at http://www.global-sci.com/issue/v23/n3/pdf/629.pdf.
28.
L.
Zhang
,
J.
Han
,
H.
Wang
,
R.
Car
, and
W.
E
, “
Deep potential molecular dynamics: A scalable model with the accuracy of quantum mechanics
,”
Phys. Rev. Lett.
120
,
143001
(
2018
).
29.
W. L.
Jorgensen
,
J.
Chandrasekhar
,
J. D.
Madura
,
R. W.
Impey
, and
M. L.
Klein
, “
Comparison of simple potential functions for simulating liquid water
,”
J. Chem. Phys.
79
(
2
),
926
935
(
1983
).
30.
L.
Delle Site
, “
Some fundamental problems for an energy-conserving adaptive-resolution molecular dynamics scheme
,”
Phys. Rev. E
76
(
4
),
047701
(
2007
).
31.
H.
Wang
and
A.
Agarwal
, “
Adaptive resolution simulation in equilibrium and beyond
,”
Eur. Phys. J.: Spec. Top.
224
(
12
),
2269
2287
(
2015
).
32.
H.
Wang
,
C.
Hartmann
,
C.
Schütte
, and
L.
Delle Site
, “
Grand-canonical-like molecular-dynamics simulations by using an adaptive-resolution technique
,”
Phys. Rev. X
3
(
1
),
011018
(
2013
).
33.
H.
Wang
,
C.
Schutte
, and
L.
Delle Site
, “
Adaptive resolution simulation (AdResS): A smooth thermodynamic and structural transition from atomistic to coarse grained resolution and vice versa in a grand canonical fashion
,”
J. Chem. Theory Comput.
8
(
8
),
2878
2887
(
2012
).
34.
H. J. C.
Berendsen
,
J. R.
Grigera
, and
T. P.
Straatsma
, “
The missing term in effective pair potentials
,”
J. Phys. Chem.
91
(
24
),
6269
6271
(
1987
).
35.
X.
Gao
,
J.
Fang
, and
H.
Wang
, “
Sampling the isothermal-isobaric ensemble by Langevin dynamics
,”
J. Chem. Phys.
144
(
12
),
124113
(
2016
).
36.
H.
Wang
,
L.
Zhang
,
J.
Han
, and
W.
E
, “
DeePMD-kit: A deep learning package for many-body potential energy representation and molecular dynamics
,”
Comput. Phys. Commun.
228
,
178
(
2018
).
37.
S.
Pronk
,
S.
Páll
,
R.
Schulz
,
P.
Larsson
,
P.
Bjelkmar
,
R.
Apostolov
,
M. R.
Shirts
,
J. C.
Smith
,
P. M.
Kasson
,
D.
van der Spoel
,
B.
Hess
, and
E.
Lindahl
, “
GROMACS 4.5: A high-throughput and highly parallel open source molecular simulation toolkit
,”
Bioinformatics
29
,
845
(
2013
).
38.
M. J.
Abraham
,
T.
Murtola
,
R.
Schulz
,
S.
Páll
,
J. C.
Smith
,
B.
Hess
, and
E.
Lindahl
, “
GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers
,”
SoftwareX
1
,
19
25
(
2015
).
39.
N.
Artrith
,
T.
Morawietz
, and
J.
Behler
, “
High-dimensional neural-network potentials for multicomponent systems: Applications to zinc oxide
,”
Phys. Rev. B
83
(
15
),
153101
(
2011
).
40.
R. W.
Hockney
and
J. W.
Eastwood
,
Computer Simulation Using Particles
(
IOP
,
London
,
1988
).
41.
T.
Darden
,
D.
York
, and
L.
Pedersen
, “
Particle mesh Ewald: An N log(N) method for Ewald sums in large systems
,”
J. Chem. Phys.
98
,
10089
(
1993
).
42.
A.
Agarwal
,
J.
Zhu
,
C.
Hartmann
,
H.
Wang
, and
L.
Delle Site
, “
Molecular dynamics in a grand ensemble: Bergmann–Lebowitz model and adaptive resolution simulation
,”
New J. Phys.
17
(
8
),
083042
(
2015
).
43.
L.
Delle Site
, “
Formulation of Liouville’s theorem for grand ensemble molecular simulations
,”
Phys. Rev. E
93
(
2
),
022130
(
2016
).