Incorporation of quantum mechanical electronic structure data is necessary to properly capture the physics of many chemical processes. Proton hopping in water, which involves rearrangement of chemical and hydrogen bonds, is one such example of an inherently quantum mechanical process. Standard ab initio molecular dynamics (AIMD) methods, however, do not yet accurately predict the structure of water and are therefore less than optimal for developing force fields. We have instead utilized a recently developed method which minimally biases AIMD simulations to match limited experimental data to develop novel multiscale reactive molecular dynamics (MS-RMD) force fields by using relative entropy minimization. In this paper, we present two new MS-RMD models using such a parameterization: one which employs water with harmonic internal vibrations and another which uses anharmonic water. We show that the newly developed MS-RMD models very closely reproduce the solvation structure of the hydrated excess proton in the target AIMD data. We also find that the use of anharmonic water increases proton hopping, thereby increasing the proton diffusion constant.
With the rapid increase in computer power, complemented by more sophisticated and efficient computational algorithms, simulations have become an effective way of understanding and even predicting the behavior of complex chemical systems. Quantum mechanics plays a fundamental role in chemical processes, and for a simulation to be physically meaningful, i.e., for the simulation to accurately describe a given chemical process with the correct underlying physics, it is often necessary to include electronic structure effects. More specifically, the electronic structure of a material can greatly affect its polarization, charge transfer, and chemical reactivity. However, the inclusion of these quantum effects (e.g., chemical reactivity) introduces a significant computational cost which greatly limits the range of applications. It is therefore desirable to use molecular mechanical models, which can increase the breadth and depth of the simulation studies, for applicable systems. Incorporating quantum mechanical effects such as reactivity into such models is a challenging task, especially given the fact that many approximate quantum mechanical methods do not necessarily provide an accurate description of a given system in the first place.
Consider, for example, simulations of liquid water. Ab initio molecular dynamics (AIMD) is the most common method for performing dynamics simulations that explicitly treat electronic degrees of freedom. However, when employing a sufficiently large basis set, AIMD simulations can only handle upwards of several hundred water molecules (or several thousand valence electrons) at the generalized gradient approximation (GGA) level of density functional theory (DFT). Unfortunately, at this particular level of DFT, water is often found to be highly over-structured, with a melting temperature of around 400 K and a self-diffusion constant an order of magnitude smaller than experiment. Indeed, most DFT-based AIMD performs rather poorly in terms of accuracy relative to the experiment for systems of liquid water.
To mitigate these limitations, more sophisticated quantum mechanical techniques have been employed. One recent example is from Willow et al.1 In that work, the authors use an embedded fragment second order Møller-Plesset scheme to perform AIMD. Such simulations promise to yield quite accurate predictions of the water self-diffusion constant, dipole moment, and coordination number relative to the experiment. However, this method is still very expensive and therefore cannot be applied to large systems; simulations were performed on systems of only 32 water molecules.
Another approach utilizes sophisticated quantum mechanical calculations to parameterize classical water models.2,3 In terms of force-matching, Wang and co-workers have developed a four-site water model using forces generated using DFT (including Hartree-Fock exchange) and MP2,4,5 as well as models fit to post Hartree-Fock methods.6,7 Resulting classical models have been shown to reproduce several experimental properties, including density and IR spectra, and improve the water self-diffusion constant relative to AIMD.
As an alternative, recent work in our group8–10 has advanced the use of a direct minimal bias of simulation models with experimental data, called “experiment directed simulation” (EDS). As one key example, rather than employing a more sophisticated electronic structure method to perform the AIMD simulation, a combined EDS-AIMD approach10 utilizes limited experimental data to parameterize a bias of the DFT functional in an AIMD simulation. For the example of bulk liquid water, the results of Ref. 10 show that applying a minimal bias to just the interaction between water oxygen atoms (i.e., the O–O radial distribution function or RDF) results in a substantially less structuring of EDS-AIMD water and also an increase in the water self-diffusion constant (a quantity not biased in the simulation).
While inclusion of electronic degrees of freedom in simulations of water as a baseline allows in principle for a more rigorous description, it is not clearly necessary for accuracy at all: there are several classical empirical water models that can reproduce a number of experimental properties rather well (see, e.g., Refs. 11–15, and in that vein, Ref. 5 represents the current and best state-of-the-art). In the case of hydrated excess proton diffusion, however, a purely classical model is incapable of accurately reproducing the physics of the process. Proton diffusion in water is comprised of both vehicular diffusion and Grotthuss hopping, in which covalent and hydrogen bonds rearrange, significantly increasing the overall proton self-diffusion constant.16–21 Because the Grotthuss process involves chemical reactivity, it is inherently quantum mechanical in nature. Traditional force fields, which have fixed bonding topologies, are therefore inadequate to describe this process. Multistate empirical valence bond (MS-EVB) and more recently multiscale reactive molecular dynamics (MS-RMD) methods have been shown to capture the physics of this process and closely reproduce the solvation structure of a hydrated excess proton in water.19,22–27 Unlike classical force fields, these reactive MD methods explicitly allow for bond breaking and formation and are thus capable of including the essential physics involved in proton transport in water and aqueous systems.
The most sensible development and parameterization of reactive MD force fields utilize explicit quantum mechanical data, given the quantum mechanical nature of reactivity, but this parameterization is highly nontrivial. MS-EVB models were initially fit to reproduce potential energy surfaces along the proton transfer coordinate in small clusters using MP2.19,22–24 Such parameterizations utilized gas-phase data to generate aspects of an MS-EVB model designed for bulk simulations. By contrast, condensed phase force-matching to AIMD data has previously been employed to parameterize MS-RMD models, wherein a variational force residue is minimized.25,26 Both these fitting methods were demonstrably successful but also have their challenges and limitations.
In a preliminary communication, we have more recently shown that relative entropy minimization (REM)28–31 is an effective and efficient method for generating new MS-RMD force fields from AIMD data.27 In that work, we parameterized our model to bulk phase AIMD data for the excess proton in water. Despite the fact that AIMD water is known to be over-structured due to deficiencies in the underlying level of DFT,32,33 the excess proton solvation behavior is actually similar to the experiment for a number of properties,34 and we were able to generate MS-RMD models with REM which faithfully reproduced both the AIMD RDFs and potential of mean force (PMF) along the proton transfer reaction coordinate. Given the recent development of EDS-AIMD,10 in which the water over-structuring is significantly reduced and self-diffusion is increased, in this work, we instead develop MS-RMD models fit to EDS-AIMD data. As such, we have parameterized two new MS-RMD models to the EDS-AIMD data using REM. The primary difference between the two models is the underlying water model employed in the MS-RMD: one uses standard harmonic simple point charge/flexible water (SPC/Fw),14 while the other uses anharmonic aSPC/Fw.35 It is important to note that both the EDS-AIMD and the MS-RMD models in this paper utilize REM in different—and important—ways.
The remainder of this paper is organized as follows: in Sec. II, we describe the MS-RMD force field, REM, and the basic EDS-AIMD scheme. In Sec. III, we discuss the models developed in this paper and give results for the hydrated proton solvation structure, the PMF along the proton transfer coordinate, the excess proton diffusion constant, and the proton hopping behavior. Concluding remarks are provided in Sec. IV.
In MS-RMD, the ground state wavefunction is defined in terms of a linear combination of basis states i, each having a unique bonding topology
The coefficients ci are determined by diagonalizing the RMD Hamiltonian “on the fly” for each configuration of the nuclei x,
where E0(x) is understood here to be a diagonal matrix of eigenvalues as a function of all nuclear coordinates. The diagonal matrix elements of the Hamiltonian matrix H are defined as
The interatomic interactions between the hydronium ion and surrounding waters are defined as
The oxygen and hydrogen atoms denoted with an asterisk are for hydronium oxygen and hydrogen nuclei. The subscript “w” denotes a water molecule. The last two terms in the above equation, and , are added corrections to the diabatic energies of the underlying classical force field and have the effect of correcting for an over-attraction between classical hydronium and water. They are defined as
The off-diagonal matrix elements hij, which describe mixing between states |i〉 and |j〉, are defined by
The lattermost term on the right hand side of the equation, A(ROO, q), is a geometry dependent scaling factor
and is a function of the interatomic distance between the hydronium and water oxygen and q, which is defined by
In Eq. (7), is a constant, and describes the electrostatic interaction between the Zundel complex, H5O2+, and surrounding waters in the complex
For the two MS-RMD models fit in this work, the parameters are given in the supplementary material. We note that the underlying classical force field parameters were not fit and were taken from MS-EVB 3.2.24
In order to parameterize the models presented in this paper, we employed the REM approach. The relative entropy is defined by
where pT(i) and pM(i) are the probabilities of the target and model ensembles being in state i, respectively. In the constant NVT ensemble, this term can be rearranged to give
where U is the potential energy, A is the configurational part of the Helmholtz free energy, and Smap is the mapping entropy which represents entropy arising from degeneracies in the model. The notation 〈…〉 indicates averages over the ensembles, labeled by the subscript. The relative entropy is minimized upon the condition that the average derivative of the model potential energy with respect to model parameters λ is equal over the model and target ensembles,
The simplest implementation of such a minimization is the Newton-Raphson iterative scheme, where λi is updated according to
In this equation, is just a mixing term.
As we noted in Ref. 27, the Newton-Raphson scheme is a local minimization and can therefore result in degenerate solutions, with no clear guarantee that one reaches the global minimum of Srel. In fact, because of this shortcoming, the converged set of parameters may be dependent on the initial parameter set. The primary reason in the present case for there being so many solutions is the large number of parameters, the majority of which are nonlinear, that are being simultaneously fit. Interestingly, the extraordinary flexibility of such a fitting scheme actually appears to be advantageous in that some parameters can effectively compensate for others. On the other hand, a large number of flexible parameters can potentially lead to overfitting. We have found that all models fit using REM have successfully reproduced the proton solvation structure. The location of the proton transfer PMF minimum is also always reproduced, though the barrier height can differ slightly. We also find some variation in the resulting proton self-diffusion constant, though the values are usually within 30% of those reported here. One can certainly imagine a more sophisticated implementation of REM which does not use the Newton-Raphson scheme and is therefore not subject to getting trapped in local minima, but within the context of the work presented here, we find such an implementation to be suitable. In addition, the simplicity of a Newton-Raphson implementation certainly adds to the appeal of REM.
REM is indifferent to target ensemble, so as higher level quantum chemistry methods become computationally feasible for dynamics simulations, REM can be used to further refine models. As an alternative to a higher level (but much more costly) of quantum mechanics in the AIMD, which in theory would more accurately describe the system relative to experiment, we have chosen here to parameterize our model using quantum data generated using the recently developed EDS-AIMD method.10 In EDS-AIMD, the interactions between water oxygen atoms are biased to reproduce the experimental Ow–Ow RDF in AIMD simulations of water. In order to do so, the bias specifically targets the oxygen coordination number and its moments. The bias V(ri) is defined as
where N is the number of oxygen neighbors of given water oxygen, rk is the kth power of distance to said coordinating atoms, rij is the pairwise distance of oxygen neighbors, u is a step function, and r0 is the experimental distance of the first solvation peak of the O–O radial distribution function. The linear bias αk is determined using the EDS algorithm to match experimental data, and is the desired ensemble average of the kth scalar calculated from the experimental RDF. We direct the reader to Ref. 10 for more details on the EDS method used to generate the target ensemble used here.
The target EDS-AIMD configurations in this model utilize the BLYP functional,36,37 with the D3 Grimme dispersion correction38,39 and TZVP basis set. The systems used to parameterize the models consist of 128 water molecules plus an excess proton. Production runs of the MS-RMD models use 256 water molecules plus an excess proton.
III. RESULTS AND DISCUSSION
We begin with a brief discussion of the structural properties of an excess proton in water simulated using EDS-AIMD, particularly in comparison to unbiased AIMD (BLYP+D3 level). The relevant RDFs for EDS-AIMD and AIMD are shown in Fig. 1. Interestingly, despite the significant change in the water Ow–Ow RDFs between EDS-AIMD and AIMD (see Ref. 10), the hydrated proton solvation structure is only marginally altered. First, consider the O*–Ow RDF, which perhaps is the most telling in terms of the proton solvation structure (where the asterisk denotes a hydronium nucleus). The first solvation shell peak for both EDS-AIMD and AIMD is located at 2.53 Å; the first solvation shell peak heights are 4.22 for AIMD and EDS-AIMD. This is consistent with previous results for the hydrated excess proton comparing EDS-AIMD and AIMD,10 which did not include the D3 dispersion correction. Interestingly, the height of the second solvation shell peak in EDS-AIMD is slightly lower in AIMD. This solvation peak is largely dominated by water-water interactions, which means the EDS bias is more able to reduce the over-structuring representative of AIMD water. The H*–Ow RDFs for both systems have very similar peak heights and locations for both the first and second solvation shells. The O*–Hw RDFs showed slightly more variability. While the first solvation shell peak location and height are nearly identical in both systems, the second solvation shell in EDS-AIMD is less pronounced and at a slightly longer distance (5.10 Å compared to 4.94 Å), again a result of reduced water-water over-structuring. Importantly, the so-called pre-solvation shell,24 located at around 2 Å, is less prominent in EDS-AIMD than in AIMD with just BLYP+D3 DFT.
We also compared the PMF along the proton transfer coordinate between the AIMD, EDS-AIMD, and the two RMD models developed in this paper. We define the proton transfer coordinate δ,
where is the distance between hydronium oxygen and hydronium hydrogen, and is the distance between hydronium hydrogen and nearest water oxygen. As was seen in previous work which did not include the D3 dispersion correction to BLYP,10 the AIMD and EDS-AIMD PMFs are virtually identical, both having a minimum at 0.33 Å and a barrier of 0.60 kcal/mol.
We next turn to the hydrated proton solvation structure, comparing the MS-RMD fit models with the target EDS-AIMD results as shown in Fig. 1. First, we compare the O*–Ow RDF, as it is most illustrative of solvation structure around hydrated protons (“hydronium” cation). The MS-RMD models with water solvent having harmonic and anharmonic internal vibrations both reproduce the location of the first solvation peak, at 2.53 Å, with peak heights of 4.23 and 4.40 for MS-RMD models compared to 4.22 for EDS-AIMD. The MS-RMD models do not entirely reproduce the depth of the first valley, though the density in the regions between the first and second solvation shells is similar for EDS-AIMD and RMD. Importantly, there is greater overlap in the second solvation between EDS-AIMD and MS-RMD models. This is actually less a function of a well-fit model than it is an improved AIMD target via EDS-AIMD. As was mentioned earlier, typical GGA DFT-based AIMD water is over-structured, and this second solvation peak is mostly governed by water-water interactions. Because of this, our preliminary efforts27 to parameterize new MS-RMD models to AIMD were unable to completely reproduce the O*–Ow second solvation shell peak. Still, as seen in Fig. 1, there is not perfect overlap of the second solvation shell peaks between the EDS-AIMD and the MS-RMD models.
Comparing the O*–Hw RDFs, the MS-RMD models cannot exactly reproduce the first solvation shell peak of the EDS-AIMD data, but the agreement is still very good. In EDS-AIMD, the first intermolecular solvation peak is centered at 3.05 Å, with a height of 2.13; in MS-RMD, the peak is centered at 3.15 Å with a height of 2.05. When fitting to EDS-AIMD, the second solvation peak of the new MS-RMD models has a greater overlap with the target: EDS-AIMD has a peak centered at 5.00 Å with a height of 1.09, while MS-RMD has a peak centered at 5.12 Å with a height of 1.11. Again, this is due to the improved water-water interactions of the target.
The H*–Ow RDF in Fig. 1 also shows a good overlap between the EDS-AIMD and MS-RMD results. The first solvation shell for EDS-AIMD is centered at 1.50 Å with a height of 2.74; the RMD models are centered at 1.47 Å with a height of 2.88 and 3.30 and for MS-RMD 4 and MS-RMD 5 models, respectively. The second solvation shell for EDS-AIMD is centered at 3.08 Å with a height of 1.75; the MS-RMD models are centered at 3.05 Å with a height of 1.79.
There is also a good overlap of the H*–Hw EDS-AIMD and MS-RMD RDFs; however, there is no real improvement in the overlap between new MS-RMD models with EDS-AIMD compared with previous models.27 This is a result of the weak contact between hydronium hydrogens and water hydrogens, which means such interactions contribute less to the total potential energy and therefore contribute less to the relative entropy.
Next, we compare the PMF along the proton transfer coordinate between EDS-AIMD and MS-RMD 4 and 5 as seen in Fig. 2. The REM generates MS-RMD models that closely reproduce the PMF along the proton transfer coordinate of the target, though the MS-RMD models diverge most significantly in regions where the free energy is higher. In the present work, both MS-RMD 4 and MS-RMD 5 actually quite closely reproduce the high free energy regions where δ is large but do not perfectly reproduce the proton transfer barrier. MS-RMD 4 has a barrier of 0.78 kcal/mol, which is larger but still close to that of EDS-AIMD, 0.60 kcal/mol; the free energy minimum for MS-RMD 4 is located at 0.32 Å, whereas that of EDS-AIMD is 0.34 Å. The free energy barrier of MS-RMD 5, 0.67 kcal/mol, agrees better with that of EDS-AIMD, and the free energy minimum 0.32 Å is the same as that of MS-RMD 4. We have shown in previous work27 that REM is far better at reproducing the region of PMF with lower free energy. This is due to the fact that for a given ensemble, more configurations are from lower free energies, which means that lower free energy regions have a greater contribution to the average potential energy derivatives in Eqs. (12)–(14).
One of the objectives of the present work is to also improve the proton self-diffusion constants of MS-RMD. Previous MS-EVB and MS-RMD models have underestimated the experimental excess proton self-diffusion constant. For example, the proton self-diffusion constant for MS-EVB 3.2 is 0.37 ± 0.01 Å2/ps and for MS-RMD 3 is 0.41 ± 0.05 Å2/ps,27 compared to the experimental value of 0.94 ± 0.01 Å2/ps. This disagreement is likely due to a number of factors: primarily that MS-EVB and MS-RMD do not include nuclear quantum effects; finite size effects can also play a role. EDS-AIMD, on the other hand, has a higher diffusion constant of 0.58 ± 0.02 Å2/ps compared to experiment. The MS-RMD models presented herein differ in their improvement of the calculated self-diffusion constants. For MS-RMD 4, the calculated self-diffusion constant is 0.41 ± 0.07 Å2/ps, which is the same as previously reported models. For MS-RMD 5, however, it is 0.47 ± 0.05 Å2/ps, which makes this model the most accurate MS-RMD hydrated excess proton model to date in terms of the self-diffusion constant. As such, we find that inclusion of anharmonic water, aSPC/Fw, is important for increasing the proton self-diffusion constant. This is also consistent with the work of Paesani and co-workers.35
We also note that accounting for finite size effects and nuclear quantum effects will increase the self-diffusion constant value by as much as a factor of 1.56.24 If we multiply the MS-RMD 5 result by this correction factor, we arrive at an excess proton self-diffusion constant estimate of 0.7 Å2/ps, which is quite close to the experimental value. We note that the EDS-AIMD excess proton diffusion result with this same correction becomes 0.9 Å2/ps, which is in essentially perfect agreement with experiment.
One of the most important properties of the hydrated excess proton in water is that it undergoes Grotthuss hopping; quantifying such hops is therefore an important way of analyzing any excess proton model. We therefore performed an analysis of the concerted hopping behavior.34 The most natural way of quantifying hopping events is to associate them with a time scale τ. We then define a hopping function hn(τ), which is a function of the number of water molecules to which the excess proton is associated; the value n is the number of distinct water molecules that the excess proton is associated with during n intervals of length τ. For example, if an excess proton were initially associated with water A, and after a time τ is still associated with A, then h1(τ) is incremented. If after a second interval τ the proton had hopped to another water B, then h2(τ) would be incremented. In Fig. 3, we show the concerted hopping for a number of models: AIMD, EDS-AIMD, MS-EVB 3.2, MS-RMD 3, MS-RMD 4, and MS-RMD 5. While we do not discuss all of these models in depth in this paper, we show them here in order to emphasize the improvement in hopping in the anharmonic water (MS-RMD 5) model presented herein. First, for n = 1 and n = 2, the concerted hopping across all models is similar. That is, for all models, the likelihood that an excess proton is associated with 1 or with 2 water molecules is nearly identical. The primary differences emerge when one looks at h3(τ). First, for both AIMD and EDS-AIMD, there is significantly more hopping at short times than in any of the EVB/RMD models. This means that concerted and/or rapid sequential hopping events are much more likely to occur in AIMD and EDS-AIMD, which partially explains the larger self-diffusion constants. The concerted hopping for our harmonic water model, MS-RMD 4, is similar to that of previous MS-RMD and MS-EVB models. The anharmonic MS-RMD 5, on the other hand, exceeds those of previous models, which contributes to the enhancement in the self-diffusion constant. We also see that the differences in concerted hopping are most pronounced at short τ values and that as τ gets large enough, all the models are similar. Effectively, the concerted hopping in short time intervals provides an enhancement to the self-diffusion constant, which none of the MS-RMD models completely capture. Future developments in MS-RMD modeling for this problem will likely require a revised functional form over Eqs. (2)–(10) (the MS-RMD “basis set”) in order to better capture this concerted proton hopping behavior. We emphasize, however, that concerted proton hopping is not the dominant effect defining the solvation structure nor the transport of the hydrated excess proton in water.
Finally, we consider the forward hopping.23,24 We define forward hopping as hopping that results in the proton being associated with a new water molecule. Therefore, if a proton is initially associated with water A, and then hops to water B, the forward hopping increases by 1. If the proton subsequently returns to water A, the forward hopping decreases by 1. The forward hopping results for EDS, MS-RMD 4, and MS-RMD 5 are shown in Fig. 4. For the sake of clarity and due to the limited AIMD data, we show only a 20 ps window. First, notice that for EDS-AMD, the forward hopping far surpasses that of MS-RMD 4 and MS-RMD 5. This explains some of the increase in self-diffusion in EDS-AIMD. Also note that in the region around t = 18 ps, forward hopping decreases quickly, followed by a rapid increase. The decrease in forward hopping occurs when the proton hops along a chain of waters across which it had already hopped. Another important property illustrated in this plot is that while forward hopping events occur fairly frequently in the MS-RMD models, they suffer from the phenomenon of “rattling,” which occurs when a proton rapidly hops between the same two water molecules. The rapid rattling processes indeed result in a decrease in the overall diffusion constant, as while proton transfer is occurring, the overall excess proton displacement is not substantial.
IV. CONCLUDING REMARKS
We have developed two new MS-RMD models—models that are fit for the first time to EDS-AIMD data. While the excess proton solvation structure with EDS-AIMD is not substantially altered from more standard AIMD utilizing the BLYP+D3 level of DFT, it does have the advantage of having removed the effects of over-structured water solvent present in virtually all AIMD simulations of water to date. Of the two MS-RMD models we developed in this paper, one uses an underlying harmonic water model (MS-RMD 4), whereas one uses an anharmonic water model (MS-RMD 5). While both models have similar proton solvation structures, the anharmonic model has a slightly lower internal proton transfer barrier and about a 15% higher proton self-diffusion constant. We ascribe the increase in proton self-diffusion to the increase in concerted and forward hopping. This is a reasonably important property of proton transport in water so is therefore valuable to capture it in an MS-RMD model.
As we have previously shown in the preliminary work,27 REM is a very effective fitting scheme in the development of MS-RMD models, and it is capable of capturing some of the chemically reactive nature of proton transfer into a reactive force field (in this case, MS-RMD). The development of such reactive MD force fields then allows one to study proton solvation and transport in systems on length and time scales that are much too large for AIMD methods that explicitly treat electronic degrees of freedom. As such, the work presented in this paper, along with the EDS-AIMD approach, provides a promising route for accurately parameterizing novel reactive force fields from quantum mechanical simulation data.
See supplementary material for parameters of the MS-RMD 4 and MS-RMD 5 models.
This research was supported by the Department of Energy (DOE), Office of Basic Energy Sciences (BES), Division of Chemical Sciences, Geosciences, and Biosciences (Award No. DE-SC0005418). C.C. is grateful for the financial support from National Natural Science Foundation of China (Grant Nos. 21303123 and 21303124) and the China Scholarship Council (No. 201306275019). The computational resources in this work were provided by the University of Chicago Research Computing Center (RCC).