We report here a new force field for water based **solely on quantum mechanics** (QM) calculations with **no empirical data. The QM was at a high level, coupled cluster single double triple**, for all orientations and distances for water dimer plus X3LYP density functional theory (DFT) on 19 larger water clusters. In addition, we included charge and polarization based on the polarizable charge equilibration method and nonbond interactions from DFT-D3 calculations on the H_{2} and O_{2} crystal. This model, denoted as RexPoN, provides quite excellent agreement with experimental (expr) data for the solid and liquid phase of water: *T*_{melt} *=* 273.3 K (expr = 273.15 K) and properties at 298 K: *ΔH*_{vap} = 10.36 kcal/mol (expr = 10.52), density = 0.9965 gr/cm^{3} (expr = 0.9965), entropy = 68.4 (J/mol)/K (expr = 69.9), dielectric constant = 76.1 (expr = 78.4), and ln *D*_{s} (self-diffusion coef) = −10.08 (expr = −11.24). Such an accurate force field for water will, we believe, be useful for full solvent calculations of electrocatalysis, where we can restrict QM water to just the first one or two layers involving reactions, using RexPoN to provide the polarization for a more distant solvent. Also, RexPoN may provide a better description of the solvent for proteins, DNA, polymers, and inorganic systems for applications to biomolecular, pharma, electrocatalysis (fuel cells and water splitting), and batteries where interaction with explicit water molecules plays a significant role.

## I. INTRODUCTION

Because of its central role in life and sustenance, the properties of water have been widely studied. Although chemically simple, it has been hard to model^{1,2} with many anomalies such as the critical point of supercooled water below 219 K,^{3} the maximum density at 4 °C, the high dielectric constant and boiling point, the non-monotonic isothermal compressibility and specific heat,^{4} and the fifteen known crystal structures for ice.^{5,6} Because of the lack of long range order, molecular dynamics (MD) simulations have played the major role in elucidating these properties. These MD studies are nearly all based on empirical force fields (FFs) because quantum mechanics (QM) simulations are limited to 100s of atoms for 10s of picoseconds (ps). Starting with the first model of Bernal and Fowler in 1933 and the first computer simulation of water by Barker and Watts in 1969, hundreds of water models have been developed and applied to the properties of water^{1,2} but there remain many puzzles.

The most popular empirical FFs^{1,2} are the TIPnP (n = 3, 4, 5)^{7–9} and simple point-charge (SPC)^{10} families, which use rigid molecular structures. The parameters in these models are fitted to reproduce one or more macroscopic properties of water but none reproduce all important properties of the solid and liquid phase.

Flexible (e.g., TIP4PF^{11} and SPC/Fw^{12}) and fluctuating charge (e.g., TIP4PQ^{77} and SPC/FQ^{13}) models are used less often because of increased costs with no obvious increase in accuracy.

Polarizable FFs [e.g., Anisotropic Site Potentials (ASP);^{14} Symmetry-adapted Perturbation Theory (SAPT);^{15} and Thole-type Model (TTM) families^{16}] and Drude oscillator methods (e.g., SWM4-DP^{17}) have been used a little, but some exhibit a “polarization catastrophe” at short distances.^{18}

Some water models have been developed from fitting many body potential energy terms to millions of high-quality energy datasets [e.g., Huang, Braams, and Bowman (HBB);^{19} CC-pol;^{20} and MB-pol^{21}]. These models are not easily transferable to other systems since for any new set of interactions (new species), this tedious fitting procedure must be repeated.^{22}

*Ab initio* molecular dynamics (AIMD) methods have been improved over the last two decades to perform at least close to empirical models.^{23} Various computational techniques have been implemented to improve the qualities of water models but not yet successful. For example, machine learning methods have been used to improve the quality of explicit polarization and Coulombic terms in the water models.^{24} More discussion of different types of water models is provided elsewhere.^{1,2}

We decided to develop a FF based fully on the best available QM, with the hope that it would correctly describe the standard properties of water including the melting point (*T*_{m}), density (*ρ*), heat of vaporization (*ΔH*_{vap}), entropy (*S*^{0}), dielectric constant (*ε*), self-diffusion constant (*D*_{s}), oxygen-oxygen radial distribution function (*R*_{OO}), and which might additionally explain the anomalies of water. Our procedure is as follows:

First, establish the

**long range interactions**: the van der Waals nonbond (NB) interactions (London dispersion plus Pauli repulsion) from QM.Describe the electrostatic (polarization and charge) interactions based on our recently developed polarizable charge equilibration (PQEq) method that has been shown to reproduce the QM polarization energies due to bringing point dipole up to various small molecules.

Use Coupled Cluster Single Double Triple, CCSD(T), QM on a water monomer

^{25}(including OH bond dissociation and HOH angle terms) to determine the**bond and angle valence terms**.Use Coupled Cluster Single Double Triple, CCSD(T), QM on a water dimer

^{25}to determine the**hydrogen bond (HB) corrections**.

The final FF is called RexPoN to indicate that it includes charge and polarization (Po) and Nonbond (N) while allowing bond breaking (Rex). RexPoN is aimed at doing both reactive and nonreactive systems. Thus, we use the full RexPoN formalism to define the valence bond and angle terms at the same level as would be used in reactions.

However, in the validations reported here, we use a rigid water model where OH bonds and HOH angles are kept fixed at the values from CCSDT throughout the simulations (see below). There are questions concerning whether the rigid water model describes nuclear quantum effects (NQE) including zero-point energy (ZPE) and tunneling. The hydrogen bonding between water molecules can be affected by ZPE effects for OH bond stretch and HOH bending vibrational modes.^{26} However, it has been shown that previous estimates of NQE effects were greatly overestimated because of the use of harmonic intramolecular interactions, ignoring anharmonicity.^{27} In addition, proton delocalization and tunneling effects are expected to be important only at very low temperatures and high pressures which are not the conditions of simulations here.^{26} Moreover, RexPoN includes the correct anharmonicity at the level of CCSDT since the OH bond and HOH angle curves were fitted to this level. Therefore, flexible RexPoN will likely do well on NQE effects in water.

The rest of this paper is organized as below. Section II describes the model and the physical motivation behind each potential energy term. Section II also discusses the training sets and optimization process of the FF. Section III provides the results of the simulations for several ice Ih and water properties and compares the results with experimental data. The conclusions are given in Sec. IV.

## II. THE MODEL

The total energy of the system is expressed as

where *E*_{nonbond} is the nonbond (NB) energy and *E*_{valence} is the bonded energy terms. *E*_{nonbond} and *E*_{valence} are defined by

where *E*_{elect} is the electrostatic, *E*_{vdW} is the van der Waals (vdW), and *E*_{HB} is the hydrogen bond (HB) energy. *E*_{valence} includes the covalent OH bond (*E*_{bond}) and the HOH angle (*E*_{angle}) energy terms.

### A. Electrostatic energy (*E*_{elect})

We describe the electrostatic energy of the system using our polarizable charge equilibration (PQEq) model.^{28,29} In addition to atomic charges, we use PQEq to include the polarization arising from interatomic interactions and external electric fields. Each atom is considered to have a core including all the mass of the atom and an electron shell that is massless. The core and shell are described by 1s Gaussian shaped electron densities (*ρ*) with a finite distribution of charge on each atom rather than fixed point charges used by many other FFs (see Fig. 1). The core has a variable charge (*q*_{i}) and a fixed charge (+1) part. The shell only has a fixed charge of −1.The dynamic position of the shell is found by balancing the external forces due to other atoms and applied fields with the spring force constant (*K*_{s}) that connects the shell and core of the atom. The electrostatic interaction energy between the electron densities (*ρ*) of atoms *i* and *j* is given by

where *k* and *l* denote the core (c) and shell (s), respectively. The width of the Gaussian distribution (*α*_{ik} and *α*_{ik}) is equal to 0.2341/*R*^{2}, where *R* is the core (*R*_{c}) or shell (*R*_{s}) radius. The electrostatic energy (*E*_{elect}) is defined by

where $\chi i0$ is the Mulliken electronegativity, $\chi i0$ = (IP + EA)/2, $Jii0$ is the idempotential, $Jii0$ = IP − EA, *r*_{ik,jl} is the interatomic distance, IP = atomic ionization potential, and EA = atomic electron affinity. Here *T(r)* is the 7th order taper function [see Eq. (29)]. The second sum computes the electrostatic energy between the core and shells of all atoms in the system. The charge on the core (*q*_{i}) is updated every time step via efficient schemes. The massless shell relaxes instantaneously to align itself along the external force vector with no inertial delay. The complete description of the PQEq model and each term in the energy equation is provided in previous publications.^{28,29} We validated the accuracy of PQEq by comparing with the polarization in QM.^{28} We reoptimized the PQEq parameters of water for two reasons.

First, to ensure that the charge distribution on oxygen and hydrogen atoms is in reasonable ranges when two water molecules get very close to each other. This is important for describing the equation of state (EOS) of water at high pressures. Thus, we computed the QM [electrostatic potential (ESP) and Mulliken] charge distributions on the water dimer over a range of short distances and adjusted the PQEq parameters such that they produce reasonable charges (see Fig. S1 of the supplementary material).

Second, during the optimization of HB attractive and repulsive terms (see below), we optimize the PQEq parameters such that the electrostatics energy provides the most accurate description of the water dimer energy at long distances. The PQEq parameters of H and O are provided in Table I.

Atom . | $\chi i0$ (eV) . | $Jii0$ (eV) . | R_{c} = R_{s} (Å)
. | K_{s} [(kcal/mol)/Å^{2}]
. |
---|---|---|---|---|

H | 4.5280 | 17.9841 | 0.4452 | 2037.2006 |

O | 8.7410 | 13.3640 | 0.8028 | 814.0445 |

Atom . | $\chi i0$ (eV) . | $Jii0$ (eV) . | R_{c} = R_{s} (Å)
. | K_{s} [(kcal/mol)/Å^{2}]
. |
---|---|---|---|---|

H | 4.5280 | 17.9841 | 0.4452 | 2037.2006 |

O | 8.7410 | 13.3640 | 0.8028 | 814.0445 |

### B. van der Waals energy (*E*_{vdW})

To determine the vdW nonbond interactions, we calculated the equation of states (EOS) of solid hydrogen and solid oxygen based on the density functional theory (DFT).^{30,31} For solid hydrogen, we use the Perdew-Burke-Ernzerhof (PBE)^{32} flavor of DFT, while for a solid oxygen crystal, we use Becke three-parameter Lee-Yang-Parr (B3LYP)^{33,34} including the hybrid terms important for describing the O_{2} triplet state^{35,36} (see below).

#### 1. Solid hydrogen

X-ray diffraction shows that the H_{2} crystal has a hexagonal closed packed structure with the *P6*_{3}*/mmc* space group and a *c/a* ratio close to the ideal value of 1.633.^{37} However, since hydrogen has no core electrons, X-ray experiments cannot determine the exact position of the H atom coordinates. Therefore, extensive theoretical studies have been carried out to determine the H_{2} molecule orientation. It was found that the *Pca2*_{1} space group (see Fig. S2 of the supplementary material) containing a hexagonal close packed structure with four molecules per unit cell is the most stable structure at high pressure ranges (∼110–150 GPa).^{38,39} Therefore, we selected a *Pca2*_{1} space group and used the PBE-D3 to compute energy versus volume with single point energy calculations at various volumes. This leads to the PBE-D3 energy curve shown in Fig. 2(a) for densities starting with the volume of 17.440 Å^{3} (ρ = 2.626 cm^{3}/mol) corresponding to the pressure of 120 GPa at 300 K and passing through the volume of 44.082 Å^{3} (6.627 cm^{3}/mol) corresponding to the pressure of 10 GPa at 300 K. We also scale the simulation cell to large volumes as high as 424.572 Å^{3} (63.920 cm^{3}/mol) to capture the long-range interactions. The HH bond distance was fixed at the equilibrium value of 0.74 Å for all volumes to avoid any contribution to the total PBE-D3 curve from changes in the HH bond length. All calculations in this section were performed using the Vienna *Ab initio* simulation package (VASP) (version 5.3.5).^{40} The generalized gradient approximation (GGA) was used for the exchange–correlation energy^{32} with a plane wave energy cutoff of 600 eV. We used the convergence criteria of 10^{−6} eV for energy and 10^{−4} eV Å^{−1} for force. The k-space sampling was the gamma point centered with full space group symmetry.

#### 2. Solid oxygen

For solid oxygen, we selected the stable *α*-phase crystal structure, which according to neutron-diffraction and X-ray measurements has a monoclinic base-center structure with a *C2/m* space group^{35,36} (see Fig. S2 of the supplementary material). For DFT, we used the B3LYP hybrid method. This is because O_{2} has a ^{3}Σ_{g}^{−} triplet ground state that is poorly described by PBE. The ground state of O_{2} crystal has the two O_{2} molecules coupled antiferromagnetically to obtain a singlet spin state. Since the DFT formalism uses a single Slater determinant, it cannot properly describe the single spin state, often leading to severe convergence problems. Instead, we describe the O_{2} crystal assuming a high spin state quintet state (S = 2, with 4 extra up spin orbitals per cell, M_{S} = 2) which is correctly described with a single Slater determinant using a spin unrestricted DFT formalism. Our computed heat of vaporization of solid oxygen using B3LYP-D3 and high spin O_{2} is −2.3 kcal/mol after correcting for zero-point energy, which is very close to the experimental value of −2.1 kcal/mol.^{36} The EOS of solid oxygen was then obtained by performing single point energy calculation at different volumes. This leads to the B3LYP-D3 energy curve shown in Fig. 2(b). The smallest volume of 17.369 Å^{3} (ρ = 5.230 cm^{3}/mol) corresponds to a pressure of 300 GPa at 300 K,^{41,42} the volume of 41.376 Å^{3} (12.459 cm^{3}/mol) corresponds to a pressure of 10 GPa at 10 K,^{43} and the volume 69.439 Å^{3} (20.909 cm^{3}/mol) corresponds to a pressure of 0 GPa at 10 K.^{43} The OO bond distance was fixed at the equilibrium value of 1.207 Å. All calculations in this section were performed using the CRYSTAL package (version 14)^{44} using the modified Gaussian basis set, m-6-311G(2d) for oxygen.^{45} We used the convergence criteria of 10^{−7} a.u. for energy. The k-space sampling was the gamma point centered with full space group symmetry.

We used the CRYSTAL package for solid oxygen because the VASP package is over 1000 times slower than CRYSTAL for B3LYP calculations. However, we used VASP for PBE-D3 calculations of solid hydrogen because it is far easier to use than CRYSTAL and is only a factor of 3 slower.

The hydrogen and oxygen DFT-D3 energy curves were then fitted to obtain the first principles based vdW potential energy curves for H and O. This vdW energy is the sum of a monotonic attractive nonbond term (NBatt) and a monotonic repulsive non-bond term (NBrep). NBatt has the form of −*C*_{6}*/r*^{6} which means that at sufficiently long *r* the wavefunction on separate atoms does not overlap. The NBatt must remain finite as the distances become sufficiently small. Therefore, we define NBatt energy (*E*_{NBatt}) as

which is similar to low gradient formulation^{46,47} that was also used in ReaxFF-*lg*.^{48} Here, *r* is the interatomic distance, *C*_{6} is the dispersion energy correction parameter, and *R*_{vdW} is the equilibrium vdW distance between atoms *i* and *j*. The NBrep energy (*E*_{NBrep}) is purely repulsive (positive) and monotonic which we expand using a group of exponential functions as

where *A*, *α*, *β*, *γ*, *n*, *η*, and *δ* are the parameters.

In developing RexPoN, we want to use generic approaches likely to have the correct systematics as a function of the rows and columns of the periodic table. Thus we used as initial values for *R*_{vdW} and *C*_{6} the values from the universal force field,^{49} which has parameters up to Lw (Z = 103). We expect that the *R*_{vdW} should be reasonably accurate; however, the *C*_{6} was based on calculated atomic polarizabilities and needs to be adjusted. Next, we define the initial parameters of *E*_{NBrep} knowing that *E*_{NBrep}*(r) = E*_{vdW}*(r)-E*_{NBatt}*(r)*. Then, the parameters of *E*_{NBatt} and *E*_{NBrep} are optimized to provide th6e best fit to the *E*_{vdw} curve.

To describe HO vdW interaction, we used the geometric mean average of the two-body vdW potential energy curves of H and O as in the following equations:

Atom i
. | Atom j
. | R_{e} (Å)
. | C_{6} [(kcal Å^{6})/mol]
. |
---|---|---|---|

H | H | 3.2692 | 239.4803 |

O | O | 2.9938 | 634.7066 |

O | H | 3.1285 | 389.8714 |

Atom i
. | Atom j
. | R_{e} (Å)
. | C_{6} [(kcal Å^{6})/mol]
. |
---|---|---|---|

H | H | 3.2692 | 239.4803 |

O | O | 2.9938 | 634.7066 |

O | H | 3.1285 | 389.8714 |

Atom i
. | Atom j
. | A (kcal/mol)
. | α
. | β
. | γ
. | n
. | η
. | δ
. |
---|---|---|---|---|---|---|---|---|

H | H | 0.0212 | −2.4002 | 10.5460 | −0.8113 | 0.0014 | −0.2607 | −0.5640 |

O | O | 0.0272 | −3.6468 | 14.5278 | −0.3235 | 0.0305 | 0.0331 | −0.5688 |

O | H | 0.0906 | −3.1455 | 10.0823 | −0.0007 | −1.0000 | 0.0000 | 0.0000 |

Atom i
. | Atom j
. | A (kcal/mol)
. | α
. | β
. | γ
. | n
. | η
. | δ
. |
---|---|---|---|---|---|---|---|---|

H | H | 0.0212 | −2.4002 | 10.5460 | −0.8113 | 0.0014 | −0.2607 | −0.5640 |

O | O | 0.0272 | −3.6468 | 14.5278 | −0.3235 | 0.0305 | 0.0331 | −0.5688 |

O | H | 0.0906 | −3.1455 | 10.0823 | −0.0007 | −1.0000 | 0.0000 | 0.0000 |

#### 3. Validation of the method

We tested the accuracy of DFT-D3 by comparing to the experimental pressure-volume (*P-V*) EOS of solid hydrogen. We used experimental measurements on solid hydrogen isotopes (H_{2} and D_{2}) within the pressure range of 10–120 GPa at 300 K for these comparisons^{37} (Fig. 3). We first minimized the atomic positions related to this pressure range (i.e., from 17.440 to 44.082 Å^{3}) using PBE-D3 to obtain the total energy at 0 K. Then, we corrected this energy for the zero point energy (ZPE) contributions and also free energy (thermal effects) at 300 K (*G = H-TS*). The total PBE-D3 energy before and after correction at 300 K is shown in Fig. S4 of the supplementary material. Then, the total energy curve at 300 K was fitted to third-order Birch-Murnaghan isothermal EOS,^{50} *E*_{BM} *(V)*. The *P-V* EOS was then obtained by

As shown in Fig. 3, our PBE-D3 predicted EOS is in good agreement with experiment, indicating that DFT-D3 is sufficiently accurate to describe the vdW interactions of dense solid systems.

### C. Covalent bond energy (E_{bond})

#### 1. The bond order (BO) expression

To describe the covalent O–H bond, we used the bond order (BO) concept to capture the dependence of the bonding on the overlap (actually exchange kinetic energy) between orbitals on the bonding atoms. Here we use a sigmoid function as in ReaxFF^{51} to define the bond distance to the bond order relation

where *pbo*_{1} and *pbo*_{2} are the parameters, *R*_{e} is the equilibrium bond distance, and *r* is the bond distance between atom *i* and atom *j*. We choose the BO = 1.0 at *R*_{e} which results in *p*_{bo1} = 2.3026. This allows the formal bond order to approach 1.1 for very small values of *r*. Therefore, the only free parameter that needs to be determined is *p*_{bo2}. To make an initial good guess for *p*_{bo2}, we choose the BO = 0.5 at the inflection point distance (R_{H/2}) of the bond energy curve. The QM bond energy curves (see below) are used to obtain *p*_{bo2} parameters for H_{2}, O_{2}, and OH (in H_{2}O molecule) cases. The final bond order curves are shown in Fig. S5 of the supplementary material, and the BO parameters are summarized in Table IV.

Bond type . | R_{e} (Å)
. | p_{bo1}
. | p_{bo2}
. |
---|---|---|---|

H–H | 0.7450 | 2.3026 | 1.3532 |

O–O | 1.2070 | 2.3026 | 1.6150 |

O–H | 0.9587 | 2.3026 | 1.4428 |

Bond type . | R_{e} (Å)
. | p_{bo1}
. | p_{bo2}
. |
---|---|---|---|

H–H | 0.7450 | 2.3026 | 1.3532 |

O–O | 1.2070 | 2.3026 | 1.6150 |

O–H | 0.9587 | 2.3026 | 1.4428 |

We define the bond energy (*E*_{bond}) to match the bond dissociation curves in H_{2}, O_{2}, and OH (in H_{2}O) molecules for the highest quality QM calculations. Thus, given *E*_{vdw} and *E*_{elect}, we define *E*_{Q2} and *E*_{bond} as

where *E*_{QM} is the total QM energy due to the stretching bond between atom *i* and *j*. *k* and *l* are the atom indices (*k*, *l* ≠ *i*, *j*). *E*_{Q2} is the net two-body QM energy between atom *i* and *j*, which is obtained by subtracting from *E*_{QM} the *E*_{vdW} between all atom pairs except for *i* and *j* and also subtracting the total *E*_{elect} of the system [Eq. (5)]. Therefore, based on *E*_{vdW} and *E*_{electro} which are already established, the *E*_{bond} is defined automatically to match the QM bonding energy curves.

We used the exact *ab initio* singlet energy data for the HH bond scan in a H_{2} molecule^{52} and high quality *ab initio* and spectroscopic energy data for the OO bond scan in an O_{2} molecule.^{53,54} For diatomic H_{2} and O_{2} molecules, the *E*_{vdw}*(r*_{kl}*)* and *E*_{electo}*(r*_{kl}*)* sums in Eq. (13) are zero and therefore *E*_{Q2}*(r*_{ij}*) = E*_{QM}*(r*_{ij}*)*. For the OH bond in the H_{2}O molecule, we used the accurate water monomer potential energy surfaces (Bowman PES) obtained from CCSD(T) calculations.^{25} Here, we started from the equilibrium water structure (OH = 0.9572 Å and HÔH = 104.52°) and stretched one of the OH bonds to infinity while the angle and other bond are fixed. Then, the *E*_{QM} was computed at each distance using Bowman PES. Here, the HH *E*_{vdw} and total *E*_{electro} [Eq. (13)] should be subtracted from *E*_{QM} to get the net OH two-body curve (*E*_{Q2}). The *E*_{Q2} curve is fitted to an extended-Rydberg type function of the form

where *D*_{0}, *a*_{0}, *b*_{0}, and *c*_{0} are the parameters to be determined. Figure 4 shows the energy components and the fit to the *E*_{Q2} curve for H_{2}, O_{2}, and OH bond dissociation. The final parameters of Eq. (15) are given in Table V.

Bond type . | D_{0} (kcal/mol)
. | R_{e} (Å)
. | a_{0}
. | b_{0}
. | c_{0}
. |
---|---|---|---|---|---|

H–H | 109.4853 | 0.7450 | 1.0000 | 4.0000 | 1.0000 |

O–O | 120.4881 | 1.2070 | 1.0000 | 5.2000 | 1.0000 |

O–H | 113.2500 | 0.9587 | 0.5577 | 7.1812 | 1.0000 |

Bond type . | D_{0} (kcal/mol)
. | R_{e} (Å)
. | a_{0}
. | b_{0}
. | c_{0}
. |
---|---|---|---|---|---|

H–H | 109.4853 | 0.7450 | 1.0000 | 4.0000 | 1.0000 |

O–O | 120.4881 | 1.2070 | 1.0000 | 5.2000 | 1.0000 |

O–H | 113.2500 | 0.9587 | 0.5577 | 7.1812 | 1.0000 |

### D. Angle energy (E_{angle})

To define the angle term, we fit a simple cosine angle function to the Bowman PES for the water monomer,^{25}

where *C*, *θ*_{0}, *a*, and *b* are the adjustable parameters. *θ*_{ijk} is the angle between atoms *i*, *j*, and *k*. *F*_{1} and *F*_{2} are the scaling functions. The scaling functions go to zero as OH gets very large and become equal to one at the equilibrium bond length (*r*_{ij}*= R*_{e} or *r*_{jk}*= R*_{e}). The QM energy as a function of *θ*_{HOH} with both bonds fixed at *R*_{e} = 0.9572 Å (i.e., *F*_{1}*= F*_{2} = 1) is shown in Fig. 5. The angle term together with *E*_{vdw} and *E*_{electro} is fitted to the QM curve. The parameters of angle energy term are given in Table VI.

### E. Hydrogen bond energy (*E*_{HB})

The water dimer presents a deep minimum (∼5.0 kcal/mol) at its global minimum which leads to ice structures having tetrahedral arrangements leading to four strong HBs to each water. The HB involves attraction between a partially positive hydrogen of the donor water molecule to the partially negative lone pair of the acceptor water molecule. Similarly liquid water forms an extensive HB network that is responsible for many of its unique properties. This HB depends strongly on the alignment of the donor hydrogen with respect to the lone pair of the acceptor molecule. In addition, the lone pairs on each water molecule repel the lone-pair electrons of nearby water molecules. To describe their roles, we include the role of lone pairs on each water molecule explicitly in our FF.

#### 1. Explicit lone pairs

Although, the lone pair electrons do not present distinct directed electron densities, there are minima in the electrostatic potential surfaces around the isolated water molecule that suggest a tetrahedral structure for the water molecule.^{55} The accurate Bowman PES for a water dimer also confirms two local minima (∼109.5° apart) during the rotation of donor water around the acceptor water molecule. A tetrahedral bonding network is also found in condensed phases of water such as hexagonal ice. Therefore, we define tetrahedrally arranged lone pairs (sp^{3} hybridized with an angle of 109.5° between them) defined with respect to the two OH bonds.

The explicit lone pairs of the acceptor molecule in a water dimer are shown in Fig. 6. The lone-pair unit vectors ($u\u2192\alpha $ and $u\u2192\beta $) are based on the OH bond unit vectors ($u\u21921$ and $u\u21922$) and the *θ*_{L} = 109.5° angle between them. First, the $u\u21921$ and $u\u21922$ vectors are used to define the perpendicular ($u\u2192p$) and bisector ($u\u2192b$) unit vectors. The $u\u2192p$ vector is perpendicular to the plane of H_{2}O, $u\u2192p=(u\u21921\xd7u\u21922)/u\u21921\xd7u\u21922$, and $u\u2192b$ is in the plane of H_{2}O and bisects the HOH angle, $u\u2192b=u\u21921+u\u21922/2$.Then, the vector that connects acceptor oxygen to the hydrogen of the donor is normalized to give the $u\u2192OH$ unit vector. The *α* and *β* angles (shown in Fig. 6) are defined as the angles that $u\u2192OH$ constructs with $u\u2192\alpha $ and $u\u2192\beta $ unit vectors, respectively. Finally, we use the following three conditions to completely define lone-pair vectors ($u\u2192\alpha $ and $u\u2192\beta $):

the angle between lone-pair vectors and $u\u2192b$ is

*θ*_{L}/2 = 54.75° because $u\u2192b$ bisects the angle between $u\u2192\alpha $ and $u\u2192\beta $,the angle between lone-pair vectors and $u\u2192p$ is π/2−

*θ*_{L}/2 = 35.25°, andlone-pair, $u\u2192b$, and $u\u2192p$ vectors are all in the same plane.

Equations (19)–(21) are written based on these conditions. They are solved simultaneously to obtain $u\u2192\alpha $ and $u\u2192\beta $,

The *E*_{HB} is defined as the sum of attractive (*E*_{HBatt}) and repulsive (*E*_{HBrep}) terms

The *E*_{HBatt} term is defined based on lone-pair angles (*α* and *β*) and the distance (*r*_{kj}) between the acceptor oxygen (*k*) and the hydrogen of the donor molecule (*j*). The *E*_{HBatt} is expressed as

where *h*_{b1} to *h*_{b8} are the adjustable parameters. The *E*_{HBrep} term is defined by

Here, *l*_{p1} to *l*_{p6} are the adjustable parameters and *r*_{ik} is the oxygen-oxygen distance. We define angle *θ*_{ijk} to be between atoms *i*, *j*, and *k* atoms (see Fig. 6) as the average position of the lone pairs is positioned on the center of the oxygen atoms.^{55} Since the lone pairs are only defined by vectors, we transfer their force components to the hydrogens of the same molecule via chain derivatives.

For all nonbond energy terms (i.e., *E*_{electro}, *E*_{vdW}, and *E*_{HB}), we use a 7th order taper function to cutoff the forces and energies smoothly at the cutoff distance (matching energy and the first three derivatives at *r* = 0 and *r* = cutoff). It is given by

where *r*_{cut} is a cutoff distance and *Tap*_{7} = 20, *Tap*_{6} = −70, *Tap*_{5} = 84, *Tap*_{4} = −35, *Tap*_{3} = 0, *Tap*_{2} = 0, *Tap*_{1} = 0, and *Tap*_{0} = 1. We used a cutoff distance of 12.0 Å for *E*_{vdW} and *E*_{electro}. W used a cutoff of 4.6 Å for *E*_{HB} to include only the first nearest neighbors in hydrogen bonding.

#### 2. HB training set

To obtain *E*_{HB} parameters, we started with a minimal training set scanning only four dimer cases. These four cases were selected to establish the range of the parameters and to avoid overfitting. The QM energy for all these cases was obtained using the Bowman PES. The dimer scans are shown in Fig. 7.

Scan 1: one of the water molecules (

*D*_{2}*h*symmetry) is scanned toward the other molecule in the*xy*plane. This scan is used to optimize the distance dependent parameters of the*E*_{HBrep}*(r*_{ik}*)*term [Eq. (27)]. Here, it is assumed that there is no angle contribution from*E*_{HBrep}[Eq. (28)] and no contribution from*E*_{HBatt}.Scan 2: the donor water molecule in the most stable water dimer structure is rotated around the

*z*axis with the oxygen of the donor (O_{i}) as the center of the rotation.Scan 3: the donor molecule in the most stable water dimer structure is scanned toward the acceptor with the O

_{i}H_{j}O_{k}angle equal to 180.0° during the scan. Scan 2 and Scan 3 are used to obtain the*E*_{HBrep}*(θ*_{ijk}*)*and*E*_{HBatt}*(r*_{kj}*)*parameters.Scan 4: the donor water molecule in the most stable water dimer is rotating in the

*xy*plane around the oxygen of the acceptor molecule (O_{k}) as the center of rotation. We used this to optimize the*E*_{HBatt}*(α,β)*parameters.

The optimization of the HB parameters for the above four scan cases used conjugate gradients to minimize the root-mean square error (RMSE) of the energy values for each geometry. The energy comparison for the obtained parameter set (FF1) and QM is shown in Fig. 8, which are in very good agreement.

To improve the quality of the force field for different configurations of water dimer interactions, we also included the 10 low-lying stationary point structures (see Fig. S6 of the supplementary material) of the water dimer in our training set. The reference energies and minimized structures are from the Bowman PES. In addition, to account for possible non-additive HB effects that exist in a bulk system, we expanded our training set by including the global minima of water clusters, (H_{2}O)_{n}, containing up to *n* = 19 waters. The reference energies and global minimized structures were taken from X3LYP^{56} hybrid DFT calculations which leads to excellent structures for large water clusters. These calculations were performed using the Jaguar^{57} software. The cluster structures are provided in Fig. S7 of the supplementary material. Our optimization of the HB term results in a RMSE difference of 0.65 kcal/mol/water for the above 10 water dimers and 19 water clusters. The comparison between RexPoN and QM for the 4 scan cases after including all dimers and water clusters in the optimization (FF2) is shown in Fig. 8. The energy components of RexPoN for FF2 are shown in Fig. S8 of the supplementary material. The systematic lower energy of the FF2 with respect to FF1 is due the non-additive HB effects that are not captured completely by considering just water dimers. Because of that at the OO distance of 2.76 Å, the water dimer energy computed by FF2 is equal to 5.7 kcal/mol which is close to the experimental ice HB energy^{58} of 5.5 kcal/mol (OO = 2.76 Å). The comparison between the computed energies of RexPoN and QM for water dimers and clusters is provided in Table S1 of the supplementary material. The *E*_{HBatt} and *E*_{HBrep} optimized parameters [Eqs. (23)–(28)] for FF2 are provided in Tables VII and VIII, respectively.

Parameters . | h_{b1}
. | h_{b2}
. | h_{b3}
. | h_{b4}
. | h_{b5}
. | h_{b6}
. | h_{b7}
. | h_{b8}
. |
---|---|---|---|---|---|---|---|---|

Value | −22.5000 | 12.7434 | 1.1345 | 0.7861 | 3.5071 | 1.1693 | 3.6922 | −0.8921 |

Parameters . | h_{b1}
. | h_{b2}
. | h_{b3}
. | h_{b4}
. | h_{b5}
. | h_{b6}
. | h_{b7}
. | h_{b8}
. |
---|---|---|---|---|---|---|---|---|

Value | −22.5000 | 12.7434 | 1.1345 | 0.7861 | 3.5071 | 1.1693 | 3.6922 | −0.8921 |

## III. RESULTS

To validate the RexPoN FF, we computed several properties of water and ice including the radial distribution function (RDF), heat of vaporization (*ΔH*_{vap}), dielectric constant (ε), and diffusivity of water at 298 K, and melting point of ice Ih. A quick comparison for several properties of liquid water and also the melting point of ice I_{h} between experiment, some of the widely used water models, and RexPoN FF is provided in Table IX.

. | Expt. . | RexPoN . | TIP3P . | TIP4P-2005 . | TIP5P . | SPC/E . | MB-pol . |
---|---|---|---|---|---|---|---|

T_{m} | 273.15 | 273.3 | 146 | 252 | 274 | 215 | 263.5 |

S^{0} | 69.9 | 68.43 | 72.51 | 57.47 | … | 60.30 | … |

ρ | 0.9965 | 0.9965 | 0.98 | 0.993 | 0.979 | 0.994 | 1.007 |

ε | 78.4 | 76.1 | 94 | 58 | 91 | 68 | 68.4 |

∆H_{v} | 10.52 | 10.36 | 10.05 | 11.99 | 10.46 | 11.79 | 10.93 |

ln D_{s} | −11.24 | −10.08 | −10.2 | −11.27 | −11.41 | −11.08 | −10.46 |

. | Expt. . | RexPoN . | TIP3P . | TIP4P-2005 . | TIP5P . | SPC/E . | MB-pol . |
---|---|---|---|---|---|---|---|

T_{m} | 273.15 | 273.3 | 146 | 252 | 274 | 215 | 263.5 |

S^{0} | 69.9 | 68.43 | 72.51 | 57.47 | … | 60.30 | … |

ρ | 0.9965 | 0.9965 | 0.98 | 0.993 | 0.979 | 0.994 | 1.007 |

ε | 78.4 | 76.1 | 94 | 58 | 91 | 68 | 68.4 |

∆H_{v} | 10.52 | 10.36 | 10.05 | 11.99 | 10.46 | 11.79 | 10.93 |

ln D_{s} | −11.24 | −10.08 | −10.2 | −11.27 | −11.41 | −11.08 | −10.46 |

We integrated the RexPoN model in the LAMMPS^{59} MD simulator package and will submit our modifications for inclusion in the official LAMMPS releases. For these water simulations at 298 K, we used a system with 216 molecules per cell, and for ice melting simulations, we used a system with 96 molecules per cell. To control the pressure, we used a barostat with a relaxation time of 1 ps, and to control temperature, we used the Nosé-Hoover thermostat with a damping time of 100 femtosecond (fs).^{60,61} We used a rigid model for the water molecules since most practical simulations use rigid models. The OH bond distance and HOH angle are kept fixed at their equilibrium values (OH = 0.9572 Å and HÔH = 104.52°) using SHAKE methodology.^{62} We used a time step of 1.0 fs here, but time steps of 2.0 fs could have been used. We used a time step of 1.0 fs to ensure the accuracy for all properties. We expect that using time steps of 2.0 fs would lead to similar accuracy, but we have not yet tested the precision of computed properties using 2.0 fs time steps.

### A. Melting point calculation

To determine the melting point of the ice Ih structure, we started with ice Ih at 0 K and increased temperature gradually to the range of 200–320 K during 10 ps followed by 150 ps constant temperature-volume (MD-*NVT*) calculations. Then, the cell was relaxed for 120 ps during constant temperature-pressure (MD-*NPT*) simulations. Then at each T, the system was further relaxed for another 100 ps (5 cycles each 20 ps) using MD-*NVT* for performing the two-phase thermodynamics (2PT) analysis.^{66,67} The 2PT analysis is used to calculate the thermodynamic properties, including entropy and free energy. 2PT determines the vibrational density of states from the Fourier transform of the velocity autocorrelation function after which corrections are made for the diffusional contributions that lead to a finite density of states at υ = 0. 2PT allows the free energy, entropy, and other thermodynamic properties to be determined from short 20 ps MD trajectories. The accuracy of the 2PT method has been validated against expensive *ab initio* methods for the calculation of absolute entropy of liquids.^{64} Figure 9 shows the change of the standard molar entropy (*S*^{0}) and free energy (*A*) of the ice versus temperature. Melting is a first-order transition that results in a discontinuity and sudden change in the slope of the *S*^{0} and *A* curves. RexPoN finds that this sharp discontinuity occurs between 273.0 and 273.5 K, indicating a melting point of ∼273.3 K for ice Ih. This predicted melting point is in excellent agreement with the experimental melting point of 273.15 K at ambient pressure. Also, the free energy curve shows a sharp change at the same temperature.

### B. Liquid density at ambient pressure

We used two methods to evaluate the liquid density of water at 298 K and 1 atm. First, we performed MD simulations in the *NPT* ensemble for 1 nanosecond (ns). Since large fluctuations of the pressure during MD-*NPT* simulations can sometimes skew the *NPT* ensemble averages, we also computed the EOS of the liquid water at 298 K for densities in the range of 0.98 to 1.02 gr/cm^{3}. For each density, we utilized MD-*NVT* simulations at 298 K for 1 ns and averaged the pressure values over the last 0.5 ns. The pressure versus density plot is shown in Fig. 10, leading to a density of 0.9965 gr/cm^{3} at 1 atm in perfect agreement with the experimental density of 0.9965 gr/cm^{3}. Our MD-*NPT* also resulted in a density of 0.9960 gr/cm^{3} also in excellent agreement with experimental density (see Fig. S8 of the supplementary material).

### C. Enthalpy of vaporization

We computed the enthalpy of vaporization (*ΔH*_{vap}*)* as the difference between the gas (*g*) and liquid (*l*) enthalpies,

which can be written in terms of gas and liquid total energies and volumes (*V*_{g} and *V*_{l}),

where *U*_{g}(*T*) and *U*_{l}(*T*) are the total energies of the gas and liquid phase, respectively. Equation (31) is further simplified with the assumptions that (i) $Vl\u2009\u226a\u2009Vg$, (ii) gas phase is ideal (i.e., $pVg=RT$), and (iii) the kinetic energy of water molecules in the liquid and gas phases is equal,

where *E*_{g}(*T*) and *E*_{l}(*T*) are the potential energies of the gas and liquid phase, respectively. It should be noted that Eq. (32) requires no correction for polarization effects since the PQEq model explicitly includes the polarization energy. Using Eq. (32), we computed ΔH_{vap} = 10.36 kcal/mol at 298 K compared to be in excellent agreement with experimental ΔH_{vap} = 10.52 kcal/mol.

### D. Absolute entropy

We computed the *S*^{0} of liquid at 298 K using 2PT methodology. For this calculation, we extracted the last 100 ps trajectory of the 1 ns MD-*NVT* simulation described above. The 2PT analysis was performed for five different 20 ps intervals and averaged over the 5 cycles. This leads to *S*^{0} = 68.43 (J/mol)/K, just 2% below that the experimental value of *S*^{0} = 69.9 (J/mol)/K.^{68} This accurate value from RexPoN for such absolute thermodynamics properties as entropy suggests that this model will reproduce a wide variety of properties of water. The computed *S*^{0} is usually underestimated by other water models. Thus AIMD-PBE (−27%), ReaxFF (−13%), TIP4P-2005 (−18%), TIP4P-ice (−25%), SPC/Fw (−13%), F3C (−12%), effective FF from AIMD (−6%), and TIP4/F (−16%) are compared to experiment.^{64}

### E. Static dielectric constant

We computed the static dielectric constant (*ε*) from dipole moment fluctuations

where *k*_{B} is the Boltzmann constant, *T* is the temperature, *V* is the volume, ** M** is the total dipole moment of the simulation cell, and

*ε*

_{∞}is the high frequency or optical dielectric constant. The PQEq formalism allows the direct calculation of

*ε*

_{∞}from the classical fluctuation of the electrons (shells) as a measure of the total polarizability of the simulation cell.

^{17}It can be calculated from the dipole moment fluctuations due to the shell movement around the cores using

where *M*_{s} indicates the dipole moment resulted from the shell displacement from its core ($Ms=\u2212\u2211iN\Delta ri$). To compute the dielectric constant, we performed = 1 ns MD-*NVT* simulations at 298 K, leading to ε_{0} = 76.1 and $\epsilon \u221e$ = 1.31, in good agreement with the experimental dielectric constant^{69} value of 78.4 and the optical dielectric constant^{70} of 1.79. The change with time of the *ε*, $\u27e8M2\u27e9$, and $\u27e8M\u27e92$ are shown in Fig. 11. As shown in this figure, after 300 ps, the values of $\epsilon $ and $\u27e8M2\u27e9$ reach to a plateau and $\u27e8M\u27e92$ becomes negligible.

### F. Radial distribution function

The molecular distances and orientations in water are mainly determined by the HB network within the first shell of each molecule, making the structural analysis of the first nearest neighbors crucial. We computed the oxygen-oxygen radial distribution function (*g*_{OO}) and coordination number (*N*_{OO}) of water at 298 K and ambient pressure over 1 ns MD-*NVT* simulations. The results together with the experimental data are shown in Fig. 12. The experimental data are based on neutron scattering (Exp1) and combination of X-ray and Neutron experiments (Exp2).^{71,72} The first peak of RexPoN (2.84 Å) is in excellent agreement with Exp1 (2.85 Å) and Exp2 (2.76 Å). Most empirical water models were adjusted to have a first peak at ∼2.76 Å, but they lead to a sharp first peak.^{72–74} Figure 12 includes the RDF for the TIP4P/2005 FF^{74} (a typical empirical FF) and several *ab initio* based models including SCAN,^{75} CC-pol, and MB-pol potentials. All these models lead to a poor fit to the experimental *g*_{OO}. Although the *ab initio* models lead to a much better *g*_{OO} than the empirical FF. By contrast, RexPoN gives a slope and height of the first peak in excellent agreement with experimental data. Our computed second peak (4.2 Å) is also within the experimental range of 4.2 to 4.7 Å. As shown in Fig. 12, RexPoN leads to a first minimum of the g_{oo} at 3.43 Å, leading to *N*_{oo} = 4.7, in excellent agreement with experiment. This value shows more than four water molecules in the first shell of water, suggesting a broken tetrahedral structure for water.

### G. Self-diffusion coefficient

The self-diffusion coefficient (*D*_{s}) was derived from the ionic mean-squared displacement (MSD) curve using the 3D diffusion equation of Einstein

where *r*_{i}*(t*) is the position of particle *i* at time *t* and *t*_{0} is the time origin and could be any time during the simulation. To provide better sampling, we computed an averaged MSD at time *t* with respect to all of the frames smaller than *t*. The values of MSD were computed every 1 ps during 1 ns NVT simulations at 298 K and ambient pressure. The change of MSD with time along with its log-log curve is shown in Fig. 13. The log-log curve shows that by 300 ps, the system has reached the Fickian regime with a slope of 1.0, satisfying $logMSD=logt+log6Ds$. Our computed *D*_{s} = 4.21 × 10^{−5} cm^{2} s^{−1} is higher than the experimental value^{76} of 2.27 × 10^{−5} cm^{2} s^{−1}. From 2PT analysis, we also computed the diffusivity using the density of states at zero frequency, *DoS(0) =* 12 mN *D*_{S}*/(kT),* which leads to *D*_{s} = 4.23 × 10^{−5} cm^{2} s^{−1}. This differs the MSD value by only 0.5%, suggesting that the 2PT analysis can be used to compute the diffusivity of large systems from short simulation times. The larger computed value of diffusivity by RexPoN is puzzling to us. We have done this both for long time scales and from 2PT and obtained the same results. This discrepancy is puzzling because RexPoN leads to good accuracy for the entropy of water [*S*^{0} = 68.43 (J/mol)/K compared to the experimental value of 69.90 (J/mol)/K]. It is known that diffusivity and entropy of a system are directly related to each other. Therefore, one would expect RexPoN to give an accurate value for the diffusivity as it describes the entropy of the system very well.

## IV. CONCLUSIONS

The RexPoN force field based entirely on high quality QM calculations leads to excellent agreement with the experimental properties of the solid and liquid phases of water. This attests to the quality of the QM calculations and to the validity of the RexPoN formalism. We are hopeful that the extension of the RexPoN methodology to other systems, fitting the parameters from QM, may lead to a new generation of FF for very accurate predictions on electrolytes and other materials. We believe that the RexPoN FF for water could provide a significant reduction in the costs for full solvent electrochemical QM calculations by allowing the QM water to be restricted to just a couple of layers.

## SUPPLEMENTARY MATERIAL

The supplementary material contains the comparison between PQEq and QM charge distributions in the water dimer, the crystal structures of solid hydrogen and solid oxygen, two-body vdW energy curves, energy changes with the volume of the hydrogen crystal, bond energy to the bond order relationship, structures of 10 low-lying stationary points of the water dimer, global minimized structures of 19 water clusters, comparison between RexPoN and QM for the 4 scan cases, comparison between RexPoN and QM energies for water dimers and clusters, the liquid density evaluation using MD-*NPT* simulations, and the computational cost of RexPoN.

## ACKNOWLEDGMENTS

This work was supported by the Computational Materials Sciences Program funded by the U.S. Department of Energy, Office of Science, Basic Energy Sciences, under Award No. DE-SC00014607. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by the National Science Foundation Grant No. ACI-1548562. W.A.G. was supported by NSF (CBET 1512759).