A huge amount of water at supercritical conditions exists in Earth’s interior, where its dielectric properties play a critical role in determining how it stores and transports materials. However, it is very challenging to obtain the static dielectric constant of water, *ϵ*_{0}, in a wide pressure–temperature (P–T) range as found in deep Earth either experimentally or by first-principles simulations. Here, we introduce a neural network dipole model, which, combined with molecular dynamics, can be used to compute P–T dependent dielectric properties of water as accurately as first-principles methods but much more efficiently. We found that *ϵ*_{0} may vary by one order of magnitude in Earth’s upper mantle, suggesting that the solvation properties of water change dramatically at different depths. Although *ϵ*_{0} and the molecular dipole moment increase with an increase in pressure along an isotherm, the dipolar angular correlation has its maximum at 5 GPa–7 GPa, which may indicate that hydrogen bonds become weaker at high pressure. We also calculated the frequency-dependent dielectric constant of water in the microwave range, which, to the best of our knowledge, has not been calculated from first principles, and found that temperature affects the dielectric absorption more than pressure. Our results are of great use in many areas, e.g., modeling water–rock interactions in geochemistry. The computational approach introduced here can be readily applied to other molecular fluids.

## INTRODUCTION

Water is arguably the most important solvent on Earth and also largely exists inside Earth’s crust, mantle,^{1} and even toward the core.^{2} The supercritical water at pressure (P) above 22 MPa and temperature (T) higher than 647 K is a major component of Earth’s deep fluids,^{3} storing and transporting many materials, and is also used as working fluids in various industrial areas.^{4} The dielectric constant of supercritical water, *ϵ*_{0}, substantially affects its solvation properties and interactions with minerals and is a key quantity in many science and industry applications.^{4,5}

For more than two decades, experimental data of the static dielectric constant of water have been limited to pressure lower than 0.5 GPa and temperature below 900 K,^{6} which can be extrapolated to ∼1 GPa and ∼1300 K using various models (e.g., Ref. 5). However, these P–T conditions can only be found in the very shallow mantle. Experimentally, we do not know *ϵ*_{0} in most part of Earth’s interior, where many important aqueous reactions are happening.^{7} First-principles molecular dynamics (FPMD) simulations have shown reliable predictions beyond the reach of current experiments, but the computational expense is so high that the previous study reported only five data points.^{8} In molecular dynamics (MD) simulations, the calculation of *ϵ*_{0} requires the variance of the total dipole moment of the simulation box, $M\u2192$, so a large number of uncorrelated configurations are needed.^{9} For example, at ambient conditions, several nanosecond MD simulations are required to get a converged value of *ϵ*_{0}.^{10} In electronic structure calculations with periodic boundary conditions, $M\u2192$ is often calculated by the sum of molecular dipole moments using maximally localized Wannier functions (MLWFs), obtained by minimizing the spread of molecular orbitals from density functional theory (DFT) calculations.^{11}

In recent years, machine learning techniques emerge as an appealing tool to combine the accuracy of first-principles simulations and the efficiency of empirical force fields in atomistic simulations. In many studies, machine learning models were trained using data from first-principles calculations to learn potential energy surfaces, which can be further used in MD or Monte Carlo simulations (e.g., Refs. 12–16). Some electronic properties may also be obtained using machine learning models, e.g., the dielectric constant of a variety of crystals,^{17} but how the dielectric constant varies with environmental factors such as P or T is not known.

Here, we constructed a neural network dipole (NND) model using data from FPMD simulations to calculate the molecular dipole moment of supercritical water in a large P–T range. In combination with MD trajectories obtained by the machine learning force field, we calculated the dielectric constant of water from 1 GPa to 15 GPa and from 800 K to 1400 K, where *ϵ*_{0} changes by one order of magnitude. We also calculated the frequency-dependent dielectric constant of water and found that temperature affects the dielectric absorption peak more than pressure. The accurate and efficient method of calculating the dielectric constant of water in a large P–T range makes it possible to model aqueous solutions and water–rock interactions in a large part of Earth’s interior. Our results have great implications on the solvation properties of aqueous geofluids in deep Earth.

## IMPLEMENTATION DETAILS

In MD simulations with periodic boundary conditions, the dielectric constant of an isotropic and homogeneous fluid can be calculated by^{9,18}

where *ϵ*_{∞} is the electronic dielectric constant, *k*_{B} is the Boltzmann constant, *T* is the temperature, and *V* is the volume of the simulation box. *ϵ*_{∞} can be calculated by density functional perturbation theory,^{19} whose fluctuation is much smaller than that of *ϵ*_{0}, so tens of MD configurations are enough to get converged results.^{20} The calculation of the variance of $M\u2192$ requires a large number of uncorrelated MD configurations, and the simulation box cannot be too small; hence, if we directly build a machine learning model to obtain $M\u2192$,^{21} the training set has to sample the huge configuration space well, which can be very expensive. Instead, here, we constructed the machine learning model to get the molecular dipole moment of water in the liquid or supercritical state. In DFT calculations, there are 64–128 water molecules in one simulation box, so we have 64–128 independent water dipoles from one MD configuration, which makes it easier to have a large training set. The dipole moment of water molecules strongly depends on their local chemical environment,^{8} so we chose the local solvation structure as an input to an end-to-end neural network model, as shown in Fig. 1.

We built a local Cartesian coordinate for each water molecule based on its molecular geometry. The origin is located at the O atom of the center water molecule. The *x* axis is along the bisector of *∠*HOH, and the *z* axis is perpendicular to the plane formed by one O and two H atoms. In this way, the rotational and translational symmetries are preserved. For the center water molecule, the input is the coordinates of its neighboring atoms including its two H atoms $\Theta \u2192={\Theta 1O,\Theta 2O,\u2026,\Theta iO,\u2026;\Theta 1H,\Theta 2H,\u2026,\Theta iH,\u2026}$ within a cutoff radius *R*_{c},

where (*x*_{i}, *y*_{i}, *z*_{i}) is the local Cartesian coordinate of the *i*th neighboring atom, *s* labels the species as either O or H, and $ri=xi2+yi2+zi2$. We sorted the atoms by distance from near to far in the input to preserve the permutational symmetry. We fixed *R*_{c} at 0.6 nm, and $\Theta \u2192$ has the coordinates of 32 water molecules. The cutoff radius is between the first and second solvation shells of water in our study. Considering that the water density varies considerably with P–T conditions, if the number of water molecules within this radius was smaller than 32, we appended (0,0,0) to the tail of $\Theta \u2192$ to keep the size of $\Theta \u2192$ fixed, so the input data to the neural network always have the same size.

In the neural network model, we used four hidden layers, sequentially consisting of 300, 200, 100, and 30 nodes, to connect the input and output layers. We chose the hyperbolic tangent function^{22} as the activation function for all hidden layers. The loss function is defined as

where $\mu \u2192jp$ and $\mu \u2192jt$ are the predicted and inputted dipole moments of the *i*th water molecule, respectively, and *n* is the batch size. We used the Adam optimization algorithm^{23} to minimize the loss function, and the learning rate is 1.0 × 10^{−4}. To avoid overfitting, we added an L2 regularization^{24} to each hidden layer.

Our training set consists of 2 138 880 water dipoles obtained from over 100 ps FPMD simulations at ambient and supercritical conditions up to 11 GPa and 2000 K. Although our current study is focused on supercritical water, we found that the data at ambient conditions could improve the accuracy of the trained NND model.

To calculate the dipole moment of water molecules, we used the Perdew-Burke-Ernzerhof (PBE) xc functional,^{25} which may not be sufficient enough to describe hydrogen bonds in water at ambient conditions,^{26} but our previous studies indicated that it performs better at high P and high T than at ambient conditions,^{27} particularly for dielectric properties.^{8,20}

Note that very recently, Zhang *et al.* introduced a neural network model to obtain the centroid position of MLWF centers,^{28} which can also be used to calculate molecular dipole moments. The model requires two neural networks.

## RESULTS AND DISCUSSION

At each P–T condition, we used half of the MD trajectory to train the NND model and the remaining half to assess results, so there is no overlap between the test and training sets. We compared the distributions of dipole magnitude of water molecules obtained by the NND model and DFT in the test set (see Fig. S1), where the mean absolute error (MAE) is 0.14 D. Interestingly, we also tested our NND model at 30 GPa and 2000 K, which was not included in the training set, and the MAE is 0.038 D. Water ionization happens very frequently at this condition, so we use one O atom and two nearest H atoms to form a water “molecule” to calculate the molecular dipole. The excellent performance at 30 GPa and 2000 K suggests that our NND model can not only cover the P–T range of the training set but may also be extrapolated outside this range.

In Fig. S2, we plotted the distributions of the angle between the dipole moment vectors calculated by the NND model and DFT, and the MAE is 3.86° for individual dipole moments. It is interesting to see that the error of $M\u2192$ is slightly smaller than that of individual dipoles, which may be due to the error cancellation in the vector summation.

After validating water dipole moments obtained by our NND model, we applied them to calculate *ϵ*_{0} at various P–T conditions, as shown in Fig. 2. Overall, the relative error is smaller than 0.6% in test sets, which is much smaller than the standard deviations obtained from our MD simulations, indicating that our NND model is nearly as accurate as first-principles calculations (see Table SI).

After building an excellent model to calculate the molecular dipole moment of water, we may conduct first-principles MD simulations to generate trajectories at various P–T conditions and, then, use the NND model to calculate the fluctuation of $M\u2192$ to get the dielectric constant of water. Using this method, we avoid the step of calculating MLWFs. However, the FPMD simulation is also very expensive, which limits its application to many P–T conditions. In many previous studies on high P–T water, the force fields extended simple point charge (SPC/E)^{29} and TIP4P/2005^{30} were widely used. Although they were mainly designed to simulate ambient water, they seem to work well for some properties at high P and T, e.g., the equation of state.^{31} As for dielectric properties, the main limitation of these two models is their rigidity. The molecular dipole moment of SPC/E is fixed at 2.34 D, while that of TIP4P/2005 is 2.31 D. In fact, both P and T may affect the molecular dipole moment of water considerably. Our previous study shows that at ∼1 GPa and 1000 K, the dielectric constant of water calculated by the SPC/E model is close to the result obtained from FPMD simulations and also the extrapolated experimental value;^{8} however, the difference between the SPC/E result and FPMD simulations becomes larger with an increase in pressure. We found that at ∼1 GPa and 1000 K, the average dipole moment of water molecules, *μ*, calculated by DFT is close to 2.34 D, but *μ* increases with an increase in pressure and decreases with an increase in temperature, which cannot be captured by rigid water models.

Since our NND model gives excellent dipole distributions under various P–T conditions, we applied this model to calculate molecular dipole moments for the trajectories obtained from the SPC/E and TIP4P/2005 simulations, as shown in Fig. 2. The dielectric constants calculated by our NND model are generally larger than those directly from the two water models and become closer to the DFT values, indicating that the polarization of water molecules with pressure and temperature affects the dielectric constant of water.

It is interesting to see that the dielectric constants calculated by the SPC/E model are closer to the DFT values than those by the TIP4P/2005 model, whereas at 1000 K using the NND model, the TIP4P/2005 trajectories give better results than the SPC/E ones. The dielectric constant of water depends on two factors: (1) molecular dipole moment and (2) hydrogen bond network characterized by the Kirkwood g-factor,^{32}

where *N* is the number of molecules in the simulation box. While *μ* is the individual property of water molecules in a chemical environment, *G*_{k} accounts for the angular correlation among water dipoles depending on the structure of hydrogen bond networks. For fully random dipole particles, *G*_{k} is 1, and for ice Ih with an ideal tetrahedral order (the Pauling model), *G*_{k} is 3.^{33} Aragones *et al.* argued that for high pressure ice phases at 243 K, TIP4P/2005 may provide decent *G*_{k}, so after scaling the molecular dipole moment, it gives correct dielectric constants,^{33} which are qualitatively consistent with our finding at 1000 K. However, when increasing T to 2000 K, the SPC/E trajectories again perform better than TIP4P/2005, indicating that the SPC/E model may give more accurate *G*_{k} than TIP4P/2005 at a very high temperature.

The variation of molecular geometry of water cannot be well reproduced by rigid force fields. Here, we also generated MD trajectories using the neural network force field, which was recently developed for high P–T water and ice in the molecular, ionic, and superionic phases up to 70 GPa and 3000 K using the DFT energy potential.^{34} We calculated *ϵ*_{0} using the obtained water trajectories, as shown in Fig. 2. The difference between the NND and DFT values is smaller than 1.7%, which is within error bars in FPMD simulations and overall better than the results obtained from the SPC/E and TIP4P/2005 trajectories.

Using the NND model and the neural network force field, we calculated the dielectric constant of supercritical water from 1 GPa to 15 GPa and from 800 K to 1400 K, corresponding to the P–T conditions found in Earth’s upper mantle. *ϵ*_{0} increases with an increase in P and decreases with an increase in T and varies from ∼7 up to 72 in Fig. 3. Generally, the increase in *ϵ*_{0} becomes smaller with an increase in P along an isotherm. If we treat water as a dielectric medium, the attractive force between a cation and an anion in water is given by $F=q+q\u2212\u03f50r$, where *q*_{+} and *q*_{−} are the charges of the cation and anion, respectively, and *r* is their distance. If *ϵ*_{0} is large, the attractive force *F* is small, so ionic compounds may be easily dissolved by water, whereas if *ϵ*_{0} is small, with increasing *F*, the dissolved ions may precipitate out of water. Since *ϵ*_{0} determines the solvation properties of water, the large variation of *ϵ*_{0} substantially influences how water stores and transports materials with great implications on water–rock interactions in Earth’s interior.

Figure 4 shows *μ* and *G*_{k} as a function of pressure. With an increase in P along an isotherm, *μ* increases monotonically, whereas *G*_{k} increases and, then, decreases; its peak appears at 5 GPa–7 GPa. All the *G*_{k} values are smaller than 3, but larger than 1, indicating that there is still certain dipolar angular correlation in supercritical water, but the hydrogen bond networks do not have a perfect tetrahedral order. With an increase in P from 1 GPa to 5 GPa, the angular correlation among dipoles is enhanced because the molecular interaction becomes stronger. With a further increase in P, however, the hydrogen bonding becomes weaker due to the increased coordination number,^{35} so the angular correlation is not as strong as at low P. There is a subtle interplay between *μ* and *G*_{k} in governing the change in *ϵ*_{0}. The increase in *ϵ*_{0} at low P is driven by the increase in *μ*, *G*_{k}, and water density, while at high P, it is because the increase in *μ* and density exceeds the decrease in *G*_{k}; as a result, the increase in *ϵ*_{0} at high P is generally smaller than that at low P.

The dielectric constant is a function of the electric field frequency, *ϵ*(*ω*), which can be calculated by the Fourier–Laplace transform of the derivative of the normalized autocorrelation function of $M\u2192$, $\Phi (t)=<M\u2192(0)\u22c5M\u2192(t)><M\u21922>$,^{9}

It is difficult to get converged *ϵ*(*ω*) in the microwave range from FPMD simulations. Using the NND model and the neural network force field, we found that the rapid decay part of Φ(*t*) can be well fitted to an exponential function $Ae\u2212t\tau D$, where *τ*_{D} approximates the Debye relaxation time,^{36} with fitting errors smaller than 2.3%. In our calculations, Φ(*t*) contains the raw data until it decreases to 0.2, the tail from the exponential function after Φ(*t*) is smaller than 0.1, and the linear interpolation between the raw and fitted data when Φ(*t*) is between 0.2 and 0.1 to make a smooth connection. Figure 5 shows the real and imaginary parts of the frequency-dependent dielectric constant of supercritical water in the microwave range. The large peaks in Fig. 5(b) correspond to the main dielectric absorption peaks, which are centered between about 600 GHz and 10 THz, much larger than those at ambient conditions, ∼20 GHz. The main absorption peak is upshifted with an increase in temperature, but downshifted with an increase in pressure. The rise in temperature from 1000 K to 2000 K affects the peak position more than the increase in pressure from 1 GPa to 10 GPa. The dielectric absorption of water may be attributed to single molecular or large collective motions, which is still under debate.^{37} For supercritical water studied here, it seems temperature has more obvious effects on the Debye relaxation time than pressure. At a high temperature, the absorption peak has a large width because the lifetime of either single molecular or large collective motions becomes short.

## CONCLUSION

In summary, we built a physics-based neural network dipole model, which combines the accuracy of first-principles calculations and the efficiency of empirical models, to compute the dielectric constant of supercritical water, *ϵ*_{0}, from 1 GPa to 15 GPa and from 800 K to 1400 K. We found that *ϵ*_{0} can vary by one order of magnitude in Earth’s upper mantle, suggesting that solvation properties of water may change dramatically at different depths, so water, as an important mass transfer medium, may dissolve some materials, e.g., carbonates,^{38} and transport and release them in some shallow areas, which connects material reservoirs in Earth’s surface and interior. A subtle interplay between the molecular dipole moment and the dipolar angular correlation governs the increase in *ϵ*_{0} with pressure. We also calculated the dielectric constant as a function of the electric field frequency and found that temperature affects the dielectric absorption peak more than pressure in the P–T range studied here. The accuracy of our method solely depends on the quality of the training data, which can be further improved by using high level theories, e.g., hybrid exchange–correlation functionals in DFT. Although we studied only water here, our method can be readily applied to other molecular fluids.

The dielectric constant of water as a function of P and T plays a key role in the Deep Earth Water (DEW) model developed recently, which can calculate thermodynamic properties of many aqueous species and study water–rock interactions at elevated P–T conditions.^{39} Using the method introduced here, we are able to build a high quality database for *ϵ*_{0} in a large P–T range as found in deep Earth. Ionic minerals tend to dissolve in water with large *ϵ*_{0} and may be transported by aqueous geofluids. When *ϵ*_{0} decreases, dissolved ions may precipitate out of solutions. The obtained data for *ϵ*_{0} have great implications on how water stores and transports materials in Earth’s interior.

## SUPPLEMENTARY MATERIAL

See the supplementary material for further details of the methodology and supporting data.

## DATA AVAILABILITY

The data that support the findings of this study are available within the article and its supplementary material. The trained NND model is openly available at https://github.com/angstrom-group/NN-Dipole-Model.

## ACKNOWLEDGMENTS

The authors thank Giulia Galli and Viktor Rozsa for their helpful discussions and Lin Zhuang and Xin-Zheng Li for helping with the simulations using the neural network force field. D.P. acknowledges support from the Croucher Foundation through the Croucher Innovation Award, the Hong Kong Research Grants Council (Project Nos. ECS-26305017 and GRF-16307618), the National Natural Science Foundation of China (Project No. 11774072), and the Alfred P. Sloan Foundation through the Deep Carbon Observatory. Part of this work was carried out using computational resources from the National Supercomputer Center in Guangzhou, China.