We present a classical induction model to evaluate the three-body ion–water–water (I–W–W) and water–water–water (W–W–W) interactions in aqueous ionic systems. The classical description of the induction energy is based on electrostatic distributed multipoles up to hexadecapole and distributed polarizabilities up to quadrupole–quadrupole on the O and H atoms of water. The monatomic ions were described by a point charge and a dipole–dipole polarizability, while for the polyatomic ions, distributed multipoles up to hexadecapole and distributed polarizabilities up to quadrupole–quadrupole were used. The accuracy of the classical model is benchmarked against an accurate dataset of 936 (I–W–W) and 2184 (W–W–W) three-body terms for 13 different monatomic and polyatomic cation and anion systems. The classical model shows excellent agreement with the reference second order Moller–Plesset and coupled-cluster single double and perturbative triple [CCSD(T)] three-body energies. The Root-Mean-Square-Errors (RMSEs) for monatomic cations, monatomic anions, and polyatomic ions were 0.29, 0.25, and 0.12 kcal/mol, respectively. The corresponding RMSE for 1744 CCSD(T)/aVTZ three-body (W–W–W) energies, used to train MB-pol, was 0.12 kcal/mol. The accuracy of the proposed classical model demonstrates that the three-body term for aqueous ionic systems can be accurately modeled classically. This approach provides a fast, efficient, and as-accurate path toward modeling the three-body term in aqueous ionic systems that is fully transferable across systems with different ions without the need to fit to tens of thousands of *ab initio* calculations for each ion to extend existing many-body force fields to interactions between water and ions.

## I. INTRODUCTION

The strong nonadditive interactions in aqueous systems^{1–7} present a challenge in the development of accurate potential energy surfaces compared to systems exhibiting largely pairwise interactions. The many-body expansion (MBE) for aqueous systems^{2,3,8–25} has manifested itself a valuable guide in fitting *ab initio*-based classical potentials to accurately reproduce the many-body terms of these systems.^{26–33} The MBE partitions the total energy of an *N*-body system as a sum of its constituent fragments, or “bodies,” which refer to the ions and water molecules,

where *E*_{ref} is the energy of the isolated (noninteracting) fragment.

While these explicit many-body potentials are typically quite accurate, some of them usually require tens of thousands of high-level *ab initio* calculations to fit against. Further, this process must be repeated for each interaction type within a new system of interest. For certain terms in the expansion (e.g., the two-body term), this procedure may be necessary for the desired accuracy. However, for higher order terms in the expansion (which are smaller in magnitude), it may be worthwhile to consider alternative classical representations of these interactions. Exploring a classical representation of the higher order terms may lead to a more efficient framework that is transferable across different ionic aqueous systems.

The two-body term is the largest contributor to the total binding energy of aqueous systems and is strongly attractive.^{10,11,22–24} While arguably the most important component of the binding energy, the two-body term is notably challenging to fit due to the sensitivity of its accuracy to the basis set superposition error (BSSE), the basis set size, and the level of electron correlation.^{22} For example, to arrive at an accurate two-body potential for water, tens of thousands of coupled-cluster single double and perturbative triple [CCSD(T)] two-body energies have been used to fit the term using permutationally invariant polynomials (PIPs).^{34} More specifically, the two-body term of the WHBB potential was fit to 30 000 CCSD(T)/aVTZ values^{32} whereas the two-body term of MB-pol was fit to 40 000 CCSD(T)/Complete Basis Set (CBS) values.^{30} The Root-Mean-Square-Errors (RMSEs) for these two-body fitted potentials were determined to be 0.15 and 0.05 kcal/mol, respectively, when tested on 42 394 CCSD(T)/CBS dimer energies.^{35}

The three-body term for aqueous systems is the second largest contributor (only behind the two-body term) to the binding energy, comprising between 2.9% and 35.4% of the binding energy for X^{+/−}(H_{2}O)_{9}^{23,24} and (H_{2}O)_{10}^{22} systems. In pure water systems, the sum of the three-body interactions is energetically favorable (attractive). The three-body term of the WHBB potential was fit to 30 000 second order Moller–Plesset (MP2)/aVTZ calculations^{32} whereas the three-body term of the MB-pol potential was fit to 12 000 CCSD(T)/aVTZ calculations.^{31} In ion–water systems, the three-body interactions are often found to be repulsive.^{23,24} While the two-body terms are sensitive to several factors mentioned earlier, the three-body energy has been shown to contain only a negligible amount of electron correlation energy, as the majority of the electron correlation energy is present in the two-body term;^{22–24} in addition, the three-body term is not sensitive to the basis set and/or BSSE.^{22–24} That said, it may not be necessary to fit the three-body term to numerous three-body energies obtained from high-level electronic structure calculations. Rather, it may be desirable to represent the three-body interaction classically. It has been established that the vast majority of nonadditive interactions in water is due to polarization.^{20} However, the accuracy of a classical representation of the three-body term has not been explored thoroughly for different monatomic and polyatomic ion systems.

The goal of this paper is to examine the accuracy of a classical three-body induction model in reproducing individual *ab initio* three-body terms for a variety of monatomic and polyatomic ion systems. We report the performance of this model on numerous monatomic (cations: Ca^{2+}, Li^{+}, Na^{+}, K^{+}, Rb^{+}, Cs^{+}; anions: F^{−}, Cl^{−}, Br^{−}, I^{−}) and polyatomic ions (cation: $NH4+$, anions: $SO42\u2212$, $ClO4\u2212$) using induction models both without ion polarization and including ion polarization. By examining a variety of ion systems, we test the transferability of this model to differing ion strengths, polarizabilities and the hydrogen bonding arrangements. This work aims to provide an understanding of the underlying physics that governs the dominant nonadditive interactions in aqueous ionic systems and test its performance compared to *ab initio* results for these systems. Such an effort will inform future development of accurate and efficient many-body potentials for aqueous ionic systems.

## II. THEORETICAL AND COMPUTATIONAL DETAILS

### A. The classical three-body induction model

The pair interaction energy or two-body energy correction *U*_{ij} of molecules *i* and *j* can be defined as their energy *E*_{ij} in the geometry in which they occur in the cluster, but in the absence of all other molecules, less the energies of the isolated molecules *i* and *j*. Similarly, the three-body correction for molecules *i*, *j*, and *k* can be defined as the energy of that triple in the absence of all other molecules, less the energies of the isolated molecules and the two-body corrections *U*_{ij}, *U*_{ik}, and *U*_{jk}. This view is equivalent to the definitions in Eqs. (1)–(4).

The principal three-body contributions to the energy of an aqueous ionic hydrogen bonded network arise from the induction energy, that is, from the polarization of each molecule by the electric field of its neighbors. The contribution from three-body dispersion is much smaller.^{36} In the hydrogen bonded network of water, arrangements are favored in which the dipole moment induced in each molecule by the electric field of its neighbors enhances its static dipole moment, which, in turn, enhances its electric field. This effect favors nose-to-tail cyclic arrangements of the hydrogen bonds in small water clusters and leads to a cooperative many-body effect on the induction energy. In the presence of an ion, however, this cooperativity is disrupted. To see how this arises, consider the arrangement in Fig. 1.

Here, we have two water molecules *a* and *b* and an ion *i*. The electric field of the ion induces dipoles *δμ*_{a} and *δμ*_{b}, additional to the static dipole moments, that are not shown. There is a change to the induction energy of molecule *a*, but this is part of the *U*_{ia} pair energy, and likewise, there is a change to the *U*_{ib} pair energy. However, there is a repulsion between the induced dipoles *δμ*_{a} and *δμ*_{b}, and this is a three-body effect. An estimate using reasonable values for the parameters suggests that the energy is positive and of the order of a few kJ mol^{−1} if the molecules and ion are close to contact. Moreover, the field of the ion will tend to orient the static dipole moments of the neighboring water molecules in the direction away from or toward the ion (depending on the ion’s sign) and this will tend to disrupt the cooperative network of hydrogen bonds, reducing the magnitude of the negative three-body energies. This is a greatly oversimplified picture, but it provides a useful qualitative idea of the nature of the ion–water–water (I–W–W) three-body effect. For the calculations described below, a much more detailed and accurate description was used, involving distributed multipoles and distributed polarizabilities.^{36} In this treatment, the change $\Delta Qta$ to multipole moment *t* at site *a* of molecule *A* depends on the electrostatic fields at sites *a*′ on *A* due to the polarized moments *u* on all sites *b* of the other molecules *B*,

where $\alpha tt\u2032aa\u2032$ is the polarizability relating the induced moment *t* at site *a* to the electrostatic field component *t*′ at site *a*′, and $Ttua\u2032b$ is the general interaction function describing the electrostatic energy between multipole moments $Qta\u2032$ and $Qub$. Summation over repeated suffixes is implied. The distributed polarizabilities are calculated by the CamCASP program.^{37} In practice, we use localized polarizabilities, which are nonzero unless *a*′ = *a*; these are derived from the general distributed polarizabilities using the Orient program.^{38} The set of coupled equations in Eq. (5) is solved iteratively to obtain the $\Delta Qta$ for all molecules *A*, and then the induction energy is

which again is calculated by the Orient program. This calculation is carried out for all the dimer and trimer subsystems in the cluster to obtain the three-body energy, and it can be used to obtain four-body and higher terms if required. The calculation is very fast—under a second to obtain the three-body energy for any of the clusters considered here.

### B. Details of the calculations

The optimized ion–water cluster geometries were obtained from previous MP2 calculations.^{23,24} In those works, MP2 calculations were also carried out to evaluate the three-body energies, both for the set of water–water–water (W–W–W) trimer subsystems and for the set of ion–water–water (I–W–W) trimers. These calculations were corrected for basis set superposition error (BSSE) using the counterpoise method of Boys and Bernardi.^{39} For all (W–W–W) and most (I–W–W) calculations, Dunning’s aug-cc-pVDZ basis set^{40,41} was used. The Stuttgart RSC 1997 ECP^{42,43} was used for calculations containing Ca^{2+}, Rb^{+}, and Cs^{+} while the aug-cc-pVDZ-PP basis set^{44,45} was used for all I^{−} calculations. All MP2 calculations were performed with the NWChem 7.0.2 electronic structure suite.^{46}

The induction energy contributions to the three-body energy can be explored accurately and in detail using the Orient program.^{38} The cluster geometries for Orient were derived from the MP2 optimized geometries by replacing each H_{2}O molecule by an H_{2}O with the O in the same position and the H atoms in the same plane but with the bond lengths and bond angle as in the equilibrium isolated molecule.

The detailed classical description of the induction energy uses electrostatic distributed multipoles up to hexadecapole and distributed polarizabilities up to quadrupole–quadrupole on the O and H atoms of water, taken from the ASPW4 intermolecular potential^{47} for the water dimer. The monatomic ions were described by a point charge and by dipole–dipole polarizabilities taken from Li *et al.*,^{48} which were evaluated experimentally for ions in aqueous solution. For the polyatomic ions, distributed multipoles up to hexadecapole and distributed polarizabilities up to quadrupole–quadrupole were calculated using CamCASP^{37} with the aug-cc-pVTZ basis set.^{40,41} The induction interactions were damped at short range by the procedure used for the ASPW4 water dimer potential. The effect of varying the damping parameter away from the value of 1.41 used in the ASPW4 potential has been explored.

Second order Moller–Plesset (MP2) perturbation theory calculations were also used to evaluate the three-body terms for these geometries, using the basis sets listed above, in order to obtain more direct comparisons between the two approaches.

### C. Dataset of (I–W–W) and (W–W–W) trimers

Ion–water–water (I–W–W) trimer subsystems containing F^{−}, Cl^{−}, Br^{−}, I^{−}, Li^{+}, Na^{+}, K^{+}, Rb^{+}, Cs^{+}, Ca^{2+}, $ClO4\u2212$, $SO42\u2212$, or $NH4+$ and water–water–water (W–W–W) trimer subsystems were obtained from ion–(H_{2}O)_{9} clusters from previous works.^{22–24} Two clusters for each ion system were considered: one cluster containing the ion inside the cluster and one containing the ion on the outside of the cluster. In the original study,^{23,24} the three-body terms for each of these subsystems [72 (I–W–W) and 168 (W–W–W) for each ion system] were computed at the cluster optimized geometries at the MP2 level of theory. This amounts to a total of 936 (I–W–W) and 2184 (W–W–W) trimers for all systems combined. To compare the proposed induction model with the *ab initio* benchmarks on an equal footing, MP2 results were evaluated at the frozen monomer geometry (i.e., when the 1–B = 0) for each subsystem, as used in Orient. As an additional test of the (W–W–W) classical model, we have utilized the dataset of 12 347 CCSD(T)/aVTZ 3-B energies used to train MB-pol.^{31} Because the classical induction model considered in this work is implemented as a rigid model, we have focused on the 1744 trimers (out of the 12 347 trimers) for which the monomers have R_{OH} values within 0.04 Å of equilibrium (0.96 Å) and *θ*_{HOH} values within 1.5° of the equilibrium (104.5°) for a more fair comparison.

## III. RESULTS AND DISCUSSION

### A. Establishing accurate *ab initio* reference values for the dataset of (I–W–W) and (W–W–W) trimers

The aqueous clusters that the trimer subsystems were taken from are shown in Fig. 2. The purpose of sampling two different cluster configurations for each ion system is to include a wider range of (I–W–W) trimer arrangements, representing surface-like and bulk-like configurations. Each ionic cluster contains 36 (I–W–W) and 84 (W–W–W) trimers. It has been previously established that the total three-body term in aqueous systems is weakly dependent on the basis set size.^{22} Figure 3 shows a comparison of the MP2/aVDZ results for (Li^{+}–W–W) against those with two larger basis set sizes, namely aug-cc-pVTZ and aug-cc-pVQZ, as well as for (Ca^{2+}–W–W) against those with the aug-cc-pVTZ basis set. The difference between the results obtained with different basis set sizes is small, with Root-Mean-Square-Errors (RMSEs) not exceeding 0.06 kcal/mol for the three-body terms in the Li^{+}(H_{2}O)_{9} or Ca^{2+}(H_{2}O)_{9} systems. This verifies that the individual three-body terms are also insensitive to the basis set size, a result that further validates the use of MP2/aVDZ results for the other ion systems.

Furthermore, the basis set superposition error (BSSE) has been found to affect mainly the two-body interaction.^{22} Correlation plots between the uncorrected and BSSE-corrected MP2/aVDZ results are shown in Fig. 4. The correlation plots for each of the individual ion systems are included in the supplementary material. For the individual halide–water systems, the RMSEs do not exceed 0.04 kcal/mol (0.029 kcal/mol for the halides altogether). Similarly, for the individual polyatomic systems, the RMSEs do not exceed 0.04 kcal/mol (0.032 kcal/mol for the polyatomic ions altogether). The RMSEs for the different cation systems range from 0.04 to 0.28 kcal/mol (0.13 kcal/mol for the cations altogether). This shows that the individual three-body terms are generally insensitive to the BSSE, but this is system-dependent. The largest deviation tends to appear in the strongly repulsive three-body terms for the trimers of Li^{+} and Ca^{2+}. For this reason, the uncorrected MP2/aVDZ values are used for all system calculations except Li^{+}, Ca^{2+}, and Cs^{+}, for which the results were corrected for BSSE. Altogether, these results show that the three-body terms in these systems are insensitive to basis set size and BSSE, thus validating the use of uncorrected MP2/aVDZ three-body calculations for most (except some cation) systems.

### B. Classical induction model for three-body (I–W–W) interactions

Now that the computational protocol used to compute the reference three-body interactions has been validated, let us turn our attention to the performance of the classical induction model in reproducing these interactions. For each set of ions (monatomic cations, monatomic anions, and polyatomic ions), we will discuss the accuracy of the classical induction model in representing these interactions and specifically the role of ion polarizabilities in these interactions. Furthermore, we will provide a comparison of this model to other many-body ion–water potentials in the literature, when available. Note that we will focus on the MP2 results at the frozen (relaxed) intramolecular geometry to compare to the induction model on an equal footing. The plots corresponding to the subsystems at the optimized cluster geometry can be found in the supplementary material.

#### 1. Monatomic cations

Figure 5 shows the accuracy of the classical induction model without ion polarization (left panel), with ion polarization (middle panel), and with ion polarization with increased damping (right panel). The distributions of the errors (*E*_{MP2} − *E*_{model}) are shown below the respective correlation plots. The corresponding RMSEs and mean absolute errors (MAEs) for the individual ion systems are organized in Table I. As expected, there is no significant qualitative difference between the models with and without ion polarization because the cations are not very polarizable. However, by adjusting the damping parameter (1.41 → 1.15), the induction model aligns more closely with the MP2 reference values (RMSE: 0.29 kcal/mol). The largest errors lie with the (Li^{+}–W–W) and (Ca^{2+}–W–W) trimers, for which the RMSE is 0.41 kcal/mol for both systems. This is mainly due to deviations in the repulsive region (positive values) of the correlation plots.

. | Without . | With ion pol. . | With ion pol. . | ||||
---|---|---|---|---|---|---|---|

. | Number of . | ion pol. . | (DF = 1.41) . | (DF = 1.15) . | |||

Trimer . | points . | RMSE . | MAE . | RMSE . | MAE . | RMSE . | MAE . |

Li^{+}–W–W | 72 | 0.76 | 0.40 | 0.80 | 0.42 | 0.41 | 0.22 |

Na^{+}–W–W | 72 | 0.34 | 0.21 | 0.42 | 0.27 | 0.26 | 0.16 |

K^{+}–W–W | 72 | 0.24 | 0.15 | 0.30 | 0.20 | 0.23 | 0.14 |

Rb^{+}–W–W | 72 | 0.22 | 0.13 | 0.28 | 0.19 | 0.22 | 0.14 |

Cs^{+}–W–W | 72 | 0.13 | 0.080 | 0.17 | 0.11 | 0.11 | 0.069 |

Ca^{2+}–W–W | 72 | 0.88 | 0.61 | 1.06 | 0.75 | 0.41 | 0.26 |

Combined | 432 | 0.51 | 0.26 | 0.60 | 0.32 | 0.29 | 0.16 |

. | Without . | With ion pol. . | With ion pol. . | ||||
---|---|---|---|---|---|---|---|

. | Number of . | ion pol. . | (DF = 1.41) . | (DF = 1.15) . | |||

Trimer . | points . | RMSE . | MAE . | RMSE . | MAE . | RMSE . | MAE . |

Li^{+}–W–W | 72 | 0.76 | 0.40 | 0.80 | 0.42 | 0.41 | 0.22 |

Na^{+}–W–W | 72 | 0.34 | 0.21 | 0.42 | 0.27 | 0.26 | 0.16 |

K^{+}–W–W | 72 | 0.24 | 0.15 | 0.30 | 0.20 | 0.23 | 0.14 |

Rb^{+}–W–W | 72 | 0.22 | 0.13 | 0.28 | 0.19 | 0.22 | 0.14 |

Cs^{+}–W–W | 72 | 0.13 | 0.080 | 0.17 | 0.11 | 0.11 | 0.069 |

Ca^{2+}–W–W | 72 | 0.88 | 0.61 | 1.06 | 0.75 | 0.41 | 0.26 |

Combined | 432 | 0.51 | 0.26 | 0.60 | 0.32 | 0.29 | 0.16 |

Table II shows a comparison of the total three-body (I–W–W) energies using the proposed induction model, CCSD(T)(-F12b), i-TTM,^{49} and Atomic Multipole Optimized Energetics for Biomolecular Applications (AMOEBA) 2009^{50} for small X^{+}(H_{2}O)_{n} clusters, as published previously.^{49} Note that for the *n* = 2, *n* = 3, and *n* = 4 results, there are 1, 3, and 6 individual (I–W–W) calculations, respectively, that are added together. The i-TTM and AMOEBA 2009 models consider electrostatic, induction, repulsion, and dispersion interactions. The proposed model in this work considers only the induction energy and agrees quite well with the *ab initio* benchmarks.

. | A . | B . | C . | D . | E . | F . | |
---|---|---|---|---|---|---|---|

n | Symmetry | (Li^{+}–W–W) contribution to three-body energy | |||||

2 | D_{2d} | 3.36 | 4.53 | 4.53 | 3.72 | 6.10 | 3.62 |

3 | D_{3} | 11.34 | 16.27 | 16.01 | 13.12 | 20.81 | 12.72 |

4 | S_{4} | 23.90 | 32.61 | 32.09 | 26.44 | 40.18 | 25.30 |

(Na^{+}–W–W) contribution to three-body energy | |||||||

2 | D_{2d} | 1.64 | 1.82 | 2.60 | 2.16 | 2.53 | 1.62 |

3 | D_{3} | 5.46 | 6.77 | 7.96 | 6.85 | 8.41 | 5.37 |

4 | C_{2} | 3.14 | 3.96 | 5.01 | 4.81 | 5.77 | 2.65 |

(K^{+}–W–W) contribution to thee-body energy | |||||||

2 | D_{2d} | 1.38 | 0.80 | 1.69 | 1.45 | 1.47 | 1.26 |

3 | C_{2} | −1.35 | −1.13 | −1.45 | −0.79 | −1.12 | −1.64 |

4 | C_{2} | 1.44 | 0.89 | 1.97 | 2.21 | 2.01 | 0.94 |

(Rb^{+}–W–W) contribution to three-body energy | |||||||

2 | D_{2d} | 1.14 | 0.55 | 1.54 | 1.35 | 1.17 | 1.12 |

3 | C_{2} | −1.63 | −1.27 | −1.78 | −1.13 | −1.44 | −1.84 |

4 | C_{2} | 0.74 | 0.18 | 1.35 | 1.64 | 1.07 | −0.50 |

(Cs^{+}–W–W) contribution to three-body energy | |||||||

2 | C_{s} | 0.76 | 0.81 | 0.48 | 0.55 | 0.81 | 0.73 |

3 | C_{2} | −1.91 | −1.32 | −2.08 | −1.45 | −1.72 | −2.07 |

4 | C_{2} | 0.56 | −0.26 | 0.94 | 1.27 | 0.55 | 0.21 |

. | A . | B . | C . | D . | E . | F . | |
---|---|---|---|---|---|---|---|

n | Symmetry | (Li^{+}–W–W) contribution to three-body energy | |||||

2 | D_{2d} | 3.36 | 4.53 | 4.53 | 3.72 | 6.10 | 3.62 |

3 | D_{3} | 11.34 | 16.27 | 16.01 | 13.12 | 20.81 | 12.72 |

4 | S_{4} | 23.90 | 32.61 | 32.09 | 26.44 | 40.18 | 25.30 |

(Na^{+}–W–W) contribution to three-body energy | |||||||

2 | D_{2d} | 1.64 | 1.82 | 2.60 | 2.16 | 2.53 | 1.62 |

3 | D_{3} | 5.46 | 6.77 | 7.96 | 6.85 | 8.41 | 5.37 |

4 | C_{2} | 3.14 | 3.96 | 5.01 | 4.81 | 5.77 | 2.65 |

(K^{+}–W–W) contribution to thee-body energy | |||||||

2 | D_{2d} | 1.38 | 0.80 | 1.69 | 1.45 | 1.47 | 1.26 |

3 | C_{2} | −1.35 | −1.13 | −1.45 | −0.79 | −1.12 | −1.64 |

4 | C_{2} | 1.44 | 0.89 | 1.97 | 2.21 | 2.01 | 0.94 |

(Rb^{+}–W–W) contribution to three-body energy | |||||||

2 | D_{2d} | 1.14 | 0.55 | 1.54 | 1.35 | 1.17 | 1.12 |

3 | C_{2} | −1.63 | −1.27 | −1.78 | −1.13 | −1.44 | −1.84 |

4 | C_{2} | 0.74 | 0.18 | 1.35 | 1.64 | 1.07 | −0.50 |

(Cs^{+}–W–W) contribution to three-body energy | |||||||

2 | C_{s} | 0.76 | 0.81 | 0.48 | 0.55 | 0.81 | 0.73 |

3 | C_{2} | −1.91 | −1.32 | −2.08 | −1.45 | −1.72 | −2.07 |

4 | C_{2} | 0.56 | −0.26 | 0.94 | 1.27 | 0.55 | 0.21 |

#### 2. Monatomic anions

Figure 6 shows the accuracy of the induction model neglecting ion polarization (left panel) and including ion polarization (middle panel). Without including ion polarization, we see good agreement in the attractive region for the monatomic anions. When ion polarization is included, the induction model overestimates these interactions slightly. In the repulsive region of the plot, we see that the model without ion polarization causes more scatter in the correlation plot. However, when ion polarization is included, we see that the deviation becomes less scattered and we can see a clear trend based on the relative ionic radii. By decreasing the damping parameter from 1.4 to 1.2 (Fig. 6, right panel), thus increasing the damping, we see improvements in both the attractive and the repulsive regions of the correlation plot. The RMSE decreased from 0.38 to 0.25 kcal/mol upon increasing the damping (Table III). However, the classical induction model is still found to overestimate the repulsive interactions to varying degrees. Interestingly, the classical induction model performs best for the monatomic cations and anions with a decreased damping parameter of ∼1.2 (increased damping). For the monatomic anions and cations, we found it necessary to also change the damping parameter to improve the agreement whereas the polyatomic systems performed well without any additional change. In Figs. 5 and 6, the middle panel shows the correlation trend between the model including ion polarization (with the original damping parameter) and the *ab initio* references. In these cases, the error is larger than the leftmost panel (without ion polarization). However, this is due to what appears to be a change in slope in the correlation plot. This is especially noticeable in the attractive region, which is why the rightmost panel shows the correlation after both including ion polarization and tweaking the damping parameter. Regardless, the RMSDs for all systems improve from the leftmost panel (no ion polarization) to the rightmost panel (including ion polarization and tweaking damping parameter if necessary) except for the monatomic anions in which we see nearly the same RMSD. The attractive region is improved upon for the monatomic anions by including ion polarization, but the repulsive region, while forming straighter lines, deviates more from the line. This implies that there may be an additional physical interaction that could be added to the induction model to improve the agreement in the repulsive region for the monatomic anions, specifically.

. | . | Ion pol. . | Ion pol. . | ||||
---|---|---|---|---|---|---|---|

. | No ion pol. . | (DF = 1.41) . | (DF = 1.20) . | ||||

System . | Number of points . | RMSE . | MAE . | RMSE . | MAE . | RMSE . | MAE . |

F^{−}–W–W | 72 | 0.26 | 0.17 | 0.50 | 0.29 | 0.24 | 0.14 |

Cl^{−}–W–W | 72 | 0.22 | 0.14 | 0.31 | 0.18 | 0.22 | 0.12 |

Br^{−}–W–W | 72 | 0.23 | 0.15 | 0.36 | 0.21 | 0.28 | 0.15 |

I^{−}–W–W | 72 | 0.22 | 0.14 | 0.30 | 0.19 | 0.24 | 0.14 |

Combined | 288 | 0.23 | 0.15 | 0.38 | 0.22 | 0.25 | 0.14 |

. | . | Ion pol. . | Ion pol. . | ||||
---|---|---|---|---|---|---|---|

. | No ion pol. . | (DF = 1.41) . | (DF = 1.20) . | ||||

System . | Number of points . | RMSE . | MAE . | RMSE . | MAE . | RMSE . | MAE . |

F^{−}–W–W | 72 | 0.26 | 0.17 | 0.50 | 0.29 | 0.24 | 0.14 |

Cl^{−}–W–W | 72 | 0.22 | 0.14 | 0.31 | 0.18 | 0.22 | 0.12 |

Br^{−}–W–W | 72 | 0.23 | 0.15 | 0.36 | 0.21 | 0.28 | 0.15 |

I^{−}–W–W | 72 | 0.22 | 0.14 | 0.30 | 0.19 | 0.24 | 0.14 |

Combined | 288 | 0.23 | 0.15 | 0.38 | 0.22 | 0.25 | 0.14 |

#### 3. Polyatomic ions

Figure 7 shows the accuracy of the induction model neglecting ion polarization (left panel) and including ion polarization (right panel). While we already see good agreement using the induction model without ion polarization, including ion polarization improves the agreement further. While other ion systems benefited from tweaking the damping, these systems did not require any further adjustment to the model. Table IV outlines the RMSEs and MAEs for the individual ion systems with and without ion polarization included in the model. Further, by examining the error distribution, we see that the maximum error in the three-body term does not exceed 0.5 kcal/mol. This demonstrates the promise of a classical induction model in accurately representing three-body interactions for other aqueous polyatomic ionic systems.

. | . | Ion pol. . | |||
---|---|---|---|---|---|

. | Number of . | No ion pol. . | (DF = 1.41) . | ||

System . | points . | RMSE . | MAE . | RMSE . | MAE . |

$NH4+$–W–W | 72 | 0.30 | 0.16 | 0.12 | 0.074 |

$ClO4\u2212$–W–W | 72 | 0.18 | 0.14 | 0.061 | 0.042 |

$SO42\u2212$–W–W | 72 | 0.33 | 0.29 | 0.15 | 0.12 |

Combined | 216 | 0.28 | 0.20 | 0.12 | 0.078 |

. | . | Ion pol. . | |||
---|---|---|---|---|---|

. | Number of . | No ion pol. . | (DF = 1.41) . | ||

System . | points . | RMSE . | MAE . | RMSE . | MAE . |

$NH4+$–W–W | 72 | 0.30 | 0.16 | 0.12 | 0.074 |

$ClO4\u2212$–W–W | 72 | 0.18 | 0.14 | 0.061 | 0.042 |

$SO42\u2212$–W–W | 72 | 0.33 | 0.29 | 0.15 | 0.12 |

Combined | 216 | 0.28 | 0.20 | 0.12 | 0.078 |

### C. Classical induction model for three-body (W–W–W) interactions

The induction model was tested on two different datasets of three-body interactions: the set of 2184 MP2/aVDZ three-body terms from the ionic and water clusters and the set of 12 347 CCSD(T)/aVTZ three-body terms that were used to train the MB-pol potential (Table V).^{31} Note that the MP2 values calculated in this work are at the frozen monomer geometries, for which the 1-B term is zero. However, this is not the case for the values in the MB-pol training set, so the comparison of the induction model (rigid monomers) to the full dataset may be less direct. The accuracy of the induction model in computing these three-body interactions is shown in Fig. 8 (left panel). Because the MB-pol training set consists of a wide range of monomer geometries, we have also filtered out all geometries for which *R*_{OH} < 0.92 Å, *R*_{OH} > 1.00 Å, *θ*_{HOH} < 103°, or *θ*_{HOH} > 106° to provide a fair comparison to the induction model (rigid). After filtering out more distorted molecules, there are 1744 trimers remaining for the comparison (Fig. 8, middle panel).

. | Ind. model . | Ind. model . | |||
---|---|---|---|---|---|

. | (DF = 1.41) . | (DF = 1.60) . | |||

Dataset . | Number of points . | RMSE . | MAE . | RMSE . | MAE . |

(W–W–W) from clusters | 2 184 | 0.0011 | 0.024 | 0.075 | 0.033 |

Mb-pol train set^{a}–full | 12 347 | 0.22 | 0.12 | 0.19 | 0.11 |

MB-pol train set^{a,b}–subset | 1 744 | 0.20 | 0.10 | 0.12 | 0.074 |

. | Ind. model . | Ind. model . | |||
---|---|---|---|---|---|

. | (DF = 1.41) . | (DF = 1.60) . | |||

Dataset . | Number of points . | RMSE . | MAE . | RMSE . | MAE . |

(W–W–W) from clusters | 2 184 | 0.0011 | 0.024 | 0.075 | 0.033 |

Mb-pol train set^{a}–full | 12 347 | 0.22 | 0.12 | 0.19 | 0.11 |

MB-pol train set^{a,b}–subset | 1 744 | 0.20 | 0.10 | 0.12 | 0.074 |

^{a}

CCSD(T)/aVTZ three-body energies used to train MB-pol, Babin *et al.*^{31}

^{b}

Only trimers for which *R*_{OH} < 0.04 Å and *θ*_{HOH} < 1.5° from equilibrium.

By increasing the damping parameter (decreasing the damping), the fit of the strongly attractive three-body terms improves greatly (Fig. 8, right panel). The resulting RMSE is 0.12 kcal/mol. This accuracy is on par with other explicit many-body potentials fitted to *ab initio* three-body terms, such as HBB2-pol and WHBB. However, because the proposed model is rigid, we are currently limited in our ability to describe three-body terms for which the monomers are significantly distorted from equilibrium monomer geometry. This will be the subject of a future study.

## IV. CONCLUSIONS

In this work, the physical origin of the three-body interaction in aqueous ionic systems was explored using a classical induction model. By modeling the three-body term as an interaction between induced multipoles for various (I–W–W) (13 different ion systems) and (W–W–W) subsystems, we find excellent agreement with the established MP2 and CCSD(T) three-body reference values. More specifically, the RMSEs for monatomic cations, monatomic anions, and polyatomic ions were 0.29, 0.25, and 0.12 kcal/mol, respectively, when compared to a total of 936 MP2/aVDZ reference three-body terms. Further, the classical induction model returned a RMSE of 0.12 kcal/mol on 1744 CCSD(T)/aVTZ three-body (W–W–W) energies used to train MB-pol. Note that this was the subset of the data for which all monomers had *R*_{OH} values within 0.04 Å of equilibrium and *θ*_{HOH} values within 1.5 degrees of equilibrium to compare the performance of the induction model (rigid, at equilibrium geometry) on an equal footing. This is on par with the performance of other explicit many-body potentials fit to *ab initio* data. The success of this model demonstrates that the three-body term for aqueous systems can be accurately modeled classically without fitting to tens of thousands of high-level *ab initio* calculations. More importantly, this work provides a fast, accurate, and efficient method to model three-body effects that is transferable across systems with different ions.

In general, we find that including ion polarization improves the description of the three-body term for all ion systems considered. For the monatomic anions and cations, we found it necessary to also change the damping parameter to improve the agreement whereas the polyatomic systems performed well without any additional change. The only adjustable parameter that was varied in this study was the damping factor. It was found that slightly decreasing the damping led to the best agreement for the monatomic ions systems while slightly increasing the damping led to the best agreement for the pure water systems. The induction model performed well on the polyatomic ion systems with the initial damping factor of 1.41. However, this damping factor was selected based on the performance on 72 (I–W–W) trimers cut from two different hydrogen bonding networks. Further tuning could be performed on a larger set of trimers, if desired. Nonetheless, the small amount of parameterization needed makes this an attractive option for other ion–water systems of interest. The induction damping could benefit from further investigation. The damping parameter is a reciprocal measure of the distance at which the damping becomes significant and can be expected to depend on the nature of the interacting molecules. However, treating it as an adjustable parameter produces satisfactory results.

This model could be further improved upon to allow for flexible monomers (and the associated change in multipoles with different intramolecular geometries). Because the intramolecular geometry of the water molecules affects the resulting interactions, this adjustment would need to be included to accurately predict the three-body energies of systems with highly distorted intramolecular geometries. This induction model could be extended further to include four-body and higher terms through a similar approach. This type of approach would allow all many-body effects to be described classically through induction, greatly simplifying the process of fitting accurate many-body potentials for new systems.

## SUPPLEMENTARY MATERIAL

The supplementary material comprises the distributed multipoles and distributed polarizabilities for H_{2}O, $NH4+$, $SO42\u2212$, and $ClO4\u2212$ correlation plots of the uncorrected MP2/aVDZ vs BSSE-corrected MP2/aVDZ three-body energies for the individual ion systems (plotted separately), and correlation plots of the induction model against the MP2/aVDZ three-body values at the cluster optimized geometries.

## ACKNOWLEDGMENTS

K.M.H. and S.S.X. acknowledge support from the Center for Scalable Predictive methods for Excitations and Correlated phenomena (SPEC), which is funded by the U.S. Department of Energy, Office of Science, Basic Energy Sciences, Chemical Sciences, Geosciences and Biosciences Division as part of the Computational Chemical Sciences (CCS) program at Pacific Northwest National Laboratory. Battelle operates the Pacific Northwest National Laboratory for the U.S. Department of Energy. This research used computer resources provided by the National Energy Research Scientific Computing Center, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231.

## AUTHOR DECLARATIONS

### Conflict of Interest

The authors have no conflicts to disclose.

## DATA AVAILABILITY

The data that support the findings of this study are available from the corresponding authors upon reasonable request.