Harmonic bond force constants and bond lengths are shown to generally obey the simple relationships, (hydrides) and (all other bond types), where ζ is the reduced nuclear charge and Re is the equilibrium bond length. Equally simple power-law relationships are found for higher-order bond force constants. Although not spectroscopically accurate, these models are nonetheless of significant heuristic value for identifying strongly multireference states of diatomic molecules (including electronically coupled excited states ill-suited for inclusion in laser-cooling schemes), rationalizing the observed trends in vibrational frequencies for diatomics and/or local mode oscillators within molecules or complexes and estimating and/or validating covalent bonding parameters within molecular mechanics force fields. Particular advantages of our approach over other bond length-strength scaling relationships proposed in the literature include its simplicity and generality and its appropriate asymptotic behavior. Notably, the relationships derived in this work can be used to predict harmonic and higher-order force constant bonds between any pair of atoms in the Periodic Table (including transition metals and lanthanides) without requiring row- or column-dependent parameterization, to accuracies commensurate with conventional force field transferability errors. We therefore anticipate that they will expedite force field development for metal-containing complexes and materials, which are structurally well-characterized but challenging to parameterize ab initio.
I. INTRODUCTION
Despite the wealth of computational resources available in the modern scientific computing era for predicting the structures and properties of molecules,1 systematic and automated universal force field parameterization remains theoretically and computationally challenging.2–4 Fundamentally, this is due to the fact that force fields are low-order approximations to multidimensional potential energy hypersurfaces of non-trivial topology, whose features depend on the electronic behavior of each molecule’s constituent atoms in a non-trivial way.
One way forward is to embrace the complexity of the problem and train computers to “think” like chemists, albeit in a more scalable and systematic manner, embedding structures to “hold” chemical information within machine-learning models, which are then typically trained to reproduce ab initio energy data.5–7 Although machine learning models are in principle completely general, in practice, their range of applicability may be limited by the completeness of the underlying dataset on which the model is trained and algorithmic choices made in constructing the model.6
An alternative approach is to pursue simple but universal physics-based models using atomic and molecular descriptors that can be readily computed ab initio. For example, long-range intermolecular interactions can be accurately modeled using atom-centered distributed multipole expansions, with relevant parameters (atomic partial charges, dipoles, quadrupoles, etc.) obtained purely by analysis of ab initio charge densities.8–10 Intramolecular interactions, on the other hand, have historically been characterized by sets of bonding parameters obtained by energetic fitting, requiring different parameters for each combination of atom types.2,3,8
This motivates the question—is it possible to predict energetic parameters that characterize the shape of the potential energy surface in the vicinity of the bonding region, based only on local molecular geometric parameters, with minimal additional parameterization?
II. METHODS
All calculations performed in atomic units and model parameters, although themselves dimensionless, pertain to quantities expressed in atomic units. However, for reporting purposes, harmonic frequencies and anharmonicity parameters, ωe and ωexe, are converted to wavenumbers using the conversion factor 1 a.u. = 219 475 cm−1. Similarly, dissociation energies, De, are reported through the more chemically familiar units of kJ/mol (1 Eh = 2625.50 kJ/mol), and bond force constants are converted to millidyne per Ångstrom (1 = 15.5691 mdyn/Å).
III. RESULTS AND DISCUSSION
A. Harmonic frequencies of gas-phase diatomics
Correlations between predicted and experimentally derived harmonic frequencies are illustrated in Fig. 1. Using our simple scaling approach, we were unable to find truly universal relationships between harmonic frequencies and bond lengths, with different sets of parameters required for hydrides vs all other molecules.
Correlation between predicted and experimental harmonic vibrational frequencies across a dataset comprising 41 ground-state hydrides (+), 77 excited-state hydrides (+), 161 other ground-state diatomics (⋄), and 332 other excited-state diatomics (⋄). Only ground-state data were used in the model-fitting process.
Correlation between predicted and experimental harmonic vibrational frequencies across a dataset comprising 41 ground-state hydrides (+), 77 excited-state hydrides (+), 161 other ground-state diatomics (⋄), and 332 other excited-state diatomics (⋄). Only ground-state data were used in the model-fitting process.
Although this appears to contradict literature reports of “near-universal” scaling relationships,15 closer inspection of the data presented in Ref. 15 and wider reading of the literature28 reveal that hydrides consistently obey a different power law relationship from other diatomics, reflecting hydrogen’s unique position on the Periodic Table as the only atom that does not possess core electrons29,30 and whose sole valence electron is both always involved in and exclusively responsible for bonding interactions.30,31
This reveals a fundamental limitation of simple scaling approaches that only account for electronic effects via geometric changes. Hydrides, in particular, show unique changes in electron density distributions upon bond formation, displaying strong charge contraction toward the hydrogen atom nucleus that is not observed for other atoms.29,31 The distinction between hydrides and other diatomics disappears when row- and column-information is incorporated via Nalewajski’s effective nuclear charge model. However, despite containing this extra information, the Nalewajski model proves to be less accurate than our simpler reduced-nuclear-charge scaling approach (Table I). Adjusting the group-dependent parameter, δ, for transition metals does not afford any improvement in accuracy, so we recommend setting δ = 0 for all groups except chalcogens and halides.
Mean absolute errors (MAEs) and mean unsigned percentage errors (%) in the predicted harmonic frequencies, anharmonicity constants, and dissociation energies. GS = ground state; ES = excited state.
. | This work . | Nalewajski . | ||||||
---|---|---|---|---|---|---|---|---|
. | MAE . | % . | MAE . | % . | ||||
. | GS . | ES . | GS . | ES . | GS . | ES . | GS . | ES . |
ωe (cm−1) | 92 | 121 | 11 | 16 | 126 | 161 | 15 | 22 |
ωexe (cm−1) | 1.2 | 4.9 | 16 | 33 | 2.0 | 5.4 | 32 | 34 |
De (kJ mol−1) | 98 | ⋯ | 27 | ⋯ | 124 | ⋯ | 37 | ⋯ |
. | This work . | Nalewajski . | ||||||
---|---|---|---|---|---|---|---|---|
. | MAE . | % . | MAE . | % . | ||||
. | GS . | ES . | GS . | ES . | GS . | ES . | GS . | ES . |
ωe (cm−1) | 92 | 121 | 11 | 16 | 126 | 161 | 15 | 22 |
ωexe (cm−1) | 1.2 | 4.9 | 16 | 33 | 2.0 | 5.4 | 32 | 34 |
De (kJ mol−1) | 98 | ⋯ | 27 | ⋯ | 124 | ⋯ | 37 | ⋯ |
B. Anharmonicity parameters for gas-phase diatomics
Additional similarities and differences between models become apparent when predicting anharmonicity parameters (Fig. 2). In both cases, center-of-charge information is required to accurately predict ground-state anharmonicity constants, a departure from the simple product ansatz used by Nalewajski to predict harmonic force constants. This center-of-effective-charge model is also no longer universal, requiring different parameters for molecular hydrogen (C′ = 1) from all other diatomics .
Correlation between predicted and experimental anharmonicity constants across a dataset comprising 41 ground-state hydrides (+), 77 excited-state hydrides (+), 161 other ground-state diatomics (⋄), and 332 other excited-state diatomics (⋄). Only ground-state data were used in the model-fitting process. Selected excited-state outliers are labeled by molecule and state; */**/*** = first/second/third excited states of the same spin-symmetry as the ground state; # = higher excited states and/or states of different spin-symmetry from the ground state.
Correlation between predicted and experimental anharmonicity constants across a dataset comprising 41 ground-state hydrides (+), 77 excited-state hydrides (+), 161 other ground-state diatomics (⋄), and 332 other excited-state diatomics (⋄). Only ground-state data were used in the model-fitting process. Selected excited-state outliers are labeled by molecule and state; */**/*** = first/second/third excited states of the same spin-symmetry as the ground state; # = higher excited states and/or states of different spin-symmetry from the ground state.
Despite requiring different parameterizations for different data subsets, both models display similar accuracies (Table I). They both seem to struggle with predicting excited-state anharmonicities, but closer inspection of the outliers reveals that these correspond to excited states whose potential energy curves are significantly distorted by strong electronic coupling to other excited states. For example, the first excited 1 state of H2 displays a double minimum arising from an avoided crossing between the (1σg) (2 s) and configurations.32 Similarly, the A1Π (AlH*) and C1Σ+ (AlH***) states of aluminum hydride are strongly multireference in character, displaying a shallow pre-dissociative minimum and double-minimum character, respectively.33 Although relatively low-lying excited states of hydrides comprise the majority of outliers, highly excited states of H2, C2, and N2 also feature primarily because the excited states of these molecules have been extensively characterized.26
From a theoretical perspective, this mismatch between predicted and observed anharmonicities (Δωexe > 50 cm−1) provides a clear indicator that multireference effects and adiabatic couplings must be taken into account when modeling affected excited states.34 From an experimental perspective, it provides an indicator that these states are not suitable for laser cooling schemes,35,36 in which electronic state crossings must be avoided.37
C. Dissociation energies of gas-phase diatomics
Neither model trialed in this work accurately predicts bond dissociation energies (Fig. 3). This finding is consistent with literature analyses which reason that “force constants alone cannot describe bond strengths completely as they lack information about the (electronic) reorganization processes occurring at other points on the potential-energy surface.”38 In other words, it is unreasonable to expect parameters that describe the shape of potential energy curves at or near equilibrium to fully characterize the entire potential energy curve. This is further supported by numerical evidence that at least three experimentally derived parameters are required to accurately recover relationships between spectroscopic constants that together characterize the entire potential energy curve (e.g., rotational and vibrational spectroscopic constants and bond dissociation energies).39
Correlation between predicted and experimentally derived dissociation energies, De, across a dataset comprising 40 ground-state hydrides (+) and 149 other ground-state diatomics (⋄). Reference De values were computed as .
Correlation between predicted and experimentally derived dissociation energies, De, across a dataset comprising 40 ground-state hydrides (+) and 149 other ground-state diatomics (⋄). Reference De values were computed as .
However, we do observe some trends in the nature of the outliers in Fig. 3(a). Our model systematically underestimates bond dissociation energies of species with strong ionic character (NaF, LiF, AlF, and BF) and overestimates bond dissociation energies of diffuse radical oxides (IO, BrO, and PtO). This suggests that systematic improvement of this model may be possible by incorporating electron density information, e.g., bond dipoles, atomic partial charges, or polarizability volumes. This information could perhaps be built into a generalized Morse type model,40,41 in which the long-range behavior of the potential is tuned using an additional third parameter. However, a recent review by Araújo and Ballester suggests that this may be more challenging than it sounds; they conclude that three-parameter potentials cannot accurately reproduce full potential energy curves, even if all three parameters are fully adjustable on a case-by-case basis.23
Empirically derived models for predicting diatomic harmonic frequencies (ωe), anharmonicity parameters (ωexe), and dissociation energies (De) from equilibrium bond lengths (Re), reduced nuclear charges , and reduced masses . All quantities are expressed in atomic units.
. | Hydrides . | Other . |
---|---|---|
μωexe = | ||
De = |
. | Hydrides . | Other . |
---|---|---|
μωexe = | ||
De = |
The extended Nalewajski model, on the other hand, does not generally obey Eq. (24). Although it requires fewer adjustable parameters to model harmonic frequencies and bond force constants, it does not readily generalize to other spectroscopic parameters such as anharmonicity constants or dissociation energies in an internally consistent manner. Therefore, our simple center-of-nuclear-charge model will be used exclusively for the remainder of this work.
D. Bimetallic diatomics-in-complexes
Predicting spectroscopic constants of bimetallic systems ab initio is challenging because of their complicated electronic structure.42,43 However, geometric parameters are generally much less sensitive at the theory level44–46 and are often available experimentally,47,48 which makes predicting force constants from bond lengths a particularly attractive option for these systems. Crystallographic bond lengths and IR frequencies for 100 bimetallics-in-complexes were compiled from the literature,49 along with ten additional sets of gas-phase data. These data are available in the supplementary material.
Our model appears to systematically overestimate harmonic frequencies for weakly bound ( 200 cm−1) bimetallics-in-complexes (Fig. 4), but this is most likely due to crystal packing effects allowing metal atoms to get closer together than they might in the gas phase, especially for complexes with a formal bond order of zero between the metal atoms. In other words, for these same bimetallic bonds to form at these distances in the gas phase, the intrinsic bond strength would need to be greater, as predicted.
Correlation between predicted and experimental harmonic vibrational frequencies for (×) ten gas-phase bimetallic diatomics and (⋄) 100 bimetallic diatomics-in-molecules.
Correlation between predicted and experimental harmonic vibrational frequencies for (×) ten gas-phase bimetallic diatomics and (⋄) 100 bimetallic diatomics-in-molecules.
The two obvious outliers (Δωe > 150 cm−1) in Fig. 4 are the gas-phase diatomics Fe2 and Cr2. It is known that dynamically correlated multireference electronic structure methods are required to appropriately model the multiple near-degenerate electronic configurations that comprise the overall equilibrium ground-state wavefunction of Cr2 and predict harmonic frequencies to even the “ballpark” accuracy (±160 cm−1).50,51 DFT models are less accurate again; commonly used generalized-gradient-approximation (GGA) and hybrid-GGA functionals overestimate ωe by a factor of 1.5–2 (240–480 cm−1), while underestimating bond lengths by 0.03–0.12 Å.52 They also struggle to describe the “shelf” shape of the potential energy curve just past equilibrium, which arises from electronic coupling between ground and excited states.50,53
Equilibrium properties of Fe2 are perhaps even harder to model due to spin-state near-degeneracies, with strong static and dynamic correlation effects throughout.54 The extended active-space CASPT255 or multireference configuration interaction models54 with near-complete atomic orbital basis sets that are required to ascertain spin-state ordering (nonet ground state) and obtain continuous potential energy curves are accurate enough to predict harmonic frequencies to within 25 cm−1 of their experimental values. DFT methods, on the other hand, incorrectly predict a low-spin (septet) ground state.52,54–56 Comparisons between DFT-computed “ground-state” properties and experimental values are therefore misleading, and they result in large (100 cm−1) errors in the predicted harmonic frequencies regardless of the chosen functional.52,56
These case studies again illustrate how our simple empirical model can be used to identify strongly multireference systems, provided local harmonic bond stretching frequencies are available or can be derived experimentally. With that being said, these situations seem to occur less frequently than one may expect. For example, vibrational frequencies of Cr–Cr and Fe–Fe bonds within complexes can be predicted quite accurately despite the significant discrepancies observed in the gas phase. This may explain the success of density functional theories in predicting structures and properties of bimetallic transition-metal complexes,57–60 despite literature reports that such systems are particularly challenging for DFT models.52,61
E. Molecular bond force constants
Having established that our model is just as accurate for diatomics-in-complexes as it is for isolated diatomics (notwithstanding the crystal packing effects in weakly bound complexes), the next logical question to ask is whether it can be meaningfully applied to diatomics-in-molecules, i.e., “conventional” covalent bonds. Although not generally experimentally observable, local-mode bond force constants for diatomics-within-molecules can be inferred computationally62 or semi-experimentally63 using local mode vibrational theory.64 Further evidence for the conceptual utility of the “diatomics-within-molecules” approach lies in the remarkable success of simple, parameterized force fields for approximating energetically accessible regions of multidimensional potential energy hypersurfaces.3,4 Indeed, local mode vibrational theory63,64 may be used to furnish the required (harmonic) bond force constants, although internal coordinate redundancy makes mapping other local mode force constants to other force field parameters more complicated.65 Even so, sets of experimentally derived bond lengths and local-mode bond stretching force constants for gas-phase molecular species are hard to come by. We have therefore chosen to assess our model for consistency with the MM3 force field.27,66 To gain insight into parameterization uncertainties, we have used both the original MM3 parameters66 and a reparameterized “Morse” set developed for reactive force field applications.27
The first category of outliers, highlighted in green (Fig. 5), correspond to single bonds adjacent to neighboring double bonds, most commonly C–O bonds adjacent to carbonyl groups. These are not obviously due to incorrect bond length parameterization; the reported bond lengths in the MM3 and Morse-reparameterized force fields align closely with the expected literature values.67,68 Similarly, close agreement between original and reparameterized bond force constants implies robust and reproducible parameterization processes. However, it is likely that discrepancies between the predicted and parameterized bond force constants arise due to mismatching Re and ke values, which are obtained by independent linear averaging of values computed in a range of different molecular environments.27 However, as we and many others11–17,19,24,69,70 have demonstrated, there is clearly a non-linear relationship between bond lengths and bond force constants. This is consistent with previous observations that bond lengths and strengths are strongly environment-dependent and can vary substantially between molecules used in deriving parameters for each bond type. This is likely to be particularly problematic for bonds that may or may not be incorporated into delocalized π-bonding systems, i.e., single bonds adjacent to double bonds.27
Correlation between empirically predicted and (a) MM3 and (b) Morse-reparameterized harmonic bond force constants. Outliers are circled and labeled according to bond type, indicated by the starred bond on each chemical structure. Force constants for extreme outliers (Calkene–Nazoxy, Pb–Pb, and S–SH) are not shown here but reported explicitly in Table III and discussed separately in the main text.
Correlation between empirically predicted and (a) MM3 and (b) Morse-reparameterized harmonic bond force constants. Outliers are circled and labeled according to bond type, indicated by the starred bond on each chemical structure. Force constants for extreme outliers (Calkene–Nazoxy, Pb–Pb, and S–SH) are not shown here but reported explicitly in Table III and discussed separately in the main text.
A second category of outliers arises when comparing empirically predicted bond force constants to Morse-reparameterized reference values. These cases, highlighted in orange and red in Fig. 5(b), reflect deficiencies in the Morse model parameterization process. Specifically, harmonic bond force constants are not directly fitted to ab initio data but are rather inferred from a global model fit, in which the Morse model parameters α, De, and re are fitted directly and then ke is estimated as 2α2De from the fitted Morse curves. Although this appears to work well for most bond types, in some cases, obtaining the most accurate global fit clearly comes at the cost of accurately describing energetic changes near equilibrium.
Extreme deviations between predicted and parameterized force constants indicate serious parameterization errors (Table III). Cases in which the MM3 and Morse-reparameterized bond force constants agree but are very different from the empirically predicted values are due to errors in bond lengths, which then propagate into large errors in empirically predicted bond force constants. For example, the C–N bond length between an alkene and azoxy group is given as 0.827 Å, which is infeasibly short and well outside the reported range of 1.44–1.47 Å.71 Similarly, the MM3 equilibrium bond length for Pb–Pb bonds is 1.944 Å, which appears to be a typo; a value of 2.944 Å falls within the expected range of 2.696–3.260 Å.72 Our model can also identify errors and resolve discrepancies in parameterized bond force constants. For example, within the Morse-reparameterized force field, dithiol bond force constants are approximately an order of magnitude larger than those in the original MM3 force field and the values predicted by our empirical model. This can be traced back to errors in the underlying ab initio data used in parameterizing the Morse model.27
Harmonic force constants (mdyn/Å) for outliers not shown in Fig. 5.
Bond type . | . | . | . |
---|---|---|---|
Calkene–Nazoxy | 3.8 | 3.7 | 66.3 |
Pb–Pb | 2.1 | 0.9 | 7.7 |
S–SH | 3.9 | 24.6 | 3.4 |
Bond type . | . | . | . |
---|---|---|---|
Calkene–Nazoxy | 3.8 | 3.7 | 66.3 |
Pb–Pb | 2.1 | 0.9 | 7.7 |
S–SH | 3.9 | 24.6 | 3.4 |
Overall, our approach provides a useful “sanity check” for validating force field parameters. In addition, empirical scaling model errors are similar to force field parameterization uncertainties (Table IV), which suggests that our approach could be used to obtain force field parameters for molecules with unusual bonding patterns that are not well-described by existing force fields, e.g., many ligands in biomolecular simulations and organo-metallic and bimetallic complexes.
Mean absolute and percentage deviations between empirically predicted (emp.) and parameterized (original and Morse-reparameterized MM3) bond force constants in units of mdyn/Å (k2), mdyn/Å2 (k3), and mdyn/Å3 (k4).
Model 1 . | Model 2 . | Param. . | MAD . | % . |
---|---|---|---|---|
MM3 | Morse | ke = k2 | 1.2 | 18 |
MM3 | Emp. | ke = k2 | 1.6 | 29 |
Morse | Emp. | ke = k2 | 1.6 | 29 |
Morse | Emp. | −k3 | 10 | 56 |
Morse | Emp. | k4 | 24 | 31 |
Model 1 . | Model 2 . | Param. . | MAD . | % . |
---|---|---|---|---|
MM3 | Morse | ke = k2 | 1.2 | 18 |
MM3 | Emp. | ke = k2 | 1.6 | 29 |
Morse | Emp. | ke = k2 | 1.6 | 29 |
Morse | Emp. | −k3 | 10 | 56 |
Morse | Emp. | k4 | 24 | 31 |
Another advantage of our model is that it can be used to predict higher order bond force constants that are more challenging to parameterize (Fig. 6). The mean absolute percentage deviations between predicted and parameterized cubic bond force constants are significantly higher than their harmonic counterparts, although quartic force constants can be predicted to a similar degree of accuracy. The outlier profile for higher order bond force constants is broadly similar to the harmonic case, suggesting that “errors” in harmonic bond force constants propagate into higher order force constants. This is entirely expected; these parameters are not independent in either our empirical model or the reparameterized Morse approach. It is also promising in terms of developing our empirical scaling relationships into an accurate predictive model because resolving discrepancies between harmonic force constants will improve agreement among higher order force constants, as well.
Correlation between empirically predicted and Morse-reparameterized (a) cubic and (b) quartic bond force constants. Outliers circled in green and orange correspond to the bond types illustrated in Fig. 5.
Correlation between empirically predicted and Morse-reparameterized (a) cubic and (b) quartic bond force constants. Outliers circled in green and orange correspond to the bond types illustrated in Fig. 5.
IV. CONCLUSIONS AND FUTURE WORK
We have discovered and validated simple but powerful new relationships between bond lengths, strengths (harmonic frequencies and force constants), and anharmonicities (anharmonicity parameters and higher order force constants). Although not spectroscopically accurate for gas-phase diatomics and although large deviations between predicted and observed anharmonicity constants can also arise from strong electronic state couplings, most commonly between excited states, our model nonetheless provides a useful “sanity check” for computed and experimentally derived spectroscopic constants.
Our model can also be used to predict molecular force field parameters (bond force constants) from structural data (equilibrium bond lengths). This approach may even be more accurate than conventional force field parameterization in some cases because it is not subject to transferability error. It is also far less computationally intensive than parameterizing molecular force field terms de novo. However, further work is required to investigate discrepancies between predicted and parameterized bond force constants for single bonds adjacent to double bonds, on a molecule-by-molecule basis. Local-mode vibrational theory provides the ideal tool to generate the required reference harmonic bond force constants, in either a fully ab initio or semi-experimental manner.16
There is strong literature evidence to suggest that our approach may generalize to predicting angle-bending force constants; equations similar to Nalewajski’s have been shown to relate angle-bending force constants to geometric parameters for related classes of molecules,73–75 and we expect that these can be improved and generalized by incorporating center-of-nuclear-charge information. This is a particularly promising avenue for future research because angle-bending terms within molecular force fields are both more numerous and more tedious to parameterize than bond-stretching terms.
There is also literature evidence to suggest that environmental effects on covalent bond force constants can also be inferred from geometrically based models.76 The inverse linear relationship between bond lengths and harmonic stretching frequencies for hydrogen-bonded O–H bonds is very different from the inverse power law relationships derived in this work, but this reflects the different physics involved. However, it remains to be determined how such relationships generalize to other types of bonds and/or other types of intermolecular interactions.
SUPPLEMENTARY MATERIAL
Refer to the supplementary material for all benchmarking data used in this work, comprising over 600 sets of spectroscopic constants for diatomics, over 100 spectroscopically derived fundamental frequencies for bimetallic diatomics-in-complexes, and 180 parameterized bond force constants.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
D. L. Crittenden: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Resources (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are available within the article and its supplementary material.