In silico property prediction based on density functional theory (DFT) is increasingly performed for crystalline materials. Whether quantitative agreement with experiment can be achieved with current methods is often an unresolved question, and may require detailed examination of physical effects such as electron correlation, reciprocal space sampling, phonon anharmonicity, and nuclear quantum effects (NQE), among others. In this work, we attempt first-principles equation of state prediction for the crystalline materials ScF3 and CaZrF6, which are known to exhibit negative thermal expansion (NTE) over a broad temperature range. We develop neural network (NN) potentials for both ScF3 and CaZrF6 trained to extensive DFT data, and conduct direct molecular dynamics prediction of the equation(s) of state over a broad temperature/pressure range. The NN potentials serve as surrogates of the DFT Hamiltonian with enhanced computational efficiency allowing for simulations with larger supercells and inclusion of NQE utilizing path integral approaches. The conclusion of the study is mixed: while some equation of state behavior is predicted in semiquantitative agreement with experiment, the pressure-induced softening phenomenon observed for ScF3 is not captured in our simulations. We show that NQE have a moderate effect on NTE at low temperature but does not significantly contribute to equation of state predictions at increasing temperature. Overall, while the NN potentials are valuable for property prediction of these NTE (and related) materials, we infer that a higher level of electron correlation, beyond the generalized gradient approximation density functional employed here, is necessary for achieving quantitative agreement with experiment.
I. INTRODUCTION
In silico prediction of the stability and properties of inorganic materials has become possible on a broad scale due largely to density functional theory (DFT) and associated advancements in algorithms, software, and computer hardware.1 With exception, the majority of such efforts focus on crystalline materials, for which the lattice structure is either known or postulated, and entropic contributions to the free energy may be approximated or sometimes neglected. Example efforts include structure prediction,2 prediction of mechanical properties,3 discovery of battery, photovoltaic, or thermoelectric materials,4 stability prediction of alloy structures or polymorphs,5 or screening for stable, ternary inorganic materials.6 In this work, we attempt first-principles equation of state prediction for ScF3 and CaZrF6 inorganic crystals, which fall under the class of negative thermal expansion (NTE) materials.
Negative thermal expansion (NTE) materials are of significant interest due to both their unusual physics and their potential applications. Numerous applications require matching coefficients of thermal expansion between two or more materials, and the discovery of new NTE materials and elucidation of their behavior provides greater tunability in this design space. Examples of possible uses for NTE materials include fuel cells, mirrors in space telescopes, optics, thermoelectric materials, and more.7–9 The discovery of new NTE materials as well as a better fundamental understanding of documented NTE behavior are both important goals. Regarding the latter, predicting/rationalizing NTE behavior from first-principles is often challenging and may serve as a stringent test of solid-state theories, particularly those focused on incorporating phonon anharmonicity.
A variety of mechanisms can cause NTE, and thus it is observed in several classes of compounds, such as ferroelectric materials,10 metal organic frameworks (MOFs),11 Prussian Blue analog,12 and other open-framework materials.7,13 The work of Mary et al.14 on ZrW2O8 documenting NTE over a wide temperature range, launched interest into framework-type materials.15 Due to their simpler structure in comparison to ZrW2O8, ReO3-type materials are of interest for the fundamental understanding of NTE behavior in open framework structures.16–19 Although ReO3 itself only shows small NTE at low temperatures, Greve et al.19 found that ScF3, which possesses a ReO3-type structure, displays strong NTE from 10 K up to K with a coefficient of thermal expansion (CTE) of αL ∼ −14 ppm/K at 100 K. In most metal trifluorides, the rhombohedral phase is more stable than the cubic phase at low temperatures and elevated pressures, which leads to positive thermal expansion; in contrast, for ScF3, the cubic phase is more stable than the rhombohedral at low temperature and ambient pressure, which leads to NTE.19–21 The discovery of strong NTE in ScF3 has led to further research into its other properties, such as pressure-induced softening,22 which causes a material to become more compressible with increasing pressure, and methods for controlling its thermal expansion. Yang et al.23 observed a reduction of NTE upon the formation of ScF3 crystals with grain sizes of 80 nm. Similarly, Hu et al.24 synthesized ∼6 nm ScF3 crystals for which only positive thermal expansion was observed. Further strategies for tuning thermal expansion with ScF3 include the formation of solid solutions through isovalent cation substitution,25–29 redox intercalation of cations into the ScF3 A-sites,30 and the introduction of excess fluoride through aliovalent cation substitution.31,32 CaZrF6 is another open-framework material with ReO3-type structure that displays NTE, and exhibits a more negative CTE (αL ∼ −18 ppm/K at 100 K) than ScF3.33 The NTE behavior of CaZrF6 can be modulated in various ways; for example, incorporation of helium under high-pressure gas has been shown to create a defect perovskite (He2−x□x)(CaZrF6)34 and a stoichiometric hybrid perovskite [He2][CaZrF6],35 with different equations of state.
Computational and theoretical work on NTE materials may be categorized based on goals: The first category consists of theoretical work that seeks to rationalize and develop physical understanding of NTE phenomenon, often through simplified models. The second category is computational work that targets quantitative, in silico property prediction, utilizing, e.g., DFT or higher accuracy electronic structure methods. The present work falls into the latter category, as we target (although do not fully achieve) quantitative, equation of state prediction for ScF3 and CaZrF6 NTE materials utilizing first-principles, computational methods. There is of course synergy between these categories of efforts. In computational predictions, necessary choices about what “physics to include” should be made based on the insight gleaned from simplified models. On the other hand, state–of–the–art computational predictions are important for testing and validating the approximations made in simplified models, and providing benchmarks when experimental data are unavailable.
We first discuss theoretical efforts and simplified models, which have largely provided the physical understanding and insight into NTE behavior.18,36–40 The rigid unit vibrational mode (RUM) model is a representative approach belonging to this category and has provided mechanistic analysis into framework NTE materials.18,37,41–43 In the context of these materials, RUMs are low-frequency modes that engage in cooperative rotations of octahedral units; the motions associated with these modes involve minimal distortion of these octahedral geometries. The RUM model was initially applied to understand phase transitions in silicates and first applied to framework NTE materials with studies of ZrW2O8.42,44,45 In ScF3, the RUMs (along with quasi-RUMS, modes with wavevectors close to the RUMs) have been demonstrated in the literature to be the key modes for NTE in ScF3.46,47 Analysis of the phonon spectrum of CaZrF6 indicates that the RUM model is applicable for this system as well.48
Ab initio methods have been used for both mechanistic analysis and attempted quantitative prediction, and we summarize those studies focusing on ScF3 or CaZrF6. Several works have investigated NTE in ScF3 with DFT-based approaches, utilizing either vibrational free energy predictions or ab initio molecular dynamics (AIMD) simulations. Oba et al.49 evaluated quasiharmonic approximation (QHA)50 free energy expressions for ScF3, and found that QHA predicts qualitatively incorrect behavior. This is likely due to the fundamental importance of anharmonicity in NTE systems, requiring extensions beyond harmonic treatments.51 Methods that go beyond the QHA have been used to study ScF3, such as the self-consistent phonon (SCP) theory, which includes quartic anharmonicity,52,53 and the improved self-consistent (ISC) phonon theory, which additionally treats cubic anharmonicity.54 These procedures require parameterization of high-order force constants from first-principles calculations.55,56 In contrast, AIMD is straightforward and naturally takes into account anharmonicity. AIMD simulations have been conducted for ScF3, providing insight into its NTE behavior.51,57–59 Utilizing AIMD, Lazar, Bučko, and Hafner57 were able to reproduce the NTE effect over a temperature range of 200–800 K using a 2 × 2 × 2 supercell of the cubic phase. They demonstrated the pressure dependence of NTE in ScF3 and related this to the stability of the phase. Bocharov et al.58 investigated the CTE and dynamics of ScF3 at different supercell sizes. They found that at least a 4 × 4 × 4 supercell for ScF3 is needed to converge AIMD predictions with system size, due to the long wavelength phonons with negative Grüneisen parameters accounting primarily for NTE behavior.24,58 As CaZrF6 is a more complicated material, there are fewer ab initio studies in the literature.60,61 Gupta et al.60 used the QHA to study CaZrF6 and related systems, and again demonstrated the importance of anharmonicity for modeling thermal expansion effects.
While AIMD is a powerful technique, its high computational cost makes it of limited use for equation of state prediction on a general scale. As discussed, larger supercells are often necessary to avoid finite size effects associated with long-wavelength/low frequency phonons that make important contributions to NTE, and the computational expense of AIMD makes simulation of such supercells challenging.33 Additionally, nuclear quantum effects (NQE) may be important, requiring computationally expensive path integral simulations. As an example, consider the phonon spectra of CaZrF6 discussed by Hancock et al.33 The high frequency phonons typically have positive Grüneisen parameters (γ), whereas low frequency modes exhibit negative γ, the latter contributing to NTE. While a classical treatment is expected to work well for low frequency (e.g., cm−1) modes, quantization will be important for high frequency modes; a classical treatment will incorrectly assign energy equipartition to these high frequency modes and thus tend to underestimate NTE effects at lower temperatures.49 Combining path integral approaches with AIMD may often be computationally intractable for systems of interest. Alternatives include fitting molecular mechanics force fields to quantum mechanical data.22 For materials of increasing complexity, developing sufficiently accurate force fields is both a challenging and time consuming task.
Because of the discussed limitations, obtaining quantitatively accurate, in silico equation of state predictions may often require methods other than direct DFT-based AIMD. An emerging strategy is to use artificial neural networks (NNs) or other machine learning (ML) methods to act as surrogate Hamiltonians that exhibit DFT-level (or higher) accuracy with substantially enhanced computational efficiency.62,63 Machine learning (ML) methods have been increasingly adopted in the chemistry and materials science community to simulate materials with ab initio accuracy at orders of magnitude lower computational cost as compared with AIMD.64–66 As long as a training set of high-quality reference data is available, usually consisting of ab initio energies/forces for the system of interest, ML techniques such as neural networks, kernel methods, or other tools can be used to construct ML potentials.67 These ML potentials can then be used to run molecular dynamics (MD) or Monte Carlo simulations to predict physical properties of interest. In particular, ML models have been used to study a large variety of materials, such as silicon,68,69 various metal systems,70–72 zeolites,73 MOFs,74 and more.64,75 ML techniques have also been used to screen materials such as MOFs, zeolites, and perovskites for selected applications.76–78 In this work, we develop similar ML potentials for the NTE materials ScF3 and CaZrF6 to predict equations of state from direct MD simulations. These ML potentials allow for efficient MD simulations of large supercells, and additionally enable path integral molecular dynamics (PIMD) simulations that explicitly incorporate NQE.
In this work, we train NN potentials to a training set of DFT energy, force, and stress tensor calculations for ScF3 and CaZrF6 to enable direct MD prediction of the equations of state. We compare predictions for CTE, atomic displacement parameters (ADPs), the bulk modulus, and additional equation of state data against both experiment and prior theoretical predictions. Utilizing PIMD simulations, we additionally quantify the contribution of NQE to the equations of the state of these materials over a wide temperature range. Overall, the predictions from our NN-driven, MD simulations are generally in qualitative to semiquantitative agreement with experimental equation of state data. Our predictions underestimate the extent of negative thermal expansion, even when NQE are fully incorporated, and we speculate that this is likely due to deficiencies in the underlying density functional (training data). We find that NQE modulate the CTE by 50%–100% at low temperature (100 K) for both materials, which is qualitatively consistent with findings from previous studies.49 A particularly apparent discrepancy in our simulations is the missing pressure-induced softening effect for cubic ScF3, which is observed experimentally.22 Overall our study demonstrates both the utility and limitations of MD simulations utilizing NNs with underlying DFT-accuracy for equation of state predictions of NTE materials.
II. METHODS
A. Training set generation and model training
We first describe our procedure for constructing a training dataset of DFT energies, forces and stress tensors for ScF3, and CaZrF6. The training sets for both crystals were generated from ab initio geometry optimizations and AIMD simulations using the Quantum ESPRESSO package.79 We used the Atomic Simulation Environment (ASE) package80 to generate a 4 × 4 × 4 supercell for ScF3 and a 2 × 2 × 2 supercell for CaZrF6 from the cubic unit cells obtained from the Materials Project.1 The Perdew–Burke–Ernzerhof (PBE) density functional and projector augmented-wave (PAW) pseudopotentials were used for all calculations,81,82 with Γ-point sampling of the band structure in all cases. Suitable values for the plane wave cutoffs (kinetic energy/density) were determined from convergence tests of the stress tensor (which is typically harder to converge than energy/forces). These convergence tests are shown in Figs. S1 and S2, and the final cutoff values employed were 180/1152 Ry for the kinetic energy/density for ScF3 and 200/1200 Ry for the kinetic energy/density for CaZrF6.
AIMD simulations were conducted for both materials over a range of temperatures and pressures to generate training data. For ScF3, we ran numerous ∼1 ps NPT simulations of the 4 × 4 × 4 supercell over a temperature range of 300–1600 K and a pressure range of 0–800 MPa.83 The timestep was set to 20 atomic units (∼1 fs). The Parrinello-Rahman method was used for pressure-coupling, with the fictitious mass set to the default value in Quantum ESPRESSO, and velocity rescaling was used for the thermostat. The velocities were rescaled every five timesteps. This resulted in a training set of energies, forces, and stress tensors for ∼8500 ScF3 cubic-phase structures of different coordinates and lattice parameters. We additionally generated training data for the rhombohedral phase of ScF3 to explore how the inclusion of these training data altered the NN predictions (vide infra). We thus ran additional AIMD simulations for a 2 × 2 × 2 supercell of the rhombohedral conventional unit cell structure at 300 K and pressures ranging between 0 and 1000 MPa (the unit cell was obtained from the Materials Project1). This added training data for ∼1200 ScF3 rhombohedral structures of different coordinates and lattice parameters. For CaZrF6, training data were generated for the cubic phase only. AIMD simulations of 2 × 2 × 2 CaZrF6 supercells were run over a temperature range of 300–1400 K and a pressure range of 0–500 MPa. This resulted in a training set of energies, forces, and stress tensors for ∼6000 CaZrF6 cubic structures of different coordinates and lattice parameters.
The DeepMD architecture was utilized for the NN potentials,84,85 and was trained to the energies, forces, and stress tensors comprising the training data for each material. Within the DeepMD architecture, an initial descriptor network converts the local environment of each atom into a set of embeddings that obey translational, rotational, and permutational invariance, and a second network utilizes this embedding to predict the energy, forces, and virial of the system.84 DeepMD’s “se_2_a” descriptor was used for the embedding network, which incorporates both radial and angular information.86 Separate neural networks were trained for both ScF3 and CaZrF6. The descriptor deep neural network was made up of three hidden layers using 25, 50, and 100 neurons. Neighbors within 8 Å were included in the local environment for each atom. For ScF3, the fitting net consisted of three hidden layers with 240 neurons each; for CaZrF6, the fitting net was reduced to 40 neurons each in order to balance accuracy and computational cost. A multi-target loss function balancing energy, force, and stress loss between the reference data and the neural network was used in order to fit all three properties (see Wang et al.84 and the supplementary material). We benchmarked the choice in weights used for each term in the loss function to ensure our results were relatively insensitive; this is also included in the supplementary material. The initial learning rate for both neural networks was set to 1 × 10−3 and ended at 3.5 × 10−8, with 5000 decay steps. A 80:20 split was used to construct a training and validation set from the total dataset; the test set was assembled from classical MD simulations using the final neural networks (the settings used for the simulations are described in Sec. II B). Each model was trained for 106 steps for training and validation. The input files with all hyperparameters used to build the DeepMD models are included in the supplementary material.
Neural network potentials may become unstable during simulations, since spurious forces will be predicted if the system drifts far outside of the configuration space included in the training set.87 Our initial AIMD simulations were only 1 ps in length, which does not allow for sufficiently exploring all regions of phase space of interest in this work. We observed that the initially trained neural networks for both materials were unstable for high temperature (1000 K) classical MD simulations after ∼10 ps. To fix this initial instability, we added additional training data consisting of Quantum ESPRESSO computed energies, forces, and stress tensors for structures taken from the NN simulation snapshots. The DeepMD NN potential was then re-trained to the expanded training set. This procedure was done iteratively until there were no observed instabilities while running MD with the NN potentials. In total, additional training data for ∼1000 structures was added to the original training set for both materials within this iterative procedure. The NN simulation procedure is described in Sec. II B (see the classical simulation details).
The final NN potentials were then tested as follows: MD simulations using these NN potentials were performed over a temperature range of 300–1000 K and a pressure range of 0–300 MPa. A 4 × 4 × 4 supercell was used for the ScF3 simulations and a 2 × 2 × 2 supercell was used for the CaZrF6 simulations. Quantum ESPRESSO was used to compute the DFT energy, forces, and stress tensor for ∼300 snapshots from these simulations. A comparison of the predicted neural network energy vs DFT energy and forces on the test set for both materials is shown in Figs. S4 and S5 of the supplementary material. For ScF3, the energy mean absolute error (MAE) is 0.042 eV (0.1 meV/atom), and the forces MAE is 0.02 eV/Å; to test the accuracy of the stress tensor, we computed the internal pressure from the stress tensor from both DFT and the neural network. The MAE is 6.6 × 10−5 eV/Å3 (∼10 MPa). For CaZrF6, the energy MAE is slightly worse at 0.18 eV (0.7 meV/atom); however, as can be seen in Fig. S4, there is little scatter in the predicted energies. The forces MAE is 0.033 eV/Å and the pressure MAE is 7.2 × 10−5 eV/Å3 (∼11 MPa).
B. Neural network-driven, molecular dynamics simulations
We perform MD simulations with the final NN potentials to predict equations of state for the ScF3 and CaZrF6 materials. Both classical MD and path integral PIMD simulations were run in order to examine the impact of NQE; we explicitly denote which predictions correspond to each simulation type when discussing the results. All simulations were conducted using ASE with the DeepMD calculator.80,84 The DeepMD architecture allows for simulations of supercells of arbitrary size, since total energy of the system is represented as a sum of atomic energies. The majority of our calculations were performed on 5 × 5 × 5 supercells of ScF3 and 3 × 3 × 3 supercells of CaZrF6 (both in the cubic phase). As noted by Dove and Fang,43 simulations of ScF3 with an odd number of unit cells will not entirely capture phonons associated with tilt modes along the M-R branch, which may be important to include. However, we have verified that predictions from our chosen supercells are largely converged with respect to system size, and choice of odd/even number of unit cells. In Fig. S8, simulation predictions of larger supercells 6 × 6 × 6 and 7 × 7 × 7 for ScF3 indicate that predictions from the smaller/odd numbered 5 × 5 × 5 ScF3 supercell are largely converged. Similarly, in Fig. S15, simulation predictions of the larger (and even numbered) 4 × 4 × 4 CaZrF6 supercell are very close to the predictions from the smaller (and odd numbered) 3 × 3 × 3 CaZrF6 supercell. Another factor for our specific choice of supercells (5 × 5 × 5 supercell for ScF3 and 3 × 3 × 3 supercell for CaZrF6) is based on preliminary simulations that employed a neural network trained to a dispersion-corrected density functional; these preliminary predictions are shown in Fig. S17, and show dramatic instabilities for systems of even numbered unit cells. This is discussed in more detail in Sec. III.
For computing the coefficient of thermal expansion (αL), a series of simulations were run in the NPT ensemble using the isotropic Berendsen barostat.88 A Berendsen thermostat was used for the temperature coupling. The simulations were run for ∼4–5 ns at a pressure of 0 MPa and temperatures ranging from 100–1200 K. We benchmark a different thermostat/barostat choice in the supplementary material Fig. S6 using the i-Pi package (Langevin thermostat/Bussi-Zykova-Parrinello barostat);89 we observe converged results with our predictions. A time step of 1.0 fs was used with a barostat time constant of 1.0 ps and a thermostat coupling constant of 1.0 ps. NPT simulations were conducted to compute pressure vs volume curves for ScF3, starting from the 5 × 5 × 5 cubic supercell geometry. These simulations were run in the i-Pi package interfaced with ASE. Each simulation was run for 1–2 ns and performed over a pressure range of 0–500 MPa and temperature range of 55–240 K. A Langevin thermostat was used along with the Martyna-Tobias-Klein (MTK) barostat.90 A time constant of 1 ps was used for both the thermostat and barostat, and the time step was set to 1.0 fs.
Similar NPT simulations were run for CaZrF6. Three main sets of properties were calculated for CaZrF6. Thermal expansion was investigated between 100 and 1000 K at 0 MPa, similar to the ScF3 simulations. Then, volume vs pressure curves were obtained from a set of simulations run at a temperature of 290 K and pressures from 0 to 300 MPa. Finally, the bulk modulus of CaZrF6 was computed over a temperature range of 300–500 K. A set of three simulations from 0 to 200 MPa was performed for each temperature. A linear fit to the average volume vs pressure data from these simulations was used to compute the bulk modulus at each temperature. All CaZrF6 simulations were run for ∼1–2 ns. The Berendsen barostat with a 1.0 ps time constant was used for all simulations.
The i-Pi software package interfaced with ASE was utilized to run the path integral MD simulations.91,92 Isotropic path integral NPT simulations were run for both systems. Simulations were run over a temperature range of 100–1200 K for cubic ScF3 and a temperature range of 100–1000 K for cubic CaZrF6, and the pressure was fixed to 0 MPa. A smaller time step of 0.5 fs was used here. The barostat implementation details can be found in Bussi, Zykova-Timan, and Parrinello89 and Ceriotti, More, and Manolopoulos91 A time constant of 0.2 ps was used for the barostat. A standard Langevin thermostat was used for temperature coupling, with a time constant of 0.2 ps. Depending on temperature, between 5 and 15 beads were utilized for the path integral simulations of both ScF3 and CaZrF6 (with greater number of beads used for lower temperatures). Convergence tests of the lattice parameter with respect to number of beads are shown in Figs. S9 and S16. Each PIMD simulation was run for 300–500 ps.
We briefly comment on the computational cost of our neural network vs Born-Oppenheimer MD simulations performed with Quantum ESPRESSO. For example, ten-time steps of NPT simulation of the 4 × 4 × 4 ScF3 supercell with Quantum ESPRESSO on 96 cores (4 Dual Intel Xeon Gold 6226 processors) took 14 667 s (4 h 4 min 27 s). With the neural network on a NVIDIA Tesla v100 GPU, a ten-time step simulation takes 1.86 s, which is about a 4 orders of magnitude difference in simulation time, demonstrating the utility of the approach for simulating larger crystals.
III. RESULTS
Our initial target is to investigate the extent to which the NN-driven MD simulations accurately predict the CTE for both ScF3 and CaZrF6. Because the NN potentials exhibit essentially DFT-level accuracy (Sec. II A), our results should be interpreted as the accuracy to which the underlying density functional (PBE) describes the material properties. In Fig. 1, we plot the predicted thermal expansion behavior for ScF3 and CaZrF6 as computed from both classical MD (labeled as “NN”) and PIMD simulations (labeled as “PI NN”) utilizing the NN potentials. We plot both the temperature-dependent reduced lattice constant (a/a0) and linear coefficient of thermal expansion: CTE = ; these are plotted in Figs. 1(a) and 1(b) for ScF3 and Figs. 1(c) and 1(d) for CaZrF6. In all cases, we plot corresponding experimental data and prior theoretical/computational predictions for comparison, where available. The curves are third order polynomials fit to the data points, with a0 taken as the extrapolated 0 K value for a from the fitted polynomial for each set of data.
Inspection of Fig. 1 indicates that the NN simulations indeed predict NTE for both systems, in qualitative agreement with experiment. Furthermore, the simulations correctly predict the experimental trend that the CTE is more negative for CaZrF6 than ScF3. Experimentally, the CTE values are α100K,L = −18 ppm/K for CaZrF6 and α100K,L = −14 ppm/K for ScF3 at 100 K. For comparison, the classical MD (“NN”) simulations predict a CTE of α100K,L = −9.8 ppm/K for CaZrF6 and α100K,L = −4.3 ppm/K for ScF3 at 100 K. Incorporation of NQE within the PIMD simulations (“PI NN”) brings the CTE values into closer agreement with experiment, with these simulations predicting α100K,L = −14.3 ppm/K for CaZrF6 and α100K,L = −8.7 ppm/K for ScF3 at 100 K. Thus NQE lower the CTE by ∼50% for CaZrF6 and 100% for ScF3 at 100 K, while the influence of NQE diminishes at higher temperatures and largely disappears by temperatures of 700–800 K. Overall, these results indicate that the MD simulations driven by DFT-trained NNs predict NTE behavior of ScF3 and CaZrF6 in semiquantitative to qualitative agreement with experiment.
Our CTE predictions for CaZrF6 are comparatively better than for ScF3 in that there is closer quantitative agreement with experiment. For CaZrF6 the predicted CTE at 100 K is within a factor of 1.3 of the experimental value when incorporating NQE with the PIMD simulations (and a factor of 1.8 without NQE). There is no obvious reason for why the predictions for CaZrF6 agree somewhat better with experiment compared with the predictions for ScF3 (although as discussed later, the pressure-induced softening effect for ScF3 is also not reproduced by our simulations). The influence of NQE on the CTE is consistent with the trend expected from analysis of the phonon spectrum. For both CaZrF6 and ScF3 materials, low frequency phonon modes exhibit negative Grüneisen parameters associated with NTE, while higher frequency phonons exhibit positive Grüneisen parameters.33,93 A classical treatment will incorrectly assign energy equipartition to these high frequency modes and overpredict their contribution to (positive) thermal expansion at low temperatures.49 Further inconsistencies of our simulation predictions with the experimental NTE behavior are predicted transitions from negative to positive thermal expansion at lower temperatures than those experimentally measured. As seen in Fig. 1, our simulations for ScF3 predict positive thermal expansion above temperatures of 600/700 K for MD/PIMD, respectively, while the experimental crossover temperature is near 1100 K. For CaZrF6, a similar effect is observed, with the transition occurring at 900 K in both the classical MD and PIMD predictions, while solely NTE is experimentally observed over the full characterized temperature range.
We discuss possible reasons for the quantitative discrepancies between our thermal expansion predictions and the experimental data. Important considerations for ab initio equation of state prediction for these materials include (1) phonon anharmonicity; (2) NQE; (3) finite-size effects; and (4) accuracy of the underlying density functional. Regarding (1), predictions from MD simulations explicitly incorporate full anharmonicity of phonon modes, in contrast to theoretical free energy models (vide infra). For (2), we have explicitly evaluated the influence of NQE, and NQE does not account for the remaining discrepancy with experiment; Figs. S9 and S16 show that NQE is largely converged with respect to the number of beads used in the PIMD simulations. Regarding (3) finite-size effects, we have tested convergence of our predictions with respect to simulated supercell size. Figures S8 and S15 demonstrate the convergence of CTE predictions for ScF3 and CaZrF6 using our simulation systems with respect to varying supercell sizes. The remaining consideration is the accuracy of the underlying density functional, which is PBE in this study. In this regard, our predictions for ScF3 thermal expansion are similar to the AIMD results from Bocharov et al.58 that utilize the PBEsol functional (Fig. S10), although predictions with PBEsol show a higher temperature for transition from negative to positive thermal expansion, in better agreement with experiment. Dispersion corrected functionals would potentially be expected to yield better physical predictions. In fact, these were the initial functionals chosen for this study, but our initial NN models trained to dispersion corrected functionals all proved to exhibit dramatic instabilities when utilized in MD simulations (see supplementary material); similar issues have been noted before.94 An interesting direction for future work would be to examine an improved treatment of electron correlation through methods such as the random-phase approximation (RPA),94–96 which may be achieved by either completely rebuilding the neural network/training set or possibly through a transfer learning procedure.97
Despite the quantitative errors, our NN-driven MD simulations represent the “state–of–the–art” in ab initio, equation of state predictions for ScF3 and CaZrF6. Figure 1(b) compares alternative theoretical predictions for the CTE of ScF3. Specifically, we compare to predictions from the SCP and ISC phonon theories from Oba et al.49 The SCP theory goes beyond the quasiharmonic approximation (QHA) by incorporating quartic anharmonicity, and the ISC incorporates cubic anharmonicity. The predictions of SCP and ISC theories are based on quantum mechanical free energy functions, and thus explicitly take into account NQEs.49 Furthermore classical limits can be derived within the SCP framework, and we label such results from Oba et al.49 as “Classical SCP” in Fig. 1(b). As indicated in Fig. 1(b), the shape of the classical SCP CTE curve looks similar to our classical MD and PIMD results; however, the predictions exhibit significant quantitative deviation from the experiment as they display essentially no NTE over the reported temperature range. Comparing the classical and quantum SCP results indicates a very similar contribution of NQE as predicted by our classical MD/PIMD simulations. Of the theoretical models, the ISC predictions exhibit the best quantitative agreement with experiment and show a similar temperature-dependent trend for CTE as compared with experiment. Overall, the ISC predictions and our “PI NN” predictions for CTE exhibit similar quantitative accuracy as compared with experiment. This indicates that including cubic and quartic anharmonicity within the ISC theory provides similar predictive accuracy as PIMD simulations for ScF3 and similar materials. The methods should be viewed as complementary; PIMD provides a straightforward approach for property predictions utilizing standard DFT machinery (energy, forces, and stress tensor), while the ISC phonon theory provides enhanced physical understanding of anharmonicity contributions to the equation of state behavior.49
We next compare atomic displacement parameters (ADPs), which measure the mean-square displacement of an atom from its crystal lattice position.98 There are six unique components to the anisotropic atomic displacement tensor (three diagonal and three off-diagonal).98 The mean-square displacement of each atom is straightforward to compute from simulations, e.g., , where u is an instantaneous atomic displacement from its mean position and x is a Cartesian coordinate. These can be compared with corresponding experimental values as measured by neutron or X-ray diffraction.98 Within the ScF3 crystal lattice, the Sc atoms undergo isotropic displacement, Uiso = U11 = U22 = U33;22 for F atoms, the displacement parameters corresponding to transverse motion (U11 = U22) are different from the displacement parameter corresponding to longitudinal motion (U33). In Fig. 2, we compare the atomic displacement parameters for ScF3 as predicted by our simulations to experimental values.19 We first focus on the scandium atoms [Fig. 2(a)]. At low temperatures, the agreement between simulation and experiment is quite good. At higher temperatures, the simulations predict somewhat larger atomic displacement parameters for Sc than what is observed in the experiment. For the fluorine atoms [Fig. 2(b)], there is very good agreement between the predicted and experimental ADPs over the full temperature range. The transverse U11 parameter for fluorine is much larger, with a stronger temperature dependence compared with the longitudinal U33 parameter. As discussed previously,22 this is the expected behavior for the cubic crystal structure. We note that there is no significant difference between the “NN” and “PI NN” results for all predicted ADP values, indicating a negligible influence of NQE.
The ADPs for CaZrF6, shown in Fig. 3, display similarly good agreement with the experiment.33 In this case, Uiso = U11 = U22 = U33 for both Ca and Zr. The fluorine ADPs are similar to those in ScF3, with two ADPs corresponding to transverse motion and one ADP corresponding to longitudinal motion. The fluorine ADPs within CaZrF6 [Fig. 3(c)] are somewhat larger than the corresponding ADPs in ScF3(note the different temperature scales of Figs. 2 and 3); this correlates with the more substantial NTE in CaZrF6 as compared with ScF3. The agreement between the Ca and Zr ADPs in Figs. 3(a) and 3(b) between simulation and experiment is good, although there is some slight disagreement at 100 K for Ca [Fig. 3(a)]. Analogous to ScF3 there is no significant difference between the classical and path integral ADP predictions for CaZrF6, indicating negligible influence of NQE.
There are three fitted quantities in Eq. (1): B0 is the value of the bulk modulus at zero pressure, V0 is the value of the volume at zero pressure and B′ is the first derivative of the bulk modulus with respect to pressure. A negative value of B′ indicates pressure-induced softening. Figure 4(a) shows the pressure-volume curves from experiment,22 and Fig. 4(b) shows our simulation predictions for the cubic phase of ScF3. Each dataset was subsequently fit to Eq. (1), with the fits corresponding to solid lines in Fig. 4 and the fitted parameters B0 and B′ plotted in Fig. 5. Note that here we only show predictions from classical MD, as we observed little change in the results with PIMD.
We first discuss the predictions from our simulations of the ScF3 cubic phase. As seen in Fig. 4(b), the Birch–Murnaghan equation of state fits our simulation data well; we show the same results plotted with the 0 K DFT result in Fig. S14. The fitted B0 values from the equation of state, shown in Fig. 5, fall within the range of those fit to the experimental data. The simulation predicted volumes are ∼3% smaller than the corresponding experimental volumes, which is largely due to errors in DFT predicted bond lengths. However, there are other non-trivial differences between our simulation curves and the experimental curves. The experimental P–V curves [Fig. 4(a)] display a significant curvature at higher pressure, corresponding to the reported pressure-induced softening effect.22 The pressure-induced softening observed experimentally is most apparent for the low temperature data. In contrast, there are no signatures of pressure-induced softening in our simulations of cubic phase ScF3. As shown in Fig. 4(b), the simulated P–V curves all exhibit a close to linear relationship between pressure and volume, even at lower temperatures. This is reflected in the value of B′ plotted in Fig. 5(b), in which simulation values are all close to 0, reflecting no pressure-induced softening; in contrast, the experimental values are all significantly negative. The high pressure bound of the experimental P–V data in Fig. 4(a) corresponds to a phase transition of ScF3 from the cubic to rhombohedral phase.19 It has thus been hypothesized that the pressure-induced softening effect is related to the proximity to this phase transition, and/or local fluctuations involving ScF6 octahedral rotations that resemble motifs of the rhombohedral phase.22 We thus speculated that the reason simulated P–V curves do not display pressure-induced softening was because the NN was trained solely to ScF3 cubic phase data, and does not “extrapolate” to such structural motifs. We hypothesized that by adding rhombohedral phase structures to the NN training set, the NN would “learn” about such rhombohedral-like, local ScF6 octahedral rotations, possibly improving agreement with the experiment. However, this hypothesis turned out to be false (at least in terms of improving agreement with the experiment). Upon adding significant training data encompassing rhombohedral structures, and retraining the neural network, the simulation predictions were essentially unchanged (Figs. S12 and S13).
At this point, we can only speculate on the discrepancy with the experiment. The simulations of the ScF3 cubic phase may not be capturing local fluctuations involving ScF6 octahedral rotations that resemble motifs of the rhombohedral phase, and are an important mechanism for pressure induced softening. Of course, the (unknown) phase diagram of the DFT Hamiltonian is likely quantitatively different than the physical/experimental phase diagram of ScF3. If indeed the pressure-induced softening is related to close proximity to the phase-transition, then any discrepancy between DFT and experimental phase behavior would affect predictions of this phenomena. For example, if the cubic to rhombohedral transition occurred at much higher pressures on the DFT phase diagram, then the simulated pressures in Fig. 4(b) may be relatively far from the phase-transition, possibly explaining the lack of pressure-induced softening. This is entirely speculative, and would require characterization of the DFT-predicted phase behavior; however, such free energy calculations are beyond the scope of the present work. We note that previous MD simulations utilizing simple bond/angle potentials have predicted pressure-induced softening, in good qualitative agreement with experiment.22 However, the CTE predicted by these simulations exhibited significant quantitative error, in contrast to our present ab initio predictions.
We next discuss equation of state and bulk modulus predictions for CaZrF6. In Fig. 6, we show experimental volume vs pressure data for CaZrF6 at 290 K from Hester et al.,34 as well as our corresponding predictions from the NN-driven MD simulations (only classical MD results are shown, as NQE are minor here). There is very good agreement in the P–V trend as predicted by simulation compared with the experiment. The absolute unit cell volumes from the NN simulations differ from the experimental values by about 3%, again due to corresponding error in DFT predicted bond lengths. Additionally, it is seen that there is essentially no pressure-induced softening for CaZrF6 over this temperature/pressure range. In Fig. 7, we compare simulation predictions of CaZrF6 bulk modulus to experimental data over a wide temperature range.33 At STP, the predicted bulk modulus is 39 GPa, which is in excellent agreement with the experimental value of ∼37 GPa. The experimental bulk modulus indicates some thermal softening at higher temperatures, and our NN predictions follow a similar trend. Overall, the simulation predictions are in good agreement with experiment over the full temperature range.
IV. CONCLUSION
We have demonstrated the accuracy attainable with DFT-trained NNs for predicting equation of state properties of the materials ScF3 and CaZrF6. Our benchmarks have covered a representative set of properties for these systems, including the coefficient of thermal expansion, atomic displacement parameters, equations of state, and the bulk modulus. Our NN-driven, MD simulation predictions largely follow the experimentally observed trends. The advantage of the NN potentials is that their development/parameterization is straightforward utilizing modern software libraries,84,85 and sufficient training data are generated from relatively short AIMD simulations. The enhanced computational efficiency of the NNs enable longer simulations of larger supercells compared with AIMD, and additionally allows application of path integral approaches for incorporating NQE.
Despite the state–of–the–art computational approach, we do not achieve the goal of quantitative accuracy in the equation of state predictions across the full temperature/pressure regime. In general, the NN-driven MD predictions agree better with experiment for pressure-volume properties such as the bulk modulus in comparison to CTE, although we are unable to reproduce the pressure-induced softening effect in ScF3. We have quantified the significant contribution of NQE to NTE at low temperatures; in general, classical simulations underpredict NQE in ScF3 and CaZrF6 at low temperature due to unphysical equipartition in high-frequency phonon modes with positive Grüneisen parameters. The NN potentials correctly predict that CaZrF6 undergoes NTE to a greater extent than ScF3, and overall the predictions exhibit semiquantitative to qualitative agreement with the experiment. However, even with PIMD, the CTE is underestimated in magnitude in comparison to the experimental values, showing that there is other missing physics. In addition, the predicted turnover from negative to positive thermal expansion occurs at too low a temperature for both materials. To improve the predictions, it is likely necessary to go beyond the generalized gradient approximation (GGA) density functional description of electron correlation, e.g., utilizing more accurate RPA or many-body methods.99 An interesting direction for future work would be to compare predictions from a NN potential trained to a database generated from a higher-level of ab initio theory, such as RPA, to investigate whether better agreement with experiment could be attained. This may be most easily achieved through “transfer learning” of the present NN potentials to higher accuracy training data.97 A similar procedure has been shown to work well for material property prediction.100,101
Our approach has utility for further study of NTE materials, despite the noted deficiencies in quantitative predictions. Concerning future outlook, our workflow based on NN potentials provides the opportunity to investigate new materials for pronounced NTE. The results here show that trends in NTE between different compounds are correctly predicted, although this will require further testing and validation. A database of the results and structures from these NN simulations would help identify and analyze the important features necessary for strong NTE, which could also guide experiment. Additionally, it is possible to extend our simulations in order to incorporate defects or impurities into the framework, which has been done experimentally in order to tune NTE behavior.25–27,35,38 Through these efforts, we anticipate that NN-driven, MD simulations will continue to play an increasingly important role in “first-principles” materials property predictions.
SUPPLEMENTARY MATERIAL
See the supplementary material for benchmarks for the kinetic energy cutoff used in the DFT calculations vs internal pressure, a description of the loss function used, benchmarks for the neural network predicted energies and forces vs DFT, benchmarks for the barostat used in this paper vs other barostats, the ScF3 NTE curves calculated with different supercell sizes, a comparison of the ScF3 NTE curve vs AIMD results, ScF3 lattice constant values for classical vs path integral MD, ScF3 pressure-volume equations of state, CaZrF6 NTE curves from simulations of different supercell sizes, CaZrF6 lattice constant values for classical vs path integral MD and ScF3 NN results for various properties when trained with various dispersion corrections. The input files with all hyperparameters used to build the DeepMD models are included as download “deepmd_train_files.zip.” Additionally, all numerical data for manuscript figures are included as download “figures_data.zip.”
ACKNOWLEDGMENTS
A.P.W. acknowledges financial support under NSF Grant No. DMR-2002739. This work used the Hive cluster, which is supported by the National Science Foundation under Grant No. 1828187. This research was supported in part through research cyberinfrastructure resources and services provided by the Partnership for an Advanced Computing Environment (PACE) at the Georgia Institute of Technology, Atlanta, Georgia, USA. We wish to thank an anonymous reviewer for pointing out the fact that simulations with odd number of unit cells for ScF3 may not entirely capture phonons associated with tilt modes along the M-R branch.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
John P. Stoppelman: Data curation (lead); Formal analysis (equal); Investigation (lead); Methodology (equal); Software (lead); Validation (lead); Visualization (lead); Writing – original draft (lead). Angus P. Wilkinson: Conceptualization (equal); Funding acquisition (equal); Supervision (equal); Writing – review & editing (equal). Jesse G. McDaniel: Conceptualization (equal); Formal analysis (equal); Project administration (equal); Supervision (equal); Writing – review & editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.