Machine learning interatomic potentials have emerged as a powerful tool for bypassing the spatiotemporal limitations of ab initio simulations, but major challenges remain in their efficient parameterization. We present AL4GAP, an ensemble active learning software workflow for generating multicomposition Gaussian approximation potentials (GAP) for arbitrary molten salt mixtures. The workflow capabilities include: (1) setting up user-defined combinatorial chemical spaces of charge neutral mixtures of arbitrary molten mixtures spanning 11 cations (Li, Na, K, Rb, Cs, Mg, Ca, Sr, Ba and two heavy species, Nd, and Th) and 4 anions (F, Cl, Br, and I), (2) configurational sampling using low-cost empirical parameterizations, (3) active learning for down-selecting configurational samples for single point density functional theory calculations at the level of Strongly Constrained and Appropriately Normed (SCAN) exchange-correlation functional, and (4) Bayesian optimization for hyperparameter tuning of two-body and many-body GAP models. We apply the AL4GAP workflow to showcase high throughput generation of five independent GAP models for multicomposition binary-mixture melts, each of increasing complexity with respect to charge valency and electronic structure, namely: LiCl–KCl, NaCl–CaCl2, KCl–NdCl3, CaCl2–NdCl3, and KCl–ThCl4. Our results indicate that GAP models can accurately predict structure for diverse molten salt mixture with density functional theory (DFT)-SCAN accuracy, capturing the intermediate range ordering characteristic of the multivalent cationic melts.

Tremendous progress has been achieved in the past decade developing data driven surrogate models for learning molecular potential energy surfaces (PES).1–3 Data-driven models are functional forms with high dimensionality, typically containing between 104 to 105 parameters, which are machine-learned from ab initio training datasets. The high-dimensionality of data-driven models allows for the capture of complex intermolecular potentials with ab initio accuracy that cannot be fit to low dimensional (∼102) empirical inter-atomic potentials.4 Furthermore, data-driven models can bypass the spatiotemporal limitations of density functional theory (DFT) based molecular dynamics (MD) simulations by dramatically reducing the associated computational cost of dynamical simulations. Many data-driven model architectures have been developed over the years, with some of the most relevant being the Behler-Parrinello neural network, Gaussian approximation potential (GAP), Spectral Neighbor Analysis Potential (SNAP), moment tensor potential (MTP), ANI, FCHL, SchNet, MBTR, DeepMD, linear atomic cluster expansion, and NequIP.4–15 Taken together, these classes of data-driven intermolecular potentials facilitate a new frontier for ab initio quality dynamical modeling at unprecedented spatiotemporal scales. The bottleneck for developing data-driven intermolecular potentials concerns the computational inefficiencies associated with diverse training database generation and model hyperparameter tuning. Active learning (AL) has been proposed as an efficient data sampling heuristic and applied to select the most promising training samples from large unlabeled sample pools, which are typically generated from equilibrium MD simulations.16,17 However, equilibrium MD based active learning does not appropriately sample meta stable and out of equilibrium training sub-regions, which are necessary for avoiding unphysical traps or numerical instability of the MD simulations.18,19 To overcome such limitations of Boltzmann sampling, a number of recent studies have employed active learning in combination with enhanced sampling techniques, such as metadynamics, random structure search, or the direct incorporation of experimentally measured meta stable structures to improve the diversity of the training databases.20–23 Even with a diverse training database, model hyperparameters still need to be carefully tuned to arrive at a model that can provide accurate and numerically stable MD simulations. To this end, data-driven model hyperparameter tuning schemes employing Bayesian optimization, particle swarm, and genetic algorithms17,24,25 have been introduced to the community. Molten salts represent an iconic use-case for machine learning interatomic potentials in which conventional modeling paradigms (e.g., empirical potentials, ab initio dynamics) cannot obtain the necessary balance of accuracy (high polarizability salts) and accessible spatiotemporal scales (∼tens of nm, ns) necessary to estimate important physical properties. Molten salts have broad applications in concentrated solar power systems, liquid metal batteries, rare earth element (REE) production, and molten salt reactors,26–30 but often molten salt mixtures are required to obtain targeted physical chemical properties. Specifically, to lower the working temperature of molten salts, salts are typically combined to create mixtures of eutectic composition with low melting points. The complexity of the molten salt mixtures in technological applications is further increased by the inclusion of multivalent ions and radioactive heavy elements. As modeling systems of this level of complexity require the explicit inclusion of many-body effects, such as polarization, that require a fully quantum mechanical treatment, there are, at present, no chemically generalizable approaches for predicting the liquid phase structure of complex molten salt mixtures at the spatiotemporal scales relevant for industrial applications.

In this article, we present an active learning software workflow with data-efficient sampling and hyperparameter tuning for multicomposition Gaussian approximation potentials (GAP) in mixtures of molten salts. This workflow is a culmination of a number of best practices we have gained through our past publications.17,18,21,31,32 We begin by providing a brief overview of the multicomposition active learning workflow for GAP, termed AL4GAP, with an emphasis on the Python classes of AL4GAP. The details of the specific methodologies can be found in our prior publications. The AL4GAP workflow currently supports arbitrary molten mixtures spanning 11 cations (Li, Na, K, Rb, Cs, Mg, Ca, Sr, Ba and two heavy species, Nd and Th) and 4 anions (F, Cl, Br, and I). We provide an overview of the software framework built around this active learning workflow for sampling arbitrary multicomponent molten salt mixture melts. Each of the individual Python classes is subsequently detailed, with additional details provided in the GitHub repository (https://github.com/pythonpanda2/AL4GAP_JCP).33 In the results section, we showcase the utility of the AL4GAP for the high-throughput generation of five independent GAP models for multicomposition binary-mixture melts each of increasing complexity with respect to a multiply charged ionic system, and electronic structure namely: LiCl–KCl, NaCl–CaCl2, KCl–NdCl3, CaCl2–NdCl3, and KCl–ThCl4. The results section provides two different simulation scenarios for the developed potentials. The first scenario employs the molten NaCl–CaCl2 as a model system to understand the liquid structure as functions of composition and temperature. For the second scenario, we discuss the chemical physics insights drawn from melt simulations for all five chemical systems with equal mixture composition at 1200 K.

The goal of the AL4GAP workflow is to accelerate the development of the GAP model that uses a two-body squared exponential and many-body smooth overlap of atomic positions (SOAP) kernel chemical descriptors.34,35 This workflow builds upon our previous AL approach that consists of an unsupervised clustering algorithm combined with Bayesian optimization for on-the-fly hyperparameter tuning of the GAP model.17,36 A standalone step-by-step tutorial of the AL scheme is provided in the GitHub repository under the subheading “Simple AL4GAP Tutorial” (“Notebook/tutorial.ipynb”) along with an accompanying YouTube video.37 In this step-by-step tutorial, we featurize input trajectory configurations using distance matrices and perform unsupervised clustering with HDBSCAN.38 Training and test configurations are obtained through sequential sampling. Bayesian optimization is used to optimize the Gaussian Approximation Potential (GAP) model.39 The iterative process continues until the target accuracy is achieved. The AL4GAP workflow is illustrated in Fig. 1, which details the actions in the GitHub wrapper script named “driver.py” that deploys compute resources and executes the workflow in ensemble mode using SmartSim.40 In the SmartSim convention, the ensemble refers to a groups of compute workloads that are launched simultaneously. In this study, the group of workloads corresponds to the number of compositions. The setup and prerequisites are listed in the GitHub repository. A simplified code example for ensemble launch is discussed in the supplementary material, Sec. A. Each of the execution blocks from Fig. 1 is further discussed below.

FIG. 1.

AL4GAP workflow for active leaning over a combinatorial composition map of molten salt mixtures. The compositions space is explored and sampled concurrently by providing dedicated compute resource for each composition.

FIG. 1.

AL4GAP workflow for active leaning over a combinatorial composition map of molten salt mixtures. The compositions space is explored and sampled concurrently by providing dedicated compute resource for each composition.

Close modal

1. Composition space

The workflow takes the user-defined composition space as an input for setting up the sampling task. In the present study, all the composition space is defined by the user in a comma separated value (CSV) format. An example is illustrated for the molten CaCl2–NdCl3 melt system (Listing 1) also found as the “density.csv” file in the GitHub repository, where a particular mixture composition is converted into its elemental composition. This example is provided to the user for a minimal trial run. The first three column headers correspond to abbreviated elemental symbols and the corresponding fractional composition. The second from the last column with the label “T” corresponds to the target experimental measurement temperature in Kelvin. If the temperature is not known it can be left as a zero value as the actual sampling is performed at an elevated temperature. The final column with the label “rho” corresponds to the density in the units of kg m−3.

LISTING 1.

CSV formatted input file with three mixture compositions defined.

CaNdClTrho
0.0681 0.1989 0.7330 1003 3179.618 
0.1577 0.1317 0.7106 1003 2852.936 
0.2723 0.0458 0.6819 1093 2321.894 
CaNdClTrho
0.0681 0.1989 0.7330 1003 3179.618 
0.1577 0.1317 0.7106 1003 2852.936 
0.2723 0.0458 0.6819 1093 2321.894 

The csv formatted input file is imported as a pandas object and passed as input to the “setup˙inputs” Python method inside the driver.py script (invoked as “from AL4GAP.setup˙inputs import setup˙inputs”). The “setup˙inputs” uses the information read from the csv file to create an atomic coordinate file (“opls.data”) and a corresponding LAMMPS input/force field file (“opls.in”) for each of the composition rows, piping these to a separate directory.

2. Generation of LAMMPS input

The density and composition read from the .csv files are used by “setup_inputs” to randomly pack the atoms into a LAMMPS readable coordinate file. The “ffparam” argument to the “setup_inputs” method generates the necessary forcefield files needed for performing the LAMMPS simulation. The “ffparam” can take two possible arguments namely “TF” (i.e., rigid ion model) or “OPLS” (Optimized Potentials for Liquid Simulations41) keywords. Our prior publications for molten LiCl and LiCl–KCl used the rigid ion based sampling which invoked the “TF” keyword to perform RIM-based sampling and the corresponding parametrization available in our GitHub only supports monovalent mixing of up to two salts.18,32 The RIM model generator classes used in prior studies are found in the GitHub repository as “moltensalt_tosifumi_gen.py” and “tosi_fumi_params.py.” The “OPLS” class was added to expand the number of chemical species in the composition space to support multiply charged cationic salt mixtures and is the new default argument. The “setup_input” will internally invoke a Python class named “moltensalt_gen” that contains all the coordinate generation and force field parameters for OPLS. A target “number of atom parameter (“set_target_number_of_atoms”), which is fixed at 64 atoms with a shortest distance cutoff of 2 Å (“min_dist”), is used to arrive at the most plausible charge neutral atoms packing. The AL4GAP workflow through the OPLS sampler currently supports nine cations (Li, Na, K, Rb, Cs, Mg, Ca, Sr, and Ba), four anions (F, Cl, Br, and I) and two heavy species (Nd and Th).

3. Run sampling with LAMMPS

After the input files are generated, the “LAMMPS˙ensemble” method embedded in the driver script is invoked to launch ensembles of MD simulations corresponding to the total number of compositions defined in the input. Each of the MD simulations (and subsequent steps) is allocated a dedicated computing resource to run the sampling for compositions concurrently. Since there are three compositions used in the present example, three compute nodes are allocated with an additional compute node for the database task. Further details are provided in the driver submission bash script along with documentation listed in the GitHub. In our prior studies for LiCl–KCl, the RIM-based MD sampling was performed at 2100 K. With the OPLS-based NVT MD sampling used for all potential generation in this study, the temperature was further elevated to 5000 K.

4. Parse MD trajectories

Following the completion of all MD simulations, the driver script invokes the “AL˙ensemble” method which parses the MD trajectories per composition and writes them in an extended xyz (exyz) format independent of each composition. Each of the parsed exyz files per composition contains ∼20 000 configurational samples.

5. Run active learning

Once the parsing of all the MD trajectories is completed, the “AL˙ensemble” method launches ensembles of active learning for each composition. The active learner acts on each of these compositions independently and parses them to find the most informative configurations (similar to the step-by-step tutorial listed earlier, except now each composition space is acted on by an independent instance of the active learner concurrently).

All training and validation “labels” produced by the AL4GAP workflows (Fig. 1) are combined and single point DFT calculations are performed. Here, we describe the generic calculation setting used in this study. Single point DFT calculations are performed on the configurations by using the Strongly Constrained and Appropriately Normed (SCAN) exchange correlation (XC) functional,42 which shows superior performance compared with generalized gradient approximation (GGA) XC functionals.43,44 DFT single point calculations are performed using the Vienna ab initio simulation package (VASP).45 The SCAN exchange-correlation functional and projector-augmented wave method are employed.42,46 A large plane wave cutoff of 700 eV with an electronic convergence criterion of 10−7 eV is used. A Γ-centered 1 × 1 × 1 k-mesh is used for reciprocal sampling. Spin polarization is applied to all the chemical mixtures involving heavy elements. The driver script can already handle many ensemble computational tasks, and the DFT can very well be bundled within that script. Here, we do not choose to do so because we want to make the tutorial modular, and the users might have a different preferred choice for their reference electronic structure calculation code. An added benefit of this decoupling is that the sampling performed with the empirical forcefield is extremely cheap relative to expensive the DFT-MD runs, and sanity checks can be done once the sampling is completed before proceeding with expensive single point DFT-SCAN calculations.

Once the DFT-SCAN single point calculations for the AL4GAP training and validation sets are generated, the next step is to use Bayesian optimization (BO) to tune the GAP model hyperparameters employing the two body (2B) squared-exponential and many-body SOAP descriptors.17,39 Here, we use the DFT-SCAN dataset for molten KCl–ThCl4 corresponding to the AL4GAP run for the composition space defined in Listing 6. The BO code and the datasets can be found in the Github folder “HyperparameterOptimization/.” The BO script that utilizes the GPyOpt library47 is implemented in the “BayesOpt_SOAP.py.” This Python script performs BO to find the optimal values for the following hyperparameters: cutoff, scaling of kernel (2B, SOAP), number of representative sparse points (2B, SOAP), and number of angular and radial basis functions for SOAP. The best optimal hyperparameters found in the search range are written to the “hyperparam_quip.json” file. More details can be found in the Github repository. The optimal BO hyperparameters are preserved through the iterative retraining discussed in Subsection II D.

An initial GAP model is fitted using the optimized hyperparameters for the AL4GAP-generated configurations across multiple-compositions [Fig. 1(b)], with “labels” computed using DFT-SCAN calculations. To circumvent the limitations of Boltzmann sampling, we showed in Guo et al. that metadynamics can be used to sample out-of-equilibrium training regions by utilizing the initial GAP model for LiCl–KCl as a model system.18 Here, we briefly reproduce the details in utilizing the well-tempered variation of metadynamics sampling on the newly AL4GAP generated GAP model for an equal fraction of molten salt mixture (e.g., 50% molar fractions of LiCl and KCl) to perform configurational sampling near the melting point as a reference temperature, which potentially leads to improved coverage of the training space.48,49 The unlike atom-atom ion pair (e.g., Li–Cl, K–Cl) coordination has been chosen intuitively as a collective variable (CV) to drive the exploration. The atom-atom pair coordination CV is parameterized with the values of the first minima in the partial PDFs. A Gaussian was deposited every 250 fs with a bias factor equal to 50. The potential is subject to retraining with an actively learned configuration drawn from this metadynamics sampling. A more detailed discussion is available in Guo et al.18 

We apply the AL4GAP framework to automatically generate GAP models for five independent binary molten salt melts. Here, we carefully design the chemical systems and sampling space with close coordination with experimentalists.50 These systems are LiCl–KCl, NaCl–CaCl2, KCl–NdCl3, CaCl2–NdCl3, and KCl–ThCl4. The CSV formatted composition mapping input file used for AL4GAP for each of these systems can be seen in  Appendix A, Listings 26. The composition space for LiCl–KCl (monovalent/monovalent cation) in Listing 2 corresponds to the input used in Guo et al.18 The composition space for NaCl–CaCl2 (monovalent/bivalent cations) corresponds to the eight composition rows, including two for pure salt melts, as shown in Listing 3. The composition space for KCl–NdCl3 (monovalent/trivalent cations) corresponds to the six composition rows, including two for pure salt melts, as shown in Listing 4. The composition space for CaCl2–NdCl3 (bivalent/trivalent cations) corresponds to the five composition rows, including two for pure salt melts, as shown in Listing 5. Finally, the composition space for KCl–ThCl4 (monovalent/tetravalent cations) corresponds to the five composition rows, including two for pure salt melts, as shown in Listing 6.

The training databases have been summarized in Table I. A more detailed breakup of the training database can be read in the supplementary material Sec. D. As noted in Subsection III A, there is a variation in the input compositions used for different molten salt mixtures. Consequently, the systems with larger input composition spaces exhibit a larger number of active learned training samples. The BO-optimized hyperparameters used to fit the final GAP models on these training databases are listed in Table II). The atomic forces validated for independent test samples drawn at each composition are visualized in Fig. 2. The GAP models provide excellent force prediction accuracy with respect to DFT-SCAN across different molten salt mixture chemistries, with root mean squared errors (RMSE) ranging from 0.12 to 0.17 eV/Å. The larger distribution of atomic force in the LiCl–KCl system can be attributed to smaller chemical species having relatively higher diffusivity.

TABLE I.

GAP training database.

SystemTraining samples
LiCl–KCl 1127 
NaCl–CaCl2 801 
CaCl2–NdCl3 762 
KCl–NdCl3 839 
KCl–ThCl4 632 
SystemTraining samples
LiCl–KCl 1127 
NaCl–CaCl2 801 
CaCl2–NdCl3 762 
KCl–NdCl3 839 
KCl–ThCl4 632 
TABLE II.

GAP model hyperparameter.

Parameter nameLiCl–KClNaCl–CaCl2KCl–NdCl3CaCl2–NdCl3KCl–ThCl4
2BSOAP2BSOAP2BSOAP2BSOAP2BSOAP
Cut off (Å) 5.92 5.92 5.97 5.97 6.074 6.074 5.594 5.594 6.11 6.11 
Sparse points 65 1200 65 1100 65 1300 65 1300 55 1500 
Delta (eV) 2.74 0.78 2.11 0.65 8.09 0.99 7.31 0.70 11.11 0.89 
(lmax, nmax⋯ (4, 8) ⋯ (4, 8) ⋯ (6, 9) ⋯ (4, 9) ⋯ (4,9) 
Parameter nameLiCl–KClNaCl–CaCl2KCl–NdCl3CaCl2–NdCl3KCl–ThCl4
2BSOAP2BSOAP2BSOAP2BSOAP2BSOAP
Cut off (Å) 5.92 5.92 5.97 5.97 6.074 6.074 5.594 5.594 6.11 6.11 
Sparse points 65 1200 65 1100 65 1300 65 1300 55 1500 
Delta (eV) 2.74 0.78 2.11 0.65 8.09 0.99 7.31 0.70 11.11 0.89 
(lmax, nmax⋯ (4, 8) ⋯ (4, 8) ⋯ (6, 9) ⋯ (4, 9) ⋯ (4,9) 
FIG. 2.

Validation of GAP models prediction against exact DFT-SCAN-derived forces computed for test configurations drawn across the diverse compositions.

FIG. 2.

Validation of GAP models prediction against exact DFT-SCAN-derived forces computed for test configurations drawn across the diverse compositions.

Close modal

The final GAP models are used to perform MD simulations with a system size of over ∼1000 atoms. Detailed information on the simulated systems, their number of atoms, and densities estimated from GAP-MD for each composition and temperature are summarized in  Appendix B, Table V. We performed GAP MD using the LAMMPS software package compiled with the QUIP pair style.51,52 Each simulation condition is initially thermalized at 1500 K in the (NVT) ensemble,53,54 followed by volume relaxation in an isothermal-isobaric (NPT) ensemble with a pressure coupling of 1 bar.55–57 The temperature is decreased to the target temperature of over 200 ps in the NPT ensemble. At the target temperature, the NPT MD is continued for ∼2 ns with a time-step of 0.5 fs, and the last 1 ns is used for computing the structure. A representative set of simulated molten salt chemistries is shown in Fig. 3. The results of these simulations are presented in the next two subsections.

FIG. 3.

MD simulation snapshots of the molten salt mixtures explored as a part of the combinatorial screening. The visualization is performed using OVITO.58 

FIG. 3.

MD simulation snapshots of the molten salt mixtures explored as a part of the combinatorial screening. The visualization is performed using OVITO.58 

Close modal

From the MD simulation trajectory, we calculated the partial pair distribution function (PDF) of NaCl–CaCl2 across multiple compositions from pure CaCl2 (1200 K), NaCl–2CaCl2 (903 K), NaCl–CaCl2 (813 K), and 2NaCl–CaCl2 (923 K) to pure NaCl (1148 K). The simulation temperature and compositions for mixtures were chosen to be consistent with the study of Igarashi et al.59 The partial PDF functions are Fourier transformed to obtain partial structure factors and then weighted according to the Faber-Ziman formalism to obtain the total x-ray structure factor, which was then Fourier transformed to the total pair distribution function.60,61 The total PDF is visualized in Fig. 4. The first peak of the total PDF corresponds to the cation-anion interaction (i.e., Na–Cl and Ca–Cl) in the first coordination shell. As Na–Cl and Ca–Cl bond lengths are very close (i.e., 2.70 vs 2.71 Å), the composition change causes minimal changes in the first peak position of the total PDF. The second peak of the total PDF is mainly from the Cl–Cl interaction, the length of which increases as the NaCl content increases as shown in Fig. 4. This agrees with the x-ray diffraction (XRD) experimental results reported by Igarashi et al.59 This change is further supported by the Cl–Cl partial PDF data (Fig. 5).

FIG. 4.

The pair distribution function plot for different compositions of molten NaCl–CaCl2.

FIG. 4.

The pair distribution function plot for different compositions of molten NaCl–CaCl2.

Close modal
FIG. 5.

Cl–Cl partial pair distribution functions for different compositions of molten NaCl–CaCl2.

FIG. 5.

Cl–Cl partial pair distribution functions for different compositions of molten NaCl–CaCl2.

Close modal

The coordination number (CN) of each ion pair was calculated by integrating the partial PDF to its first minima and compared with previous studies in Table III.31,59,62 We noticed that the Ca–Cl coordination number in pure CaCl2 and the Na–Cl coordination number in pure NaCl are around 6.33 and 5.40 respectively, which are close to their crystalline state value of 6. Ca2+ is surrounded by more Cl within the 1st coordination shell in the molten state than in the crystalline state, while Na+ is surrounded by fewer. In the NaCl–CaCl2 mixture, the coordination number for both Na–Cl and Ca–Cl decreases as the NaCl content increases.

TABLE III.

Bond lengths and coordination numbers of ion pairs in the NaCl–CaCl2 mixture with various compositions. Literature values are also reported when available in their original sources. All new GAP-MD simulations were performed as a part of this study except for molten NaCl. Neutron diffraction with isotope substitution (NDIS). Polarizable ion model (PIM).

Na–NaNa–ClNa–CaCa–CaCa–ClCl–Cl
ChemistryMethodTemperature (K)r (Å)CNr (Å)CNr (Å)CNr (Å)CNr (Å)CNr (Å)CN
CaCl2 GAP-MD 1200 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 4.52 10.75 2.72 6.33 3.68 16.31 
XRD59  1063 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 4.60 6.90 2.76 5.20 3.55 7.60 
DFT-MD63  1200 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 4.55 ⋯ 2.73 6.20 3.66 ⋯ 
NaCl–2CaCl2 GAP-MD 903 4.16 4.18 2.71 6.35 4.38 8.21 4.53 7.63 2.73 6.06 3.73 16.31 
NaCl–CaCl2 GAP-MD 813 4.16 6.26 2.71 6.22 4.46 6.79 4.50 5.96 2.73 6.06 3.79 16.41 
GAP-MD 1200 4.15 6.28 2.71 6.01 4.41 6.38 4.56 5.94 2.71 5.99 3.81 16.06 
2NaCl–CaCl2 GAP-MD 923 4.14 9.09 2.69 5.79 4.36 4.49 4.43 4.36 2.72 5.93 3.83 15.44 
NaCl GAP-MD31  1148 4.03 15.39 2.70 5.40 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 4.12 14.93 
NDIS + XRD64  1093 4.01 15.20 2.68 4.70 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 4.03 15.10 
PIM-MD 1100 4.05 15.40 2.66 5.20 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 4.07 15.30 
Na–NaNa–ClNa–CaCa–CaCa–ClCl–Cl
ChemistryMethodTemperature (K)r (Å)CNr (Å)CNr (Å)CNr (Å)CNr (Å)CNr (Å)CN
CaCl2 GAP-MD 1200 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 4.52 10.75 2.72 6.33 3.68 16.31 
XRD59  1063 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 4.60 6.90 2.76 5.20 3.55 7.60 
DFT-MD63  1200 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 4.55 ⋯ 2.73 6.20 3.66 ⋯ 
NaCl–2CaCl2 GAP-MD 903 4.16 4.18 2.71 6.35 4.38 8.21 4.53 7.63 2.73 6.06 3.73 16.31 
NaCl–CaCl2 GAP-MD 813 4.16 6.26 2.71 6.22 4.46 6.79 4.50 5.96 2.73 6.06 3.79 16.41 
GAP-MD 1200 4.15 6.28 2.71 6.01 4.41 6.38 4.56 5.94 2.71 5.99 3.81 16.06 
2NaCl–CaCl2 GAP-MD 923 4.14 9.09 2.69 5.79 4.36 4.49 4.43 4.36 2.72 5.93 3.83 15.44 
NaCl GAP-MD31  1148 4.03 15.39 2.70 5.40 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 4.12 14.93 
NDIS + XRD64  1093 4.01 15.20 2.68 4.70 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 4.03 15.10 
PIM-MD 1100 4.05 15.40 2.66 5.20 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 4.07 15.30 

The total structure factor data of the NaCl–CaCl2 mixture at various compositions are shown in Fig. 6(a). A pre-peak at ∼1.44 Å−1 appears in the CaCl2 S(Q) plot, which is not present in the NaCl S(Q) plot [zoomed in Fig. 6(b)]. A peak near 1.04 Å−1 appears in the structure factor when NaCl is introduced into CaCl2. These two peaks are most likely the results of intermediate range order in CaCl2 and CaCl2–NaCl mixtures which are also observed in other AX2 systems.65–69 The peak at 1.04 Å−1 is probably attributable to the periodicity of Cl in adjacent ionic networks (chains) as mentioned by Wu et al.69 It is also worth pointing out that the simulations using the RIM model cannot accurately predict the structure of mixtures with multiply charged cations.70,71 This means the ML-MD can accurately capture both the details in the short-range structure and the intermediate-range structure of complex molten salt mixtures.

FIG. 6.

(a) Structure factor of NaCl–CaCl2 mixture of different compositions. (b) Close-up of Prepeaks in (a).

FIG. 6.

(a) Structure factor of NaCl–CaCl2 mixture of different compositions. (b) Close-up of Prepeaks in (a).

Close modal

We also explore the effect of temperature on the structure of molten NaCl–CaCl2. The PDF plots shown in Fig. 7(a) for 50–50 NaCl–CaCl2 at 813.15 and 1200 K indicate that the overall peak intensity reduces as temperature increases indicating the atomic density decreases. The change also reflects in the coordination number (Table III) as the Na–Cl, Ca–Cl, and Cl–Cl coordination number decreases from 6.22, 6.06, and 16.41 to 6.01, 5.99, and 16.07, respectively. The structure factor [Fig. 7(b)] shows that the pre-peaks at 1.0 and 1.4 Å−1 increase while the first major peak at 2.48 Å−1 reduces as temperature increases from 813 to 1200 K. Similar behavior is observed in the molten MgCl2–KCl system.65,69 This suggests that at higher temperatures, shorter cation-anion-cation chains are favored which leads to more ions contributing to the inter-chain interaction and as a result, the enhanced pre-peak.

FIG. 7.

NaCl–CaCl2 with 50–50 mol. % composition simulated at two different temperatures. (a) Pair distribution function. (b) Structure factor.

FIG. 7.

NaCl–CaCl2 with 50–50 mol. % composition simulated at two different temperatures. (a) Pair distribution function. (b) Structure factor.

Close modal

Here, we also looked at the structure of multiple bi-valent, mono-valent, tri-valent, and quad-valent salt mixtures. In the total pair distribution function shown in Fig. 8(a), for light elements (i.e., Li, K, Ca, Na), the 1st peak has contributions from the cation-anion bond and the second peak is mainly due to the contribution from the Cl–Cl bond. However, for mixtures with heavy elements, even though the Cl concentration is higher, according to the weight factor, the main contributions to the second and Third peaks are from the heavy cation – heavy cation (e.g., Th–Th and Nd–Nd) correlations. The detailed structure information, such as bond length and coordination number for individual bonds, is derived from the partial PDF of these salt mixtures and listed in Table IV. The average Th–Cl coordination number (CN) in the KCl–ThCl4 mixture was found to be 6.21, which is consistent with prior studies indicating that at high ThCl4 concentrations, the primary species in ThCl4–ACl (A = alkali metal) are 6-coordinated octahedra, including linked ThCl6 − and chained [ThnCl4n+2]2− and [ThnCl4n+2]2+.72,73 Similarly, the average Nd–Cl CN in the KCl–NdCl3 mixture was 6.33, also in agreement with previous research suggesting that a distorted octahedral loose network structure predominates in NdCl3–ACl mixtures with high NdCl3 concentrations (>25 mol. %).74 The structure factor of different salt mixtures is shown in Fig. 8(b). Salt mixtures with ThCl4 and NdCl3 show very intense pre-peaks at around ∼0.8 to ∼1.05 Å−1 which indicates a strong intermediate range ordering.

FIG. 8.

Structural information of multiple salt mixtures at 50–50 mol. % composition at 1200 K. (a) Pair distribution function. (b) Structure factor.

FIG. 8.

Structural information of multiple salt mixtures at 50–50 mol. % composition at 1200 K. (a) Pair distribution function. (b) Structure factor.

Close modal
TABLE IV.

Bond lengths and coordination numbers of ion pairs in salt mixtures at 1200 K with 50–50 mol. % compositions.

Chemistryr (Å)CNr (Å)CNr (Å)CNr (Å)CNr (Å)CNr (Å)CN
KCl–ThCl4 K–K K–Cl K–Th Th–Th Th–Cl Cl–Cl 
5.16 4.79 3.20 8.12 5.26 6.28 4.62/5.42 2.02 2.67 6.21 3.70 13.58 
KCl–NdCl3 K–K K–Cl K–Nd Nd–Nd Nd–Cl Cl–Cl 
5.25 5.29 3.07 6.19 5.04 6.31 4.21/5.16 2.96 2.69 6.33 3.53 14.46 
CaCl2–NdCl3 Ca–Ca Ca–Cl Ca–Nd Nd–Nd Nd–Cl Cl–Cl 
4.86 4.02 2.71 6.32 4.55 4.09 4.52 2.92 2.74 6.78 3.58 17.15 
LiCl–KCl Li–Li Li–Cl Li–K K–K K–Cl Cl–Cl 
3.28 4.80 2.29 4.05 3.90 7.96 4.62 10.35 3.02 6.70 3.93 10.07 
NaCl–CaCl2 Na–Na Na–Cl Na–Ca Ca–Ca Ca–Cl Cl–Cl 
4.15 6.28 2.71 6.01 4.41 6.38 4.56 5.94 2.71 5.99 3.81 16.06 
Chemistryr (Å)CNr (Å)CNr (Å)CNr (Å)CNr (Å)CNr (Å)CN
KCl–ThCl4 K–K K–Cl K–Th Th–Th Th–Cl Cl–Cl 
5.16 4.79 3.20 8.12 5.26 6.28 4.62/5.42 2.02 2.67 6.21 3.70 13.58 
KCl–NdCl3 K–K K–Cl K–Nd Nd–Nd Nd–Cl Cl–Cl 
5.25 5.29 3.07 6.19 5.04 6.31 4.21/5.16 2.96 2.69 6.33 3.53 14.46 
CaCl2–NdCl3 Ca–Ca Ca–Cl Ca–Nd Nd–Nd Nd–Cl Cl–Cl 
4.86 4.02 2.71 6.32 4.55 4.09 4.52 2.92 2.74 6.78 3.58 17.15 
LiCl–KCl Li–Li Li–Cl Li–K K–K K–Cl Cl–Cl 
3.28 4.80 2.29 4.05 3.90 7.96 4.62 10.35 3.02 6.70 3.93 10.07 
NaCl–CaCl2 Na–Na Na–Cl Na–Ca Ca–Ca Ca–Cl Cl–Cl 
4.15 6.28 2.71 6.01 4.41 6.38 4.56 5.94 2.71 5.99 3.81 16.06 

We have described a software workflow, AL4GAP, for accelerating the development of machine learning interatomic potentials (ML-IP) via active leaning over a combinatorial composition space. The workflow provides an easy-to-use interface for setting up efficient sampling over arbitrary mixtures of molten salt chemistries. AL4GAP employs low-cost empirical forcefields for sampling and thereby effectively bypassing the need for expensive DFT-MD for training dataset generation. We also provide an easy-to-use interface for the Bayesian optimization of ML-IP model hyperparameters.

We showcase the power of AL4GAP in enabling the facile generation of GAP models at DFT-SCAN level for five different three-component molten salt mixtures. Our results indicate that GAP models can accurately capture the structural characteristics of complex mixtures of molten salts. Our workflow accelerates the development of DFT-SCAN accurate multi-composition ML-IPs, which can be used for high accuracy property prediction across various compositions and simulation conditions opening up the possibility for providing rapid feedback/guidance to experiments in regimes challenged by high corrosion and radiation.

The supplementary material contains a simplified code example for ensemble launch, energy error validation, comparison of GAP-MD computed density with respect to reported experiments, and database summary.

This material was based upon work supported by Laboratory Directed Research and Development (Grant No. LDRD-CLS-1-630) funding from Argonne National Laboratory, provided by the Director, Office of Science, of the U.S. Department of Energy under Contract No. DE-AC02-06CH11357. This research was in portion supported by ExaLearn Co-design Center of the Exascale Computing Project (Grant No. 17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration.75 Portions of this work were sponsored by the U.S. Department of Energy, Office of Nuclear Energy’s Material Recovery and Wasteform Development Program under Contract No. DE-AC02-06CH11357. We gratefully acknowledge the computing resources provided on Bebop; a high-performance computing cluster operated by the Laboratory Computing Resource Center at Argonne National Laboratory. This research used resources of the Argonne Leadership Computing Facility, a DOE Office of Science User Facility supported under Contract No. DE-AC02-06CH11357. C.B. acknowledges support from the Advanced Photon Source, a U.S. Department of Energy (DOE) Office of Science User Facility operated for the DOE Office of Science by Argonne National Laboratory under Contract No. DE-AC02-06CH11357. Argonne National Laboratory’s work was supported by the U.S. Department of Energy, Office of Science, under Contract No. DE-AC02-06CH11357. G.S. would like to thank Professor Gabor Csányi for fruitful discussions on GAP model fitting. Los Alamos National Laboratory, an affirmative action/equal opportunity employer, was operated by Triad National Security, LLC, for the National Nuclear Security Administration of the U.S. Department of Energy under Contract No. 89233218CNA000001.

The authors have no conflicts to disclose.

Jicheng Guo: Conceptualization (equal); Data curation (lead); Formal analysis (equal); Funding acquisition (equal); Investigation (equal); Methodology (equal); Project administration (equal); Resources (equal); Software (equal); Supervision (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal). Vanessa Woo: Conceptualization (supporting); Data curation (equal); Formal analysis (supporting); Funding acquisition (supporting); Investigation (supporting); Methodology (equal); Project administration (supporting); Resources (supporting); Software (equal); Supervision (equal); Validation (supporting); Visualization (equal); Writing – original draft (supporting); Writing – review & editing (supporting). David A. Andersson: Data curation (equal); Formal analysis (supporting); Investigation (supporting); Methodology (equal); Project administration (supporting); Resources (supporting); Software (equal); Supervision (supporting); Validation (supporting); Writing – original draft (supporting); Writing – review & editing (equal). Nathaniel Hoyt: Conceptualization (supporting); Funding acquisition (lead); Investigation (supporting); Methodology (supporting); Project administration (equal); Resources (equal); Supervision (equal); Writing – review & editing (equal). Mark Williamson: Conceptualization (supporting); Funding acquisition (supporting); Project administration (supporting); Resources (equal); Supervision (supporting); Writing – review & editing (equal). Ian Foster: Funding acquisition (lead); Project administration (supporting); Resources (supporting); Supervision (supporting); Writing – review & editing (equal). Chris Benmore: Formal analysis (supporting); Investigation (supporting); Methodology (supporting); Resources (equal); Supervision (supporting); Validation (equal); Writing – review & editing (equal). Nicholas E. Jackson: Conceptualization (equal); Data curation (supporting); Formal analysis (supporting); Funding acquisition (supporting); Investigation (supporting); Methodology (equal); Project administration (supporting); Software (equal); Writing – original draft (supporting); Writing – review & editing (equal). Ganesh Sivaraman: Conceptualization (lead); Data curation (lead); Formal analysis (equal); Funding acquisition (supporting); Investigation (lead); Methodology (lead); Project administration (lead); Resources (lead); Software (lead); Supervision (lead); Validation (lead); Visualization (equal); Writing – original draft (lead); Writing – review & editing (lead).

The AL4GAP workflow is provided under an MIT license at Ref. 33 (https://github.com/pythonpanda2/AL4GAP_JCP). The GAP model, training data, and the MD trajectories have been deposited in Ref. 76.

The CSV formatted input files for LiCl-KCl, NaCl-CaCl2, KCl-NdCl3, CaCl2-NdCl3, and KCl-ThCl4 are listed in Listings 2-6.

LISTING 2.

CSV formatted input file for LiCl–KCl.

LiKClTrho
0.45 0.05 0.5 820 1552 
0.4 0.1 0.5 760 1598 
0.35 0.15 0.5 740 1622 
0.3335 0.1665 0.5 720 1646 
0.25 0.25 0.5 780 1627 
0.21 0.29 0.5 860 1595 
0.165 0.335 0.5 900 1583 
0.1 0.4 0.5 980 1548 
0.05 0.45 0.5 1020 1534 
0.5 0.5 1060 1520 
LiKClTrho
0.45 0.05 0.5 820 1552 
0.4 0.1 0.5 760 1598 
0.35 0.15 0.5 740 1622 
0.3335 0.1665 0.5 720 1646 
0.25 0.25 0.5 780 1627 
0.21 0.29 0.5 860 1595 
0.165 0.335 0.5 900 1583 
0.1 0.4 0.5 980 1548 
0.05 0.45 0.5 1020 1534 
0.5 0.5 1060 1520 
LISTING 3.

CSV formatted input file for NaCl–CaCl2.

NaCaClTrho
0.500 000 0.000 000 0.500 000 1090.0000 1554.5 
0.425 000 0.050 000 0.525 000 1090.0000 1635.0 
0.337 500 0.108 333 0.554 167 1090.0000 1742.4 
0.245 500 0.169 667 0.584 833 1060.0000 1854.8 
0.179 500 0.213 667 0.606 833 1080.0000 1912.9 
0.112 500 0.258 333 0.629 167 1080.0000 1971.9 
0.099 000 0.267 333 0.633 667 1040.0000 2002.9 
0.000 000 0.333 333 0.666 667 1070.0000 2073.2 
NaCaClTrho
0.500 000 0.000 000 0.500 000 1090.0000 1554.5 
0.425 000 0.050 000 0.525 000 1090.0000 1635.0 
0.337 500 0.108 333 0.554 167 1090.0000 1742.4 
0.245 500 0.169 667 0.584 833 1060.0000 1854.8 
0.179 500 0.213 667 0.606 833 1080.0000 1912.9 
0.112 500 0.258 333 0.629 167 1080.0000 1971.9 
0.099 000 0.267 333 0.633 667 1040.0000 2002.9 
0.000 000 0.333 333 0.666 667 1070.0000 2073.2 
LISTING 4.

CSV formatted input file for KCl–NdCl3.

KNdClTrho
0.25 0.75 1090 3250.3474 
0.037 056 928 03 0.231 471 536 0.731 471 536 997 3216.625 
0.089 275 191 51 0.205 362 404 2 0.705 362 404 2 1045 2954.175 
0.202 247 191 0.148 876 404 5 0.648 876 404 5 1083 2920.545 
0.368 055 555 6 0.065 972 222 22 0.565 972 222 2 1093 1872.805 
0.50 0.50 1122 1493.702 
KNdClTrho
0.25 0.75 1090 3250.3474 
0.037 056 928 03 0.231 471 536 0.731 471 536 997 3216.625 
0.089 275 191 51 0.205 362 404 2 0.705 362 404 2 1045 2954.175 
0.202 247 191 0.148 876 404 5 0.648 876 404 5 1083 2920.545 
0.368 055 555 6 0.065 972 222 22 0.565 972 222 2 1093 1872.805 
0.50 0.50 1122 1493.702 
LISTING 5.

CSV formatted input file for CaCl2–NdCl3.

CaNdClTrho
0.00 0.2500 0.7500 1270 3082.9222 
0.0681 0.1989 0.7330 1003 3179.618 
0.1577 0.1317 0.7106 1003 2852.936 
0.2723 0.0458 0.6819 1093 2321.894 
0.3333 0.00 0.6667 1098 2122.448 
CaNdClTrho
0.00 0.2500 0.7500 1270 3082.9222 
0.0681 0.1989 0.7330 1003 3179.618 
0.1577 0.1317 0.7106 1003 2852.936 
0.2723 0.0458 0.6819 1093 2321.894 
0.3333 0.00 0.6667 1098 2122.448 
LISTING 6.

CSV formatted input file for KCl–ThCl4.

KThClTrho
0.20 0.80 1075 3317 
0.0771 0.1692 0.7538 883 3188.55 
0.1416 0.1433 0.715 748 3171.64 
0.3494 0.0602 0.5903 925 2111 
0.50 0.50 1075 1506.375 
KThClTrho
0.20 0.80 1075 3317 
0.0771 0.1692 0.7538 883 3188.55 
0.1416 0.1433 0.715 748 3171.64 
0.3494 0.0602 0.5903 925 2111 
0.50 0.50 1075 1506.375 

The GAP-MD simulation set up for various salt mixtures is listed in Table V.

TABLE V.

GAP MD simulation set up. The mixture compositions expressed in mol. %. A 50–50 composition of LiCl:KCl with 256 anion-cation pairs for each salt would be equivalent to 1024 atom system.

Mixture chemistryMixture compositionNumber of atomsTemperature (K)Density from GAP-MD (g cm−3)
LiCl–KCl 50–50 1024 1200 1.587 (±0.015) 
NaCl–CaCl2 33–67 1024 903.15 1.805 (±0.017) 
50–50 1040 813.15 1.799 (±0.013) 
 1040 1200 1.690 (±0.015) 
67–33 1064 923.15 1.661 (±0.012) 
KCl–NdCl3 50–50 1008 1200 2.354 (±0.028) 
CaCl2–NdCl3 50–50 1008 1200 2.506 (±0.023) 
KCl–ThCl4 50–50 1008 1200 2.551 (±0.021) 
Mixture chemistryMixture compositionNumber of atomsTemperature (K)Density from GAP-MD (g cm−3)
LiCl–KCl 50–50 1024 1200 1.587 (±0.015) 
NaCl–CaCl2 33–67 1024 903.15 1.805 (±0.017) 
50–50 1040 813.15 1.799 (±0.013) 
 1040 1200 1.690 (±0.015) 
67–33 1064 923.15 1.661 (±0.012) 
KCl–NdCl3 50–50 1008 1200 2.354 (±0.028) 
CaCl2–NdCl3 50–50 1008 1200 2.506 (±0.023) 
KCl–ThCl4 50–50 1008 1200 2.551 (±0.021) 
1.
J.
Behler
and
G.
Csányi
, “
Machine learning potentials for extended systems: A perspective
,”
Eur. Phys. J. B
94
,
142
(
2021
).
2.
V. L.
Deringer
,
A. P.
Bartók
,
N.
Bernstein
,
D. M.
Wilkins
,
M.
Ceriotti
, and
G.
Csányi
, “
Gaussian process regression for materials and molecules
,”
Chem. Rev.
121
,
10073
10141
(
2021
).
3.
M.
Ceriotti
, “
Beyond potentials: Integrated machine learning models for materials
,”
MRS Bull.
47
,
1045
1053
(
2022
).
4.
D. P.
Kovács
,
C. v. d.
Oord
,
J.
Kucera
,
A. E.
Allen
,
D. J.
Cole
,
C.
Ortner
, and
G.
Csányi
, “
Linear atomic cluster expansion force fields for organic molecules: Beyond RMSE
,”
J. Chem. Theory Comput.
17
,
7696
7711
(
2021
).
5.
J.
Behler
and
M.
Parrinello
, “
Generalized neural-network representation of high-dimensional potential-energy surfaces
,”
Phys. Rev. Lett.
98
,
146401
(
2007
).
6.
A. P.
Bartók
,
M. C.
Payne
,
R.
Kondor
, and
G.
Csányi
, “
Gaussian approximation potentials: The accuracy of quantum mechanics, without the electrons
,”
Phys. Rev. Lett.
104
,
136403
(
2010
).
7.
A. P.
Thompson
,
L. P.
Swiler
,
C. R.
Trott
,
S. M.
Foiles
, and
G. J.
Tucker
, “
Spectral neighbor analysis method for automated generation of quantum-accurate interatomic potentials
,”
J. Comput. Phys.
285
,
316
330
(
2015
).
8.
I. S.
Novikov
,
K.
Gubaev
,
E. V.
Podryabinkin
, and
A. V.
Shapeev
, “
The MLIP package: Moment tensor potentials with MPI and active learning
,”
Mach. Learn.: Sci. Technol.
2
,
025002
(
2020
).
9.
J. S.
Smith
,
O.
Isayev
, and
A. E.
Roitberg
, “
ANI-1: An extensible neural network potential with DFT accuracy at force field computational cost
,”
Chem. Sci.
8
,
3192
3203
(
2017
).
10.
K. T.
Schütt
,
H. E.
Sauceda
,
P.-J.
Kindermans
,
A.
Tkatchenko
, and
K.-R.
Müller
, “
SchNet–A deep learning architecture for molecules and materials
,”
J. Chem. Phys.
148
,
241722
(
2018
).
11.
A. S.
Christensen
,
L. A.
Bratholm
,
F. A.
Faber
, and
O.
Anatole von Lilienfeld
, “
FCHL revisited: Faster and more accurate quantum machine learning
,”
J. Chem. Phys.
152
,
044107
(
2020
).
12.
H.
Huo
and
M.
Rupp
, “
Unified representation of molecules and crystals for machine learning
,”
Mach. Learn.: Sci. Technol.
3
,
045017
(
2022
).
13.
R.
Lot
,
F.
Pellegrini
,
Y.
Shaidu
, and
E.
Küçükbenli
, “
PANNA: Properties from artificial neural network architectures
,”
Comput. Phys. Commun.
256
,
107402
(
2020
).
14.
A. M.
Cooper
,
J.
Kästner
,
A.
Urban
, and
N.
Artrith
, “
Efficient training of ANN potentials by including atomic forces via Taylor expansion and application to water and a transition-metal oxide
,”
npj Comput. Mater.
6
,
54
(
2020
).
15.
S.
Batzner
,
A.
Musaelian
,
L.
Sun
,
M.
Geiger
,
J. P.
Mailoa
,
M.
Kornbluth
,
N.
Molinari
,
T. E.
Smidt
, and
B.
Kozinsky
, “
E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials
,”
Nat. Commun.
13
,
2453
(
2022
).
16.
J.
Vandermause
,
S. B.
Torrisi
,
S.
Batzner
,
Y.
Xie
,
L.
Sun
,
A. M.
Kolpak
, and
B.
Kozinsky
, “
On-the-fly active learning of interpretable Bayesian force fields for atomistic rare events
,”
npj Comput. Mater.
6
,
20
(
2020
).
17.
G.
Sivaraman
,
A. N.
Krishnamoorthy
,
M.
Baur
,
C.
Holm
,
M.
Stan
,
G.
Csányi
,
C.
Benmore
, and
Á.
Vázquez-Mayagoitia
, “
Machine-learned interatomic potentials by active learning: Amorphous and liquid hafnium dioxide
,”
npj Comput. Mater.
6
,
104
(
2020
).
18.
J.
Guo
,
L.
Ward
,
Y.
Babuji
,
N.
Hoyt
,
M.
Williamson
,
I.
Foster
,
N.
Jackson
,
C.
Benmore
, and
G.
Sivaraman
, “
Composition-transferable machine learning potential for LiCl-KCl molten salts validated by high-energy x-ray diffraction
,”
Phys. Rev. B
106
,
014209
(
2022
).
19.
D.
Montes de Oca Zapiain
,
M. A.
Wood
,
N.
Lubbers
,
C. Z.
Pereyra
,
A. P.
Thompson
, and
D.
Perez
, “
Training data selection for accuracy and transferability of interatomic potentials
,”
npj Comput. Mater.
8
,
189
(
2022
).
20.
L.
Bonati
and
M.
Parrinello
, “
Silicon liquid structure and crystal nucleation from ab initio deep metadynamics
,”
Phys. Rev. Lett.
121
,
265701
(
2018
).
21.
G.
Sivaraman
,
G.
Csanyi
,
A.
Vazquez-Mayagoitia
,
I. T.
Foster
,
S. K.
Wilke
,
R.
Weber
, and
C. J.
Benmore
, “
A combined machine learning and high-energy x-ray diffraction approach to understanding liquid and amorphous metal oxides
,”
J. Phys. Soc. Jpn.
91
,
091009
(
2022
).
22.
N.
Bernstein
,
G.
Csányi
, and
V. L.
Deringer
, “
De novo exploration and self-guided learning of potential-energy surfaces
,”
npj Comput. Mater.
5
,
99
(
2019
).
23.
C.
van der Oord
,
M.
Sachs
,
D.
Kovacs
,
C.
Ortner
, and
G.
Csanyi
, “
Hyperactive learning for data-driven interatomic potentials
,” preprint available at Research Square, https://doi.org/10.21203/rs.3.rs-2248548/v1 (
2022
).
24.
G. C.
Sosso
et al, “
Soap˙gas
,” https://github.com/gcsosso/SOAP_GAS,
2022
.
25.
S. K.
Natarajan
and
M. A.
Caro
, “
Particle swarm based hyper-parameter optimization for machine learned interatomic potentials
,” arXiv:2101.00049 (
2020
).
26.
J.
Dupont
, “
From molten salts to ionic liquids: A ‘nano’ journey
,”
Acc. Chem. Res.
44
,
1223
1231
(
2011
).
27.
H.
Kim
,
D. A.
Boysen
,
J. M.
Newhouse
,
B. L.
Spatocco
,
B.
Chung
,
P. J.
Burke
,
D. J.
Bradwell
,
K.
Jiang
,
A. A.
Tomaszowska
,
K.
Wang
et al, “
Liquid metal batteries: Past, present, and future
,”
Chem. Rev.
113
,
2075
2099
(
2013
).
28.
H. D.
Gougar
,
D. A.
Petti
,
P. A.
Demkowicz
,
W. E.
Windes
,
G.
Strydom
,
J. C.
Kinsey
,
J.
Ortensi
,
M.
Plummer
,
W.
Skerjanc
,
R. L.
Williamson
et al, “
The US department of energy’s high temperature reactor research and development program—Progress as of 2019
,”
Nucl. Eng. Des.
358
,
110397
(
2020
).
29.
J.
Guo
,
N.
Hoyt
, and
M.
Williamson
, “
Multielectrode array sensors to enable long-duration corrosion monitoring and control of concentrating solar power systems
,”
J. Electroanal. Chem.
884
,
115064
(
2021
).
30.
M.
Mehos
,
C.
Turchi
,
J.
Vidal
,
M.
Wagner
,
Z.
Ma
,
C.
Ho
,
W.
Kolb
,
C.
Andraka
, and
A.
Kruizenga
, “
Concentrating solar power Gen3 demonstration roadmap
,” Technical Report No. NREL/TP-5500-67464,
National Renewable Energy Lab
,
Golden, CO
,
2017
.
31.
S.
Tovey
,
A.
Narayanan Krishnamoorthy
,
G.
Sivaraman
,
J.
Guo
,
C.
Benmore
,
A.
Heuer
, and
C.
Holm
, “
DFT accurate interatomic potential for Molten NaCl from machine learning
,”
J. Phys. Chem. C
124
,
25760
25768
(
2020
).
32.
G.
Sivaraman
,
J.
Guo
,
L.
Ward
,
N.
Hoyt
,
M.
Williamson
,
I.
Foster
,
C.
Benmore
, and
N.
Jackson
, “
Automated development of Molten salt machine learning potentials: Application to LiCl
,”
J. Phys. Chem. Lett.
12
,
4278
4285
(
2021
).
33.
V.
Woo
,
N. E.
Jackson
,
G.
Sivaraman
et al (
2023
), “
pythonpanda2/al4gap˙jcp: Initial release
,” Zenodo. https://doi.org/10.5281/zenodo.7916551
34.
A. P.
Bartók
,
R.
Kondor
, and
G.
Csányi
, “
On representing chemical environments
,”
Phys. Rev. B
87
,
184115
(
2013
).
35.
V. L.
Deringer
and
G.
Csányi
, “
Machine learning based interatomic potential for amorphous carbon
,”
Phys. Rev. B
95
,
094203
(
2017
).
36.
G.
Sivaraman
,
L.
Gallington
,
A. N.
Krishnamoorthy
,
M.
Stan
,
G.
Csányi
,
Á.
Vázquez-Mayagoitia
, and
C. J.
Benmore
, “
Experimentally driven automated machine-learned interatomic potential for a refractory oxide
,”
Phys. Rev. Lett.
126
,
156002
(
2021
).
37.
G.
Sivaraman
, “
ML-IP 2021, A Psi-K tutorial workshop: From atomistic to coarse grained
,” https://youtu.be/yDDMNh2-fbk.
38.
L.
McInnes
,
J.
Healy
, and
S.
Astels
, “
HDBSCAN: Hierarchical density based clustering
,”
J. Open Source Software
2
,
205
(
2017
).
39.
B.
Shahriari
,
K.
Swersky
,
Z.
Wang
,
R. P.
Adams
, and
N.
De Freitas
, “
Taking the human out of the loop: A review of Bayesian optimization
,”
Proc. IEEE
104
,
148
175
(
2015
).
40.
S.
Partee
,
M.
Ellis
,
A.
Rigazzi
,
A. E.
Shao
,
S.
Bachman
,
G.
Marques
, and
B.
Robbins
, “
Using machine learning at scale in numerical simulations with SmartSim: An application to ocean climate modeling
,”
J. Comput. Sci.
62
,
101707
(
2022
).
41.
W. L.
Jorgensen
,
D. S.
Maxwell
, and
J.
Tirado-Rives
, “
Development and testing of the OPLS all-atom force field on conformational energetics and properties of organic liquids
,”
J. Am. Chem. Soc.
118
,
11225
11236
(
1996
).
42.
J.
Sun
,
A.
Ruzsinszky
, and
J. P.
Perdew
, “
Strongly constrained and appropriately normed semilocal density functional
,”
Phys. Rev. Lett.
115
,
036402
(
2015
).
43.
J.
Sun
,
R. C.
Remsing
,
Y.
Zhang
,
Z.
Sun
,
A.
Ruzsinszky
,
H.
Peng
,
Z.
Yang
,
A.
Paul
,
U.
Waghmare
,
X.
Wu
et al, “
Accurate first-principles structures and energies of diversely bonded systems from an efficient density functional
,”
Nat. Chem.
8
,
831
836
(
2016
).
44.
D.
Tisi
,
L.
Zhang
,
R.
Bertossa
,
H.
Wang
,
R.
Car
, and
S.
Baroni
, “
Heat transport in liquid water from first-principles and deep neural network simulations
,”
Phys. Rev. B
104
,
224202
(
2021
).
45.
G.
Kresse
and
J.
Furthmüller
, “
Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set
,”
Phys. Rev. B
54
,
11169
(
1996
).
46.
P. E.
Blöchl
, “
Projector augmented-wave method
,”
Phys. Rev. B
50
,
17953
(
1994
).
47.
The GpyOpt Authors, “
GPyOpt: A Bayesian optimization framework in Python
,” http://github.com/SheffieldML/GPyOpt (
2016
).
48.
A.
Barducci
,
G.
Bussi
, and
M.
Parrinello
, “
Well-tempered metadynamics: A smoothly converging and tunable free-energy method
,”
Phys. Rev. Lett.
100
,
020603
(
2008
).
49.
A.
Laio
and
M.
Parrinello
, “
Escaping free-energy minima
,”
Proc. Natl. Acad. Sci. U. S. A.
99
,
12562
12566
(
2002
).
50.
G. J.
Janz
,
Molten Salts Handbook
(
Elsevier
,
2013
).
51.
S.
Plimpton
, “
Fast parallel algorithms for short-range molecular dynamics
,”
J. Comput. Phys.
117
,
1
19
(
1995
).
52.
G.
Csányi
,
S.
Winfield
,
J. R.
Kermode
,
A.
De Vita
,
A.
Comisso
,
N.
Bernstein
, and
M. C.
Payne
, “
Expressive programming for computational physics in Fortran 95+
,” in
IOP Computational Physics Newsletter
(
Spring
,
2007
).
53.
S.
Nosé
, “
A unified formulation of the constant temperature molecular dynamics methods
,”
J. Chem. Phys.
81
,
511
519
(
1984
).
54.
W. G.
Hoover
, “
Canonical dynamics: Equilibrium phase-space distributions
,”
Phys. Rev. A
31
,
1695
(
1985
).
55.
M.
Parrinello
and
A.
Rahman
, “
Polymorphic transitions in single crystals: A new molecular dynamics method
,”
J. Appl. Phys.
52
,
7182
7190
(
1981
).
56.
G. J.
Martyna
,
D. J.
Tobias
, and
M. L.
Klein
, “
Constant pressure molecular dynamics algorithms
,”
J. Chem. Phys.
101
,
4177
4189
(
1994
).
57.
W.
Shinoda
,
M.
Shiga
, and
M.
Mikami
, “
Rapid estimation of elastic constants by molecular dynamics simulation under constant stress
,”
Phys. Rev. B
69
,
134103
(
2004
).
58.
A.
Stukowski
, “
Visualization and analysis of atomistic simulation data with OVITO–The open visualization tool
,”
Modell. Simul. Mater. Sci. Eng.
18
,
015012
(
2009
).
59.
K.
Igarashi
,
T.
Nijima
, and
J.
Mochinaga
, “
Structure of molten CaCl2-NaCl mixture
,” in
Proceedings of the First International Symposium on Molten Salt Chemistry And Technology
(
Kyoto
,
Japan
,
1983
), Vol. 469.
60.
T. E.
Faber
and
J. M.
Ziman
, “
A theory of the electrical properties of liquid metals: III. The resistivity of binary alloys
,”
Philos. Mag.
11
,
153
173
(
1965
).
61.
D. A.
Keen
, “
A comparison of various commonly used correlation functions for describing total scattering
,”
J. Appl. Crystallogr.
34
,
172
177
(
2001
).
62.
X.
Wei
,
D.
Chen
,
S.
Liu
,
W.
Wang
,
J.
Ding
, and
J.
Lu
, “
Structure and thermophysical properties of molten calcium-containing multi-component chlorides by using specific BMH potential parameters
,”
Energies
15
,
8878
(
2022
).
63.
M.
Bu
,
W.
Liang
,
G.
Lu
, and
J.
Yu
, “
Static and dynamic ionic structure of molten CaCl2 via first-principles molecular dynamics simulations
,”
Ionics
27
,
771
779
(
2021
).
64.
A.
Zeidler
,
P. S.
Salmon
,
T.
Usuki
,
S.
Kohara
,
H. E.
Fischer
, and
M.
Wilson
, “
Structure of Molten NaCl and the decay of the pair-correlations
,”
J. Chem. Phys.
157
,
094504
(
2022
).
65.
S. E.
Day
and
R. L.
McGreevy
, “
Structure factors of Molten CaCi2 and MgCi2 at low q
,”
Phys. Chemi. Liquids Int. J.
15
,
129
136
(
1985
).
66.
R.
McGreevy
and
L.
Pusztai
, “
The structure of molten salts
,”
Proc. R. Soc. London, Ser. A
430
,
241
261
(
1990
).
67.
P. S.
Salmon
, “
The structure of Molten and Glassy 2: 1 binary systems: An approach using the Bhatia–Thornton formalism
,”
Proc. R. Soc. London, Ser. A
437
,
591
606
(
1992
).
68.
A.
Zeidler
,
P.
Chirawatkul
,
P. S.
Salmon
,
T.
Usuki
,
S.
Kohara
,
H. E.
Fischer
, and
W. S.
Howells
, “
Structure of the network glass-former ZnCl2: From the boiling point to the glass
,”
J. Non-Cryst. Solids
407
,
235
245
(
2015
).
69.
F.
Wu
,
S.
Sharma
,
S.
Roy
,
P.
Halstenberg
,
L. C.
Gallington
,
S. M.
Mahurin
,
S.
Dai
,
V. S.
Bryantsev
,
A. S.
Ivanov
, and
C. J.
Margulis
, “
Temperature dependence of short and intermediate range order in Molten MgCl2 and its mixture with KCl
,”
J. Phys. Chem. B
124
,
2892
2899
(
2020
).
70.
M.
Wilson
and
P. A.
Madden
, “
‘Prepeaks’and ‘first sharp diffraction peaks’ in computer simulations of strong and fragile ionic liquids
,”
Phys. Rev. Lett.
72
,
3033
(
1994
).
71.
M.
Salanne
and
P. A.
Madden
, “
Polarization effects in ionic solids and melts
,”
Mol. Phys.
109
,
2299
2315
(
2011
).
72.
A.-L.
Rollet
and
M.
Salanne
, “
Studies of the local structures of molten metal halides
,”
Ann. Rep. Sec. C: Phys. Chem.
107
,
88
123
(
2011
).
73.
G. M.
Photiadis
et al, “
Co-ordination of thorium (IV) in molten alkali-metal chlorides and the structure of liquid and glassy thorium (IV) chloride
,”
J. Chem. Soc., Dalton Trans.
3541
3548
(
1999
).
74.
G. M.
Photiadis
,
B.
Brresen
, and
G. N.
Papatheodorou
, “
Vibrational modes and structures of lanthanide halide–alkali halide binary melts LnBr3–KBr (Ln=La, Nd, Gd) and NdCl3–ACl (a=Li, Na, K, Cs)
,”
J. Chem. Soc., Faraday Trans.
94
,
2605
2613
(
1998
).
75.
F. J.
Alexander
,
J.
Ang
,
J. A.
Bilbrey
,
J.
Balewski
,
T.
Casey
,
R.
Chard
,
J.
Choi
,
S.
Choudhury
,
B.
Debusschere
,
A. M.
DeGennaro
et al, “
Co-design center for exascale machine learning technologies (exalearn)
,”
Int. J. High Perform. Comput. Appl.
35
,
598
616
(
2021
).
76.
J.
Guo
,
V.
Woo
,
D.
Andersson
,
N.
Hoyt
,
M.
Williamson
,
I.
Foster
,
C.
Benmore
,
N. E.
Jackson
, and
G.
Sivaraman
(
2023
). “
Metadynamics enhanced training datasets, DFT-SCAN accurate GAP model and MD trajectories for ‘AL4GAP: Active learning workflow for generating DFT-SCAN accurate machine-learning potentials for combinatorial molten salt mixtures
,’”
Figshare
. https://doi.org/10.6084/m9.figshare.22534981.v1

Supplementary Material