To design new materials and understand their novel phenomena, it is imperative to predict the structure and properties of materials that often rely on first-principles theory. However, such methods are computationally demanding and limited to small systems. This topical review investigates machine learning (ML) approaches, specifically non-parametric sparse Gaussian process regression (SGPR), to model the potential energy surface (PES) of materials, while starting from the basics of ML methods for a comprehensive review. SGPR can efficiently represent PES with minimal ab initio data, significantly reducing the computational costs by bypassing the need for inverting massive covariance matrices. SGPR rank reduction accelerates density functional theory calculations by orders of magnitude, enabling accelerated simulations. An optimal adaptive sampling algorithm is utilized for on-the-fly regression with molecular dynamics, extending to interatomic potentials through scalable SGPR formalism. Through merging quantum mechanics with ML methods, the universal first-principles SGPR-based ML potential can create a digital-twin capable of predicting phenomena arising from static and dynamic changes as well as inherent and collective characteristics of materials. These techniques have been applied successfully to materials such as solid electrolytes, lithium-ion batteries, electrocatalysts, solar cells, and macromolecular systems, reproducing their structures, energetics, dynamics, properties, phase-changes, materials performance, and device efficiency. This review discusses the built-in library universal first-principles SGPR-based ML potential, showcasing its applications and successes, offering insights into the development of future ML potentials and their applications in advanced materials, catering to both educational and expert readers.
I. INTRODUCTION
A breakthrough in functional materials is driving innovation across various technological fields, spanning from clean energy to information processing. Historically, the discovery of new materials has been hindered by costly trial-and-error methods. Meanwhile, advancements in deep learning, particularly in language, vision, and biology, have demonstrated remarkable predictive capabilities with the accumulation of data and computational power. Leveraging these trends, researchers have developed various machine learning (ML) methods. In particular, in developing new functional materials, the prediction of the structure and properties of the desired materials is possible from various ML approaches to possible materials candidates. One noteworthy achievement is the recent success of ML-based design in creating new solid-state materials.1,2
To achieve accurate predictions of diverse material systems, conducting simulations using ab initio calculations such as first-principles theory is the most desirable. Consequently, the most comprehensive approach to designing new functional materials involves employing simulation methods that range from simple to complex material systems, operating at the quantum mechanical level. However, ab initio calculations demand substantial computational resources and are often constrained to systems comprising only few hundred atoms. To overcome these limitations, numerous efforts have been directed toward utilizing ML approaches to generate first-principles level interatomic potentials (IAPs) and forces for materials. These endeavors span a wide array of ML approaches to generate the first-principles level IAPs of large-scale materials.3–35
Diverse ML methods allow us to extend the high quality of ab initio level calculations to large systems and long time periods. Indeed, density functional theory (DFT) combined with ML has enabled fast and realistic computer experiments, which correlate very well with the real world. Notable methods are neural network (NN) representations,36–42 Bayesian linear regression (BLR)43–46 and Gaussian process regression (GPR),25,47–49 gradient-domain50 and symmetrized gradient-domain ML,51 deep potential,52 compressed sensing,53 sparse Gaussian process regression (SGPR),54 etc.
Based on these various ML methods, there have been diverse applications for materials simulations such as inorganic crystals,55–57 metal particles,58,59 solid electrolytes,54,60–63 batteries,64–66 electrocatalysts,67–70 perovskite solar cells,71–73 molecular liquids,74–78 organic/bioorganic molecular systems,79–81 macromolecules,82 supramolecules, and polymers.83 In particular, the SGPR rank reduction84,85 was able to accelerate a DFT calculation of ∼1000 atoms by a factor of 104, resulting in accelerated simulations of silicon crystals, sulfide solid electrolytes as well as others.54
In this context, this topical review outlines ML representations of the potential energy surface (PES) of materials using non-parametric Bayesian and GPR algorithms, which have the advantage of estimating the possible errors in the predicted potential and force fields. These representations can effectively capture the PES with a minimal amount of ab initio data, contrasting with NNs that typically require extensive datasets for training their hyper-parameters. The computational efficiency of the GPR algorithm is primarily determined by the inversion of the covariance matrix between data components, such as energy and forces. Addressing this computational bottleneck, SGPR representations have been developed for PES, wherein large datasets are projected into smaller inducing sets, bypassing the need for inverting massive covariance matrices and significantly reducing computational costs. Several orders of magnitude improvement in computational cost is achieved. Moreover, an optimal adaptive sampling algorithm has been devised to facilitate on-the-fly data generation for regression, in conjunction with molecular dynamics (MD) simulations. Through hierarchical combinations of expert models, universal potentials have been generated, enabling the modeling of large-scale complexity via a combinatorial approach.63
As a demonstration, this review presents examples such as the reproduction of melting, glass-crystallization, and phase-change for silicon materials and solid electrolytes. The ab initio level MD simulations using on-the-fly ML of IAPs have been conducted to survey Li diffusivity in hundreds of crystals, potentially applicable as electrolytes for all-solid-state batteries. The models generated for these crystals can be amalgamated to create more general and transferable models through the hierarchical combination of expert models, which generate universal potentials, facilitating the simulation of all these materials and their analogs. This leads to applications for various systems, including batteries, solar cells, catalysts, and hydrocarbon macromolecular systems, paving the way for the eventual evolution of ab initio-level force fields for general materials systems. These universal potentials can be useful in designing novel functional molecular/materials systems as well as predicting their structures and properties. The continuous accumulation of these universal potentials could eventually evolve toward super-AI by self-learning process of diverse IAPs, which would lead to the prediction of diverse undiscovered materials. This comprehensive review offers in-depth insights into the evolution of ML potentials and their practical implementations in advanced functional materials. It is crafted to accommodate readers at both introductory and advanced levels of comprehension, ensuring accessibility for a wide range of audiences.
II. THEORETICAL BACKGROUND
A. Introduction to machine learning (ML) methods
Ab initio simulations are an integral part of our understanding of the materials physics and chemistry alongside experiments. The major bottleneck for application of the ab initio simulations is the computational cost, which scales exponentially with the size of system. If is the number of electrons, scaling of the computational cost of different ab initio methods varies from to higher powers. DFT is one of the most popular and efficient ab initio methods with a nice trade-off between accuracy and computational cost.86 The cost for DFT calculations based on the Kohn–Sham scheme, which requires diagonalization to obtain Kohn–Sham orbitals, scales as or at best unless one uses the fragmental calculations based on divide-and-conquer algorithms.87,88 From a practical point of view, DFT calculations are affordable only for atoms with reasonable computational resources. In many cases for simulating large systems, force fields are used, which express the potential energy in terms of distances, bonds, angles, dihedrals, etc. Force fields are specially popular in simulating biological systems where large macromolecules such as proteins need to be simulated. The parameters in the force field are often set by fitting to limited experimental data or ab initio results. However, these force fields are of varying transferability between different thermodynamic environments and sometimes between the calculation of different properties within the same environment. In many cases, a force field cannot even satisfactorily fit ab initio data because of the many-body quantum effects, which cannot be reduced to bonds, angles, etc.89,90
As an alternative to empirically fitting the force fields with simple analytical forms, ML techniques have recently emerged as a powerful tool for representing the mathematical entities used in ab initio modeling.15 Using ML, the potential energy is learned from a set of n ab initio data points. ML reduces the scaling of the computational cost for energy and force calculations to be linear in the number of atoms in the system. This allows us to apply a potential, which may be trained with the data obtained for small systems, to arbitrarily large systems. Currently, the most widely used ML methods are variations of NNs,36 Gaussian approximation potentials (GAPs),47 and BLR methods.43,53,91 In spite of their success stories, each method has shortcomings. NNs and linear methods need large datasets for training. GAPs can be used for robust modeling with small datasets, but their computational cost for training scales as n3 with the size of the data. Here we discuss the SGPR algorithm, which shows very attractive traits in comparison to the previous ML methods. Moreover, we address the issue of optimal data generation, which may be even more important than the regression algorithm. Combined with SGPR, we also introduce an adaptive sampling algorithm that can build robust potentials with a surprisingly small number of DFT data.
The capabilities of this methodology are demonstrated with applications to superionic conducting solid electrolytes. Li diffusivity is studied in dozens of solid electrolytes. For a certain sulfide based electrolyte, the melting, melt-quench, and diffusivity in the glass phase and glass crystallization are simulated, which are found to be consistent with experimental observations. Additionally, since some of the simulated electrolytes had similar chemical elements, the ML models generated for these materials can be combined to give a universal potential for that subset of elements.
In many simulations, even a computationally fast potential is not enough. For instance, slow dynamics or rare events may require inconceivably long molecular dynamics or Monte Carlo simulations in order to achieve statistical certainty. In general, there are two scales that determine the size of the simulation cell and the period for simulation. The size of the simulation cell must be much larger than the length scale of the spatial correlations. Similarly, the period of the simulation must be much longer than the temporal autocorrelation functions. In these situations, enhanced sampling algorithms are necessary in addition to fast potentials. Although the temporal autocorrelation function oscillates long in many cases in chemistry (compared with for simple liquids92), the spatial correlation length is often small (compared, for example, with the Debye length for salt concentrations in biological fluids93). Two exceptions exist where the spatial correlation length is divergent. The first is at the critical point of continuous phase transitions and the second is when a system exhibits quasi-long-range order. In this regard, fast and accurate ML models for first-principles potential energy are critical for large-scale materials simulations. Thus, in this section, we address the ML representations of the potential energy surface (PES), including NNs, GPR, and BLR, and then discuss SGPR along with the adaptive sampling algorithms.
As for the ML representations of the PES, the inputs are the atomic coordinates and the targets are the potential energy and forces obtained from ab initio calculations such as DFT. A set of (inputs, targets) constitute the data that will be used for modeling. Assembling the data is an important topic by itself. The data largely determine the domain of validity for the generated model and the computational cost. Therefore, the generation of optimal data might be the most important step for wider transferability and acceptance of the simulation results. Here, we will assume that the data are generated beforehand and we wish to model the existing data. We will discuss the general aspects of ML algorithms and briefly introduce the state of the art methods. Optimal generation of the data by algorithms such as on-the-fly sampling or active learning will be discussed later.
B. Notations
Nomenclature will be followed hereafter. It should be noted that symbols like and are used to distinguish functions and kernels from coefficients, vectors, etc.
C. Representation of the potential energy
The central assumption in the above formulation is that of locality. This assumption is exercised in practice by choosing a finite cutoff radius rc. There is a trade-off in choosing the cutoff. Choosing a large cutoff dilutes the features of LCEs and increases the computational cost. On the other hand, by choosing a small cutoff, the effect of atoms beyond the cutoff will appear as random forces that cannot be explained. Therefore, the cutoff should be chosen by balancing the errors of the regression algorithm and the random noise due to the atoms beyond the cutoff. Such a principled approach has been taken for ML of the potential energy for amorphous carbon.94
D. Complexity of machine learning models
E. Neural networks
NNs are specially efficient for big data since the computational cost for training a NN for data of size n scales as . On the other hand, it is not easy to train a NN model using a small training set without over-fitting. Therefore, NNs are very useful when large datasets are already available. However, in the context of ML of the PES, it should be noted that the data are obtained from expensive ab initio calculations and are often scarce.
ML potentials based on NNs have greatly improved due to advancements in graph-based algorithms.1,17,40,95–100 When the predicted energies are constrained to be invariant under rotation, translation, and permutation, and the predicted forces are constrained to be equivariant under rotation as the NN parameters are updated, the overfitting issues are significantly reduced.
Graph neural networks (GNNs) are useful for modeling systems where data can naturally be represented as a graph, such as atomic systems in materials and molecular simulations. In these models, nodes represent atoms, and edges represent interactions like bonds or forces between them. Two important categories of GNNs in this domain are invariant models (M3GNet,96 CHGNet97) and equivariant models (NequIP,40 MACE17). Invariant models ensure the output properties are invariant to the rotation, translation, and reflection of the input system, while equivariant models enforce the predictions change in a way that follows the symmetry transformations (like rotations) of the input system. Invariant models are typically simpler and faster, suitable for scalar outputs like total energy, while equivariant models that incorporate geometric symmetry are ideal for tasks requiring directional dependencies (such as forces and torques).
F. Gaussian processes
1. Multivariate normal distributions
2. Gaussian process regression
Meanwhile, for GPR, we need to invert the covariance matrix on the training data, which is computationally costly. If the size of covariance matrix is n, the computational cost for matrix inversion is . For reducing this unfavorable scaling of the cost, several approximation methods have emerged that reduce this scaling to . We will discuss one of them in a later section.
3. Hyper-parameters
If the inverse of the matrix is not explicitly required in the calculation of quantities such as , we can bypass the matrix inversion and calculate the result α as the solution of linear system . If Γ is symmetric and positive definite, the Cholesky decomposition can be used to significantly reduce the computational complexity. An important point is worth noting here about , which is added to the diagonal elements of the covariance matrix. This should not be set to exactly zero, even if our data are noiseless. A small non-zero σ is needed to ensure the numerical stability of the Cholesky factorization and matrix inversion. The reason is that although the covariance matrix is in principle positive definite, an eigenvalue can become 0 due to numerical errors. This makes the determinant 0 and the inverse non-existent.
4. Covariance matrix for energies and forces
G. Potential energy inference
H. Bayesian linear regression
I. Compressed sensing
J. Sparse Gaussian process regression (SGPR)
In Secs. II G and II H, we discussed two types of Bayesian regression algorithms: full-rank GPR and BLR methods. In full-rank GPR, the cost for calculation of the covariance matrix scales as for a dataset of size n. The cost for evaluation of the regression weights is dominated by inversion of the covariance matrix, which scales as . In linear regression methods scaling of both of these costs is reduced to . Instead, linear regression methods do not come close in data efficiency, while GPR can represent the potential energy surface with much less data with the same accuracy. In this section, we explore low-rank approximation of the covariance matrix in GPR to reduce the cost scaling to that of the linear regression methods while maintaining the data-efficiency and expressiveness of GPR. This novel algorithm is called SGPR modeling. We also address on-the-fly data sampling methods that are useful to maximize the efficiency of SGPR.
Given that the kernel in SGPR only needs to be calculated between the data and inducing LCEs, a substantial improvement is achieved as compared with full GPR. In a naive comparison, the improvement is , while a more rigorous analysis shows a much better improvement. For example, with 100 configurations (100 atoms each), the improvement is a factor of .
K. Variational lower bound for log-likelihood
L. Adaptive sampling algorithm
The most important step, regardless of the regression algorithm, is the generation of the data. Early methods36,47 prepared the data beforehand, which is not efficient. For instance, they used existing ab initio calculations or generated samples by random displacement of atoms. On-the-fly learning methods,43,44,91,101 estimate the accuracy of the potential and use a criterion for data generation during molecular dynamics simulation, etc. In GPR based algorithms, the predictive variance is a suitable criterion while in NNs the uncertainty can be emulated by training identical NNs with different initial random seeds. Generally, GPR based algorithms are more suited for on-the-fly learning since they have only a few hyper-parameters. However, the size of these potentials and their computational cost grow with time; therefore, the data generated earlier need to be dumped to keep the potential affordable. Thus, for globally accurate potentials, it is critical to devise an optimal adaptive sampling algorithm. We tackle this issue by replacing the predictive variance by another criterion, which we call the geometric criterion. This is motivated from the observation that the predictive variance does not take advantage of the energy and forces data. Assume that a SGPR model is generated using n ab initio data and m inducing LCEs . Aside from hyper-parameter optimization, the potential can be modified in two ways: adding a LCE to the inducing set z and adding a new ab initio data to X. Instead of calculating the predictive variance, we calculate the change in the predictive mean of the potential by directly constructing the potential both with and without (same for ). If the change in the predictive mean is more than a threshold ϵ, the addition is accepted. Inclusion of the forces data in the regression is critical for this algorithm; otherwise, geometric criterion may become only marginally better than variance criterion.
For adaptive sampling of the data and inducing LCEs, the model is built on-the-fly with MD where the system evolves with SGPR model, and ab initio calculations are carried out actively to correct the model based on the following criterion. The data and unique LCEs in the first step are automatically included in X and z. In the following steps, we try to insert on-the-fly LCEs to z. A LCE is added to z only if , where is the change in the predictive energy resulting from the inclusion of in z and ϵ is a predetermined threshold. Since it is too costly to try out all LCEs, in the spirit of importance sampling, we try insertion of LCEs based on their spilling factor [see Eq. (79)] in a descending order and terminate at the first unsuccessful insertion trial. If at least one LCE is added to z, it means that the system has potentially crossed into an unfamiliar region. At this stage, in one algorithm (FAST), we calculated the exact ab initio energy and forces and included the snapshot in X only if . Thus, sometimes the exact ab initio data were regarded as redundant and were rejected by the model. This was devised to keep the number of sampled data ( the computation cost in future steps) as small as possible. In another algorithm (ULTRAFAST), in order to preempt unwanted ab initio calculations, predictions of the model were used as fake ab initio data, and if the insertion was accepted, they were corrected by exact calculations. This algorithm is similar to leave-one-out cross-validation method for active learning.104
M. AutoForce: A SGPR-based ML package for atomistic simulations
We have developed the Python package AutoForce for generating SGPR models and on-the-fly learning of the PES.105 This package is coupled with the atomic simulation environment (ASE)106 and from there it can be coupled to various ab initio software including Vienna Ab initio Simulation Package (VASP),107 which implements the projector augmented wave (PAW)108 approach to DFT with Perdew–Burke–Ernzerhof (PBE) and generalized-gradient approximation (GGA) functionals.109
N. Benchmark test of SGPR-based ML: Silicon
For a benchmark simulation, we created a SGPR potential for bulk Si in that it has been studied by several other ML methods.36,44,47,91,101 A cubic cell with 64 atoms was chosen with a k-point grid for DFT calculations. Calculations for the remaining part are non-spin-polarized and a kinetic energy cutoff of 500 eV is applied. The potential is generated by on-the-fly sampling with MD simulations at 300 K (for 20 ps) and 1000 K (for 30 ps). Using the algorithm ULTRAFAST, only 13 DFT calculations were performed for data generation and 36 LCEs were sampled as the inducing set. This entire simulation took less than 1 h. A total of 100 snapshots were chosen from trajectories for testing, which resulted in the RMSE = 0.07 eV/AA and for forces. The phonon spectra with a supercell are shown in Fig. 5. This performance is better than all previous ML methods, which used hundreds to thousands of DFT data for generating a potential. The closest competition is on-the-fly active learning,44 which reached the RMSE of 0.08 eV/Å with 133 DFT calculations for 10 ps MD at 620 K. The SGPR algorithm, which used much more expensive smooth overlap of atomic positions (SOAP) descriptor and smaller computational resources, was found to be more than ∼300 times faster than the GPR algorithm.
The RMSE of ML potential usually increases with temperature because of larger forces and more disorder. In Si simulation above from 300 to 700 K, the RMSE increases from 0.03 to 0.07 eV/Å. However, at higher temperatures, the larger RMSE can be tolerated because of random forces due to thermal fluctuations. In Ref. 36, a silicon melt simulation with NN errors well above 0.2 eV/Å accurately reproduced the DFT radial distribution function. A minimal requirement is that the distribution of forces error should be Gaussian centered at 0. As such, they act as random forces due to thermal coupling with a heat bath and do not significantly alter the thermodynamics.
O. Universal potentials
When we consider materials with various compositions but with the same elements, each “expert” model can be generated for each composition based on the SGPR algorithm. Then, a general expert model can be obtained for a material system having diverse compositions. As an example, let us consider the Li–S–P materials system (Lix–Sy–Pz) having four different compositions ( ), which are composed of the same set of elements of Li, P, and S. Then, each expert model is generated for each material. By combining these four expert models, one universal expert model can be generated. Three approaches could be utilized:
-
Product of experts (POE): Average of all of the expert models is calculated for making predictions.
-
Merging of experts (MOE): The data and inducing LCEs of all expert models are combined by importance sampling using the geometric criterion.
-
Fusion of experts (FOE): The data and inducing LCEs of all expert models are fully utilized, without importance sampling.
As an example, here we provide the coefficients of determination R2 for the above combinatorial models, i.e., the aforementioned four different crystals of the ternary (Li–P–S) solid electrolyte with the same set of three elements, as listed in Table I. As expected, POE is poorer than expert models. It is included here because in most of the non-Bayesian ML methods, averaging over the expert models is the only available mechanism for combining ML models, which is equivalent to POE in Bayesian methods. The current study provides a quantitative comparison. Thus, although POE can provide a rough ML-IAP without any further ML procedures, it does not generalize well. ML models trained for different domains may not be simply averaged. MOE is as good as the expert models, even though its size (data/inducing) = (102/573) is much smaller than the combined size of the experts (181/1418).63 FOE has the same size as the combined size of the experts but almost in every case is better than the experts. Thus, FOE could be used when affordable; otherwise, MOE offers a scalable solution that can handle more complex systems much faster.
. | Li–P–S . | Expert . | POE . | MOE . | FOE . |
---|---|---|---|---|---|
A | Li3PS4 | 0.972 | 0.962 | 0.973 | 0.980 |
B | Li7PS6 | 0.972 | 0.813 | 0.966 | 0.978 |
C | Li48P16S61 | 0.953 | 0.697 | 0.952 | 0.969 |
D | Li7P3S11 | 0.958 | 0.864 | 0.955 | 0.969 |
. | Li–P–S . | Expert . | POE . | MOE . | FOE . |
---|---|---|---|---|---|
A | Li3PS4 | 0.972 | 0.962 | 0.973 | 0.980 |
B | Li7PS6 | 0.972 | 0.813 | 0.966 | 0.978 |
C | Li48P16S61 | 0.953 | 0.697 | 0.952 | 0.969 |
D | Li7P3S11 | 0.958 | 0.864 | 0.955 | 0.969 |
Similar combinations can also be made for another ternary system such as Li–Ge–S solid electrolyte by accomplishing the same by training a model sequentially in all domains. Then, as a more interesting and useful example, by combining the two expert models for the two ternary systems of Li–P–S and Li–Ge–S, it is possible to generate the expert model for a quaternary system of Li–P–Ge–S solid electrolyte, once the cross term between P and Ge, i.e., the expert model for the P–Ge system is properly provided. If there is no direct contact between these two elements, the cross term could be reliable enough by adding only the long-range interactions between the two elements. As an approximation, such a small effect by the cross term could be even ignored with a certain error range. However, if the two elements are directly in contact, the cross term needs to be taken into account. In the case of MOE, the cross term between the two elements can be more easily and simply calculated than the three elements cases. Overall, the MOE approach can be utilized with a minimal loss of accuracy in energy. Thus, the combinatorial approach utilizing the MOE approach opens the door for strategies like “high level parallelism” or “a library of SGPR potentials,” which can be combined by simple cross covariance calculations to describe increasingly complex materials. Such a combinatorial approach is not trivial by other ML methods.
Indeed, the SGPR models for Li3PS4 and Li4GeS4 were simply combined and tested on Li10Ge(PS6)2 (both configurations in Table II). Due to the presence of different species P and Ge, the combinatorial model cannot be significantly compressed and thus MOE and FOE are equivalent. The combined model of the two ternary crystals described the forces of the quaternary crystal with , which is very close to R2 of the expert model. This is understandable in view of the similarity of LCEs for the cations P and Ge in these crystals (see Fig. 6). In many promising solid electrolytes, they are present as poly-anionic moieties with X = P, Ge, etc., in the sea of Li+ ions and do not directly interact with each other. By combination of the ML models for the simpler crystals, we can simulate the composite ones with a minimal training or even without any additional training.
mp-ID . | Formula . | ML . | Diffusion coefficient (Å2/ps) at given temperature (K) . | Ea . | σRT . | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | R2 . | 400 . | 500 . | 600 . | 700 . | 800 . | 900 . | 1000 . | 1200 . | (eV) . | (mS/cm) . |
mp-1211324 | Li7PS6 | 0.972 | 0.15 | 0.29 | 0.54 | 0.83 | 0.18 | 65.2 | ||||
mp-1097036 | Li3PS4 | 0.972 | 0.068 | 0.14 | 0.28 | 0.46 | 0.23 | 20.2 | ||||
mp-641703 | Li7P3S11 | 0.958 | 0.12 | 0.2 | 0.32 | 0.61 | 0.21 | 20.1 | ||||
mp-768197 | Li5SbS4 | 0.934 | 0.1 | 0.18 | 0.31 | 0.23 | 18.4 | |||||
mp-1147621 | Li3InBr6 | 0.924 | 0.018 | 0.057 | 0.13 | 0.2 | 16.1 | |||||
mp-28327 | LiGaBr3 | 0.903 | 0.049 | 0.11 | 0.21 | 7.6 | ||||||
mp-756427 | Li5SbS4 | 0.933 | 0.086 | 0.19 | 0.26 | 0.37 | 0.26 | 6.0 | ||||
mp-685863 | Li4SiO4 | 0.958 | 0.093 | 0.2 | 0.32 | 0.53 | 0.3 | 3.1 | ||||
mp-1195718 | Li4SnS4 | 0.973 | 0.18 | 0.32 | 0.37 | 0.74 | 0.29 | 2.3 | ||||
mp-766409 | Li3BiS3 | 0.946 | 0.034 | 0.077 | 0.12 | 0.29 | 0.27 | 2.1 | ||||
mp-29009 | Li2MgBr4 | 0.944 | 0.045 | 0.096 | 0.18 | 0.28 | 1.4 | |||||
mp-38684 | Li2MgCl4 | 0.958 | 0.047 | 0.095 | 0.21 | 0.46 | 0.3 | 1.3 | ||||
mp-760415 | Li3SbS4 | 0.966 | 0.2 | 0.57 | 0.98 | 0.33 | 0.83 | |||||
mp-1194700 | Li4SnSe4 | 0.962 | 0.15 | 0.3 | 0.41 | 0.8 | 0.33 | 0.6 | ||||
mp-1222582 | Li4GeS4 | 0.97 | 0.14 | 0.31 | 0.4 | 0.93 | 0.38 | 0.24 | ||||
mp-1001069 | Li48P16S61 | 0.953 | 0.027 | 0.083 | 0.17 | 0.38 | 0.22 | |||||
mp-1177520 | Li3SbS3 | 0.929 | 0.018 | 0.058 | 0.18 | 0.39 | 0.38 | 0.2 | ||||
mp-775806 | Li3SbS3 | 0.944 | 0.084 | 0.16 | 0.27 | 0.49 | 0.36 | 0.19 | ||||
mp-38487 | Li7BiO6 | 0.987 | 0.041 | 0.079 | 0.12 | 0.25 | 0.37 | 0.17 | ||||
mp-753720 | Li3BiS3 | 0.939 | 0.08 | 0.14 | 0.21 | 0.51 | 0.39 | 0.079 | ||||
mp-29008 | Li6MgBr8 | 0.965 | 0.019 | 0.052 | 0.13 | 0.23 | 0.41 | 0.077 | ||||
mp-757150 | Li5BiS4 | 0.929 | 0.02 | 0.12 | 0.2 | 0.41 | 0.46 | 0.049 | ||||
mp-696128 | Li10Ge(PS6)2 | 0.967 | 0.16 | 0.32 | 0.55 | 0.76 | 0.17 | 84.2 | ||||
mp-696138 | Li10Ge(PS6)2 | 0.97 | 0.14 | 0.33 | 0.55 | 0.85 | 0.19 | 46.6 |
mp-ID . | Formula . | ML . | Diffusion coefficient (Å2/ps) at given temperature (K) . | Ea . | σRT . | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | R2 . | 400 . | 500 . | 600 . | 700 . | 800 . | 900 . | 1000 . | 1200 . | (eV) . | (mS/cm) . |
mp-1211324 | Li7PS6 | 0.972 | 0.15 | 0.29 | 0.54 | 0.83 | 0.18 | 65.2 | ||||
mp-1097036 | Li3PS4 | 0.972 | 0.068 | 0.14 | 0.28 | 0.46 | 0.23 | 20.2 | ||||
mp-641703 | Li7P3S11 | 0.958 | 0.12 | 0.2 | 0.32 | 0.61 | 0.21 | 20.1 | ||||
mp-768197 | Li5SbS4 | 0.934 | 0.1 | 0.18 | 0.31 | 0.23 | 18.4 | |||||
mp-1147621 | Li3InBr6 | 0.924 | 0.018 | 0.057 | 0.13 | 0.2 | 16.1 | |||||
mp-28327 | LiGaBr3 | 0.903 | 0.049 | 0.11 | 0.21 | 7.6 | ||||||
mp-756427 | Li5SbS4 | 0.933 | 0.086 | 0.19 | 0.26 | 0.37 | 0.26 | 6.0 | ||||
mp-685863 | Li4SiO4 | 0.958 | 0.093 | 0.2 | 0.32 | 0.53 | 0.3 | 3.1 | ||||
mp-1195718 | Li4SnS4 | 0.973 | 0.18 | 0.32 | 0.37 | 0.74 | 0.29 | 2.3 | ||||
mp-766409 | Li3BiS3 | 0.946 | 0.034 | 0.077 | 0.12 | 0.29 | 0.27 | 2.1 | ||||
mp-29009 | Li2MgBr4 | 0.944 | 0.045 | 0.096 | 0.18 | 0.28 | 1.4 | |||||
mp-38684 | Li2MgCl4 | 0.958 | 0.047 | 0.095 | 0.21 | 0.46 | 0.3 | 1.3 | ||||
mp-760415 | Li3SbS4 | 0.966 | 0.2 | 0.57 | 0.98 | 0.33 | 0.83 | |||||
mp-1194700 | Li4SnSe4 | 0.962 | 0.15 | 0.3 | 0.41 | 0.8 | 0.33 | 0.6 | ||||
mp-1222582 | Li4GeS4 | 0.97 | 0.14 | 0.31 | 0.4 | 0.93 | 0.38 | 0.24 | ||||
mp-1001069 | Li48P16S61 | 0.953 | 0.027 | 0.083 | 0.17 | 0.38 | 0.22 | |||||
mp-1177520 | Li3SbS3 | 0.929 | 0.018 | 0.058 | 0.18 | 0.39 | 0.38 | 0.2 | ||||
mp-775806 | Li3SbS3 | 0.944 | 0.084 | 0.16 | 0.27 | 0.49 | 0.36 | 0.19 | ||||
mp-38487 | Li7BiO6 | 0.987 | 0.041 | 0.079 | 0.12 | 0.25 | 0.37 | 0.17 | ||||
mp-753720 | Li3BiS3 | 0.939 | 0.08 | 0.14 | 0.21 | 0.51 | 0.39 | 0.079 | ||||
mp-29008 | Li6MgBr8 | 0.965 | 0.019 | 0.052 | 0.13 | 0.23 | 0.41 | 0.077 | ||||
mp-757150 | Li5BiS4 | 0.929 | 0.02 | 0.12 | 0.2 | 0.41 | 0.46 | 0.049 | ||||
mp-696128 | Li10Ge(PS6)2 | 0.967 | 0.16 | 0.32 | 0.55 | 0.76 | 0.17 | 84.2 | ||||
mp-696138 | Li10Ge(PS6)2 | 0.97 | 0.14 | 0.33 | 0.55 | 0.85 | 0.19 | 46.6 |
In summary, extensive MD simulations can be carried out with on-the-fly ML for the potential energy using the SGPR algorithm. The models generated with kernel-based methods such as SGPR can be easily combined for more generality and transferability. Therefore, using the previously trained models in the database, it is possible to model new crystals without further training. This approach was demonstrated by creating universal potentials for two ternary systems. Additionally, combinations of two ternary crystals resulted in a realistically accurate potential for a quaternary crystal. With the introduced algorithm, the models generated in this study can be reused, with limited or no training at all, to describe more complex system (such as quaternary electrolytes) by a combinatorial approach.
Recently, universal potentials have been widely developed using graph-based models trained on large databases like the Materials Project.1,17,40,95–100 These models aim to cover all elements in the periodic table and handle various configurations, expanding the term “universal potential” to encompass not only elements but also configurational spaces. While the current review discusses mainly elements, nothing in the model formulation limits its applicability to different configurations. The uncertainty threshold of the model can be adjusted to accommodate a broader training configuration space, thereby enhancing its predictive capabilities across diverse materials. The term “universal potential” was first used to refer to extended subsystems within typical machine learning potentials (MLPs), which were tailored for specific materials with fixed compositions.34,63 Kernel methods are able to address a wide range of material systems, regardless of differences in phase or stoichiometry. These subsystems can be combined into supersystems, enabling them to handle even more diverse materials. A built-in library system, stored in memory, allows for the selection of appropriate kernels for specific systems. Once fully constructed, this library can support various material systems by incorporating cross terms between expert systems, calculated from local structure descriptors. These cross terms are then added to the library, making it more general and versatile over time. Although the compositionally universal potential is currently somewhat limited, the library is continuously expanding and will eventually provide greater generality and applicability for a wide range of materials.
P. Sparse Bayesian committee machine potential
The SGPR algorithm can have difficulties in dealing with large number of inducing sets m as in obtaining the solution w of Eq. (74), which involves the inversion of a kernel matrix, with large size of training datasets X and inducing set z of LCEs. In order to mitigate this challenge, Bayesian Committee Machine (BCM)110 is employed to combine SGPR models, each trained independently on different subsets as a local expert (Fig. 7). When building MLP of ice polymorphs, rather than utilizing all ice phases as a training dataset, each ice phase is divided into separate training subsets. Each local expert model is then trained on its corresponding subset. These individually trained local expert SGPR models are combined into a universal BCM MLPs. This approach not only simplifies the training process but also overcomes the computational challenges associated with inverting large kernel matrices, enabling an efficient and accurate universal MLP. The BCM, for instance, combines the predictions from P distinct local expert models, each trained on different data subsets, which reduces the computational scaling to for training.
The key advantage of using the SGPR-based BCM is that it not only provides sparsification but also allows us to invert an approximate -dimensional matrix rather than a full m-dimensional one ( ). This reduces the computational cost of the matrix inversion from to . Compared to “product of experts” (POE), where the predicted potential energy is given by , the BCM employs a weighting factor to ensure that models with higher confidence contribute more significantly to the predicted energy compared to less confident models.
It is validated that the BCM potential accurately estimates the melting temperatures of ice II, III, and V under pressure conditions of 0.2–0.6 GPa.112 To achieve this, on-the-fly active learning SGPR NpT MD simulations was performed to create BCM MLP for each coexisting system: ice II-water, ice III-water, and ice V-water. Three separately trained SGPR local expert MLP models were combined into one sparse robust BCM potential. This combined potential was used for NpH MD simulations of coexisting ice–liquid systems. Non-active NpH MD simulations were performed to determine the melting temperatures of ice II, III, and V. During these MD simulations, the conservation of enthalpy (H) was ensured while monitoring fluctuations in total energy, which includes both potential and kinetic energy.
Figure 8 illustrates the melting points of ice II, III, and V as determined from our NpH MD simulations using the MB-pol water model. The ice–water phase diagram predicted with the MB-pol potential closely aligns with those obtained from enhanced-coexistence simulations, as reported by Bore and Paesani.113 The BCM MLP stabilizes ice V over ice II, consistent with Zhang et al.'s predicted phase diagram.114 However, using a sophisticated deep NN potential, Bore and Paesani113 observed that ice V became a metastable phase, absent from the phase diagram. This discrepancy is attributed to the failure in incorporating high model uncertainty configurations of the water–ice system in the training dataset. In contrast, the active BCM potential adeptly learns model uncertainties on-the-fly, enhancing model accuracy and thereby stabilize ice V in the phase diagram (Fig. 8). However, the melting points predicted by the BCM MLP exhibit a slight upward shift in temperature relative to the benchmark MB-pol potential (Fig. 8). This deviation primarily stems from the noise present in the energy and force predictions of the BCM potential. Notwithstanding the shifts in melting temperatures, we emphasize the BCM potential's capability to accurately replicate the physical and thermodynamic properties of the reference MB-pol water model. While the discrepancies in melting temperatures suggest a room for further improvements in accuracy of model, the significance lies in the BCM potential's reproduction of reference phase diagram. This affirms the BCM potential's utility in capturing the essence of water's behavior under various conditions while reducing the computational scaling of training SGPR MLP for large data.
Q. Comparison of GNNs and SGPR
Figure 9 compares GNNs and SGPR during training and testing. The Material Project Trajectory (MPTrj)97 serves as an example of a large dataset for training, containing about 1 500 000 compounds. The NNs95–98 used the entire dataset directly, employing an embedding layer, interaction layers, and a readout layer to predict output values such as energy, with optimization. In contrast, SGPR organizes the data into subsets based on the similarity of LCEs, creating the local expert models. Each local expert is built from its unique LCEs, which serve as an inducing dataset and a solution (w) to predict energy. These local expert models form the basis for built-in SGPR kernels. To estimate the potential energy of a test configuration ( ), NNs rely on the trained parameters, but SGPR selects local experts ( and quantifies uncertainty based on the similarity measure between the LCE ( ) of the testing configuration and the existing inducing LCEs .
For uncertainty quantification (UQ), SGPR estimates uncertainty based on the similarity measurement , providing UQ values [Eqs. (79) and (89)] using similarity measures, particularly in regions far from the training data, between the LCE of the testing configuration and the existing inducing LCEs. NNs, on the other hand, face difficulties in producing robust UQ estimates.115,116 Methods such as Bayesian NNs, including Bayes by Backprop,118,119 Monte Carlo-Dropout,120,121 and deep ensembles,122 have been proposed to estimate uncertainties in the predicted total energy, but not for local energy predictions. Other UQ approaches use feature space,123 latent space,124 or GP-based kernel on latent space,125 with the Euclidean norm distance in latent features showing some success. Recently, UQ could be improved by analyzing variations in predicted forces rather than predicted energies,126 while handling variations in predicted forces remain less reliable in data-sparse regions. In this regard, SGPR stands out for its principled approach, providing closed-form uncertainty estimates, data-dependent variance, and the flexibility of a non-parametric model, which adapts its complexity to the data.
As for scalability, GNNs are able to handle large datasets and many atoms efficiently through local message-passing mechanisms. However, significant disadvantages of GNN models arise from the significant training data requirements, UQ, interpretability, and computational complexity during training. GNNs typically require large datasets to reach optimal performance. If training data are limited, they may not perform as well as GPR-based models, which are better at handling small data regimes. The need for extensive data can be a drawback in situations where high-quality labeled data (e.g., quantum mechanical calculations) are expensive or time-consuming to generate. GNNs are generally deterministic models, without providing UQ. This makes them less useful for tasks where knowing the confidence in prediction is crucial. GNNs are often seen as black-box models, making it difficult to interpret why certain predictions are made. This lack of transparency can be a concern in certain applications where interpretability is important. The underlying physics captured by the model can be harder to analyze compared to GPR where the kernels can often be mapped to physical processes. Although GNNs are scalable during inference, training them can still be computationally expensive, especially for equivariant models that require careful enforcement of rotational symmetries and use large datasets.
GPR models typically rely on SOAP descriptors, which become computationally intensive as the number of elements increases. This is because the descriptor needs to account for the diverse chemical environments and interactions for each additional element. The computational cost in GPR grows significantly when a greater number of atomic species is involved, making it less scalable for systems with many different elements. On the other hand, GNN models use graph-based embeddings, which capture structural information about the molecules without scaling as severely with the number of elements. GNNs are more flexible and efficient for large systems due to the ability to generalize molecular structures through learned embeddings. GNNs are highly scalable by handling large datasets and systems with many atoms efficiently due to their local message-passing nature. However, significant disadvantages of GNN models arise from the significant training data requirements, UQ, interpretability, and computational complexity during training. GNNs typically require large datasets to reach optimal performance. If training data are limited, they may not perform as well as GPR-based models, which are better at handling small data regimes. The need for extensive data can be a drawback in situations where high-quality labeled data (e.g., quantum mechanical calculations) are expensive or time-consuming to generate. GNNs are generally deterministic models, without providing UQ. This makes them less useful for tasks where knowing the confidence in prediction is crucial. GNNs are often seen as black-box models, making it difficult to interpret why certain predictions are made. This lack of transparency can be a concern in certain applications where interpretability is important. The underlying physics captured by the model can be harder to analyze compared to GPR where the kernels can often be mapped to physical processes. Although GNNs are scalable during inference, training them can still be computationally expensive, especially for equivariant models like NequIP and MACE that require careful enforcement of rotational symmetries and use large datasets.
On the other hand, GPR is a non-parametric Bayesian approach used for regression tasks. SGPR is a variant of GPR that reduces the computational cost by selecting a subset as a natural way to quantify uncertainty in the predictions. This is useful in materials discovery or simulations where understanding the reliability of predictions is critical. The Bayesian nature of SGPR allows for the prediction of both mean and variance, which is a unique strength over many GNNs. SGPR models tend to perform well with relatively small and moderate datasets, as they rely on a kernel-based approach that implicitly embeds prior knowledge about smoothness in the data. SGPR is particularly useful when training data are scarce or expensive to obtain. GPR models are highly interpretable, as the choice of kernels can be directly related to physical principles, and the model predictions can be understood as weighted averages of the training data. The ability to choose specific kernel functions provides flexibility to capture particular characteristics of atomic interactions. The predictions of GPR are generally smooth and conservative, which can be beneficial in certain applications, especially when physical smoothness constraints are essential, like for energy landscapes. In contrast, there are disadvantages of GPR. As the number of atoms and interactions increases, even sparse approximations become computationally extensive for very large systems. GPR relies heavily on the choice of kernel function to model interactions. To mitigate the scaling issue of GPR, feature compression, sparse representations, and partitioning strategy could be exploited. Techniques such as dimensionality reduction on SOAP descriptors (e.g., principal component analysis) could help reduce the complexity of the feature space without losing essential chemical information. Using a sparser representation of the SOAP descriptor could lessen the computational load by focusing on the most relevant atomic environments. Breaking down the chemical space into smaller regions (e.g., clustering molecules based on their element types) may allow the model to focus on subsets of elements at a time.
Overall, GNNs excel in scalability, can model complex interactions, and are better suited for large datasets, but require more data and computational resources, and lack built-in uncertainty quantification. SGPR is an excellent choice for small datasets, interpretable models, and applications where uncertainty quantification is critical. While GPR struggles with scalability and expressiveness in complex systems, SGPR mitigates such issues and, furthermore, the built-in-library SGPR potential/force-fields would maintain the accuracy of complex materials model systems without significant loss of the computational speed, enhance the prediction power of structure, energetics, and properties of unknown novel materials, and help in understanding of their physical science concepts.
There is no definitive answer for the exact maximum number of elements SGPR can handle, as it depends on the specific implementation and computational resources available. However, most SGPR models may struggle beyond 6–10 distinct elements, especially when high accuracy and rich descriptors are required. Meanwhile, GNNs are known for their strong extrapolation capabilities due to the generality of their learned embeddings. In contrast, GPR-based models are generally more interpolative, meaning they excel at making predictions within the bounds of the data they were trained on. While GPR models show limitations in scalability and extrapolation, particularly when the number of elements grows, GNN models provide a more robust framework for handling diverse chemical systems. In particular, SGPR-based models offer significant advantages in smaller, moderate-sized well-defined systems where interpretability and fine-grained control over descriptors like SOAP are essential. Overall, the choice between SGPR and GNN-based methods depends on specific application needs, data size, and computational resources. SGPR is preferred for smaller datasets requiring interpretable models and theoretically grounded uncertainty estimates, while GNNs are favored for large datasets and scalable predictions. SGPR provides a more interpretable and theoretically grounded uncertainty estimate, while NN-based methods are scalable but often rely on approximations or ensembles that may be less reliable in data-sparse regions. In this regard, scalable built-in-library SGPR could be a useful approach for moderately larger datasets. Future research may explore ways to combine the strengths of both SGPR and GNN approaches for broader applicability, particularly for larger datasets in materials science.
R. Summary of Sec. II
In Sec. II, we discussed ML representations of the ab initio PES. Two of the most important branches are NN36 and GAPs.47 NNs are computationally efficient because they scale as with the size of the data. However, since they contain many hyper-parameters, they are more likely to over-fit the training data and not generalize well. Therefore, NNs are often used when large datasets are available. On the other hand, Gaussian process based methods are almost non-parametric; only a few hyper-parameters are present in the covariance kernel. They work well even with small datasets, but their computational cost grows fast with the size of data. The cost for building the covariance matrix scales as and for its inversion as . Therefore, approximate inference methods for the GPR, which transform these unfavorable cost scaling to , are crucial. BLR method43,91) is one of the possible solutions. However, a method better than linear regression is sparse approximation.
Another important issue is on-the-fly modeling and active learning43,44,91,101 where the data required for building the model is generated alongside molecular dynamics (MD) simulations. For this purpose, Gaussian process based regression methods are much more suitable than NN because the predicted variance can be used as a proxy for on-the-fly accuracy of the model and a criterion for data generation by ab initio calculations. Although the uncertainty can be emulated with NN by training identical networks with different initial random seeds, updating an existing Gaussian process model is much easier.
Currently, Gaussian process based active learning algorithms use the predictive variance [see Eq. (40)] as a criterion for on-the-fly data generation. A common objective for these methods is to show that their predictive variance correlates well with the actual ML errors. However, we noticed that the predictive variance is blind to the energy and forces data. For this, we devised a new criterion that includes these data as well. As a result, during on-the-fly sampling, the model is generated with much smaller data while maintaining or improving the accuracy, which dramatically increases the efficiency. This is accomplished by low-rank approximation of the full covariance matrix in normal GPR. Using SGPR, the computational cost is effectively reduced to that of the linear regression methods while maintaining the accuracy. In addition, we discussed a novel adaptive sampling algorithm. The excellent performance of this formalism (SGPR + adaptive sampling) is demonstrated with benchmark simulations. As a benchmark simulation, we applied this algorithm to molecular dynamics simulations of bulk Si. The SGPR potential, which was generated using only 13 ab initio calculations, regenerated the phonon spectra almost identical to DFT. The combinatorial approach to generate a more universal expert system dealing with more elements from several expert systems dealing with less elements opens the door for strategies like “high level parallelism” or “a library of SGPR potentials,” which can be combined by simple cross covariance calculations (based on the MOE approach) for the cases to describe increasingly complex materials. Therefore, this approach can be useful for modeling wide-ranging composite materials for tailoring the performance. Since the computational cost of SGPR is proportional to the number of atoms N in the system, much bigger systems can be efficiently simulated compared with DFT where the computational cost scales as . This allows us to study large and complex systems.
III. BASIC FUNCTIONS
The kernel regression methods in Sec. II assumed that a latent energy function exists for the contribution of atoms i in the potential energy of the system, where is the descriptor for the local chemical environment of atom i. Then using an appropriate covariance kernel , representations based on linear regression or Gaussian process regression are found for using a set of ab initio data. Here we discuss the kernels and descriptors in more details.
A. General aspects
In all of the previous derivations, we assumed that the latent energy function (LEF) is related to the potential energy by Eq. (3), where every atom i has its own descriptor di. Although this can be applied everywhere, it is not the only possible approach. For instance, a group of atoms can be described with a single descriptor and the associated LEF. This is specially useful for molecular systems where coordinates of a group of atoms are strongly correlated. In such cases, the generalized expression in Eq. (1) can be assumed. The regression equations can still be applied with simple adjustments: ρi represents the translationally invariant degrees of freedom in the group of atoms indexed by i.
Ideally, the descriptor should be invariant with respect to the symmetry operations that do not change the potential energy. Symmetries that we are most concerned with are translations, rotations, and permutations of identical atoms. In some cases, certain symmetries can be ignored in the descriptor design and enforced manually. For instance, in a system with strong chemical bonds that permutations of atoms never occur, one can ignore this symmetry. Instead, the indices that are chosen for the atoms should be kept fixed.
B. Combining descriptors
C. Rotational symmetry
D. Two-body and many-body
E. Coulomb matrix
F. Behler–Parrinello symmetry functions
G. Smooth overlap of atomic positions
H. Summary of Sec. III
In this section, we discussed combinations of covariance kernels such as two-body, three-body, and many-body. As examples of many-body descriptors, we represented Coulomb matrix, Behler–Parrinello symmetry functions, and smooth overlap of atomic positions (SOAP). We found analytical expressions for the components of the SOAP descriptor using the series expansion of the Bessel-i functions, which can be used for fast numerical evaluation of the descriptors. The SOAP kernel is almost non-parametric. Only a few parameters are present in the kernel, which have simple intuitive interpretations. Therefore, we can often skip iterative refinement using the gradient descent algorithm, which is computationally an exhaustive task. The SOAP descriptor with the series expansion of the Bessel-i functions (SE-SOAP) is used in this study.
IV. DYNAMIC PROPERTIES
In this section, we discuss diffusivity in solids and the methods used to estimate ionic conductivity.
A. Conductivity
B. Ionic conductivity in crystals
C. Diffusion in solids
Nudged elastic band (NEB) calculations are used to calculate the energy barrier along a specific diffusion path if dominant diffusion pathways are known. For a concerted diffusion mechanism, this becomes even more complicated as the number of possible pathways become increasingly large. The ab initio molecular dynamics (AIMD) simulations, on the other hand, are able to realistically determine the diffusion coefficient and the conductivity of materials as well as the diffusion pathways. However, due to the huge computational resources required for the electronic structure calculations, these simulations are often applied only to small systems ( atoms) and short intervals of time ( ). Therefore, sometimes classical MD simulations that use empirical potentials or force fields are utilized to scale up calculations.
Another important aspect of the studies of diffusion in the solids is that, due to the Arrhenius-type temperature dependence of the diffusion events, they occur too slowly at room temperature to be sampled with statistical certainty using AIMD135 even in the fastest ion conductors. Therefore, often simulations are carried out at elevated temperatures and the diffusion coefficient at room temperature is approximated by extrapolation. However, this is not possible in the cases where the system of interest is unstable at higher temperatures. In fact, such a case is encountered in the current study.
D. Linear response theory
E. Diffusion constant from molecular dynamics
F. Grain boundaries and long-range diffusivity
In molecular dynamics simulations, often only the tracer diffusivity is calculated in a pure crystal. This can be used to estimate the short-ranged diffusivity. Grain boundaries and long-range diffusivity are difficult to consider because of the size limitations of ab initio simulations.
G. Diffusivity and activation energy
H. Summary of Sec. IV
Here we have described the dynamic properties such as ionic conductivity, diffusion, linear response theory, molecular dynamics simulations, grain boundaries, and long-range diffusivity for material simulations.
V. APPLICATIONS: SGPR-ASSISTED MATERIALS SIMULATIONS
A. Sulfide solid electrolytes
In this section, we discuss the Li diffusivity in sulfide solid electrolytes, which are one of the most promising candidates for future battery systems. Li7P3S11 has shown high ionic conductivity. More powerful solid electrolytes showing remarkable ionic conductivity need to be developed for more promising battery systems. Here we investigate these materials using the SGPR algorithm and study its properties using large scale ML MD simulations. We also discuss the glass phase with a melt-quench simulation and the Li diffusivity in the glass phase and the crystallization.
1. Li2S-P2S5 system
We consider sulfide-based superionic conductors in the Li2S-zP2S5 system. The most stable compound in this system is considered to be Li3PS4. It exists as -Li3PS4 below 195 °C and -Li3PS4 at higher temperatures. Depending on the experimental preparation method, the ionic conductivity as high as is reported for this material.138 A much more conductive compound is the Li7P3S11 glass-ceramic, which has a highly conductive phase due to the formation of a metastable crystal structure.139 This crystal structure is found to be a triclinic cell made of PS4 tetrahedra and P2S7 ditetrahedra.140 With heat treatment to reduce the grain boundary resistance, the conductivity increases to .141 Some of other fastest ion conductors are closely related to this system in that they are obtained upon substituting some of P or S atoms by other atoms (M, for example, Ge), namely, Li2S-P2S5-GeSn, Li10GeP2S12,142 or a more complicated case of Li9.54Si1.74P1.44S11.7Cl0.3.143 Argyrodite type solid-state electrolytes (SSEs) are another family of electrolytes. Thus, the main degrees of freedom are to choose the ternary/quaternary systems, the composition ratios, and the route to the final product such as the heat treatment temperature and duration. First-principles theory calculations have been applied extensively for theoretical studies of solid electrolytes,144 which helped in shedding light on the lithium's self-diffusion mechanism in Li10GeP2S12,145,146 the effects of anion/cation substitutions,147 the cause of different ionic conductivity between tetragonal and cubic Li7La3Zr2O12,148,149 and the effect of composition in Li3OCl1−xBrx lithium-rich antiperovskite150 and (1-z)Li4SiO4-(z)Li3PO4 solid solutions.151
2. Training
A sparse Gaussian process potential was trained with on-the-fly adaptive sampling algorithm for the well known crystal structures of γ-Li3PS4, β-Li3PS4, and Li7P3S11.63 For sampling data, the systems were quenched from ∼1500 to 700 K in ∼5 ps by velocity re-scaling and kept at this temperature for ∼45 ps using a time step of 2 fs. In quenching, there is no time for the crystal to melt (since atoms do not have time to displace too far the crystal becomes only locally disordered), but this local distortion results in stream of diverse LCEs to the sampling algorithm, which helps in faster stabilization of the model at the beginning of training. In total, only 456 ab initio calculations are performed from which only 111 are sampled by the model as the data X and 705 LCEs are sampled as the inducing set z.
3. Testing
The model on all 456 ab initio data generated during training fits ab initio forces with an RMSE of 0.13 eV/Å and a coefficient of determination of (Fig. 10). To verify the model with independent data, ML MD simulations at five temperatures in the range 300–700 K for 0.8–2.2 ns are performed and 103 snapshots are randomly selected. The accuracy of the model is tested with these samples in Fig. 11. For this testing set, we have RMSE = 0.14 eV/Å and . The RMSE of the model is almost the same for the training data and testing sets. This shows that the model generalized very well.
In addition, the liquid and glass phases of Li7P3S11 are simulated with the SGPRs model. 50 + 70 samples are drawn from machine learning molecular dynamics (MLMD) simulation of the liquid phase Li7P3S11 at 1200 K and glass phase (obtained after melt-quenching) at 600 K for ∼0.2 ns. The number of atoms in these samples is 672.
To show that the model is energy-conserving, a MD simulation was carried out in the NVE ensemble. The total energy is conserved up to small fluctuations of eV per atom, which are related to the finite time step of 2 fs. Since the model is trained at high temperatures, it is intriguing to see how well the model describes the properties close to 0 K. Specifically, NEB simulation is used in many ab initio studies to estimate the activation energy for Li diffusion since lengthy AIMD simulation is computationally very time consuming. We carried out NEB calculations with both the ML potential and direct DFT calculations. The trained potential in Fig. 12 describes NEB with an acceptable accuracy.
4. Computational performance
For a system with N atoms, the computational cost of predicting energy and forces with the SGPR model is , while the complexity of ab initio DFT calculations is or at best unless a fragmental approach is not employed. The elapsed time for ab initio calculations of the energy and forces for a system with 672 atoms was 9718 s using VASP with 80 cores, while the elapsed time for the same calculation with SGPR was 5.3 s using only 20 cores. The cost defined by is for SGPR, which is better than ab initio calculations by a factor of . This gain will grow exponentially as the number of atoms increases.
5. α and β phases
Large scale ML MD simulations of Li7P3S11 ( supercell, 672 atoms) were performed in isothermal-isobaric (NPT) ensemble at several temperatures in the range 300–1200 K and external pressure of . For this size, the implementation of ML MD is faster than DFT MD. A phase transition was detected by ML MD (at ), which occurred by rotations of the P2S7 double-tetrahedra into a new orientational order. The initial and the new structures are referred to as α and β phases. The ground state energies of α and β phases obtained from exact DFT calculations are −4.413 and −4.416 eV per-atom, respectively, which shows that they are almost iso-energetic and the β phase is not an artifact of the SGPR potential. Also, examples from the β phase are encountered during training and testing. 0.14 eV/Å is the lumped RMSE for both phases. There appeared two steps in the MSD of S atoms corresponding to the delayed rotations of P2S7 double-tetrahedra in different layers. The delay was ∼0.5 ns at 450 K with a supercell. Specifically in the range 450–600 K (Fig. 13), the system remained a few hundred picoseconds in the α-phase before transition to the β-phase, which allows estimation of the Li diffusivity in both phases. In agreement with experimental reports,139,140 at 900 K, the P and S atoms also started to diffuse, which indicates melting or its decomposition to more stable components.
From the diffusion coefficient (diffusivity) (Fig. 14) with the assumption of an Arrhenius temperature dependence for diffusivity, the activation energies Ea in α and β phases were 0.20 and 0.30 eV, respectively. The ionic conductivity in the α phase in NPT at RT(=300 K) becomes , which is in reasonable agreement with previous DFT simulation results of 152 in NVT and 153 within a limited box size. For the β phase, the RT conductivity is obtained by extrapolation of D and approximating V (which is T-dependent) by the volume of the relaxed cell. This yields the conductivity of , which is more than an order of magnitude lower than the α phase. Alternatively, the Arrhenius dependence for σT is assumed. The numerical difference is since the volume of α phase expands less than 1% (300–600 K) and β phase about (450–800 K).
The diffusivity was the largest along the -axis in both phases. The main difference is that the diffusivity along and axes are almost equal in the α phase, while in the β phase, this symmetry is clearly broken. In Ref. 152, it is pointed out that the one-dimensional (1D) MSD of Li atoms along one of the axes in Li7P3S11 is nearly a factor of 2 higher than that along the other two axes. This is consistent with the directional MSD.
6. Glass phase
The glass Li7P3S11 is created by a melt-quench simulation from 1200 to 300 K. The Li conductivity of is then obtained for the glass phase (with ML MD for 3 ns), which is higher than the conductivity in the β phase. Equilibration of the glass at higher temperatures shows that P atoms start to diffuse at 600 K, which indicates crystallization below this temperature (Fig. 13), in agreement with experiment.139–141 The liquid and glass phases show RMSE = 0.21 eV/Å and (see Fig. 15).
Experimentally, Li7P3S11 is often prepared in glass-ceramic phase by controlled crystallization and heat treatment of the glass phase. Depending on the heat treatment conditions, ionic conductivities of and are reported.140,141 It was shown that the experimental conductivity is correlated with the degree of crystallization, depending on the heat treatment method.154 Experimental measurements of the ionic conductivity with techniques such as impedance spectroscopy probe the long-range dynamics of Li, which is influenced by the properties such as grain boundaries and degree of amorphism. Wohlmuth et al.155 probed the short-range diffusivity of Li in glass-ceramic Li7P3S11 using NMR a range of nuclear magnetic resonance (NMR) methods and showed that the activation energy of bulk Li7P3S11 is 0.20 eV, which is in agreement with our simulation. It was shown in Sec. IV F that grain boundaries modify the activation energy by , where Eg is the energy barrier at the grain boundary and l is the grain size that can often be measured experimentally.
7. Li–X–Y ternary systems
We studied Li diffusivity in Li7P3S11 with extensive MD simulations using a ML potential created with the SGPR algorithm. These simulations described the diffusion properties and the melting temperature in agreement with the experimental measurements. We simulated systems with atoms for several nanoseconds. A new crystal structure is found for Li7P3S11, which is referred to the β phase. The β phase shows much lower ionic conductivity than the α phase. These phases are almost isoenergetic according to exact DFT calculations. Therefore, it is important to identify the β phase experimentally and fine-tune the preparation method in order to avoid this phase. Additionally, we studied diffusivity in the glass phase and showed that the ML potential predicts the crystallization temperature consistently with experimental observations.
In this section, we turn our focus on known Li–X–Y ternary systems, which are available online (materials-project111). We considered any crystal that has a considerable concentration of Li to be considered as a solid electrolyte. For any material that qualified certain criteria, a ML potential is generated using the SGPR algorithm and its Li diffusivity properties are calculated. Then, we discuss how the ML models for similar materials can be combined for generating global/universal potentials.
a. Searching algorithm
For all of the Li–X–Y ternary systems, with the exclusion of transition metals, a sparse GP potential was trained using an on-the-fly adaptive sampling algorithm for known crystal structures (Materials Project database) in the materials-project111 repository. At the first filtering, the systems with lithium less than 10% of their atoms are dropped. As the next filter, the systems whose electronic bandgap is less than 1 eV are eliminated. After the first two filters, about 300 structures remained for further investigations. Then, on-the-fly ML MD simulations for the potential energy are triggered for these materials. The size of supercells for the ML MD simulations was chosen such that k-spacing in the reciprocal space satisfied . Therefore, only Γ-point was chosen as the k-point grid for DFT calculations. The number of atoms in the supercell was in the range of 100–300. If the Li diffusivity of a certain material was estimated to be too small (e.g., less than ∼0.1 Å2/ps at 1000 K), the simulation was stopped for that material. The other condition for continuation of the simulation was stability. If a crystal melted at temperatures higher than ∼600 K, the simulation was terminated unless a very high diffusivity is observed at lower temperatures. The ML MD simulations were carried out in isothermal-isobaric (NPT) ensemble for ∼100 ps. Temperature dependent diffusivity and room temperature ionic conductivity were calculated for this set of materials and listed in Table S1 of Ref. 63.
At the end, 22 promising crystals were detected (Table II). As an exemplary case, 4 materials out of 22 had the same set of elements (Li–P–S) for which the training set consisted of crystal structures known for their relevance to Li-ion conductivity focusing primarily on three materials: γ-Li3PS4 with , β-Li3PS4 with , and Li7P3S11 with cells.138–140 This sampling procedure was discussed in the earlier Subsection V C. As such, the combinatorial approach above opens the door for strategies like “high level parallelism” or “a library of SGPR potentials,” which can be combined by simple cross covariance calculations to describe increasingly complex materials. Such a combinatorial approach is not trivial by other methods.
b. Universal ML interatomic potentials for surveying ternary solid electrolytes
With on-the-fly ML of interatomic potentials using the SGPR algorithm, MD simulations are applied for a survey of Li diffusivity in hundreds of ternary crystals as potential electrolytes for all-solid-state batteries. The models generated for these crystals can be easily combined for creating more general and transferable models, which can potentially be used for simulating new materials without further training. As an example, universal potentials are created for Li–P–S and Li–Ge–S systems by combining the expert models of the crystals that contained the same set of elements. Combinatorial models of different ternary crystals can be directly applied for modeling composite quaternary ones (e.g., Li–Ge–P–S). This hierarchical approach paves the way for modeling large-scale complexity by a combinatorial approach.63
NNs are suitable for large datasets, while kernel-based methods excel in training with smaller data and on-the-fly learning.43,44,54,91,101 These methods have found a wide range of application in physical chemistry. Relating to solid electrolytes (SEs), Li3N is studied using electrostatic spectral neighbor analysis potential (eSNAP) method,156 Li3PO4 using NNs,157 and Li7P3S11 using SGPR.54
An important issue for the future of ML IAPs is transferability and systematic model reusability. Currently, models are generated for specific tasks and often discarded afterward. The ability to build on prior work is highly sought-after and imperative for efficient exploration of functional materials. Although gradual improving of the trained models is possible with all ML methods, kernel-based methods such as SGPR offer natural frameworks for libraries of ML models, which can be combined with little effort to describe increasingly complex materials. NNs memorize the data using a set of weights with fixed size and thus have to be re-optimized entirely upon adding new data. In contrast, in kernel-based methods, models are represented by a set of LCEs that can be easily categorized and encapsulated in libraries. These ideas can be explored in the context of Li conducting crystals, which are one of the essential components of all-solid-state storage devices. In an unsupervised fashion, hundreds of ternary crystals are simulated by on-the-fly ML MD and fastest Li conductors are distinguished. Later, we discuss combining the SGPR models for more transferable potentials and a combinatorial approach for modeling new crystals.
Li7P3S11 superionic conductor is extensively studied using SGPR. Similar simulations are applied to hundreds of potential SEs. We focus on ternary systems (composed of three elements) in the materialsproject111 repository that contains Li. Crystals containing the transition metals are not considered here. Li concentration and bandgap are used as initial filters since a SE should have considerable Li atoms and it should be an electron insulator. About 300 crystals passed the initial filters. A quaternary system of Li10Ge(PS6)2142 is included in the list as an extension of the ternary system. Then NPT-MD simulations with on-the-fly SGPR are performed for a duration of ∼100 ps for each crystal at standard pressure. Three or four temperatures are chosen for MD simulation depending on the stability against melting and the Li diffusion rate.
Figure 16 and Table II show the Li diffusivity in 22 fastest Li conducting crystals. In NPT-MD simulations, the volume expands with increasing temperature. This temperature dependence is ignored in reporting the ionic conductivities in Table II since it only modifies the values by (see Ref. 54). Nevertheless, the effect of NPT ensemble is directly reflected in diffusivity and significantly faster ionic conduction is observed in comparison with NVT ensemble.
Fast Li diffusion is one of the most important factors for a promising solid electrolyte. Stability against Li metal and wide electrochemical window are the other important factors. Additionally, experimental reports often measure long-range diffusivity, which is affected by grain boundaries, while the values in Table II reflect only the short-range diffusivity. A special technique is required for probing the short-range diffusivity in experimental samples.155 Among 22 Li conducting SEs reported in Table II, most are sulfides (14), few halides (5), two oxides, and one selenide. Here we briefly discuss the experimental results for selective SEs. The top three fastest ionic conductors belong to the Li2S-P2S5 system. The most stable compound in this system is (γ-/β-) Li3PS4. Li7PS6 is a lithium argyrodite with orthorhombic/cubic (low/high temperatures) phase where the cubic phase has a higher conductivity. The highest experimental conductivity reported in this system is for Li7P3S11, which has a triclinic cell made of PS4 tetrahedra and P2S7 double-tetrahedra.140 Experimental room temperature ionic conductivities (mS/cm) of 0.16 for Li3PS4,158 0.11 for Li7PS6,159 and 17 for Li7P3S11141 are reported. The latter is one of the highest reported to date for a SE.138 As a halide, a conductivity of ∼4 mS/cm (at 330 K) is reported for Li3InBr6.160 Li4SiO4 has also been considered as a parent for SE solutions.151 The experimental conductivities are often improved, even by several orders of magnitude,158 depending on the preparation method. Our simulations suggest that experimental preparation methods could be improved, specially for Li7PS6.
8. Universal potentials: Application to quaternary systems
In previous simulations, an “expert” model is generated for the potential energy of each material. At the end, 22 materials are selected, which show significant Li diffusivity. For instance, 4 materials out of 22 have the same set of elements (Li-P-S). It is intriguing to somehow combine these four models to create a universal potential for Li–P–S system. The combinatorial approach based on MOE model opens the door for strategies like “high level parallelism” or “a library of SGPR potentials,” which can be combined by simple cross covariance calculations to describe increasingly complex materials.
As a more simple case, the SGPR models for Li3PS4 and Li4GeS4 are simply combined and tested on Li10Ge(PS6)2 (both configurations in Table II). Due to the presence of different species P and Ge, the combinatorial model cannot be significantly compressed. The combined model of the two ternary crystals described the forces of the quaternary crystal with , which is very close to R2 of the expert model. This is understandable in view of the similarity of local chemical environments for the cations P and Ge in these crystals (see Fig. 6). In many promising solid electrolytes, they are present as poly-anionic moieties with X = P, Ge, etc., in the sea of Li+ ions and do not directly interact with each other. By a combination of the ML models for the simpler crystals, we can simulate the composite ones with a minimal training or even without any additional training. Thus, by combining two ternary systems of Li3PS4 and Li4GeS4, a quaternary system of Li10Ge(PS6)2 was studied because the models generated with kernel-based methods such as SGPR can be easily combined with more generality and transferability. Using the previously trained models in the database, it is possible to model new crystals without further training. This approach is indeed demonstrated by creating universal potentials for two ternary systems. Combinations of two ternary crystals resulted in a realistically accurate potential for a quaternary crystal. With the introduced algorithm, the models generated in this study can be reused, with limited or no training at all, to describe more complex system (such as quaternary electrolytes) by a combinatorial approach.
9. Summary for SGPR simulations of solid electrolytes
The SGPR-based ML formalism is applied to the studies of ionic diffusion in dozens of solid electrolytes. Specifically for Li7P3S11, these simulations described the diffusion properties and the melting temperature in agreement with experiments. We simulated systems with atoms for several nanoseconds, which is unimaginable with direct ab initio molecular dynamics (AIMD). Although we could easily simulate larger systems because the computational cost for calculating the energy and forces is proportional to the number of atoms N, unlike DFT that scales as N3 or . For atoms, using 100 CPUs, it took less than 1 s to predict the ab initio potential energy and forces using the SGPR potential which was faster than direct DFT by a factor of . A new crystal structure is found for Li7P3S11, which is referred to as the β phase. The β phase shows much lower ionic conductivity than α phase. These phases are almost isoenergetic according to exact DFT calculations. Therefore, it is important to identify the β phase experimentally and fine-tune the preparation method in order to avoid this phase. Additionally, we studied diffusivity in the glass phase and showed that the ML potential predicts the crystallization temperature consistently with the experimental observations.
We showed that SGPR models for materials with the same set of elements can be easily merged into a single universal potential. This allows high level parallelism in generating ML potentials, which is not easily possible with other ML algorithms. That is, instead of training a potential sequentially in different domains, we can train potentials for all of the domains in parallel and merge them into a single global potential at the end. This is a critical step forward for applications of the ML potentials at increasingly wider domains and truly complex materials.
B. Lithium ion batteries
1. Al-doping in Li-ion-battery cathode materials
Al-doping has been employed in the cathode materials of lithium-ion batteries (LIBs) to alter the reaction of negatively charged ions in order to prevent declining in energy storing ability or dropping in electrical potential. The chemistry involving the transfer of negative charge in cathodes with high lithium content has opened avenues for creating cathode materials with greater energy capacity. Yet, the prolonged reaction of these negatively charged ions leads to a reduction in the battery's storage capacity and a decrease in voltage as the battery is used repeatedly. This occurs due to irreversible changes in the oxygen molecules, destabilizing the structure. To enhance the durability and steadiness of layered oxides and lithium-rich variants such as lithium nickel–cobalt–aluminum oxide (LNCAO or abbreviated as NCA) and lithium nickel–cobalt–manganese oxide (LNCMO or abbreviated as NCM), Al-doping has been intermittently applied. However, prior studies on Al-doping have mostly focused on NCA or NCM, which predominantly contain TM elements from the 3d group. The chemical behavior differs significantly in the case where the elements from the 4d or TM groups are utilized, leading to distinct characteristics. Cathodes comprising solely 3d TMs exhibit unique structural arrangements. In contrast, addition of more 4d or 5d transition metals into oxide-based cathodes introduces novel chemical properties. Due to the intricate behavior of these materials, SGPR-based DFT potentials/forces were employed to identify their most stable structures, presenting a new approach to designing fully reversible layered cathode materials. By employing this method, the impact of Ni and Al-doping was investigated on the decline in capacity and voltage of the high-capacity LIB cathode materials containing 4d-elements. An exhaustive and dependable analysis of this complex system was conducted using SGPR.65 The instability of LRO (Li2RuO3; Li1.33Ru0.67O2) was significantly enhanced by Ni-doping (LRNO: Li1.22Ru0.61Ni0.17O2). Further, stability was achieved in this LRNO through additional Al-doping. These Al-doped LRNO (Al-LRNO: Li1.22Ru0.61Ni0.11Al0.06O2) materials exhibit exceptional stability and high capacity. The structures of LRNO and Al-LRNO were investigated with SGPR-based DFT potentials by incorporating a diverse genetic algorithm based on the known LNO structure. The effects of N-doping and Al-doping on the materials' stability under high voltage conditions were investigated.
2. DFT calculations
To generate training test sets for SGPR potential energies, spin-polarized DFT calculations were conducted to derive potential energies for both discharged and charged phases of LRNO and Al-LRNO configurations by using a PAW plane wave with an energy cutoff of 600 eV. The method involved PBE+U functional with Tkatcheko–Schaffler (TS)161 dispersion correction, where U = 7.5 eV, J = 0.6 eV for Ru and U = 7.7 eV, J = 1.0 eV for Ni. The structures were fully relaxed until the interatomic forces were smaller than 0.03 eV/Å.
3. Structure prediction
To identify the best fitting arrangements of cations in the LRNO and Al-LRNO systems, the structures with the lowest energy were searched by altering atomic positions within the unit-cell formula to mirror the effect of doping. A modified genetic algorithm was employed starting with a handful of randomly generated parent structures. These parents represented changes in atomic positions, swapping three Li/Ru-sites within the 3 × 2 × 1 Li2RuO3 supercell with either three Ni atoms or a combination of two Ni and one Al atom, thereby forming LRNO or Al-LRNO reference systems, respectively. At each step, numerous child structures were randomly generated from each parent, and the lowest energy structures were selected from the combined parent and child generation for the next iteration. The search process continued until minimal changes occurred within the parent structures over a few steps, while conducting multiple independent searches. To calculate the energy of each structure, the ML potential was generated on-the-fly-by using the SGPR methodology, employing a variation of SOAPs—a many-body descriptor—as the core kernel. During each energy calculation step, if the ML model fell short, precise DFT calculations were performed, updating the model. An accuracy of 3 meV per atom was achieved by retaining the structures that could potentially possess the minimum energy within this margin for exact DFT calculations. Geometry optimization was performed to determine the lowest energy structure. Upon identifying stable cation ordering in Al-LRNO, a Li vacancy was introduced to balance additional charge, and its placement was determined by defect formation energy. To forecast the charging phase, each state of charge was handled separately, thus reducing the number of sampled geometries. Commencing with pre-established cation ordering, ∼500 random delithiated geometries were generated for each model, with their structure energies predicted by ML potential. The SGPR model recognized unique environments from around 11 samples among prepared structures, learning atomic potential from DFT results. Again, the model identified the Li ordering with the lowest energy, validated subsequently by DFT. The trained model underwent validation by comparing predicted and calculated formation energies of six randomly selected non-learning samples. For the training/test sets, each model exhibited R2 of 0.96/0.92, RMSE of 10.2/10.4 meV, and MAE of 8.2/6.4 meV (Fig. 17).
4. Structures of LRNO and Al-LRNO
The structure features a honeycomb-like Li-transition metal layer reminiscent of LRO, showcasing an edge-shared section composed of RuO6-NiO6 or RuO6-AlO6 configurations [Fig. 18(a)]. A resemblance is observed in their respective calculated x-ray diffraction (XRD) spectra [Fig. 18(b)]. When comparing the optimized LRO, predicted LRNO, and experimental XRD results, no significant visual disparities were noted.162,163 This alignment significantly bolsters the reliability of the SGPR-based ML-driven simulations rooted in first principles for the structures of LRO, LRNO, and Al-LRNO. A notable observation is that in LRNO, Ni tends to form clusters, whereas in Al-LRNO, Ni and Al coalesce into alloy-type clusters, predominantly bonding with each other rather than exclusively with Ru. This inclination is supported by the cluster expansion coefficients (energies) derived from pair atoms, obtained through the least absolute shrinkage and selection operator (LASSO) regression model.164 This intriguing pattern can be comprehended through the concept of immiscibility or miscibility between Al, Ni, and Ru.166,167 DFT calculations on the alloying or mixing energies of these metals suggest that Ni and Ru tend to be immiscible, leading to the clustering of Ni atoms adjacent to Ru in LRNO. Meanwhile, the calculations indicate that Al–Ni mixing is marginally more favorable than Al–Ru mixing, resulting in a preference for Al atoms to bond primarily with Ni atoms, especially when a small quantity of Al atoms is present.
5. Stability fading of Al-LRNO due to delithiation at high-voltages
The delithiated structures of LRNO and Al-LRNO are investigated to scrutinize the impact of Al-doping on stability, particularly in high-voltage environments. Initially, the fully (100%) delithiated structures of LRNO and Al-LRNO are optimized. Delithiation in LRO triggers significant distortion, indicated by a short O–O distance of approximately 2.39 Å, resulting from anionic activity.167 This leads to an essential de-coordination of oxygen atoms, inducing a structural distortion to alleviate strain energy through a reductive coupling mechanism. In LRO, oxygen prompts adjacent oxygen atoms to approach Ru along the O–O axis, causing O–O dimerization and eventual O2 release, contributing to poor anionic cycling stability, as highlighted in prior studies. However, in LRNO after Ni substitution, the distortion is somewhat alleviated, evident from a larger O–O distance (∼2.42–2.58 Å) compared to LRO. This suggests suppression of oxygen oxidation. Subsequent Al-doping in LRNO relaxes the delithiated structure further, resulting in a significantly increased O–O distance (∼2.50–2.62 Å). The O atoms attached to Al, exhibiting strong ionic covalent characteristics, hinder O–O dimerization, thereby suppressing O2 release. Additionally, the O defect formation energy was calculated to comprehend the suppression of structural distortion due to O2 release. The calculations reveal a higher O defect formation energy in Al-LRNO > LRNO > LRO at any delithiated state. By recognizing that 100% delithiation during charging is unrealistic, the 50% and 75% delithiated structures of LRNO and Al-LRNO were examined to determine their formation energies (meV/f.u., where f.u. is formula-unit) using SGPR model-based DFT-level potential. The results confirm the higher stability of delithiated Al-LRNO structures over the corresponding delithiated LRNO structures, evidenced by more negative formation energies [Fig. 19(a)]. Despite similar destabilization from structural reconstruction in both Al-LRNO and LRNO, the heightened stability in delithiated Al-LRNO stems from Al's stronger reduction effect compared to Ni, as validated by the Bader charges. Specifically, the Bader charges (in au) of O around Ni in LRNO for 50% and 75% delithiated phases are −1.01 and −0.87, respectively. Conversely, around Al in Al-LRNO for 50% and 75% phases, the values become more negative, registering at −1.23 and −1.12, respectively. These findings imply Al effectively reinforces oxygen binding energy in Al-LRNO, leading to greater suppression of oxygen evolution. Furthermore, a notable difference is observed in delithiation around the Al atom compared to LRNO, where delithiation is significantly curtailed [Figs. 19(b)–19(g)]. At highly delithiated states, Li-ions farther from Al tend to diffuse out first, whereas no specific Li position preference is noted around Ni.
6. Enhanced diffusion by Al-doping
Delving into the diffusion coefficient via AIMD study poses challenges due to the fluctuating lithium concentration during the charging and discharging cycles. Particularly, at room temperature, the diffusion of Li atoms belongs to extremely rare events. To expedite computations, the diffusion was simulated at elevated temperatures of 700, 800, 900, 1000, and 1200 K. To find the impact of Al-doping, LRNO and Al-LRNO structures were examined by employing (2 × 1 × 1) cells with a 50% Li concentration, comprising 122 and 120 atoms in the simulation cells, respectively. The SGPR-based ML DFT potential was trained dynamically during the simulations. MD simulations in the NVT ensemble with a Nosé–Hoover thermostat were conducted for 0.5 ns (except for 1 ns at 700 K), using a time step of 2 fs. Each atom's mean squared displacement (MSD) from its initial position surpassed 50 Å within the simulation timeframe. Typically, the temperature dependency of the diffusion constant D is approximated using the Arrhenius relation: , where Ea represents the activation energy. The calculated activation energies for LRNO and Al-LRNO with 50% Li concentration stand at 0.48 and 0.40 eV, respectively (Fig. 20). Simulations based on the SGPR ML DFT potential project a higher diffusion coefficient for the Al-doped material compared to the pristine one, suggesting an approximate doubling in Li+ mobility within the cathode material, especially around ∼700 °C, where Li+ diffusion favors Al-LRNO over LRNO.
7. Electronic origin of stabilization and Discussion
In the realm of cathode materials containing 4d-elements, the introduction of Al through doping has shown remarkable efficacy in mitigating capacity and voltage deterioration. This intervention not only addresses persistent challenges related to capacity fading but also curtails the decline in average discharge voltage, thereby paving the way for high-performance LIBs. Al-doping fosters the formation of Ni–Al bonding, bolstering the characteristics of bonding orbitals within Al–O bonds. This fortified Al–O bonding acts as a deterrent to oxygen oxidation, significantly enhancing structural stability and alleviating safety concerns. The capability of Al-doping to suppress capacity fading and voltage decay is pivotal in the development of stable, reversible layered cathode materials. Furthermore, the reported material possesses the potential to serve as a protective coating over cathode materials, substantially enhancing their electrochemical performance by imbuing them with heightened stability. The exceptional performance of this material can be attributed to several factors. The bonding orbital characteristics of Al–O, situated deep below the Fermi level in Al-doped material, effectively restrain excessive oxygen oxidation even during profound delithiation. This phenomenon stabilizes crystal structures by notably suppressing oxygen defect formation compared to LRNO. Al-LRNO demonstrates higher structural stability within the high voltage region due to the less oxidized state of Ni, a characteristic observed in this Al-doped material. In essence, the strategic utilization of Al-doping stands as a promising avenue for addressing key challenges in LIBs, fostering improved stability and performance.
Notably, this SGPR ML-based computational approach underscores the effective suppression of cationic migration within these high-voltage redox systems. Al-doping emerged as a highly impactful strategy, influencing the bonding dynamics to stabilize TM octahedrons within the high-voltage redox framework. The Al-doping not only enhances diffusion rates but also reduces Li-ion charge transfer resistance, bolstering the overall performance. Drawing from these fundamental insights and preceding experimental results, conventional cathode materials such as NCMA could potentially be harnessed for long-term development in high-energy-density batteries. The analogical relationship between Mn and Ru in the periodic table holds promise for this purpose. The exceptional attributes of Al-LRNO, including its high capacity, diminished voltage hysteresis, and remarkable structural integrity, owe credit to the induced Al–O bonding orbital characteristics. The ionic covalent nature of Al–O bonding orbitals, situated significantly below the Fermi level within the doped material, plays a pivotal role in restraining excessive oxygen oxidation, even under deep delithiation conditions, effectively stabilizing the crystal structure. We posit that the reported material holds potential as a protective coating for commercial cathode materials, fostering high-temperature stability and substantially augmenting their electrochemical performance. This strategic use of the material may pave the way for advancements in battery technology.
8. Sodium-substituted disordered rock salt cathode: Reduced energy barrier for oxygen evolution
Cation-disordered rock salt (DRX) cathodes stand as promising candidates for future high-energy-density materials, surpassing conventional layered cathodes in LIB technology. An intriguing discovery emerged when Na-doped DRX exhibited efficient electrocatalytic properties toward the OER. This OER electrocatalyst demonstrated a current density of 10 mA cm−2 at an overpotential (η) of 270 mV, displaying a Tafel slope of 67.5 mV dec−1 and impressive long-term stability surpassing 5.5 days—surpassing the benchmark IrO2 (with η = 330 mV and Tafel slope = 74.8 mV dec−1). This intriguing electrochemical behavior finds strong support both experimentally and through ML-based searches for the minimum energy structure. Na-doping notably facilitates O2 evolution by reducing the oxygen binding energy (OBE) on the surface, showing a decrease from 6.51 eV in pristine DRX to 5.45 eV with Na-doping. To comprehend Na's role, the scenario where 1/16 of Li atoms in Li2RuO3 were substituted by Na was examined. Simulation-based relaxations considering positions and cell structures demonstrated that the system's potential energy is minimized when the Na atom occupies the Li layer rather than the Ru layer, due to the compatibility between Na and Li. Subsequent discharging/delithiation was simulated using AIMD. SGPR MD was performed to construct an on-the-fly potential energy surface of the spin-polarized PBE+U calculations.66 The ML model accurately predicted the DFT forces with a RMSE of ∼0.2 eV, which can be considered favorable at high temperatures. Simulations were conducted with a time step of 2 fs in the isothermal isobaric ensemble (NPT) to monitor volume and shape changes during discharging at an elevated temperature of 700 K for enhanced dynamics. Two supercells were created—one with 32 undoped Li atoms and another with two Li atoms replaced by Na atoms in their minimum energy positions. Both were simulated using MLMD to discern differences. During these simulations, the delithiation potential was calculated based on energy changes and volume variations in the crystal structure. The findings highlighted that while Na's presence did not notably affect the potential, it played a pivotal role in controlling crystal volume. At lower delithiation stages (<50%), the Na-doped structure exhibited slightly higher volume than the undoped crystal. However, in higher delithiation percentages (50%–80%), the presence of Na prevented excessive expansion along the stacking direction of the Ru layers, thereby enhancing the system's mechanical stability (Fig. 21).
C. Electrocatalysts
1. High-performing atomic electrocatalysts
There is an urgent need for developing efficient and environmentally friendly electrocatalysts for various electrochemical reactions including hydrogen/oxygen evolution/reduction, carbon dioxide reduction, ammonia reduction, hydroxide evolution, and chlorine evolution. Traditionally, these catalysts relied on expensive metals like platinum, ruthenium, or iridium. Recent efforts focus on using minimal amounts of these costly metals or finding cheaper alternatives, embedding them into graphene or other 2D materials to maximize surface area and catalytic activity. However, while this embedding enhances the active surface area, it can decrease electrical conductivity, impacting the catalyst's efficiency. To optimize these catalysts, researchers turn to high-throughput computer screening, analyzing different metal configurations onto/into 2D materials and their impact on stability, electronic properties, and electrochemical reduction/evolution activities. Their findings highlight specific configurations as displaying superior durability and high conductivity for efficient water splitting, crucial for clean energy applications like hydrogen-powered vehicles and electricity generation. The study also explores ML approaches and AIMD simulations to characterize high-performance catalysts.168–170 Detailed examinations have unveiled that single atom catalysts (SACs)161,172 follow distinct catalytic pathways compared to their nanocluster counterparts, resulting in differing selectivity in catalytic reactions. This means that by precisely adjusting the coordination environment of active sites, specific catalytic reaction pathways and product preferences can be achieved. Notably, many carbon-based materials have proven to be effective hosts for anchoring single transition metal atoms, serving as a favorable environment for these catalysts.67
Lately, there has been a surge in interest toward atomically dispersed catalysts that feature isolated transition metal atoms supported on a variety of materials such as metal oxides, borides, carbides, nitrides, graphene, graphitic carbon nitride (g-C3N4), hexagonal-boron nitride (h-BN), black phosphorous (BP), layered double hydroxides (LDHs), and transition metal dichalcogenides (TMDs), MXenes, MBenes.173 These catalysts have garnered attention due to their notably high utilization of atoms, exposed active sites, customizable electronic structure properties, and robust interaction between the metal and the supporting material. Known as single atom catalysts (SACs),161,172 embedding atoms catalysts onto or into 2D materials are being widely explored and recognized as promising candidates for a wide-ranging catalytic reactions such as hydrogen evolution (HER),165,174–177 oxygen evolution/reduction (OER/ORR),166,178–180 nitrogen reduction (NRR),173,181–184 CO/CO2 reduction (CORR/CO2RR),69 and chlorine evolution reaction (ChER).185 Ultimately, this research contributes to a clearer understanding of these catalysts, particularly for critical applications in clean energy environments like hydrogen fuel cells and electric power generation.
2. HER/OER/ORR
Tiny yet mighty, nanoparticle (NP)-sized noble electrocatalysts have been harnessed for a range of electrochemical reactions, particularly in advancing an environmentally friendly hydrogen-based economy like water splitting. Researchers have leveraged minuscule quantities of single atoms (SAs) to amplify the active surface area, fine-tuning catalytic abilities by situating these atoms within defect sites on nitrogen-doped graphene (GN). Developing highly effective catalysts that serve multiple purpose like HER, OER, and ORR has become a focal point in advancing clean and sustainable energy technologies.171 Focusing on these reactions, top-tier performances from 3d-5d TM SACs have been unveiled by employing DFT and ML. The investigation delved into the stability and activity of TM–GN, scrutinizing structure/coordination, formation energy, structural/electrochemical stability, electronic traits, conductivity, and reaction mechanisms. Among the array of –NnCm configurations, the –N2C2 compositions stand out for their ease of formation, showcasing superior electrochemical catalytic prowess and prolonged durability, in particular, for HER, notably avoiding aggregation or dissolution, compared to the extensively studied –C4/C3 and –N4/N3 configurations.65
Introducing non-noble TMs such as Fe/Ni/Cu onto the surface of MXenes as individual SAs has demonstrated altered mechanical and electrical traits, albeit rarely observed. Through the use of DFT simulations, various combinations of MXenes are assessed for their stability with intermediates and adsorption free energies to gauge their efficacy as catalysts for these reactions. The majority of the considered MXenes remain thermodynamically stable upon doping with Ni/Cu/Fe, showcasing optimal performance for OER/ORR/HER, respectively. The ML potential was generated using on-the-fly sampling and SGPR algorithm.70 The kinetic stability of the material was confirmed by using SGPR ML potential for the catalyst. Surprisingly minimal overpotentials observed in these reactions highlight these catalysts' appeal for water splitting (overall potential of 0.48 V) and metal–air batteries (overall potential of 0.70 V), surpassing even commercially utilized IrO2(111) and Pt(111) for the OER and HER/ORR, respectively.70,186 The strong correlations observed between intermediates (scaling relations) and their interactions with the catalytic surface predominantly dictate their performance. Doped metals not only serve as active sites but also play a crucial role as conduits for charge transfer between the surface and adsorbate. The notably low reaction barriers of Ni–Cr2ScC2O2 for all three reactions, coupled with robust reaction kinetics, position it as an efficient catalyst serving three functions simultaneously.
Here, we show heterogeneous catalysis dynamics of H/Pt reported by Vandermause et al.45 They validated a ML sparse Gaussian process (SGP) force field model for simulating Pt, H2, and H/Pt systems, comparing its performance with DFT and ReaxFF187,188 calculations. The SGP model was trained on energy, forces, and stress data from MD simulations and tested on various properties not included in the training set. The SGP model predicted the lattice constant within 0.1% and the bulk modulus and elastic constants within 6% of DFT values. It performed significantly better than ReaxFF, particularly for the C44 elastic constant. The model accurately predicted energies and properties for structures not encountered during training, showing low uncertainties and excellent agreement with DFT, especially for bulk Pt equilibrium volume and H2 bond lengths. It generalized well for different Pt surface facets. The SGP model showed higher fidelity to DFT, avoiding non-physical behaviors like surface sublimation of Pt hydride species observed in ReaxFF. The model's predictions of H2 dissociative adsorption and atomic H diffusion on Pt(111) closely matched DFT results. MD simulations using SGP yielded an apparent activation energy and diffusion prefactor consistent with experimental values. Overall, the SGP model demonstrated superior accuracy and reliability in comparison to ReaxFF and showed promising generalization capabilities for unseen configurations (Fig. 22).
3. CO2RR
Carbon capture, storage, and conversion technologies are seen as viable strategies to tackle excessive CO2 emissions. The electrocatalytic reduction of CO2 into carbon-based fuels and chemicals using renewable electricity is a promising method to mitigate CO2 levels in the atmosphere and address energy concerns. While carbon monoxide (CO) and formic acid (HCOOH) are common products of CO2 reduction, further transformations can yield valuable outputs like methane, methanol, ethylene, and ethanol. However, CO2's chemical inertness leads to high overpotentials and low selectivity for desired products, making the process challenging. High voltages may trigger hydrogen evolution and hinder CO2 reduction efficiency. While carbon-based materials effectively host TM SAs, stability during catalysis remains an issue. Efforts are focused on identifying non-carbon 2D materials with strong metal-support interaction and structural tunability. For instance, ultrathin gallium nitride (GaN) monolayers have been synthesized as promising hosts for TM SAs, showcasing exceptional electronic and optical properties. GaN-based SACs have demonstrated efficacy in various catalytic reactions such as nitrogen reduction, oxygen evolution/reduction, and more, showcasing potential multifunctionality through precise coordination modulation of active sites. This advancement highlights the potential of GaN SACs in diverse catalytic reactions by precisely adjusting the coordination environment of active sites.69
AIMD simulations based on SGPR ML potential at 500 K using the Nose–Hoover thermostat method tested the kinetic stability of proposed structures. The representative examples of (a) Ir-SAC and (b) Os-SAC also showed the dynamic stability of these structures (Fig. 23).
4. NRR
Nitrogen (N2) is abundant but inert, requiring transformation into bio-available forms like ammonia (NH3) and other nitrogen compounds through a process called nitrogen fixation. This process is crucial for sustaining life, impacting agriculture, soil fertility, ecosystem health, and even potential energy sources. NH3, a key product of nitrogen fixation, is vital in fertilizer production, pharmaceuticals, and energy storage due to its high energy density and hydrogen content. As the global population grows, the demand for NH3 escalates, necessitating cost-effective production methods that are environmentally friendly. Current industrial methods, like the Haber–Bosch process, though pivotal for food production, contribute significantly to greenhouse gas emissions. Efforts are underway to explore alternatives to the energy-intensive and environmentally impactful industrial processes. Electrochemical nitrogen reduction (NRR) shows promise as a cleaner, more sustainable method for NH3 synthesis, drawing inspiration from biological nitrogenase enzymes. However, challenges persist in finding efficient catalysts that can enhance the selectivity and yield of NH3 production while competing with hydrogen evolution reactions. Researchers are investigating diverse materials, including transition metals, borides, sulfides, oxides, single-atom catalysts, MBenes, and defect-engineered substrates, to improve NRR efficiency. Various studies aim to elucidate mechanisms, control parameters, and catalyst performance using experimental, theoretical, and computational approaches and to explore innovative solutions to improve electrochemical NRR performance in the coming years.
SACs for effective nitrogen reduction reaction (NRR) using various transition metals were investigated as implanted on boron-arsenide (BAs), boron-phosphide (BP), and boron-antimony (BSb) using DFT. The kinetic stability of the material was investigated by using simulations based on SGPR machine-learned potential for the catalyst (W-BAs) with the lowest potential determining step (PDS). W-BAs showed high catalytic activity and excellent selectivity with an insignificant barrier of only 0.05 eV along the distal pathway and a surmountable kinetic barrier of 0.34 eV. The W-BSb and Mo-BSb exhibit high performances with limiting potentials of −0.19 and −0.34 V, respectively.
To check the dynamic stability of single and double vacancy systems, SGPR based ML DFT potential/force MD simulations were performed at 500 K. The simulations indicated the dynamical stability of (a) Pd@2DCP, (b) Fe@h-BN, and (c) Fe@h-B2N2 structures (Fig. 24),181–183 which promised the stability of mono- and dual-type non-metal doped TM-SACs which can be utilized for NRR.
5. ClER
The chlor-alkali process stands as one of the fundamental chemical industries globally, with an annual chlorine production surpassing 75 × 106 tons. The production of each ton of chlorine demands approximately 2500 kW h of electricity, contributing to an overall requirement of around 200 TW h annually for the chlor-alkali sector alone. This consumption constitutes roughly 1% of the world's total energy production. Given these substantial energy demands, it becomes paramount to explore novel, energy-efficient, and eco-friendly methods for synthesizing Cl2 at a lower cost. Furthermore, achieving selectivity in the oxygen evolution reaction over the chlorine evolution reaction is crucial for industries like pharmaceuticals, cosmetics, and computing. A promising and cost-effective technique to produce pure Cl2 involves the use of a membrane-separated device within the chlor-alkali process. Presently, the electrocatalysts based on RuO2 and IrO2 dominate chlorine evolution reaction (ClER). However, these metal oxides suffer from inadequate catalytic activity, selectivity, and durability due to concurrent oxygen generation during Cl2 production. For highly efficient ClER, a catalyst capable of producing Cl2 with minimal overpotential in the chlor-alkali process is imperative. In this regard, for more efficient electrocatalysts, ML potential/force-fields simulations were carried out. The optimal configurations of the complex systems were obtained by simulation/optimization process utilizing the SGPR based ML DFT potential. The free energy profiles of ClERs for Pt–C4, Pt–N4, Pt–C2N2, Pt–C2 , and Pt–C2 are shown in Fig. 25. In this context, the Pt–C2N2 moiety steered electrocatalysts (referred to as Pt-1) exhibit promising ClER features. Comparative studies demonstrate that Pt-1 displays an onset potential of only 5 mV at 80 °C, in line with industrial Cl2 production temperatures. At a current density of 10 mA/cm2, it exhibits a minimal overpotential (η) of only 39 mV, contrasting with ∼100 mV for the best industrial electrocatalyst. These results align closely with the DFT study conducted. Pt-1 showcases superior selectivity in Cl2 production (up to 100%, compared to 90% for industrial electrocatalysts), exceptionally high mass catalytic activity (∼2.12 A/mgPt against 1.49 × 10−5 A/mgIr−Ru at 70 mV overpotential), and long-term durability in acidic electrolytes. Notably, the reported ClER mass activity for Pt-1 in acidic media surpasses that achieved by any other catalyst to date. The efficient Cl2 generation capability of Pt–C2N2 at remarkably low applied voltages in acidic conditions signifies a groundbreaking advancement toward eco-friendly and economical pure Cl2 production, potentially saving substantial energy. The as-prepared Pt-N2C2 catalyst exhibits superior electrochemical performance for chlorine production in acidic electrolytes. These catalysts can be industrially synthesized at low temperatures, demonstrating exceptional activity, durability, and selectivity (100%)—outperforming commercial and membrane catalysts commonly adopted in industry. The industrial electrodes (IE) comprise substantial amounts of precious metals ( mg/cm2). Their deactivation arises from weak adhesion of the coating to the substrate, leading to the separation of electrocatalyst species from the electrode surface. Additionally, oxygen penetration through cracks during electrolysis causes electrode passivation. This process reduces the electroactive sites due to the oxidation of RuO2 and IrO2 to RuO4 and IrO3 at high potentials. However, the Pt-C2N2 species in our electrode material circumvent these issues, resulting in heightened stability (Fig. 25).
D. Perovskite solar cells
For perovskite solar cell systems, a number of theoretical studies associated with ML and AIMD simulations have been reported.189,190 The links connecting the perovskite and layers responsible for carrying electrical charge contain significantly more imperfections compared to those found within the perovskite layer itself—roughly 100 times greater. These imperfections, especially the deep-level ones, noticeably reduce the effectiveness of these devices in converting power.72,73,191–193 Recent efforts to reduce these flaws at the connections have primarily focused on enhancing the surface.194 However, improving the surface of the perovskite that connects with the layer responsible for transporting electrons presents challenges, as the agents used for treating the electron-transporting layer might dissolve during application to the perovskite thin film. Conversely, if a seamless connection between the electron-transporting and perovskite layers could be established, concerns about flaws at the connection might become less significant.
The development of an intermediary layer between a SnO2 layer that transports electrons and a halide perovskite layer that absorbs light involves combining Cl-bonded SnO2 with a Cl− containing precursor for the perovskite. This intermediary layer demonstrates coherence at the atomic level, improving the extraction and movement of charges from the perovskite layer while exhibiting fewer imperfections at the connection. This coherent intermediary layer has facilitated the creation of perovskite solar cells achieving an efficiency of 25.8% under standard light exposure. Furthermore, even after continuous exposure to light for 500 h, uncovered devices maintained of their initial efficiency. These findings provide valuable insights into crafting interfaces that minimize imperfections between metal halide perovskites and layers responsible for transporting electrons.
The experimental analysis suggested the ability to form stable FA(MA)SnClx structures between the FAPbI3 perovskite and SnO2 without significant distortion in the lattice. However, when TiO2 was utilized as an electrode, establishing a stable interface proved challenging due to the substantial difference in electronegativity between Ti and Sn [Fig. 26(b)]. As per DFT simulations, the interface between the perovskite and TiO2 did not demonstrate stability in the TiO2/FAPbI3 scenario [Fig. 26(c)]. Conversely, the perovskite and SnO2 interface exhibited the creation of FASnCl3 interlayers at the SnO2/FAPbI3 interface [Fig. 26(d)], resulting in a favorable alignment of energy bands between SnO2 and FAPbI3 [Fig. 26(a)]. The conduction band minimum (CBM) of SnO2 was slightly lower (by 0.21 eV) than that of FAPbI3 due to the presence of short strong hydrogen bonding (SSHB) between interfacial oxygen and the FA molecule, stabilizing the dangling bond of the interfacial oxygen.195
E. Universal first-principles force-fields for hydrocarbon systems
Section V E presents the development of universal ML IAPs for a wide range of hydrocarbons, encompassing alkanes, alkenes, and aromatic compounds. Leveraging the SGPR algorithm and on-the-fly adaptive sampling, these potentials were generated based on the PBE+D3 level of DFT. The ML models were trained on diverse hydrocarbon molecules and their clusters encompassing intra- and inter-molecular interactions. Notably, the combined potentials accurately capture the intricate potential energy surfaces for various hydrocarbon systems, demonstrating excellent transferability from small molecules to long alkanes and polyenes and complex ring-chain molecular systems. Furthermore, the ML-IAPs successfully predicted the structures and energies of β-carotene monomer and dimer. Simulation outcomes for liquid ethylene and toluene crystals showcased the ML IAPs' ability to reproduce molecular volumes and crystalline stability differences between different phases. These findings signify the potential of these ab initio-level force fields to advance toward modeling universal organic, polymeric, and biomolecular systems. Classical force-fields like OPLS,196–200 AMBER,207 GROMACS,202 and CHARMM203 have long served as essential tools for simulating organic, biological, polymeric, and macro-molecular systems. Though these force-fields reproduce some experimental data such as conformation reasonably well, they fall short in describing non-covalent inter-molecular interactions194–199 as well as helicity.210–214 It is particularly difficult to accurately describe both structures and properties.215,216 Furthermore, their reliance on descriptors such as bond stretching and angle bending limits their applicability to non-reactive systems, hindering their ability to capture diverse molecular behaviors accurately.
In contrast, ML IAPs offer non-parametric representations, but their efficacy depends on comprehensive descriptors that account for many-body correlations while remaining invariant to system transformations. ML-IAPs rely on “descriptors” derived from atomic positions for their expression. While bonds, bond angles, and torsional angles can serve as useful descriptors, their utility is limited to non-reactive systems. Consequently, there is a necessity for more versatile descriptors that capture many-body correlations, remaining unchanged under the physical system's translations, rotations, and permutations of identical atoms.128 Ideally, a seamless correspondence should exist from the coordinate space to the descriptor space. If the transformation of descriptors lacks injectivity, it can impose a lower boundary on the achievable accuracy of ML-IAPs.217 Thus, the ongoing challenge involves developing more comprehensive descriptors.
Sampling data for ML-IAPs pose a significant challenge, necessitating training data from various potential energy surface domains. Most of these ML-IAPs are often generated on-the-fly with MD simulations.43,44,54,91,101 Active or on-the-fly learning requires assessing the model's uncertainty at given input points. Bayesian inference methods straightforwardly predict uncertainty, while NNs emulate it as variance among predictions from a committee of NNs trained with different architectures or random seeds. However, generating the initial NNs demands substantial effort. While NNs are advantageous when abundant ab initio data are available, the challenge remains in sampling these data across all relevant domains of the potential energy surface. Conversely, kernel-based or Bayesian methods enable on-the-fly model generation. The difficulties in utilizing NNs for on-the-fly training stem from their lack of uncertainty prediction comparable to Bayesian methods and their extensive hyper-parameterization. Re-optimizing all hyper-parameters upon sampling new data becomes cumbersome in high-dimensional spaces and is less conducive to on-the-fly training. Among Bayesian methods, exact GPR offers optimal accuracy but scales computationally as with the training data's size ( ). In contrast, SGPR scales more favorably as while retaining the favorable characteristics of GPR. Hence, SGPR emerges as an ideal choice for on-the-fly sampling of optimal datasets in generating ML IAPs.
The focus of this study lies in developing ML IAPs using SGPR for a broad spectrum of hydrocarbons, encompassing alkanes and alkenes with different structures and different degrees of unsaturation. Though triple bonds have not been included yet, here the work has considered not only saturated, olefinic, and aromatic hydrocarbons, which include alkane, alkene, alkadiene, and alkatriene molecules (having zero to arbitary number of C=C double bonds) with branches but also cyclic saturated/unsaturated and aromatic hydrocarbons, and further their combined structures.81,82 This research aims to lay the groundwork for a universal ab initio force-field specific to this subset of hydrocarbons. The goal is to bridge the accuracy gap observed in conventional force-fields and create a more versatile model capable of accurately describing diverse organic, bioorganic systems as well as polymers and macromolecular systems.
1. Machine learning potentials
In this work, for all first-principles calculations, the Vienna ab initio simulation package (VASP)107 using the Perdew–Burke–Ernzerhof (PBE) generalized-gradient approximation (GGA) functionals109 with correction method of Grimme et al.219 for van der Waals interactions (D3) was used. The python package 105 for on-the-fly generation of SGPR models.54 was used. For the accuracy of ML IAPs, the coefficient of determination denoted by is reported, where are the first-principles (PBE+D3) forces, is their average, and are the ML forces.
Here, the training with a particular molecule means MD simulations with an adaptive sampling algorithm similar to Ref. 54 for on-the-fly ML of the potential energy surface. Accurate variance prediction is central for efficient on-the-fly training. All available conformers of a given molecule are used for training. Each conformer is simulated for 3–6 ps of MD at 300 K. Molecules are queued for training based on their simplicity. Owing to the abundance of relevant molecules, trained molecules are split into several groups and independent SGPR models are trained in parallel using the AutoForce package. In this study, we considered expert models for linear or branched systems (C1–C8 alkane, C3–C10 isoalkane, C2–C6 alkene, C3–C6 branched alkene, C3–C8 alkadiene, C7–C9 alkatriene), cyclic systems (C3–C10 cycloalkane, C3–C8 cycloalkene, C4–C10 methyl-/dimethyl-cycloalkane, bicycloalkane), and aromatic systems (benzene, toluene, xylene, trimethylbenzene, mesitylene) to build the universal model. To consider the intermolecular interactions, expert models for liquid phases of alkane (methane, ethane, propane, butane), alkene (ethylene), and aromatic (benzene) are trained also and denoted as alkane-inter, alkene-inter, and aromatic-inter, respectively. The model for alkane is extended by simulations for small clusters and liquid phases of methane, ethane, propane, and butane for learning inter-molecule interactions. This extended alkane model will be distinguished as “alkane†.” Later, independent MD simulations are repeated for sampling independent testing sets for all conformers of the above molecules.
2. Macromolecular systems
a. Expert models
The expert SGPR model for each category of molecules is generated on-the-fly by short MD simulations (3–6 ps) of the associated molecules at 300 K. As an example, training for the elementary alkane molecules is shown in Fig. 27. The model trained after each MD simulation is used as the initial model for the next. In this way, the forces for larger molecules are inferred from the data for smaller ones more frequently and less ab initio calculations are performed for on-the-fly training. Expert models are also trained for cyclic hydrocarbons (cycloalkane, cycloalkene, and bicycloalkane) and aromatic systems, and then are combined with the previously generated model for alkane, isoalkane, alkene, and alkadiene. Each generated expert model is validated by comparing it with the DFT calculation results. Similar MD simulations are repeated without on-the-fly training for generating independent testing sets. The performance of the expert models on their testing sets are shown in Table III by means of mean absolute error (MAE) and R2. These SGPR models closely approximate the ab initio PBE+D3 potential energy surface in their respective domains.
Group . | Ntest . | Energy* MAE . | Force MAE . | Force R2 . |
---|---|---|---|---|
. | . | (meV) . | (eV/Å) . | Testing/training . |
Alkane | 448 | 4.9 | 0.076 | 0.978/0.993 |
Isoalkane | 308 | 3.0 | 0.080 | 0.977/0.989 |
Alkene-1 | 142 | 8.1 | 0.110 | 0.966/0.985 |
Alkene-2 | 88 | 5.3 | 0.110 | 0.955/0.98 |
Alkene-3 | 238 | 4.0 | 0.087 | 0.975/0.991 |
Alkadiene | 374 | 4.5 | 0.110 | 0.959/0.978 |
Alkatriene | 238 | 3.2 | 0.108 | 0.992/0.966 |
Cycloalkane | 253 | 2.3 | 0.074 | 0.983/0.990 |
Cycloalkene | 77 | 3.4 | 0.110 | 0.963/0.984 |
Methyl/dimethyl-cycloalkane | 946 | 5.2 | 0.092 | 0.970/0.989 |
Bicycloalkane | 594 | 2.1 | 0.075 | 0.980/0.987 |
Aromatic | 83 | 2.1 | 0.074 | 0.997/0.983 |
Group . | Ntest . | Energy* MAE . | Force MAE . | Force R2 . |
---|---|---|---|---|
. | . | (meV) . | (eV/Å) . | Testing/training . |
Alkane | 448 | 4.9 | 0.076 | 0.978/0.993 |
Isoalkane | 308 | 3.0 | 0.080 | 0.977/0.989 |
Alkene-1 | 142 | 8.1 | 0.110 | 0.966/0.985 |
Alkene-2 | 88 | 5.3 | 0.110 | 0.955/0.98 |
Alkene-3 | 238 | 4.0 | 0.087 | 0.975/0.991 |
Alkadiene | 374 | 4.5 | 0.110 | 0.959/0.978 |
Alkatriene | 238 | 3.2 | 0.108 | 0.992/0.966 |
Cycloalkane | 253 | 2.3 | 0.074 | 0.983/0.990 |
Cycloalkene | 77 | 3.4 | 0.110 | 0.963/0.984 |
Methyl/dimethyl-cycloalkane | 946 | 5.2 | 0.092 | 0.970/0.989 |
Bicycloalkane | 594 | 2.1 | 0.075 | 0.980/0.987 |
Aromatic | 83 | 2.1 | 0.074 | 0.997/0.983 |
b. Long alkane molecules
Using the alkane SGPR model, which is trained using only short molecules, fails to reproduce the potential energy surface for longer molecules when they fold and form a coil. When a long molecule folds, different parts of the same molecule interact with each other, which is not reproducible with short molecules. Instead, alkane† model, which is extended for inter-molecule interactions using small alkane molecules, is applicable. Here we used this model for MD simulations (without on-the-fly training) of long n-alkane molecules for 100 ps at 300 K. Snapshots of C64H130 are shown in Fig. 28. A testing set is generated by collecting snapshots every 1 ps. Performance of the alkane† model is summarized in Table IV, which shows a high degree of transferability. This table also reconfirms that the computational cost is for DFT but for ML.
Molecule . | . | . | Energy* MAE . | Force MAE . | R2 . |
---|---|---|---|---|---|
. | (s) . | (s) . | (meV) . | (eV/Å) . | (testing) . |
C8H18 | 0.44 | 82.07 | 3.3 | 0.070 | 0.985 |
C16H34 | 0.65 | 282.36 | 4.7 | 0.085 | 0.977 |
C32H66 | 1.08 | 1836.88 | 4.2 | 0.091 | 0.973 |
C64H130 | 1.92 | 12310.9 | 4.8 | 0.095 | 0.970 |
Molecule . | . | . | Energy* MAE . | Force MAE . | R2 . |
---|---|---|---|---|---|
. | (s) . | (s) . | (meV) . | (eV/Å) . | (testing) . |
C8H18 | 0.44 | 82.07 | 3.3 | 0.070 | 0.985 |
C16H34 | 0.65 | 282.36 | 4.7 | 0.085 | 0.977 |
C32H66 | 1.08 | 1836.88 | 4.2 | 0.091 | 0.973 |
C64H130 | 1.92 | 12310.9 | 4.8 | 0.095 | 0.970 |
c. Universal model
The SGPR models for alkane, isoalkane, alkene-(1, 2, 3), and alkadiene are combined into a single universal model for linear/branched hydrocarbons with arbitrary number of double bonds. The model was rebuilt using all of the ab initio data and inducing descriptors, which were sampled during training of the expert models.63 As shown in Table V, the universal model is more accurate than the expert models even in their own domains. The correlation of the predicted energies with the exact ab initio energies for all testing sets is shown in Fig. 29.
Group . | Energy* MAE . | Force MAE . | Force R2 . |
---|---|---|---|
. | (meV) . | (eV/Å) . | . |
Alkane | 3.4 | 0.055 | 0.989 |
Isoalkane | 3.3 | 0.060 | 0.986 |
Alkene-1 | 5.6 | 0.075 | 0.98 |
Alkene-2 | 5.1 | 0.079 | 0.973 |
Alkene-3 | 4.1 | 0.071 | 0.983 |
Alkadiene | 4.4 | 0.081 | 0.977 |
Group . | Energy* MAE . | Force MAE . | Force R2 . |
---|---|---|---|
. | (meV) . | (eV/Å) . | . |
Alkane | 3.4 | 0.055 | 0.989 |
Isoalkane | 3.3 | 0.060 | 0.986 |
Alkene-1 | 5.6 | 0.075 | 0.98 |
Alkene-2 | 5.1 | 0.079 | 0.973 |
Alkene-3 | 4.1 | 0.071 | 0.983 |
Alkadiene | 4.4 | 0.081 | 0.977 |
For a case study, several linear and branched isomers of C11H18 were considered. There are 3C=C double bonds in these isomers (see Fig. 30). It is worth noting that the training molecules contained at most 2C=C double bonds (alkadiene). These structures are relaxed using the universal ML-IAP model, PBE+D3, B3LYP+D3, and OPLS-AA force-field by minimizing the forces below 0.01 eV/Å. B3LYP+D3 is included since it is a popular functional for molecular systems. Additionally, we are intrigued to check how the error of ML-IAP compares to the disagreement of two widely used density functionals. The energies for the optimized structures, depending on the method used for relaxation, are shown in Fig. 31. For structures optimized using PBE+D3, single-point ML-IAP and OPLS-AA energies are compared. The energies for optimized structures using ML-IAP, PBE+D3, and B3LYP are in reasonable agreement while OPLS-AA fails to reproduce the trend. It should be noted that OPLS-AA is an empirical potential designed to statistically reproduce some experimental observations rather than single-point relaxed structures.
d. Rotation of CH3 group in ethane
The accuracy of universal ML-IAPs generated by combining expert models is tested first. As a simplest case, nudged elastic band (NEB) calculation is conducted to predict the rotation barrier of CH3 group in ethane. The barrier to rotate CH3 group for 120° is calculated using universal ML-IAPs and compared with PBE+D3 results. As a result, the barrier of the reaction was predicted to be 0.12 eV, which is almost the same with the PBE+D3 results showing 0.11 eV.
e. Dimerization of β-carotene
The universal ML-IAPs are used to predict the structures of molecular systems containing various hydrocarbon groups. Here, the structures of β-carotene monomer and dimer are investigated using the universal potential. β-carotene is an organic molecule having eight isoprenes and two trimethyl-cyclohexenes in its structure. The expert models including C–C single bonds, C=C double bonds, and cyclic hydrocarbons are used to construct the universal model. The relative energies of 20 different conformers of β-carotene monomer are predicted to be close to the PBE+D3 results within a low RMSE of 0.12 eV.
f. Liquid ethane and ethylene
. | Units . | ML (alkane†) . | Experimental . |
---|---|---|---|
vl | (Å3) | 90.3(4) | 91.5 |
(kcal/mol) | 4.01(2) | 3.52 |
. | Units . | ML (alkane†) . | Experimental . |
---|---|---|---|
vl | (Å3) | 90.3(4) | 91.5 |
(kcal/mol) | 4.01(2) | 3.52 |
Also, the molecular volume of liquid ethylene is predicted using ML-IAPs. The NPT MD simulations of 34 ethylene molecules are conducted for 30 ps with a constant pressure 1 bar and temperature 169.5 K. The volume per molecule vl is predicted to be 76 ± 6 Å3 in good agreement with the experimental value220 of 81.5 Å3. Some discrepancy from the experimental value seems to arise from the inherent discrepancy in DFT-D3 calculation.
g. Toluene crystal
The prediction of crystal structures is accelerated by using the ML potential trained with DFT potential energies/forces. Here, the universal model predicts the stable toluene crystal structure (Fig. 32). The α-phase is predicted to be the most stable state in crystal, while the β-phase is metastable state.221
3. Discussion
This study outlines the comprehensive generation of a universal ML-IAP designed to predict molecular and crystal structures of hydrocarbons. This model encapsulates various hydrocarbon types, including alkanes, unsaturated alkenes with diverse C=C double bond numbers, aromatic compounds, open-chain, and cyclic compounds. Using the SGPR algorithm and on-the-fly learning, the model emerges from less than 500 atomic configurations of elementary molecules, along with clusters and liquid phases of small molecules to incorporate inter-molecule interactions.
Expert models tailored to distinct hydrocarbon groups are initially developed through on-the-fly ML and then consolidated into the universal ML-IAPs. These universal models, originating from the expert models trained for 90 molecules (350 conformers) and five bulk phases, exhibit exceptional transferability across varied hydrocarbon systems. They effectively handle complex scenarios like the folding and coiling of long alkane molecules, showcasing reliability even under such structural changes. Simulations of liquid ethane at its boiling temperature yield results reasonably aligned with experimental observations.
The model's prowess extends to predicting the minimum energy structure of C11H18, displaying impressive agreement with direct ab initio simulations while outperforming traditional force-fields like OPLS-AA. Although OPLS-AA is empirically successful in replicating certain experimental observations, ML-IAPs, being surrogates for expensive first-principles potentials, provide more adaptable avenues for studying bond breaking/forming/rearrangements reaction-paths involving atom transfer, transition-states, and phase transitions. The universal ML-IAPs accurately predict the structure and interaction energies of complex compounds like the -carotene dimer and effectively describe molecular volume and phase transition, as exemplified in the investigation of the toluene crystal's phases. The significance of this work lies in efficiently generating a universal model from separately trained atomic potentials for various subgroups, enabling its broad applicability across multi-component systems—a feat unattainable for other ML methods. It showcases one of the most extensive explorations of hydrocarbon phase spaces, employing fewer than 1000 configurations as the basis for SGPR potential. Employing a Bayesian criterion during on-the-fly MD simulations allows automatic discarding of configurations with redundant information. It emphasizes that data quality, in terms of diversity and completeness, significantly impacts the accuracy and transferability of ML potentials, surpassing the importance of data quantity. Moreover, it explores the potential of using more comprehensive descriptor schemes for improved accuracy and introduces intriguing concepts like “recursively embedded atom NNs” for incorporating higher-order correlations. It also highlights the importance of balancing global exploration and local exploitation in data sampling for developing accurate ML-IAPs.
The SGPR algorithm emerges as a robust and rapid tool for on-the-fly ML of potential energy surfaces, offering excellent transferability despite being based on a relatively small number of configurations. It introduces a modular approach through expert models, allowing the combination of diverse models into a singular global model. The study underscores the significance of building efficient yet globally diverse datasets, a crucial step in the automation and optimization of ML-IAPs.
VI. PERSPECTIVES AND CONCLUDING REMARKS
ML has gained recognition for its precision in modeling quantum potential surfaces and forces. While NNs are frequently employed for this purpose, they require frequent updates from numerous trajectories. GPR provides an alternative. However, traditional GPR models still necessitate large datasets for training due to extensive covariance matrix elements. This complicates continuous learning, particularly during phase transitions and chemical reactions. SGPR addresses this issue by significantly reducing the number of parameters, which allows for smaller training datasets. SGPR's ability to perform rank reduction improves the computational performance by several orders of magnitude (see Table VII). Moreover, by utilizing path-independent feature parameters through active learning instead of traditional atomic coordinates, the SGPR-ML based algorithm facilitates the creation of universal quantum mechanical potentials for complex systems, establishing a foundational platform for virtual material modeling and broadening applicability across various quantum mechanical systems. Path-independent parameters can be compiled into a built-in library, enabling their reuse in related systems without the need for re-training. The transferability of SGPR's covariance matrix means it remains effective without updates. This facilitates the creation of a comprehensive covariance matrix library through cumulative learning, enhancing the efficiency and speed of simulations for complex material systems. This makes SGPR versatile for the simulations of a range of small to complex quantum mechanical systems at the levels of advanced theories like coupled cluster theory with perturbative triple excitations [CCSD(T)],222 matrix product states (MPS),223,224 and quantum Monte Carlo (QMC) simulations225,226 alongside DFT.
Method . | DFT-data . | Accuracy . | Paper . |
---|---|---|---|
Cost . | GPR . | SGPR . | GPR/SGPR . |
Matrix inversion | (1 + 3N) | (1 + 3N)nm2 | ∼103 |
Kernel construction | (1 + 3nc) | (1 + 3nc)Nnm + m2 | ∼103 |
Method . | DFT-data . | Accuracy . | Paper . |
---|---|---|---|
Cost . | GPR . | SGPR . | GPR/SGPR . |
Matrix inversion | (1 + 3N) | (1 + 3N)nm2 | ∼103 |
Kernel construction | (1 + 3nc) | (1 + 3nc)Nnm + m2 | ∼103 |
SGPR's rank reduction, empowered by selecting LCE features via similarity measures in active learning, replaces atomic coordinates with characteristic geometric features. This method helps achieve chemical accuracy across various spatial configurations of quantum mechanics potentials. By aggregating covariance elements from subsystems and analyzing their interrelationships, SGPR is capable of generating universal quantum mechanical potentials, enabling integrated first-principles calculations and ML simulations on a comprehensive platform for virtual materials reality. This advanced capability is generally unachievable with most traditional ML methods. Meanwhile, the covariance matrices created through SGPR's rank reduction are transferable, meaning that their updates are not required even during events such as phase transitions, molecular transformations, or chemical bond formation/breaking. This feature allows the combination of systems, where inter-system covariance matrices are integrated gradually, building a comprehensive library. This library then facilitates rapid and precise simulations for complex systems by simply retrieving relevant datasets, effectively realizing the structural and phenomenological characteristics of these systems. SGPR thus distinguishes itself with several key features: rapid training due to minimal LCE-based rank reduction, high-speed computations suitable for large or complex systems, and the generation of a universal, transferable potential that supports both expansive learning across complex systems and the development toward self-learning super-AI or digital twin. By merging SGPR with active learning, meta-dynamics, and quantum mechanical calculations, it would be possible to develop a self-improving AI and digital twin capable of innovating new materials and devices. SGPR offers a reduction in computational complexity and biases while ensuring quicker, more accurate simulations for sizable and intricate systems. Its unique attributes, such as linear scalability and adaptability to varied systems, represent a substantial leap over conventional ML methodologies, encouraging creative and effective approaches in materials science simulations and designs. The strengths of SGPR include (i) fast training due to reduced parameter needs; (ii) high-speed computation suitable for large and complex systems (linear scaling); (iii) chemical accuracy and the ability to produce universal potentials; and (iv) scalability through cumulative learning, forming a base for a self-learning AI or Digital Twin.
This approach has been demonstrated in the simulation of superionic materials for fast-charging batteries and electro-catalytic reactions, showing significant improvements over traditional methods. SGPR's low-rank approximation of the covariance matrix boosts computational performance dramatically, which we have quantified in comparative studies. Additionally, SGPR integrates seamlessly with on-the-fly ML methods, eliminating the need for constant retraining of ML models during simulations. This makes it particularly suited for extensive materials studies involving solid electrolytes, catalysts, solar cells, and even complex biological molecules, where it enables rapid simulation and exploration of potential materials without the computational overhead typically associated with such tasks. Thus, SGPR represents a transformative step forward in the use of ML for quantum mechanical simulations, providing a robust platform for the development of new materials and the detailed study of existing ones under various conditions, all while reducing the computational expense and increasing the scope of potential applications.
In this work, we reviewed the applications of SGPR for extensive materials involving bulk materials, solid electrolytes, batteries, catalysts, solar cells, and macromolecular systems. Leveraging this tool allows to bypass traditional computational constraints, enabling extensive exploration of material compositions. This approach is expected to significantly contribute to the development of safer and more efficient solid electrolytes for future energy storage applications. Here, as an exemplary application, we outline a technical strategy and the expected outcome of SGPR as a comprehensive virtual lab capable of simulating and predicting properties of electrolytes across various systems, significantly advancing our capability to design efficient, stable, and safe solid electrolytes for future energy storage solutions.
The Innovative Approach will push the boundaries of materials science by integrating advanced ML techniques with traditional chemical physics approaches to develop more advanced materials, which is crucial for designing next-generation novel materials.
-
SGPR-based ab initio molecular dynamics (AIMD): AIMD excels in simulating complicated systems. However, AIMD's computational demands scale cubically with the number of atoms, making it feasible only for smaller systems and shorter durations. SGPR-enhanced AIMD can handle larger atom counts over extended periods, demonstrating its potential for larger, more complex simulations.
-
Integration of quantum mechanics and ML: This SGPR-based approach ushers in a new era for modeling with significant applications such as:
-
Learning from existing data to develop strategies, similar to how AlphaGo learned to master Go toward AlphaGo-0.
-
Generating substantial data on unknown material properties using quantum mechanical computations.
-
Creating a universal quantum mechanics potential that combines quantum mechanics and ML, enabling rapid computations.
-
Exploring structural space and phase transitions through meta-dynamics.
-
Developing self-learning systems that evolve into Super AI and Digital Twins, capable of innovating new materials and devices.
-
-
Challenges with conventional ML methods: Traditional GPR methods require inverting large covariance matrices, which is impractical for systems with thousands of trajectories due to computational constraints. SGPR addresses this by reducing feature dimensions, which simplifies the learning process and enhances the speed and scalability of simulations.
-
Key features of SGPR:
-
Fast training and superfast computing capabilities suitable for extensive systems due to linear scaling.
-
Ability to generate transferable universal potentials that facilitate rapid, large-scale quantum calculations.
-
Minimal data requirements for accurate potential generation, leveraging the cumulative learning of key features.
-
To conclude, by merging computational power with innovative ML strategies, this research stands to significantly impact the field of energy storage, contributing to safer and more reliable battery technologies and other novel functional materials including catalysts and solar cells that are essential for a range of modern applications.
ACKNOWLEDGMENTS
This research was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT: Ministry of Science and ICT) (No. RS-2024-00346451) and the Program for Cultivating the Next Generation of Outstanding Scientists (No. 1.240033.01) by UNIST Basic Science Institute.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to declare.
Author Contributions
Soohaeng Yoo Willow, Amir Hajibabaei, Miran Ha, David ChangMo Yang, Chang Woo Myung, Seung Kyu Min, and Geunsik Lee contributed equally to this work.
Soohaeng Yoo Willow: Writing – original draft. Amir Hajibabaei: Writing – original draft. Miran Ha: Writing – original draft. David ChangMo Yang: Writing – review & editing. Chang Woo Myung: Writing – review & editing. Seung Kyu Min: Writing – review & editing. Geunsik Lee: Writing – review & editing. Kwang S. Kim: Conceptualization; Funding acquisition; Project administration; Writing – original draft; Writing – review & editing.
DATA AVAILABILITY
Data sharing is not applicable to this article as no new data were created or analyzed in this study.
APPENDIX: MATHEMATICAL BACKGROUND
1. Mathematical expressions
a. Matrix algebra
Matrix inversion lemma:
Therefore,