Lattice dynamics (LD) plays a crucial role in investigating thermal transport in terms of not only underlying physics but also novel properties and phenomena. Recently, machine learning interatomic potentials (MLIPs) have emerged as powerful tools in computational physics and chemistry, showing great potential in providing reliable predictions of thermal transport properties with high efficiency. This tutorial provides a comprehensive guideline for MLIPs’ development and how they are used for the computational modeling of thermal transport. Using atomic cluster expansion (ACE) as the paradigmatic potential, we introduce the essential fundamentals of MLIPs, including data construction, model training, and hyperparameter optimization. With the developed ACE potentials, we further showcase their applications in the LD modeling of thermal transport for crystalline silicon and amorphous carbon. The corresponding code implementations for MLIP applications in calculating thermal conductivity are also provided for beginners to follow.
I. INTRODUCTION
Thermal transport is one of the most critical properties of solid materials, playing an essential role in a range of technological applications, such as electrical device design and thermoelectric energy conversion.1–3 Thermal conductivity of materials is significantly related to chemistry compositions, atomic structures, sizes, and temperatures in complex manners, bringing challenges for thermal transport simulation, optimization, and manipulation in practical applications.4–7 For in-depth understanding of thermal transport, there have been extensively developed computational methods, in which the phonon Boltzmann transport equation (PBTE)8,9 combined lattice dynamics (LD) becomes widely adopted. This approach offers indispensable insight with high reliability, as it can derive accurate phonon properties from force constants (FCs) calculated using the quantum-mechanically accurate Density Functional Theory (DFT).10,11 However, the high computational cost of DFT calculations makes first-principles methods quite expensive and limits their applications in complex situations, such as high-order anharmonicity and glasses.12,13 Typically, DFT evaluations are restricted to the system with a few thousand atoms and only hundreds of atoms when conducting “ab initio” simulations.14,15
In recent years, machine learning interatomic potentials (MLIPs) have been attracting researchers’ attention due to their computational efficiency and accuracy as a compelling alternative to DFT. In MLIP models, a potential energy surface (PES) is constructed for the accurate prediction of energies and forces by learning from the DFT reference results. They offer comparable accuracy but several orders of magnitude higher computational efficiency than DFT,16–18 enabling reliable large-scale (up to millions of atoms) atomic simulations19,20 and prohibitive computations for fourth- or higher-order phonon scatterings.21–24 The applications of MLIPs in thermal properties prediction, with advantages in efficiency and accuracy, have been widely reported.25–30 To date, numerous MLIPs have been developed with the corresponding code implementations for computational physics, chemistry, and materials. Representative MLIP models are enumerated in Table I, along with their regression types and supported tools such as the Atomic Simulation Environment (ASE)31,32 in Python and the Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS).33 For thermal transport simulation, developing proper MLIPs and practical workflows remains nontrivial.
List of the representative MLIP models.
Model . | Type . | Supporting codes . | Year . | Reference . |
---|---|---|---|---|
ACE | Kernel Regression | ASE, LAMMPS | 2019 | Drautz34 |
GAP | Kernel Regression | ASE, LAMMPS | 2010 | Bartok35 |
NEP | NN (Neural Net) | ASE, LAMMPS, GPUMD36 | 2021 | Fan37 |
DeepPotential | NN | ASE, LAMMPS | 2018 | Han38 |
MACE | GraphNN | ASE, LAMMPS | 2023 | Batatia39 |
Allegro | GraphNN | ASE, LAMMPS | 2023 | Musaelian20 |
NequIP | GraphNN | ASE, LAMMPS | 2022 | Batzner40 |
Model . | Type . | Supporting codes . | Year . | Reference . |
---|---|---|---|---|
ACE | Kernel Regression | ASE, LAMMPS | 2019 | Drautz34 |
GAP | Kernel Regression | ASE, LAMMPS | 2010 | Bartok35 |
NEP | NN (Neural Net) | ASE, LAMMPS, GPUMD36 | 2021 | Fan37 |
DeepPotential | NN | ASE, LAMMPS | 2018 | Han38 |
MACE | GraphNN | ASE, LAMMPS | 2023 | Batatia39 |
Allegro | GraphNN | ASE, LAMMPS | 2023 | Musaelian20 |
NequIP | GraphNN | ASE, LAMMPS | 2022 | Batzner40 |
Herein, the purpose of this tutorial is to provide guidance for beginners on the development and application of MLIPs for thermal transport calculations. We aim to complement relevant works by others: focus on a step-by-step guide to developing MLIPs from scratch and applying them in thermal calculations using well-established interfaces. Our work strongly emphasizes MLIP applications in LD methods, complementing a related work by Dong et al.,41 which focused on Molecular Dynamics (MD) simulation. For the theoretical details of LD, we refer interested readers to the previous papers.42–45 We present the basic concepts, data construction, model training, and hyperparameters’ selection segments. Additionally, we feature crystalline silicon and amorphous carbon to illustrate the practical implementation of MLIPs and the derived thermal transport calculations. For this tutorial, the atomic cluster expansion (ACE) potential has been selected as the paradigmatic model due to its precision with efficiency on high-performance parallel computation,46,47 meeting a range of computational requirements. It is worth noting that although only the ACE potential is considered, the presented instructions are also applicable to other MLIPs.
The remaining parts of this tutorial are organized as follows. In Sec. II, we introduce the overall procedure of MLIP development, which includes a brief introduction to the ACE potential framework, data construction, potential training and estimation, and the optimization and selection of model hyperparameters. In Sec. III, we bring two applications of thermal calculations using MLIPs for crystalline silicon and amorphous carbon, respectively. Finally, this tutorial is summarized in Sec. IV.
II. MLIP DEVELOPMENT
A. Fundamental concepts and atomic cluster expansion framework
Here, the compressed index is expanded into a list of indices characterizing the chemical species and indicating the types of radial functions and spherical harmonic functions. The radial functions depend on the pairwise distance between atoms i and j, and the spherical harmonic functions depend on the respective direction . This expansion form fulfills the symmetry requirements of permutational and translational invariance.
B. Data collection
Unlike empirically fitted potentials, MLIPs do not impose any prior assumptions on the functional form of the PES. Instead, the shape of the potential is directly learned through a regression process from a large set of reference configurations with their corresponding energies and forces, typically obtained from DFT calculations. Collectively, these elements constitute the dataset for training and validating the ML potential. Given this nonparametric nature of ML potentials, which lack specific physical function forms and involve tens of thousands of parameters to be determined, the selection of reference data is especially critical. Data collection must align with the application purposes of MLIPs: a general potential requires a diverse range of configurations including crystalline, amorphous, and surface structures,48–50 while tasks of LD calculations for specific crystals required more targeted data.25,30 To date, a variety of data construction approaches have been developed. Here, we will present a spectrum of strategies from manually crafted approaches to automated protocols.
1. Hand-built approaches
Directly selecting configurations for the dataset based on physical intuition is a straightforward yet practical and widely adopted method.51,52 The common artificial methods include the perturbation method and Ab Initio Molecular Dynamics (AIMD).53
As illustrated in Fig. 1(a), the perturbation method begins with structures positioned at energy minima and then introduces random displacements to atoms in their supercells to generate reference configurations. This procedure can be conveniently executed with ASE's rattle function, following a specific statistical distribution, most commonly a Gaussian distribution with a predefined standard deviation (e.g., 0.1 Å). However, structures generated by this stochastic approach might sometimes exhibit large interatomic forces due to the possibility of atoms being placed very close together, resulting in significant repulsive forces. To address this issue, constraints on a minimum atomic distance can be added via a Monte Carlo (MC) process, as introduced in Ref. 54, where configurations with short atomic distances below a threshold are only accepted with low probability. Moreover, to ensure that the potential accurately forecasts forces, it might sometimes be necessary to subject the lattice to stress through tension or compression.49 After acquiring these reference structures, they are subsequently employed to produce a reference dataset via DFT calibration.
Approaches for data collection. (a) Perturbation method. (b) AIMD method. (c) Analytical definition of uncertainty in AL. (d) Query-by-Committee definition of uncertainty in AL.
Approaches for data collection. (a) Perturbation method. (b) AIMD method. (c) Analytical definition of uncertainty in AL. (d) Query-by-Committee definition of uncertainty in AL.
Despite its high efficiency, the perturbation method requires initial structures, which cannot be obtained when considering amorphous and liquid materials. For this purpose, the more computationally demanding AIMD method [Fig. 1(b)] provides a distinct advantage in constructing datasets that lack a definable structural prototype. It involves performing an MD simulation for a specified duration, after which snapshots are sampled at regular intervals or using other sampling methods to form a dataset. In practice, these two hand-built approaches can be combined for data collection.52
2. Active learning protocols
Manually constructing datasets demands researchers to have extensive experience, which may inevitably introduce inductive biases. This can lead to high redundancy in structural features and an imbalance in data distribution, resulting in unnecessary increase in data collection efforts and reduced transferability of the potential.17,55–58 Active learning (AL) strategies are employed to address this issue through an iterative procedure to gradually improve the ML model. Starting with a limited collection of labeled configurations, the procedure iteratively performs the ML potential training and appends the newly labeled data until certain stopping criteria are met. Here, selection rules for the appended data are based on predefined metrics centered around uncertainty, which quantifies the expected error of prediction on a given configuration. Two of the most commonly used definitions of uncertainty can be estimated through an analytical method58,59 or Query-by-Committee (QbC),60–63 as depicted in Figs. 1(c) and 1(d).
New configurations with high uncertainty are then added to the training dataset iteratively, similar to the analytical approach. Once the estimated uncertainties on all configurations are below a given threshold or other criteria are met, all data are collected as the final dataset to train a single potential.
Moreover, other AL schemes, including entropy maximization on feature space distance, dropout method, etc., are detailed in their respective literature structures.64,65
C. Training and estimation
With the data prepared, it will be divided into two parts (e.g., 80–20 split), where the majority serves as a training dataset for ML potential’s development and the remainder is used for estimating the performance of the trained potential as the testing dataset. This division can be performed using the random division or stratified sampling techniques, ensuring both the diversity of the training data and the comprehensiveness of the potential's performance assessment across the entire PES landscape.
The trade-off relationship between the accuracy of energy and force in MLIP training.
The trade-off relationship between the accuracy of energy and force in MLIP training.
D. Pareto optimal and hyperparameters selection
Given a specific dataset and model setting, the training of MLIPs determines the predictive accuracy of energy and force. By refining the model's hyperparameters, such as using a larger cutoff radius or increasing the order of many-body terms for higher-order interactions, the accuracy of MLIPs may be enhanced. However, these adjustments can lead to an increase in the computational cost, which is particularly detrimental for MD simulations and high-throughput computations where computational efficiency is as vital as the potential's accuracy itself.
The accuracy and speed of MLIPs typically present a trade-off in the optimization process without a single optimal solution.73,74 Therefore, it is necessary to select from a range of alternatives to balance accuracy and efficiency. As shown in Fig. 3, by performing a grid search on the hyperparameters of ACE including the cutoff radius, order of expansion, and number of basis functions, a series of potentials can be trained, offering a spectrum of efficiency and precision pairs. The vertices of their convex hull form the Pareto optimal,75 signifying that any improvement in accuracy will inevitably accompany an increase in the computational cost. These Pareto optimal points provide optimal solutions for hyperparameters’ selection.
(a) Pareto optimal via grid search. (b) and (d) Physical accuracy exemplification of cell length, phonon cutoff frequency, and thermal conductivity, respectively, on Pareto optimal points.
(a) Pareto optimal via grid search. (b) and (d) Physical accuracy exemplification of cell length, phonon cutoff frequency, and thermal conductivity, respectively, on Pareto optimal points.
In the thermal calculation practice, the physical accuracy of MLIPs requires further consideration beyond mere numerical precision. Employing these Pareto optimal points, properties such as cell length, phonon frequency, and thermal conductivity of crystalline silicon are predicted, as shown in Figs. 3(b)–3(d). Convergence of these properties’ calculations is observed when the computational cost surpasses 8 × 10−3 ms/timestep/atom, with the force root-mean-square error (RMSE) falling below 50 meV/Å. This convergence indicates that the potential's accuracy has met the computational requirements. Therefore, the potential marked on the dashed line can satisfy the computational needs with a balance of performance and efficiency. This convergence behavior of Pareto optimal potentials can assist in the selection of hyperparameters.
III. EXAMPLES FOR APPLICATIONS
State-of-the-art MLIP models, including the ACE potential, offer an interface to the ASE or LAMMPS (as listed in Table I), allowing convenient LD calculations and MD simulations. In the following, we will demonstrate how the ACE is used to calculate the thermal conductivity of crystalline silicon and amorphous carbon. All the data, codes, and scripts necessary to reproduce the results can be accessed at https://github.com/Dio2k/Tutorial-ACE4LDSim.
A. Diamond silicon
LD is a fundamental approach for normal modes, i.e., phonons in crystal, the quantized units of vibrational energy that serve as the dominating heat carriers in insulators and semiconductors. Harmonic LD calculation takes second FCs to generate phonon dispersion, group velocities, and other harmonic or quasi-harmonic phonon properties, while anharmonic LD calculation takes third or higher-order FCs to generate phonon scattering rates, which can provide scattering rates. With these phonon properties, thermal conductivities can be calculated by solving PBTE. Here, we show the workflow with Phonopy and Phono3py for LD calculations and PBTE solving,43,76 as illustrated in Fig. 4, where MLIPs can replace DFT in structural relaxation and the generation of FCs to reduce the computational cost. We will exemplify this process using the calculation of crystalline silicon.
Workflow of ML accelerated force constant calculations in lattice dynamics.
While there have already been several publicly available datasets for crystalline silicon, we elect to construct a new dataset from the ground up for pedagogical purposes. Here, we employ hand-built approaches, incorporating both perturbation and AIMD methods for data construction. First, we generate 300 structures by introducing random displacements with a deviation of 0.03 Å on atomic positions and simultaneously applying cell strains ranging from −2% to +2% for each degree of freedom. This process is conducted using a 2 × 2 × 2 supercell of crystal silicon comprising 64 atoms. These reference configurations are then subjected to single-point DFT calculations using the Perdew–Burke–Ernzerhof functional with the generalized gradient approximation,77 a cutoff energy of 600 eV, and a k-point mesh of 4 × 4 × 4 in Vienna Ab initio Simulation Package (VASP).78,79 Second, we perform ten 10 ps AIMD simulations at a temperature range from 100 to 1000 K in 100 K increments, using a supercell of the same size and a time step of 10 fs. Snapshots are sampled every 0.1 ps, yielding other 1000 data. In total, we obtain 1300 reference configurations for our dataset.
After obtaining the datasets, the ACE potential can be fit by utilizing pacemaker34,46,47 software. For parameterization, the site energy is directly read out from a single atom in a linear form, and the cutoff distance (rcut) is set to 7.0 Å. A total of 700 basis functions are parameterized with a maximum body order corresponding to ν = 6. Ten percent of the total dataset is randomly selected for testing, not involved in the training procedure. Parameter optimization results in a fit with an energy mean-averaged error (MAE) of 2.82 meV/atom and a force root-mean-square error (RMSE) of 12.24 meV/Å on the testing datasets. Note that the RMSE metric is used for force since it is sensitive to undesirable outliers. This provides a more stringent estimation of the model's performance in predicting forces.80
We proceed to demonstrate the application of the trained ACE potential to thermal calculations. Upon completion of the ACE potential training, a YAML file recording all model parameters is automatically created in the training directory. This file can then be used as an ASE calculator, enabling the convenient prediction of structural energies and atomic forces (this interface is similar to the other Python-supported MLIPs listed in Table I). Listing 1 shows an illustrative code extract to perform structural relaxation with the ACE potential in Python, where the PreconLBFGS optimizer81 is employed. The variability of the cell shape and volume during optimization (variable_cell) and the force convergence criterion (fmax) can be further specified. The optimized equilibrium structure is a prerequisite for the subsequent force constant calculations.
Code extract of structural optima with the ACE potential.
from pyace import PyACECalculator |
ase_atoms.calc = PyACECalculator(‘/path/to/potential’) |
from ase.optimize.precon import PreconLBFGS |
optimizer = PreconLBFGS(ase_atoms, variable_cell = True) |
optimizer.run(fmax = 1e-5) |
from pyace import PyACECalculator |
ase_atoms.calc = PyACECalculator(‘/path/to/potential’) |
from ase.optimize.precon import PreconLBFGS |
optimizer = PreconLBFGS(ase_atoms, variable_cell = True) |
optimizer.run(fmax = 1e-5) |
For FC calculations, a practical approach is to apply finite displacements to atoms within a given supercell.43 With a sufficient number of atomic displacement and force pairs, FCs can be approximated mathematically. This can be achieved with the help of the phono3py package as shown in Listing 2. In the first block, calculation settings are specified, which include the size of the supercell (supercell_matrix), the finite displacement distance applied to atoms (distance), and the cutoff distance on displacement pairs to reduce computational budget (cutoff_pair_distance). In the second block, forces on the displaced structures are predicted with the ASE calculator iteratively, and then, FCs are estimated. Here, only second and third FCs are considered.
Code extract of force constant calculations with the ACE potential.
from phono3py import Phono3py |
phono3py = Phono3py(phpy_atoms, supercell_matrix = [3, 3, 3]) |
phono3py.generate_displacements(distance = 0.03, cutoff_pair_distance = 6.21) |
# Adapted to ASE.Atoms formats for ML potentials calculator. |
supercell_fc2 = PhonopyAtoms2AseAtoms(phono3py.phonon_supercell) |
supercell_fc2.calc = PyACECalculator(‘/path/to/potential’) |
for dataset in phono3py.phonon_dataset[‘first_atoms’]: |
supercell_fc2.positions[dataset[‘number’]] + = dataset[‘displacement’] |
dataset[‘forces’] = supercell_fc2.get_forces() |
supercell_fc2.positions[dataset[‘number’]] − = dataset[‘displacement’] |
phono3py.produce_fc2() |
fc2 = phono3py.fc2 |
# the third force constant can be obtained with a similar process. |
fc3 = … |
from phono3py import Phono3py |
phono3py = Phono3py(phpy_atoms, supercell_matrix = [3, 3, 3]) |
phono3py.generate_displacements(distance = 0.03, cutoff_pair_distance = 6.21) |
# Adapted to ASE.Atoms formats for ML potentials calculator. |
supercell_fc2 = PhonopyAtoms2AseAtoms(phono3py.phonon_supercell) |
supercell_fc2.calc = PyACECalculator(‘/path/to/potential’) |
for dataset in phono3py.phonon_dataset[‘first_atoms’]: |
supercell_fc2.positions[dataset[‘number’]] + = dataset[‘displacement’] |
dataset[‘forces’] = supercell_fc2.get_forces() |
supercell_fc2.positions[dataset[‘number’]] − = dataset[‘displacement’] |
phono3py.produce_fc2() |
fc2 = phono3py.fc2 |
# the third force constant can be obtained with a similar process. |
fc3 = … |
With the FCs prepared using the developed ACE model, we proceed with regular phonon computations to derive the thermal conductivity of crystalline silicon. Figure 5 presents the calculated phonon band structure, density of states (DOS), and temperature-dependent thermal conductivity, with comparisons to the DFT results under the same calculation settings. It is evident that the LD computations accelerated by MLIP align with DFT in high agreement, indicating the capability of MLIPs to predict thermal transport properties with comparable accuracy to DFT.
Comparison of DFT and ML predictions on (a) band structure and DOS, and (b) thermal conductivity.
Comparison of DFT and ML predictions on (a) band structure and DOS, and (b) thermal conductivity.
1. Amorphous carbon
Amorphous materials are valuable for applications such as thermal barrier coatings and thermoelectric, where their inherently low thermal conductivity due to disorder is desired.82 However, the complex local environments and interactions in amorphous materials require large atomic structures, which pose significant challenges for ab initio calculations, particularly in thermal computations.13,83 To address this, MLIPs can accurately capture the complex interatomic interactions in amorphous systems, allowing for reliable atomic model generation and force constant calculations. In this case, we use amorphous carbon (a-C) to exemplify the advantage of MLIPs in handling thermal transport calculations for complex systems. As shown in Fig. 6, the melt-quench protocol, structure relaxation, and FC calculations for further thermal transport modeling can all be accelerated by MLIPs.
Since the purpose of the target MLIP is to correctly simulate the melt-quench process and calculate the FCs required for thermal conductivity calculations, the training dataset must cover a wide range of configurations as comprehensively as possible. Manually creating a reasonable dataset in a convincing manner is challenging. Therefore, we opt to utilize the QbC technique from active learning to construct the dataset instead of using hand-built approaches. First, a small initial dataset is built by 500 structures sampled from an AIMD simulation performed using the VASP with the Perdew–Burke–Ernzerhof exchange-correlation functional under the generalized gradient approximation used along with projector-augmented wave (PAW) pseudo-potentials.77,84,85 The simulation starts with a (unstable) simple-cubic lattice of carbon containing 125 atoms with density of 2.5 g/cm3. The system is first held at 8000 K for 3 ps, then maintained in the liquid state at 5000 K for another 3 ps, quenched to 300 K at a constant rate of over 1 ps, and finally, annealed at 300 K for 3 ps. The time step is set to 1 fs along with a k-point grid of 3 × 3 × 3 in all simulations. Second, we use a committee of eight potentials to iteratively select new snapshots to enrich the initial dataset. For each iteration, 50 unselected structures with the highest uncertainties are added to the dataset, and the protocol stops after 10 iterations. This procedure ultimately yields a dataset comprising 1000 structures for training the final ACE potential.
For potential parameterization, the site energy is directly read out from two single atomic properties in the Finnis Sinclair form and the cutoff distance is set as 5.5 Å. A total of 700 basis functions are parameterized with a maximum body order corresponding to ν = 6. Ten percent of the total dataset is randomly selected for testing. Parameter optimization leads to a fit with an energy MAE of 40.03 meV/atom and a force RMSE of 540.07 meV/Å for the testing datasets.
Code extract of ACE potential definition in the LAMMPS.
pair_style pace |
pair_coeff * * /path/to/potential |
pair_style pace |
pair_coeff * * /path/to/potential |
Leveraging the speed advantage of the ACE potential, we generate an a-C sample using a larger cell containing more atoms and a longer melt-quench process.86 This MD simulation is performed in the LAMMPS, with the implementation of ACE potential shown in Listing 3. Using the same time step of 1 fs, we start with a 512-atom orthogonal box held at 8000 K and then cooled down to 5000 K for 100 ps. We then perform a rapid quench from 5000 to 300 K within 1 ps. Finally, the system is equilibrated at 300 K for 100 ps, followed by an energy minimization process for the equilibrium structure. We compare the radial distribution function (RDF) of the final generated structure with those from AIMD and the Tersoff potential.87 As shown in Fig. 7(b), the RDF of ACE shows good agreement with the result of AIMD simulation, whereas the Tersoff potential does not capture the right locations and values of the second and the third RDF peaks. We also calculate the fourfold coordinated atom percentage to be 30.1%, which aligns with the previous works reported in Ref. 88. Additionally, we assess the computational efficiency of the ACE potential. As illustrated in Fig. 7(a), simulations accelerated by MLIPs show a speed of 3 to 4 orders of magnitude faster than AIMD, with linear scaling relative to the system size and the number of CPU cores used in the calculation. In contrast, the computational cost of AIMD grows exponentially with the number of atoms, leading to a sharp increase in computational expense, which restricts its application to large systems (e.g., over 103 atoms).
(a) Comparison of computation efficiency on AIMD and MLIP MD. (b) RDF of a-C. (c) Scattering rate distribution and cumulative thermal conductivity of a-C.
(a) Comparison of computation efficiency on AIMD and MLIP MD. (b) RDF of a-C. (c) Scattering rate distribution and cumulative thermal conductivity of a-C.
With the relaxed atomic model, we calculate the second and third FCs and, subsequently, the thermal conductivity for a-C. The FCs are derived using the PHONON package89 in the LAMMPS. For thermal conductivity, we use the Quasi-harmonic Green's Function (QHGK) method90 implemented in kALDo,44 versatile and scalable open-source software to compute phonon transport in crystalline and amorphous solids. As illustrated in Fig. 7(c), the thermal conductivity of a-C is 1.36 W/m/K, which is close to the value of 1.61 W/m/K using the linear fit to experimental data reported in Ref. 91, demonstrating the capability of MLIPs in thermal calculations for complex systems. We noted that the propagation contribution is likely to be underestimated due to the limitation of the QHGK method and a proper extrapolation is needed for the case in which propagation mechanism matters,83 like a-C with higher density92 and amorphous silicon.93
IV. SUMMARY
In this tutorial, we offer a comprehensive guide to develop MLIPs for modeling thermal transport using the LD method. This guide covers the introduction of basic concepts and the procedure of MLIP development from data construction, potential training to the optimal and the selection of model hyperparameters. We also present two application examples, i.e., crystalline silicon and amorphous carbon, to demonstrate the superiority of speed with the accuracy of MLIPs. In addition, we provide practical code implementations for beginners to follow conveniently. It is noted that while we use ACE potential as the illustrative backbone model in this tutorial, the underlying principle, practical strategies, and guidelines can be easily transferred to the other MLIP models. We believe that our tutorial will promote the extensive applications of MLIPs in thermal transport research, encouraging innovative discoveries and their practical applications in the industry.
ACKNOWLEDGMENTS
This study was financially supported by the National Natural Science Foundation of China (NNSFC) (Nos. 52425601, 52327809, 52250273, U20A20301, and 82361138571), National Key Research and Development Program of China (No. 2023YFB4404104), and Beijing Natural Science Foundation (No. L233022).
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Liben Guo: Conceptualization (equal); Investigation (lead); Validation (equal); Visualization (lead); Writing – original draft (lead); Writing – review & editing (equal). Yuanbin Liu: Conceptualization (lead); Investigation (equal); Validation (equal); Visualization (equal); Writing – original draft (supporting); Writing – review & editing (equal). Lei Yang: Conceptualization (equal); Investigation (equal); Validation (equal); Visualization (supporting); Writing – original draft (supporting); Writing – review & editing (equal). Bingyang Cao: Conceptualization (equal); Funding acquisition (lead); Investigation (equal); Supervision (lead); Validation (equal); Visualization (equal); Writing – review & editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are openly available at in Github repository Tutorial-ACE4LDSim at https://github.com/Dio2k/Tutorial-ACE4LDSim, Ref. 94.