Calculations of heat transport in crystalline materials have recently become mainstream, thanks to machine-learned interatomic potentials that allow for significant computational cost reductions while maintaining the accuracy of first-principles calculations. Moment tensor potentials (MTPs) are among the most efficient and accurate models in this regard. In this study, we demonstrate the application of MTP to the calculation of the lattice thermal conductivity of and - . Although MTP is commonly employed for lattice thermal conductivity calculations, the advantages of applying the active learning methodology for potential generation are often overlooked. Here, we emphasize its importance and illustrate how it enables the generation of a robust and accurate interatomic potential while maintaining a moderate-sized training dataset.
I. INTRODUCTION
Lattice thermal conductivity (LTC) is an essential quantity for a wide range of technological applications, including thermal management of microelectronic devices,1 thermoelectrics,2 and thermal barrier coating materials.3 Accurate experimental measurement of LTC is challenging and reproducibility between different experimental groups is often not guaranteed.4 Consequently, despite the technological need, reliable LTC data are known for only a small number of materials. At the same time, the theoretical formulation of heat transport constitutes a mature research field.5,6 From a practical perspective, calculations of the temperature-dependent LTC are usually based on a perturbative approach, the Green–Kubo approach (GK),7,8 or non-equilibrium molecular dynamics simulations (NEMD). In a perturbative approach,9 calculations are typically performed within the framework of first-principles (or ab initio) density-functional theory (DFT). In practice, perturbative approaches require calculation of the so-called interatomic force constants (IFCs), which is time-consuming, especially for structures with large unit cells and low symmetry. This is because the number of structures with displaced atoms needed to construct third-order IFCs grows enormously fast with reduction of the crystalline symmetry.10 In practice, each element of the third-order force constants matrix requires four DFT calculations with differently displaced atoms in the supercells—for a unit cell with n atoms and a supercell with N units cells, there are 27n N IFCs. Since forces acting on atoms can be obtained in a single calculation, n N DFT calculations are required. Typically, LTC calculation demands supercells easily exceeding 100 atoms—several thousands of DFT calculations would be needed.
Materials with large anharmonicities are not rare and sometimes accurate evaluation of lattice dynamics of such materials require to consider the inclusion of all orders of anharmonicity.11 Already the influence of four-phonon scattering processes on LTC of solids can have huge impact.12,13 For example, in the case of boron arsenide (BAs) and graphene, better agreement between theory and experiment can only be achieved when four-phonon scattering is taken into account.14–17 In principle, lattice dynamics anharmonicity can be fully and accurately taken into account in ab initio molecular dynamics (AIMD) simulations. Hence, both the GK and NEMD approaches utilize AIMD and allow for a more accurate computation of LTC, as all orders of anharmonicity are taken into account. However, these methods are computationally expensive. The Green–Kubo formula relates instantaneous fluctuations of heat current in terms of the autocorrelation function. While the formulation of GK was done long ago, first-principles calculations of LTC using this approach have only recently appeared.18 The evaluation of LTC using the Green–Kubo approach coupled with first-principles calculations is rare, albeit its profound influence on the understanding of the heat transport.19 However, converging LTC values is a challenging task, as AIMD simulations are severely limited by system size (a few hundred atoms) and timescales (tens of picoseconds). Thus, GK and NEMD approaches are limited in applicability when coupled with first-principles calculations.
Challenges associated with the experimental measurements and theoretical calculations of the LTC lead to the lack of a systematic understanding of the LTC beyond semi-empirical and phenomenological trends in a very limited number of simple materials.20 From the computational perspective, speed up can be obtained by performing molecular dynamics (MD) simulations using empirical potentials, which enables one to efficiently predict properties of both bulk materials and nanostructures, over a wider range of system sizes (up to hundreds of nanometers)21 and timescales (up to tens of nanoseconds).22 However, the prediction accuracy will be highly dependent on the fidelity of the potentials. On the other hand, machine-learning methods (ML) can be used to predict LTC values. In some cases, ML models are trained in a supervised manner,23–25 and in some cases, machine-learned interatomic potentials (MLIPs)26,27 are used to either compute IFCs28–30 or to directly perform MD simulations.28,31–33 Among different MLIP models, the moment tensor potential (MTP)34 showed substantial accuracy keeping the required training data at a minimum size,35 especially once the active learning (sometimes terms active sampling or learning on-the-fly are used) methodology is used.36,37 Despite wide application of different MLIPs, the training process is still frequently done without the active learning scheme. In this case, the potential is usually trained on some AIMD trajectories and its ability to extrapolate is typically limited only to this training set. Recently, we demonstrated that low root-mean-squared errors on the training and test sets do not guarantee the robustness of a potential,38 which we define as an ability to run long MD simulations without failure. This outcome underscores that while low energy and force errors are achieved, applicability and accuracy of such MLIP should be taken with a portion of skepticism, even if the lattice dynamics calculations are performed under the same thermodynamic conditions as those employed during the training data generation.
In this work, we demonstrate how to generate a robust and accurate MTP using the active learning approach. Its employment not only ensures the applicability of MTP for lattice dynamics simulations, but also substantially reduces the number of the required DFT calculations. As a benchmark systems, we used - and - . Both and phases of are wide-bandgap semiconductors of significant technological importance.39–45 Consequently, these materials are well-investigated both experimentally46–49 and theoretically.50,51 Moreover, given the large conventional cell of - containing 20 atoms, the first-principles calculations are highly time-consuming especially when extensive convergence tasks need to be performed, which highlights the role of MLIPs usage as would be demonstrated here.
II. METHODOLOGY
A. Construction of a moment tensor potential
As fundamental symmetry requirements, all descriptors for atomic environment have to be invariant to translation, rotation, and permutation with respect to the atomic indexing. Particularly for structurally intricate polymorphs, the incorporation of many-body interactions is crucial to ensuring the accuracy of the derived potentials. The MTP framework satisfies all aforementioned conditions.34 MTP methodology and tools to utilize it are implemented in the MLIP-2 package,53 which we used in this work.
B. Generation of a training set
We conducted AIMD simulations to collect the reference data (potential energy, atomic forces, atomic coordinates) for MTP training. AIMD simulations are conducted within the DFT framework, employing VASP (Vienna ab initio simulations package)54 with the projector augmented wave method.55 The Perdew–Burke–Ernzerhof generalized gradient approximation (PBE-GGA)56 was employed for the exchange–correlation functional. We used a plane wave basis set cutoff of 550 eV and a Gaussian smearing of 0.1 eV width. The halting criterion for electronic density convergence during self-consistent field calculations is set to eV.
The corundum structure of - has a space group , while - has a monoclinic structure with the space group . The optimized structural parameters obtained in our calculations are shown in Table I. These structural parameters are in a good agreement with the experimental data.57 All lattice dynamics calculations in this work were performed with these structures. The conventional cells was sampled using k-point grids for - and for - , respectively. In case of the lattice dynamics calculations, we used a diagonal transformation matrix to generate a supercell with 120 atoms from the conventional - cell. In the case of - , we generated a supercell with 160 atoms expanding the conventional cell using a diagonal transformation matrix. For the calculations with supercells, the k-point grids were adjusted to keep the k-points’ mesh density as in the calculations with the unit cells, and this leads to the utilization of only one -point in both cases.
PBE-optimized unit cell vectors and angles of the α- and β-Ga2O3 phases.
Phase . | a (Å) . | b (Å) . | c (Å) . | α . | β . | γ . |
---|---|---|---|---|---|---|
α-Ga2O3 | 5.065 | 5.065 | 13.64 | 90° | 90° | 120° |
β-Ga2O3 | 12.469° | 3.087 | 5.882 | 90° | 103.673° | 90° |
Phase . | a (Å) . | b (Å) . | c (Å) . | α . | β . | γ . |
---|---|---|---|---|---|---|
α-Ga2O3 | 5.065 | 5.065 | 13.64 | 90° | 90° | 120° |
β-Ga2O3 | 12.469° | 3.087 | 5.882 | 90° | 103.673° | 90° |
Typically, two main approaches are employed to train a reliable MLIP capable of accurately representing a wide spectrum of local atomic environments. The first one involves the inclusion of various systems, necessitating extensive AIMD simulations. By incorporating a diverse range of systems, the MLIP can effectively learn to capture the nuances of different atomic environments. The second approach is called active learning, and it involves the selective addition of data points to the training set, focusing on structures where the MLIP extrapolates significantly. These are instances where the MLIP’s predictions for energies and forces exhibit high uncertainty. By strategically choosing a minimal yet diverse set of training data in the feature space, this active learning strategy enables effective fitting of the potential and helps overcome extrapolation challenges. In this study, we generate a small dataset to pretrain the MLIP. Subsequently, we employ an active learning strategy to iteratively expand this dataset, ensuring the MLIP’s robustness and accuracy for lattice dynamics calculations. We advocate for the consistent use of the active learning methodology due to its efficacy in enhancing the MLIP’s performance and, thus, revisit it briefly here.
Various active learning schemes are employed in the development of machine-learning potentials.58,59 In our study, we utilized the D-optimality-based active learning procedure developed in Ref. 36 and available in the MLIP-2 package. The initial training set was generated in AIMD simulations with an isothermal–isobaric (NPT) ensemble using the Nosé–Hoover thermostat.60 The initial dataset was generated at T = 700 K for the phase and at T = 1800 K for the phase, respectively. Each AIMD trajectory was simulated for 2 ps with a 1 fs time step. Initial structures for AIMD were prethermalized following the algorithm established in Ref. 61, which allows to equilibrate simulations at finite temperatures faster. The first picosecond of each AIMD simulation was discarded, and the second picosecond was subsampled with the 5 fs time intervals, resulting in a set of 200 snapshots for each structure. These samples were employed to train an initial MTP of level 12 (this MTP contains 127 parameters).
Within the active learning algorithm, we initiate MD calculations (in the NpT ensemble) in LAMMPS (Large-scale Atomic/Molecular Massively Parallel Simulator)62 using a pretrained MTP of level 12 as the model for interatomic interactions. At each step of the MD simulations, the algorithm assesses the extrapolation grade of the atomic configuration. The D-optimality criterion guides the decision to include a configuration in the training set based solely on atomic coordinates. This unique feature, coupled with the linear form of the potential, allows MTP to learn effectively on-the-fly. Configurations with are added to the preselected set. When exceeds 10, the MD simulation halts, and all sufficiently different configurations from the preselected set are incorporated into the training, followed by the refitting of the potential. This procedure repeats until MD simulations can run without failure for 30 ps. In our case, MTP achieved robustness with a training set comprising 439 samples in the case of - and 403 samples in the case of - . This amount of samples is two times to an order of magnitude smaller than for other machine-learning potentials.63,64 The potential trained in this manner demonstrated the ability to robustly conduct MD simulations for 100 ps, showcasing the effectiveness of the active learning procedure in terms of reducing the necessary DFT data for training and enhancing potential robustness. The training set generated during the active learning procedure was then used to train MTP with level 16 (this MTP contains 222 parameters). Then, we checked that obtained potential is suitable for long (30 ps) MD simulations and that no new structures were selected during this simulation.
The fitting accuracy of obtained potential can be verified by the agreement between MTP predictions and the benchmark AIMD results. We compare the MTP-predicted energies and forces for a separate testing dataset formed by running AIMD for 1 ps with time steps of 1 fs (this dataset contains 1000 snapshots) at T = 700 K for and T = 1800 K for phases, respectively. It is found that our MTP can provide the root-mean-square errors of 0.2 and 0.6 meV/atom for the potential energy of and phases. The RMSEs for the atomic forces are 22 and 46 meV/Å for and phases, respectively. Therefore, MTP can accurately reproduce ab initio energies and forces as also visually demonstrated in Figs. 1(a) and 1(b).
Comparison between DFT- and MTP-calculated (a) energies and (b) forces for both - and - structures. A straight black line represents a perfect linear dependence.
Comparison between DFT- and MTP-calculated (a) energies and (b) forces for both - and - structures. A straight black line represents a perfect linear dependence.
III. RESULTS AND DISCUSSIONS
Harmonic or second-order force constants can be used to evaluate the accuracy of MTP compared to DFT. We thus calculate the phonon dispersion relation along selected high symmetry paths, using the supercell approach and the finite displacement method,65 as implemented in the Phonopy package.66, Figures 2(a) and 2(b) show calculated phonon dispersion relations obtained using MTP and DFT for both polymorphs of , respectively. We note good agreement between both methods, which validates an ability of MTP to describe the lattice dynamics in the harmonic approximation. Notably, existing empirical potentials frequently fail to capture that.67,68 The most observable differences in the phonon spectrum might be seen for high-frequency optical phonon modes [especially at the Z-point in Fig. 2(a)]. This is a usual problem frequently arising in the harmonic constants simulations with MLIPs.28,63 From our experience, it is possible to improve the accuracy of optical phonon modes evaluation by performing simulations at high temperatures for a set of compressed and expanded structures.29,64 In this work, we skipped this procedure since obtaining precise values of LTC is not the primarily goal of this work. Considering the low symmetry and high anisotropy of - , it is reasonable to see that the phonon dispersion profile is much more complex than that of many other wide-bandgap materials, such as GaN and ZnO.69 Due to the versatility stemming from the relatively high complexity of the MTP functional form, harmonic force constants can be accurately reproduced—this precision is crucial as it serves as a fundamental prerequisite for conducting calculations of the LTC.
Phonon dispersion relation of (a) - and (b) - along the selected high symmetry Brillouin zone path. Calculations are done using DFT and MTP.
Phonon dispersion relation of (a) - and (b) - along the selected high symmetry Brillouin zone path. Calculations are done using DFT and MTP.
We next proceed toward LTC calculations, which was evaluated by solving the Boltzmann transport equation for phonons in the relaxation time approximation as implemented in the Phono3py package.10,70,71 Here, we should mention that in the case of - , one has to compute forces acting on atoms in 4087 supercells (with 120 atoms) and in the case of - in 15 225 supercells (with 160 atoms). Albeit the crystalline symmetry is taken into account to reduce the number of calculations, these numbers are tangible even for calculations on modern high-performance clusters. Performing these calculations with MTP as a model for interatomic interactions takes minutes on a personal workstation.
Figures 3(a) and 3(b) present temperature dependence of the LTC for and polymorphs. Since - has a monoclinic structure, its [001] direction does not align with the -axis. Therefore, we calculated its LTC in the [001] direction as was done in previous works:52,63 . Notably, MTP caught the anisotropy in LTC values along different directions in both - and - , which further validates its accuracy. Values of LTC obtained using MTP are in a good agreement with most recent DFT calculations,52 for example, at 300 K LTC values along [100] are identical up to the second decimal. The maximal difference between our results and previous theoretical calculations52 is within 5%–10%, which might be associated with different structural parameters or calculations set up since PBE revisited for solids (PBEsol) was used in Ref. 52. Taken into account tremendous reduction of simulation time, we do not consider this as a significant problem, and moreover, the comparison between calculated LTC at T = 300 K in our study and further literature data (as presented in Table II) shows good agreement.
Temperature dependence of the LTC of (a) - and (b) - for three different directions. LTC calculations with MLIP (moment tensor potential in our case) are compared to the recent calculations, which were done in the DFT framework.52
Temperature dependence of the LTC of (a) - and (b) - for three different directions. LTC calculations with MLIP (moment tensor potential in our case) are compared to the recent calculations, which were done in the DFT framework.52
Lattice thermal conductivity of α- and β-Ga2O3 at 300 K. The units are W/mK.
Polymorph . | . | [100] . | [010] . | [001] . |
---|---|---|---|---|
α-Ga2O3 | This work | 9.05 | 9.05 | 12.46 |
Calculations | 8.9552 | 8.9552 | 12.9552 | |
11.6172 | 9.3872 | 8.9472 | ||
β-Ga2O3 | This work | 9.91 | 17.5 | 12.93 |
Calculations | 10.6352 | 18.9352 | 13.952 | |
10.6863 | 20.7863 | 12.6163 | ||
Measurements | 9.5 ± 1.849 | 22.5 ± 2.549 | 13.3 ± 1.849 |
IV. CONCLUSION
In summary, we developed MTPs for atomistic simulations of and - . Our potentials exhibit good accuracy in reproducing the ab initio PES, attaining a total energy accuracy below 1 meV/atom and force accuracy below 50 meV/Å for both polymorphs. We then compared phonon dispersion and LTC values calculated using MTP with the results of DFT and obtained good agreement, which highlights the applicability of MTP in calculating the lattice dynamics and heat transport of solids. The proper choice of training data to generate an accurate potential using as little data as possible is governed by the utilization of the active learning methodology. Although the possibility of its usage is frequently ignored, the active learning scheme plays a crucial role in enhancing the robustness of the machine-learned interatomic potentials, as demonstrated here.
The accompanying tutorial, which demonstrates how to calculate LTC using Phono3py and MLIP-2 packages, is available on GitLab.73
ACKNOWLEDGMENTS
We acknowledge funding from the Russian Science Foundation (Project No. 23-13-00332).
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Nikita Rybin: Conceptualization (lead); Formal analysis (lead); Investigation (lead); Methodology (equal); Visualization (lead); Writing – original draft (lead); Writing – review & editing (equal). Alexander Shapeev: Conceptualization (equal); Formal analysis (equal); Funding acquisition (lead); Methodology (equal); Project administration (lead); Resources (lead); Supervision (lead); Writing – review & editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.