Machine learning (ML) methods are of rapidly growing interest for materials modeling, and yet, the use of ML interatomic potentials for new systems is often more demanding than that of established density-functional theory (DFT) packages. Here, we describe computational methodology to combine the CASTEP first-principles simulation software with the on-the-fly fitting and evaluation of ML interatomic potential models. Our approach is based on regular checking against DFT reference data, which provides a direct measure of the accuracy of the evolving ML model. We discuss the general framework and the specific solutions implemented, and we present an example application to high-temperature molecular-dynamics simulations of carbon nanostructures. The code is freely available for academic research.
Molecular dynamics (MD) simulations have become a key research technique in physics, chemistry, and materials science. Quantum-mechanical methods, most commonly density-functional theory (DFT), make it possible to accurately predict forces on atoms, which in turn can be used to integrate Newton’s equation of motion and, therefore, to drive high-fidelity simulations of molecular systems and condensed phases.1 This type of simulation, sketched in Fig. 1(a), is often referred to as first-principles or “ab initio” MD (AIMD): because it is based on the laws of quantum mechanics, no prior knowledge of the system is required as long as a reasonable initial structural model is chosen. Such AIMD simulations can reach system sizes of several hundreds of atoms, but remain computationally highly expensive.
Machine learning (ML) has opened up an increasingly popular alternative route in this regard.2–6 The key idea is to fit ML surrogate models of the potential-energy surface of a given system based on a relatively small number of DFT reference computations. Despite the growing popularity and widespread use, there is still a significant barrier from the users’ perspective compared to how widely and routinely DFT is used day-to-day. One bottleneck has been the development of suitable reference databases, which initially had to be constructed and curated with substantial user input. Examples for ML potentials, fitted to extensive databases of that type, are Gaussian approximation potential (GAP) models for carbon7 or silicon8 or the ANI series of neural-network potentials for organic molecules.9,10 Once trained, these general-purpose models enable predictions outside of the direct scope of their training: for example, resolving the intricate structure of amorphous phosphorus11 without having included it explicitly in the reference database. We also note recent work in the field on creating widely applicable, “universal” pre-trained neural-network potentials, which are beginning to be tested for very different chemistries.12–14 For all those approaches, there is a substantial requirement for human and computational time in the first place, yet a significant possible benefit once the methodology is used in practice.
A somewhat complementary approach is now to use the underlying DFT code not just for labeling the data (i.e., computing reference energies and forces) but also to generate the structural data themselves “on the fly.” In other words, the ML potential fit can be incorporated directly into the AIMD workflow [Fig. 1(b)]—starting an MD simulation with DFT forces, then accelerating it substantially with an on-the-fly fitted ML potential model where this is possible, and updating the model with new reference data as required. This way, there is no requirement for expert knowledge from the user, even for unseen systems. This type of approach was pioneered for the Vienna Ab initio Simulation Package (VASP),15,16 which now offers ML potential model fitting on the fly.17–20 We review emerging applications in more depth below.
In the present work, we describe a computational framework that allows for the direct interfacing of the DFT-based simulation software CASTEP (Ref. 21) with ML potential fitting. The key methodological steps described herein are (i) the integration of relevant functionality into the CASTEP code itself and (ii) a stand-alone decision-making code that connects to external software for the fitting of ML models. The code is designed to be interoperable with different ML potential fitting frameworks, whether already established in the field, newly emerging, or envisioned in the future: the practical aspects will change, but we expect the overarching ideas to remain the same. We illustrate these developments with a case study of a potential that is “learned” by our hybrid CASTEP + ML scheme and then taken out and used for a larger, stand-alone simulation.
A. Machine-learned acceleration for ab initio MD
Figure 1 provides an overview of the methodology: the regular AIMD workflow is shown in simplified terms in Fig. 1(a), and the inclusion of the ML model requires additional actions (in blue) and decision points (magenta), as shown in Fig. 1(b). We note that this flowchart is deliberately generic, and the questions when to “Use ML?” and when to “Refit?” in Fig. 1(b) can be answered in multiple ways—one of which is discussed below (Sec. II C).
Irrespective of how exactly an on-the-fly fitted ML potential is created, there are different principles by which it can be used in computational practice [Fig. 1(c)]. It can be applied directly within the AIMD code to accelerate simulations of systems with typically a few hundred atoms. This direct acceleration is exemplified in Fig. 1(c) by showing a 200-atom disordered carbon structure. However, once an on-the-fly model has been generated, it can also be used subsequently as a stand-alone potential22–24—evaluating it with an external MD engine, rather than within the AIMD code, thereby enabling much larger-scale simulations. Finally, one might expect in future work to take an on-the-fly generated fitting database (yellow) and combine it with other, existing datasets (gray), together yielding a combined ML potential for a more complex system. An example for the second principle is shown in Fig. 1(c) for a 10 000-atom carbon structure; we will discuss this specific example in Sec. V.
On-the-fly ML potential fitting methodology has been implemented in VASP17,18 using kernel-based regression models similar to the ones used in GAP fitting and the associated Bayesian error estimation to define the decision criteria. The capability of the methodology was initially demonstrated for temperature-dependent structural distortions in a hybrid halide perovskite17 and for melting-point predictions,18 respectively. The utility of the approach was then shown for a wider range of research questions: for simulations of thermal properties that would be computationally highly expensive at the full DFT level25,26 and for beyond-DFT computations for bulk and surface systems.27,28 Once the ML potential has been fitted in this way, it can be taken out and used in larger-scale, stand-alone simulations [Fig. 1(c)], as has been demonstrated in Refs. 22–24. We mention in passing that there are analogies to other “on-the-fly” schemes, which similarly aim to accelerate MD simulations, although in different ways: by combining force-field and quantum-mechanical simulations29,30 and by making external DFT calls during an ML-driven simulation (often referred to as “active learning”).31–35 In the following discussion, we take “on-the-fly” to refer to one of the workflows sketched in Figs. 1(b) and 1(c).
B. Gaussian approximation potentials
In the present work, we primarily fit ML potentials using the Gaussian approximation potential (GAP) framework and the QUIP and gap_fit codes. GAPs were initially introduced by Bartók et al. in Ref. 36, and many aspects of the methodology have been established in a series of papers on tungsten,37 carbon,38 and silicon.8 The underlying theory of Gaussian process regression (GPR), as well as the design choices that are relevant in computational practice, has been reviewed in Ref. 39.
The choice of k is, therefore, the first important decision to be made in GAP fitting. k is typically given by the Smooth Overlap of Atomic Position (SOAP) kernel.40,41 The most relevant SOAP hyperparameters are the radial cutoff, rcut, and the smoothness used for the neighbor density, σatom. The convergence of the neighbor-density expansion with respect to its local basis is controlled by setting maximum values for the radial and angular quantum numbers, n and l, respectively.
C. Checking criteria
ML models accelerate AIMD by replacing the expensive ab initio force calculation with faster ML forces when possible. There is an inherent trade-off in accuracy between the two methods, requiring monitoring of the error of the ML model. This gives us two requirements for the design of the code: (1) minimizing the number of ab initio force calls throughout the simulation and (2) keeping the accuracy of the entire simulation within a specified bound. To fulfill these requirements, a decision-making procedure is introduced before force calculations [Fig. 1(b)], determining whether ML or ab initio forces should be used for the next step. We note that this decision can, in principle, be based on any structural or model-derived information, as long as that information is predictive of the error of the evolving ML model.
We have made the design choice to use direct checking against ab initio data to decide whether or not a potential needs to be refitted. This makes the approach applicable to all ML potential fitting frameworks, irrespective of whether they have internal confidence estimates available. At MD steps where checks are performed, both the ab initio and ML forces are evaluated, and the disagreement between the two is checked against user-supplied tolerance values. The new ab initio observation gathered can be used for updating the model either at all such observations or only when tolerances are not met. As there is a non-negligible cost associated with this refitting step, we only perform model training on occasions where the user-defined tolerances are not met. In principle, the model can be updated more often, though this is only economical if an existing model can be extended without needing to re-train from scratch—e.g., for neural-network-based models.
Figures 2(a) and 2(b) illustrate two possible interval modes for checking the accuracy of the model. The basis of both presented methods is a direct comparison between the CASTEP DFT results (which serve as the ground truth), on one hand, and the ML model prediction, on the other hand. The user can set tolerances for energy, force, and stress errors against which the evolving ML potential is assessed. The interface allows for running a number of initial DFT steps to gather data, indicated by blue shading, followed by a fit (“training”) of the ML model to those data. This is useful for starting from scratch, i.e., without any previous ML potential model, as well as for updating a model to the structure/calculation at hand in cases where previously gathered ab initio data are supplied.
The checking then proceeds at specified points during the MD run. The simplest solution is pre-defining a fixed interval for the DFT re-evaluation and running a constant number of ML-driven steps in between. This is illustrated in Fig. 2(a): the only choice to make here is the number of MD time steps per interval, denoted as n_fix in the input file. We implemented an adaptive scheme as well [Fig. 2(b)], in which the number of MD steps between DFT force calculations is adapted according to the accuracy of the model observed: the previous interval is increased when tolerances are met and decreased otherwise, together with refitting the model. We use a scaling factor (keyword factor in the input file) by which to multiply or divide the interval, keeping it between a lower and upper bound of steps to take (keywords n_min and n_max, respectively).
Figure 2(c) illustrates the application of adaptive checking intervals for aluminum oxide, Al2O3. In this simple experiment, we take a 2 × 2 × 1 (120-atom) supercell expansion of the corundum-type unit cell of Al2O3 and remove one Al and one O atom each, as a prototype for more extended simulations of vacancy migration. We plot the energy (top) and force (bottom) errors observed during the first 1 ps of the MD run, indicating points where both DFT checks were passed in blue and those where at least one was not passed in red. Vertical lines indicate the refitting events. Detailed data, including the associated input files and the results of a longer simulation, are provided in the repository accompanying this paper (see the Data availability statement).
In our implementation, we connect CASTEP21 as the AIMD software to a custom and standalone Python program, which we call hybrid-md. Within the present study, we primarily use the GAP code for potential fitting and evaluation (Sec. II B). We emphasize, however, that hybrid-md can be interfaced with other ML potential fitting and evaluation frameworks as well, for example, through a custom Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) interface described below, which we expect to be relevant in the future. We refer to the overall framework as “CASTEP + ML” and to the specific potential models fitted in the present work as “CASTEP-GAP” models.
Figure 3 provides an overview of the implementation, separated according to those tasks that are carried out within CASTEP (light gray), within the decision-making code described here (magenta), and by external ML potential fitting and evaluation software (blue).
A. GAP interface
We provide an interface in the CASTEP Fortran code to evaluate any QUIP-compatible interatomic potential model, including GAP. We designed a simple interface of the Python code as a shell executable that can be used by the CASTEP code via system calls. Decisions are passed as shell exit codes, parsed by the Fortran code. The behavior of the decision-making code is controlled by an input file; we show an example in Listing 1.
The model updates are provided by the Python code, building on the results of the DFT calculations being written to xyz files by CASTEP. The model fitting itself is carried out using the built-in fitting executable of GAP, which fits a new model from scratch each time there is an update in the data. Again, we note that the code is designed to enable extension and future interfaces with other fitting frameworks as well; we show initial examples to this effect below. In Fig. 3, we refer to generic tasks rather than specific ML potential software to emphasize the general nature of the approach.
The Python code is intended to be self-contained and provide the single interface for non-core development. This is favorable from a licensing point of view (the CASTEP license is more restrictive than GAP’s ASL license) as well as for the efficiency of development: if the user wishes, they can change the Python code for the specific research question they are addressing or add a new custom feature, while not having to obtain, change, and recompile the source of the DFT code. The latter would be a clear issue on high-performance computing (HPC) systems, for example, on top of the typically higher barrier of entry into Fortran programming (compared to Python).
B. LAMMPS interface
We further extended CASTEP with an interface to call LAMMPS42 as a force calculator, which provides access to a wider variety of ML models through optional packages in LAMMPS. For this work, we used the ML-PACE and ML-MACE packages in LAMMPS, making it possible to evaluate Atomic Cluster Expansion (ACE) models43 and equivariant message-passing neural-network models based on ACE (MACE),44 respectively.
LAMMPS uses domain decomposition for parallelism, splitting the simulation cell between Message Passing Interface (MPI) processes for neighbor list and force calculations. Our interface implements supplying the entire MPI communicator to LAMMPS or communicating only on a single process. The latter is essential for models using graphics processing unit (GPU) accelerators, such as MACE, where splitting the cell is counter-productive. In the future, extension for supporting further patterns is possible, e.g., one MPI process per GPU for a multi-GPU model evaluation.
C. Model training extensions
User-defined model training logic can be used for training ACE and MACE models. The import path of the required code can be specified in the input file of the acceleration program. ACE model training can be carried out using the ACE1pack Julia library, executing a training script similarly to how the gap_fit program is called for training GAP models. Training is performed from scratch (no re-use of previous model weights) on the central processing unit (CPU). The MACE Python package is used directly for training MACE models on a GPU. Incremental training is possible, where initially a longer optimization of model weights is used, and later, only short fine-tuning is carried out with new observations. We note that while this functionality is provided as part of the current code implementation, further work is required for an in-depth analysis of generally applicable and robust settings. All following discussion, therefore, will be based on the GAP framework, which is more firmly established at the time of this writing.
IV. NUMERICAL PERFORMANCE
We present example runs to assess the numerical performance of the method. We tested the speed of CASTEP MD runs, comparing standard AIMD with GAP-accelerated runs as described above. We chose two chemically rather different systems to provide representative examples of different envisioned use cases: defective crystalline Al2O3 (ordered structure, primarily ionic–covalent bonding) and liquid Si (highly disordered structure, primarily metallic bonding). Both simulations started from scratch with no ML model or data, and both used the adaptive checking interval scheme described in Sec. II C. The results are summarized in Table I.
|.||Crystalline .||Liquid .|
|.||Al2O3 .||Si .|
|Time in SCF||35.3%||2.5%|
|Time in gap_fit||1.0%||3.0%|
|Time in GAP forces||63.7%||94.5%|
|(Incl. system & hybrid-md)|
|CASTEP + ML MD steps||100 000|
|AIMD steps in same time|
|.||Crystalline .||Liquid .|
|.||Al2O3 .||Si .|
|Time in SCF||35.3%||2.5%|
|Time in gap_fit||1.0%||3.0%|
|Time in GAP forces||63.7%||94.5%|
|(Incl. system & hybrid-md)|
|CASTEP + ML MD steps||100 000|
|AIMD steps in same time|
For the example run on Al2O3 characterized in Fig. 2(c), we examined the timing data to assess the numerical performance of the method. In this specific simulation, 35.3% of the overall runtime was required for the CASTEP self-consistent field loop (that is, for the DFT evaluation), 1.0% for running the external gap_fit program, and 63.7% for MD: system, decision making, and GAP model evaluation. A total of 57 DFT evaluations were carried out, and eight of those led to gap_fit calls [four of those during the first 1 ps of the simulation; see the gray horizontal lines in Fig. 2(c)]. Hence, over a total of 100 k MD steps, we had an average of 1750 GAP-MD steps per DFT computation. AIMD on the same system and input structure has been tested (same HPC system and settings): about 200 MD steps can be done within the same time, which means a speedup of about 500× with the ML acceleration.
For the equivalent test for liquid Si, a total of 36 DFT evaluations (2.5% of overall runtime) and 12 gap_fit calls (3.0%) were carried out over 100 k MD steps on a system size of 250 atoms (2563 per DFT computation, 94.5% of total time). ∼5400 AIMD steps can be carried out within the same walltime using the same basis set and k-point sampling (250 eV, Γ point only), which amounts to a speedup of . The liquid Si simulation was completed in two parts of duration 90 and 10 ps (the latter exemplifying the “restart” functionality). In the final 10 ps, one refit was performed. We note that for both test systems, only a small fraction of the computational time was required for the (re-) fitting of the ML models.
V. CASE STUDY: GRAPHITIZATION OF AMORPHOUS CARBON
We present a case study for a complex, disordered material, as an example of a simulation that will ultimately require larger simulation cell sizes than would be amenable to DFT at all. This section, therefore, exemplifies the second “mode of operation” in Fig. 1(c): generating a potential with CASTEP + ML, which is then taken out and used to drive stand-alone simulations. Our application case is amorphous carbon (a-C), whose structures can have pore radii of several nm, and a-C films can be described in deposition simulations.45,46 The ML-driven computational modeling of a-C materials is relevant for diverse applications, e.g., in energy storage: porous carbon structures can host metal ions (for battery anodes), molecular species (for supercapacitors), and so on.47–51
The thermal annealing of a-C structures is a benchmark challenge for empirical and ML interatomic potentials.52–54 Starting from a disordered precursor, these simulations involve the breaking and forming of bonds typically over hundreds of thousands of MD steps, resulting in a gradually more ordered network that tends toward an sp2-rich, graphitic structure at lower densities (1.5–2.0 g cm−3) or an sp3-rich, diamond-like structure at higher densities (3.0–3.5 g cm−3). Meeting this benchmark is challenging for carbon potentials because of the diverse atomic environments in the amorphous state.
As mentioned above, on-the-fly fitting needs a different strategy compared to a “typical” GAP fit to an existing database of structures. Here, rather than starting from the existing C-GAP-17 database,38 we show that it is possible to start with a small amount of fitting data sampled at relatively few sparse points when fitting on-the-fly and yet achieve accuracy to within tens of meV for a specific use-case. While this error would normally be considered relatively high for ML potentials,55 we note that modeling a-C is challenging due to its complex configurational space and that C-GAP-17 has been shown to be a robust model for a-C.54 CASTEP-GAP produces potentials of comparable quality, but will be less transferable due to the much smaller training database.
For this test, we took a 200-atom a-C structure generated by melt-quenching with C-GAP-17 and annealed it at 3000 K for 350 ps using two different methods: (i) direct MD with C-GAP-17 and (ii) CASTEP MD accelerated with on-the-fly GAP fitting. We chose a density of 2.5 g cm−3, slightly higher than that of graphite, but not high enough to form a dense diamond-like structure—it might, therefore, be viewed as one of the more challenging cases. The hybrid-md settings used are those in Listing 1; technical details are given in the Appendix.
Figure 4(a) shows that the reference C-GAP-17 simulation and the new CASTEP + ML run exhibit qualitatively similar trends and physically correct behavior with respect to the fraction of sp2 atoms. The latter increases rapidly during the first 25 ps, followed by a slower increase until converging at ∼180 ps, up to numerical fluctuations. The simulation driven by the original C-GAP-17 model shows a similar profile while reaching convergence faster, at around 60 ps. Figure 4(a) indicates by arrows when the CASTEP-GAP model was refit, with a total of 255 refits out of 404 DFT checks throughout the simulation. Several refitting events occurred near the beginning of the run; they then gradually spread out according to the adaptive refitting scheme, up until around 170 ps where an increasing number of refitting events occurs. After ∼180 ps, the simulation required restarting, and at that point, refitting was manually disabled due to the perceived stability of the potential, allowing for the simulation to continue with the ML potential without further refitting.
To analyze the quality of the resulting potential, we introduce here a systematic benchmark for disordered carbon across the range of densities and degrees of ordering [Fig. 4(b)]. These structures consist of 200 atoms and were generated using C-GAP-17 following the same annealing protocol. Ten independent structures were DFT-labeled per density–temperature run at the end of the annealing process (500 ps) for a total of 250 uncorrelated cells (50 000 atoms) in this benchmark set. The DFT parameters were adjusted for the benchmark set to match that used in C-GAP-1738 for direct comparison; we note that, in future work, we expect to re-label the dataset at higher levels of DFT. The root mean square error (RMSE) against DFT for the energies and forces were computed for both the CASTEP-GAP model [Figs. 4(c) and 4(d), bottom] and for C-GAP-17 (top). At the chosen temperature and density, the energy accuracy of the CASTEP-GAP model is within 30 meV/at., and the force error is below 1 eV/Å, as highlighted by a square in Fig. 4(c) (bottom). C-GAP-17 reaches an energy error of meV/at. and a force error below 0.9 eV/at., being fitted with many more SOAP sparse points and a larger database. As a result of the latter, C-GAP-17 is better suited to capturing a range of amorphous structures as indicated by the lower energy errors across most of the structures with no anomalous results [Fig. 4(c), top]. This is further illustrated by the force errors observed in C-GAP-17 [Fig. 4(d), top] where the majority of errors are below 1 eV/Å, compared with the CASTEP-GAP model [Fig. 4(d), top], which has relatively poor force accuracy since force errors were not accounted for during the refitting criterion (see Listing 1). Changes in temperature and density have a significant impact on the numerical accuracy of the CASTEP-GAP potential. Notably, at densities of 1.5 and 3.5 g cm−3, we observe significant errors. The reason behind these errors is the absence of higher- and lower-density structures in the CASTEP-GAP database. Consequently, diamond-like structures are evaluated with poor numerical accuracy in terms of forces when compared to C-GAP-17. Additionally, structures at lower densities display a greater variety of chemical environments (e.g., chains, pores, and low-coordination environments) that are absent from the CASTEP-GAP database. This lack of representation results in higher energy and force errors.
One centrally important aspect of on-the-fly ML potential fitting is the ability to extract the potential for use in an external MD engine to run larger simulations.22,23 We exemplify this approach here: we scaled up the system size to 10 000 atoms [Figs. 5(a) and 5(b)] and generated an a-C precursor at 2.5 g cm−3 using the same protocol as for the 200-atom one. The CASTEP-GAP potential was then used in LAMMPS42 to anneal the 10 000-atom structure at 3000 K for 350 ps. The final annealed structure is shown in Fig. 5(b), color-coded by SOAP similarity to diamond,40 where a value of 1 indicates a local chemical environment identical to that in bulk diamond. Qualitatively, the structure in Fig. 5(b) shows the expected warped graphitic layers at this density, where it is too dense to form flat, ordered graphitic sheets, but not dense enough to form a bulk tetrahedral (diamond-like) structure. Notably, there are locally two regions with that characteristic, highlighted in yellow in Fig. 5(b). This type of behavior has been observed with C-GAP-17 previously (along with other empirical potentials)54,57 and ascribed to numerical artifacts arising from interactions between the cutoff function and the second-neighbor peak in the radial distribution function.57
For comparison, the same 10 000-atom structure was also annealed using C-GAP-17. The shortest-path ring counts for both structures give nearly identical trends, where the structure generated from the on-the-fly potential has slightly more six-membered rings [Fig. 5(c)]. Both bond angle distributions [Fig. 5(d)] are centered around 120°, emphasizing the presence of sp2 environments; both radial distribution functions [Fig. 5(e)] have a first peak at Å, consistent with the experimental bond length in graphene (1.42 Å),58,59 along with a second, slightly broader peak centered at 2.44 Å, corresponding to next-nearest neighbors, and a smaller peak at 3.77 Å. Finally, an assessment of coordination numbers [Fig. 5(f)] shows that both annealed structures have a high percentage of sp2 environments . However, the structure generated with the on-the-fly potential has fewer sp3 environments compared to the C-GAP-17 one. Note that there is a very small amount of five-coordinated atoms in both structures (0.06% and 0.05%, respectively); for a discussion of five-coordinated atoms in simulations of amorphous carbon, see Ref. 46.
Overall, the on-the-fly ML potential shows good performance in a scaled-up simulation using an external MD engine. As demonstrated in Fig. 5, the potential yields structural metrics that are comparable to the predictions of C-GAP-17 while requiring only a fraction of the fitting cost. Additionally, little human effort was expended for the construction of the training database and the fitting procedure itself. Numerical validation across densities and temperatures [Fig. 4(c)] showed that the generated potential performs similarly to C-GAP-17 for some configurations, but has a higher error for others for which it has not been “trained.” We emphasize that to be considered fully validated (for production simulations), the robustness of the potential would now need to be more comprehensively tested—e.g., by repeating the same structural validation across various densities and temperatures.
Combining DFT-driven molecular dynamics with ML acceleration can, in principle, offer the best of two worlds: accurate and reliable data for chemically complex systems “out of the box” and fast simulations at the same time. We have here described the practical implementation of a hybrid AIMD + ML framework, with the aim to provide a link between the CASTEP first-principles simulation code on one hand and any user-defined fitting method on the other hand. The connection is made using a standalone Python code to facilitate usability and future modifications. The developments described herein are implemented in CASTEP as of version 22.1, which is freely available for academic research.
One major direction for future work in this area relates to the choice of error prediction and uncertainty quantification algorithms. For now, we have focused on using generic energy and force error criteria to determine at which points of the simulation to refit the model. This means that the method can, in principle, be interfaced with any type of ML potential model beyond GAP: one may envision an extension to fast linear-fitted potentials (e.g., Refs. 60–62) or to recently developed equivariant neural-network models,44,63 and interfaces for this purpose have begun to be implemented in the code (Sec. III C). Beyond this scheme, it would be interesting in future work to explore the comparison between the direct DFT evaluation, on one hand, and predictive error estimators and uncertainty quantification, on the other hand—the latter having been successfully used in Refs. 17 and 34, for example.
In addition to providing a reference for the implementation of CASTEP + ML, our work touches on more generally relevant points about the future development and adaptation of ML potential models in the wider community. One point is the democratization of the usage of those models: initially having required substantial expert knowledge, it is now increasingly possible to investigate a given material with the same ease that characterizes popular AIMD codes. A second point is the expected capability of those approaches not “just” to speed up MD but to provide efficient sampling for developing more generic and wider-ranging fitting databases. The ML potentials generated by an on-the-fly scheme can be extracted and then run on their own, with much more efficient simulations and larger system sizes than if they were integrated in the AIMD loop directly. We expect that these overarching directions will be of interest to the materials modeling community in the years ahead.
T.K.S. acknowledges the support from the European Union’s Horizon 2020 Research and Innovation Program under Grant Agreement No. 957189 (BIG-MAP project) and the UKCP Consortium’s Code Development Internship Scheme 2020 and assistance by Dr. William C. Witt with ACE & MACE model compilation and training. We are grateful for the support from the EPSRC Centre for Doctoral Training in Theory and Modeling in Chemical Sciences (TMCS) under Grant No. EP/L015722/1 (Z.E.-M.). This paper conforms to the RCUK data management requirements. G.L. was supported through the EPSRC Centre for Doctoral Training in Graphene Technology, Grant No. EP/L016087/1, and Versarien ltd. J.D.M. acknowledges funding from the EPSRC Centre for Doctoral Training in Inorganic Chemistry for Future Manufacturing (OxICFM), Grant No. EP/S023828/1. A.P.B. and M.I.J.P. acknowledge the support from the CASTEP-USER project, funded by the Engineering and Physical Sciences Research Council under Grant Agreement No. EP/W030438/1. A.P.B. also acknowledges the support from the NOMAD Centre of Excellence (European Commission Grant Agreement No. ID 951786). We are grateful for the computational support from the UK Materials and Molecular Modeling Hub, which is partially funded by EPSRC (Grant Nos. EP/P020194 and EP/T022213), as well as the UK national high performance computing service, ARCHER2, for which access was obtained via the UKCP consortium and funded by EPSRC Grant Reference Nos. EP/P022561/1 and EP/X035891/1.
Conflict of Interest
A.P.B. and G.C. are listed as inventors on a patent filed by Cambridge Enterprise Ltd. related to SOAP and GAP (US patent 8843509, filed on 5 June 2009 and published on 23 September 2014). A.P.B. and M.I.J.P. declare the receipt of income from commercial sales of CASTEP by Biovia Inc. The other authors have no conflicts to disclose.
Tamás K. Stenczel: Investigation (equal); Software (lead); Writing – original draft (equal). Zakariya El-Machachi: Investigation (equal); Validation (lead); Writing – original draft (equal). Guoda Liepuoniute: Investigation (supporting). Joe D. Morrow: Investigation (supporting). Albert P. Bartók: Software (supporting). Matt I. J. Probert: Software (supporting); Writing – review & editing (supporting). Gábor Csányi: Conceptualization (equal); Supervision (equal); Writing – review & editing (supporting). Volker L. Deringer: Conceptualization (equal); Supervision (equal); Writing – original draft (equal); Writing – review & editing (lead).
The code is available as a part of the current academic release of CASTEP, for which a license may be freely obtained for academic use (http://www.castep.org/). A stand-alone version of the code is available at https://github.com/stenczelt/CASTEP-ML-data. Data supporting the findings of this study are openly available in the same repository.
APPENDIX: TECHNICAL DETAILS
In Table II, we list the hyperparameters that define the structural descriptors and the number of representative points, M (controlled by the keyword n_sparse), for GAP fitting for all three model systems. The full strings (“descriptor_str” keyword; cf. Listing 1) are contained in the input files provided (see the Data availability statement).
|.||.||Al2O3 .||l-Si .||a-C .|
|.||.||Al2O3 .||l-Si .||a-C .|
In carbon graphitization simulations, for the GAP fitting parameters, we set rcut = 3.7 Å for two-body and SOAP descriptors and additionally included a shorter-ranged three-body descriptor, in analogy to the settings used for C-GAP-17.38 We emphasize, however, that we here only used M = 200 for the SOAP kernel—in strong contrast to C-GAP-17 (M = 4030).38 All DFT computations for carbon simulations were performed in the local density approximation.64 For CASTEP + ML, a plane wave cutoff of 550 eV was used with a 2 × 2 × 2 k-point grid and a Gaussian smearing width of 0.1 eV. The SCF halting criterion was ΔE < 10−5 eV at.−1. The DFT parameters for C-GAP-17 can be found in Ref. 38.