A Gaussian approximation machine learning interatomic potential for platinum is presented. It has been trained on density-functional theory (DFT) data computed for bulk, surfaces, and nanostructured platinum, in particular nanoparticles. Across the range of tested properties, which include bulk elasticity, surface energetics, and nanoparticle stability, this potential shows excellent transferability and agreement with DFT, providing state-of-the-art accuracy at a low computational cost. We showcase the possibilities for modeling of Pt systems enabled by this potential with two examples: the pressure–temperature phase diagram of Pt calculated using nested sampling and a study of the spontaneous crystallization of a large Pt nanoparticle based on classical dynamics simulations over several nanoseconds.

## I. INTRODUCTION

Platinum belongs to the noble metal family and is often used in expensive jewelry. However, the wider importance of Pt for the global economy stems from its countless industrial uses, even in elemental crystalline form. Platinum is commonly used as a catalyst for many chemical reactions. For instance, Pt is the best known catalyst for the hydrogen evolution reaction (HER), where it shows an extremely small overpotential.^{1,2} Pt is also one of the few catalysts that can withstand the highly oxidizing environments of the oxygen reduction reaction (ORR) and oxygen evolution reaction (OER).^{3,4} At the same time, Pt is scarce in the Earth’s crust, and its supply for industrial applications is severely limited by cost and availability. Still, for some applications, the use of Pt can be so advantageous compared to the next-best option that it remains in wide use. To reduce the amount of raw Pt that is needed for a given application, Pt thin films or nanoparticles (NPs) can be used instead of the bulk material. The catalytic properties of Pt films are strongly influenced by crystallographic surface orientation;^{2} for NPs, size and shape are the parameters that determine these properties.^{5–7} Understanding the atomic-scale structure of such systems is, therefore, critical for understanding the catalytic properties.

In this article, we introduce and validate a general-purpose machine learning (ML)-based Gaussian approximation potential (GAP)^{8,9} for elemental Pt. This potential offers similar accuracy as density-functional theory (DFT) for a small fraction of the computational cost. Our potential shows extremely good transferability, accurately predicting the interatomic interactions in Pt from bulk to surface through NPs. The paper is organized as follows: we first discuss the GAP theoretical framework and the generation of training data. We, then, benchmark our potential against the prediction of the basic material properties of bulk, surface, and NP platinum. Finally, we use the GAP to compute the pressure–temperature phase diagram of Pt using the nested sampling (NS) method and to study the nucleation of face-centered cubic (FCC) Pt during the solidification of a large Pt NP.

## II. METHODS

### A. Gaussian approximation potentials

*trained*on that data, from which it

*learns*. A GAP prediction is made by comparing the atomic structure for which we seek the prediction to a set of structures in the database. Each of these comparisons yields a

*kernel*, or measure of similarity, which is bounded between 0 (the two structures are nothing alike) and 1 (the structures are identical). Different

*descriptors*, and combinations thereof, of the atomic structure can be used to describe the atomic environments. For instance, in this work, we use a combination of two-body (2b) and many-body (mb) soap_turbo descriptors.

^{10,11}The actual prediction for the local atomic energy of atom

*i*is expressed as

*k*(

*i*,

*s*) is the kernel between the atomic environment of

*i*and the different atomic environments

*s*in the

*sparse set*(a subset of structures in the training database), the

*α*

_{s}are fitting coefficients obtained during training,

*e*

_{0}is a constant energy per atom, and

*δ*gives the energy scale of the model. Forces can be obtained analytically from the derivatives of Eq. (1). More details about the GAP framework and many-body descriptors are given in Refs. 8–11.

We generate training data at the DFT level using the PBE functional approximation^{12} and will denote it as PBE-DFT from now on. We use a highly converged plane-wave basis set with a 520 eV cutoff and an adaptive reciprocal-space integration mesh such that the number of **k** points is given by *N*_{k} = 1000/*N*_{atoms}. The VASP software^{13–15} is used with input options given in the Appendix. The composition and generation of the database are discussed in Sec. II B. The training and validation of the potential are done with the QUIP/GAP codes.^{16} Structure manipulation and database sorting are done with ASE.^{17} Molecular dynamics (MD) simulations are carried out using a Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS)^{18,19} and TurboGAP.^{20}

### B. Database generation and accuracy tests

We want to create a robust Pt GAP that can be used safely in exploratory work, e.g., to assess the stability of Pt NPs derived computationally. To this end, the GAP needs to be simultaneously accurate *and* transferable. Within a data-driven approach, it is important to note that prior knowledge of physics and chemistry is not embedded in the form of the potential. That is, a GAP does not “know” about the Schrödinger equation—it only knows about data it has seen during training. Therefore, the training set must be carefully crafted to contain all the relevant configurations. This includes (meta)stable structures, but also, perhaps counterintuitively, high-energy structures. High-energy structures must be present in the database so that the GAP learns that they are, in fact, of high energy; otherwise, the GAP could spuriously predict previously unseen unstable structures to be low in energy.

It is also useful to realize that high-energy observables can be learned with less accuracy than low-energy ones because low-energy structures contribute much more to the partition function of the system at the temperatures of interest and, thus, to the derived thermodynamic properties. This leads to an efficient database construction strategy where a few disordered structures, such as high-temperature liquid or dimers at close range, are added to sample configuration space sparsely but comprehensively. Further to these, many configurations close to the known stable structures, like close-packed crystal lattices and surfaces thereof, are added by “rattling” the atoms around equilibrium and applying small amounts of strain to the periodic cells. This, in turn, begs the question what about the *unknown* stable structures?

To improve a GAP in a yet unknown region of configuration space, a successful strategy is iterative training.^{21} In iterative training, one trains several versions of the GAP, and each time uses the newest GAP to predict stable structures. The energy values and atomic forces for those structures are then computed with PBE-DFT and fed to the next version of the GAP, which will learn from its predecessor’s successes and, especially, failures. This iterative procedure progressively refines the GAP’s accuracy in the region of configuration space where the target structures (e.g., NPs) reside. The advantage is that the computationally demanding procedure, the structure generation, which might require thousands or millions of energy and force evaluations, is performed with the GAP, inexpensively. The PBE-DFT calculations are only carried out for the final structures or, in some cases, a small subset of the structures selected along the path followed in configuration space to generate the final ones.

Figure 1 shows the most commonly used numerical benchmark for machine learning potentials (MLPs), i.e., a scatterplot of predicted vs reference energies for a 20%/80% test/training sets split. That is, out of the entire database of structures, 80% are used to train the GAP, and the 20% unseen structures are used to test the potential outside the training set. The root mean-square error (RMSE) is computed to give a single numerical score for the overall performance of the potential. Our Pt GAP shows remarkably low errors in this simple test, with an RMSE of only 1.6 meV/atom. Application-specific tests of the GAP are presented in Sec. III, more indicative of how this potential performs for large-scale and high-throughput simulations.

In Fig. 1, we also show the predictions of three embedded-atom method (EAM)-type potentials on our test set for comparison: Zhou’s EAM,^{22,23} the Gupta potential,^{24} and Lee’s modified EAM (MEAM).^{25} EAM-type potentials are usually fitted using ground-state (low-temperature) experimental data. For instance, Zhou’s EAM was fitted to reproduce (quoting the authors) “basic material properties such as lattice constants, elastic constants, bulk moduli, vacancy formation energies, sublimation energies, and heats of solution.”^{22} While EAMs can satisfactorily reproduce the energetics of bulk FCC near equilibrium, as expected from the composition of their training database, all other structure types are modeled significantly less accurately. We will show in Sec. III C a further comparison for NPs. In Fig. 2, we show a histogram of force errors. Systematic deviations for the reference potentials (i.e., the error distributions peak above zero) are apparent, also visible in Fig. 1 for high-energy structures. These high-energy structures fall outside the scope of EAM-type potentials. However, they become relevant in low-dimensional systems and, e.g., at high temperature and/or pressure.

Clearly, the improved accuracy of GAP comes at the expense of additional CPU time. For instance, to perform a single-point calculation for a NP with 147 atoms, our GAP requires ∼109 ms of CPU time, whereas an EAM calculation only needs 1.2 ms. The GAP is still significantly faster than PBE-DFT [using Vienna Ab initio Simulation Package (VASP)], for which this calculation requires of the order of 102 CPU hours (i.e., $\u223c3.5\xd7106$ times more expensive than the GAP).

### C. Nested sampling

We use the nested sampling (NS) technique^{26,27} to evaluate the bulk macroscopic thermodynamic properties of the new Pt GAP model. NS samples the entire potential energy surface, starting from high-energy random configurations (representing the gas phase) down to the ground-state structure through a series of nested energy levels, without requiring any advance knowledge of the stable phases.^{28,29} A unique advantage of NS is that it allows the calculation of the partition function as a simple post-processing step. This gives access to thermodynamic properties, such as the heat capacity—which is the second derivative of the partition function with respect to temperature—and, hence, enables us to identify all the phase transitions of the system. In the current work, we perform the NS calculations at constant pressure, to compute the pressure–temperature phase diagram.^{30–33} Simulations were carried out using the pymatnest program package,^{34} using LAMMPS to perform the dynamics.

## III. BENCHMARKS

Our Pt GAP has been designed with the goal of general applicability in mind. In this section, we prove its transferability across a selection of different applications representative of common use cases. We test the GAP for basic bulk properties (equation of state, elasticity, and phonons), surface energetics, and NP formation energies. While avoiding a too detailed examination of each application, which could merit on their own more focused studies, these examples showcase the predictive power of the new GAP. In Sec. IV, we describe two, more detailed, applied studies: the phase diagram and a spontaneous crystal nucleation in nanostructured Pt.

Accuracy tests that are missing from this section are those relevant to nanoscopic processes in surface diffusion, aggregation, nucleation, and, more generally, rare events involving the description of a transition state while crossing an energy barrier. The exploration of the region of configuration space corresponding to these is difficult to automate because transition states contribute much less to the partition function than stable states and will not be sampled by MD in significant proportions. We are currently developing the methodology to automate incorporating transition-state configurations for GAP training and will add these to future versions of our Pt GAP as these developments become available. In the meantime, one should expect inconsistent prediction of transition-state energies with our GAP, and individual calculations should be benchmarked against DFT before trusting the results.

### A. Equation of state, elastic properties, and phonons

The equation-of-state calculation shows the expected minimum for the FCC phase from zero up to very high pressure, with HCP about 60 meV/atom above FCC and body-centered cubic (BCC) slightly above HCP. The simple cubic (SC) phase is significantly higher in energy than FCC, HCP, and BCC, except at large tensile strain, i.e., at large (and unrealistic) negative pressures, where it becomes the stable phase. All phases evolve smoothly as a function of unit cell volume, as shown in Fig. 3.

Our tests for Pt show that phonons and elastic constants can be learned accurately when the training data only contain structures created for this specific purpose. However, the trained potential is then only able to describe those properties and will not have general-purpose applicability. When different structures are added to bring in more general-purpose applicability, the high accuracy on both phonons and elastic constants is sacrificed. Phonons (Fig. 4) are still described reasonably well as compared to the PBE-DFT results as far as the main trends are concerned, except for a systematic deviation at the *W* point. Table I shows the elastic constants computed with GAP and compared to PBE-DFT, as well as to experiment. Overall, the agreement with PBE-DFT (the GAP’s “ground truth”) is good, with a systematic deviation of only +4%. This deviation is smaller than the PBE-DFT error as compared to experiments, highlighting how, in some cases, the overall accuracy of the GAP is more limited by the intrinsic accuracy of the reference method (PBE-DFT in our case) than by the accuracy of the fit. That is, for the specific purpose of calculating elastic constants, the GAP is a better representation of PBE-DFT than PBE-DFT is of reality. For reference, we also provide the elastic constants computed with EAM, which compare favorably to experiment. They are indeed closer to the experimental values than the PBE-DFT results, a consequence of the fact that the EAM was fitted to reproduce the experimental ground-state results, as discussed earlier.

. | Exp.^{36}
. | PBE-DFT . | GAP . | EAM . |
---|---|---|---|---|

C_{11} (GPa) | 373 | 320 (−14%) | 333 (+4%) | 345 (−8%) |

C_{12} (GPa) | 242 | 218 (−10%) | 228 (+5%) | 250 (+3%) |

C_{44} (GPa) | 78 | 77 (−1%) | 80 (+4%) | 76 (−3%) |

. | Exp.^{36}
. | PBE-DFT . | GAP . | EAM . |
---|---|---|---|---|

C_{11} (GPa) | 373 | 320 (−14%) | 333 (+4%) | 345 (−8%) |

C_{12} (GPa) | 242 | 218 (−10%) | 228 (+5%) | 250 (+3%) |

C_{44} (GPa) | 78 | 77 (−1%) | 80 (+4%) | 76 (−3%) |

### B. Surfaces

Platinum is a material widely used in interfacial (electro)catalysis, and thus, it is important to ensure that an interatomic potential for Pt can accurately reproduce surface formation energies. The three surfaces most commonly studied are those defined by the (111), (100), and (110) crystallographic FCC planes.^{2} The (111) surface is the most stable one and the one most often used in electrocatalysis, e.g., for hydrogen production, due to the low overpotential it exhibits for HER.^{1}

A comprehensive study of surface energetics for arbitrary Miller indices (*hkl*) becomes prohibitive for DFT, due to the large number of atoms in the unit cell for large indices. For example, a 7-atom thick Pt slab with (10 1 0) indices already contains 280 atoms in the *primitive* unit cell. With our Pt GAP, studying these surfaces with small tilt angles becomes possible. We, therefore, calculated the surface formation energies, with bulk FCC Pt as reference, for all the symmetry-inequivalent Miller planes that can be constructed in Pt when letting each index run up to 10. To ensure that reconstruction effects beyond the primitive unit cell are considered, we ran the calculation for the primitive unit cell generated with ASE^{17} as implemented by Buus, Howalt, and Bligaard, its Niggli equivalent cell,^{37–39} as well as 2 × 2 supercells built thereof. We included small random initial displacements of the atoms to avoid biasing the geometry optimization due to high-symmetry starting configurations. Altogether, six calculations were carried out for each set of Miller indices and the obtained surface formation energies per atom were always the same (except for negligible numerical differences). This indicates that simple relaxation of the atomic positions takes place as the surfaces are created and that the surfaces have the same periodicity as the primitive unit cell.

Figure 5 (top panel) shows the surface formation energies for varying Miller indices within the triangle enclosed by the (111), (110), and (100) planes as end points. The values predicted for those planes, 0.091, 0.117, and 0.117 eV/Å^{2}, respectively, are in good agreement with our reference PBE-DFT values (0.096, 0.123, and 0.120 eV/Å^{2}, respectively) and with recent values from the literature.^{40} The GAP predicts smooth transitions as the cleaved (and relaxed) crystal facet is tilted between the most common facets. The bottom panel of the figure shows the surprising result that our Pt GAP is more capable of reproducing the PBE-DFT surface energies provided by a “high-quality” PBE-DFT calculation (computed with the same VASP settings as those reported in Sec. II) than another PBE-DFT calculation with “standard” settings. The GAP tends to slightly underestimate the surface formation energies (by about 5%) but the trends, i.e., the relative formation energies, are accurately captured.

### C. GAP accuracy for nanoparticle modeling

We have generated a large database of Pt NPs for this work. This database is divided into the two following subsets, NP-DB01 and NP-DB02. NP-DB01 contains 8000 NPs generated between *N*_{atoms} = 10 and *N*_{atoms} = 349 using an annealing–quenching–relaxation procedure, starting from a highly disordered precursor, where the annealing and quenching steps take 20 ps each and the annealing happens at 1500 K; we call this a “cooking” protocol. NP-DB02 contains 3400 NPs between *N*_{atoms} = 10 and *N*_{atoms} = 349 (10 for each size) where the annealing step of the cooking protocol takes place at the optimal crystallization temperature of 1150 K (see Sec. IV B) but, otherwise, generated in the same way as NP-DB01. This database is freely available from the Zenodo repository^{41} and will be extended in subsequent work, in particular with larger NPs beyond *N*_{atoms} = 349.

To assess the ability of our GAP to accurately model Pt NPs and to compare it to previously available, commonly used, force fields for Pt modeling, we selected NPs from NP-DB01 up to *N*_{atoms} = 150. The energies were computed with our GAP, standard-quality PBE-DFT (300 eV plane-wave cutoff), the Gupta potential,^{24} and the EAM potential.^{22,23} We compare all these numbers to a benchmark-quality PBE-DFT calculation (520 eV plane-wave cutoff). The results of this comparison are shown in Fig. 6. Clearly, the GAP outperforms the other force fields with errors ($\u223c40$ meV/atom) one order of magnitude smaller than Gupta ($\u223c400$ meV/atom) and EAM ($\u223c500$ meV/atom) around and above 50-atom NPs. The GAP errors for these NPs are about five times larger than those obtained from standard-quality PBE-DFT. For very small NPs (<50 atoms), the GAP results are still better than for the other force fields, but the errors are significantly higher than for larger NPs. Since the atomic motifs in small NPs look significantly different from those of bulk and surfaces, it is not surprising that the errors are larger.

The accuracy of GAP can be enhanced specifically for NPs by iteratively training the potential for that purpose. That is, we can improve the accuracy of the GAP in the future by adding (some of) these NPs to the training set and training a new version of the potential, as exemplified in Fig. 7. In that figure, we observe the performance of two versions of the Pt GAP. The first one, GAPv1, is initially used to make two sets of small NPs, with *N*_{atoms} ≤ 50. One of the sets is used to retrain the GAP, giving GAPv2, and the second set is used to test the predictions of both versions vs PBE-DFT. The results are shown in Fig. 7 (left) where GAPv1 is shown to predict too low (i.e., too stable) energies for the smallest NPs in the test set (*N*_{atoms} ≲ 40), whereas GAPv2 correctly orders all of the NPs generated with GAPv1. On the right-hand side of the figure, we show the energy predictions of GAPv1 and GAPv2 for NPs that were generated with GAPv2. There are two features of the GAP accuracy refinement provided by iterative training which are apparent from this right-hand panel. On the one hand, as expected, GAPv2 produces “better” NPs than GAPv1, in the sense that they are lower in energy when looking at the PBE-DFT energy prediction (i.e., the datapoints are shifted horizontally to the left, compared to the left-hand panel), and there is less data scatter. On the other hand, counterintuitively, the GAPv1 predictions for these GAPv2-generated NPs are in better agreement with PBE-DFT than the GAPv2 predictions. While unexpected, this is a typical result for early iterations in GAP iterative training: a given iteration of the potential, used in an application-specific simulation, will favor structures which populate artificially low regions of the PES. As new iterations of the potential add these low-energy structures to the database, the PES is refined and the GAP “unlearns” the spurious minima and the scatterplot converges toward optimal agreement with DFT.

The structure generation strategy that we followed here to augment the GAPv1 database is as follows: we first generate all the regular FCC tetrahedra that can be constructed below 50 atoms, which correspond to 4, 10, 20, and 35 atoms. We, then, start an MD simulation from the ideal (relaxed) structure, quickly (10 ps) heat up to 3000 K and quickly (another 10 ps) quench down to 100 K. From this MD trajectory, we sample 11 equidistant (in time) snapshots, which ensure we incorporate a wide diversity of small nanoclusters, including some that are high in energy: regular (crystal-like), thermally disordered, and quenched structures are added to the training database.

Generally, as new training configurations are generated, we can retrain and refine the accuracy of our GAP. For reference, we provide in the repository^{42} two versions of the GAP: the one used for most of the simulations presented in this article (v1) and the one that contains a small amount of NP-specific iterative training (v2). Any future version of the GAP will be added to this repository together with a note on any further additions to the database, compared to the configurations reported here, with all published versions remaining publicly available. This will ensure that the user base of the potential has easy access to the most accurate (and most recent) Pt GAP while enabling reproducibility of the results produced with all earlier versions. Upcoming work from our group will focus on a detailed study of small Pt NP formation and stability, and we expect to update this repository with a NP-optimized version of the GAP in the near future.

For the sake of clarity, we emphasize here that GAPv1 was used to generate all the results in this paper except for those labeled as GAPv2 results in this subsection. In addition, we note that the linked repository^{42} allows us to browse the full history of GAP versions, even though the latest version is shown by default. Both v1 and v2 can be retrieved from the repository and are listed under “Versions.”

## IV. APPLICATIONS

### A. Pressure–temperature phase diagram

The NS calculations were performed as presented in Ref. 31. The simulations were run at constant pressure in the range of *p* = 0.07–50 GPa, using a simulation cell of variable shape and size, containing 24 atoms. We used 1000 walkers and performed 440 steps (8:1:2:2 ratio of total-energy Hamiltonian Monte Carlo, volume, cell shear, and cell stretch steps) to generate the new configurations during the NS iterations. These parameters ensure convergence of the melting transition within ±40 K. The use of small systems will inevitably cause some finite-size effects, for example, an underestimation of the boiling curve and an overestimation of the melting line as compared to the macroscopic value.^{30} In order to estimate this error, we repeated the simulations with 48 atoms at *p* = 1 GPa and obtained 2.8% lower melting temperature than the 24-atom calculation.

Figure 8 shows the pressure–temperature phase diagram. At low pressure, we observe a heat capacity peak at high temperature corresponding to the boiling curve and its extension to the supercritical region, the Widom line, marked by a shallower and broader peak (shown by the dashed red line in Fig. 8). To locate the critical point in the NS calculations, we drew on the results of Bruce and Wilding^{43} and calculated the density distribution in the temperature region of the peak. With this, we estimate the critical parameters to be *p*_{c} = 0.1–0.2 GPa and *T*_{c} = 9500–10 600 K. The low-pressure melting transition is estimated to be $\u22481650K$, hence underestimating the experimentally determined transition. This inaccuracy could be either due to our GAP or to an inherent error of the PBE functional used to train it. Since the NS calculations at the PBE level are simply intractable, there is no straightforward way to pinpoint the origin of this disagreement with experiment. The NS calculations found the solid structure to be FCC as expected and explored other close-packed stacking variants only in thermodynamically insignificant proportions.

### B. Spontaneous FCC nucleation and crystallization

We also used the Pt GAP to study the spontaneous nucleation of the stable FCC structure and the spontaneous formation of facets in a large NP (16 384 atoms) with MD. Figure 9 shows the sequence from the initial cube carved out of an FCC lattice. This is melted at 3000 K for 40 ps and then the quenching process takes place by cooling the NP from 3000 K down to 300 K over 1 ns using a linear temperature profile, controlled by a Berendsen thermostat with time constant 0.1 ps. The figure also shows a slice through the middle of the NP and, for reference, a periodic solid with the same number of atoms and undergoing the same temperature profile. For the solid, the pressure is controlled with a Berendsen barostat with time constant 1 ps and inverse compressibility equal to 100 times that of water.

To get further insight into the atomistic processes taking place during crystallization, in Fig. 9 we map the similarity of the local atomic structures to reference atomic motifs: bulk FCC and the stable (100), (110), and (111) FCC surface reconstructions. This is done by computing the SOAP descriptors of each atom in the system and calculating the similarity kernel with the SOAP descriptors of the reference motifs. These similarities are indicated by color coding the resulting structures. As expected, toward the end of the quench the interior of the NP (as well as the solid) is FCC-like, and the NP facets are (111)-like. Interestingly, the simulation shows that the formation of the FCC interior is nucleated from the surfaces inward. Therefore, there is grain formation with the (111) direction pointing approximately from the surface toward the center of the NP. For this reason, the resulting NP is polycrystalline, with the grain boundaries indicated by dark-colored atoms. It is clear from the figure that the formation of the FCC interior in the NP happens at a *higher* temperature than in the solid due to the nucleation effect at the (111) facets. A video animation of this process is available.^{46}

To elucidate the role of the quench rate on the results, we monitored the evolution of the NP’s structure as it was cooled down from 3000 to 300 K for additional quench rates corresponding to 2–10 ns simulations, with the same MD settings as before. Figure 10 shows the evolution of the potential energy as a function of temperature in the 1400–800 K temperature window, where most of the FCC nucleation takes place in these simulations (outside of this range the potential energy evolves linearly with temperature, as expected from the virial theorem). According to our MD results, the onset of significant structure rearrangement favorable toward FCC nucleation takes place at around 1200 K and continues down to a temperature, which depends on the quench rate (the slower the rate the higher the final temperature). From these values, we infer an optimal crystallization temperature around 1150 K. This is analogous to the graphitization temperature in carbon materials.^{47,48} We, therefore, repeated the MD simulation starting from the 3000 K-melted NP but fixing the thermostat’s target temperature at 1150 K and annealed for 1 ns (indicated as “Ann.” in the figure). There is a rapid quench from 3000 to 1150 K, and then, the system equilibrates for a few ps, corresponding to the loop seen at high potential energy, before it starts to go down in energy as it crystallizes (the vertical drop in potential energy at 1150 K). Most of the annealing process was completed after 250 ps, with no noticeable further drop in potential energy after 500 ps of MD. After the 1 ns annealing simulation had ended, we further quenched the structure to 300 K over 100 ps using a linear temperature profile. The results showed good agreement with the more computationally demanding slow quenches. This annealing process at 1150 K, thus, allows us to minimize the number of MD steps that are required to generate a reasonably stable NP, generated from a process mimicking spontaneous solidification.

## V. CONCLUSIONS AND OUTLOOK

We have developed a GAP for Pt with state-of-the-art force-field accuracy for the description of bulk, surface, and nanostructured systems. We have benchmarked our GAP against PBE-DFT for general accuracy, elasticity, phonons, surface energetics, and NP formation energies. Except for small NPs (*N*_{atoms} ≲ 40), our GAP shows remarkable agreement with the reference PBE-DFT data. We have, then, proceeded to use the GAP in situations beyond the reach of PBE-DFT calculations. Namely, we have computed the temperature–pressure phase diagram and studied the spontaneous solidification and FCC-motif nucleation in a large NP. The new GAP and several other resources have been made freely available. In the near future, we will further develop our reference database and the potential itself for improved description of NPs and surface dynamics, with the objective to get detailed insight into the atomic-scale phenomena taking place in Pt-based systems of interest in (electro)catalysis.

## ACKNOWLEDGMENTS

J.K. and M.A.C. acknowledge funding from the Academy of Finland under the C1 Value Programme, Project No. 329483. M.A.C. also acknowledges personal funding from the Academy of Finland, Project No. 330488. H.J. acknowledges funding from the Icelandic Research Fund, Project No. 207283-053. L.B.P. acknowledges support from the EPSRC through an Early Career Fellowship (Grant No. EP/T000163/1). Computational resources for this project were obtained from the CSC—IT Center for Science and Aalto University’s Science-IT project.

## AUTHOR DECLARATIONS

### Conflict of Interest

The authors have no conflicts to disclose.

### Author Contributions

**Jan Kloppenburg**: Conceptualization (supporting); Data curation (lead); Formal analysis (equal); Methodology (equal); Validation (lead); Visualization (equal); Writing – original draft (supporting); Writing – review & editing (equal). **Livia B. Pártay**: Data curation (supporting); Formal analysis (supporting); Funding acquisition (supporting); Methodology (supporting); Supervision (supporting); Validation (supporting); Visualization (supporting); Writing – original draft (supporting); Writing – review & editing (equal). **Hannes Jónsson**: Data curation (supporting); Formal analysis (supporting); Funding acquisition (supporting); Methodology (supporting); Supervision (supporting); Validation (supporting); Visualization (supporting); Writing – original draft (supporting); Writing – review & editing (equal). **Miguel A. Caro**: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Funding acquisition (equal); Methodology (equal); Supervision (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal).

## DATA AVAILABILITY

Most of the data that support the findings of this study are available from public online repositories as listed in the references. Any data not publicly shared in this way can be obtained from the corresponding author upon reasonable request.

### APPENDIX: VASP INPUT FILE

The VASP INCAR input file used for the PBE-DFT calculations is given below:

PREC = Accurate

ENCUT = 520

EDIFF = 1.0e-05

ISMEAR = 0; SIGMA = 0.1

ALGO = Normal

LWAVE = .FALSE.

LCHARG = .FALSE.

The *k*-space sampling is not explicitly set in the INCAR file. Instead, **k** points are chosen by homogeneously sampling the first Brillouin zone with the total number of points determined by the relation *n*_{atoms} × *n*_{k} = 1000. To enable high-throughput calculations, the Fireworks framework^{49} was used for task automation and single-point workflows, similar to the implementation in Atomate,^{50} which rely on Custodian^{51} as VASP handler.

## REFERENCES

_{2}formation

*ab initio*total-energy calculations using a plane-wave basis set