Parallel kinetic Monte Carlo simulation framework incorporating accurate models of adsorbate lateral interactions

Ab initio kinetic Monte Carlo (KMC) simulations have been successfully applied for over two decades to elucidate the underlying physico-chemical phenomena on the surfaces of heterogeneous catalysts. These simulations necessitate detailed knowledge of the kinetics of elementary reactions constituting the reaction mechanism, and the energetics of the species participating in the chemistry. The information about the energetics is encoded in the formation energies of gas and surface-bound species, and the lateral interactions between adsorbates on the catalytic surface, which can be modeled at different levels of detail. The majority of previous works accounted for only pairwise-additive first nearest-neighbor interactions. More recently, cluster-expansion Hamiltonians incorporating long-range interactions and many-body terms have been used for detailed estimations of catalytic rate [C. Wu, D. J. Schmidt, C. Wolverton, and W. F. Schneider, J. Catal. 286, 88 (2012)]. In view of the increasing interest in accurate predictions of catalytic performance, there is a need for general-purpose KMC approaches incorporating detailed cluster expansion models for the adlayer energetics. We have addressed this need by building on the previously introduced graph-theoretical KMC framework, and we have developed Zacros, a FORTRAN2003 KMC package for simulating catalytic chemistries. To tackle the high computational cost in the presence of long-range interactions we introduce parallelization with OpenMP. We further benchmark our framework by simulating a KMC analogue of the NO oxidation system established by Schneider and co-workers [J. Catal. 286, 88 (2012)]. We show that taking into account only first nearest-neighbor interactions may lead to large errors in the prediction of the catalytic rate, whereas for accurate estimates thereof, one needs to include long-range terms in the cluster expansion.


I. INTRODUCTION
The origins of temporal investigations of processes on lattices can be sought in the seminal work by Glauber, who analyzed a Markov process for spin flips in the Ising Model, back in 1963. 1 The subsequent development of kinetic Monte Carlo approaches 2, 3 enabled the study of complex timedependent phenomena in the Ising model such as metastability and dynamic critical phenomena (see for instance Refs.4-7).Application of kinetic Monte Carlo (KMC) to catalysis on surfaces was pioneered by Ziff, Gulari, and Barshad, 8 who investigated kinetic phase transitions on a model for CO oxidation in 1986.][11][12][13][14][15] Since these early studies, significant developments have taken place.Over the past two decades KMC has evolved a) Author to whom correspondence should be addressed.Electronic mail: m.stamatakis@ucl.ac.uk.Tel.: +44-203-108-1128.
to enable the coupling with ab initio calculations for a first-principles based simulation of chemistries on catalytic surfaces [16][17][18][19][20][21][22][23][24][25][26][27][28][29][30][31] (for reviews, see Refs.32-34).In these firstprinciples KMC frameworks the reaction energies and activation barriers are typically obtained from density functional theory (DFT) calculations, and the kinetic parameters are calculated by employing transition state theory (TST) approximations.Thus, within the level of accuracy of DFT and to the extent of validity of TST, the quality of the predictions obtained by KMC depends on "how well" the ab initio data have been incorporated into the simulation.
In particular, attractive or repulsive interactions between adsorbates on the catalytic surface have been shown to affect the rates of elementary reactions. 16,35 arly first-principles KMC frameworks used DFT and bond-order conservation (BOC) methods to account for such effects. 16,17 n some instances, such interactions were neglected altogether or modeled by pairwise additive nearest neighbor contributions, due to large computational expense needed for implementing more accurate models. 36The most general such models consist of cluster-expansion (CE) Hamiltonians, which can accommodate any level of accuracy by taking into account long-range and many-body contributions to the total energy.9][40][41][42][43][44][45] Moreover, CEs have been employed to investigate adsorption/desorption dynamics [46][47][48] and the diffusion of adatoms on metal systems. 49,50 ly recently however, have cluster expansion Hamiltonians been applied for the estimation of catalytic rates. 51In an elegant study, Schneider and co-workers investigated the kinetics of NO oxidation on Pt(111) by means of equilibrium Monte Carlo (MC) calculations and appropriate averaging of the microscopic kinetic rates over the lattice.The MC calculations employed a cluster expansion for O on Pt(111) 45 and yielded equilibrium structures of the oxygen adlayer.The dissociative adsorption energy of O 2 was then calculated for each pair of empty sites (using the cluster expansion) and mapped to an activation energy for O 2 dissociation thereon via a Brønsted-Evans-Polanyi (BEP) relationship.Thus, a distribution of activation energies was obtained, from which the average rate of O 2 dissociation was evaluated.To make the connection with NO oxidation, it was assumed that the NO/NO 2 mixture was at equilibrium with adsorbed oxygen, thereby computing the chemical potential of O which is the input to the aforementioned MC calculations.
In view of the increasing interest in accurate predictions of catalytic performance, long-range lateral interactions and possibly many-body contributions are expected to form an integral part of kinetic models of catalysts.We have thus developed a general-purpose KMC framework that incorporates detailed cluster expansion models for the adlayer energetics, building on the previously introduced graph-theoretical KMC approach. 52Using this framework, one can define a cluster expansion model for the adlayer energetics, along with the forward activation energies of the elementary reactions for the chemistry under investigation.The reverse activation energies are computed through linear BEP relations thereby ensuring that the simulation exhibits microscopic reversibility.To tackle the high cost of computing the energetics in the presence of long-range interactions we introduce parallelization with OpenMP, and benchmark the performance of the framework by simulating a KMC analogue for the NO oxidation model established by Schneider and co-workers. 51he rest of the paper is organized as follows.In Sec.II we present a brief overview of the graph-theoretical KMC framework, and present our current development work incorporating cluster expansion Hamiltonians in the framework.Subsequently, we discuss the setting up of a benchmark model for NO oxidation on Pt(111) based on the cluster expansions developed by Schneider and co-workers. 45,51 n Sec.III, we validate our computational framework and investigate the behavior of the model under different operating conditions.We further discuss the performance and scalability of the parallel code.In Sec.IV, we highlight our contributions and discuss the significance of this work.

II. METHODOLOGY
The components and procedures of the graph-theoretical KMC implementation are discussed in Ref. 52, whereas   Refs.53-56 demonstrate the use of the method in gaining a fundamental understanding of surface processes in a variety of catalytic systems.For a general overview of KMC approaches and applications in heterogeneous catalysis the reader is referred to Ref. 55. Here, we will briefly review the framework and present in detail the calculation of the energetics which is the focus of the present work.

A. Overview of graph-theoretical KMC
The workload of the KMC algorithm consists of simulating a sequence of elementary events, such as adsorption, desorption, diffusion, reaction, which change the configuration of a lattice representing the catalytic surface.In the graphtheoretical KMC, the lattice, adsorbate configurations, elementary reactions, as well as energetic interaction contributions, are all represented by graphs.The input to a simulation consists of parameters controlling the behavior of the program (for instance, sampling times, stopping criteria, etc.), and the components specifying the physics of the system under investigation.For the latter, one has to provide the conditions (T, P, and gas phase composition), a lattice structure, an initial configuration of the lattice, an energetics model containing at minimum the energies of gas species and the binding configurations of adsorbates, and finally, a mechanistic model containing a list of possible elementary events.Note that in our previous work, no energetic model was specified.The code could only handle pairwise additive interactions and microscopic reversibility was manually implemented by correctly listing all interactions between neighboring species and the initial and final state species of each elementary event.In the current work, we introduce cluster-expansion based models for a general and thermodynamically consistent KMC simulation scheme.
As in previous KMC schemes, a simulation is initialized with the given lattice configuration (possibly an empty lattice), and a queue is generated containing all the possible lattice processes for the given configuration.For instance, if the lattice is initialized with one adsorbed molecule, the possible processes could be: adsorption on all empty sites, desorption of the adsorbate, or diffusional hops to each of the neighboring empty sites.The queue just mentioned also stores randomly generated times in which each process can occur.Under constant conditions, each of these random times follows an exponential distribution with a rate parameter equal to the kinetic constant of that process. 57fter initialization, the algorithm enters a loop at each step of which the most imminent process in the queue (the one with the smallest time of occurrence) is executed.Thus, the reactants are removed from the lattice along with the processes they participate in, and the products are added to the lattice.The latter operation also involves detecting all the possible processes in which the newly added adsorbates can participate and including them in the queue of lattice processes.This detection entails searching the neighborhood of each newly added adsorbate to identify patterns matching an elementary event, e.g., desorption or diffusion.Note that for each of these patterns identified, a rate constant has to be calculated.Thus, one needs to know the activation energy, which in general depends on the local environment due to the lateral interactions.Moreover, the removal of adsorbed reactants from the lattice eliminates energetic interactions, and the addition of adsorbed reaction products introduces new energetic interactions which affect the activation energies and consequently the rates of existing processes.Computing the activation energies can be computationally intensive if the energetics' model contains long-range interactions.Thus, we implemented parallel computing in order to improve the efficiency of these calculations.
The procedures just described occur at every KMC step.By simulating a long sequence of such steps, a KMC trajectory (realization) is generated which can be subsequently post-processed to yield a variety of observables.For instance, counting the number of adsorbates on each site type yields surface coverages from which the most abundant surface intermediate can be found.Counting the number of reactant molecules consumed per site per time allows one to calculate the turnover frequency of the overall reaction.Moreover, the rates of each of the elementary reactions can be calculated by counting the occurrences thereof per unit time.Analysis of this data can reveal the most dominant pathway and sensitivity analysis can elucidate the processes limiting the overall reaction rate, namely the rate determining steps (RDS). 58,59 C simulations thus provide a wealth of information that can be used to elucidate the underlying molecular phenomena giving rise to catalysis.

B. Implementation of detailed models of lateral interactions
Adsorbate lateral interactions have attracted the interest of experimental and theoretical studies for many years. 60In catalysis, interactions between reactant species and spectator species in the neighborhood of a microscopic event can give rise to spatial heterogeneity in chemical reactivity. 61It is thus important to model such effects in order to enable the accurate prediction of catalytic performance.
To this end, it is necessary to formulate a model Hamiltonian, in order to capture the energetics of the adsorbate overlayer.For lattice systems, the most general such Hamiltonian is given by the so-called cluster expansion. 37Mathematicallly, for a specified lattice and a given function that maps the occupancy of the lattice points to a real number, for instance, the energy for a given configuration of adsorbates, one can in principle construct a cluster expansion that represents this function exactly.Of course, for practical purposes one usually truncates the expansion, which introduces error.The basic idea behind this model is to decompose the energy of a lattice configuration into single-body, two-body, and manybody interaction terms represented by clusters (also known as figures).From a technical standpoint, the state of each site (which in our case represents which adsorbate is bound thereon) is given by a spin variable.One defines a set of basis functions for each site, consisting for instance of Chebyshev polynomials if an orthonormal basis is desired.Then, a set of cluster functions are defined as all the possible products of basis functions for each site in the cluster.To compute the en-ergy of the lattice, one evaluates the linear combination of all cluster functions appropriately weighted with the contribution of each one of them to the overall energy of the system. 37The method has recently been expanded to accommodate different site types. 62o implement a general cluster expansion Hamiltonian in the graph theoretical KMC, we adopt a slightly different formulation which is based on representing each cluster as a graph pattern, and counting the occurrences of that pattern by solving subgraph isomorphism problems.This makes it possible to accommodate patterns in which multidentate species participate.In the graph-theoretical KMC, the state of a single site σ i is given by a 3-element vector, 52 where S L denotes the number of sites on the lattice, N S the number of species, maxdent is the maximum number of sites that any species can occupy, for instance if species A binds to one site (monodentate) and species B binds to two (bidentate), maxdent = 2.In vector σ i , the first element gives the index of the entity bound to that site; this is done in order to be able to tell the sites occupied by multidentate species.If only monodentate species are present, the maximum number of bound entities is S L ; yet, in cases where multidentate species are bound on the lattice, this first element may range up to values less than S L .The second element gives the species type, and the third element gives the dentate with which a species is bound to site i.Thus, the state of the overall lattice is given by an S L × 3 array σ , 52 In this setting, a graph pattern representing a cluster in the expansion is defined as where k represents the set of sites in cluster k (vertexes of the graph pattern), and E k the neighboring relations between these sites (edges of the graph).Each of the sites just noted can be assigned one out of the possible S T site types that may exist on the lattice.Thus, if cluster k involves S C,k sites, the type of site k will be where the 0 means that a site k may be of any permitted type.In addition, the coverage pattern of cluster k can be defined by the states of all sites in the graph pattern in line with Eqs.(1) and (2), The above definition allows for the state of site i to be left unspecified, in which case it is denoted by the ampersand symbol (&).This is useful when defining long range interaction patterns in which the intermediate sites can be vacant or occupied by any species.Finally, a set of geometric constraints can be defined by specifying the angles between certain edges in the graph.Thus, for these constraints a set of three sites {s 1 , s 2 , s 3 } in the cluster is defined and the angle between the vectors (x 1 -x 2 , y 1 -y 2 ) and (x 3 -x 2 , y 3 -y 2 ) is specified to the desired value, where the function Angle calculates the angle between the aforementioned vectors, ϕ i is the desired angle for geometric constraint i, and N AC k gives the number of such constraints defined for cluster k.
Similar to the detection of lattice processes in the graphtheoretical KMC, 52 an energetic contribution is a mapping between the vertex sets of C k (the subgraph) and L = (S, E) (the "large" graph representing the lattice with sites S and edges E), This mapping has to satisfy the following conditions: 1. M is a subgraph isomorphism, namely, every pair p, q of neighboring sites in pattern C k is mapped to a pair M(p), M(q) of neighboring sites in the lattice: 2. Wherever specified, the angles between the specified edges of pattern C k are the same as those on the lattice: 3. Wherever specified, the types of sites of the pattern are the same as those on the lattice: 4. There is a mapping between the cluster entities and the lattice entities: such that Thus, the coverage patterns of the elementary step and the lattice match for sites with specified state.
Note that there are as many such mappings M for cluster k (Eq.( 7)) as the number of occurrences of that cluster on the lattice.This allows for a direct enumeration of the possible energetic cluster contributions.
Let NCE k (σ ) denote the number of occurrences of cluster C k for the lattice configuration σ and ECI k the effective cluster interaction parameter, which gives the contribution of this cluster to the total energy.To avoid overestimating the contributions, we define the graph multiplicity of cluster k, GM k , which can be thought of as a symmetry number for that elementary pattern and gives the number of times the exact same pattern will be counted.For instance, pairwise energetic interactions between adsorbates of the same type occupying the same site types will be counted twice; thus, GM = 2.Then, the total energy of the system will be given as Equation ( 13) is the general cluster expansion Hamiltonian giving the energy of the adsorbate overlayer for any configuration.

C. Computing reaction energies and activation barriers
In the graph-theoretical KMC, every elementary event (adsorption, desorption, diffusion, reaction) is represented as a graph pattern with specified initial and final coverages.Due to lateral interactions, the reaction energy depends not only on the coverages of the sites involved in that event, but also on the coverages of sites in the neighborhood thereof.The reaction energy is thus given as where σ and σ refer to the initial and final coverages of the overall lattice, and E gas is the change in the energy of gas species; for instance, if adsorbed CO and O gave rise to CO 2(gas) , then E gas would be equal to the energy of CO 2 , approximately −3.1 eV referenced to CO and 1/2O 2 in the gas phase.In practice, one does not need to recalculate from scratch the energy of the overall lattice in the final state H(σ ).Rather, after computing H(σ ) the energetic contributions of the reactants are subtracted therefrom, the clusters in which the products can participate are detected and the corresponding contributions are added therein to obtain H(σ ).Given the reaction energy, the activation energies of the forward and reverse elementary steps must satisfy the following relation to ensure microscopic reversibility (see Figure 1): The forward activation barrier can be parameterized by means of a BEP relation and ab initio data. 51Thus, for a given configuration, the forward activation energy can be expressed by a linear relation that involves the activation energy at the zero coverage limit, and the difference between the reaction energies at the given configuration and the zero coverage limit: In the equation above, E k, ‡ fwd,0 and E k rxn,0 are the forward activation barrier and reaction energy at the zero coverage limit   14)- (18).The blue solid path pertains to the reaction energy profile in the limit of zero coverage.For the red dotted path in this example we have assumed that repulsive interactions between the reactants and neighboring adsorbates raise the energy of the initial state.
(see Figure 1), and ω is a parameter referred to as the proximity factor. 63The max operator ensures that the activation energy will be at least zero, or equal to the reaction energy if the latter is positive (otherwise the transition state energy would lie between the energies of reactants and products).In order to satisfy Eq. ( 15) the reverse activation energy has to be parameterized as Equations ( 16)-( 18) allow us to compute the activation barriers in a thermodynamically consistent way, as prescribed by the cluster expansion Hamiltonian (13) and the energetics of the gas species.The rate constant for reaction i can be then computed according to transition state theory as 23,32,33,64,65 where Q ‡ and Q R denote the quasi-partition functions of the transition state and reactants, respectively; h denotes Planck's constant, k B Boltzmann's constant, and T the temperature.
For details on the calculation of the quasi-partition functions and the rate constants for particular elementary events (for instance Eley-Rideal reactions) the reader can consult the supplementary material of Ref. 33.Note that fitting the ab initio-computed activation energies to a BEP scaling relation introduces error.Our KMC approach however does not necessarily require the fitting of the activation energies to the BEP: if the activation energies for specific neighboring configurations are available, one can define several elementary reactions in which the arrangements of the spectator species (within the interaction range) and the corresponding activation energies are explicitly defined.However, this procedure is cumbersome from a practical standpoint, and thus, one resorts to an approximation, such as the one provided by the BEP relation.The latter, while not quan-titative, is crucial for the qualitative understanding of the process of interest.

D. Algorithmic implementation
As implied by Eq. ( 13), computing the energy of the lattice necessitates detecting all energetic clusters and summing their contributions to the total energy.For bookkeeping purposes, the graph-theoretical KMC algorithm utilizes a data structure that stores the mapping information for each energetic cluster (namely, the lattice sites involved in the cluster), and also the inverse mapping that gives the indexes of the energetic clusters in which each adsorbed entity participates.This scheme is similar to that used for handling the elementary processes as described in Ref. 52.Thus, at every KMC step one knows the total lattice energy and all the energetic contributions it comprises.This information is used in calculating reaction energies and activation barriers as discussed in the following.
To calculate the activation energy for newly detected lattice processes (for instance an Eley-Rideal reaction step), the algorithm makes a temporary change in the state of the lattice, thereby executing that elementary step.In the final state, it detects the energetic clusters in which the products participate and computes the energy of the lattice from Eq. ( 13).The difference between final and initial state energies, calculated from Eq. ( 14), yields the reaction energy for the given configuration E k rxn (σ ).To find the reaction energy at the zero coverage limit, the algorithm performs a second computation, in which the clusters involving only the reactant entities are taken into account at the initial state, and similarly, the clusters involving only the product entities are considered in the final state, thereby neglecting interactions with neighboring adsorbates.The difference between the two energies thus computed gives E k rxn,0 .Subsequently, the activation barriers can be computed via Eq.( 16) or (17) depending whether the lattice process in discussion is a forward or a reverse step of an elementary event.Note that the user has to supply as input the forward activation energy at the zero coverage limit E k, ‡ fwd,0 , as well as the proximity factor ω. The reverse activation energy at zero coverage E k, ‡ rev,0 is computed from Eq. (18).
The procedures just discussed take place whenever a new lattice process is detected in the course of a KMC simulation.The pseudocode for the whole algorithm has been given in our previous article; here we give a brief outline with specific focus on the calculation of energetics and kinetics.constant, and generate a random time at which it will occur.Put these times in an event-queue.5.While t < t final 5.a.Find the process μ that will occur next and update the time.5.b.Remove the reactants from the lattice.Also remove their energetic contributions from the energetics data-structure and the processes which they participate in from the event-queue.5.c.Add the products of process μ in the lattice.Detect their energetic cluster contributions and store them in the energetics data-structure.Update the lattice energy.5.d.For every product, detect the elementary events in which it can participate.For each of these events: 5.d.i.Call subroutine CalculateKineticConstant to find the rate for the newly detected event.5.d.ii.Given the rate calculated from step 5.d.i, generate a random time for the occurrence of that event and store it in the event-queue.5.e.For each event within the neighborhood of the lattice process μ that just took place: 5.e.i.Call subroutine CalculateKineticConstant to find the current rate in the presence of the new environment.5.e.ii.Given the rate calculated from step 5.e.i, generate a random time for the occurrence of that event and update the corresponding entry in the event-queue.6. Repeat 7. Terminate Subroutine CalculateKineticConstant 1. Calculate the initial state energy at the given configuration: from the energetics data-structure find the clusters in which reactants and neighboring adsorbates participate and sum up the contributions.2. Calculate the initial state energy at the zero coverage limit: repeat the above calculation for clusters in which only reactants participate.3. Make a temporary change in the state of the lattice to represent the final state.4. Calculate the final state energy at the given configuration: detect the energetic clusters in which products and neighboring adsorbates participate and sum up the contributions. 5. Calculate the final state energy at the zero coverage limit: repeat the above calculation for clusters in which only products participate.6. Revert the change in the state of the lattice done in step 3 of this subroutine.7. Calculate the energy of reaction at the given configuration, E k rxn (σ ), and the zero coverage limit, E k rxn,0 , from Eq. ( 14). 8. Calculate the activation energy from Eq. ( 16) if this is a forward step of an elementary event, otherwise use Eqs.( 17) and (18).9. Calculate and return the kinetic constant from Eq. ( 19).

E. Computational implementation and parallelization strategy
The above procedures were coded in our FORTRAN2003 software package, Zacros, which is currently available from the UCL online licensing portal, e-Lucid. 66The package features an easy-to-learn keyword-based language for defining a simulation, and generates detailed output of the lattice state and energetics, reaction occurrence statistics, and numbers of molecules produced or consumed for each of the species participating in the chemistry.In addition, one can run simulations in debugging mode, during which the program provides an account of the changes in the key data structures during the simulation.This information can subsequently be used to efficiently troubleshoot a KMC simulation, if needed.
Using code profiling it was determined that the majority of the computational cost is due to the event time update operations in the loop of step 5.e.Thus, to improve performance, OpenMP parallelization was implemented over that loop.This scheme is applicable to shared memory architectures (for instance a single node in a computational cluster), and is based on the new reaction rate constants being calculated in parallel and stored into thread-private arrays.Subsequently, the random reaction occurrence times are generated serially and inserted into the event-queue, by looping over the threads and the processes assigned to every thread (double loop).This scheme produces simulation results which are identical to those generated by the serial version of the code.

F. Setting up a benchmark model
In a recent article, Schneider and co-workers, presented a set of cluster expansions for atomic oxygen adsorbed on Pt(111). 45Careful error analysis of these expansions showed that for an accurate representation of the ground states of this system one needs to include up to 8 th nearest neighbor pairwise additive interactions and at least two triplets (three-body contributions).Larger expansions were shown to be able to fit the DFT results quantitatively, but this essentially resulted in fitting the DFT error as well (overfitting).
In a subsequent study, Schneider and co-workers used the cluster expansion the strikes the best balance between accuracy and computational efficiency to study NO oxidation on Pt. 51 Thus, they performed equilibrium MC simulations to calculate the distribution of activation energies and the overall rate of O 2 dissociation on the surface, given the chemical potential of O.The latter was calculated from the ratio of NO/NO 2 in the gas phase assuming that this mixture is in quasi-equilibrium with adsorbed O. Thus, the rate of NO oxidation was computed as one-half the rate of O 2 dissociative adsorption.In this work, associative desorption of O adatoms was neglected.
To benchmark our code we set up a KMC model for the NO oxidation using as input the energetic models (cluster expansions) and kinetic parameters by Schneider and coworkers. 45,51,67 T article is copyrighted as indicated in the article.Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 128.40

Energetics
][73][74] The slab models contained 4 layers of Pt (with only the bottom one fixed), a layer of oxygen adatoms and four vacuum layers, on p(1 × 1) up to p(4 × 4) supercells to investigate a variety of O coverages.This computational scheme was shown to incur a maximum error of 4 meV per adsorbed oxygen.The error due to uncertainty of the O 2 energy calculated from Generalized Gradient Approximation (GGA) was estimated to be 0.2 eV per adsorbed oxygen.For more information about the DFT methodology and its accuracy, the reader is referred to Ref. 45.
Schneider and co-workers 45,51 thus developed four cluster expansion Hamiltonians of increasing accuracy by including 3, 5, 8, and 12 figures (clusters).The parameters reported in that work are for the Ising Hamiltonian formalism, which uses values of ±1 for the state of a single site, denoting spin up or down.For catalytic systems it seems more natural to adopt a lattice-gas Hamiltonian formulation in which the state of a single site takes values of 0 and 1, denoting a vacant or an occupied site accordingly.The mapping from the Ising parameters to the lattice-gas ones is discussed in detail in Sec.I of the supplementary material, 76 along with example calculations, and allows us to calculate the parameters of Table I from those of Table 2 in Ref. 45.These parameters correspond to the energetic contributions of the patterns shown in Figure 2. Note that the triplet 1-2-3b has a top site in the triangle sketched in Figure 2, and therefore, the only acceptable reflections and rotations are the ones drawn therein.Reflections in the horizontal axis would result in an hcp site inside the triangle and the figure would then be the 1-2-3a, which had not been included in this cluster expansion. 45,51

Kinetics
Oxygen adatoms are bound to fcc sites which is the only site type accounted for, in line with Refs.45 and 51.In our model, the following reversible elementary events are explicitly considered: O 2(g) + * where the last reaction denotes diffusion of O between neighboring sites.The graph patterns of these reactions as implemented in the graph-theoretical KMC are shown in Figure 3.
In the following, we summarize the equations giving the rate constants of the above reactions.For a detailed discussion on how these equations were derived, the reader is referred to Sec.II of the supplementary material. 76 Oxidation of NO and reduction of NO 2 .The 1 st reaction (NO oxidation and reduction) was considered to be in quasiequilibrium conditions by Wu et al. 51 In the present work, the quasi-equilibrium conditions are ensured by taking the forward and reverse steps of reaction ( 20) to be fast.Thus, we parameterize the kinetics constants thereof as follows: where A is a parameter that can be given a high value to ensure quasi-equilibration.The reduction and oxidation rates are proportional to the partial pressures of NO 2 and NO, P NO 2 , and P NO , because of the participation of these gaseous species in the corresponding reactions.Moreover, Q surf vib,O denotes the vibrational quasi-partition function of O adatoms (frequencies of 429, 380, 377 cm −1 , calculated from DFT and used in our calculations), 45 and ZPE surf O the zero point energy thereof.Thus, the terms inside the curly brackets constitute the oxidation pre-exponential entering the KMC input.The activation energy at the zero coverage limit for this reaction is assumed to be negligible, and the reaction energy appearing in the exponent of the last term, is automatically handled by the KMC code.This reaction energy has three contributions, namely, the Gibbs free energies of formation of gas NO 2 , f G o NO 2 (T, 1 bar), and NO, f G o NO (T, 1 bar), calculated from the formulas given in the NIST WebBook, 75 and the energy of formation of the O adlayer with respect to gas phase O 2 , FE surf O calculated from the cluster expansions.

b. Adsorption of O 2 . The rate of O 2 adsorption is given as
In the above equation the orientational sticking factor κ ads was arbitrarily taken to 1/6 as in Ref. 51.The activation energy for the adsorption step follows the BEP relationship of Ref. 51: where FE surf 2O is the energy of formation of an O adatom pair on neighboring sites on the lattice.In order to implement this BEP relationship within the KMC framework (Eqs.( 16) and ( 17)) we need to find the activation energy at the zero coverage limit and the proximity factor.The former is E ‡ ads,0 = 2 • ECI O_Point + ECI O_pair_1NN + 2.12 eV, (26)   where ECI O_Point is the 1-body term in the cluster expansion Hamiltonian, and ECI O_pair_1NN is the first nearest neighbor effective cluster interaction.Note that negative barriers are replaced by a zero value in Eq. ( 25).Finally, the proximity factor is equal to unity, which is expected as the transition state is a surface bound O 2 precursor.

c. Desorption of O 2 . The rate constant of O 2 desorption is given as
where the terms in curly brackets will be the pre-exponential entering the KMC input.

d. Diffusion of adsorbed O.
An oxygen adatom can diffuse to a neighboring site with a rate that depends on the neighboring environment of the initial and the final site.Since the diffusion process is symmetric, we only consider the forward rate: To simulate the fast equilibration of the adsorbate overlayer, we adjusted the pre-exponentials for diffusion, A dif , and NO oxidation/NO 2 reduction, A , such that: (i) the rates of these processes are at least 50 times larger than the O 2 dissociative adsorption (or O 2 associative desorption), and (ii) these processes are in partial equilibrium (forward divided by reverse rate is close to unity).As an exemplar parameter set, Table II provides the pre-exponentials and activation energies (at the zero coverage limit) for the aforementioned elementary events at 480 K and ln(P NO2 /P NO ) = −4.0 for the 12-figure cluster expansion.

A. Hull of ground states computed through simulated annealing
To validate the correct implementation of the cluster expansion Hamiltonian in the graph-theoretical KMC algorithm, we first computed the convex hull of the ground states for O on Pt(111).For each of these calculations, only diffusional hops were simulated at a given coverage.The lattice was initialized with a configuration of oxygen adatoms placed randomly on the lattice, such that the probability of occupancy of any site was equal to the coverage.Such configurations obviously have a high energetic cost, as they are far from the ordered adlayer structures observed at equilibrium.Subsequently, the system was allowed to relax while reducing the temperature linearly in time, thereby performing a simulated annealing calculation.After the system relaxed to a final state at a low temperature, we performed a second simulated annealing step, by restarting the system from that final state and the former initial temperature, and letting it relax again.This simulation setup ensures that the configuration minimizing the energy (ground state) will be found if the cooling rate is low enough.
The results of these calculations are shown in Figure 4. Panel (a) shows the temporal evolution of energy for a simulated annealing calculation on a 42 × 42 lattice (1764 sites) at a coverage equal to 0.35 ML.The energy is observed to drop quickly at initial times, as the adlayer relaxes.Moreover, as the temperature decreases linearly in time, the energy fluctuations are shown to drop, until no more events can happen.Figure 4(b) displays the convex hull of ground states, with some prominent structures noted thereon.These results are in agreement with the previously published ones by Schneider and co-workers, 45 thereby validating our methodology.

B. Dynamic simulations of NO oxidation and NO 2 reduction
To further test our methodology against the previously published results, 51 we simulated the full dynamic model (reactions ( 20)-( 22), Figure 3) containing, in addition to O diffusion, NO oxidation, NO 2 reduction, and O 2 dissociative adsorption.The reverse process of the latter was neglected for these comparisons, so that the latter are performed on an equal footing.We will come back to this point later.
The results of these simulations are summarized in Figure 5. Panels (a) and (b) show transients of the oxygen coverage and the number of NO 2 molecules produced or consumed per site.For this simulation, the lattice was initialized with a coverage of 0.25, which is lower than the stationary coverage of approximately 0.31.Thus, NO 2 reduction happens rapidly at initial times leaving oxygen adatoms on the surface.After reaction (20) has reached quasi-equilibrium, O 2 dissociation and slow NO oxidation are observed.A statistical analysis of the event occurrence is shown in the bar graph of Figure 5(c), which depicts the forward and reverse rates for each of the elementary steps taken into account in the model.It is evident that NO oxidation and NO 2 reduction are in partial equilibrium.O diffusion is observed to be in equilibrium as well.Since the diffusion step is symmetric with respect to the renumbering of the two sites, the observed partial equilibrium indicates that the energetic interactions of neighboring adsorbates have been correctly implemented in the forward and reverse diffusion rates.Finally, Figure 5(d) shows the turnover frequencies (TOFs) calculated by counting the NO 2 molecules produced per site per time after discarding the initial transient (see Figure 5(a)).Higher NO 2 /NO ratios result in higher coverages of oxygen on the surface and thus TOFs.The latter are in excellent agreement with previously published results 51 even though they have been computed with different methodologies.As previously mentioned, the simulations of Figure 5 neglect O 2 associative desorption for the comparison with published results to be valid.Moreover, from a physics standpoint, practical NO oxidation occurs at conditions at which O 2 desorption is not important.For high NO 2 /NO ratios, however, it is expected that the rate of this step will be significant and dominate the mechanism, resulting in NO 2 reduction.One can actually compute the NO 2 /NO ratio for which the overall chemistry will be at equilibrium given the molar fraction of O 2 : and therefore, ) where the change in the Gibbs free energy of the overall reac- Simulations in which the O 2 associative desorption is taken into account are shown in Figure 6.Panel (a) shows the TOFs for a range of conditions, with the critical NO 2 to NO ratio computed from Eq. ( 30) noted in the plot.We observed that the KMC simulations correctly capture the transition from  NO oxidation on the left of the vertical lines, to NO 2 reduction on the right of the lines, where the number of gas NO 2 molecules decreases in time.This observation also verifies the correct implementation of the energetics model in the KMC simulation.It is interesting to observe that the net NO 2 reduction rate exhibits a negligible increase for higher NO 2 /NO ratios.This can be attributed to the activation barrier for the associative desorption of oxygen being insensitive to the surface coverage thereof.Indeed, the proximity factor for the reversible O 2 adsorption (reaction ( 21)) is equal to 1 (see Table II), which means that the activation energy for oxygen desorption will be only weakly affected by lateral interactions.In the case of high enough coverages, for which O 2 dissociation is activated, the oxygen desorption barrier will be constant (see Eq. ( 17)).From a physical standpoint, this insensitivity means that the energetic interactions exerted by neighbors to the two O adatoms about to dissociate are similar to those exerted to the transition state (O 2 precursor).The slight increase in the NO 2 reduction rates for NO 2 -richer mixtures is due to the higher numbers of neighboring O pairs on the surface, resulting from the higher oxygen coverages in this regime.
To evaluate the performance of the different CEs in predicting the TOFs we performed a set of calculations with progressively more accurate expansions.In addition to the four CEs already presented, we generated two more, capturing up to 2nd and 5th nearest-neighbor interactions (4-and 7-figure CEs, respectively).The latter CEs included at most 2-body terms and were generated by refitting configurationenergy data generated from the 12-figure CE.The configurations were randomly generated for a 10 × 10 lattice, and coverages uniformly distributed from 0 to 1.The parameters of these CEs are given in Sec.III of the supplementary material. 76igure 6(b) shows the results of simulations of the full NO oxidation NO 2 reduction system at T = 480 K and ln(P NO 2 /P NO ) = −1.0.Simulations were performed for 4 different lattice sizes to exclude the possibility of error originating from small size effects.It is evident that the 3-figure CE, which includes only 1st nearest-neighbor pairwise additive interactions, errs by about 3 orders of magnitude.Inclu-sion of 2nd nearest-neighbor interactions (4-figure CE), while providing some improvement, still underpredicts the TOF by 36-fold.The remaining cluster expansions (5-figure, 7figure, 8-figure, and 12-figure CE) give similar results.These observations underscore the importance of including longrange interactions, for our purposes at least up to 3rd nearestneighbor interactions.Taking into account multi-body terms, while certainly important for the correct prediction of adlayer structures, 43,45 does not seem to be critical for TOF estimations.

C. Performance benchmarks for parallel algorithm
The calculations involving long-range interactions are extremely computationally intensive: apart from the detection of energetic cluster contributions every time a new event is detected (steps 5.d of the pseudocode in Sec.II D), the algorithm also has to deal with the updates of existing lattice processes after the execution of an elementary event.To improve the performance of the algorithm, these updates were processed in parallel (see Sec. II E).To evaluate the performance of the algorithm we performed two benchmarks: (i) for a given number of threads we investigated how the computational time scales with respect to lattice size; (ii) for a given lattice size, we examined the speedup with respect to the number of threads.All benchmarks were performed on the computational cluster Legion@UCL on type U nodes with 16 cores.Furthermore, we allocated a full node independent of the thread count of the job to prevent any backfilling jobs from interfering with the benchmarking.
The results of the first benchmark are shown in Figure 7, which portrays the computational time required to propagate the system for 1 simulated second with respect to the number of sites in the lattice.We see that the computational time scales almost linearly with respect to the number of sites.This is intuitively expected as the number of lattice processes executed per unit time increases linearly with respect to the size of the system.Moreover, the computational expense of incorporating long range interactions into the KMC simulation becomes evident by the computational time for the 12-figure CE being more than 4 orders of magnitude higher than that of the 3-figure CE.Improving the performance of the algorithm is thus of paramount importance.
Thus, Figure 7(b) shows the results of the second benchmark pertinent to the achieved speedup for the parallelized simulations.We observe that the improvement in the performance reaches a plateau as the number of threads increases.To understand why this is the case, we need to recall how our parallelization strategy works: the processes to be updated are first identified serially.Then, for each of these processes, the new activation energy and rate constant are computed; these calculations are the most intense and are performed in parallel.Subsequently, all new rate constants are collected from the different threads; this introduces a synchronization overhead.Finally, the new event occurrence times are computed serially and the queue of lattice processes is updated.If the number of threads is small with respect to the number of processes, the time required for the new rate constant calculations (in parallel) dominates, and the addition of more threads increases the speedup.On the contrary, if the number of threads is large, the synchronization when collecting the new rate constants is the "bottleneck" and the efficiency through parallelization is overshadowed by this synchronization overhead.Increasing the number of threads has a small effect on the speedup, which explains the plateaus observed.This plateau behavior is reached "more quickly" (i.e., at lower number of threads) for the smaller CEs.This effect can be explained by the fact that the number of lattice processes in need of update after a KMC step increases for longer range interactions.In this case, the workload distribution via parallelization results in high efficiency, as opposed to when the number of updated processes is low, in which case the aforementioned communication overhead is significant.

IV. SUMMARY AND CONCLUSIONS
In view of the increasing interest in accurate predictions of reactions on catalytic surfaces, there is a need for general, flexible, and accurate kinetic modeling approaches.The graph-theoretical KMC method previously developed by Stamatakis and Vlachos 52 allows for the direct incorporation of structurally complex elementary reactions computed by ab initio methods into KMC simulation.Thus, steric exclusions due to multidentate binding configurations, as well as intricate neighboring patterns of reactants and products were treated in a natural way within that formalism.The energetic interactions of the adlayer were constrained however to pairwise interactions and detailed balance was cumbersome to implement.
Cluster expansion Hamiltonians can be used to overcome both challenges with the construction of general models of adsorbate energetics, which may include many-body and longrange lateral interactions.Such Hamiltonians have been used extensively in the past to investigate the thermodynamics of adsorbate overlayers but there is only a limited number of studies focusing on the kinetics of reactions occurring in such environments.In this work, we incorporated general cluster expansion Hamiltonians in the graph-theoretical KMC framework, thereby enabling the detail treatment of adlayer ener-getics in KMC simulation at any desired level of accuracy.Moreover, detailed balance is naturally incorporated into the framework, as the forward and reverse activation energies of a particular elementary step are contained to the thermodynamics of that step for the given neighboring environment.We have implemented this framework in Zacros, a computational KMC package written in FORTRAN2003, available through the UCL online licensing portal e-Lucid. 66To improve the efficiency of the calculations for energetic models with long range interactions we explored parallelization with OpenMP.
To validate and benchmark our framework we set up a kinetic model for NO oxidation based on previous work by Schneider and co-workers. 51We performed simulated annealing calculations to verify that KMC correctly predicts the convex hull of ground states, and by means of dynamic simulations at a range of conditions we showed that our calculated TOFs agree with the ones found in the literature.Moreover, we demonstrated the effect of O 2 associative desorption, which was not accounted for in the original model.In this case, the NO 2 /NO ratio in which the NO oxidation switches to NO 2 reduction was predicted by the KMC and was found to be in agreement with the ratio predicted purely by the thermodynamics of the gas phase reaction.We finally, showed that for accurate predictions of the TOF one needs to include long range interactions, while the 3-body terms are not critical.
It thus becomes evident that a KMC scheme incorporating detailed models for the adlayer energetics is a powerful tool for investigating surface kinetics.As massively parallel computer architectures become widely available, cluster expansion models with quantitative accuracy with respect to DFT calculations will be easily handled in KMC simulations.This will enable computational studies of the molecular-scale processes on catalytic surfaces at an unprecedented level of detail, unraveling new phenomena for chemistries of theoretical and practical importance.

FIG. 1 .
FIG.1.Energy with respect to reaction coordinate showing the energeticsand kinetics-related quantities entering Eqs.(14)-(18).The blue solid path pertains to the reaction energy profile in the limit of zero coverage.For the red dotted path in this example we have assumed that repulsive interactions between the reactants and neighboring adsorbates raise the energy of the initial state.

FIG. 2 . 43 NOFIG. 3 .
FIG.2.Energetic interaction patterns taken into account in the cluster expansion Hamiltonians.Filled blue circles represent sites occupied by an oxygen adatom.Open blue circles represent sites that may or may not be occupied (their state is represented by the symbol & in Eq. (5)).The graph multiplicity for each of the pairwise interaction patterns is equal to 2. For the triplets, the graph multiplicity of the 1-1-3 pattern is also equal to 2, whereas the 1-2-3b patterns have multiplicities equal to 1.This article is copyrighted as indicated in the article.Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 128.40.74.17 On: Thu, 12 Dec 2013 16:04:43

FIG. 4 . 43 FIG. 5 .
FIG. 4. Results of the simulated annealing calculations.(a) Energy versus time for θ = 0.35, T ini = 100.5K, T ramp = −2 K/s.The energy fluctuations drop over time and the ground state configuration is reached.(b) Convex hull of ground states.Prominent ground state structures are noted.This article is copyrighted as indicated in the article.Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 128.40.74.17 On: Thu, 12 Dec 2013 16:04:43

FIG. 6 .FIG. 7 .
FIG. 6. Simulation results for the full NO oxidation model.(a) The vertical dashed lines correspond to the equilibrium NO 2 /NO ratios for y O 2 = 0.1 and the three different temperatures (480 K, 580 K, and 680 K from right to left).On the left side of each line, NO oxidation is the net reaction, whereas on the right side, NO 2 reduction.(b) Turnover frequency predictions for six cluster expansions and progressively larger lattices.The conditions for these simulations were as follows: T = 480 K, P = 1 bar, log(y NO 2 /y NO ) = −1.0.The lattice sizes used are noted in the legend, with the largest lattice (84 × 84) having 7056 sites.This article is copyrighted as indicated in the article.Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 128.40.74.17 On: Thu, 12 Dec 2013 16:04:43

0 .
Start 1. Define simulation lattice, conditions, participating species, elementary steps, energies of gas species, and cluster expansion Hamiltonian.2. Initialize the lattice state and set the time clock to t = 0. 3. Detect and store all energetic cluster contributions in the energetics data-structure.Sum them up to obtain the lattice energy.
4. Detect all elementary events that can happen.For each such event, compute the activation energy and kinetic This article is copyrighted as indicated in the article.Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 128.40.74.17 On: Thu, 12 Dec 2013 16:04:43

TABLE I .
Energetic contribution parameters (in meV) entering the latticegas Hamiltonian for O on Pt(111).

TABLE II .
Exemplar kinetic parameters for the NO oxidation KMC model at 480 K and ln(P NO 2 /P NO ) = −4.0 for the 12-figure CE.