Hydrocarbon pyrolysis is a complex process involving large numbers of chemical species and types of chemical reactions. Its quantitative description is important for planetary sciences, in particular, for understanding the processes occurring in the interior of icy planets, such as Uranus and Neptune, where small hydrocarbons are subjected to high temperature and pressure. We propose a computationally cheap methodology based on an originally developed ten-reaction model and the configurational model from random graph theory. This methodology generates accurate predictions for molecule size distributions for a variety of initial chemical compositions and temperatures ranging from 3200 to 5000 K. Specifically, we show that the size distribution of small molecules is particularly well predicted, and the size of the largest molecule can be accurately predicted provided that this molecule is not too large.

Complex chemical systems, in which a large number of chemical species are generated via numerous types of chemical reactions, play a key role in a variety of applications, such as combustion,1,2 astrophysics,3–5 or battery interfaces.6 Understanding the kinetic and thermodynamic forces involved is crucial for making accurate predictions of the evolution of these systems. However, estimating equilibrium chemical composition for such systems is extremely complicated. A common way to obtain the equilibrium molecular composition is by running atomistic simulations.7–9 While these simulations eventually equilibrate, the equilibration typically takes a long time, which can be measured in weeks or even months depending on the conditions. Furthermore, each modification of the conditions, such as the temperature and the initial molecular composition, requires a brand-new simulation.

An example of a chemical system facing these challenges is the pyrolysis of hydrocarbons at high temperature and pressure,10–14 which we consider in this work. Hydrocarbon pyrolysis occurs in the interior of icy giant planets, such as Neptune and Uranus, where small hydrocarbons, such as methane, are subject to high temperatures, typically on the order of several thousand kelvins, and high pressure, typically on the order of tens of gigapascals.3–5 Small hydrocarbons react to form long carbon structures and these long structures can undergo phase transition to a solid phase, such as diamond. The conditions where diamond is formed have been studied in several experimental works.3,15,16 However, these experiments do not always agree with each other, even within conditions of temperature and pressure that are similar. These disagreements can be explained by the fact that the experimental set-ups are different between these experiments. However, a lack of understanding of the precise mechanism of hydrocarbon pyrolysis prevents researchers from having a clear understanding of the forces that are in play and their effect on the kinetic and thermodynamic equilibrium of this system. Indeed, as these experiments are run in extreme conditions on small time scales, it is very challenging to obtain precise concentrations of the different structures that could be used to obtain these mechanisms.

As experimentally studying the hydrocarbon pyrolysis mechanism is intricate, many researchers have studied it via modeling and simulation. Many widely used combustion kinetic models17,18 can make predictions for hydrocarbon pyrolysis; however, these models assume a gaseous phase, whereas in the conditions of the interior of Neptune, hydrocarbons are in a liquid phase. In addition, these kinetic models only consider species with less than 20 atoms and hence are not suitable for the study of the growth of large carbon structures that occur in this system. Ancilotto et al.19 instead performed ab initio Molecular Dynamics (MD) simulations to observe the evolution of methane pyrolysis at high pressure and temperature. They observed some dissociation of methane below 100 GPa and 5000 K and the formation of diamond above 300 GPa. However, due to the high computational cost of ab initio MD simulations, the simulations were only run for 2 ps with 16 CH4 molecules. At these scales, it can be difficult to know if a system is at equilibrium or if it needs longer than a few picoseconds to react. It can also be challenging to observe the different reaction pathways that methane can undergo during its pyrolysis.

As an alternative to the high computational cost of ab initio MD simulations, classical MD simulations relying on force fields can be performed. Particularly, the ReaxFF force field20–23 is an alternative to study hydrocarbon pyrolysis simulations in conditions close to the ones in Neptune or Uranus. Indeed, the parameters of this force field developed for combustion and pyrolysis of hydrocarbons have shown several successes in reproducing experimental Arrhenius parameters,7,23–26 DFT-calculated energies,25,27 and experimentally observed reaction pathways23,28,29 in conditions of temperatures between 2000 and 4000 K. In addition, these parameters were shown to be relevant for the study of condensed carbon phases by Orekhov et al.31 where they validated Ashraf et al. parameters23 with DFT calculations. The calculations by the ReaxFF forcefields20–23 developed for pyrolysis and combustion have not yet been validated with DFT calculations for the conditions of pressures in Neptune and Uranus (tens of gigapascals), which is the range of pressures we study in this work.

Although ReaxFF had several successes, several of its limitations have also been reported: for example, it failed to reproduce some ab initio energies,32 some common reactions were not observed in ReaxFF simulations,32,33 and excited electronic states are not taken into account. However, due to the prohibitive computational cost of ab initio MD simulations and the intricacy of doing experiments, ReaxFF is a sensible alternative to study hydrocarbon pyrolysis in the conditions studied here, with reasonable time scales and system sizes. Unfortunately, ReaxFF simulations can still take several days to run, which can make it time-consuming to know the evolution of hydrocarbon pyrolysis in different conditions.

In the works of Yang et al.10,12–14 and Chen et al.,11 kinetic models for hydrocarbon pyrolysis were built from ReaxFF MD simulations data. These models were able to reproduce the concentrations of small molecules as well as the bulk counts for larger structures arising in the MD simulations, but no information on the size of the larger structure was given. Another kinetic model for the growth of polycyclic aromatic compounds has been developed by Wang et al.;34 nonetheless, this model does not study structures that could lead to diamond formation.

In the recent work of Reed’s group,35,36 a “local” kinetic model capable of predicting the dynamics of counts of small and large carbon molecules was developed. In this model, trained on ReaxFF MD simulations, the reactions were described in terms of local changes at the reactive site. The local kinetic model was able to accurately reproduce the time series of the training and test MD data of counts for the largest hydrocarbons in the system. Remarkably, the initial composition and the temperature of the training set was different from that of the test set. Nevertheless, the local kinetic model suffers from a major drawback: full trajectories, using kinetic Monte Carlo simulations,37,38 need to be run in order to obtain the equilibrium composition. Therefore, obtaining equilibrium molecular compositions in many different conditions would require significant time. In addition, this time would scale with the size of the system making it impractical to study larger size systems.

One broad goal of the scientific community studying hydrocarbon pyrolysis is to predict the structures and sizes of the products of hydrocarbon pyrolysis from any initial conditions. Of particular interest are the conditions of the interior of Neptune and Uranus where this process occurs continuously. As these structures are difficult to observe experimentally or in ab initio simulations, we chose to work with the structures observed in classical MD simulations run with the ReaxFF force field. Therefore, the goal of this work is to develop an easy-to-implement and fast-to-run model for hydrocarbon pyrolysis that predicts the equilibrium distribution observed in ReaxFF MD simulations of molecule sizes measured in the numbers of carbon atoms per molecule for any given initial temperature within the range of 3200–5000 K and any initial composition of hydrocarbons. The specific objective of this work is to develop an approach based on random graph theory that can quickly replicate the results of MD simulations at various temperatures and initial molecular compositions using a small amount of MD data for training.

The proposed methodology consists of two major components.

  • We introduce the ten-reaction model that predicts the degree distribution given the temperature and the initial composition. This model accounts only for local reactions of the form

CiH+CjHCi+1Cj+1+HH,

where the subscripts indicate the number of carbon–carbon bonds (the degree) that the reacting carbon atoms have. The number of the reactions, ten, is determined by the fact that the number of carbon–carbon bonds per carbon atom takes values only between 0 and 4. The model is calibrated on MD data with initial composition consisting of 64 C4H10 molecules at various temperatures.

  • We use the configuration model from random graph theory39–43 to predict molecule size distribution given the degree distribution of a system, calculated with the ten-reaction model, or directly extracted from MD data. The approach based on generating functions introduced by Newman, Strogatz, and Watts41 is employed.

All the simulations studied here are at a pressure of 40 GPa, and no pressure extrapolation was investigated in this work. Indeed, contrary to temperature extrapolation where the Arrhenius model is available, there are no readily available model for pressure extrapolation. Developing such a model was considered out of the scope of this work and is left to future work.

Knowing only the temperature and initial composition, we can use the ten-reaction model to estimate the degree distribution from which the Newman–Strogatz–Watts generating function formalism allows us to calculate the equilibrium distribution of hydrocarbon sizes we desire. The need for any ordinary differential equation (ODE) solver, such as kinetic Monte Carlo, is completely eliminated, as these models are analytical or based on random sampling, which makes them computationally cheap. Furthermore, one can predict the histograms of small molecule counts and the histogram for the largest molecule size by sampling random graphs for the configuration model. The flowchart of the proposed methodology is shown in Fig. 1.

FIG. 1.

The flowchart of the proposed methodology.

FIG. 1.

The flowchart of the proposed methodology.

Close modal

The raw data used for training and testing the model developed in this work were extracted from ReaxFF MD simulations. All these data were extracted when the simulations were at equilibrium. Equilibrium was assumed to be reached once the counts of each of the most common small molecules, CH4, H2, and C2H6, and the size of the largest molecule converged to a steady state value. These extracted data include

  1. the time-averaged distribution for the numbers of carbon–carbon bonds per carbon atom, or, in other words, the time-averaged degree distribution for the graphs consisting of the union of the carbon skeletons of all molecules present at each time step;

  2. the time-averaged counts of molecules of each size, i.e., the molecule size distribution.

The MD simulations were run using the LAMMPS software44,45 and the KOKKOS package.46 The ReaxFF potential20,47,48 with the parameters described by Ashraf et al.23 was used as the interatomic potential. The initial molecules present in the simulations were exclusively C4H10, C2H6, CH4, C8H18, or a mix of molecules. The numbers of molecules for each simulation were chosen to have a total number of atoms between 800 and 1000 in each simulation. The simulations were run for a duration allowing the system to reach an equilibrium and remain in it for a sufficient period of time (see the supplementary material, Sec. S1 and Fig. S1, for the definition of equilibrium). The simulation was between 40 and 1000 ps. To take into account the variance between MD replicates, two MD simulations for each set of conditions were run. The simulations details, the conditions of each of the simulations, and the duration during which the simulations are at equilibrium can be found in the supplementary material, Sec. S1 and Table S1. The computation time to run the MD simulations was around 1 week on 40 CPUs for 500 ps of simulation.

An important aspect of extracting data from MD simulations is the criterion for when two atoms should be treated as bonded. We used the bond duration and bond length criterion proposed in Ref. 11: two atoms are considered to be bonded if the interatomic distance has been less than a threshold distance λ for a threshold duration τ. The values for λ were taken from Chen et al.:11λ = 1.98 Å for C–C bonds, λ = 1.57 Å for C–H bonds, and λ = 1.09 Å for H–H bonds. The bond duration was studied at different temperatures using the protocol developed by Chen et al.11 (see the supplementary material, Sec. S2), and the value τ = 0.096 ps was set for all temperatures.

The proposed methodology consists of two steps. Step one predicts the degree distribution for the random graph comprised by carbon skeletons of all molecules with the aid of the originally developed ten-reaction model. Step two calculates the molecule size distributions from the degree distribution using the generating function approach41 or a random graph sampling approach. Below we detail each of these steps.

In each MD simulation, tens of thousands of chemical reactions are observed. These reactions can be described at the molecular scale, where the whole molecules involved are described, or at the atomic scale, where only atoms surrounding the reactive site are described. A comparison of these two descriptions in the case of hydrocarbon pyrolysis was performed previously.35 Here, we wish to obtain the degree distribution of the carbons, which is an atomic property; therefore, we will describe reactions at the atomic scale. When observing the reactions at the atomic level, these reactions can be divided into three groups according to the change in the total number of the carbon–carbon bonds (CC-bonds) in the system:

  1. there is no change in the CC-bonds;

  2. a single CC-bond is added or removed; and

  3. more than one CC-bonds are added or removed.

The histogram of changes in the CC-bonds in the reactions observed in the MD simulation at T = 3300 K initialized with 64 molecules C4H10 is shown in Fig. S3. (Here and further the symbol S indicates that the referred object is found in the supplementary material.) In this case, ∼50% of all reactions observed did not change the total number of CC-bonds. Respectively, 24.25% and 24.32% of the reactions removed or added a single CC-bond. The remaining 1.4% of the reactions added or removed two or more CC-bonds. The reactions that do not lead to changes in CC-bonds do not affect the molecule size distribution measured in the number of carbon atoms in the molecules. Over 97% of the remaining reactions either add or remove exactly one CC-bond. These observations motivate our first simplifying assumption:

Assumption 1.Each reaction changes exactly one CC-bond.

As we are studying the equilibrium degree distribution, we consider only the molecules that are stable. This means that we ignore radicals as these molecules are only transition species and do not remain for a long time in this state. We also ignore double bonds. The reason for this assumption is that they are not prevalent. Except at temperatures above 4500 K more than 90% of carbons are bonded to four atoms, as can be observed in Fig. S4. The rest of the carbons are mostly bonded to three other atoms. These carbons bonded to three other atoms are either radicals or forming a double bonds and their total percentages represent less than 10% of all the carbons. Ignoring radicals and double bonds brings us to our second simplifying assumption:

Assumption 2.All carbons have four single bonds.

Using Assumptions 1 and 2, we obtain the following set of possible reactions:

CiH+CjHKijCi+1Cj+1+HH.
(1)

The subscripts of the carbons indicate their degrees, i.e., Ck is bonded to k other carbon atoms. Since the valence of carbon is 4, the degree k can be 0, 1, 2, 3, or 4, resulting in ten possible reactions. We note that these reactions are not elementary reactions and would likely need several elementary steps to occur. However, these reactions are considered to govern the degree distribution of the carbons at equilibrium.

Using this simplified mechanism, the equilibrium constants Kijs can be fitted to the data obtained with the MD simulations. The Kijs are fitted at different temperatures to MD simulations starting with C4H10. A least squares fit to the Arrhenius model is used to obtain the Kijs as a function of temperature.

Once the parameters for the Arrhenius models for Kijs are obtained, the degree distribution at equilibrium can be readily calculated given the temperature and the initial chemical composition of the system. The calculation consists in solving a system of nonlinear algebraic equations. The details of the fitting of the equilibrium constants Kijs, finding the degree distribution, and the uncertainty quantification for the degree distribution are provided in the supplementary material, Sec. S3.

The collection of carbon skeletons of hydrocarbon molecules at each step of an MD simulation can be viewed as the realization of a random graph whose vertices are the carbon atoms (C-atoms) and edges are the carbon–carbon bonds (CC-bonds). We remind the reader that a graph consists of sets of nodes and edges. In undirected graphs, each edge connects an unordered pair of nodes. The hydrogens are not represented in the graphs. The degree of a node is the number of edges that this node forms. In our case, the degree of a C-atom represents the number of CC-bonds that this carbon forms.

The degree distribution for this random graph can be extracted from the MD simulation or predicted as described in Sec. III A. Therefore, this random graph is an instance of the so-called configuration model39–41 briefly described below. A detailed derivation of this model is given in Ref. 41. Its adaptation for the case of hydrocarbons is presented in the supplementary material, Sec. S4.

Our goal is to obtain the probabilities πs, sN, that a randomly picked molecule has s C-atoms, using the degree distribution pk. For this purpose, we define the probabilities Ps, sN, that a randomly chosen C-atom belongs to a molecule containing s C-atoms. The distribution {πs} is readily obtained from the distribution {Ps},

πs=Ps/sjNPj/j,sN.
(2)

To obtain the distribution Ps from the degree distribution pk, we use the generating functions of these distributions.49 For any discrete probability mass function (PMF) {ak}, the generating function is defined by G(x)≔kakxk. Generating functions possess a number of useful properties. For example, since any PMF sums up to 1, we have G(1) = 1. The definition of the mean implies that ⟨k⟩ = G′(1). We denote the generating functions of the distributions pk and Ps by G0(x) and H0(x), respectively. While G0(x) can be readily written out (G0(x) = p0 + p1x + p2x2 + p3x3 + p4x4), finding H0(x) is nontrivial and requires more work.

To find H0(x), we need to define the so-called excess degree distribution {qk} as follows: When an edge is randomly picked and then one of its endpoints v is chosen, the probability that v has k other edges incident to it is

qk=(k+1)pk+1G0(1).
(3)

The generating function G1(x) of the excess degree distribution is readily found: G1(x)=G0(x)/G0(1). Using G1(x), we can obtain the generating function H1(x) for the probability distribution for the component sizes hanging off from one of the ends of a randomly picked edge. The calculation of H1(x) stems from the following two ideas. The first idea is the use of the power property of any generating function: if G(x) generates the distribution for some quantity associated with an object, then [G(x)]m generates the distribution for the sum of these quantities of m independent samples of the object. The second idea is that H1(x) must satisfy the following self-consistency relationship:

H1(x)=xG1(H1(x)).
(4)

An insightful graphical explanation for (4) was offered by Newman, Strogatz, and Watts.41 Figure 2 (top) displays a version of it adapted for our case where the excess degree distribution has at most four nonzero components: q0, q1, q2, and q3. Since G1(x) is a cubic polynomial, Eq. (4) with respect to H1 has three roots. Only one of them does not blow up as x → 0, allowing us to obtain a unique solution for H1(x).

FIG. 2.

A graphical explanation for the self-consistency relationship (4) (top) and Eq. (5) (bottom) for the case relevant to random graphs representing carbon skeletons of hydrocarbons where the functions G1 and G0 are cubic and quadratic polynomials, respectively. The symbols ■ and ★ denote the distributions H1 and H0, respectively. The circles represent vertices of the graph. Top: The size distribution H1 of the connected component that can be reached by picking a random edge and a random direction along it is a sum of the following four cases. Let v be the vertex defined by this random edge and this random direction. Case 0: the vertex v has excess degree 0 with probability q0. Case 1: v has excess degree 1 with probability q1; in this case, the size distribution of the connected component hanging off the other edge incident to v is again generated by H1. In Cases 2 and 3, v has two and three other edges incident to it with probabilities q2 and q3, respectively. The size distributions of the connected components hanging off each of those edges are generated by H1 as well. Bottom: The size distribution of a connected component containing a randomly picked vertex v is generated by H0, which adds up from the following five cases. Case 0: v has degree zero with probability p0. Cases 1–4: v has degree k with probability pk, k = 1, 2, 3, 4. In these cases, the size distributions of the connected components hanging off each edge incident to v are generated by H1. This figure is adapted from Ref. 42.

FIG. 2.

A graphical explanation for the self-consistency relationship (4) (top) and Eq. (5) (bottom) for the case relevant to random graphs representing carbon skeletons of hydrocarbons where the functions G1 and G0 are cubic and quadratic polynomials, respectively. The symbols ■ and ★ denote the distributions H1 and H0, respectively. The circles represent vertices of the graph. Top: The size distribution H1 of the connected component that can be reached by picking a random edge and a random direction along it is a sum of the following four cases. Let v be the vertex defined by this random edge and this random direction. Case 0: the vertex v has excess degree 0 with probability q0. Case 1: v has excess degree 1 with probability q1; in this case, the size distribution of the connected component hanging off the other edge incident to v is again generated by H1. In Cases 2 and 3, v has two and three other edges incident to it with probabilities q2 and q3, respectively. The size distributions of the connected components hanging off each of those edges are generated by H1 as well. Bottom: The size distribution of a connected component containing a randomly picked vertex v is generated by H0, which adds up from the following five cases. Case 0: v has degree zero with probability p0. Cases 1–4: v has degree k with probability pk, k = 1, 2, 3, 4. In these cases, the size distributions of the connected components hanging off each edge incident to v are generated by H1. This figure is adapted from Ref. 42.

Close modal

The generating function H0(x) can be obtained from H1(x) using the formula [see Fig. 2 (bottom)]

H0(x)=xG0(H1(x)).
(5)

The distribution {Ps} can be extracted from its generating function H0 using Cauchy’s integral formula [Eq. (S18)]. Then, the probabilities πs that a randomly picked molecule has s C-atoms can be obtained from {Ps} using Eq. (2).

Two different regimes can be observed for the molecule size distribution: either all molecules remain small (subcritical regime) or one molecule, called the giant component, and contains a substantial fraction of carbons at all times (supercritical regime). These two regimes are separated by the criterion39,41

k(k2)pk<0,subcritical regime>0,supercritical regime.
(6)

The distributions H0 and H1 account only for small tree-like connected components of a random graph. In the subcritical regime where there is no giant component, these distributions sum up to 1. If there is a giant component, H0 sums up to the fraction of vertices not lying in the giant component. Hence, when the system is in the supercritical phase, one can estimate the fraction S of vertices in the giant component using the following relationship:41 

H0(1)=G0(H1(1))=1S.
(7)

Therefore, the expected number of carbons in the largest hydrocarbon molecule is NC(1 − H0(1)), where NC is the number of carbons in the system. Equation (4) and Fig. 2 (top) imply that H1(1) = u < 1 in the supercritical phase, where u is the smallest positive solution to u = G1(u).

The model described in this section will be referred to as the “RGT” model in the rest of this paper. RGT stands for Random Graph Theory.

The algorithm for obtaining the molecule size distribution {πs} and the size of the giant component (if any) summarized in Sec. III B is extremely cheap: it runs in less than a second on a single central processing unit (CPU). However, it is based on the assumption that the number of vertices (carbons) is very large. An alternative way to predict molecule size distribution from the degree distribution is by sampling random graphs for the configuration model and applying standard graph algorithms, such as depth-first search,50 to find all connected components. Random graph sampling has several advantages. First, it allows us to fix the number of vertices to match the number of carbons in the considered system and hence eliminate the assumption that the system is large. Second, it offers a direct way to approximate any distribution we want. In particular, we are interested in the size distribution for the largest molecule. This distribution depends on the size of the system and hence cannot be obtained via the generating function approach. On the downside, it takes about a minute to generate and process a large enough collection of random graph samples on a regular modern laptop. Moreover, all estimates obtained using random graph sampling unavoidably contain statistical errors.

FIG. 3.

A comparison between the degree distributions predicted by the ten-reaction model and extracted from the MD simulations. Top: The degrees error Eq. (8) as a function of temperature and the H/C ratio is showed using color coding. Bottom: Examples of small E = 0.05 (left, green frame), medium E = 0.09 (middle, purple frame) and relatively large E = 0.13 (right, orange frame) errors arising in three different systems. The corresponding marks in the top plot are surrounded with squares of the corresponding colors.

FIG. 3.

A comparison between the degree distributions predicted by the ten-reaction model and extracted from the MD simulations. Top: The degrees error Eq. (8) as a function of temperature and the H/C ratio is showed using color coding. Bottom: Examples of small E = 0.05 (left, green frame), medium E = 0.09 (middle, purple frame) and relatively large E = 0.13 (right, orange frame) errors arising in three different systems. The corresponding marks in the top plot are surrounded with squares of the corresponding colors.

Close modal

Given the degree distribution and the total number of vertices NC (carbon atoms), we can create a collection of random graphs as outlined in Ref. 41. The main idea of this procedure is to generate a collection of vertices with “stubs” (i.e., edges sticking out of them) whose number obeys the degree distribution and then randomly pair the stubs:

  1. The carbon atoms are enumerated from 1 to NC. The first round(p0NC) carbons get zero stubs. The next round(p1NC) carbons get one stub each. The next round(p2NC) carbons get two stubs each, and so on.

  2. The total number of edges is one-half of the number of the stubs. If the number of stubs turns out to be odd, the last atom with zero stubs is given one stub. Then, the stubs are matched as follows: First, the stubs are enumerated. Next, a random permutation of the stubs is generated. Then, the stubs from the top half of this permutation are paired with the stubs of its bottom half. As a result, a realization of the random graph is generated.

  3. Finally, the realization of the random graph is postprocessed by removing self-loops and repeated edges.

This random graph sampling model will be referred to as the “RGS” model in the rest of the paper.

In this section, we apply our proposed methodology to predict the molecule size distribution for hydrocarbon pyrolysis. Since our methodology involves several components, we test each of them separately to achieve a more detailed assessment of its strengths and limitations.

We have assumed that the equilibrium constants Kij obey the Arrhenius law for any hydrocarbon composition. The validity of this assumption is demonstrated in Fig. S5 displaying Arrhenius plots for Kijs for a collection of training and test data for temperatures in the range 3200 K T 5000 K.

To compare the degree distributions predicted by the ten-reaction model and extracted from MD simulations, we define the following error metric:

E=i=04NCiNC10RMNCiNCMD,
(8)

where the superscripts 10RM and MD mean that the corresponding data come, respectively, from the ten-reaction model and from the average of two MD simulations.

Figure 3 visualizes this comparison. It can be observed that even for the highest error, the predicted degree distribution matches well the degree distribution extracted from the MD data.

FIG. 4.

Left and Middle Columns: The predicted molecule size distributions for hydrocarbons with at most 20 carbon atoms are compared to the ones extracted from MD simulations (solid black dots). The black error bars are equal to two standard deviations. The molecule size is measured by the numbers of carbon atoms s. The dashed blue curves and the solid red curves correspond to predicted size distribution by the random graph theory model (RGT). The dashed blue curves with error bars represent the results obtained using the degree distributions extracted from the MD simulations, while the red solid curves represent the results obtained using the degree distribution generated using the ten-reaction model (10RM). Right Column: The Wasserstein-1 distance W1(P, Q), where P is the molecule size distribution extracted from MD simulation and Q is the predicted molecule size distribution by the “RGT” model (top) and the “10RM + RGT” model (bottom) plotted as a function of the H/C ratio. The initial composition is indicated for each mark with circles indicating the data used for training the 10RM and stars indicate data used for testing. The temperature is shown with color coding.

FIG. 4.

Left and Middle Columns: The predicted molecule size distributions for hydrocarbons with at most 20 carbon atoms are compared to the ones extracted from MD simulations (solid black dots). The black error bars are equal to two standard deviations. The molecule size is measured by the numbers of carbon atoms s. The dashed blue curves and the solid red curves correspond to predicted size distribution by the random graph theory model (RGT). The dashed blue curves with error bars represent the results obtained using the degree distributions extracted from the MD simulations, while the red solid curves represent the results obtained using the degree distribution generated using the ten-reaction model (10RM). Right Column: The Wasserstein-1 distance W1(P, Q), where P is the molecule size distribution extracted from MD simulation and Q is the predicted molecule size distribution by the “RGT” model (top) and the “10RM + RGT” model (bottom) plotted as a function of the H/C ratio. The initial composition is indicated for each mark with circles indicating the data used for training the 10RM and stars indicate data used for testing. The temperature is shown with color coding.

Close modal

While the error in the predicted degree distribution is fairly small for all initial compositions and temperatures that we have considered, there is an apparent trend observed for C4H10 data: the error increases as the temperature grows. This trend and its consequences will be discussed in Sec. V.

Hydrocarbon pyrolysis usually generates a mix of small molecules, and one much larger molecule, if the H/C ratio is low enough. In order to accurately predict the equilibrium conditions of hydrocarbon pyrolysis we need to correctly describe both of these components of the system. The algorithm summarized in Sec. III B predicts the size distribution for small molecules as well as the mean size of the largest carbon agglomeration, called the giant component in the random graph theory framework. Here, we start by studying the small molecules size distribution. We chose 20 carbons in a molecule as a reasonable size cut-off for “small” molecules.

For each initial composition and temperature reported in Table S1, the following three small molecule size distributions were compared:

  • The molecule size distribution extracted from MD simulations.

  • The molecule size distribution predicted by the configuration model described in Sec. III B using the degree distribution extracted from the MD simulations (“RGT” model).

  • The molecule size distribution predicted by the configuration model described in Sec. III B using the degree distribution predicted by the ten-reaction model described in Sec. III A (“10RM + RGT” model).

Figure 4 (left and middle columns) illustrates four samples of predicted and extracted from MD simulations molecule size distributions. These samples are chosen representatively with respect to initial compositions, temperatures, and discrepancies between the distributions extracted from MD simulations and the ones calculated via the random graph theory. More plots of this kind are found in Fig. S6 in the supplementary material. We observe that the predictions of the RGT and 10RM + RGT models match generally well the values observed in the MD simulations, especially for the smallest molecules (containing less than five carbons). We can also note that the RGT and 10RM + RGT model predictions are very close to each other.

In order to compare the accuracy of the prediction for the small molecule size distribution for different temperatures and initial chemical compositions, we use the Wasserstein W1 distance, a standard measure for comparing probability distributions51 

W1(P,Q)=sj=1sPjj=1sQj.
(9)

The molecule size distributions extracted from MD simulations have been used as the distribution P, while the predicted molecule size distributions have been used as Q. In both cases, we truncated these distributions at size s = 20 and renormalized them so that they sum up to 1. The values of the Wasserstein W1 distance obtained for various initial compositions and various temperatures are visualized in Fig. 4 (right column). This figure allows us to make the following observations.

  • The Wasserstein W1 distance tends to increase as the H/C ratio decreases. This phenomenon is related to the presence of large hydrocarbons. We will discuss it in Sec. V.

  • The temperature dependence observed for C4H10 data is nonmonotone.

  • The difference between the accuracy of RGT and 10RM + RGT for small molecule size distributions is insignificant.

The RGT model is derived under the assumption that the number of vertices in the graph is very large. Hence, we expect that the predictions by RGT will become more accurate as the number of carbons in the system increases. This effect is illustrated in Sec. S6 of the supplementary material.

The random graph sampling (RGS) model can also be used to study the small molecule size distribution. In particular, this method can be used to obtain a histogram of the counts of molecules of a particular size at equilibrium. More details and figures are given in Sec. S7 of the supplementary material.

There are two ways to predict the size of the largest molecule given the degree distribution. If the degree distribution implies the supercritical phase and the number of carbons is large enough, the size of the giant component can be estimated by NC(1 − H0(1)) (RGT, see the end of Sec. III B). Alternatively, for a system of any size and independent of phase, the size distribution of the largest molecule can be obtained using random graph sampling (RGS) (see Sec. III C). The results of both of these approaches are presented here. The degree distribution is obtained by the ten-reaction model (10RM).

The existence of the giant component was predicted for initial compositions consisting of C4H10 molecules and C8H18 molecules, while for the mix and the initial compositions consisting of C2H6 and CH4, the fraction of carbons in the giant component S was found to be zero. We report the numbers of carbons in the giant component S · NC in Table I. We can see that the size of the giant component is generally over-estimated by the RGT model. This observation will be discussed in Sec. V.

TABLE I.

The numbers of carbon atoms in the largest molecule. The column RGT shows the predictions by 10RM + RGT: the expected number of carbons in the giant component is the product of S and the number of carbons NC. The last two columns display mean ± standarddeviation for the size of the largest molecule extracted from generated by 10RM + RGS and MD simulations, respectively.

Init. Comp.Temp. (K)NCRGTRGSMD
C4H10 3200 256 79 112 ± 25 57 ± 18 
C4H10 3300 256 81 110 ± 26 49 ± 17 
C4H10 3400 256 115 113 ± 26 70 ± 28 
C4H10 3500 256 112 113 ± 27 64 ± 21 
C4H10 3600 256 125 117 ± 26 76 ± 24 
C4H10 4000 256 92 129 ± 25 57 ± 22 
C4H10 4500 256 105 150 ± 22 58 ± 21 
C4H10 5000 256 96 171 ± 18 54 ± 19 
C8H18 3300 288 195 208 ± 13 127 ± 36 
C2H6 3300 250 ⋯ 24 ± 8 20 ± 6 
C2H6 4000 250 ⋯ 31 ± 11 20 ± 5 
CH4 3600 160 ⋯ 6 ± 1 6 ± 1 
CH4 4500 160 ⋯ 10 ± 3 10 ± 3 
Mix 3500 240 ⋯ 21 ± 7 18 ± 3 
Init. Comp.Temp. (K)NCRGTRGSMD
C4H10 3200 256 79 112 ± 25 57 ± 18 
C4H10 3300 256 81 110 ± 26 49 ± 17 
C4H10 3400 256 115 113 ± 26 70 ± 28 
C4H10 3500 256 112 113 ± 27 64 ± 21 
C4H10 3600 256 125 117 ± 26 76 ± 24 
C4H10 4000 256 92 129 ± 25 57 ± 22 
C4H10 4500 256 105 150 ± 22 58 ± 21 
C4H10 5000 256 96 171 ± 18 54 ± 19 
C8H18 3300 288 195 208 ± 13 127 ± 36 
C2H6 3300 250 ⋯ 24 ± 8 20 ± 6 
C2H6 4000 250 ⋯ 31 ± 11 20 ± 5 
CH4 3600 160 ⋯ 6 ± 1 6 ± 1 
CH4 4500 160 ⋯ 10 ± 3 10 ± 3 
Mix 3500 240 ⋯ 21 ± 7 18 ± 3 

Random graph sampling allows us to estimate the probability ζs that the size of the largest molecule is s. We used the 10 000 samples of random graphs to obtain the probability distribution for size of the largest component and compared them to the probability distribution for the largest molecule size extracted from the MD data. This was done for each initial composition and temperature listed in Table S1. The agreement between these distributions is reasonably good for the cases with high H/C ratio where the giant component does not exist, i.e., for the mix and the initial compositions consisting of only C2H6 or CH4 (see the last five rows in Table I). All cases where the giant component was predicted exhibited poor agreement. Examples of good and bad agreements are shown in Figs. 5(a)5(c), respectively.

FIG. 5.

(a)–(c) Distributions of the size of the largest molecule extracted from MD data (blue plots) and obtained using the ten-reaction model and the random graph sampling (10RM + RGS) (red plots). (d) The Wasserstein-1 distance W1(P, Q) as a function of H/C ratio. The distributions P for the largest molecule size are obtained using the 10RM + RGS model, while the distributions Q for the largest molecule size are extracted from MD simulations. The data are color coded by the temperature of the system. Circles indicate the data used for training the 10RM, and stars indicate data used for testing.

FIG. 5.

(a)–(c) Distributions of the size of the largest molecule extracted from MD data (blue plots) and obtained using the ten-reaction model and the random graph sampling (10RM + RGS) (red plots). (d) The Wasserstein-1 distance W1(P, Q) as a function of H/C ratio. The distributions P for the largest molecule size are obtained using the 10RM + RGS model, while the distributions Q for the largest molecule size are extracted from MD simulations. The data are color coded by the temperature of the system. Circles indicate the data used for training the 10RM, and stars indicate data used for testing.

Close modal

We would like to comment on the highly noisy distributions for the size of the largest molecule extracted from MD simulations. First, various MD simulations stayed at equilibrium for different duration ranging from 25% to 67% of simulation time. Second, the largest molecule size distribution is naturally very noisy: for example, if a bond is broken in the middle of the largest molecule, its size is divided roughly by two.

The Wasserstein W1 distance between the size distributions for the largest molecule obtained using random graph sampling and extracted from MD data is plotted in Fig. 5(d) as a function of H/C ratio. The temperature is shown using color coding.

Figure 5(d) suggests that the prediction for the distribution of the largest molecule size becomes less accurate in terms of the Wasserstein W1 distance as the H/C ratio decreases and the temperature increases.

As shown in Sec. IV A, the prediction error for the degree distribution by the ten-reaction model increases with temperature. This behavior of the error can be explained by the growth of the number of radicals and double carbon–carbon bonds with temperature. In Fig. S4, we have plotted the probability to observe a carbon bonded to a total of k other atoms, hydrogens and carbons, vs temperature. The ten-reaction model relies on Assumption 2 that all carbons are bonded to four other atoms. This assumption becomes less accurate as the temperature increases. At T = 3, 200 K, over 96% of carbons are bonded to four other atoms. At T = 3, 600 K, T = 4, 000 K, and T = 5, 000 K, this percentage drops down to 91%, 87%, and 74%, respectively, due to the increase of the number of double bonds and radicals.

To determine if the errors due to the ten-reaction model are important or not, we compare the small molecule size distributions predicted by the RGT model and the 10RM + RGT model in Fig. 4 (right column). Remarkably, these predictions are very close to each other. In particular, the discrepancy between these predictions is very small for the system initially composed of C4H10 at 5,000 K. Note that the error in the degree distribution predicted by the ten-reaction model for this system was the largest of all systems considered here. This means that the errors observed in Sec. IV A are low enough and are not a limiting factor for our method in the case of small molecules.

However, in Fig. 5(d), there is a clear trend that the 10RM + RGS model performs worse for the largest molecule size prediction when the temperature increases. In this model, temperature only affects the 10RM part through the calculation of the Kijs. Therefore, for the largest molecule size distribution, it looks like the prediction error from the ten-reaction model is critical and affects the performance of the model. This phenomenon is explained by the fact that errors in the probabilities p3 and p4 for a carbon to have degree three and four, respectively, affect the relative error in size distribution for large molecules the most. To improve the ten-reaction model in future work, assumption 2 could be relaxed and double bonds or radicals could be taken into account.

In Sec. IV, we have observed that the RGT and RGS models accurately predict size distributions for small molecules (Fig. 4). When no giant component exists, as in the case of systems starting with CH4, C2H6 or the mix, the size of the largest molecules is also well-predicted [see the last five rows of Table I and Figs. 5(a), 5(b), and 5(d)]. Importantly, the RGT not only predicts the molecule size distribution but also explains it. The distribution is completely determined by three assumptions: (i) the degree distribution, (ii) a large number of carbons, and (iii) equal probability for any pair of carbons to be bonded. These three assumptions imply that small molecules are tree-like with high probability, and hence the component size distribution H0 is given by Eqs. (4) and (5).

However, when the random graph theory [specifically, Eq. (6)] predicts that there is a giant component, the performance of these models worsens, especially for the prediction of the largest molecule [Table I, Figs. 5(c) and 5(d)]. Therefore, the larger the giant component of the system is the worse these random graph models perform. This observation explains why the model consistently performs worse when the H/C ratio decreases. When there are fewer hydrogens per carbon in the system, larger giant components tend to form, resulting in worse predictions by the random graph theory and sampling.

The fact that both the RGT and the RGS model result in a poor prediction for the size of the largest molecule is due to the fact that both of these approaches ignore geometric aspects of the graphs. This means that some large connected components of graphs predicted by the random graph models may be nonphysical, i.e., impossible to realize in 3D when taking into account the geometric restriction for bond lengths, bond angles, and minimal inter-atomic distance of unbonded atoms. Without these geometric constrains, the molecules predicted by the RGT and RGS models tend to be larger than in the MD simulations, resulting in the poor predictions by the RGT and RGS models. One possible approach to tackle this issue is to learn physical molecular substructures such as branching patterns, cycles, combinations of cycles, etc. from data and reject molecules generated by random graph sampling with nonphysical substructures. Another possible approach is to develop a model for hydrocarbons based on hypergraphs rather than graphs. Both of these approaches require major developments and are left to future work.

Finally, we would like to make a few remarks on computational cost, the cost of the prediction of molecule size distribution by 10RM + RGT is independent of the size of the system and involves only a solution of a small system of nonlinear equations and function evaluations. Runtimes are about one second. The cost of 10RM + RGS is determined by the number of samples and scales with the number of carbons NC as the cost of the depth-first search algorithm: O(NC).50 The runtimes for the systems studied in this work are about one minute. It is important to note that 10RM + RGS has an advantage over kinetic Monte Carlo (KMC) hypothetically used with the ten-reaction model that the former would immediately sample random graphs from the invariant distribution while the latter would require some equilibration period. The accuracy of these two techniques would be comparable. Kinetic Monte Carlo combined with a more sophisticated atomic–level reaction model can give accurate predictions even for the distribution of the largest molecule in the system,35,36 but would take more time to run (10 s of minutes).

We have proposed a methodology to predict the molecule size distribution for hydrocarbon pyrolysis suitable for various initial molecular compositions and various temperatures, at least in the range 3200–5000 K. The methodology involves two steps. First, the ten-reaction model analytically gives the degree distribution for the desired initial composition and temperature. Second, given this degree distribution, we predict the molecule size distribution using two random graph theory models. These models can estimate the size distribution of small molecules and of the largest molecule. These two steps are executed within seconds to minutes on a consumer-grade laptop, which is several orders of magnitude faster than what is possible by running MD simulations. Remarkably, the ten-reaction model and the RGT model runtimes do not depend on the size of the system and can easily be computed for any system size. The predictions for small molecule size distributions are remarkably accurate and the predictions for the largest molecule size distributions are good if there are no giant components. Further improvements of the predicting power of the random graph approaches, particularly for the largest molecules, will be performed in future work. Additionally, with more accurate MD data from improved methods or more powerful computers, this methodology can take advantage of the improved data and provide even better predictions.

Our method can also be applied to other systems where the size distribution is of interest. From the first conclusions of our work, our methods can be applied to systems in regimes where no structural effects (cycles, etc.) are observed. If molecules become large but no structural effects are observed (only branching), we foresee that our method could be applied successfully as our method assumes that molecules are tree-like. For example, our method could be applied to many polymerization processes.

See the supplementary material for the following information: GitHub link to code, molecular dynamics simulations details, bond duration definition, ten-reaction model details, Arrhenius parameters and plots, generating molecule size distribution from degree distribution (detailed version), additional plots for the size distribution for small molecules, system size effect, and prediction of histograms for particular molecule sizes from random graph sampling.

This work was partially supported by the Air Force Office of Scientific Research under Award No. FA9550-20-1-0397.

The authors have no conflicts to disclose.

Vincent Dufour-Décieux: Conceptualization (equal); Data curation (lead); Formal analysis (equal); Investigation (equal); Validation (equal); Writing – original draft (equal); Writing – review & editing (equal). Christopher Moakler: Conceptualization (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Validation (equal); Writing – original draft (equal); Writing – review & editing (equal). Evan J. Reed: Conceptualization (supporting); Funding acquisition (equal); Supervision (equal); Writing – review & editing (supporting). Maria Cameron: Conceptualization (equal); Formal analysis (equal); Funding acquisition (equal); Investigation (equal); Methodology (equal); Supervision (equal); Validation (equal); Writing – original draft (equal); Writing – review & editing (equal).

The data that support the findings of this study are available from the corresponding author upon reasonable request.

1.
M. R.
Harper
,
K. M.
Van Geem
,
S. P.
Pyl
,
G. B.
Marin
, and
W. H.
Green
, “
Comprehensive reaction mechanism for n-butanol pyrolysis and combustion
,”
Combust. Flame
158
(
1
),
16
41
(
2011
).
2.
C. K.
Westbrook
,
W. J.
Pitz
,
O.
Herbinet
,
H. J.
Curran
, and
E. J.
Silke
, “
A comprehensive detailed chemical kinetic reaction mechanism for combustion of n-alkane hydrocarbons from n-octane to n-hexadecane
,”
Combust. Flame
156
(
1
),
181
199
(
2009
).
3.
D.
Kraus
,
J.
Vorberger
,
A.
Pak
,
N. J.
Pak
,
L. B.
Fletcher
,
S.
Frydrych
,
E.
Galtier
,
E. J.
Gamboa
,
D. O.
Gericke
,
S. H.
Glenzer
 et al, “
Formation of diamonds in laser-compressed hydrocarbons at planetary interior conditions
,”
Nat. Astron.
1
(
9
),
606
611
(
2017
).
4.
H.
Hirai
,
K.
Konagai
,
T.
Kawamura
,
Y.
Yamamoto
, and
T.
Yagi
, “
Polymerization and diamond formation from melting methane and their implications in ice layer of giant planets
,”
Phys. Earth Planet. Inter.
174
(
1–4
),
242
246
(
2009
).
5.
M.
Ross
, “
The ice layer in Uranus and Neptune—Diamonds in the sky?
,”
Nature
292
(
5822
),
435
436
(
1981
).
6.
A.
Wang
,
S.
Kadam
,
H.
Li
,
S.
Shi
, and
Y.
Qi
, “
Review on modeling of the anode solid electrolyte interphase (SEI) for lithium-ion batteries
,”
npj Comput. Mater.
4
(
1
),
15
(
2018
).
7.
X.-M.
Cheng
,
Q.-D.
Wang
,
J.-Q.
Li
,
J.-B.
Wang
, and
X.-Y.
Li
, “
ReaxFF molecular dynamics simulations of oxidation of toluene at high temperatures
,”
J. Phys. Chem. A
116
(
40
),
9811
9818
(
2012
).
8.
Z.
He
,
X.-B.
Li
,
L.-M.
Liu
, and
W.
Zhu
, “
The intrinsic mechanism of methane oxidation under explosion condition: A combined ReaxFF and DFT study
,”
Fuel
124
,
85
90
(
2014
).
9.
G.
Yan
,
Z.
Zhang
, and
K.
Yan
, “
Reactive molecular dynamics simulations of the initial stage of brown coal oxidation at high temperatures
,”
Mol. Phys.
111
(
1
),
147
156
(
2013
).
10.
Q.
Yang
,
C. A.
Sing-Long
, and
E. J.
Reed
, “
Learning reduced kinetic Monte Carlo models of complex chemistry from molecular dynamics
,”
Chem. Sci.
8
,
5781
5796
(
2017
).
11.
E.
Chen
,
Q.
Yang
,
V.
Dufour-Décieux
,
C. A.
Sing-Long
,
R.
Freitas
, and
E. J.
Reed
, “
Transferable kinetic Monte Carlo models with thousands of reactions learned from molecular dynamics simulations
,”
J. Phys. Chem. A
123
(
9
),
1874
1881
(
2019
).
12.
Q.
Yang
,
C. A.
Sing-Long
, and
E. J.
Reed
, “
L1 regularization-based model reduction of complex chemistry molecular dynamics for statistical learning of kinetic Monte Carlo models
,”
MRS Adv.
1
(
24
),
1767
1772
(
2016
).
13.
Q.
Yang
,
C. A.
Sing-Long
,
E.
Chen
, and
E. J.
Reed
, “
Data-driven methods for building reduced kinetic Monte Carlo models of complex chemistry from molecular dynamics simulations
,” in
Computational Approaches for Chemistry under Extreme Conditions
(
Springer
,
2019
), pp.
209
227
.
14.
Q.
Yang
,
C. A.
Sing-Long
, and
E. J.
Reed
, “
Rapid data-driven model reduction of nonlinear dynamical systems including chemical reaction networks using ℓ1-regularization
,”
Chaos
30
(
5
),
053122
(
2020
).
15.
S. S.
Lobanov
,
P. N.
Chen
,
X. J.
Chen
,
C. S.
Zha
,
K. D.
Litasov
,
H. K.
Mao
, and
A. F.
Goncharov
, “
Carbon precipitation from heavy hydrocarbon fluid in deep planetary interiors
,”
Nat. Commun.
4
(
1
),
2446
2448
(
2013
).
16.
A.
Zerr
,
G.
Serghiou
,
R.
Boehler
, and
M.
Ross
, “
Decomposition of alkanes at high pressures and temperatures
,”
High Pressure Res.
26
(
1
),
23
32
(
2006
).
17.
H.
Wang
,
X.
You
,
A. V.
Joshi
,
S. G.
Davis
,
A.
Laskin
,
F.
Egolfopoulos
,
C. K.
Law
, and
USC Mech Version II
, “
High-temperature combustion reaction model of H2
,” in
High-temperature Combustion Reaction Model of H2/CO/C1-C4 Compounds
, (
2007
); available at http://ignis.usc.edu/USC_Mech_II.htm..
18.
C.-W.
Zhou
,
L.
Yang
,
U.
Burke
,
C.
Banyon
,
K. P.
Somers
,
S.
Ding
,
S.
Khan
,
J. W.
Hargis
,
T.
Sikes
,
O.
Mathieu
 et al, “
An experimental and chemical kinetic modeling study of 1,3-butadiene combustion: Ignition delay time and laminar flame speed measurements
,”
Combust. Flame
197
,
423
438
(
2018
).
19.
F.
Ancilotto
,
G. L.
Chiarotti
,
S.
Scandolo
, and
E.
Tosatti
, “
Dissociation of methane into hydrocarbons at extreme (planetary) pressure and temperature
,”
Science
275
(
5304
),
1288
1290
(
1997
).
20.
A. C. T.
van Duin
,
S.
Dasgupta
,
F.
Lorant
, and
W. A.
Goddard
, “
ReaxFF: A reactive force field for hydrocarbons
,”
J. Phys. Chem. A
105
(
41
),
9396
9409
(
2001
).
21.
K.
Chenoweth
,
A. C. T.
van Duin
, and
W. A.
Goddard
, “
ReaxFF reactive force field for molecular dynamics simulations of hydrocarbon oxidation
,”
J. Phys. Chem. A
112
(
5
),
1040
1053
(
2008
).
22.
S. G.
Srinivasan
,
A. C. T.
van Duin
, and
P.
Ganesh
, “
Development of a ReaxFF potential for carbon condensed phases and its application to the thermal fragmentation of a large fullerene
,”
J. Phys. Chem. A
119
(
4
),
571
580
(
2015
).
23.
C.
Ashraf
and
A. C. T.
van Duin
, “
Extension of the ReaxFF combustion force field toward syngas combustion and initial oxidation kinetics
,”
J. Phys. Chem. A
121
(
5
),
1051
1068
(
2017
).
24.
K.
Chenoweth
,
A. C. T.
van Duin
,
S.
Dasgupta
, and
W. A.
Goddard
 III
, “
Initiation mechanisms and kinetics of pyrolysis and combustion of JP-10 hydrocarbon jet fuel
,”
J. Phys. Chem. A
113
(
9
),
1740
1746
(
2009
).
25.
Q.-D.
Wang
,
J.-B.
Wang
,
J.-Q.
Li
,
N.-X.
Tan
, and
X.-Y.
Li
, “
Reactive molecular dynamics simulation and chemical kinetic modeling of pyrolysis and combustion of n-dodecane
,”
Combust. Flame
158
(
2
),
217
226
(
2011
).
26.
M.
Döntgen
,
M.-D.
Przybylski-Freund
,
L. C.
Kröger
,
W. A.
Kopp
,
A. E.
Ismail
, and
K.
Leonhard
, “
Automated discovery of reaction pathways, rate constants, and transition states using reactive molecular dynamics simulations
,”
J. Chem. Theory Comput.
11
(
6
),
2517
2524
(
2015
).
27.
H.-J.
Qian
,
A. C. T.
van Duin
,
K.
Morokuma
, and
S.
Irle
, “
Reactive molecular dynamics simulation of fullerene combustion synthesis: ReaxFF vs DFTB potentials
,”
J. Chem. Theory Comput.
7
(
7
),
2040
2048
(
2011
).
28.
A.
Beste
, “
ReaxFF study of the oxidation of lignin model compounds for the most common linkages in softwood in view of carbon fiber production
,”
J. Phys. Chem. A
118
(
5
),
803
814
(
2014
).
29.
A.
Bharti
and
T.
Banerjee
, “
Reactive force field simulation studies on the combustion behavior of n-octanol
,”
Fuel Process. Technol.
152
,
132
139
(
2016
).
30.
N.
Orekhov
,
G.
Ostroumova
, and
V.
Stegailov
, “
High temperature pure carbon nanoparticle formation: Validation of AIREBO and ReaxFF reactive molecular dynamics
,”
Carbon
170
,
606
620
(
2020
).
31.
J.
Ford
,
S.
Seritan
,
X.
Zhu
,
M. N.
Sakano
,
M. M.
Islam
,
A.
Strachan
, and
T. J.
Martínez
, “
Nitromethane decomposition via automated reaction discovery and an ab initio corrected kinetic model
,”
J. Phys. Chem. A
125
(
7
),
1447
1460
(
2021
).
32.
C. W.
Bauschlicher
, Jr.
,
T.
Qi
,
E. J.
Reed
,
A.
Lenfant
,
J. W.
Lawson
, and
T. G.
Desai
, “
Comparison of ReaxFF, DFTB, and DFT for phenolic pyrolysis. 2. Elementary reaction paths
,”
J. Phys. Chem. A
117
(
44
),
11126
11135
(
2013
).
33.
L. W.
Bertels
,
L. B.
Newcomb
,
M.
Alaghemandi
,
J. R.
Green
, and
M.
Head-Gordon
, “
Benchmarking the performance of the ReaxFF reactive force field on hydrogen combustion systems
,”
J. Phys. Chem. A
124
(
27
),
5631
5645
(
2020
).
34.
Q.
Wang
,
J. C.
Saldinger
,
P.
Elvati
, and
A.
Violi
, “
Molecular structures in flames: A comparison between SNapS2 and recent AFM results
,”
Proc. Combust. Inst.
38
(
1
),
1133
1141
(
2021
).
35.
V.
Dufour-Décieux
,
R.
Freitas
, and
E. J.
Reed
, “
Atomic-level features for kinetic Monte Carlo models of complex chemistry from molecular dynamics simulations
,”
J. Phys. Chem. A
125
(
19
),
4233
4244
(
2021
).
36.
V.
Dufour-Décieux
,
A. D.
Sendek
,
B.
Ransom
,
R.
Freitas
,
J.
Blanchet
, and
E. J.
Reed
, “
Temperature extrapolation of molecular dynamics simulations of complex chemistry to microsecond timescales using kinetic models: Applications to hydrocarbon pyrolysis
,”
J. Chem. Theory Comput.
,
18
(
12
),
7496
7509
(
2022
).
37.
D. T.
Gillespie
, “
A general method for numerically simulating the stochastic time evolution of coupled chemical reactions
,”
J. Comput. Phys.
22
(
4
),
403
434
(
1976
).
38.
D. J.
Higham
, “
Modeling and simulating chemical reactions
,”
SIAM Rev.
50
(
2
),
347
368
(
2008
).
39.
M.
Molloy
and
B.
Reed
, “
A critical point for random graphs with a given degree sequence
,”
Random Struct. Algorithms
6
,
161
179
(
1995
).
40.
M.
Molloy
and
B.
Reed
, “
The size of the giant component of a random graph with a given degree sequence
,”
Combinatorics Probab. Comput.
7
(
3
),
295
305
(
1998
).
41.
M. E. J.
Newman
,
S. H.
Strogatz
, and
D. J.
Watts
, “
Random graphs with arbitrary degree distributions and their applications
,”
Phys. Rev. E
64
,
026118
(
2001
).
42.
R.
van der Hofstad
.
Random Graphs and Complex Networks
(
Cambridge Series in Statistical and Probabilistic Mathematics
,
2017
), Vol. 1.
43.
R.
van der Hofstad
.
Random Graphs and Complex Networks
(
Cambridge Series in Statistical and Probabilistic Mathematics
,
2021
), Vol. 2.
44.
S.
Plimpton
, “
Fast parallel algorithms for short-range molecular dynamics
,”
J. Comput. Phys.
117
(
1
),
1
19
(
1995
).
45.
A. P.
Thompson
,
H.
Metin Aktulga
,
R.
Berger
,
D. S.
Bolintineanu
,
W.
Michael
Brown
,
P. S.
Crozier
,
P. J.
in’t Veld
,
A.
Kohlmeyer
,
S. G.
Moore
,
T. D.
Nguyen
 et al, “
LAMMPS - A flexible simulation tool for particle-based materials modeling at the atomic, meso, and continuum scales
,”
Comput. Phys. Commun.
271
,
108171
(
2022
).
46.
H.
Carter Edwards
,
C. R.
Trott
, and
D.
Sunderland
, “
Kokkos: Enabling manycore performance portability through polymorphic memory access patterns
,”
J. Parallel Distrib. Comput.
74
(
12
),
3202
3216
(
2014
).
47.
H. M.
Aktulga
,
C.
Knight
,
P.
Coffman
,
K. A.
O’Hearn
,
T.-R.
Shan
, and
W.
Jiang
, “
Optimizing the performance of reactive molecular dynamics simulations for many-core architectures
,”
Int. J. High Perform. Comput. Appl.
33
(
2
),
304
321
(
2019
).
48.
H. M.
Aktulga
,
J. C.
Fogarty
,
S. A.
Pandit
, and
A. Y.
Grama
, “
Parallel reactive molecular dynamics: Numerical methods and algorithmic techniques
,”
Parallel Comput.
38
(
4
),
245
259
(
2012
).
49.
H. S.
Wilf
,
Generatingfunctionology
, 2nd ed. (
Academic Press
,
London
,
1994
).
50.
T. H.
Cormen
,
C. E.
Leiserson
,
R. L.
Rivest
, and
C.
Stein
,
Introduction to Algorithms
, 3rd ed. (
The MIT Press
,
2009
).
51.
S.
Kolouri
,
S. R.
Park
,
M.
Thorpe
,
D.
Slepcev
, and
G. K.
Rohde
, “
Optimal mass transport: Signal processing and machine-learning applications
,”
IEEE Signal Process. Mag.
34
,
43
59
(
2017
).

Supplementary Material