Equations of state (EOSs) are typically represented as physics-informed models with tunable parameters that are adjusted to replicate calibration data as closely as possible. Uncertainty quantification (UQ) allows for the development of an ensemble of EOS parameters that are consistent with the calibration data instead of a single EOS. In this work, we perform UQ for the reactant and product EOSs for a variety of high explosives (HEs). In doing so, we demonstrate a strategy for dealing with heterogeneous (both experimental and calculated) data. We also use a statistical distance metric to quantify the differences between the various HEs using the UQ results.

## I. INTRODUCTION

Equations-of-state (EOSs) relate the conditions of a material to its properties. In hydrodynamic simulations of multi-scale phenomena, the EOSs of the constituent materials prescribe the material behavior as the simulation evolves. Therefore, the accuracy of the underlying EOSs is critical to the fidelity of the hydrodynamic simulations. Though the functional form of the EOS models is generally informed by physics, such models typically contain adjustable parameters that must be calibrated to data so that the EOS accurately reproduces the known (either via experiment or calculation) material properties. Historically, this tuning procedure has been done by hand, but in recent years, several automated optimization strategies have been developed to facilitate this process.^{1–7}

The hydrodynamic simulation of high explosive (HE) detonation requires a theoretical description of the transition from the unreacted condensed-phase state composed of larger molecules at ambient conditions to a high-temperature, high-pressure gaseous state composed of smaller molecules like water and carbon dioxide. A core requirement of such simulations is the need for both a reactant and product EOS. At present, single deterministic calibrations for HEs are standard in the field; however, increasing computational power warrants the use of an ensemble of statistically relevant calibrations that convey the inherent limitations in both data quantity and data certainty. The strategy outlined in this work, while applied only to ideal HE models, lays the foundation for incorporating the burn models required by non-ideal theories of detonation into an uncertainty-aware calibration framework.

In this work, we calibrate reactant and product EOS models to a heterogeneous dataset that includes experimental data. However, such measurements have uncertainties. Additionally, EOS models often are quasi-degenerate, meaning that multiple parameter sets can produce practically equivalent predictions for a finite-sized calibration dataset. Uncertainty quantification (UQ) provides a mechanism to produce an ensemble of EOS parameter sets that capture both of these effects. Then, in principle, this ensemble of EOSs can be transmitted downstream to result in a range of predictions for hydrodynamic simulations. Prior UQ work on EOSs and other hydrodynamic models underscores the growing importance of UQ in the field.^{8–21}

In general, fitting HE EOSs to data presents a variety of challenges. First is the high dimensionality of the problem involving a reactant and product EOS (and in later work, we will also include burn models). A large number of degrees of freedom can be problematic for standard UQ strategies such as Markov Chain Monte Carlo (MCMC). We ameliorate this problem by untangling the reactant and product EOSs, allowing us to treat them nearly independently. Second is that data are often conflicting, old, or very difficult to obtain. Conflicting data sources are natural to include in a UQ framework, which we show in this work. To deal with other data challenges, we use a composite dataset comprised of both experimental and computed calibration data, and we explore creative ways to cast the calibration data. For the reasons that we describe below, it is expedient to use a Bayesian approach for the reactants and a bootstrap approach for the products. We note that the reactant data are often reported as fit coefficients instead of as individual measurements. We discuss strategies for the inclusion of this type of data in a UQ analysis, with an emphasis on their uncertainties. We then use a machine-learning classifier to unite the reactant and product EOS parameters.

We perform UQ for various HE formulations. Some of these HEs might be expected to behave relatively similarly, e.g., the plastic-bonded explosives (PBXs) 9501, 9404, and 9011 all have the same HE material (1,3,5,7-tetranitro-1,3,5,7-tetrazocane or HMX) with different percentages of HE and different binders.^{22,23} We will show in this work that the ensembles of parameter combinations that result from UQ can allow for a more in-depth understanding of relative similarities and differences between various HE formulations by way of statistical distances.

The remainder of this work is organized as follows. In Sec. II, we describe the essential details of the UQ methodology: the model, the calibration data, and the UQ sampling strategies employed. We also cover the statistical distances used to quantify the differences between the various HEs explored in this work. In Sec. III, we employ the above strategy for the reactant EOSs first, and we compare differences in HEs as well as differences between different data sources. Then, we fold in the product calibration data to provide a complete comparison of reactant and product EOSs. Finally, in Sec. IV, we conclude and comment on future directions.

## II. COMPUTATIONAL METHODS

### A. Davis EOS models

^{24–26}These models are of Mie–Grüneisen form, meaning that pressure and specific energy are linearly related and pressure and volume are non-linearly related

^{27}and is based on a reference isentrope as

^{26}

^{,}

^{24–26}In this work, the product reference isentrope is taken to be the release isentrope from the Chapman–Jouget (CJ) state. The reference state for the product EOS has a temperature of 298.15 K, and the reference volume is taken from the reactant EOS. The reactant EOS has six adjustable parameters [ $A$, $B$, $C$, $ C v r 0$, $ \alpha S T$, $ \Gamma r 0$]; the product EOS has seven adjustable parameters [ $a$, $k$, $ v c$, $ p c$, $n$, $b$, $ C v p$].

The Davis EOS models are popular choices for modeling the behavior of HEs; however, they have known limitations. For example, the reactant EOS does not obey the Dulong–Petit limit for the isochoric heat capacity $ C v$ as a function of $T$. In the UQ parlance, such systematic defects in the physics model are referred to as model form uncertainty. We will not explicitly consider model form uncertainty in this work.

### B. Calibration data

#### 1. Reactants

For the HE reactants, we use two HE data compendiums, one from Los Alamos Scientific Laboratory [Gibbs & Popolato 1980, Ref. 22] and the other from Lawrence Livermore Laboratory [Dobratz 1972, Ref. 23]. Their selection is motivated so that the calibration data will be maximally internally consistent within a single data source. We want comparisons between HEs to reflect the intrinsic differences between the HEs and not discrepant experimental equipment or processing of the data. While both of these data sources sometimes cite external data, we can (perhaps naïvely) hope that the data curation choices were made in some sort of internally consistent fashion. For the purposes of this study, we are essentially outsourcing the data curation step to the two compendiums of HE data as these sources are fairly standard references in HE research. However, we note that data curation is, in general, a part of the UQ process, and the UQ field as a whole would stand to benefit from increased collaboration between physical scientists and statisticians in this area.

For the reactant calibration data, we use isobaric heat capacity ( $ C p$ vs $T$), isobaric linear thermal expansion ( $\alpha $ vs $T$), and shock Hugoniot (shock speed $ U s$ vs particle speed $ U p$) data. We selected the eight HEs where all of these data are available in both Gibbs & Popolato 1980 and Dobratz 1972: PBX 9501, PBX 9404, PBX 9011, 2,4,6-trinitrotoluene (TNT), pentaerythritol tetranitrate (PETN), Baratol, Composition B-3 (Comp B), and the extrudable explosive XTX 8003.

Both these data sources report the experimental data as polynomial fit coefficients instead of raw data. For example, the isobaric heat capacity of solid TNT is reported as $ C p [ cal / g / \xb0 C ]=0.254+7.5\xd7 10 \u2212 4T [ \xb0 C ]$ over the range $T=[17,67 \xb0 C]$ in Gibbs & Popolato 1980. It is tempting to use the preceding equation to generate synthetic point-wise data over the relevant temperature range and input these data points into the UQ process. However, this approach is problematic when we do not have information about the magnitude of the uncertainty in the underlying data. For reasons discussed below, we will not infer the data uncertainties in this work, and so we are required to make reasonable, non-arbitrary choices for these values.

There are some reported uncertainties for the fit parameters in Ref. 22 that tend to be about 10% of the value of the coefficient. Therefore, we can use the fit coefficients themselves as data and make informed choices about their uncertainty. For a given EOS parameter set, we generate, e.g., $ C p$ vs $T$ data points over the relevant temperature range, fit these data points to a polynomial of the appropriate order (a linear fit in the TNT example above), and then compare the fit coefficients to the experimental data. We assume that the magnitude of the uncertainty in the fit coefficients is constant across all HEs. Since our assumed uncertainty in the fit coefficients is approximate, we perform the UQ with two different sets of uncertainties, corresponding to an uncertainty of about 5% and 10%, respectively, of the average value for the given coefficient.

Additionally, the above approach circumvents the need to arbitrarily select how many synthetic data points are generated from the polynomial fit coefficients for a given experiment. This is important because, when model form uncertainty is omitted, the posterior variance in the inferred physical parameters will continue to decrease with the number of points sampled.^{12} However, since the calibration data are a polynomial fit, altering the number of synthetic points does not increase the information content and, therefore, should not affect the variance of the physical parameters. Use of the fit coefficients as data avoids this problem.

We chose to use a constant uncertainty for each HE so that the differences between the resulting posterior distributions would reflect inherent differences between the HEs and not differences in measurement precision. In principle, we could have inferred the magnitude of the constant error in the Bayesian approach described below. However, this would require performing the UQ on all HEs simultaneously, vastly increasing the dimensionality of the parameter space. For fixed uncertainty, we can perform the UQ for each HE independently, limiting the number of variables for each UQ problem. We explore the effect of this approximation by varying the uncertainty in Sec. III.

#### 2. Products

For the product calibration data, we chose overdriven shock Hugoniot data, release isentrope data, and the following properties of the CJ state: detonation velocity ( $ D C J$), pressure ( $ P C J$), and temperature ( $ T C J$). These data are not all available in the two compendiums used for the reactants, and data involving temperature are very difficult to evaluate experimentally. Therefore, we used thermochemical predictions of the above quantities from the Los Alamos National Laboratory thermochemical code magpie^{28} as calibration data. For a given list of products, magpie minimizes the free energy for a mixture of atoms whose stoichiometry is derived from the reactant HE. The product list used for this work includes H_{2}O, H_{2}, O_{2}, N_{2}, NO, CO, CO_{2}, NH_{3}, HCOOH, HNCO, CH_{4}, C_{2}H_{2}, C_{2}H_{6}, C_{2}N_{2}, NO_{2}, N_{2}O, diamond, graphite, and liquid carbon. The free energies of the constituent products are described by the Ree-Ross liquid perturbation theory—a molecular interaction model that approximates molecules as isotropic and uses a pairwise potential to describe particle interactions.^{29} The molecular interactions are described by $\varphi (r)=\u03f5U(r/ r \u2217)$, where $\u03f5$ and $ r \u2217$ are the characteristic energy and length scales, respectively, and $U$ is a universal function. There are several sets of Ree-Ross parameters available in the literature, differing in the choice of universal function, $U$, and the particular calibration of the parameters $\u03f5$ and $ r \u2217$.

We use magpie to generate $P\u2212V$ and $T\u2212V$ data along both the shock Hugoniot and the release isentrope, in addition to the desired properties of the CJ state. In order to perform UQ, we need to include the uncertainty on the data in some way. It would, in principle, be possible to assign “error bars” to the calculated data; however, similar to the reactants above, it is unclear what the uncertainties should be. We can re-cast the problem more appropriately as follows. The magpie calculations can be seen as a (complicated and non-analytical) functional form to which the Ree-Ross parameters are input. Therefore, the core data for the product calibration are the Ree-Ross parameters. The uncertainty on the data is derived from the multiple sets of Ree-Ross parameters available in the literature^{30–34} as described in Sec. II C 2.

### C. Uncertainty quantification

#### 1. Reactants

^{35}Note that the evidence is $\theta $-independent and can be discarded in the Markov Chain Monte Carlo (MCMC) sampling described below.

The prior beliefs are typically simple analytic expressions. For this work, we use relatively broad Gaussian functions for the priors on $A$ and $B$. Expert knowledge tells us that reasonable values for the sound speed ( $A$) and its derivative as $ U p$ goes to zero ( $B$) are in the ballpark of 2 or 3 for many HEs; therefore, we use a Gaussian with a mean of two and standard deviation of 0.5 for the priors on those two parameters to lightly constrain $\theta $ to maintain physically reasonable values. For the other parameters, we use flat priors. The priors for all parameters had a minimum value of 0.0 and a maximum values of 10.

Evaluation of the likelihood term, $P(D|\theta )$, requires a generative model to predict the calibration data for a given parameter combination $\theta $. The generative model encompasses both the physics model prediction step and the application of the uncertainty associated with the experimental data. In this work, we assume that the experimental data, i.e., the polynomial fit coefficients, are corrupted by Gaussian-distributed noise only (no systematic biases), and that the physics model is not systematically deficient in its prediction of the fit coefficients.

We characterize the above posterior distribution via MCMC sampling, using the emcee^{36} and ptemcee^{37} codes as described in Ref. 8. The MCMC sampling proposes trial $\theta $ parameter combinations and accepts or rejects these moves according to the Metropolis–Hastings criterion. All MCMC simulations were run with 48 walkers. In brief, we use ptemcee to perform parallel tempering MCMC (PT-MCMC) simulations for 1000 steps at 8 temperatures. Then, we fit the final 500 steps of the lowest-temperature walkers to a multi-variate Gaussian that we re-sample to get a good starting point for the single-temperature emcee calculations. The above intermediate fitting step ensures that all walkers are initialized in a distribution that is both smooth and close to the true posterior distribution, improving the equilibration of the walkers in the MCMC simulation. From the single-temperature MCMC run, the first 500 of 2500 total steps were discarded, which was more than sufficient to reach equilibrium. We used the LANL UQ code UNIT^{10} to organize the data and interface with emcee and ptemcee.

#### 2. Products

Physically, the product EOS would seem to be inextricably linked to the reactant EOS. However, for our choice of calibration data and judicious definition of the reference state, we can nearly untangle the reactant and product EOSs. If all of the product calculations in the calibration data set are initiated from a reference state where $T=298.15 K$ (the reference temperature chosen in the Davis reactant model) and the volume is the reference volume, then the reactant EOS has an internal energy and pressure of zero by construction, regardless of the six adjustable input parameters to the reactant EOS. We then only must ensure that the reactant and product Hugoniot are physically reasonable (i.e., that detonation causes expansion); see Sec. II D.

Since we define the product calibration data to be the Ree-Ross parameters, MCMC sampling would require generating physics predictions for a given trial $\theta $ and then fitting those predictions to an optimal set of Ree-Ross parameters that best describe those data. Unlike the reactants, this fitting step is incredibly challenging because the “functional form” encoded by magpie is a constrained, free energy minimization requiring optimization of approximately 50 degrees of freedom (there are either three or four adjustable parameters per molecule in the product library). Another complicating factor is that there are a relatively small number of Ree-Ross parameter sets to use as data, and the Ree-Ross parameters within a given set will have non-trivial correlations with each other. As a result, it is not clear how to express the uncertainty in a way compatible with Eq. (3).

Bootstrapping has been referred to as the “poor man’s” posterior and, thus, can provide an estimate for Eq. (2) in a totally different fashion than that outlined in Sec. II C 1.^{38} The “poor man’s” descriptor refers to the ease with which bootstrapping can be implemented as compared with parametric approaches such as Bayesian MCMC. In Bayesian approaches, we are required to make assumptions about the nature of the uncertainty present in the calibration data, i.e., we are assuming a statistical distribution for the underlying process that generated the calibration data. By contrast, bootstrapping does not require any such assumptions. We only assume that there is some underlying distribution (about which we are not required to have knowledge) that has been sampled to generate the calibration data (here, the five sets of Ree-Ross parameters). Bootstrapping then enables estimation of the posterior distribution directly from these samples. The methodology approximates a Bayesian analysis in the absence of prior beliefs about $\theta $.^{38}

The first step in bootstrapping is sampling with replacement of the calibration data. In this case, we use five different sets of Ree-Ross parameters available in magpie (labeled as full, JCZS, RUS, REE, and CARTE in magpie). Having only five datasets to choose from can result in artifacts in the resulting bootstrapped distributions. Therefore, we use Bayesian bootstrapping to smooth out the resulting distribution. That is, instead of sampling with replacement (equivalent to weighting the sampling with multinomial coefficient weights), the samples are weighted by pulling samples from the uniform Dirichlet distribution (i.e., all weights sum to 1).^{39}

For each of the five sets of Ree-Ross parameters, we can generate physics predictions for the quantities that we wish to include in the calibration (see Sec. II B 2). If a product species is not present in any of the above Ree-Ross parameters sets, we default to the “full” Ree-Ross set in magpie, which contains all product species. magpie has built-in capabilities to fit the Davis Product parameters to multiple datasets with weights. By repeatedly sampling different weights for the physics predictions associated with the Ree-Ross parameters, we obtain different realizations of the Davis product parameters that estimate the desired posterior distributions. The exact details of the fitting procedure to transform the bootstrapped data into an optimal set of Davis parameters is somewhat arbitrary. We optimize the Davis parameters such that they minimize the root mean squared error between the magpie thermochemical predictions and the Davis model physics predictions.

### D. Machine learning

Subsection II C outlined the UQ strategies for treating the reactants and products independently. However, we need to filter out parameter combinations that result in the unphysical result of densification upon explosion. That is, the reactant Hugoniot should be denser than the product Hugoniot at any pressure that is accessible to the reactant state. In order to do this filtering efficiently (as we must combinatorially check the samples comprising the reactant and product posterior distributions), we train a machine-learning (ML) classifier to predict whether a given combination of reactant and product EOSs are valid.

For each HE, we fit both the reactant MCMC samples and the bootstrapped product samples to kernel density estimations (KDEs). We then pull 50 000 samples from the two distributions and compute the Davis model predictions for the Hugoniot curves and label them as either valid or invalid. While not required, modeling the posterior samples with KDEs allows us to sample the posterior more densely and concentrate the training data on the portions of phase space where we know that we need to make accurate predictions. 70% of these data formed the training set, 15% formed the validation set, and the remaining 15% were held out to form the test set.

scikit-learn^{40} was used for implementation of the ML model. Three different classifiers were originally tested on the Hugoniot data: Gaussian Naïve Bayes, Random Forest, and Gradient-Boosted Trees. Preliminary work indicated that the Gradient-Boosted Trees performed the best. Then, the hyperparameters^{41} for the Gradient-Boosted Trees algorithm were tuned by fivefold cross-validation. During training using the optimal hyperparameters, the validation set was used to determine the stopping criteria. Over the six HEs, the resulting models (one per HE) were 96.5%–98.7% accurate on the test sets containing 7500 samples that were completely held out of the ML training.

### E. Statistical distances

The above UQ analysis allows us to make pairwise statistical comparisons of different HE formulations on the basis of the two posterior distributions, $p(\theta )$ and $q(\theta )$. Statistical distances are used to compare probability distributions. For this work, we employ the Bhattacharrya distance, which has the advantage of being equivalent regardless of which HE posterior distribution is defined to be $p(\theta )$ and which is defined as $q(\theta )$.

Operationally, $ D B C$ between two HEs is computed by modeling $p(\theta )$ and $q(\theta )$ as KDEs and pulling 10 000 samples from $p(\theta )$ at which the right-hand side of Eq. (5) can be evaluated. We can estimate the uncertainty in the $ D B C$ values (largely derived from the KDE modeling step) by switching the definitions of $p(\theta )$ and $q(\theta )$ in Eq. (5); perfect estimation of the posteriors in the infinite sample limit would yield the exact same value for $ D B C$ for either definition of $p(\theta )$ and $q(\theta )$. The average discrepancy between the $ D B C$ values is less than 5%, with nearly all distances having an uncertainty of less than 10%. We report the average value of $ D B C$ derived from the above two calculations in the paper.

## III. RESULTS AND DISCUSSION

### A. Reactants only

Before showing UQ results, we recall that we have defined the polynomial fit coefficients reported in Refs. 22 and 23 as the reactant calibration data, and we have assumed that the uncertainties on the fit coefficients are uncorrelated and Gaussian with fixed standard deviations. Data are often reported in such a fashion, where the underlying measurements are distilled into fit coefficients with (what is presumably) a standard deviation associated with each fit coefficient. (See, for example, the linear fits to the Hugoniot measurements in Ref. 22.)

On the one hand, the assumptions described in above formation are simple and easily compatible with the log-likelihood expressed in Eq. (4). On the other hand, this formulation is unlikely to be consistent with the underlying data that generated the fit. Consider a linear fit to a monotonically increasing dataset in the first quadrant (e.g., all of the reactant data in this work). If the stochastic errors are such that the computed slope is steeper than the “true” slope, then the intercept will be lower than the “true” intercept and vice versa. Therefore, the slope and intercept should be anti-correlated for the data in this work.

If one knew the exact details of the fitting procedure (including where the data points were measured and how the measurement uncertainty varied over the data points), it would be possible to infer the co-variance terms. However, if one has sufficient information to definitively infer the co-variances, one probably has sufficient information to simply use the data directly. In the absence of the co-variance information, the most conservative (maximum entropy) choice is to assume that all co-variances are zero, knowing that there is some loss of information content relative to the original (unknown) measured data. A related approach to neglecting the co-variance terms is inflating experimental uncertainties; both approaches increase the uncertainty of the data. The latter is a zeroth-order strategy to ameliorate the effects of model discrepancy^{12,42,43} and thus could be justifiable in a qualitative sense for this work since we are not otherwise accounting for model form uncertainty.

With the above caveat in mind, we performed UQ for the reactant EOS for eight different HEs listed in Sec. II B 1. The marginalized single variable distributions for the Davis adjustable parameters in the case of TNT are shown in Fig. 1 for the Gibbs & Popolato 1980 data (blue dotted-dashed lines), Dobratz 1972 (red dashed lines), and a combined dataset using both sources (purple solid lines). For some variables (e.g., $ C v r 0$), the variables are essentially insensitive to the data source used. However, other variables differ more noticeably between the two data sources. $C$, in particular, varies quite strongly; $C$ reflects the high-pressure regime of the EOS where there is typically little constraining data.

We can then use these distributions to produce ensembles of predictions for various physical quantities of interest. In Fig. 2, we show the predictions for 480 samples for the calibration data, where the median prediction along with 95% confidence intervals (CIs) are compared to the experimental calibration data for TNT. Despite the differences in the single variable distributions, the range of predictions over the calibration dataset are relatively similar, with only small differences in the medians and CIs. However, predictions that are not included in the calibration dataset (e.g., downstream hydrodynamic calculations) may vary more strongly.^{8} While the calibration data are in reasonable qualitative agreement with the EOS model predictions, we observe some discrepancy between the two datasets, indicating that the physics model is unable to simultaneously reproduce all of the calibration data. This could be due to deficiencies in the physics model, systematic errors in the calibration data, or a combination of the two effects.

While one of the purposes of performing UQ is to pass these ensembles of EOSs in some form to hydrodynamic simulations to generate a range of outcomes consistent with the calibration data, another possible use for these ensembles is to answer statistical questions about similarities and differences between EOSs. In this work, we compute the Bhattacharyya distance to qualitatively compare the eight EOSs studied in this work.

In Fig. 3, we show the Bhattacharyya distances between various ensembles of reactant EOSs. The diagonal blocks indicate the differences between the Gibbs & Popolato 1980 UQ and the Dobratz 1972 UQ for the same HE. The off diagonal elements are the statistical differences between different HEs within the same dataset (lower left is Dobratz and upper right is Gibbs & Popolato 1980). Note that many of the diagonal terms are larger in magnitude than some of the off diagonal terms, indicating that two different HEs are more similar to each other within a single data source than the same HE across the two compendiums. For instance, PBX 9501 and PBX 9404 are closer if a single data source is used ( $ D B C=0.6 or0.8$) than the two PBX 9404 ensembles derived from the two datasets ( $ D B C=1.1$). There is also variability in terms of the disparities between the datasets: Baratol is the least affected by choice of data source ( $ D B C=0.2$) and PETN is the most affected ( $ D B C=2.1$). To underscore just how significant the differences between the two data sources can be, we note that the reactant PETN and PBX 9501 (two very different HEs) is more similar within the same data source ( $ D B C=1.7 or1.4$) than PETN is to itself between the two data sources. As another example, XTX 8003 and PBX 9011 are hardly distinguishable ( $ D B C=0.3$) within Dobratz 1972; however, those same two HEs are much more distinct ( $ D B C=3.2$) within Gibbs & Popolato 1980. The differences shown in Figs. 1 and 2 for TNT correspond to an HE that is comparatively insensitive to data source ( $ D B C=0.3$ in Fig. 3).

In the absence of a reason to trust one dataset over the other, we combine both datasets and report those Bhattacharyya distances in Fig. 4. Here, we also explore the effect of varying the uncertainty from 5% to 10%. The diagonal terms quantify the differences in the same HE between the two choices for uncertainty. The off diagonal terms compare different HEs with an uncertainty of 5% for the lower left triangle and 10% for the upper right triangle.

While the qualitative trends are preserved between the two choices for uncertainty, the distances shrink for the larger uncertainty value. The larger uncertainty results in less sharply peaked distributions. Some of the trends between HEs in Fig. 4 are as expected. For instance, PBX 9501 and PBX 9404 are relatively similar, and both Baratol and XTX 8003 are not particularly similar to any of the other HEs (though XTX 8003—which contains PETN—is most similar to PETN). However, other trends are counter-intuitive. For instance, PBX 9501 and Comp B are extremely close together in a statistical sense. PBX 9404 and TNT are also highly similar. These latter two results are not at all expected based on the known behavior of the HEs. It is important to resolve these oddities in order to demonstrate that this approach to quantify differences between parameters ensembles is robust. Therefore, we must move to perform UQ on the product EOSs since these also inform the behavior of HEs.

### B. Inclusion of products

As described in Sec. II C 2, we use the Bayesian bootstrap to repeatedly re-weight the five Ree-Ross parameter sets available in the literature in order to approximate the posterior distribution for the product EOSs. Figure 5 shows the spread in the single variable distributions for the adjustable parameters to the product Davis model after the bootstrapping in the case of TNT. This analysis is performed for only six HEs, with Baratol and XTX 8003 being omitted, because magpie cannot treat compounds with barium or silicon at present.

Figure 6 shows the ensemble of predictions compared to the calibration data for TNT. On the scale furnished by the range of specific volumes in the data plotted in panels a and b, the spread is not visually obvious, but the Davis predictions are in reasonable agreement with the mapgie data. The properties of the CJ state fall within the ranges of the mapgie predictions as well. The Davis EOS model does seem to struggle to capture the temperature dependence as a function of specific volume along the overdriven Hugoniot and release isentrope. This is perhaps unsurprising since this type of detailed temperature data is generally not available from experiment, and the product Davis model was not developed with these type of data in mind.

Because of the nature of the product calibration data, we have performed UQ on the reactant and product EOSs separately to this point. However, there is one final physical consideration that affects both the reactant EOS and the product EOS: in pressure–volume space, the products EOS should always be at higher volume at a fixed pressure, i.e., materials should expand upon detonation. We treat this constraint as a step function; parameter combinations that do not obey this constraint have a probability of zero. We have independent samples for the reactant and product EOS parameters (from the MCMC sampling and the bootstrap, respectively) that we can combine into a single distribution by checking to see if the above constraint is obeyed. If so, the reactant and product EOS parameters are included in the final joint distribution. We trained an ML model for each HE to predict whether a given joint parameter combination was valid or not. We sub-sampled 5000 samples from the MCMC and used all 5000 product combinations from the bootstrapping and combinatorially checked each parameter combination for a total of 2.5 $\xd7 10 7$ evaluations.

The changes to the TNT reactant distributions are shown in Fig. 7, where the distributions before and after the ML filtering are plotted. (The product distributions were fairly unaffected and therefore not shown.) Most of the single variable distributions are similar in both cases, but the $C$ parameter is shifted to lower values. The Hugoniot constraint provides additional information about the high pressure behavior that influences the admissible values for $C$. In the reactant EOS distribution, $B$ and $C$ have a Pearson correlation coefficient (R) of $\u2212$0.8, indicating a strong anti-correlated relationship, and so $B$ commensurately shifts to slightly higher values. By contrast, $A$ and $ \Gamma r 0$ are directly correlated (R values of 0.25 and 0.37, respectively) with $C$ and so shift to lower values with $C$.

Given the final statistical descriptions of six HEs, we can provide an updated matrix of Bhattacharyya distances in Fig. 8. We can see that the counter-intuitive results occurring when only the reactant EOS was included have been alleviated. In particular, the pairs of HEs (PBX 9404, TNT), (PBX 9404, PETN), (PBX 9501, CompB) were all more similar than (PBX 9501, PBX 9404) in Fig. 4. However, when we include both reactant and product data, we see that (PBX 9501, PBX 9404) remain similar to each other, but the other pairs become more disparate when product data is included.

Interestingly, despite also having HMX as the energetic material, PBX 9011 is more distinct from PBX 9501 and 9404. In the original calibration data, the coefficient of thermal expansion is quite different for PBX 9011 relative to the other two PBXs, so the distances in Fig. 8 are consistent with the underlying data. The thermal coefficient expansion measurements in the two compendiums are rather discrepant from one another and so it is difficult to speculate on the underlying physical reason for why PBX 9011 deviates from 9404 and 9501. Nonetheless, the statistical distances reflect these differences in calibration data in a distilled fashion, propagating the differences in calibration data through to their effect on the parameters.

## IV. CONCLUSIONS

In this study, we have demonstrated a composite UQ strategy for heterogeneous datasets, and we have used the resulting UQ to compute statistical distances to rank the similarity or differences among different HE formulations. The ability to fully quantify the uncertainty in the reactant and product EOSs for HEs is an essential step toward propagating small-scale experiment data up to multi-physics experiments.

In addition to the usual applications of the propagation of uncertainty, the ability to compare similarities and differences between EOS distributions could be relevant to answering many questions. In this work, we have ranked the HEs in terms of consistency between two separate data sources, but there are other possible applications. It would be possible to rank the degree to which new pieces of data influence the UQ; i.e., a larger $ D B C$ would indicate a particularly impactful piece of data. Similarly, if we develop a new material (e.g., a novel HE), we could rank where existing materials are the most similar or different to make further predictions about performance.

## ACKNOWLEDGMENTS

This work was supported by the U.S. Department of Energy through the Los Alamos National Laboratory. Los Alamos National Laboratory is operated by Triad National Security, LLC, for the National Nuclear Security Administration of U.S. Department of Energy (Contract No. 89233218NCA000001).

## AUTHOR DECLARATIONS

### Conflict of Interest

The authors have no conflicts to disclose.

### Author Contributions

**Beth A. Lindquist:** Conceptualization (lead); Data curation (equal); Formal analysis (equal); Methodology (lead); Supervision (equal); Writing – original draft (lead). **Ryan B. Jadrich:** Conceptualization (supporting); Methodology (supporting); Software (lead); Supervision (equal); Writing – original draft (supporting). **Juampablo E. Heras Rivera:** Data curation (equal); Formal analysis (equal); Writing – original draft (supporting). **Lucia I. Rondini:** Data curation (equal); Formal analysis (equal); Writing – original draft (supporting).

## DATA AVAILABILITY

The data that support the findings of this study are available from the corresponding author upon reasonable request.

## REFERENCES

*ASME 2019 Verification and Validation Symposium*(ASME, 2019).

*LASL Explosive Property Data*, Los Alamos Series on Dynamic Material Properties, edited by T. R. Gibbs and A. Popolato (University of California Press, Berkeley, CA, 1980).

*Properties of Chemical Explosives and Explosive Simulants*, edited by B. M. Dobratz (Lawrence Livermore Laboratory, Livermore, CA, 1972), Vol. 1, uCRL-51319 Rev 1.

*12 Symposium (International) on Detonation*(U.S. Naval Research Office, San Diego, CA, 2002).

*Bayesian Reasoning and Machine Learning*

*The Elements of Statistical Learning*, Springer Series in Statistics (Springer, New York, 2001).