We present a deep learning based framework for real-time analysis of a differential filter based x-ray spectrometer that is common on short-pulse laser experiments. The analysis framework was trained with a large repository of synthetic data to retrieve key experimental metrics, such as slope temperature. With traditional analysis methods, these quantities would have to be extracted from data using a time-intensive and manual analysis. This framework was developed for a specific diagnostic, but may be applicable to a wide variety of diagnostics common to laser experiments and thus will be especially crucial to the development of high-repetition rate (HRR) diagnostics for HRR laser systems that are coming online.
I. INTRODUCTION
The development of high-intensity high-repetition rate (HRR) lasers, with intensities >1018 W/cm2 that can operate on the order of 1 Hz or faster, is a quickly maturing field, and a number of laser facilities around the world are already operating in this regime.1 The use of these HRR systems will be particularly significant for laser-driven acceleration research, which relies on these high-intensity lasers impinging on matter to generate beam-like high-energy particle and photon sources.
Much of the research in laser-driven acceleration has focused on understanding the relationship between laser parameters, such as laser energy, pulse duration, and laser spot size, and the characteristics of the resulting high-energy sources, such as maximum energy, beam laminarity, and total particle dose. With HRR technology, one could imagine spanning a large laser-parameter space more quickly, thus generating an unprecedented amount of data, which could then be the foundation of more robust models of laser-driven accelerators. In the future, this may enable the creation of tunable laser-driven accelerators that could have applications ranging from fusion to material studies to medical therapies.
While there has been considerable development in the area of HRR targets,2 such as continuous target-tape drives,3 gas targets,4 and liquid film targets,5 there is not yet a wide suite of HRR target diagnostics that can measure and analyze the laser-particle acceleration process at the rate that the laser can fire. In addition, while there exists detection media technology to collect particle and photon traces at high repetition rates, such as ultra-fast high-resolution cameras,6 computational frameworks that can analyze these traces in real time at HRR are still lacking. Real-time analysis of diagnostic outputs will allow for experiments with active feedback, where data are used to inform the laser inputs that are required to generate the desired characteristics of a laser-driven particle source. Therefore, diagnostics and analysis frameworks remain the missing component in utilizing the full potential of HRR systems for laser-driven acceleration experiments.
Machine Learning (ML), in particular deep learning (DL), has long been an excellent tool for recognizing patterns and features in data.7–9 Therefore, utilizing these tools in data analysis for short-pulse laser experiments benefits from the large body of research of deep learning for image recognition and classification. In this work, we describe the development of automated data analysis based on deep neural networks for a synthetic step-filter-based x-ray spectrometer, called the Livermore Tantalum Step Filter (LTSF). The LTSF often serves as one of the workhorse diagnostics on such laser-driven experiments since it both diagnoses the characteristics of emitted x-ray radiation while also allowing for the inference of key information about the internal electron distribution of irradiated targets.10 Given to the simplicity of the LTSF diagnostic, it is a good first candidate to study the feasibility and strengths and weaknesses of a machine-learning based automated analysis. While this framework was developed for a specific x-ray diagnostic, these machine learning techniques are applicable to a variety of diagnostics common on short-pulse laser acceleration experiments.
II. LTSF DIAGNOSTIC
One important thrust of research in high-intensity laser interactions with solid-density targets is the development of high-energy x-ray sources. As the laser interacts with solid-density materials, typically micrometer thick metallic foils, high energy electrons (>1 MeV) are produced via a variety of mechanisms including the ponderomotive force,11 where electrons are accelerated and oscillated by the laser field. These electrons can then drive the generation of secondary sources of radiation, such as a continuum of x-ray bremsstrahlung photons. Recent results have shown the potential of using these x-ray sources to perform MeV-radiography of dense objects relevant to high-energy density physics and inertial confinement fusion.12,13
To measure and characterize the spectra of these x rays, a differential-filter-based spectrometer called the LTSF has been developed, which has been frequently used across many short-pulse laser platforms.10 This LTSF spectrometer is composed of 36 channels of different thicknesses of tantalum followed by a FujiFilm BA-SR image plate to collect the x-ray signal. A diagram of the diagnostic filters and their thicknesses as well as example image plate data are shown in Fig. 1. Using the known transmission of x rays through thicknesses of tantalum and the GEANT4 modeled sensitivity of the image plate, one can build a synthetic detector response of the full LTSF system. This detector response encodes information about the transmission of a particular energy photon through each channel and the image plate sensitivity. This can be represented by a matrix, R, which has elements, , which describes the detector response for photons of energy E(j) that have interacted with channel i.10 The resulting measured LTSF data (D) are then a matrix multiplication of the detector response and the assumed x-ray spectrum [S(E)], such that
Example Livermore tantalum step filter (LTSF) image plate data are shown from a typical experiment along with a cartoon of the LTSF. The thickness of each tantalum channel is shown on the right in units of micrometers in bold. The LTSF is also covered with one 23.5 μm thick aluminum filter that covers all channels.
Example Livermore tantalum step filter (LTSF) image plate data are shown from a typical experiment along with a cartoon of the LTSF. The thickness of each tantalum channel is shown on the right in units of micrometers in bold. The LTSF is also covered with one 23.5 μm thick aluminum filter that covers all channels.
In a traditional analysis scheme, the measured data are analyzed using a forward-fit routine. An assumed one- or two-temperature spectrum of the form ∼A0 exp(−E/T), where A0 is the amplitude of the spectrum, E is the energy, and T is the electron temperature, is used to generate synthetic data using the matrix equation [Eq. (1)] and then a routine iterates until the error between the synthetic data and real data (background subtracted) is minimized.10 Therefore, this scheme is limited in that it relies on assuming a functional form of the spectra.
The key characteristics of the spectrum can be distilled into three main measurement outputs: (1) amplitude of the spectrum, (2) slope of the spectrum (which is a proxy for the temperature of the electrons generating the resulting x rays), and (3) total particle dose. Traditional analysis is therefore a multi-step process where the spectrum is first extracted from the image data and then these data are further processed to extract these key experimental metrics. While this is conceptually simple, it is time-intensive and requires some initial assumptions and guesses at the spectrum, and furthermore, it cannot be completed at HRR (i.e., at rates comparable to the laser shot rate of 1 Hz or faster). In contrast, the deep learning framework presented here is designed to extract these key metrics from the data directly and automatically.
III. DEEP LEARNING BASED AUTOMATED DATA ANALYSIS
Extracting features from images is a common task for machine learning and more specifically, deep learning (DL) algorithms. A typical test case for machine learning algorithms is to identify images of handwritten digits from the MNIST database.14 In many ways, the extraction of spectral features from x-ray spectrometer data is not a dissimilar task and thus benefits from the active research and resources in deep learning for image classification.
There are many benefits of a deep learning based automated analysis that extend beyond speed when compared to traditional analysis techniques. For instance, the probabilistic nature of deep learning architectures makes uncertainty quantification of results more robust compared to traditional analysis, which often relies on tolerance analysis or is subject to systematic errors introduced by the human operator. In addition, DL-based automated analysis allows the ability to integrate multiple streams of data dynamically. For example, as described previously, short-pulse laser experiments typically produce multiple types of secondary photons and charged particles. Using these DL-frameworks, multiple streams of data from these different particle and photon sources could be integrated “on-the-fly” in order to more comprehensively describe the experiment. Recently, integrated data analysis (IDA) techniques have been utilized throughout the plasma science and fusion community to help constrain plasma parameters in magnetic fusion and inertial confinement fusion experiments using Bayesian inference.15,16 Using DL-based automated analysis, one could quickly analyze multiple streams of multi-modal data with high fidelity, which then could be the basis for an integrated data analysis model in a short-pulse laser experiment context.
A. Description of LTSF training data
Regardless of the machine learning architecture, training an effective model relies on a large repository of training examples. A synthetic dataset (N = 47 845 samples) was produced assuming simple spectra of the form of a one-temperature Maxwellian distribution following the functional form: (A0/E) exp(−E/T). In preparing the synthetic dataset, amplitudes were sampled assuming a uniform distribution between 104 and 106 photons/keV. Similarly, temperatures were chosen assuming a uniform distribution between 150 and 800 keV. These ranges of parameters were chosen to span typical operation of the LTSF diagnostic. Synthetic LTSF data are then produced using Eq. (1), where this simple spectrum is multiplied by the detector response matrix to produce the LTSF image plate data in units of photostimulated luminescence (PSL), which is a unit related to the stored energy in the image plate phosphor. In addition, Gaussian noise was added to the synthetic data images, where the mean was set to 1 × 10−4 of the maximum PSL value across the image and the standard deviation was also 1 × 10−4 of the maximum PSL value. This was important to both mimic the background and keep the machine learning model from being over-constrained. The set of synthetic data was further pruned to exclude data examples with low signals across the image plate sample (i.e., has fewer than four channels with signals above 0.1 PSL) and also high image plate values (i.e., saturated channels of PSL values above 250). While an image plate is an appropriate detection medium for lower-repetition rate platforms since slower shot cycles can allow for the manual scanning of the image plates between shots, in a high-repetition rate scheme, the LTSF would need to be redesigned to utilize electronic detection, such as scintillators coupled to cameras. Therefore, the machine learning model presented here would need to be updated to reflect these different detection media, but the general framework and procedure of generating a large set of synthetic data for training of these models would remain unchanged. Thus, in this way, this model represents a proof-of-principle for this type of analysis scheme and allows a means to investigate the strengths and weaknesses of machine learning based analysis.
B. Description of model
A fully connected feed-forward model was developed to extract the key experimental metrics and the model architecture as described in Fig. 2. The network was developed using Tensorflow 2.0.17 The 36 channels of the LTSF data unfolded into a 1D input vector are the input for the model. Each value for every channel represents a total PSL value over the image plate region corresponding to that channel with the mean local background subtracted. The model was compiled using a mean-squared loss function and Adam optimizer18 with a learning rate of 0.001. The batch size was 256, and the model was trained over 100 epochs. 20% of the data examples were kept aside for model validation. Efficient training of machine learning models relies in part on pre-processing the input data and features before training. The input LTSF data were pre-processed by taking the decadic logarithm of all training data examples. Similarly, the amplitude metric was pre-processed by taking the decadic logarithm of the amplitude values and the temperature and dose metrics were also scaled by taking the natural logarithm of each variable. In addition, the metrics were further pre-processed by standardization, that is, subtracting the mean and scaling the metrics such that they are between zero and unity. In this preliminary model, the aim was to output the metrics of x-ray slope temperature (T), x-ray dose (D), and spectrum amplitude (A0). To quantify the model uncertainty, an ensemble of 20 models with identical architectures were trained on the same sub-set of training data, but the model weights initialized randomly. This is a common methodology in machine learning to estimate the uncertainty on model outputs.
A fully connected feed-forward model was used. The synthetic LTSF data were flattened and input as a linear vector of 36 channels. Three hidden layers were used with 64, 128, and 64 neurons each with a rectified linear unit (ReLU) activation function. A dropout layer with 0.2 frequency rate was utilized after the hidden layers in order to prevent over-fitting of the mode. The output layer is composed of three neurons corresponding to the values of the three extracted experimental metrics used to define the spectrum.
A fully connected feed-forward model was used. The synthetic LTSF data were flattened and input as a linear vector of 36 channels. Three hidden layers were used with 64, 128, and 64 neurons each with a rectified linear unit (ReLU) activation function. A dropout layer with 0.2 frequency rate was utilized after the hidden layers in order to prevent over-fitting of the mode. The output layer is composed of three neurons corresponding to the values of the three extracted experimental metrics used to define the spectrum.
Figure 3 shows the results from evaluating the ensemble of models on the test dataset. For each test data example, the predicted value from the machine learning model was averaged over the ensembles to produce a mean output value for each of the three metrics previously described. The standard deviation of the ensemble of output values is taken as the model uncertainty. Figures 3(a)–3(c) show the true metric on the x axis and ML-predicted metric on the y axis for each of the three metrics described previously. This model uncertainty is shown in shaded blue on each of the three panels. The model does well in predicting all three metrics with high accuracy (∼95% across all three metrics).
The results from evaluating the ensemble of models on the test dataset. In each plot, the true value is shown on the x axis and the predicted value using the machine learning based analysis is shown on the y axis. Shaded blue represents the standard deviation for the ensemble’s output. The dashed black line in each plot shows the y = x line. (a) shows the true and predicted results for the dose metric (D). Similarly, (b) and (c) show the results for the amplitude (A0) and temperature (T), respectively.
The results from evaluating the ensemble of models on the test dataset. In each plot, the true value is shown on the x axis and the predicted value using the machine learning based analysis is shown on the y axis. Shaded blue represents the standard deviation for the ensemble’s output. The dashed black line in each plot shows the y = x line. (a) shows the true and predicted results for the dose metric (D). Similarly, (b) and (c) show the results for the amplitude (A0) and temperature (T), respectively.
In addition to model uncertainty, we can also quantify the prediction error percentage, which we define here as 100 ∗ (p − t)/t, where t is the true value and p is the average predicted value from the ensemble of ML models. This prediction error is shown for all three metrics in Fig. 4. In Figs. 4(a)–4(c), the true value of the amplitude, dose, and temperature is plotted for each test data case and each point is colored by the prediction error of a specific metric. Therefore, these maps allow one to quantify the “edge cases” in parameter space where the model does poorly. For example, in Fig. 4(a), each of the points is colored by the prediction error in the dose. In this example, we see that the ML model is most severely over-predicting the dose in cases when the true temperature of the spectrum is low. In general, for all three plots in Fig. 4, the model performs poorly at the extremes in feature space, which may be due to the fact that the data were pruned in order to eliminate data instances with too little signal or saturated signal. The result is that there are fewer examples at these extrema for the model to learn from and differentiate.
In (a)–(c), the true amplitude, true dose, and true temperature are plotted in three dimensions. In (a), the points are colored by the prediction error in the dose, in (b) the points are colored by the prediction error in the amplitude, and in (c) the points are colored by the prediction error in the temperature.
In (a)–(c), the true amplitude, true dose, and true temperature are plotted in three dimensions. In (a), the points are colored by the prediction error in the dose, in (b) the points are colored by the prediction error in the amplitude, and in (c) the points are colored by the prediction error in the temperature.
C. Application of model to real data
As a preliminary test of this methodology, this machine learning model was applied to example real data from a recent campaign at the OMEGA-EP facility at the Laboratory for Laser Energetics. In this campaign, thin (5 μm-thick) gold foils were irradiated with laser intensities ranging from 0.3 to 2.9 × 1018 W/cm2. Electron spectra results from this experiment were published showing a population of super-ponderomotive electron temperatures at sub-relativistic laser intensities.19 In addition to the electron measurements, LTSF x-ray data were also collected. These LTSF data were analyzed using the traditional analysis methodology, where a one-temperature fit is assumed and a forward-fit routine is used to infer the temperature. The LTSF data were also evaluated using this machine learning model that had been pre-trained on synthetic data to extract the temperature.
After training, evaluation of all experimental metrics from data using the machine learning model takes roughly 0.3 ms, which is a time scale well-suited for HRR operation. These results are shown in Fig. 5. Uncertainty in the traditional analysis is assumed to be 20%, given a previous analysis of the uncertainty in this forward-fit model described by Williams et al.10 The uncertainty in the machine learning model predicted metrics is given by the standard deviation of the predicted results from the ensemble of models. In Fig. 5(a), the machine learning model matches the traditional analysis amplitude within error bars for all of the shots. In Fig. 5(b), the machine learning model tends to over-predict the temperature when compared to the traditional analysis, especially for the two points around 400 keV. From the inset in Fig. 5(b), we see that those Omega EP data points are in the parameter space (based on synthetic data) where there are not training examples or not many training examples. This elucidates a weakness in machine learning models in that they typically only excel at interpolation between data instances that the model has seen before. Improvements to the model must include adding more training examples from synthetic or real data sources in order to increase fidelity in these regions of high error.
The ensemble of models trained only on synthetic LTSF data was applied to real data examples from OMEGA-EP. In (a), the amplitude derived from the traditional analysis is shown on the x axis and the amplitude predicted from the machine learning model is shown on the y axis. Similarly, in (b), temperature derived from the traditional analysis is shown on the x axis and the temperature predicted from the machine learning model is shown on the y axis. The inset is a replication of Fig. 4(c) with the points from the campaign at OMEGA-EP overlaid. Here, the traditional analysis amplitude, dose, and temperature are plotted for each of the OMEGA-EP points and they are colored with their prediction error.
The ensemble of models trained only on synthetic LTSF data was applied to real data examples from OMEGA-EP. In (a), the amplitude derived from the traditional analysis is shown on the x axis and the amplitude predicted from the machine learning model is shown on the y axis. Similarly, in (b), temperature derived from the traditional analysis is shown on the x axis and the temperature predicted from the machine learning model is shown on the y axis. The inset is a replication of Fig. 4(c) with the points from the campaign at OMEGA-EP overlaid. Here, the traditional analysis amplitude, dose, and temperature are plotted for each of the OMEGA-EP points and they are colored with their prediction error.
IV. FUTURE WORK
While the results shown in Fig. 3 provide confidence that these frameworks can work on synthetic simplistic data, ultimately it must be demonstrated that these frameworks can work more robustly on real data. In order to better adapt the simplistic model shown previously to real data examples, we may in the future exploit the concept of hierarchical transfer learning (HTL). This framework was described recently in a plasma physics context by Humbird et al.20 for the purposes of predicting plasma parameters in inertial confinement fusion implosions. HTL, in effect, allows one to stitch together learning on increasing levels of fidelity of data. In the context of the work presented here, there are many examples of our simplistic training dataset, which have low fidelity but are “cheap” in a computational sense to produce. However, higher-fidelity data generated using more computationally intensive methods like particle-in-cell (PIC) codes coupled with a Monte Carlo physics package like GEANT4 could be utilized in training through this HTL scheme. Combining learning of these two different datasets with HTL may allow for improvement in the model evaluation on real data. Even further, using this more realistic set of spectra to generate data could allow for a new deep learning based model to extract the full spectrum from LTSF data without the assumption of a spectral form.
V. CONCLUSIONS
A new deep learning based analysis framework was developed for a differential x-ray step filter diagnostic called the LTSF. Training of these models on synthetic data demonstrates the ability to extract key experimental metrics. Machine learning analysis is still in early phases and at each step requires a confidence check against traditional analysis schemes, but as we build up more synthetic data, and more experimental data to train on, the model should continue to improve.
ACKNOWLEDGMENTS
This work was completed under the auspices of the U.S. DOE LLNL under Contract No. DE-AC52-07NA27344 with funding support from the Laboratory Directed Research and Development Program under tracking code Grant Nos. 21-ERD-015 and 20-ERD-048, the DOE Office of Science ECRP under Grant No. SCW1651, and the DOE NNSA Laboratory Residency Graduate Fellowship program, which is provided under Grant No. DE-NA0003960. Neither the United States government nor LLNL, LLC, nor any of their employees makes any warranty, expressed or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights.
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.