Betatron radiation produced from a laser-wakefield accelerator is a broadband, hard x-ray (>1 keV) source that has been used in a variety of applications in medicine, engineering, and fundamental science. Further development and optimization of stable, high repetition rate (HRR) (>1 Hz) betatron sources will provide a means to extend their application base to include single-shot dynamical measurements of ultrafast processes or dense materials. Recent advances in laser technology used in such experiments have enabled increases in shot-rate and system stability, providing improved statistical analysis and detailed parameter scans. However, unique challenges exist at high repetition rate, where data throughput and source optimization are now limited by diagnostic acquisition rates and analysis. Here, we present the development of a machine-learning algorithm for the real-time analysis of betatron radiation. We report on the fielding of this deep learning algorithm for online source characterization at the Institut National de la Recherche Scientifique's Advanced Laser Light Source. By fine-tuning an algorithm originally trained on a fully synthetic dataset using a subset of experimental data, the algorithm can predict the betatron critical energy with a percent error of with a reconstruction time of 1.5 ms, providing a valuable tool for real-time, multi-objective optimization at HRR.
I. INTRODUCTION
The generation of compact, coherent x-ray beams was revolutionized through the development of tabletop, high-power laser systems via chirped pulse amplification (CPA).1 A stand-out technology for particle acceleration is laser wakefield acceleration (LWFA).2 During LWFA, electrons become trapped and “surf” waves produced in the wake of a laser propagating through a low-density plasma. These waves can accelerate electrons to relativistic megaelectron volt-gigaelectron volt energies over millimeter to centimeter length scales. During LWFA, trapped electrons can oscillate transverse to their direction of motion and emit x rays known as betatron radiation.
Betatron x rays produced during LWFA have micrometer-scale source-size,3,4 femtosecond duration,5 and low divergence6 with a broadband, synchrotron-like spectrum characterized by a critical energy.7,8 These properties position them as promising probes of high energy density environments, such as radiography of inertial confinement fuel capsules,9,10 time-resolved absorption spectroscopy of laser-produced plasmas,11,12 or high resolution imaging of laser-driven shocks.13 Phase-contrast imaging enabled by betatron sources can act as nondestructive probes of biological samples14–17 and industrial materials.18–20
While betatron x rays have shown tremendous promise, many high repetition rate (HRR) applications require the source to be better controlled and optimized, e.g., high flux for single shot imaging or maximizing spectrum around absorption features of interest. High-power, CPA laser systems are now capable of driving LWFA with parameters ranging from Joule-class, a few hertz7,21 to multi-millijoule, many hertz to kilohertz.22–24 These systems enable a many-fold increase in data collection for improved statistics and modeling. However, to meet the requirements for HRR applications, i.e., rates requiring >1 Hz shot rates,25,26 the next-generation laser–plasma-driven sources necessitate the development of efficient and real-time data analysis to complement higher data throughput. The process of integrated analysis provides a method of directly measuring and optimizing physics quantities that are typically not retrieved until post-experiment.
One potential class of solutions that can provide HRR analysis tools are machine learning (ML) techniques.27 ML techniques have already been found significant utility in the field of laser–plasma physics for data analysis as well as automated optimization schemes.28–33 In particular, multi-layer neural networks (NN), used in a subset of ML known as deep learning (DL), can learn the complex relationships between the input and output variables by training a collection of connected nodes or neurons using nonlinear activation functions. NN offer a versatile framework with multiple inputs and outputs combined with large training sets for fitting arbitrary functions.27 NN are widely used for predictive modeling, image classification, and other applications where they can be trained via a large dataset. DL algorithms have already found uses as surrogates for computationally demanding atomic physics simulations,29 and in HRR laser–plasma diagnostics.28,30
In this article, we present the development of a DL-assisted x-ray spectrometer designed to reconstruct spectral characteristics of betatron sources. A NN model was used to extract the critical energy and source amplitude from the data in real-time with an average reconstruction time of ∼4 ms, which provides a 20× increase in speed compared to traditional analysis, which will be discussed in detail later in the manuscript. This spectrometer was fielded at the Advanced Laser Light Source (ALLS) facility at the Institut National de la Recherche Scientifique – Énergie, Matériaux et Télécommunications (INRS-EMT) where spectral characterization was performed at a laser repetition rate of 0.5 Hz. These results provide a methodology to characterize betatron radiation features that can be coupled with active feedback mechanisms for real-time source optimization at repetition rates exceeding 10 Hz.
II. X-RAY FILTER PACK SPECTROMETER
The standard LWFA experiments (see Fig. 1) consist of multiple diagnostics to characterize the acceleration process and detect the accelerated electrons and generated betatron radiation from the interaction. A high intensity laser is focused into a gas target, where it drives the LWFA process before being depleted or exiting the interaction region. The laser is then filtered out using materials such as aluminum or glass with high transmission for photon energies above 5 keV. The electron beam is then swept out of the way and characterized using a dipole magnetic spectrometer. The betatron radiation that exits the interaction is incident on an x-ray CCD. This detector can be coupled with different types of materials to act as a diagnostic of the betatron source.
One class of diagnostics widely used for betatron characterization are x-ray filter pack spectrometers (XFP). These spectrometers measure the transmission of x rays through an array of different filters.3,7,13,34 These filter arrays are made of multiple materials and thicknesses, which each have unique x-ray transmission profiles, and are highly flexible for tailoring to the spectral range of interest by simply modifying the array elements. The filtering techniques are broadly classified as either (i) step wedge,28 which uses the same material of different thicknesses to reconstruct broadband spectral features, or (ii) Ross pairs,35,36 which uses a differential transmission technique to obtain higher spectral resolution.
A step wedge filter spectrometer, which is suitable for reconstructing broadband x-ray sources such as betatron radiation, was chosen as the diagnostic for this work. The filter pack consists of an array of eight filters of three aluminum foils of thicknesses 25, 50, and 75 μm and five copper foils of thicknesses 25, 50, 75, 100, and 125 μm, provided by Goodfellow. The x-ray CCD used for the detection of the betatron radiation used at ALLS is an Andor iKon-M SO vacuum CCD. The camera chip is in an isolated vacuum environment with a beryllium (Be) window, while the betatron radiation exits the interaction chamber through a second Be window. The filter pack was designed to sit in air in between the source and detector windows. The transmission values of the filters (see Fig. 2) and various materials (see Fig. 2) along the beam path were calculated using the NIST X-Ray Mass Attenuation Coefficients database.37
Typically, characterization of betatron spectra is performed by acquiring XFP images during a campaign and then reconstructed after the experiment. However, this process is performed offline and requires assumptions of the shape of the betatron spectrum. This is disadvantageous to experimental campaigns that would benefit from active feedback of the spectra parameters for optimization and tuning of the source to meet experiment goals, e.g., tuning the spectral peak to a material absorption edge. The NN model presented here is designed to extract these key metrics from the data in real-time at the laser repetition rate.
III. DEEP LEARNING MODEL
A. Synthetic dataset generation
Neural networks typically require large dataset ( ) to create an effective model for accurate predictions; however, due to the significant time required to collect these data experimentally, the NN model was trained on a purely synthetic dataset. The dataset (N = 12 500 samples) was produced assuming the simple synchrotron spectra [Eq. (2)]. The metric parameter space of synthetic spectra was chosen to match typical experimental betatron source characteristics as measured at ALLS. Accordingly, the amplitude was sampled from a uniform distribution ranging from to photons/sr/0.1% BW and the critical energy was sampled from a uniform distribution ranging from 5 to 50 keV. Synthetic images were generated using Eq. (1) by using the known detector response curve of an Andor iKon-M, the synchrotron source given a specific A and Ec, and the known transmission values of the XFP filters and materials along the x-ray beam path at ALLS. Once generated, Gaussian noise was applied to the images with a mean and standard deviation of 5% of the maximum value. The counts from each region corresponding to a unique x-ray filter were spatially averaged and background subtracted by the average counts within the lead (Pb) region, which has minimal transmission below 50 keV. Figure 2 shows example experimental data alongside synthetic images with comparable critical energies.
The NN was implemented using pyTorch.42 The fully connected model took the spatially averaged and background subtracted transmission of the eight filters as inputs. Three hidden layers of 32, 64, and 32 with rectified linear unit (ReLU) activation functions were used. The output layer consisted of two nodes for the critical energy and amplitude. Regularization using a dropout rate of 10% was performed after the second hidden layer to prevent overfitting of the network. The learning rate was set at 0.001, which was reduced by a factor of ten upon reaching a plateau. The model training was performed using an Adam optimizer with performance based on a loss function comparing the mean squared error of the metrics. The training set consisted of 80% of the original data, with 20% withheld for final model validation.
By default, the range of values of the two metrics as well as the eight transmission values (features) spanned orders of magnitude in difference. This difference in scale can lead to slower convergence, or overemphasis of one metric or feature. Therefore, feature scaling using standardization, or Z-score normalization, of the data was employed to improve the efficiency of the training. The amplitude metric was scaled by the base-10 logarithm, and the critical energy metric was scaled by the natural logarithm. The metrics and features were then standardized by subtracting the mean and dividing by the standard deviation of training set. The training employed tenfold cross-validation using scikit-learn.43 After training was completed, ten models of identical architecture with randomly initialized weights were trained. These models were then used to predict the metrics of the withheld validation dataset in order to quantify the prediction uncertainty of the model for previously unseen data.
B. Model training results
Figure 3 shows the ensemble training evaluation. For each test data point, the predicted values from the ML models were averaged to produce the mean output value, and the model uncertainty was taken to be the standard deviation. Figures 3(a) and 3(b) show the performance of the model with the true value on the X-axis and the ML-predicted value (blue) on the Y-axis. The model uncertainty is plotted for both σ (shaded blue region) and (shaded light blue region). The mean values (red) of a random sample of 100 data are plotted with error bars corresponding to their individual standard deviation. Figures 3(c) and 3(d) show the histogram plots of the percent error , where pi is the averaged predicted value and ti is the true value for the ith test sample. The model is capable of prediction with high accuracy across the full parameter space with a mean absolute error defined as of 0.8 keV and photons/sr/0.1% BW for the critical energy and amplitude, respectively. The average percent error values are 2.6% and 4.5% for the critical energy and amplitude, respectively.
IV. REAL-TIME MACHINE LEARNING ASSISTED EXPERIMENTAL ANALYSIS
A. Experiment details
The NN was applied and tested on data acquired from an experimental campaign at the INRS-EMT ALLS facility using their dedicated betatron beamline.14 The laser delivered 3.2 J with a pulse duration of 22 fs on-target at a maximum repetition rate of 2.5 Hz. The linearly polarized laser was focused using an f/15 off-axis parabola to produce a FWHM spot size of 15 μm with an energy concentration of 80% at 1/e2, corresponding to a laser intensity of 4.6 × 1019 W/cm2, and a normalized vector potential of .10 A deformable mirror and Dazzler were employed to optimize the laser wavefront and pulse duration for LWFA and subsequent betatron radiation.
The gas targetry consisted of either a supersonic gas jet with a nozzle diameter of 7 mm or a single-stage gas cell with an inner cell width of 7 mm. The performance of the supersonic gas jet has been well characterized at ALLS and is routinely used in betatron imaging experiments.10,41 The gas cells, which have previously been shown to produce significantly higher energy electron beams at comparable conditions,44 were 3D printed using an Anycubic Photon M3 resin printer. The gas types include helium (He), nitrogen (N2), and mixed gases that consisted of He doped with a small fraction of a secondary higher-Z species to facilitate ionization injection.45,46 The dopant mixtures were He–N2 (99.5% He + 0.5% N2 or 99% He + 1% N2) and He–Ar (99% He + 1% Ar).
B. Experimental results
During the experiment, the XFP spectrometer was placed in the betatron beam path during the optimization of the betatron source to provide as wide a range of amplitudes and critical energies as possible to evaluate the model. A Python program was developed to acquire x-ray images and predict the x-ray critical energy in real time. Critical energies were read off at the laser repetition rate of 0.5 Hz, which was the limitation of single shot acquisitions due to the readout speed of the Andor iKon-M. After the experiment was completed, the x-ray images were analyzed using the standard analysis method, where the least squares error was minimized assuming the source takes the form of Eq. (2). The standard analysis takes into account a potential 15% uncertainty in the individual filter thickness as quoted from Goodfellow. A total of 541 XFP images with various plasma densities, focal spot positions, and second order spectral dispersion were analyzed for comparison with the model performance.
Figure 4(a) shows the results of the standard and the ensemble of NN models in calculating the critical energy for two types of gas targets and various gas mixtures. The error bars on the standard predictions correspond to the 15% filter thickness uncertainty. The ML model predictions are plotted on the Y-axis, where the error bars correspond to the ensemble model prediction standard deviation added in quadrature with the 15% filter thickness uncertainty. The ML model is able to predict the experimental critical energy values with a mean absolute error between the model predictions and the standard analysis of 3.2 keV and a standard deviation of 2.1 keV for data points above keV. The average relative error between the neural network predication and the standard analysis for this same range is 13.6%.
The predicted critical energy remains within the error bars of the standard analysis across 5–50 keV. However, approximately 10% of the entire dataset corresponds to low critical energies ( keV) when the parameters of the accelerator were not tuned for efficient bubble regime acceleration. The 5 keV is also below the lower limit of the training set described above, and data with critical energies below that limit can negatively impact the performance of the network's predictive capability. The increased error could also be attributed to the poor transmission of <5 keV x rays through the set of filters, and therefore little signal being captured by the x-ray detector. The inclusion of data points with keV, the mean absolute error increases to 3.7 ± 2.8 keV with an average percent error of 34.1%.
The photon amplitude metric was evaluated by standard methods, but the large size of the XFP filters compared to the CCD field of view as well as the small spacing between each filter made the interpolation of the x-ray intensity profile potentially inaccurate. Therefore, an error analysis between the neural network and standard analysis is potentially unclear. Future XFPs can be designed with this knowledge to reduce the individual sizes of the filters to improve the analysis of the photon amplitude.
V. MODEL FINE TUNING
To improve the model performance, network fine tuning was performed using a subset of the experimental data. This technique is useful to transfer NNs from being trained on a fully synthetic dataset to perform with better accuracy when evaluating experimental data. A subset of 100 randomly selected experimental images with critical energies above 10 keV were used to further train the neural network. The network layers were frozen such that only the weights of the first layer of the network could be adjusted. The photon amplitude metric was not adjusted during the fine tuning process.
The updated model's performance can be seen in Fig. 5. Network fine tuning improves the model's performance at predicting experimental data with a mean absolute error between the model predictions and the standard analysis of 1.2 keV and a standard deviation of 1.3 keV for data points above keV. The average percent error between the NN prediction and the standard analysis for this same range is 7.2%. By leveraging fine tuning the NN model with a subset of experimental data, the network percent error decreased by a factor of 2× compared to a network trained purely on synthetic data.
To demonstrate the reduction in analysis time provided by the NN, a random selection of 500 samples was analyzed by both the NN and the standard analysis. This analysis involves loading an experimental image, extracting the corresponding filter transmissions from the image, and then passing the transmission values into the NN or the least squares function. Pre-processing steps used for the least squares function such as loading the known transmission values of the filters were not included, but would increase the time of the reconstruction loop. The reported average reconstruction time at 500 iterations is 17.2 ± 5.5 ms for the least-square analysis and 1.5 ± 0.6 ms for the NN. While both methods are suitable for Hz operation, >10 Hz HRR systems with more involved programmatic loops necessary for real-time, multi-objective optimization would significantly benefit from the ∼1 ms reconstruction time provided by the neural network.
VI. CONCLUSION
In this work, we report on a deep neural network model that was developed for step-wedge x-ray filter spectrometers to characterize in betatron radiation properties. This analysis tool was deployed during an experimental campaign, where the x-ray critical energy and amplitude were reconstructed in real-time at 0.5 Hz. The predicted betatron critical energy was in good agreement with standard analysis techniques and provides a ms reconstruction time for the two metrics. While this network assumed an ideal synchrotron spectrum as the shape of the betatron spectrum, this technique could utilize multiple broadband functions such as bremsstrahlung, which is commonly produced by the stopping of electrons in chamber walls and acts as an additional signal on the x-ray detector. This analysis is fast enough to support 10 Hz laser operation and can be coupled with other diagnostics or feedback algorithms for experiment optimization. Indeed, this diagnostic would be ideally suited for fielding in experimental campaigns focused on real-time optimization of betatron radiation through machine-learning assisted techniques such as Bayesian optimization.
ACKNOWLEDGMENTS
This research is supported by the U.S. Department of Energy Fusion Energy Sciences Postdoctoral Research Program administered by the Oak Ridge Institute for Science and Education (ORISE) for the DOE, the NSERC Alliance - Alberta Innovates Advance Program (Agreement Nos. 212201089 and 222302077), the Natural Sciences and Engineering Research Council of Canada (Grant No. RGPIN-2021-04373), and the Canada Foundation for Innovation: Major Science Initiatives Ministère de l'Économie, de l'Innovation et de l'Énergie du Québec. This research was undertaken, in part, thanks to funding from the Canada Research Chairs Program. ORISE is managed by Oak Ridge Associated Universities (ORAU) under DOE Contract No. DE-SC0014664. All opinions expressed in this presentation are the authors' and do not necessarily reflect the policies and views of DOE, ORAU, or ORISE. The authors would like to thank Joël Maltais and Stéphane Payeur from ALLS for technical support, and Derek Mariscal and Matt Hill for helpful discussions and insight.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
N. F. Beier: Conceptualization (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Resources (equal); Software (equal); Supervision (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal). V. Senthilkumaran: Formal analysis (equal); Investigation (equal). E. Kriz: Investigation (equal). S. Fourmaux: Data curation (equal); Investigation (equal); Resources (equal); Supervision (equal). F. Légaré: Funding acquisition (equal); Project administration (equal); Resources (equal); Supervision (equal); Writing – review & editing (supporting). T. Ma: Conceptualization (equal); Project administration (equal); Resources (equal); Supervision (equal); Writing – review & editing (supporting). A. E. Hussein: Conceptualization (equal); Data curation (equal); Funding acquisition (equal); Project administration (equal); Resources (equal); Supervision (equal); Writing – review & editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.