Fusion power production in tokamaks uses discharge configurations that risk producing strong type I edge localized modes. The largest of these modes will likely increase impurities in the plasma and potentially damage plasma facing components, such as the protective heat and particle divertor. Machine learning-based prediction and control may provide for the automatic detection and mitigation of these damaging modes before they grow too large to suppress. To that end, large labeled datasets are required for the supervised training of machine learning models. We present an algorithm that achieves 97.7% precision when automatically labeling edge localized modes in the large DIII-D tokamak discharge database. The algorithm has no user controlled parameters and is largely robust to tokamak and plasma configuration changes. This automatically labeled database of events can subsequently feed future training of machine learning models aimed at autonomous edge localized mode control and suppression.
I. INTRODUCTION
High confinement (H-mode) operation modes are the mode of choice for fusion power production in future tokamak fusion systems.1 These modes are targeted for the “highest performance” operation at ITER.2 Unfortunately, these operational modes also produce so-called Edge Localized Modes (ELMs) at the plasma boundary that result in damaging heat loads at the divertor and other plasma facing components.3,4 These modes also increase the impurity of the plasma through contamination by the erosion of those components.1 The management of ELMs has been named a “critical issue for ITER,”5 where fusion power production requires plasma density levels that produce type I ELMs.1 These ELMs are expected to occur up to once per second at ITER.3
On the other hand, the ELMs in a so-called “ELMy” H-mode also provide a mechanism for density and impurity control of the core plasma.6,7 For this reason, it is desirable to control the ELMs during H-mode operation rather than strictly eliminate them. Studies exist on the control of ELMs that primarily focus on adapting plasma parameters and system design at time scales covering many ELMs to reduce the harmful effects of the worst of them.5–18
There is mounting optimism in an alternative to plasma parameter selection or design; real-time predictive plasma control systems could preemptively identify that an ELM will occur in the future 100 ms and autonomously prescribe corrective actions to ameliorate the damaging consequences. Rich with sensor systems that have well defined semantic meaning, tokamaks are good candidates for monitoring and active control by the data-driven models of machine learning. For example, a neural network was used to predict the disruption of the ASDEX Upgrade tokamak “far away from the desired operational space (H-mode, high-β).”19 There is mounting literature on machine learning-based plasma control.20–28
A critical difficulty with training supervised machine learning models in the context of tokamak reactors is the lack of labeled ELM events. Herein, we address this difficulty by describing an algorithm for autonomously identifying the ELM events in a large dataset of shots from the DIII-D tokamak.29 The performance of this algorithm is evaluated against a set of ELMs that were hand-labeled by human experts, and the algorithm demonstrates a near perfect precision and recall. Deployed on shots that were not selected by experts for ELM-potential, where recall is much more difficult to evaluate, the algorithm demonstrated a precision of 97.7%.
A shot is a plasma discharge lasting for many seconds in which there may be thousands of ELM events for a typical H-mode plasma. At DIII-D, ELMs that occur during an “ELMy” H-mode discharge typically occur with a period of tens of milliseconds, grow from the background in a few microseconds or less, and last for a few hundred microseconds.
A recent study uses an ELM detector based exclusively on a single filterscope signal,18 further described elsewhere.30 The authors state that “the ELM detector [just referenced] should be replaced by a more robust ELM detector that does not rely on calibration-specific tuning.”18 Here, we use additional signals but with no user-controlled parameters, to create just such an ELM detector that has high precision when shown more than 5 years of shots recorded at DIII-D. This system can be used to create a training dataset for machine learning models to detect, classify, and ultimately predict ELMs in real time based on time series streaming data from tokamak diagnostic sensors.
Another recent, complementary study has trained a U-Net for ELM discovery at the KSTAR tokamak.31 Therein, the authors train the neural network on a single filterscope signal using ten shots for training (and validation) and two shots for testing and compare the performance of their model to several others. The model does not contain any user-controlled parameters. We call this related work complementary because it is an example of how labeled data can be used to create high quality models for plasma science. We will compare the reported performance of this U-Net with the performance of our model after describing the latter.
II. ELM OBSERVER (ELM-O): AN ALGORITHM FOR LABELING ELMs IN TIME SERIES DATA
To locate the ELMs in time, we use diagnostic signals from the tokamak: two interferometric measurements of the plasma density (interferometers),32 three measurements of atomic emission lines (filterscopes),33 and 64 channels from the beam emission spectroscopy system (BES).34 These three systems are sensitive to ELMs of all types, not just type I ELMs. All three instrument types have different sampling rates, as shown in Table I. We consider time periods within a shot when all the instruments are reporting data; we call this the concurrent sampling window. The data are processed by the ELM-O algorithm to produce labeled ELMs. The algorithm is described diagrammatically in Fig. 1.
Sampling rates for the instruments used in this work.
Instrument . | Sample rate (kS/s) . |
---|---|
Filterscope | 50 |
Interferometer | 100 |
BES | 1000 |
Instrument . | Sample rate (kS/s) . |
---|---|
Filterscope | 50 |
Interferometer | 100 |
BES | 1000 |
The factor of 20 difference in sampling rate between the filterscopes and the BES means that we have to “align” the data during the concurrent sampling window. To do this, we perform a discrete cosine transform (DCT)35 on the interferometer and filterscope data, append zeros to the end of the transformed vector, and inverse transform (IDCT) back to the time-domain. This operation produces “spectrum preserving” interpolation to the same time-domain sample points as the high-sampling BES signal. We note that this approach could also be used to decimate the BES data as well; e.g., one could effectively choose a middle point in sampling frequency between the disparate sources.
Typically, we perform this “up-sampling” routine on time series that are 200 ms long, or 200k points in the BES signal. The spectra of some example signals to be up-sampled are shown in Fig. 2. An example of the signals before and after this transformation is shown in Fig. 3. Despite the sharp cutoff where the spectrum goes to zero, the “spectrum preserving” up-sampled signals are quite faithful to the originals. The largest change appears to be that the up-sampled filterscope signals are delayed by ∼10 µs.
Spectrum of a 200 ms window of the interferometer and filterscope signals. The dashed vertical line shows the highest frequency component of the DCT that exists; all higher frequency components are set identically to zero.
Spectrum of a 200 ms window of the interferometer and filterscope signals. The dashed vertical line shows the highest frequency component of the DCT that exists; all higher frequency components are set identically to zero.
0.5 ms time series from a tokamak shot showing an ELM. The top two rows are interferometer signals sampled at 100 kS/s; the bottom three rows are filterscope signals sampled at 50 kS/s. The red circles show the measurements before “up-sampling” to 1 MS/s, and the blue lines show the signals after “up-sampling.”
0.5 ms time series from a tokamak shot showing an ELM. The top two rows are interferometer signals sampled at 100 kS/s; the bottom three rows are filterscope signals sampled at 50 kS/s. The red circles show the measurements before “up-sampling” to 1 MS/s, and the blue lines show the signals after “up-sampling.”
The BES data are reduced to a single time series. First, the latter 32 channels have half the range of the former 32 channels, and so the signals from those channels are correspondingly scaled by 2. Second, all 64 channels are averaged together. The above procedure creates six channels for detecting ELMs when this single BES channel is combined with the two interferometer channels and the three filterscope channels.
The principle behind the detection algorithm is that, because each of the different diagnostics detects different reactions of the plasma to ELMs, they should all be treated separately and then combined at the end. In the BES signal, candidates are labeled via a simple threshold. If the signal is above the threshold t (out of ∼10 V), the point is labeled an ELM candidate. The interferometer and filterscope signals are differentiated numerically using a first-order time difference and then the absolute value is taken. From each differentiated signal, the 1 − η largest points are labeled as candidates, where the same η is used for all five of the signals to define the threshold percentile.
All six signals are now binary time series, in which a one indicates an ELM candidate in that signal. The signals are individually convolved with a window function 100 µs long to prevent a “near miss” between signals. This also helps produce fewer, more uniform regions (in time) of candidates instead of a greater number of smaller regions. A candidate is declared in the interferometer signals if either signal contains a candidate. A candidate is declared in the filterscope signals if any two of the three signals contain a candidate. A candidate is declared in the BES signal directly, based on the average of the channels as described above. Finally, any point that has a candidate in all three diagnostics is declared an ELM.
To tune the values of t and η, we use a database of 972 ELMs that were hand labeled. This dataset consists of manually trimmed regions that should contain only one ELM and a smaller region labeled to be the ELM. An example of the data used to detect ELMs is shown in Fig. 4. The trimmed regions are of varying lengths, from a few milliseconds to tens of milliseconds. In order to rank the thresholds, t and η, we use the Area Under the Curve (AUC) metric of the precision–recall curve.36 A plot of the precision–recall curve as a function of η and t is shown in Fig. 5. Precision is defined as the number of ELMs identified by the ELM-O algorithm that are also labeled as ELMs by the expert. Recall is defined as the number of ELMs labeled by the expert that are also identified by the ELM-O algorithm.
Example of one of the hand labeled ELMs. The top row shows the interferometer signals, the middle row shows the filterscope signals, and the bottom row shows the mean of the BES signals. The shaded region in the bottom row is the region hand-labeled as an ELM.
Example of one of the hand labeled ELMs. The top row shows the interferometer signals, the middle row shows the filterscope signals, and the bottom row shows the mean of the BES signals. The shaded region in the bottom row is the region hand-labeled as an ELM.
Precision–recall curve for the ELM-O algorithm. The maximum AUC when t = 1 V is 0.971 and occurs for a percentile-threshold of 0.997.
Precision–recall curve for the ELM-O algorithm. The maximum AUC when t = 1 V is 0.971 and occurs for a percentile-threshold of 0.997.
We define true positives, false positives, and false negatives in the following way. For each distinct time-span the algorithm labels as an ELM, a true positive is scored if there is any overlap with the hand-labeled ELM. If there is no overlap with the hand-labeled region, this is scored as a false negative. A false positive is scored when the algorithm labels a region to contain an ELM that does not contain a hand-labeled ELM. Note that it is possible for the algorithm to both miss the true ELM (false negative) and label a non-ELM as an ELM (false positive) in a single example. In fact, it is possible to have many false positives in a single example. In practice, this is only a problem with very permissive labeling of ELMs.
Table II shows the highest performing settings of t and η. When t is 2 or below, it appears that there is little cause to be selective in its value. We selected t = 1 V for convenience.
Algorithm performance on the hand-labeled dataset. Only the highest AUC η is shown for each threshold.
t . | η . | P . | R . | AUC . |
---|---|---|---|---|
0.5 | 0.998 | 0.996 | 0.974 | 0.970 |
1.0 | 0.997 | 0.995 | 0.976 | 0.971 |
2.0 | 0.997 | 0.997 | 0.975 | 0.972 |
5.0 | 0.200 | 0.998 | 0.961 | 0.959 |
t . | η . | P . | R . | AUC . |
---|---|---|---|---|
0.5 | 0.998 | 0.996 | 0.974 | 0.970 |
1.0 | 0.997 | 0.995 | 0.976 | 0.971 |
2.0 | 0.997 | 0.997 | 0.975 | 0.972 |
5.0 | 0.200 | 0.998 | 0.961 | 0.959 |
With a percentile-threshold of η = 0.997 (t = 1 V), the algorithm returns 5 false positives and 23 false negatives and an AUC of 0.971. An inspection of these errors reveals that the five false positives are all cases where the human labeler did not label a second ELM in the time window; i.e., they are true positives. Of the false negatives, 18 come from two shots where the timing system failed to appropriately line up the signals from all three diagnostic types. Thus, the precision of the algorithm is perfect, while the recall depends on the functioning of the signal digitizer’s time registration. Assuming improved inter-diagnostic timing registration, we expect that the algorithm will produce nearly ideal detection of ELMs similar to those labeled by hand.
We emphasize here that the percentile setting is not a user-controlled parameter. It has been set via tuning the algorithm’s performance on our test set, and any changes made to this parameter will require re-evaluation of the algorithm performance on a different hand-labeled set. As we shall see shortly, these settings (and the overall algorithm design) show a similar performance on seven years’ worth of data at DIII-D. Hand tuning of these values on a per-shot basis is not feasible, as our goal is to discover, en masse, ELMs in a large dataset of tens of thousands of shots at DIII-D.
Hereafter, we focus on precision as our evaluation metric for two reasons. First, when creating a database of ELM events, our primary concern is making sure that the database contains only ELMs. Second, it is far easier for an expert to review potential ELMs that have already been labeled than it is to comb through the raw data looking for ELMs. Potential ELM discovery is, in fact, the process we wish to automate with this work.
To further evaluate the performance of the algorithm, we ran ELM-O on randomly sampled time windows from a number of shots. The shots were divided into two groups: shots (but not ELM events) that were previously used to produce the 972 hand-labeled ELMs on which ELM-O’s parameters were tuned (group 1) and shots disjoint from these shots (group 2). The group 1 dataset contains 49 shots numbered within the range from 166 433 (April 4, 2016) to 173 224 (October 12, 2017). These shots had been selected prior to the work described here because the human expert expected to see type I ELMs in these time series. For each shot, ELM-O was shown a random 200 ms window and labeled all the ELMs in that window. This random window was selected from the entire time series recorded by the control system excluding 200 ms at the beginning and at the end of the shot. We did not otherwise perform any selection for plasma status. This procedure produced 299 ELMs. An expert review of these ELMs found only four of them to be false positives. These ELMs came from two shots and are pairs labeling the same event described by an expert as “possibly a disruption of the plasma,” separated by 200–300 µs. Nevertheless, the precision of ELM-O is 98.7% in this dataset.
Group 2 ELMs are composed as follows: first, we identified a corpus of 9941 shots by filtering the entire DIII-D database with the following criteria: (1) the operational mode of the tokamak is set to “plasma,” (2) the discharge lasts for more than 1 s, and (3) the diagnostics we use in the algorithm show data in the database. Two hundred shots were randomly selected from this set, numbered from 156 562 (April 7, 2014) to 187 328 (June 24, 2021), while excluding the entire range of shots included in the original 972 hand-labeled ELMs. This resulted in 76 shots, 58 before the group 1 shots and 18 after. A 200 ms window is randomly selected from each of the shots in the same way as for group 1. ELM-O found 392 ELM candidates in this dataset. From this dataset, an expert found that 97.7% (383/392) are indeed ELMs.
An inspection of the results on signals that contain smaller ELMs suggests that a different method be used to detect all the ELMs in a data series. In Fig. 6, the filterscope shows some lower intensity ELMs between the larger ELMs. They also appear to show up in the BES signal. However, these smaller ELMs are not large enough to exceed the percentile-threshold, so they are not labeled as ELMs. Identifying these smaller ELMs automatically is the subject of ongoing work.
100 ms window from a period with both large and small ELMs. The top row shows the interferometer signals, the middle row shows the filterscope signals, and the bottom row shows the mean BES signal. The vertical regions in green show the parts of each signal labeled as ELMs, whereas the red stars in the bottom panel show the ELMs found by the ELM-O algorithm.
100 ms window from a period with both large and small ELMs. The top row shows the interferometer signals, the middle row shows the filterscope signals, and the bottom row shows the mean BES signal. The vertical regions in green show the parts of each signal labeled as ELMs, whereas the red stars in the bottom panel show the ELMs found by the ELM-O algorithm.
Finally, we make a brief comparison between ELM-O and the U-Net model trained on data from KSTAR.31 Table III shows the values of precision and recall for both models (reported in the other work as positive prediction rate and true positive rate, respectively). Because we lack estimates for recall on our test data (as described earlier), we limit our comments to precision and simply note that ELM-O has a smaller reduction in precision when deployed on unseen data. We do not make much of this difference because the U-Net was trained to identify smaller ELMs that we know ELM-O does not identify. Once again, we highlight the complementary nature of the two ELM detection methods: ELM-O uses multiple signals to identify ELMs, whereas the U-Net uses a single filterscope signal and identifies ELMs “not [by] the intensity but the shape of the [ELM] peaks.”31
Comparison between the performance of ELM-O and the reported performance of the U-Net described elsewhere.31 The values for the test set (shots 18 396 and 29 487) were interpolated from Figs. 9(a) and 10(a), respectively.
Model . | Data . | P . | R . |
---|---|---|---|
ELM-O | Training | 0.995 | 0.976 |
Group 1 | 0.987 | ⋯ | |
Group 2 | 0.977 | ⋯ | |
U-Net | Training | 0.924 | 0.935 |
18 396 | 0.84 | 0.96 | |
29 487 | 0.87 | 0.88 |
Model . | Data . | P . | R . |
---|---|---|---|
ELM-O | Training | 0.995 | 0.976 |
Group 1 | 0.987 | ⋯ | |
Group 2 | 0.977 | ⋯ | |
U-Net | Training | 0.924 | 0.935 |
18 396 | 0.84 | 0.96 | |
29 487 | 0.87 | 0.88 |
III. CONCLUSIONS AND FUTURE WORK
“‘Big Data’ has made machine learning much easier,”37 and we expect that “machine learning” will do the same for plasma science but only if the inevitable hunger for labels can be fed. In this article, we have shown that the algorithm ELM-O is an effective way to automatically label ELMs in shots at DIII-D with no user-selected parameters. Testing the algorithm on shots spanning more than half a decade and with little regard for plasma parameter settings indicated that the false positive rate is less than 3%. A lower false positive rate, less than 2%, was achieved when a more careful selection of shots was made. Therefore, we conclude that ELM-O is a good candidate for generating massive datasets of autonomously labeled ELMs, numbering in perhaps the millions. This is a dramatic improvement over current datasets that are painstakingly generated by human eye and number only in the thousands.
Since ELM-O relies on percentile-thresholds, it is robust to scale changes in the diagnostics, i.e., the gain setting of the filterscopes. However, this also means that ELM-O will not label smaller ELMs in the presence of larger ELMs. See Fig. 6 for an example. Using the ELM-O-derived data, we are currently working on unsupervised machine learning methods for detecting ELMs of all shapes and sizes. In addition, we are working on training a U-Net model with the large labeled datasets we can generate using ELM-O.
ACKNOWLEDGMENTS
The authors would like to thank R. Shousha and E. Kolemen for useful discussions. This work was supported by the Department of Energy, Office of Fusion Energy Science, under Field Work Proposal No. 100636 “Machine Learning for Real-time Fusion Plasma Behavior Prediction and Manipulation” and Department of Energy Grant No. DE-SC0021157. This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Fusion Energy Sciences, using the DIII-D National Fusion Facility, a DOE Office of Science user facility, under Award No. DE-FC02-04ER54698.
This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Finn H. O’Shea: Conceptualization (equal); Data curation (lead); Formal analysis (lead); Investigation (lead); Methodology (lead); Software (lead); Supervision (equal); Validation (lead); Visualization (lead); Writing – original draft (lead); Writing – review & editing (lead). Semin Joung: Data curation (supporting); Writing – review & editing (equal). David R. Smith: Data curation (supporting); Funding acquisition (equal); Methodology (supporting); Writing – review & editing (equal). Ryan Coffee: Conceptualization (equal); Funding acquisition (equal); Methodology (supporting); Supervision (equal); Writing – review & editing (equal).
DATA AVAILABILITY
Data sharing is not applicable to this article as no new data were created or analyzed in this study. The code used to label ELMs is available at https://github.com/finnoshea/PublicELMO.