We introduce EFIT-Prime, a novel machine learning surrogate model for EFIT (Equilibrium FIT) that integrates probabilistic and physics-informed methodologies to overcome typical limitations associated with deterministic and ad hoc neural network architectures. EFIT-Prime utilizes a neural architecture search-based deep ensemble for robust uncertainty quantification, providing scalable and efficient neural architectures that comprehensively quantify both data and model uncertainties. Physically informed by the Grad–Shafranov equation, EFIT-Prime applies a constraint on the current density and a smoothness constraint on the first derivative of the poloidal flux, ensuring physically plausible solutions. Furthermore, the spatial location of the diagnostics is explicitly incorporated in the inputs to account for their spatial correlation. Extensive evaluations demonstrate EFIT-Prime's accuracy and robustness across diverse scenarios, most notably showing good generalization on negative-triangularity discharges that were excluded from training. Timing studies indicate an ensemble inference time of 15 ms for predicting a new equilibrium, offering the possibility of plasma control in real-time, if the model is optimized for speed.
I. INTRODUCTION
Reconstruction of magnetohydrodynamic (MHD) equilibria from a series of external and internal diagnostic measurements is a crucial aspect of tokamak research and operations worldwide. It provides essential information on the magnetic geometry, current, and pressure profiles that are necessary for tokamak data analysis and interpretation, plasma stability and control, and code and physics model validation. The EFIT code1,2 has been extensively used in many tokamaks around the world to reconstruct MHD equilibria based on experimental constraints derived from measurements. Due to its widespread application, there exist broad experimental equilibrium reconstruction data, some examples of which are DIII-D,3 EAST,4 JET,5 KSTAR,6 and NSTX.7 A more comprehensive list of references for experiments using EFIT can be found in Ref. 8.
EFIT has three general modes of operation corresponding to the applications: real-time operation for use in the plasma control system (PCS), between-shot analysis for experimental planning, and post-processing analysis for experimental interpretation and theory validation. The real-time mode, known as RT-EFIT, which uses either external magnetics data or external magnetics with motional stark effect (MSE) diagnostic data, is a reduced version of EFIT that does not completely numerically solve the equilibrium equation9 because of the time spent in inverting the Grad–Shafranov operator. For each use, the inputs used to constrain the equilibrium can vary. All the modes of operation use magnetic inputs. Between-shot analysis typically uses measurements based on the spectroscopic analysis of neutral beams and ionization. The spectroscopic analysis uses MSE to provide internal measurements of the magnetic field pitch angle for improved accuracy. Measurements of the species densities and temperatures, known as the kinetic profiles, can be used for additional constraints.
The reconstruction problem can be rewritten as , where y is the outputs of EFIT that encompass fundamental quantities like and derived quantities like internal inductance and normalized beta , and d represents the input data and is the observed data generating function of EFIT. The input vector d consists of many different types of internal and external measurements. However, since our aim in this work is to build a surrogate for magnetic EFIT, only the external magnetic data are considered here, thus . The goal of this surrogate is then to fit the equation with a neural network function .
This work presents a reduced-order or surrogate model of the magnetics-only EFIT reconstructions using DIII-D data. There has been a multitude of recent work on reduced-order models based on artificial neural networks for equilibrium reconstruction in various experiments around the world. A table summarizing surrogates for EFIT specifically is shown in Table I. Outside EFIT, there have also been parallel efforts recently to build surrogate models for the variational moments equilibrium code (VMEC).20 In addition, our focus is developing a surrogate model for the EFIT inverse problem of predicting the poloidal flux, while works such as Ref. 21 build surrogate of the forward problem solved by Grad–Shafranov equation. One of the primary interests in a machine learning surrogate is to replace RT-EFIT because the PCS requires fast feedback and it is possible to have a surrogate that is more accurate than RT-EFIT given its limitations. The PCS does not necessarily need to have a complete representation of EFIT, and thus the interest in using derived quantities, such as the location of the last closed-flux surface (LCFS).
Several papers looking at neural networks to create a surrogate model for EFIT. These papers differ not only in the techniques used to determine the neural net, but also in the inputs, outputs, and experimental data used. A drop-in replacement for EFIT would require outputting the poloidal flux at a minimum, but often derived quantities, including the last closed-flux surface (LCFS), are the most useful for a plasma control system (PCS). Our goal is to build upon prior work to produce a version that is faster, more accurate, and more robust than prior versions.
Paper . | Inputs . | Outputs . | Experiment . |
---|---|---|---|
This paper | Magnetics data | , | DIII-D |
Joung12 | Magnetics data | KSTAR | |
Wai13 | Magnetics data | NSTX-U | |
Joung14 | Magnetics data | KSTAR | |
Lu15 | Magnetics data | , LCFS | KSTAR |
Shousha16 | Magnetics, MSE, TS | DIII-D | |
CER, RT-EFIT | |||
Wan17,18 | Magnetics data | EAST | |
Wei19 | Magnetics data | DIII-D |
Paper . | Inputs . | Outputs . | Experiment . |
---|---|---|---|
This paper | Magnetics data | , | DIII-D |
Joung12 | Magnetics data | KSTAR | |
Wai13 | Magnetics data | NSTX-U | |
Joung14 | Magnetics data | KSTAR | |
Lu15 | Magnetics data | , LCFS | KSTAR |
Shousha16 | Magnetics, MSE, TS | DIII-D | |
CER, RT-EFIT | |||
Wan17,18 | Magnetics data | EAST | |
Wei19 | Magnetics data | DIII-D |
Prior surrogate models summarized in Table I often relied on low-resolution data, exhibited normalization inconsistencies that do not align with the physics, utilized deterministic frameworks that are not equipped to capture the underlying uncertainties, and employed ad hoc neural network architectures that are not optimized for the tasks. Addressing these issues and developing an accurate and robust model, we introduce EFIT-Prime, a novel machine learning surrogate model for EFIT that integrates probabilistic and physics-informed methodologies to overcome typical limitations associated with deterministic and ad hoc neural network architectures. The probabilistic approach that allows one to quantify the uncertainty in a model's prediction in addition to a point estimate typically seen with deterministic models. We adopt a deep ensembles-based uncertainty quantification approach, where a neural architecture search22,23 is used to obtain a diverse set of ensembles of models in a scalable way by utilizing leadership-class computing systems. Deep ensembles are currently the state-of-the-art for large neural network models, compared to variational inference and other probabilistic modeling approaches, closely performing to the gold standard of Markov Chain Monte Carlo Bayesian methods for larger models in both in-distribution and out-of-distribution predictions and generalization contexts, as highlighted in previous works.24 Deep ensembles allow for reliable quantification of both aleatory (irreducible) and epistemic (reducible) uncertainties. We also introduce physics constraints by using a multi-model neural architecture search, focusing on the magnetic flux ( ) and the toroidal current density ( ) given in tandem. This hybrid architecture-penalty constraint approach concurrently learns along with enforces axisymmetry and Ampère's law.
An important property of accuracy and robustness in an machine learning (ML) model is that of generalizability. A model is said to be generalizable if it gives accurate predictions and uncertainties even in regimes that were not included in its training. As we move into the burning plasma era where we will have fewer diagnostics and less control over plasmas, this becomes increasingly important. One of the unique features of this work is our rigorous tests of generalizability by understanding the ability to predict an extreme plasma shape, negative-triangularity (NT) plasmas, even if they are not included in the training set.
The rest of this paper is organized as follows. The network architecture search (NAS) framework and overall methodology are described in Sec. II. The database creation and the aggregation of the NN input and target vectors are covered in Sec. III. The results of the final model EFIT-Prime are presented in Sec. IV. Variations (or ablation) of EFIT-Prime to study the impacts of the physics-informed components and other modeling strategies are presented in Sec. V, culminating in a rigorous test of generalizability using NT discharges from DIII-D in Sec. V D. A summary of the manuscript, additional discussion, and future work are presented in Sec. VI.
II. EFIT-PRIME SURROGATE MODELING
A. Probabilistic modeling using NAS-based deep ensembles
Most machine learning-based surrogate models are developed using standard neural architectures or ad hoc network choices. This includes similar work on EFIT-based neural nets,12,14,25 while versatilely, they may not be specifically tuned to a problem's unique aspects. They often miss out on quantifying prediction uncertainties, which is crucial for reliable decision-making and risk assessment. Here, we describe the neural architecture search (NAS)26 adopted in the work to optimize both architecture (e.g., recurrent neural network, convolutional neural network, multi-layer perceptron) and model hyperparameters (e.g., number of layers, weights) together.
Neural architecture search is formulated as a bi-level optimization: the outer level for architecture parameters and the inner level for model parameters, based on the chosen architecture. This approach ensures a thorough exploration of both parameters. As described in Sec. I, our goal is to train over a large dataset with equilibria. To meet the computational challenge of searching for the optimal architecture and framework, leadership-class facilities are used to give the computational power needed. Even with these facilities, it is important to choose the correct optimization techniques for efficient search. We use a framework combining Aging Evolution (AgE), rooted in the paradigm of evolutionary algorithms,27,28 for the outer loop with asynchronous Bayesian Optimization (BO)29 for the inner loop, together giving rise to the AgEBO technique.
The outer loop method, Aging Evolution (AgE), creates various neural architectures and enables concurrent training across multiple nodes. AgE follows an iterative mutation process: at each iteration t, it mutates existing architectures, leading to new candidates. This process is described as , where and are the architecture populations at iterations t and , Mutate is the mutation operation, and are the mutation rules at iteration t. The method is efficient in parallel.
The implementation of our NAS method is done with DeepHyper.23 It employs a modular and flexible framework for defining the space of neural architectures, thus facilitating a comprehensive exploration of various architectural configurations. It encapsulates the architectural space through a high-level abstraction, which allows specifying ranges and types of architectural components, such as layers, activation functions, and other hyperparameters. This formulation is amenable to a variety of search algorithms to navigate the architectural landscape efficiently. Leveraging the capabilities of Ray,31 a distributed computing library, DeepHyper can parallelize the architecture search and training processes across multiple compute nodes or cores. This parallelization is instrumental in substantially accelerating the search, especially when navigating through a vast space of possible architectures.
The comprehensive NAS combined with the AgEBO approach as formulated here then has the following advantages: (1) automatically optimized neural models can be found using leadership class facilities, (2) improved robustness enabled by an ensemble approach, and (3) uncertainty quantification, with the categorization of two types of uncertainty, in those predictions. These are crucial features needed to build a robust and accurate surrogate that may not be attained by an un-optimized multi-layer perceptron (MLP) NN, even those that may perform similarly on an in-distribution test set. Such naive approaches often perform poorly when inferring on out-of-distribution data.
B. Incorporating physics-constraints in NAS-based deep ensembles
Up to this point, we have described a general method for forming a performant, probabilistic neural network that is independent of the specific physics that we wish to study. Here, we describe two physics-based constraints that we wish to impose and study their effects on the ability for accurate predictions.
The next constraint that we wish to enforce is that of the Grad–Shafranov itself, referred to as the constraint henceforth, and which follows the design philosophy of physics-informed neural networks.35,36 To do so, we introduce a second model that predicts from the right side of the Grad–Shafranov equation, given predicted by the first model. The right side is valid only inside the separatrix, and our goal is to include only those contributions. That is, while our flux inference is for both the plasma and coil contributions, the constraint should only constrain the current inside the separatrix. This is because calculating from is inaccurate for the coil currents, which EFIT calculates using Green's functions. There are two possible methods for including the current inside the separatrix: (1) Calculate the input vector to arrive at via automatic differentiation, as was carried out by Ref. 12. After this is calculated, we would then have to have an additional calculation for having only the values inside the separatrix. (2) Use the right-side directly via training. The latter approach is not only less computationally intensive, but using the Grad–Shafranov equation requires one less differentiation and thus gives a that is smoother. This yields an indirect constraint on , but works well as seen in Sec. IV. An explanation for how our approach yields a consistent with is given in Appendix A.
The loss function for this additional constraint, , is given by Eq. (4) with .
To apply this hybrid-physics constraint, we adopt a dual-model setup where the first model is an MLP that learns as a function of external magnetic data , and the second model is another MLP that learns as a function of . In other words, these two models are linked to enforce the Grad–Shafranov equation.
The smoothing loss function described above was also necessary to mitigate the noise that arises in due to the constraint. Since the two targets are concurrently learned in our setup, the strong coupling between and feeds back into . This effect is strong in the plasma core where current density peaks. By adjusting the amplitude of the Sloss term in Eq. (10), we were able to restore the smoothness in , However, finite-difference calculations of indicate that a continuity, i.e., smoothness of the second spatial derivative of , may be needed.
This summarizes the physics constraints, and how they lead to a dual-model setup with a joint loss function. Another important physics-informed training method is how the magnetic data are treated. Specifically, in Sec. III, we discuss how to explicitly encode the correlations inherent in the magnetic signals.
III. TRAINING DATA
To train our EFIT-Prime model, a dataset was created using approximately 180 000 magnetically constrained equilibria from approximately 800 discharges from the 2019 DIII-D campaign. This dataset features a diverse array of plasma conditions and includes the ramp-up/down (or pre/post-flat top) stages to ensure comprehensive coverage and model robustness.37 We split these data into 80% for training (145 701 samples), 10% for validation (18 214 samples), and 10% for testing (18 212 samples). Preparing the data in this fashion erases the individual identity of a single discharge, i.e., time slices from a single discharge can easily be in training, validation, and test datasets all at once.
The creation of this dataset was greatly facilitated by the development of workflow tools and data standards. Specifically, OMFIT38 was used to generate the bulk of the data. To ensure quality of the data, we implemented an equilibrium quality check and discarded any magnetic equilibrium that has or or or or or , which are usually considered “poor” conditions for the plasma to remain in 2D force-balance. An additional filter that removes the equilibria with is also applied to remove the disrupting plasmas.
A second dataset, which excludes all negative triangularity (NT) discharges from the above dataset, containing approximately magnetic equilibria was also created. These data are used to train different variations of EFIT-Prime pursued under ablation studies described in Sec. V. Withholding NT from the training of these other models gives us a stringent test of generalizability by testing each model on out-of-distribution NT discharges, specifically DIII-D discharges 180526-8, 180533, containing 957 equilibria, in this case. Note the NT equilibria contained in the EFIT-prime dataset have no overlap with NT equilibria that come from these four NT discharges. These additional datasets are also summarized in Table II. We next discuss the preparation of the magnetics data, poloidal magnetic flux, and the toroidal current density that are used to train the EFIT-Prime model.
Summary of data sets used in training and later stress-testing the EFIT-Prime model, where the negative-triangularity discharges are withheld from training but used in inference. While the EFIT-Prime dataset contains some NT equilibria, none of the NT data comes from the four NT discharges used to create the NT inference set shown in row 3.
Data set type . | Total number of equilibria . | Training . | Validation . | Test . |
---|---|---|---|---|
EFIT-Prime dataset | 182 122 | 145 701 | 18 214 | 18 212 |
NT-withheld dataset | 178 282 | 142 624 | 17 829 | 17 829 |
NT inference | 957 | ⋯ | ⋯ | 957 |
Data set type . | Total number of equilibria . | Training . | Validation . | Test . |
---|---|---|---|---|
EFIT-Prime dataset | 182 122 | 145 701 | 18 214 | 18 212 |
NT-withheld dataset | 178 282 | 142 624 | 17 829 | 17 829 |
NT inference | 957 | ⋯ | ⋯ | 957 |
A. The NN input vector: Experimental magnetics data
As discussed in the introduction, Sec. I, the magnetic signals form the basis of equilibrium reconstruction in this work. For DIII-D, these magnetic signals consist of the measurements of the poloidal magnetic field by an array of 76 3-axis magnetic probes (MP), poloidal magnetic flux picked up by 44 flux loops (FL), the absolute value of the plasma current , which is a scalar quantity, and the electrical current in the 18 external poloidal field coils (FC) used in shaping the plasma, and the current in the 6 Ohmic coils (EC). These measurements from the magnetic sensors and currents in the coils, which add up to 145 signals in total, are used to construct the input vector . The 145 magnetic signals are summarized in Table III, and their locations (not including ) in the DIII-D poloidal cross section are shown in Fig. 1.
Summary of the magnetic signals that are used in the neural net input vector.
Input . | Definition . | Data size . |
---|---|---|
MP | Poloidal magnetic field | 76 |
FL | Flux loops | 44 |
Plasma current | 1 | |
EC | Ohmic coils | 6 |
FC | Poloidal field coils | 18 |
Total | 145 |
Input . | Definition . | Data size . |
---|---|---|
MP | Poloidal magnetic field | 76 |
FL | Flux loops | 44 |
Plasma current | 1 | |
EC | Ohmic coils | 6 |
FC | Poloidal field coils | 18 |
Total | 145 |
A cross section of the DIII-D tokamak with all of the external diagnostics and PF coils that enter magnetic EFIT as least squares constraints. (a) The red line segments represent the 76 magnetic (b) probes, the blue circles the 44 flux ( ) loops, and the orange blocks the 18 poloidal field (PF) coils and 6 Ohmic coils. The gray corresponds to the vacuum vessel wall.
A cross section of the DIII-D tokamak with all of the external diagnostics and PF coils that enter magnetic EFIT as least squares constraints. (a) The red line segments represent the 76 magnetic (b) probes, the blue circles the 44 flux ( ) loops, and the orange blocks the 18 poloidal field (PF) coils and 6 Ohmic coils. The gray corresponds to the vacuum vessel wall.
For training of the EFIT-Prime model, all 145 input features are normalized by various combinations of the time-varying (toroidal) vacuum magnetic field and the major and (average) minor radii of DIII-D, m and m (and vacuum permeability for the coil currents) to bring the scale of the inputs approximately to within the range. For example, the magnetic-probe measurements are scaled by , which is approximately the poloidal equivalent of the vacuum magnetic field. An example of normalized inputs for the DIII-D discharge 180 087 is shown in Fig. 2. The arrows correspond to the mean signal at that particular location over all time slices within the discharge. The range of the vertical bar represents the three-sigma standard deviation of each measurement/current. Not all of the 145 signals are necessarily active during a discharge. Often a few of the magnetic probe measurements are discarded for various reasons, such as poor measurement quality, calibration issues, or other data quality factors that are determined during the experimental run. The present implementation treats the missing signals by replacing them with their synthetic counterparts reconstructed by EFIT.
Mean (arrows) normalized magnetic diagnostics and coil currents and their standard deviations (vertical bar) for DIII-D discharge 180087. All 145 input features are normalized by various combinations of the time-varying (toroidal) vacuum magnetic field and the major and (average) minor radii of DIII-D, m and m (and vacuum permeability for the coil currents) to bring the scale of the inputs approximately within the range.
Mean (arrows) normalized magnetic diagnostics and coil currents and their standard deviations (vertical bar) for DIII-D discharge 180087. All 145 input features are normalized by various combinations of the time-varying (toroidal) vacuum magnetic field and the major and (average) minor radii of DIII-D, m and m (and vacuum permeability for the coil currents) to bring the scale of the inputs approximately within the range.
The ability of ML models to generalize, that is give good predictions in regimes not included in the training, is highly dependent on the adequacy of the training data to represent the underlying phase-space features and constraints. One of our goals is to investigate an important factor of the training data, and that is the format of the data itself. Traditional methods of ML model training in plasma physics input the data as a 1D vector, i.e., in a tabulated fashion similar to what is seen in Fig. 2, without attempting to explicitly encode spatial correlations between the different sensors/coils. That is, the spatial locations are excluded from the inputs. To overcome this limitation, we embed the magnetic sensor measurements and coil currents in a 2D coordinate plane.
This spatial embedding procedure is as follows: first, the spatial coordinates of the magnetic sensors and PF/Ohmic coils are mapped to the corresponding indices on the 2D image canvas. If a sensor maps to an index, its value, i.e., measurement is directly assigned to the pixel at that index on the image. In cases where multiple sensors map to the same index, the mean value of the measurements is assigned to the pixel at that index. Finally, we choose to embed at the center of the grid, as it is a scalar quantity. Another option could be to distribute it poloidally into many filaments whose total current would add up to .
Because the embedding inflates the size of the input vector two orders of magnitude from 145 to , a second step entailing the compression of the input data is carried out. This compression uses principal component analysis (PCA), where only the first 30 principal components are retained, that contains more than 99% of the information contained in the embedded magnetic inputs, as indicated by Fig. 3. This is a compression of the input data by at least a factor 100, which provides huge savings in the computational cost of training otherwise immense NNs where both the inputs—because of the embedding—and outputs would contain approximately features.
Explained variance showing the amount of compressed information in the first 30 PCA components for a 2D image of embedded magnetic inputs.
Explained variance showing the amount of compressed information in the first 30 PCA components for a 2D image of embedded magnetic inputs.
Next, we describe the targets of the NN training.
B. The NN targets: The poloidal magnetic flux and toroidal current density
During supervised training, the EFIT-Prime model was supplied with true values of a quantity that it attempted to reproduce in its output. These true values are the target vector of the model. For the present applications, the target vector is the poloidal magnetic flux and the toroidal current density on the uniform EFIT mesh. The flux is first normalized to lie over the range [0, 1] within the plasma via using the flux at the magnetic axis and at the plasma boundary .39 Then, it is flattened to form a 1D array, which yields a total of features per sample.
The toroidal current is first de-dimensionalized by the following transformation: , then compressed into 300 principal components to accelerate the NN training. These coefficients are further scaled with a standard scalar to have zero mean and a variance of unity, before the training. The NNs are tasked with learning the (scaled) coefficients of these principal components. Similar dimensional reduction of NN targets was carried out in Refs. 40 and 41. Note that the principal components are extracted from the training set alone, representing 80% of the data. It is assumed that the resulting set of basis vectors forms a complete set for seen in most of the DIII-D scenarios and discharges. The first four of these basis vectors, i.e., principal components, are shown in Fig. 4. The dominant component of with the current centroid appears in the first component; the second and third components correspond to a radial and axial centroid, resulting in an inward and downward shift of the current centroid. The fourth component results in a similar shift of the current centroid as the second and third components.
The first four PCA components of the toroidal current density for the dataset used in training EFIT-Prime.
The first four PCA components of the toroidal current density for the dataset used in training EFIT-Prime.
IV. TRAINING RESULTS OF EFIT-PRIME
In this section, the results obtained with the proposed EFIT-Prime model are discussed in detail.
The above two metrics, calculated for each and slice within the mesh, assess accuracy and similarity to observed data, respectively. We then aggregate these individual metrics over the entire test set to form histograms to visualize the overall performance distribution. Furthermore, a visual inspection of the predicted vs observed and for the best, median, and worst predictions is provided, giving us a layered understanding of the model's accuracy in predicting the magnetic equilibria.
For this model, an ensemble is formed out of top-five NN configurations, which are then utilized to quantify the overall uncertainty consisting of both the aleatory and epistemic uncertainties for the poloidal flux and the toroidal current density across the 2D grid. Examining the relative uncertainty magnitudes in different regions and under various plasma scenarios gives us a measure of confidence, highlighting the model's predictive reliability, and how uncertainty varies across the plasma environment. Hereafter, the ensemble mean prediction is referred to as the default EFIT-Prime prediction.
The overall performance of the EFIT-Prime model can be assessed from the and SSIM distributions of the prediction of , shown in Fig. 5(a) and shown in Fig. 6(a). EFIT-Prime produces good accuracy in its prediction of , as evidenced by the tight clustering of both the (blue) and SSIM (orange) distributions toward the right boundary, suggesting nearly perfect agreement between the mean predictions and the truth. The distribution is clustered tightly toward the boundary (right), with more than 99.5% of the samples with , indicating an accurate and robust prediction by the ensemble. Both the mean and median to within three significant digits. A similar conclusion also holds for SSIM, indicating that EFIT-Prime successfully reproduces the flux surfaces.
The mean prediction of the poloidal flux by the EFIT-Prime model, given the principal components of the external magnetic measurements and coil currents embedded in a 2D map: Shown are (a) the (blue) and SSIM (orange) distributions of the predicted flux for nearly test samples, (b)–(d) the overlay of the true flux surfaces (black) against the NN-predicted flux surfaces (red dashed) for three samples with the worst, median, and best , (e)–(g) aleatory, and (h)–(j) epistemic uncertainties in the flux prediction for the same three samples.
The mean prediction of the poloidal flux by the EFIT-Prime model, given the principal components of the external magnetic measurements and coil currents embedded in a 2D map: Shown are (a) the (blue) and SSIM (orange) distributions of the predicted flux for nearly test samples, (b)–(d) the overlay of the true flux surfaces (black) against the NN-predicted flux surfaces (red dashed) for three samples with the worst, median, and best , (e)–(g) aleatory, and (h)–(j) epistemic uncertainties in the flux prediction for the same three samples.
The mean prediction of the toroidal current density by the EFIT-Prime model, given the principal components of the external magnetic measurements and coil currents embedded in a 2D map. Shown are (a) the (blue) and SSIM (orange) distributions of for nearly test samples, (b)–(d) the overlay of the true (black) against the NN-predicted (red dashed) for three samples with the worst, median, and best , (e)–(g) aleatory, and (h)–(j) epistemic uncertainties in the prediction for the same three samples.
The mean prediction of the toroidal current density by the EFIT-Prime model, given the principal components of the external magnetic measurements and coil currents embedded in a 2D map. Shown are (a) the (blue) and SSIM (orange) distributions of for nearly test samples, (b)–(d) the overlay of the true (black) against the NN-predicted (red dashed) for three samples with the worst, median, and best , (e)–(g) aleatory, and (h)–(j) epistemic uncertainties in the prediction for the same three samples.
To gain further insight, the samples with the worst (left), median (middle), and best (right) are identified, for which we carry out a visual comparison of the poloidal flux predicted by EFIT-Prime (appearing as dashed red contours) against the true flux (appearing as solid black contours), shown in Figs. 5(b) and 5(d). We observe that the flux overlay for the best and median samples is nearly perfect, making it virtually impossible to distinguish the predicted flux surfaces from the true ones. The sample with the worst appears to be somewhat of an outlier and could be from the early-start-up phase or end of a discharge. However, even in this case, the predicted flux surfaces do not deviate too much from the true flux surfaces.
The aleatory (AU) and epistemic (EU) uncertainties are shown in the third (e)–(g) and fourth (h)–(j) rows of Fig. 5 for the same samples mentioned above. The uncertainties are given in the same scale as the normalized flux and normalized . Thus, they are dimensionless and also stated as variance squared ( ), in accordance with Eqs. (8) and (9). For the outlier sample, with the lowest , the two types of uncertainty are relatively large. For the best and median predictions, the uncertainties are small, nearly four-to-five orders of magnitude smaller than the normalized flux, amounting to an error less than , attesting to the accuracy of the model. For comparison, magnetic EFITs have a convergence error of and a root-mean-square Grad–Shafranov residual of approximately . For the best and median samples, the AU is larger than the EU, whereas, for the worst sample, the EU is larger. This is not an established pattern, however, and in general, our observations of EFIT-Prime and other models' results indicate the AU to be on average larger than EU. The regions that show the largest AU are at the center of the grid, mid-core, and the vicinity of the shaping coils on the outboard side of the plasma. The EU is also the largest at the center of the grid. The elevated AU around the outboard side coils is likely due to the variation in the coil currents associated with controlling the plasma shape, which would further increase to generate and control NT plasmas. As the AU is the mean of the variances produced by each NN configuration within the ensemble, its irreducible nature becomes apparent here: each NN configuration picks up some inherent uncertainty from the shaping coils, thereby producing a non-vanishing mean of those uncertainties. However, the EU shows no such error around the shaping coils; as all the five configurations in EFIT-Prime produce a similar level of uncertainty there, so does their mean. This translates to a vanishing model uncertainty, as calculated by Eq. (9), showcasing the reducible nature of this type of uncertainty.
Next, we assess the quality of the EFIT-Prime prediction of the on the same test set. The distributions and SSIM for are shown in Fig. 6(a), displaying a wider spread than the distribution for , suggesting that the accuracy of the model in learning slightly drops compared to that for . In fact, in this case, almost half of the test samples have (with about 3% of the test samples having an ). This slight drop in performance is also indicated by the median and mean . A comparison of the true to predicted for the same three samples is shown in Figs. 6(b) and 6(d). A similar conclusion follows here as well: the worst sample appears to be an outlier. There is a good qualitative agreement between the truth and the models' predictions for the median and best samples, as expected. The peak AU for the predictions appears as large as the predicted and orders of magnitude larger than the AU for . This is also true for the EU. The two types of uncertainty seem heavily concentrated in the plasma core. This and the fact that the uncertainty is not localized in any of the regions that would be clear indicators of numerical error are thought to be a consequence of the fundamental limitation of magnetic EFITs, which lack internal constraints. This effect is especially pronounced in the plasma core and should affect more strongly than since is a second-order spatial derivative of . This is consistent with the well-known and documented limitations of the external magnetic-only reconstructions.1,43,44 That our framework can quantify this uncertainty is a feature, and not a bug of our modeling approach, and thus one of the highlights of the present work.
The learning history, i.e., the evolution of the loss function for the best configuration within EFIT-Prime, is shown in Fig. 7, which tracks the training and validation (black and red) loss given by Eq. (10) and training and validation (blue and orange) loss given by Eq. (4), as a function of the NN epochs. In this case, the training went as far as epochs before being terminated by the early-stopping criterion. As expected, the validation error is slightly larger than the training error and undergoes a noisier evolution for both and , because the validation data are never used to adjust the NN weights. The sudden drop that occurs in the loss function is due to the piecewise constant learning rate schedule that drops the learning rate by pre-defined factor after a given number of epochs. Also note that the loss function for undergoes a much more rapid evolution, dropping several orders of magnitude while the loss for evolves more gradually, but still undergoes the same scheduled drop in the learning rate before epoch 50.
The training and validation loss for the targeted quantities: and the PCA coefficients of as a function of NN epochs for the best-performing NN model from the ensemble forming EFIT-Prime. Under the maximum likelihood assumption, the loss function becomes the negative of the natural logarithm of the likelihood function, which for is given by Eq. (10) (including the smoothing loss described in Sec. II B) and for is given by Eq. (4) under the transformation .
The training and validation loss for the targeted quantities: and the PCA coefficients of as a function of NN epochs for the best-performing NN model from the ensemble forming EFIT-Prime. Under the maximum likelihood assumption, the loss function becomes the negative of the natural logarithm of the likelihood function, which for is given by Eq. (10) (including the smoothing loss described in Sec. II B) and for is given by Eq. (4) under the transformation .
The neural architecture corresponding to the top-performing configuration of EFIT-Prime is shown and discussed in Appendix B. NAS produces large configurations with many connections and approximately ten × 106 model parameters (weights and biases) (a standard deterministic MLP would also have nearly as many model parameters to optimize). The total inference time (from 5 configurations that makeup EFIT-Prime ensemble) has been clocked at 15 ms, with 7 ms for and 7.5 ms for prediction, the latter taking up as much time because of the operations that undo the scaling and PCA to map the prediction back to . This is without any parallelization or optimization; each configuration's prediction is executed sequentially and without any optimization in the Python model evaluation for inference or the NN architecture itself. The latter can be rebuilt under NAS with an architectural sparsity constraint to speed up the inference time, which will be the topic of subsequent work.
V. IMPACTS OF MODEL COMPONENTS ON GENERALIZABILITY
We have pursued several, possibly advantageous, modeling strategies in the data preparation and formulation of physics constraints for building the EFIT-Prime model. To understand the contribution of each strategy to the final model, we perform ablation studies where a certain strategy, be it a model component or a different way to prepare the input vector, is removed and the model performance is then reevaluated after retraining. We pursue two main approaches of surrogate modeling in this study, starting with the removal of the constraint from the models. In the second approach, we undo the spatial embedding of the input vector and instead use tabulated magnetic inputs, again without .
For each approach, a NAS is carried out to determine an ensemble of best-performing models. A crucial aspect of this study is the training of the ablated models on a curated dataset that excludes all negative triangularity (NT) discharges (discussed in Sec. III). To establish a baseline for the ablation study, we first retrain the EFIT-Prime model on this dataset devoid of NT. Altogether, we present three modeling approaches in this section: first, a baseline representing EFIT-Prime without NT data followed by the two aforementioned ablative approaches that are also ignorant of NT. These three approaches are summarized as follows:
-
EFIT-Prime without negative triangularity,
-
-only with spatially embedded magnetic inputs,
-
-only with tabular magnetic inputs.
The performance of models built under the three approaches is first gauged on in-distribution No-NT data in Secs. V A–V C, and then on out-of-distribution data consisting of NT discharges, presented in Sec. V D. This “handicap” of withholding NT from the training provides an excellent platform for assessing the role of each component in improving the generalizability of our models when inferring on out-of-distribution NT discharges in Sec. V D.
A. Baseline: EFIT-prime without negative triangularity
This is the same approach as that used in the construction of the final model, EFIT-Prime. It still includes the two crucial modeling components: the spatial embedding of the inputs and constraint that we wish to investigate for improved generalizability, except the models, in this case, are trained on a special dataset devoid of NT discharges, shown in the second column of Table II. This is done to establish a baseline for EFIT-Prime against which we can compare the ablated models of Secs. V B and V C.
Similarly to Sec. IV, we use the and SSIM distributions of the ensemble mean prediction of and , and their aleatory and epistemic uncertainty to assess the performance of the model. We find that and SSIM distributional metrics for are similar to those reported for EFIT-Prime in Sec. IV where the NT data were part of the training, as shown in Fig. 8(a). Both the mean and median to within three significant digits, as was observed in the EFIT-Prime results.
The mean prediction of by the version of EFIT-Prime that is ignorant of negative triangularity (NT), given the principal components of the external magnetic measurements and coil currents embedded in a 2D map. Shown are (a) the (blue) and SSIM (orange) distributions of predicted flux for nearly test samples, (b)–(d) the overlay of the true flux (black) against the NN-predicted flux (red dashed) for three samples with the worst, median, and best , (e)–(g) aleatory, and (h)–(j) epistemic uncertainties in the flux prediction for the same three samples.
The mean prediction of by the version of EFIT-Prime that is ignorant of negative triangularity (NT), given the principal components of the external magnetic measurements and coil currents embedded in a 2D map. Shown are (a) the (blue) and SSIM (orange) distributions of predicted flux for nearly test samples, (b)–(d) the overlay of the true flux (black) against the NN-predicted flux (red dashed) for three samples with the worst, median, and best , (e)–(g) aleatory, and (h)–(j) epistemic uncertainties in the flux prediction for the same three samples.
The and SSIM distributions for the are shown in Fig. 9(a). These distributions and the mean/median for show a slight improvement over the results from EFIT-Prime, However, this is likely a consequence of the reduced variance in the training data in use here due to the lack of NT equilibria; EFIT-Prime has to accommodate the NT shape compared to the model studied here. To put it in other terms, the present model might fit its dataset better, however, likely at the expense of generalizing to out-of-distribution data like NT plasmas.
The mean prediction of the toroidal current density by the version EFIT-Prime that is ignorant of NT, given the principal components of the external magnetic measurements and coil currents embedded in a 2D map. Shown are (a) the (blue) and SSIM (orange) distributions of the predicted for nearly test samples, (b)–(d) the overlay of the true (black) against the NN-predicted (red dashed) for three samples with the worst, median, and best , (e)–(g) aleatory, and (h)–(j) epistemic uncertainties in the prediction for the same three samples.
The mean prediction of the toroidal current density by the version EFIT-Prime that is ignorant of NT, given the principal components of the external magnetic measurements and coil currents embedded in a 2D map. Shown are (a) the (blue) and SSIM (orange) distributions of the predicted for nearly test samples, (b)–(d) the overlay of the true (black) against the NN-predicted (red dashed) for three samples with the worst, median, and best , (e)–(g) aleatory, and (h)–(j) epistemic uncertainties in the prediction for the same three samples.
The qualitative comparisons on the worst (left), median (middle), and best (right) , for is shown in Figs. 8(b) and 8(d), and is shown in Figs. 9(b) and 9(d). Here too, we find that the flux overlay for the best and median samples is nearly perfect, making it virtually impossible to distinguish the predicted flux surfaces from the true ones. However, we note that for , there are three samples with the worst and the illustrated sample appears to undergo a vertical displacement event as indicated by the strong axial shift of the flux surfaces. When comparing the aleatory and epistemic uncertainty, we found that the magnitude and locations of high relative uncertainty for the remain similar to those of the EFIT-Prime results discussed in Sec. IV. For , we find a lower magnitude for the AU overall, but the locations of the high uncertainty remain somewhat consistent. For the EU, which represents the (reducible) model-form uncertainty, we also see that the present case without NT evinces slightly lower values for the displayed samples. This conclusion is also supported by the mean uncertainties taken over all the test samples and over all values on the grid for each sample.
B. -only with spatially embedded magnetic inputs
Here, we further remove the constraint to study its effect on the predictions. So, in this approach, we only learn , given the spatially embedded magnetic data, using the same approach to learning the model and obtaining the ensembles for uncertainty quantification, with the exception that it is a single model setup, described in Sec. II A. Without the constraint, the computational cost goes down a little with the number of model parameters decreasing to approximately 7 × 106 (from 10 × 106) per NN configuration.
In line with the model discussed in Sec. IV, our ensemble mean prediction of shows a highly concentrated distribution near 1, with over 99.5% of samples exceeding 0.995 and both mean and median and SSIM values reaching 1.0 to within three significant digits, indicating good agreement with true flux surfaces (see Fig. 10). The analysis of aleatory (AU) and epistemic (EU) uncertainties reveals them to be significantly smaller for both best and median predictions, nearly five-to-six orders of magnitude less than the normalized flux, with AU consistently larger than EU by a factor of 2–3. Again, the maximum AU is localized near the shaping coils on the plasma's outboard side and the mid-core region, mirroring observations from Sec. IV, with negligible uncertainty in the vacuum region. The structure in EU looks somewhat different compared to that for EFIT-Prime and EFIT-Prime without NT.
The mean prediction of the poloidal flux by the ensemble of five best-performing MLP configurations determined by NAS, given the principal components of the external magnetic measurements and coil currents embedded in a 2D map. Shown are (a) the and SSIM distribution of the predicted flux aggregated over the entire test set of nearly samples, (b)–(d) the overlay of the true flux (black) against the NN predicted flux (red dashed) for three samples with the worst, median, and best , (e)–(g) aleatory, and (h)–(j) epistemic uncertainties in the flux prediction for the same three samples.
The mean prediction of the poloidal flux by the ensemble of five best-performing MLP configurations determined by NAS, given the principal components of the external magnetic measurements and coil currents embedded in a 2D map. Shown are (a) the and SSIM distribution of the predicted flux aggregated over the entire test set of nearly samples, (b)–(d) the overlay of the true flux (black) against the NN predicted flux (red dashed) for three samples with the worst, median, and best , (e)–(g) aleatory, and (h)–(j) epistemic uncertainties in the flux prediction for the same three samples.
We note that removing the constraint has negligible impact on the prediction accuracy for in-distribution data, showcasing the model's robustness within known scenarios. It remains to be seen if the models without the constraint will perform as well on out-of-distribution data of which NT is an example. This is the topic of Sec. V D.
C. -only learning with tabulated magnetic inputs without the spatial embedding
This can be considered the least sophisticated and perhaps the baseline approach for building our surrogates. It uses a simplistic approach where the magnetic measurements and coil currents are input into the model as flat, structured data, forming a tabular representation of the inputs. Given these tabular inputs, an MLP configuration is again used in NAS to learn the map from the inputs to .
This strategy does not explicitly encode any spatial correlation and instead relies on the underlying spatial correlations present in the measurements to deduce relevant physical reconstructions. For example, the measurements from the magnetic probes contain correlations based on the underlying plasma generating the magnetic field, which is contained in the measurement value but otherwise not explicitly encoded. While this approach is computationally less demanding, similar to the model of Sec. V B, with only about 7 × 106 model parameters for each NN configuration in the deep ensemble, it might fall short in capturing the spatial dynamics crucial for accurate prediction of plasma scenarios, especially , which might have intricate dependencies on the spatial configuration of magnetic measurements.
We find that the removal of spatial correlation from magnetic inputs and the constraint does not significantly impact predictions for in-distribution data encompassing only positive triangularity equilibria. This observation is supported by the distributions of and SSIM metrics for the predictions over the entire test set, by the mean and median , which are again 1.0–3 significant digits, and by the relative magnitude and location of aleatory (AU) and epistemic (EU) uncertainties. The summary figure for this model is omitted as it shows results that are very similar to those of Sec. V B.
While the ensemble of models built under this approach seems to perform well on its in-distribution test set, we will see shortly that the exclusion of NT training data and spatial embedding alongside the absence of the constraint can profoundly affect predictions for out-of-distribution scenarios. This impact is thoroughly explored in the following Sec. V D.
D. Inference on negative-triangularity plasmas to test the generalizability of the models
We have discussed throughout this manuscript the importance of building models with the ability to generalize to cases that may lie out of distribution with respect to their training set. Here, we carry out an inference study on such a case, four DIII-D discharges with negative triangularity (NT) plasma shape, to rigorously test the modeling strategies (including input preparation), studied under ablation in Sec. V, that have led to the final surrogate model: EFIT-Prime. It was noted in Sec. V that assessing these ablated models' performance on the test alone did not yield any conclusive evidence for determining the winning and losing modeling strategies. This is why we turn to the NT scenario here because it embodies the unique characteristic of a “flipped” plasma shape, providing a stringent test of the models' adaptability, especially for those models that have never seen NT in their training. This method of stress-testing the models aims to unveil any potential biases and assess robustness against overfitting to specific plasma scenarios. The final model EFIT-Prime is also tested on the same NT discharges to provide a baseline of expected performance.
The inference set consists of approximately one thousand magnetic equilibria (samples or time slices) from four NT DIII-D discharges (180526-8, 180533) performed during the 2019 campaign (see bottom row of Table II). EFIT-Prime and its ablated “cousins” are tasked with predicting the flux surfaces for all the time slices contained within this inference set. The average triangularity for each time slice is then calculated from the predicted poloidal flux and compared against the “true” triangularity calculated from the true poloidal flux extracted from magnetic EFIT reconstructions of the four NT discharges.
Adhering to the analysis presented in Secs. IV and V, we begin with the and SSIM distributions of the prediction of by each of the ablated models (a-c) as well as EFIT-Prime. The median and mean for each approach are also displayed in the upper left corner of each subfigure in Fig. 11. The “naive” approach of Sec. V C, which uses tabulated magnetic inputs to predict only , performs poorly, producing no samples with acceptable accuracy with . In fact, this case produces many samples with , including the peak of its distribution, but setting the minimum in Fig. 11 excludes many of them. Upon changing the way the magnetic inputs are fed into the NNs, i.e., switching to the embedded inputs (and their principal components), we see a notable improvement in the prediction accuracy with the peak of the distribution shifting to approximately 0.5 [from the negative peak distribution observed in Fig. 11(a)]. However, this is still far from the kind of accuracy demanded of a reliable surrogate, producing only two predictions with . The inclusion of the constraint improves the predictive capability slightly, with the distribution's peak shifting to . This conclusion is also supported by the more than 10% improvement in the median and mean , displayed in each subfigure of Fig. 11. Finally, the EFIT-Prime model shows accurate predictions of the flux with more than 80% of samples with , and median/mean . Interestingly, for all four cases, the SSIM paints a more optimistic picture than , but not so much as to contradict the conclusions based on the here.
The and SSIM distribution for the magnetic flux surfaces predicted by each ablated model (a)–(c) and EFIT-Prime (d), aggregated over four NT DIII-D discharges.
The and SSIM distribution for the magnetic flux surfaces predicted by each ablated model (a)–(c) and EFIT-Prime (d), aggregated over four NT DIII-D discharges.
Shown in Fig. 12 is the average triangularity calculated from the poloidal flux predicted by EFIT-Prime and the ablated models, representing the three different modeling approaches. The results are shown over the same aforementioned four NT discharges that are stitched together from end to beginning to form the inference set. The true is shown as the black trace, showing except at the beginning and end of the discharges (the abrupt jumps occurring around time slices 220, 600, and 900 correspond to the end of one shot and the beginning of the next one). The most naive approach (purple trace) is unable to sense NT, consistently yielding positive triangularities with an average . The approach that learns only from the spatially embedded magnetic inputs (green trace) shows a remarkable improvement in the prediction, yielding consistently negative, albeit underestimated, triangularity, with an average of . Therefore, it stands to reason that the spatial embedding of the magnetic inputs provides crucial physics information to the NN surrogates about the plasma shape. This embedding partially restores the fact that external magnetics are sufficient to determine the plasma shape. The next approach is the version of EFIT-Prime without NT (blue). It too yields similar, perhaps slightly degraded, predictions for as the model with the embedding but without . These results suggest that the constraint does not add further physics information that can dramatically increase the models predictive capability on NT. However, this is not surprising since the PCA basis functions were created out of only positive-triangularity (PT) equilibria. Thus, there is no way to completely represent NT by a linear combination of any of the 300 PCA basis vectors used here, i.e., a few NT basis functions must lie in a space that is orthogonal to the PT basis functions. A similar observation was reported by Ref. 41. Finally, the triangularity calculated from the EFIT-Prime's poloidal flux predictions is shown as the red trace, which tracks the true closely, indicating that the model's predictive power improves dramatically when a relatively small set of NT equilibria is included in its training set (in this case about 5%–7% of the training data had NT). In other words, EFIT-Prime's performance on NT could be further improved, if NT had increased representation in the training data.
The average triangularity calculated from the poloidal flux predicted by the EFIT-Prime (red) and the models built under the approaches of ablation plotted against the true .
The average triangularity calculated from the poloidal flux predicted by the EFIT-Prime (red) and the models built under the approaches of ablation plotted against the true .
We next provide a qualitative comparison of the plasma boundary calculated from the predicted poloidal flux against the true boundary calculated from the true poloidal flux. This comparison is carried out for many samples we pull randomly from the inference set; however, only a single time slice, which is shown as the vertical dashed line in Fig. 12, is illustrated in Fig. 13. The same coloring scheme as that in Fig. 12 is used here as well to delineate the different approaches. The true boundary, which is calculated from the true (in the same way as the boundaries from the NN models are calculated), is shown as the solid black curve. We use the same method as EFIT to find the plasma boundary: by performing a binary search that checks whether the flux surface is closed or open (or hits the wall), and iterating until the last closed-flux surface is found. The particular sample, displayed in Fig. 13, evinces a lower-diverted plasma, with strong NT in the upper half. Of the four cases shown, only EFIT-Prime [Fig. 13(d)] captures the upper (negative) triangularity of this time slice correctly, showing the best agreement with the truth, while the other approaches shown in Figs. 13(a) and 13(c) all struggle with the plasma shape to varying degrees. Of course, any conclusion about model performance needs to be directed to Fig. 12, which shows the overall trend of the models' evaluation on out-of-distribution NT data, whereas Fig. 13 offers a glimpse in terms of a single time slice.
The plasma boundary, i.e., the last closed-flux surface, is shown for one time slice out of the 957 aggregated from four DIII-D NT discharges for the NT inference test. The boundary is calculated from predicted by (a) -only learning with tabular inputs, (b) -only learning with embedded inputs, (c) EFIT-Prime with No NT in its training (concurrent learning of and with embedded inputs), and finally, and (d) EFIT-Prime.
The plasma boundary, i.e., the last closed-flux surface, is shown for one time slice out of the 957 aggregated from four DIII-D NT discharges for the NT inference test. The boundary is calculated from predicted by (a) -only learning with tabular inputs, (b) -only learning with embedded inputs, (c) EFIT-Prime with No NT in its training (concurrent learning of and with embedded inputs), and finally, and (d) EFIT-Prime.
VI. SUMMARY AND DISCUSSION
EFIT-Prime represents a novel approach to creating an accurate, and robust neural network surrogate for EFIT, aiming to fully replace traditional EFIT processes. This surrogate model focuses on minimizing errors in calculating poloidal flux and toroidal current, specifically targeting the reduction of Grad–Shafranov residual errors. Keys to EFIT-Prime are its probabilistic nature and its incorporation of physics constraints, which collectively enhance its accuracy and robustness.
At the heart of EFIT-Prime's probabilistic approach is the integration of a Bayesian Optimization (BO) with neural architecture search (NAS) using an Aging Evolution (AgE) algorithm. This method, termed AgEBO, systematically explores and optimizes multiple models to iteratively converge to high-performing model architectures. The highest-ranked models from this search are subsequently amalgamated into a deep ensemble, enhancing the robustness of the predictions and facilitating nuanced quantification of uncertainties. Specifically, this ensemble approach allows for the separation of uncertainties into aleatory (irreducible) and epistemic (reducible) types. Moreover, EFIT-Prime innovatively applies a multi-model neural architecture search to enforce physics constraints directly within the neural network architecture, focusing on the magnetic flux ( ) and the toroidal current density ( ) relationships. This strategy not only leverages the strengths of deep learning but also ensures that the model adheres closely to physical laws, significantly boosting its accuracy, generalizability, and robustness.
An additional physics-informed modeling strategy is embedding the magnetic sensors and coils in a 2D coordinate plane to explicitly retain spatial correlations between different magnetic sensors. This contrasts with the traditional way of inputting the data as a 1D vector, i.e., in a tabulated fashion. The information in the resulting 2D maps is compressed into 30 principal components (PCs), and we feed into EFIT-Prime the coefficients of these 30 PCs as the input vector for each time slice. This embedding of the magnetic inputs has proven to be a winning strategy as far as model generalizability is concerned, as explained further below.
Approximately 180 000 magnetically constrained equilibria from the DIII-D 2019 campaign are used to train, validate, and test (with an 80-10-10 split) many different NN configurations built under NAS. The top five performing models are then used to form a deep ensemble that represents the EFIT-Prime model. The performance metrics: the distribution of the coefficient of determination and structural similarity index SSIM aggregated over the 18 000 test samples indicate high reconstruction accuracy of the poloidal flux, with low epistemic and aleatory uncertainties for more than 99.5% of the samples in the test set. The key result is shown in Fig. 5, where high accuracy is seen for all of the test cases. The worst case, which has an , has much larger uncertainties than all of the other cases. That is, our model not only has high predictive value but also gives us a measure of confidence associated with each of its predictions.
To understand the contribution of each strategy to EFIT-Prime, we performed ablation studies by removing a particular strategy, be it a model component or a nonstandard way to formulate the NN input vector, and then reevaluating the ablated models' performance after the training. We deliberately withheld negative triangularity (NT) discharges from the training of these models. We then tested the ablated models on a special inference set consisting of four NT DIII-D discharges. This method of withholding a particular scenario from a model during its training and then testing that model on that previously withheld scenario makes the ideal platform for gauging the generalizability of our models and determining the winning modeling strategies. Our results indicate the spatial embedding of the magnetic inputs to significantly improve the generalizability of the models. The contribution of the constraint for improving the generalizability seems inconclusive in so far as the results of our NT inference test goes. Alternative approaches such as that of Ref. 41, where the contributions to the poloidal flux from the plasma current and coil currents are separated in the same way that EFIT separates them, with the NNs trained on the plasma contributions will be interesting to compare with.
As part of the future work, we plan on extending the training to other DIII-D campaigns from 2018 to 2022 and performing inference across different years to assess the generalizability of the models further. We are also planning on adding the magnetic pitch angle measurements from DIII-D Motional Stark effect diagnostic (MSEd) to the input vector used for EFIT-Prime. Training new models on externally (with the magnetics) and internally (with MSEd) constrained EFIT equilibria with NAS could yield improved predictions of with smaller uncertainties in the plasma core. We will also expand the number of quantities to be learned under the NAS approach to include the plasma boundary, certain profiles, and the derived discharge parameters such as internal inductance, the normalized beta, and plasma volume, thereby building on work by Ref. 1 and laying the foundations for a drop-in surrogate for EFIT. In addition, we aim to speed up inference times from the current level of 15 ms per time slice down to approximately millisecond timescale per time slice for real-time deployment of EFIT-Prime in parallel with real-time EFIT. We will also consider other ways to reconstruct and models to exploit the temporal correlations in the training data.
Furthermore, the comprehensive probabilistic approach under the NAS framework offers the possibility of moving to dynamic and real-time full kinetic equilibrium reconstructions that are currently limited to offline analysis because they are computationally expensive. To this end, we will pursue the application of the NAS framework to kinetic surrogate modeling as a key follow-up work. In fact, we performed a scoping study of kinetic surrogate modeling and its sensitivity to different sets of diagnostics data,45 which will form a basis for extending EFIT-Prime to incorporate kinetic data for flux prediction.
ACKNOWLEDGMENTS
This work is supported by the U.S. Department of Energy, Office of Fusion Energy Science (Award Nos. DE-AC02-06CH11357, DE-SC0021203, DE-FG02-95ER54309, and DE-SC0021380). The authors acknowledge the computational resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under the contract (No. DE-AC02-06CH11357), and Laboratory Computing Resource Center (LCRC) at the Argonne National Laboratory. The data used in this work are based upon work supported by the U.S. Department of Energy, Office of Science, Office of Fusion Energy Sciences, using the DIII-D National Fusion Facility, a DOE Office of Science user facility (Award(s) No. DE-FC02-04ER54698).
The authors thank Erik Olofsson for his insightful comments to improve the manuscript.
This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
S. Madireddy: Conceptualization (lead); Formal analysis (lead); Funding acquisition (lead); Investigation (equal); Methodology (lead); Project administration (lead); Resources (lead); Software (lead); Supervision (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal). C. Akcay: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (supporting); Resources (supporting); Supervision (equal); Validation (equal); Visualization (equal); Writing – original draft (lead); Writing – review & editing (lead). S. E. Kruger: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Funding acquisition (lead); Investigation (supporting); Project administration (lead); Resources (equal); Supervision (equal); Writing – original draft (equal); Writing – review & editing (equal). T. Bechtel Amara: Conceptualization (equal); Data curation (lead); Formal analysis (supporting); Investigation (supporting); Resources (lead); Validation (equal); Visualization (equal); Writing – original draft (supporting); Writing – review & editing (supporting). X. Sun: Data curation (supporting); Investigation (supporting); Methodology (supporting); Writing – original draft (supporting); Writing – review & editing (supporting). J. McClenaghan: Data curation (lead); Investigation (supporting); Resources (equal); Writing – original draft (supporting); Writing – review & editing (supporting). J. Koo: Conceptualization (equal); Formal analysis (supporting); Investigation (supporting); Software (supporting). A. Samaddar: Methodology (supporting); Software (supporting). Y. Liu: Conceptualization (supporting); Funding acquisition (equal); Project administration (equal); Supervision (equal); Writing – review & editing (equal). P. Balaprakash: Conceptualization (supporting); Project administration (lead); Supervision (lead). L. L. Lao: Conceptualization (lead); Formal analysis (supporting); Funding acquisition (lead); Investigation (equal); Project administration (lead); Resources (supporting); Software (equal); Supervision (lead); Writing – review & editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.
APPENDIX A: ENCODING FURTHER PHYSICS INFORMATION INTO THE NNs WITH THE CONSTRAINT
APPENDIX B: EFIT-PRIME TOP-MODEL ARCHITECTURE
The neural architecture corresponding to the best configuration of EFIT-Prime is shown in Figs. 14 and 15. The model that predicts from external magnetic data is shown in Fig. 14, and the model that predicts from is shown in Fig. 15. The neural architecture displays several counter-intuitive feed-forward and skip connections, with some nodes having concurrent connections to multiple nodes. The final layer shows the mean and standard deviation prediction for each ensemble, while the left branch in Fig. 14 shows a sampling layer that samples from the distribution defined by the predicted mean and standard deviation of that is input to the neural architecture prediction as seen in Fig. 15. The depicted NN configuration contains approximately 10 × 106 model parameters, most of which (approximately 7 × 106) belong to since is learned on the entire EFIT mesh, whereas enters the NN as 300 PC coefficients. Note any garden variety MLP NN would also produce approximately the same number of parameters because of the immense size of the output vector that is dominated by the high-resolution .
The architecture of the best-performing configuration from the deep ensemble used in constructing EFIT-Prime, described in Sec. II. Only the architecture of the model learning , containing approximately 14 × 106 parameters, is shown. There are approximately that the NN optimizes A box can represent a dense layer, an activation, or an addition (for residual layers).
The architecture of the best-performing configuration from the deep ensemble used in constructing EFIT-Prime, described in Sec. II. Only the architecture of the model learning , containing approximately 14 × 106 parameters, is shown. There are approximately that the NN optimizes A box can represent a dense layer, an activation, or an addition (for residual layers).
The architecture of the best-performing configuration from the deep ensemble used in constructing EFIT-Prime, described in Sec. II. Only the architecture of the second model learning ( ), containing approximately 2 × 106 parameters, is shown. A box can represent a dense layer, an activation, or an addition (for residual layers).
The architecture of the best-performing configuration from the deep ensemble used in constructing EFIT-Prime, described in Sec. II. Only the architecture of the second model learning ( ), containing approximately 2 × 106 parameters, is shown. A box can represent a dense layer, an activation, or an addition (for residual layers).