This work presents a method for predicting plasma equilibria in tokamak fusion experiments and reactors. The approach involves representing the plasma current as a linear combination of basis functions using principal component analysis of plasma toroidal current densities (Jt) from the EFIT-AI equilibrium database. Then utilizing EFIT's Green's function tables, basis functions are created for the poloidal flux (ψ) and diagnostics generated from the toroidal current (Jt). Similar to the idea of a physics-informed neural network (NN), this physically enforces consistency between ψ, Jt, and the synthetic diagnostics. First, the predictive capability of a least squares technique to minimize the error on the synthetic diagnostics is employed. The results show that the method achieves high accuracy in predicting ψ and moderate accuracy in predicting Jt with median R2 = 0.9993 and R2 = 0.978, respectively. A comprehensive NN using a network architecture search is also employed to predict the coefficients of the basis functions. The NN demonstrates significantly better performance compared to the least squares method with median R2 = 0.9997 and 0.9916 for Jt and ψ, respectively. The robustness of the method is evaluated by handling missing or incorrect data through the least squares filling of missing data, which shows that the NN prediction remains strong even with a reduced number of diagnostics. Additionally, the method is tested on plasmas outside of the training range showing reasonable results.

Reconstructing magnetohydrodynamic equilibria from a series of diagnostic measurements is crucial in tokamak research and operations. This process yields invaluable insights into magnetic geometry, current, and pressure profiles, all of which are essential for tokamak data analysis, plasma stability, and control, as well as for validating codes and physics models. The EFIT code, widely utilized across various tokamaks, serves as a powerful tool for reconstructing equilibria. Consequently, a wealth of experimental equilibrium reconstruction data exists due to its extensive usage, with notable examples including DIII-D,1 EAST,2 JET,3 KSTAR,4 and NSTX.5 However, this method of equilibrium reconstruction is computationally intensive, making it infeasible for real-time operation. Recently, the application of machine learning techniques in equilibrium reconstruction has yielded promising results. By training machine learning models on a subset of data, it is now possible to accurately predict equilibria in a given system, thereby significantly reducing the computational cost associated with the reconstruction process. These techniques enable fast and precise equilibrium reconstruction, even for complex and high-dimensional systems.

Initial efforts to solve the equilibrium Grad–Shafranov (GS) equation (1) with machine learning utilized neural networks (NNs), as seen in the work by van Milligen et al.6 Early attempts for equilibrium reconstruction shape parameters with neural networks were performed on DIII-D7 and ASDEX.8 More recent endeavors have involved employing deep neural networks to solve the equilibrium GS equation using measured external magnetic signals in the KSTAR tokamak, building upon the offline EFIT9 equilibrium reconstruction results.10,11 Neural networks have also been trained for the DIII-D tokamak, showcasing promising predictive capabilities.12 Notably, there have been substantial refinements and improvements made to the equilibrium surrogates13 using a comprehensive neural architecture search (NAS).14 However, thus far, these studies have not leveraged the separation of the known external contributions and unknown internal contributions to the Grad–Shafranov equation, with the exception of recent work examining equilibrium in the NSTX tokamak.15 Considering the variability of external coils across different tokamaks, this step holds significant potential for training a robust, general neural network for the Grad–Shafranov equation.

In this paper, we explore leveraging machine learning to speed up the solution of the Grad–Shafranov equation taking advantage of the separation of the toroidal plasma current and the externally generated coil currents. Section II formulates writing of the linear combination of basis functions in toroidal current. Utilizing EFIT's Green's function tables, corresponding basis functions are generated for the poloidal flux and plasma diagnostics, ensuring that these quantities are consistent with each other. A linear least-squares minimization of the synthetic diagnostics is used to predict ψ in Sec. III. Section IV shows neural networks trained with NAS to learn the equilibrium basis functions give a more accurate equilibrium prediction than the least-squares minimization when the equilibrium is in a well-trained parameter space. However, the trained neural network's predictive capability relies on there being no missing input data, which can be achieved by filling missing data with synthetic diagnostic data predicted from least-squares. Neural network predictions outside the training set are examined in Sec. V, showing reasonable but slightly worse than the least-square strategy predictions. Conclusions are given in Sec. VI.

In this section, we represent the plasma toroidal current (Jt) as a linear combination of basis functions. This facilitates a rapid computation of the poloidal flux (ψ), as corresponding basis functions can be generated for ψ and plasma diagnostics generated by Jt. The Grad–Shafranov equation is expressed as follows:
(1)
(2)
where R is the major radius, P is the plasma pressure, and F is the poloidal current. EFIT splits the poloidal flux into the internal contribution generated from the plasma and the external contribution generated from the poloidal field coils.
(3)
where Ω is the plasma volume, and Ie,n are the current of the nth external shaping coils, ohmic heating coils, and passive structures located at re,n. There are a total of ne=24+24 poloidal field coils and passive structures for DIII-D. The variable G(r,r) are the toroidal Green's functions and are pre-computed and stored as Fortran binary tables that both EFIT and real-time EFIT16 use. For DIII-D with a grid size of (nw=129)×(nh=129), these tables are the size of [48, 1292] for G(r,re,n) and [129, 1292] for G(r,r). Note that Z-symmetry reduces the array size of G(r, r′) from [1292, 1292] to [129, 1292].
Like ψ, Green's function tables are also used to compute synthetic diagnostics for the magnetic probes and flux loops in EFIT and are given by
(4)
For DIII-D, there are 76 magnetic probes, 44 poloidal flux loops, and a Rogowski coil measurement of the plasma current. Thus, 121 total features make up the input vector, which can be used to predict an equilibrium.

The external contribution to ψ is easily calculated with a simple matrix multiplication. The computation of the plasma portion of ψ requires an integral over the plasma volume and thus takes, by far, the longest amount of time to compute. Thus, if Jt could be pre-computed as a set of basis functions, then the corresponding basis functions of ψ along with the magnetic probe and flux loop signals from the plasma that Jt generates could be also pre-computed. Using EFIT's Green's function tables in this way, the corresponding ψ and Cm share the same basis representation. This ensures that ψpla, and Cm, and Jt are all consistent with each other.

By removing the external contributions to ψ, the prediction of ψ becomes simpler and could potentially enable better generalization in solving the Grad–Shafranov equation. To highlight this, Fig. 1 shows example flux contours of a positive triangularity and a negative triangularity plasma ψ and ψpla. Despite vastly different plasma shapes in ψ, the produced flux patterns of ψpla are nearly indistinguishable by eye. However, ψpla is still an integral quantity of Jt defined by the second term of Eq. (3). Thus, there is still some dependence on shape, especially in more complicated equilibria with transport barriers not in the DIII-D magnetics database.

FIG. 1.

Example positive- δ (top) and negative  δ contours of ψ (left) and ψpla (right).

FIG. 1.

Example positive- δ (top) and negative  δ contours of ψ (left) and ψpla (right).

Close modal
To take advantage of this similarity in ψpla, we represent Jt as a linear combination of basis functions:
(5)
then ψpla is a linear combination with the same coefficients
(6)
and the magnetic probes and flux loops can also be described as a linear combination with the same coefficients:
(7)

To obtain the Jt,i functions, we utilize the toroidal currents from 54 755 equilibria from the EFIT-AI magnetics database.12 A matrix of dimensions 1292×54755 is constructed and subjected to a principle component analysis (PCA) using the sklearn.decomposition.PCA library, retaining the first n components. Additionally, there is a test database consisting of 14 093 equilibria, which are withheld from the PCA and NN training. As an example of what the PCA components look like, the first four components are plotted in Fig. 2.

FIG. 2.

First four Jt PCA components are plotted. The solid lines are positive values and dashed on negative values.

FIG. 2.

First four Jt PCA components are plotted. The solid lines are positive values and dashed on negative values.

Close modal
As representing the equilibrium as a finite number of possible Jt s naturally limits the predictive capability of the solution, the maximum predictive capability of this method is assessed by computing the median equilibrium R2 of Jt and ψ from the database, where
(8)
The variable y is ψ or Jt at a grid point. Figure 3 shows the PCA reconstructed R2 for Jt and ψ keeping between one and 32 components. Even keeping only a single component predicts surprisingly well with ψR2=0.985 and JtR2=0.9. Each additional component improves the possible predictive capability with ψR2=0.9998 and JtR2=0.9957 at eight components, and ψR2=0.999994 and JtR2=0.9976 at 32 components. Thus, if the first 32 PCA components could be perfectly predicted by a neural network, then the ψ and Jt would be well predicted up to the previously mentioned R2 s.
FIG. 3.

Test database median R2 between the true and PCA reconstruction of ψ (blue) and Jt (green) plotted the number of PCA components retained (i.e., maximum possible R2 for the number of components).

FIG. 3.

Test database median R2 between the true and PCA reconstruction of ψ (blue) and Jt (green) plotted the number of PCA components retained (i.e., maximum possible R2 for the number of components).

Close modal

To gain an intuitive understanding of the accuracy of a solution for a given R2, we present an example of a Jt from DIII-D discharge 179 425 at 5640 ms plotted along with equilibria retaining 1, 2, 6, and 16 components, as shown in Fig. 4. The first PCA component yields an R2=0.95, and while the overall shape is roughly matched, the contours are visibly incorrect in both the core and near the edge of the plasma. At R2=0.971 with two PCA components, the core is much more accurately represented, but the edge contour, which has a significantly lower current density, remains quite different from the equilibrium. With eight PCA components and an R2=0.999, both the core and edge are better predicted, while with 16 PCA components and an R2=0.9999, the core is nearly indistinguishable from the original equilibrium and the edge contour is only slightly off near the separatrix. The jaggedness of the equilibrium Jt could potentially limit the utility of a fast neural prediction with this method. To assess this, the β-limit of a PCA decomposed equilibrium is explored in  Appendix A.

FIG. 4.

Plotted in red is the reconstructed equilibrium Jt retaining 1 (top left), 2 (top right), 8 (bottom left), and 16 (bottom right) PCA components for the DIII-D discharge 179 425 at 5640 ms. Plotted in black is the original Jt.

FIG. 4.

Plotted in red is the reconstructed equilibrium Jt retaining 1 (top left), 2 (top right), 8 (bottom left), and 16 (bottom right) PCA components for the DIII-D discharge 179 425 at 5640 ms. Plotted in black is the original Jt.

Close modal
Since Jt is calculated from a linear combination of basis functions, and each eigenvector has an associated magnetic probe signal, flux loop signal, and Ip, the Grad–Shafranov solution can be predicted from least squares with L2 regularization minimizing the error between the measured and predicted diagnostic of Eq. (7) using sklearn.linear_model.Ridge with the L2 multiplier set to unity. The coefficient matrix is
(9)
where wm is the EFIT weight array, which is an array of ones where data are valid and zeros where diagnostics data are missing or bad. The target vector is magnetic probes, flux loops, and Ip signals with the external coil components subtracted off times wm:
(10)
where Mm is the diagnostic measurement. In order to account for uncertainties in the experimental measurements, including those from the external shaping coil currents and synthetic measurements produced by EFIT, two sets of Mm are considered. The first set consists of EFIT's reconstructed measurements, which align with the known equilibria, serving as an ideal reference. The second set encompasses the actual experimental measurements, which include the effects of uncertainty.

The predictive capability of this least squares method is found to be limited in the number of PCA components that can be kept. The median R2 of ψ and Jt predicted by the least squares vs the number of PCA components of Jt and ψ is plotted in Fig. 5. The R2 s improve up to ten PCA components. Using additional components leads to a plateau in predictive capability with the median R2=0.9993 and R2=0.978, respectively for ψ and Jt at 32 components.

FIG. 5.

Least square predicted test database median ψ R2 (blue) and Jt R2 (green) vs the number of PCA components retained.

FIG. 5.

Least square predicted test database median ψ R2 (blue) and Jt R2 (green) vs the number of PCA components retained.

Close modal

Utilizing this least squares prediction retaining six PCA components of Jt and ψ on the test database of 14 093 equilibria, gives excellent predictions of ψ and moderate predictions of Jt. Figure 6 shows histograms of R2 for individual equilibrium Jt and ψ. Using the experimental diagnostic signals, we see a median ψ R2=0.998 and median Jt R2=0.947. The predictive capability of this approach goes up moderately using the reconstructed signals with the median ψ R2=0.9996 and median Jt R2=0.977.

FIG. 6.

Histograms depicting the R2 values for the predicted ψ (upper panel) and Jt (lower panel) obtained from the least squares, using 6 PCA components to represent Jt on a dataset of 14 093 equilibria extracted from the DIII-D 2019 campaign. Blue bars depict predictions derived from experimental signal inputs and green bars depict predictions based on inputs reconstructed using EFIT.

FIG. 6.

Histograms depicting the R2 values for the predicted ψ (upper panel) and Jt (lower panel) obtained from the least squares, using 6 PCA components to represent Jt on a dataset of 14 093 equilibria extracted from the DIII-D 2019 campaign. Blue bars depict predictions derived from experimental signal inputs and green bars depict predictions based on inputs reconstructed using EFIT.

Close modal

To build fast, accurate, and robust NN surrogates, we use a comprehensive NAS to carry out a bi-level optimization of architecture at the outer level and hyperparameters at the inner level, based on the chosen general structure to predict the 32 basis coefficients of Jt, with the loss being the mean squared error of the coefficients. The search space includes the maximum number of layers, type of operation each layer executes, and hyperparameters associated with each operation such as the initial learning rate and optimizer. Training of each architecture is performed with tensorflow. This approach ensures a thorough exploration of both parameters. For large datasets, NAS makes use of AgEBO, a framework combining Aging Evolution (AgE) rooted in the paradigm of evolutionary algorithms and asynchronous Bayesian Optimization (BO). Using NAS, we train a multitude of neural network configurations for 24 h on 4 GPU nodes on the Argonne National Laboratory cluster Swing to find the best models tailored to the reconstruction problem outlined above. A deep ensemble is then formed with the top five models to make predictions. The final predictions are the mean of the predictions made by each of the five models in the deep ensemble.

In this section, we use NAS to predict the coefficients ai of the PCA components for Jt as an alternative to the least squares prediction. The results show a significant performance improvement compared to the least squares method. Scanning the number of PCA components kept in the NN training, it is found that the NN prediction does not degrade with increasing components like the least squares method did. Figure 7 shows the ψ R2 and Jt R2 of the test database plotted vs PCA component. In order to avoid the complication of missing or poor quality diagnostic data, any data with the EFIT weight wi set to zero is replaced with the EFIT reconstructed values for this first test. The best performance at 32 components has a median ψ R2=0.99956 and the median predicted Jt R2=0.992. The prediction of R2s increases with each additional component that is retained. However, the improvement in predictive capability increasing beyond eight components is weak and well below the maximum possible R2 of Fig. 3. Details of the top performing model are shown in  Appendix B. Other NAS predicted models have similar architectures to the top performing model.

FIG. 7.

The median R2 for the NN-predicted ψ R2 (blue) and Jt R2 (green) vs the number of PCA components retained, aggregated over a test set containing approximately 14 000 equilibria.

FIG. 7.

The median R2 for the NN-predicted ψ R2 (blue) and Jt R2 (green) vs the number of PCA components retained, aggregated over a test set containing approximately 14 000 equilibria.

Close modal

The histograms of Fig. 8 illustrate the median predicted ψ R2 and Jt R2 values for the test database. Training a NAS NN on solely the reconstructed signals shows a strong improvement in prediction compared to the experimental signals with the median ψ R2=0.99994 and the median predicted Jt R2=0.995 on the test database. To make the neural network robust against missing data, one possibility for training the neural network is dropping out the signals, as was done in.15 In this work, the least squares prediction of the Grad–Shafranov solution is robust to missing data, as (9) and (10) are multiplied wm by which can be set to zero where data are missing. This allows for another way to handle missing data, which is to utilize the synthetic diagnostic signals generated from the least squares method as inputs to the NN. The orange bars in Fig. 8 show the binned R2 predicted from the neural net. Doing this gives a median predicted ψ R2=0.99954 and the median predicted Jt R2=0.9916, only slightly lower than the R2s from the neural net trained with experimental signals filled with the EFIT reconstructed signals where data were missing.

FIG. 8.

Histograms depicting the R2 values obtained from the 30 PCA component single layer neural net for variables ψ (upper panel) and Jt (lower panel) in the DIII-D test database, encompassing a dataset of 14 093 equilibria. Blue bars depict predictions derived from experimental signal inputs, which encompass zeroed channels that were unavailable during prediction. Green bars depict predictions based on inputs reconstructed using EFIT, and orange bars depict predictions using experimental signals with zeroed channels filled by the least squares predicted inputs.

FIG. 8.

Histograms depicting the R2 values obtained from the 30 PCA component single layer neural net for variables ψ (upper panel) and Jt (lower panel) in the DIII-D test database, encompassing a dataset of 14 093 equilibria. Blue bars depict predictions derived from experimental signal inputs, which encompass zeroed channels that were unavailable during prediction. Green bars depict predictions based on inputs reconstructed using EFIT, and orange bars depict predictions using experimental signals with zeroed channels filled by the least squares predicted inputs.

Close modal

Next, we evaluate the accuracy of the models by comparing the synthetic signals predicted by the models with those from the magnetic diagnostics. Using Green's function tables to compute synthetic signals, we then calculate the total χ2 of the diagnostic signals, a measure utilized in EFIT for assessing the accuracy of an equilibrium reconstruction.9  Figure 9 shows the median χ2 of the test database plotted against the number of retained PCA components. The NAS NN predicts an average χ2 of a similar magnitude as that obtained from the PCA of the Jt solution. In contrast, the least squares method predicts a lower χ2 compared to the PCA decomposition, confirming overfitting to the experimental diagnostics beyond six PCA components.

FIG. 9.

Median predicted χ2 plotted against the number of PCA components. The PCA decomposition calculation is in black, least squares method in blue, and NAS NN in green.

FIG. 9.

Median predicted χ2 plotted against the number of PCA components. The PCA decomposition calculation is in black, least squares method in blue, and NAS NN in green.

Close modal

The predictive capability of the least-square filled NN remains strong even when the number of magnetic probes and flux loops is reduced. To assess this, we randomly replace the diagnostic signals with the least squares-predicted synthetic diagnostic signals to determine the point at which the neural net prediction starts to degrade. Figure 10 illustrates the median predicted R2 value as a function of the number of removed diagnostics and their replacement with the least squares predicted synthetic diagnostic. When half of the diagnostics are removed, the median predicted ψ exhibits an R2 value of 0.999, while the median predicted Jt shows an R2 value of 0.98. It is only when the number of diagnostics is reduced to 10 out of the original 120 that the predictive capability significantly deteriorates, resulting in ψ R2=0.986 and a median predicted Jt R2=0.81.

FIG. 10.

NAS NN prediction of ψ and Jt characterized by their median R2 aggregated over the test set as a function of the number of diagnostics.

FIG. 10.

NAS NN prediction of ψ and Jt characterized by their median R2 aggregated over the test set as a function of the number of diagnostics.

Close modal

Going outside of the training set distribution, the trained NN in this paper can still predict reasonable ψ. However, the least squares prediction performs slightly better. To highlight this, negative triangularity data are withheld from the training set. The PCA components are re-computed and NAS NN is retrained on a curated training set that holds out all shots with negative triangularity. Then, we carry out inference on a test set comprising 7011 negative triangularity equilibria to predict ψ and Jt from these negative triangularity plasmas. The NN gives a reasonable prediction with a median ψ R2=0.969 and Jt R2=0.56. Figure 11 is the average plasma triangularity calculated from the predicted vs EFIT reconstructed last closed flux surface. The NN can predict the triangularity well at low negative δ0.2, but as triangularity is decreased further to δ0.35, the NN begins predicting stronger negative- δ than the experiment. Since most of the structure of the poloidal flux in negative triangularity is due to the poloidal field coils as noted in Fig. 1, even with moderate predictive capability of Jt, the NN can predict the triangularity reasonably well. However, using just least squares prediction with the negative triangularity withheld PCA components predicts better matching the EFIT δ, and with a median ψ R2=0.990 and a median Jt R2=0.792. Interestingly, both methods tend to under-predict the triangularity at δ0.5 and over-predict it at δ0.1 on average. One might naively expect the predictions to worsen as triangularity becomes more negative. A potential reason for this discrepancy is that plasmas with more negative δ are not necessarily much closer to typical positive triangularity plasmas. Many of the negative triangularity discharges from 2019 have nearly zero lower triangularity, as they were diverted onto the DIII-D shelf.17 

FIG. 11.

The average triangularity of the last closed flux surfaces calculated from the least squares ψ (top) and NN ψ (bottom) vs that calculated from the EFIT-reconstructed (true) ψ aggregated over 7011 negative triangularity equilibria.

FIG. 11.

The average triangularity of the last closed flux surfaces calculated from the least squares ψ (top) and NN ψ (bottom) vs that calculated from the EFIT-reconstructed (true) ψ aggregated over 7011 negative triangularity equilibria.

Close modal

Another advantage of separating out the external coils is that it allows for the use of the PCA basis functions to train a neural network to predict ψ for another tokamak. For example, a neural network can be trained to predict ψ for ITER, only using the PCA components of Jt from DIII-D. To do this, care must be taken to construct a set of Green's function tables with a grid size so that the ITER plasma shape is well aligned with the database plasma shapes. A new training database is created using ITER Green's function tables to generate synthetic ITER magnetic probes, flux loops, and Ip signal data which comes from the Jt. Additionally, new basis functions for ψpla and Cm are computed based on the ITER's Green's function tables. As ITER is not yet online, the EFIT reconstructed external coil currents are used for isolating the plasma contributions to each diagnostic signal. A neural network can be trained to predict the strength of each component based on a given probe, loop, and Ip signal from the plasma. Figure 12 shows the least squares and NN predicted contours of ψ and Jt for an ITER 15 MA L-mode equilibrium, along with the original equilibrium quantities. The prediction is reasonable but still requires further improvement. NN gives a median ψ R2=0.979 and a median Jt R2=0.884. However, like the negative-triangularity-withheld case, the least squares strategy yields better predictions with the last closed flux surface better matched and a median ψ R2=0.997 and Jt R2=0.957. While the NN performs significantly better when within the training set, the least squares method is more resilient when outside, as it is directly constrained by the diagnostics signals. Additionally, we note that while the components are primary to Jt, they are not necessarily primary to ψpla and Cm and could potentially be more sensitive to the higher order components. This suggests that a possible way to improve the robustness of the NN may be to add the error of the synthetic signals to the loss function.

FIG. 12.

Predicted (red) and original (black) equilibrium contours of ψ (left) and Jt (right) for an ITER 15 MA L-mode equilibrium. The prediction from least-squares is on top, and the NN prediction is on the bottom.

FIG. 12.

Predicted (red) and original (black) equilibrium contours of ψ (left) and Jt (right) for an ITER 15 MA L-mode equilibrium. The prediction from least-squares is on top, and the NN prediction is on the bottom.

Close modal

This paper presents a method for predicting the plasma current and poloidal magnetic flux in a tokamak fusion device using a combination of least squares and neural network techniques. The approach presented here exploits the separation of the total poloidal flux into a known external contribution due to the poloidal field coils and an unknown internal contribution due to the plasma current. The external contribution can be computed using the pre-computed Green's function tables. This leaves the remaining task of determining the response due to the unknown plasma current, which here is represented as a linear combination of basis functions, allowing for rapid computation of the poloidal flux.

The results show that the least squares method provides excellent predictions of the poloidal magnetic flux and moderate predictions of the plasma current. The method is robust to missing or incorrect data and can handle uncertainties in experimental measurements. However, its predictive capabilities is limited. The neural network approach using NAS, on the other hand, shows a performance improvement compared to the least squares method when reconstructing signals. The neural network demonstrates high predictive capabilities for ψ and moderate predictive capabilities for Jt. However, the neural network has not been trained to handle missing or incorrect data.

By combining the least squares method and the neural network, it is possible to fill in missing data and enhance the robustness of the prediction in a real-time application. The least squares prediction can be used to generate synthetic diagnostic signals, which can then be used as inputs to the neural network. This approach provides strong predictive capabilities even with a reduced number of magnetic probes and flux loops.

Furthermore, the paper examines the extrapolation of the neural network prediction outside the training set and compares it to the least squares prediction. The results show that the least squares prediction performs better when extrapolating to new plasma configurations, such as from positive triangularity to negative triangularity and from DIII-D to ITER.

The authors would like to thank C. K. Lau for his insights and stimulating discussions. This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Fusion Energy Sciences, using the DIII-D National Fusion Facility, a DOE Office of Science user facility, under Awards Nos. DE-SC0021203, DE-FC02-04ER54698, and DE-FG02-95ER54309. Part of the data analysis was performed using the OMFIT integrated modeling framework.18 

The authors have no conflicts to disclose.

J. McClenaghan: Conceptualization (lead); Data curation (equal); Investigation (lead); Methodology (lead); Writing – original draft (lead). C. Akcay: Investigation (supporting); Methodology (supporting); Writing – review & editing (supporting). T. B. Amara: Data curation (lead); Investigation (supporting); Methodology (supporting); Writing – review & editing (supporting). X. Sun: Investigation (supporting); Methodology (supporting). S. Madireddy: Investigation (supporting); Methodology (supporting). L. L. Lao: Investigation (supporting); Methodology (supporting); Project administration (lead); Writing – review & editing (supporting). S. E. Kruger: Investigation (supporting); Methodology (supporting). O. M. Meneghini: Investigation (supporting); Methodology (supporting); Supervision (supporting).

The data that support the findings of this study are available from the corresponding author upon reasonable request.

To explore the utility of a predicted plasma retaining only a finite number of PCA components of Jt. An initial equilibrium of an example discharge 180 631 at 4000 ms. A scan in βN=βaBTIp is performed holding the flux surface average of Jt fixed using the OMFIT18 EFIT module. Next, a linear ordinary least squares fit to the PCA components. The poloidal flux is then reconstructed retaining a given set of PCA components and ψin. Using OMFIT fluxSurfaces class, quantities derived from ψ such as the safety factor and normalized toroidal flux are computed, and an equilibrium gEQDSK file is generated. Ideal MHD is simulated using DCON19 at the no-wall limit. A summary of the simulations is shown in Fig. 13. The original equilibrium has a predicted stability limit of βN=2.8. With less than eight components, the β-limit is significantly under-predicted at βN2. From 8–24 components, the stability limit is slightly over-predicted with βN=3.3 at eight components and βN=2.95 at 16 components. At 24 PCA components and beyond, the predicted β-limit agrees with the full equilibrium solution.

FIG. 13.

DCON predicted no-wall MHD stability metric δW for toroidal mode number n=1. The solid black line is δW=0, and represents the transition from stability to instability.

FIG. 13.

DCON predicted no-wall MHD stability metric δW for toroidal mode number n=1. The solid black line is δW=0, and represents the transition from stability to instability.

Close modal

This appendix details the top-performing multilayer perceptron model identified through NAS. NAS generated a total of 450 configurations for the experimental diagnostic dataset. The optimizers tested in the search included sgd, rmsprop, adagrad, adam, adadelta, adamax, and nadam. Initial learning rates ranged from 1×104 to 0.1 on a logarithmic scale. The best model utilizes the adam optimizer and contains 26 896 weights and biases. The configuration of this model is illustrated in Fig. 14. The diagram shows multiple sequential dense layers with one branching point. Training this model took approximately 700 s. It employs an adaptive learning rate, starting at 0.0015 and decreasing to 1.5×109. The model predicts 32 basis coefficients of the Jt. Each node in the graph represents a layer in the neural network. The architecture includes a skip connection following the first dense layer. The final layer provides the mean and standard deviation predictions for each ensemble.

FIG. 14.

The architecture of the best-performing configuration from the deep ensemble is used to predict 32 basis functions.

FIG. 14.

The architecture of the best-performing configuration from the deep ensemble is used to predict 32 basis functions.

Close modal
1.
L. L.
Lao
,
H. E. S.
John
,
Q.
Peng
,
J. R.
Ferron
,
E. J.
Strait
,
T. S.
Taylor
,
W. H.
Meyer
,
C.
Zhang
, and
K. I.
You
, “
MHD equilibrium reconstruction in the DIII-D tokamak
,”
Fusion Sci. Technol.
48
(
2
),
968
977
(
2005
).
2.
Q.
Jinping
,
W.
Baonian
,
L. L.
Lao
,
S.
Biao
,
S. A.
Sabbagh
,
S.
Youwen
,
L.
Dongmei
,
X.
Bingjia
,
R.
Qilong
,
G.
Xianzu
, and
L.
Jiangang
, “
Equilibrium reconstruction in EAST tokamak
,”
Plasma Sci. Technol.
11
(
2
),
142
145
(
2009
).
3.
D. P.
O'Brien
,
L. L.
Lao
,
E. R.
Solano
,
M.
Garribba
,
T. S.
Taylor
,
J. G.
Cordey
, and
J. J.
Ellis
, “
Equilibrium analysis of iron core tokamaks using a full domain method
,”
Nucl. Fusion
32
(
8
),
1351
1360
(
1992
).
4.
Y. S.
Park
,
S. A.
Sabbagh
,
J. W.
Berkery
,
J. M.
Bialek
,
Y. M.
Jeon
,
S. H.
Hahn
,
N.
Eidietis
,
T. E.
Evans
,
S. W.
Yoon
,
J.-W.
Ahn
,
J.
Kim
,
H. L.
Yang
,
K. I.
You
,
Y. S.
Bae
,
J.
Chung
,
M.
Kwon
,
Y. K.
Oh
,
W. C.
Kim
,
J. Y.
Kim
,
S. G.
Lee
,
H. K.
Park
,
H.
Reimerdes
,
J.
Leuer
, and
M.
Walker
, “
KSTAR equilibrium operating space and projected stabilization at high normalized beta
,”
Nucl. Fusion
51
(
5
),
053001
(
2011
).
5.
S. A.
Sabbagh
,
S. M.
Kaye
,
J.
Menard
,
F.
Paoletti
,
M.
Bell
,
R. E.
Bell
,
J. M.
Bialek
,
M.
Bitter
,
E. D.
Fredrickson
,
D. A.
Gates
,
A. H.
Glasser
,
H.
Kugel
,
L. L.
Lao
,
B. P.
LeBlanc
,
R.
Maingi
,
R. J.
Maqueda
,
E.
Mazzucato
,
D.
Mueller
,
M.
Ono
,
S. F.
Paul
,
M.
Peng
,
C. H.
Skinner
,
D.
Stutman
,
G. A.
Wurden
, and
W.
Zhu
, and
NSTX Research Team
. “
Equilibrium properties of spherical torus plasmas in NSTX
,”
Nucl. Fusion
41
(
11
),
1601
1611
(
2001
).
6.
B. P.
van Milligen
,
V.
Tribaldos
, and
J. A.
Jiménez
, “
Neural network differential equation and plasma equilibrium solver
,”
Phys. Rev. Lett.
75
,
3594
3597
(
1995
).
7.
J. B.
Lister
and
H.
Schnurrenberger
, “
Fast non-linear extraction of plasma equilibrium parameters using a neural network mapping
,”
Nucl. Fusion
31
(
7
),
1291
(
1991
).
8.
E.
Coccorese
,
C.
Morabito
, and
R.
Martone
, “
Identification of noncircular plasma equilibria using a neural network approach
,”
Nucl. Fusion
34
(
10
),
1349
(
1994
).
9.
L. L.
Lao
,
H.
St. John
,
R. D.
Stambaugh
,
A. G.
Kellman
, and
W.
Pfeiffer
, “
Reconstruction of current profile parameters and plasma shapes in tokamaks
,”
Nucl. Fusion
25
,
1611
(
1985
).
10.
S.
Joung
,
J.
Kim
,
S.
Kwak
,
J. G.
Bak
,
S. G.
Lee
,
H. S.
Han
,
H. S.
Kim
,
G.
Lee
,
D.
Kwon
, and
Y.-C.
Ghim
, “
Deep neural network Grad–Shafranov solver constrained with measured magnetic signals
,”
Nucl. Fusion
60
(
1
),
016034
(
2020
).
11.
S.
Joung
,
Y.-C.
Ghim
,
J.
Kim
,
S.
Kwak
,
D.
Kwon
,
C.
Sung
,
D.
Kim
,
H.-S.
Kim
,
J. G.
Bak
, and
S. W.
Yoon
, “
GS-DeepNet: Mastering tokamak plasma equilibria with deep neural networks and the Grad-Shafranov equation
,”
Sci. Rep.
13
(
1
),
15799
(
2023
).
12.
L. L.
Lao
,
S.
Kruger
,
C.
Akcay
,
P.
Balaprakash
,
T. A.
Bechtel
,
E.
Howell
,
J.
Koo
,
J.
Leddy
,
M.
Leinhauser
,
Y. Q.
Liu
,
S.
Madireddy
,
J.
McClenaghan
,
D.
Orozco
,
A.
Pankin
,
D.
Schissel
,
S.
Smith
,
X.
Sun
, and
S.
Williams
, “
Application of machine learning and artificial intelligence to extend EFIT equilibrium reconstruction
,”
Plasma Phys. Controlled Fusion
64
(
7
),
074001
(
2022
).
13.
S.
Madireddy
,
C.
Akcay
,
S. E.
Kruger
,
T.
Bechtel Amara
,
X.
Sun
,
J.
McClenaghan
,
J.
Koo
,
A.
Samaddar
,
Y.
Liu
,
P.
Balaprakash
, and
L. L.
Lao
, “
EFIT-PRIME: Probabilistic and physics-constrained reduced-order neural network model for equilibrium reconstruction in DIII-D
,”
Nucl. Fusion
64
,
074001
(
2024
).
14.
T.
Elsken
,
J. H.
Metzen
, and
F.
Hutter
, “
Neural architecture search: A survey
,” arXiv:1808.05377 (
2019
).
15.
J. T.
Wai
,
M. D.
Boyer
, and
E.
Kolemen
, “
Neural net modeling of equilibria in NSTX-U
,”
Nucl. Fusion
62
(
8
),
086042
(
2022
).
16.
J. R.
Ferron
,
M. L.
Walker
,
L. L.
Lao
,
H. E.
St. John
,
D. A.
Humphreys
, and
J. A.
Leuer
, “
Real time equilibrium reconstruction for tokamak discharge control
,”
Nucl. Fusion
38
(
7
),
1055
(
1998
).
17.
A.
Marinoni
,
M. E.
Austin
,
A. W.
Hyatt
,
S.
Saarelma
,
F.
Scotti
,
Z.
Yan
,
C.
Chrystal
,
S.
Coda
,
F.
Glass
,
J. M.
Hanson
,
A. G.
McLean
,
D. C.
Pace
,
C.
Paz-Soldan
,
C. C.
Petty
,
M.
Porkolab
,
L.
Schmitz
,
F.
Sciortino
,
S. P.
Smith
,
K. E.
Thome
,
F.
Turco
, and
the DIII-D Team
, “
Diverted negative triangularity plasmas on DIII-D: The benefit of high confinement without the liability of an edge pedestal
,”
Nucl. Fusion
61
(
11
),
116010
(
2021
).
18.
O.
Meneghini
,
S. P.
Smith
,
L. L.
Lao
,
O.
Izacard
,
Q.
Ren
,
J. M.
Park
,
J.
Candy
,
Z.
Wang
,
C. J.
Luna
,
V. A.
Izzo
,
B. A.
Grierson
,
P. B.
Snyder
,
C.
Holland
,
J.
Penna
,
G.
Lu
,
P.
Raum
,
A.
McCubbin
,
D. M.
Orlov
,
E. A.
Belli
,
N. M.
Ferraro
,
R.
Prater
,
T. H.
Osborne
,
A. D.
Turnbull
,
G. M.
Staebler
, and
The AToM Team
. “
Integrated modeling applications for tokamak experiments with OMFIT
,”
Nucl. Fusion
55
,
083008
(
2015
).
19.
A. H.
Glasser
and
M. S.
Chance
, “
Determination of free boundary ideal MHD stability with DCON and VACUUM
,” Paper No. dMopP102 (
1997
).