Many scientific problems can be formulated as sparse regression, i.e., regression onto a set of parameters when there is a desire or expectation that some of the parameters are exactly zero or do not substantially contribute. This includes many problems in signal and image processing, system identification, optimization, and parameter estimation methods such as Gaussian process regression. Sparsity facilitates exploring high-dimensional spaces while finding parsimonious and interpretable solutions. In the present work, we illustrate some of the important ways in which sparse regression appears in plasma physics and point out recent contributions and remaining challenges to solving these problems in this field. A brief review is provided for the optimization problem and the state-of-the-art solvers, especially for constrained and high-dimensional sparse regression.

Despite spanning an enormous range of scales and dynamical regimes, plasmas exhibit a number of common challenges to their diagnosis and analysis. Almost all plasmas are numerically intensive to simulate, challenging to diagnose, and present significant barriers to open design or closed-loop control strategies in experiments. Many of these challenges are common with fluid dynamics, but understandably the solutions may be quite different for plasmas.

For instance, tokamak fusion reactors will require many active and robust control schemes to ensure safe and reliable operation. This includes feedback control for stabilizing the vertical position, shape control, current control, position control, neutral beam control, controlling core parameters such as the line-averaged density and plasma pressure, active control of the divertor particle and energy fluxes, control for energetic particles, and, perhaps most importantly, prediction and avoidance or mitigation of disruptions.1–3 The significant challenges of real-time control in tokamaks are somewhat converted into a high-dimensional and sensitive engineering design space in the stellarator. Stellarators require solving a very complex two-stage design problem subject to both physics and engineering constraints.4–6 Similarly, inertial confinement fusion poses a very challenging design problem. The recent extraordinary successes at the National Ignition Facility come in part from improved symmetry control and laser pulse design.7 

In principle, researchers would like to perform full-fidelity simulations of realistic and dynamical plasmas with multi-species kinetic codes, but these simulations can require months of computational run-time despite modern advances in supercomputing. Partially reduced-order models such as gyro-kinetics or magnetohydrodynamics (MHDs) are also computationally intensive and only apply in parameter regimes where their simplifying assumptions are valid.

Sensing is also very challenging, and subsequently sparse measurements are a ubiquitous feature in the literature. Space plasmas are primarily sensed with expensive spacecraft that take point measurements along a flight path. By its very nature, bulk heliophysics measurements are limited to the plasma near the surface of stars. Future fusion devices must have core plasmas with temperatures on the order of 100 × 106 K and high neutron fluxes that necessitate a large neutron-absorbing blanket, other shielding, and subsequently very limited diagnostic access. Diagnostics in these devices are often limited to a set of spatial line-of-sight measurements in a single poloidal cross section, from which practitioners often attempt to accurately infer quantities that may in principle vary in two or even all three dimensions. Variants of this tomographic inversion problem8 have been studied in this field for decades and have some interesting parallels with coil optimization for stellarators, e.g., both are ill-posed inverse problems.

Common to all of these challenges is some form of parameter estimation with a number of free parameters. This problem is not unique to the field of plasma physics. Regression models are ubiquitous in science and engineering, and it is often very useful to distinguish a relatively small number of variables in a regression that dominantly contribute to the identified model. This utility increases as the data and number of regression variables become large, i.e., in a high-dimensional optimization space; using a large number of regression variables can easily lead to overfitting the data, but using a small number of variables limits the expressivity of the model. Sparse regression facilitates a compromise by using many variables in the optimization, but returning a sparse solution representing the subset of the most important parameters.

The balance between sparse and parsimonious models and models with high prediction accuracy maps out a well-known curve called the Pareto-front.9 In order to produce a priori parsimonious models from regression, optimization problems must be solved that include a loss term that promotes some sense of sparsity in the solution. One of the most important sparsity priors is the l0 quasi-norm x0, a function that counts the number of nonzero terms in the vector x. When added to a linear least squares regression, we have in total the following optimization problem:

(1)

The linear combination of x parameters in Ax is chosen to best fit the data in b, subject to a requirement of sparsity. The α hyperparameter controls this balance between accuracy and sparsity, so that scanning through values of α will generate the Pareto-front. The optimal value of α in the Pareto-front, corresponding to the model that best satisfies this trade-off between accuracy and sparsity, should be determined by an information-theoretic metric such as the Akaike information criteria (AIC).10 The optimization problem in (1) can easily be extended for regression onto more general convex functions, rather than simple least squares. However, it follows from the nonsmoothness of the l0 that the application of typical gradient or Hessian-based optimizers such as the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm will not usually work. In the present review, we will use the term sparse regression to indicate an optimization problem consisting of a convex loss term and a sparsity prior, but will occasionally point out more general opportunities for sparsity-promotion.

To address the issue of nonsmoothness, a large literature exists on using different sparsity priors in place of l0, including the l1 norm,11 the lp norm12(0<p<1), and variations that allow for sparsity within pre-defined groups, such as the group l1 norm.13 The l1-regularized regression problem is typically called Lasso or matching pursuit. These “relaxations” of the l0 term are often used because they have advantageous properties with respect to optimization, notably improved smoothness, convexity, and/or theoretical bounds on convergence given some constraints on the data properties. However, unlike the l0 norm, these relaxations do not typically produce a solution with components that are exactly zero, which is often an important requirement of the regression problem. For instance, the theoretical l1 recovery of the l0 optimal sparse solution is guaranteed only under quite strong assumptions on the data.14 Lasso often struggles to select the most important parameters without also including many weak and relatively superfluous parameters.15 Tikhonov regularization,16 or the l2 norm, is also common for inverse and other ill-posed problems but lacks any substantial relationship to the l0 problem beyond the fact that both terms regularize the primary objective. Tikhonov regularization is a useful tool for ill-posed problems such as the tomographic inversion relevant to many plasma diagnostics,17 but the solution variables will generally all be nonzero.

Subsequently, the sparse regression literature addressing the l0 norm optimization in (1) is prolific.18–24 Despite the fact that the problem is NP-hard, there has been very recent work in mixed-integer (discrete) optimization, showing that l0-based sparse regression can actually be solved to optimality with reasonable computational efficiency for low-dimensional problems24 and even some high-dimensional problems.25 Two caveats for exact methods for high-dimensional l0-based sparse regression: first, we find it unclear whether the methods can easily incorporate additional optimization constraints. For instance, the MIOSR algorithm for system identification allows for convex constraints and scales well with the problem dimension, but scales poorly with the difficulty of the optimization problem,24 so it is not clear that this algorithm is suitable for difficult problems in high-dimensional spaces. Second, there is limited open-source code availability to perform general or sparse-regression-specific high-dimensional mixed-integer optimization. Nonetheless, this is an exciting development for future progress in high-dimensional sparse regression.

In most previous research using high-dimensional and constrained sparse regression, greedy algorithms26 are used that iteratively select nonzero features in x. Unfortunately, greedy algorithms can fail spectacularly, so there has been substantial research into deriving performance guarantees for greedy algorithms, given some reasonable properties of the underlying optimization problem. The most substantial theoretical performance guarantees for greedy algorithms come from formulating an objective function that is submodular. A review of submodularity can be found in Bilmes,27 and recent work has also found theoretical performance bounds for greedy optimization of some non-submodular functions,28 including the mean-squared error.29 This literature provides a strong motivation for researchers to choose submodular or similar optimization objective functions.

Sparsity-promotion is used for nonconvex problems too. Unfortunately, these problems are very challenging to solve because there are now multiple nonconvex terms and the l0 norm is nonsmooth. Once again the problem is typically addressed by convex relaxations of the l0 or greedy algorithms. Two examples in the literature are sparsifying deep neural networks to avoid overfitting, reduce memory, and enforce group structure,30 (but the problem uses the l1), and system identification that uses the multi-time step dynamical prediction error as a loss term31 (but greedy techniques are used). Despite the challenges, high-dimensional sparse regression results for neural network compression indicate that greedy magnitude pruning often performs quite well.32 Finally, note that most of the preceding discussion generalizes to the Bayesian inference settings,33,34 where it is roughly the case that the l1 maps to the Laplace prior, the l0 maps to the spike-and-slab prior, and relaxations of the spike-and-slab are common.35 

The primary point for the present review is that a litany of algorithms is now available to solve (1) in high-dimensional spaces and with Bayesian inference, convex constraints, variations of the sparsity prior, and some theoretical performance guarantees. Therefore, we move on now to a discussion of the plasma physics problems that take an equivalent or similar form to (1). Further details regarding optimization considerations can be found in the review from Bertsimas et al.36 

This work is intended to be a concise and useful manual for plasma physics practitioners to quickly understand the opportunities, optimization problems, and algorithms appearing in sparse regression, which typically appears only briefly (and sometimes disjointly because of the diverse applications) in larger reviews on data-driven methods for plasma physics.37,38 We highlight new work, by ourselves and others, in this field which demonstrates that an understanding of new innovations and remaining challenges in sparse regression can lead to novel solutions for a broad array of useful plasma physics problems.

Notions of sparsity occur very naturally in a number of plasma physics applications ranging from system identification to signal processing. To some extent, the plasma physics community has addressed the ubiquity of sparse and limited plasma measurements by numerous innovations in diagnostic techniques, installation of many diagnostics per device, the operation of such diagnostics over years, and intensive research into high-fidelity numerical simulations. Thus, despite the limited spatial resolution of plasma measurements in experiments, the modern era has actually seen the proliferation of enormous volumes of experimental and simulated plasma physics data, which can be used to discover new data-driven models across the large parameter regimes spanned by different subfields of plasma physics. Toward that goal, we focus this review first on the task of system identification.

System identification is model discovery, the subfield of identifying a physical or reduced-order model from data, and is increasingly playing a role in plasma physics. Identifying or approximating the true underlying dynamics of a plasma system has obvious importance to scientists, including for understanding dynamical behavior, estimating physical parameters, such as the viscosity or resistivity, improving simulations, forecasting, and much more. In particular, reduced-order modeling is the process of finding low-dimensional models that approximate the true high-dimensional system, which is often restricted to a low-dimensional manifold. Identifying reduced-order models can be useful for a number of tasks, since they can be computed more efficiently than the high-dimensional models, coupled with control strategies, and investigated for insights into the dynamical behavior.

Traditional system identification typically assumes that the underlying data can be fit reasonably well with a linear model; this assumption is often not well-justified physically, but is suitable for highly resolved data and particularly useful for applying real-time control. However, nonlinear models coupled with constrained model predictive control (MPC) appear feasible for a number of scenarios.39 These nonlinear schemes may be required for complex modeling scenarios, such as the dynamics of detachment in the tokamak divertor, as we highlight in Sec. III A.

Since system identification refers to any method that builds a model from data, it encompasses an enormous swath of literature across scientific disciplines. A large number of varied approaches have been developed in recent years, such as the dynamic mode decomposition,40,41 Koopman theory,42 nonlinear autoregressive algorithms,43 neural networks,44–46 methods based on Gaussian process regression,47 operator inference and reduced-order modeling,48–50 genetic programming,51,52 divide-and-conquer strategies,53 and sparse regression. We focus on system identification based on sparse regression because it produces an interpretable set of equations that can be scientifically analyzed. It can also be computed very efficiently, and subsequently such methods have seen prolific use across scientific fields. One of the most common such methods is called the sparse identification of nonlinear dynamics (SINDy).18 

Within the field of plasma physics, recently scientists have attempted to discover nonlinear SINDy models from data, including finding models for anode-glow oscillations,54 simplified models for the L–H transition in tokamaks,55 reduced-order models for magnetohydrodynamic (MHD) simulations,56 hierarchical plasma models from data,57 more accurate plasma closures,58,59 turbulence modelling,60 and models for the tokamak divertor.61 

It is a reasonable expectation that nonlinear models of explicit equations can represent nonlinear systems more faithfully than linear models, but there has been also been substantial progress building useful linear models by embedding nonlinear systems in much higher-dimensional spaces. For a review on this Koopman theory perspective, see, e.g., Brunton et al.42 In addition, interpretable nonlinear models obtained from system identification can come with some caveats, which we highlight now in order to warn practitioners and point to potential solutions.

For instance, even if the equations extracted from the data have all of the correct dynamical terms, the coefficients on these terms will inevitably have errors from finite sampling rates, numerical precision limits, or noise in the dataset. This is problematic for the purpose of forecasting (if the practitioner is interested in the equations for other reasons, the following discussion is irrelevant), because very small deviations from the true dynamical equations can result in models that are unstable for some subset of initial conditions. This can be resolved straightforwardly for linear data-driven systems of arbitrary state dimension; during the optimization, require a negative semidefinite matrix C in the fit of ẋ=Cx as in, e.g., Pan and Duraisamy62 Stability analysis is much more challenging for nonlinear systems, but there are some methods for guaranteeing local or global stability for quadratically nonlinear models, which have relevance for many fluid and magnetohydrodynamic models.63–65 To illustrate, Fig. 1 shows the correct simulation of new trajectories from a provably stable, nonlinear, and data-driven model trained on a noisy trajectory of the Lorenz63 system.66 Finally, recent work has utilized the multi-time step dynamical prediction error as a loss term,31,67 sacrificing convexity for much improved (but not guaranteed) model stability. This is a worthwhile trade for a number of applications.

FIG. 1.

System identification coupled with a nonlinear stability analysis and robust techniques can produce models that are provably globally stable for any initial condition.65 In this illustration, we use the weak formulation with the trapping (global stability) variant of SINDy, trained on (b) a Lorenz63 trajectory of 5000 points in time. Zero-mean Gaussian noise, with standard deviation equal to 20% of the root mean square of the training data, has been added to every point in the training data. The identified model is used to predict trajectories starting from 20 new initial conditions randomly selected in [150,150]3. The model is analytically globally stable, and in (a), we show for visualization's sake that the model produces only bounded trajectories that settle on the true attractor. A zoomed view of the attractor is shown in (c).

FIG. 1.

System identification coupled with a nonlinear stability analysis and robust techniques can produce models that are provably globally stable for any initial condition.65 In this illustration, we use the weak formulation with the trapping (global stability) variant of SINDy, trained on (b) a Lorenz63 trajectory of 5000 points in time. Zero-mean Gaussian noise, with standard deviation equal to 20% of the root mean square of the training data, has been added to every point in the training data. The identified model is used to predict trajectories starting from 20 new initial conditions randomly selected in [150,150]3. The model is analytically globally stable, and in (a), we show for visualization's sake that the model produces only bounded trajectories that settle on the true attractor. A zoomed view of the attractor is shown in (c).

Close modal

Moreover, there have been recent and substantial improvements in the robustness of sparse regression for system identification. SINDy has been extended to handle more complex modeling scenarios, such as partial differential equations (PDEs),19,68 delay equations,69 stochastic differential equations,70–73 Bayesian approaches,35 systems with inputs or control,39,74,75 systems with implicit dynamics,76,77 hybrid systems,78,79 to enforce physical constraints,21,56,80 to incorporate information theory81 or group sparsity82 or global stability,65 to identify models from corrupt or limited data,83–86 to identify models with partial measurements of the state space,67,87–89 to identify models with clever subsampling strategies,90 and ensembles of initial conditions,91 to perform cross-validation with ensemble methods,92 and extending the formulation to include weak or integral terms,57,93–98 tensor representations,99,100 and stochastic forcing.101 In particular, the weak formulation drastically improves performance when noise is present in the data and we recommend this formulation for essentially all applications.

The enormous number of modifications to this method can be overwhelming for practitioners, and which advanced techniques to utilize will depend on the purpose of the system identification. For general use, many of the quoted sparse-regression-based system identification advances have been implemented in the open-source PySINDy code.102,103 The technicalities are mostly hidden from the user interface, and a large set of examples and YouTube tutorials are available. Moreover, large-scale datasets have been recently incorporated in PySINDy for standardized benchmarks of the code.104 

Sparse sensing (also called optimal sensor placement)105 is the subfield of maximizing the information gleaned from a sparse set of measurements. Sparse sensing usually refers to choosing the most informative points for sampling data and applying actuation.106 Since plasmas are inherently hard to diagnose, we expect that research in sparse sensing could produce significant results, especially for extremely sparse diagnostic environments, such as nuclear fusion devices, spacecraft, and space weather.107 These methods can be extended to incorporate cost constraints,108,109 representing anything from literal financial costs to electricity requirements to varying levels of diagnostic accessibility.

Sparse sensing and compressed sensing110 are closely related. Compressed sensing relies on the fact that signals can often be sparsely represented in some universal basis such as a Fourier basis. Compressed sensing is clearly related to optimal sensor placement because one can phrase the problem as selecting a measurement matrix that facilitates maximal compression. The primary difference is that the optimal sensor placement algorithms typically rely on “tailored” bases rather than universal ones, i.e., using the basis of biorthogonal modes111,112 computed from a training dataset, rather than a Fourier or other universal basis. Furthermore, compressed sensing has been used to an extent in the plasma physics field113–117 but not to select optimal measurement points but rather to avoid overfitting, provide data compression, or address ill-posedness. More sophisticated nonlinear methods of sparse sensing are also increasingly available.118 As with many high-dimensional l0 optimization problems, most methods of sparse sensing rely on greedy algorithms with submodular objectives. As far as the authors are aware, there has been little consideration in the plasma physics field for using these methods to choose diagnostic measurement locations in simulations or real experiments and we consider this as a useful opportunity for future work.

As a simple illustration, a particle-in-cell code is used to simulate the 1D two-stream instability and a biorthogonal basis is learned from the data. This basis is used, alongside an initial set of 20 pointwise sensor locations, by a greedy sensor placement algorithm105 in order to discover the most informative locations in the phase space. Figure 2 shows the phase space reconstruction errors during another simulation of the two-stream instability, at a single snapshot in time, for 20 randomly placed sensors vs the 20 sensors that were greedily optimized. The optimized sensors clearly pick out the important peaks and troughs where the particles tend to bunch up in the phase space, resulting in much better reconstruction errors. Greedy algorithms are useful here because one is choosing 20 points to sample from the set of all points in the full phase space grid. Even for this simple problem, the set of all gridpoints is a high-dimensional space.

FIG. 2.

A demonstration of sparse sensing on a particle-in-cell simulation of the 1D two-stream instability. After learning a useful basis for reconstruction from a training set, the algorithm from Manohar et al.105 learns to place 20 informative sensors. The algorithm reconstruction errors of a new test simulation (right) compare favorably with the errors from using 20 randomly placed sensors (left).

FIG. 2.

A demonstration of sparse sensing on a particle-in-cell simulation of the 1D two-stream instability. After learning a useful basis for reconstruction from a training set, the algorithm from Manohar et al.105 learns to place 20 informative sensors. The algorithm reconstruction errors of a new test simulation (right) compare favorably with the errors from using 20 randomly placed sensors (left).

Close modal

Gaussian process regression is now a common tool across the plasma physics field. For instance, it has been used for fitting diagnostic profiles in fusion devices,119 accelerating the convergence of global transport simulations,120 solar wind event classification,121 and laser pulse shape optimization in laser-wakefield accelerators.122 There are a number of excellent introductions to Gaussian process regression available in the literature.123 For our purposes, it matters only that Gaussian process regression is regression, i.e., that a set of outputs, in a Bayesian framework, are modeled with a set of Gaussian processes with free parameters. The resulting parameter estimation is typically done with maximum likelihood with a penalization for the complexity (in other words, a reward for sparsity) of the parameters.123 The sparsity accounts for outliers and promotes parsimonious models. Common distributions used for sparsity promotion include Student's t, logistic, Laplace, spike-and-slab, and horseshoe distributions. Student's t was recently used for tokamak profile fits124 and the sparse GP technique from Almosallam et al.125 were used for prediction implosion yields in inertial confinement fusion.126 To see how this relates back to sparse regression as formulated above, note that maximum a posteriori (MAP) estimation using the Laplace distribution corresponds to regression with the l1 norm.127 We expect sparse Gaussian process techniques to expand to new and high-dimensional applications in the plasma physics field, and the connection with sparse regression is fundamental.

We have discussed three subfields of sparse regression for which there are a myriad of applications in the plasma physics field. A number of notable examples in the literature were pointed out, but the examples in the present work have so far been limited to illustrative toy problems. We now transition to demonstrating a few concrete and realistic applications of these methods in the literature, which are focused on fusion experiments because this is the present authors' subfield of mutual expertise. Section III B will dive considerably deeper into a discussion of sparse regression for stellarators, where we see significant opportunities for sparsity-promotion across a wide range of optimization problems.

As we indicated in the introduction, the real-time control needs of the tokamak fusion concept are very significant. One of the most important challenges is divertor detachment control,128–130 since this strongly affects the large heat fluxes to the divertor and elsewhere. Accurately modeling divertor detachment requires a boundary transport plasma model, such as the coupled 2D fluid plasma and kinetic neutral transport code SOLPS-ITER.131 These codes are too computationally intensive to include in a real-time control loop, so reduced-order data-driven models are needed that both approximate the SOLPS-ITER solution and can be computed fast enough to be integrated into the closed loop. We have highlighted this problem because it illustrates an important use case for data-driven modeling and sparse system identification in the fusion community.

To begin to address this challenging problem, recent work61 reduces divertor detachment control using extrinsic impurity seeding to the more straightforward problem of controlling two pointwise plasma quantities using main ion gas puff actuation. The SINDy method is used to generate linear and nonlinear models for the evolution of pointwise measurements of the electron density (upstream) and temperature (downstream). The models are computationally efficient enough to be both incorporated into a model-predictive control (MPC) loop and automatically retrained if model errors become greater than a preset threshold. Figure 3 illustrates the geometry of the tokamak boundary, alongside a baseline setpoint scenario where a linear SINDy model is shown to accurately track the full SOLPS-ITER evolution of the pointwise measurements. In more advanced scenarios, the authors show that MPC with nonlinear models is required for tracking the true evolution of the pointwise measurements.

FIG. 3.

Left: Mesh of the 2D tokamak geometry used in SOLPS-ITER simulations, with locations labeled for the electron density (outboard midplane) and temperature (outer divertor) measurements. Right: A control scenario where the gas puff (bottom row) is modulated with changing waveforms. In this case, the pointwise measurements from concatenated SOLPS-ITER simulations can be tracked accurately by linear SINDy models (dashed black).

FIG. 3.

Left: Mesh of the 2D tokamak geometry used in SOLPS-ITER simulations, with locations labeled for the electron density (outboard midplane) and temperature (outer divertor) measurements. Right: A control scenario where the gas puff (bottom row) is modulated with changing waveforms. In this case, the pointwise measurements from concatenated SOLPS-ITER simulations can be tracked accurately by linear SINDy models (dashed black).

Close modal

A particularly interesting opportunity for sparse regression comes in the form of the many variants of stellarator optimization. Stellarator optimization is typically divided into two stages. The first is a configuration optimization using fixed-boundary MHD equilibrium codes to obtain MHD equilibria with desirable physics properties.132–134 After obtaining the optimal magnetic field in this first stage, complex coils must be designed to produce these fields135–137 and this complexity raises the cost and difficulty of manufacturing. Both stages of stellarator optimization can be performed with a large number of degrees of freedom, and sparsity can be useful in various scenarios. Stage-2 optimization with coils and permanent magnets is particularly appealing for sparsity-promoting regularizations, since an optimization using the Biot–Savart law as the cost function will always be ill-posed.5 

Recently, we have reformulated permanent magnet optimization for stellarators as sparse regression and provided new algorithms for solving this problem.138 Thanks to the connections with sparse regression, we were able to design greedy algorithms that can compete with the state of the art, while being significantly simpler in design, faster to compute, and guaranteed to generate solutions with advantageous engineering properties (binary, grid-aligned magnets).139Figure 4 illustrates a novel permanent magnet solution for a four-period quasi-helically symmetric stellarator in Landreman and Paul,140 scaled to the 0.15 T on-axis magnetic field strength of the MUSE permanent magnet experiment.141 A similar approach appears feasible with other problems appearing in the stellarator field, including winding-surface optimization135 and superconducting tile optimization.142 

FIG. 4.

Overhead and side views of a permanent magnet solution (discrete points in red, white, and blue) found for the four-period quasi-helically symmetric plasma surface described in Landreman et al.140 The plasma is visualized by plotting the B·n̂ errors on the surface; minimizing B·n̂ is the goal of stage-2 optimization. The magnet solution is very sparse; most of the grid is left unfilled (white). Only magnets with ±1 the maximum strength are placed (red and blue). The planar toroidal field coils used for this optimization are not pictured for visualization's sake.

FIG. 4.

Overhead and side views of a permanent magnet solution (discrete points in red, white, and blue) found for the four-period quasi-helically symmetric plasma surface described in Landreman et al.140 The plasma is visualized by plotting the B·n̂ errors on the surface; minimizing B·n̂ is the goal of stage-2 optimization. The magnet solution is very sparse; most of the grid is left unfilled (white). Only magnets with ±1 the maximum strength are placed (red and blue). The planar toroidal field coils used for this optimization are not pictured for visualization's sake.

Close modal

Stage-1 stellarator optimization for the plasma boundary may even benefit from promoting sparsity; the space of Fourier modes describing the boundary can be chosen quite large. Additionally, there may be interesting and useful stellarator shapes that are dominantly described by relatively high-order Fourier harmonics. In practice, stage-1 convergence is quite challenging if one uses many Fourier modes (i.e., many sub-optimal local minima can appear), so often only the first 34 Fourier modes are considered in each angular direction. Even further, the optimization is frequently performed in multiple stages, starting with just a few modes, converging that solution, and using it as an initial condition to another optimization over additional Fourier degrees of freedom.143 By using sparsity-promotion to regularize the problem, we may be able to directly optimize in the higher-dimensional space of Fourier modes, finding new and improved stellarators while retaining some parsimony in the final configuration.

Finally, even more exotic stellarator optimization problems could be attempted with sparsity-promotion. A different potential stage-1 sparsity problem could be formulated to lose alpha particles in a set of sparse locations, corresponding to places for liquid metal or divertors, instead of having alpha losses that are uniformly nonzero or large in certain unfavorable directions.

Sparse regression appears across science, including increasingly in plasma physics, and it is often indispensable for producing high-quality and interpretable results from high-dimensional optimization and parameter estimation. Recent work across the scientific community continues to improve the robustness of these methods for tasks, such as system identification, compressed sensing, and other optimization problems. New algorithms increasingly address more advanced scenarios, such as constrained or high-dimensional regression.

Plasma physics and its subfields, such as nuclear fusion, laser wakefield acceleration, plasma propulsion, and heliophysics, regularly encounter challenges that can be addressed by formulating the problem as sparse regression. We posit that there are significant opportunities for future work, including in stellarator optimization, compressed sensing, and general parameter estimation, including Gaussian process regression.

The authors thank Eduardo Paulo Alves for providing the two stream instability simulation data. This work was supported by the U.S. Department of Energy under Award Nos. DEFG0293ER54197 and DE-ACO5-000R22725, through a grant from the Simons Foundation under Award No. 560651, and by the National Science Foundation under Grant No. PHY-2108384.

The authors have no conflicts to disclose.

Alan Ali Kaptanoglu: Conceptualization (lead); Investigation (lead); Writing – original draft (lead). Christopher Hansen: Conceptualization (equal); Supervision (equal); Writing – review & editing (equal). Jeremy D. Lore: Investigation (supporting); Methodology (supporting). Matt Landreman: Conceptualization (equal); Software (equal); Supervision (equal). Steven L. Brunton: Conceptualization (equal); Supervision (equal); Writing – review & editing (supporting).

The data that support the findings of this study are available from the corresponding author upon reasonable request.

1.
M. L.
Walker
,
E.
Schuster
,
D.
Mazon
, and
D.
Moreau
, “
Open and emerging control problems in tokamak plasma control
,” in
47th IEEE Conference on Decision and Control
(
IEEE
,
2008
), pp.
3125
3132
.
2.
F.
Felici
, “
Real-time control of tokamak plasmas: From control of physics to physics-based control
,” Ph.D. thesis (
École Polytechnique Fédérale de Lausanne
,
2011
).
3.
M. L.
Walker
,
P.
De Vries
,
F.
Felici
, and
E.
Schuster
, “
Introduction to tokamak plasma control
,” in
American Control Conference (ACC)
(
IEEE
,
2020
), pp.
2901
2918
.
4.
G.
Grieger
,
W.
Lotz
,
P.
Merkel
,
J.
Nührenberg
,
J.
Sapper
,
E.
Strumberger
,
H.
Wobig
,
R.
Burhenn
,
V.
Erckmann
,
U.
Gasparino
 et al., “
Physics optimization of stellarators
,”
Phys. Fluids B
4
,
2081
2091
(
1992
).
5.
L.-M.
Imbert-Gerard
,
E. J.
Paul
, and
A. M.
Wright
, “
An introduction to stellarators: From magnetic fields to symmetries and optimization
,” arXiv:1908.05360 (
2019
).
6.
C.
Hegna
,
D.
Anderson
,
A.
Bader
,
T.
Bechtel
,
A.
Bhattacharjee
,
M.
Cole
,
M.
Drevlak
,
J.
Duff
,
B.
Faber
,
S.
Hudson
 et al., “
Improving the stellarator through advances in plasma theory
,”
Nucl. Fusion
62
,
042012
(
2022
).
7.
H.
Abu-Shawareb
,
R.
Acree
,
P.
Adams
,
J.
Adams
,
B.
Addis
,
R.
Aden
,
P.
Adrian
,
B.
Afeyan
,
M.
Aggleton
,
L.
Aghaian
 et al., “
Lawson criterion for ignition exceeded in an inertial fusion experiment
,”
Phys. Rev. Lett.
129
,
075001
(
2022
).
8.
J.
Mlynar
,
T.
Craciunescu
,
D. R.
Ferreira
,
P.
Carvalho
,
O.
Ficker
,
O.
Grover
,
M.
Imrisek
,
J.
Svoboda
, and
J.
contributors
, “
Current research into applications of tomography for fusion diagnostics
,”
J. Fusion Energy
38
,
458
466
(
2019
).
9.
X.
Blasco
,
J. M.
Herrero
,
J.
Sanchis
, and
M.
Martínez
, “
A new graphical visualization of n-dimensional Pareto front for decision-making in multiobjective optimization
,”
Inf. Sci.
178
,
3908
3924
(
2008
).
10.
H.
Akaike
, “
A new look at the statistical model identification
,”
IEEE Trans. Autom. Control
19
,
716
723
(
1974
).
11.
R.
Tibshirani
,
M.
Wainwright
, and
T.
Hastie
,
Statistical Learning with Sparsity: The Lasso and Generalizations
(
Chapman and Hall/CRC
,
2015
).
12.
S.
Guo
,
Z.
Wang
, and
Q.
Ruan
, “
Enhancing sparsity via lp (0<p<1) minimization for robust face recognition
,”
Neurocomputing
99
,
592
602
(
2013
).
13.
J.
Huang
and
T.
Zhang
, “
The benefit of group sparsity
,”
Ann. Stat.
38
,
1978
2004
(
2010
).
14.
M. J.
Wainwright
, “
Sharp thresholds for high-dimensional and noisy sparsity recovery using l1-constrained quadratic programming (Lasso)
,”
IEEE Trans. Inf. Theory
55
,
2183
2202
(
2009
).
15.
W.
Su
,
M.
Bogdan
, and
E.
Candes
, “
False discoveries occur early on the Lasso path
,”
Ann. Stat.
45
,
2133
2150
(
2017
).
16.
G. H.
Golub
,
P. C.
Hansen
, and
D. P.
O'Leary
, “
Tikhonov regularization and total least squares
,”
SIAM J. Matrix Anal. Appl.
21
,
185
194
(
1999
).
17.
T.
Odstrčil
,
T.
Pütterich
,
M.
Odstrčil
,
A.
Gude
,
V.
Igochine
,
U.
Stroth
, and
A. U.
Team
, “
Optimized tomography methods for plasma emissivity reconstruction at the ASDEX upgrade tokamak
,”
Rev. Sci. Instrum.
87
,
123505
(
2016
).
18.
S. L.
Brunton
,
J. L.
Proctor
, and
J. N.
Kutz
, “
Discovering governing equations from data by sparse identification of nonlinear dynamical systems
,”
Proc. Natl. Acad. Sci.
113
,
3932
3937
(
2016
).
19.
S. H.
Rudy
,
S. L.
Brunton
,
J. L.
Proctor
, and
J. N.
Kutz
, “
Data-driven discovery of partial differential equations
,”
Sci. Adv.
3
,
e1602614
(
2017
).
20.
P.
Zheng
,
T.
Askham
,
S. L.
Brunton
,
J. N.
Kutz
, and
A. Y.
Aravkin
, “
A unified framework for sparse relaxed regularized regression: SR3
,”
IEEE Access
7
,
1404
1423
(
2019
).
21.
K.
Champion
,
P.
Zheng
,
A. Y.
Aravkin
,
S. L.
Brunton
, and
J. N.
Kutz
, “
A unified sparse optimization framework to learn parsimonious physics-informed models from data
,”
IEEE Access
8
,
169259
169271
(
2020
).
22.
P.
Zheng
and
A.
Aravkin
, “
Relax-and-split method for nonconvex inverse problems
,”
Inverse Probl.
36
,
095013
(
2020
).
23.
L.
Liu
,
Y.
Shen
,
T.
Li
, and
C.
Caramanis
, “
High dimensional robust sparse regression
,” in
International Conference on Artificial Intelligence and Statistics
(
PMLR
,
2020
), pp.
411
421
.
24.
D.
Bertsimas
and
W.
Gurnee
, “
Learning sparse nonlinear dynamics via mixed-integer optimization
,”
Nonlinear Dyn.
111
,
6585
6604
(
2023
).
25.
D.
Bertsimas
and
B.
Van Parys
, “
Sparse high-dimensional regression: Exact scalable algorithms and phase transitions
,”
Ann. Stat.
48
,
300
323
(
2020
).
26.
T.
Zhang
, “
Adaptive forward-backward greedy algorithm for learning sparse representations
,”
IEEE Trans. Inf. Theory
57
,
4689
4708
(
2011
).
27.
J.
Bilmes
, “
Submodularity in machine learning and artificial intelligence
,” arXiv:2202.00132 (
2022
).
28.
A. A.
Bian
,
J. M.
Buhmann
,
A.
Krause
, and
S.
Tschiatschek
, “
Guarantees for greedy maximization of non-submodular functions with applications
,” in
International Conference on Machine Learning
(
PMLR
,
2017
), pp.
498
507
.
29.
A.
Kohara
,
K.
Okano
,
K.
Hirata
, and
Y.
Nakamura
, “
Sensor placement minimizing the state estimation mean square error: Performance guarantees of greedy solutions
,” in
59th IEEE Conference on Decision and Control (CDC)
(
IEEE
,
2020
), pp.
1706
1711
.
30.
R.
Ma
,
J.
Miao
,
L.
Niu
, and
P.
Zhang
, “
Transformed l1 regularization for learning sparse deep neural networks
,”
Neural Networks
119
,
286
298
(
2019
).
31.
K.
Kaheman
,
S. L.
Brunton
, and
J. N.
Kutz
, “
Automatic differentiation to simultaneously identify nonlinear dynamics and extract noise probability distributions from data
,”
Mach. Learn.: Sci. Technol.
3
,
015031
(
2022
).
32.
T.
Gale
,
E.
Elsen
, and
S.
Hooker
, “
The state of sparsity in deep neural networks
,” arXiv:1902.09574 (
2019
).
33.
S.
Ji
,
Y.
Xue
, and
L.
Carin
, “
Bayesian compressive sensing
,”
IEEE Trans. Signal Process.
56
,
2346
2356
(
2008
).
34.
I.
Castillo
,
J.
Schmidt-Hieber
, and
A.
van der Vaart
, “
Bayesian linear regression with sparse priors
,”
Ann. Stat.
43
,
1986
2018
(
2015
).
35.
S. M.
Hirsh
,
D. A.
Barajas-Solano
, and
J. N.
Kutz
, “
Sparsifying priors for Bayesian uncertainty quantification in model discovery
,”
R. Soc. Open Sci.
9
,
211823
(
2022
).
36.
D.
Bertsimas
,
J.
Pauphilet
, and
B.
Van Parys
, “
Sparse regression: Scalable algorithms and empirical performance
,”
Stat. Sci.
35
,
555
578
(
2020
).
37.
A.
Döpp
,
C.
Eberle
,
S.
Howard
,
F.
Irshad
,
J.
Lin
, and
M.
Streeter
, “
Data-driven science and machine learning methods in laser-plasma physics
,” arXiv:2212.00026 (
2022
).
38.
R.
Anirudh
,
R.
Archibald
,
M. S.
Asif
,
M. M.
Becker
,
S.
Benkadda
,
P.-T.
Bremer
,
R. H.
Budé
,
C.
Chang
,
L.
Chen
,
R.
Churchill
 et al., “
2022 review of data-driven plasma science
,” arXiv:2205.15832 (
2022
).
39.
E.
Kaiser
,
J. N.
Kutz
, and
S. L.
Brunton
, “
Sparse identification of nonlinear dynamics for model predictive control in the low-data limit
,”
Proc. R. Soc. A
474
,
20180335
(
2018
).
40.
P. J.
Schmid
, “
Dynamic mode decomposition of numerical and experimental data
,”
J. Fluid Mech.
656
,
5
28
(
2010
).
41.
J. N.
Kutz
,
S. L.
Brunton
,
B. W.
Brunton
, and
J. L.
Proctor
,
Dynamic Mode Decomposition: Data-Driven Modeling of Complex Systems
(
SIAM
,
2016
).
42.
S. L.
Brunton
,
M.
Budišić
,
E.
Kaiser
, and
J. N.
Kutz
, “
Modern Koopman theory for dynamical systems
,”
SIAM Rev.
64
,
229
340
(
2022
).
43.
S. A.
Billings
,
Nonlinear System Identification: NARMAX Methods in the Time, Frequency, and Spatio-Temporal Domains
(
John Wiley & Sons
,
2013
).
44.
J.
Pathak
,
B.
Hunt
,
M.
Girvan
,
Z.
Lu
, and
E.
Ott
, “
Model-free prediction of large spatiotemporally chaotic systems from data: A reservoir computing approach
,”
Phys. Rev. Lett.
120
,
024102
(
2018
).
45.
P. R.
Vlachas
,
W.
Byeon
,
Z. Y.
Wan
,
T. P.
Sapsis
, and
P.
Koumoutsakos
, “
Data-driven forecasting of high-dimensional chaotic systems with long short-term memory networks
,”
Proc. R. Soc. A
474
,
20170844
(
2018
).
46.
M.
Raissi
,
P.
Perdikaris
, and
G.
Karniadakis
, “
Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations
,”
J. Comput. Phys.
378
,
686
707
(
2019
).
47.
M.
Raissi
,
P.
Perdikaris
, and
G. E.
Karniadakis
, “
Machine learning of linear differential equations using Gaussian processes
,”
J. Comput. Phys.
348
,
683
693
(
2017
).
48.
P.
Benner
,
S.
Gugercin
, and
K.
Willcox
, “
A survey of projection-based model reduction methods for parametric dynamical systems
,”
SIAM Rev.
57
,
483
531
(
2015
).
49.
B.
Peherstorfer
and
K.
Willcox
, “
Data-driven operator inference for nonintrusive projection-based model reduction
,”
Comput. Methods Appl. Mech. Eng.
306
,
196
215
(
2016
).
50.
E.
Qian
,
B.
Kramer
,
B.
Peherstorfer
, and
K.
Willcox
, “
Lift & Learn: Physics-informed machine learning for large-scale nonlinear dynamical systems
,”
Physica D
406
,
132401
(
2020
).
51.
J.
Bongard
and
H.
Lipson
, “
Automated reverse engineering of nonlinear dynamical systems
,”
Proc. Natl. Acad. Sci.
104
,
9943
9948
(
2007
).
52.
M.
Schmidt
and
H.
Lipson
, “
Distilling free-form natural laws from experimental data
,”
Science
324
,
81
85
(
2009
).
53.
S.-M.
Udrescu
and
M.
Tegmark
, “
AI Feynman: A physics-inspired method for symbolic regression
,”
Sci. Adv.
6
,
eaay2631
(
2020
).
54.
B.
Thakur
,
A.
Sen
, and
N.
Chaubey
, “
Data driven discovery of a model equation for anode-glow oscillations in a low pressure plasma discharge
,”
Phys. Plasmas
29
,
042112
(
2022
).
55.
M.
Dam
,
M.
Brøns
,
J.
Juul Rasmussen
,
V.
Naulin
, and
J. S.
Hesthaven
, “
Sparse identification of a predator-prey system from simulation data of a convection model
,”
Phys. Plasmas
24
,
022310
(
2017
).
56.
A. A.
Kaptanoglu
,
K. D.
Morgan
,
C. J.
Hansen
, and
S. L.
Brunton
, “
Physics-constrained, low-dimensional models for magnetohydrodynamics: First-principles and data-driven approaches
,”
Phys. Rev. E
104
,
015206
(
2021
).
57.
E. P.
Alves
and
F.
Fiuza
, “
Data-driven discovery of reduced plasma physics models from fully kinetic simulations
,”
Phys. Rev. Res.
4
,
033192
(
2022
).
58.
J.
Donaghy
and
K.
Germaschewski
, “
In search of a data driven symbolic multi-fluid closure
,” arXiv:2207.06241 (
2022
).
59.
S.
Thévenin
,
N.
Valade
,
B.-J.
Gréa
,
G.
Kluth
, and
O.
Soulard
, “
Modeling compressed turbulent plasma with rapid viscosity variations
,”
Phys. Plasmas
29
,
112310
(
2022
).
60.
I.
Abramovic
,
E.
Alves
, and
M.
Greenwald
, “
Data-driven model discovery for plasma turbulence modelling
,”
J. Plasma Phys.
88
,
895880604
(
2022
).
61.
J. D.
Lore
,
S.
De Pascuale
,
M. P.
Laiu
,
B.
Russo
,
J.-S.
Park
,
J. M.
Park
,
S.
Brunton
,
J. N.
Kutz
, and
A. A.
Kaptanoglu
, “
Time-dependent SOLPS-ITER simulations of the tokamak plasma boundary for model predictive control using SINDy
,”
Nucl. Fusion
63
,
046015
(
2023
).
62.
S.
Pan
and
K.
Duraisamy
, “
Physics-informed probabilistic learning of linear embeddings of nonlinear dynamics with guaranteed stability
,”
SIAM J. Appl. Dyn. Syst.
19
,
480
509
(
2020
).
63.
B.
Kramer
, “
Stability domains for quadratic-bilinear reduced-order models
,”
SIAM J. Appl. Dyn. Syst.
20
,
981
996
(
2021
).
64.
N.
Sawant
,
B.
Kramer
, and
B.
Peherstorfer
, “
Physics-informed regularization and structure preservation for learning stable reduced models from data with operator inference
,”
Comput. Methods Appl. Mech. Eng.
404
,
115836
(
2023
).
65.
A. A.
Kaptanoglu
,
J. L.
Callaham
,
A.
Aravkin
,
C. J.
Hansen
, and
S. L.
Brunton
, “
Promoting global stability in data-driven models of quadratic nonlinear dynamics
,”
Phys. Rev. Fluids
6
,
094401
(
2021
).
66.
E. N.
Lorenz
, “
Deterministic nonperiodic flow
,”
J. Atmos. Sci.
20
,
130
141
(
1963
).
67.
J.
Bakarji
,
K.
Champion
,
J. N.
Kutz
, and
S. L.
Brunton
, “
Discovering governing equations from partial measurements with deep delay autoencoders
,” arXiv:2201.05136 (
2022
).
68.
H.
Schaeffer
, “
Learning partial differential equations via data discovery and sparse optimization
,”
Proc. R. Soc. A
473
,
20160446
(
2017
).
69.
A.
Sandoz
,
V.
Ducret
,
G. A.
Gottwald
,
G.
Vilmart
, and
K.
Perron
, “
SINDy for delay-differential equations: Application to model bacterial zinc response
,”
Proc. R. Soc. A
479
,
20220556
(
2023
).
70.
A.
Klimovskaia
,
S.
Ganscha
, and
M.
Claassen
, “
Sparse regression based structure learning of stochastic reaction networks from single cell snapshot time series
,”
PLoS Comput. Biol.
12
,
e1005234
(
2016
).
71.
D. B.
Brückner
,
P.
Ronceray
, and
C. P.
Broedersz
, “
Inferring the dynamics of underdamped stochastic systems
,”
Phys. Rev. Lett.
125
,
058103
(
2020
).
72.
M.
Dai
,
T.
Gao
,
Y.
Lu
,
Y.
Zheng
, and
J.
Duan
, “
Detecting the maximum likelihood transition path from data of stochastic dynamical systems
,”
Chaos
30
,
113124
(
2020
).
73.
J. L.
Callaham
,
J.-C.
Loiseau
,
G.
Rigas
, and
S. L.
Brunton
, “
Nonlinear stochastic modelling with Langevin regression
,”
Proc. R. Soc. A
477
,
20210092
(
2021
).
74.
E.
Kaiser
,
J. N.
Kutz
, and
S. L.
Brunton
, “
Discovering conservation laws from data for control
,” in
IEEE Conference on Decision and Control (CDC)
(
IEEE
,
2018
), pp.
6415
6421
.
75.
U.
Fasel
,
E.
Kaiser
,
J. N.
Kutz
,
B. W.
Brunton
, and
S. L.
Brunton
, “
SINDy with control: A tutorial
,” in
60th IEEE Conference on Decision and Control (CDC)
(
IEEE
,
2021
), pp.
16
21
.
76.
N. M.
Mangan
,
S. L.
Brunton
,
J. L.
Proctor
, and
J. N.
Kutz
, “
Inferring biological networks by sparse identification of nonlinear dynamics
,”
IEEE Trans. Mol. Biol. Multi-Scale Commun.
2
,
52
63
(
2016
).
77.
K.
Kaheman
,
J. N.
Kutz
, and
S. L.
Brunton
, “
SINDy-PI: A robust algorithm for parallel implicit sparse identification of nonlinear dynamics
,”
Proc. R. Soc. A
476
,
20200279
(
2020
).
78.
N. M.
Mangan
,
T.
Askham
,
S. L.
Brunton
,
J. N.
Kutz
, and
J. L.
Proctor
, “
Model selection for hybrid dynamical systems via sparse regression
,”
Proc. R. Soc. A
475
,
20180534
(
2019
).
79.
G.
Thiele
,
A.
Fey
,
D.
Sommer
, and
J.
Krüger
, “
System identification of a hysteresis-controlled pump system using SINDy
,” in
24th International Conference on System Theory, Control and Computing (ICSTCC)
(
IEEE
,
2020
), pp.
457
464
.
80.
J.-C.
Loiseau
and
S. L.
Brunton
, “
Constrained sparse Galerkin regression
,”
J. Fluid Mech.
838
,
42
67
(
2018
).
81.
N. M.
Mangan
,
J. N.
Kutz
,
S. L.
Brunton
, and
J. L.
Proctor
, “
Model selection for dynamical systems via sparse regression and information criteria
,”
Proc. R. Soc. A
473
,
20170009
(
2017
).
82.
X.
Dong
,
Y.-L.
Bai
,
Y.
Lu
, and
M.
Fan
, “
An improved sparse identification of nonlinear dynamics with Akaike information criterion and group sparsity
,”
Nonlinear Dyn.
111
,
1485
(
2023
).
83.
G.
Tran
and
R.
Ward
, “
Exact recovery of chaotic systems from highly corrupted data
,”
Multiscale Model. Simul.
15
,
1108
1129
(
2017
).
84.
H.
Schaeffer
,
G.
Tran
, and
R.
Ward
, “
Extracting sparse high-dimensional dynamics from limited data
,”
SIAM J. Appl. Math.
78
,
3279
3295
(
2018
).
85.
C. B.
Delahunt
and
J. N.
Kutz
, “
A toolkit for data-driven discovery of governing equations in high-noise regimes
,”
IEEE Access
10
,
31210
31234
(
2022
).
86.
J.
Wentz
and
A.
Doostan
, “
Derivative-based SINDy (DSINDy): Addressing the challenge of discovering governing equations from noisy data
,” arXiv:2211.05918 (
2022
).
87.
A.
Somacal
,
Y.
Barrera
,
L.
Boechi
,
M.
Jonckheere
,
V.
Lefieux
,
D.
Picard
, and
E.
Smucler
, “
Uncovering differential equations from data with hidden variables
,”
Phys. Rev. E
105
,
054209
(
2022
).
88.
P.
Conti
,
G.
Gobat
,
S.
Fresca
,
A.
Manzoni
, and
A.
Frangi
, “
Reduced order modeling of parametrized systems through autoencoders and SINDy approach: Continuation of periodic solutions
,” arXiv:2211.06786 (
2022
).
89.
L.
Gao
and
J. N.
Kutz
, “
Bayesian autoencoders for data-driven discovery of coordinates, governing equations and fundamental constants
,” arXiv:2211.10575 (
2022
).
90.
Z.
Zhao
and
Q.
Li
, “
Adaptive sampling methods for learning dynamical systems
,” in
Mathematical and Scientific Machine Learning
(
PMLR
,
2022
), pp.
335
350
.
91.
K.
Wu
and
D.
Xiu
, “
Numerical aspects for approximating governing equations using data
,”
J. Comput. Phys.
384
,
200
221
(
2019
).
92.
U.
Fasel
,
J. N.
Kutz
,
B. W.
Brunton
, and
S. L.
Brunton
, “
Ensemble-SINDy: Robust sparse model discovery in the low-data, high-noise limit, with active learning and control
,”
Proc. R. Soc. A
478
,
20210904
(
2022
).
93.
H.
Schaeffer
and
S. G.
McCalla
, “
Sparse model selection via integral terms
,”
Phys. Rev. E
96
,
023302
(
2017
).
94.
P. A.
Reinbold
,
D. R.
Gurevich
, and
R. O.
Grigoriev
, “
Using noisy or incomplete data to discover models of spatiotemporal dynamics
,”
Phys. Rev. E
101
,
010203
(
2020
).
95.
D. A.
Messenger
and
D. M.
Bortz
, “
Weak SINDy for partial differential equations
,”
J. Comput. Phys.
443
,
110525
(
2021
).
96.
P. A.
Reinbold
,
L. M.
Kageorge
,
M. F.
Schatz
, and
R. O.
Grigoriev
, “
Robust learning from noisy, incomplete, high-dimensional experimental data via physically constrained symbolic regression
,”
Nat. Commun.
12
,
3219
(
2021
).
97.
B.
Russo
and
M. P.
Laiu
, “
Convergence of weak-SINDy surrogate models
,” arXiv:2209.15573 (
2022
).
98.
D. A.
Messenger
and
D. M.
Bortz
, “
Asymptotic consistency of the WSINDy algorithm in the limit of continuum data
,” arXiv:2211.16000 (
2022
).
99.
P.
Gelß
,
S.
Klus
,
J.
Eisert
, and
C.
Schütte
, “
Multidimensional approximation of nonlinear dynamical systems
,”
J. Comput. Nonlinear Dyn.
14
,
061006
(
2019
).
100.
A.
Goeßmann
,
M.
Götte
,
I.
Roth
,
R.
Sweke
,
G.
Kutyniok
, and
J.
Eisert
, “
Tensor network approaches for learning non-linear dynamical laws
,” arXiv:2002.12388 (
2020
).
101.
L.
Boninsegna
,
F.
Nüske
, and
C.
Clementi
, “
Sparse learning of stochastic dynamical equations
,”
J. Chem. Phys.
148
,
241723
(
2018
).
102.
B.
de Silva
,
K.
Champion
,
M.
Quade
,
J.-C.
Loiseau
,
J. N.
Kutz
, and
S.
Brunton
, “
PySINDy: A Python package for the sparse identification of nonlinear dynamical systems from data
,”
J. Open Source Software
5
,
2104
(
2020
).
103.
A. A.
Kaptanoglu
,
B. M.
de Silva
,
U.
Fasel
,
K.
Kaheman
,
A. J.
Goldschmidt
,
J.
Callaham
,
C. B.
Delahunt
,
Z. G.
Nicolaou
,
K.
Champion
,
J.-C.
Loiseau
,
J. N.
Kutz
, and
S. L.
Brunton
, “
PySINDy: A comprehensive Python package for robust sparse system identification
,”
J. Open Source Software
7
,
3994
(
2022
).
104.
A. A.
Kaptanoglu
,
L.
Zhang
,
Z. G.
Nicolaou
,
U.
Fasel
, and
S. L.
Brunton
, “
Benchmarking sparse system identification with low-dimensional chaos
,” arXiv:2302.10787 (
2023
).
105.
K.
Manohar
,
B. W.
Brunton
,
J. N.
Kutz
, and
S. L.
Brunton
, “
Data-driven sparse sensor placement for reconstruction: Demonstrating the benefits of exploiting known patterns
,”
IEEE Control Syst. Mag.
38
,
63
86
(
2018
).
106.
K.
Manohar
,
J. N.
Kutz
, and
S. L.
Brunton
, “
Optimal sensor and actuator selection using balanced model reduction
,”
IEEE Trans. Autom. Control
67
,
2108
2115
(
2021
).
107.
C. J.
Schrijver
,
K.
Kauristie
,
A. D.
Aylward
,
C. M.
Denardini
,
S. E.
Gibson
,
A.
Glover
,
N.
Gopalswamy
,
M.
Grande
,
M.
Hapgood
,
D.
Heynderickx
 et al., “
Understanding space weather to shield society: A global road map for 2015–2025 commissioned by COSPAR and ILWS
,”
Adv. Space Res.
55
,
2745
2807
(
2015
).
108.
E.
Clark
,
T.
Askham
,
S. L.
Brunton
, and
J. N.
Kutz
, “
Greedy sensor placement with cost constraints
,”
IEEE Sens. J.
19
,
2642
2656
(
2018
).
109.
E.
Clark
,
J. N.
Kutz
, and
S. L.
Brunton
, “
Sensor selection with cost constraints for dynamically relevant bases
,”
IEEE Sens. J.
20
,
11674
11687
(
2020
).
110.
D. L.
Donoho
, “
Compressed sensing
,”
IEEE Trans. Inf. Theory
52
,
1289
1306
(
2006
).
111.
J.
Levesque
,
N.
Rath
,
D.
Shiraki
,
S.
Angelini
,
J.
Bialek
,
P.
Byrne
,
B.
DeBono
,
P.
Hughes
,
M.
Mauel
,
G.
Navratil
 et al., “
Multimode observations and 3D magnetic control of the boundary of a tokamak plasma
,”
Nucl. Fusion
53
,
073037
(
2013
).
112.
C.
Hansen
,
B.
Victor
,
K.
Morgan
,
T.
Jarboe
,
A.
Hossack
,
G.
Marklin
,
B.
Nelson
, and
D.
Sutherland
, “
Numerical studies and metric development for validation of magnetohydrodynamic models on the HIT-SI experiment
,”
Phys. Plasmas
22
,
056105
(
2015
).
113.
Y.
Huang
,
S.
Jiang
,
H.
Li
,
Q.
Wang
, and
L.
Chen
, “
Compressive analysis applied to radiation symmetry evaluation and optimization for laser-driven inertial confinement fusion
,”
Comput. Phys. Commun.
185
,
459
471
(
2014
).
114.
Y. J.
Fan
and
C.
Kamath
, “
A comparison of compressed sensing and sparse recovery algorithms applied to simulation data
,”
Stat., Optim. Inf. Comput.
4
,
194
213
(
2016
).
115.
N.
Xia
,
Y.
Huang
,
H.
Li
,
P.
Li
,
K.
Wang
, and
F.
Wang
, “
A novel recovery method of soft x-ray spectrum unfolding based on compressive sensing
,”
Sensors
18
,
3725
(
2018
).
116.
M. C.
Cheung
,
B.
De Pontieu
,
J.
Martínez-Sykora
,
P.
Testa
,
A. R.
Winebarger
,
A.
Daw
,
V.
Hansteen
,
P.
Antolin
,
T. D.
Tarbell
,
J.-P.
Wuelser
 et al., “
Multi-component decomposition of astronomical spectra by compressed sensing
,”
Astrophys. J.
882
,
13
(
2019
).
117.
M. K.
Georgoulis
,
D. S.
Bloomfield
,
M.
Piana
,
A. M.
Massone
,
M.
Soldati
,
P. T.
Gallagher
,
E.
Pariat
,
N.
Vilmer
,
E.
Buchlin
,
F.
Baudin
 et al., “
The flare likelihood and region eruption forecasting (FLARECAST) project: Flare forecasting in the big data & machine learning era
,”
J. Space Weather Space Clim.
11
,
39
(
2021
).
118.
S. E.
Otto
and
C. W.
Rowley
, “
Inadequacy of linear methods for minimal sensor placement and feature selection in nonlinear systems: A new approach using secants
,”
J. Nonlinear Sci.
32
,
69
(
2022
).
119.
M.
Chilenski
,
M.
Greenwald
,
Y.
Marzouk
,
N.
Howard
,
A.
White
,
J.
Rice
, and
J.
Walk
, “
Improved profile fitting and quantification of uncertainty in experimental measurements of impurity transport coefficients using Gaussian process regression
,”
Nucl. Fusion
55
,
023012
(
2015
).
120.
P.
Rodriguez-Fernandez
,
N.
Howard
, and
J.
Candy
, “
Nonlinear gyrokinetic predictions of SPARC burning plasma profiles enabled by surrogate modeling
,”
Nucl. Fusion
62
,
076036
(
2022
).
121.
E.
Camporeale
,
A.
Carè
, and
J. E.
Borovsky
, “
Classification of solar wind with machine learning
,”
J. Geophys. Res.: Space Phys.
122
,
10,910
10,920
, (
2017
).
122.
R.
Shalloo
,
S.
Dann
,
J.-N.
Gruse
,
C.
Underwood
,
A.
Antoine
,
C.
Arran
,
M.
Backhouse
,
C.
Baird
,
M.
Balcazar
,
N.
Bourgeois
 et al., “
Automation and control of laser wakefield accelerators using Bayesian optimization
,”
Nat. Commun.
11
,
6355
(
2020
).
123.
E.
Schulz
,
M.
Speekenbrink
, and
A.
Krause
, “
A tutorial on Gaussian process regression: Modelling, exploring, and exploiting functions
,”
J. Math. Psychol.
85
,
1
16
(
2018
).
124.
J.
Leddy
,
S.
Madireddy
,
E.
Howell
, and
S.
Kruger
, “
Single Gaussian process method for arbitrary tokamak regimes with a statistical analysis
,”
Plasma Phys. Controlled Fusion
64
,
104005
(
2022
).
125.
I. A.
Almosallam
,
M. J.
Jarvis
, and
S. J.
Roberts
, “
GPz: Non-stationary sparse Gaussian processes for heteroscedastic uncertainty estimation in photometric redshifts
,”
Mon. Not. R. Astron. Soc.
462
,
726
739
(
2016
).
126.
P.
Hatfield
,
S.
Rose
,
R.
Scott
,
I.
Almosallam
,
S.
Roberts
, and
M.
Jarvis
, “
Using sparse Gaussian processes for predicting robust inertial confinement fusion implosion yields
,”
IEEE Trans. Plasma Sci.
48
,
14
21
(
2019
).
127.
T.
Park
and
G.
Casella
, “
The Bayesian Lasso
,”
J. Am. Stat. Assoc.
103
,
681
686
(
2008
).
128.
C.
Guillemaut
,
M.
Lennholm
,
J.
Harrison
,
I.
Carvalho
,
D.
Valcarcel
,
R.
Felton
,
S.
Griph
,
C.
Hogben
,
R.
Lucock
,
G. F.
Matthews
,
C. P.
Von Thun
,
R. A.
Pitts
, and
S.
Wiesen
, “
Real-time control of divertor detachment in H-mode with impurity seeding using Langmuir probe feedback in JET-ITER-like wall
,”
Plasma Phys. Controlled Fusion
59
,
045001
(
2017
).
129.
D.
Eldon
,
H.
Wang
,
L.
Wang
,
J.
Barr
,
S.
Ding
,
A.
Garofalo
,
X.
Gong
,
H.
Guo
,
A.
Järvinen
,
K.
Li
,
J.
McClenaghan
,
A.
McLean
,
C.
Samuell
,
J.
Watkins
,
D.
Weisberg
, and
Q.
Yuan
, “
An analysis of controlled detachment by seeding various impurity species in high performance scenarios on DIII-D and EAST
,”
Nucl. Mater. Energy
27
,
100963
(
2021
).
130.
T.
Ravensbergen
,
M.
van Berkel
,
A.
Perek
,
C.
Galperti
,
B. P.
Duval
,
O.
Février
,
R. J. R.
van Kampen
,
F.
Felici
,
J. T.
Lammers
,
C.
Theiler
,
J.
Schoukens
,
B.
Linehan
,
M.
Komm
,
S.
Henderson
,
D.
Brida
, and
M. R.
de Baar
, “
Real-time feedback control of the impurity emission front in tokamak divertor plasmas
,”
Nat. Commun.
12
,
1105
(
2021
).
131.
X.
Bonnin
,
W.
Dekeyser
,
R.
Pitts
, and
D.
Coster
, “
Presentation of the new SOLPS-ITER code package for tokamak plasma edge modelling
,”
Plasma Fusion Res.
11
,
1403102
(
2016
).
132.
M.
Drevlak
,
C.
Beidler
,
J.
Geiger
,
P.
Helander
, and
Y.
Turkin
, “
Optimisation of stellarator equilibria with ROSE
,”
Nucl. Fusion
59
,
016010
(
2018
).
133.
S. P.
Hirshman
,
D. A.
Spong
,
J. C.
Whitson
,
V. E.
Lynch
,
D. B.
Batchelor
,
B. A.
Carreras
, and
J. A.
Rome
, “
Transport optimization and MHD stability of a small aspect ratio toroidal hybrid stellarator
,”
Phys. Rev. Lett.
80
,
528
(
1998
).
134.
M.
Landreman
,
B.
Medasani
,
F.
Wechsung
,
A.
Giuliani
,
R.
Jorge
, and
C.
Zhu
, “
SIMSOPT: A flexible framework for stellarator optimization
,”
J. Open Source Software
6
,
3525
(
2021
).
135.
M.
Landreman
, “
An improved current potential method for fast computation of stellarator coil shapes
,”
Nucl. Fusion
57
,
046003
(
2017
).
136.
C.
Zhu
,
S. R.
Hudson
,
Y.
Song
, and
Y.
Wan
, “
New method to design stellarator coils without the winding surface
,”
Nucl. Fusion
58
,
016008
(
2017
).
137.
E.
Paul
,
M.
Landreman
,
A.
Bader
, and
W.
Dorland
, “
An adjoint method for gradient-based optimization of stellarator coil shapes
,”
Nucl. Fusion
58
,
076015
(
2018
).
138.
A. A.
Kaptanoglu
,
T.
Qian
,
F.
Wechsung
, and
M.
Landreman
, “
Permanent-magnet optimization for stellarators as sparse regression
,”
Phys. Rev. Appl.
18
,
044006
(
2022
).
139.
A. A.
Kaptanoglu
,
R.
Conlin
, and
M.
Landreman
, “
Greedy permanent magnet optimization
,”
Nucl. Fusion
63
,
036016
(
2023
).
140.
M.
Landreman
and
E.
Paul
, “
Magnetic fields with precise quasisymmetry for plasma confinement
,”
Phys. Rev. Lett.
128
,
035001
(
2022
).
141.
T.
Qian
,
M. C.
Zarnstorff
,
D.
Bishop
,
A.
Chambliss
,
A.
Dominguez
,
C.
Pagano
,
D.
Patch
, and
C.
Zhu
, “
Simpler optimized stellarators using permanent magnets
,”
Nucl. Fusion
62
,
084001
(
2022
).
142.
G.
Neilson
,
T.
Brown
,
D.
Gates
,
K.
Lu
,
M.
Zarnstorff
,
A.
Boozer
,
J.
Harris
,
O.
Meneghini
,
H.
Mynick
,
N.
Pomphrey
 et al., “
Progress toward attractive stellarators
,” Report No. PPPL-4589 (
Princeton Plasma Physics Lab.
,
Princeton, NJ
,
2011
).
143.
M.
Landreman
,
B.
Medasani
, and
C.
Zhu
, “
Stellarator optimization for good magnetic surfaces at the same time as quasisymmetry
,”
Phys. Plasmas
28
,
092505
(
2021
).