This work extends the adjoint-deep learning framework for runaway electron (RE) evolution, developed by McDevitt et al. [Phys. Plasmas 32, 042503 (2025)], to account for large-angle collisions. By incorporating large-angle collisions, the framework allows the avalanche of REs to be captured, an essential component of RE dynamics. This extension is accomplished by using a Rosenbluth–Putvinski approximation to estimate the distribution of secondary electrons generated by large-angle collisions. By evolving both the primary and multiple generations of secondary electrons, the present formulation can capture both the detailed temporal evolution of a RE population beginning from an arbitrary initial momentum space distribution, along with providing approximations to the saturated growth and decay rates of the RE population. Predictions of the adjoint-deep learning framework are verified against a traditional RE solver, with good agreement present across a broad range of parameters.

The description of runaway electron (RE) formation and evolution in magnetized fusion devices poses a substantial scientific challenge. While reduced analytic,2–6 semi-analytic,7–9 or machine learning10–13 models are often used to infer RE generation rates, threshold electric fields, or the number of seed REs, such models are unable to recover the detailed dynamics of RE evolution. Instead, the description of RE dynamics is often performed using continuum14–18 or particle-based RE solvers.19–22 In Paper I,1 the decay rate of a population of primary electrons was evaluated using a parametric solution to the adjoint of the relativistic Fokker–Planck equation. By targeting the adjoint to the relativistic Fokker–Planck equation, a single solution to this equation for a given set of parameters allowed the RE density to be evolved forward in time beginning from an arbitrary initial momentum space distribution of electrons. The framework was further generalized by utilizing a physics-informed neural network (PINN) to identify a parametric solution to the adjoint of the relativistic Fokker–Planck equation. Hence, once trained, the combined adjoint-deep learning framework enabled the dynamics of the RE density to be inferred beginning from an arbitrary initial momentum space distribution for a range of physics parameters. While the offline training time of the PINN was substantially greater than the solution time of a traditional RE solver, its online inference time was orders of magnitude faster, suggesting its potential as a surrogate model for RE evolution in integrated descriptions of tokamak disruptions.

Reference 1, however, did not describe the exponential growth of the RE population due to the avalanche mechanism,23 an essential component when describing the conversion of Ohmic to RE current in tokamak devices.24,25 The present paper seeks to remove this limitation by extending the adjoint-deep learning framework to incorporate large-angle collisions, thus allowing for the amplification of an initial seed population. A second generalization of the framework will be achieved by training across a broader range of physics parameters compared to Ref. 1. This is accomplished by introducing a normalized time and energy coordinate that allows features of the runaway probability function (RPF) to be accurately tracked while using a comparable number of training points as the narrow parameter range employed in Ref. 1. The derived adjoint-deep learning approach will thus provide an efficient framework through which RE dynamics can be accurately described.

The present paper is not the first to evaluate the avalanche growth rate of REs using an adjoint approach,26 nor the first to use a PINN for evaluating the avalanche growth rate.13 With regard to Ref. 13, that paper employed a PINN to solve the steady state adjoint of the relativistic Fokker–Planck equation. Using that solution, the saturated avalanche growth rate of the RE population was inferred. A limitation of Ref. 13, however, was that it computed the rate that secondary electrons are generated and run away, without accounting for the decay of the primary distribution. Thus, when below the threshold, Ref. 13 simply predicted a negligibly small avalanche growth, without accounting for the decay of the overall RE population. This resulted in a slight ambiguity in the determination of the RE avalanche threshold, together with the inability to correctly describe RE decay when below the threshold. Furthermore, Ref. 13 focused on the saturated growth rate of REs, and thus did not treat the transient evolution of the RE population. Results from the present model will fill these gaps by providing an efficient means of evaluating the temporal evolution of the RE density both above and below the threshold. The near and below threshold limits are of particular importance during the RE plateau, where accounting for the decay of the RE distribution is essential for accurately predicting the dissipation of the RE beam. While analytic approximations to this rate are available, see for example, Ref. 4, such formulas are highly inaccurate when below the threshold.1 When carrying out these extensions, we will consider the limit of a completely screened plasma for a constant electric field. A more comprehensive treatment of REs during a disruption will require including a temporally varying electric field and partial screening corrections,27 which will be left to a future work.

The remainder of this paper is organized as follows. Section II discusses an extension of the adjoint-deep learning formulation of the relativistic Fokker–Planck equation to include a secondary source term. Physics-informed neural networks are briefly described in Sec. III. The temporal evolution of the number density of an avalanching RE population both above and below the threshold is given in Sec. IV. Section V evaluates the avalanche growth and decay rates across a broad range of plasma conditions and compares with a traditional RE solver. A brief discussion along with conclusions are given in Sec. VI.

This section will provide a brief description of an adjoint formulation for evolving the RE density beginning from an arbitrary initial momentum space distribution. A more detailed description is given in Ref. 1. The adjoint of the relativistic Fokker–Planck equation can be written as11,26,28,29
(1)
(1a)
with the adjoint operators defined by
(1b)
(1c)
(1d)
Here, we have defined the electron's pitch by ξp||/p[1,1], time is normalized as tt/τc, with τc4πϵ02me2c3/(e4nelnΛ) and Coulomb logarithm lnΛ, momentum as pp/(mec), the electric field as E||E||/Ec, where Ecmec/(eτc) is the Connor–Hastie electric field,3 the Lorentz factor is defined as γ=1+p2, ατc/τs, and τs6πϵ0me3c3/(e4B2) is the timescale associated with synchrotron radiation, and since Eq. (1) does not include toroidal geometry, magnetic trapping effects on RE generation are not included. The coefficients for the collisional drag CF and pitch-angle scattering CB in Eq. (1c) are defined by
(2a)
(2b)
An adjoint problem for the runaway probability function (RPF) P can be defined by enforcing the terminal condition and momentum boundary conditions,
(3)
(3a)
(3b)
(3c)
where Θ(x) is a Heaviside function, and Up is defined by
(4)
Here, pRE indicates the momentum above which an electron will be counted as a RE, whose specific value will be discussed below. Once the adjoint problem defined by Eq. (1) with the terminal and boundary conditions defined by Eq. (3) has been solved, the RE density at a time between 0ttfinal is given by1 
(5)
where τtfinalt. This latter formulation will allow the time evolution of the density moment of an arbitrary initial momentum space distribution of REs fe(init)(p,ξ) to be projected to any time between 0 and tfinal. It does not, however, include large-angle collisions, which are essential when describing RE populations for the large electric fields expected in tokamak disruptions. The extension to include large-angle collisions is discussed in Sec. II B as follows.
Once the time evolution of the RE density is inferred, this can be used to evaluate the rate that a given seed population of REs will be enhanced by the avalanche mechanism. From Eq. (5), the evolution of the RE seed is given by
(6)
This seed population will generate additional REs via large-angle collisions. By introducing a temporal grid from t=0 to t=tfinal, in increments of Δtavtfinal/Nav, where Nav is the number of avalanche time steps, the distribution of secondary electrons after one time step can be approximated by
(7)
where ti=iΔtav, fsec(1) indicates the first generation of secondary electrons, and S(p,ξ) is the source of secondary electrons defined by
(8)
Here, fe(p,ξ,t1) is the seed electron population evaluated after one time step, and S0(p,ξ,p,ξ) is given by
(9)
where re=e2/(4πϵ0mec2) is the classical electron radius, dσM/dp is the Møller cross section,30,31 Π(p,ξ,p;ξ) describes the pitch-angle dependence of secondary electron generation (see Ref. 32 for an explicit expression), d3p=2πp2dpdξ, all variables have been dedimensionalized according to pp/mec, vv/c, and σMσM/re2. Since the adjoint formulation does not immediately give us the time evolution of the seed electron distribution fe(p,ξ,t), only its density moment, we will need to introduce a closure. The simplest closure available, introduced in Ref. 4, takes the limit whereby the energy of the primary electrons is assumed to be infinite p, and the pitch is taken to be aligned with the magnetic field ξ=1. Taking these limits, Eq. (8) reduces to
(10)
where ξ1 is defined by
(11)
Substituting Eq. (10) into (7) yields an explicit expression for the momentum space distribution of the first generation of secondaries, i.e.,
(12)
where the RE density at t=t1=Δtav will be given by the number of seed electrons nRE(t1)=nseed(t1), which can be evaluated from Eq. (6). The evolution of the number density of the first generation of secondary REs can then be expressed as
(13)
where the RPF is now evaluated at a time τ+t1=tfinal(tt1), since the first generation of secondaries is born at t=t1=Δtav. Substituting Eq. (12) into (13) yields an explicit expression for the time evolution of the number density of the first generation of secondary electrons, i.e.,
(14)
Similarly, the momentum space distribution of the second generation of secondaries is given by
(15)
where nRE(t2)=nseed(t2)+nsec(1)(t2). An explicit expression for the next generation of secondary electrons can then be written as
(16)
where we have used Eq. (15) in the second line. The ith generation of secondaries is then given by
(17)
(17a)
The density of REs is
(17b)
By evaluating each successive generation of secondaries, together with the initial evolution of the seed RE population, this will enable the time history of an avalanching RE distribution to be determined. We note that while the present adjoint formulation only predicts the number density of REs, the RPF incorporates fully kinetic physics into the evolution of the RE population. The main approximation is due to the use of a Rosenbluth–Putvinski source [Eq. (10)], which does not account for the energy or pitch distribution of the RE population.
While a range of approaches to the numerical solution of Eq. (1a) could be applied, we will utilize a PINN33,34 in the present manuscript. In so doing, this will allow us to identify the parametric solution to Eq. (1a) such that once trained, we will be able to evaluate the RPF for a range of physics parameters. The specifics of our implementation of a PINN can be found in Ref. 1, however, here, we will provide the essential formulas. In order to enforce several properties of the solution as hard constraints, we will introduce a physics layer of the form
(18)
where λ is a vector containing the physics parameters (E||,Zeff,α), and the terminal condition is defined by
(19)
where the final RPF is computed by passing Eq. (18) through
(20)
Here, PNN represents the output of the hidden layers of the neural network (NN), Δp/pmax and ΔP are hyperparameters whose value should be set to be much less than one, and λ is a vector representing the physics parameters, which are given by λ=(E||,Zeff,α). It can be verified that for small Δp/pmax and ΔP, the physics layer enforces (i) the terminal condition given by Eq. (3a), (ii) lower momentum boundary condition [Eq. (3b)], and (iii) constrains the RPF to have a range between zero and one. With this physics layer, only the residual of the PDE and the boundary condition on the upper momentum boundary will be included in the loss function, such that we will minimize a loss of the form
(21)
where
(22)
where Nbdy is the number of points on the upper momentum boundary that will be sampled, and R is the residual of Eq. (1a). The role of the factors G(p) and pi2/(1+pi2) are to remove large contributions to the residual that emerge at either high or low momenta boundaries, as discussed further in Ref. 1. The Python script used for training the PINN is written using the DeepXDE library,35 with TensorFlow36 as the backend, and will be made available upon acceptance at https://github.com/cmcdevitt2/RunAwayPINNs.
This section will utilize the algorithm described in Sec. II B to describe the temporal evolution of a RE population including large-angle collisions. We will assume a seed electron population defined by
(23)
where the evolution of the number density of seed electrons will be evolved using Eq. (6), with each generation of secondary electrons described by Eq. (17). The momentum above which an electron will be counted as a RE is taken to be pRE=pmax/4, which corresponds to an energy of roughly 1 MeV. While the use of a Rosenbluth–Putvinski secondary source term will result in a weak dependence on the choice of pRE, such that modestly improved agreement with first-principle Monte Carlo simulations could be achieved by tuning this parameter, we have opted not to do this since such tuning may be parameter dependent. We have set the hyperparameters to ΔP=0.15, Δp=0.1pmax, Δpmax=0.05pmax, and wPDE=10 when training the model. The resulting loss history for the PINN is shown in Fig. 1. Here, we have trained the PINN for electric fields in the range E||(0,3), effective charges Zeff(1,5), and synchrotron radiation α(0,0.2), a somewhat broader range of parameters compared to Ref. 1, though at the cost of a much larger number of training points. A residual-based adaptive sampling scheme is employed,37 leading to periodic spikes in the training loss.
FIG. 1.

Training and test loss history. The solid blue curve indicates the training loss of the PDE, the solid red curve indicates the training loss for the boundary condition, and the “×” markers indicate the test loss (the test and training losses are the same for the boundary term). 3 000 000 training and testing points are used. The training points initially obey a Hammersley sequence, with the test distribution being uniform random. The training points are resampled using the residual-based adaptive distribution described in Ref. 37, every 50 000 epochs of L-BFGS.

FIG. 1.

Training and test loss history. The solid blue curve indicates the training loss of the PDE, the solid red curve indicates the training loss for the boundary condition, and the “×” markers indicate the test loss (the test and training losses are the same for the boundary term). 3 000 000 training and testing points are used. The training points initially obey a Hammersley sequence, with the test distribution being uniform random. The training points are resampled using the residual-based adaptive distribution described in Ref. 37, every 50 000 epochs of L-BFGS.

Close modal

Two example time histories are shown in Fig. 2. Here, the time evolution of the total number of REs (solid blue curve), initial seed population (dashed black curve), and several generations of secondary REs (multicolored dashed curves) are shown. For panel (a) of Fig. 2, the electric field was chosen to be above the threshold (E||=3), with the total number of REs increasing exponentially (solid blue curve). Considering each generation, the seed electron population is normalized to have a value of one at t=0, where its value remains largely unchanged since E||>Eav (Eav1.8 for these parameters). For each avalanche time step, a small number of secondary electrons are born, where each generation grows rapidly until plateauing later in time. The initial rapid increase in the secondary electron population is due to the energy and pitch distribution of secondaries immediately after they are born obeying Eq. (10), where the majority of these electrons will have energies below the 1 MeV threshold energy for an electron to be counted as a RE. For cases above the threshold, a large number of the secondary electrons will be quickly accelerated above 1 MeV, resulting in an increase in the total number of REs. This process is repeated for each generation of secondaries, with each generation having a larger magnitude due to the number of REs increasing according to Eq. (17b). Summing together the number of primary electrons with all generations of secondary electrons results in an exponentially increasing number of REs, as indicated by the solid blue curve in Fig. 2. Turning to Fig. 2(b), here, the electric field was chosen to be below the threshold (E||=1.5), with the total number of REs initially increasing slightly before decaying slowly in time. Considering first the seed runaway population, the number of seed electrons initially remains approximately constant, but then begins to decay at a later time. The finite number of seed electrons leads to the generation of secondary electrons, and hence an initial increase in the total number of REs. However, since each secondary generation decays exponentially in time, this does not lead to the long-term growth of the overall population. At late times, the total number of REs decays exponentially at a rate that is similar, though not identical, to the decay of the seed RE population.

FIG. 2.

Runaway electron evolution over time for two different values of the electric field. The number of total REs is shown by the blue curves; the seed population is indicated by the black dashed curve; and each generation of secondaries are indicated by the multicolor dashed curves. Panel (a) is for E||=3 (above marginality), whereas panel (b) is for E||=1.5 (below marginality). The other parameters are given by Zeff=1, α=0.1, and lnΛ=15, with 19 generations of secondaries included (i.e., 20 time steps). The seed electron distribution was taken to be Eq. (23) with (p0=3pmax/4,Δp=pmax/10,ξ0=1,Δξ=0.1).

FIG. 2.

Runaway electron evolution over time for two different values of the electric field. The number of total REs is shown by the blue curves; the seed population is indicated by the black dashed curve; and each generation of secondaries are indicated by the multicolor dashed curves. Panel (a) is for E||=3 (above marginality), whereas panel (b) is for E||=1.5 (below marginality). The other parameters are given by Zeff=1, α=0.1, and lnΛ=15, with 19 generations of secondaries included (i.e., 20 time steps). The seed electron distribution was taken to be Eq. (23) with (p0=3pmax/4,Δp=pmax/10,ξ0=1,Δξ=0.1).

Close modal

The RE density histories for several different electric fields are shown in Fig. 3, where the PINN's predictions (blue curves) are compared with the RE solver RAMc (black curves). RAMc is a particle-based RE solver21 that evolves guiding center orbits together with small and large angle collisions, as well as synchrotron radiation. For the present case, the electrons were initialized on the magnetic axis for a large ITER-like device, such that neoclassical effects arising from magnetic trapping4,38–40 along with spatial transport41 are negligible. This will allow for comparisons with the present formulation, which does not account for toroidal geometry or spatial transport. Furthermore, RAMc uses a Møller source [Eqs. (8) and (9)] to describe secondary electron generation, a higher physics fidelity treatment of large-angle collisions compared to the Rosenbluth–Putvinski source used in the adjoint-deep learning approach. The present comparison will thus test the accuracy of the adjoint-deep learning framework along with regimes where a Rosenbluth–Putvinski source provides an adequate approximation to the large-angle collision operator. In all cases, the electron distribution was initialized to be a block of particles at high energies with pitch near ξ1 [see Fig. 3(f)] to be consistent with the initial RE distribution implemented in the RAMc code. From Figs. 3(a) and 3(b), it is evident that when well above the threshold, the two approaches are in good agreement. However, when approaching the threshold [panels (c) and (d) of Fig. 3], more substantial differences between the two approaches are evident, with the PINN systematically overpredicting the RE density. The cause of this discrepancy can be linked to the Rosenbluth–Putvinski secondary source, which assumes the primary electrons to have asymptotically large energy. While very high energy primary electrons will be present when the system is above marginality, no such electrons are expected to be present when at or below marginality. As a result, the Rosenbluth–Putvinski source will prove inaccurate when near the threshold. Comparing the time evolution of the primary electron population [dashed curves in Figs. 3(d) and 3(e)] with the result from RAMc, it is evident that the decay rate of the primary electron population is in better agreement with the results of RAMc. The reason for this is that RAMc employs a Møller secondary source term that accounts for the energy and pitch of the primary electrons. When below the threshold, the energy of the primary distribution will be modest, such that the secondary electrons generated by the Møller source will have little impact on the RE number density, with the majority of them being born with too low an energy to run away. Thus, as the system drops further below marginality, neglecting the Rosenbluth–Putvinski source in the PINN leads to an increasingly good approximation to the RE density evolution.

FIG. 3.

Comparison of RE density evolution with the Monte Carlo code RAMc for electric fields E||=3 [panel (a)], E||=2.5 [panel (b)], E||=2 [panel (c)], E||=1.5 [panel (d)], and E||=1 [panel (e)]. The initial electron distribution assumed for both RAMc and the PINN is shown in panel (f), where a rectangular block of electrons were inserted. The other parameters were taken to be Zeff=1, α=0.1, and lnΛ=15.

FIG. 3.

Comparison of RE density evolution with the Monte Carlo code RAMc for electric fields E||=3 [panel (a)], E||=2.5 [panel (b)], E||=2 [panel (c)], E||=1.5 [panel (d)], and E||=1 [panel (e)]. The initial electron distribution assumed for both RAMc and the PINN is shown in panel (f), where a rectangular block of electrons were inserted. The other parameters were taken to be Zeff=1, α=0.1, and lnΛ=15.

Close modal
Our aim in the present section will be to train the PINN across a broader range of physics parameters. The treatment of a broader range of parameters will pose a substantial challenge to training an accurate PINN. Specifically, as the size of the parameter space is increased, the number of training points required to densely sample the space will grow substantially, thus requiring a longer period of offline training along with increasing memory usage. In addition, the location of the transition region of the RPF (i.e., where the RPF transitions between zero and one) and the time to reach a steady state solution will traverse a broad range of energy and time scales, further complicating capturing the RPF with a single PINN. To accommodate both of these challenges, we will introduce a normalized time and energy coordinate that will allow the PINN to adapt the time period and the energy range that must be captured in order to accurately infer the RPF evolution. Such an adaptive approach will allow us to capture a broader range of physics parameters for a given number of training points. To normalize the time period over which the RPF is evolved, we will utilize the Rosenbluth–Putvinski growth rate as a rough estimate of the rate that the RE density changes, i.e.,4 
(24)
where the parallel electric field is normalized to the Connor–Hastie threshold Ec. The time period over which the RPF will be evolved, denoted by t¯final, will be taken to have the form
(25)
where Nav is the number of avalanche timescales that will be simulated and tfinal provides an upper bound on the time period of the simulation. It may be readily verified that near the threshold where tfinalγav/Nav1, Eq. (25) asymptotes to t¯finaltfinal, whereas when far from the threshold where tfinalγav/Nav1, t¯final asymptotes to t¯finalNav/γav. Equation (25) thus allows for relatively long simulations to be carried out when the system is near the threshold, but substantially shrinks the time domain for scenarios far above the threshold where the RPF quickly reaches a quasi-steady state (see Fig. 6 below).
The upper bound in the momentum domain will be chosen to be larger than the momentum, where the electric field acceleration and drag balance for an electron with ξ=1, which defines a critical momentum given by pcrit=1/E||1. To avoid the singularity at E||=1, we will approximate this critical momentum by pcrit1/E||+δ, where δ is a small value that removes the divergence at E||=0, taken to be δ=0.01 in the present study. The upper momentum bound will then be taken to have the form
(26)
where Np is a factor that sets how far above pcrit the high momentum boundary is set, and pmax limits the upper momentum boundary for the case of a weak electric field. For pmax2/(Nppcrit)21 (i.e., a weak electric field), the upper momentum boundary is approximately given by p¯maxpmax, whereas for a strong electric field pmax2/(Nppcrit)21, Eq. (26) asymptotes to p¯maxNppcrit. An upper momentum bound given by Eq. (26) thus allows a high momentum boundary of roughly pmax to be used when near or below marginality, but will substantially shrink the momentum domain for large electric fields where the transition region is located at low momenta.
Utilizing these adaptive time and momentum ranges, the loss history and final training point distribution are shown in Fig. 4. Here, we have trained the PINN to learn the RPF for electric fields in the range E||(0,10), Zeff(1,10), and α(0,0.2) for the hyperparameters ΔP=0.2, Δp¯=0.1pmax, Δpmax=0.05p¯max, and wPDE=10. The time and momentum domains are defined using Eqs. (24)–(26), with the values tmax=20, Nav=1/2, Np=10, and a pmax is chosen consistent with 10MeV. The low momentum boundary is taken to be consistent with 10keV, and pRE is set to be pRE=p¯max/4. We also slightly modified the loss function to
(27)
FIG. 4.

(a) Training and test loss history. The solid blue curve indicates the training loss of the PDE, the solid red curve indicates the training loss for the boundary condition, and the “×” markers indicate the test loss (the test and training losses are the same for the boundary term). 1 000 000 training and test points were used. (b) Final training point distribution.

FIG. 4.

(a) Training and test loss history. The solid blue curve indicates the training loss of the PDE, the solid red curve indicates the training loss for the boundary condition, and the “×” markers indicate the test loss (the test and training losses are the same for the boundary term). 1 000 000 training and test points were used. (b) Final training point distribution.

Close modal
where we have included the additional factor Eavg/(1+E||) in front of the residual, where Eavg=(Emin+Emax)/2, due to the magnitude of the residual of Eq. (1a) increasing with the electric field strength. Dividing by the electric field in the loss, this prevents the training of the PINN from being unduly biased toward cases with large electric fields. A residual-based adaptive training point sampling scheme37 has been adopted, where from Fig. 4(b), it is evident that the density of training points is highest near p/p¯max0.2 and at the high momentum boundary near ξ±1. The concentration of training points near p/p¯max0.2 is due to the value of pRE being chosen to be pRE/p¯max=0.25, whereas the concentration of points near the upper momentum boundary with ξ±1 often contains the maximum value of the RPF when below the threshold (see the second column of Fig. 5 below).
FIG. 5.

Time slices of the RPF evolution (top row) along with associated residuals (bottom row) for E||=0.8 and pRE=p¯max/4. The bottom row indicates the residual to Eq. (1a) multiplied by p2/(1+p2), such that the low energy divergence is removed. The first column indicates the terminal condition at t19.36, the second column t17.43 and the last column t=0. The dashed white curves are the location of secondary injection ξ1 defined by Eq. (11). The other parameters are given by Zeff=5 and α=0.1, and we chose tfinal=20, Np=10, Nav=0.5, along with a pmax consistent with 10MeV.

FIG. 5.

Time slices of the RPF evolution (top row) along with associated residuals (bottom row) for E||=0.8 and pRE=p¯max/4. The bottom row indicates the residual to Eq. (1a) multiplied by p2/(1+p2), such that the low energy divergence is removed. The first column indicates the terminal condition at t19.36, the second column t17.43 and the last column t=0. The dashed white curves are the location of secondary injection ξ1 defined by Eq. (11). The other parameters are given by Zeff=5 and α=0.1, and we chose tfinal=20, Np=10, Nav=0.5, along with a pmax consistent with 10MeV.

Close modal
Two example RPFs for both weak and large electric fields are shown in Figs. 5 and 6, respectively. For the case of a weak electric field (E||=0.8, Zeff=5, α=0.1), a large energy domain is used, with an upper bound of 4.5MeV along with an integration time of nearly twenty τc, such that the relatively slow evolution of the RPF can be captured. Here, the terminal condition is chosen such that pRE=p¯max/4, which for the weak electric field case corresponds to an energy slightly below 1MeV. As time evolves, the transition region shifts to higher energy, with maxima near ξ±1 at t17.43 (middle column in Fig. 5). Here, these maxima at high energy are due to synchrotron radiation preferentially slowing down electrons with large quantities of perpendicular energy (i.e., with ξ0). By t=0, the RPF is nearly zero everywhere, except near the low energy boundary where it has a magnitude of 106. The reason the RPF does not vanish identically at the low energy boundary is due to the physics layer defined by Eqs. (18)–(20) not precisely vanishing at p=pmin. In particular, at p=pmin, the RPF reduces to [see Eq. (20)]
FIG. 6.

Time slices of the RPF evolution (top row) along with the associated residual (bottom row) for E||=8 and pRE=p¯max/4. The bottom row indicates the residual to Eq. (1a) multiplied by p2/(1+p2), such that the low energy divergence is removed. The first column indicates the terminal condition at t2.19, the second column t1.65, and the last column t=0. The dashed white curves are the location of secondary injection ξ1 defined by Eq. (11). The other parameters are given by Zeff=5 and α=0.1, and we chose tfinal=20, Np=10, Nav=0.5, along with a pmax consistent with 10MeV.

FIG. 6.

Time slices of the RPF evolution (top row) along with the associated residual (bottom row) for E||=8 and pRE=p¯max/4. The bottom row indicates the residual to Eq. (1a) multiplied by p2/(1+p2), such that the low energy divergence is removed. The first column indicates the terminal condition at t2.19, the second column t1.65, and the last column t=0. The dashed white curves are the location of secondary injection ξ1 defined by Eq. (11). The other parameters are given by Zeff=5 and α=0.1, and we chose tfinal=20, Np=10, Nav=0.5, along with a pmax consistent with 10MeV.

Close modal
which for ΔP=0.2 and Δp=0.1p¯max does not precisely vanish. This small inaccuracy will have a negligible impact for parameters far above the threshold, and will only impact predictions on the PINN at late times when well below marginality. Furthermore, since the spurious contribution to the RPF is located at the low energy boundary, it can be removed via a post processing routine as discussed below.

Considering now a case well above marginality (E||=8, Zeff=5, α=0.1, see Fig. 6), the RPF quickly reaches a steady state, with the RPF at t1.65 having nearly the same form as at t=0, with the total time of the simulation being 2.19. The steady state solution has a contour of roughly one half at an energy of 60keV, suggesting that the maximum energy used in this case of 1MeV is sufficient to capture the critical portion of the RPF.

Using the PINN described in Figs. 4–6, we will be interested in comparing predictions of the RE avalanche growth or decay rate with Monte Carlo simulations carried out with RAMc. Before carrying out this comparison, we note that the use of a Rosenbluth–Putvinski source to describe large-angle collisions introduces an ambiguity in the evaluation of the avalanche growth rate. In particular, since the Rosenbluth–Putvinski source only depends on the number of REs, and not on the energy distribution, it is necessary to introduce a criterion for classifying which electrons should be counted as REs. In the present formulation, the quantity pRE defined by Eq. (3) acts as the effective definition of which electrons are counted as REs. In Fig. 7, this value is roughly 2.5pcrit for the hyperparameters selected, where pcrit is the critical energy to run away given by pcrit1/E||. This definition ensures that only electrons located substantially above the critical energy to run away are included in the large-angle collision operator.

FIG. 7.

Comparison of avalanche growth rates between the Monte Carlo code RAMc with a Møller secondary source and predictions of the PINN. The solid blue curve includes large-angle collisions, whereas the solid red curve includes large-angle collisions when above the threshold, but uses the primary decay rate when below the threshold, and the black “×” markers are values from RAMc. The other parameters were Zeff=5, α=0.1, and lnΛ=16.15. The initial electron distribution was defined by Eq. (23) with (p0=3p¯max/4,Δp=p¯max/10,ξ0=1,Δξ=0.1).

FIG. 7.

Comparison of avalanche growth rates between the Monte Carlo code RAMc with a Møller secondary source and predictions of the PINN. The solid blue curve includes large-angle collisions, whereas the solid red curve includes large-angle collisions when above the threshold, but uses the primary decay rate when below the threshold, and the black “×” markers are values from RAMc. The other parameters were Zeff=5, α=0.1, and lnΛ=16.15. The initial electron distribution was defined by Eq. (23) with (p0=3p¯max/4,Δp=p¯max/10,ξ0=1,Δξ=0.1).

Close modal

An additional subtlety that arises when using an adaptive energy region is that, for the case of a large electric field, the modest value of the upper momentum boundary results in the integral used to evaluate the number of secondary electrons [Eq. (17a)] being prematurely cutoff. This can result in underestimating the number of secondary electrons generated. To account for this, we extended the energy integral in Eq. (17a) to infinity, where we take the value of the RPF to be P(p¯max,ξ1) for the extended region of the integral. For cases well above threshold, we will have P(p¯max,ξ1)1 (see the dashed white curves in Fig. 6, for example), and we would expect this quantity to remain near unity for p>p¯max. In contrast, when near or below the threshold, P(p¯max,ξ1) will no longer be unity, and will likely vary significantly at higher momentum. For this latter case, however, the high energy boundary will be located at several MeV (see Fig. 5, for example), where at such high energies, the Rosenbluth–Putvinski source has a negligibly small value. In this limit, the contribution from the high energy extension of the integral defined by Eq. (17a) will make a negligible contribution to the rate of RE growth or decay, such that our extension of the integral defined in Eq. (17a) past the p¯max boundary will have little impact on the predicted number of secondary electrons.

Finally, as evident in Fig. 5(c), when well below the threshold, a spurious contribution to the RPF can emerge at the low energy boundary. To remove this contribution, we will note two momenta. The first is the critical momentum for an electron to overcome drag, defined by pc1/E||1, for E||1, or pc otherwise. The second is a hybrid momentum formed by taking the weighted average of pmin and pRE, i.e., pref=(pRE+3pmin)/4. To remove the spurious contribution to the RPF near pmin, we will multiply the integrand appearing in Eq. (17a) by a Heaviside function Θ(ppc) for pc<pref, thus removing electrons below the critical momentum, or Θ(ppref) otherwise. In so doing, contributions from momenta below pref will be removed when near or below the threshold, but when far above the threshold (i.e., when pc<pref), only momenta below pc are removed. Thus, in both limits, the spurious contribution to the RPF near p=pmin is removed, but when well above the threshold, only very small values of momenta are impacted.

Using the above algorithm, the avalanche growth or decay of an initial seed RE population has been evaluated for cases both well above and below the threshold, with the result compared with RAMc simulations that employ a Møller secondary source (see Fig. 7). Good agreement is found both for a strongly decaying RE distribution along with a RE distribution that is rapidly growing. Considering a broader comparison between predictions from the PINN and RAMc, 50 values of the parameters (E||,Zeff,α) were randomly sampled over the range of parameters the PINN was trained across. Cases that decayed too rapidly for a good exponential fit to be identified, which corresponded to cases far below the threshold as discussed in Ref. 1, were discarded leaving 44 cases that will be compared with PINN predictions. The resulting comparison between the two approaches is shown in Fig. 8(a). Generally, good agreement is evident, suggesting that the PINN is able to describe the growth or decay of a RE population across a range of plasma conditions. We note that the rapid increase in the magnitude of the RE decay rate as the electric field drops below the threshold (i.e., the steep drop off evident in Fig. 7) resulted in only a few data points that were substantially below the threshold, but not so far below the threshold to prevent an exponential fit, being present in our randomly sampled dataset.

FIG. 8.

Comparison of avalanche growth and decay rates between the Monte Carlo code RAMc and the PINN over a broad range of plasma conditions. Panel (a) includes a Rosenbluth–Putvinski secondary source term for all the PINN's predictions, whereas panel (b) uses the decay rate of the primary distribution when the PINN predicts RE decay. These latter points are highlighted in red. The electric field, effective charge Zeff, and synchrotron radiation α were selected randomly over the region the PINN was trained. The Coulomb logarithm was taken to be lnΛ=9.998 in this scan. The initial electron distribution was defined by Eq. (23) with (p0=3p¯max/4,Δp=p¯max/10,ξ0=1,Δξ=0.1).

FIG. 8.

Comparison of avalanche growth and decay rates between the Monte Carlo code RAMc and the PINN over a broad range of plasma conditions. Panel (a) includes a Rosenbluth–Putvinski secondary source term for all the PINN's predictions, whereas panel (b) uses the decay rate of the primary distribution when the PINN predicts RE decay. These latter points are highlighted in red. The electric field, effective charge Zeff, and synchrotron radiation α were selected randomly over the region the PINN was trained. The Coulomb logarithm was taken to be lnΛ=9.998 in this scan. The initial electron distribution was defined by Eq. (23) with (p0=3p¯max/4,Δp=p¯max/10,ξ0=1,Δξ=0.1).

Close modal

The largest disagreement between the predictions of the PINN and RAMc occurs when below the threshold. As noted in Fig. 3, and the ensuing discussion, when below the threshold, the Rosenbluth–Putvinski source overpredicts the impact of secondary electron generation on the RE density, where a better approximation can be achieved by neglecting the secondary source and computing the decay rate of the primary distribution. This is done in Fig. 8(b), where the decay rate of the values (red “×” markers) that the PINN predicted were below the threshold, is now computed using the decay rate of the primary distribution, yielding improved agreement with predictions from RAMc, particularly for the strongest decaying cases.

An adjoint-deep learning framework for evaluating the time evolution of the RE density incorporating large-angle collisions was derived. By utilizing a PINN to identify the parametric solution to the adjoint problem, this enables the time evolution of the RE density to be rapidly inferred across a broad range of parameters for an arbitrary initial momentum space distribution. This framework was applied to evaluate time histories of the RE density, along with saturated growth and decay rates across a broad range of physics parameters. Excellent agreement was found between predictions of the adjoint-deep learning framework and a traditional RE solver for the saturated avalanche growth rate. It was found that near and below marginality, the Rosenbluth–Putvinski approximation to the secondary source term lead to quantitatively inaccurate time histories of the RE density, though the qualitative behavior was captured. Furthermore, it was found that when below marginality, the decay rate of the RE distribution is well approximated by the decay rate of the primary distribution function (i.e., neglecting large-angle collisions). We anticipate that extending the present framework to include a more comprehensive set of physics, including time varying electric fields, temporally varying plasma compositions, and partial screening corrections to collisional coefficients, will allow the present approach to provide an efficient, yet accurate tool for evaluating saturated RE growth and decay rates, as well as describing RE dynamics during rapid variations in the background plasma. Finally, the present approach to describing RE generation can be extended to describe other RE generation mechanisms including nuclear seeds arising from tritium decay or Compton scattering,6 thus, providing a comprehensive treatment of RE formation across a range of plasma scenarios. Such extensions will be the topic of future work.

This work was supported by the Department of Energy, Office of Fusion Energy Sciences, at the University of Florida under Award Nos. DE-SC0024649 and DE-SC0024634, and at Los Alamos National Laboratory (LANL) under Contract No. 89233218CNA000001. The authors acknowledge the University of Florida Research Computing for providing computational resources that have contributed to the research results reported in this publication. This research used resources of the National Energy Research Scientific Computing Center (NERSC), a Department of Energy Office of Science User Facility, using NERSC Award No. FES-ERCAP0028155.

The authors have no conflicts to disclose.

Christopher J. McDevitt: Conceptualization (lead); Data curation (equal); Formal analysis (lead); Funding acquisition (equal); Investigation (equal); Methodology (equal); Project administration (equal); Resources (equal); Software (equal); Supervision (equal); Visualization (equal); Writing – original draft (lead); Writing – review & editing (equal). Jonathan S. Arnaud: Conceptualization (equal); Data curation (lead); Software (equal); Visualization (equal); Writing – review & editing (equal). Xian-Zhu Tang: Conceptualization (equal); Formal analysis (equal); Funding acquisition (equal); Methodology (equal); Project administration (equal); Writing – review & editing (equal).

The data that support the findings of this study are available from the corresponding author upon reasonable request.

1.
C. J.
McDevitt
et al, “
A physics-constrained deep learning treatment of runaway electron dynamics
,”
Phys. Plasmas
32
,
042503
(
2025
).
2.
M.
Kruskal
and
I.
Bernstein
, Report No. Matt-Q-20 (
Princeton Plasma Physics Laboratory
,
1962
).
3.
J.
Connor
and
R.
Hastie
,
Nucl. Fusion
15
,
415
(
1975
).
4.
M.
Rosenbluth
and
S.
Putvinski
,
Nucl. Fusion
37
,
1355
(
1997
).
5.
H. M.
Smith
and
E.
Verwichte
,
Phys. Plasmas
15
,
072502
(
2008
).
6.
J.
Martín-Solís
,
A.
Loarte
, and
M.
Lehnen
,
Nucl. Fusion
57
,
066025
(
2017
).
7.
P.
Aleynikov
and
B. N.
Breizman
,
Phys. Rev. Lett.
114
,
155001
(
2015
).
8.
L.
Hesslow
,
O.
Embréus
,
O.
Vallhagen
, and
T.
Fülöp
,
Nucl. Fusion
59
,
084004
(
2019a
).
9.
C. J.
McDevitt
,
Z.
Guo
, and
X.-Z.
Tang
,
Plasma Phys. Controlled Fusion
60
,
024004
(
2018
).
10.
L.
Hesslow
,
L.
Unnerfelt
,
O.
Vallhagen
,
O.
Embréus
,
M.
Hoppe
,
G.
Papp
, and
T.
Fülöp
,
J. Plasma Phys.
85
,
475850601
(
2019
).
11.
C. J.
McDevitt
,
Phys. Plasmas
30
,
092501
(
2023
).
12.
M.
Yang
,
P.
Wang
,
D.
del-Castillo-Negrete
,
Y.
Cao
, and
G.
Zhang
,
SIAM J. Sci. Comput.
46
,
C508
(
2024
).
13.
J. S.
Arnaud
,
T. B.
Mark
, and
C. J.
McDevitt
,
J. Plasma Phys.
90
,
905900409
(
2024
).
14.
R. W.
Harvey
,
V. S.
Chan
,
S. C.
Chiu
,
T. E.
Evans
,
M. N.
Rosenbluth
, and
D. G.
Whyte
,
Phys. Plasmas
7
,
4590
(
2000
).
15.
E.
Nilsson
,
J.
Decker
,
Y.
Peysson
,
R. S.
Granetz
,
F.
Saint-Laurent
, and
M.
Vlainic
,
Plasma Phys. Controlled Fusion
57
,
095006
(
2015
).
16.
Z.
Guo
,
C.
Mcdevitt
, and
X.
Tang
,
Phys. Plasmas
26
,
082503
(
2019
).
17.
M.
Hoppe
,
O.
Embreus
, and
T.
Fülöp
,
Comput. Phys. Commun.
268
,
108098
(
2021
).
18.
J.
Rudi
,
M.
Heldman
,
E. M.
Constantinescu
,
Q.
Tang
, and
X.-Z.
Tang
,
J. Comput. Phys.
507
,
112954
(
2024
).
19.
L.-G.
Eriksson
and
P.
Helander
,
Comput. Phys. Commun.
154
,
175
(
2003
).
20.
C.
Sommariva
,
E.
Nardon
,
P.
Beyer
,
M.
Hoelzl
,
G. T. A.
Huijsmans
,
D.
van Vugt
, and
J.
Contributors
,
Nucl. Fusion
58
,
016043
(
2018
).
21.
C. J.
McDevitt
,
Z.
Guo
, and
X.-Z.
Tang
,
Plasma Phys. Controlled Fusion
61
,
054008
(
2019
).
22.
M. T.
Beidler
,
D.
del-Castillo-Negrete
,
D.
Shiraki
,
L. R.
Baylor
,
E. M.
Hollmann
, and
C. J.
Lasnier
,
Nucl. Fusion
64
,
076038
(
2024
).
23.
I.
Sokolov
,
JETP Lett.
29
,
218
(
1979
).
24.
T. C.
Hender
,
J. C.
Wesley
,
J.
Bialek
,
A.
Bondeson
,
A. H.
Boozer
,
R. J.
Buttery
,
A.
Garofalo
,
T. P.
Goodman
,
R. S.
Granetz
,
Y.
Gribov
et al,
Nucl. Fusion
47
,
S128
(
2007
).
25.
B. N.
Breizman
,
P.
Aleynikov
,
E. M.
Hollmann
, and
M.
Lehnen
,
Nucl. Fusion
59
,
083001
(
2019
).
26.
C.
Liu
,
D. P.
Brennan
,
A. H.
Boozer
, and
A.
Bhattacharjee
,
Plasma Phys. Controlled Fusion
59
,
024003
(
2017
).
27.
L.
Hesslow
,
O.
Embréus
,
A.
Stahl
,
T. C.
DuBois
,
G.
Papp
,
S. L.
Newton
, and
T.
Fülöp
,
Phys. Rev. Lett.
118
,
255001
(
2017
).
28.
C. F.
Karney
and
N. J.
Fisch
,
Phys. Fluids
29
,
180
(
1986
).
29.
G.
Zhang
and
D.
del-Castillo-Negrete
,
Phys. Plasmas
24
,
092511
(
2017
).
31.
A.
Ashkin
,
L. A.
Page
, and
W.
Woodward
,
Phys. Rev.
94
,
357
(
1954
).
32.
A. H.
Boozer
,
Phys. Plasmas
22
,
032504
(
2015
).
33.
M.
Raissi
,
P.
Perdikaris
, and
G. E.
Karniadakis
,
J. Comput. Phys.
378
,
686
(
2019
).
34.
G. E.
Karniadakis
,
I. G.
Kevrekidis
,
L.
Lu
,
P.
Perdikaris
,
S.
Wang
, and
L.
Yang
,
Nat. Rev. Phys.
3
,
422
(
2021
).
35.
L.
Lu
,
X.
Meng
,
Z.
Mao
, and
G. E.
Karniadakis
,
SIAM Rev.
63
,
208
(
2021
).
36.
M.
Abadi
,
P.
Barham
,
J.
Chen
,
Z.
Chen
,
A.
Davis
,
J.
Dean
,
M.
Devin
,
S.
Ghemawat
,
G.
Irving
,
M.
Isard
et al, in
12th USENIX Symposium on Operating Systems Design and Implementation (OSDI ‘16
) (
USENIX
,
2016
), pp.
265
283
.
37.
C.
Wu
,
M.
Zhu
,
Q.
Tan
,
Y.
Kartha
, and
L.
Lu
,
Comput. Methods Appl. Mech. Eng.
403
,
115671
(
2023
).
38.
S.
Chiu
,
M.
Rosenbluth
,
R.
Harvey
, and
V.
Chan
,
Nucl. Fusion
38
,
1711
(
1998
).
39.
C.
McDevitt
and
X.-Z.
Tang
,
Europhys. Lett.
127
,
45001
(
2019
).
40.
J. S.
Arnaud
and
C. J.
McDevitt
,
Phys. Plasmas
31
,
062509
(
2024
).
41.
C. J.
McDevitt
,
Z.
Guo
, and
X.-Z.
Tang
,
Plasma Phys. Controlled Fusion
61
,
024004
(
2019
).