A real-time control system has been developed to control the amplitude, phase, and offset of bulk plasma parameters inside an oscillating magnetic helicity injector. Control software running entirely on an Nvidia Tesla P40 graphical processing unit is able to receive digitizer inputs and send response patterns to a Pulse Width Modulation (PWM) controller with a minimum control loop period of 12.8 µs. With an input digitization rate of 10 MS/s, a three-parameter proportional integral differential controller is shown to be sufficient to inform the PWM controller to drive the desired oscillating plasma waveform with a frequency of 16.6 kHz that is located near the resonance of a coupled RLC circuit. In particular, the temporal phase of the injector waveform is held within 10° of the target value. Control is demonstrated over the toroidal modal structure of the imposed magnetic perturbations of the helicity injection system, allowing a new class of discharges to be studied.
I. INTRODUCTION
An active problem in the development of magnetic confinement fusion involves the real-time control of nonlinear plasma dynamics. A variety of systems have been implemented on tokamak devices1 to perform real-time prediction and control equilibrium profiles2 and instability mitigation.3
The HIT-SI3 device4,5 forms and sustains spheromak plasmas through the use of three inductive magnetic helicity injectors. A magnetic flux conserving volume, made of chromium copper, is coated with an insulating layer of aluminum oxide to prevent plasma currents from connecting with the vessel walls. Each injector on HIT-SI3 is a semi-toroid, attached to the top of the central chamber, which forms a helical magnetic field structure by driving a toroidal magnetic flux and plasma current with two orthogonal coils, as shown in Fig. 1. The purely inductive nature of the injectors precludes the ability to operate the injectors with time-independent currents, so the injector voltage (Vinj) and flux (ψinj) are oscillated in phase with one another at the operating frequency (finj), driving a plasma current that is parallel with the magnetic flux. Operation of an injector is analogous to that of a tokamak device operating in an oscillatory mode, with the two coils, voltage and flux, being equivalent to the central solenoid and the toroidal field coils, respectively.
The evolution of HIT-SI3 discharges is largely determined by the rate of injection of magnetic helicity balanced against resistive dissipation.6 Each injector injects helicity at a rate of
which when the flux and voltage are driven as sine waves with constant amplitude and phase becomes
To provide constant helicity injection, the three injectors are operated temporally out of phase with either 60° or 120° relative phasing. The total helicity injection rate is given as
where ϕA,B,C is the absolute temporal phase of the waveform for the injectors named A, B, and C. The motivation for the development of this feedback control system is to actively control the temporal waveform of Vinj and ψinj on each injector with focus on the amplitude and temporal phase. Previous studies5,7 of the role of the injector phase on discharge evolution found that the toroidal Fourier mode spectrum of the applied perturbation fields was primarily dictated by the relative temporal phases of the three injectors. The implementation of a real-time injector phase controller allows for a wider range of experimental studies to be performed with focus on the role of the interaction of the non-axisymmetric perturbation fields with the axisymmetric spheromak.
Prior to this work, each of the three injectors was controlled by a set of three Blackfin BF537 microcontrollers, with the entire system using nine microcontrollers. While the microcontroller had digital signal processing capabilities, a simple pre-programmed drive waveform that based timing on the detection of plasma breakdown was typically used. An active proportional integral differential (PID) control implementation was successful at operating the prior HIT-SI experiment; however, that implementation was not successful with the more closely coupled injectors of HIT-SI3. The prior system has several key limitations that justified the development of a new implementation:
Each microcontroller has only one digitizer channel and very low bandwidth communication between units, limiting the control algorithms available for implementation.
The realized digitizer sampling rate (∼400 kHz) was insufficient to properly track the temporal phase of the injector waveforms at frequencies above 10 kHz.
A Graphical Processing Unit (GPU)-based system was selected as a replacement to centralize the control system implementation into a single unit with increased computational power, to expand the available control algorithms that can be implemented with the system, and to update the digitizer hardware being used for a higher sampling rate.
This paper begins by describing the hardware of both the magnetic helicity injectors and the feedback control system that has been developed. Next, a description of the software implementation is presented along with a description of the initial PID controller algorithm that has been applied. Following this, experimental results featuring both a demonstration of the control system and a plasma physics result are presented and discussed. Finally, a conclusion summarizes the results and next steps in the development of this control system.
II. SYSTEM IMPLEMENTATION
Each helicity injector is inductively driven by two orthogonal coil-sets shown in Fig. 1(b), named the voltage (V) and flux (ψ) coils. These circuits consist of a set of H-bridge Switching Power Amplifiers (SPAs) that drive an RLC circuit near resonance. The resonant frequency of the RLC circuit is roughly tuned through selection of a capacitor to match the coil inductance; then, fine tuning is performed via an inductor in series with the H-bridge supply. The general layout of the two circuits is identical, as shown in Fig. 2, differing in the values of the individual elements while tuned to an identical resonant frequency. Additionally, different values of LS and C can be incorporated, with previous experimental work demonstrating operations in the range of finj = 5–70 kHz. It is important to note that these circuits are well understood in the “vacuum” case, when no plasma exists in the injector, but the interaction with a nonlinear plasma9 provides the justification for the development of this control system.
Each SPA operates an H-bridge supply, which drives positive and negative pulses onto the coil circuit. The typical output voltage waveform is a three-step square signal and is parameterized by a period length and the duty cycle of the positive and negative pulses. This waveform is produced via the switching of four Insulated-gate Bipolar Transistors (IGBTs)s in the SPA, which are referred to as S1–4. Figure 2 shows the current paths during the positive and negative pulses, with every other pattern either being zero-voltage or forbidden (shorting the energy storage capacitors).
The control hardware consists of three main pieces, visualized in Fig. 3:
A host-PC with a Graphical Processing Unit (GPU, Nvidia Tesla P40) and a communication card (D-tAcq AFHBA404).
A digitizer control module (D-tAcq ACQ2106).
A collection of electronic-to-fiber converters to propagate the signal to the SPAs.
The digitizer control module built by D-tAcq has three main components:
ACQ482: a 16-channel analog-to-digital converter (ADC) sampling at 10 MS/s.
DIO482: a 32-channel Digital Input/Output (DIO) module providing digital output with a 25 MHz clock.
A Xilinx Z7030 SOC that handles data transfer with the GPU and interprets Pulse Width Modulation (PWM) instructions.
The hardware used is similar to the systems implemented on the High Beta Tokamak-Extended Pulse (HBT-EP) device at Columbia University10,11 and the DIII-D device at General Atomics,12 both of which used a GPU for control calculations, and a system used on the Madison Symmetric Torus (MST) device at the University of Wisconsin,13 which used a central processing unit (CPU) for control calculations. While the digitizer sampling rate of the signals supplied to the GPU in this case is much larger than those used in the other applications, the Peripheral Component Interconnect-express (PCI-e) transfer rates and latency impose a similar bottleneck of ∼10 µs control loop duration and ∼15–30 µs latency on the system.
The host-PC is equipped with two PCI-e devices, a Tesla P40 GPU and a D-tAcq communication card (AFHBA404). Device drivers incorporating the GPUDirect RDMA feature of CUDA14 enable the AFHBA404 card to write directly to GPU memory. Fiber optics are used for the communication between the ACQ2106 device and the host-PC. The digitized data are transferred to the GPU in an alternating two-buffer approach, with the GPU control loop being synchronized to the filling of each buffer, allowing the digitizer sampling rate to exceed the control loop rate. In this specific application, the control loop operates at 78.125 kHz, with 128 time-samples of data being processed in each loop, with an input data transfer rate of 320 MB/s.
The ACQ2106 device’s FPGA includes a PWM controller for communicating the output waveforms to the SPAs. A 32-bit PWM instruction is communicated to each of the 32 channels (24 used on HIT-SI3) during each control loop. The PWM instruction contains four input parameters to define two switches in a single PWM period: an 11-bit cycle period, two 10-bit switching times, and a 1-bit initial state. At the end of each period, the PWM controller loads the current pattern in the buffer to begin the next cycle, leading to patterns being repeated until modifications are made. The DIO482 output clock can operate up to 125 MHz; however, the limitation that the cycle period is represented as an 11-bit number requires that drive frequencies slower than 61 kHz use a slower clock speed. The application presented in this paper uses 25 MHz as the output clock speed to operate the injectors at 16.6 kHz. An output data transfer rate from the GPU is calculated to be 10 MB/s, leading to a total 330 MB/s transfer rate to/from the GPU, which is impressive for a data stream originating external to the host-PC.
To generate a set of four patterns compatible with the SPAs, as shown in Fig. 4, a cycle is decomposed into three terms: the cycle period (T), the positive duty cycle (d+), and the negative duty cycle (d−). The two duty cycles can be written in terms of the amount of time spent being on as . Additionally, a short waiting period δ = 1 µs applied at each switch allows a finite time for the IGBTs to act.
The digital output of the ACQ2106 device is communicated to the SPAs through a set of electronics converting the electrical signal to a fiber-optic signal and a set of fiber-optic fan-out modules. Logic is included in the electronics to prevent output signals that would damage the SPA modules, such as S1 and S2 being simultaneously triggered, shorting the energy storage capacitors.
The software implementation of the control system runs on a set of seven GPU kernels, one primary kernel and a set of six kernels that handle individual injector circuits, as shown in Fig. 5. When the master experimental control system initializes a discharge, a Python script is launched on the host-PC that writes the discharge settings from the MDSplus data tree to the input data files for the control program. Following this, the Python script launches the CPU program and waits for the program to conclude. The CPU portion of the program, written in C++, initializes the communication drivers for data transfer to the GPU and loads the input variables into GPU memory before launching the primary CUDA kernel. After launching the primary kernel, the CPU then waits for a synchronization event from the CUDA device, indicating that the GPU has finished operating, before loading the output values from GPU memory and writing the data to disk. The Python routine then sends the output data of the control system to the data tree.
The GPU portion of the program is initialized by the primary kernel and run with 128 CUDA threads, which largely handles digitizer storage tasks. In order to detect a new set of data that have been written to the digitizer data buffer, a value outside of the digitizer range is written to the final 32 bits of the data buffer and a loop is run to watch for that value to change. The primary kernel transfers the contents of the data buffer to a global storage array, which contains the complete time history of the digitizer data for the discharge, and resets the end-point value of each loop. Parallel optimization of the memory copy is required to achieve the speed of the overall loop, and the use of 128 threads and double-packing the data (each 32-bit instruction copies two 16-bit digitizer values) allow the copying process to take place in roughly 2 µs. A set of sub-kernels are additionally launched to parallelize the operation of each individual injector circuit. Each sub-kernel can be launched with a different control algorithm to allow for more flexibility in operations and development.
The CUDA kernel for each individual circuit runs a loop that is tasked with computation of the desired output waveforms for the four SPA inputs and writing of the result to the output buffer. This kernel works similar to the primary kernel, with each loop beginning at the detection of new data in the input buffer. In the control algorithm implemented, described in Sec. III, an additional sub-kernel is launched by each of these kernels to allow the GPU to do matrix computations in parallel with the memory transfers needed for the output buffer. To allow this to occur, a buffer of global GPU memory is shared between all active kernels to pass messages. Due to the start-up cost of launching a CUDA kernel, all kernels that run during the discharge are launched during initialization and use the same looping structure to synchronize time.
An important consideration in the program construction of the GPU control loop is that communication with the CPU during operation is not possible and the entire control loop is run on the GPU due to timing requirements. Traditionally, CUDA programs utilize the GPU for “single-shot” use cases, where the CPU transfers inputs to GPU memory, the GPU runs a subroutine, and then the CPU copies the outputs and continues the program. Because the control loop implemented here is constrained only to the GPU, common libraries for various tasks, such as the Gauss–Newton fitting used in the state estimation, are not available for our specific implementation. To work around this and achieve the desired computation speed, optimized versions of various subroutines were written for this specific application.
Each kernel is run with 128 CUDA threads to restrict the kernel to running on one single GPU streaming multiprocessor (SM). The advantage of this multi-kernel structure is that it allows the separation of the source code between individual SMs on the GPU, allowing different tasks to be performed simultaneously. Running kernels with larger thread counts than a single SM is possible with this multi-kernel approach; however, the specifics of the HIT-SI3 injectors encourage us to divide the used GPU resources into batches of six for each power circuit. With 13 kernels running simultaneously, the primary and two for each of the six injectors, this means that a total of 1664 CUDA threads are run concurrently, which is a little less than one half of the selected GPU’s total compute capability. Future work to be performed involves testing optimization based on increasing thread counts balanced with the limitations of memory transfer speeds. In particular, as development progresses to more sophisticated control algorithms than the PID implementation detailed here, sub-kernels can be initiated for specific matrix operations involved in the calculations.
III. CONTROL ALGORITHM
The control algorithm for determining the PWM patterns for a single injector circuit consists of three independent PID controllers, which calculate the parameters of the output pattern. The GPU is operated with a control loop period dt = 12.8 µs. Using the formulas from Table I, the control algorithm requires the calculation of the positive duty cycle (t+), the negative duty cycle (t−), and the duration of the specific period (T).
Channel . | IS . | IC . | OC . | GP . |
---|---|---|---|---|
S1 | 0 | T | ||
S2 | 1 | T | ||
S3 | 0 | T | ||
S4 | 1 | T |
Channel . | IS . | IC . | OC . | GP . |
---|---|---|---|---|
S1 | 0 | T | ||
S2 | 1 | T | ||
S3 | 0 | T | ||
S4 | 1 | T |
The output waveform of each injector circuit (ψinj, Vinj) is assumed to be of the form
where the amplitude, phase, and offset (A, ϕ, C) are the control variables, with ω being straightforward to control with the periodicity of the SPA inputs. A fast parallel implementation of the Gauss–Newton least squares fitting algorithm is used to obtain the values for A, ϕ, and C for each circuit during each injector period. Due to computation speed limitations, only a single iteration of the fitting algorithm can be performed in each control loop, with the results from the prior loop being used in the next iteration. In practice, we find that evaluating an iteration every three control loops produces a better fit by binning the data from a larger portion of the injector period. Comparisons of the fit results obtained during the control loop are shown with the coefficients obtained from a post-processing analysis in Fig. 6, where it is seen that the coefficients are sufficiently slowly changing for the fit to be acceptable. The error can be quantified as the root-mean-square error with the mean taken over the injector period as
where N is the number of digitizer samples taken during the sum, ψf is the evaluated fit, and ψm is the measured signal. We see that during the majority of the discharge period, the error values σrms are typically around 2% of the amplitude. The largest error values are seen during the plasma breakdown event, which is the most chaotic period during a discharge.
Using the output of the curve fit, we can recast the amplitude and offset in terms of the positive and negative extrema of the sine wave as
Two independent PID controllers are then employed to calculate the positive and negative duty cycles based on the current extrema values. The duty cycles for the next cycle are calculated as
where the error ϵ on cycle i is given by
and D[+,−] is the demand amplitude and offset converted into positive and negative extrema. Gain scheduling is performed by adjusting the gain values KP, KI, and KD when the system transitions from the vacuum to plasma state to account for the large differences in circuit impedance between the two cases.
Phase control is accomplished through a biased proportional controller applied to the cycle period T. The period of the next cycle T is calculated from
where T0 is the injector operation period, Kϕ is the gain factor, and δϕ = ϕD − ϕm is the error in the injector phase, with ϕD being the demand and ϕm being the measured value.
Prior to the implementation of the GPU system, a set of microcontrollers were used to independently drive each injector circuit on the HIT-SI device.15 While a similar two-term PID controller, using the positive and negative sums of the waveform as the control values, was demonstrated to control amplitude and offset with the HIT-SI injectors, the digitization rate was insufficient to achieve phase control. Additionally, the implementation of the microcontroller system on HIT-SI3 encountered difficulties caused by increased coupling between the injectors.
This algorithm is implemented through the use of two CUDA kernels: the first kernel performs the fitting function, while the second performs the PID algorithm. Each iteration of the loop performing the real-time fit takes roughly 8 µs per cycle of the control loop and keeps the current result of the fit stored in global memory for the PID kernel to utilize. The PID kernel runs mostly serial operations, with most time being consumed by memory transfer operations, spending only about 5 µs for determining and then writing the next PWM instructions to the output buffer.
IV. RESULTS AND DISCUSSION
A set of discharges were taken with the control system to determine the effectiveness and to demonstrate new modes of operation that the system allows. For these discharges, the injector power circuits seen in Fig. 2 were tuned to a vacuum resonance frequency of fres = 16.75 kHz and the control system was driven slightly below that resonance at finj = 16.6 kHz.
The accuracy of the sine-fit routine can be seen in Fig. 6. We see that while the fit error appears to struggle during the plasma breakdown event, the fit generally performs well during the bulk of the discharge, despite being one injector period behind in time. The majority of the fitting error seen during the discharge can be attributed to the high-frequency activity in the signal, as shown in Fig. 7. This noise was found to correlate strongly with the nearby control electronics, in particular, the output PWM signals generated to drive the SPAs, and was not observed when the signal was measured with a digitizer located further away from the control system. While the sine function is a good approximation of the injector voltage and flux waveforms, it has additionally been observed that the injector current sees significant deviations from that form. This provides justification for the selection of controlling the voltage and flux with this algorithm, instead of, as an alternative, the current and flux.
The initial set of discharges have been taken to demonstrate the performance of both the hardware and the control algorithm. Data from discharges that maintain constant amplitude waveforms on the injectors are shown in Fig. 8. These discharges were previously difficult to produce as a combination of changing plasma parameters and power supply droop requires a non-trivial time evolution of the duty cycle of the SPAs. An important part of the duty-cycle evolution for the voltage circuits is the large increase that is seen during the plasma-breakdown event, where the majority of the energy stored in the electric fields of the voltage circuit is expended on ionizing the neutral gas. Typical operations before this control system had difficulty in maintaining a constant Vinj through this event, leading to a reduction in injected power during the spheromak relaxation event. Figure 9 shows the duty cycles that the controller used to achieve the waveform.
The implemented phase control method has been demonstrated to keep the error in the temporal phase below 10° for the duration of the discharge. In Fig. 10, we see that when the phase control is turned off in the middle of a discharge, equivalent to keeping the PWM period constant between injector cycles, the injector phase error δϕ increases roughly 5° per millisecond. This natural phase drift is caused by the resonance of the driving circuit, which is dependent on the plasma parameters inside the injector, varying as the discharge progresses while the drive frequency remains constant.
The relative temporal phase of the injectors has previously been shown to play an important role in the 3D spatial structure of the imposed perturbation fields.5,7 While it is still unclear how much of a role the specific toroidal structure of the perturbation plays in the current drive mechanisms, other than the requirement of being non-axisymmetric, being able to control this structure throughout the discharge allows deeper study to be done. Figure 11 shows two examples of this new capability during a discharge. We see that a smooth transition is able to be performed between two different operating states while sustaining an axisymmetric equilibrium. Particularly, we see that the sign of the axisymmetric toroidal current reverses during the transition from 120° phase separation to the injectors operating in phase. Because the device has no external coils on the confinement volume, spheromak equilibria with both dipole orientations (positive and negative in the vertical direction) are able to be formed. This control system serves as a demonstration that one of the primary inputs to the orientation of the HIT-SI3 equilibrium is the injector system. Future work will seek to explore this in depth to characterize the plasma state dependence on the relative injector phases.
The latency of the communication systems was tested by having the GPU compute the “zero-latency” expected waveform of an SPA and comparing that to a signal digitized by the digitizer, with a result of 18 µs being obtained. This result was a pleasant surprise since the hardware itself imposes 12.8 µs latency on the update time of the PWM controller, indicating that the remaining components only contribute 5 µs to the latency of the system. Figures 10 and 11 show a roughly 250 µs lag between the setting of a demand value and the response of the system. This latency is a result of the PID algorithm being reactive to the demand values, and it can take up to two injector periods, in this case 120 µs, before the first PWM pattern, accounting for the modified demand sent out to the system. The remainder of the latency in the modified demand showing up on the waveform comes from the power circuit response time to a change in the input voltage pattern.
V. CONCLUSIONS AND FUTURE WORK
A GPU-based feedback controller has successfully demonstrated the usage of a PID controller on the operation of the HIT-SI3 device. The system is shown to be capable of controlling a set of oscillating magnetic helicity injectors that operate at 16.6 kHz while making measurements at 10 MHz sampling frequency. Demonstrations of perturbation spectrum control during experimental discharges have been performed, opening up new avenues of device operation and areas of study. These additional types of discharges involve control over the time evolution of the helicity injection parameters (λinj) and the 3D spatial structure of the injected perturbation.
The next steps for development of the system include the application of more sophisticated control algorithms. A linear model was previously developed to describe the time evolution of the injector plasma and power systems,15 and as such, physics-informed control schemes are viable. Control using a model capable of state prediction should allow for a significant reduction in the time-delay response of the system to new demand set-points. Beyond controlling the injector waveforms, the long term goal of this system additionally requires the determination of the required injector waveforms to produce the desired spheromak response. The work by Kaptanoglu16 has laid a foundation for a physics-based model reduction through the use of dynamic mode decomposition and machine learning techniques, which have been shown to accurately approximate HIT-SI state evolution. Finally, enhancements are in progress to include the power circuit evolution in extended magnetohydrodynamic models of the device, which is expected to both improve agreement with simulated results and provide a new test bed for prototyping feedback algorithm implementations.
ACKNOWLEDGMENTS
This work was made significantly easier with the technical support of the team at D-tAcq, who provided the expertise for making the communication channels work properly. The authors thank the members of the HIT research group for valuable discussions on the development and implementation of this control system. This work was funded by the U.S. Department of Energy Phase 1 SBIR program under Award No. SC-0018844 and by CTFusion, Inc., the primary recipient of ARPA-E Award No. DE-AR0001098.
K.D.M. and A.C.H. have a financial interest in the prime awardee and funder of this work, CTFusion, Inc.
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.