We present a design for a superconducting nanowire binary shift register, which stores digital states in the form of circulating supercurrents in high-kinetic-inductance loops. Adjacent superconducting loops are connected with nanocryotrons, three-terminal electrothermal switches, and fed with an alternating two-phase clock to synchronously transfer the digital state between the loops. A two-loop serial-input shift register was fabricated with thin-film NbN and a bit error rate of less than 10−4 was achieved, when operated at a maximum clock frequency of 83 MHz and in an out-of-plane magnetic field of up to 6 mT. A shift register based on this technology offers an integrated solution for low-power readout of superconducting nanowire single photon detector arrays and is capable of interfacing directly with room-temperature electronics and operating unshielded in high magnetic field environments.
Superconducting nanowires are interesting candidates for cryogenic data processing and storage, particularly for readout of superconducting nanowire single photon detector (SNSPD) arrays. The high kinetic inductance of thin film superconductors allows them to store data in compact loops,1 and the existence of nanocryotrons (nTrons), three-terminal electrothermal switches,2 enables the creation of low-power digital logic and memory elements.3 In addition, superconducting nanowires can operate in harsh environments. NbN is radiation hard,4 and SNSPDs have been shown to operate under high magnetic fields—both in-plane up to and out-of-plane up to .5 This makes nanowires an interesting candidate for applications in which SNSPD readout electronics must be able to withstand strong ambient magnetic fields or radiation, such as high energy physics and space exploration. Furthermore, the shared technology platform with SNSPDs and ability to drive high-impedance loads2 is a strong motivator for direct integration of nanowire electronics with SNSPD arrays. Dedicated readout electronics are necessary to address the thermal and mechanical challenges of scaling SNSPD imagers beyond 1 kilopixel,6 and low-power electronic devices that operate in extreme environments and can be fabricated adjacent to superconducting detectors are an attractive choice over Josephson junction logic and CMOS. Previous works7,8 have used the high kinetic inductance of superconducting nanowires to make analog delay-line imagers, which offer high pixel counts and preserve the picosecond timing resolution of the SNSPDs. Row column multiplexing has also been shown as an effective technique for reducing cable counts;9 however, a more aggressive reduction in cable count will be required for megapixel arrays. Inspired by the operation of a semiconductor CCD, serial readout of SNSPD arrays could be performed by a superconducting nanowire binary shift register. Serial readout may enable higher count rates than delay-line techniques by shortening dead time, but more importantly, it simplifies the interface to conventional CMOS readout electronics by removing the need for high resolution, low jitter time-to-digital converters.
In this work, we demonstrate a proof-of-concept for a superconducting nanowire binary shift register, which encodes digital states with dissipationless circulating current in superconducting loops. As shown in Fig. 1, each loop is formed by a kinetic inductor Lk and two nTrons, U1 and U2. The presence of a circulating current flowing through into the gate of U2 encodes a binary “1,” and the absence of current is used to represent a “0.” The shift register is designed to use circulating currents on the order of ; therefore, small (μA) fluctuations in loop current (e.g., due to thermally activated phase slips that change the stored flux in each loop by ) are not expected to impact the binary state. A substantial environmental disturbance that makes the film resistive (e.g., ) would be necessary to destroy the state stored in the shift register. The state of the shift register is only altered under the application of a clock, when the combination of the circulating current and clock pulse exceeds the critical current density in the nTron channel, causing it to switch from superconductive to resistive, diverting the clock pulse into the next loop. This process forms a new circulating current conditional on the presence of current in the previous loop. A two-phase clock is used to guarantee that the diverted current always has a superconducting path to ground [as shown in Fig. 1(c)]. In comparison to the original nTron design,2 which acts like an amplifier, the loops are connected with wide-gate nTrons, where the width of the gate constriction is comparable or equal to that of the channel constriction. A wide-gate nTron is crucial for the shift register: because the output of one nTron becomes the input of another, the current levels for the input and output should be equal. The additional readout nTron shown in Fig. 1(f) uses a standard nTron with a small choke. It terminates the final loop of the shift register to destroy any circulating current present at the end of each clock cycle. The readout nTron serves two purposes: (1) to reset the final loop of the shift register and (2) to generate an output voltage signal, which can be sent to off-chip readout electronics or cascaded through a resistor to other nTron logic.
(a)–(f) Principle of operation of the shift register, which uses the presence or absence of circulating current to encode digital states. (g) The results of a transient simulation in LTSpice of a four-loop shift register, including noise and parasitics from the packaging and experimental apparatus. (a) A shift register with an initial circulating current in the loop formed by the kinetic inductor Lk and nTrons U1, U2. The corresponding time a in the simulation is indicated in (g). A two-phase clock ( ) is used to transfer the digital state between adjacent loops; the first phase is applied in (b). In (c), the summation of the clock and circulating currents exceeds the switching current of U2's channel, forming a resistive hotspot and diverting the clock into the loop formed by U2 and U3. The hotspot creates a voltage spike shown in the lower panel of (g) at time c. By the time the clock is turned off in (d), the channel of U2 has healed and a circulating current is present in the loop between U2 and U3. The process continues in (e) when the second clock phase is applied. Two clock phases are needed to ensure a zero resistance path to ground for the diverted clock, for example, the path through U3 as shown in (c). The readout nTron Uro in (f) is used to reset the state of the final loop and generate an output voltage conditional on the presence of a circulating current.
(a)–(f) Principle of operation of the shift register, which uses the presence or absence of circulating current to encode digital states. (g) The results of a transient simulation in LTSpice of a four-loop shift register, including noise and parasitics from the packaging and experimental apparatus. (a) A shift register with an initial circulating current in the loop formed by the kinetic inductor Lk and nTrons U1, U2. The corresponding time a in the simulation is indicated in (g). A two-phase clock ( ) is used to transfer the digital state between adjacent loops; the first phase is applied in (b). In (c), the summation of the clock and circulating currents exceeds the switching current of U2's channel, forming a resistive hotspot and diverting the clock into the loop formed by U2 and U3. The hotspot creates a voltage spike shown in the lower panel of (g) at time c. By the time the clock is turned off in (d), the channel of U2 has healed and a circulating current is present in the loop between U2 and U3. The process continues in (e) when the second clock phase is applied. Two clock phases are needed to ensure a zero resistance path to ground for the diverted clock, for example, the path through U3 as shown in (c). The readout nTron Uro in (f) is used to reset the state of the final loop and generate an output voltage conditional on the presence of a circulating current.
The shift register was fabricated on a -thick layer of NbN, deposited with an AJA sputtering system onto an Si wafer with -thick SiO2 thermal oxide. The circuit geometry was patterned on the NbN layer with electron-beam lithography using ZEP530A resist and CF4 reactive ion etching. The wide-gate nTron channel constriction widths were designed to be (with an equal-sized gate choke), and the readout nTron channel width was designed to be , with a gate choke width of . Figure 2(a) shows an electron micrograph of a wide-gate nTron patterned on thin-film NbN. Figure 2(b) is an electron micrograph of the experimental two-loop shift register circuit, and the equivalent circuit model is shown in Fig. 2(c). The loop kinetic inductors were designed to be ; the estimated inductance came out to ( per square) based on a room temperature sheet resistance measurement of per square. The finished chip was wirebonded to a printed circuit board with off-chip current bias and shunt resistors, which was mounted to a custom dip probe10 and cooled to in a dewar of liquid helium. The bias resistors were used as approximate current sources to convert an applied voltage to a current through the nanowire. The hotspot resistance of the switching nTron is small compared to , so the amount of current through the nanowire given some applied voltage stays roughly constant regardless of the nanowire state. The nTron dimensions, inductor sizes, and resistor values were selected through LTSpice simulation,11 the results of which are shown in Fig. 1(g). The bit error rate of the shift register model under high levels of noise (e.g., variation in clock amplitude) was used to guide selection of component properties. Eight different shift register circuits were fabricated on a single chip. Two circuits were tested: the circuit presented in this Letter, which used a wide-gate nTron to connect adjacent loops, and a shift register with a different switch geometry. The alternative design used current summation into a single two-terminal constriction as a switch, which performed worse than the design based on the wide-gate nTron, likely due to leakage current that could flow between loops unimpeded regardless of the switch state. The results presented in this Letter are from the circuit which used wide-gate nTrons.
(a) and (b) Electron micrographs of fabricated wide nTron and two loop shift register. The large meanders in (b) are kinetic inductors. (c) An equivalent circuit model of the experimental circuit. The current pulses are provided by a voltage source in series with resistors mounted off-chip on a printed circuit board.
(a) and (b) Electron micrographs of fabricated wide nTron and two loop shift register. The large meanders in (b) are kinetic inductors. (c) An equivalent circuit model of the experimental circuit. The current pulses are provided by a voltage source in series with resistors mounted off-chip on a printed circuit board.
The circuit was characterized with clock rates from 10 to 100 MHz and under magnetic fields from ±1 to ±6 mT, applied orthogonal to the chip surface by a superconducting magnet mounted on the end of the dip probe. A Keysight PXIe M3202A (arbitrary waveform generator) and M3102A (digitizer) were used to verify correct operation of the shift register over a range of signal amplitudes. This was done by generating multiple -long pseudorandom binary sequences of voltage pulses and measuring the circuit response. The data and clock input signals encoded digital “1”s with low-duty-cycle FWHM voltage pulses, as can be seen in the top panel of Fig. 3(c). The PXIe chassis controller swept the amplitude of the shift and readout clock pulses and measured the bit error rate in near real time for each set of clock amplitudes by comparing the device output with the input sequence. Each spike of the output waveform was thresholded and digitized, and the result was compared with a copy of the input signal delayed by a clock period—for each instance where the input and digitized output differed, the total error count was incremented. A sample waveform used to calculate the bit error rate is shown in Fig. 3(c).
(a) and (b) Bias margin plots, which show the bit error rate of the shift register (number of errors out of a random bit sequence) as a function of shift and readout clock amplitude. The black regions represent correct operation, with a bit error rate below . The input clock amplitude was fixed at a level that gave optimal margins. (c) An example trace of the transient response of the circuit with a clock. The voltage spikes on Vshunt1 indicate the storage of a circulating current in the first loop, and spikes on Vshunt2 indicate transfer of state between adjacent loops. Traces are vertically offset for clarity.
(a) and (b) Bias margin plots, which show the bit error rate of the shift register (number of errors out of a random bit sequence) as a function of shift and readout clock amplitude. The black regions represent correct operation, with a bit error rate below . The input clock amplitude was fixed at a level that gave optimal margins. (c) An example trace of the transient response of the circuit with a clock. The voltage spikes on Vshunt1 indicate the storage of a circulating current in the first loop, and spikes on Vshunt2 indicate transfer of state between adjacent loops. Traces are vertically offset for clarity.
The plots in Fig. 3(a) are bias margin plots, which show the bit error rate as a function of clock pulse amplitude for various clock rates. The dark regions indicate no measured errors for the sequence, and the width of the dark regions give the bias margins, defined as the amount of variation in clock amplitude that is acceptable before the circuit begins to function incorrectly. The device performed correctly up to a maximum clock rate of , with the bias margins steadily shrinking for increasing clock frequency. The bias margins of the shift clock were at MHz, but only ±7% for MHz. Margins for the readout clock shrank even more, from at MHz to ±5% at MHz. As shown in Fig. 3(b), the introduction of a field did not dramatically hurt the margins of the shift clock: for +1 mT and ±20% for −1 mT. The readout clock margins were unimpacted. However, the introduction of a + field reduced the margins of the shift clock to , and a −6 mT field (not shown) prevented the device from working with a bit error rate below 10−3.
The lower half of each bias margin plot exhibits a downwards slope due to the transfer characteristics of the readout nTron: for a larger gate current, the required channel current to switch the nTron is lower. Therefore, for a larger readout clock, the required loop current (and, thus, shift clock amplitude) is lower. The abrupt change in bit error rate for readout clock amplitudes below occurred because the readout clock was not strong enough to switch the readout nTron. If the final loop current is left circulating, it prevents the middle nTron from switching again when a shift clock is applied. The optimal bias region slopes upwards for high readout clock currents, possibly because of current injection from the readout clock, which would create a reverse circulating current in the final shift register loop. This would require the amplitude of the shift clock to be larger to leave a net-forward circulating current in the final loop that was large enough for the readout nTron to switch when clocked.
As the frequency of the clock increased, the bias margins for the shift clock shrank from both sides, and the maximum acceptable readout clock amplitude decreased dramatically. The L/R time constant to charge a loop with a circulating current depends on the loop kinetic inductance and the total shunt resistance. It is plausible that, for higher clock frequencies, the circulating current does not reach a stable level in the half-period between the two clock phases, thus producing incorrect behavior. Further characterization with various shunt resistor and kinetic inductor sizes should be performed to verify that the decrease in margins is due to this electrical time constant, and not a thermal process or some other unconsidered effect. One possible explanation for the large decrease in the bias margins of the readout clock could be slow thermal reset of the readout nTron gate choke. The designed critical current of the choke was only , and overdriving the readout clock significantly above that (e.g., ) would generate a considerable amount of heat. Residual heat from a readout clock with phase would suppress the critical current of the channel, potentially causing the readout nTron to switch on phase if it had not cooled sufficiently. Shunting the gate with a small resistor could limit the heating of the choke, potentially restoring the bias margin range of the readout clock for high clock frequencies.
The observed shift in bias margins of due to the external magnetic field [Fig. 3(b)] agrees with the expected loop current induced by the Meissner effect. However, enhancement of current crowding around constrictions [such as the sharp corners in the nTron channel as can be seen in Fig. 2(a)] due to the Lorentz force is potentially a more plausible explanation, so further work must be done to understand the mechanism of the external field on the bias margins of the circuit. If the Meissner effect is the dominant mechanism, reducing the size of the loop inductor may help improve resilience against out-of-plane magnetic fields. Instead, if the mechanism is current crowding enhanced by the Lorentz force, then the nTron geometry would need to be modified to mitigate this effect.
The total energy of any cryogenic electronics system will be dominated by the cryocooler, which can consume on the order of to supply tens of milliwatts of cooling power at .12 Unless the design of the shift register presented in this work is modified, SNSPD arrays using shift register readout are limited to the kilopixel regime by cryostat cooling power. The energy consumption of the shift register is estimated to be per shift operation and is dominated by the clocking: each clock phase dissipates through for . When the shift register stores a “1,” approximately of energy is stored ( in a loop). Each shifting operation destroys this circulating current, dissipating the stored energy through the resistive hotspot in the nTron channel. Shift register readout of a 1 kilopixel array clocked at would dissipate about . Reduction of the clock impedance by a factor of 20 from to and the operating current from to would reduce the power dissipation of the 1 kilopixel array to , making a megapixel array feasible from a power perspective.
Decreasing the size of the loop inductor will enable faster, more compact shift registers due to a reduced kinetic inductance and, therefore, smaller L/R loop current time constant. The speed of the device is fundamentally limited by the hotspot thermal relaxation time, since the nTron channel must cool between the two clock phases; otherwise, there will not be a superconducting path for the diverted clock if the previous shift register stage switches. For example, as shown in Fig. 1(c), U3 must be superconducting during the application of clock . An nTron fabricated with NbN on SiO2 thermal oxide has achieved a thermally limited switching speed of , with an estimated thermal relaxation time of .13 Based on this, a conservative estimate for the thermal-reset-limited clock frequency of the shift register is about , allowing for a two-phase clock. At this clock rate, a 1 megapixel array could be read out on two wires at a frame rate of , for a maximum photon count rate of . More thermally conductive substrates can speed up thermal relaxation,14 potentially offering further speed improvements to nanowire logic.
Due to the small feature size of the nTron constriction, fabrication variations may pose a challenge when drastically reducing feature sizes, especially for shift registers with many nTrons. In order to minimize cable count, the same clock signal must be shared between multiple nTrons for any practical shift register. Therefore, all nTrons will receive the same amplitude clock signal, so if there is substantial variation in the switching current of the nTrons, then some loops may not function correctly for a clock amplitude which works for other loops. The bias margins of each nTron in a large shift register will have roughly the same shape, with variations in the midpoint of the optimal bias region due to edge roughness altering the constriction widths. Film thickness also plays a role, but edge roughness should be the dominant factor in switching current variations. Based on Fig. 3(a), the allowable variation in switching current is for a clock rate of 83 MHz. This is equivalent to variation in nTron width for the -wide nTrons. A nanowire fabrication process using ma-N demonstrated 36 nTrons with a mean gate width of and standard deviation of across a chip area.15 With ±7 standard deviations of allowable variation in width, a shift register with millions of nTrons should be feasible. However, scaling down to smaller nTron widths may still pose a challenge, as the relative variation in nTron switching current is larger.
Because the device we fabricated only accepts serial inputs, it would provide little practical benefit for large SNSPD arrays, as it is incapable of reducing wire count. However, modifications to the circuit design can be incorporated to load data from an entire row of pixels in parallel into the shift register, as shown in Fig. 4. This proposed modification was designed and simulated in LTSpice. A simple pixel and destructive-readout memory can be implemented with an inductively shunted SNSPD and nTron. A second nTron is used to store a current in the shift register when the pixel is read out, conditional on the presence of a current in the pixel inductor. Using this technique, data from all pixels could be loaded simultaneously into the shift register. Since the readout of the pixels is destructive, the bias current through the SNSPD is restored, so the pixels can still detect photons after the pixel data are loaded into the shift register. There is still per-pixel dead time set by the frame rate of the imager, since each pixel can only detect a single photon before it is reset again, but there is no imager-wide dead time like in a delay-line readout approach.
Proposed parallel readout scheme and transient simulation results. (a) Modifications to the original shift register (gray) in black. Each pixel consists of an inductively shunted SNSPD which is read out with an nTron. When the bias is enabled, photon arrivals divert the bias current into the right branch of the pixel. An additional bias current is applied to the clock input of each shift register stage. If the SNSPD bias current is diverted, a pulse of current applied to the gate of nTron U1 will cause it to switch, sending a pulse of current to the gate of nTron U2. The additional bias on the input will be diverted, forming a circulating current in the shift register. In (b), photon arrivals create circulating currents in each pixel. After the application of the load pulse , the pixel states are loaded into the shift register and shifted out.
Proposed parallel readout scheme and transient simulation results. (a) Modifications to the original shift register (gray) in black. Each pixel consists of an inductively shunted SNSPD which is read out with an nTron. When the bias is enabled, photon arrivals divert the bias current into the right branch of the pixel. An additional bias current is applied to the clock input of each shift register stage. If the SNSPD bias current is diverted, a pulse of current applied to the gate of nTron U1 will cause it to switch, sending a pulse of current to the gate of nTron U2. The additional bias on the input will be diverted, forming a circulating current in the shift register. In (b), photon arrivals create circulating currents in each pixel. After the application of the load pulse , the pixel states are loaded into the shift register and shifted out.
In addition to performing detector readout, the simplicity of a shift register makes it a useful test structure, which could be used to characterize process yield, as has been done in the past with SFQ logic to evaluate yield for Josephson junction processes.16 More generally, the inherent ability of shift registers to serialize and deserialize data makes them a critical function of any large-scale digital system. A superconducting shift register could help increase the capacity of links between room temperature and superconducting electronics, and with the introduction of digital logic, push even more computing into the fridge and enable larger scale superconducting systems based on nanowires.
The initial stages of this work were sponsored by the Army Research Office (ARO) under Cooperative Agreement No. W911NF-21-2-0041. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Office or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein. The completion of the data analysis and presentation was funded by the DOE under the National Laboratory LAB 21-2491 Microelectronics grant. The authors would like to thank Kyle Richards and Teja Kothamasu for assistance with setting up and using the Keysight PXIe system.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Reed A. Foster: Conceptualization (lead); Data curation (lead); Formal analysis (lead); Investigation (equal); Methodology (equal); Software (lead); Visualization (lead); Writing – original draft (lead); Writing – review & editing (lead). Matteo Castellani: Conceptualization (supporting); Formal analysis (supporting); Investigation (equal); Methodology (equal); Resources (equal); Software (supporting); Writing – original draft (supporting); Writing – review & editing (equal). Alessandro Buzzi: Formal analysis (supporting); Investigation (supporting); Software (supporting); Writing – review & editing (equal). Owen Medeiros: Formal analysis (supporting); Investigation (supporting); Methodology (supporting); Resources (supporting); Software (supporting); Writing – review & editing (equal). Marco Colangelo: Formal analysis (supporting); Writing – original draft (supporting); Writing – review & editing (equal). Karl K. Berggren: Funding acquisition (lead); Project administration (lead); Supervision (lead); Writing – original draft (supporting); Writing – review & editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.