Josephson junctions and single flux quantum (SFQ) circuits form a natural neuromorphic technology with SFQ pulses and superconducting transmission lines simulating action potentials and axons. Josephson junctions consist of superconducting electrodes with nanoscale barriers that modulate the coupling of the complex superconducting order parameter across the junction. When the order parameter undergoes a 2π phase jump, the junction emits a voltage pulse with an integrated amplitude of a flux quantum ϕ0 = h/(2e) = 2.068 × 10−15 V s. The coupling across a junction can be controlled and modulated by incorporating the nanoscale magnetic structure in the barrier. The magnetic state of embedded nanoclusters can be changed by applying small current or field pulses, enabling both unsupervised and supervised learning. The advantage of this magnetic/superconducting technology is that it combines natural spiking behavior and plasticity in a single nanoscale device and is orders of magnitude faster and lower energy than other technologies. Maximum operating frequencies are above 100 GHz, while spiking and training energies are ∼10−20 J and 10−18 J, respectively. This technology can operate close to the thermal limit, which at 4 K is considerably lower energy than in a human brain. The transition from deterministic to stochastic behavior can be studied with small temperature modifications. Here, we present a tutorial on the spiking behavior of Josephson junctions; the use of the nanoscale magnetic structure to modulate the coupling across the junction; the design and operation of magnetic Josephson junctions, device models, and simulation of magnetic Josephson junction neuromorphic circuits; and potential neuromorphic architectures based on hybrid superconducting/magnetic technology.

Deep neural nets have been successful in many tasks,1 including image recognition/classification, language translation, speech recognition, medical image reconstruction, medical diagnosis, and robotics. Current implementations are mostly software based and are limited by software algorithms for training and the inherent serial nature of the complementary metal–oxide–semiconductor (CMOS) processors used to implement the algorithms. Developing neuromorphic hardware has been a longstanding challenge.2,3 Biological neural systems are characterized by complex spike-based communication, co-location of memory and processing, plasticity, and extreme complexity. Extreme complexity includes a large number of neurons (1011) and synapses (1015), large fan-out (>1000), 3-dimensional connectivity, large number of physical parameters and mechanisms working together, and chemical/physics based processing where all components evolve simultaneously.4 

Many neuromorphic systems have been proposed and developed to various stages of complexity.5,6 The most advanced are CMOS based systems, such as True North,7 SpiNNacker,8 Neurogrid,9 and the BrainScaleS machine.10 Most of these systems imitate neural function with mixed analog and digital techniques using conventional silicon device structures. Non-CMOS neuromorphic devices have been proposed including memristors,11–13 nanowires,14 photonics,15,16 and spin based systems.17–20 Most of these technologies do not have low energy plasticity combined with natural spiking at the device level. While there is considerable debate about the added advantage and necessity of spiking systems,21,22 when operating at extreme speeds (>100 GHz) and ultra-low energy (<10−20 J/spike), systems that use asynchronous spikes for data transmission and unsupervised learning are at present the most accessible.

Here, we focus on hybrid magnetic superconducting devices that naturally output voltage spikes and have plasticity due to embedded nanoscale magnetic structures. Superconducting and magnetic device technologies are relatively mature, and large scale integrated systems have been successfully deployed. On the magnetic device side, large non-volatile memories based on magnetic tunnel junctions, using spin-polarized currents to write, have been commercialized.23 Magnetic devices are uniquely suited for nanoscale memory applications. On the superconducting device side, high speed microprocessors and communication systems have been developed, based mostly on superconducting tunnel junctions.24 One successful application is the Josephson voltage standard that uses large arrays of Josephson junction oscillators to phase lock on an external microwave signal and output a precise voltage that is related to the flux quantum.25 Superconducting devices are being used for quantum computation and information processing applications.26 Neuromorphic applications of these devices use completely different design concepts than all of these previous applications. For nanoscale digital memory, systems require high uniformity and discrete well-defined states, while neuromorphic systems embrace nonuniformity and a large number of analog states. In quantum computation and information processing, interactions with the environment are minimized while neuromorphic systems embrace thermal fluctuations and interactions with their environment.

Both the superconducting and the magnetic systems can be described by macroscopic order parameters that characterize a coherent state of electrons, which occurs due to electron-electron interactions. Superconducting systems are characterized, in the simplest case, by a complex order parameter ψ ( r , t ) = n e i θ , which is derived from a two-particle correlation function that describes a many body coherence of the quantum state. For a standard s-wave superconductor, an attractive interaction leads to correlations of a pair of spin up and spin down electrons near the Fermi surface.27,28 The two-body correlation can be thought of as a superconducting electron pair (Cooper pair), which has a wave function similar to positronium and an extent given by the coherence length ξ 0 = v F π Δ s c , where v F is the Fermi velocity and Δ s c is the superconducting energy gap. For Nb, one of the commonly used materials in superconducting circuits, the coherence length is ξ 0 = 38 nm .29 The superconducting energy gap, Δ s c , is a measure of the minimum energy to create an electronic excitation and the decrease in energy going into the correlated state U s c = 1 2 N ( 0 ) Δ s c 2 , where N ( 0 ) is the density of states at the Fermi surface.27 For conventional superconductors, the wave function is built up from momentum states near the Fermi surface and has strong charge modulation at wavelengths comparable to lattice spacings. These charge modulations couple with the lattice to cause distortions (phonons) that lower the overall energy of the electron pair. The superconducting pairs are highly overlapping, so the macroscopic charge density is uniform. The square of the magnitude of the order parameter n ( r , t ) can be viewed as a local density of superconducting pairs, while the phase θ ( r , t ) is related to motion and flow of the superconducting pairs.

For magnetic systems, there is a strong spin-dependent interaction between electrons due to the antisymmetry of the wave function. This asymmetry requires that two electrons of identical spin cannot occupy the same spatial location, which leads to strong correlations in the spin states of neighboring electrons. For the case where neighboring spins tend to align, the order parameter can be expressed as the local magnetization vector, M ( r ) , which is the local magnetic moment per unit volume. For both superconducting and magnetic systems, especially when they are combined, the order parameters may have a more complex structure.

The devices considered here, shown in Fig. 1, consist of two superconducting electrodes separated by a barrier that suppresses the superconducting order parameter. The barrier can be an insulator, a normal metal, a magnetic metal or insulator, a constriction, or a nanowire. Figure 1 shows a special type of barrier that has embedded magnetic nanoclusters in a dielectric barrier of thickness ∼5 nm. If the barrier is sufficiently thin, there will be coupling across the barrier and a dissipationless pair current (shown schematically as the coupled electrons in Fig. 1) can flow up to a critical current Ic. These devices all fall under the category of Josephson junctions, named after Brian Josephson who first predicted the effect in tunnel junctions.30 The dynamics of the superconducting order parameter in a Josephson junction, in the simplest case, can be described by a set of coupled equations31 
(1)
where μ 1 = μ 2 are the pair energy levels on the two sides of the electrodes and k is a coupling energy. The voltage V (half of the energy to transfer a pair) and current I across the device are given by (where ni is now the number of Cooper pairs on each electrode)
(2)
(3)
FIG. 1.

Josephson junction with two superconducting electrodes separated by an insulating barrier with embedded magnetic nanoclusters (shown in red). The suppression of the superconducting order parameter ψ is shown schematically on the right as a function of vertical z-position. Quasiparticle I q p and superconducting pair I s p currents are shown schematically.

FIG. 1.

Josephson junction with two superconducting electrodes separated by an insulating barrier with embedded magnetic nanoclusters (shown in red). The suppression of the superconducting order parameter ψ is shown schematically on the right as a function of vertical z-position. Quasiparticle I q p and superconducting pair I s p currents are shown schematically.

Close modal
By equating the real and imaginary parts of these equations, one can derive the Josephson relations which relate the current and voltage to the phase difference θ = θ 2 θ 1 and the time dependence of the phase, respectively,
(4)
(5)
ϕ 0 = h 2 e = 2.067833831 ( 13 ) 10 15 V s is the superconducting magnetic flux quantum and I c is the junction critical current.

A steady state pair current can flow with no voltage (no dissipation) and is a periodic function of the order parameter phase difference θ , with a maximum value of I c . I c is a function of the pair density in the electrodes, which has a strong temperature dependence and goes to zero near the superconducting transition temperature Tc. I c is also a function of the coupling energy k, which is a sensitive function of the embedded magnetic order. As the pair current and phase difference increase, a voltage is generated proportional to d I / d t and reversible work is done leading to a Josephson energy U j = I V ( t ) d t = I c sin ( θ ) ( d θ / d t ) ( ϕ 0 / 2 π ) d t = ( I c ϕ 0 / 2 π ) [ 1 cos ( θ ) ] . Using the standard definition of an inductor V = L ( d I / d t ) , a Josephson device can be considered as a nonlinear inductor with L j = ϕ 0 / 2 π I c cos ( θ ) , with the inductance inversely proportional to the critical current. If the phase θ undergoes a phase jump, then a voltage pulse will be generated with an integrated amplitude of V ( t ) d t = ϕ 0 . These pulses are referred to as single flux quantum (SFQ) pulses and have typical amplitudes of 1 mV and duration 2 ps.

In addition to the pair current, there is a current carried by thermal single-particle excitations, quasiparticles, that, at higher energies, are similar to single particle excitations in normal metals. The superconducting energy gap Δ s c defines the minimum energy of a quasiparticle excitation, leading to an exponential decrease in the quasiparticle density at low temperatures and nonlinear conductances across certain junctions, especially those whose transport is dominated by tunneling. The current through the superconducting electrodes can be considered to be the sum of two independent components, the superconducting pair and quasiparticle currents, which constitutes a two-fluid model. In conventional tunnel barriers, the transport through the barrier can also be assumed to be due to independent pair and quasiparticle currents. This may not be the case in more complex barriers that have magnetic structures.

A Josephson junction can be viewed as an ideal Josephson device, which transports superconducting pairs, in parallel with a capacitor and a dissipative element. This model, referred to as the resistively capacitively shunted junction (RCSJ) model,29,32 is shown in Fig. 2. The dissipative element can be either the quasiparticle current or a separate resistive shunt. Here, we take R n to be independent of voltage, which adequately models the junctions used in this work.

FIG. 2.

Resistively capacitively shunted junction (RCSJ) model of a Josephson junction. An × is used as the symbol for a Josephson junction.

FIG. 2.

Resistively capacitively shunted junction (RCSJ) model of a Josephson junction. An × is used as the symbol for a Josephson junction.

Close modal
The circuit model describes a resonant LCR circuit with a nonlinear inductor describing the supercurrents. Summing the currents and setting them equal to the applied current yields the RCSJ dynamical equation29,32
(6)
where current noise In(t) is a random variable usually assumed to have a Gaussian distribution with a standard deviation of I n r m s = 4 k B T R n τ . Here, k B is Boltzmann's constant, T is the temperature of the junction, and τ is the time over which the random field is applied, which must be much less than dynamical oscillation times.
The dynamics are characterized by two parameters, the plasma frequency, ω p , defined as the oscillation frequency at the bottom of the potential well and the unitless McCumber parameter, β c , which determines the strength of the damping.
(7)
(8)

Typical values of the plasma frequency for the junctions considered here are 100 to 400 GHz.33 The McCumber parameter is related to the quality factor of the oscillator Q = R n C j ω p = β c and can be chosen to vary over a wide range. Highly damped junctions, β c 1 , are used for Josephson voltage standard arrays, while underdamped junctions, β c 1 , have been used for latching logic. Here, we will restrict ourselves to junctions that are moderately damped that have β c close to 1, which is close to critical damping.

The circuit model yields a dynamical equation that is identical to that of a forced damped pendulum, where the first term gives the kinetic energy with a mass promotional to C j , the second and third terms are the damping term and associated fluctuations, respectively. The fourth term is the torque due to a “gravitational” potential with the same form as the Josephson energy U = ( I c ϕ 0 / 2 π ) cos ( θ ) . The final term is the external torque corresponding to the applied current.

The dynamics can be visualized using the damped pendulum analogy. When a small torque (or current) is applied, the order parameter phase θ will increase and reach a static value. When the phase gets close to π, there will be a probability that the pendulum will go over the potential energy maximum and will emit a SFQ pulse. When the applied current is larger than I c , the torque is sufficient to continually drive the pendulum over the potential energy maximum outputting a series of SFQ pulses. At very high applied currents, the pendulum will undergo rapid oscillations with a rate of phase advance proportional to the applied current, corresponding to a linear increase in the average voltage. Here, the dissipative current is much larger than the supercurrent. This trend is seen in Fig. 3, which is a simulation of the RCSJ equation with a current ramp of 3.0 μA/ns, T = 4.0 K, Ic = 1.5 μA, βc = 0.5. The critical current is sufficiently low that the thermal energy is comparable to the Josephson energy leading to observed fluctuations.

FIG. 3.

Stochastic RCSJ simulation for a junction showing the phase and voltage as the current is ramped at 3 μA/ns. The temperature is 4.0 K which, for this small critical current, gives rise to significant thermal fluctuations of the phase.

FIG. 3.

Stochastic RCSJ simulation for a junction showing the phase and voltage as the current is ramped at 3 μA/ns. The temperature is 4.0 K which, for this small critical current, gives rise to significant thermal fluctuations of the phase.

Close modal

Experimentally measured time-averaged voltages as a function of applied current, for arrays of Josephson junctions designed for SFQ applications, are shown in Fig. 4. Here, Ic = 0.7 mA. The three regimes are labeled: the low-current dissipationless regime, the SFQ pulsing regime, and the high-current dissipative regime.

FIG. 4.

Measured average voltage vs. applied current for Josephson junction arrays designed for SFQ applications. Figure courtesy of P. Dresselhaus and D. Olaya.

FIG. 4.

Measured average voltage vs. applied current for Josephson junction arrays designed for SFQ applications. Figure courtesy of P. Dresselhaus and D. Olaya.

Close modal

As important as efficient spiking and synaptic plasticity, a neuromorphic architecture must have the ability to transport spikes with little dispersion over long distances and large fan-out. Superconducting transmission lines have low loss and low dispersion up to frequencies comparable to the gap frequency h f g = 2 Δ s c , above which quasiparticle pairs can be created.34 For Nb transmission lines, which are used in most of the current SFQ circuits, f g = 650 GHz .35 1 ps SFQ pulses can propagate 10 mm without significant loss and dispersion.34 

A superconducting transmission line can function as a long dispersionless axon. However, synaptic connections will cause attenuation of the pulse. Several circuits have been demonstrated using Josephson transmission lines [an example is shown in Fig. 5(b)] and SFQ pulse splitters that can be used to add energy and regenerate quantized SFQ pulses.36 Using these techniques, a single spike can be sent to a very large number of synaptic taps.

FIG. 5.

Comparison of biological and magnetic Josephson junction neural systems. (a) Numerical simulations of action potentials using the Hodgkin–Huxley model and SFQ pulses using the RCSJ model. (b) Schematic of an axon which transmits action-potential pulses long distances without degradation and a Josephson transmission line which transmits SFQ pulses long distances without degradation. The axon schematic is from https://en.wikipedia.org/wiki/Axon. (c) A diffusion MRI-based tractography map of a human brain from www.humanconectomeproject.org illustrating complex 3-dimensional connectivity and state-of-the art 8-layer SFQ circuit showing present day connectivity in microfabricated circuits.24 (d) Illustrates a biological synapse (https://en.wikipedia.org/wiki/Synapse) with complex neurotransmitters and ion channels, and a magnetic Josephson junction synapses with connectivity modulated by local spin clusters. [(e), (f), and (g)] Number of neurons and fan-out possible at present, speed of the neural systems measured in pulses per neuron per second, and the power consumed by a system with the above parameters. The power listed for a typical SFQ neuromorphic circuit does not include cooling overhead.

FIG. 5.

Comparison of biological and magnetic Josephson junction neural systems. (a) Numerical simulations of action potentials using the Hodgkin–Huxley model and SFQ pulses using the RCSJ model. (b) Schematic of an axon which transmits action-potential pulses long distances without degradation and a Josephson transmission line which transmits SFQ pulses long distances without degradation. The axon schematic is from https://en.wikipedia.org/wiki/Axon. (c) A diffusion MRI-based tractography map of a human brain from www.humanconectomeproject.org illustrating complex 3-dimensional connectivity and state-of-the art 8-layer SFQ circuit showing present day connectivity in microfabricated circuits.24 (d) Illustrates a biological synapse (https://en.wikipedia.org/wiki/Synapse) with complex neurotransmitters and ion channels, and a magnetic Josephson junction synapses with connectivity modulated by local spin clusters. [(e), (f), and (g)] Number of neurons and fan-out possible at present, speed of the neural systems measured in pulses per neuron per second, and the power consumed by a system with the above parameters. The power listed for a typical SFQ neuromorphic circuit does not include cooling overhead.

Close modal

When operating in a low current regime near I c , Josephson junctions are naturally spiking devices. These pulses form the basis of several high-speed logic families.37 In addition, Josephson junctions have been demonstrated as biologically realistic neurons with two Josephson junctions per cell.38,39 They have been proposed for use in stochastic neural networks,40 spiking neural networks, and have been used to implement a sigmoid transfer function for rate encoding.41 Figure 5 shows the correspondence between this superconducting technology and biological neural systems. Figure 5(a) shows a comparison of numerically calculated action potentials using a Hodgkin-Huxley model42 and SFQ pulses calculated using the RCSJ model. SFQ pulses are an order of magnitude smaller in voltage and nine orders of magnitude shorter duration. Both neural and SFQ pulses can propagate long distances with little dispersion. SFQ pulses can propagate on passive superconducting transmission lines or active Josephson transmission lines, shown in Fig. 5(b). Neurons also use a combination of regions with the myelinated regions being more passive and the nodes of Ranvier [Fig. 5(b)] corresponding to more active regions. Active lines, where there is energy input along the transmission path, are required for high fan-out to maintain the pulse energy. Biological neural systems have an inherent high-complexity 3-dimensional structure, as indicated schematically by a white matter tractography map for a human brain in Fig. 5(c), while the magnetic/superconducting systems are still limited by current microfabrication technology. Figure 5(c) shows state-of-the-art SFQ technology with 8-metal layers for Josephson junctions, high speed interconnects, resistive, capacitive, and inductive components.24 While technically advanced, this microfabricated technology cannot match the fan-out possible in biological systems. In SPICE (SPICE refers to Simulation Programs with Integrated Circuit Emphasis, which is used to simulate many types of electronic devices and circuits) simulations, we have been able to implement fan-out of 1 to 3 without additional junctions. If we use this as a worst case, implementing all-to-all connections in a feed forward between two 100 node layers would require roughly 3900 additional junctions. This is a major challenge and any larger fan out is difficult without a further advance in the microfabrication technology. Fan-in of 9 to 1 has been simulated and would follow a similar approach, which is more promising. We have thus indicated 100 fan-out as the maximum feasible number with the current technology in Fig. 5(c), though even this level will clearly be challenging. The speed of Josephson based neurons is much faster than biological, with ≈1010 pulses/neuron/s possible. The neuromorphic synapse being discussed in this paper is shown schematically in Fig. 5(d) along with a biological synapse. The biological synapse is exceedingly complex relying on many different neurotransmitters, ion transport mechanisms, and intracellular structures. The neuromorphic synapse relies on the complex interaction of the superconducting order parameter with the embedded magnetic nanostructure. The magnetic nanostructure forms a memory that can modulate the superconducting spiking behavior. The modulation depends on the relative orientations of the spin clusters, which can be modified using low energy current pulses. Finally, the human brain shows remarkable energy efficiency with operation at 20 W. The superconducting circuits take considerably less energy, excluding the costs for the cryogenic system. In large scale applications where the power consumption is already greater than 10 kW, the cooling overhead is roughly 1000 W of wall power to cool 1 W at 4 K.43 

Neuromorphic technologies have many potential advantages over corresponding digital technologies when going to extreme speeds and low energies. Digital technologies require global high-speed clocks for synchronization, which is challenging above 10 GHz, while neuromorphic systems are asynchronous. Digital SFQ systems require precise pulse timing and synchronization of pulse arrival at subsequent gates, while neuromorphic systems embrace pulse time variations as an important form of information transfer. Digital systems require high uniformity of devices, while neuromorphic systems, given that the devices are inherently plastic, embraces nonuniformity. Digital systems require a memory hierarchy with memory located at sequentially farther distances from the information processing core. This separation leads to both higher power and lower speed. Neuromorphic systems, such as the one discussed here, have the memory intrinsically embedded in the information processing core, collocated at the nanoscale. Digital systems require deterministic response and cannot tolerate stochasticity, which necessarily occurs when switching energies are close to the thermal energy. Neuromorphic systems embrace stochasticity44 as an important method of information transfer and as a method to explore options that have not been explicitly preprogramed.

Superconductivity and ferromagnetism have a fundamental competition in their ground states, which makes the combination of the systems quite interesting. The singlet ground state in a superconductor has Cooper pairs with opposite spin orientation, whereas the ferromagnetic exchange interaction tends to align spins parallel. Vitaly Ginzburg in 1956 was the first to posit the coexistence problem of singlet superconductivity with ferromagnetism from the viewpoint of orbital interactions.45 The advent of BCS theory27 soon after lead to the realization that the problem could also be explained in terms of a competition between the opposite spin orientation in superconducting pairs and the parallel spin orientation favored by the ferromagnetic exchange interaction.46 The effect of the exchange field can lead to a nonuniform superconducting state with the pairs acquiring a non-zero center of mass momentum, which was predicted in 1964 by Fulde, Ferrell, Larkin, and Ovchinnikov (FFLO).47,48 While these theoretical works were done with respect to continuous ferromagnetic superconductors, the behavior that is observed in superconducting-ferromagnetic heterostructures is often analogous. This analogous behavior has led to many exciting recent developments in nanometer scale superconducting ferromagnetic hybrid devices.49–53 

Among the most promising devices for high performance computing applications are the magnetic Josephson junctions (MJJs). In these devices, the Cooper pair wavefunction extends into the ferromagnetic layer with a damped oscillatory behavior. Leveraging the work of the hard disk drive industry, the magnetic random access memory industry, and the great progress in superconductive thin films, hybrid superconducting-ferromagnetic devices can be made with precise control of the ferromagnetic layer. This has led to the development of many interesting devices that exploit the physics of the interacting order parameters. Among these are π Josephson junctions,54 spin triplet Josephson junctions,49,55 and pseudo spin-valve Josephson junctions.56 What has become clear is that there is a strong interaction between the state of the ferromagnetic barrier in an MJJ and the superconducting order parameter. These effects can be used to modulate the phase and/or amplitude of the Josephson critical current Ic. More specifically, the exchange field F results in a superconducting spin-pair ↑ ↓ that occupies a spin-split Fermi level in the ferromagnet. The resultant center of mass momentum of the spin-pair is nonzero |k − k| ≡ δk and is directed perpendicular to the interface. In a typical superconductor, the ground state is occupied by Cooper pairs in the singlet state ↑↓ − ↓↑. Once the spin splitting from the exchange field is introduced, the pair amplitude is proportional to ↑↓ eiδk·R − ↓↑ e-iδk·R in analogy to the FFLO phase in a ferromagnetic superconductor. This leads to an oscillatory pair-phase φ(x) that modulates with distance from the superconductor-ferromagnet interface that is also accompanied by decoherence as the pairs move away from the superconducting interface.47,48,51,57 This effect can be observed in the Josephson coupling, which leads to an Ic that oscillates and decays as a function of ferromagnetic thickness. The oscillation period with ferromagnetic thickness is set by the spin splitting at the Fermi level and is ∼π/δk.49,58–60 The ability to tune phase and the phase dependent Josephson critical current by changing the magnetic properties allows one to exploit the work that has gone into a large range of spintronics devices to control the superconducting properties of a hybrid MJJ.

Significant work has gone into exploring ferromagnetic semiconductors for spintronics applications.61,62 One of the unintended effects that was discovered was the tendency for Mn to cluster during annealing when doping semiconductors.63,64 The size and energy barrier of the Mn clusters can be tuned by varying these annealing conditions leading to a highly tunable materials system.65 This is particularly useful in Nb based Josephson junctions, which enables easy access to clusters that are thermally stable at operating temperature of 4 K. In addition, the clusters can be physically very small with cluster sizes being estimated from magnetometry to be around 4 nm in diameter and having a density of roughly 20 000 clusters per square micron. Being able to make MJJs with these large number of clusters allows us to control the critical current of the junction in a near analog way over two orders of magnitude. This flexible tunability is ideal for enacting the synaptic functions that are needed in SFQ neuromorphic computing.33 

Figure 6 shows voltage versus applied current data from a 2.5 × 5 μm elliptical MJJ with the SiMn cluster barrier. In the magnetically disordered state, where there is dissipationless transport, the critical current is roughly 130 μA. In the magnetically ordered state, the critical current is near the noise limit of this measurement at roughly 10 μA or less. The magnetic structure in these junctions can be toggled between ordered and disordered with 50 ± 5 fJ, 300 ± 20 ps electrical pulses. In the presence of a 20 mT external magnetic field, this causes the clusters to order, and in the absence of an applied field, the pulse causes the clusters to disorder. If the field direction is reversed and the junction is pulsed with lower amplitude pulses, the MJJ will transition first to a higher Ic state and then back down to a lower Ic state, indicating that the ordered MJJ clusters point along the direction of the field.

FIG. 6.

Voltage versus applied bias current for 2.5 × 5 μm elliptical clustered MJJ in the magnetically disordered (left) and magnetically ordered (right) states.

FIG. 6.

Voltage versus applied bias current for 2.5 × 5 μm elliptical clustered MJJ in the magnetically disordered (left) and magnetically ordered (right) states.

Close modal

The strong modulation of the Josephson critical current at zero field, seen in Fig. 6, suggests that the effect is due to the interacting order parameters. However, it is also known that the magnetically ordered state will produce a dipole field that can shift the maximum critical current away from zero applied field.66 To confirm that the observed change in Ic is not due to the dipolar field effect, we can measure Ic as a function of the applied magnetic field. Figure 7 shows the characteristic voltage IcRn for a 10 μm circular junction in the magnetically ordered (blue) and disordered states (red). The classic Fraunhofer pattern29 is observed for the magnetically disordered state and is partially fit by the Airy function. The discrepancy of the Airy function fit on the right-hand lobe illustrates the complexity of the interaction of the superconducting order parameter with the magnetic clusters; the details of which are currently under investigation. In the magnetically ordered state (red; Fig. 7), we did not fit the data to the Airy function because Ic was at the noise level of the measurement and no clear pattern was observed for Ic as a function of applied field.

FIG. 7.

Josephson Ic as a function of applied field for a 10 μm circular clustered MJJ. Blue data are taken in the magnetically disordered state and fit to the Airy function, and red data are in the magnetically ordered state and were at the instrument resolution and therefore not fit to the Airy function.

FIG. 7.

Josephson Ic as a function of applied field for a 10 μm circular clustered MJJ. Blue data are taken in the magnetically disordered state and fit to the Airy function, and red data are in the magnetically ordered state and were at the instrument resolution and therefore not fit to the Airy function.

Close modal

Figure 8 shows the Josephson Ic as a function of a number of electrical pulses used to order the Josephson junction.33 The data demonstrate the ability to tune the order in a quasi-analog way between the fully magnetically ordered and disordered states, which is critical for synaptic functionality. The ability to reset the clusters into the disordered state is easily achieved by pulsing the MJJ in the absence of an external magnetic field. These data were taken on a 10 μm circular clustered MJJ. The electrical pulses were applied in a 20 mT external field. The measurements of Ic were taken in zero applied field and it was confirmed that sweeping the field between zero and 20 mT had no effect on Ic without the additional electrical pulses. The pulse voltage was chosen to be a factor of 10 less than the voltage required to fully order the clusters with a single pulse. This indicates that the amount of order induced with the electrical pulse can be modulated by changing the pulse amplitude. This is an important tunability that can be used in the design of self-training synaptic circuits.

FIG. 8.

Josephson critical current vs. the number of applied electrical pulses demonstrating the ability to tune order in small increments for synaptic functionality.

FIG. 8.

Josephson critical current vs. the number of applied electrical pulses demonstrating the ability to tune order in small increments for synaptic functionality.

Close modal

Figure 9 shows the energy scaling well with the device size.33 The red squares are the energy in the electrical pulse that was required to fully order the junction in an applied 20 mT external magnetic field. It was confirmed that the magnetic field alone was sufficiently small that it did not affect the magnetic order without the addition of an electrical pulse. It is worth noting that pulses many times below this energy also affected the magnetic order, as seen in Fig. 8, and allows for the near analog control of the magnetic order. It is also worth noting that the minimum training spike voltage was roughly 700 μV. The blue circles are the approximate spiking energy for the MJJ as a function of size. The energy value is calculated as IcΦ0 and is close to the energy dissipated during a spiking event in the MJJ synapse during inference. If the observed scaling trends continue, then junctions with approximately 1 μm2 area will be in the self-training regime. In this limit, the magnetic order can interact directly with the SFQ pulses and self-learning circuits based on Hebbian learning rules can be designed. If the size scaling fails to continue, we should also be able to reduce the training energy required by reducing the Mn cluster size. This can likely be achieved with a lower annealing temperature.

FIG. 9.

Pulse energy scaling as a function of the MJJ device area. The red squares represent the electrical pulse energy required to fully order the magnetic state in a 20 mT applied external field. The blue circles are the approximate energy dissipated during a 2π spiking event of the Josephson junction and are calculated as IcΦ0.

FIG. 9.

Pulse energy scaling as a function of the MJJ device area. The red squares represent the electrical pulse energy required to fully order the magnetic state in a 20 mT applied external field. The blue circles are the approximate energy dissipated during a 2π spiking event of the Josephson junction and are calculated as IcΦ0.

Close modal

In this section, we present simulations of spiking Josephson junctions based on the RCSJ models. Sections III A and III B present stochastic models of a single Josephson junction driven by thermal noise. Section III C extends the RCSJ model to include a magnetic degree of freedom that can serve as a synaptic weight.

When the Josephson energy becomes comparable to k B T , fluctuations become important. The thermal stability factor is given by
(9)
Here, k B is the Boltzmann constant and T is the temperature in kelvin. When Δ t h is large, the dynamics are deterministic, whereas when Δ t h < 6 , there is a significant stochastic component. Given the large nonlinear variation of I c , with temperature near the superconducting critical temperature Tc, we can vary Δ t h over a wide range by adjusting the temperature over just a few degrees, thereby controlling the amount of stochasticity in the neural circuit.

Figure 10 shows a numerical simulation of the RCSJ equation for junctions with different critical currents and thermal stability factors showing a transition from small to extensive stochastic spiking as the critical current is lowered from 1.0 μA to 0.5 μA. Here, a small external current bias, I b  = 0.3 μA, is applied, which sets the polarity of the spiking and provides energy input. The energy input is given by the product of I b and the time-integrated voltage, which is then n I b ϕ 0 , where n is the number of 2π phase slips that occur. The external energy required to generate a spike is 11 k B T . This low spiking energy is an important advantage to operating in the stochastic regime. A second advantage is the exponential dependence of spiking rate on device parameters. By dynamically changing the critical current by a factor of 2, the physics of which was described in Sec. II, a large increase in spiking can be obtained. Without any external energy input in the stochastic limit, the system will undergo random positive and negative spikes as energy is transferred in and out of the thermal bath, which is an inherent component of the dissipative element.

FIG. 10.

Simulations of spiking of a 0.5 μm diameter 400 Ω junction at 4.2 K for critical currents of (a) 1.0 μA and (b) 0.5 μA and a bias current of 0.3 μA. The junction with Ic = 0.5 μA shows spikes corresponding to 1, 2, and 3 flux quanta.

FIG. 10.

Simulations of spiking of a 0.5 μm diameter 400 Ω junction at 4.2 K for critical currents of (a) 1.0 μA and (b) 0.5 μA and a bias current of 0.3 μA. The junction with Ic = 0.5 μA shows spikes corresponding to 1, 2, and 3 flux quanta.

Close modal

All neuromorphic systems have spike timing effects due to their intrinsic dynamics. For SFQ systems, the dynamics are based on ω p and β c , as well as on stochastic effects caused by temperature. Figure 11 shows an RCSJ simulation for a series of current-pulse-pairs input into a 1 μm, 200 Ω junction. The current-pulse pairs have temporal separations ranging from 100 ps to 0 ps and are each 5 ps in duration. For pulse separation ≥50 ps, no SFQ pulses are emitted. As the pulse separation is reduced below 20 ps, increasing number of SFQ pulses are emitted, showing a strong spike timing dependence. While there is a transient response for pulse separations of 50 ps and 100 ps, the output voltage pulses integrate precisely to zero. The integrated amplitudes of all output voltage pulses are quantized in units of ϕ 0 and range from 0 ϕ 0 to 8 ϕ 0 . This spike timing dependence is due to the energy relaxation time of the junction, which is dependent on β c . As seen in the inset to Fig. 11(c), the relaxation time, for this junction, is on the order of 20 ps. The oscillations in the junction, which occur at the plasma frequency ω p , will lead to fine-scale structure in the spiking probabilities on short time scales if β c > 1 .

FIG. 11.

Simulations of spike timing effects. A pair of 5 ps wide 10 μA amplitude Gaussian input pulses are applied to a 1 μm diameter 200 Ω junction with different pulse separations ranging from 100 ps to 0 ps. (a) Input current spikes, (b) output voltage spikes, (c) phase progression with the left axis being the number of 2π phase jumps. (d) A close up of a voltage spike and phase jump where the inherent dynamics corresponding to a damped oscillator can be seen.

FIG. 11.

Simulations of spike timing effects. A pair of 5 ps wide 10 μA amplitude Gaussian input pulses are applied to a 1 μm diameter 200 Ω junction with different pulse separations ranging from 100 ps to 0 ps. (a) Input current spikes, (b) output voltage spikes, (c) phase progression with the left axis being the number of 2π phase jumps. (d) A close up of a voltage spike and phase jump where the inherent dynamics corresponding to a damped oscillator can be seen.

Close modal
In addition to the standard RCSJ model that is available as open source in WRspice, we have developed a Verilog model of our magnetic Josephson junctions and integrated it into WRspice. In this model, the magnetic order parameter m is expressed by its effect on the junction critical current ( I C ) , according to the relationship
(10)
where T is the temperature, TC is the critical temperature of the superconducting material, m is the magnetic order parameter (the normalized total magnetic moment in the barrier), and the sum I C V + I C M is the maximum critical current, achieved when m = 0 (the magnetically disordered state) and T = 0. Equation (10) is a simple phenomenological model which coarsely mimics the observed behavior that the I C changes with each voltage pulse and that I C varies with the observed power law behavior 1 ( T / T C ) 2 . The magnetic order parameter varies with the integrated junction voltage according to d m = k V ( t ) d t between m = 0 and m = 1; in other words, a voltage pulse across the junction causes the order parameter to increase and critical current to decrease, until saturation at m = 1 and I C = I C M . The proportionality constant k is a simulation input parameter; physically, this parameter varies based on fabrication parameters and external field strength. To simulate the neural network in situ self-training mode, k can be set so that m varies from 0 to 1 within ∼100 SFQ voltage pulses. For simulation of network operation after MJJ, IC’s have converged to their optimal values and k is set so that m is essentially invariant in normal circuit operation.
Josephson junctions and MJJs can be built into simple superconducting circuit elements that emulate the synapses and neurons of a neural network. A simple 3-layer feed forward network is shown schematically in Fig. 12, with neurons and synapses labeled n and s, respectively. These elements can be used to realize the layers of a standard feed-forward neural network and perform, in hardware, the computation
(11)
where x and y are the vectors of network layer inputs and outputs, W is a matrix of weight factors, b is a vector of offset bias values, and f ( z ) is a nonlinear function, applied element-wise to each element of z , which is the weighted sum of the inputs and biases. W and b are variables and these are modified, in either a supervised or unsupervised manner, to train the circuit for a particular function.
FIG. 12.

Schematic of a simple spiking feed forward network. n, s, and b represent neurons, synapses, and biases, respectively. Not all connections are shown.

FIG. 12.

Schematic of a simple spiking feed forward network. n, s, and b represent neurons, synapses, and biases, respectively. Not all connections are shown.

Close modal

For a software based neural net, x , y , W , and b would consist of integers or floating-point numbers. For a neural net implemented in analog, silicon-based circuitry x , y , and b would be continuous voltages and W would be linear amplification stages. For neuromorphic spiking systems, being discussed here, x and y represent complex functions of the input and output current spike trains (e.g., the number of spikes or spike rate), W characterizes the complex processing of the incoming spikes by the Josephson circuit elements, and b are external bias currents. Hence, in our discussion, we refer to “synapses” as the circuit elements that implement the weight matrix elements w i j and “neurons” as the circuit elements that perform the summation of multiple synaptic inputs, add a bias, apply a nonlinear function, and produce an output spike train.

Synapses are represented numerically by the elements w i j of a matrix of weights W . In a superconducting neuromorphic circuit with tunable IC’s, the synapse circuit elements are designed such that the I C of the (i,j)th synaptic MJJ maps to the weight element w i j . Ideally, the mapping from I C to weight w i j would be unique and monotonic. In addition, due to the high volume of synapses in a feed-forward network, a simple synapse architecture with small circuit area is ideal.

A proposed synapse circuit structure is shown in Fig. 13. This structure uses a single MJJ with tunable I C , in parallel with a fixed inductor L f i x . In a small-signal linear model, the Josephson junction is modeled as an inductor L J with inductance L J Φ 0 / 2 π I C , in parallel with a resistance R n . For pulses much slower than the plasma frequency of the MJJ (100 GHz as demonstrated33), the input current splits between the two inductive branches in a ratio that depends on the MJJ I C . The current through L f i x is used as the synapse output signal. Given an input current I i n to this synapse structure, the current through L f i x is given by
(12)
where
FIG. 13.

Synapse circuit element consisting of a Josephson junction in parallel with a fixed inductor, Lfix, which is inductively coupled to the output. The symbol for a MJJ is similar to that of a Josephson junction, except there is an additional arrow to indicate that the junction properties can be varied during circuit operation.

FIG. 13.

Synapse circuit element consisting of a Josephson junction in parallel with a fixed inductor, Lfix, which is inductively coupled to the output. The symbol for a MJJ is similar to that of a Josephson junction, except there is an additional arrow to indicate that the junction properties can be varied during circuit operation.

Close modal

A SPICE simulation of the I C versus w i j relationship is shown in Fig. 14. This simulation uses circuit parameters L f i x  = 4 pH, R n  = 1 Ω, and I C varying from 20 μA to 200 μA (corresponding to junction characteristic frequency varying from 9.6 GHz to 96 GHz). In this simulation, the input current I i n is ramped from 0 to 150 μA, and back to 0 μA, over a range of ramp times. The synapse weight w i j is calculated by the ratio of the peak current through L e x t to the peak input current of 150 μA. The solid curve corresponds to the slowly varying input case (ramp time >150 ps). The value of w i j varies from 0.95 to 0.3 over the range of IC’s simulated. To achieve smaller magnitude of weights w i j , the maximum MJJ I C could be further increased. For example, increasing I C to 2 mA would decrease the synaptic weight factor to ∼0.04 in the slowly varying input case. The fixed inductance L f i x could also be increased to achieve smaller w i j , although this would result in larger circuit areas (with current NIST fabrication technology, L f i x  = 4 pH corresponds to a synapse circuit area of ∽10 μm × 15 μm). Alternatively, each synaptic weight w i j could be physically represented by two synapse structures with independent MJJ IC's, one with a positive contribution and one with a negative contribution so that w i j ( n e t ) = w i j + w i j . This scheme will be further explained in the discussion of output neurons.

FIG. 14.

SPICE simulation of synaptic weight vs. Josephson junction critical current for varying ramp times.

FIG. 14.

SPICE simulation of synaptic weight vs. Josephson junction critical current for varying ramp times.

Close modal

The mapping between I C and w i j deviates from the outlined model when either the input current changes too quickly or the magnitude of input current becomes too high. In particular, as the inverse of the input signal ramp time approaches the junction characteristic frequency, the junction resistance R n shunts an increasing portion of the input current, lowering the current through L f i x and limiting the range of possible variation in w i j . This effect is shown by the dashed curves in Fig. 14. For the junction parameters simulated in this example, the range of possible w i j begins to drop significantly for input ramp times <30 ps. To achieve a higher maximum operating speed while maintaining full sensitivity of w i j to I C , the MJJ R n could be increased. Finally, when the input current magnitude increases such that I i n L f i x  ∼  0.5 Φ 0 or larger, the small-signal model for the Josephson inductance L J is no longer valid. Instead, the MJJ outputs an SFQ pulse for sufficiently large I i n , causing an abrupt change in the effective w i j . The proposed synapses are not intended for operation in this large-signal regime.

The input current signals to the synapses could be either analog (i.e., a current pulse from an external source with programable amplitude or the scaled output from a previous network layer) or digital and binary (a “1” or “0” represented by the current, or lack thereof, from an SFQ pulse). An advantage of SFQ pulses is that the input bits could be stored prior to operation in an on-chip SFQ memory bank, and then clocked at speed to the network input with a single universal clock signal. For long input vectors x and networks that require synchronous operation, this operation scheme is more realistic than separately driving each input channel in real time.

1. Overview

We envision using standard Josephson junctions, e.g., Nb/AlOx/Nb, to provide the functionality of neurons. These will serve as the spiking elements and have already been well developed for digital SFQ circuits. As previously described, the role of output neuron j is to sum the synaptic inputs and apply a nonlinear function
(13)
One proposed circuit structure that implements this functionality is shown in Fig. 15. The current through L f i x of each synapse is inductively coupled to an output loop that consists of a large inductor in series with two Josephson junctions. This output circuit structure is a superconducting quantum interference device (SQUID).29,67 The SQUID is characterized by its total inductance L S Q , the coupled magnetic flux through the SQUID loop Φ c p l , the DC bias current I b i a s , and the IC’s of the component Josephson junctions (which could either be fixed or tunable if MJJs are used). Here, it is assumed that Φ c p l arises from current through the inductively coupled synaptic inputs.
FIG. 15.

Schematic of multiple synapses inductively coupled to the “jth” output neuron.

FIG. 15.

Schematic of multiple synapses inductively coupled to the “jth” output neuron.

Close modal

The total magnetic flux through the SQUID ( Φ t o t = Φ c p l + L S Q I c i r ) is proportional to the difference in the gauge-invariant phase φ between the two Josephson junctions; Φ t o t = Φ 0 ( φ 1 φ 2 ) / 2 π . A further constraint on φ 1 and φ 2 is that the total current through the Josephson junctions must sum to I b i a s in the absence of any coupled flux. Whenever I b i a s is not split evenly between the two Josephson junctions, a nonzero I c i r is present. Depending on all SQUID parameters and input values, a steady-state solution for the distribution of I b i a s between the two Josephson junctions may exist (meaning that d φ / d t = 0 for both Josephson junctions), and the SQUID remains in the “zero-voltage state;” if a solution does not exist and d φ / d t 0 , the SQUID is in the “voltage state.”

In the zero-voltage state, as the coupled flux Φ c p l increases, the current through each Josephson junction (and value of I c i r ) varies smoothly until the current through one of the Josephson junctions reaches its I C threshold. At this point, the Josephson junction at threshold outputs an SFQ pulse, the circulating current I c i r increments or decrements by a discrete value, and the SQUID again evolves smoothly along a new solution space as Φ c p l increases further. If Φ c p l subsequently decreases to zero or its original value, the second Josephson junction outputs an SFQ pulse, returning I c i r to its original value.

Figure 16 shows the results of a SPICE simulation of a SQUID while ramping the coupled flux. One of the SQUID Josephson junctions outputs an SFQ pulse when Φ c p l increases above a threshold, and the second outputs an SFQ pulse as Φ c p l decreases. In the proposed neuromorphic SFQ circuit architecture, Φ c p l is provided by the summation of the coupled flux from all input synapses; Φ c p l , j = i M I L f i x , i j , where M is the mutual inductance between the synapses and neuron and I L f i x , i j = w i j I i n , i . As shown in Fig. 15, each synapse is physically represented by two loops that couple flux of opposite polarity to the output SQUID, corresponding to the weight factors w i j + and w i j .

FIG. 16.

SPICE simulations of a SQUID with coupled flux ramp. When the coupled flux reaches a certain threshold, one of the SQUID Josephson junctions outputs an SFQ pulse and the circulating current jumps in magnitude.

FIG. 16.

SPICE simulations of a SQUID with coupled flux ramp. When the coupled flux reaches a certain threshold, one of the SQUID Josephson junctions outputs an SFQ pulse and the circulating current jumps in magnitude.

Close modal

The inductively coupled scheme for transmitting synapse signals to the output neuron was chosen to minimize unwanted cross-talk between circuit structures. The Josephson junctions are two-terminal devices with low output impedance, which makes it difficult to achieve isolation between inputs and output, especially when designing directly coupled circuits with large fan-in. For example, consider N MJJ synapses that are directly connected to the bias input of a neuron SQUID loop. An SFQ voltage pulse from one synapse could split and travel back through the other N − 1 synapses rather than forward to the output neuron. Additional circuitry can be added to block this back-propagation,37 but our simulations of directly coupled neuromorphic circuits still reveal limited scalability and poor circuit operating margins, which is why we propose an inductively coupled scheme.

Cross-talk between synapses in the inductively coupled scheme only results from second order effects, as long as the parasitic mutual inductance between nearby synapses is kept small. The second-order effect arises because incremental addition to Φ c p l by one synapse slightly changes the output neuron I c i r , which in turn changes the amount of magnetic flux back-coupled from the output neuron to the other synapses. The back-coupled flux through the synapse loops slightly changes the current through L f i x and effective synapse weight w i j . However, these incremental changes are negligible and even the large-signal flux back-coupled to the synapses from the total I c i r in the neuron is small. For example, assuming a synapse L f i x of 4 pH, with each synapse coupled to a 1 pH section of the output neuron L S Q with coupling coefficient 0.3 , and an I c i r magnitude of 80 μA or less, the magnetic flux back-coupled to the synapses is at most 0.02 Φ 0 . Simulations show that Φ c p l = 0.02 Φ 0 through the synapses has negligible impact on w i j . As network fan-in grows, the neuron L S Q will increase to accommodate the larger number of synapse inputs, while the neuron IC’s and I c i r will decrease. Therefore, unlike the directly coupled architecture, unwanted cross-talk between circuit structures will decrease with larger fan-in for the inductively coupled architecture.

2. Neuron nonlinearity implementation

To fully implement the feed-forward neural network layer model in Eq. (11), the output SQUID neuron should apply a nonlinear function f ( z ) to the summation of coupled synapse fluxes.

In a binary-output neuron model, the nonlinear function f ( z ) is simply a threshold function. The SQUID Josephson junctions each output an SFQ pulse if the value of Φ c p l exceeds a threshold value and do not output an SFQ pulse if Φ c p l does not reach the threshold. This threshold ( Φ c p l T / Φ 0 ) is tunable based on the parameters of the output SQUID, including the I C and L S Q . The value of Φ c p l T / Φ 0 is shown in Fig. 17 as a function of I C and L S Q ; this figure shows that Φ c p l T / Φ 0 mainly depends on the product I C L S Q . Figure 17 is generated from a SPICE simulation in which I b i a s was initially 0.5 I C for each Josephson junction in the SQUID. The effective neural network offset bias b j of neuron j can thus be set by tuning the I C L S Q value, which will tune the amount of coupled Φ c p l needed to trigger a binary “1” output. The binary-output neuron scheme is attractive from an energy and speed standpoint, because the output from each neuron is encoded in a single pulse with sub-attojoule energy and only 10s of ps duration. The tradeoff is that more information can be encoded per neuron if analog output signals are used.

FIG. 17.

Results of SPICE simulations showing coupled flux Φ c p l T / Φ 0 to create a neuron SQUID output pulse as a function of IC and LSQ. Lines of constant IC LSQ are shown.

FIG. 17.

Results of SPICE simulations showing coupled flux Φ c p l T / Φ 0 to create a neuron SQUID output pulse as a function of IC and LSQ. Lines of constant IC LSQ are shown.

Close modal

In one scheme to achieve analog neuron output signals, the SQUID bias current I b i a s is set at a sufficiently high value that when Φ c p l exceeds a threshold Φ c p l T / Φ 0 , there is no possible zero-voltage SQUID configuration, and the SQUID enters the voltage state. The time-average voltage (measured across either Josephson junctions) depends on Φ c p l , as shown in Fig. 18. For this figure, SQUID output voltage was averaged over 1 ns. Although the time-average voltage does not depend monotonically on Φ c p l , there is a regime of Φ c p l in which the SQUID behaves approximately as a Rectified Linear Unit (ReLU) neuron, with zero output up to an input threshold Φ c p l T / Φ 0 and steadily increasing output above this threshold. Unlike a software-defined ReLU neuron, the output eventually saturates and then decreases with increasing input.

FIG. 18.

Simulation of a neuron SQUID operated as an analog-output neuron, with 0 output voltage up to a threshold Φ c p l T / Φ 0 and subsequent monotonic increase in V a v g with Φ c p l , within a limited range of Φ c p l .

FIG. 18.

Simulation of a neuron SQUID operated as an analog-output neuron, with 0 output voltage up to a threshold Φ c p l T / Φ 0 and subsequent monotonic increase in V a v g with Φ c p l , within a limited range of Φ c p l .

Close modal

3. Trainability

The IC’s of the MJJs can be tuned in either direction depending on the presence or absence of a small magnetic field ∼1 mT. The magnetic field can be generated either globally or locally by a write line. In the absence of any other stimulus, the magnetic clusters in the MJJs remain stable in this small magnetic field and do not change. However, any of the MJJs that receive a voltage spike with the applied magnetic field will increase in magnetic order. This reduces the I C value of these MJJs. In the absence of an applied magnetic field, a voltage spike will reduce the magnetic order of any MJJ that receives one of these training spikes. The details of the implementation of such training methods are an area of active investigation.

In one example of a simple training algorithm for a supervised learning task, consider a training set of binary-valued input vectors x and their corresponding binary-valued known outputs y . For each training example, the activated neurons in output vector y m are biased so that their circulating current, I c i r , couples significant magnetic flux through the input synapses ( 0.3 Φ 0 ) , but does not induce an SFQ pulse from the synapse MJJs. The non-activated output neurons are biased so that I c i r is negative, as is the magnetic flux coupled to the input synapses. A positive I i n is then applied to the activated inputs of x m . When bias and input amplitudes are chosen correctly, this scheme will cause an SFQ pulse only from the MJJs in w i j + synapses that connect an activated input to an activated output, or in w i j synapses that connect an activated input to a non-activated output. A schematic of this supervised learning scheme is shown for a simple 3-to-2 network in Fig. 19. It is desirable to be able to supply a localized magnetic field to target MJJ synapses, which can be achieved with a dedicated field write line. It should be noted that in this case, we would need to reduce the required field for ordering.

FIG. 19.

Proposed on a chip supervised learning scheme for MJJ neural network. For each training example, input pulses shown as 1s on top of figure are applied to the activated input neurons, which will produce currents in the MJJs in the synapses connected to the active inputs. In addition, a positive training current, shown as clockwise green arrow, is applied to the desired active output neuron, labeled as 1, and negative training current is applied to the desired non-activated output neurons, labeled as 0. These applied currents at the outputs will induce currents in the MJJs of the connected synapses. The sum of the active input and training currents will add together and exceed the MJJ cluster ordering threshold and, in the presence of an applied field, result in a strengthening of the weight of the connected excitatory part of the synapse. The sum of the active input and training current will tend to cancel each other in the MJJ of the inhibitory part of the synapse. Similarly, the current in an active input and non-active output neuron will add in the inhibitory part of the synapse (strengthening the inhibitory weight) and will tend to cancel each other in the excitatory part of the connected synapses. The MJJs that will have their weight strengthened are circled in red.

FIG. 19.

Proposed on a chip supervised learning scheme for MJJ neural network. For each training example, input pulses shown as 1s on top of figure are applied to the activated input neurons, which will produce currents in the MJJs in the synapses connected to the active inputs. In addition, a positive training current, shown as clockwise green arrow, is applied to the desired active output neuron, labeled as 1, and negative training current is applied to the desired non-activated output neurons, labeled as 0. These applied currents at the outputs will induce currents in the MJJs of the connected synapses. The sum of the active input and training currents will add together and exceed the MJJ cluster ordering threshold and, in the presence of an applied field, result in a strengthening of the weight of the connected excitatory part of the synapse. The sum of the active input and training current will tend to cancel each other in the MJJ of the inhibitory part of the synapse. Similarly, the current in an active input and non-active output neuron will add in the inhibitory part of the synapse (strengthening the inhibitory weight) and will tend to cancel each other in the excitatory part of the connected synapses. The MJJs that will have their weight strengthened are circled in red.

Close modal

The combination of clustered MJJs that can mimic the functionality of the synapse and normal Josephson junctions that can mimic the functionality of a neuron allows for a powerful device set for implementing neuromorphic circuits. Neuromorphic systems based on this hardware platform have clear advantages in speed and power consumption compared to other systems. In addition, the substantial amount of work that has been done for digital superconducting circuit design and fabrication can also be leveraged to greatly accelerate the time from proof of concept to functioning circuits. The previous modeling section has illustrated the ability to model the functionality of the various circuit elements that will be needed to construct a larger scale neuromorphic system. The fact that the Josephson junction models used in the SPICE simulations have already been proven to be accurate for digital circuit design is a substantial benefit.

Building a truly neuromorphic system that can take advantage of the spike time dependent dynamics and the plasticity demonstrated by the clustered MJJs is the ultimate goal. However, the demonstration of functional circuits composed of these devices is the first step in this direction. We can leverage the already existing simulation tools to design various circuit architectures much more rapidly than testing the circuit design only after fabrication. Given that we have circuit elements that mimic the neuron and synapse, we should be able to make circuits that perform the general artificial neural network functionality of Eq. (11). This functionality is at the heart of software based neural networks that are now the best in class for applications such as image recognition,1 internet search,68 and are starting to show great promise for medical diagnostics.69 We have shown that we can implement a feed forward neural network directly with these devices with an efficiently compact design in SPICE simulations.70 

In the example network shown in Fig. 20, we are able to implement a 9 pixel 3 class feed forward network.71 We trained this network on the example problem previously demonstrated with a memristor implementation that was used to recognize three distinct letters (z, v, or n) with any one pixel of input noise.12 Since there is no separate training and test data in the example, 100% accuracy is easily achievable. To test our implementation, we first trained a standard software neural network in Python using back propagation techniques.72 We were then able to use a linear mapping of the weight values in the Python network to the Ic values of the MJJs in our SPICE simulation. With this mapping, we were able to achieve 100% accuracy on the full dataset of noisy 9 pixel letters. Furthermore, because the propagation speed of SFQ pulses is roughly 1/3 the speed of light in vacuum, the network can respond very quickly. In our unoptimized simulations, we could input a new image vector every 3 ns.70 

FIG. 20.

Schematic of the feed forward neural network that was implemented in SPICE simulations to solve a 9-pixel three class example; this network requires 12 Josephson junctions and 27 MJJs.

FIG. 20.

Schematic of the feed forward neural network that was implemented in SPICE simulations to solve a 9-pixel three class example; this network requires 12 Josephson junctions and 27 MJJs.

Close modal

Because of the structure used, the output neurons of the model can be used as inputs for a subsequent layer. In general, we can directly implement conventional deep learning architectures in our hardware platform. The key problem of mapping software neural networks into this hardware will be the large fan-in and fan-out that is often required. However, the structures are naturally separated into layers in a feed forward architecture, which greatly simplifies the issue of connectivity. The fan-out in these circuits can be repeated in architectures where the neurons have a fixed spike amplitude. We have simulated fan-out of 1 to 3, which is typical even in digital circuits. As discussed above, nesting such a fan-out would require a substantial overhead. We have simulated fan-in of 9 to 1, which is quite promising, and imply that this side of connectivity between layers will be less of an issue. This is important as the fan-in from synaptic layers will contain weighted spikes that would potentially require weighted repeating nodes. Given the structure of modern feed forward neural networks, the additional overhead in Josephson junction count and wiring complexity will be one of the main challenges for this technology for the foreseeable future.

Deep neural nets have been successful in many tasks. However, the current implementations are largely limited by software algorithms for training the CMOS processors used to implement the algorithms. Here, we focus on hybrid magnetic superconducting devices that naturally output voltage spikes and have plasticity due to embedded nanoscale magnetic structures. These devices have the advantage of very low energy (<10−20 J/spike) and very fast operating speeds (>100 GHz) while naturally utilizing low loss (negligible dispersion up to 1 cm) high speed superconducting communication (∼108 m/s).

Both the magnetic and superconducting technologies have already been proven to scale well by leveraging standard semiconductor fabrication techniques. On the magnetic device side, large non-volatile memories based on magnetic tunnel junctions have been commercialized. On the superconducting device side, high speed microprocessors and communication systems have been developed, and large arrays of Josephson junctions are used to create the Josephson voltage standard. The scalability in size and energy and the proven manufacturability of these devices are a large advantage compared to many other emerging devices.

We have demonstrated nanoclustered magnetic Josephson junctions that have low-energy high-speed plasticity, which, in turn, enables a Josephson-junction-based native-neuromorphic technology. We have used proven simulation tools to model networks of these devices and have mapped them onto a standard neural network architecture. In this context, they have the potential to be faster and more energy efficient than the same network implemented in software even when run on specialized CMOS hardware such as a tensor processing unit. In addition, because of the high speed and low energy consumption, extending these devices to large, truly complex spiking networks has a promising outlook. In the next generation spiking neural networks, we can take advantage of spike timing dependence and the dynamics of the plasticity, which may enable all new functionality beyond that currently associated with software neural networks. This neuromorphic technology may be well suited for much more complex highly recurrent networks that can have learned intelligence that goes well beyond currently utilized feed forward networks.

We acknowledge the technical and intellectual contributions from B. Baek, M. R. Pufall, P. F. Hopkins, P. D. Dresselhaus, S. P. Benz, and W. H. Rippard. We acknowledge NIST for funding this research and the IARPA C3 program for initial funding of the clustered MJJ work. Contribution of the National Institute of Standards and Technology, not subject to copyright.

1.
Y.
LeCun
,
Y.
Bengio
, and
G.
Hinton
,
Nature
521
(
7553
),
436
444
(
2015
).
2.
C.
Mead
,
Analog VLSI and Neural Systems
(
Addison-Wesley
,
Reading
,
MA
,
1989
).
4.
P.
Sterling
and
S.
Laughlin
,
Principles of Neural Design
(
MIT Press
,
Cambridge
,
MA
,
2015
).
5.
C.
Diorio
,
P.
Hasler
,
B. A.
Minch
, and
C.
Mead
, in
Neuromorphic Systems Engineering: Neural Networks in Silicon
, edited by
T. S.
Lande
(
Springer US
,
Boston
,
MA
,
1998
), pp.
315
337
.
6.
D. B.
Frédéric
,
J.
Siddharth
,
W.
Jun
, and
C.
Gert
,
J. Neural Eng.
14
(
4
),
041002
(
2017
).
7.
P. A.
Merolla
,
J. V.
Arthur
,
R.
Alvarez-Icaza
,
A. S.
Cassidy
,
J.
Sawada
,
F.
Akopyan
,
B. L.
Jackson
,
N.
Imam
,
C.
Guo
,
Y.
Nakamura
,
B.
Brezzo
,
I.
Vo
,
S. K.
Esser
,
R.
Appuswamy
,
B.
Taba
,
A.
Amir
,
M. D.
Flickner
,
W. P.
Risk
,
R.
Manohar
, and
D. S.
Modha
,
Science
345
(
6197
),
668
(
2014
).
8.
S. B.
Furber
,
F.
Galluppi
,
S.
Temple
, and
L. A.
Plana
,
Proc. IEEE
102
(
5
),
652
665
(
2014
).
9.
B. V.
Benjamin
,
P.
Gao
,
E.
McQuinn
,
S.
Choudhary
,
A. R.
Chandrasekaran
,
J. M.
Bussat
,
R.
Alvarez-Icaza
,
J. V.
Arthur
,
P. A.
Merolla
, and
K.
Boahen
,
Proc. IEEE
102
(
5
),
699
716
(
2014
).
10.
J.
Schemmel
,
D.
Brüderle
,
A.
Grübl
,
M.
Hock
,
K.
Meier
, and
S.
Millner
, in
Proceedings of 2010 IEEE International Symposium on Circuits and Systems (ISCAS)
, 30 May–2 June 2010 (IEEE, Paris, France), pp.
1947
1950
.
11.
M. D.
Pickett
,
G.
Medeiros-Ribeiro
, and
R. S.
Williams
,
Nat. Mater.
12
(
2
),
114
117
(
2013
).
12.
M.
Prezioso
,
F.
Merrikh-Bayat
,
B. D.
Hoskins
,
G. C.
Adam
,
K. K.
Likharev
, and
D. B.
Strukov
,
Nature
521
(
7550
),
61
64
(
2015
).
13.
J. J.
Yang
,
M. X.
Zhang
,
J. P.
Strachan
,
F.
Miao
,
M. D.
Pickett
,
R. D.
Kelley
,
G.
Medeiros-Ribeiro
, and
R. S.
Williams
,
Appl. Phys. Lett.
97
(
23
),
232102
(
2010
).
14.
W.
Xu
,
S. Y.
Min
,
H.
Hwang
, and
T. W.
Lee
,
Sci. Adv.
2
(
6
),
e1501326
(
2016
).
15.
Y.
Shen
,
N. C.
Harris
,
S.
Skirlo
,
M.
Prabhu
,
T.
Baehr-Jones
,
M.
Hochberg
,
X.
Sun
,
S.
Zhao
,
H.
Larochelle
,
D.
Englund
, and
M.
Soljačić
,
Nat. Photonics
11
,
441
(
2017
).
16.
A. N.
Tait
,
T. F.
de Lima
,
E.
Zhou
,
A. X.
Wu
,
M. A.
Nahmias
,
B. J.
Shastri
, and
P. R.
Prucnal
,
Sci. Rep.
7
(
1
),
7430
(
2017
).
17.
K. Y.
Camsari
,
R.
Faria
,
B. M.
Sutton
, and
S.
Datta
,
Phys. Rev. X
7
(
3
),
031014
(
2017
).
18.
J.
Torrejon
,
M.
Riou
,
F. A.
Araujo
,
S.
Tsunegi
,
G.
Khalsa
,
D.
Querlioz
,
P.
Bortolotti
,
V.
Cros
,
K.
Yakushiji
,
A.
Fukushima
,
H.
Kubota
,
S.
Yuasa
,
M. D.
Stiles
, and
J.
Grollier
,
Nature
547
(
7664
),
428
431
(
2017
).
19.
A.
Mizrahi
,
T.
Hirtzlin
,
A.
Fukushima
,
H.
Kubota
,
S.
Yuasa
,
J.
Grollier
, and
D.
Querlioz
,
Nat. Commun.
9
(
1
),
1533
(
2018
).
20.
A.
Sengupta
and
K.
Roy
,
Appl. Phys. Express
11
(
3
),
030101
(
2018
).
21.
C. D.
Schuman
, in 2017 International Joint Conference on Neural Networks (IJCNN), April 19, 2017 (IEEE, Anchorage, AK, USA), 2017), pp. 2636–2643.
22.
M.
Davies
,
N.
Srinivasa
,
T. H.
Lin
,
G.
Chinya
,
Y.
Cao
,
S. H.
Choday
,
G.
Dimou
,
P.
Joshi
,
N.
Imam
,
S.
Jain
,
Y.
Liao
,
C. K.
Lin
,
A.
Lines
,
R.
Liu
,
D.
Mathaikutty
,
S.
McCoy
,
A.
Paul
,
J.
Tse
,
G.
Venkataramanan
,
Y. H.
Weng
,
A.
Wild
,
Y.
Yang
, and
H.
Wang
,
IEEE Micro.
38
(
1
),
82
99
(
2018
).
23.
D.
Apalkov
,
B.
Dieny
, and
J. M.
Slaughter
,
Proc. IEEE
104
(
10
),
1796
1830
(
2016
).
24.
S. K.
Tolpygo
,
Low Temp. Phys.
42
(
5
),
361
379
(
2016
).
25.
S. P.
Benz
and
C. A.
Hamilton
,
Proc. IEEE
92
(
10
),
1617
1629
(
2004
).
26.
G.
Wendin
,
Rep. Prog. Phys.
80
(
10
),
106001
(
2017
).
27.
J.
Bardeen
,
L. N.
Cooper
, and
J. R.
Schrieffer
,
Phys. Rev.
108
(
5
),
1175
1204
(
1957
).
28.
M.
Tinkham
,
Introduction to Superconductivity
(
Dover
,
New York
,
1996
).
29.
T.
Van Duzer
and
C. W.
Turner
,
Principles of Superconductive Devices and Circuits
, 2nd ed. (
Prentice Hall
,
Upper Saddle River
,
NJ
,
1999
).
30.
B. D.
Josephson
,
Phys. Lett.
1
(
7
),
251
253
(
1962
).
31.
R. P.
Feynman
,
R. B.
Leighton
and
M.
Sands
, The Feynman Lectures on Physics, Volume III: Quantum Mechanics (
Addison-Wesley
,
New York
,
1965
), Vol. 3.
32.
J. A.
Blackburn
,
M.
Cirillo
, and
N.
Grønbech-Jensen
,
Phys. Rep.
611
,
1
33
(
2016
).
33.
M. L.
Schneider
,
C. A.
Donnelly
,
S. E.
Russek
,
B.
Baek
,
M. R.
Pufall
,
P. F.
Hopkins
,
P. D.
Dresselhaus
,
S. P.
Benz
, and
W. H.
Rippard
,
Sci. Adv.
4
(
1
),
e1701329
(
2018
).
34.
R. L.
Kautz
,
J. Appl. Phys.
49
(
1
),
308
314
(
1978
).
35.
M. M. T. M.
Dierichs
,
B. J.
Feenstra
,
A.
Skalare
,
C. E.
Honingh
,
J.
Mees
,
H. V. D.
Stadt
, and
T. D.
Graauw
,
Appl. Phys. Lett.
63
(
2
),
249
251
(
1993
).
36.
P.
Bunyk
,
K.
Likharev
, and
D.
Zinoviev
,
Int. J. High Speed Electron. Syst.
11
(
1
),
257
305
(
2001
).
37.
K. K.
Likharev
and
V. K.
Semenov
,
IEEE Trans. Appl. Supercond.
1
(
1
),
3
28
(
1991
).
38.
K.
Segall
,
M.
LeGro
,
S.
Kaplan
,
O.
Svitelskiy
,
S.
Khadka
,
P.
Crotty
, and
D.
Schult
,
Phys. Rev. E
95
(
3
),
032220
(
2017
).
39.
P.
Crotty
,
D.
Schult
, and
K.
Segall
,
Phys. Rev. E
82
(
1
),
8
(
2010
).
40.
T.
Onomi
and
K.
Nakajima
, in 11th European Conference on Applied Superconductivity, edited by S. Farinon, I. Pallecchi, A. Malagoli, and G. Lamura (IOP Publishing Ltd, Bristol, 2014), Vol. 507.
41.
Y.
Yamanashi
,
K.
Umeda
, and
N.
Yoshikawa
,
IEEE Trans. Appl. Supercond.
23
(
3
),
1701004
1701004
(
2013
).
42.
A. L.
Hodgkin
and
A. F.
Huxley
,
J. Physiol.
117
(
4
),
500
544
(
1952
).
43.
S.
Holmes
,
A. L.
Ripple
, and
M. A.
Manheimer
,
IEEE Trans. Appl. Supercond.
23
(
3
),
10
(
2013
).
44.
E. O.
Neftci
,
B. U.
Pedroni
,
S.
Joshi
,
M.
Al-Shedivat
, and
G.
Cauwenberghs
,
Front. Neurosci.
10
,
241
(
2016
).
45.
V.
Ginzburg
,
Sov. Phys. JETP
4
(
2
),
153
160
(1957)
[Zh. Eksp. Teor. Fiz. 31, 202–214 (1956)].
46.
B.
Matthias
,
H.
Suhl
, and
E.
Corenzwit
,
Phys. Rev. Lett.
1
(
3
),
92
(
1958
).
47.
P.
Fulde
and
R. A.
Ferrell
,
Phys. Rev.
135
(
3A
),
A550
(
1964
).
48.
A.
Larkin
and
Y. N.
Ovchinnikov
,
Sov. Phys. JETP
20
,
762
769
(
1965
) [Zh. Eksp. Teor. Fiz. 47, 1136–1146 (1964)].
49.
A. I.
Buzdin
,
Rev. Mod. Phys.
77
(
3
),
935
976
(
2005
).
50.
F.
Bergeret
,
A. L.
Yeyati
, and
A.
Martin-Rodero
,
Phys. Rev. B
72
(
6
),
064524
(
2005
).
51.
M.
Eschrig
,
Phys. Today
64
(
1
),
43
(
2011
).
52.
M. G.
Blamire
and
J. W.
Robinson
,
J. Phys.: Condens. Matter.
26
(
45
),
453201
(
2014
).
53.
J.
Linder
and
J. W.
Robinson
,
Nat. Phys.
11
(
4
),
307
(
2015
).
54.
A.
Feofanov
,
V.
Oboznov
,
V.
Bol’ginov
,
J.
Lisenfeld
,
S.
Poletto
,
V.
Ryazanov
,
A.
Rossolenko
,
M.
Khabipov
,
D.
Balashov
, and
A.
Zorin
,
Nat. Phys.
6
(
8
),
593
(
2010
).
55.
F.
Bergeret
,
A. F.
Volkov
, and
K. B.
Efetov
,
Rev. Mod. Phys.
77
(
4
),
1321
(
2005
).
56.
C.
Bell
,
G.
Burnell
,
C. W.
Leung
,
E. J.
Tarte
,
D. J.
Kang
, and
M. G.
Blamire
,
Appl. Phys. Lett.
84
(
7
),
1153
1155
(
2004
).
57.
E. A.
Demler
,
G. B.
Arnold
, and
M. R.
Beasley
,
Phys. Rev. B
55
(
22
),
15174
15182
(
1997
).
58.
A. I.
Buzdin
,
L.
Bulaevskii
, and
S.
Panyukov
,
Pis’ma Zh. Eksp. Teor. Fiz.
35
(
4
),
147
(
1982
) [JETP Lett. 35, 178–180 (1982)].
59.
V. V.
Ryazanov
,
V. A.
Oboznov
,
A. Y.
Rusanov
,
A. V.
Veretennikov
,
A. A.
Golubov
, and
J.
Aarts
,
Phys. Rev. Lett.
86
(
11
),
2427
2430
(
2001
).
60.
T.
Kontos
,
M.
Aprili
,
J.
Lesueur
,
F.
Genet
,
B.
Stephanidis
, and
R.
Boursier
,
Phys. Rev. Lett.
89
(
13
),
137007
(
2002
).
61.
I.
Žutić
,
J.
Fabian
, and
S.
Das Sarma
,
Rev. Mod. Phys.
76
(
2
),
323
410
(
2004
).
62.
T.
Dietl
and
H.
Ohno
,
Rev. Mod. Phys.
86
(
1
),
187
251
(
2014
).
63.
T.
Dietl
and
H.
Ohno
,
Mater. Today
9
(
11
),
18
26
(
2006
).
64.
V.
Ko
,
K.
Teo
,
T.
Liew
,
T.
Chong
,
M.
MacKenzie
,
I.
MacLaren
, and
J.
Chapman
,
J. Appl. Phys.
104
(
3
),
033912
(
2008
).
65.
S. E.
Russek
,
C. A.
Donnelly
,
M. L.
Schneider
,
B.
Baek
,
M. R.
Pufall
,
W. H.
Rippard
,
P. F.
Hopkins
,
P. D.
Dresselhaus
, and
S. P.
Benz
, in 2016 IEEE International Conference on Rebooting Computing (ICRC), 17–19 Oct. 2016 (IEEE, San Diego, CA, 2016), pp. 1–5.
66.
B.
Baek
,
W. H.
Rippard
,
S. P.
Benz
,
S. E.
Russek
, and
P. D.
Dresselhaus
,
Nat. Commun.
5
,
3888
(
2014
).
67.
C. D.
Tesche
and
J.
Clarke
,
J. Low Temp. Phys.
29
(
3–4
),
301
331
(
1977
).
68.
J.
Clark
, Bloomberg News (
2015
).
69.
A.
Esteva
,
B.
Kuprel
,
R. A.
Novoa
,
J.
Ko
,
S. M.
Swetter
,
H. M.
Blau
, and
S.
Thrun
,
Nature
542
(
7639
),
115
(
2017
).
70.
M. L.
Schneider
,
S. E.
Russek
,
W. H.
Rippard
, and
M. R.
Pufall
, U.S. patent application 15/722,508 (
2017
).
71.
M. L.
Schneider
,
C. A.
Donnelly
,
S. E.
Russek
,
B.
Baek
,
M. R.
Pufall
,
P. F.
Hopkins
, and
W. H.
Rippard
, in 2017 IEEE International Conference on Rebooting Computing (ICRC), 809 Nov. 2017 (IEEE, Washington, DC, 2017), pp. 1–4.
72.
M. A.
Nielsen
,
Neural Networks and Deep Learning
(
Determination Press
,
2015
).