Neural networks have revolutionized the area of artificial intelligence and introduced transformative applications to almost every scientific field and industry. However, this success comes at a great price; the energy requirements for training advanced models are unsustainable. One promising way to address this pressing issue is by developing low-energy neuromorphic hardware that directly supports the algorithm's requirements. The intrinsic non-volatility, non-linearity, and memory of spintronic devices make them appealing candidates for neuromorphic devices. Here, we focus on the reservoir computing paradigm, a recurrent network with a simple training algorithm suitable for computation with spintronic devices since they can provide the properties of non-linearity and memory. We review technologies and methods for developing neuromorphic spintronic devices and conclude with critical open issues to address before such devices become widely used.
I. INTRODUCTION
Neural networks are widely used across various sectors to perform challenging data analysis tasks, but the high energy cost of training increasingly complex models is an escalating problem. More specifically, for training a state-of-the-art model, a Transformer with 213M parameters, the CO2 emissions were 626 155 lbs (including neural architecture search), while driving a car (average fuel consumption), for one lifetime, the CO2 emissions were only 126 000 lbs.1 One solution to the energy issue is to create new hardware platforms for neuromorphic computation using functional materials that intrinsically perform the required computation, potentially achieving greater efficiency than conventional CMOS approaches that merely simulate these. Recurrent neural networks (RNNs) are inspired by the high interconnectivity of biological systems and are a potent tool for tasks involving complex temporal data sequences. However, their temporal interconnectivity requires complex training methods. Such methods are computationally expensive and challenging to implement on hardware. The reservoir computing (RC) paradigm provides a solution using an RNN with fixed, random synaptic weights (the reservoir), which transforms inputs into higher dimensional representations before passing them to a single feed-forward output layer. The weights of this output layer can be calculated by minimizing an error function defined, for instance, as the squared difference between the desired and the predicted output. The output layer contains no temporal dependencies, and thus, training becomes relatively trivial. Ultimately, the reservoir does not need to be a neural network; it can be any suitable non-linear system that exhibits hysteresis. RC is particularly well-suited to neuromorphic hardware-based implementations. Since the learning process does not interfere with the reservoir dynamics, we may use any material device that provides appropriately complex dynamics and memory in the place of a neural network reservoir.
There are explorative reservoirs from different technologies, including photonic,2 mechanical,3 and memristive4 systems. Nanomagnetic systems have properties that make them particularly well-suited to act as reservoirs. For example, the magnetic hysteresis loop depicted in Fig. 1 shows a non-linear response (the net magnetization) to a stimulus (the applied field). Bistable remanent magnetization states, shown schematically in Fig. 1(a), can be the basis for the system's memory. Furthermore, in extended systems, interactions between moments give rise to a wealth of magnetization textures with complex dynamics that provide a rich playground to explore novel devices. Some example textures are shown in Figs. 1(b)–1(d), showing magnetic domain wall (DW), skyrmion, and artificial spin ice (ASI) systems. Short-range exchange and longer-range magnetostatic interactions offer in materia pathways to creating reservoirs with multiple physical nodes without the need for complex material synapses between nodes. The historical use of magnetic materials in hard-disk drives, sensors, and random access memories means integration with CMOS and techniques for reading (e.g., magnetoresistance effects) and writing data (e.g., magnetic fields and spin torque effects) are also well-established.
Magnetic hysteresis loop and complex magnetic textures. (a) Bistable hysteresis loop where the applied magnetic field (H) controls the orientation of the magnetization (M). (b)–(d) Examples of complex magnetic textures are domain walls, skyrmion, and artificial spin ice.
Magnetic hysteresis loop and complex magnetic textures. (a) Bistable hysteresis loop where the applied magnetic field (H) controls the orientation of the magnetization (M). (b)–(d) Examples of complex magnetic textures are domain walls, skyrmion, and artificial spin ice.
In this perspective, we will first review the approaches for creating in materia reservoirs using nanomagnetic materials and their various strengths and weaknesses. We will discuss the most common training methods that map their physical behaviors into meaningful data outputs. Next, we will discuss simulation tools that can assist in exploring the feasibility of reservoir computing with different magnetic systems and the characterization methods and benchmark problems commonly used to establish computational capability. Finally, we present some key challenges in the field and potential approaches to address these.
II. MATERIALS AND DEVICES
Due to their attractive properties, several nanomagnetic systems have been deemed suitable as reservoirs. These systems include spin torque oscillators (STOs),5–9 spin ice arrays,10–15 skyrmion textures,9,16,17 super-paramagnetic arrays,18 magnonic systems,19 and domain wall devices.20,21 Most studies are in simulations, although some demonstrations of RC with real devices have been performed, providing important evidence of real-world feasibility.5,15,19
In general, nanomagnetic reservoirs can be classified based on several characteristics (e.g., energy consumption, operating speed, and device size). Here, we introduce a taxonomy that classifies proposed devices by (a) input/output dimensionality (IOD) and (b) dynamical response (DR) (Fig. 2).
Classification of magnetic reservoir proposals by input/output dimensionality (IOD) and Dynamical Response (DR). IOD: IOD-1D—single dynamical node and IOD-N—multiple spatial dimensions. DR: DR-D—driven by external clock stimulus, DR-T—dynamics governed by thermal activation/relaxation, and DR-LLG—dynamics governed by LLG equation. References in bold red text are experimental demonstrations; all other references are simulation-based demonstrations. Red arrows represent systems where the state-of-the-art is an IOD-1D demonstration, but there are clear in materia approaches available to create IOD-N reservoirs. Key: STOs—spin torque oscillators, DWO—domain wall oscillators, SkyrOsc—skyrmion oscillator, SPE—super-paramagnet ensemble, NRE—nanoring ensemble, Magnonic—magnonic reservoir, SkyTex—skyrmion texture, and ASI—artificial spin ice.
Classification of magnetic reservoir proposals by input/output dimensionality (IOD) and Dynamical Response (DR). IOD: IOD-1D—single dynamical node and IOD-N—multiple spatial dimensions. DR: DR-D—driven by external clock stimulus, DR-T—dynamics governed by thermal activation/relaxation, and DR-LLG—dynamics governed by LLG equation. References in bold red text are experimental demonstrations; all other references are simulation-based demonstrations. Red arrows represent systems where the state-of-the-art is an IOD-1D demonstration, but there are clear in materia approaches available to create IOD-N reservoirs. Key: STOs—spin torque oscillators, DWO—domain wall oscillators, SkyrOsc—skyrmion oscillator, SPE—super-paramagnet ensemble, NRE—nanoring ensemble, Magnonic—magnonic reservoir, SkyTex—skyrmion texture, and ASI—artificial spin ice.
For IOD, RC requires multiple outputs from the reservoir (i.e., simultaneous measures of reservoir state) and benefits from multiple, simultaneous data inputs. Many devices proposed for use in RC are simple dynamical nodes with only a single input and output (IOD-1D). To use IOD-1D as reservoirs, we must expand the dimensionality of input and output data by using time-multiplexing techniques,22 an approach often referred to as “delay line” RC. However, other proposed devices consist of many spatially distributed, interacting elements/regions. These naturally possess N dimensional state vectors and, thus, offer an in materia pathway to defining multiple input and output dimensions (IOD-N). Reservoirs containing multiple non-interacting devices can also be powerful, providing that each device offers a different non-linear mapping of input signals.8
For DR, many proposed magnetic reservoirs exploit the damped, oscillatory motion of individual magnetic moments, as described by the Landau–Lifshitz–Gilbert (LLG) equation of motion (DR-LLG). These dynamics have high MHz–THz frequencies and ns decay times for ferromagnetic materials, making them well-suited to high-speed data processing applications. RC is also ideal for real-time signal processing, where reservoir timescales must match external signals with low or high frequencies. However, as the dynamics of DR-LLG systems occur on nanosecond timescales, they are too fast for many real-time tasks; one must use external electronics to “speed-up” data input or improve long-term dependencies via delay lines. Effectively, we treat the magnetic devices as non-linear activation functions with short-term temporal dependencies.6
Other magnetic devices do not naturally relax their state without applied stimuli; external clocking stimuli determine the timescales of these dynamically driven (DR-D) systems. By choosing the clock frequency, these systems can operate at any timescale longer than intrinsic magnetization dynamics. Therefore, they are naturally well-suited to real-time data analysis but may be less energy efficient than DR-LLG devices. A final class of reservoirs directly exploits thermally activated magnetization dynamics to provide transitions between magnetic states (DR-T). These are interesting as they directly exploit aggregated thermal effects to increase energy efficiency, whereas, in most device proposals, thermal effects introduce stochasticity, reducing performance in computational tasks. Furthermore, as the timescales of thermal activation can be changed dramatically (down to ∼tens of nanoseconds23) by changing the size of the systems' energy barriers, it should be possible to tune these systems dynamics to be compatible with a variety of real-time tasks. However, their stability to variations in operating temperature requires careful exploration.
A. Nanomagnetic oscillators
Spin torque oscillators24 (STOs) (IOD-1D, DR-LLG) use the same magnetic tunnel junction (MTJ) technology that forms the basis of contemporary magnetic random-access memory (MRAM) device.25 At the most basic level, MTJs consist of two thin ferromagnetic layers separated by a thin insulating barrier in a “spin valve” configuration. One of the ferromagnetic layers is free to change its magnetization direction (free layer). The other is “pinned” into a fixed state (pinned layer) by an adjacent antiferromagnetic layer. Passing a DC electrical current through the multilayer excites oscillation of the magnetization direction of the free layer due to spin torque effects,26,27 with frequencies in the range hundreds of MHz to tens of GHz, depending on the details of the oscillator's design and stimuli applied to it. When the free layer magnetization oscillates, it produces oscillations in the electrical resistance of the MTJ via the tunnel magnetoresistance (TMR) effect. TMR can be detected as voltage signals with amplitudes as large as tens of mV.28 The amplitude of STO oscillations varies non-linearly with current and typically decays over timescales of ∼hundreds of ns.5
Torrejon et al.5 demonstrated RC experimentally using a single sub-micrometer STO device using the time-multiplexed approach of Appletant et al.22 Input signals are given to the STO by modulating the amplitude of the DC driving current, with the readout being the power output of the STO. Using this approach, the authors achieved state-of-the-art performance when classifying spoken digits from the TI-46 database.29 Alternative input and output approaches (e.g., frequency modulated input and phase modulated output) can also create richer reservoir transformations and improve performance in tasks.7
STOs have many attractive properties. Foremost among these is that MTJs are a well-established commercial technology and are fully compatible with conventional CMOS platforms, providing a clear path to the realization of devices. Furthermore, they can be scaled down substantially from the sub-micrometer dimensions studied by Torrejon et al. to ∼10 nm, creating device designs that are both dense and energy efficient (∼1 W per STO).
While recent demonstrations have focused on time-multiplexed RC schemes, interconnections between STOs allow them to couple to each other,30,31 potentially facilitating N-dimensional reservoirs. Current approaches to neuromorphic computation with STOs have used external electrical interconnects to achieve this.32 Still, STOs can interact/synchronize via magnetic interactions,31,33 allowing for simpler and more elegant device designs.
Other types of magnetic oscillators can also be used as reservoirs. Ababei et al. used simulations to show that a single magnetic domain wall (DW) oscillating within a geometrically defined potential well in a nickel nanowire can create a reservoir capable of classifying a variety of different signals21 (IOD-1D and DR-LLG). In this approach, the DW's dynamics are dictated by device geometry and, therefore, should be highly tunable. Furthermore, DWs naturally produce monopole-like magnetic fields,34,35 allowing inter-device interactions to expand reservoir dimensionality. In a similar modeling study, Jiang et al. use the dynamics of a single magnetic skyrmion (i.e., a topologically protected “bubble” of non-uniform magnetization) within a geometrically defined potential to make an effective reservoir9 (IOD-1D and DR-LLG).
B. Magnonic systems
When driven at microwave frequencies, magnetic materials exhibit phase-coherent collective excitations known as spin waves (SWs), the quasiparticle of which is the magnon. The frequencies of SWs depend strongly on both material properties and induced magnetic anisotropies imposed by the system's geometry. The magnetic damping parameter of a material quantifies how efficiently SWs dissipate into the lattice and must be minimized by using materials such as Permalloy (NiFe) or yttrium iron garnet (YIG)36 to reduce losses. Boundaries and interfaces within a material allow for complex SW interference patterns to form, akin to reservoir work involving the pattern of water waves in a bucket.37 This high degree of tunability provides a rich parameter space for useful computation. At the same time, the intrinsic spatial variation of interference effects makes spin waves an ideal phenomenon for developing IOD-N reservoirs. As these approaches directly exploit magnetization dynamics, they all have class DR-LLG.
Papp et al.38 used micromagnetic simulations to characterize the computational potential of a simulated SW reservoir based on a film of YIG (IOD-N, DR-LLG) using task agnostic metrics. Modulating an RF excitation from a waveguide on one side of the film provides the input. The output is the time-averaged signal response at points across the system. Patterned dots of material with perpendicular magnetic anisotropy (PMA) on the surface of the YIG provided a non-uniform magnetic field, which locally altered the SW dispersion, resulting in a non-linear response. The system's response strongly depends on the regime at which the SWs were driven. For example, too high an input excitation would drive the system toward chaos. Nakane et al. suggest that magnetoelastic effects in multiferroic systems could provide energy-efficient excitation of spin wave reservoirs.39–41
In another simulation-based study, Dale et al. explored the limits of magnonic RC42 by considering thin films of Co, Fe, and Ni with ∼100 nm lateral dimensions (IOD-N and DR-LLG). These were split into a regular grid of up to 900 5 × 5 nm2 nodes, which were excited with local magnetic fields for data input and with the local 3D magnetization state of each node providing output. SWs reflect from edges forming interference patterns that provide a complex, transient transformation of input data. For larger numbers of nodes at 0 K, the system achieves impressive task-agnostic metric scores (see Sec. V) and an error of about 1% for a NARMA-30 task. As expected, the introduction of temperature to the simulation drastically reduced performance. Experimental realization of an equivalent device would be highly challenging, and cooling devices to cryogenic temperatures are unlikely to be energy efficient. Hence, further work is required to explore device designs that are feasible to fabricate and robust to higher temperatures.
Physical devices based on SWs are challenging to realize, partly due to devices operating at non-zero temperatures, which can alter magnon behavior.43 Watt et al. experimentally demonstrated an SW-based system with a time-multiplexed active ring resonator approach19,44,45 (IOD-1D and DR-LLG). The system consists of two antennas on each side of a strip of YIG: one to excite SWs and the other to detect them. The amplified microwave output signal is fed back into the input antenna to shift the phase of the frequencies within the YIG. An increase in gain stabilizes the SWs, until the threshold at which chaotic behavior occurs. This time-delayed transition to a steady-state condition acts as a fading memory within the system without needing external time-delayed input.44
Magnonic systems provide a potential platform for fast, low-power reservoir computing. However, they require high-quality growth of insulating magnetic films such as YIG and may show the best performance at low temperatures. Further work, particularly on experimental SW-based devices, is needed to explore their potential fully.
C. Artificial spin ice systems
Artificial spin ice (ASI) arrays consist of magnetically bistable nanoscale islands of soft magnetic materials (e.g., permalloy) arranged into tightly spaced, periodic lattices of various geometries.46 Magnetostatic fields created by the elements in these lattices mean that any given nanomagnet's free energy depends strongly on its magnetization's direction relative to its neighbors. Thus, the physics of ASIs is emergent, with complex collective behaviors deriving from simple interactions at an array's vertices. They provide a rich playground to explore various physical phenomena, including phase transitions, emergent magnetic monopoles, and magnetic frustration. Dynamics in these experiments are typically driven by applying external magnetic fields or directly heating the arrays. Studies have explored a wide range of geometries, including, for example, square lattices,47 kagome lattices,48 and pinwheel lattices.49 Fully connected ASIs can also be created where exchange interactions mediate interactions between vertices, and switching occurs by the propagation of DWs.50
ASIs are particularly effective systems for RC. They consist of large numbers of spatially distributed elements that interact strongly with their neighbors without the need for layers of interconnects, offering a natural platform for realizing IOD-N reservoirs. Their complex and highly tunable dynamics (e.g., via their large geometric phase space) promise a wealth of non-linear transforms of input data. Their dynamics are typically “clocked” by external stimuli, making them examples of DR-D systems.
Initial simulation-based studies by Jensen et al. show that the large binary state space of ASIs can be fully exploited computationally10 and that even subsampled representations of the magnetic state retain substantial computational power when used as outputs11 (IOD-N and DR-D). Other simulation studies have demonstrated that data can be input using the configurations of individual, or small groups, of islands.12–14 These studies provide strong evidence that the large numbers of interacting, binary degrees of freedom in ASIs is a genuine asset for creating IOD-N RC platforms.
There are substantial challenges to experimentally demonstrating the computational abilities of ASIs. While it is possible to envision ASIs constructed from dense arrays of individually addressable MTJs that would facilitate data input and output, the fabrication of such devices is beyond what is achievable in most research laboratories. Thus, alternative methods must be used to determine how the microstates of ASIs vary when subjected to complex field sequences. Gartside et al. have used ferromagnetic resonance measurements to “fingerprint” the microstates of an ASI.15 Their approach led to the first experimental demonstration of RC using an ASI to perform signal reconstruction and time series prediction tasks (IOD-N and DR-D). Globally applied magnetic fields were used to “clock” the ASI-based reservoirs. Still, such fields would likely be energy intensive for device-level implementations, and alternative clocking methods, e.g., spin or spin–orbit torque effects, will be necessary.
While the potential strength of ASIs as reservoirs stems from interactions between elements, Welbourne et al. have shown that collections of magnetic islands are capable of computation even in the non-interacting limit.18 In a simulation study, the authors used ensembles of voltage-controlled super-paramagnetic islands as time-multiplexed reservoirs, demonstrating high performance in both chaotic series prediction and spoken digit recognition tasks (IOD-1D and DR-T). Energy consumption was estimated to be ∼24 fJ per input, which makes the proposed devices attractive for edge-computing applications where low power consumption is vital. However, RC systems contain multiple components beyond the reservoir material itself. Further research is needed to understand how the total power consumption is related to that of the reservoir itself.
D. Skyrmion and domain wall ensembles
Magnetic nanostructures can support a variety of stable, non-uniform magnetization textures. Examples of such textures are domain walls and magnetic skyrmions that exhibit complex dynamics and strong interactions when placed in close proximity.
Skyrmions are topologically protected bubble-like magnetization textures stabilized in magnetic materials that exhibit strong Dzyaloshinskii–Moriya interactions.51 These can be found in single crystal bulk magnetic materials with non-centrosymmetric lattices (e.g., MnSi52) or in thin film systems that lack inversion symmetry (e.g., Pt/Co/Ir multilayers53). Skyrmions can be displaced at relatively low current densities using spin–orbit torques and produce unique electrical signatures via the topological Hall effect.51 In extended systems, skyrmion textures/fabrics can be formed; these interpolate between particle-like individual skyrmions and complex domain structures bounded by chiral domain walls.
Pinna et al. have studied the feasibility of reservoir computing with skyrmion textures using micromagnetic simulations16 (IOD-N and DR-LLG). These were excited using spin torque effects by passing current between two electrical contacts. The readout could be either (i) a time-multiplexed sampling of the device's anisotropic magnetoresistance (AMR) or (ii) multiple spatially resolved samples of the textures' magnetization configurations. The authors showed that the device could classify sine and square waves within random sequences, provided that the dynamics of the input signals were well-matched to those of the skyrmions dynamics, which were in the GHz regime. However, there are a variety of hurdles still to be overcome for experimental realizations. Chief among these is that for temperatures above T = 100 K, thermal noise obscures the anisotropic magnetoresistance (AMR) signals,16 indicating a need for alternative readout mechanisms.
Dawidek et al. have proposed an alternative reservoir design that exploits stochastic interactions between domain walls in a patterned array of interconnected, micron-scale Ni80Fe20 rings.20 At remanence, each ring in the array typically contained two 180 DWs, which could be driven continuously around the rings' tracks by applying rotating magnetic fields.54 Stochastic interactions between DWs at the array's junctions led to both mechanisms for DWs being annihilated from the array, and new DW pairs being nucleated, with the balance of these mechanisms depending strongly on the rotating amplitude of the applied field. Thus, the array exhibited a field-dependent emergent response similar to that observed in ASIs. Averaging magnetic behavior over many rings transformed the individual rings' stochastic response into a rich, non-linear, and deterministic aggregate response.
Dawidek et al. first used a range of experimental techniques to demonstrate that the ring arrays had the basic physical properties required for reservoir computing. They then used a phenomenological model of their dynamics to demonstrate the classification of digits from the TI-46 database of spoken digits via a time-multiplexed approach, with data being input to the array using the amplitude of a continuously rotating applied field54,55 (IOD-1D and DR-D). A recent study by the same team has provided an experimental demonstration of RC with an electrically contacted ring array,56 where AMR measurements probed the states of the rings.
Interconnected ring arrays have several features that make them highly attractive as reservoirs. Like ASIs, they have numerous geometrical parameters that could tune their dynamic responses. Furthermore, as they consist of many interacting magnetic elements, they offer obvious routes to creating IOD-N reservoirs. However, data input by rotating magnetic fields is unlikely to be energy efficient, and alternative approaches exploiting, e.g., spin–orbit torques, will need to be explored.57
III. RESERVOIR TRAINING METHODS
In Sec. II, we covered a range of nanomagnetic systems suitable for reservoir computing. Here, we discuss how to train the output layer that receives the reservoir activity to solve various tasks. We present the most popular reservoir training method, known as ridge regression, which requires accumulating all training data and training the reservoir in one step. We also mention a recent technique applicable in an “online learning” setup, where the algorithm progressively adapts its parameters as new data are collected. This technique enables the reservoir to learn tasks sequentially, which may allow its usage in lifelong learning situations.
More recently, a sparse online learning algorithm (SpaRCe) has been proposed.60 SpaRCe introduces one threshold per neuron, which is learnable by minimizing the same cost function for the output weights. SpaRCe boosts the performance of online learning in reservoirs applied to classification problems while alleviating the issue of catastrophic forgetting. The latter is a fundamental problem in machine learning; new knowledge overrides older memories when the algorithm learns tasks sequentially. Catastrophic forgetting imposes additional challenges when considering the application of machine learning in lifelong learning scenarios and is a particularly significant problem for recurrent networks. SpaRCe performs exceptionally well in cases where the reservoir measurements are highly correlated. Since this method does not affect the reservoir dynamics, it synergizes well with in materia reservoirs. Although more time-consuming than the one-step regression, it may enable functionalities that are not possible otherwise, as it improves performance over standard “online methods” in classification problems.
Despite the recent advantages in training methods, and while we consider reservoir computing a promising paradigm for in materia computing, we do not expect that single reservoirs will be able to compete with more complex structures in general. However, it is possible to achieve competitive performance for specific problems. In a comparative study60 between hierarchical reservoirs and a well-established recurrent network architecture known as long sort-term memory (LSTM) with the same number of learnable parameters, the reservoirs achieved better performance in the permuted sequential MNIST task. The reservoir learning rule does not need to unravel dependencies in time when finding gradients, reducing the algorithmic complexity by factor T compared to the LSTM, where T is the length of input signals (here 784). These advantages in terms of complexity are expected to translate to reduced energy costs.
IV. SIMULATION TOOLS
Many tools are available to model nanoscale magnetic systems, ranging from general-purpose, full-physics simulators to high-level, special-purpose phenomenological models. These tools are essential to developing magnetic RC platforms; experimental demonstrations require challenging device fabrication and subsequent high-throughput characterization of the devices' responses to large quantities of input data. Simulation-based approaches are attractive for scoping functionality when combining these challenges with the wealth of systems and phenomena useful for RC.
However, simulations of RC also have their challenges. RC requires modeled devices to receive extended streams of input stimuli over timescales at a high computational expense. Furthermore, there is usually a trade-off between the accuracy with which the simulation approach replicates physical phenomena (e.g., magnetization dynamics, the effects of temperature, and materials defects) and their computational cost. We will briefly review the different simulation approaches used to model RC in magnetic materials and discuss where they are best applied.
A. General purpose physical simulators
General-purpose physical simulators are powerful modeling software packages that can model a diverse range of nanomagnetic systems.
Atomistic solvers, such as Vampire,61 allow atomic scale simulation of magnetic materials. Magnetic moments are assumed to be localized to atomic sites, and their dynamics are modeled classically via the LLG equation. Modeling materials with this exquisite fidelity allow physically accurate simulations of thermal effects, defects, interfacial interactions, and non-uniform spin textures but at a very high computational cost; it is prohibitively costly to simulate devices with dimensions >100 nm. Consequently, atomistic models are generally poorly suited to exploring RC unless the systems in question are smaller than those we could typically study experimentally.42
Micromagnetic solvers, such as OOMMF,62 NMag,63 and MuMax3,64 model magnetization as a continuous vector field , using finite difference or finite element numerical methods. Typically, a model is discretized into individual cells smaller than the exchange length (i.e., the characteristic length scale of a domain wall). Within these cells, the magnetization is considered to be uniform. Cells are usually a few nanometers in size and, thus, represent the magnetic moments of several hundred atoms each. Similar to atomistic solvers, the classical LLG equation models dynamics. Thermal effects may be introduced by including a thermal noise term, resulting in a Langevin thermostat for the system.65 Since we assume that each cell has a fixed magnetic moment, this approach is limited to temperatures away from the Curie temperature, where we expect large fluctuations in the length of the moment.
The cells in micromagnetic approaches are typically two orders of magnitude larger than those in atomistic simulations. Therefore, they are substantially less computationally expensive to run. Systems with lateral dimensions ∼μm are easily accessible, especially when using GPU-accelerated packages such as MuMax3. While these can be used to model RC in modestly sized systems,16,21 the sheer amount of input data required for training can present computational challenges. They are also poorly suited to simulating large systems such as large ASIs or interconnected ring ensembles. Micromagnetic simulations are often best suited to validating the outputs of higher level simulators or training fast, machine learning-based models of system behavior.66
B. Special-purpose phenomenological simulators
The limited applicability of general-purpose simulators to modeling RC stems from many degrees of freedom they must model. However, simulators specialized to systems of a given class can describe the basic physical behaviors with substantially fewer degrees of freedom.
For example, each island in a typical ASI would consist of ∼2000 cells with 2 degrees of freedom each if simulated within a micromagnetic framework. At a phenomenological level, it could be represented by a single bistable vector within an Ising model. The GPU-accelerated flatspin simulator67 takes this approach. The simulator has been designed to simulate the dynamics of ASIs as collections of bistable nano-magnets arranged on a lattice, approximated as point dipoles interacting through dipole–dipole coupling. With these approximations, it is possible to model systems comprised of millions of islands. Model predictions were validated against experimental results and other models and allowed simulations demonstrating the applicability of ASIs to RC with modest computational costs.10
RingSim,20,68 a simulator designed to predict the behaviors of interconnected nanoring ensembles (NRE), takes a similar phenomenological approach. The simulator follows agent-based modeling principles: the active agents are domain walls that are instanced into the model and interact stochastically with a rotating field and other DWs situated in neighboring rings. With this model, it was possible to demonstrate the feasibility of performing RC with a system that would be entirely inaccessible using standard micromagnetic approaches.20,68
Simple phenomenological models have been used to model a range of other systems, including STOs,8 DW Oscillators,21 and super-paramagnet ensembles (SPEs).18 These models are similar in that they sacrifice the detail and accuracy of their descriptions of physics to reduce computational expense. These are appropriate tradeoffs for studies aiming to demonstrate the basic feasibility of RC with a given system as a stepping stone to experimental studies; even predictions from highly detailed atomistic or micromagnetic models are expected to show some variance from real-world devices.
V. CHARACTERIZATION BEYOND BENCHMARK TASKS
The suitability of nanomagnetic systems for RC is usually established by performing standard benchmark tasks such as time series prediction or speech recognition (for a review of some key benchmarks, see the supplementary material). Evaluating reservoirs in this way provides limited characterization; different tasks require different computational properties. Thus, strong performance in a single task does neither guarantee broader usefulness as a reservoir nor scalability to more complex problems.
In principle, one may achieve a better understanding by measuring task-agnostic reservoir metrics, which characterize a reservoir's computational properties beyond specific benchmarks. Three commonly used metrics are kernel rank (KR),69 generalization rank (GR),69 and linear memory capacity (MC).70,71 KR measures the ability of a reservoir to separate different inputs to different reservoir states. GR is the ability of a reservoir to generalize similar inputs to the same reservoir states, and MC is the amount of linear memory within the system. Other metrics have also been proposed,72 and careful research will be required to establish which groupings offer the most informative characterizations of a reservoir's computational properties.
The optimal values of metrics are highly task-dependent. For example, a system with a high GR is susceptible to noisy inputs, whereas a low GR is less sensitive. Depending on the task, these may reflect a desired or undesired property; a noisy input would benefit from a low GR, but a precise and sensitive input would benefit from a high GR. Nonetheless, metrics knowledge can help optimize reservoir design for a specific problem. For instance, if a task requires a particular memory length, knowing which device designs provide the appropriate timescales would lead to a more efficient design process than fabricating several reservoirs and testing them on the specific task.
A step in this direction is CHARC,73 a framework for exploring the behavior spaces of families of dynamical systems. Traditional search-based methods search for reasonable solutions to a given problem. Instead, CHARC explores the entire behavior space to characterize how well a given set of systems (such as the nanomagnetic systems in this paper) exhibit various dynamical properties usable for solving specific problems. CHARC defines the space of behaviors by a set of n user-supplied metrics that define an n-dimensional behavior space.
It then explores the input parameters to determine the range of behaviors accessible in this space. Using a range is more appropriate for characterizing a system's overall potential than optimizing the parameters for some specific behavior. CHARC uses a novelty search algorithm,74,75 an evolutionary algorithm purely explorative, to find sets of input parameters that result in relatively uniformly distributed behaviors over the behavior space. The system is characterized by the volume of behavior space it can access.
CHARC is typically applied to a three-dimensional behavior space defined by KR, GR, and MC, but it also allows the configuration of alternative measures; there is no claim by the authors of CHARC that these measures are the best for mapping a behavior space.73 Given a sufficiently fast and accurate simulator, CHARC can be used to find potentially compelling phenomena to then test in hardware experiments. The results of these experiments can then refine the simulator, creating a closed software improvement loop.
VI. CHALLENGES AND OUTLOOK
Experimental realizations. Thus far, most studies have only explored nanomagnetic RC in simulation. It is now critical that the most promising proposals are transferred to experimental demonstrations. The challenges here are not a lack of methods to input signals into materials or measure well-established materials' responses but the complexity of the proposed devices and the measurement infrastructure required for proof-of-principle experiments. The latter needs to apply and measure signal trains in substrate-compatible formats at speeds up to GHz. While these challenges are substantial, robust functionality can be demonstrated only via these experimental prototypes under real-world conditions and constraints. While we expect a system computing using material dynamics to be inherently more efficient, such prototypes will allow an accurate measurement of energy consumption76 and drive future device improvements.
Scalability. Once experiments demonstrate basic functionality, it is essential to examine the scalability of proposed RC systems. For example, for simple IOD-1D, time-multiplexed implementations of RC, it will be essential to examine how computational power is enriched if these devices create IOD-N networks, either via external interconnects or via in materia interactions. One needs to explore how computational power scales as the size and complexity of systems increase. Computational power will be particularly critical when exploiting in materia interaction as these will have natural length scales beyond which individual inputs and outputs of a reservoir will not directly interact. Meta-reservoirs, i.e., systems consisting of multiple interconnected reservoirs with different computational properties, should also be explored. Such architectures may likely have substantially greater power than their constituent parts.77 In all of these cases, simulations will be an essential tool for exploration. These allow evaluation of the ultimate computational potential of a material system by ignoring the physical confines of interfacing in the first instance.
Algorithms. The simplicity of the training algorithms RC uses is another critical element for the popularity of RC in the spintronics community. However, this simplicity also has drawbacks; training RC online with the simplest algorithms was challenging until recent methods60 improved its performance by efficiently increasing algorithmic complexity. We pay a small price for improving learning speed and resilience to catastrophic forgetting. Similarly, to achieve Scalability, we need to optimize the interconnectivity between the reservoirs or their timescales.77 Typically, however, techniques for finding appropriate parameters require precise mathematical reservoir models, and in spintronic devices, such models may only sometimes be available. Techniques that allow for automated tuning of the parameters of mathematically agnostic reservoirs will be transformative.
Evaluation. Task-agnostic metrics offer a powerful platform for understanding the computational properties of potential reservoirs. With the wealth of nanomagnetic systems available for this purpose, careful evaluation of these metrics will be essential for understanding their relative strengths and weaknesses. We do not believe such evaluation will reveal a single system as inherently superior. A wealth of factors must be considered, including power consumption, operating speed, and production cost. More likely, a thorough evaluation of device concepts will reveal what applications they would best suit, whether in lower power edge-computing systems or high-throughput data co-processors, and how nanomagnetic RC systems compare to other competitor technologies. In all cases, it will be essential to recognize the heterotic nature of RC, i.e., conventional electronic systems must augment the reservoir to create input and output layers, all with their constraints and overheads.
Applications. So far, reservoir-based spintronic devices have solved simple benchmark problems. While this is inevitable at the earlier stages of research, such toy problems serve only as proof of concept. They are inappropriate for the evaluation of the reservoirs and for attracting a more general interest in the technology. Identifying more challenging tasks within application areas where the spintronics devices may be transformative is necessary. At this stage, it is hard to imagine that spintronic-based RC will serve as general-purpose devices; we expect that there are particular niche areas for which they may be suited. For instance, in the context of edge computing, a promising direction may be that of smart sensors, where we would like to offload low-energy preprocessing on the chip. Generally, RC maybe also boost existing methods where additional memory is helpful by adding only a small overhead. For instance, in robotics, the advantages of augmenting existing architectures with a reservoir are demonstrated in the problem of visual place recognition.78 For this, interfacing spintronics technology with other hardware may be crucial for the further development of the devices.
SUPPLEMENTARY MATERIAL
See the supplementary material for the Echo State Network, a fundamental neural network reservoir model, and some typical benchmarks used in reservoir computing.
ACKNOWLEDGMENTS
D.A.A., T.J.H., L.M., C.S., I.V., and E.V. acknowledge funding from the EPSRC MARCH Project No. EP/V006339/1. D.G., M.F.KH.M., S.O.K., S.S., and M.A.T. acknowledge funding from the EPSRC MARCH Project No. EP/V006029/1; S.O.K., S.S., and M.A.T. also acknowledge partial funding from the EPSRC SpInspired Project No. EP/R032823/1. D.A.A., T.J.H., M.O.A.E., and E.V. also acknowledge funding from the EPSRC Project No. EP/S009647/1. D.A.A., T.J.H., and G.V. acknowledge Horizon 2020 FET-Open SpinEngine (Agreement No. 861618). I.V. acknowledges a DTA-funded Ph.D. studentship from EPSRC. C.W. acknowledges doctoral funding from the Department of Computer Science, University of York. T.J.H. acknowledges funding from The Leverhulme Trust under grant RPG-2019-097.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Dan A. Allwood: Writing – original draft (supporting). Martin A. Trefzer: Writing – original draft (supporting). Eleni Vasilaki: Writing – original draft (equal); Writing – review & editing (equal). Guru Venkat: Writing – original draft (supporting); Writing – review & editing (supporting). Ian Thomas Vidamour: Writing – original draft (supporting); Writing – review & editing (supporting). Chester Wringe: Writing – original draft (supporting). Matthew O. A. Ellis: Writing – original draft (supporting); Writing – review & editing (supporting). David Griffin: Writing – original draft (supporting). Tom James Hayward: Writing – original draft (equal); Writing – review & editing (equal). Luca Manneschi: Writing – original draft (supporting). Mohammad FKH Musameh: Writing – original draft (supporting). Simon O'Keefe: Writing – original draft (supporting). Susan Stepney: Writing – original draft (supporting); Writing – review & editing (supporting). Charles Swindells: Writing – original draft (supporting); Writing – review & editing (supporting).
DATA AVAILABILITY
Data sharing is not applicable to this article as no new data were created or analyzed in this study.