The term “neuromorphic” was originally introduced by Mead in the late 1980s,1 referring to devices and systems that imitated certain elements of biological neural systems. However, today the interpretation of the term has diverged across different research communities. Broadly speaking, “neuromorphic” relates to computing approaches that are inspired by the brain. Not exclusively, such approaches could include in-memory computing, continuous learning in hardware, spike-based processing, fine-grained parallelism, reduced precision computing, and asynchronous computing, among others. Many of these concepts are explored independently, not necessarily by linking them to neuromorphic technologies. For example, a comprehensive review of in-memory computing using emerging technologies is provided by Mannocci et al. in our first issue.2
There are distinct approaches to neuromorphic research, distinguished by their primary goals. One goal is to provide efficient hardware platforms to deepen our comprehension of biological nervous systems (e.g., the human brain). Conversely, a different objective is to harness brain-inspired principles to create efficient computing applications, without being confined by the need for biological realism. The distinct goals of brain modeling and efficient computing start to intersect when incorporating more biologically credible mechanisms, such as learning processes, which can potentially exceed the capabilities of existing machine learning and deep learning paradigms.3 In connection to this, it is of paramount importance to establish a standardized and unbiased method of evaluating the advantages and strengths of neuromorphic methods compared to conventional deep learning-based techniques.4
The neuromorphic community is incredibly diverse and spans across all stacks of computational abstractions. At the top layer, neuromorphic algorithms most commonly refer to spiking neural networks (SNNs)5,6 or training models via biologically plausible learning rules. What does it mean for a learning rule to be biologically plausible? A common criterion is the need for locality in both space and time. While error backpropagation routes a huge number of gradient signals to each model parameter [Fig. 1(a)], the synapses in the brain are thought to only update on the basis of signals that are immediately available to it. For example, the work by McCaughan et al. published in this issue of APL Machine Learning derives a method to broadcast a global loss signal to all parameters, where each weight in a network selectively “extracts” the component relevant to itself7 [Fig. 1(b)]. While the technique is likened to wireless communication, it also bears similarities to global mechanisms in the brain, such as dopamine release. Fundamentally, the power savings it can have by reducing the amount of data routing are vast.
Computational graphs of various modes of training. (a) Error backpropagation in a standard neural network. (b) Error broadcasting reduces the cost of gradient routing. (c) Backpropagation through time in sequential neural networks. (d) Real-time recurrent learning pushes gradients forward in time.
Computational graphs of various modes of training. (a) Error backpropagation in a standard neural network. (b) Error broadcasting reduces the cost of gradient routing. (c) Backpropagation through time in sequential neural networks. (d) Real-time recurrent learning pushes gradients forward in time.
In the past several years, an emerging trend among the neuromorphic algorithms community is to hack apart error backpropagation to enable machine learning models to adapt and learn in real-time. Almost all these techniques stem from the original real-time recurrent learning training algorithm (RTRL) proposed back in 1989 by Williams and Zipser.8 When training recurrent models, gradient-based updates require storing the state of all neurons throughout all of history [Fig. 1(c)]. The memory demands become prohibitively expensive for training continuous models. The RTRL modifies this by pushing gradient signals forward in time instead, eliminating the need for a historical trace of each neuron [Fig. 1(d)]. Doing so means the cost of training SNNs is no longer limited by sequence length. The computational resources that once had to be reserved for lengthy sequences can be reallocated to scaling up the SNNs we train. Pushing this trend of large-scale SNNs was catapulted by the recent emergence of the first neuromorphic language model, SpikeGPT.9
Techniques such as e-prop approximate the RTRL by removing recurrent feedback connections.10 Deep continuous online learning, or decolle, applies local losses to each layer to address the spatial locality of error signals in addition to temporal locality.11 Forward propagation through time applies the RTRL to time-varying losses with regularization to promote training stability.12 The general trend is to endow SNN learning with the benefits of the RTRL and stress-testing how tolerant these models are of approximations and alternative loss metrics.
While these are advances made at the software layer, the hardware and devices community directly reaps the benefits. Frenkel and Indiveri integrated forward-mode learning into a CMOS integrated circuit to show that SNNs can be trained at ultra-low power on a variety of dynamical tasks.13 This is the only silicon demonstration of a task-agnostic chip that can learn continuously on a range of data modalities. This range of high-performant lightweight learning algorithms signals a shift away from low-level learning rules, such as spike-time dependent plasticity, as we develop deeper insight as to how local learning rules become competitive with their non-local counterparts, especially when modulated by system-level objectives.
The absence of silicon chips in the online learning space is offset when one digs deeper into the abstraction layers. The past decade of neuromorphic devices and materials that have to adapt in real-time to input stimuli opens up a breadth of work that aims to link dynamics to functionality.14,15 The development of “neuromorphic devices” might be the closest to the AIPP readership. This involves leveraging the physical mechanisms of electronic, magnetic, and photonic materials to create efficient novel nanodevices that mimic biological functionalities such as those of neurons and synapses. The primary distinction from neuromorphic engineering is the objective to realize these functionalities through the inherent physics of materials instead of relying on individual devices like transistors and aiming for the most compact implementations (e.g., nanoscale devices). Multiple examples of these can be found across the AIPP journals, such as recent work on using memristive devices for neuromorphic applications,16 optoelectronic devices,17,18 magnetic devices,19 ferroelectric devices,20 and photonic devices,21 as well as roadmaps and special issues.22–24 APL Machine Learning aspires to be the premier and preferred platform for all the aforementioned research. This falls under what we term as “Applied Physics for Machine Learning”—AP for ML, which is a central part of the journal’s scope.
Ultimately, neuromorphic computing is a natural stage for blending the various abstractions. The brain is a physical manifestation of the neural code, and building the most powerful computers will rely on researchers treating the link between physics and applications, and everything in between, in much the same way. If the above resonates with you and your research, please consider submitting it to APL Machine Learning. More information about the journal’s scope can be found on our website, as well as in the first editorial.25