Artificial neural networks (ANNs) have emerged as an essential tool in machine learning, achieving remarkable success across diverse domains, including image and speech generation, game playing, and robotics. However, there exist fundamental differences between ANNs’ operating mechanisms and those of the biological brain, particularly concerning learning processes. This paper presents a comprehensive review of current brain-inspired learning representations in artificial neural networks. We investigate the integration of more biologically plausible mechanisms, such as synaptic plasticity, to improve these networks’ capabilities. Moreover, we delve into the potential advantages and challenges accompanying this approach. In this review, we pinpoint promising avenues for future research in this rapidly advancing field, which could bring us closer to understanding the essence of intelligence.
INTRODUCTION
The dynamic interrelationship between memory and learning is a fundamental hallmark of intelligent biological systems. It empowers organisms to not only assimilate new knowledge but also to continuously refine their existing abilities, enabling them to adeptly respond to changing environmental conditions. This adaptive characteristic is relevant on various time scales, encompassing both long-term learning and rapid short-term learning via short-term plasticity mechanisms, highlighting the complexity and adaptability of biological neural systems.1–3 The development of artificial systems that draw high-level, hierarchical inspiration from the brain has been a long-standing scientific pursuit spanning several decades. While earlier attempts were met with limited success, the most recent generation of artificial intelligence (AI) algorithms has achieved significant breakthroughs in many challenging tasks. These tasks include, but are not limited to, the generation of images and text from human-provided prompts,4–7 the control of complex robotic systems,8–10 the mastery of strategy games such as Chess and Go,11 and a multimodal amalgamation of these.12
While ANNs have made significant advancements in various fields, there are still major limitations in their ability to continuously learn and adapt such as biological brains.13–15 Unlike current models of machine intelligence, animals can learn throughout their entire lifespan, which is essential for stable adaptation to changing environments. This ability, known as lifelong learning, remains a significant challenge for artificial intelligence, which primarily optimizes problems consisting of fixed labeled datasets, causing it to struggle to generalize to new tasks or retain information across repeated learning iterations.14 Addressing this challenge is an active area of research, and the potential implications of developing AI with lifelong learning abilities could have far-reaching impacts across multiple domains.
In this paper, we offer a unique review that seeks to identify the mechanisms of learning in the brain that have inspired current artificial intelligence algorithms. The scope of this review is for algorithms that modify the parameters of a neural network, such as synaptic plasticity, and how they relate to the brain. To better understand the biological processes underlying natural intelligence, the first section will explore the low-level components that shape neuromodulation, from synaptic plasticity to the role of local and global dynamics that shape neural activity. This will be related back to ANNs in the third section, where we compare and contrast ANNs with biological neural systems. This will give us a logical basis that seeks to justify why the brain has more to offer AI beyond the inheritance of current artificial models. Following that, we will delve into algorithms for artificial learning that emulate these processes to improve the capabilities of AI systems. Finally, we will discuss various applications of these AI techniques in real-world scenarios, highlighting their potential impact on fields such as robotics, lifelong learning, and neuromorphic computing. By doing so, we aim to provide a comprehensive understanding of the interplay between learning mechanisms in the biological brain and artificial intelligence, highlighting the potential benefits that can arise from this synergistic relationship. We hope our findings will encourage a new generation of brain-inspired learning algorithms.
PROCESSES THAT SUPPORT LEARNING IN THE BRAIN
A grand effort in neuroscience aims at identifying the underlying processes of learning in the brain. Several mechanisms have been proposed to explain the biological basis of learning at varying levels of granularity—from the synapse to population-level activity. However, the vast majority of biologically plausible models of learning are characterized by plasticity that emerges from the interaction between local and global events.16 Below, we introduce various forms of plasticity and how these processes interact in more detail.
Synaptic plasticity: Plasticity in the brain refers to the capacity of experience to modify the function of neural circuits. The plasticity of synapses specifically refers to the modification of the strength of synaptic transmission based on activity and is currently the most widely investigated mechanism by which the brain adapts to new information.17,18 There are two broader classes of synaptic plasticity: short- and long-term plasticity. Short-term plasticity acts on the scale of tens of milliseconds to minutes and has an important role in short-term adaptation to sensory stimuli and short-lasting memory formation.19 Long-term plasticity acts on the scale of minutes to more and is thought to be one of the primary processes underlying long-term behavioral changes and memory storage.20
Neuromodulation: In addition to the plasticity of synapses, another important mechanism by which the brain adapts to new information is through neuromodulation.3,21,22 Neuromodulation refers to the regulation of neural activity by chemical signaling molecules, often referred to as neurotransmitters or hormones. These signaling molecules can alter the excitability of neural circuits and the strength of synapses and can have both short- and long-term effects on neural function. Different types of neuromodulation have been identified, including acetylcholine, dopamine, and serotonin, which have been linked to various functions such as attention, learning, and emotion.23 Neuromodulation has been suggested to play a role in various forms of plasticity, including short-19 and long-term plasticity.22
Metaplasticity: The ability of neurons to modify both their function and structure based on activity is what characterizes synaptic plasticity. These modifications that occur at the synapse must be precisely organized so that changes occur at the right time and in the right quantity. This regulation of plasticity is referred to as metaplasticity, or the “plasticity of synaptic plasticity,” and plays a vital role in safeguarding the constantly changing brain from its own saturation.24–26 Essentially, metaplasticity alters the ability of synapses to generate plasticity by inducing a change in the physiological state of neurons or synapses. Metaplasticity has been proposed as a fundamental mechanism in memory stability, learning, and regulating neural excitability. While similar, metaplasticity can be distinguished from neuromodulation, with metaplastic and neuromodulatory events often overlapping in time during the modification of a synapse.
Neurogenesis: The process by which newly formed neurons are integrated into existing neural circuits is referred to as neurogenesis. Neurogenesis is most active during embryonic development but is also known to occur throughout the adult lifetime, particularly in the subventricular zone of the lateral ventricles,27 the amygdala,28 and in the dentate gyrus of the hippocampal formation.29 In adult mice, neurogenesis has been demonstrated to increase when living in enriched environments vs in standard laboratory conditions.30 In addition, many environmental factors, such as exercise31,32 and stress,33,34 have been demonstrated to change the rate of neurogenesis in the rodent hippocampus. Overall, while the role of neurogenesis in learning is not fully understood, it is believed to play an important role in supporting learning in the brain.
Glial cells: Glial cells, or neuroglia, play a vital role in supporting learning and memory by modulating neurotransmitter signaling at synapses, the small gaps between neurons where neurotransmitters are released and received.35 Astrocytes, one type of glial cell, can release and reuptake neurotransmitters, as well as metabolize and detoxify them. This helps to regulate the balance and availability of neurotransmitters in the brain, which is essential for normal brain function and learning.36 Microglia, another type of glial cell, can also modulate neurotransmitter signaling and participate in the repair and regeneration of damaged tissue, which is important for learning and memory.37 In addition to repair and modulation, structural changes in synaptic strength require the involvement of different types of glial cells, with the most notable influence coming from astrocytes.36 However, despite their crucial involvement, we have yet to fully understand the role of glial cells. Understanding the mechanisms by which glial cells support learning at synapses is an important area of ongoing research.
DEEP NEURAL NETWORKS AND PLASTICITY
Artificial and spiking neural networks
Artificial neural networks have played a vital role in machine learning over the past several decades. These networks have seen tremendous progress toward solving a variety of challenging problems. Many of the most impressive accomplishments in AI have been realized through the use of large ANNs trained on tremendous amounts of data. While there have been many technical advancements, many of the accomplishments in AI can be explained by innovations in computing technology, such as large-scale GPU accelerators and the accessibility of data. While the application of large-scale ANNs has led to major innovations, there are still many challenges ahead. A few of the most pressing practical limitations of ANNs are that they are not efficient in terms of power consumption and they are not very good at processing dynamic and noisy data. In addition, ANNs are not able to learn beyond their training period (e.g., during deployment), from which data assumes an independent and identically distributed (IID) form without time, which does not reflect physical reality where information is highly temporally and spatially correlated. These limitations have led to their application requiring vast amounts of energy when deployed in large-scale settings38 and have also presented challenges toward integration into edge computing devices, such as robotics and wearable devices.39
Graphical depiction of long-term potentiation (LTP) and depression (LTD) at the synapse of biological neurons. (a) Synaptically connected pre- and post-synaptic neurons. (b) Synaptic terminal, the connection point between neurons. (c) Synaptic growth (LTP) and synaptic weakening (LTD). (d) (Top) Membrane potential dynamics in the axon hillock of the neuron. (Bottom) Pre- and post-synaptic spikes. (e) Spike-timing dependent plasticity curve depicting experimental recordings of LTP and LTD.
Graphical depiction of long-term potentiation (LTP) and depression (LTD) at the synapse of biological neurons. (a) Synaptically connected pre- and post-synaptic neurons. (b) Synaptic terminal, the connection point between neurons. (c) Synaptic growth (LTP) and synaptic weakening (LTD). (d) (Top) Membrane potential dynamics in the axon hillock of the neuron. (Bottom) Pre- and post-synaptic spikes. (e) Spike-timing dependent plasticity curve depicting experimental recordings of LTP and LTD.
Homeostatic regulation is an additional process that maintains the stability of internal conditions against external changes. This regulation is achieved through feedback mechanisms that adjust physiological processes, ensuring optimal functioning and equilibrium within the organism. In SNNs, homeostatic regulation also involves adjusting the spike threshold of neurons to stabilize network activity. This adjustment can be mathematically described as Vth(t + 1) = Vth(t) + β · (Atarget − Aactual), where Vth represents the neuron’s threshold potential, β is a modulation parameter, and Atarget and Aactual denote the target and actual firing rates. Homeostatic regulation has been shown to be useful in learning applications of SNNs.43–45
Despite these potential advantages, SNNs are still in the early stages of development, and there are several challenges that need to be addressed before they can be used more widely. One of the most pressing challenges is regarding how to optimize the synaptic weights of these models, as traditional backpropagation-based methods from ANNs fail due to the discrete and sparse nonlinearity. Irrespective of these challenges, there do exist some works that push the boundaries of what was thought possible with modern spiking networks, such as large spike-based transformer models.46 Spiking models are of great importance for this review since they form the basis of many brain-inspired learning algorithms.
Hebbian and spike-timing dependent plasticity
Hebbian and spike-timing dependent plasticity (STDP) are two prominent models of synaptic plasticity that play important roles in shaping neural circuitry and behavior. The Hebbian learning rule, first proposed by Hebb in 1949,47 posits that synapses between neurons are strengthened when they are coactive, such that the activation of one neuron causally leads to the activation of another. STDP, on the other hand, is a more recently proposed model of synaptic plasticity that takes into account the precise timing of pre- and post-synaptic spikes48 to determine synaptic strengthening or weakening. It is widely believed that STDP plays a key role in the formation and refinement of neural circuits during development and in the ongoing adaptation of circuits in response to experience. In the following subsection, we will provide an overview of the basic principles of Hebbian learning and STDP.
Hebbian learning: Hebbian learning is based on the idea that the synaptic strength between two neurons should be increased if they are both active at the same time and decreased if they are not. Hebb suggested that this increase should occur when one cell “repeatedly or persistently takes part in firing” another cell (with causal implications). However, this principle is often expressed correlatively, as in the famous aphorism “cells that fire together, wire together” (variously attributed to Löwel49 or Shatz50).
Hebbian learning is often used as an unsupervised learning algorithm, where the goal is to identify patterns in the input data without explicit feedback.51 An example of this process is the Hopfield network, in which large binary patterns are easily stored in a fully connected recurrent network by applying a Hebbian rule to the (symmetric) weights.52 It can also be adapted for use in supervised learning algorithms, where the rule is modified to take into account the desired output of the network. In this case, the Hebbian learning rule is combined with a teaching signal that indicates the correct output for a given input.
One potential drawback of the basic Hebbian rule is its instability. For example, if xi and xj are initially weakly positively correlated, this rule will increase the weight between the two, which will in turn reinforce the correlation, leading to even larger weight increases, etc. Thus, some form of stabilization is needed. This can be done simply by bounding the weights or by more complex rules that take into account additional factors such as the history of the pre- and post-synaptic activity or the influence of other neurons in the network (see Ref. 53 for a practical review of many such rules).
More formally, as pointed out by Frémaux et al.,54 to properly track the actual covariance between inputs, outputs, and rewards, at least one of the terms in the xixjR product must be centered, which is replaced by zero-mean fluctuations around its expected value. One possible solution is to center the rewards by subtracting a baseline from R, generally equal to the expected value of R for this trial. While helpful, in practice, this solution is generally insufficient.
This is the so-called “node perturbation” rule proposed by Fiete and Seung.55,56 Intuitively, notice that the effect of the xiΔxj increment is to push future xj responses (when encountering the same xi input) in the direction of the perturbation: larger if the perturbation was positive, smaller if the perturbation was negative. Multiplying this shift by R results in pushing future responses toward the perturbation if R is positive and away from it if R is negative. Even if R is not zero-mean, the net effect (in expectation) will still be to drive wij toward a higher R, although the variance will be higher.
This rule turns out to implement the REINFORCE algorithm (Williams’ original paper57 actually proposes an algorithm that is exactly node-perturbation for spiking stochastic neurons) and, thus, estimates the theoretical gradient of R over wij. It can also be implemented in a biologically plausible manner, allowing recurrent networks to learn non-trivial cognitive or motor tasks from sparse, delayed rewards.58
Spike-timing dependent plasticity: Spike-timing dependent plasticity (STDP) is a theoretical model of synaptic plasticity that allows the strength of connections between neurons to be modified based on the relative timing of their spikes. Unlike the Hebbian learning rule, which relies on the simultaneous activation of pre- and post-synaptic neurons, STDP takes into account the precise timing of the pre- and post-synaptic spikes. In particular, STDP suggests that if a presynaptic neuron fires just before a post-synaptic neuron, the connection between them should be strengthened. Conversely, if the post-synaptic neuron fires just before the presynaptic neuron, the connection should be weakened.
STDP has been observed in a variety of biological systems, including the neocortex, hippocampus, and cerebellum. The rule has been shown to play a crucial role in the development and plasticity of neural circuits, including learning and memory processes. STDP has also been used as a basis for the development of artificial neural networks, which are designed to mimic the structure and function of the brain.
PROCESSES THAT SUPPORT LEARNING IN ARTIFICIAL NEURAL NETWORKS
There are two primary approaches for weight optimization in artificial neural networks: error-driven global learning and brain-inspired local learning. In the first approach, the network weights are modified by driving a global error to its minimum value. This is achieved by delegating errors to each weight and synchronizing modifications between each weight. In contrast, brain-inspired local learning algorithms aim to learn in a more biologically plausible manner by modifying weights from dynamical equations using locally available information. Both optimization approaches have unique benefits and drawbacks. In the following sections, we will discuss the most commonly utilized form of error-driven global learning, backpropagation, followed by in-depth discussions of brain-inspired local algorithms. It is worth mentioning that these two approaches are not mutually exclusive and will often be integrated in order to complement their respective strengths.59–62
Backpropagation
Backpropagation is a powerful error-driven global learning method that changes the weight of connections between neurons in a neural network to produce a desired target behavior.63 This is accomplished through the use of a quantitative metric (an objective function) that describes the quality of behavior given sensory information (e.g., visual input, written text, and robotic joint positions). The backpropagation algorithm consists of two phases: the forward pass and the backward pass. In the forward pass, the input is propagated through the network, and the output is calculated. During the backward pass, the error between the predicted output and the “true” output is calculated, and the gradients of the loss function with respect to the weights of the network are calculated by propagating the error backward through the network. These gradients are then used to update the weights of the network using an optimization algorithm such as stochastic gradient descent. This process is repeated for many iterations until the weights converge to a set of values that minimize the loss function.
The impressive accomplishments of backpropagation have led neuroscientists to investigate whether it can provide a better understanding of learning in the brain. While it remains debated as to whether backpropagation variants could occur in the brain,65,66 it is clear that backpropagation in its current formulation is biologically implausible. Alternative theories suggest complex feedback circuits or the interaction of local activity and top-down signals (a “third-factor”) could support a similar form of backprop-like learning.65
Despite its impressive performance, there are still fundamental algorithmic challenges that follow from repeatedly applying backpropagation to network weights. One such challenge is a phenomenon known as catastrophic forgetting, where a neural network forgets previously learned information when training on new data.13 This can occur when the network is fine-tuned on new data or when the network is trained on a sequence of tasks without retaining the knowledge learned from previous tasks. Catastrophic forgetting is a significant hurdle for developing neural networks that can continuously learn from diverse and changing environments. Another challenge is that backpropagation requires backpropagating information through all the layers of the network, which can be computationally expensive and time-consuming, especially for very deep networks. This can limit the scalability of deep learning algorithms and make it difficult to train large models on limited computing resources. Nonetheless, backpropagation has remained the most widely used and successful algorithm for applications involving artificial neural networks.
Evolutionary and genetic algorithms
Another class of global learning algorithms that has gained significant attention in recent years is evolutionary and genetic algorithms. These algorithms are inspired by the process of natural selection and, in the context of ANNs, aim to optimize the weights of a neural network by mimicking the evolutionary process. In genetic algorithms,67 a population of neural networks is initialized with random weights, and each network is evaluated on a specific task or problem. The networks that perform better on the task are then selected for reproduction, whereby they produce offspring with slight variations in their weights. This process is repeated over several generations, with the best-performing networks being used for reproduction, making their behavior more likely across generations. Evolutionary algorithms operate similarly to genetic algorithms but use a different approach by approximating a stochastic gradient.68,69 This is accomplished by perturbing the weights and combining the network objective function performances to update the parameters. This results in a more global search of the weight space, which can be more efficient at finding optimal solutions compared to local search methods such as backpropagation.70
One advantage of these algorithms is their ability to search a vast parameter space efficiently, making them suitable for problems with large numbers of parameters or complex search spaces. In addition, they do not require a differentiable objective function, which can be useful in scenarios where the objective function is difficult to define or calculate (e.g., spiking neural networks). However, these algorithms also have some drawbacks. One major limitation is the high computational cost required to evaluate and evolve a large population of networks. Another challenge is the potential for the algorithm to become stuck in local optima or to converge too quickly, resulting in suboptimal solutions. In addition, the use of random mutations can lead to instability and unpredictability in the learning process.
Regardless, evolutionary and genetic algorithms have shown promising results in various applications, particularly when optimizing non-differentiable and non-trivial parameter spaces. Ongoing research is focused on improving the efficiency and scalability of these algorithms, as well as discovering where and when it makes sense to use these approaches instead of gradient descent.
BRAIN-INSPIRED REPRESENTATIONS OF LEARNING IN ARTIFICIAL NEURAL NETWORKS
Local learning algorithms
Unlike global learning algorithms such as backpropagation, which require information to be propagated through the entire network, local learning algorithms focus on updating synaptic weights based on local information from nearby or synaptically connected neurons (Fig. 2). These approaches are often strongly inspired by the plasticity of biological synapses. As will be seen, by leveraging local learning algorithms, ANNs can learn more efficiently and adapt to changing input distributions, making them better suited for real-world applications. In this section, we will review recent advances in brain-inspired local learning algorithms and their potential for improving the performance and robustness of ANNs (see Fig. 2).
Feedforward neural network computes an output given an input by propagating the input information downstream. The precise value of the output is determined by the weight of synaptic coefficients. To improve the output for a task given an input, the synaptic weights are modified. Synaptic plasticity algorithms represent computational models that emulate the brain’s ability to strengthen or weaken synapses—connections between neurons—based on their activity, thereby facilitating learning and memory formation. Three-factor plasticity refers to a model of synaptic plasticity in which changes to the strength of neural connections are determined by three factors: presynaptic activity, post-synaptic activity, and a modulatory signal, facilitating more nuanced and adaptive learning processes. The feedback alignment algorithm is a learning technique in which artificial neural networks are trained using random, fixed feedback connections rather than symmetric weight matrices, demonstrating that successful learning can occur without precise backpropagation. Backpropagation is a fundamental algorithm in machine learning and artificial intelligence used to train neural networks by calculating the gradient of the loss function with respect to the weights in the network.
Feedforward neural network computes an output given an input by propagating the input information downstream. The precise value of the output is determined by the weight of synaptic coefficients. To improve the output for a task given an input, the synaptic weights are modified. Synaptic plasticity algorithms represent computational models that emulate the brain’s ability to strengthen or weaken synapses—connections between neurons—based on their activity, thereby facilitating learning and memory formation. Three-factor plasticity refers to a model of synaptic plasticity in which changes to the strength of neural connections are determined by three factors: presynaptic activity, post-synaptic activity, and a modulatory signal, facilitating more nuanced and adaptive learning processes. The feedback alignment algorithm is a learning technique in which artificial neural networks are trained using random, fixed feedback connections rather than symmetric weight matrices, demonstrating that successful learning can occur without precise backpropagation. Backpropagation is a fundamental algorithm in machine learning and artificial intelligence used to train neural networks by calculating the gradient of the loss function with respect to the weights in the network.
Backpropagation-derived local learning
Backpropagation-derived local learning algorithms are a class of local learning algorithms that attempt to emulate the mathematical properties of backpropagation. Unlike the traditional backpropagation algorithm, which involves propagating error signals back through the entire network, backpropagation-derived local learning algorithms update synaptic weights based on local error gradients computed using backpropagation. This approach is computationally efficient and allows for online learning, making it suitable for applications where training data are continually arriving.
One prominent example of backpropagation-derived local learning algorithms is the Feedback Alignment (FA) algorithm,71,72 which replaces the weight transport matrix used in backpropagation with a fixed random matrix, allowing the error signal to propagate from direct connections, thus avoiding the need for backpropagating error signals. A brief mathematical description of feedback alignment is as follows: let wout be the weight matrix connecting the last layer of the network to the output, and win be the weight matrix connecting the input to the first layer. In feedback alignment, the error signal is propagated from the output to the input using the fixed random matrix B rather than the transpose of wout. The weight updates are then computed using the product of the input and the error signal, Δwin = −ηxz, where x is the input, η is the learning rate, and z is the error signal propagated backward through the network, similar to traditional backpropagation.
Direct Feedback Alignment72 (DFA) simplifies the weight transport chain compared with FA by directly connecting the output layer error to each hidden layer. The Sign-Symmetry (SS) algorithm is similar to FA except the feedback weights symmetrically share signs. Recent progress in feedback alignment explores incorporating backprojections with cortical hierarchies73 and is completely phase free (no forward or backward passes). While FA has exhibited impressive results on small datasets such as MNIST and CIFAR, their performance on larger datasets such as ImageNet is often suboptimal.74 On the other hand, recent studies have shown that the SS algorithm is capable of achieving comparable performance to backpropagation, even on large-scale datasets.75
Eligibility propagation60,76 (e-prop) extends the idea of feedback alignment for spiking neural networks, combining the advantages of both traditional error backpropagation and biologically plausible learning rules, such as spike-timing-dependent plasticity (STDP). For each synapse, the e-prop algorithm computes and maintains an eligibility trace , derived based on real-time recurrent learning.77,78 Eligibility traces measure the total contribution of this synapse to the neuron’s current output, taking into account all past inputs.3 This can be computed and updated in a purely forward manner, without backward passes. This eligibility trace is then multiplied by an estimate of the gradient of the error over the neuron’s output to obtain the actual weight gradient . Lj(t) itself is computed from the error at the output neurons, either by using symmetric feedback weights or by using fixed feedback weights, as in feedback alignment. A possible drawback of e-prop is that it requires a real-time error signal Lt at each point in time since it only takes into account past events and is blind to future errors. In particular, it cannot learn from delayed error signals that extend beyond the time scales of individual neurons (including short-term adaptation),60 in contrast with methods such as REINFORCE and node-perturbation. In addition, the weight update is an approximation of the true gradient, which can lead to difficulties in spatial scaling.
In the work of Refs. 79 and 80, a normative theory for synaptic learning based on recent genetic findings81 of neuronal signaling architectures is demonstrated. They propose that neurons communicate their contribution to the learning outcome to nearby neurons via cell-type-specific local neuromodulation and that neuron-type diversity and neuron-type-specific local neuromodulation may be critical pieces of the biological credit-assignment puzzle. In this work, the authors instantiate a simplified computational model based on eligibility propagation to explore this theory and show that their model, which includes both dopamine-like temporal difference and neuropeptide-like local modulatory signaling, leads to improvements over previous methods such as e-prop and feedback alignment.
Generalization properties: Techniques in deep learning have made tremendous strides toward understanding the generalization of their learning algorithms. A particularly useful discovery was that flat minima tend to lead to better generalization.82 What is meant by this is that, given a perturbation ɛ in the parameter space (synaptic weight values), more significant performance degradation is observed around narrower minima. Learning algorithms that find flatter minima in parameter space ultimately lead to better generalization.
Recent work has explored the generalization properties exhibited by (brain-inspired) backpropagation-derived local learning rules.83 Compared with backpropagation through time, backpropagation-derived local learning rules exhibit worse and more variable generalization, which does not improve by scaling the step size due to the gradient approximation being poorly aligned with the true gradient. While it is perhaps unsurprising that local approximations of an optimization process are going to have worse generalization properties than their complete counterpart, this work opens the door toward asking new questions about what the best approach toward designing brain-inspired learning algorithms is. It also opens the question as to whether backpropagation-derived local learning rules are even worth exploring given that they are fundamentally going to exhibit sub-par generalization.
In conclusion, while backpropagation-derived local learning rules present themselves as a promising approach to designing brain-inspired learning algorithms, they come with limitations that must be addressed. The poor generalization of these algorithms highlights the need for further research to improve their performance and to explore alternative brain-inspired learning rules. It also opens the question as to whether backpropagation-derived local learning rules are even worth exploring given that they are fundamentally going to exhibit sub-par generalization.
Meta-optimized plasticity rules
Meta-optimized plasticity rules offer an effective balance between error-driven global learning and brain-inspired local learning. Meta-learning can be defined as the automation of the search for learning algorithms themselves, where, instead of relying on human engineering to describe a learning algorithm, a search process to find that algorithm is employed.84 The idea of meta-learning naturally extends to brain-inspired learning algorithms, such that the brain-inspired mechanism of learning itself can be optimized, thereby allowing for the discovery of more efficient learning without manual tuning of the rule. In the following section, we discuss various aspects of this research, starting with differentiably optimized synaptic plasticity rules.
Differentiable plasticity: One instantiation of this principle in the literature is differentiable plasticity, which is a framework that focuses on optimizing synaptic plasticity rules in neural networks through gradient descent.85,86 In these rules, the plasticity rules are described in such a way that the parameters governing their dynamics are differentiable, allowing for backpropagation to be used for meta-optimization of the plasticity rule parameters (e.g., the η term in the simple Hebbian rule or the A+ term in the STDP rule). This allows the weight dynamics to precisely solve a task that requires the weights to be optimized during execution time, referred to as intra-lifetime learning.
Differentiable plasticity rules are also capable of the differentiable optimization of neuromodulatory dynamics.61,86 This framework includes two main variants of neuromodulation: global neuromodulation, where the direction and magnitude of weight changes are controlled by a network-output-dependent global parameter, and retroactive neuromodulation, where the effect of past activity is modulated by a dopamine-like signal within a short time window. This is enabled by the use of eligibility traces, which are used to keep track of which synapses contributed to recent activity, and the dopamine signal modulates the transformation of these traces into actual plastic changes.
Methods involving differentiable plasticity have seen improvements in a wide range of applications, from sequential associative tasks87 to familiarity detection88 to robotic noise adaptation.61 This method has also been used to optimize short-term plasticity rules,88,89 which exhibit improved performance in reinforcement and temporal supervised learning problems. While these methods show much promise, differentiable plasticity approaches take a tremendous amount of memory, as backpropagation is used to optimize multiple parameters for each synapse over time. Practical advancements with these methods will likely require parameter sharing90 or a more memory-efficient form of backpropagation.91
Plasticity with spiking neurons: Recent advances in backpropagating through the non-differentiable part of spiking neurons with surrogate gradients have allowed for differentiable plasticity to be used to optimize plasticity rules in spiking neural networks.61 In Ref. 62, the capability of this optimization paradigm is demonstrated through the use of a differentiable spike-timing dependent plasticity rule to enable “learning to learn” on an online one-shot continual learning problem and an online one-shot image class recognition problem. A similar method was used to optimize the third-factor signal using the gradient approximation of e-prop as the plasticity rule, introducing a meta-optimization form of e-prop.92 Recurrent neural networks tuned by evolution can also be used for meta-optimized learning rules. Evolvable Neural Units93 (ENUs) introduce a gating structure that controls how the input is processed and stored, and dynamic parameters are updated. This work demonstrates the evolution of individual somatic and synaptic compartment models of neurons and shows that a network of ENUs can learn to solve a T-maze environment task, independently discovering spiking dynamics and reinforcement-type learning rules. Meta-learning has also been introduced to optimize the natural physical structure of spiking reservoir systems to determine the optimal initialization before a task is learned.94
Plasticity in RNNs and Transformers: Independent of research aiming at learning plasticity using update rules, Transformers have recently been shown to be good intra-lifetime learners.5,95,96 The process of in-context learning works not through the update of synaptic weights but purely within the network activations. As in Transformers, this process can also happen in recurrent neural networks.97 While in-context learning initially appears to be a different mechanism from synaptic plasticity, these processes have been demonstrated to exhibit a strong relationship. One exciting connection discussed in the literature is the realization that parameter-sharing by the meta-learner often leads to the interpretation of activations as weights.98 This demonstrates that, while these models may have fixed weights, they exhibit some of the same learning capabilities as models with plastic weights. Another connection is that self-attention in the Transformer involves outer and inner products that can be cast as learned weight updates99 that can even implement gradient descent.100,101
Evolutionary and genetic meta-optimization: Much like differentiable plasticity, evolutionary and genetic algorithms have been used to optimize the parameters of plasticity rules in a variety of applications,102 including adaptation to limb damage on robotic systems.103,104 Recent work has also enabled the optimization of both plasticity coefficients and plasticity rule equations through the use of Cartesian genetic programming,105 presenting an automated approach for discovering biologically plausible plasticity rules based on the specific task being solved. In these methods, the genetic or evolutionary optimization process acts similarly to the differentiable process in that it optimizes the plasticity parameters in an outer-loop process, while the plasticity rule optimizes the reward in an inner-loop process. These methods are appealing since they have a much lower memory footprint compared to differentiable methods since they do not require backpropagating errors over time. However, while memory efficient, they often require a tremendous amount of data to get comparable performance to gradient-based methods.106
Self-referential meta-learning: While synaptic plasticity has two-levels of learning, the meta-learner and the discovered learning rule, self-referential meta-learning107,108 extends this hierarchy. In plasticity approaches, only a subset of the network parameters are updated (e.g., the synaptic weights), whereas the meta-learned update rule remains fixed after meta-optimization. Self-referential architectures enable a neural network to modify all of its parameters in a recursive fashion. Thus, the learner can also modify the meta-learner. This, in principle, allows arbitrary levels of learning, meta-learning, meta–meta-learning, etc. Some approaches meta-learn the parameter initialization of such a system.107,109 Finding this initialization still requires a hardwired meta-learner. In other works, the network self-modifies in a way that eliminates even this meta-learner.108,110 Sometimes the learning rule to be discovered has structural search space restrictions that simplify self-improvement, where a gradient-based optimizer can discover itself111 or an evolutionary algorithm can optimize itself.112 Despite their differences, both synaptic plasticity and self-referential approaches aim to achieve self-improvement and adaptation in neural networks.
Generalization of meta-optimized learning rules: The extent to which discovered learning rules generalize to a wide range of tasks is a significant open question—in particular, when should they replace manually derived general-purpose learning rules such as backpropagation? A particular observation that poses a challenge to these methods is that when the search space is large and few restrictions are put on the learning mechanism,97,113,114 generalization is shown to become more difficult. However, toward amending this, in variable shared meta-learning,98 flexible learning rules were parameterized by parameter-shared recurrent neural networks that locally exchange information to implement learning algorithms that generalize across classification problems not seen during meta-optimization. Similar results have also been shown for the discovery of reinforcement learning algorithms.115
APPLICATIONS OF BRAIN-INSPIRED LEARNING
Neuromorphic Computing: Neuromorphic computing represents a paradigm shift in the design of computing systems, with the goal of creating hardware that mimics the structure and functionality of the biological brain.42,116,117 This approach seeks to develop artificial neural networks that not only replicate the brain’s learning capabilities but also its energy efficiency and inherent parallelism. Neuromorphic computing systems often incorporate specialized hardware, such as neuromorphic chips or memristive devices, to enable the efficient execution of brain-inspired learning algorithms.117,118 These systems have the potential to drastically improve the performance of machine learning applications, particularly in edge computing and real-time processing scenarios.
A key aspect of neuromorphic computing lies in the development of specialized hardware architectures that facilitate the implementation of spiking neural networks, which more closely resemble the information processing mechanisms of biological neurons. Neuromorphic systems operate based on the principle of brain-inspired local learning, which allows them to achieve high energy efficiency, low-latency processing, and robustness against noise, which are critical for real-world applications.119 The integration of brain-inspired learning techniques with neuromorphic hardware is vital for the successful application of this technology.
In recent years, advances in neuromorphic computing have led to the development of various platforms, such as Intel’s Loihi,120 IBM’s TrueNorth,121 and SpiNNaker,122 which offer specialized hardware architectures for implementing SNNs and brain-inspired learning algorithms. These platforms provide a foundation for further exploration of neuromorphic computing systems, enabling researchers to design, simulate, and evaluate novel neural network architectures and learning rules. As neuromorphic computing continues to progress, it is expected to play a pivotal role in the future of artificial intelligence, driving innovation and enabling the development of more efficient, versatile, and biologically plausible learning systems.123,124
Robotic learning: Brain-inspired learning in neural networks has the potential to overcome many of the current challenges present in the field of robotics by enabling robots to learn and adapt to their environment in a more flexible way.125,126 Traditional robotics systems rely on pre-programmed behaviors, which are limited in their ability to adapt to changing conditions. In contrast, as we have shown in this review, neural networks can be trained to adapt to new situations by adjusting their internal parameters based on the data they receive.
Because of their natural relationship to robotics, brain-inspired learning algorithms have a long history in robotics.125 Toward this end, synaptic plasticity rules have been introduced for adapting robotic behavior to domain shifts such as motor gains and rough terrain,61,127–129 as well as for obstacle avoidance130–132 and articulated (arm) control.133,134 Brain-inspired learning rules have also been used to explore how learning occurs in the insect brain using robotic systems as an embodied medium.135–138
Deep reinforcement learning (DRL) represents a significant success of brain-inspired learning algorithms, combining the strengths of neural networks with the theory of reinforcement learning in the brain to create autonomous agents capable of learning complex behaviors through interaction with their environment.139–141 By utilizing a reward-driven learning process emulating the activity of dopamine neurons142 as opposed to the minimization of, e.g., classification or regression error, DRL algorithms guide robots toward learning optimal strategies to achieve their goals, even in highly dynamic and uncertain environments.143,144 This powerful approach has been demonstrated in a variety of robotic applications, including dexterous manipulation, robotic locomotion,145 and multi-agent coordination.146
Lifelong and online learning: Lifelong and online learning are essential applications of brain-inspired learning in artificial intelligence, as they enable systems to adapt to changing environments and continuously acquire new skills and knowledge.14 Traditional machine learning approaches, in contrast, are typically trained on a fixed dataset and lack the ability to adapt to new information or changing environments. The mature brain is an incredible medium for lifelong learning, as it is constantly learning while remaining relatively fixed in size across the span of a lifetime.147 As this review has demonstrated, neural networks endowed with brain-inspired learning mechanisms, similar to the brain, can be trained to learn and adapt continuously, improving their performance over time.
The development of brain-inspired learning algorithms that enable artificial systems to exhibit this capability has the potential to significantly enhance their performance and capabilities and has wide-ranging implications for a variety of applications. These applications are particularly useful in situations where data are scarce or expensive to collect, such as in robotics148 or autonomous systems,149 as they allow the system to learn and adapt in real-time rather than requiring large amounts of data to be collected and processed before learning can occur.
One of the primary objectives in the field of lifelong learning is to alleviate a major issue associated with the continuous application of backpropagation on ANNs, a phenomenon known as catastrophic forgetting.13 Catastrophic forgetting refers to the tendency of an ANN to abruptly forget previously learned information upon learning new data. This happens because the weights in the network that were initially optimized for earlier tasks are drastically altered to accommodate the new learning, thereby erasing or overwriting the previous information. This is because the backpropagation algorithm does not inherently factor in the need to preserve previously acquired information while facilitating new learning. Solving this problem has remained a significant hurdle in AI for decades. We posit that by employing brain-inspired learning algorithms that emulate the dynamic learning mechanisms of the brain, we may be able to capitalize on the proficient problem-solving strategies inherent to biological organisms.
Toward understanding the brain: The worlds of artificial intelligence and neuroscience have been greatly benefiting from each other. Deep neural networks, specially tailored for certain tasks, show striking similarities to the human brain in how they handle spatial150–152 and visual153–155 information. This overlap hints at the potential of artificial neural networks (ANNs) as useful models in our efforts to better understand the brain’s complex mechanics. A new movement referred to as the neuroconnectionist research program156 embodies this combined approach, using ANNs as a computational language to form and test ideas about how the brain computes. This perspective brings together different research efforts, offering a common computational framework and tools to test specific theories about the brain.
While this review highlights a range of algorithms that imitate the brain’s functions, we still have a substantial amount of work to do to fully grasp how learning actually happens in the brain. The use of backpropagation and backpropagation-like local learning rules to train large neural networks may provide a good starting point for modeling brain function. Much productive investigation has occurred to see what processes in the brain may operate similarly to backpropagation,65 leading to new perspectives and theories in neuroscience. Even though backpropagation in its current form might not occur in the brain, the idea that the brain might develop similar internal representations to ANNs despite such different mechanisms of learning is an exciting open question that may lead to a deeper understanding of the brain and AI.
Explorations are now extending beyond static network dynamics to the networks that unravel as a function of time, much like the brain. As we further develop algorithms for continual and lifelong learning, it may become clear that our models need to reflect the learning mechanisms observed in nature more closely. This shift in focus calls for the integration of local learning rules—those that mirror the brain’s own methods—into ANNs.
We are convinced that adopting more biologically authentic learning rules within ANNs will not only yield the aforementioned benefits, but it will also serve to point neuroscience researchers in the right direction. In other words, it is a strategy with a two-fold benefit: not only does it promise to invigorate innovation in engineering, but it also brings us closer to unraveling the intricate processes at play within the brain. With more realistic models, we can probe deeper into the complexities of brain computation from the novel perspective of artificial intelligence.
CONCLUSION
In this review, we investigate the integration of more biologically plausible learning mechanisms into ANNs. This further integration presents itself as an important step for both neuroscience and artificial intelligence. This is particularly relevant amid the tremendous progress that has been made in artificial intelligence with large language models and embedded systems, which are in critical need of more energy efficient approaches for learning and execution. In addition, while ANNs are making great strides in these applications, there are still major limitations in their ability to adapt such as biological brains, which we see as a primary application of brain-inspired learning mechanisms.
As we strategize for future collaboration between neuroscience and AI toward more detailed brain-inspired learning algorithms, it is important to acknowledge that the past influences of neuroscience on AI have seldom been about a straightforward application of ready-made solutions to machines.157 More often than not, neuroscience has stimulated AI researchers by posing intriguing algorithmic-level questions about aspects of animal learning and intelligence. It has provided preliminary guidance toward vital mechanisms that support learning. Our perspective is that by harnessing the insights drawn from neuroscience, we can significantly accelerate advancements in the learning mechanisms used in ANNs. Similarly, experiments using brain-like learning algorithms in AI can accelerate our understanding of neuroscience.
ACKNOWLEDGMENTS
We thank the OpenBioML collaborative workspace, to which several of the authors of this work were connected. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship for Comp/IS/Eng—Robotics under Grant Nos. DGE2139757 and 2235440.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Samuel Schmidgall: Conceptualization (lead); Investigation (lead); Visualization (lead); Writing – original draft (lead); Writing – review & editing (lead). Rojin Ziaei: Conceptualization (equal); Writing – original draft (equal); Writing – review & editing (equal). Jascha Achterberg: Writing – review & editing (equal). Louis Kirsch: Writing – original draft (supporting); Writing – review & editing (supporting). S. Pardis Hajiseyedrazi: Visualization (equal); Writing – review & editing (equal). Jason Eshraghian: Supervision (equal); Writing – original draft (supporting); Writing – review & editing (equal).
DATA AVAILABILITY
Data sharing is not applicable to this article as no new data were created or analyzed in this study.