The elementary basis of intelligence in organisms with a central nervous system includes neurons and synapses and their complex interconnections forming neural circuits. In non-neural organisms such as slime mold with gel-like media, viscosity modulation enables adaptation to changing environments. At a larger scale, collective intelligence emerges via social interactions and feedback in animal colonies. Learning and memory are therefore multi-scale features that evolve as a result of constant interactions with the environment. There is growing interest in emulating such features of intelligence in computing machines and autonomous systems. Materials that can respond to their environment in a manner similar to organisms (referred to as “organismic materials”) therefore may be of interest as hardware components in artificial intelligence machines. In this brief review, we present a class of semiconductors called correlated oxides as candidates for learning machines. The term “correlated” refers to the fact that electrons in such lattices strongly interact and the ground state is not what is predicted by classical band theory. Such materials can undergo insulator–metal transitions at near ambient conditions under external stimuli such as thermal or electrical fields, strain, and chemical doping. Depending on the mechanism driving the transition, intermediate states can be metastable with different volatilities, and the time scales of phase change can be controlled over many orders of magnitude. The change in electronic properties can be sharp or gradual, leading to digital or analog behavior. These properties enable the realization of artificial neurons and synapses and emulate the associative and non-associative learning characteristics found in various organisms. We examine microscopic properties concerning electronic and structural transitions leading to collective behavior and theoretical treatments of the ground state and dynamical response, showcasing VO2 as a model system. Next, we briefly review algorithms designed from the plasticity demonstrated by phase changing systems. We conclude the brief review with suggestions for future research toward realizing non-von Neumann machines.
I. ORGANISMIC INTELLIGENCE IN NATURE AND EMULATION IN SYNTHETIC MATTER
The manifestation of intelligence in nature appears at multiple length scales. Within the brain, there are chemical neurotransmitters with concentrations that depend on neurons and synapses, as well as the associated circuitry. Neurons and synapses are the components that transmit and modulate signals across specialized circuits. In this system, the parameters are dynamically adjusted for different tasks, as represented by signals being given different synaptic “weights” (different signal transmissivities) between pre- and post-neurons. Electrochemical pumps acting across cell membranes control the ion concentration gradients, e.g., Na+ and K+ voltage-gated ion channels that lead to the generation of signals (action potentials). The neurons are encased by myelin sheaths helping to mitigate the leakage of the signals propagated to the synaptic junctions to amplify or decrease the strength of the electrical impulses. Furthermore, the frequency of the impulses contains information regarding the stimulus and the decisions made in response to the stimulus. While our understanding of behavior at the level of single neurons has improved significantly over the last few decades, understanding how behavior emerges at the network level remains a highly active research area in the field of neuroscience. At even larger scales, intelligence is known to be emergent in animal colonies via social learning, feedback, and adaptation. Intelligence manifested across scales in biological systems is summarized in Fig. 1.
Interestingly, a plethora of recent studies have found that intelligence can be present in organisms that do not possess a central nervous system (CNS). Perhaps the most famous example is physarum polycephalum, also known as slime mold.1–5 For instance, Boisseau et al. reported that slime mold can habituate to caffeine and quinine, which demonstrates non-associative learning in a simple organism.5 The work of Saigusa and co-workers demonstrated that slime mold can predict unnatural events such as local environmental changes induced in a laboratory that the organism normally does not encounter in nature.4 Such capabilities are remarkable because slime mold must rely on external spatial memory. Experiments on the slime mold point to a type of intelligence that existed prior to the development of complex organs such as the brain. In the case of slime mold, it has been suggested that its degree of viscosity can be adapted to the environment through frequency tunable oscillatory networks built into their bodies.4 More recent work suggests that these oscillators adjust the structure of gel sheets in slime molds, so they can memorize periodic changes in the favorable and unfavorable culture conditions. The appropriate response is determined after adaptively adjusting the gel sheet configuration in their body in different environments.6 A common characteristic of the nervous system or the slime gel is the change in the internal chemical dynamics (i.e., the neural transmitter concentration and gel sheet configuration) due to the presence of external stimuli; see Figs. 2(a) and 2(b). Figure 3 depicts a few recent results from studies on slime mold, which demonstrates remarkable skills concerning foraging, path optimization, event prediction, and habituation. They illustrate how slime mold can employ external spatial memory to successfully find food and optimize its path. These observations reveal the primitive origins of intelligence enabling the survival of species and predating the evolution of the central nervous system.
The development of intelligence in organisms that possess a central nervous system is currently a subject of intense study. It is known that the ability of colonies to learn continuously forms an important part of an individual organism's ability to adapt to changing environments (aside from evolutionary knowledge imprinted at the DNA level), develop defense mechanisms against predators, and so forth. There are many examples of collective intelligence, from the feeding behavior of bottlenose dolphins, synchronized birth behavior in the African mongoose, the flocking of birds, schooling in cave fish, and the foraging characteristics of ant colonies. The overarching question with regard to the physical science is whether some aspects of evolutionary knowledge, gained and refined over hundreds of millions of years in nature, can be adapted in hardware to design future machines. One goal of artificial intelligence is to build machines that can incorporate features found in organisms such as learning. This represents a fundamental departure from devices based on the von Neumann architecture currently being used to design microprocessors. Organism-inspired algorithms—including various associative and non-associative learning mechanisms, adaptation, and responsivity to dynamic environments—are already being implemented in software development. While these bio-mimetic features are helpful, the cost of mimicking them in Si-based devices is high and will significantly increase as the device structure becomes more complicated. Can machines be designed to possess lifelong learning ability (similar to the animal brain) and evolve via environmental feedback to outperform existing computers for solving certain problems? Can drones and other autonomous systems benefit from neuromorphic hardware to solve problems that otherwise would be impossible or too energy-consuming to perform with existing von Neumann architectures? These are some questions that have stimulated fundamental research into adaptive matter for non-von Neumann machine intelligence. Unlike today's modern PCs (i.e., von Neumann machines) where the computation unit and the memory unit are two separate entities (leading to a large amount of memory traffic for data retrieval and storage), computation in the brain is distributed, and computations are in some ways performed in the memory itself. Such “Non-von Neumann”-like processing with collocated memory and computation has the advantage of reduced latency and energy consumption. It is worth noting that some computer scientists have already considered this question and made significant progress in algorithm development. Swarm intelligence and bio-inspired programming have emerged as major research tools for solving a variety of practical problems.7–9 A remaining open question is whether adaptive hardware can augment such software to build better machines, with “better” implying greater energy efficiency for performing certain tasks, on-the-fly learning, or simplifying the tasks of circuit and software design.
In this review, we examine the use of correlated oxides, a class of quantum materials, as potential neuromorphic hardware. The term “correlated” refers to the fact that the electrons in these semiconductors strongly interact and the ground state is not metallic as one would expect from the partially filled orbitals. Such semiconductors can further undergo a transition into the metallic state from an insulating state under external stimuli such as thermal fields, electrical fields, or chemical pressure. The intermediate pathways and the dynamics can be controlled by the actuation mechanism. Consequently, these oxides have the potential to exhibit adaptive behaviors seen in both the neural systems and slime molds, as shown in Fig. 2(c). Such adaptive properties are key to memory circuits in which computing and storage functions may be embedded in the same device and in which multiple-memory states can be utilized. Correlated oxides can act as neurons, synapses, and even interconnects. Some exhibit insulator–metal transitions (IMTs) in which insulating behavior is observed below a critical temperature (TC) due to electron localization and metallic behavior above it.10 Since the IMT is typically a first-order transition, these materials exhibit hysteretic properties near TC. These systems are often broadly classified as Mott insulators, and many of them are transition metal oxides (TMOs).
We note that several review articles on the more general topic of phase change materials have appeared within the last decade, illustrating their growing importance in the condensed matter and device science communities.11–15 This is not particularly surprising as Si-based complementary-metal-oxide-semiconductor (CMOS) scaling is nearing its end, with the number of benefits offered from continued scaling decreasing due to excessive cost and the problem of power dissipation. The electronics industry that once motivated developments following Moore's law is now at risk of becoming a sustaining operation. Hence, solid-state physicists, electrical engineers, materials and chemical engineers, etc., must adapt their skill sets to stay relevant and impact the growing disciplines of autonomous intelligence, robotics, and machine learning. This presents an opportunity for researchers to think more broadly about electronics and computer hardware, without the need to follow narrowly defined research directions for the miniaturization of transistor/interconnect nodes.
Here, we discuss topics not extensively covered in other reviews and also present some material that may overlap topically with other publications for the sake of completeness. Our focus is on oxide materials that exhibit strong electron correlations and related systems in which structural and electronic phase changes can occur concurrently. While electronic transitions can be on the femto-nanosecond time scale, it is also possible to continuously tune the time scale from the micro-second to second or hours through thermal/electrochemical stimulation, as shown in Fig. 4, with VO2 as a model system. While the traditional mantra for electronic switch design is “smaller, faster, and better,” in neuromorphic computing, it is not yet clear what timescales are most appropriate. For instance, the brain can outperform traditional computers in a wide variety of tasks while consuming far lesser power. Chemical signaling occurs at millisecond timescales in the brain and is far slower than CMOS switches. Therefore, it is not wise to ignore phenomena that offer interesting capabilities for analog signal processing even if they are not operating at the same time scale as a sub-10-nm node field effect transistor (FET). In fact, processing and storing information in the brain that is crucial for decision making and forecasting relies on events taking place in the chemical and electrical synapses over multiple timescales. Similarly, phase transitions in strongly correlated materials can be volatile, stable, or multistate, making them ideal for simulating complex information processing in learning (Figs. 5 and 6). We will explicate these emerging fields in the sections below, emphasizing the relevant materials physics. Our review includes discussions on materials properties, operando characterization, atomistic simulations, and computing applications with an emphasis on the strongly correlated oxides and perovskite halides where the importance of electron correlations is still being elicited. We note here the pioneering contributions of Leon Chua who has identified several analogies between information processing units in the neural system to artificial devices referred to as “memristors.”16,17 It is reasonable to suggest that synthetic matter that can adaptively respond to external stimuli opens a pathway to emulate biological behavior in a well-defined manner.
II. PHYSICS OF THE IMT IN STRONGLY CORRELATED OXIDES
A. Characteristics of the IMT and concomitant phase transitions
Correlated semiconductors possess ground states that are often insulating despite having cations with partially filled orbitals. This is due to the strong interactions between electrons arising from Coulomb repulsion. Such systems can undergo insulator–metal transitions (IMTs) triggered by a variety of external stimuli such as temperature, hydrostatic or chemical pressure, and electrical fields. Furthermore, a range of intermediate states can exist between the initial and final stable states depending on how the transition is driven. This can lead to large, tunable responses in the electrical conductivity and optical reflectance, allowing their use in adaptive hardware that can mimic the organismic response to environmental changes. While in memristors made of hafnia, titania, etc., one can create local filaments by introducing a high density of defects such as oxygen vacancies,18,19 in IMT systems, the entire volume of the device can undergo phase changes due to the collapse or opening of a bandgap. Intermediate states at different levels of electron filling may give rise to continuously tunable conductivity or, in other words, analog behavior. Hence, correlated, narrow gap semiconductors offer a complementary approach to conventional band insulators in the emerging fields of neuromorphic hardware design. Similar to other electron-correlated systems that require advanced fabrication approaches,20–24 the IMT is very sensitive to structural dynamics and can be controlled through the use of various parameters including pressure,25–29 strain,30,31 chemical doping,28,32–36 electrical gating,37,38 ion irradiation,39 electromagnetic waves,40,41 and temperature.10 Perhaps the most studied Mott insulator is VO2, which exhibits orders of magnitude change in resistance at the transition temperature near 335 K. The convenience of this temperature makes it a promising candidate for a variety of electronic applications. In many cases, the IMT is accompanied by some form of spin or charge ordering and a structural phase transition (SPT). For instance, the high-temperature metallic state of VO2 has a rutile structure, while the low-temperature insulating state is monoclinic with dimerized vanadium atoms having electrons in a spin-singlet state.42 There are many examples of concomitant phase transitions in Mott materials including several vanadates (VO2, VnO2n−1, and La1−xSrxVO3),43–45 the rare-earth nickelates (ReNiO3),40–42 La1−xSrxFeO3,46–49 and Fe3O450 to name a few; for a more comprehensive list, see Ref. 10. The involvement of several degrees of freedom adds significant complexity to the IMT but possibly also offers a route to tune material properties. It is thus important to understand whether these phase transitions can be decoupled from one another under certain circumstances. For example, in rare earth nickelates, the coupling between charge ordering and the antiferromagnetic transition depends on the identity of rare-earth elements.51 In other materials, some studies suggest that the SPT may become decoupled from the electronic IMT52–62 and may even be avoided altogether.63 This is especially important to understand as the electronic degrees of freedom typically have much lower heat capacitance and shorter characteristic time scales than structural ones such that a purely electronic IMT should be far less energy and time-consuming than one combined with an SPT.64 An additional advantage of a purely electronic IMT would be the elimination of any mechanical wear associated with recurring switching events. Determining whether the IMT and SPT are coupled is very challenging since it requires appropriate comparison of different techniques to measure the structural and electronic states. There are several contradicting reports, but for V2O3, it seems that the SPT and MIT are strongly coupled,65 whereas in VO2, the situation is less clear.52–62
B. Electrical switching: Thermal vs non-thermal IMT
Changes in the resistance of several orders of magnitude can be achieved in a Mott insulator by triggering the IMT using current/voltage pulses. As will be discussed in Sec. IV, this effect is of great importance since it can be used to emulate neuronal functionality. Despite this, the mechanism behind the electrically driven IMT and the energy consumption associated with it remain unclear. One intensely debated issue is whether the IMT can be triggered without heating the Mott insulator to TC by a large electric field. Despite some controversy, for VO2 devices, many reports indicate that the main effect of applying an electric field is Joule heating.66–75 Here, the IMT is triggered only when sufficient Joule heating drives the device to TC. The energy required to achieve thermal switching depends on the input power, heat dissipation rate, thermal capacitance, and latent heat associated with the IMT. Recent reports suggest that isostructural or non-thermally triggered IMTs in VO2 are possible.63,76,77 Mixed results have also been found in V2O3 with some claiming Joule heating to be the main driving mechanism for the IMT,78 while others suggest that a non-thermal mechanism associated with the electric field takes place.67 The energy required for a non-thermal IMT in this system is unknown but may potentially be orders of magnitude smaller than a thermal IMT. This prospect has far-reaching implications for device performance. Understanding different behaviors observed in the various experiments is key to solving the fundamental, physical problem of the IMT.
C. Dynamics of electrical switching and relaxation
To determine the speed at which Mott insulator-based devices can operate, several time scales have to be considered. In a simple device consisting of a material exhibiting an IMT, such as VO2 connected in series with a resistor Rseries79 [see Fig. 7(a)], the device operates at T < TC so that VO2 is initially in the high resistance (Rhigh) insulating state. Current pulses that emulate the neural input signals cause the capacitor to charge with a time scale of (Rhigh + Rseries)C. A series of frequent pulses will successively charge the capacitor. If the voltage on VO2 surpasses a certain threshold (Vth), the electric field will trigger the IMT and the material will assume the low resistance (Rlow) metallic state. Due to the large resistance drop, the current through the device will surge. The role of Rseries is to lower the voltage and power on VO2 after switching (Rlow < Rseries < Rhigh). This allows the device to cool and return to the insulating state even when current is still driven through the device. This cycle accomplishes the LIF functionality required from a neuron. If a constant current is applied instead of pulses, the device will fire repetitively by successive charging-firing-discharging cycles. The experimental response of such a circuit to a constant input current is shown in Figs. 7(b) and 7(c). As recently shown, more complex neural functionalities can be achieved using several similar devices coupled together.79–81
The time it takes to attain Vth depends on the applied voltage, the charging time (RhighC), and the time scale associated with the IMT (τIMT). The current spike duration will be determined by the discharge time of the capacitor (RlowC), the thermal relaxation time constant (τth), and the time scale associated with the metal–insulator transition (τMIT). τth is determined by the conduction of heat from the device to the substrate and is typically in the range of tens of nanoseconds. This, however, will vary depending on the choice of materials. The RC time scales will depend on the device properties and geometry. The most fundamental time scales are τIMT and τMIT. Pump-probe experiments show that τIMT may be as short as several picoseconds or even in the sub-picosecond regime.82–86 τMIT is much less studied and is difficult to disentangle from τth since the device is typically above TC after switching and must cool down before returning to the insulating state. If non-thermal transitions can be achieved, this would allow one to disentangle τMIT from τth. One of the few studies of these relaxation time scales in Mott materials has shown that τMIT diverges close to the phase transition in VO2 and may reach microseconds or more.87
D. Controlling the transition temperature
Another important parameter to consider is the IMT temperature, TC. Operating at a temperature close to TC is beneficial since it generally reduces the field and power required to electrically trigger the transition. Also, from a design standpoint, TC should be tuned relative to the operating temperature of the device. One way of achieving this is by varying the strain transmitted into the Mott material through the substrate.65,88–91 The strain couples to the structural degrees of freedom, thereby affecting the SPT and consequently the concurrent IMT, such as shown in the example presented in Fig. 8(a). Effects on the IMT may also arise through strain-induced changes to the bandwidth and electron interactions. Intriguingly, it has been found that the microstructure may also play a crucial role in determining TC in a Mott insulator.92 Thus, not only lattice mismatch but also the substrate morphology can have dramatic effects on TC and other properties of the IMT. Strain can also be induced in ferroelectric/Mott insulator hybrids. Biasing the ferroelectric induces tunable and reversible strain in the Mott material, which can be used to tune the transition temperature dynamically and spatially.93–98
Another way of varying TC is through ion irradiation. In V2O3, a dramatic change of ∼60 K was observed upon ion irradiation [Fig. 8(b)],39 whereas an ∼40 K change in TC was observed in VO2.99,100 The nature of this effect is as of yet speculative but may be related to the influence of disorder on different Mott insulators. Nevertheless, the effect could be especially useful since it allows for spatial tuning of TC, using a focused ion beam or a mask to vary the irradiation dose. This facilitates the use of the same TMO for implementing both volatile (neuronal) and non-volatile (synaptic) resistive switching (RS). This is a challenging problem since the two types of switching require different temperatures below TC. For volatile switching, the material should be close to TC to be able to switch with low voltage/power. On the other hand, non-volatile switching requires high fields to induce oxygen migration, which cannot be achieved close to TC, since the IMT will be triggered before attaining the high fields required for oxygen migration.101 Using ion irradiation, both neural and synaptic functionalities may be realized in different parts of the sample simply by tuning the local irradiation dose.
III. APPLICATIONS FOR ELECTRICALLY DRIVEN IMT
A. Neuron-like applications
To date, the growing needs of data-intensive analytical computing are driving computational hardware research toward a new paradigm. There has been significant research interest in the hardware implementation of spiking neural networks (SNNs), such as TrueNorth,102 Rolls,103 and others.104–108 SNNs are based upon a threshold-activated, integrate-and-fire neuron as the main device element where an integration of current or charge within the device triggers a spiking response.109,110 Chipsets like TrueNorth are currently designed and built on silicon CMOS technology, which lacks features such as positive feedback and results in complicated integrated circuit designs containing six or more transistors.13,102 Strongly correlated oxides, on the other hand, are organismic materials that can inherently trigger a positive feedback when subjected to an electrical stimulus. An example of positive feedback can be described by thermal runaway.
A simple VO2 neuronal circuit—consisting of a capacitor, an IMT element, and an optional resistor, is shown in Fig. 7.111 The working principle of this device is discussed in Sec. II C. Tuning the high and low resistance states of the VO2 thin films through engineering of the oxygen stoichiometry is critical to neuronal circuit operation. A large resistivity ratio between the high resistance state (HRS) and the low resistance state (LRS) is desirable because it enables a wide design space for the VO2 neuron. Through VO2 resistivity engineering, one can therefore modify the firing threshold, spiking pattern, and integration time, among other characteristics. Artificial neuron devices have been experimentally demonstrated to operate at a low voltage of ∼0.5 V.111
Although VO2 naturally exhibits the dynamical properties of neurons, VO2-based devices may be passive and not capable of amplifying the input. A simple VO2 neuron consisting of only passive elements (capacitor and resistor) has limitations. The output node of a neuron in a neural network is usually connected to a large number of fan-outs (e.g., 1-to-1000). Co-designing a VO2 element with a silicon transistor can provide the signal gain required for a large fan-out. For example, VO2 can be connected with a silicon MOSFET in a Common Source Amplifier configuration. The constructed spiking neuron has been shown to implement the “winner takes all” max-pooling layers employed in image processing pipelines.112
The coupled oscillator array (COA) is an oscillatory network connecting multiple states of oscillators. The COA can be used as an efficient method to solve computationally intense problems in computer vision. Unlike conventional image processing performed with Boolean algorithms, the COA encodes information through analogies: the image patterns can be encoded using the relative phase differences and frequency shifts of the oscillators. A COA can be constructed using conventional silicon-based technology, but an implementation using the native properties of organismic materials can offer better efficiency. VO2 relaxation oscillators have been studied both experimentally and by modeling.113–115 A coupled oscillator network with multiple stage VO2 oscillators has also been demonstrated to be effective for vision applications.116–118 Depending on the number of stages and the topology, there can be many configurations of an oscillatory network. One type of relaxation oscillator network has been made through capacitive coupling of all the oscillator output nodes to a common output node,119 and it was shown that a network of six coupled-oscillators is able to perform non-Boolean image processing functions, including saliency detection, color interpretation, and morphological operations. The authors demonstrated a factor of ten improvement in the power efficiency of the VO2 coupled oscillator network with respect to the CMOS implementation.119
The resistivity ratio between HRS and LRS in VO2 in an artificial neuronal circuit usually ranges from 10 to 105. For this magnitude of resistivity change, the fastest electrical transient response occurs at a fraction of a nanosecond,78,120 which can be measured in two-terminal VO2 devices with channel lengths on the sub-100 nm scale. This transient response results from the interplay of device geometry, material composition, and other device parameters, and the full transient consists of three phases.120 Based on recent experimental and model studies in other material systems, researchers have also noted the importance of thermal effects during resistive switching in TaOx121 and NbO2.122 A reliable physical model for new materials is crucial to the construction of reliable neuromorphic circuitry. When thermal runaway is the origin of the phase transition, the circuit needs to follow certain design rules to protect the element from overheating.123 Considering the maturity in VO2 processing, it remains an ideal vehicle for research into new, neuromorphic applications using strongly correlated materials. However, it is worth pointing out that the IMT of VO2 occurs at ∼340 K, which is significantly below the normal operating temperature of integrated circuits. Hence, this may pose fundamental limits in monolithic integration with CMOS, or other materials such as SmNiO3 (SNO) and NbO2 with higher IMT temperatures may become more relevant for applications when monolithic integration is needed with conventional integrated circuits.
B. Use of artificial neurons in synthetic neuroscience
An emerging target application for the class of IMT materials is synthetic neuroscience.79 Using the same neuronal configuration as shown in Fig. 7, a strongly correlated VO2 artificial neuron can naturally perform an electrically driven insulator–metal transition akin to an excitable membrane in the biological neuron, as shown in Fig. 9(a). In biological neural systems, the pathologically altered spike timing in a leaky membrane is linked to a number of degenerative diseases such as neuromuscular disorders and motor neuron disease (MND), as illustrated in Fig. 10. For example, the myelin sheath can undergo electrical breakdown, which would lead to a leaky membrane and weaker muscles. A healthy neuron subject to a constant input current will generate a periodic output spike (action potential), as shown in Figs. 9(b) and 9(c). In Figs. 9(d)–9(f), we show the solid-state analog of a healthy neuron. A neuron will fail to fire when the conductance is significantly increased. This increase can be mimicked in a solid-state circuit by electrical-stress-induced resistance degradation in VO2. Further research in this area can help create artificial systems and generate knowledge regarding the onset thresholds for brain disorders due to neuronal malfunction.79 This can serve as synthetic platforms to inform neuroscientists about precise electrical properties for signal transmission through neuronal circuits with high fidelity.
IV. CHARACTERIZATION OF THE SWITCHING DYNAMICS
Tunable electronic states under external stimuli are one principal mechanism enabling adaptive behavior in correlated semiconductors. The states can be modified by defect migration under electric fields, structural modifications due to Joule heating, and other processes. Imaging the material as it undergoes switching under operating conditions is therefore of importance. While in operando studies are quite challenging to perform, synchrotron beamlines are well positioned to take on this problem. In this section, we discuss a few examples from the literature discussing high-resolution characterization of oxides undergoing phase transitions due to external stimuli. Although the imaging of single point defects is difficult, changes in neuromorphic devices typically involve a collection of defects forming a channel from one electrode to another. In the case of oxygen vacancies in the transition metal oxides, the vacancies typically produce changes one can easily detect with synchrotron x-ray methods, especially if defect injection can be performed in operando. If so, one can directly correlate changes to the resistive state with changes in the local composition and structure.
Liu et al.18 reported a study on the memristive properties of WO3−δ, which is normally a wideband gap insulator but can be transformed into a metal by the injection of oxygen vacancies. The experimental setup is shown in Fig. 11(a), where synchrotron x-rays were focused to an ∼200 nm spot onto a region of WO3 between two platinum electrodes, using an x-ray zone plate. While the scattered x-rays were picked up with a pixel array detector, an energy resolving detector was used to measure changes to both the W fluorescence and W oxidation state, with the latter providing information regarding changes to the oxygen composition in WO3−δ.
The results are shown in Fig. 12. The left column shows the W fluorescence (a), oxygen stoichiometry (b), and integrated intensity of the thickness-normalized 003 reflection (c) after the application of a sufficient amount of voltage to produce the electroformed state. Liu et al. discovered that memristive behavior requires that the initial WO3−δ has the “proper” concentration of oxygen vacancies to start with; too few or too many would lead to irreversible, non-switchable electrical behavior.18
As shown in Figs. 12(a)–12(c), the application of this field leads to considerable local disorder. The W fluorescence shown in Fig. 12(a) shows that after the application of >32 V, W ions migrate away from the central region of the exposed area and toward the electrodes. Simultaneously, the oxygen ions redistribute [Fig. 12(b)], diffusing toward the anode; the more positively charged oxygen vacancies thus migrate toward the cathode. This leads to the loss of crystallinity in the region, as shown in Fig. 12(c).
The electrical results in Figs. 11(b) and 11(c) show that after the field is removed, the device is in the low resistance state, as shown by the red curve in Fig. 11(c). Figures 12(d)–12(f) show that the W, O, and 003 spectra have also changed, with the conducting channel comprising oxygen vacancies, as shown in Fig. 12(e). The encircled red region contains more oxygen than its surroundings and is correlated with greater 003 intensity [Fig. 12(f)], indicating that the degree of crystallinity correlates with the oxygen concentration.
Once the field is reversed [at +4 V in Fig. 11(c)], the device returns to the high resistance state (HRS). The corresponding structural and chemical changes can be observed in Figs. 12(g)–12(i). As depicted in Fig. 12(h), the conducting channel from the anode to the cathode is absent, thus yielding greater resistance. Again, the region with a larger oxygen concentration [the encircled red region in Fig. 12(h)] corresponds to the region with greater 003 intensity [Fig. 12(i)].
From microdiffraction scans carried out across the device (not shown), one even observes that the WO3−δ lattice expands beneath the Pt electrodes in the LRS. In the HRS, most of the lattice expansion is gone, suggesting that much of the oxygen has been reincorporated.
One can also explore probes in the soft x-ray region of the spectrum (<1500 eV). Soft x-ray spectroscopies such as soft X-ray absorption spectroscopy (XAS) are advantageous in that they give clear insights into both transition metal electronic and magnetic structures and how they are connected within the transition metal-oxygen framework.124 This is particularly useful for the transition metal oxides where the L-edges provide not only a measure of valence but also a local picture of how the d-states are organized in specific orbitals. This can be especially important in the area of the electrochemical processes where valence and orbital states are very important to controlling chemical potentials.125,126 While soft XAS has been applied to a variety of memristor systems,127–129 here we give an example of the rare-earth nickelates, RNiO3 thin films as shown in Fig. 13.130
The high-temperature-metallic phase of SmNiO3 (SNO) converts into a charge-ordered insulating phase just above room temperature. It has been shown that this is a viable platform for biologically inspired sensing.36 Using either Li or H doping, the underlying properties of SNO can be reversibly manipulated in an efficient manner.131–133 To go beyond the changes seen in the electrical response, XAS was utilized to understand fingerprint changes due to doping. In this case, the starting Ni valence is near 3+, consistent with the expected value,134 and evolves to 2+ as the charge of H+ is compensated, resulting in a shift of the Ni L3-edge to lower energy (see the left panel in Fig. 13). The opposite change is seen at the oxygen K-edge, which is a direct representative of how the unoccupied density of states changes with doping and is a powerful method for tracking changes in the electronic structure, which can be directly compared with electronic structure calculations.135,136 While not directly relevant to the case of SNO under ambient conditions, polarization-dependent XAS can be extended to probe element-resolved magnetic properties as well.124 Although this is not discussed here, this can be very relevant to exploring the control of neuromorphic systems based on spintronics.137,138
Here, however, these measurements are all spatially integrated and can be extended to microscopic probes to gain further information concerning the inhomogeneous nature of the phase changes in many of these materials. Recent work with soft transmission X-ray microscopy (TXM)139–141 and photoelectron emission microscopy (PEEM)142 has shown the ability to track changes in working devices as they switch between different electronic configurations. In connection, with in situ electrical measurements, this is a powerful way to understand device operation. In the future, one can look toward experiments in moving these measurements into the time domain to directly follow time-resolved dynamics, which is currently only explored through transport measurements. One should be able to track electronic, magnetic, and structural degrees of freedom during changes in the device response to understand what fundamental limitations exist.
V. COMPUTATIONAL MODELING OF STATIC AND DYNAMICAL BEHAVIOR
A. Multiscale modeling of neuromorphic behavior
To model the neural functionalities described above, a variety of computational techniques spanning a broad range of lengths and timescales have been utilized. We present a brief overview of some of these different methods commonly employed to model synaptic behavior, as summarized in Fig. 14.
1. Electronic structure and atomistic scale modeling
At the atomistic scale, molecular dynamics simulations are commonly employed to study the dynamical evolution of neuromorphic systems, i.e., the switching between the high resistance and low resistance states. There are various flavors of molecular dynamics: although ab initio MD (AIMD) captures the physical processes, it is limited by the inevitability of dense linear algebra in density functional theory (DFT), which means that scaling to larger length scales beyond a few 100 atoms is a major hurdle.143 While it is not possible to simulate the entire switching behavior with these simulations, their high accuracy is often advantageous in addressing individual problems such as computation of the electronic structure in the ON and OFF states or ionic migration barriers.144,145 In general, these models have been used to understand the various stages of switching phenomena, including possible atomistic configurations that lead to LRS and HRS, the nucleation of stable phases and their subsequent growth, the formation energies of redox reactions, and the calculations of the energy barriers for ionic migration.146–148 Most of the studies have focused on understanding synaptic functionalities at the atomistic scale; the general idea is to understand the sudden collapse of resistance under the action of an electrical stimulus and the spontaneous return to the initial state. This is observed in systems that undergo metal–insulator transitions such as Mott insulators. Ab initio calculations in such systems (e.g., VOx) are aimed at obtaining the non-equilibrium electronic structure using state-of-the-art DFT+U or combining advanced density functional theory with dynamical mean field theory (DFT+DMFT) simulations.149,150 In their current implementations, these approaches are able to provide a unified picture of the Mott–Hubbard transition and the coupled periodic lattice distortion.
Classical MD, within the Born–Oppenheimer framework, allows for simulations of phenomena that span over nanoseconds in time scale and tens of nanometers in the length scale.151,152 Classical MD is, however, often limited by the accuracy of the inter-atomic potentials and restricted to nanometer length and nanosecond time scales. This enables larger ensembles to be calculated, and as such, these simulations have been used to model the nanosecond switching behavior arising from the formation of filaments or beyond filamentary mechanisms (e.g., based on oxygen stoichiometry changes in metal-oxide based resistive devices).153,154
2. Mesoscale modeling
Kinetic Monte Carlo (KMC) simulations have been used to simulate the mesoscopic behavior and link the atomistic simulations with continuum methods.155–157 The KMC simulations are stochastic in nature and model the physical processes based on the pre-defined rate catalog. For each of the physical processes, one computes the occurrence probability based on the knowledge of the activation barriers and the attempt frequencies. The system is then propagated by selecting the occurrence process using a random number, and subsequently, the selection probabilities are updated according to the new system configuration.158,159 Owing to its stochastic nature, the KMC method is used to study the variability of the resistive switching or the failure mechanisms.
3. Continuum scale models
At the continuum scale, one typically solves the partial differential equations that describe the behavior of the entire device.160,161 The electrical behavior is modeled using the Poisson equation, which is coupled to the drift–diffusion equations of the charge carriers, and/or the current continuity equation is solved. Heat transfer equations using electrical power as a source term are used to model the accompanying local Joule heating effects. These partial differential equations are solved by discretizing the computational domain using the finite element, the finite difference or the finite volume method. These continuum models rely on assumptions of material parameters to simulate the complete device behavior and do not include information regarding the actual atomistic configurations or their changes at the nanoscale. For instance, the amount of electronic and ionic carriers is described by macroscopic continuous variables such as number densities or concentrations. This allows one to understand the effect of physical or device parameters, for example, on the switching time and switching voltage. While the details of the nanoscopic processes are not included in such models, the key advantage of these continuum models is that they allow direct comparisons of the model predictions with experimental observations.162
B. DFT and beyond DFT methods for strongly correlated materials
The perovskite rare-earth nickelates display a novel metal-to-insulator transition driven by the strong interplay of structural changes and electronic correlation effects.149 While DFT allows us to compute total energies with high fidelity, it performs poorly when predicting the properties of strongly correlated materials. Predictions of exotic phenomena such as “site-selective Mott transition” in rare-earth nickelates therefore necessitate DFT+DMFT.149 Previous work by Park and co-workers suggest that the subtle Ni–O bond disproportionation plays an important role to make Ni d electrons in a sublattice with larger Ni–O bonds localized, as in a strongly correlated Mott insulator, while other Ni d electrons in a sublattice with smaller Ni–O bonds become strongly hybridized with surrounding O p electrons.163,164 Conventional DFT methods cannot capture this novel Ni–O bond order energetically due to the poor description of the correlation energy. These subtle changes of Ni–O bond lengths in nickelates are triggered depending on the size of the A-site cation, strain effects, temperature, negative pressure, etc., which have been successfully described using the total energy calculation of DFT+DMFT (see Fig. 15).165 As a result, the strongly correlated perovskite nickelates display rather complex interplay among their spin, charge, orbital, and lattice degrees of freedom.166,167
This strongly correlated physics and the strong coupling of electronic and vibrational degrees of freedom open new opportunities to modify electronic properties by the control of chemical doping, defect engineering (e.g., oxygen vacancies), and strain engineering for thin films.167 For example, hydrogen doping in nickelates leads to an insulating phase and a ten order-of-magnitude change in resistivity.168 Going forward, gaining improved understanding and control of such phenomena requires a quantitative, unified description of the effects of electronic and lattice degrees of freedom across their characteristic time scales. This includes the femtosecond onset of electron correlations, to the nanosecond scale associated with ionic diffusion, while also accounting for the strongly correlated nature of Ni d electrons.169
Gaining fundamental understanding of the complex interplay between dynamical events, charge transfer, structural transitions (e.g., periodic lattice distortions), and electronic phase changes (e.g., MIT) over ultrafast timescales requires high-fidelity dynamical simulations. To investigate such processes over a few nanometers, ab initio molecular dynamics (AIMD) simulations using DFT+U have been successfully employed;131,168,170 in this technique, electron localization is treated using Hubbard U corrections. AIMD simulations can, for instance, help lead us to understand dynamic interactions between water and SmNiO3, the facile transport of protons within electron-doped SmNiO3, and structural distortions and lattice strain induced by electron doping [Fig. 16(a)].168 Such AIMD simulations can be used to investigate the impact of (a) the A-site cation, (b) the concentration of dopants or defects, and (c) strain on ion-transport pathways, defect coalescence, and local structural changes induced by dopants/defects in rare-earth nickelates. Climbing edge nudged elastic band calculations, within the framework of DFT, is typically used to gain insights into the kinetic barriers [Fig. 16(b)] associated with the key ion-transport pathways identified by AIMD.171 Additionally, the atomic structures obtained from AIMD simulations can be used as input for DMFT calculations to identify accurate correlations between structural variations, electronic structures, and charge transfer. Combining DMFT energies (computed as a function of atomic positions) with molecular dynamics to treat the ions and electrons simultaneously has remained a challenge. In the future, successful combinations will allow us to precisely understand both the dynamics of electronic structure changes in the doped nickelates and the proton-induced structural transition and the associated electric field effects that drive phase transitions in such quantum materials.
VI. ORGANOMETALLIC PEROVSKITES FOR NEUROMORPHIC COMPUTING
Although the basic science of electron correlations in organometallic halide perovskites (OHPs) and the role of correlations on electrical and optical properties are still under intensive investigation,172,173 the fingerprints of electron correlations (e.g., spectral weight transfer) have been observed in OHPs,173 which might enable the usage of OHPs in applications similar to those for the strongly correlated oxides. Transition metal doping may further open up approaches to engineer strong correlations. Researchers have already discussed the impact of slow transient capacitive currents, the trapping and detrapping processes, ion migration, and ferroelectric polarization on the I–V hysteresis behavior in OHP-based solar cell devices.174 In particular, ion migration in OHPs can be important to the I–V curve hysteresis.174 As the calculated migration energy of charged vacancies in OHPs is much lower than those of native defects in the metal oxides,175,176 OHPs can be excellent candidates for resistive switching (RS) memories and logic devices. From the viewpoint of devices, the basic operation principle of OHP-based memory is analogous to that of conventional flash memory.177 The device experiences a change from HRS (analogous to the un-programmed state in flash memory) to LRS (analogous to the programmed state) when a write voltage is applied. During this process, the appropriate voltage is the SET voltage, and the process is known as the SET process. When an erase voltage, known as RESET voltage, is applied to the device, this will lead to a change from LRS to HRS, which is termed the RESET process. Yoo et al. demonstrated the OHP-based RS memory by taking advantage of its hysteretic I–V loop.178 As shown in Fig. 17(a), by employing a metal-insulator-metal (MIM) device configuration, they demonstrated RS characteristics in a simple, low-temperature processable device having the architecture Au/CH3NH3PbI3−xClx/FTO (fluorine-doped tin oxide) with a typical bipolar RS behavior. A low SET voltage of 0.8 V and a RESET voltage of −0.6 V, with an endurance >100 cycles and a retention time of 104 s, have been demonstrated. Gu and Lee demonstrated flexible nonvolatile memory devices using CH3NH3PbI3 OHP as the RS layer on plastic substrates [Fig. 17(b)].179 The flexible perovskite ReRAM displayed long data retention over 104 s, a low operation voltage of 0.7 V, an endurance over 400 cycles, and good mechanical flexibility. Choi et al. improved the device performance of OHP ReRAM.148 They demonstrated the performance of a Pt/CH3NH3PbI3/Ag device [Fig. 17(c)], showing ON/OFF ratios greater than 106, SET voltages less than 0.15 V, and multilevel resistance states. In order to expand the structural design of OHP ReRAM, various strategies have been attempted. Xiao et al. reported the demonstration of OHP-based memory using both the vertical device configuration of Au/MAPbI3/PEDOT:PSS/ITO and the lateral device configuration of Au/MAPbI3/Au [Fig. 17(d)].180 The vertical device displayed large I–V hysteresis. An ON/OFF ratio >103 at an applied voltage of 1 V has been demonstrated. Hwang and Lee reported high-density memory cells with nanoscale dimensions for each device.181 They utilized a sequential vapor deposition technique to fabricate CH3NH3PbI3-based nanoscale ReRAM and designed a cross-point array structure to demonstrate the feasibility of high memory density. The resultant ReRAM showed a fast switching speed of 200 ns, good endurance over 500 cycles, low operation voltage, and long data retention over 105 s. Table I displays a comparison of device performance between different metal oxides and OHP RS materials. In comparison to metal oxides, OHPs show some advantages, especially in terms of synthesis (as they rely on low-cost solution processing). Nevertheless, most 3D OHPs suffer from poor stability, and an uncontrollable conducting pathway in the 3D perovskite could lead to poor retention and endurance. Two-dimensional (2D) OHPs have gained attention recently due to better stability and anisotropic properties. Figures 18(a) and 18(b) compare the crystal structure between 2D and 3D OHP materials. In comparison to 3D OHPs, the 2D-layered perovskites exhibit layered hierarchal structures, reducing the random migration of charged defects. This is expected to be beneficial for the formation and rupture of conductive filaments.182 Seo et al. investigated the RS behavior of (C4H9NH3)2(CH3NH3)n−1PbnI3n+1 with n = 1–3 and n = ∞ and found that the 2D OHP material was more reliable and stable in the switching process than their quasi-2D or 3D counterparts.182 The 2D features in these RS materials are key to the device performance, which requires the whole film to have a specific orientation. In this regard, bottom-up synthesis from single-crystalline 2D OHP is very promising, as single crystals deliver various advantages such as less disordered states and grain boundary-free features as compared to their polycrystalline counterparts.183 Recently, a method of synthesizing inch-scale 2D OHP single crystal membranes from water–air interface has been developed.184 As shown in Fig. 18(c), the insulating organic spacer separates the conductive [PbI4] octahedra, forming a natural multiple-quantum well (MQW) structure. Charge tunneling through the organic layer can be adjusted in terms of probability by changing the chemical composition of the organic layer. As a result, quantum tunneling in the out-of-plane direction offers a large operational window to construct low operating-current ReRAM. Tian et al. conceptualized the memory device using 2D organometallic perovskite single crystals.185 Filaments with ∼20 nm diameters were visualized, and the corresponding RS memories exhibited a low program current of ∼10 pA, which are an order of magnitude lower than conventional memories. However, the stability of these materials needs to be improved significantly along with reliable switching characteristics to have any real impact in electronic devices, and indeed, research efforts to improve the devices such as by oxide capping are highly desired.186 Given that switches in computers have to toggle states at least several tens of millions of cycles over their lifetime, it is important to address the stability of OHPs for future applications in computing.
|RS materials .||Deposition technique .||Write/erase voltage (V) .||ON/OFF ratio .||Retention (s) .||Endurance (cycles) .||References .|
|MOs||HfO2||Atomic layer deposition||+1/−1.5||>102||104||1010||187|
|ZnO||Atomic layer deposition||+1/−1||>102||104||102||188|
|CH3NH3PbI3 (400 nm)||Solution-processed||+0.13/−0.13||>106||350||148|
|CH3NH3PbI3 (1015 nm)||Solution-processed||+0.7/−1.0||>103||750||180|
|2D OMP||BA2PbI4||Spin-coating||+ 0.6/−0.6||107||103||250||182|
|RS materials .||Deposition technique .||Write/erase voltage (V) .||ON/OFF ratio .||Retention (s) .||Endurance (cycles) .||References .|
|MOs||HfO2||Atomic layer deposition||+1/−1.5||>102||104||1010||187|
|ZnO||Atomic layer deposition||+1/−1||>102||104||102||188|
|CH3NH3PbI3 (400 nm)||Solution-processed||+0.13/−0.13||>106||350||148|
|CH3NH3PbI3 (1015 nm)||Solution-processed||+0.7/−1.0||>103||750||180|
|2D OMP||BA2PbI4||Spin-coating||+ 0.6/−0.6||107||103||250||182|
VII. ADAPTIVE PLASTICITY INSPIRED BY CORRELATED OXIDES
The adaptivity of artificial intelligence systems to continuous streams of input information is fundamental to the construction of autonomous and computationally viable learning systems. Continual or lifelong learning is an ability that humans and animals possess naturally; this helps each species learn and acquire skills throughout their lifespan and even transfer knowledge to the next generation. Lifelong learning in machine learning or neural network models (state-of-the-art computational systems) remains a pressing problem as they are prone to catastrophic forgetting or catastrophic interference. Catastrophic forgetting refers to the issue where training a neural network with new information interferes with previously learned knowledge. In the present era of the Internet of Things (IoT), we are surrounded by an explosion of digital data wherein computing platforms across the spectrum (from mobile phones and wearable devices to massive data servers) will increasingly need to acquire, process, and analyze new data. Computational systems must then continually extract structures, patterns, and meaning from raw and unstructured datasets in real-time in unsupervised dynamically changing environments. However, the catastrophic forgetting of intelligent systems for continual information streaming environments restricts the energy-efficient implementation for intelligence applications. Today, we rely on expensive retraining procedures, where old and new data are presented together to an intelligence system during training to address catastrophic interference.
While advances in deep learning and other machine learning techniques have led to systems able to match or even surpass human performance in several tasks (e.g., recognition, analytics, and inference), inherent learning for such tasks is still static. That is, the learning methods use data points from past or old experience to build a predictor (classifier, regression model, and recurrent time series model) for processing future behavior. The predictor does not adjust itself (self-correct or adapt) as new events continually occur over a lifetime. In order to build computational models with an intelligence capable of learning in a stable-plastic manner and adapt to a flood of new data, we ask the following questions: What changes in the brain when it encounters new information to learn? And once something is learned, how is that information retained? There is a growing biological evidence that long-term potentiation, which is involved in maintaining memory, is not permanent and decays eventually, leading to “forgetting.”194,195 At a preliminary level, learning in the brain involves making synaptic connections and re-enforcing them with repeated exposure to a given input stimuli. However, due to limited space, the brain gradually forgets already learned connections to associate them with new data. Based on this, the authors in Refs. 168 and 196 introduced a novel, adaptive “Learning to Forget” scheme called adaptive synaptic plasticity (ASP) that forgets (or weakens) already learned connections to make room for new information to adapt to the continuously changing inputs.
Driven by brain-like asynchronous event-based computations, computational effort in SNNs focuses on currently active parts of the network, effectively saving power on the remaining idle parts. Therefore, one can achieve orders of magnitude less power consumption as compared to their deep learning counterparts. In addition, SNNs are equipped with self-learning capabilities like Spike Timing Dependent Plasticity (STDP),197 which further make them desirable for adaptive, on-chip implementations. STDP is an associative form of synaptic plasticity that uses the temporal correlation between the spiking patterns of pre- and post-neuron pairs to conduct weight updates. This timing-based correlation rule is in stark contrast to batch mode learning such as Stochastic Gradient Descent (SGD).198 STDP is a local learning rule that uses local feedback from neighboring neurons to modulate the synaptic strength and hence can be totally unsupervised. In contrast, SGD uses a global error/loss function to modulate the weights of a network that requires a supervisory class label/ground truth to create the error signals. From an optimization viewpoint, global update-based learning constrains a model's adaptive capability to new events since the optimization landscape (or decision boundary) is governed by the class labels/inputs shown during training. On the other hand, an unsupervised model of learning broadens the boundary points, thereby generalizing across a larger optimization landscape that allows a predictor to self-learn or adapt in response to dynamically changing inputs. The authors in Ref. 199 proposed an elastic weight consolidation (EWC) algorithm for artificial neural networks (ANNs), which slows down learning on certain weights based on how important they are to previously seen tasks. They show how EWC can be used in supervised learning and reinforcement learning problems in the ANN domain to train several tasks sequentially without forgetting older ones, in marked contrast to our ASP approach of forgetting previously learned tasks to perform sequential learning. Here, it is worth mentioning that ASP enables an SNN to perform dynamic reallocation of its resources as new data are being presented to the network. The data or task presentation needs not follow an order. EWC, on the other hand, performs structured sequential task learning, wherein the order of the tasks shown affects the end-accuracy or performance. It remains to be seen if ASP can be modified to perform EWC-like behavior where the plasticity of synapses vital to previously learned tasks is reduced to scale up to larger classification tasks.
From the perspective of hardware implementation, the power consumption of large scale SNNs presents a daunting challenge. This issue is further aggravated when taking into account the physical realization of advanced on-chip learning algorithms. The computing primitives and architectures needed for the majority of neural computing paradigms are very different from von Neumann architectures. For now, there is a pressing need to explore new technologies to simultaneously meet the low power, and adaptive plasticity needs of future neuromorphic computing platforms. Despite this mismatch, SNN algorithms are commonly implemented in software running on multi-cores and on analog, mixed signal, or digital CMOS hardware, leading to large gaps in efficiency. Bridging this gap necessitates the co-exploration of devices, circuits, and architectures that are better matched to the model of computation. The computational units that constitute the proposed neural computing platform are functionally quite different from standard CMOS (which are switches). This leads to the question of whether there are emerging devices that naturally match the functionality required for neural computing. Post-CMOS technologies (such as spin devices,200 memristors,15 and resistive RAMs201) have shown promise in mitigating the limitations of CMOS due to their inherent neuro-mimetic behavior.
It is noted that through cation doping and band-filling control in correlated oxides such as perovskite nickelates, the localization of the itinerant electrons through strong electron repulsion can be achieved. A perovskite device can be doped by hydrogen with the help of a Pt catalyst and shows habituation and relaxation (forgetting) of the changed resistance due to doping (Fig. 19). Based on the kinetics of resistance, which emulates ASP-like forgetting behavior, SNN strategies with ASP-like properties were proposed. The ASP was evaluated for unsupervised digit recognition in a dynamic environment, wherein the training instances of digits “0 through 2” were presented sequentially with no reinforcement—i.e., no training image was re-shown to the network. Figure 20 depicts the representations learned in a fixed-size SNN (with nine excitatory neurons) with traditional exponential STDP learning and compared ASP learning. As the network is shown digit 1, the ASP learned SNN forgets the already learned connections for 0 and learns the new input. Additionally, ASP enables the SNN to learn more stably as some neurons corresponding to the older pattern 0 are retained while learning 1. When the last digit 2 is presented to the ASP-learned SNN, the connections to the excitatory neurons that have learned digit 0 are forgotten to learn 2, while the connections (or neurons) corresponding to recently learned digit 1 remain intact. This is consistent with the significance and latest data-driven forgetting mechanism (incorporated in the decay phase in ASP) wherein older digits are forgotten to learn new digits. Note that, for standard STDP-learned SNNs, the representations overlap, thereby rendering the network useless. For more clarification and a discussion of other analysis conducted with ASP, we direct the reader to Refs. 168 and 196.
We would like to point out that in order to prevent “catastrophic forgetting” with STDP-learning in the SNNs (or standard SGD-training in deep learning models), the network is presented with the already learned old information along with the new data, when the network has to learn a new class. However, storing all old data samples for retraining is a major drawback for implementing on-line real-time learning. ASP offers a promising solution for real-time dynamic learning without this retraining procedure. To establish the efficacy of ASP on complex recognition problems, we need to investigate larger or deeper networks. The large-scale implementation of SNNs poses many challenges, the major one being the inefficiency of local learning to train all layers in a cohesive manner. Today, learning efficiently and appropriately is key to achieving a functional cognitive system. While deep learning has no doubt paved the way for learning appropriately, the efficiency/computing power associated with learning still remains a concern. To that effect, the “learning to forget” behavior realized with ASP (inspired from biological principles) and its future implications on large-scale networks provides a potentially exciting and promising direction toward improved and efficient representation learning with the emerging computing paradigm of SNNs. We would also like to note that several proposals have been made in the artificial neural network (ANN) domain to address catastrophic forgetting with continual learning. Most works optimize the SGD mechanism with extra regularization features to improve the continual learning ability.202
As shown in Fig. 21, intelligence and cognitive capabilities generally scale with the complexity of the nervous system that includes neurons, synapses, and various circuits that are responsible for learning, memory, and sensory interfaces to effectively interact with the surrounding environment. However, intelligent behavior is clearly also present at the primitive level such as in amoebas, slime molds, and sea slugs. Hence, the aspects of learning and intelligence, while generally observed in the complex neural system, can be transferred to much simpler systems in principle. This offers a reductionist approach to breakdown the problem at hand and identify model systems for emulation. Intelligence can emerge if the structural and chemical dynamics can respond to external stimuli in a way to help the organism function better in a certain environment, e.g., predicting periodic events or habituating to previous stimuli to better adapt the organism in dynamical, i.e., continuously changing environments. In this paper, we have discussed strongly correlated materials that show this essential characteristic for intelligence, featuring the electron-interaction-induced insulator–metal transition. Artificial neurons can be fabricated, and degradation induced defects can be utilized to simulate neural disorders in a malfunctional spiking neural network. In addition, correlated systems can show classical conditioning, habituation, and dynamic relaxation, which can be integrated in modern adaptive synaptic plasticity algorithm designs.
Meanwhile, further studies are needed to understand the physics of stimuli-perturbed strongly correlated systems. While traditional work in strongly correlated systems focuses on a relatively static picture, i.e., in the DC limit, the adaptive behaviors require a thorough investigation of the electron–electron interaction in a dynamic environment. The “dynamic environment” could imply rapidly changing stimuli with time or multiple stimuli being presented to the device in no particular order. For this purpose, advances in both experimental and theoretical knowledge are required, which not only help with better control of the insulator–metal transition, transition pathways, and stability of the transient states but also provide guidance in selecting the right materials for certain function or circuit design. In other words, a complete picture of the energy landscape of order parameter evolution and its competing mechanisms will be desirable. The traditional approach to the design of volatile vs non-volatile memory is insufficient to mimic the brain. Not all information presented to the brain is important, and there is a natural way for the brain to identify critical information and retain it vs information that is stored on very short time scales. In other words, memory states in synthetic circuits should be allowed to decay depending on their relevance in certain implementations of neural networks. It is therefore important to understand the fundamental mechanisms that impart memory to different electronic or structural phases in correlated systems and how their dynamics can be controlled or manipulated. Experimental probes that can track the real-time evolution of the electronic or crystal phase under stimulus are essential to form a microscopic understanding of the overall response. The spatially resolved evolution of an electronic phase and its propagation front under a stimulus such as an electric field or a chemical concentration gradient is needed to formulate percolation-based transport models for conductance and capacitance. From a theoretical perspective, being able to predict transport gaps while incorporating electric fields and other chemical potentials in first principles models is essential. Beyond the understanding of basic principles, collaboration between physicists, material scientists, and biologists is necessary to explore how these physical observations can better mimic learning. Much as flocking in birds or schooling in fish has inspired the design of swarm intelligence algorithms, perhaps collective behavior in correlated materials that emerge under external stimuli can lead to better machine intelligence. Beyond the fundamental science level, efforts are required to scale up organismic-matter-based devices to chip-scale circuits, while integrating these adaptive materials with the work horse silicon CMOS technology is of significance.
While exploring the organismic features in correlated materials for a fast and power-efficient non-von Neumann machine is one promising future research direction, other areas of scholarship should not be overlooked. Adaptive behavior found in strongly correlated materials shares similar roots with other intelligence features in nature from simple slime molds to more complex nervous systems. Whether we can use these materials as a complementary tool to study fundamental problems in biology and neuroscience remains an open question. It may be easier to start at a simple scale such as emulation of sensory circuits that only involve one to a few synapses. At a larger scale, learning and adaptive behavior naturally emerges in nature when different individuals in a colony or society interact. It will be interesting to see the possibilities in circuits comprising millions of adaptive device-based building blocks and how this might translate to better or more capable autonomous machines in the future. With increasing access to semiconductor foundries, this may be soon become a reality for academic researchers.
We thank AFOSR (No. FA9550-16-1-0159), ARO (No. W911NF-16-1-0289), and Gilbreth Fellowship for their support. This research was funded in part by the Vannevar Bush Faculty Fellowship program, the Center for Brain-Inspired Computing (C-BRIC), and a DARPA/SRC funded JUMP (Joint University Microelectronic Program). Y.K. and I.K.S. would like to acknowledge the funding from the Energy Frontier Research Center funded by the U.S. Department of Energy, Office of Science, Basic Energy Sciences under Award No. DE-SC0019273. D.D.F. was supported by the U.S. Department of Energy (DOE), Office of Science, Basic Energy Sciences (BES), Materials Science and Engineering Division. J.W.F. thanks the Advanced Photon Source, which is a DOE Office of Science User Facility, and was supported by the U.S. DOE, Office of Science, under Contract No. DE-AC02-06CH11357. S.K.R.S.S. thanks the Center for Nanoscale Materials, an Office of Science user facility, and was supported by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences, under Contract No. DE-AC02-06CH11357.