Resistive random-access memories are promising analog synaptic devices for efficient bio-inspired neuromorphic computing arrays. Here we first describe working principles for phase-change random-access memory, oxide random-access memory, and conductive-bridging random-access memory for artificial synapses. These devices could allow for dense and efficient storage of analog synapse connections between CMOS neuron circuits. We also discuss challenges and opportunities for analog synaptic devices toward the goal of realizing passive neuromorphic computing arrays. Finally, we focus on reducing spatial and temporal variations, which is critical to experimentally realize powerful and efficient neuromorphic computing systems.
Recently, artificial intelligence (AI) has allowed for significant technological advancements in image classification,1–3 speech recognition,4–6 strategic gaming,7 and decision-making.8–12 However, artificial neural networks (ANNs) require large amounts of computing power, especially for deep learning.13 Because of this, there is a significant demand for more efficient hardware to accelerate training ANNs and improve classification accuracies for recognition tasks. For example, wider artificial neural network layers can be better at handling more complex datasets.14 Increasing the number of computing cores has been the primary method to handle larger artificial neural networks. Hence, Graphics Processing Units (GPUs) have become the machine of choice for most AI. GPUs handle many operations in parallel by processing data using a centra1ized control of parallel arithmetic logic units (ALUs) that fetch data from a memory hierarchy.15,16 Compared to GPUs, application-specific integrated circuit (ASIC) accelerators17–23 and field-programmable gate array (FPGA) accelerators24–27 demonstrated computing efficiency improvements. However, the performance of these complementary metal-oxide semiconductor (CMOS) based systems is limited by the large footprint of synaptic cells and frequent access to external memory.28 This has engendered motivation to explore bio-inspired neuromorphic systems.29–31 Figure 1 illustrates the trade-off between technologically mature CMOS-based hardware and ideal neuromorphic systems (Table I).
|Material .||Strategy for uniformity .||Temporal variation (%) .||Spatial variation (%) .||References .|
|ZrO2||Implanting Ti ions||35||…||128|
|ZnO w/Ti||Stack modification||∼10||∼25||117|
|Material .||Strategy for uniformity .||Temporal variation (%) .||Spatial variation (%) .||References .|
|ZrO2||Implanting Ti ions||35||…||128|
|ZnO w/Ti||Stack modification||∼10||∼25||117|
Contrasting electronic hardware, the human brain is efficient wetware consisting of roughly 80 × 109 neuronal cells and 150 × 1012 synapses.32 A working hypothesis suggests that learning, in part, occurs when synapse connection strengths are re-programmed to process electrical signals differently than it did previously.33 Memory and other cognitive processes emerge from encoded information in molecules, atoms, and ions within the cellular network.34 While there is still much to learn about neurophysiological phenomena, discoveries in neuroscience35–40 illuminate a path toward developing neuromorphic computing for more efficient AI.
In the late 1980s, Mead first described neuromorphic computing as a concept involving large integrated electronic analog systems that mimic biological neural networks.41 Neuromorphic computing has since evolved into two categories: (1) bio-plausible: focused on mimicking dynamics of biological synapses and (2) bio-inspired: focused on developing electronic systems (loosely) inspired by the brain for more efficient artificial neural networks. In the first category, research has focused on experimentally demonstrating biologically observed phenomena, such as spike-timing-dependent plasticity (STDP), with artificial synapses.42–44 However, how to efficiently utilize local updating rules with spikes remains an important question to tackle.45 By contrast, bio-inspired neuromorphic computing focuses on implementing artificial neural network hardware built for learning algorithms that are well-defined mathematically.
This perspective focuses on recent progress toward achieving bio-inspired neuromorphic computing with analog arrays of artificial synapses,46,47 as illustrated in Figs. 2(a) and 2(b). Like biological synapses, synaptic devices can reconfigure constituent ions and atoms to change the material conductivity between two electrode terminals. The conductance values, G, are analog weights used for passive weighted summations in crossbar arrays, as illustrated in Fig. 3(a). As voltage bias, Vi, are applied to M input rows, output currents, Ij, for all N output columns are weighted summations given by
where Gij are synapse conductance values, as illustrated in Fig. 3(b). Passive arrays can execute weighted summations for all columns in parallel according to Ohm’s law and Kirchhoff’s current law. Each array implements vector-matrix multiplications given by
Increasing and decreasing conductance values of synaptic devices are referred to as potentiation and depression, respectively. These operations are executed by applying electrical signals that exceed activation thresholds, VP and VD, respectively. After potentiation, the pre-synaptic input is more strongly associated with the post-synaptic output, i.e., the artificial synapse conductance increases, as illustrated in Fig. 4(a). After depression, an artificial synapse becomes more resistive, as illustrated in Fig. 4(b). Electrical signals weaker than threshold voltages do not frequently elicit the conductance change. This allows for negligible perturbance to conductance values while weighted summations are computed.
During online learning, artificial neural network training involves potentiation and depression of connection strengths between ensembles of neurons to minimize the global error of a training dataset. Out of the many existing learning algorithms,33,48–51 a stochastic gradient descent using backpropagation is the most widely used method to update synaptic weights.52–55 During training, the partial derivative of an error, δi, for each array output, i, is calculated by using CMOS neuron circuits, and conductance, Gij,old, can be updated to Gij,new by the delta rule,52
where ΔG is the amount of conductance change, η is the learning rate parameter, and xj is the activity at input j. The function f(η, xj, δi) is a circuit implementation to physically manifest the synaptic weight change (computed by using CMOS neuron circuits) to reconfigure the material structure within an artificial synapse. For example, voltage pulses exceeding threshold values can be applied to potentiate or depress artificial synapses according to Eq. (3), as illustrated in Figs. 4(a) and 4(b). Protection voltages are commonly used to prevent unselected devices from unintentional conductivity changes. Similar schemes could be accomplished with circuitry comprised of integrators and comparators.56–65
Electrical CMOS circuits based on flash memory have been demonstrated for synaptic devices.66,67 However, flash technologies are not ideal for online learning due to slow write speed, low endurance, and the requirement to erase an entire block at the same time. In addition to flash memory, three-terminal emergent non-volatile memory, such as organic transistors,68 spin transfer torque magnetic random-access memory (STT-MRAM),69 and ferroelectric devices70 also show promise for multi-level storage of synaptic weights. These types of devices are in early stages of development.
Resistive random-access memories (RRAMs) are two-terminal emergent non-volatile memories that are suitable for artificial synapses since they are re-programmable and have analog conductance states. Device performance characteristics such as uniform switching, symmetric and linear potentiation/depression, many multi-level states, low operational power, fast switching speed, and good scaling are simultaneously desired for large-scale neuromorphic arrays. Three main types of RRAM are known as phase-change random-access memory (PCRAM), oxide random-access memory (OxRAM), and conductive bridging random-access memory (CBRAM). Many of these synaptic devices have demonstrated ideal characteristics for bio-inspired neuromorphic computing. To start, high signal contrast along with multi-level state accessibility is necessary to store precise synaptic weight values. HfOx-based CBRAM has very large ON/OFF ratio, indicating that good signal contrast could be possible with these synaptic devices.43 Also, Mo/TiOx/TiN-based OxRAM remains stable at as many as 64 analog conductance levels.71 For inference engines and other applications that rely on long-term memory, retention of conductance states is crucially important. Also, good cycling endurance is desired to allow online training of synapses.45 Excellent endurance and retention have been observed in Ta2O5−x/TaO2−x bilayer OxRAM. These devices exhibit switching for >1012 cycles with over ten years of expected retention at room temperature.72 Other metrics are the switching speed, scaling limitations, and linearity or symmetry of potentiation and depression. Sub-nanosecond switching speed is observed in Ta-based OxRAM.73 Scaling as small as ∼2 nm device size has been demonstrated using TiOx/HfO2-based OxRAM.74 Linear analog potentiation and depression has been demonstrated for a-Si-based and crystalline-SiGe-based CBRAM.42,75
Many review articles have surveyed RRAM devices.45,76–83 Chen and Lin provided a review with focus on resistance variation for non-volatile memory or programmable logic applications.84 In addition to resistance variation, switching threshold variation is also important since it correlates with repeatability and controllability of filament geometries and material configurations underlying the conductance change phenomena. This perspective focuses on switching threshold variations of artificial synapses for bio-inspired neuromorphic computing.
Section II of this article will introduce basic working principles for PCRAM, OxRAM, and CBRAM artificial synapses. After their operation mechanisms are discussed, Sec. III will highlight key challenges for these devices with a focus on strategies to reduce temporal and spatial variations of activation thresholds. By achieving better reliability and control of synapse performance, large-scale bio-inspired neuromorphic systems can potentially attain unprecedented computing efficiencies for advanced AI.
II. A BRIEF INTRODUCTION TO PCRAM, OxRAM, AND CBRAM
Two-terminal emergent non-volatile memories are promising synapses for analog arrays since they have a lithographic footprint size of only 4F2 in systolic crossbars and can take advantage of multi-level states and 3D architectures for further enhanced memory density.85–88 Crossbar architectures have full-connectivity for passive weighted summations, which frequently repeat during artificial neural network operation. In this section, we will discuss the working principles of PCRAM, OxRAM, and CBRAM artificial synapses.42,43,62,68,72,89,90
Phase change random-access memory (PCRAM), also known as phase change memory (PCM), is a relatively mature RRAM technology.91 As illustrated in Figs. 5(a) and 5(b), these devices are based on reversible crystallization and amorphization of chalcogenide materials. Potentiation and depression in PCRAM occur by altering the crystalline-to-amorphous volume ratio. The crystalline phase typically has dense long-range-ordered structure and high conductivity, whereas the amorphous phase is short-range ordered and porous with low conductivity.64,65 Generally, resistive heater elements are used as bottom electrodes to localize Joule heating that induces material phase changes. Crystallization near the vicinity of the bottom interface depends on the temperature and the previous material structure. When the temperature of chalcogenide near the heater element rises above the glass transition temperature (Tg) while remaining below melting temperature (Tm), the volume ratio of crystalline-to-amorphous material increases, resulting in potentiation, as shown in Fig. 5(c).64 During potentiation, atomic density increases, which increases the conductance state of the PCRAM artificial synapse.92 Depression occurs when the chalcogenide is rapidly quenched from a molten state to result in a larger volume fraction of amorphous material, as shown in Fig. 5(d).
Compared to other two-terminal emergent non-volatile memories, PCRAM is the most mature technology. Products based on PCRAM have been in the market since 2009, whereas OxRAM and CBRAM are still in earlier stages of development.93 Modern PCRAM is scalable (device area smaller than 100 nm2) and possesses a large conductance range (analog ON/OFF ratio of ∼2000) due to large resistance contrast between amorphous and crystalline phases.93
However, despite the maturity of PCRAM, there remain several opportunities to improve its analog synapse performance. At low-conductance state, the amorphous structure is the dominant phase. PCRAM synaptic devices are inherently unstable because of thermodynamic instability of this amorphous phase. An amorphous phase has a larger specific volume than the crystalline phase, as illustrated in Fig. 5(e).94,95 To form a more thermodynamically stable structure, structural rearrangement occurs in the amorphous phase even at much below crystalline temperature, known as structural relaxation (SR). During SR, electrical conductivity of the amorphous phase decreases continuously,95 as shown in Fig. 5(f). Decaying conductivity could be caused by lower dangling bond density or release of a residual stress.
Although gradual potentiation is possible by inducing the spontaneous nucleation and growth of the crystalline region,93,96 gradual depression is difficult since amorphization involves rapid quenching, which generates drastic changes in conductivity, as shown in Figs. 6(a) and 6(b). Additional circuit components could help to compensate for the drastic conductance change during depression.97 However, temporal (cycle-to-cycle) variations are still a significant issue for PCRAM since Joule heating accelerates electromigration, which changes the chalcogenide stoichiometry by inducing phase segregation and void formation.98–100
Oxide random-access memory (OxRAM), also known as valence change memory (VCM), is another class of two-terminal synapses that is suitable for neuromorphic computing. Analog conductance states are accessible in many binary metal-oxide materials, such as HfOx, TaOx, TiOx, AlOx, ZrOx, SiOx, WOx, ZnOx, and many more.72,101–104 In contrast to PCRAM, OxRAM relies on formation and rupture of localized conduction channels that result from the movement of oxygen vacancies.72 After years of research, OxRAM has demonstrated promising properties such as record-high endurance (>1012 cycles),72 long data retention (>10 years at 85 °C),105 good scalability (<10 nm),106 fast switching speed (<1 ns),73 and back-end-of-line (BEOL)/3D compatibility.85–88
Although demonstrations show promise, there are still challenges for OxRAM. First, temporal and spatial variations are unavoidable due to the stochasticity of conductive filament formation and rupture. Non-uniformity complicates the design of CMOS neuron circuits, decreases the efficiency of programming device conductance states, and results in reduced classification accuracies.45 Furthermore, consistent conductance change step-size is hard to achieve in OxRAM since conductance has an exponential dependence on the conductive filament size and geometry.107 As a result, OxRAM has been challenging to implement without additional CMOS at each synapse. Figure 6(c) shows spatial variation of potentiation and depression measured from Ta2O5-based OxRAM. Examples of current-voltage (I-V) cycles for Ta2O5-based OxRAM are shown in Fig. 6(d).
Conductive bridging random-access memory (CBRAM), also known as electrochemical metallization memory (ECM) or programmable metallization cell (PMC) memory, is also based on the formation and rupture of conduction channels. In contrast to OxRAM with oxygen vacancy conductive filaments, conduction channels in CBRAM are composed of metal, usually originating from one of the electrodes. These synaptic devices are also referred to as electrochemical metallization cells. CBRAM consists of an active metal moving through an amorphous solid electrolyte42,43,108–111 or single-crystalline epitaxial film with dislocations.75 During potentiation, a positive voltage is applied to the active electrode to oxidize metal into ions and electrons. Ions drift through the resistive medium and are reduced within the conduction channel, increasing the terminal-to-terminal conductivity. During depression, a negative potential is applied to the active electrode, and the metal conduction channel destabilizes and ruptures. Metal clusters have been observed forming and dissolving within a-Si and SiO2.109,112 Because CBRAM utilizes metal filaments rather than oxygen vacancies, the conductance range can be much larger than that of OxRAM.43,77,112,113 Figure 6(e) shows potentiation and depression of epitaxial SiGe-based CBRAM. I-V cycles for epitaxial SiGe-based CBRAM are shown in Fig. 6(f).
In CBRAM, mid-level conductance states can initially tend to decay and stabilize at lower values due to metal clustering.43 Stabilizing filaments by confining metal in dislocations could improve retention.75 However, like PCRAM and OxRAM, CBRAM also suffers from large variations due to stochasticity associated with conduction channel formation and rupture. Achieving better uniformity for two-terminal emergent non-volatile memories is a critical challenge. Section III of this perspective will focus on strategies to minimize variations for PCRAM, OxRAM, and CBRAM artificial synapses.
III. STRATEGIES TO MINIMIZE VARIATIONS
Although stand-alone synapses show promise, array demonstrations have been limited to small-scale systems or require transistors to regulate each synapse individually. A major bottleneck limiting large-scale passive arrays is temporal (cycle-to-cycle) and spatial (device-to-device) variations, which result due to stochasticity of filament formation and rupture.84,108,109,112,114 The more the conductance change amount per pulse fluctuates from synapse-to-synapse and in response to repetitive pulsing, the more difficult it becomes to implement conductance updating schemes. Furthermore, simulations accounting for variation predict that non-uniformity degrades classification accuracy.45 As a general guideline, temporal variation above 2% will likely cause severe degradation of learning accuracy.45 Arrays are fairly tolerant to spatial variations, but high yield (∼100%) is essential since defective devices could permit excessive leakage currents.45
Potentiation voltage threshold is a useful metric for assessing uniformity since it is closely related to the conduction channel geometry. Consistent spatial (device-to-device) and temporal (cycle-to-cycle) response to identical voltage signals could allow for fast and low-power bio-inspired neuromorphic systems.31,58 Conventionally, the percentage of variation is reported as the coefficient of variation (standard deviation over the mean). In this perspective, we evaluate potentiation threshold voltage variations determined from current-voltage measurements. However, it is worth noting that thresholds for pulsing conditions slightly differ from DC measurements. Varying pulse duration alters the probability that a synapse will potentiate or depress in response.75 An alternative metric to characterize uniformity is conductance state variations.84 In this discussion, we evaluate synaptic devices variations by comparison of potentiation thresholds measured in DC current-voltage sweeps.
Consistent resistive switching fundamentally relies on controlling the evolution of the conductance channel. However, phase transitions in PCRAM and filament evolution in OxRAM and CBRAM are inherently stochastic processes.115 While randomness cannot be entirely eliminated, stochasticity can be minimized by scaling, modifying device structures, embedding nanoparticles, and pre-defining filament geometries.
Numerous candidate filament pathways exist under a large device area. Because of this, reducing the electrode area can effectively reduce stochasticity. Kim et al. reported a spatial variation of only 3% for 50 nm-node a-Si-based CBRAM, shown in Fig. 7(a).108 Also, Lee et al. reported shrinking ZrOx/HfOx OxRAM to 50 nm reduces temporal variation to 3%.116 Scaling down PCRAM device size can allow crystallization and amorphization across the entire volume to improve temporal variation.93 However, this limits synapses to binary conductance values. Furthermore, as PCRAM device size decreases, potentiation thresholds are more sensitive to small differences in crystal grains and stoichiometry.100
B. Structure modification
Modifying the device structure is a useful strategy to stabilize the formation and rupture of a conduction channel. For example, uniformity is improved for ZnO-based OxRAM by adding a TiOx layer, resulting in about 25% spatial variation and 10% temporal variation.117 In stacked crossbars, layered TiN/TiO2−x/Al2O3 OxRAM between Pt electrodes has only 16% spatial variation.86 Including a Ge-Sb-Te interface layer between Al and CuxO helps to isolate OxRAM filament formation for as low as 13% temporal variation.118 Adding a layer of IrO2 could help to stabilize oxygen migrations in NiO-based OxRAM, allowing for only 11% temporal variation.119 Finally, embedding Al layers in HfOx CBRAM can reduce spatial variation to 10%.120
Nanostructures have been employed to direct conductive pathways for improved uniformity. For example, nanopores in graphene could reduce variation in Ta/G/Ta2O5 OxRAM by acting as a diffusion barrier, as shown in Fig. 7(b).121 Nanocones, as shown in Fig. 7(c), are capable of localizing Ag filaments in SiO2 to result in 34% spatial variation and 32% temporal variation.122 Uniformity improvement using pyramids has also been observed for Al2O3-based CBRAM,123,124 and GST-based PCRAM, as shown in Fig. 7(d). Self-assembled SiOx nanostructures in the inert electrode can help localize oxide-RRAM conductive filaments for about 30% spatial variation and 11% temporal variation.125 Also, vertically aligned ZnO nanorods, as shown in Fig. 7(e), can also enhance uniformity by confining oxide-RRAM conductive filaments to nanorod surfaces for as low as 6% temporal variation.126
C. Embedding nanoparticles
Embedding nanoparticles is another effective method to reduce variation since nanoparticles can help to guide the formation of conductive filaments. Liu showed that adding nanocrystals to ZrO2-based OxRAM results in 50% temporal variation.127 Alternatively, Ti ions can be implanted into ZrO2-based oxide-RRAM to allow for 35% temporal variation.128 Lee et al. showed that annealing to agglomerate conductive filaments in CuC-based CBRAM allows for about 30% temporal variation.129 In TiO2-based OxRAM, embedding Ru nanodots allows for about 30% variation,130 whereas embedding Pt nanocrystals, as shown in Fig. 7(f), isolates conductive filament formation, resulting in only 9% temporal variation.131 Finally, in SiO2-based OxRAM, embedding Pt allows for up to 1000 uniform cycles.132
D. Defining channels in single crystals
Pre-defining conduction channels using threading dislocations in a single-crystalline material is an effective technique for controlling spatial and temporal conductive filament dynamics. Potentiation and depression have been demonstrated using single-crystalline STO-based OxRAM,133 as well as for crystalline-SiGe-based CBRAM.75 Although ions are typically less mobile in defect-free single-crystals compared to amorphous materials, threading dislocations allow conduction channels to form with as low as 1% temporal variation and only 4.9% spatial variation,75 as shown in Fig. 8. Under pulsing conditions, the potentiation threshold voltage shows 4.8% spatial variation and 3.9% temporal variation, while the depression threshold voltage shows 5.3% spatial variation and 4.8% temporal variation. These devices also demonstrate large conductance range, good endurance, and long retention time at high conductance states.
IV. CONCLUDING REMARKS
Bio-inspired neuromorphic computing arrays with PCRAM, OxRAM, and CBRAM artificial synapses have demonstrated pattern classification. To create robust arrays with non-uniform emergent non-volatile memories, transistors can help to compensate for variability and other device non-idealities. For example, with one transistor at each synaptic device, HfO2/Al2O3-based OxRAM in a 128 × 8 array demonstrated facial recognition.134 A 500 × 661 PCRAM array with one transistor at each artificial synapse has also demonstrated pattern classification.62 Adding additional CMOS circuits to each artificial synapse can further enhance computing capabilities. For example, two PCRAMs, two transistors, and one capacitor for each artificial synapses can execute the training and image classification using MNIST, MNIST-back-rand, CIFAR-10, and CIFAR-100 datasets.97
Passive arrays without transistors regulating each synapse promise an ultimate reduction in footprint size and power consumption. Transistor-free Al2O3/TiO2-based OxRAM demonstrated classification of 3 × 3-black/white pixel images using a passive 12 × 12 array.57 Al2O3/TiO2−x-based OxRAM in two 20 × 20 passive arrays demonstrated one-hidden layer perceptron classification.135 WOx-based OxRAM in a 32 × 32 crossbar array demonstrated offline learning for image analysis.136
Although these demonstrations are exciting, device variations impose limitations to classification accuracy with more complex databases. For example, a color image in the CIFAR-10 and CIFAR-100 datasets consists of 32 × 32 × 3 inputs, which are too large for passive arrays to currently handle. Large arrays require complicated CMOS neuron control circuits to deal with variations. Robust and reliable artificial synapses could greatly simplify these peripheral access circuits to realize larger neuromorphic systems.
High-performance synaptic devices with minimal variations are critical to demonstrate passive arrays of analog artificial synapses. Spatial and temporal uniformity can be improved by shrinking device size, modifying device structure, embedding nanoparticles, or defining channels in single-crystals. Although some device non-idealities are tolerable, better control of conductance change is essential to realize wider arrays for powerful and efficient neuromorphic computing.
Several issues also influence device behavior and contribute to device properties. For example, the nanobattery effect caused by inhomogeneous ion distribution in the switching medium results in stored charges.137,138 At metal/insulator interfaces, native oxides degrade reliability of devices.139 Moisture changes alter device properties since water molecules can be incorporated into defect sites.140,141 Nano-scale variations in local (short-range) material structure and density can cause significant switching variations as well.113,142 Furthermore, the interplay between cation/anion mobility and oxidation/reduction reaction rates governs growth direction and shape of conductive filaments.109
Based on measured synaptic device characteristics, there is also significant opportunity to improve circuitry and algorithms to map calculated weight values to artificial synapses. Circuit implementation of Eq. (3) requires thoughtful design to efficiently apply voltage signals, read weighted summations, calculate and backpropagate partial derivatives of error, and modify artificial synapse conductance values. Mapping algorithms could also help to tolerate device non-idealities and variations. Highly integrated CMOS neuron circuits will be important to expand AI system capacities for more complicated tasks.
Accurate and detailed modeling of the dynamic processes that give rise to conductance change could help guide both device engineering and algorithm development. Advanced microscopic imaging, such as in situ transmission electron microscopy (TEM), may help reveal new strategies for controlling potentiation and depression.
In addition to improving emergent non-volatile memories, bio-inspired neuromorphic systems also could benefit from high-performance selector devices.143–145 Selectors introduce non-linearity to suppress leakage currents at sub-threshold voltages. Two-terminal architectures allow for vertical integration with emergent non-volatile memories.
In summary, PCRAM, OxRAM, and CBRAM are promising emergent non-volatile memories for analog neuromorphic arrays. Additional transistors or CMOS circuitry can compensate for spatial and temporal non-uniformity and allow pattern classification with comparable accuracies to generic hardware. Passive analog artificial synapse arrays are capable of dense and efficient bio-inspired neuromorphic computing. Large-scale passive arrays could be achieved by reducing variations to better control conductance change without requiring transistors. In conclusion, improving uniformity of artificial synapses can allow for efficient and powerful hardware specialized for AI.