In neuromorphic circuits, stochasticity in the cortex can be mapped into the synaptic or neuronal components. The hardware emulation of these stochastic neural networks are currently being extensively studied using resistive memories or memristors. The ionic process involved in the underlying switching behavior of the memristive elements is considered as the main source of stochasticity of its operation. Building on its inherent variability, the memristor is incorporated into abstract models of stochastic neurons and synapses. Two approaches of stochastic neural networks are investigated. Aside from the size and area perspective, the impact on the system performance, in terms of accuracy, recognition rates, and learning, among these two approaches and where the memristor would fall into place are the main comparison points to be considered.
Conventional computing systems are facing severe limitations due to the increasing demands on data and processing potentials.1 Hence, the current challenge is to realize a new computing paradigm that is up to the stringent requirements of today’s applications. Particularly in terms of interpreting the perceived data and learning from it to solve unfamiliar problems. Inspired by the operation of the human brain, from the dimensionality, energy and underlying functionalities, neuromorphic systems are building upon circuit elements to mimic the neuro-biological activities.2 In abstract terms, the brain is mapped into its corresponding building blocks of neurons and synapses along with their corresponding interactions. Several circuit designs are available either to act as a close resemblance to the physiological elements and their behavior or as a mere abstraction of the operation principles in an event based manner.3
However, adhering to the size and energy constraints along with the extensive scaling witnessed in the CMOS technology, emerging non-volatile memory technologies are rising up as suitable alternatives to conventional designs.4–7 Resistive Random Access Memories (ReRAM), in particular is a prominent candidate and has been integrated into several neuromorphic platforms. In its original operation form, with a continuous change of resistance upon input excitation, it has found diverse venues as analog synapses8,9 and neuronal elements.10 On the other hand, with the threshold-based devices, more abstract implementations are available benefiting mainly from a binary operation with the high (ROFF) and low (RON) resistance values.
Nonetheless, an intriguing feature of variability is apparent in the operation of the brain, where biological noise has been proven to be quite beneficial for the learning, information processing and decision making.11–13 Incorporating this stochasticity characteristic into the neural network operation is mainly narrowed down into having either the neuron or the synapse behaving in a nondeterministic manner. Traditionally, stochasticity is added into the neural networks through injected background noise into the circuit.3 However, recent studies on the material characteristics of the memristive elements have shown a high predisposition of variable behavior. Thereby, allowing for its integration into neuromorphic circuits to inherently induce the required stochastic behavior. A recent study on phase change memories also build on the inherent variability to mimic the neural activity of the integrate and fire neuron.14
In this paper, memristor based stochastic neural networks are investigated. The memristor dynamics and its underlying switching principles are discussed. The origins of the memristor’s variability are further elaborated and two approaches of stochastic winner-take-all networks are presented; stochastic neurons and alternatively stochastic synapses all tested with MNIST data for classification applications. A comparison analysis of the the neuronal and synaptic stochasticity models is presented along with concluding remarks on the overall integration approaches.
II. MEMRISTIVE NEURAL NETWORKS
In its deterministic form, the memristor integrated within the neural network platforms have shown similar performance to the conventional CMOS architectures, with the added saving in the area and power. Adding the stochastic feature provides a further venue for design saving, and resilience to inevitable hardware variations. The design principles for the stochastic neuronal and synaptic elements are elaborated in this section.
A. Stochasticity origins
Memristors are two terminal devices that exhibit resistance change under an input bias excitation.15 Its internal material composition of an insulating layer sandwiched between two metal electrodes govern the corresponding operation. The switching mechanism is abstracted into the formation/rupture of a single dominant conducting filament.16 Oxidation, ion transport, and reduction are the underlying processes involved in the filament formation.17 Depending on the internal chemical interactions, one of these processes would be the limiting rate process. In other words, it determines the rate of electron hopping and the filament formation. Moreover, as these processes are driven thermodynamically, the switching mechanism is dependent upon thermal activation over a dominant energy barrier. A stochastic mechanism in nature. Thus, leading to the stochastic formation of a dominant conducting filament and subsequently the variable operation of the memristor.18 With the input bias exciting the metal particles to move through the insulating layer, the joules heating effect comes into place adding variability to the switching behavior of the device. The wait time for the device to switch is then variable. It thus offers a direct relation between the applied voltage level and the activation energy (Γ) for the electrons to move into the gaps as shown in the following equation.
where KB, T and v correspond to the boltzman constant, the temperature, and the applied voltage respectively.
Depending on the different material characteristics and the interaction among the metal and insulating layers, the switching behavior of the device would differ as well. Diverse statistical distributions were suggested based on experimental fitting of the fabricated devices, such as Poisson,18 log-normal,19 and Gaussian.20 However, the common consensus among them is the linear relationship between the time it takes for the device to switch and the applied voltage. For threshold-based devices, where no change in the resistance is seen under a certain set threshold, there would be a probabilistic behavior allowing for the switching event to occur in the sub-threshold region. It is time and voltage related, where the closer the input voltage is to the threshold, the shorter time it would take for the memristor to change state. Conversely, a longer time is needed for a switching event to occur under smaller input voltage as shown in the following equation relating the average time to switch and the applied bias V.
where and Vo are fitting parameters that depend on the underlying material and fabrication process.21
B. Neuronal stochasticity
The neuron is the main processing component within the network, with electrical impulses being the means of communication and information processing.2 A simplified abstraction of the neuron behavior is based on an integrate and fire model (I&F). The incoming charges to the neuron are accumulated, and once the voltage across the membrane reaches a certain threshold, a spike is generated. A conventional resistor-capacitor (RC) circuit is used to model this behavior with an added threshold condition.
With the time constant , the charge accumulation in the neuron is modeled as
The threshold condition supplements the above equations in order to model the spike emission behavior of the neuron. The firing occurs at a time t(f) once the voltage V(t) reaches a threshold Vth. After which the potential is reset back to vr as defined in the following equation.
However, in order to induce stochasticity to the operation of the neuron, a random noise generator is used. Where noise is added to the membrane potential randomizing by that the spike generation process as shown in Figure 1a. This technique requires extra hardware to be added to the circuit in addition to affecting the energy efficiency of the neuronal elements. Alternatively, we proposed the use of the stochastic memristor as an inherent source of variability in the neuron that allows it to produce spikes stochastically.23 So the memristor is put in parallel with the original neuron circuit and its variable threshold acts to randomize the firing threshold of the neuron and consequently the spiking behavior as shown in Figure 1b. From the circuit perspective, the memristor, with its variable threshold, acted as a stochastic comparator and allowed for a more area and power efficient implementation of the neuron. It replaced the random number generator for the injected noise along with the operational amplifier in earlier designs.
C. Synaptic stochasticity
An alternative approach to having stochasticity induced into the neural network is through having stochastic synapses;24 a feature where the communication channels between the neurons are not behaving in a deterministic manner. An approach we have also investigated, showing preliminary results, through the incorporation of stochastic memristors in a crossbar structure and building on the random switching between its two binary values.25 The synaptic weight was thus variably set to either an ON or OFF resistance state according to the corresponding value of the memristor. Figure 2 shows the memristor crossbar incorporating stochastic memristive elements. The switching of the memristive elements is time and voltage dependent. The time it takes for the device to switch is random based on the applied voltage. Where the rows and columns are connected to the input and output neurons respectively. It is based on a pulse based operation, with the duration of the pulse and its corresponding amplitude tuning the probabilistic operation of the memristors, and shaping by that the synaptic values to either high or low value as well. In this paper, more elaborate simulations and test analysis are shown. Larger network sizes and input patterns are used to quantify the performance and the corresponding metrics.
III. SIMULATION AND DISCUSSION
A fully connected Winner-take-All (WTA) network structure was adopted for the simulations and test in the two stochastic perspectives. MNIST handwritten digits are used for the training and classification.
A. Simulation platform
In21 a generic model of the memristor was introduced where the stochasticity could be added into the threshold of the device. A technique based on varying the threshold voltage while preserving the kinetics of the switching is established. To that end, the abstract switching probability following the Poisson distribution was adopted in these simulations as they are akin to the neuronal behavior, and can be adopted to the synaptic components as well. Hence, the variability present in the resistance state switching, and based on a threshold operation, the memristor can be utilized in inducing inherent stochasticity in the neuronal and synaptic components of neural networks.
In a system level simulation, a neural network platform was established. Completely connected layers of input and output neurons are set as shown in Figure 3. The neurons are based on an integrate and fire model, with the input layer responsible for encoding the images or patterns supplied. Each digit was represented in a 28x28 pixel requiring 784 encoding neurons. A spike based approach was utilized for the nueronal stochasticity, where high or low frequency spikes are used to encode the light and dark pixels respectively. On the other hand, the stochastic synaptic crossbar required the a pulse based approach. Pulses of +0.6/+0.1V were used to correspond to a white and black pixel respectively. The platform accommodates different simulation parameters, and offers flexibility in the training and recognition phases, with diverse data sets, test sets, network size, and learning principles.
For stochastic synaptic learning, with a fully connected network, each memristor in the crossbar acted as synaptic connection with stochastic plasticity. The output neurons are then connected in a winner-take-all fashion, where the first neuron to spike inhibits the remaining neurons in the layer from spiking. Once any of the output neurons reached the spiking threshold, a spike is generated and sent back across the same column with a short negative pulse of -3.6V followed by a short positive pulse of 3.6V. The memristor values are thus reinforced with the values of the images pixel supplied. The values of the pulses were chosen based on a 50% probability of resistance state change. The probability distribution for the Poisson switching is calculated as P(t) with t being the time interval of applying the voltage, and as the average switching time.
The resistance value of the memristors in the crossbar were recorded after all the data set are supplied, along with the winning neurons for each class. The classification phase is then initiated. Random patterns were shown to the network and the winning neuron for each of the supplied patterns was recorded and compared to the trained set of neurons. In case a mismatch is found between the trained neurons set for a particular class and the classified one, an error is recorded. The overall accuracy of the classification is calculated as follows
where Ntest corresponds to the number of test images used for the classification.
B. Performance analysis
For testing the network performance under different settings, the training phase of network involved presenting digit patterns in a random manner for the training dataset for digits 0 to 4. In the neuronal stochasticity, the stochastic memristive neurons were set as the input and output layers. The network performance showed high accuracy ranges and robustness to several characteristic metrics variation.22 Spike-timing dependent plasticity (STDP) learning rule was applied in the simulations allowing the synapses to adapt in a more analog manner to the input images as shown in Figure 4a. Alternatively in the synaptic stochasticity, a stochastic learning rule was applied with binary synaptic weights that were also confined and abstracted into two levels only as depicted in the black and white images in Figure 4b.
Nonetheless, the binary abstraction did not affect the accuracy of operation. On the contrary, simulations for networks of 16 and 32 output neurons, the accuracy ranged from 60% to 64% respectively. A very similar behavior to the neuronal model statistics. Table I shows the different metrics of accuracy for the neuronal and synaptic approaches. Thus, asserting by that the interchangeability of stochasticity induction and adaptation to the supplied input patterns in terms of the learning potential. It is also noteworthy to mention that the performance of the network is strongly dependent on the number of neurons and the corresponding synaptic connections. Studies with 10 to 50 neurons9 could only achieve rates between 50% and 75%. Which complies with our synaptic and neuonal results. However, the potential to reach greater than 90% recognition rates26 requires at least 300 neurons and 235,200 synapses.
|.||Neuronal Stochasticity .||Synaptic Stochasticity .|
|16 output neuron||58.9%||64.2%|
|32 output neurons||60.5%||63.8%|
|.||Neuronal Stochasticity .||Synaptic Stochasticity .|
|16 output neuron||58.9%||64.2%|
|32 output neurons||60.5%||63.8%|
Moreover, in terms of the stochasticity emulation, from the area and size perspective, the stochastic synaptic crossbar was more area efficient as a single device can stand for a complete synaptic CMOS circuit. Hence, offering orders of magnitude in area saving. Moreover, with the inherent stochasticity, the level of voltage applied is much lower than the deterministic case, and having the number of synapses much larger than the neuron, further savings in the power is also attained.