The commercial introduction of a novel electronic device is often preceded by a lengthy material optimization phase devoted to the suppression of device noise as much as possible. The emergence of novel computing architectures, however, triggers a paradigm shift in noise engineering, demonstrating that non-suppressed but properly tailored noise can be harvested as a computational resource in probabilistic computing schemes. Such a strategy was recently realized on the hardware level in memristive Hopfield neural networks, delivering fast and highly energy efficient optimization performance. Inspired by these achievements, we perform a thorough analysis of simulated memristive Hopfield neural networks relying on realistic noise characteristics acquired on various memristive devices. These characteristics highlight the possibility of orders of magnitude variations in the noise level depending on the material choice as well as on the resistance state (and the corresponding active region volume) of the devices. Our simulations separate the effects of various device non-idealities on the operation of the Hopfield neural network by investigating the role of the programming accuracy as well as the noise-type and noise amplitude of the ON and OFF states. Relying on these results, we propose optimized noise tailoring and noise annealing strategies, comparing the impact of internal noise to the effect of external perturbation injection schemes.

Memristive crossbar arrays are promising candidates as the hardware components of artificial neural networks,1–5 including advanced applications in feed-forward neural networks,6–9 convolutional layers,10,11 3D architectures,12–15 unsupervised neural networks,16–18 and recurrent neural networks.11,19 In these applications, the tunable conductance of a memristor unit encodes a synaptic weight in the network, and once the properly trained weights are programmed to each memristor cell, the memristive crossbar array is able to perform the vector-matrix multiplication, i.e., the key mathematical operation of the network inference, in a single time-step.1,2,6 This equips the artificial neural networks with a highly energy efficient hardware component compared to software solutions, where the evaluation of the input vector at a layer with N neurons requires N2 multiplication operations. In most neural network applications, the highest resolution of the memristive synaptic weights is desirable,20 and, therefore, the memristor non-idealities, such as their programming inaccuracy or their stochastic noise properties, should be eliminated as much as possible. A special class of memristive networks, however, relies on probabilistic optimization,21–25 where it is well known that tunable stochasticity, such as customizable device noise, is not a disturbing factor but a useful computational resource. A similar strategy was recently experimentally realized in 60 × 60 memristive Hopfield neural networks (HNNs),24 demonstrating the efficient solution of max-cut graph segmentation problems and delivering over four orders of magnitude higher solution throughput per power consumption than digital or quantum annealing approaches.

Inspired by these achievements, we perform a thorough analysis of simulated memristive Hopfield neural networks, putting a key emphasis on the effect of device noise on the network’s operation. To this end, first, the realistic noise characteristics of memristive devices26–29 are discussed (Sec. III), and a general noise model describing the conductance-dependent noise characteristics in the filamentary and broken filamentary regimes is proposed. Afterward, various benchmark max-cut problems are solved by simulated memristive HNNs (Sec. IV), relying on the proposed noise model. These simulations demonstrate rather well-defined relative noise values at which the network operation is optimized, regardless of the network size and the type of noise spectrum. We also demonstrate a simplified, easily implementable double-step noise annealing scheme (Sec. IV C), which further enhances the convergence probability of the network. These optimized noise levels, however, are at the top border of the experimentally observed noise amplitudes, which raises the need for an external injection of stochasticity (Sec. IV F). For the latter, two strategies are tested, including external noise injection and the introduction of chaotic behavior through the self-feedback of the neurons in the network.30 Finally, the effect of further device-non-idealities is tested separately (Sec. IV E), analyzing the effect of the programming inaccuracy and the finite OFF-state conductance. The presentation of all these results is preceded by a brief overview of Hopfield neural networks and their implementation by memristive crossbar arrays (Sec. II).

The Hopfield Neural Network (HNN), introduced by Hopfield,31 was shown to be capable of solving complex problems by Hopfield and Tank32 and has been used for optimization ever since.33 The Hopfield network’s main allure is its simplicity and immense power to provide reasonable solutions for high complexity problems. The network consists of fully connected binary neurons (without self-connections); the W̲̲ synaptic weight matrix encodes the optimization problem; and the x̲ state of the neurons represents the possible states of the system, including the desired solution(s) [see Fig. 1(a)]. The network is operated in an iterative fashion: at each time-step t the activation,
a̲(t)=W̲̲x̲(t)
(1)
is calculated. Then an index j is picked at random, and a single neuron is updated according to the rule,
xj(t+1)=+1ifaj(t)θj,1ifaj(t)<θj,
(2)
where θj is the component of a predefined threshold vector θ̲. It can be shown that this simple update rule decreases the energy function,
E(t)x̲(t);W̲̲,θ̲=12x̲(t)TW̲̲x̲(t)+θ̲x̲(t)
(3)
in every iteration step (E(t+1)E(t)). Due to this property, Hopfield neural networks are widely used to solve complex problems, which can be encoded in the form of the effective energy function in Eq. (3). This can be applied in an associative memory scheme,31 where W̲̲ and θ̲ are set such that each local minimum of E(x̲) encodes a predefined pattern (e.g., images), and the update rule drives the system from an arbitrary initial state x̲(0) to the closest local minimum, i.e., the network finds the predefined pattern most similar to the initial state. Alternatively, the Hopfield neural network may find the global solution to a complex problem, such as the max-cut graph segmentation problem.
FIG. 1.

(a) Illustration of a Hopfield neural network with five neurons. The orange (yellow) circles illustrate the “+1” (“−1”) binary states, whereas the lines represent the synaptic weights between the neurons. A HNN excludes self-connections; however, a self-connection with negative weight (dark green arrow) introduces a chaotic nature to the network operation, which helps to find the global solution. Stochasticity can also be introduced by the temporal variation (noise) of the synaptic weights as well as by external noise injection (light green arrow). (b) Illustration of the max-cut problem: the goal is to find a partitioning of the vertices into two adjacent sets so that the total number of crossing edges between the two sets (red lines with dashed-line cut) is maximal. (c) Illustration of the energy landscape of a HNN. A noiseless operation with the conventional update rule may yield dead ends in local minima (gray line). Properly tailored stochasticity, however, helps escape from the local minima and find the global solution (blue line). (d) Experimental realization of the discrete HNN by a memristor crossbar array. The Vi = ±|V| voltage inputs at the horizontal lines represent the states of the neurons, which are updated according to the Ij current outputs at the vertical lines, the latter representing the aj(t) activation. The synaptic weights are encoded in the Gi,j conductance matrix of the memristors in the crossbar. In a conventional HNN, the lack of self-connections is represented by the 0 conductance values at the diagonal of the crossbar (dark green memristors). The light green arrow illustrates the possibility of external noise injection. (e) In a memristive HNN, the “1” and “0” synaptic weights are encoded in GON and GOFF conductance values. These, however, exhibit device-to-device variations described by the ΔGstatic variance. (f) The stochastic temporal variation (i.e., the noise) of the GON and GOFF conductance values also introduces a device non-ideality described by the ΔGdynamic variance. The proper tailoring of the noise, however, aids the network’s operation.

FIG. 1.

(a) Illustration of a Hopfield neural network with five neurons. The orange (yellow) circles illustrate the “+1” (“−1”) binary states, whereas the lines represent the synaptic weights between the neurons. A HNN excludes self-connections; however, a self-connection with negative weight (dark green arrow) introduces a chaotic nature to the network operation, which helps to find the global solution. Stochasticity can also be introduced by the temporal variation (noise) of the synaptic weights as well as by external noise injection (light green arrow). (b) Illustration of the max-cut problem: the goal is to find a partitioning of the vertices into two adjacent sets so that the total number of crossing edges between the two sets (red lines with dashed-line cut) is maximal. (c) Illustration of the energy landscape of a HNN. A noiseless operation with the conventional update rule may yield dead ends in local minima (gray line). Properly tailored stochasticity, however, helps escape from the local minima and find the global solution (blue line). (d) Experimental realization of the discrete HNN by a memristor crossbar array. The Vi = ±|V| voltage inputs at the horizontal lines represent the states of the neurons, which are updated according to the Ij current outputs at the vertical lines, the latter representing the aj(t) activation. The synaptic weights are encoded in the Gi,j conductance matrix of the memristors in the crossbar. In a conventional HNN, the lack of self-connections is represented by the 0 conductance values at the diagonal of the crossbar (dark green memristors). The light green arrow illustrates the possibility of external noise injection. (e) In a memristive HNN, the “1” and “0” synaptic weights are encoded in GON and GOFF conductance values. These, however, exhibit device-to-device variations described by the ΔGstatic variance. (f) The stochastic temporal variation (i.e., the noise) of the GON and GOFF conductance values also introduces a device non-ideality described by the ΔGdynamic variance. The proper tailoring of the noise, however, aids the network’s operation.

Close modal
The NP-hard max-cut problem is formulated for an arbitrary undirected G(V, E) graph with V vertices and E edges. The goal is to find a partitioning of V into two adjacent sets S and K so that the total weight of crossing edges between the two sets is maximal [see Fig. 1(b)].34 Although abstract at first sight, many practical problems can be mapped to the max-cut, such as the conflict-graph formulation of the layer assignment problem in very large scale integration (VLSI) design, where the position of functional blocks is optimized in a multilayer chip.35 If a graph is given by its adjacency matrix A̲̲, a cut can be encoded by x̲, simply as
xk=+1ifVkS,1ifVkK,
(4)
and the maximum cut can be found by minimizing the E(x̲;A̲̲,0̲) energy function [Eq. (3)]. For an unweighted graph, the absolute value of the energy is proportional to the number of edges running between S and K, so the problem is directly addressable by a HNN.34 In that case, however, the global minimum of the energy function is to be found, while the conventional operation of a Hopfield neural network would yield dead ends in each local minima of the energy landscape. This problem can be eliminated by introducing proper stochasticity to the network, such as a finite noise, which helps escape from the local minima and is reduced as the states evolve toward the global solution [see the illustration in Fig. 1(c)].

The so-called crossbar structure is a popular scheme for physical matrix realization via memristors.2,36 As shown in Fig. 1(d), it is essentially a set of horizontal and vertical wires (word and bit lines) with a memristor placed at each meeting point of the lines. Operating this arrangement in the linear, subthreshold regime of the memristive units, the output current vector at the bit lines is obtained as the product of the input voltage vector at the word lines and the conductance matrix of the memristors at the crosspoints, Ij = ∑iGi,j · Vi. Once the proper conductance weights are programmed to the crossbar, the vector-matrix multiplication is performed on the hardware level within a single clock-cycle. This scheme is also implementable for Hopfield neural networks, where the diagonal values of the Gi,j conductance matrix are zero due to the lack of self-connections in the HNN.

The special case of the max-cut problem is mathematically formulated by a weight matrix, which is “1” or “0” if the proper vertices are connected or non-connected. This problem can be mapped to a memristive HNN by setting a constant GON conductance and GOFFGON conductance instead of the 1 and 0 values, respectively. The xi binary state vector elements are represented by Vi(t)=xi(t)|V| input voltages at the crossbar word lines. This scheme was experimentally realized in Ref. 24 using memristive HNNs up to 60 × 60 matrix sizes.

Figures 1(e) and 1(f) demonstrate the key device non-idealities in a memristive Hopfield neural network: the temporal stochastic variation (noise) of the programmed GON and GOFF conductances (panel f), as well as the programming inaccuracy, i.e., the device-to-device variation of the time-averaged Gi,j(t)̄ conductances for the memristor cells programmed to the same ON/OFF binary state. These non-idealities are measured by the temporal variance (ΔGdynamic) and the device-to-device variance (ΔGstatic) around the ideal GON and GOFF values. Furthermore, it is noted that GON can be chosen arbitrarily in the mapping of the “0” and “1” synaptic weights to the GON and GOFF conductance; however, a finite GOFF already represents a device non-ideality, which may modify the operation of the network. Furthermore, finite wire resistances or nonlinear I(V) characteristics may also be considered non-idealities;24 however, these two non-idealities are not considered in our following analysis.

Among these non-idealities, noise plays a distinguished role, as it does not necessarily hamper network operation, but a properly tailored noise may help find the global solution. However, the injection of external stochasticity might also become necessary once the internal noise of the memristor elements in the crossbar array is not large enough for optimal network operation. The latter possibility is illustrated by the light green arrows in Figs. 1(a) and 1(d), as well as by the dark green arrow in Fig. 1(a), respectively, illustrating external noise injection and the introduction of chaotic behavior via negative self-feedback of the neurons.24,30

In the following, we analyze the realistic noise characteristics of various memristive systems (Fig. 2), which are considered key ingredients of the memristive HNNs’ operation.

FIG. 2.

(a) Relative conductance noise as a function of device conductance for a variety of memristive materials. The data for Ag2S (green), Ta2O5 (red), and Nb2O5 (orange) memristive devices are reproduced from Refs. 26–28, recalculating the integrated noise amplitudes for the [fA, fB] = [10 Hz, 250 kHz] band. For these material systems, the validity of the diffusive noise model with volume-distributed fluctuators was verified in the high conductance regime; the lines with the corresponding colors represent the best fitting trends with the corresponding γ = 3/2 exponent.26–28 At somewhat smaller conductances, a rather narrow ballistic conductance region is observed with a significantly shallower γ = 1/4 exponent.26–28 Finally, in the sub-conductance-quantum interval, a broken filamentary regime is observed with constant relative noise level, which is best resolved for the Ag2S system.27 The blue data represent new measurements on graphene-SiOx-graphene lateral devices, using the sample preparation protocol as in Refs. 29 and 37. Here, the switching relies on the voltage-controlled transitions between well-conducting crystalline and poorly conducting amorphous regions.38,39 The low conductance constant and high conductance ∼Gγ, γ = 1.13 dependencies are clearly seen for this system, spanning a 5 orders of magnitude (3 orders of magnitude) range along the conductance (relative noise) axis. The GC crossover conductance is well below G0, indicating a barrier-like component even in the metallic regime. As a reference, the black solid and dashed lines illustrate the integrated contribution of the thermal noise background at V = 1 V and fB = 250 kHz and fB = 1 GHz, respectively. (b) Illustrative Lorentzian (blue) 1/f-type (pink) and mixed (purple) noise spectra measured on Ta2O5 devices. The bottom curve is on true scale, while the middle and top curves are artificially shifted upward by one and two orders of magnitude. (c) Proposed noise model: with constant relative noise in the barrier-like regime (G < GC) and ∼Gγ relative noise in the metallic nanojunction regime (G > GC), the GOFF (red circle) and GON (blue circle) conductances can be set to arbitrary positions along the noise model.

FIG. 2.

(a) Relative conductance noise as a function of device conductance for a variety of memristive materials. The data for Ag2S (green), Ta2O5 (red), and Nb2O5 (orange) memristive devices are reproduced from Refs. 26–28, recalculating the integrated noise amplitudes for the [fA, fB] = [10 Hz, 250 kHz] band. For these material systems, the validity of the diffusive noise model with volume-distributed fluctuators was verified in the high conductance regime; the lines with the corresponding colors represent the best fitting trends with the corresponding γ = 3/2 exponent.26–28 At somewhat smaller conductances, a rather narrow ballistic conductance region is observed with a significantly shallower γ = 1/4 exponent.26–28 Finally, in the sub-conductance-quantum interval, a broken filamentary regime is observed with constant relative noise level, which is best resolved for the Ag2S system.27 The blue data represent new measurements on graphene-SiOx-graphene lateral devices, using the sample preparation protocol as in Refs. 29 and 37. Here, the switching relies on the voltage-controlled transitions between well-conducting crystalline and poorly conducting amorphous regions.38,39 The low conductance constant and high conductance ∼Gγ, γ = 1.13 dependencies are clearly seen for this system, spanning a 5 orders of magnitude (3 orders of magnitude) range along the conductance (relative noise) axis. The GC crossover conductance is well below G0, indicating a barrier-like component even in the metallic regime. As a reference, the black solid and dashed lines illustrate the integrated contribution of the thermal noise background at V = 1 V and fB = 250 kHz and fB = 1 GHz, respectively. (b) Illustrative Lorentzian (blue) 1/f-type (pink) and mixed (purple) noise spectra measured on Ta2O5 devices. The bottom curve is on true scale, while the middle and top curves are artificially shifted upward by one and two orders of magnitude. (c) Proposed noise model: with constant relative noise in the barrier-like regime (G < GC) and ∼Gγ relative noise in the metallic nanojunction regime (G > GC), the GOFF (red circle) and GON (blue circle) conductances can be set to arbitrary positions along the noise model.

Close modal

The SI(f) spectral density of the current noise is defined as the (ΔI)2|f0,Δf mean squared deviation of the current within a narrow Δf band around the central frequency f0 normalized to the bandwidth, SI(f0)=(ΔI)2|f0,Δf/Δf, but practically SI(f) is calculated from the absolute value squared of the Fourier transform of the measured I(t) fluctuating current signal.27 In memristive devices, SI(f) typically exhibits a Lorentzian spectrum [blue curve in Fig. 2(b)], a 1/f-type spectrum [pink curve in Fig. 2(b)], or a mixture of these two [purple curve in Fig. 2(b)]. In the former case, the noise is dominated by a single fluctuator introducing a steady-state resistance fluctuation with a typical time constant τ0, yielding a spectrum that is constant at 2πf<τ01 and decays with 1/f2 at 2πf>τ01.27 If multiple fluctuators with different time constants contribute to the device noise, the Lorentzian spectra of the individual fluctuators sum up to a spectrum with SIfβ, where β is usually close to unity [pink noise, pink curve in Fig. 2(b)].27 Alternatively, a single fluctuator positioned at the device bottleneck may dominate the device noise, but a larger ensemble of more remote fluctuators may also make a significant contribution.27 This situation yields a mixture of Lorentzian and 1/f-type noise [purple curve in Fig. 2(b)]. Without any steady-state resistance fluctuations, a finite thermal noise is still observed, the latter exhibiting a constant (frequency-independent) spectrum (white noise) with SI = 4kBTG, where kB is Boltzmann’s constant and T is the absolute temperature. Integrating the current noise for the frequency band of the measurement, the mean squared deviation of the current is obtained in this band, (ΔI)2=fAfBSI(f)df.

At low enough subthreshold voltages, the memristive conductances exhibit steady-state fluctuations, i.e., the applied voltage is only used for the readout of the noise, but it does not excite any fluctuations. In this case, (ΔI)2 = (ΔG)2 · V2 holds according to Ohm’s law, i.e., the voltage-dependent current fluctuation is not a good measure of the noise properties. The ΔI/I relative current fluctuation, however, is already a voltage-independent metric of the fluctuations, which equals the relative fluctuation of the conductance or the resistance in the linear regime, ΔI/I = ΔG/G = ΔR/R.27 This metric will be used throughout the paper to describe the noise characteristics, where (ΔG/G)dynamic describes the relative temporal fluctuations of a certain element of the memristor conductance matrix Gi,j(t). It is noted that (ΔG/G)dynamic depends on the bandwidth. The high-frequency cutoff is determined by the integration time of the current readout (τreadout = 2 µs in our simulation yielding fB = 1/2τreadout = 250 kHz), whereas the fA = 10 Hz bottom end of the frequency band is determined by the time-period for which the network is operated (0.1 s in our simulation corresponding to 10 000 iteration steps and a 4τreadout waiting time between the current readout events, simulating the finite time of the neural updates and the multiplexing). Increasing the number of iteration steps would naturally increase the (ΔG/G)dynamic, but this dependence is characteristic of the nature of the noise spectrum. In the case of a Lorentzian spectrum, the noise amplitude does not really depend on the bandwidth once the (2πτ0)1 characteristic frequency of the fluctuator is well inside the band. This is consistent with the (ΔG)2arctan(2πfτ0)|fAfB relation for the Lorentzian spectrum. The other experimentally relevant 1/f-type spectrum yields (ΔG)2 ∼ ln(fB/fA), which is also a very weak dependence on the bandwidth, yielding only a 30% increase in (ΔG/G)dynamic once the number of iteration steps is increased from 104 to 107, i.e., the above bandwidth is increased by three orders of magnitude. According to these considerations, the results of our following simulations are rather weakly dependent on our specific choice of bandwidth.

We also note that the above operation bandwidth was chosen to overlap the frequency range of the noise measurements. In principle, a significantly faster operation is also realistic where fB approaches GHz frequencies.24 Considering this enhanced bandwidth but keeping the number of 10 000 iteration steps fixed, we arrive at a frequency range where noise data are not available. In this range, it is hard to predict the noise properties for a memristive junction with Lorentzian character in the original frequency interval, as additional high frequency fluctuators might become relevant and a 1/f-type contribution might dominate over the Lorentzian spectrum above a certain corner frequency. For a 1/f-type spectrum, however, it is reasonable to extrapolate the 1/f-type character to significantly higher frequencies. With this assumption, the (ΔG)2 ∼ ln(fB/fA) relation shows that (ΔG/G)dynamic does not depend on the high-frequency cutoff but only on the fB/fA ratio, i.e., a GHz-range network would suffer from the same noise levels as the much slower simulated network (fB = 250 kHz) once the same number of iterations are applied. On the other hand, the integrated contribution of the thermal noise background becomes more relevant at increasing frequencies: (ΔG)2 = (ΔI)2 · V2 ≈ 4kBTG · fB · V2. This relation yields ΔG/G=V4kBTfB/G, which is plotted as a reference in Fig. 2(a) using V = 1 V and fB = 250 kHz (black solid line) and fB = 1 GHz (black dashed line). This comparison shows that in the original frequency range, the thermal noise background is orders of magnitude below the typical 1/f-type noise levels, and even at the envisioned GHz frequencies, the 1/f-type noise would slightly dominate over the thermal background once its extrapolation to the high-frequency range is valid.

Several studies have pointed out that the relative noise amplitude of a memristive device exhibits a strong and specific dependence on the device conductance, i.e., the multilevel programmability is accompanied by the tuning of the relative noise level.26,28,40–48 Figure 2(a) shows four examples of this behavior, demonstrating the conductance-dependent noise characteristics of Ag2S (green),27,28 Ta2O5 (red),26 Nb2O5 (orange),26 and SiOx (blue) memristive units integrated for the same 10 Hz–250 kHz frequency band. It is clear that the overall noise amplitude, the characteristic conductance range of the operation, as well as the dependence of ΔG/G on the conductance, are a kind of device fingerprint, exhibiting significant differences between various material systems. However, a rather general trend of the noise characteristics can be identified: in the low-conductance region of the operation regime, ΔG/G is very weakly dependent on the conductance, whereas in the high conductance region, a strong ΔG/GGγ power-law dependence is typical.

In the latter case, metallic filamentary conduction is envisioned, where the relative noise amplitude obviously increases as the filament diameter is reduced.27 A rather generally observed tendency is related to the volume-distributed fluctuators in a diffusive filament, where ΔG/GG−3/2 was obtained from theoretical considerations.26,28 This is also confirmed by the experimental data in Fig. 2(a), where the validity of the γ = 3/2 exponent was approved for the Ag2S, Ta2O5, and Nb2O5 systems,26,28 whereas our new data on SiOx memristors exhibits a somewhat shallower dependence with γ = 1.13 (see the dashed lines representing the best fitting tendencies with the given γ exponents). It is noted, however, that the γ exponent may depend on the transport mechanism, the device geometry, the dimensionality (2D/3D devices), as well as the distribution of the fluctuators (single or multiple fluctuators, surface or volume distributed fluctuators, etc.).27 

In contrast, the saturated noise characteristics in the low conductance regime are attributed to broken filaments, where a barrier-like transport is envisioned. In the simplest case of a tunnel barrier, the G = A · exp(−α · d) relations yield a conductance-independent ΔG/G = α · Δd relative conductance noise for a constant Δd fluctuation of the barrier width.27 More complex transport phenomena, such as the Frenkel–Poole mechanism49 or a hopping-type transport,50 require more sophisticated descriptions, but the overall trend, i.e., the independence or the very weak dependence of ΔG/G on G, is left unchanged due to the exponential dependence of the conductance on a relevant fluctuating parameter.

According to these considerations, in the following simulations we rely on a simplified noise model [see Fig. 2(c)], where ΔG/G is the constant below a certain threshold conductance GC [see the red barrier-like regime in Fig. 2(c)], whereas a general ΔG/GGγ power-law dependence is considered at G > GC [see the blue metallic nanojunction regime in Fig. 2(c)]. The GOFF and GON conductances of the memristive HNN can be fixed at arbitrary positions along this noise model, as demonstrated by the red and blue circles in Fig. 2(c). This simplified model has three free parameters: the GC threshold, the γ slope, and the ΔG/G relative fluctuation in the barrier-like regime. Note that according to the experimental results in Fig. 2(a), the latter can reach a few tens of percent; GC is not necessarily but reasonably close to the G0 = 2e2/h conductance quantum unit, whereas the variation of ΔG/G can span up to three orders of magnitude in the metallic nanojunction regime. For the exponents, γ = 1.13 − 1.5 values are observed; however, we emphasize that fundamentally different slopes are also possible.27 

We have simulated memristive HNNs with realistic noise characteristics relying on the standardized Biq Mac Library,51 which provides exact globally optimal energies for Max-Cut instances of undirected and unweighted graphs in the sizes n ∈ [60, 80, 100]. Following the results of Ref. 24, we were studying Erdős–Rényi graphs with a 50% connection probability between the vertices. We have simulated the HNNs starting from K = 200 randomly picked initial state vectors and performing N = 10 000 iterations for each epoch. The neurons were iterated in a predetermined random order. The K runs are evaluated according to two figures of merit: the proportion of runs where the network was in the globally optimal state at the Nth step (Pconv convergence probability), and the number of edges between the two subsets (i.e., the C number of cuts) after N iteration steps averaged for the K random initial vectors, C̄=1Ki=1KCx̲i(N).

Instead of the ideal “1” and “0” values of the Wi,j weight matrix, realistic Gi,j conductances of the memristive HNN were used in the simulations. At the matrix positions with a value of “1” in the original problem, an average conductance of GON was applied, considering both (ΔG/G)static device-to-device variations and (ΔG/G)dynamic temporal fluctuations around this mean value. To simulate the latter, independent G(t) time traces were generated for all the memristive elements in the ON state using either the Lorentzian, pink, or white noise spectrum. Carson’s theorem and method were applied52 to generate the G(t) temporal noise traces (i.e., temporal conductance variations) from the chosen SG(f) noise spectrum.

According to the HNN scheme, the diagonal elements of the crossbar matrix (self-connections) are set to exactly zero. This can be physically implemented by omitting the memristors at the diagonal positions, either by switching off their transistor in a 1T1R arrangement53–55 or by omitting their electroforming procedure. In Sec. IV F, however, we also discuss the case of finite self-feedback, which introduces a chaotic nature to the network.

The off-diagonal elements of the weight matrix with values of “0” are generally represented by the GOFF conductance with the corresponding relative conductance noise. In some of the simulations, however, GOFF = 0 is applied according to the following considerations.

Finally, we note that our simulations place emphasis on the noise properties of the network, and otherwise the network operation is simulated as an ideal mathematical HNN according to the update and activation rules in Eqs. (1) and (2), using θi = 0 for the max-cut problem. This means that the xi neural state vector is represented by the Vi input voltage vector of the memristive HNN, and the ai activation vector is represented by the output current vector, which is the product of the input voltage vector and the Gi,j conductance matrix. The latter includes the above discussed realistic conductance and noise properties. In the simulations, numerical input voltage levels of ± 1 V are applied; however, we note that both the relative noise levels and the results of the HNN operation are invariant to the applied voltage once the external parameters (external noise amplitude, strength of the self-feedback, and the θj threshold values) are properly scaled with the voltage. This idealized network operation with realistic conductance and noise characteristics can be considered an idealized circuit model, where the wire resistances as well as the resistances of the selector elements and the output and input resistances of the word line drivers and current sensing circuits at the bit lines are neglected, and a fully linear operation is considered for all the components of the memristive crossbar array. We believe that this choice of simplification facilitates the fundamental understanding of the role of internal noise in memristive HNNs; nevertheless, the proposed simplified noise model can also be implemented in more realistic circuit simulations as long as steady state noise in the linear transport regime is considered. Outside of this regime, i.e., if either significant nonlinearities appear in the I(V) curves of the memristive elements at the applied voltages or if the applied voltages excite additional fluctuations compared to the steady-state noise, nonlinear noise spectroscopy measurements would be required to realistically describe the voltage dependent noise properties.27,29

In the following, we provide simple considerations on the relative current and current fluctuation contributions of the ON and OFF state elements, which helps identify the most relevant contributions.

  • Relative current contribution of OFF-state memristive elements in the crossbar. For bit line j, the number of “1” values in the original weight matrix is denoted by dj, yielding an average value of dj̄=(n1)pc according to the random connection probability between the n vertices, for which pc = 0.5 is applied in the following. The current contribution of the ON and OFF state elements in a certain bit line j, however, also depends on the distribution of the “+1” and “−1” values in the xi state vector, which varies along the operation. For the ensemble average of the adjacency matrices with the same random connection probability, however, an ensemble-averaged current can be calculated
    Ij̄=ixi|V|pcGONION,j̄+ixi|V|(1pc)GOFFIOFF,j̄,
    (5)
    from which the IOFF,j̄/ION,j̄=(GOFF/GON)*(1pc)/pc ratio gives an indication of the OFF and ON state memristors’ relative current contribution in column j. For the special case of pc = 0.5, this simplifies to IOFF̄/ION̄=GOFF/GON. This demonstrates that at a large enough ON/OFF conductance ratio (e.g., GON/GOFF > 100), the replacement of GOFF by zero is a reasonable simplification for a densely connected graph. Later on (Sec. IV E 2), we numerically analyze how a non-zero GOFF value modifies the network operation at moderate ON/OFF conductance ratios.
  • Relative noise contribution of OFF-state memristive elements in the crossbar. Whereas the current in a certain bit line strongly depends on the actual xi state vector values, the mean squared deviation of the current is independent of that and can be exactly deduced once the dj number of ON-state elements in column j is known
    (ΔI)j2=i(ij)(ΔG)j,i2|Vi|2=(ΔG)OFF2(ndj1)|V|2(ΔI)OFF,j2+(ΔG)ON2dj|V|2(ΔI)ON,j2.
    (6)
    From this, the relative noise contributions of the OFF and ON state elements in bit line j can be calculated considering our simplified noise model [Fig. 2(c)]. First, we treat the mixed barrier-like and metallic regime, where GON > GC > GOFF, yielding
    ΔIOFF,jΔION,j=GOFFGCGCGON1γndj1dj.
    (7)
    Note that the square-root term gives unity once dj is replaced by its average value at 50% connection probability. This formula yields negligible OFF-state noise contribution for arbitrary γ once GOFF is chosen deep in the barrier-like regime (GOFF/GC ≪ 1), whereas GON remains reasonably close to GC.

It is worth discussing another limit as well, where the entire crossbar is operated in the metallic nanojunction regime [see Fig. 2(c)], i.e., GON > GOFF > GC (pure metallic regime). In the metallic nanojunction regime, the memristive elements exhibit much more linear subthreshold I(V) characteristics than in the barrier-like regime, which is a favorable property for the high-precision vector-matrix multiplication operation of the memristive crossbar. This limit yields
ΔIOFF,jΔION,j=GOFFGON1γndj1dj,
(8)
emphasizing the dominance of the OFF-state elements’ noise contribution at any γ > 1 value, i.e., for all the memristive units demonstrated in Fig. 2(a). Furthermore, in this pure metallic operation regime, the GON/GOFF conductance ratio is restricted to rather limited values spanning one order of magnitude in the diffusive regime of Ag2S, Ta2O5, and Nb2O5 memristors and less than two orders of magnitude in SiOx memristors [see Fig. 2(a)], which may distort the network operation compared to networks operated with orders of magnitude larger GON/GOFF ratios in the mixed barrier-like and metallic regimes [Eq. (7)], i.e., the choice of the operation regime is the trade-off between high precision linearity and the proper representation of the “0” values in the weight matrix. In the following subsections, we discuss the results of our simulations for both operation regimes using max-cut benchmarks with 50% connection probability and a noise model with γ = 3/2. Note, however, that the above formulas are also appropriate to discuss more general situations, including arbitrary conductances and γ scaling exponents. Furthermore, arbitrary graphs (i.e., any dj and n values) can also be analyzed, where the replacement of the applied dense graph with a sparse graph (dj/n ≪ 1/2) would yield further enhancement of the OFF noise contribution.

We have simulated max-cut problems using graphs with 50% connection probability and different sizes (n = 60, 80, 100, 300). First, the mixed barrier-like and metallic operation regime is analyzed [Eq. (7)] with GOFF/GON ≪ 1, and according to the above considerations, GOFF = 0 was chosen, whereas a finite GON value with variable noise was applied. The time-averaged conductance was the same for all the ON elements, i.e., (ΔG/G)static = 0 was applied. Simulations were run with three different noise types: Lorentzian noise (blue symbols in Fig. 3), 1/f noise (pink symbols), and white noise (gray symbols). The related noise spectra and time traces generated from these spectra are shown in Figs. 3(c) and 3(d). For all three spectra, the (ΔG/G)dynamic metric was used to measure the relative noise.

FIG. 3.

Simulation results for test problem g05_60.051 with different dynamic noise spectra at various constant noise levels. (a) Convergence probability as a function of dynamic noise level for the three noise types. White noise, pink noise, and Lorentzian noise are, respectively, marked by gray, pink, and blue symbols in all panels. (b) C̄ as a function of dynamic noise level for the three noise types. (c) and (d) Noise spectra and example G(t) traces generated from these spectra for the three noise types. (e) Optimal noise level for various max-cut problems using graphs with randomly generated 50% connection probabilities and sizes of 60 × 60 (circles), 80 × 80 (pentagons), 100 × 100 (stars), and 300 × 300 (plus symbol).

FIG. 3.

Simulation results for test problem g05_60.051 with different dynamic noise spectra at various constant noise levels. (a) Convergence probability as a function of dynamic noise level for the three noise types. White noise, pink noise, and Lorentzian noise are, respectively, marked by gray, pink, and blue symbols in all panels. (b) C̄ as a function of dynamic noise level for the three noise types. (c) and (d) Noise spectra and example G(t) traces generated from these spectra for the three noise types. (e) Optimal noise level for various max-cut problems using graphs with randomly generated 50% connection probabilities and sizes of 60 × 60 (circles), 80 × 80 (pentagons), 100 × 100 (stars), and 300 × 300 (plus symbol).

Close modal

Figures 3(a) and 3(b) demonstrate the Pconv convergence probability and the C̄ average number of cuts after N iteration steps for the 60 × 60 benchmark max-cut problem also applied in Ref. 24. The red line in panel (b) shows the maximum number of cuts, i.e., the global solution to the problem.

It is clear that at zero noise level, the convergence probability is poor (Pconv=1.5%), and the achieved number of cuts is far from the global solution. As the relative amplitude of the dynamic noise is increased, the convergence probability [Fig. 3(a)] exhibits a stochastic resonance phenomenon similar to the results of Ref. 24: irrespective of the noise color, Pconv shows a peak at (ΔG/G)dynamic ≈ 13.8%, leading to a Pconv=4050% chance of convergence. This figure implies that at a lower noise level, the system sticks to local minima, which prevents the convergence to the global solution, whereas at a high noise level, the system is able to escape from the global minimum, which also hampers the convergence. As an interesting conclusion, however, the results of the simulation are very similar for the different noise colors, i.e., the temporal correlations in the noise spectra are irrelevant, and (ΔG/G)dynamic seems a proper, noise-type independent metric to find the optimal noise level. This also allows the simplification of the simulations by easily generating white noise spectra. Furthermore, it is emphasized that the best (ΔG/G)dynamic ≈ 13.8% noise level corresponds to the top end of the experimentally observed relative noise values [Fig. 2(a)], i.e., the experimentally relevant noise levels do not hamper the network operation; on the contrary, experimentally relevant noise levels might not be enough to realize the optimal noise level if stochasticity is solely introduced by the noise of the memristor elements of the crossbar matrix.

We have repeated these simulations for numerous benchmark problems from the Biq Mac library spanning matrix sizes of 60 × 60 [circles in Fig. 3(e)], 80 × 80 (pentagons), and 100 × 100 (stars), using white, pink, and Lorentzian spectra (gray, pink, and blue symbols). At the larger matrix sizes, only white noise was applied. For these problems, the symbols in Fig. 3(e) represent the relative noise values, where the convergence probability is maximal. Furthermore, we have generated an even larger weight matrix (300 × 300, “+” symbol in the last column). Here, the global solution is not known; therefore, the symbol represents the noise value, where C̄ is maximal. Whereas the convergence probability and the C̄ strongly vary for the different problems, the optimal noise level scatters around a common ΔG/G = 13.2% average value (horizontal solid line) with a small variance of 2.6% (horizontal dashed lines). This analysis does not show any systematic tendencies as a function of the matrix size; even the largest matrix with 90 000 memristor elements exhibits optimal operation close to this average value. We note that the system size dependence of the optimal noise level was analyzed in Ref. 24 as well. However, in the latter analysis, the current noise of the entire bit line was considered. The system-size independent optimal noise level of the individual devices [Fig. 3(e)] yields a bit line current variance scaling with the square root of the array size due to the ΔIjdj relation [see Eq. (6)], i.e., the results of Ref. 24 on the optimal noise level [Fig. 5(c) in Ref. 24] are consistent with our observations.

The greatest challenge with randomization algorithms is that stochasticity helps to escape local minima, but there is no guarantee that the system will stay at the global minimum once it is first reached. A common approach to overcome this difficulty is to “cool” the system, i.e., gradually decrease the extent of stochastic behavior during the optimization process. For an experimentally realized HNN, the straightforward method is to harvest and tune the inherent device noise utilizing the multilevel programmability of the conductance states.

According to the work of Cai et al.,24 the optimal trend for the cooling process in a HNN is superlinear. We have implemented this cooling scheme in our simulations, applying a parameterless superlinear annealing protocol to the stochastic variation of the conductance,
Ganneal(t)=log109tNG(t),
(9)
where the G(t) noise signal is generated according to the chosen spectrum and the initial ΔG/G value, and the noise signal is accordingly attenuated as the iterations evolve. The such generated temporal decrease in the noise amplitude is illustrated by the pink curve in Fig. 4(d).
FIG. 4.

Simulation results for test problem g05_60.051 using annealed pink dynamic noise. (a) Convergence probability as a function of dynamic noise level for the different schemes: constant noise (pink line), continuous annealing (pink circles), and double-step annealing (orange squares). (b) C̄ as a function of the dynamic noise level for the different schemes [same colors as in (a)]. (c) Operation scheme with annealing. A memristor has two operational regimes based on dynamic noise. The matrix elements representing zero are set to the far OFF state, giving an essentially zero contribution, whereas matrix elements representing one are programmed to an ON state at the desired initial noise level. During the N = 104 steps at each K = 200 starting vectors, the system’s ON state is gradually reprogrammed to a lower dynamic noise level. (d) Example: Ganneal(t) noise signals for continuous logarithmic and discrete double-step annealing schemes.

FIG. 4.

Simulation results for test problem g05_60.051 using annealed pink dynamic noise. (a) Convergence probability as a function of dynamic noise level for the different schemes: constant noise (pink line), continuous annealing (pink circles), and double-step annealing (orange squares). (b) C̄ as a function of the dynamic noise level for the different schemes [same colors as in (a)]. (c) Operation scheme with annealing. A memristor has two operational regimes based on dynamic noise. The matrix elements representing zero are set to the far OFF state, giving an essentially zero contribution, whereas matrix elements representing one are programmed to an ON state at the desired initial noise level. During the N = 104 steps at each K = 200 starting vectors, the system’s ON state is gradually reprogrammed to a lower dynamic noise level. (d) Example: Ganneal(t) noise signals for continuous logarithmic and discrete double-step annealing schemes.

Close modal

As illustrated in Fig. 4(c), the OFF-state conductance (red dot) is chosen deep in the barrier-like regime (and accordingly, GOFF = 0 is applied), while the ON-state (blue dot) is prepared in the metallic regime with non-zero dynamic noise. During the N steps, the blue dot is moved toward higher conductances so that the relative dynamic noise gradually decreases [Fig. 4(d)]. All simulations were made using experimentally motivated pink noise.

The results achieved by this continuous annealing protocol can be seen in Figs. 4(a) and 4(b), which are scattered as pink circles. Here, the horizontal axis represents the initial noise value. To compare this annealing scheme to the network operation with constant noise, the concerned results from Figs. 3(a) and 3(b) are reproduced by pink lines. It is clear that the annealing procedure started at a high enough noise level delivers significantly better convergence probability than the constant noise simulation using the optimal noise level, which is consistent with the observations in Ref. 24. However, the results plotted in Figs. 4(a) and 4(b) demonstrate an unexpected phenomenon: if the annealing is started from a noise level at or below the optimal 13.8% constant noise level, the convergence probability does not show any improvement compared to the related constant noise simulation anymore. This implies an important conclusion: it is not vital to decrease the noise level well below the optimal noise level during the annealing process; however, it is beneficial if the annealing is started at a higher noise level than the optimal constant noise level. In other words, the optimal constant noise level does not cause a significant escape probability from the global solution; however, an initially higher noise level helps to escape from the local minima, driving the system more efficiently toward the global solution.

Utilizing this finding, we propose a highly simplified double-step annealing protocol [orange illustrations in panels (c) and (d)], where the initial noise level is decreased to its 2/3 and 1/3 value at the 1/3 and 2/3 of the iteration steps. According to panels (a) and (b), this simplified annealing protocol (orange symbols) delivers similar results as continuous annealing. This is highly beneficial for the network operation, as continuous noise annealing would be a demanding task due to the frequent reprogramming of all memristive cells. The double reprogramming along all the iteration steps is a reasonable trade-off between the time-consuming continuous annealing and the constant noise operation, where the convergence probability is worse and it is unrealistic to precisely know the optimal noise level in advance.

Finally, we note that a single weight update in a 60 × 60 crossbar array requires 3600 write operations, which is expected to take as much or even more time than the overall 10 000 iteration steps along the network operation. This means that any annealing protocol that requires much more weight update steps than the above simplified double-step annealing would yield a performance where the write operations along the weight updates require significantly more time than all the current readout and neural state update operations, i.e., the actual execution time.

In the discussion of Eq. (8), we have seen that the noise of the OFF-state elements dominates the network once both the OFF and the ON states are positioned in the metallic nanojunction regime, which is described by a γ > 1 exponent. Next, we analyze the network operation in this pure metallic regime by varying the dominant ΔGOFF/GOFF relative noise level using GON/GOFF = 10 [purple symbols in Figs. 5(a) and 5(b)] and GON/GOFF = 100 [orange symbols in Figs. 5(a) and 5(b)] conductance ratios and γ = 3/2. Here, the ON state noise level is also simulated according to the scaling in Eq. (7). In this case, the convergence probability and C̄ [Figs. 5(a) and 5(b)] exhibit significantly worse results even at the highest 30% relative noise than the optimal network operation in Figs. 3(a) and 3(b) at 13.8% relative ON state noise level. This result, however, is obvious from Eqs. (6) and (7). According to these formulas, arbitrary OFF and ON state noise levels along the noise model can be converted to an equivalent situation where the OFF elements are noiseless but the equivalent relative ON-state noise, (ΔGON/GON)equivalent, is set such that the overall current noise of the given bit line remains the same. According to Eq. (7), the γ = 3/2 and pc = 0.5 parameters yield (ΔGON/GON)equivalent=(ΔGOFF/GOFF)/1+GON/GOFF. In Figs. 5(c) and 5(d), the results in Figs. 3(a), 3(b), 5(a), and 5(b) are plotted as a function of the equivalent ON-state noise level, demonstrating that the curves indeed follow the same tendency. From this, we can conclude that the pure metallic nanojunction regime yields the dominance of the OFF-state elements’ noise; however, a given OFF-state noise corresponds to significantly smaller equivalent ON-state noise, i.e., even the largest 30% OFF-state noise is too small to reach optimal operation.

FIG. 5.

Simulation results for test problem g05_60.0:51 operating the noise model in the pure metallic regime and using pink noise. (a) and (b) Convergence probability and C̄ as a function of the relative dynamic OFF-state noise level (i.e., the dominant noise contribution) using GON/GOFF = 10 and GON/GOFF = 100 conductance ratios (pink and orange). (c) and (d) The same data rescaled to the equivalent relative ON-state noise level (see the text). The pink line reproduces the simulation using solely ON-state noise in the mixed barrier-like and metallic regime [pink data in Figs. 3(a) and 3(b)], which actually represents the GON/GOFF = limit. Note that the largest 30% relative OFF-state noise levels in panels (a) and (b) correspond to equivalent noise values of 0.1 and 0.03 for the GON/GOFF = 10 and GON/GOFF = 100 conductance ratios, respectively.

FIG. 5.

Simulation results for test problem g05_60.0:51 operating the noise model in the pure metallic regime and using pink noise. (a) and (b) Convergence probability and C̄ as a function of the relative dynamic OFF-state noise level (i.e., the dominant noise contribution) using GON/GOFF = 10 and GON/GOFF = 100 conductance ratios (pink and orange). (c) and (d) The same data rescaled to the equivalent relative ON-state noise level (see the text). The pink line reproduces the simulation using solely ON-state noise in the mixed barrier-like and metallic regime [pink data in Figs. 3(a) and 3(b)], which actually represents the GON/GOFF = limit. Note that the largest 30% relative OFF-state noise levels in panels (a) and (b) correspond to equivalent noise values of 0.1 and 0.03 for the GON/GOFF = 10 and GON/GOFF = 100 conductance ratios, respectively.

Close modal

After a detailed analysis of the memristive HNNs’ noise properties and their impact on the network operation, we analyze the role of further device-non-idealities, such as the programming inaccuracy and the finite GOFF conductance.

1. Programming inaccuracy

In Figs. 6(a) and 6(b), we analyze the role of the (ΔG/G)static measure of the programming inaccuracy, i.e., the device-to-device variance of the time-averaged conductance normalized to the average conductance. Here, we also consider the mixed barrier-like and metallic regimes using the approximation of GOFF = 0, i.e., solely analyzing the programming inaccuracy of the ON-state conductances. The network’s operation for an increasing (ΔG/G)static is demonstrated in Figs. 6(a) and 6(b) with no dynamical noise (pink line) and constant pink noise at optimal amplitude (pink circles).

FIG. 6.

Convergence probability (a) and (c) and C̄ (b) and (d) as a function of the (ΔG/G)static device-to-device conductance variations (a) and (b) and the GOFF/GON conductance ratio at finite OFF conductance (c) and (d). Pink circles (lines) represent the results for the optimal equivalent dynamic noise level (zero dynamic noise level). Device-to-device variations are modeled by a Gaussian conductance distribution.

FIG. 6.

Convergence probability (a) and (c) and C̄ (b) and (d) as a function of the (ΔG/G)static device-to-device conductance variations (a) and (b) and the GOFF/GON conductance ratio at finite OFF conductance (c) and (d). Pink circles (lines) represent the results for the optimal equivalent dynamic noise level (zero dynamic noise level). Device-to-device variations are modeled by a Gaussian conductance distribution.

Close modal

In the noiseless network, device-to-device variations up to 15% leave the poor noiseless network performance practically unchanged, whereas larger (ΔG/G)static already makes the network operation even worse. Here, it is to be emphasized that static device-to-device variations seemingly produce a stochastic deviation of the bit line currents from the expected values of the original ideal HNN along the temporal evolution of the neural states.24 However, a finite (ΔG/G)static only deforms the weight matrix of the HNN, but still, an ideal noiseless HNN is realized. This means that a finite (ΔG/G)static and (ΔG/G)dynamic = 0 yields a modified ideal HNN, where the energy can only be reduced along the operation, yielding similar dead ends in the local minima as the original noiseless HNN. Therefore, the device-to-device variations are not proper for performance enhancement in the memristive HNN, for either true stochasticity (noise) is required or a non-ideal HNN with somewhat chaotic energy-trajectories should be realized. The latter is possible by the introduction of diagonal feedback (see Sec. IV F 2), and presumably nonlinear device characteristics also yield similar non-ideal chaotic behavior; the latter, however, is not analyzed in this paper.

It is also interesting to analyze the role of programming inaccuracy if it is accompanied by optimal dynamical noise characteristics. According to the pink circles in Fig. 6(a), already (ΔG/G)static > 0.025 values yield a sharp decrease in the convergence probability. An annealed network would show a very similar decay of the convergence probability (not shown). Accordingly, proper programming accuracy is vital to the network. Such accuracy has already been experimentally demonstrated in memristors with 2048 distinct conductance levels (corresponding to 11-bit resolution), where a special denoising process was applied to maximize programming accuracy.20 The states were programmed between 4144 and 50 μS, with a 2 μS resolution, which roughly estimates to ΔG/G ≈ 0.0005 − 0.04, i.e., if the network is operated in the high conductance end of this conductance regime, the envisioned (ΔG/G)static < 0.025 conditions [see Fig. 6(a)] is easily satisfied even at much worse conductance resolution.

2. Finite OFF conductance

We have seen that from the noise perspective, one can always find an equivalent picture where the OFF state is noiseless, i.e., in this sense, the partitioning of the noise between the ON and OFF elements is irrelevant; just the overall noise matters. However, even at zero noise, a finite OFF-state conductance may modify the network operation due to the imperfect representation of zero states. To analyze this, simulations were run at different GOFF/GON values—ranging from 0 to 0.3—with no dynamical noise and constant pink noise with an optimal equivalent amplitude. The results can be seen in Figs. 6(c) and 6(d). No significant change is detected in a noiseless network (pink line), but in a network with optimal equivalent noise, the finite (GOFF/GON) > 0.1 values already yield a shallow but significant reduction in the convergence probability (pink circles).

1. External injection of current noise

The above considerations have demonstrated that rather large, 11%16% relative equivalent ON-state noise levels are required for the best network operation, which can be further boosted by annealing the noise from an even higher initial level. These noise levels are already at the border of the experimentally observed noise values, and especially in the pure metallic regime, it is hardly possible to reach the optimal noise level in the network. On the other hand, this also means that the network is easily set to an operation regime where the overall noise is definitely smaller than the optimal level, i.e., in this regime, it is possible to apply external noise injection with which the stochastic operation is optimized. This scheme is demonstrated in Figs. 7(a) and 7(b) [see also Figs. 1(a) and 1(d)], where the light green arrows illustrate noise injection to the bit-line current from an external tunable noise source. Mathematically, this is represented by the proper modification of the update rule [Eq. (2)],
xj(t+1)=+1ifaj(t)θj+ξj(t),1ifaj(t)<θj+ξj(t),
(10)
where ξj(t) is the stochastic variable with σj variance representing the external noise injection (note that for the max-cut problem, θj = 0 applies). As proposed in Ref. 56, an additional memristive crossbar line with high amplitude tunable noise characteristics could be applied to tailor the noise level in the bit lines separately. Here, we apply an even simpler scheme with a single external memristive (or non-memristive) tunable noise source representing the σ variance of the ξ random variables. Along with the multiplexing, this external noise is added to the randomly chosen bit line. The green line (green symbols) in Figs. 7(a) and 7(b) represent the convergence probability and C̄ as the function of σ for constant (annealed) external noise amplitudes. These curves highly resemble the results, where a constant (annealed) internal noise of the crossbar elements was applied [Figs. 3(a), 3(b), 4(a), and 4(b)]. This is, however, an easily deducible correspondence, as the σj variance of the external noise can be converted to an (ΔGON/GON)equivalent,j=σj/dj ON-state equivalent noise level in bit line j considering the mixed barrier-like and metallic regime with neglected OFF-state noise. Using this conversion factor with σj = σ and replacing dj with its average value, we can plot the results of Figs. 3(a), 3(b), 4(a), and 4(b) on Fig. 7 (see the rescaled top horizontal axis). With this rescaling, the constant and annealed internal memristor noise (purple line and purple circles in Fig. 7) indeed exhibit a highly similar effect on the network performance, as the equivalent externally injected constant or annealed noise (green line and green circles). This also means that our above results on internal noise contributions are all directly transferable to the case of external noise injection. Furthermore, in the case of external noise injection, the noise annealing is substantially more demanding than having to reprogram all the memristive elements to anneal the internal noise.
FIG. 7.

(a) and (b) Effect of external noise injection on the performance of the memristive HNN using the simulation results for the g05_60.051 benchmark problem. The green line (circles) represents the convergence probability (a) and C̄ as the function of the external noise’s variance, σ. As a reference, Pconv and C̄ are also plotted for the case of constant and annealed internal noise (purple line and circles) using the rescaled, equivalent ON-state noise axis (see top axes). For the external noise injection, white noise was applied, and the annealing protocol (circles) followed the continuous annealing scheme [Eq. (9)]. (c) and (d) The effect of diagonal self-feedback on the performance of the same memristive HNN using finite negative w values and zero noise. Pconv and C̄ are plotted as a function of |w| for constant w (blue lines) and for continuously annealed w [see Eq. (9)]. The insets in (b) and (d) illustrate the external noise injection and diagonal feedback schemes similarly to Fig. 1(a).

FIG. 7.

(a) and (b) Effect of external noise injection on the performance of the memristive HNN using the simulation results for the g05_60.051 benchmark problem. The green line (circles) represents the convergence probability (a) and C̄ as the function of the external noise’s variance, σ. As a reference, Pconv and C̄ are also plotted for the case of constant and annealed internal noise (purple line and circles) using the rescaled, equivalent ON-state noise axis (see top axes). For the external noise injection, white noise was applied, and the annealing protocol (circles) followed the continuous annealing scheme [Eq. (9)]. (c) and (d) The effect of diagonal self-feedback on the performance of the same memristive HNN using finite negative w values and zero noise. Pconv and C̄ are plotted as a function of |w| for constant w (blue lines) and for continuously annealed w [see Eq. (9)]. The insets in (b) and (d) illustrate the external noise injection and diagonal feedback schemes similarly to Fig. 1(a).

Close modal

2. Chaotic annealing through diagonal feedback

In addition to a well-tailored internal or external noise injection, there are other, fundamentally different strategies to help find the optimal solution in HNN-based problem-solving. An alternative approach relies on the introduction of negative diagonal feedback, which was shown to introduce chaotic behavior to the network.25,30,57 The latter can also be utilized to escape from local minima in the energy landscape. This diagonal feedback strategy was applied in various studies, including the demonstration of quantum-inspired annealing schemes in memristive crossbar arrays58,59 or the fine control of the perturbations using three terminal synaptic circuit elements.59 Furthermore, it was shown that the internal noise of the network can be attenuated or magnified by a tunable hysteretic threshold circuitry, which also induces self-feedback to the network.24 As a basis for comparison with the main topic of our article, i.e., the role of experimentally relevant noise characteristics of the memristive elements, here we also briefly summarize and simulate the diagonal-feedback-based chaotic annealing scheme.

An ideal HNN lacks the self-connection of the neurons, i.e., the diagonal elements of the weight matrix are zero. A finite self-connection of the neurons with uniform w strength [see the dark green arrow in Fig. 7(c) and also in Fig. 1(a)] modifies the update rule as
xj(t+1)=+1ifaj(t)+wxj(t)θj,1ifaj(t)+wxj(t)<θj,
(11)
where aj(t) is the activation without self-feedback, and w is the tunable strength of the self-feedback. Positive w values drive the network toward the stabilization of the actual states, i.e., if a certain neuron would change its state at w = 0, it is possible that at w > 0 it does not change the state [for +1 (−1)] neural state; the state-change would require negative (positive) activation, but the self-feedback shifts the threshold toward positive (negative) values, which works against the change of the actual state. With similar argumentation, a negative w yields neural updates in situations where the ideal Hopfield network (w = 0) would not yield the change in the neural state. This simple argumentation also demonstrates that positive (negative) self-feedback attenuates (magnifies) the role of the internal noise in a memristive HNN, which is demonstrated and derived in detail in Ref. 24.

Here, we wish to investigate the effect of the self-feedback separately from the effect of internal noise, i.e., we have performed simulations for a noiseless Hopfield network using either constant [blue lines in Figs. 7(c) and 7(d)] or annealed [blue circles in Figs. 7(c) and 7(d)] negative w values. The negative feedback introduces chaotic behavior, which can be suppressed by gradually decreasing w. We note that in a memristive crossbar array, the introduction of negative weights would require the application of differential memristor pairs,7 i.e., it is more reasonable to keep the memristive matrix positive valued with zero diagonal elements and to introduce the self-feedback as an offset along the neural updates.

The results of the simulations with zero noise but finite negative diagonal feedback are demonstrated in Figs. 7(c) and 7(d). Once w is annealed, the convergence probabilities [blue circles in Fig. 7(c)] are similar as in the case of annealed internal or externally injected noise [purple and green circles in Fig. 7(a)]. However, at constant self-feedback amplitude [blue line in Fig. 7(c)], no optimal w is found, which would yield similarly good convergence probability (45%), as the optimal constant internal or external noise level [purple and green lines in Fig. 7(a)], rather the convergence probability remains below 20% regardless of the w value. Accordingly, we can state that annealing is vital in the case of self-feedback. Finally, we note that chaotic dynamics is not only available through negative self-feedback, but nonlinear memristive elements with chaotic dynamics, such as nanoscale NbO2 memristors, can also be applied to externally inject chaotic behavior into the network.60 

In conclusion, we simulated probabilistic optimization schemes inspired by the memristive HNNs experimentally realized in Refs. 24 and 56. These works demonstrated that memristive HNNs are not only efficient hardware accelerators for complex optimization tasks thanks to their single-step matrix–vector multiplication capability, but the intrinsic noise of the memristive elements can be exploited as a hardware resource, introducing proper stochasticity to the network. As a main focus, we simulated the operation of the memristive HNNs, relying on experimentally deduced, realistic noise characteristics. Based on a broad range of conductance dependent noise characteristics in various memristive systems, we proposed a noise model describing the typical noise evolution along the variation of the conductance states. Relying on this model, we demonstrated distinct operation regimes where either the ON-state or the OFF-state noise provides the dominant contribution. We also demonstrated that the relative conductance variation is not only a good measure of the noise amplitude but is also a highly relevant parameter describing the operation of the memristive HNNs. According to our simulations, the relative noise level required for the optimal network operation is found to be in the range of ΔG/G ≈ 11 − 16% regardless of the color of the noise spectrum (white, pink, or Lorentzian noise) or the size of the problem (60 × 60 − 300 × 300). We have shown that further performance enhancement can be achieved by noise annealing; however, a highly simplified and easily implemented double-step noise annealing scheme provides similar performance as the more refined but extremely time-consuming continuous superlinear noise annealing scheme. It is also found that the optimal noise level is at the top edge of the experimentally achievable relative noise levels, which means that the network is easily tuned to an operation regime with a suboptimal relative noise level, where the optimal operation can be either set by external noise injection or by negative diagonal feedback, introducing chaotic network behavior. Finally, we have explored the effects of further non-idealities, such as the limited programming accuracy and the finite OFF-state conductance of the memristors. We have argued that any static non-ideality that deforms the weight matrix but still implements a noise-free HNN can only lead to a degradation of the network performance, i.e., for performance enhancement, either true stochasticity (noise) or a non-ideal HNN with somewhat chaotic energy trajectories is required. It was, however, also found that static non-idealities, especially the device-to-device variations with (ΔG/G)static > 0.025, cause severe performance degradation in networks with optimized dynamical noise levels. All these results are presented on the benchmark unweighted max-cut problem, where binary weights are used. We note, however, that the discussed conductance-dependent noise model is also a guideline for dealing with the intrinsic noise of multilevel weights in memristive HNNs, which can be simulated using a very similar approach. The latter non-binary networks are relevant for more complex optimization problems, such as weighted max-cut,59,61 or for situations where continuous weight annealing is used.23 

The rapidly growing field of memristor research is expected to deliver radically new IT solutions in the near future. We believe that our results contribute to this field by exploring the prospects of fully connected memristive networks utilizing the inherent stochasticity of memristors for probabilistic optimization algorithms.

This research was supported by the Ministry of Culture and Innovation and the National Research, Development and Innovation Office within the Quantum Information National Laboratory of Hungary (Grant No. 2022-2.1.1-NL-2022-00004), the New National Excellence Program of the Ministry for Culture and Innovation from the source of the National Research, Development and Innovation Fund (Grant Nos. ÚNKP-22-2-I-BME-73 and ÚNKP-22-5-BME-288), and NKFI Grant Nos. K143169 and K143282. Project No. 963575 has been implemented with the support provided by the Ministry of Culture and Innovation of Hungary from the National Research, Development and Innovation Fund, financed under the KDP-2020 funding scheme. Z.B. acknowledges the support of the Bolyai János Research Scholarship of the Hungarian Academy of Sciences. The authors are grateful to Dávid Krisztián and Péter Balázs for their contribution to the noise measurements on SiOx resistive switches.

The authors have no conflicts to disclose.

János Gergő Fehérvári: Data curation (lead); Formal analysis (equal); Investigation (equal); Methodology (equal); Software (lead); Validation (equal); Visualization (lead); Writing – original draft (equal); Writing – review & editing (equal). Zoltán Balogh: Formal analysis (equal); Investigation (equal); Methodology (equal); Resources (equal); Supervision (supporting); Validation (equal); Visualization (supporting); Writing – review & editing (supporting). Tímea Nóra Török: Data curation (supporting); Formal analysis (supporting); Software (supporting); Writing – review & editing (supporting). András Halbritter: Conceptualization (lead); Formal analysis (equal); Funding acquisition (lead); Investigation (equal); Methodology (equal); Project administration (lead); Resources (equal); Supervision (lead); Validation (equal); Visualization (supporting); Writing – original draft (equal); Writing – review & editing (equal).

The data that support the findings of this study are openly available in figshare repository at https://figshare.com/projects/Noise_tailoring_noise_annealing_and_external_noise_injection_strategies_in_memristive_Hopfield_neural_networks/176478.62 

1.
M. A.
Zidan
,
J. P.
Strachan
, and
W. D.
Lu
, “
The future of electronics based on memristive systems
,”
Nat. Electron.
1
,
22
29
(
2018
).
2.
Q.
Xia
and
J. J.
Yang
, “
Memristive crossbar arrays for brain-inspired computing
,”
Nat. Mater.
18
,
309
323
(
2019
).
3.
A.
Mehonic
,
A.
Sebastian
,
B.
Rajendran
,
O.
Simeone
,
E.
Vasilaki
, and
A. J.
Kenyon
, “
Memristors—From in-memory computing, deep learning acceleration, and spiking neural networks to the future of neuromorphic and bio-inspired computing
,”
Adv. Intell. Syst.
2
,
2000085
(
2020
).
4.
A.
Sebastian
,
M.
Le Gallo
,
R.
Khaddam-Aljameh
, and
E.
Eleftheriou
, “
Memory devices and applications for in-memory computing
,”
Nat. Nanotechnol.
15
,
529
544
(
2020
).
5.
P.
Mannocci
,
M.
Farronato
,
N.
Lepri
,
L.
Cattaneo
,
A.
Glukhov
,
Z.
Sun
, and
D.
Ielmini
, “
In-memory computing with emerging memory devices: Status and outlook
,”
APL Mach. Learn.
1
,
010902
(
2023
).
6.
S.
Ambrogio
,
P.
Narayanan
,
H.
Tsai
,
R. M.
Shelby
,
I.
Boybat
,
C.
di Nolfo
,
S.
Sidler
,
M.
Giordano
,
M.
Bodini
,
N. C. P.
Farinha
,
B.
Killeen
,
C.
Cheng
,
Y.
Jaoudi
, and
G. W.
Burr
, “
Equivalent-accuracy accelerated neural-network training using analogue memory
,”
Nature
558
,
60
(
2018
).
7.
C.
Li
,
D.
Belkin
,
Y.
Li
,
P.
Yan
,
M.
Hu
,
N.
Ge
,
H.
Jiang
,
E.
Montgomery
,
P.
Lin
,
Z.
Wang
,
W.
Song
,
J. P.
Strachan
,
M.
Barnell
,
Q.
Wu
,
R. S.
Williams
,
J. J.
Yang
, and
Q.
Xia
, “
Efficient and self-adaptive in-situ learning in multilayer memristor neural networks
,”
Nat. Commun.
9
,
2385
(
2018
).
8.
Z.
Wang
,
C.
Li
,
W.
Song
,
M.
Rao
,
D.
Belkin
,
Y.
Li
,
P.
Yan
,
H.
Jiang
,
P.
Lin
,
M.
Hu
,
J. P.
Strachan
,
N.
Ge
,
M.
Barnell
,
Q.
Wu
,
A. G.
Barto
,
Q.
Qiu
,
R. S.
Williams
,
Q.
Xia
, and
J. J.
Yang
, “
Reinforcement learning with analogue memristor arrays
,”
Nat. Electron.
2
,
115
124
(
2019
).
9.
C.
Mackin
,
H.
Tsai
,
S.
Ambrogio
,
P.
Narayanan
,
A.
Chen
, and
G. W.
Burr
, “
Weight programming in DNN analog hardware accelerators in the presence of NVM variability
,”
Adv. Electron. Mater.
5
,
1900026
(
2019
).
10.
C.
Li
,
M.
Hu
,
Y.
Li
,
H.
Jiang
,
N.
Ge
,
E.
Montgomery
,
J.
Zhang
,
W.
Song
,
N.
Dávila
,
C. E.
Graves
,
Z.
Li
,
J. P.
Strachan
,
P.
Lin
,
Z.
Wang
,
M.
Barnell
,
Q.
Wu
,
R. S.
Williams
,
J. J.
Yang
, and
Q.
Xia
, “
Analogue signal and image processing with large memristor crossbars
,”
Nat. Electron.
1
,
52
59
(
2018
).
11.
Z.
Wang
,
C.
Li
,
P.
Lin
,
M.
Rao
,
Y.
Nie
,
W.
Song
,
Q.
Qiu
,
Y.
Li
,
P.
Yan
,
J. P.
Strachan
,
N.
Ge
,
N.
McDonald
,
Q.
Wu
,
M.
Hu
,
H.
Wu
,
R. S.
Williams
,
Q.
Xia
, and
J. J.
Yang
, “
In situ training of feed-forward and recurrent convolutional memristor networks
,”
Nat. Mach. Intell.
1
,
434
442
(
2019
).
12.
J. Y.
Seok
,
S. J.
Song
,
J. H.
Yoon
,
K. J.
Yoon
,
T. H.
Park
,
D. E.
Kwon
,
H.
Lim
,
G. H.
Kim
,
D. S.
Jeong
, and
C. S.
Hwang
, “
A review of three-dimensional resistive switching cross-bar array memories from the integration and materials property points of view
,”
Adv. Funct. Mater.
24
,
5316
5339
(
2014
).
13.
C.
Wu
,
T. W.
Kim
,
H. Y.
Choi
,
D. B.
Strukov
, and
J. J.
Yang
, “
Flexible three-dimensional artificial synapse networks with correlated learning and trainable memory capability
,”
Nat. Commun.
8
,
752
(
2017
).
14.
C.
Li
,
L.
Han
,
H.
Jiang
,
M.-H.
Jang
,
P.
Lin
,
Q.
Wu
,
M.
Barnell
,
J. J.
Yang
,
H. L.
Xin
, and
Q.
Xia
, “
Three-dimensional crossbar arrays of self-rectifying Si/SiO2/Si memristors
,”
Nat. Commun.
8
,
15666
(
2017
).
15.
P.
Lin
,
C.
Li
,
Z.
Wang
,
Y.
Li
,
H.
Jiang
,
W.
Song
,
M.
Rao
,
Y.
Zhuo
,
N.
Upadhyay
,
M.
Barnell
,
Q.
Wu
,
J. J.
Yang
, and
Q.
Xia
, “
Three-dimensional memristor circuits as complex neural networks
,”
Nat. Electron.
3
,
225
232
(
2020
).
16.
A.
Serb
,
J.
Bill
,
A.
Khiat
,
R.
Berdan
,
R.
Legenstein
, and
T.
Prodromakis
, “
Unsupervised learning in probabilistic neural networks with multi-state metal-oxide memristive synapses
,”
Nat. Commun.
7
,
12611
(
2016
).
17.
S.
Choi
,
J. H.
Shin
,
J.
Lee
,
P.
Sheridan
, and
W. D.
Lu
, “
Experimental demonstration of feature extraction and dimensionality reduction using memristor networks
,”
Nano Lett.
17
,
3113
3118
(
2017
).
18.
Z.
Wang
,
S.
Joshi
,
S.
Savel’ev
,
W.
Song
,
R.
Midya
,
Y.
Li
,
M.
Rao
,
P.
Yan
,
S.
Asapu
,
Y.
Zhuo
,
H.
Jiang
,
P.
Lin
,
C.
Li
,
J. H.
Yoon
,
N. K.
Upadhyay
,
J.
Zhang
,
M.
Hu
,
J. P.
Strachan
,
M.
Barnell
,
Q.
Wu
,
H.
Wu
,
R. S.
Williams
,
Q.
Xia
, and
J. J.
Yang
, “
Fully memristive neural networks for pattern classification with unsupervised learning
,”
Nat. Electron.
1
,
137
145
(
2018
).
19.
C.
Li
,
Z.
Wang
,
M.
Rao
,
D.
Belkin
,
W.
Song
,
H.
Jiang
,
P.
Yan
,
Y.
Li
,
P.
Lin
,
M.
Hu
,
N.
Ge
,
J. P.
Strachan
,
M.
Barnell
,
Q.
Wu
,
R. S.
Williams
,
J. J.
Yang
, and
Q.
Xia
, “
Long short-term memory networks in memristor crossbar arrays
,”
Nat. Mach. Intell.
1
,
49
57
(
2019
).
20.
M.
Rao
,
H.
Tang
,
J.
Wu
,
W.
Song
,
M.
Zhang
,
W.
Yin
,
Y.
Zhuo
,
F.
Kiani
,
B.
Chen
,
X.
Jiang
,
H.
Liu
,
H.-Y.
Chen
,
R.
Midya
,
F.
Ye
,
H.
Jiang
,
Z.
Wang
,
M.
Wu
,
M.
Hu
,
H.
Wang
,
Q.
Xia
,
N.
Ge
,
J.
Li
, and
J. J.
Yang
, “
Thousands of conductance levels in memristors integrated on CMOS
,”
Nature
615
,
823
829
(
2023
).
21.
H.
Kim
,
M.
Kim
,
A.
Lee
,
H.-L.
Park
,
J.
Jang
,
J.-H.
Bae
,
I. M.
Kang
,
E.-S.
Kim
, and
S.-H.
Lee
, “
Organic memristor-based flexible neural networks with bio-realistic synaptic plasticity for complex combinatorial optimization
,”
Adv. Sci.
10
,
2300659
(
2023
).
22.
M.
Jiang
,
K.
Shan
,
C.
He
, and
C.
Li
, “
Efficient combinatorial optimization by quantum-inspired parallel annealing in analogue memristor crossbar
,”
Nat. Commun.
14
,
5927
(
2023
).
23.
Z.
Fahimi
,
M. R.
Mahmoodi
,
H.
Nili
,
V.
Polishchuk
, and
D. B.
Strukov
, “
Combinatorial optimization by weight annealing in memristive Hopfield networks
,”
Sci. Rep.
11
,
16383
(
2021
).
24.
F.
Cai
,
S.
Kumar
,
T.
Van Vaerenbergh
,
X.
Sheng
,
R.
Liu
,
C.
Li
,
Z.
Liu
,
M.
Foltin
,
S.
Yu
,
Q.
Xia
,
J. J.
Yang
,
R.
Beausoleil
,
W. D.
Lu
, and
J. P.
Strachan
, “
Power-efficient combinatorial optimization using intrinsic noise in memristor Hopfield neural networks
,”
Nat. Electron.
3
,
409
418
(
2020
).
25.
M. R.
Mahmoodi
,
M.
Prezioso
, and
D. B.
Strukov
, “
Versatile stochastic dot product circuits based on nonvolatile memories for high performance neurocomputing and neurooptimization
,”
Nat. Commun.
10
,
5113
(
2019
).
26.
B.
Sánta
,
Z.
Balogh
,
L.
Pósa
,
D.
Krisztián
,
T. N.
Török
,
D.
Molnár
,
C.
Sinkó
,
R.
Hauert
,
M.
Csontos
, and
A.
Halbritter
, “
Noise tailoring in memristive filaments
,”
ACS Appl. Mater. Interfaces
13
,
7453
7460
(
2021
).
27.
Z.
Balogh
,
G.
Mezei
,
L.
Pósa
,
B.
Sánta
,
A.
Magyarkuti
, and
A.
Halbritter
, “
1/f noise spectroscopy and noise tailoring of nanoelectronic devices
,”
Nano Futures
5
,
042002
(
2021
).
28.
B.
Sánta
,
Z.
Balogh
,
A.
Gubicza
,
L.
Pósa
,
D.
Krisztián
,
G.
Mihály
,
M.
Csontos
, and
A.
Halbritter
, “
Universal 1/f type current noise of Ag filaments in redox-based memristive nanojunctions
,”
Nanoscale
11
,
4719
4725
(
2019
).
29.
L.
Pósa
,
Z.
Balogh
,
D.
Krisztián
,
P.
Balázs
,
B.
Sánta
,
R.
Furrer
,
M.
Csontos
, and
A.
Halbritter
, “
Noise diagnostics of graphene interconnects for atomic-scale electronics
,”
npj 2D Mater. Appl.
5
,
57
(
2021
).
30.
K.
Yang
,
Q.
Duan
,
Y.
Wang
,
T.
Zhang
,
Y.
Yang
, and
R.
Huang
, “
Transiently chaotic simulated annealing based on intrinsic nonlinearity of memristors for efficient solution of optimization problems
,”
Sci. Adv.
6
,
eaba9901
(
2020
).
31.
J. J.
Hopfield
, “
Neural networks and physical systems with emergent collective computational abilities
,”
Proc. Natl. Acad. Sci. U. S. A.
79
,
2554
2558
(
1982
).
32.
J. J.
Hopfield
and
D. W.
Tank
, “
Computing with neural circuits: A model
,”
Science
233
,
625
633
(
1986
).
33.
K. A.
Smith
, “
Neural networks for combinatorial optimization: A review of more than a decade of research
,”
INFORMS J. Comput.
11
,
15
34
(
1999
).
34.
L.-Y.
Wu
,
X.-S.
Zhang
, and
J.-L.
Zhang
, “
Application of discrete Hopfield-type neural network for max-cut problem
,”
Proceedings of ICONIP
(
2001
), pp,
1439
1444
.
35.
F.
Liers
,
T.
Nieberg
, and
G.
Pardella
, “
Via minimization in VLSI chip design—Application of a planar max-cut algorithm
,”
Working Paper
, University of Köln,
2011
https://kups.ub.uni-koeln.de/55018/.
36.
H.
Li
,
S.
Wang
,
X.
Zhang
,
W.
Wang
,
R.
Yang
,
Z.
Sun
,
W.
Feng
,
P.
Lin
,
Z.
Wang
,
L.
Sun
, and
Y.
Yao
, “
Memristive crossbar arrays for storage and computing applications
,”
Adv. Intell. Syst.
3
,
2100017
(
2021
).
37.
L.
Pósa
,
M.
El Abbassi
,
P.
Makk
,
B.
Sánta
,
C.
Nef
,
M.
Csontos
,
M.
Calame
, and
A.
Halbritter
, “
Multiple physical time scales and dead time rule in few-nanometers sized graphene–SiOx-graphene memristors
,”
Nano Lett.
17
,
6783
6789
(
2017
).
38.
T. N.
Török
,
J. G.
Fehérvári
,
G.
Mészáros
,
L.
Pósa
, and
A.
Halbritter
, “
Tunable, nucleation-driven stochasticity in nanoscale silicon oxide resistive switching memory devices
,”
ACS Appl. Nano Mater.
5
,
6691
6698
(
2022
).
39.
J.
Yao
,
L.
Zhong
,
D.
Natelson
, and
J. M.
Tour
, “
In situ imaging of the conducting filament in a silicon oxide resistive switch
,”
Sci. Rep.
2
,
242
(
2012
).
40.
D.
Ielmini
,
F.
Nardi
, and
C.
Cagli
, “
Resistance-dependent amplitude of random telegraph-signal noise in resistive switching memories
,”
Appl. Phys. Lett.
96
,
053503
(
2010
).
41.
R.
Soni
,
P.
Meuffels
,
A.
Petraru
,
M.
Weides
,
C.
Kügeler
,
R.
Waser
, and
H.
Kohlstedt
, “
Probing Cu doped Ge0.3Se0.7 based resistance switching memory devices with random telegraph noise
,”
J. Appl. Phys.
107
,
024517
(
2010
).
42.
Z.
Fang
,
H. Y.
Yu
,
W. J.
Fan
,
G.
Ghibaudo
,
J.
Buckley
,
B.
DeSalvo
,
X.
Li
,
X. P.
Wang
,
G. Q.
Lo
, and
D. L.
Kwong
, “
Current conduction model for oxide-based resistive random access memory verified by low-frequency noise analysis
,”
IEEE Trans. Electron Devices
60
,
1272
1275
(
2013
).
43.
S.
Ambrogio
,
S.
Balatti
,
A.
Cubeta
,
A.
Calderoni
,
N.
Ramaswamy
, and
D.
Ielmini
, “
Statistical fluctuations in HfOx resistive-switching memory: Part II—Random telegraph noise
,”
IEEE Trans. Electron Devices
61
,
2920
2927
(
2014
).
44.
S.
Ambrogio
,
S.
Balatti
,
V.
McCaffrey
,
D. C.
Wang
, and
D.
Ielmini
, “
Noise-induced resistance broadening in resistive switching memory—Part I: Intrinsic cell behavior
,”
IEEE Trans. Electron Devices
62
,
3805
3811
(
2015
).
45.
W.
Yi
,
S. E.
Savelev
,
G.
Medeiros-Ribeiro
,
F.
Miao
,
M.-X.
Zhang
,
J. J.
Yang
,
A. M.
Bratkovsky
, and
R. S.
Williams
, “
Quantized conductance coincides with state instability and excess noise in tantalum oxide memristors
,”
Nat. Commun.
7
,
11142
(
2016
).
46.
F. M.
Puglisi
,
N.
Zagni
,
L.
Larcher
, and
P.
Pavan
, “
Random telegraph noise in resistive random access memories: Compact modeling and advanced circuit design
,”
IEEE Trans. Electron Devices
65
,
2964
2972
(
2018
).
47.
E.
Piros
,
M.
Lonsky
,
S.
Petzold
,
A.
Zintler
,
S.
Sharath
,
T.
Vogel
,
N.
Kaiser
,
R.
Eilhardt
,
L.
Molina-Luna
,
C.
Wenger
,
J.
Müller
, and
L.
Alff
, “
Role of oxygen defects in conductive-filament formation in Y2O3-based analog RRAM devices as revealed by fluctuation spectroscopy
,”
Phys. Rev. Appl.
14
,
034029
(
2020
).
48.
J.-K.
Lee
and
S.
Kim
, “
Comparative analysis of low-frequency noise based resistive switching phenomenon for filamentary and interfacial RRAM devices
,”
Chaos, Solitons Fractals
173
,
113633
(
2023
).
49.
S.
Slesazeck
,
H.
Mähne
,
H.
Wylezich
,
A.
Wachowiak
,
J.
Radhakrishnan
,
A.
Ascoli
,
R.
Tetzlaff
, and
T.
Mikolajick
, “
Physical model of threshold switching in Nbo2 based memristors
,”
RSC Adv.
5
,
102318
102322
(
2015
).
50.
S.
Gao
,
F.
Zeng
,
F.
Li
,
M.
Wang
,
H.
Mao
,
G.
Wang
,
C.
Song
, and
F.
Pan
, “
Forming-free and self-rectifying resistive switching of the simple Pt/TaOx/n-Si structure for access device-free high-density memory application
,”
Nanoscale
7
,
6031
6038
(
2015
).
51.
A.
Wiegele
, “
Biq Mac library—A collection of max-cut and quadratic 0-1 programming instances of medium size
,”
Technical Report, Alpen-Adria-Universität
Klagenfurt, Klagenfurt, Austria
,
2007
.
52.
M.
Carrettoni
and
O.
Cremonesi
, “
Generation of noise time series with arbitrary power spectrum
,”
Comput. Phys. Commun.
181
,
1982
1985
(
2010
).
53.
F.
Cai
,
J. M.
Correll
,
S. H.
Lee
,
Y.
Lim
,
V.
Bothra
,
Z.
Zhang
,
M. P.
Flynn
, and
W. D.
Lu
, “
A fully integrated reprogrammable memristor–CMOS system for efficient multiply–accumulate operations
,”
Nat. Electron.
2
,
290
299
(
2019
).
54.
M.
Hu
,
C. E.
Graves
,
C.
Li
,
Y.
Li
,
N.
Ge
,
E.
Montgomery
,
N.
Davila
,
H.
Jiang
,
R. S.
Williams
,
J. J.
Yang
,
Q.
Xia
, and
J. P.
Strachan
, “
Memristor-based analog computation and neural network classification with a dot product engine
,”
Adv. Mater.
30
,
1705914
(
2018
).
55.
H.
Jiang
,
C.
Li
,
R.
Zhang
,
P.
Yan
,
P.
Lin
,
Y.
Li
,
J. J.
Yang
,
D.
Holcomb
, and
Q.
Xia
, “
A provable key destruction scheme based on memristive crossbar arrays
,”
Nat. Electron.
1
,
548
554
(
2018
).
56.
F.
Cai
,
S.
Kumar
,
T. V.
Vaerenbergh
,
R.
Liu
,
C.
Li
,
S.
Yu
,
Q.
Xia
,
J. J.
Yang
,
R.
Beausoleil
,
W.
Lu
, and
J. P.
Strachan
, “
Harnessing intrinsic noise in memristor Hopfield neural networks for combinatorial optimization
,” arXiv:1903.11194 [cs.ET] (
2019
).
57.
L.
Chen
and
K.
Aihara
, “
Chaotic simulated annealing by a neural network model with transient chaos
,”
Neural Networks
8
,
915
930
(
1995
).
58.
S.
Kumar
,
T.
Van Vaerenbergh
, and
J. P.
Strachan
, “
Classical adiabatic annealing in memristor Hopfield neural networks for combinatorial optimization
,” in
2020 International Conference on Rebooting Computing (ICRC)
(
IEEE
,
2020
), pp.
76
79
.
59.
S.-I.
Yi
,
S.
Kumar
, and
R. S.
Williams
, “
Improved Hopfield network optimization using manufacturable three-terminal electronic synapses
,”
IEEE Trans. Circuits Syst. I
68
,
4970
4978
(
2021
).
60.
S.
Kumar
,
J. P.
Strachan
, and
R. S.
Williams
, “
Chaotic dynamics in nanoscale NbO2 Mott memristors for analogue computing
,”
Nature
548
,
318
321
(
2017
).
61.
R.
Shaydulin
,
P. C.
Lotshaw
,
J.
Larson
,
J.
Ostrowski
, and
T. S.
Humble
, “
Parameter transfer for quantum approximate optimization of weighted maxcut
,”
ACM Trans. Quantum Comput.
4
,
1
(
2023
).
62.
J. G.
Fehérvári
,
Z.
Balogh
,
T. N.
Török
, and
A.
Halbritter
(
2023
). “
Simulation codes and datasets
,”
Figshare
. https://figshare.com/projects/Noise_tailoring_noise_annealing_and_external_noise_injection_strategies_in_memristive_Hopfield_neural_networks/176478