The commercial introduction of a novel electronic device is often preceded by a lengthy material optimization phase devoted to the suppression of device noise as much as possible. The emergence of novel computing architectures, however, triggers a paradigm shift in noise engineering, demonstrating that non-suppressed but properly tailored noise can be harvested as a computational resource in probabilistic computing schemes. Such a strategy was recently realized on the hardware level in memristive Hopfield neural networks, delivering fast and highly energy efficient optimization performance. Inspired by these achievements, we perform a thorough analysis of simulated memristive Hopfield neural networks relying on realistic noise characteristics acquired on various memristive devices. These characteristics highlight the possibility of orders of magnitude variations in the noise level depending on the material choice as well as on the resistance state (and the corresponding active region volume) of the devices. Our simulations separate the effects of various device non-idealities on the operation of the Hopfield neural network by investigating the role of the programming accuracy as well as the noise-type and noise amplitude of the ON and OFF states. Relying on these results, we propose optimized noise tailoring and noise annealing strategies, comparing the impact of internal noise to the effect of external perturbation injection schemes.

## I. INTRODUCTION

Memristive crossbar arrays are promising candidates as the hardware components of artificial neural networks,^{1–5} including advanced applications in feed-forward neural networks,^{6–9} convolutional layers,^{10,11} 3D architectures,^{12–15} unsupervised neural networks,^{16–18} and recurrent neural networks.^{11,19} In these applications, the tunable conductance of a memristor unit encodes a synaptic weight in the network, and once the properly trained weights are programmed to each memristor cell, the memristive crossbar array is able to perform the vector-matrix multiplication, i.e., the key mathematical operation of the network inference, in a single time-step.^{1,2,6} This equips the artificial neural networks with a highly energy efficient hardware component compared to software solutions, where the evaluation of the input vector at a layer with *N* neurons requires *N*^{2} multiplication operations. In most neural network applications, the highest resolution of the memristive synaptic weights is desirable,^{20} and, therefore, the memristor non-idealities, such as their programming inaccuracy or their stochastic noise properties, should be eliminated as much as possible. A special class of memristive networks, however, relies on probabilistic optimization,^{21–25} where it is well known that tunable stochasticity, such as customizable device noise, is not a disturbing factor but a useful computational resource. A similar strategy was recently experimentally realized in 60 × 60 memristive Hopfield neural networks (HNNs),^{24} demonstrating the efficient solution of max-cut graph segmentation problems and delivering over four orders of magnitude higher solution throughput per power consumption than digital or quantum annealing approaches.

Inspired by these achievements, we perform a thorough analysis of simulated memristive Hopfield neural networks, putting a key emphasis on the effect of device noise on the network’s operation. To this end, first, the realistic noise characteristics of memristive devices^{26–29} are discussed (Sec. III), and a general noise model describing the conductance-dependent noise characteristics in the filamentary and broken filamentary regimes is proposed. Afterward, various benchmark max-cut problems are solved by simulated memristive HNNs (Sec. IV), relying on the proposed noise model. These simulations demonstrate rather well-defined relative noise values at which the network operation is optimized, regardless of the network size and the type of noise spectrum. We also demonstrate a simplified, easily implementable double-step noise annealing scheme (Sec. IV C), which further enhances the convergence probability of the network. These optimized noise levels, however, are at the top border of the experimentally observed noise amplitudes, which raises the need for an external injection of stochasticity (Sec. IV F). For the latter, two strategies are tested, including external noise injection and the introduction of chaotic behavior through the self-feedback of the neurons in the network.^{30} Finally, the effect of further device-non-idealities is tested separately (Sec. IV E), analyzing the effect of the programming inaccuracy and the finite OFF-state conductance. The presentation of all these results is preceded by a brief overview of Hopfield neural networks and their implementation by memristive crossbar arrays (Sec. II).

## II. MEMRISTIVE HOPFIELD NEURAL NETWORKS

### A. Hopfield networks

^{31}was shown to be capable of solving complex problems by Hopfield and Tank

^{32}and has been used for optimization ever since.

^{33}The Hopfield network’s main allure is its simplicity and immense power to provide reasonable solutions for high complexity problems. The network consists of fully connected binary neurons (without self-connections); the $W\u0332\u0332$ synaptic weight matrix encodes the optimization problem; and the $x\u0332$ state of the neurons represents the possible states of the system, including the desired solution(s) [see Fig. 1(a)]. The network is operated in an iterative fashion: at each time-step

*t*the activation,

*j*is picked at random, and a single neuron is updated according to the rule,

*θ*

_{j}is the component of a predefined threshold vector $\theta \u0332$. It can be shown that this simple update rule decreases the energy function,

*E*

^{(t+1)}≤

*E*

^{(t)}). Due to this property, Hopfield neural networks are widely used to solve complex problems, which can be encoded in the form of the effective energy function in Eq. (3). This can be applied in an associative memory scheme,

^{31}where $W\u0332\u0332$ and $\theta \u0332$ are set such that each local minimum of $E(x\u0332)$ encodes a predefined pattern (e.g., images), and the update rule drives the system from an arbitrary initial state $x\u0332(0)$ to the closest local minimum, i.e., the network finds the predefined pattern most similar to the initial state. Alternatively, the Hopfield neural network may find the global solution to a complex problem, such as the max-cut graph segmentation problem.

### B. Max-cut problem

*G*(

*V*,

*E*) graph with

*V*vertices and

*E*edges. The goal is to find a partitioning of

*V*into two adjacent sets

*S*and

*K*so that the total weight of crossing edges between the two sets is maximal [see Fig. 1(b)].

^{34}Although abstract at first sight, many practical problems can be mapped to the max-cut, such as the conflict-graph formulation of the layer assignment problem in very large scale integration (VLSI) design, where the position of functional blocks is optimized in a multilayer chip.

^{35}If a graph is given by its adjacency matrix $A\u0332\u0332$, a cut can be encoded by $x\u0332$, simply as

*S*and

*K*, so the problem is directly addressable by a HNN.

^{34}In that case, however, the global minimum of the energy function is to be found, while the conventional operation of a Hopfield neural network would yield dead ends in each local minima of the energy landscape. This problem can be eliminated by introducing proper stochasticity to the network, such as a finite noise, which helps escape from the local minima and is reduced as the states evolve toward the global solution [see the illustration in Fig. 1(c)].

### C. Hardware implementation of HNNs by memristive crossbar arrays

The so-called crossbar structure is a popular scheme for physical matrix realization via memristors.^{2,36} As shown in Fig. 1(d), it is essentially a set of horizontal and vertical wires (word and bit lines) with a memristor placed at each meeting point of the lines. Operating this arrangement in the linear, subthreshold regime of the memristive units, the output current vector at the bit lines is obtained as the product of the input voltage vector at the word lines and the conductance matrix of the memristors at the crosspoints, *I*_{j} = ∑_{i}*G*_{i,j} · *V*_{i}. Once the proper conductance weights are programmed to the crossbar, the vector-matrix multiplication is performed on the hardware level within a single clock-cycle. This scheme is also implementable for Hopfield neural networks, where the diagonal values of the *G*_{i,j} conductance matrix are zero due to the lack of self-connections in the HNN.

The special case of the max-cut problem is mathematically formulated by a weight matrix, which is “1” or “0” if the proper vertices are connected or non-connected. This problem can be mapped to a memristive HNN by setting a constant *G*_{ON} conductance and *G*_{OFF} ≪ *G*_{ON} conductance instead of the 1 and 0 values, respectively. The *x*_{i} binary state vector elements are represented by $Vi(t)=xi(t)\u22c5|V|$ input voltages at the crossbar word lines. This scheme was experimentally realized in Ref. 24 using memristive HNNs up to 60 × 60 matrix sizes.

### D. Non-idealities and stochasticity in memristive HNNs

Figures 1(e) and 1(f) demonstrate the key device non-idealities in a memristive Hopfield neural network: the temporal stochastic variation (noise) of the programmed *G*_{ON} and *G*_{OFF} conductances (panel f), as well as the programming inaccuracy, i.e., the device-to-device variation of the time-averaged $Gi,j(t)\u0304$ conductances for the memristor cells programmed to the same ON/OFF binary state. These non-idealities are measured by the temporal variance (Δ*G*_{dynamic}) and the device-to-device variance (Δ*G*_{static}) around the ideal *G*_{ON} and *G*_{OFF} values. Furthermore, it is noted that *G*_{ON} can be chosen arbitrarily in the mapping of the “0” and “1” synaptic weights to the *G*_{ON} and *G*_{OFF} conductance; however, a finite *G*_{OFF} already represents a device non-ideality, which may modify the operation of the network. Furthermore, finite wire resistances or nonlinear *I*(*V*) characteristics may also be considered non-idealities;^{24} however, these two non-idealities are not considered in our following analysis.

Among these non-idealities, noise plays a distinguished role, as it does not necessarily hamper network operation, but a properly tailored noise may help find the global solution. However, the injection of external stochasticity might also become necessary once the internal noise of the memristor elements in the crossbar array is not large enough for optimal network operation. The latter possibility is illustrated by the light green arrows in Figs. 1(a) and 1(d), as well as by the dark green arrow in Fig. 1(a), respectively, illustrating external noise injection and the introduction of chaotic behavior via negative self-feedback of the neurons.^{24,30}

## III. REALISTIC NOISE PROPERTIES OF MEMRISTIVE DEVICES

In the following, we analyze the realistic noise characteristics of various memristive systems (Fig. 2), which are considered key ingredients of the memristive HNNs’ operation.

### A. Typical noise spectra of memristive units

The *S*_{I}(*f*) spectral density of the current noise is defined as the $(\Delta I)2|f0,\Delta f$ mean squared deviation of the current within a narrow Δ*f* band around the central frequency *f*_{0} normalized to the bandwidth, $SI(f0)=(\Delta I)2|f0,\Delta f/\Delta f$, but practically *S*_{I}(*f*) is calculated from the absolute value squared of the Fourier transform of the measured *I*(*t*) fluctuating current signal.^{27} In memristive devices, *S*_{I}(*f*) typically exhibits a Lorentzian spectrum [blue curve in Fig. 2(b)], a 1/f-type spectrum [pink curve in Fig. 2(b)], or a mixture of these two [purple curve in Fig. 2(b)]. In the former case, the noise is dominated by a single fluctuator introducing a steady-state resistance fluctuation with a typical time constant *τ*_{0}, yielding a spectrum that is constant at $2\pi f<\tau 0\u22121$ and decays with 1/*f*^{2} at $2\pi f>\tau 0\u22121$.^{27} If multiple fluctuators with different time constants contribute to the device noise, the Lorentzian spectra of the individual fluctuators sum up to a spectrum with *S*_{I} ∼ *f*^{−β}, where *β* is usually close to unity [pink noise, pink curve in Fig. 2(b)].^{27} Alternatively, a single fluctuator positioned at the device bottleneck may dominate the device noise, but a larger ensemble of more remote fluctuators may also make a significant contribution.^{27} This situation yields a mixture of Lorentzian and 1/f-type noise [purple curve in Fig. 2(b)]. Without any steady-state resistance fluctuations, a finite thermal noise is still observed, the latter exhibiting a constant (frequency-independent) spectrum (white noise) with *S*_{I} = 4*k*_{B}*TG*, where *k*_{B} is Boltzmann’s constant and *T* is the absolute temperature. Integrating the current noise for the frequency band of the measurement, the mean squared deviation of the current is obtained in this band, $(\Delta I)2=\u222bfAfBSI(f)df$.

### B. Proper metrics of the noise characteristics

At low enough subthreshold voltages, the memristive conductances exhibit steady-state fluctuations, i.e., the applied voltage is only used for the readout of the noise, but it does not excite any fluctuations. In this case, (Δ*I*)^{2} = (Δ*G*)^{2} · *V*^{2} holds according to Ohm’s law, i.e., the voltage-dependent current fluctuation is not a good measure of the noise properties. The Δ*I*/*I* relative current fluctuation, however, is already a voltage-independent metric of the fluctuations, which equals the relative fluctuation of the conductance or the resistance in the linear regime, Δ*I*/*I* = Δ*G*/*G* = Δ*R*/*R*.^{27} This metric will be used throughout the paper to describe the noise characteristics, where (Δ*G*/*G*)_{dynamic} describes the relative temporal fluctuations of a certain element of the memristor conductance matrix *G*_{i,j}(*t*). It is noted that (Δ*G*/*G*)_{dynamic} depends on the bandwidth. The high-frequency cutoff is determined by the integration time of the current readout (*τ*_{readout} = 2 *µ*s in our simulation yielding *f*_{B} = 1/2*τ*_{readout} = 250 kHz), whereas the *f*_{A} = 10 Hz bottom end of the frequency band is determined by the time-period for which the network is operated (0.1 s in our simulation corresponding to 10 000 iteration steps and a 4*τ*_{readout} waiting time between the current readout events, simulating the finite time of the neural updates and the multiplexing). Increasing the number of iteration steps would naturally increase the (Δ*G*/*G*)_{dynamic}, but this dependence is characteristic of the nature of the noise spectrum. In the case of a Lorentzian spectrum, the noise amplitude does not really depend on the bandwidth once the $(2\pi \tau 0)\u22121$ characteristic frequency of the fluctuator is well inside the band. This is consistent with the $(\Delta G)2\u223carctan(2\pi f\tau 0)|fAfB$ relation for the Lorentzian spectrum. The other experimentally relevant 1/*f*-type spectrum yields (Δ*G*)^{2} ∼ ln(*f*_{B}/*f*_{A}), which is also a very weak dependence on the bandwidth, yielding only a $\u224830%$ increase in (Δ*G*/*G*)_{dynamic} once the number of iteration steps is increased from 10^{4} to 10^{7}, i.e., the above bandwidth is increased by three orders of magnitude. According to these considerations, the results of our following simulations are rather weakly dependent on our specific choice of bandwidth.

We also note that the above operation bandwidth was chosen to overlap the frequency range of the noise measurements. In principle, a significantly faster operation is also realistic where *f*_{B} approaches GHz frequencies.^{24} Considering this enhanced bandwidth but keeping the number of 10 000 iteration steps fixed, we arrive at a frequency range where noise data are not available. In this range, it is hard to predict the noise properties for a memristive junction with Lorentzian character in the original frequency interval, as additional high frequency fluctuators might become relevant and a 1/*f*-type contribution might dominate over the Lorentzian spectrum above a certain corner frequency. For a 1/*f*-type spectrum, however, it is reasonable to extrapolate the 1/f-type character to significantly higher frequencies. With this assumption, the (Δ*G*)^{2} ∼ ln(*f*_{B}/*f*_{A}) relation shows that (Δ*G*/*G*)_{dynamic} does not depend on the high-frequency cutoff but only on the *f*_{B}/*f*_{A} ratio, i.e., a GHz-range network would suffer from the same noise levels as the much slower simulated network (*f*_{B} = 250 kHz) once the same number of iterations are applied. On the other hand, the integrated contribution of the thermal noise background becomes more relevant at increasing frequencies: (Δ*G*)^{2} = (Δ*I*)^{2} · *V*^{2} ≈ 4*k*_{B}*TG* · *f*_{B} · *V*^{2}. This relation yields $\Delta G/G=V\u22c54kBTfB/G$, which is plotted as a reference in Fig. 2(a) using *V* = 1 V and *f*_{B} = 250 kHz (black solid line) and *f*_{B} = 1 GHz (black dashed line). This comparison shows that in the original frequency range, the thermal noise background is orders of magnitude below the typical 1/f-type noise levels, and even at the envisioned GHz frequencies, the 1/*f*-type noise would slightly dominate over the thermal background once its extrapolation to the high-frequency range is valid.

### C. Variation of the noise with the device conductance

Several studies have pointed out that the relative noise amplitude of a memristive device exhibits a strong and specific dependence on the device conductance, i.e., the multilevel programmability is accompanied by the tuning of the relative noise level.^{26,28,40–48} Figure 2(a) shows four examples of this behavior, demonstrating the conductance-dependent noise characteristics of Ag_{2}S (green),^{27,28} Ta_{2}O_{5} (red),^{26} Nb_{2}O_{5} (orange),^{26} and SiO_{x} (blue) memristive units integrated for the same 10 Hz–250 kHz frequency band. It is clear that the overall noise amplitude, the characteristic conductance range of the operation, as well as the dependence of Δ*G*/*G* on the conductance, are a kind of device fingerprint, exhibiting significant differences between various material systems. However, a rather general trend of the noise characteristics can be identified: in the low-conductance region of the operation regime, Δ*G*/*G* is very weakly dependent on the conductance, whereas in the high conductance region, a strong Δ*G*/*G* ∼ *G*^{−γ} power-law dependence is typical.

In the latter case, metallic filamentary conduction is envisioned, where the relative noise amplitude obviously increases as the filament diameter is reduced.^{27} A rather generally observed tendency is related to the volume-distributed fluctuators in a diffusive filament, where Δ*G*/*G* ∼ *G*^{−3/2} was obtained from theoretical considerations.^{26,28} This is also confirmed by the experimental data in Fig. 2(a), where the validity of the *γ* = 3/2 exponent was approved for the Ag_{2}S, Ta_{2}O_{5}, and Nb_{2}O_{5} systems,^{26,28} whereas our new data on SiO_{x} memristors exhibits a somewhat shallower dependence with *γ* = 1.13 (see the dashed lines representing the best fitting tendencies with the given *γ* exponents). It is noted, however, that the *γ* exponent may depend on the transport mechanism, the device geometry, the dimensionality (2D/3D devices), as well as the distribution of the fluctuators (single or multiple fluctuators, surface or volume distributed fluctuators, etc.).^{27}

In contrast, the saturated noise characteristics in the low conductance regime are attributed to broken filaments, where a barrier-like transport is envisioned. In the simplest case of a tunnel barrier, the *G* = *A* · exp(−*α* · *d*) relations yield a conductance-independent Δ*G*/*G* = *α* · Δ*d* relative conductance noise for a constant Δ*d* fluctuation of the barrier width.^{27} More complex transport phenomena, such as the Frenkel–Poole mechanism^{49} or a hopping-type transport,^{50} require more sophisticated descriptions, but the overall trend, i.e., the independence or the very weak dependence of Δ*G*/*G* on *G*, is left unchanged due to the exponential dependence of the conductance on a relevant fluctuating parameter.

According to these considerations, in the following simulations we rely on a simplified noise model [see Fig. 2(c)], where Δ*G*/*G* is the constant below a certain threshold conductance *G*_{C} [see the red barrier-like regime in Fig. 2(c)], whereas a general Δ*G*/*G* ∼ *G*^{−γ} power-law dependence is considered at *G* > *G*_{C} [see the blue metallic nanojunction regime in Fig. 2(c)]. The *G*_{OFF} and *G*_{ON} conductances of the memristive HNN can be fixed at arbitrary positions along this noise model, as demonstrated by the red and blue circles in Fig. 2(c). This simplified model has three free parameters: the *G*_{C} threshold, the *γ* slope, and the Δ*G*/*G* relative fluctuation in the barrier-like regime. Note that according to the experimental results in Fig. 2(a), the latter can reach a few tens of percent; *G*_{C} is not necessarily but reasonably close to the *G*_{0} = 2*e*^{2}/*h* conductance quantum unit, whereas the variation of Δ*G*/*G* can span up to three orders of magnitude in the metallic nanojunction regime. For the exponents, *γ* = 1.13 − 1.5 values are observed; however, we emphasize that fundamentally different slopes are also possible.^{27}

## IV. SIMULATION OF MEMRISTIVE HNNS WITH REALISTIC NOISE CHARACTERISTICS

We have simulated memristive HNNs with realistic noise characteristics relying on the standardized Biq Mac Library,^{51} which provides exact globally optimal energies for Max-Cut instances of undirected and unweighted graphs in the sizes *n* ∈ [60, 80, 100]. Following the results of Ref. 24, we were studying Erdős–Rényi graphs with a 50% connection probability between the vertices. We have simulated the HNNs starting from *K* = 200 randomly picked initial state vectors and performing *N* = 10 000 iterations for each epoch. The neurons were iterated in a predetermined random order. The *K* runs are evaluated according to two figures of merit: the proportion of runs where the network was in the globally optimal state at the *N*th step ($Pconv$ convergence probability), and the number of edges between the two subsets (i.e., the *C* number of cuts) after *N* iteration steps averaged for the K random initial vectors, $C\u0304=1K\u2211i=1KCx\u0332i(N)$.

Instead of the ideal “1” and “0” values of the *W*_{i,j} weight matrix, realistic *G*_{i,j} conductances of the memristive HNN were used in the simulations. At the matrix positions with a value of “1” in the original problem, an average conductance of *G*_{ON} was applied, considering both (Δ*G*/*G*)_{static} device-to-device variations and (Δ*G*/*G*)_{dynamic} temporal fluctuations around this mean value. To simulate the latter, independent *G*(*t*) time traces were generated for all the memristive elements in the ON state using either the Lorentzian, pink, or white noise spectrum. Carson’s theorem and method were applied^{52} to generate the *G*(*t*) temporal noise traces (i.e., temporal conductance variations) from the chosen *S*_{G}(*f*) noise spectrum.

According to the HNN scheme, the diagonal elements of the crossbar matrix (self-connections) are set to exactly zero. This can be physically implemented by omitting the memristors at the diagonal positions, either by switching off their transistor in a 1T1R arrangement^{53–55} or by omitting their electroforming procedure. In Sec. IV F, however, we also discuss the case of finite self-feedback, which introduces a chaotic nature to the network.

The off-diagonal elements of the weight matrix with values of “0” are generally represented by the *G*_{OFF} conductance with the corresponding relative conductance noise. In some of the simulations, however, *G*_{OFF} = 0 is applied according to the following considerations.

Finally, we note that our simulations place emphasis on the noise properties of the network, and otherwise the network operation is simulated as an ideal mathematical HNN according to the update and activation rules in Eqs. (1) and (2), using *θ*_{i} = 0 for the max-cut problem. This means that the *x*_{i} neural state vector is represented by the *V*_{i} input voltage vector of the memristive HNN, and the *a*_{i} activation vector is represented by the output current vector, which is the product of the input voltage vector and the *G*_{i,j} conductance matrix. The latter includes the above discussed realistic conductance and noise properties. In the simulations, numerical input voltage levels of ± 1 V are applied; however, we note that both the relative noise levels and the results of the HNN operation are invariant to the applied voltage once the external parameters (external noise amplitude, strength of the self-feedback, and the *θ*_{j} threshold values) are properly scaled with the voltage. This idealized network operation with realistic conductance and noise characteristics can be considered an idealized circuit model, where the wire resistances as well as the resistances of the selector elements and the output and input resistances of the word line drivers and current sensing circuits at the bit lines are neglected, and a fully linear operation is considered for all the components of the memristive crossbar array. We believe that this choice of simplification facilitates the fundamental understanding of the role of internal noise in memristive HNNs; nevertheless, the proposed simplified noise model can also be implemented in more realistic circuit simulations as long as steady state noise in the linear transport regime is considered. Outside of this regime, i.e., if either significant nonlinearities appear in the *I*(*V*) curves of the memristive elements at the applied voltages or if the applied voltages excite additional fluctuations compared to the steady-state noise, nonlinear noise spectroscopy measurements would be required to realistically describe the voltage dependent noise properties.^{27,29}

### A. Contribution of the OFF and ON state elements to the current output and the noise of the memristive HNN

In the following, we provide simple considerations on the relative current and current fluctuation contributions of the ON and OFF state elements, which helps identify the most relevant contributions.

*Relative current contribution of OFF-state memristive elements in the crossbar.*For bit line*j*, the number of “1” values in the original weight matrix is denoted by*d*_{j}, yielding an average value of $dj\u0304=(n\u22121)\u22c5pc$ according to the random connection probability between the*n*vertices, for which*p*_{c}= 0.5 is applied in the following. The current contribution of the ON and OFF state elements in a certain bit line*j*, however, also depends on the distribution of the “+1” and “−1” values in the*x*_{i}state vector, which varies along the operation. For the ensemble average of the adjacency matrices with the same random connection probability, however, an ensemble-averaged current can be calculatedfrom which the $IOFF,j\u0304/ION,j\u0304=(GOFF/GON)*(1\u2212pc)/pc$ ratio gives an indication of the OFF and ON state memristors’ relative current contribution in column(5)$Ij\u0304=\u2211ixi\u22c5|V|\u22c5pc\u22c5GON\ufe38ION,j\u0304+\u2211ixi\u22c5|V|\u22c5(1\u2212pc)\u22c5GOFF\ufe38IOFF,j\u0304,$*j*. For the special case of*p*_{c}= 0.5, this simplifies to $IOFF\u0304/ION\u0304=GOFF/GON$. This demonstrates that at a large enough ON/OFF conductance ratio (e.g.,*G*_{ON}/*G*_{OFF}> 100), the replacement of*G*_{OFF}by zero is a reasonable simplification for a densely connected graph. Later on (Sec. IV E 2), we numerically analyze how a non-zero*G*_{OFF}value modifies the network operation at moderate ON/OFF conductance ratios.*Relative noise contribution of OFF-state memristive elements in the crossbar.*Whereas the current in a certain bit line strongly depends on the actual*x*_{i}state vector values, the mean squared deviation of the current is independent of that and can be exactly deduced once the*d*_{j}number of ON-state elements in column*j*is knownFrom this, the relative noise contributions of the OFF and ON state elements in bit line(6)$(\Delta I)j2=\u2211i(i\u2260j)(\Delta G)j,i2\u22c5|Vi|2=(\Delta G)OFF2\u22c5(n\u2212dj\u22121)\u22c5|V|2\ufe38(\Delta I)OFF,j2+(\Delta G)ON2\u22c5dj\u22c5|V|2\ufe38(\Delta I)ON,j2.$*j*can be calculated considering our simplified noise model [Fig. 2(c)]. First, we treat the*mixed barrier-like and metallic*regime, where*G*_{ON}>*G*_{C}>*G*_{OFF}, yieldingNote that the square-root term gives unity once(7)$\Delta IOFF,j\Delta ION,j=GOFFGCGCGON1\u2212\gamma \u22c5n\u2212dj\u22121dj.$*d*_{j}is replaced by its average value at 50% connection probability. This formula yields negligible OFF-state noise contribution for arbitrary*γ*once*G*_{OFF}is chosen deep in the barrier-like regime (*G*_{OFF}/*G*_{C}≪ 1), whereas*G*_{ON}remains reasonably close to*G*_{C}.

*G*

_{ON}>

*G*

_{OFF}>

*G*

_{C}(

*pure metallic regime*). In the metallic nanojunction regime, the memristive elements exhibit much more linear subthreshold

*I*(

*V*) characteristics than in the barrier-like regime, which is a favorable property for the high-precision vector-matrix multiplication operation of the memristive crossbar. This limit yields

*γ*> 1 value, i.e., for all the memristive units demonstrated in Fig. 2(a). Furthermore, in this pure metallic operation regime, the

*G*

_{ON}/

*G*

_{OFF}conductance ratio is restricted to rather limited values spanning one order of magnitude in the diffusive regime of Ag

_{2}S, Ta

_{2}O

_{5}, and Nb

_{2}O

_{5}memristors and less than two orders of magnitude in SiO

_{x}memristors [see Fig. 2(a)], which may distort the network operation compared to networks operated with orders of magnitude larger

*G*

_{ON}/

*G*

_{OFF}ratios in the mixed barrier-like and metallic regimes [Eq. (7)], i.e., the choice of the operation regime is the trade-off between high precision linearity and the proper representation of the “0” values in the weight matrix. In the following subsections, we discuss the results of our simulations for both operation regimes using max-cut benchmarks with 50% connection probability and a noise model with

*γ*= 3/2. Note, however, that the above formulas are also appropriate to discuss more general situations, including arbitrary conductances and

*γ*scaling exponents. Furthermore, arbitrary graphs (i.e., any

*d*

_{j}and

*n*values) can also be analyzed, where the replacement of the applied dense graph with a sparse graph (

*d*

_{j}/

*n*≪ 1/2) would yield further enhancement of the OFF noise contribution.

### B. Optimal noise level and the role of the noise color

We have simulated max-cut problems using graphs with 50% connection probability and different sizes (*n* = 60, 80, 100, 300). First, the mixed barrier-like and metallic operation regime is analyzed [Eq. (7)] with *G*_{OFF}/*G*_{ON} ≪ 1, and according to the above considerations, *G*_{OFF} = 0 was chosen, whereas a finite *G*_{ON} value with variable noise was applied. The time-averaged conductance was the same for all the ON elements, i.e., (Δ*G*/*G*)_{static} = 0 was applied. Simulations were run with three different noise types: Lorentzian noise (blue symbols in Fig. 3), 1/*f* noise (pink symbols), and white noise (gray symbols). The related noise spectra and time traces generated from these spectra are shown in Figs. 3(c) and 3(d). For all three spectra, the (Δ*G*/*G*)_{dynamic} metric was used to measure the relative noise.

Figures 3(a) and 3(b) demonstrate the $Pconv$ convergence probability and the $C\u0304$ average number of cuts after *N* iteration steps for the 60 × 60 benchmark max-cut problem also applied in Ref. 24. The red line in panel (b) shows the maximum number of cuts, i.e., the global solution to the problem.

It is clear that at zero noise level, the convergence probability is poor $(Pconv=1.5%)$, and the achieved number of cuts is far from the global solution. As the relative amplitude of the dynamic noise is increased, the convergence probability [Fig. 3(a)] exhibits a stochastic resonance phenomenon similar to the results of Ref. 24: irrespective of the noise color, $Pconv$ shows a peak at (Δ*G*/*G*)_{dynamic} ≈ 13.8%, leading to a $Pconv=40\u221250%$ chance of convergence. This figure implies that at a lower noise level, the system sticks to local minima, which prevents the convergence to the global solution, whereas at a high noise level, the system is able to escape from the global minimum, which also hampers the convergence. As an interesting conclusion, however, the results of the simulation are very similar for the different noise colors, i.e., the temporal correlations in the noise spectra are irrelevant, and (Δ*G*/*G*)_{dynamic} seems a proper, noise-type independent metric to find the optimal noise level. This also allows the simplification of the simulations by easily generating white noise spectra. Furthermore, it is emphasized that the best (Δ*G*/*G*)_{dynamic} ≈ 13.8% noise level corresponds to the top end of the experimentally observed relative noise values [Fig. 2(a)], i.e., the experimentally relevant noise levels do not hamper the network operation; on the contrary, experimentally relevant noise levels might not be enough to realize the optimal noise level if stochasticity is solely introduced by the noise of the memristor elements of the crossbar matrix.

We have repeated these simulations for numerous benchmark problems from the Biq Mac library spanning matrix sizes of 60 × 60 [circles in Fig. 3(e)], 80 × 80 (pentagons), and 100 × 100 (stars), using white, pink, and Lorentzian spectra (gray, pink, and blue symbols). At the larger matrix sizes, only white noise was applied. For these problems, the symbols in Fig. 3(e) represent the relative noise values, where the convergence probability is maximal. Furthermore, we have generated an even larger weight matrix (300 × 300, “+” symbol in the last column). Here, the global solution is not known; therefore, the symbol represents the noise value, where $C\u0304$ is maximal. Whereas the convergence probability and the $C\u0304$ strongly vary for the different problems, the optimal noise level scatters around a common Δ*G*/*G* = 13.2% average value (horizontal solid line) with a small variance of 2.6% (horizontal dashed lines). This analysis does not show any systematic tendencies as a function of the matrix size; even the largest matrix with 90 000 memristor elements exhibits optimal operation close to this average value. We note that the system size dependence of the optimal noise level was analyzed in Ref. 24 as well. However, in the latter analysis, the current noise of the entire bit line was considered. The system-size independent optimal noise level of the individual devices [Fig. 3(e)] yields a bit line current variance scaling with the square root of the array size due to the $\Delta Ij\u223cdj$ relation [see Eq. (6)], i.e., the results of Ref. 24 on the optimal noise level [Fig. 5(c) in Ref. 24] are consistent with our observations.

### C. Annealing schemes

The greatest challenge with randomization algorithms is that stochasticity helps to escape local minima, but there is no guarantee that the system will stay at the global minimum once it is first reached. A common approach to overcome this difficulty is to “cool” the system, i.e., gradually decrease the extent of stochastic behavior during the optimization process. For an experimentally realized HNN, the straightforward method is to harvest and tune the inherent device noise utilizing the multilevel programmability of the conductance states.

*et al.*,

^{24}the optimal trend for the cooling process in a HNN is superlinear. We have implemented this cooling scheme in our simulations, applying a parameterless superlinear annealing protocol to the stochastic variation of the conductance,

*G*(

*t*) noise signal is generated according to the chosen spectrum and the initial Δ

*G*/

*G*value, and the noise signal is accordingly attenuated as the iterations evolve. The such generated temporal decrease in the noise amplitude is illustrated by the pink curve in Fig. 4(d).

As illustrated in Fig. 4(c), the OFF-state conductance (red dot) is chosen deep in the barrier-like regime (and accordingly, *G*_{OFF} = 0 is applied), while the ON-state (blue dot) is prepared in the metallic regime with non-zero dynamic noise. During the *N* steps, the blue dot is moved toward higher conductances so that the relative dynamic noise gradually decreases [Fig. 4(d)]. All simulations were made using experimentally motivated pink noise.

The results achieved by this continuous annealing protocol can be seen in Figs. 4(a) and 4(b), which are scattered as pink circles. Here, the horizontal axis represents the initial noise value. To compare this annealing scheme to the network operation with constant noise, the concerned results from Figs. 3(a) and 3(b) are reproduced by pink lines. It is clear that the annealing procedure started at a high enough noise level delivers significantly better convergence probability than the constant noise simulation using the optimal noise level, which is consistent with the observations in Ref. 24. However, the results plotted in Figs. 4(a) and 4(b) demonstrate an unexpected phenomenon: if the annealing is started from a noise level at or below the optimal 13.8% constant noise level, the convergence probability does not show any improvement compared to the related constant noise simulation anymore. This implies an important conclusion: it is not vital to decrease the noise level well below the optimal noise level during the annealing process; however, it is beneficial if the annealing is started at a higher noise level than the optimal constant noise level. In other words, the optimal constant noise level does not cause a significant escape probability from the global solution; however, an initially higher noise level helps to escape from the local minima, driving the system more efficiently toward the global solution.

Utilizing this finding, we propose a highly simplified double-step annealing protocol [orange illustrations in panels (c) and (d)], where the initial noise level is decreased to its 2/3 and 1/3 value at the 1/3 and 2/3 of the iteration steps. According to panels (a) and (b), this simplified annealing protocol (orange symbols) delivers similar results as continuous annealing. This is highly beneficial for the network operation, as continuous noise annealing would be a demanding task due to the frequent reprogramming of all memristive cells. The double reprogramming along all the iteration steps is a reasonable trade-off between the time-consuming continuous annealing and the constant noise operation, where the convergence probability is worse and it is unrealistic to precisely know the optimal noise level in advance.

Finally, we note that a single weight update in a 60 × 60 crossbar array requires 3600 write operations, which is expected to take as much or even more time than the overall 10 000 iteration steps along the network operation. This means that any annealing protocol that requires much more weight update steps than the above simplified double-step annealing would yield a performance where the write operations along the weight updates require significantly more time than all the current readout and neural state update operations, i.e., the actual execution time.

### D. Memristive HNN operated in the metallic nanojunction regime

In the discussion of Eq. (8), we have seen that the noise of the OFF-state elements dominates the network once both the OFF and the ON states are positioned in the metallic nanojunction regime, which is described by a *γ* > 1 exponent. Next, we analyze the network operation in this *pure metallic regime* by varying the dominant Δ*G*_{OFF}/*G*_{OFF} relative noise level using *G*_{ON}/*G*_{OFF} = 10 [purple symbols in Figs. 5(a) and 5(b)] and *G*_{ON}/*G*_{OFF} = 100 [orange symbols in Figs. 5(a) and 5(b)] conductance ratios and *γ* = 3/2. Here, the ON state noise level is also simulated according to the scaling in Eq. (7). In this case, the convergence probability and $C\u0304$ [Figs. 5(a) and 5(b)] exhibit significantly worse results even at the highest 30% relative noise than the optimal network operation in Figs. 3(a) and 3(b) at 13.8% relative ON state noise level. This result, however, is obvious from Eqs. (6) and (7). According to these formulas, arbitrary OFF and ON state noise levels along the noise model can be converted to an equivalent situation where the OFF elements are noiseless but the equivalent relative ON-state noise, $(\Delta GON/GON)equivalent$, is set such that the overall current noise of the given bit line remains the same. According to Eq. (7), the *γ* = 3/2 and *p*_{c} = 0.5 parameters yield $(\Delta GON/GON)equivalent=(\Delta GOFF/GOFF)/1+GON/GOFF$. In Figs. 5(c) and 5(d), the results in Figs. 3(a), 3(b), 5(a), and 5(b) are plotted as a function of the equivalent ON-state noise level, demonstrating that the curves indeed follow the same tendency. From this, we can conclude that the pure metallic nanojunction regime yields the dominance of the OFF-state elements’ noise; however, a given OFF-state noise corresponds to significantly smaller equivalent ON-state noise, i.e., even the largest 30% OFF-state noise is too small to reach optimal operation.

### E. Further non-idealities

After a detailed analysis of the memristive HNNs’ noise properties and their impact on the network operation, we analyze the role of further device-non-idealities, such as the programming inaccuracy and the finite *G*_{OFF} conductance.

#### 1. Programming inaccuracy

In Figs. 6(a) and 6(b), we analyze the role of the (Δ*G*/*G*)_{static} measure of the programming inaccuracy, i.e., the device-to-device variance of the time-averaged conductance normalized to the average conductance. Here, we also consider the mixed barrier-like and metallic regimes using the approximation of *G*_{OFF} = 0, i.e., solely analyzing the programming inaccuracy of the ON-state conductances. The network’s operation for an increasing (Δ*G*/*G*)_{static} is demonstrated in Figs. 6(a) and 6(b) with no dynamical noise (pink line) and constant pink noise at optimal amplitude (pink circles).

In the noiseless network, device-to-device variations up to $\u224815%$ leave the poor noiseless network performance practically unchanged, whereas larger (Δ*G*/*G*)_{static} already makes the network operation even worse. Here, it is to be emphasized that static device-to-device variations seemingly produce a stochastic deviation of the bit line currents from the expected values of the original ideal HNN along the temporal evolution of the neural states.^{24} However, a finite (Δ*G*/*G*)_{static} only deforms the weight matrix of the HNN, but still, an ideal noiseless HNN is realized. This means that a finite (Δ*G*/*G*)_{static} and (Δ*G*/*G*)_{dynamic} = 0 yields a modified ideal HNN, where the energy can only be reduced along the operation, yielding similar dead ends in the local minima as the original noiseless HNN. Therefore, the device-to-device variations are not proper for performance enhancement in the memristive HNN, for either true stochasticity (noise) is required or a non-ideal HNN with somewhat chaotic energy-trajectories should be realized. The latter is possible by the introduction of diagonal feedback (see Sec. IV F 2), and presumably nonlinear device characteristics also yield similar non-ideal chaotic behavior; the latter, however, is not analyzed in this paper.

It is also interesting to analyze the role of programming inaccuracy if it is accompanied by optimal dynamical noise characteristics. According to the pink circles in Fig. 6(a), already (Δ*G*/*G*)_{static} > 0.025 values yield a sharp decrease in the convergence probability. An annealed network would show a very similar decay of the convergence probability (not shown). Accordingly, proper programming accuracy is vital to the network. Such accuracy has already been experimentally demonstrated in memristors with 2048 distinct conductance levels (corresponding to 11-bit resolution), where a special denoising process was applied to maximize programming accuracy.^{20} The states were programmed between 4144 and 50 *μ*S, with a 2 *μ*S resolution, which roughly estimates to Δ*G*/*G* ≈ 0.0005 − 0.04, i.e., if the network is operated in the high conductance end of this conductance regime, the envisioned (Δ*G*/*G*)_{static} < 0.025 conditions [see Fig. 6(a)] is easily satisfied even at much worse conductance resolution.

#### 2. Finite OFF conductance

We have seen that from the noise perspective, one can always find an equivalent picture where the OFF state is noiseless, i.e., in this sense, the partitioning of the noise between the ON and OFF elements is irrelevant; just the overall noise matters. However, even at zero noise, a finite OFF-state conductance may modify the network operation due to the imperfect representation of zero states. To analyze this, simulations were run at different *G*_{OFF}/*G*_{ON} values—ranging from 0 to 0.3—with no dynamical noise and constant pink noise with an optimal equivalent amplitude. The results can be seen in Figs. 6(c) and 6(d). No significant change is detected in a noiseless network (pink line), but in a network with optimal equivalent noise, the finite (*G*_{OFF}/*G*_{ON}) > 0.1 values already yield a shallow but significant reduction in the convergence probability (pink circles).

### F. Externally induced stochasticity in memristive HNNs with suboptimal internal noise level

#### 1. External injection of current noise

*σ*

_{j}variance representing the external noise injection (note that for the max-cut problem,

*θ*

_{j}= 0 applies). As proposed in Ref. 56, an additional memristive crossbar line with high amplitude tunable noise characteristics could be applied to tailor the noise level in the bit lines separately. Here, we apply an even simpler scheme with a single external memristive (or non-memristive) tunable noise source representing the

*σ*variance of the

*ξ*random variables. Along with the multiplexing, this external noise is added to the randomly chosen bit line. The green line (green symbols) in Figs. 7(a) and 7(b) represent the convergence probability and $C\u0304$ as the function of

*σ*for constant (annealed) external noise amplitudes. These curves highly resemble the results, where a constant (annealed) internal noise of the crossbar elements was applied [Figs. 3(a), 3(b), 4(a), and 4(b)]. This is, however, an easily deducible correspondence, as the

*σ*

_{j}variance of the external noise can be converted to an $(\Delta GON/GON)equivalent,j=\sigma j/dj$ ON-state equivalent noise level in bit line

*j*considering the mixed barrier-like and metallic regime with neglected OFF-state noise. Using this conversion factor with

*σ*

_{j}=

*σ*and replacing

*d*

_{j}with its average value, we can plot the results of Figs. 3(a), 3(b), 4(a), and 4(b) on Fig. 7 (see the rescaled top horizontal axis). With this rescaling, the constant and annealed internal memristor noise (purple line and purple circles in Fig. 7) indeed exhibit a highly similar effect on the network performance, as the equivalent externally injected constant or annealed noise (green line and green circles). This also means that our above results on internal noise contributions are all directly transferable to the case of external noise injection. Furthermore, in the case of external noise injection, the noise annealing is substantially more demanding than having to reprogram all the memristive elements to anneal the internal noise.

#### 2. Chaotic annealing through diagonal feedback

In addition to a well-tailored internal or external noise injection, there are other, fundamentally different strategies to help find the optimal solution in HNN-based problem-solving. An alternative approach relies on the introduction of negative diagonal feedback, which was shown to introduce chaotic behavior to the network.^{25,30,57} The latter can also be utilized to escape from local minima in the energy landscape. This diagonal feedback strategy was applied in various studies, including the demonstration of quantum-inspired annealing schemes in memristive crossbar arrays^{58,59} or the fine control of the perturbations using three terminal synaptic circuit elements.^{59} Furthermore, it was shown that the internal noise of the network can be attenuated or magnified by a tunable hysteretic threshold circuitry, which also induces self-feedback to the network.^{24} As a basis for comparison with the main topic of our article, i.e., the role of experimentally relevant noise characteristics of the memristive elements, here we also briefly summarize and simulate the diagonal-feedback-based chaotic annealing scheme.

*w*strength [see the dark green arrow in Fig. 7(c) and also in Fig. 1(a)] modifies the update rule as

*w*is the tunable strength of the self-feedback. Positive

*w*values drive the network toward the stabilization of the actual states, i.e., if a certain neuron would change its state at

*w*= 0, it is possible that at

*w*> 0 it does not change the state [for +1 (−1)] neural state; the state-change would require negative (positive) activation, but the self-feedback shifts the threshold toward positive (negative) values, which works against the change of the actual state. With similar argumentation, a negative

*w*yields neural updates in situations where the ideal Hopfield network (

*w*= 0) would not yield the change in the neural state. This simple argumentation also demonstrates that positive (negative) self-feedback attenuates (magnifies) the role of the internal noise in a memristive HNN, which is demonstrated and derived in detail in Ref. 24.

Here, we wish to investigate the effect of the self-feedback separately from the effect of internal noise, i.e., we have performed simulations for a noiseless Hopfield network using either constant [blue lines in Figs. 7(c) and 7(d)] or annealed [blue circles in Figs. 7(c) and 7(d)] negative *w* values. The negative feedback introduces chaotic behavior, which can be suppressed by gradually decreasing *w*. We note that in a memristive crossbar array, the introduction of negative weights would require the application of differential memristor pairs,^{7} i.e., it is more reasonable to keep the memristive matrix positive valued with zero diagonal elements and to introduce the self-feedback as an offset along the neural updates.

The results of the simulations with zero noise but finite negative diagonal feedback are demonstrated in Figs. 7(c) and 7(d). Once *w* is annealed, the convergence probabilities [blue circles in Fig. 7(c)] are similar as in the case of annealed internal or externally injected noise [purple and green circles in Fig. 7(a)]. However, at constant self-feedback amplitude [blue line in Fig. 7(c)], no optimal *w* is found, which would yield similarly good convergence probability $(\u224845%)$, as the optimal constant internal or external noise level [purple and green lines in Fig. 7(a)], rather the convergence probability remains below $\u224820%$ regardless of the *w* value. Accordingly, we can state that annealing is vital in the case of self-feedback. Finally, we note that chaotic dynamics is not only available through negative self-feedback, but nonlinear memristive elements with chaotic dynamics, such as nanoscale NbO_{2} memristors, can also be applied to externally inject chaotic behavior into the network.^{60}

## V. CONCLUSIONS

In conclusion, we simulated probabilistic optimization schemes inspired by the memristive HNNs experimentally realized in Refs. 24 and 56. These works demonstrated that memristive HNNs are not only efficient hardware accelerators for complex optimization tasks thanks to their single-step matrix–vector multiplication capability, but the intrinsic noise of the memristive elements can be exploited as a hardware resource, introducing proper stochasticity to the network. As a main focus, we simulated the operation of the memristive HNNs, relying on experimentally deduced, realistic noise characteristics. Based on a broad range of conductance dependent noise characteristics in various memristive systems, we proposed a noise model describing the typical noise evolution along the variation of the conductance states. Relying on this model, we demonstrated distinct operation regimes where either the ON-state or the OFF-state noise provides the dominant contribution. We also demonstrated that the relative conductance variation is not only a good measure of the noise amplitude but is also a highly relevant parameter describing the operation of the memristive HNNs. According to our simulations, the relative noise level required for the optimal network operation is found to be in the range of Δ*G*/*G* ≈ 11 − 16% regardless of the color of the noise spectrum (white, pink, or Lorentzian noise) or the size of the problem (60 × 60 − 300 × 300). We have shown that further performance enhancement can be achieved by noise annealing; however, a highly simplified and easily implemented double-step noise annealing scheme provides similar performance as the more refined but extremely time-consuming continuous superlinear noise annealing scheme. It is also found that the optimal noise level is at the top edge of the experimentally achievable relative noise levels, which means that the network is easily tuned to an operation regime with a suboptimal relative noise level, where the optimal operation can be either set by external noise injection or by negative diagonal feedback, introducing chaotic network behavior. Finally, we have explored the effects of further non-idealities, such as the limited programming accuracy and the finite OFF-state conductance of the memristors. We have argued that any static non-ideality that deforms the weight matrix but still implements a noise-free HNN can only lead to a degradation of the network performance, i.e., for performance enhancement, either true stochasticity (noise) or a non-ideal HNN with somewhat chaotic energy trajectories is required. It was, however, also found that static non-idealities, especially the device-to-device variations with (Δ*G*/*G*)_{static} > 0.025, cause severe performance degradation in networks with optimized dynamical noise levels. All these results are presented on the benchmark unweighted max-cut problem, where binary weights are used. We note, however, that the discussed conductance-dependent noise model is also a guideline for dealing with the intrinsic noise of multilevel weights in memristive HNNs, which can be simulated using a very similar approach. The latter non-binary networks are relevant for more complex optimization problems, such as weighted max-cut,^{59,61} or for situations where continuous weight annealing is used.^{23}

The rapidly growing field of memristor research is expected to deliver radically new IT solutions in the near future. We believe that our results contribute to this field by exploring the prospects of fully connected memristive networks utilizing the inherent stochasticity of memristors for probabilistic optimization algorithms.

## ACKNOWLEDGMENTS

This research was supported by the Ministry of Culture and Innovation and the National Research, Development and Innovation Office within the Quantum Information National Laboratory of Hungary (Grant No. 2022-2.1.1-NL-2022-00004), the New National Excellence Program of the Ministry for Culture and Innovation from the source of the National Research, Development and Innovation Fund (Grant Nos. ÚNKP-22-2-I-BME-73 and ÚNKP-22-5-BME-288), and NKFI Grant Nos. K143169 and K143282. Project No. 963575 has been implemented with the support provided by the Ministry of Culture and Innovation of Hungary from the National Research, Development and Innovation Fund, financed under the KDP-2020 funding scheme. Z.B. acknowledges the support of the Bolyai János Research Scholarship of the Hungarian Academy of Sciences. The authors are grateful to Dávid Krisztián and Péter Balázs for their contribution to the noise measurements on SiO_{x} resistive switches.

## AUTHOR DECLARATIONS

### Conflict of Interest

The authors have no conflicts to disclose.

### Author Contributions

**János Gergő Fehérvári**: Data curation (lead); Formal analysis (equal); Investigation (equal); Methodology (equal); Software (lead); Validation (equal); Visualization (lead); Writing – original draft (equal); Writing – review & editing (equal). **Zoltán Balogh**: Formal analysis (equal); Investigation (equal); Methodology (equal); Resources (equal); Supervision (supporting); Validation (equal); Visualization (supporting); Writing – review & editing (supporting). **Tímea Nóra Török**: Data curation (supporting); Formal analysis (supporting); Software (supporting); Writing – review & editing (supporting). **András Halbritter**: Conceptualization (lead); Formal analysis (equal); Funding acquisition (lead); Investigation (equal); Methodology (equal); Project administration (lead); Resources (equal); Supervision (lead); Validation (equal); Visualization (supporting); Writing – original draft (equal); Writing – review & editing (equal).

## DATA AVAILABILITY

The data that support the findings of this study are openly available in figshare repository at https://figshare.com/projects/Noise_tailoring_noise_annealing_and_external_noise_injection_strategies_in_memristive_Hopfield_neural_networks/176478.^{62}

## REFERENCES

_{2}/Si memristors

_{x}-graphene memristors

_{0.3}Se

_{0.7}based resistance switching memory devices with random telegraph noise

_{x}resistive-switching memory: Part II—Random telegraph noise

_{2}O

_{3}-based analog RRAM devices as revealed by fluctuation spectroscopy

_{2}based memristors

_{x}/n-Si structure for access device-free high-density memory application

_{2}Mott memristors for analogue computing