Extracting information from radio-frequency (RF) signals using artificial neural networks at low energy cost is a critical need for a wide range of applications from radars to health. These RF inputs are composed of multiple frequencies. Here, we show that magnetic tunnel junctions can process analog RF inputs with multiple frequencies in parallel and perform synaptic operations. Using a backpropagation-free method called extreme learning, we classify noisy images encoded by RF signals, using experimental data from magnetic tunnel junctions functioning as both synapses and neurons. We achieve the same accuracy as an equivalent software neural network. These results are a key step for embedded RF artificial intelligence.
Analyzing radio-frequency (RF) signals is essential in various applications, such as connected objects, radar technology, gesture sensing, and biomedical devices.1–8 For many signal classification tasks, such as emitter type identification9,10 and RF fingerprinting,5,11 artificial neural networks have proven to perform better than standard methods and show superior robustness to noise and defects.1 However, running neural networks on conventional computing hardware can be time-consuming and energy-intensive, which makes it challenging to integrate this capability into embedded systems.12,13 This issue is amplified in the case of RF signals because they require signal digitization before being processed by the neural network.
A promising path to reduce the energy consumption of artificial intelligence is to build physical neural networks using emerging technology.14 For this goal, spintronic nano-devices have key advantages, including their multifunctionality, fast dynamics, small size, low power consumption, high cyclability, high reliability, and CMOS compatibility.15,16 Furthermore, the high-speed dynamics of spintronic devices provides them key features for the emission, reception, and processing of RF signals.17–23 Several studies have shown their potential for building hardware neural networks.14,24–28 In particular, it was recently proposed to use the flagship devices of spintronics, magnetic tunnel junctions (MTJs) as synapses taking RF signals as inputs,29–31 and neurons emitting RF signals at their output.26
In this study, we first experimentally demonstrate that MTJs can perform synaptic weighted sums on RF signals containing multiple frequencies, similar to real-life RF data. Next, to showcase the potential of MTJs in RF signal classification, we construct a neural network using experimental data from MTJs acting as both synapses and neurons. We employ a backpropagation-free method known as extreme learning32,33 to train this network, integrating both experimental results and software processing. In extreme learning, the network is composed of a first random layer of synapses connecting the inputs to a hidden layer of neurons, and of a second layer of synapses that connects the neurons to the outputs and is trained. In this regard, extreme learning is the static equivalent of reservoir computing.34 We classify analog RF signals encoding noisy four-pixel images, into three classes with 99.7% accuracy, and into six classes with 93.2% accuracy, which is as good as the same network implemented in software. These results open the path to embedded systems performing artificial intelligence at low energy cost and high speed on complex RF signals, without digitization.
ANALOG PROCESSING OF MULTIPLE RADIOFREQUENCY SIGNALS IN PARALLEL
We process analog RF signals by leveraging the intrinsic fast dynamics of nanodevices called magnetic tunnel junctions (MTJs). These devices are nanopillars composed of two ferromagnetic layers separated by a tunnel barrier. When an RF current is injected into an MTJ, as depicted in Fig. 1, the magnetization of one layer enters in resonance with the input signal and by magnetoresistive effect a direct voltage is generated.35 This phenomenon, called spin-diode, is frequency selective: the output voltage is only generated when the input signal is close to the resonance frequency of the device. We first illustrate this effect by sending single frequency RF signals at the input of our junctions. Figure 1 depicts how four MTJs of different resonance frequencies each process the spectrum from 100 to 800 MHz. All four devices are from a material stack of SiO2//5 Ta/50 CuN/5 Ta/50 CuN/5 Ta/5 Ru/6 IrMn/2.0 Co70Fe30/0.7 Ru/2.6 Co40Fe40B20/MgO/2.0 Co40Fe40B20/0.5 Ta/7 NiFe/10 Ta/30 CuN/7 Ru, where thicknesses are indicated in nm, and have different diameters from 250 to 450 nm. The typical resistance area product of the devices is 8 Ω µm2. The MTJs are all in a stable vortex state. Using individual magnetic fields, we can fine tune the frequencies of the devices, as shown in different colors in Fig. 1. Here, the magnetic field is applied using an electromagnet and the devices were measured sequentially. The magnetic field modifies the rotation of the vortex. Here the vortex motion is dominated by magnetic pinning, which leads to different field-dependent behaviors of the frequency in different pinning sites and devices. However, future efficient integrated systems will use amorphous magnetic materials where the dynamics is not dominated by pinning, as demonstrated in Ref. 36. Furthermore, non-volatile resonance frequency tuning through magnetic anisotropy, as demonstrated in Ref. 37.
The total voltage V is a weighted sum of the input powers by tunable weights, as desired. Now, each synaptic weight Wk is encoded by all MTJs simultaneously, although the main contribution comes from the device whose resonance frequency is closest to the input frequency. Figure 3(c) shows that there is a good agreement (the normalized root-mean-square error is 4.4% of the range) between the measured and expected summed voltage V when the four sets of input powers and weights are varied. We observe that the agreement is better for the sum of all outputs than for the individual device outputs. This is because errors on the individual voltages are averaged out through the sum. As only the result of the sum is meaningful for the neural network, this is promising for the scalability of the system. These results demonstrate that arrays of MTJs can process multiple RF inputs simultaneously, over a wide frequency range in parallel. Although the sum of the voltage outputs from each junction has been performed here numerically for practicality, it can be achieved in a compact way on chip in the future by simply connecting them electrically.29
EXPERIMENTAL CLASSIFICATION OF RF SIGNALS
We now use the RF synaptic weighted sums to perform a classification task. Here, we choose a method, sometimes called “extreme learning,”32,33 depicted in Fig. 4. The neural network is composed of two fully connected layers, separated by a hidden layer of neurons. The first layer of synapses, described by a vector W(1)—here implemented in hardware—has random weights and is not tuned during training. The second layer of synapses, described by a vector W(2)—here implemented in software—is trained through a simple matrix inversion, as detailed below. Extreme learning is similar to reservoir computing34 in the sense that there are random hardware weights that are not trained and a software layer of weights that are trained. However, in contrast to the reservoir approach, in extreme learning the network is feedforward and static, meaning there are no recurrences or connections between the neurons of the hidden layer. This method has the advantage of eliminating the need for backpropagation, enabling classification without adjusting the weights. Although extreme learning may not be suitable for tackling complex, state-of-the-art tasks, it is a good benchmark for artificial neural networks implemented with emerging hardware.
Here, we perform a time-multiplexed experiment, as it enables us to implement a large neural network with only a few physical devices. This means that the different components are measured sequentially. We compose the first fully connected layer [i.e., W(1) × P in Eq. (1)] using the weighted sums of physical MTJ synapses. As there are three different weights for each of MTJ 1 and MT2, and four different weights for each of MTJ 3 and MTJ 4, by using all combinations of measured weights of the four MTJs (i.e., all possible combinations of the weights), we obtain 144 pseudo-random weighted sums. We measure each weighted sum sequentially.
To implement the hidden layer of neurons [i.e., a(·) in Eq. (1)], we use magnetic tunnel junctions as spin-torque nano-oscillators. Indeed, as shown in Ref. 26, we can use the non-linear relationship between their output RF power and input DC current as activation function. As our experiment is time-multiplexed, we could, in principle, use a single physical MTJ neuron to implement 144 virtual neurons (akin to virtual nodes in reservoir computing). However, our outlook is toward a fully parallel system where each neuron would be implemented by one individual MTJ. In such a system, device to device variability is unavoidable. To mimic such variability and assess the robustness of our system, we employ two physical MTJ neurons, measured in different conditions, to obtain 144 different activation functions. In practice, we perform DC current injection in a strip line above each MTJ neuron. The current generates a local Oersted field that modifies the vortex dynamics and thus the shapes of the activation function. The current in the strip line of MTJ neuron 1 was from 5 to 11 mA by steps of 0.05 mA, and the current in the strip line of MTJ neuron 2 was from 2 to 12 mA by steps of 0.05 mA. Each of the resulting 144 neurons receives as input the output of one weighted sum, thus performing of Eq. (1).
Then, the second synaptic layer W(2) is performed in software and trained by matrix inversion on the computer.
We compose a dataset of analog RF signals as follows: Each sample of the dataset is a four-pixel image, as the ones shown in Fig. 5(a). Each pixel corresponds to a frequency (, , , and ), and the intensity of the pixel is encoded by the RF power at that frequency. In order to emulate noise in the inputs, we assign the powers 1 and 3 µW to the gray pixels and 7 and 9 µW to the black pixels. In consequence, each class is composed of images with different noise configurations.
To benchmark our experimental network, we also perform classification with a purely software network. This software network has the same architecture as the physical network: four inputs, a hidden layer of 144 neurons and three outputs (respectively, six outputs) for the three-class (respectively, six-class) task. The first fully connected layer of the software network is composed of ideal weighted sums with weights extracted from the experiment, and the neurons are conventional rectified linear units. In consequence, the differences between the physical network and the software network are two-fold. First, the weighted sums from the physical synapses are noisy while the weighted sums of the software synapses are ideal. Second, the activation functions of the physical neurons are noisy and exhibit variability while the activation functions of the software neurons are ideal and identical. Comparing the classification accuracy of the physical and software networks thus provides an estimation of the impact of the physical network non-idealities (noise and device variability).
We first evaluate the ability of the network to discriminate the three classes shown in Fig. 5(a). We perform 100 runs, with four randomly chosen samples per class for the training set, and 20 randomly chosen samples per class for the test set. We obtain 99.7% for the experimental network and 96.2% for the equivalent software network, with standard deviations of 0.9% and 7.2%, respectively. If we complexify the task by having six classes (all possible combinations of two chosen pixels in the image), the test accuracy becomes 93.2% for the experimental network and 94.9% for the equivalent software network, with standard deviations of 3.6% and 5.2%, respectively. Figures 5(b)–5(e) show the corresponding confusion matrices for both tasks. The way we generate the images (i.e., having noise in the input RF powers) emulates input noise. However, having noise in the input frequency—which is likely to be the case when dealing with real-world inputs—will be equivalent to noise on the synaptic weights and affect the performance of the network. The precise impact of such noise has yet to be investigated and will greatly vary depending on the task. These results demonstrate that a network composed of experimental RF MTJs data can classify raw analog RF signals, with accuracy as high as a software network.
TOWARD LARGE-SCALE NETWORKS
In future systems, the network could be fully parallel (no time-multiplexing) and fully physical (no software layer) to achieve high speed and energy efficiency. This requires several improvements. First, we need optimization of the synapses. To reduce pinning effect, we can, for instance, use amorphous materials as proposed in Ref. 36. To achieve need non-volatile weights, we need novel resonance frequency tuning, for instance, as proposed in Ref. 37. Second, we need to have MTJs functioning at wide ranges of frequencies. This enables the network to have large expressivity, as the resonance frequencies control the weights. In particular, if the frequencies are too close, it is difficult to independently tune the weights.39 While the resonance frequency can be tuned by several GHz after fabrication,21,37 its value is coarsely set by materials and geometry. Therefore, we envision large arrays of devices of various sizes and shapes. Reaching higher frequencies is critical to have a large total frequency range, as well as to process diverse real-world inputs. Spin-diode behavior is similar in the 10 GHz range37 as in the device presented here in the 100 MHz range, and there are prospects for higher frequencies.40 Alternatively, input signals can be down-converted at lower frequencies using analog electronics, which comes with an energy overhead but might be relevant for some applications. Third, we need to stack layers into deep networks, which requires feeding the outputs of MTJ neurons to MTJ synapses. This has been explored in Ref. 31 with two neurons and two synapses but requires further optimization of the neurons, as proposed in Ref. 41. Finally, in order to tackle state-of-the-art classification tasks, we need to move from extreme learning to more complex algorithms where all weights are trained. While Leroux et al.30,31,42 have shown through simulations the possibility to train similar networks through backpropagation, this remains to be demonstrated experimentally on a large network, and how to perform on-chip training is an open question.
We have leveraged the dynamics of magnetic tunnel junctions to perform synaptic operations and performed weighted sums on several analog RF signals in parallel. In the future, by choosing the materials, shape, and size of the devices, their frequency can be engineered from 50 MHz to 50 GHz.40 As a consequence, an array of magnetic tunnel junctions could process RF signals over this whole frequency range in parallel, without digitization. This removes the need of multiple local oscillators or high-speed ADCs.43 Using experimental data from RF MTJs functioning as both synapses and neurons, we have composed a neural network and demonstrated classification of RF signals through extreme learning. The achieved accuracy is on par with that of an equivalent software-based network. These results open the path toward large neural networks able to perform artificial intelligence tasks on raw RF signals, without digitization, at low energy cost and small size.
See the supplementary material for the data of Figs. 1–4 and the Python code for the RF classification task, accompanied by the relevant experimental data as well as results files.
This work has received funding from the European Union under Grant No. PADR—886555-2—SPINAR.
Conflict of Interest
The authors declare the following competing interest: Patents FR 1800805-6-7 are held by CNRS and Thales, and J.G. is inventor. They cover the RF synapses and neurons proposed in this work.
Nathan Leroux: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Funding acquisition (equal); Investigation (equal); Methodology (equal); Project administration (equal); Resources (equal); Software (equal); Supervision (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal). Danijela Marković: Conceptualization (equal); Funding acquisition (equal); Investigation (equal); Methodology (equal); Project administration (equal); Supervision (equal); Validation (equal); Writing – original draft (equal); Writing – review & editing (equal). Dédalo Sanz-Hernández: Formal analysis (equal); Investigation (equal); Methodology (equal); Writing – review & editing (equal). Juan Trastoy: Investigation (equal); Methodology (equal); Writing – review & editing (equal). Paolo Bortolotti: Funding acquisition (equal); Investigation (equal); Methodology (equal); Project administration (equal); Supervision (equal); Writing – review & editing (equal). Alejandro Schulman: Investigation (equal); Methodology (equal); Resources (equal); Writing – review & editing (equal). Luana Benetti: Funding acquisition (equal); Investigation (equal); Methodology (equal); Project administration (equal); Resources (equal); Supervision (equal); Writing – review & editing (equal). Alex Jenkins: Funding acquisition (equal); Investigation (equal); Methodology (equal); Resources (equal); Supervision (equal); Writing – review & editing (equal). Ricardo Ferreira: Funding acquisition (equal); Investigation (equal); Methodology (equal); Resources (equal); Supervision (equal); Writing – review & editing (equal). Julie Grollier: Conceptualization (equal); Funding acquisition (equal); Investigation (equal); Methodology (equal); Project administration (equal); Resources (equal); Supervision (equal); Validation (equal); Writing – original draft (equal); Writing – review & editing (equal). Frank Alice Mizrahi: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Funding acquisition (equal); Investigation (equal); Methodology (equal); Project administration (equal); Resources (equal); Software (equal); Supervision (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal).
The data that support the findings of this study are available within the article and its supplementary material.