We study the validity of implementing MoS2 multilevel memories in future neuromorphic networks. Such a validity is determined by the number of available states per memory and their retention characteristics within the nominal computing duration. Our work shows that MoS2 memories have at least 3-bit and 4.7-bit resolvable states suitable for hour-scale and minute-scale computing processes, respectively. The simulated neural network conceptually constructed on the basis of such memory states predicts a high learning accuracy of 90.9% for handwritten digit datasets. This work indicates that multilevel MoS2 transistors could be exploited as valid and reliable nodes for constructing neuromorphic networks.
The increasing demand of storage and analysis on a huge amount of data has led to a surge in interest in developing next-generation processor and memory systems.1,2 To handle the diverse big data problems ranging from the internet of things (IoT), artificial intelligence (AI), and cloud computing, the conventional Von Neumann architecture computing system encounters serious challenges such as deceleration of transistor scaling, high fabrication cost, and unacceptable power consumption for transferring data between processers and memories.3,4 New device and system architectures need to be created for dealing with such challenges.2 Non-Von Neumann architectures, in which processors and memories merge together, have been proposed to leverage parallel computing capability and overcome the data transfer restriction imposed by process-memory bandwidth. In such a system, each memory cell serves as a synapse for dynamically recording and processing information. In comparison with binary cells, multilevel cells (MLCs) or other memory devices with multiple storage states can provide the higher data storage density and processing accuracy, which makes them desirable candidates for the construction of the next generation memory-based computing systems.
Recently, various types of multibit memories have been proposed. Some of them are two-terminal devices, such as phase-change memory (PCRAM), magnetic memory (MRAM), and resistive memory (ReRAM).5–7 These devices can be integrated into a crossbar architecture to achieve a high cell density. MRAM and PCRAM cells have exhibited very good reliability, and ReRAM cells have enabled an energy-efficient operation scheme in the femtojoule regime.8,9 However, such two-terminal devices, when operated in a crossbar architecture, suffer from cell-to-cell crosstalk. The current methods for addressing the crosstalk issue significantly increase the complexity of circuit design and operation.10,11
Field effect transistor (FET) memories can also enable a high integration density as well as reliable isolation among cells ranged in networks. Several types of transistor memories, such as redox-transistor memories, ionic floating-gate memories, and ferroelectric transistors, have been proposed and studied.12–17 The practical high-density implementation of these transistor memories still needs additional in-depth investigation, especially the research effort seeking to address the challenges associated with complex architectures and operations schemes, high fabrication cost, unknown scaling behavior, and poor compatibility with current circuit platforms.12–17 Therefore, it is highly desirable to explore multibit transistor memories made from other emerging nanomaterials that could result in simplified device architectures and low fabrication cost. Chen et al. reported the multibit transistor memories made from plasma-treated and mechanically twisted layered transition metal dichalcogenides (TMDCs).18,19 Such devices could be easily manufactured at a low cost and enable simple high-density implementation. However, it is still not clear if the multiple data storage states set in such TMDC transistors are suitable for neuromorphic computing. In particular, it is still unknown if such transistors can provide a sufficient number of resolvable memory states for meeting the computing accuracy requirement within a given period.
In this Letter, we present an experiment/simulation-integrated investigation on the validity of implementing plasma-treated MoS2-based multilevel transistor memories for neuromorphic computing applications. In this work, multiple data storage states are set in MoS2 transistors, and the retention behaviors of these states are characterized within given time durations. This work shows that a plasma-treated MoS2 transistor can be programed into at least 3-bit resolvable states that are suitable for hour-scale computing applications or 4.7-bit states for minute-scale computing. A neural network is conceptually constructed and simulated on the basis of the data levels (or conductance memory states) experimentally acquired from our MoS2 multilevel transistor memories. This simulated neural network exhibits a high learning accuracy of 90.9% for the modified National Institute of Standards and Technology (MNIST) handwritten digits dataset.
Figure 1(a) schematically illustrates a typical top-gated Ar-plasma-treated few-layer MoS2 multilevel transistor. Ar plasma treatment is applied to introduce sulfur vacancies at the bottom layers of the flakes.20–22 Figure 1(b) displays the optical micrograph (OM) of a representative multilevel transistor with a few-layer MoS2 channel sandwiched by two metal electrodes (50 nm Au/5 nm Ti) and covered by a top gate dielectric (40 nm Al2O3) and a gate electrode (50 nm Au/5 nm Ti). In the fabrication of such transistor memories, nanoimprint-assisted shear exfoliation (NASE) is exploited to fabricate few-layer MoS2 channels with a small device-to-device variation in channel dimensions.22–24 All as-fabricated MoS2 memory channels have similar dimension sizes (channel length ∼5 um, width ∼2 um, and thickness ∼15 nm). Other details of device fabrication are presented in the supplementary material.
Retention characteristics of the multilevel memory (or conductance) states set in our MoS2 transistors are investigated. The details about the setting and measurement of such states are presented in the supplementary material. Figure 2(a) displays four selected retention characteristic curves measured from a representative transistor. It is observed that four states (i.e., S1–S4) are well distinguishable between each other for thousands of seconds after setting. Afterwards, these states tend to relax and converge toward the neutral state. A more detailed analysis of the retention characteristics of a presentative memory state is described in Fig. S1 in the supplementary material. To make quasi-analogue MLCs suitable for serving as neuromorphic network nodes, the number of resolvable memory states in a transistor for a given computing duration should be clearly justified and quantified. This demands an understanding of the relationship between the number of resolvable multilevel states and their relaxation behaviors within nominal computing durations. To answer this device physics question, we propose an evaluation scheme to distinguish two adjacent states. Figure 2(b) displays two adjacent states S1 and S2. For a given nominal time duration t, the confidence interval (CI12) between the two adjacent states is defined by the gap between the lowest current of the state with the higher average conductance and the highest current of the state with the lower average conductance, as denoted by the dashed lines in Fig. 2(b). The confidence interval ratio is further defined by the ratio between confidence interval and the initial current gap between the two states (at t = 0). Based on previous works, such adjacent states are regarded to be distinguishable when their confidence interval ratio is at least 2/3 within the nominal operation duration.25
Figure 3(a) displays the scatterplot of a set of confidence interval ratio data measured from 66 pairs of adjacent memory states programed in a representative transistor memory and retained for 1000 s. These confidence interval ratio data are plotted as a function of normalized initial conductance [defined in Eq. (S2) in the supplementary material] and normalized initial conductance gap [defined in Eq. (S3)] values. Figures 3(b) and 3(c) plot the corresponding confidence interval ratio data as a function of normalized initial conductance and initial conductance gap values, respectively. Figure 3(b) shows a convex curve for correlating confidence interval ratio and normalized conductance values. This observation, in combination with previously identified bipolar charging/discharging properties of MoS2, implies that the memory states that are farther away from the neural state (i.e., the state with a normalized conductance of ∼0.5) tend to relax more rapidly toward the neural state.18 On the other hand, Fig. 3(c) shows that the confidence interval ratio decreases with the reduction of the normalized conductance gap. This indicates that the normalized initial conductance gap between two states needs to be >0.09 to assure a confidence interval ratio >2/3 for a computing period of 1000 s. Figures 3(d)–3(i) plot the counterpart results acquired with different nominal computing durations of 100 s and 10 s. For such shorter time durations, confidence interval ratio values are ∼1 and exhibit very weak dependences on initial conductance gaps. This indicates that multilevel MoS2 transistors can provide more resolvable memory states for the shorter operation time durations. More specifically, when the required computing time is reduced to 100 s and 10 s, the normalized initial conductance gap between adjacent states can be set to be as small as 0.03 [i.e., the smallest initial conductance gap shown in Fig. 3(f), which can still result in a high confidence interval ratio of ∼0.94 after 100 s]. This can result in ∼33 (i.e., 1/0.03) resolvable states for each neuromorphic network node.
To further support the aforementioned conclusion, we set as many conductance states as possible in a transistor and evaluated their resolvability during a retention period of t = 100 s. Figure 4(a) shows retention characteristics of 26 conductance states set in the same transistor discussed above. Table S2 lists the applied programing voltage pulses for each state. Figure 4(b) shows the endurance characteristics of the same 26 states subjected to 25 consecutive cycles. In a cycle, the device is sequentially programed into states S1–S26. These results indicate that our MoS2 transistors can provide 4.7-bit resolvable data levels with good retention and endurance properties, suitable for minute-scale operations. The long-term retention and endurance characteristics (up to 105 s) are also measured and presented in the supplementary material (Fig. S2). Such long-term results show that MoS2 memory cells can provide at least three-bit resolvable states suitable for hour-scale computing processes.
To preliminarily evaluate the validity of our multilevel MoS2 transistors for future all-hardware neuromorphic computing, we construct a software-based multilayer perceptron (MLP) model based on the measured current levels of our transistors' data states to perform a supervised online learning course on MNIST handwritten digits datasets.26 Figure 5(a) displays the MLP structure that consists of three neural layers with 400 input neurons, 100 hidden neurons, and 10 output neurons. The pre-processed input MNIST digit images are centered and resized to 20 × 20, and each pixel is fed into an input neuron to perform supervised learning. Previous works have implied that to obtain a high learning accuracy, each node in such a MLP network needs to have at least four-bit reliable weight values (or resolvable states) within the computing period (i.e., 100 s for this simulation).27,28 The transistor memory with 26 reproducible conductance states (or current levels), as shown in Fig. 4, could serve as such a required network node. The detailed weight update scheme of the simulated MLP nodes on the basis of the measured current levels is described in the supplementary material.28,29 Figure 5(b) shows the evolution of the simulated learning accuracy with increasing the training epoch number. Figure S4 plots the processing latency as a function of the training epoch number. The online learning processing latency for MNIST datasets (with 60k training images and 10k testing images) is 92.7 s for 100 training epochs, which is in the range of pre-set processing time and assures a sufficient number of resolvable data states during the whole computing period. These simulation results predict that the neural network based on MoS2 multilevel transistors could enable a high learning accuracy of ∼90.9% for minute-scale parallel computing tasks.
In summary, we evaluate the validity of the implementation of MoS2 multilevel transistor memories for neuromorphic computing applications. This work shows that such a memory can be programed into at least 3-bit resolvable states suitable for hour-scale computing applications or 4.7-bit states for minute-scale computing. The simulation of a three-layer neural network conceptually based on the data states of our MoS2 memories predicts a high learning accuracy of 90.9% for MNIST handwritten digits datasets.
See the supplementary material for the fabrication of MoS2 multilevel transistors, the setting and measurement of each conductive state, analysis of the retention characteristics, and the construction of MLP network simulation.
This work was supported in part by NSF ECCS-1452916. The authors would like to thank the staff of the University of Michigan's Lurie Nanofabrication Facility for providing the support of device fabrication.
The data that support the findings of this study are available from the corresponding author upon reasonable request.