Designed to outperform conventional computers when performing machine-learning tasks, neuromorphic computation is the principle whereby certain aspects of the human brain are replicated in hardware. While great progress has been made in this field in recent years, almost all input signals provided to neuromorphic processors are still designed for traditional (von Neumann) computer architectures. Here, we show that a simple photosensitive capacitor will inherently reproduce certain aspects of biological retinas. We found that capacitors based on metal halide perovskites will output a brief voltage spike in response to changes in incident light intensity, but output zero voltage under constant illumination. Such a sensor is not only optimized for use with spiking neuromorphic processors but also anticipated to have broad appeal from fields such as light detection and ranging, autonomous vehicles, facile recognition, navigation, and robotics.
Traditional computers are designed to handle instructions sequentially, in what is called a von Neumann architecture.1 Neuromorphic computation,2 however, is the principle whereby information is processed in an analogous manner to the mammalian brain.3 Often implemented using massively parallel resistive switching elements4–7 and/or spiking nodes,8–12 neuromorphic processors13,14 are predicted to outperform von Neumann machines by several orders of magnitude for certain machine-learning tasks, in terms of both speed and power consumption. While great progress has been made in the processing of information based on neuromorphic principles,15 stimuli provided to this hardware still take a form designed for von Neumann architectures, for example, an array of pixels sampled at fixed intervals in time.
Compared to artificial imaging systems, such as those based on complementary metal oxide semiconductor (CMOS) or charged coupled device (CCD) sensors, the human eye is spectacularly complex.16,17 The retina has on the order of 108 photoreceptors (rods and cones), but the optical nerve transmits only ∼106 signals to the primary visual cortex for processing. A one-to-one mapping between photoreceptors and neurons in the brain, therefore, does not exist, and instead, the retina preprocesses information before it is transmitted. Cell activity is found to strongly spike when presented with spatially or temporally varying signals but remains relatively inactive in response to static images.18–21 Hence, by transmitting signals only in response to changes, the retina is able to compress the information that it receives significantly.
Retinomorphic22 is a term used to describe an event-driven optical sensor that is designed to emulate this visual pre-processing in some way.23 While accurate replication of the retina has been achieved using very large scale integration (VLSI)-based systems,24,25 the complexity of such circuits suggests that they are more applicable as intraocular prostheses, than as mass-produced general vision sensors (c.f. photodiodes/CCD elements). Thin-film semiconductors such as metal halide perovskites (henceforth referred to as just perovskites for brevity) or organic semiconductors, however, have tunable absorption properties26,27 and compatibility with facile fabrication techniques,28 making them good candidates for simple and inexpensive vision sensors based on single circuit elements.29 While optical sensors with the geometry of the eye have been demonstrated using perovskites,30 and vision has been restored to blind mice by replacing photoreceptors using semiconducting nanowires,31 the fundamental operating principles of most sensors still remain tailored toward von Neumann processing. Artificial synapses that employ photonic simulation32–35 have also been demonstrated using perovskites36 but are, in general, designed for transmitting and processing information rather than as optical sensors.
Here, we have fabricated a simple photosensitive capacitor and characterized its response to optical stimuli. The structure is shown in Fig. 1(a) and is based on a bilayer dielectric. The bottom dielectric, silicon dioxide: SiO2, is expected to be highly insulating and largely irresponsive to light. The top dielectric is the prototypical perovskite: methylammonium lead iodide (MAPbI3), a compound that is known to have a large photoconductive response37 and a dielectric constant that changes significantly under illumination.38 These properties make MAPbI3 an ideal candidate for a dielectric medium, which changes capacitance under illumination. The bottom electrode is highly doped silicon, which acts as a substrate and an electrode. The top electrode is 15 nm of gold, which was deposited by thermal evaporation and is designed to be sufficiently thin to be partially transparent to optical illumination while still being conductive, albeit with a large contact resistance. When placed in series with an external resistor, the voltage dropped across the resistor will, therefore, spike temporarily, as the capacitor charges/discharges, before returning to its equilibrium value. The result is a sensor that spikes in response to changes in illumination, but otherwise outputs zero voltage.
The photosensitive capacitor was placed in series with a conventional resistor of resistance (1 MΩ for all data obtained), as shown in Fig. 1(b), and a voltage was applied across the structure. The voltage drop across the external resistor is defined as , and the voltage drop across the capacitor is defined as . The device was held in the dark before being illuminated with constant illumination from a green (peak wavelength: 525 nm) light emitting diode, with an optical power density () of 60 mW/cm2, as shown in Fig. 1(c). Figure 1(d) shows the mean (dark line) and the standard deviation in (gray) of five different devices measured experimentally under identical conditions.
It is believed that upon constant illumination, free charges are generated in MAPbI3,39 as the exciton binding energy is below the thermal energy () in these systems at room temperature.40 In the steady state,37,41 it is expected that the relative concentration of holes vs electrons in MAPbI3 will depend on both the respective carrier lifetimes42 and details of trap states in the film.43 Because here the top contact was held at a positive voltage, relative to the bottom contact that was grounded, it is anticipated that electrons will drift to the gold top contact, while holes will accumulate at the interface between the semiconductor and the dielectric. It is known that bare SiO2 can lead to trapping when used as a dielectric in thin-film transistors (TFTs),44 and it is likely that passivated SiO2 or alternative dielectrics45 such as polymers46 or high- oxides47 will lead to greater charge accumulation and faster response, something which will be considered in future device designs.
The data presented in Fig. 1(d) show that our devices produce the desired behavior: a spike in voltage in response to a change in illumination, but a low voltage under constant illumination. It is important to emphasize that the photosensitive capacitor demonstrated here is dissimilar from previously reported photocapacitors48,49 in both design and intended use. Photocapacitors are designed to store energy from solar irradiation, while our sensor is designed to detect changes in optical stimuli for neuromorphic computation.
The peak voltage was approximated for a single device as a function of applied and , as shown in Figs. 1(e) and 1(f), respectively, both exhibiting a roughly linear relationship over the ranges studied. The RC circuit shown in Fig. 1(b) can be modeled using Kirchhoff's Laws, where we explicitly state that the capacitance has some time dependence: . The derivation is provided in supplementary material Sec. S1, and must be solved numerically for the specified function, which can take any form. is a constant contact resistance term invoked in the model. It is presumed that the largest source of is the poor conductivity of the 15 nm electrode itself and its contact with the probe-station probe needles. However, the interface50 between MAPbI3 and Au will likely possess a non-negligible contribution to also . The capacitance of the photosensitive capacitor is modeled using the following equation:
Here, is the capacitance of the bilayer capacitor in the dark, approximated using known parameters,51 as described in supplementary material Sec. S4. is a proportionality constant relating and any change in capacitance due to illumination. We have approximated this relationship to be linear, based empirically on the data presented in Fig. 1(f) for a first approximation. By modeling as a step function, as shown in Fig. 1(c), the model can be solved numerically and fitted to experimental data with and as fitting parameters, as described in supplementary material Sec. S5. Figure 1(g) shows an example of such a fit, and supplementary material Fig. S4 shows this and four other fits to similar data from other devices. The extracted parameters were = 32 ± 10 MΩ and = 0.014 ± 0.004 Fm2/W, where ± denotes the standard deviation in extracted parameters between devices.
It should be noted that all experimental device measurements presented in this manuscript were obtained by applying illumination to the device 30 s after the application of . It was observed that the delay between the application of and the optical stimulus does affect the device response (see supplementary material Fig. S5), something not encapsulated by our simple model as described in Sec. S2. This phenomenon is discussed in more detail in supplementary material Sec. S6, but is attributed to either the MAPbI3 layer acting as a large shunt resistor in the dark or significant ionic motion,52 which is inhibiting the accumulation of electronic charge, in an analogous manner to that observed in TFTs.53,54
Using the parameters extracted from fits to experimental data and by approximating noise measured on , we are then able to simulate the anticipated behavior of many devices. Figure 1(h) shows the average (red) and standard deviation (gray) of as a function of time for five devices simulated using our model. The parameters and for each simulated device were assigned randomly from a Gaussian distribution with the same mean and standard deviation values as evaluated experimentally. Gaussian noise was then added to with the same magnitude as what was observed experimentally. While we have only experimentally measured one device at a time, the similarity between Figs. 1(d) and 1(h) demonstrates that our parametrization strategy enables us to accurately simulate the behavior of an ensemble of many devices as produced in our lab.
To illustrate the utility of these retinomorphic sensors, large arrays were simulated in an identical manner as shown in Fig. 1(h), but where a video was used as an optical stimulus. Details of the process are provided in supplementary material Sec. S7, but a brief description is provided here for completeness. An array of retinomorphic sensors with the same dimensions as the input video resolution was defined in software with randomly assigned values of and . The input video was converted to the grayscale, with brighter regions assigned a larger and darker regions a smaller , regardless of their color, in the range of 0 to 100 mW/cm2. While the absorption coefficient of MAPbI3 is dependent on the wavelength,55 color selectivity was not considered in this simulation. Initial conditions were defined, then and were updated at every time step , for every pixel, based on the changing values of . was then calculated as a function of time for every pixel and outputted as a new video. In this output video, black regions indicate a low and white regions indicate a high . Example frames from these videos are shown in Fig. 2.
As expected, stationary parts of each video remain dark in the output video (corresponding to a low ), while movement appears white (high ). There exists a “ghosting” effect of white regions, as slowly decays in the absence of illumination. If remaining faithful to the design of biological retina, stationary parts of the stimulus are of less interest to the mammal/artificial neural network than moving parts. Hence, one can interpret these data as a map of valuable information, where regions of high are of interest for further processing, while regions of low are generally not.
While a stronger response to dynamic images is an important component of simulating the retina,21 this is just one aspect of what biological vision systems do. In the 1950s and 1960s, Hubel and Wiesel showed that certain neurons in cats fire strongly in response to optical stimuli aligned at certain angles, but not stimuli aligned at others.56–58 This is understood to occur via a specific arrangement (in real space) of some cells in the retina that hyperpolarize, and some cells that depolarize, in response to optical stimuli. Taking an average of these cell responses results in a stronger firing rate to stimuli of certain orientation than others. This can be considered the first stage in the complex process59 by which mammals process visual information.
This on-center off-surround strategy, as adopted by nature, turns out to be well-suited for the retinomorphic sensors described here. Because our sensors consist of a photosensitive capacitor in series with a resistor, one can either choose to measure the voltage drop across the capacitor, as shown in Fig. 3(a), or across the resistor, as shown in Fig. 3(c). By defining these measurement configurations as “C” sensors and “R” sensors, respectively, one can emulate an on-center off-surround sensor layout, as depicted in Fig. 3(b).
The voltage of the sensor depicted in Fig. 3(b) was simulated using the mean parameters observed experimentally ( = 5 V, = 32 MΩ, and = 0.014 Fm2/W), but with no spread or noise in the first case. The optical stimulus incident on the sensor was 0 mW/cm2 up until = 1 s at all locations. After = 1 s, a constant illumination with a maximum of 100 mW/cm2 was applied, but only in selected regions as illustrated in white in the examples in Fig. 3(d). The average voltage as a function of the angle of the stimulus, relative to the vertical, is shown in Fig. 3(e), illustrating that such a sensor would indeed respond more strongly to the stimulus of a certain orientation. It is easy to conceive how such a sensor could be modified to be selective to objects of any desired orientation by rearranging the components of Fig. 3(b). Figure 3(f) demonstrates how the peak change in voltage () relative to 2.5 V is determined by the angle of stimulus for both a vertically selective sensor, as shown in Fig. 3(b), and a horizontally selective sensor (achieved by rotating the sensor array by 90°). The facts that peaks do not appear at exact multiples of 90°, and is negative at some angles are due to the fact that the 8 8 brightness arrays are digitized from a higher resolution image, and therefore, total brightness is not conserved as a function of angle.
The simulated data presented in Fig. 3 demonstrate how one can construct an angle-dependent sensor based on the retinomorphic sensors we experimentally measured in Fig. 1. It is important to emphasize that this is a sensor that would not require any complex readout algorithm or any significant post-processing. It is a sensor that directly outputs a voltage that depends on the orientation of the object it is directed at. The output from such a sensor could, for example, be provided to a single neuron in an artificial neural network, greatly reducing the power and time required for certain image processing vs a traditional video input.
To demonstrate more explicitly how a sensor such as that depicted in Fig. 3 would perceive moving images, we have simulated two-dimensional arrays of angle-dependent sensors, in an analogous manner to Fig. 2. Because each angle-dependent sensor is itself made up of an array (in this case 10 × 10) of neuromorphic sensors, the resolution of an angle-sensitive image sensor array will be significantly reduced compared to the input. The results are presented for some example input data in Figs. 3(g)–3(j). As with the data presented in Fig. 2, each pixel in the simulated array was assigned a random value of and based on the experimentally observed means and standard deviations and the assumption that spread in parameters can be roughly represented by a Gaussian distribution. Random noise in the voltage was added to the voltage of each pixel, again with the same level as observed experimentally.
While the amplitude of detectable voltage is an order of magnitude lower and the resolution is significantly reduced, compared to the images produced by a simple angle-insensitive retinomorphic sensor, it is clear that the simulated device acts as expected. The vertically selective sensor [Fig. 3(i)] is good at identifying vertical objects (such edges of buildings), and the horizontally selective sensor [Fig. 3(j)] is good at identifying horizontal objects (such as horizontal road markings).
In summary, we have taken inspiration from biology21 and experimentally demonstrated a simple yet unique optical sensor that provides a spiking voltage in response to changes in illumination, but not under constant illumination. This design, hence, inherently filters out non-pertinent information such as static images, providing a voltage only in response to movement. Using a simple model based on Kirchhoff's Laws, we are able to parameterize this device and accurately reproduce its behavior in simulations. Because this paradigm for optical sensing is a significant deviation from the traditional method by which optical information is acquired, we have spent a substantial portion of the report demonstrating the applications of such a sensor when combined in arrays. The simplicity of these devices additionally enables us to emulate on-center off-surround design provided by nature,56 allowing us to simulate sensors that output a voltage only in response to objects of a particular orientation.
See the supplementary material for the experimental methods, derivation of the quantitative model for the sensor response, dependence of the sensor response on the load resistor, description of the fitting strategy employed, dependence of the sensor response on delay between the application of voltage and optical stimulus, and a description of how the retinomorphic sensor arrays were simulated.
The authors thank the National Science Foundation for financial support (Award No. 1942558). Part of this research was conducted at the Northwest Nanotechnology Infrastructure, a National Nanotechnology Coordinated Infrastructure site at Oregon State University, supported in part by the National Science Foundation (Grant No. NNCI-2025489) and Oregon State University. The authors would like to thank Alyson Joos for assisting with recording the input videos used in the simulation part of this work. J.G.L. would like to thank Massachusetts Institute of Technology (MIT) OpenCourseWare and Nancy Kanwisher, in particular, for providing inspiration for this work through her lecture series “The Human Brain.”
The data that support the findings of this study are available from the corresponding author upon reasonable request.