Recent advances in neuromorphic computing have established a computational framework that removes the processor-memory bottleneck evident in traditional von Neumann computing. Moreover, contemporary photonic circuits have addressed the limitations of electrical computational platforms to offer energy-efficient and parallel interconnects independently of the distance. When employed as synaptic interconnects with reconfigurable photonic elements, they can offer an analog platform capable of arbitrary linear matrix operations, including multiply–accumulate operation and convolution at extremely high speed and energy efficiency. Both all-optical and optoelectronic nonlinear transfer functions have been investigated for realizing neurons with photonic signals. A number of research efforts have reported orders of magnitude improvements estimated for computational throughput and energy efficiency. Compared to biological neural systems, achieving high scalability and density is challenging for such photonic neuromorphic systems. Recently developed tensor-train-decomposition methods and three-dimensional photonic integration technologies can potentially address both algorithmic and architectural scalability. This tutorial covers architectures, technologies, learning algorithms, and benchmarking for photonic and optoelectronic neuromorphic computers.

Artificial Intelligence (AI) and Machine Learning (ML) have transformed our everyday lives—everything from scientific computing to shopping and entertainment. The intelligence of such artificial systems lies primarily in data centers or warehouse-scale computing systems and has been shown to surpass the ability of human brains in some tasks, including the highly complex game of Go. However, today’s data centers consume megawatts of power [Google’s AlphaGo utilized 1202 central processing units (CPUs) and 176 graphical processing units (GPUs)1], and current deep neural network algorithms require labor-intensive hand labeling of large datasets. Furthermore, the early conceptualization of “a-machine” by Turing in 19362 (also called the Turing machine) proved the existence of fundamental limitations on the power of mechanical computation, albeit with powerful mathematical models and algorithms using a processing unit (e.g., CPU) as is done today. In addition, modern computers utilize random-access memory instead of an infinite memory tape divided into discrete cells.

In his “First draft of a report on the EDVAC,” in 1945,3 John von Neumann articulated what is considered the first general-purpose computing architecture based on memory, processing units, and networks (interconnects). Fascinatingly, von Neumann utilized synapses, neurons, and neural networks in this 1945 report to explain his proposed architecture and then predicted its limitations—now called the von Neumann bottleneck3—by stating that “the main bottleneck of an automatic very high-speed computing device lies: At the memory.” Because of this limitation, relatively simple tasks, such as learning and pattern recognition, require a large amount of data movement (including moving the weight values) between the processor and the memory (across the bottleneck). Thus, the energy efficiency and the throughput of such computing tasks are fundamentally limited in von Neumann computing as was already predicted in 1945.4 

Despite increases in computing speed and the development of memory hierarchies, a fundamental separation between memory and computation remains, limiting data processing speeds regardless of the total availability of memory resources. Neuromorphic computers, in contrast, perform computation through directed graphs that are much better suited for the collocation of computing units and memory. Such a model has persistent or non-volatile memory in the form of synaptic weights uniquely associated with each pair of nodes in the graph. This locality of information allows neuromorphic architectures to avoid the bottleneck between processing and memory entirely. Each node is an individual computing unit with its own dedicated memory such that multiple pieces of information can be processed completely asynchronously and in parallel much like the human brain.

A human brain recognizes features from partial and conflicting information at ∼20 W power levels.5 At each moment, the brain is bombarded with a vast amount of sensory information, but somehow, the brain makes sense of this data stream, even if it contains imperfect and inconsistent data elements, by extracting the forms of the spatiotemporal structure embedded in it. From this, it builds meaningful representations of objects, sounds, surface textures, and so forth through parallel distributed processing. In a human brain, each neuron may be connected to up to ∼10 000 other neurons, passing signals via as many as 164 × 1012 synaptic connections,6 equivalent by some estimates to a computer with a 1 × 1012 bit per second processor. The neurons communicate with each other with extremely high energy efficiency. For example, in Ref. 7, Attwell and Laughlin observed that the energetic cost of information transmission through synapses is extremely efficient at ∼20 500 ATP/bit, corresponding to 1.04 fJ/bit at 32 bit/s. In the nervous system, firing a spike costs a neuron 106–107 ATP/bit7 or 50–500 fJ/bit, an amount of energy proportional to how far the spike must travel down the axon because the transmission medium is dispersive and lossy. Furthermore, dendrites of neurons contribute immensely to the energy efficiency and the density of computing in the brain by providing nano-/micro-scale neural networks inside the neuron itself, which is part of a larger neural network. The massively parallel yet hierarchical nature of learning and inference processes in the brain has been intriguing but not fully understood.

Is it possible to bring such brain-inspired capabilities to artificial machines with similar energy efficiency and scalability? Can we replicate the brain’s remarkable capabilities by constructing synapses and neurons using artificial materials and devices? There have been decades of efforts in this area of neuromorphic computing, and none have come close to demonstrating the full capability of the brain. Turing in 1950 proposed a test (now known as a Turing test) to replace the question “Can machines think?”8,9 Despite decades of efforts by many, even if a machine could get close to passing the Turing test, it is extremely unlikely or at least challenging to achieve the energy efficiency and the computing capacity in such small volume and weight as the brain. The seminal work carried out by Mead at Caltech in the late 1980s10 emphasized a million-fold improvement in power efficiency. The subsequent work of Boahen’s Neurogrid,11 Heidelberg’s BrainScaleS,12 IBM’s TrueNorth,13 Intel’s Loihi,14 Manchester’s SpiNNaker machine,15 Cauwenberghs’ Hierarchical Address Event Representation (HiAER) communications fabric,16 and Mitra and Wong’s N3XT17 all addressed far better energy-efficiency than conventional von Neumann computing for relatively simple example tasks.

There are challenges in scaling these electronic neuromorphic computing platforms to the very large scale. Electronic solutions typically include long electrical wires with large capacitance values, leading to high interconnect energy consumption. Their interconnect topologies are typically in four directions and require many repeaters for multi-hop connections to many other non-neighboring nodes. For instance, the TrueNorth chip runs at a slow clock speed of 1 kHz, communicates with energy efficiency at 2.3 pJ/bit with an additional 3 pJ/bit for every cm of transmission, and requires a 256 × 256 cross-bar network that selectively connects incoming neural spike events to outgoing neurons.13 The recently emerging nanoelectronic neuromorphic computing systems also suffer from similar communication challenges in achieving appreciable repeaterless distances, especially at high speeds.18 In the 1980s, optical neural network studies became a very active area of study in achieving massively parallel brain-like computing at the speed of light.19–24 However, the pioneer himself, Psaltis, declared in the 1990s that he was abandoning optical neuromorphic computing for two reasons: (1) lack of practical devices that can be integrated and (2) insufficient knowledge of complex neural networks. Fast forward to 2021, three decades later, we are now witnessing three major changes countering the two reasons for the abandonment. First, machine learning algorithms utilizing deep neural networks have advanced so much that an artificial machine with a single night of training can beat the human world champion of 33 years in the game of Go1—Lee Sedol referred to AI as “an entity that cannot be defeated.” Second, the rate of increase in component integration in silicon photonics25–27 is now twice as fast as that of the electronic integration (electronic Moore’s law28). Thus, we now find silicon photonic integrated circuits with ten thousand photonic components on a die manufactured on 300-mm silicon photonic wafers from several foundries. Third, while Moore’s law barely maintains its trend of continuing increases in transistor density—going from 5, 4, 3, and possibly down to 2 nm and below—the slowing of this trend is evident, and Dennard’s law,29,30 which governs energy efficiency, has already stalled since 2005. Hence, electronics alone cannot sustain the exponential increases of data processing, especially with the requirement that von Neumann computing architectures move data across a bottleneck. The natural conclusion from these three major changes points to photonic neuromorphic computing as the key solution to future computing.

Nonetheless, there are significant challenges in scaling analog photonic networks while maintaining high accuracy. Wetzstein et al.31 summarized this to be due to the following three main reasons: (1) the advantages (power and speed) of analog accelerators are useful only for very large networks, (2) the technology for the optoelectronic implementation of the nonlinear activation function was immature, and (3) the difficulty in controlling analog weights made it difficult to reliably control large optical networks. Hybrid or heterogeneously integrated photonic–electronic neural networks offer practical solutions to these challenges.

Many other reviews have been written on the topic of photonic neuromorphic computing systems. Some place heavier emphasis on a specific aspect of the system such as material choice32,33 or nonlinearity34 specific structures such as reservoir computers,35,36 or the relationship between photonic neuromorphic neural networks and machine learning,37 deep learning,38 or artificial intelligence.39 Others more broadly discuss interconnect technology, network topology, neuron design, and algorithm choices at differing levels of depth.40–44 This tutorial aims to concisely and comprehensively unify each of the aforementioned aspects of photonic neuromorphic design and cover them at their most fundamental level before describing how they relate to the computational abilities of the system; references to other reviews will be given for implementation and other details not fully addressed here. The tutorial is structured as follows: Sec. II A argues the rationale of photonic and optoelectronic neuromorphic computing. Section II B covers the general system architecture of the neuromorphic computer. Section II C details the individual building blocks, followed by the learning models in Sec. II D. Section III addresses the critical topic of achieving scalability in algorithm and physical systems. Section IV surveys benchmarking and casts the challenges in benchmarking such a nascent area of computing. Finally, Sec. V summarizes the tutorial by addressing future directions.

As mentioned in Sec. II, the primary motivation for neuromorphic computing arises from the brain’s capability to learn and infer with remarkable energy efficiency, throughput, and scalability. Figure 1 illustrates the rationale for pursuing optoelectronic neuromorphic computing. Figure 2(a) shows one of the early drawings by Cajal45 depicting the complex network of neurons and synapses. Each neuron consists of thousands of dendrites, a soma, and an axon with thousands of axon terminals. Each neuron connects with thousands (∼7000 on average6) of other neurons at the synapses interfacing the axon terminals of the upstream neurons (presynaptic neurons) to the downstream neurons (postsynaptic neurons). The plasticity of the synapses through the history of experiences allows learning (or training)—discussed more in Sec. II D. Hence, our efforts to construct a bio-derived or a bio-inspired (close-imitation or inspired-reconstruction of the brain functionality) neuromorphic computer start from the simple mathematical diagram of Fig. 2(b) of a (deep) neural network consisting of an array of neurons in each layer and weighted synaptic interconnections between the layers. The output of neurons at each layer [e.g., xil1 at layer(l − 1)] is related to the output of the neurons at the next layer (xjl at layer l) via the relationship
where wijl is the weight and θ is the nonlinear transfer function. Figure 2(c) depicts this again from the perspective of a single neuron where the input from each dendrite receives synaptically weighted inputs that are summed at the soma and generates an output according to a nonlinear transfer function—like the Sigmoid activation function θsjl shown in Fig. 2(d).
FIG. 1.

Comparisons of a biological cognitive system (human’s cortex), CMOS-based electronic neuromorphic computing (Ex. IBM TrueNorth13 and Intel Loihi14,46), and nanoscale photonic–electronic neuromorphic computing. In Ref. 47, Miller mentioned 1 fJ/b at 10 Gb/s for point to point and 10 fJ/b for 19 dB loss budget (∼80×) when quantum impedance conversion is utilized. WDM (90 wavelengths) and SDM (90-way splitter that corresponds to ∼19 dB loss per branch47).

FIG. 1.

Comparisons of a biological cognitive system (human’s cortex), CMOS-based electronic neuromorphic computing (Ex. IBM TrueNorth13 and Intel Loihi14,46), and nanoscale photonic–electronic neuromorphic computing. In Ref. 47, Miller mentioned 1 fJ/b at 10 Gb/s for point to point and 10 fJ/b for 19 dB loss budget (∼80×) when quantum impedance conversion is utilized. WDM (90 wavelengths) and SDM (90-way splitter that corresponds to ∼19 dB loss per branch47).

Close modal
FIG. 2.

(a) Drawing by Cajal45 (public domain). (b) A two-layer neural network with synaptic weights. (c) A simple example of a nonlinear neuron that includes (i) synapses, (ii) weighted addition, and (iii) nonlinear activation function. (d) A nonlinear activation function (e.g., sigmoid function) for five different slope parameters a: θs=11+eas.

FIG. 2.

(a) Drawing by Cajal45 (public domain). (b) A two-layer neural network with synaptic weights. (c) A simple example of a nonlinear neuron that includes (i) synapses, (ii) weighted addition, and (iii) nonlinear activation function. (d) A nonlinear activation function (e.g., sigmoid function) for five different slope parameters a: θs=11+eas.

Close modal

It is not a coincidence that the biological system chose to utilize spiking neural networks (SNNs) instead of non-spiking ones after millions of years of evolution. The energy efficiency of the brain is crucial; although a human brain represents only ∼2% of the body weight, it consumes ∼20% of the oxygen and calories. Information transfer and processing utilizing spikes—based on event-driven communication and processing in a massively parallel system—are orders of magnitude more energy-efficient than non-spiking counterparts that require constant energy consumption even when communication or computation is unnecessary.

Another important challenge is the implementation of high-throughput and scalable neuromorphic computers that maintain this energy efficiency. Even for bio-derived neuromorphic computing, it is difficult to exactly replicate the wet-electro-chemical systems of ion channels and ATP/ADP conversions. Commonly used electrical wires are too power-hungry and noisy due to electromagnetic impedance, electromagnetic interference, and Johnson thermal noise. For instance, IBM’s TrueNorth system13 included repeaters consuming 3 pJ/bit for every cm to overcome dispersion limitations and assure signal integrity. Similarly, Intel’s first iteration of the Loihi chip is organized into a grid of 128 neurocores communicating through a network-on-chip (NoC) with an energy cost of 3–4 pJ for each hop.14 

Even more serious energy limitations arise when large-scale synaptic networks need to be deployed utilizing electrical mesh networks whose capacitance and energy consumption scale quadratically. On the other hand, photonics do not suffer from the same types of limitations due to impedance, interference, thermal noise, and RC latency. Photonic meshes can achieve matrix multiplications in Figs. 2(b) and 2(c) just by propagating light through the photonic meshes, which can be made lossless in a unitary photonic mesh configuration. Photonic interconnects can achieve low energy (∼1 fJ/b),47 low loss (<0.1 dB/cm),48,49 parallel wavelengths (many wavelengths), and high speed (>10 Gb/s) transmission independently of the distance.

Nonetheless, it is difficult to fully compare photonic and electronic approaches without the availability of equivalently functional systems. However, the same challenge has been previously identified for comparisons between two or more fully electrical solutions.50 Nonetheless, this tutorial hopes to convince the reader that photonic approaches to neuromorphic computing can offer the following unique advantages compared to electronic counterparts:

  • Massive photonic parallelism achievable in wavelength, time, and space domains,

  • absence of electromagnetic interference,

  • extremely low noise (negligible thermal Johnson noise and possible shot-noise performance),

  • information transmission at 1 fJ/b independently of the distance,

  • matrix-multiplication achievable by simple propagation of light through the photonic mesh, in principle, with zero energy loss,

  • bidirectional photonic synapses and meshes achievable for forward/backward propagation training,

  • extremely low noise (shot-noise limited),

  • fast optical barrier synchronization, and

  • sparse processing overcoming the poor locality of data and information far beyond the reach of electronic neural networks.

On the other hand, photonic neuromorphic computing faces the following challenges:

  • Photonic components are relatively large (typical dimensions of wavelength/refractive index on the order of ∼1 μm) compared to electronic components (∼10 nm). Therefore, wavelength- and time-domain multiplexing is necessary to achieve the density comparable to electronic and biologic neural systems.

  • All-optical nonlinear transfer functions are difficult to achieve without relatively high optical power. Due to this reason, optoelectronic neural networks incorporating optical–electrical–optical (O/E/O) conversion have been proposed and demonstrated.51 

Photonic neuromorphic computing almost always needs electronics for power distribution, control, and signaling. Section II B details the high-level architectural decisions of designing photonic neuromorphic computers and detail existing devices and methods for building optoelectronic and photonic neuromorphic hardware while highlighting these advantages and addressing the challenges.

1. Spiking vs non-spiking photonic neural networks

As previously discussed, event-driven spiking neural networks are far more energy-efficient compared to non-spiking counterparts. Additionally, spiking units offer new ways to represent information that may be more natural for specific classes of computation, such as graph algorithms,52 quadratic programming,53 and other parallelizable computation algorithms.54 However, spiking networks have additional complexity in that they require mechanisms for integration over time. Neurons must integrate their inputs over a time window some order of magnitude greater than the received pulse widths to meaningfully process new aggregate information from their upstream neurons; otherwise, the activity of deeper layers in the network can merely encode a rough thresholding of activity from previous layers. Additional complexity in the inclusion of dendritic delays—which simulates the effect of spatially distributed networks—allows downstream neurons to encode and process information about the timing patterns of their upstream inputs and thus provides another dimension of processing to the network.

On the other hand, mathematical and experimental implementations of non-spiking artificial neural networks (ANNs) are much more accessible than those of spiking neural networks. Moreover, the choice of a non-spiking model can more easily leverage the many developments of traditional deep learning from traditional computing platforms and more naturally apply gradient-based learning rules (discussed more in Sec. II D) that have been well proven in many application spaces. Many non-spiking photonic matrix multipliers55–61 and photonic neural networks (PNNs)62–65 have been proposed or demonstrated. See Ref. 66 for a taxonomy of photonic neural network approaches and a more in-depth review of existing approaches.

2. Out-of-plane vs in-plane photonic neural networks

Out-of-Plane Photonic Neural Networks. As introduced in Sec. I, the first optical neural networks were developed in the 1980s incorporating optical planes (pixels) with photonic signals propagating vertically. Such out-of-plane implementations of photonic neural networks are still in active research (a) because it can attempt to utilize many optically resolvable elements (pixels) simultaneously in parallel and (b) because Fourier optics and optical convolution can be achieved easily by incorporating lenses. The electronic architecture demonstrated in 1985 has been commercialized in the Optalysys system, where a PC has been used to provide the necessary gain, feedback, and thresholding indicated in Ref. 67. Interestingly, the optical feedback scheme utilizes O/E/O conversion consisting of arrays of photodetectors (PD) and light-emitting diodes (LEDs). Both schemes utilize electronic amplification to overcome optical diffractive losses and electronically incorporate the nonlinear transfer function (no all-optical neural transfer function).

More recently, out-of-plane photonic neural networks have been implemented in the form of Diffractive Deep Neural Networks (D2NNs), which use a cascade of passive diffractive media to implement a synaptic strength matrix between layers in the network.68 A thin optical element is designed with variable thicknesses at the “resolution” of the network (number of neurons in the layer) and controls the complex-valued transmission and reflection coefficients at each point. Mathematically, each point is considered a secondary source for the incoming coherent light signal that acts as a neuron in a fully connected neural network layer. Weight matrices implemented by the network are fixed, with the diffractive medium being fabricated as a passive optical element by 3D printing or photolithography. Inputs to the network can be encoded on the amplitude or phase of incoming coherent light before the network output is measured by an array of photodetectors at the output plane [as in Fig. 3(a)]. The demonstration of the above D2NN did not experimentally incorporate nonlinear optical neuronal transfer functions or synaptic reconfiguration. The optical losses per layer (51% average power attenuation per layer reported in Ref. 68) and relatively high optical intensity levels required to drive optical nonlinear transfer functions may limit the scalability and practicality of this method. However, this does not represent a fundamental limit of the technique and may be reduced in the future by improved diffractive surface design.

FIG. 3.

Example schematic of (a) the out-of-plane photonic neural network based on the deep diffractive neural networks found in the study by Lin et al.68 and (b) an in-plane photonic neural network schematic based on the MZI mesh and opto-electronic neurons.

FIG. 3.

Example schematic of (a) the out-of-plane photonic neural network based on the deep diffractive neural networks found in the study by Lin et al.68 and (b) an in-plane photonic neural network schematic based on the MZI mesh and opto-electronic neurons.

Close modal

In-Plane Photonic Neural Networks. As opposed to the out-of-plane PNNs, in-plane photonic neural networks implement all interconnected photonic synapses and photonic (or optoelectronic) neurons on planar photonic integrated circuits and offer more solid realization especially when utilizing silicon photonic technologies. Despite the lack of lenses, as in the out-of-plane approach, unitary photonic mesh networks consisting of many (unitary) 2 × 2 optical couplers can perform arbitrary matrix operations, including convolution and Fourier transforms. Furthermore, reconfiguring the photonic mesh can be achieved through the individual 2 × 2 optical couplers, which can be considered a photonic synapse in the synaptic network.

Miller proposed a method to implement arbitrary weighted connections from optical inputs to a set of optical outputs. This method relies on a “universal linear optical component”69 comprising a network of 2 × 2 Mach–Zehnder interferometer blocks—connected in a mesh as illustrated in Fig. 3(b)—which was proven capable of implementing any linear transformation from its inputs to its outputs (i.e., from preceding neurons to subsequent layers).69,70 In addition, other reconfigurable photonic structures can also implement matrix transformations, such as crossbar networks and micro-ring resonator banks, discussed in Sec. II C. By incorporating nonlinearity in the form of photonic or optoelectronic neurons between each layer (as shown in Fig. 4), a multi-layer neural network can be constructed.

FIG. 4.

Photonic neural network (two-layer example) with optoelectronic neurons (inset diagram on the bottom) at each node and the photonic neural network with 2 × 2 MZI synapses between each layer.

FIG. 4.

Photonic neural network (two-layer example) with optoelectronic neurons (inset diagram on the bottom) at each node and the photonic neural network with 2 × 2 MZI synapses between each layer.

Close modal

Aside from in-plane and out-of-plane networks, it is worth mentioning that there are examples of optical neural networks designed using a combination of optical fibers, large-scale laser sources, and various other off-the-shelf modulators and components that can be purchased from providers of telecommunication companies.71–76 Given that such fiber-based or free-space components can be easily moved, swapped, and otherwise manipulated, it is far easier to develop and prototype neural networks using these technologies and architectures. In fairness—and not to diminish work in this space as merely prototype networks—it is speculatively conceivable that such neural networks can be used to simultaneously communicate and process information collected over broad distances much faster than would be possible for any exclusively localized system, photonic or electric. Examples of such fiber-based optical networks include optical reservoir computers71–75 (discussed more in Subsection II B 3), which have long exceeded a data processing speed of >1 GB/s75 for such tasks as chaotic time series prediction (over 107 points per second) with an error rate of about 10% (compared to contemporary electronic approaches at 1%). Rafayelyan et al.,73 in more recent work, reported processing speeds on the order of 1014 operations per second for multi-dimensional chaotic system prediction—compared to the 1015–1017 operations per second possible on supercomputers for similar tasks.73 In these examples, laser and amplifier feedback is manipulated to substantially increase the parallelization, with neurons communicating on various parallel wavelengths or modes in the system.

Despite the computational efficiency and promise of out-of-plane and fiber-based approaches, the remainder of this tutorial focuses on the construction of in-plane, integrated photonic neural networks that more closely approximate the physical scales of biological systems.

3. Network topology

The physical structure or topology of the neural network can determine the possible transformations of data between successive layers or ensembles of neurons. Various neural network structures exist, with some derived from biological connectivity patterns and others deduced from the functions intended to be computed. In the design of a neuromorphic computer, the topology of networks supported depends on the capabilities of the hardware structures employed (discussed more in Sec. II C) and the learning rules most suitable for a given application. Neural network topologies are often divided between feedforward and recurrent approaches though brain-inspired structures typically fall into the latter category. Figure 5 summarizes the common topological structures found in neuromorphic computing.

FIG. 5.

Diagrams of different neural network topologies classified in two main categories of feedforward and recurrent neural networks.

FIG. 5.

Diagrams of different neural network topologies classified in two main categories of feedforward and recurrent neural networks.

Close modal

Feedforward neural networks are the simplest network topology to implement and are useful in situations with clear mappings between input and output data as in the prevalent MNIST handwriting classification task. As shown in Fig. 5(a), feedforward refers to the flow of information exclusively from input to output. In general, restricting the flow of information allows the network to be trained more easily by backpropagation and equivalent supervised learning algorithms (discussed more in Sec. II D). Often, synaptic interconnections are fully or densely connected, meaning that each sending neuron is connected to each receiving neuron. This all-to-all pattern of connectivity provides the most flexibility for learnable patterns though at the cost of increased parameters (synaptic weight strengths) and computation for training those parameters. Various photonic structures can be used to implement these weighted connections based on the passive propagation of light—for example, phase-change materials (PCMs), Mach-Zehnder Interferometers (MZIs), or micro-ring resonators (MRRs) (discussed more in Sec. II C 1).

Fully connected feedforward networks require many parameters, so multiple strategies have been developed in traditional ANNs to reduce the number of parameters. Convolutional neural networks (CNNs) are the most popular remedy, which combines weight sharing and sparse connectivity to perform pattern recognition with far fewer weights than fully connected networks. As shown in Fig. 5(b), a small weight pattern (called a kernel) is swept across different positions of the input layer such that the subsequent layer reflects the strength of that pattern at each position. Photonic neural networks can take advantage of wavelength-division multiplexing (WDM)60,65,77 and time-division multiplexing (TDM)78 in addition to space-division-multiplexing (SDM) and other forms of parallelism58 to repeatedly apply the same kernel over multiple positions of an input vector.

Recurrent neural networks deviate from this exclusively “forward” propagation of information and incorporate cyclical and lateral pathways—see Fig. 5(c)—with varying degrees of connectivity. The broader range of topologies provides additional mechanisms for information processing in addition to the input–output mappings of feedforward networks. There are various forms of recurrent networks that appear in neuromorphic computing that use recurrent connections to add a persistent state (as in working memory) and dynamical properties to the behavior of the network.

In reservoir computing, a fixed (non-reconfigurable) recurrent neural network is sandwiched between two feedforward layers, as shown in Fig. 5(d), and only the output feedforward layers are reconfigured. The recurrent part of the network is called the “reservoir” and is defined with lateral connections (between neurons in the same layer or ensemble) of random strengths. The first feedforward layer also contains randomly selected weights and provides input to the reservoir, while the final feedforward layer is trained to “read out” the activity of the reservoir.79 The random flow of activity through the reservoir behaves like a dynamical system and enables the learning of temporal dynamics without the complicated training schemes required to train an entire network in other recurrent structures.80 To summarize, the reservoir is often said to transform the input from a temporal representation into a higher dimensional spatial representation that a linear classifier or predictor can more easily interpret; this behavior has been useful in applications such as digital signal processing, speech recognition, and general modeling of dynamical systems.81 

Photonic and optical implementations of reservoir computing vary in architecture, modulation, and signal generation. For example, Duport et al.82 reported a fully analog variant of a popular optical reservoir computing system; it represents the neural activity of the reservoir as the modulated intensity of an external laser. Each neuron's activity is time-multiplexed on a long delay line (optical fiber spool) with a period roughly equal to the round-trip time of the line (∼8.4 µs). Other groups have demonstrated reservoir computing on CMOS-compatible integrated silicon photonics chips. Vandoorne et al.,83 for example, experimentally demonstrated a passive chip that uses only waveguides, splitters, and combiners, while the nonlinearity of the network is handled at signal detection and in the “read out” layer of the network. This approach was evaluated with multiple 2-bit binary logical tasks, with an error rate as low as 10−4 for the exclusive or (XOR) logical operation.

Winner-Take-All (WTA) networks are a biologically inspired topology in which recurrent or lateral inhibitory connections between excitatory neurons and inhibitory neurons—depicted in Fig. 5(e)—enforce a limit on the total activity of the layer. Strengths of incoming excitatory connections, lateral inhibitory connections, and bias currents are balanced to create the desired selectivity (softmax-like or hardmax-like transformation) of incoming information, much like that seen in the feature maps of the cerebral cortex.84 

Zhang et al.85 recently demonstrated a simulated WTA mechanism using the inhibitory dynamics of vertical-cavity surface-emitting lasers with saturable absorption (VCSEL-SA). Each VCSEL-SA in the circuit acts as a spiking neuron whose output is interpreted from a spike of intensity in the X-polarization (XP) mode; the Y-polarization (YP) mode is sent to the competing neuron (VSCEL-SA) and induces an YP spike that pushes the neuron into a refractory period that temporarily prevents XP spikes from being generated. Zhang et al. also used simulations to show that this bio-inspired mechanism can be used to implement the max-pooling operation required by many CNNs found in traditional machine learning. Max-pooling layers have been shown to increase the accuracy of deep CNNs by a factor of nearly three86 without the need for additional learnable parameters (synaptic weights).

Hopfield networks are another common recurrent network topology with well-defined dynamics; they are formed by selecting excitatory and inhibitory weights between neurons such that specific states (patterns of firing) are stable, while others are unstable—see Fig. 5(f). This connectivity pattern forms a simple content-based memory such that partial or distorted input patterns similar to a “memorized” state will evolve toward the memorized stable state—effectively completing the memory.87 Marquez et al.88 used a thermo-optically tuned micro-ring resonator bank (described in Sec. II C 1) to experimentally demonstrate the pattern reconstruction capability of the Hopfield network for three small 4 × 4 patterns. A flattened memory pattern, x, was stored (in an off-chip computer) in the network by calculating the recurrent weights according to
where Wii = 0 is imposed to prevent neurons from firing all the time. The micro-ring resonators were tuned to apply the modulation of x according to W one column and row at a time. Components of a vector in x were represented by the intensity of light at multiple wavelength channels on the same input waveguide. An offline computer was used to interpret results (as opposed to on-chip neurons) though the demonstration provides proof-of-concept for Hopfield networks implemented on micro-ring resonator banks.

As discussed in Sec. II B 2, PNNs consist of an interconnection network analogous to the axonal and dendritic projections of the neuron and a photonic or optoelectronic nonlinearity that corresponds to the excitability behavior of the neuron. Table I summarizes biological neural network components and examples of equivalent optical and electro-optic device components that emulate the biological component mechanism.

TABLE I.

Summary of building blocks for biological neural networks and in-plane photonic neural networks.

Biological componentBiological constituentBiological functionEquivalent photonic componentEquivalent photonic function
Synapse Presynaptic terminal Electrical signal (action potential) conversion to chemical signal (neurotransmitter release) Photonic synapses or photonic mesh Photonic couplers with variable coupling strengths to reflect synaptic weights 
Postsynaptic terminal Receives neurotransmitters at the receptors 
Neuron Dendrites Spatiotemporal dendrite computing summation All-optical or optoelectronic neurons Photonic dendrites or input optical couplers (with fan-in) 
Soma Integration and spike generation Photonic somas: nonlinear transfer function achieved by all-optical or optoelectronic devices 
Axon and axon terminals Signal transmission Photonic axon and axon terminals: optical waveguides and output optical couplers (with fan-out) 
Biological componentBiological constituentBiological functionEquivalent photonic componentEquivalent photonic function
Synapse Presynaptic terminal Electrical signal (action potential) conversion to chemical signal (neurotransmitter release) Photonic synapses or photonic mesh Photonic couplers with variable coupling strengths to reflect synaptic weights 
Postsynaptic terminal Receives neurotransmitters at the receptors 
Neuron Dendrites Spatiotemporal dendrite computing summation All-optical or optoelectronic neurons Photonic dendrites or input optical couplers (with fan-in) 
Soma Integration and spike generation Photonic somas: nonlinear transfer function achieved by all-optical or optoelectronic devices 
Axon and axon terminals Signal transmission Photonic axon and axon terminals: optical waveguides and output optical couplers (with fan-out) 

Because biological neurons connect to thousands of other neurons via thousands of synapses, dendrites, and axons, it is difficult for bio-derived or bio-inspired artificial neural networks to achieve equivalent levels of connectivity. Biological systems can more easily achieve such high degrees of interconnection because of synapses' extremely small (∼10 nm) size and even smaller neurotransmitters and receptors. In addition, the unique electro-chemical dynamics allow tree-structure branching of dendrites and axon terminals to achieve the broadcast (at the axon terminals) and sum (at the dendrites) of the neurotransmitter signals. While, in principle, possible to construct such a synaptic network for each neuron, it is simpler to construct a mathematically equivalent mesh of synaptic interconnects, such as a crossbar, a mesh of 2 × 2 couplers, or other parallelized structures.

Electronic neuromorphic computers typically consist of electronic crossbars with memristive synapses at each crosspoint (N2 synapses for an N × N crossbar) and electronic neurons at each end. Likewise, photonic neural networks comprise photonic synaptic meshes consisting of photonic couplers and photonic or optoelectronic neurons that provide nonlinearity. In both cases, the number of photonic synaptic couplers scales as O(N2) for an N × N neural network. Unlike biological synaptic interconnections, the synaptic couplers in these meshes do not directly and individually alter the corresponding weight values. Instead, the collections of these photonic synaptic coupling coefficients collectively apply the weight matrix values wij between the ith presynaptic neuron and the jth post-synaptic neuron.

Sections II C 1II C 3 discuss the working principles of several photonic components that provide the reader with the fundamental background to construct photonic neural networks while referring to other reviews for further reading.

1. Forming reconfigurable optical synapses

Photonic matrix multipliers passively couple light from a set of N input ports to a set of M output ports according to a unitary weight matrix U. It is possible to remove the unitary restriction by the inclusion of active gain media. However, due to the preference for energy efficiency in neuromorphic computing, the remainder of this section assumes that passive weighting is sufficient for computation and all signal gains (if any) are handled by the neuron nonlinearity. The amount of light coupled from one input to one output is determined by one or more reconfigurable photonic elements that form the optical synapse. Manipulation of one or more material properties allows for the reconfiguration of these elements.

The refractive index is one such property of a material and can be tuned by thermo-optic, electro-optic, phase change, or photo-ionic effects89 to produce a change in the optical length and thus modulate the phase shift of a signal due to propagation through the material. The former two are considered volatile reconfiguration mechanisms and require constant external biasing or power supply, while the latter two are non-volatile and persist after the power supply is removed. The application of these effects can result in the signal phase change, Δϕ, that is proportional to the refractive index change Δn and the propagation constant,
where L is the length of the material undergoing the index change and λ0 is the free-space wavelength of the signal.
Thermo-optic effects describe how the temperature of a waveguide—often controlled by Ohmic heating elements that run current through a known resistance—changes the refractive index according to the materials’ thermo-optic coefficients at a given temperature and wavelength. Thermo-optic coefficients are nonlinearly dependent on both the wavelength and temperature and can be found in the literature where semiempirical formulas can be fit to measurements of a given material over some temperature range.90–93 The change in the refractive index can then be calculated as
though it is often treated linearly for small changes in the refractive index where the linear coefficient, β1, is often written as dndT and given in units of inverse Kelvin (1/K). In integrated photonics, the behavior of structures can be simplified in terms of an effective index that depends on the refractive index of two or more materials. In such a case, the thermo-optic coefficient can be written according to the sum of contributions from each index. For example, in a waveguide where the effective index is dependent on the core index ncore and cladding index nclad, the thermo-optic coefficient for the effective index can be written as
Because thermal heating tends to spread across a photonic chip, thermal crosstalk must be appropriately considered and accounted for in chip design.94 

The electro-optical tuning of the refractive index requires materials with significant electrical-field induced optical index changes through the Pockels effect, the Kerr effect, field-induced carrier density changes, or other mechanisms. The magnitude of these index changes varies considerably for differing materials.95 The relationship between the electric field and changes in the index is more complex than for thermo-optic effects and ranges from simple linear relationships in the case of the Pockels effect to more complicated charge carrier distribution through the material for carrier effects. Photo-ionic effects are similar to electro-optical effects except that applied electrical fields can physically displace ions to drive them in or out of waveguide materials (e.g., polymers) to semi-permanently change the optical index.96,97

Optical Phase Change Materials (PCMs) achieve phase changes from one material structure to another (e.g., crystalline to amorphous and vice versa) often by external heating—rapid heating to high temperature and cooling vs slow heating and cooling—to induce changes in the refractive index and loss. Heating is often achieved electrically by incorporating pulsed Ohmic heaters as described previously though it is also possible to achieve such heating by optical pulses themselves, thus realizing the “all-optical” reconfiguration. However, due to the typical optical power levels required for such an optical reconfiguration of PCMs, optically tuning the PCM can be restrictive compared to other PCM reconfiguration approaches, such as electronic or thermal reconfiguration.98 A non-volatile synaptic reconfiguration achievable by PCMs and photonic–ionic materials89 is attractive because no static power consumption is required to maintain the induced changes in material properties.

The modulation of the index of refraction can also be used to modulate the coupling coefficients of directional and bidirectional couplers. Coupling occurs due to the overlap of evanescent fields in the modes of two waveguides; its strength decreases with the increasing distance or decreased contrast between core and cladding indices of refraction. The mode-coupling coefficient is often denoted by κ and given in units of inverse length as the amount of coupling between the waveguides depends on the length of the interaction. Mode coupling between waveguides can be derived using coupled-mode theory; for two rectangular waveguides of the same core index, n1, and shared cladding index, n0, the coupling coefficient can be shown to obey the following proportions:
where D is the distance between waveguide centers and a is the width of the waveguides in the plane of coupling [see Fig. 6(a)]. Assuming no loss, we can describe coupling in a two-arm coupler—as shown in Fig. 6(b)—from a single incident wave in one arm to the two output arms according to the following (see Ref. 99 for more details on the derivation and nature of waveguide coupling):
where l is the length of interaction between the two waveguides and A1 is the normalized incident field amplitude such that the input power is P=A12 and likewise for output fields B1 and B2. It is important to note that the coupling coefficient is wavelength dependent, and if multiple wavelengths are used, an appropriate selection of the coupling length must be determined to allow each wavelength to follow the desired coupling between ports.
FIG. 6.

Depiction of two coupled rectangular waveguides from two simplified perspectives: (a) a close-up of the waveguide dimensions in the coupling region and (b) a view from above the plane of the coupler with the labeled input port, A1, and output ports, B1 and B2.

FIG. 6.

Depiction of two coupled rectangular waveguides from two simplified perspectives: (a) a close-up of the waveguide dimensions in the coupling region and (b) a view from above the plane of the coupler with the labeled input port, A1, and output ports, B1 and B2.

Close modal
Mach–Zehnder Interferometers (MZIs), as depicted in the dashed rectangle of Fig. 7(b), are 2 × 2 reconfigurable photonic couplers that make use of two pairs of phase shifters and bidirectional couplers to implement a 2 × 2 unitary weight matrix, U. If we represent the input as a vector, A, whose elements correspond to the normalized incident field amplitudes, then the normalized field amplitudes at the output are given by B=UA. The pair of phase shifters can be arranged on any two arms (straight waveguide regions) of the MZI though the configuration shown in Fig. 7(b) allows for one of the phase shifters to control the relative phase of the two input signal components in each output. Assuming coherent inputs, perfect 50:50 couplers, and two phase shifters, φ and θ, arranged as shown in Fig. 7(b), the output amplitudes can be described by the following unitary matrix multiplication:
As optical input signals propagate through the MZI, the light from each input port interferes with both itself and the neighboring arm to determine what proportion of input power from each input is coupled to each output port; the reader can verify that the total power is conserved in the multiplication above. In the proper arrangement—as discussed in Subsection II C 2—meshes of MZI units can form unitary matrices of any dimension to behave as a synaptic weight matrix between layers in a neural network.
FIG. 7.

Three common photonic structures for synaptic interconnects: (a) crossbar network, (b) MZI mesh, and (c) MRR bank.

FIG. 7.

Three common photonic structures for synaptic interconnects: (a) crossbar network, (b) MZI mesh, and (c) MRR bank.

Close modal
Micro-Ring Resonators (MRRs) are another reconfigurable technology that can be arranged to compute a matrix multiplication. In contrast to the MZI unit, MRR units are often used with wavelength-division multiplexing schemes to implement a “broadcast and weight” architecture.100 Under this scheme, input vectors are encoded as the modulated light intensities of multiple wavelength channels, while each MRR unit acts as a tunable filter to selectively apply attenuation to a specific input wavelength. The micro-ring itself is a waveguide in the shape of a circle placed within an evanescent coupling distance of one or more straight waveguides [see in Fig. 7(c)]. For matrix multiplication, it is typical to use two waveguides such that the intensity at one of the two possible output ports—often called through and drop ports—can be modulated according to the desired multiplication by attenuation. These rings form a resonant cavity though other closed-loop waveguide paths can also form the MRR cavity. The length of the path (e.g., circumference in the case of a ring) determines the resonant condition and is tuned by incorporating a phase shifter. Rather than defining the usual coupling coefficient per unit length, analyzing the response of a ring resonator is usually done by defining the power splitting ratios, kt2 and rt2, which are known as the cross-coupling and self-coupling coefficients and correspond to the amplitude of input power coupled to the ring or through port, respectively. We can also define equivalent coupling coefficients kd2 and rd2 for the interactions between the waveguide and resonant cavity on the opposing drop waveguide. Assuming no coupling loss (where k2 + r2 = 1 for both coupling regions) or attenuation in the waveguide, we can calculate the power transmission to each port as
where n is the refractive index of all waveguides, L corresponds to the length of the cavity, and λ0 is the free-space wavelength of the incoming signal. We can see that a resonant condition occurs for any inputs where an integer number of half wavelengths can fit in the cavity (see Ref. 101 for the derivation and review of other MRR properties),

Tuning the resonant frequency and coupling coefficients allows the modulation of the transmission to each port.102 It should be noted that, in contrast to the MZI unit, there will be a loss in transmission to the unused output port of the MRR (except for the case of total transmission at the desired wavelength channel). As such, the multiplication is not unitary as in the case of the MZI. However, as discussed in Sec. II C 2, a matrix multiplier can still be constructed from this unit by assembly into banks.

2. Assembling photonic synaptic meshes

Using the above reconfigurable photonic elements provides many ways to construct a photonic matrix multiplier of a given number of input and output dimensions. Each method has a differing number of tunable elements required and different design considerations. These methods can be categorized into three prevalent technologies: cross-bar networks, MZI meshes, and MRR banks.

Cross-bar networks, as shown in Fig. 7(a), are the simplest approach, typically aligning incoming signals along one direction (i.e., east-west) and outgoing signals along the other (north-south). Reconfigurable materials, such as PCMs or optical memristive materials, allow incoming light to be coupled into the output waveguides according to synaptic strength. MRRs can also be used to couple light from input to output ports; however, this has not been demonstrated for matrix multiplication to our knowledge. Feldmann et al.60 demonstrated a PCM crossbar for parallel matrix multiplication in a convolutional network and reported 1012 MAC operations per second for a CNN accuracy of 95.3% compared to 96.1% for the equivalent CNN on a traditional computer. The crossbar networks consist of N input and M outputs with N × M connections, allowing for all-to-all connectivity at the cost of N × M reconfigurable coupling elements. Crossbar networks can implement rectangular or square matrices but require careful design to ensure that crossings further from the input receive sufficient optical power.

MZI meshes are another example of a photonic matrix multiplier and use collections of MZIs as an optical linear unit (OLU) that can perform calculations as a facet of their respective transfer matrix, as shown in Fig. 7(b). Utilizing this structure for the OLU, one can build an N × N arbitrary unitary matrix that consists of MZIs arranged in various mesh topologies, of which the most common are triangular,69 rectangular,103 and diamond104 as depicted in Figs. 8(a)8(c), respectively. Gu et al. also presented a more complex butterfly topology that utilizes waveguide crossings and reduces the total number of MZI units in comparison to the former three topologies; see Ref. 105 for more details. For the former three topologies, the total number of MZI units for an NxN matrix is exactly N(N − 1)/2 though each MZI is composed of two reconfigurable elements for a total of N(N − 1) controllable parameters.

FIG. 8.

Illustration of the most common MZI mesh topologies for implementing unitary matrix transformations: (a) triangular,69 (b) rectangular,103 and (c) diamond.104 

FIG. 8.

Illustration of the most common MZI mesh topologies for implementing unitary matrix transformations: (a) triangular,69 (b) rectangular,103 and (c) diamond.104 

Close modal

The rectangular mesh, simply put, connects MZI units side-by-side, as shown in Fig. 8(b). The upper arm of the previous MZI unit connects to the lower arm of the next MZI unit. The advantage of the rectangular mesh is that its compact arrangement has the minimum optical depth among the configurations mentioned.103 A non-square matrix must be formed by constructing a square matrix with the largest dimension and leaving the additional input or output ports unused. The butterfly mesh is based on the structure of the rectangular mesh with some MZI units pruned to reduce the number of MZI units needed at the cost of some reconfigurability (i.e., not all unitary matrices can be represented).105 Triangular meshes follow a similar connection rule to the rectangular mesh but start with only the two bottom ports and increase the coupling to additional ports in a diagonal line as depicted in Fig. 8(a). Triangular meshes require a higher optical depth and more chip space but support self-configuration mechanisms as demonstrated in Ref. 69 with the same number of parameters as the rectangular mesh. Diamond meshes are a modified version of the triangular mesh. It adds (N − 1)(N − 2)/2 additional MZI units that vertically mirror the shape of a triangular mesh as seen in Fig. 8(c). Shokraneh et al.104 showed that this symmetric topology can provide additional degrees of freedom for weight matrix optimization in the backpropagation training process while also improving the fault tolerance due to errors in fabrication.

MRR banks, as previously mentioned, take advantage of WDM to broadcast spiking signals widely and weight them with selective filters at the receiving neuron.106 Banks of MRRs are formed as shown in Fig. 7(c) by aligning them along a shared pair of waveguides while varying the radius of each ring sufficiently to avoid wavelength collision. Each receiving neuron has its own dedicated MRR bank to implement the incoming synaptic strengths of each wavelength before a balanced detector can measure the overall incoming signal intensity. The number of MRRs in each bank matches the number of sending neurons in the network, while the number of banks matches the number of receiving neurons. This means that a total of N × M MRRs will be needed to implement a fully connected network between a layer of N sending neurons and M receiving neurons.

3. Photonic and optoelectronic nonlinear neurons

After receiving sufficiently strong stimuli, biological neurons emit electrical pulses known as action potentials or spikes. Encoding of information in the form of spike timing (temporal coding) or the spike rate (rate coding) has been a subject of active research. In designing nanophotonic spiking neural networks, the three essential elements—the neuron, the synapses, and the coding scheme—should be designed together to have the following attributes:51 

  • weighted addition: the ability to sum weighted inputs,

  • integration: the ability to integrate the weighted sum over time,

  • thresholding: the ability to decide whether or not to send a spike (all-or-none),

  • reset: the ability to have a refractory period during which no firing can occur immediately after a spike is released, and

  • pulse generation: the ability to generate new pulses.

Biological neurons consist of three primary structures: dendrites, soma, and axon.80,81 The neuron body, or soma, forms the thresholding function, accumulating input currents from dendritic trees until the internal voltage meets the condition for spike generation. Exact mechanisms for this spike generation vary in biological realism and complexity and can be reviewed in Refs. 107 and 108. Photonic implementation of this function can be generally classified between all-optical (or photonic) and electro-optic approaches.

All-optical neurons tend to use more simplified neuron models due to the difficulties in implementing optical nonlinearity. One of the simplest approaches uses traditional ANN activation functions, such as the sigmoid function, and maps this onto spiking hardware as in a rate-encoded ANN translation.109 Another common choice is the leaky-integrate-and-fire (LIF) model,110 in which an internal state variable—representing the membrane potential in biology—constantly decays exponentially toward some equilibrium value. Incoming spikes increment the membrane potential according to synaptic strength, and the neuron fires if the potential surpasses some threshold before decaying back to its resting potential. Choices of nonlinearity to implement these models include the use of semiconductor optical amplifiers (SOAs),109,110 vertical-cavity surface-emitting lasers (VCSELs),111–113 saturable absorption,43,114–117 and more recently passive micro-resonators.118 These optical neurons have not yet been demonstrated in large-scale networks, and more work is needed to establish their computational abilities.

Optoelectronic neurons, the alternative, combine the advantage of well-studied electronic nonlinearities alongside the fast, nearly lossless transmission119 and zero-energy weighting provided by the photonic devices discussed in Sec. II C 2. O/E/O conversion uses photodetectors to generate an electrical current in proportion to the power of light received, thus converting the aggregated optical inputs from the synaptic mesh into electrical currents. The electrical circuit, in turn, can use any nonlinear circuit element to implement the spiking function and generate an optical output using semiconductor lasers. Nozaki et al.120 demonstrated that close integration between the photodetector and the modulator reduced the integrated capacitance to 2 fF and that non-spiking neural nonlinearity can be achieved at an extremely low energy consumption of 4.8 fJ/bit at a speed of 10 Gbit/s. However, such a neuron architecture requires a constantly powered laser source. For instance, continuous currents supplying the lasers described in Ref. 120 consume a significant amount of energy even when the neurons are idle. These dynamics can be implemented by closely integrating CMOS transistors with photodiodes and electro-optic modulators. As discussed in Sec. II B 1, spiking neural networks offer far better overall energy efficiency due to the sparse nature of communication in event-driven neuromorphic computing. Recently, a low-power, 6-transistor soma design has been demonstrated for spiking optical neural networks.121 This soma design consumes 21.09 fJ/spike with a maximum spiking rate of 10 GHz on 90 nm CMOS.

Similar O/E/O circuit designs can potentially allow the use of more biologically accurate neuron models by replacing the transistor circuit with any desirable analog electrical neuron nonlinearity, such as that presented in the study by Farquhar and Hasler.122 Unlike the all-optical neurons, O/E/O conversion limits the response speed of a single optoelectronic neuron because of the analog bandwidth limitation of the electronics. However, throughput increases in neural networks benefit from optical parallelism in wavelength and spatial domains. Miscuglio et al.123 discussed that for CMOS-compatible integrated neuromorphic devices, >25 GHz is possible with a relatively low energy consumption of <10 fJ/b. Careful studies of benchmarking the system-wide throughput, energy consumption, and latency for given workloads are necessary to correctly compare neuromorphic computing systems among the various optical, electrical, or optoelectronic technologies.

Gradient descent algorithms, such as backpropagation and its many variants, are often used to train ANNs and have recently been applied to PNNs. A cost or loss function is associated with the neural network that numerically penalizes differences between outputs of the ANN and their respective target values. The matrix elements that refer to the weights of the synaptic interconnections within the network are tuned to minimize the cost function over the input space based on computed gradients with respect to each weight. Backpropagation is a class of gradient descent algorithms that extends this procedure by sequentially applying the chain rule to calculate gradients from the output to the input layer. Recurrent layers within networks can apply backpropagation through time by unrolling the network activity into time steps. Various other forms of backpropagation can be found in the literature; see Ref. 124 for a recent review of these techniques in the context of deep neural networks.

Recent work125 used photonics and adjoint variable methods (AVMs) to derive a photonic implementation of backpropagation for training PNNs. AVMs utilize the field solutions of two electromagnetic simulations, referred to as the adjoint field and the original field, to derive the cost function of the photonic neural network. This cost function is more compact in terms of measurable quantities within the network and allows for the parallel gradient computation for all phase shifters within an MZI network. Gradient computation can then be completed using intensity measurements of the adjoint field, which physically corresponds to a backward propagating waveguide mode sent into the system through the output ports. Such a backpropagation implementation is robust under noise and scaling while allowing in situ computation for photonic neural networks.

Hebbian learning is a more bio-derived class of algorithms based on Hebb’s original postulate that, in general, synaptic connections are strengthened between neurons whose activity is correlated.126 The exact weight update rule varies from one mathematical formulation to another, which is often based on the activity of neurons—such as the average firing rate of spiking neurons or the activation function of non-spiking neurons—or the timing differences between pre- and post-synaptic firing in explicitly spiking neuron models.127 Two- and three-factor learning rules128 are an example of the former set of algorithms; spike trains are smoothed by convolution with a kernel (such as the exponential decay filter) and used to numerically represent the activity of pre- and post-synaptic neurons; see Ref. 127 for a deeper review of the mathematical formulations of Hebbian learning. The latter subclass of Hebbian learning is called Spike-Timing-Dependent Plasticity (STDP), in which weight updates are calculated based on a nonlinear function of the time-interval, ΔT, between pre- and post-synaptic spikes,69 

Various photonic synapses have demonstrated STDP behavior by modulating the gain of SOAs,129,130 changing the transmission levels of phase-change memory materials,131,132 and by other photoelectric device mechanisms.131 STDP supports both supervised133 and unsupervised learning134 rules that have been demonstrated by simulation by Xiang and colleagues. In Ref. 133, a VCSEL-SA structure is simulated to implement a supervised XOR learning task with an STDP learning rule. VCSEL-SA is chosen because its inhibitory dynamics can be used as inhibitory weights as seen in Ref. 85. The XOR task was solved with supervised training, where the weight update from a photonic STDP learning rule was derived in Ref. 135, and the sign is changed when the output does not match the target value. Training was shown to converge completely after 40 simulated epochs. Xiang and colleagues also demonstrated an unsupervised pattern recognition task with similar components. In both works, a controller circuit calculates STDP weight updates to program the synaptic weights.

Amato et al.136 made a comparison of multiple hybrid networks composed of Hebbian-trained and gradient-descent-trained layers to isolate the respective advantages of each technique in the context of a classification task using feedforward convolutional neural networks (on a traditional computing platform). Their analysis empirically showed that Hebbian learning falls behind gradient descent for deep networks, with evidence to show that intermediary layers are not trained as efficiently as layers near the beginning or end of the network. It was also found that Hebbian learning requires far fewer epochs (iterations over the training set) than gradient descent, with convergence after 2 epochs vs 20 for Hebbian and gradient descent, respectively.

Hebbian learning rules have the advantage of requiring only the information that is local to a given synapse. For example, two-factor learning rules compute weight updates from traces (filtered spike trains), representing average pre- and post-synaptic activities over a given time scale. As such, it is conceivable to implement an architecture that calculates all weight updates entirely in parallel, as opposed to gradient descent algorithms, which must sequentially propagate the credit of error backward through the network layer by layer and thus fundamentally require some amount of serial computation. This advantage, in turn, also allows Hebbian learning to be easily applied in online learning tasks wherein learning is performed whenever new information is made available. Gradient descent algorithms, in contrast, are more often used with batch training where the error is minimized across multiple samples simultaneously. As such, Hebbian learning is advantageous in settings with real-time learning requirements, while gradient descent is more amenable to applications in which many novel input samples are generated nearly simultaneously. See Ref. 137 for more on the advantages and formulations for online learning in Hebbian and spiking neural networks.

Table II tabulates and compares the existing architectures for photonic neuromorphic processing designed by some well-known institutions and organizations in the field. This table lists the technologies used for synaptic meshes, neural nonlinearities, and any performance metrics reported if applicable. Synaptic meshes are discriminated based on the topology, learning mechanisms utilized for the network, network size demonstrated (experimental or simulated), and reconfigurable elements employed. Neuron models are distinguished by the functional form of nonlinearity employed and the use of optoelectronic or fully photonic devices. Finally, the performance of each architecture is reported as presented, given the lack of a single established performance metric agreed upon within the field (discussed more in Sec. IV). As such, this table aims to summarize and contextualize existing approaches to photonic neuromorphic computing.

TABLE II.

Comparison of approaches to photonic neuromorphic computing.a

Syn.Syn.NeuralDemonstrated
Research network reconfiguration networknetworkReportedNeurons/
group architecture mechanism topology sizeLearningperformanceMultiplexing nonlinearityReferences
UC Davis Rectangular and triangular MZI meshes Thermo-optic (SOI) Feed-forward TNN 1024 × 1024 (sim) EC standard BP and TT decomposition MAC/J WDM and SDM Optoelectronic Izhikevich spiking neuron 121,138,139  
Princeton MRR banks Thermo-optic (SOI with Ti/Q heaters) Feed-forward 2 × 3 w/0–250 broadcast nodes (sim/exp) NR Extinction ratio > 13 dB WDM Optoelectronic leaky integrate-and-fire spiking neuron 100,106,140  
University of Monarch Rectangular MZI mesh Electro-optical phase shifter Convolutional (3 × 3) 900 × 10 (sim/exp) EC standard BP 11.321 TOPS WDM EC sigmoid non-spiking neuron 58,64  
Pleros Aristotle University of Thessaloniki Rectangular MZI mesh Electro-optical (SOA) Recurrent 32 × 3 (sim), 4 × 4 (exp) EC standard BP SOA1 = 180 pJ/symbol and SOA2 = 300 pJ/symbol WDM SOA-based sigmoid non-spiking nonlinearity 141–143  
Stanford Rectangular MZI mesh Electro-optical phase shifter Feedforward (FT pre-processed) 16 × 10 (sim) EC custom BP 94% classification accuracy (MNIST) 7.7 × 1012 MAC/s WDM Custom optoelectronic non-spiking nonlinearity 144–146  
McGill University Diamond MZI mesh Electro-optical phase shifter Feedforward 4 × 4 mesh (sim/exp) EC custom algorithm 98.9% accuracy (0 dB MZI loss) and 75% accuracy (0.5 dB MZI loss) WDM NA 147–149, 104  
MIT Diamond MZI mesh Thermo-optic (SiPh + PCM) Feedforward 4 × 4 mesh (sim) EC standard BP ∼100 pJ/FLOP TDM NA (sug. bistable nonlinear photonic crystals) 150  
Ghent University Crossbar network Electro-optical (PCM) Passive reservoir computing 4 × 4 (sim) EC complex-valued ridge regression Minimum error rate = 10−3 TDM Photodetector non-spiking nonlinearity 151–156  
NTT Crossbar network Electro-optical (cross-gain modulated SOA) Reservoir computing (RNN) Single device NR 43 mW consumed, normalized mean square error (NMSE) ∼ 0.112 TDM Custom SOA non-spiking nonlinearity 74  
NIST Rectangular MZI mesh Electro-optical (superconducting-nanowire single photon detectors) Spiking feedforward 49 SNSPDs (exp) NR NR TDM Optoelectronic integrate-and-fire spiking neuron 157–160  
University of Washington Crossbar network All-optical (PCM [GST]) Convolutional (2 × 2) 256 × 256 input (sim/exp) EC standard BP 25 TOPS/mm2 WDM NA 78 and 161–163  
George Washington University Rectangular MZI mesh Thermo-optic (PCM [GSST-Si]) Feedforward (FF); convolutional FF: 784 × 100 × 10 (simulation); CNN: NR EC standard BP 93% inference accuracy (MNIST) WDM Custom sigmoidal non-spiking nonlinearity 77,164  
University of Paris-Saclay MRR banks (arranged in crossbar-like topology) All-optical (SOI) Swirl reservoir computing 4 × 4 (simulation) EC ridge regression XOR task 20 Gb/s with BER < 10−3 and injection power < 2.5 mW WDM Custom MRR non-spiking nonlinearity 83,165,166 
Syn.Syn.NeuralDemonstrated
Research network reconfiguration networknetworkReportedNeurons/
group architecture mechanism topology sizeLearningperformanceMultiplexing nonlinearityReferences
UC Davis Rectangular and triangular MZI meshes Thermo-optic (SOI) Feed-forward TNN 1024 × 1024 (sim) EC standard BP and TT decomposition MAC/J WDM and SDM Optoelectronic Izhikevich spiking neuron 121,138,139  
Princeton MRR banks Thermo-optic (SOI with Ti/Q heaters) Feed-forward 2 × 3 w/0–250 broadcast nodes (sim/exp) NR Extinction ratio > 13 dB WDM Optoelectronic leaky integrate-and-fire spiking neuron 100,106,140  
University of Monarch Rectangular MZI mesh Electro-optical phase shifter Convolutional (3 × 3) 900 × 10 (sim/exp) EC standard BP 11.321 TOPS WDM EC sigmoid non-spiking neuron 58,64  
Pleros Aristotle University of Thessaloniki Rectangular MZI mesh Electro-optical (SOA) Recurrent 32 × 3 (sim), 4 × 4 (exp) EC standard BP SOA1 = 180 pJ/symbol and SOA2 = 300 pJ/symbol WDM SOA-based sigmoid non-spiking nonlinearity 141–143  
Stanford Rectangular MZI mesh Electro-optical phase shifter Feedforward (FT pre-processed) 16 × 10 (sim) EC custom BP 94% classification accuracy (MNIST) 7.7 × 1012 MAC/s WDM Custom optoelectronic non-spiking nonlinearity 144–146  
McGill University Diamond MZI mesh Electro-optical phase shifter Feedforward 4 × 4 mesh (sim/exp) EC custom algorithm 98.9% accuracy (0 dB MZI loss) and 75% accuracy (0.5 dB MZI loss) WDM NA 147–149, 104  
MIT Diamond MZI mesh Thermo-optic (SiPh + PCM) Feedforward 4 × 4 mesh (sim) EC standard BP ∼100 pJ/FLOP TDM NA (sug. bistable nonlinear photonic crystals) 150  
Ghent University Crossbar network Electro-optical (PCM) Passive reservoir computing 4 × 4 (sim) EC complex-valued ridge regression Minimum error rate = 10−3 TDM Photodetector non-spiking nonlinearity 151–156  
NTT Crossbar network Electro-optical (cross-gain modulated SOA) Reservoir computing (RNN) Single device NR 43 mW consumed, normalized mean square error (NMSE) ∼ 0.112 TDM Custom SOA non-spiking nonlinearity 74  
NIST Rectangular MZI mesh Electro-optical (superconducting-nanowire single photon detectors) Spiking feedforward 49 SNSPDs (exp) NR NR TDM Optoelectronic integrate-and-fire spiking neuron 157–160  
University of Washington Crossbar network All-optical (PCM [GST]) Convolutional (2 × 2) 256 × 256 input (sim/exp) EC standard BP 25 TOPS/mm2 WDM NA 78 and 161–163  
George Washington University Rectangular MZI mesh Thermo-optic (PCM [GSST-Si]) Feedforward (FF); convolutional FF: 784 × 100 × 10 (simulation); CNN: NR EC standard BP 93% inference accuracy (MNIST) WDM Custom sigmoidal non-spiking nonlinearity 77,164  
University of Paris-Saclay MRR banks (arranged in crossbar-like topology) All-optical (SOI) Swirl reservoir computing 4 × 4 (simulation) EC ridge regression XOR task 20 Gb/s with BER < 10−3 and injection power < 2.5 mW WDM Custom MRR non-spiking nonlinearity 83,165,166 
a

NA: not applicable, NR: not reported, EC: externally computed, BP: backpropagation, FOM: figure of merit, Sim: simulated, exp.: experimentally demonstrated, SOI: silicon on insulator, SiPh: silicon photonics, SOA: semiconductor optical amplifier, PCM: phase change materials, GST: GeSbTe, and GSST: GeSeSbTe.

One of the major remaining challenges of both electronic and photonic neuromorphic computing is the physical composition of large-scale neural networks. The photonic (and general) neuromorphic technologies and methods discussed thus far have addressed scalability in other forms by reducing algorithmic complexity,52,53 by increasing parallelism through multiplexing (WDM,60,65,77 TDM,78 and SDM58), and in the general sense of decreased energy consumption of photonic components.47,58,121 Nonetheless, as the availability of data increases and the demand for computing resources rises to match, so will the demand for large-scale neural networks with high neural and synaptic densities. In such cases, the number of distinguishable modes or wavelength channels in photonic networks may become a barrier to further increases in parallelism. As such, other methods must be developed to increase neural and synaptic densities to match the needs for large-scale networks in reasonably sized form factors.

The remainder of this section introduces and discusses two promising photonic technologies under active research that may improve the physical scalability of future integrated photonic and optoelectronic neural network architectures. First, a recently developed technique, called tensor train decomposition,167 simplifies the structure of the neural network into only the fundamental elements (called “tensor cores”) required for fast, accurate training. The second utilizes new techniques in fabrication168,169 to reorganize the floor plan of fabricated circuits to utilize the third dimension more efficiently, thus realizing a 3D neuromorphic system much like the biological brain. Future work is needed to apply these algorithmic and manufacturing innovations to novel photonic neuromorphic computers.

Tensor-train decomposition (TT decomposition)167 is a decomposition algorithm that works by representing the elements of a d-dimensional tensor as the product of d three-dimensional tensor core elements as in
(3.1)
where Gkik is an rk−1 × rk matrix. The product results in an r0 × rd matrix; hence, the condition that r0 = rd = 1 is imposed to match the input and output dimensionality. This procedure of representing a rank differs from the canonical representation of a tensor rank in that the ranks rk can be calculated as the ranks of known auxiliary matrices. The matrix tensor cores Gkik from Eq. (3.1) are three-dimensional arrays and can be written as Gkαk1,ik,αk, which can be treated as rk−1 × nk × rk arrays with Gkαk1,ik,αk=Gkikαk1αk.167 

Shallow networks with large fully connected layers achieve almost the same accuracy as an ensemble of CNNs.170 Therefore, it is highly desirable to implement high-radix (e.g., 1024 × 1024) photonic synaptic interconnections. However, an N × N MZI mesh representing a unitary weight matrix requires a minimum of N(N − 1) reconfigurable elements and N cascaded stages,103 limiting mesh scalability for high-radix interconnections. Tensor-train decomposition offers increased scalability to PNNs by reducing the number of elements in the network (i.e., fewer MZI units). High-radix meshes are formed by cascading smaller-radix meshed called photonic tensor-train cores.138 Other benefits include the resulting reduction in optical insertion loss and decreased chip size for a given network.

Tensorized PNNs were proposed in Ref. 138 for deep feed-forward neural networks with a rectangular MZI mesh. In principle, however, the architecture can be applied to spiking neural networks and recurrent neural networks. The proposed network utilizes a discrete-time representation of input signals where each input sample is amplitude modulated on a continuous wave optical carrier at specific time intervals. A diagram of the architecture for a conventional PNN and a tensorized PNN is shown in Fig. 9, detailing the difference in the architecture that arises from utilizing tensor-train decomposition in the training process. By adding parallelism in both the wavelength domain using WDM technology and space domain using 3D photonics, the proposed tensorized PNNs maintain all the benefits of the conventional PNNs while reducing the insertion loss by 171.8 dB and the number of MZIs by a factor 582×.

FIG. 9.

Diagram detailing architecture for (a) a conventional PNN and (b) a tensorized PNN.

FIG. 9.

Diagram detailing architecture for (a) a conventional PNN and (b) a tensorized PNN.

Close modal

A simulation demonstration has been completed by considering hardware implementation challenges, such as phase-shifter variations and beam splitter power imbalances.171 The simulation studies the response related to cross-entropy loss to benchmark the tensorized PNNs compared to conventional PNNs and Fourier-transform preprocessed PNNs. In particular, the accuracy of the different network models was studied for performing handwritten digit recognition with the MNIST dataset. The simulation results demonstrate that the TNNs are robust against phase-shifter variations and beam splitter power imbalances when studying the overall accuracy of the trained TNN model against the MNIST dataset. Furthermore, the implementation of the photonic TNN can achieve >90% classification accuracy while using 33.6× fewer MZIs than a conventional ANN, which can only achieve 71.6% accuracy under the practical hardware imprecisions studied. Further implications of the architecture for scalability, accuracy, learning, and hardware implementation are under active study.138,139,171

3D integration is essential for practical photonic neuromorphic computing since typical photonic devices are of many wavelengths in size. Together with control electronics, 3D electronic photonic integrated circuits (3D EPICs) must be considered. At the heart of the 3D EPIC is a through-silicon-optical-via (TSOV) with silicon photonic vertical reflectors. Recently, a UC Davis team has experimentally demonstrated a 90° vertical coupler as illustrated in Fig. 10, which consists of a silicon photonic vertical via and a 45° reflector attached to a waveguide end.168,169 The interlayer connection loss is 1.3 dB (or 0.65 dB per coupling)169,172 and is limited by the mode matching of the lateral and vertical waveguides for the given 220 nm silicon thickness rib waveguides.169,172 For thicker silicon rib waveguides (e.g., 500 nm thick), the loss can be reduced to 0.5 dB per interlayer connection. This vertical coupler can also be used for interlayer coupling in a multi-layer silicon photonic 3D integrated circuit by placing a matching vertical coupler face-to-face. For coupling between the silicon photonic waveguide layer and a silicon nitride layer, inverse tapered couplers can be utilized where UC Davis and other groups have already demonstrated interlayer coupling loss at ∼0.01 dB.173,174

FIG. 10.

Schematic of a 3D optoelectronic neuromorphic computing platform. Each wafer can be 3D stacked utilizing a 3D vertical optical via using the 3D silicon photonic fabrication technique including 45° reflectors.80,81

FIG. 10.

Schematic of a 3D optoelectronic neuromorphic computing platform. Each wafer can be 3D stacked utilizing a 3D vertical optical via using the 3D silicon photonic fabrication technique including 45° reflectors.80,81

Close modal

Successfully comparing any two things requires a system that can meaningfully establish their value. At the lowest level in the field of computing, the value is placed on computational ability as measured by such things as latency, throughput, accuracy, and energy efficiency. A complication arises when any of these metrics may change in the context of a specific application. This challenge is already present in the case of conventional general-purpose computing architectures, resulting in the availability of competing standards for benchmarking CPUs.175,176 This challenge is further exacerbated by the more open-ended and varied goals of neuromorphic computing architectures, the ambiguity of what a biological brain “does” and consequently what a brain-like or brain-inspired architecture should therefore also “do.” In traditional computing, different architectures may emphasize the optimization of the aforementioned values for different computational units based on the needs of the design—integer operations vs floating-point operations, for example. In the neuromorphic case, these operations may take the form of individual neural state updates, processing of spike traffic, or synaptic weight updates, each of which may be broken down into further suboperations. In summary, a good choice of benchmark addresses the following two questions: (1) How to establish the value? (2) How to establish fairness? The remainder of this section will describe commonly referenced metrics of comparison in photonic neuromorphic devices before describing a more general approach for benchmarking that can be applied to neuromorphic computers of various electronic and photonic architectures.

Multiple photonic devices of interest58,60,121,177 have reported their value in terms of the energy efficiency and throughput of multiply–accumulate (MAC) operations. Since MAC operations require many parallel memory accesses for large networks, they tend to contribute to the bottleneck for network-based computation on von Neumann architectures. While it is undoubtedly true that the MAC operation is a significant component of network computation, a singular focus on the performance of this operation would be far too myopic as the behavior of biological networks and real-time interactions may drastically affect performance. For example, the performance of the 11 tera-operations per second (TOPS)58 photonic convolutional accelerator is only available to workloads in which there is a need to perform so many operations in a given amount of time. In contrast, the real-time applications—that biological brains are most suited for—may not produce enough data for these processing speeds to be relevant.

Some have attempted to account for the device footprint in relation to the improvements in MAC throughput and energy.177 While this approach more fairly compares the performance of matrix multipliers across photonic and electronic platforms, it does not address the other factors involved in neural network processing, which many contemporary photonic devices offload to post-processing on traditional computers. Additionally, the SNN architecture—which is considered favorable to many in the context of neuromorphics—favors an accumulate and fire operation in which the membrane potential is a continuous state variable that undergoes continuous update in the ideal (analog) case, at which point the definition of a single MAC operation may not be clear.

Furthermore, Cole178 suggested that when programmability and data transfer are considered, the energy consumption of computational elements is negligible to both electronic and photonic approaches. Cole suggested that when considering the energy consumption of optical receivers, there is no advantage to photonic computation over a fully optimized electronic computer. Instead, Cole claimed that energy reduction efforts should be focused on the adoption of optical data transfer and not optical computation. It is important to note that the computation considered is binary and that representation in neuromorphic computation will not necessarily take this form. Nonetheless, this result demonstrates the importance of comparing neuromorphic processors in their entirety rather than considering the consumption of particular computational elements.

More mature neural network processors—for example, the electronic neuromorphic devices TrueNorth,13 Loihi,14 and Neurogrid11—have instead reported their achievements in terms of energy consumption per spike or energy per bit. Such metrics can neglect the question of fairness, as changes in workload or architecture can drastically change these metrics. When reporting the energy per spiking event, for example, it is unclear whether the operations contributing to the membrane potential updates should be considered. For a different workload, the number of events before the neuron reaches threshold and fires may vary and result in an inconsistent metric. If subthreshold operations are included, a digital architecture with discrete timesteps might require more energy per spike for workloads with longer gaps between spikes. If subthreshold operations are not included, then an architecture that chooses the smallest possible spiking energy may appear more efficient despite another architecture that might compensate with nearly passive energy costs for incoming spike accumulation. In such an architecture, spike energy may be less important; after all, sparsity in time is a major advantage of spiking networks. Furthermore, metrics involving units of bits are specific to a given architecture in that architectural choices determine what role these bits play and whether the bit width is flexible. thus making it more difficult to establish fair comparisons between widely different architectures. It has even been argued that bit precision is not significant in neuromorphic computing, given that one of the goals of the field is to perform computation with low-precision elements.11 

Proper benchmarking of neuromorphic computers should take inspiration from solutions generated in traditional computing; various standards of benchmarking have been proposed, such as SPEC175 or MLPerf,176 which attempt to fairly discriminate the advantages and disadvantages of various architectures in different contexts. Mike Davies, Director of Intel’s Neuromorphic Computing Lab, has suggested that the field of neuromorphics has not yet matured enough to establish the specific operations that a fully qualified neuromorphic computer should support yet proposes a benchmarking suite known as SpikeMark.50 In this suite, various workloads, such as spoken keyword classification or hand gesture recognition, would be used to determine an architecture’s feature set and flexibility in various contexts while providing a standard for comparing energy efficiency and performance. As the name implies, SpikeMark focuses on spiking network workloads though it is important to note that spiking behavior may not be necessary for useful neuromorphic devices. In the book How to Build a Brain,179 Eliasmith describes a set of “Core Cognitive Criteria” that attempts to answer the question of what a brain “does” and can also act as a framework for designing neuromorphic benchmarks. The criteria are broken down into three categories—representational structure, performance concerns, and scientific merit—which are ambiguous to the choice of spiking or non-spiking neuron model and have been applied to the design of a large-scale network model known as SPAUN.180 

Further work is still needed to resolve the ambiguity of what workloads should be considered fundamental to a neuromorphic processor and establish an official benchmarking standard. Regardless, the device footprint, energy efficiency, and relevant processing speeds should be considered jointly across multiple minimally overlapping tasks representing the desirable computational characteristics that researchers seek to borrow from biological brains.

Recent advances in photonic circuits have led to theoretical studies and experimental demonstrations of synaptic interconnects with reconfigurable photonic elements capable of arbitrary linear matrix operations—including MAC operation and convolution—at extremely high speed and energy efficiency. Both all-optical and optoelectronic neurons of nonlinear transfer functions have also been investigated. A number of research efforts have reported orders of magnitude improvements estimated for computational throughput and energy efficiency. While the photonic technologies are relatively immature compared to their electronic counterparts, silicon photonics have emerged as a viable platform for integrated photonic neuromorphic circuits. However, substantial challenges remain in several areas: (a) cross-layer co-design of algorithms, architectures, circuits (photonic and electronic), training/inference, and benchmarking; (b) heterogeneous integration of dissimilar materials if non-volatile synaptic reconfigurabilities are to be incorporated; (c) realizing high-density 3D integration; and (d) implementing scalable learning and inference in the resulting system. With rapid and accelerating trends of active progress in this nascent area of research and development, we expect to see key advances in addressing each of the above challenges for viable photonic and optoelectronic neuromorphic computing in the future.

The authors acknowledge enlightening discussions with Professor David A. B. Miller. This work was supported, in part, by AFOSR under Grant No. FA9550-18-1-0186.

The authors have no conflicts to disclose.

The data that support the findings of this study are available from the corresponding authors upon reasonable request.

1.
D.
Silver
et al, “
Mastering the game of Go with deep neural networks and tree search
,”
Nature
529
(
7587
),
484
489
(
2016
).
2.
A. M.
Turing
, “
On computable numbers, with an application to the Entscheidungsproblem. A correction
,”
Proc. London Math. Soc.
s2-43
(
1
),
544
546
(
1938
).
3.
J.
Backus
, “
Can programming be liberated from the von Neumann style?
,”
Commun. ACM
21
(
8
),
613
641
(
1978
).
4.
A. A.
Demkov
,
C.
Bajaj
,
J. G.
Ekerdt
,
C. J.
Palmstrøm
, and
S. J. B.
Yoo
, “
Materials for emergent silicon-integrated optical computing
,”
J. Appl. Phys.
130
(
7
),
070907
(
2021
).
5.
J.
Hsu
, “
IBM’s new brain [News]
,”
IEEE Spectrum
51
(
10
),
17
19
(
2014
).
6.
T.
Nguyen
, “
Total number of synapses in the adult human neocortex
,”
Undergrad. J. Math. Modell. One + Two
3
(
1
),
26
(
2010
).
7.
D.
Attwell
and
S. B.
Laughlin
, “
An energy budget for signaling in the grey matter of the brain
,”
J. Cereb. Blood Flow Metab.
21
(
10
),
1133
1145
(
2001
).
8.
B. J.
Copeland
, “
The turing test*
,”
Mind Mach.
10
(
4
),
519
539
(
2000
).
9.
A.
Pinar Saygin
,
I.
Cicekli
, and
V.
Akman
, “
Turing test: 50 years later
,”
Mind Mach.
10
(
4
),
463
518
(
2000
).
10.
M. A. C.
Maher
,
S. P.
DeWeerth
,
M. A.
Mahowald
, and
C. A.
Mead
, “
Implementing neural architectures using analog VLSI circuits
,”
IEEE Trans. Circuits Syst.
36
(
5
),
643
652
(
1989
).
11.
B. V.
Benjamin
,
P.
Gao
,
E.
McQuinn
,
S.
Choudhary
,
A. R.
Chandrasekaran
,
J.-M.
Bussat
,
R.
Alvarez-Icaza
,
J. V.
Arthur
,
P. A.
Merolla
, and
K.
Boahen
, “
Neurogrid: A mixed-analog-digital multichip system for large-scale neural simulations
,”
Proc. IEEE
102
(
5
),
699
716
(
2014
).
12.
J.
Schemmel
,
D.
Brüderle
,
A.
Grübl
,
M.
Hock
,
K.
Meier
, and
S.
Millner
, “
A wafer-scale neuromorphic hardware system for large-scale neural modeling
,” in
ISCAS 2010: 2010 IEEE International Symposium on Circuits and Systems, Nano-Bio Circuit Fabrics and Systems
(
IEEE
,
2010
), pp.
1947
1950
.
13.
P. A.
Merolla
et al, “
A million spiking-neuron integrated circuit with a scalable communication network and interface
,”
Science
345
,
668
6197
(
2014
).
14.
M.
Davies
et al, “
Loihi: A neuromorphic manycore processor with on-chip learning
,”
IEEE Micro
38
(
1
),
82
99
(
2018
).
15.
E.
Painkras
,
L. A.
Plana
,
J.
Garside
,
S.
Temple
,
S.
Davidson
,
J.
Pepper
,
D.
Clark
,
C.
Patterson
, and
S.
Furber
, “
SpiNNaker: A multi-core system-on-chip for massively-parallel neural net simulation
,” in
Proceedings of IEEE Custom Integrated Circuits Conference
(
IEEE
,
2012
).
16.
J.
Park
,
T.
Yu
,
S.
Joshi
,
C.
Maier
, and
G.
Cauwenberghs
, “
Hierarchical address event routing for reconfigurable large-scale neuromorphic systems
,”
IEEE Trans. Neural Networks Learn. Syst.
28
(
10
),
2408
2422
(
2017
).
17.
D.
Rich
,
A.
Bartolo
,
C.
Gilardo
,
B.
Le
,
H.
Li
,
R.
Park
,
R. M.
Radway
,
M. M.
Sabry Aly
,
H.-S. P.
Wong
, and
S.
Mitra
, “
Heterogeneous 3D nano-systems: The N3XT approach?
,” in
NANO-CHIPS 2030
(Springer, Cham,
2020
), pp.
127
151
.
18.
R. G.
Beausoleil
,
P. J.
Kuekes
,
G. S.
Snider
,
S. Y.
Wang
, and
R. S.
Williams
, “
Nanoelectronic and nanophotonic interconnect
,”
Proc. IEEE
96
(
2
),
230
247
(
2008
).
19.
J.
Mumbru
,
G.
Panotopoulos
,
D.
Psaltis
,
X.
An
,
F. H.
Mok
,
S. U.
Ay
,
S. L.
Barna
, and
E. R.
Fossum
, “
Optically programmable gate array
,”
Proc. SPIE
4089
(
24
),
763
771
(
2000
)
20.
E. G.
Paek
and
D.
Psaltis
, “
Optical associative memory using Fourier transform holograms
,”
Opt. Eng.
26
(
5
),
265428
(
1987
).
21.
D.
Psaltis
and
R.
Athale
, “
Optical computing: Past and future
,”
Opt. Photonics News
27
(
6
),
32
39
(
2016
).
22.
D.
Psaltis
and
R. A.
Athale
, “
High accuracy computation with linear analog optical systems: A critical study
,”
Appl. Opt.
25
(
18
),
3071
3077
(
1986
).
23.
D.
Psaltis
,
C. H.
Park
, and
J.
Hong
, “
Higher order associative memories and their optical implementations
,”
Neural Networks
1
(
2
),
149
163
(
1988
).
24.
D.
Brady
,
D.
Psaltis
, and
K.
Wagner
, “
Adaptive optical networks using photorefractive crystals
,”
Appl. Opt.
27
(
9
),
1752
1759
(
1988
).
25.
M. J. R.
Heck
,
M. L.
Davenport
, and
J. E.
Bowers
, “
Progress in hybrid-silicon photonic integrated circuit technology
,”
SPIE
(
2013
).
26.
M.
Smit
,
J.
van der Tol
, and
M.
Hill
, “
Moore’s law in photonics
,”
Laser Photonics Rev.
6
(
1
),
1
13
(
2012
).
27.
L.
Thylén
,
S.
He
,
L.
Wosinski
, and
D.
Dai
, “
The Moore’s law for photonic integrated circuits
,”
J. Zhejiang Univ., Sci., A
7
(
12
),
1961
1967
(
2006
).
28.
G.
Moore
, “
Cramming more components onto integrated circuits, Reprinted from Electronics, volume 38, number 8, April 19, 1965, pp.114 ff.
,”
IEEE Solid-State Circuits Society Newsletter
11
(
3
),
33
35
(
2006
).
29.
W.
Haensch
,
E. J.
Nowak
,
R. H.
Dennard
,
P. M.
Solomon
,
A.
Bryant
,
O. H.
Dokumaci
,
A.
Kumar
,
X.
Wang
,
J. B.
Johnson
, and
M. V.
Fischetti
, “
Silicon CMOS devices beyond scaling
,”
IBM J. Res. Dev.
50
(
4–5
),
339
361
(
2006
).
30.
R. H.
Dennard
,
F. H.
Gaensslen
,
H.-N.
Yu
,
V. L.
Rideout
,
E.
Bassous
, and
A. R.
Leblanc
, “
Design of ion-implanted MOSFET’s with very small physical dimensions
,”
IEEE J. Solid-State Circuits
9
(
5
),
256
268
(
1974
).
31.
G.
Wetzstein
,
A.
Ozcan
,
S.
Gigan
,
S.
Fan
,
D.
Englund
,
M.
Soljačić
,
C.
Denz
,
D. A. B.
Miller
, and
D.
Psaltis
, “
Inference in artificial intelligence with deep optics and photonics
,”
Nature
588
(
7836
),
39
47
(
2020
).
32.
M.
Xu
et al, “
Recent advances on neuromorphic devices based on chalcogenide phase-change materials
,”
Adv. Funct. Mater.
30
(
50
),
2003419
(
2020
).
33.
S.
Song
,
J.
Kim
,
S. M.
Kwon
,
J.-W.
Jo
,
S. K.
Park
, and
Y.-H.
Kim
, “
Recent progress of optoelectronic and all-optical neuromorphic devices: A comprehensive review of device structures, materials, and applications
,”
Adv. Intell. Syst.
3
(
4
),
2000119
(
2020
); accessed November 27, 2021.
34.
A.
Jha
,
C.
Huang
, and
P. R.
Prucnal
, “
Reconfigurable all-optical nonlinear activation functions for neuromorphic photonics
,”
Opt. Lett.
45
(
17
),
4819
4822
(
2020
).
35.
A.
Lugnan
,
A.
Katumba
,
F.
Laporte
,
M.
Freiberger
,
S.
Sackesyn
,
C.
Ma
,
E.
Gooskens
,
J.
Dambre
, and
P.
Bienstman
, “
Photonic neuromorphic information processing and reservoir computing
,”
APL Photonics
5
(
2
),
020901
(
2020
).
36.
D.
Brunner
,
B.
Penkovsky
,
B. A.
Marquez
,
M.
Jacquot
,
I.
Fischer
, and
L.
Larger
, “
Tutorial: Photonic neural networks in delay systems
,”
J. Appl. Phys.
124
(
15
),
152004
(
2018
).
37.
T. F.
De Lima
,
H. T.
Peng
,
A. N.
Tait
,
M. A.
Nahmias
,
H. B.
Miller
,
B. J.
Shastri
, and
P. R.
Prucnal
, “
Machine learning with neuromorphic photonics
,”
J. Lightwave Technol.
37
(
5
),
1515
1534
(
2019
).
38.
V.
Bangari
,
B. A.
Marquez
,
A. N.
Tait
,
M. A.
Nahmias
,
T. F.
De Lima
,
H. T.
Peng
,
P. R.
Prucnal
, and
B. J.
Shastri
, “
Neuromorphic photonics for deep learning
,” in
2019 IEEE Photonics Conference (IPC)
(
IEEE
,
2019
).
39.
B. J.
Shastri
,
A. N.
Tait
,
T.
Ferreira de Lima
,
W. H. P.
Pernice
,
H.
Bhaskaran
,
C. D.
Wright
, and
P. R.
Prucnal
, “
Photonics for artificial intelligence and neuromorphic computing
,”
Nat. Photonics
15
(
2
),
102
114
(
2021
).
40.
T.
Ferreira De Lima
,
B. J.
Shastri
,
A. N.
Tait
,
M. A.
Nahmias
, and
P. R.
Prucnal
, “
Progress in neuromorphic photonics
,”
Nanophotonics
6
(
3
),
577
599
(
2017
).
41.
H.-T.
Peng
,
M. A.
Nahmias
,
T. F.
de Lima
,
A. N.
Tait
,
B. J.
Shastri
, and
P. R.
Prucnal
, “
Neuromorphic photonic integrated circuits
,”
IEEE J. Sel. Top. Quantum Electron.
24
(
6
),
1
15
(
2018
).
42.
S.
Xiang
et al, “
A review: Photonics devices, architectures, and algorithms for optical neural computing
,”
J. Semicond.
42
(
2
),
023105
(
2021
).
43.
P. R.
Prucnal
,
B. J.
Shastri
, and
M. C.
Teich
,
Neuromorphic Photonics
(
CRC Press
,
2017
), pp.
1
412
.
44.
G.
Mourgias-Alexandris
,
A.
Totovic
,
N.
Passalis
,
G.
Dabos
,
A.
Tefas
, and
N.
Pleros
, “
Neuromorphic computing through photonic integrated circuits
,”
Proc. SPIE
11284
,
1128403
(
2020
).
45.
S. R. Y.
Cajal
,
Comparative Study of the Sensory Areas of the Human Cortex
(
Clark University
,
1899
).
46.
D. A. B.
Miller
, “
Attojoule optoelectronics for low-energy information processing and communications
,”
J. Lightwave Technol.
35
(
3
),
346
396
(
2017
).
47.
S. B.
Laughlin
and
T. J.
Sejnowski
, “
Communication in neuronal networks
,”
Science
301
(
5641
),
1870
1874
(
2003
).
48.
M.
Tran
,
D.
Huang
,
T.
Komljenovic
,
J.
Peters
,
A.
Malik
, and
J.
Bowers
, “
Ultra-low-loss silicon waveguides for heterogeneously integrated silicon/III-V photonics
,”
Appl. Sci.
8
(
7
),
1139
(
2018
).
49.
M.
Davies
, “
Benchmarks for progress in neuromorphic computing
,”
Nat. Mach. Intell.
1
(
9
),
386
388
(
2019
).
50.
P. R.
Prucnal
,
B. J.
Shastri
,
T.
Ferreira de Lima
,
M. A.
Nahmias
, and
A. N.
Tait
, “
Recent progress in semiconductor excitable lasers for photonic spike processing
,”
Adv. Opt. Photonics
8
(
2
),
228
(
2016
).
51.
J. B.
Aimone
,
Y.
Ho
,
O.
Parekh
,
C. A.
Phillips
,
A.
Pinar
,
W.
Severa
, and
Y.
Wang
, “
Provable advantages for graph algorithms in spiking neural networks
,” in
Proceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures
(
2021
).
52.
C.-N.
Chou
,
K.-M.
Chung
, and
C.-J.
Lu
, “
On the algorithmic power of spiking neural networks
,” in
Proceedings of the 10th Innovations in Theoretical Computer Science Conference
(
2019
), pp.
26:1
26.20
.
53.
S. J.
Verzi
,
F.
Rothganger
,
O. D.
Parekh
,
T.-T.
Quach
,
N. E.
Miner
,
C. M.
Vineyard
,
C. D.
James
, and
J. B.
Aimone
, “
Computing with spikes: The advantage of fine-grained timing
,”
Neural Comput.
30
(
10
),
2660
2690
(
2018
).
54.
X.
Li
,
N.
Youngblood
,
W.
Zhou
,
J.
Feldmann
,
J.
Swett
,
S.
Aggarwal
,
A.
Sebastian
,
C. D.
Wright
,
W.
Pernice
, and
H.
Bhaskaran
, “
On-chip phase change optical matrix multiplication core
,” in
Technical Digest–2020 IEEE International Electron Devices Meeting (IEDM)
(
IEEE
,
2020
), pp.
7.5.1
7.5.4
.
55.
C.
Ríos
,
N.
Youngblood
,
Z.
Cheng
,
M.
Le Gallo
,
W. H. P.
Pernice
,
C. D.
Wright
,
A.
Sebastian
, and
H.
Bhaskaran
, “
In-memory computing on a photonic platform
,”
Sci. Adv.
5
(
2
),
eaau5759
(
2019
).
56.
M. A.
Nahmias
,
T. F.
De Lima
,
A. N.
Tait
,
H.-T.
Peng
,
B. J.
Shastri
, and
P. R.
Prucnal
, “
Photonic multiply-accumulate operations for neural networks
,”
IEEE J. Sel. Top. Quantum Electron.
26
(
1
),
7701518
(
2020
).
57.
X.
Xu
,
M.
Tan
,
B.
Corcoran
,
J.
Wu
,
A.
Boes
,
T. G.
Nguyen
,
S. T.
Chu
,
B. E.
Little
,
D. G.
Hicks
,
R.
Morandotti
,
A.
Mitchell
, and
D. J.
Moss
, “
11 TOPS photonic convolutional accelerator for optical neural networks
,”
Nature
589
(
7840
),
44
51
(
2021
).
58.
M.
Miscuglio
and
V. J.
Sorger
, “
Photonic tensor cores for machine learning
,”
Appl. Phys. Rev.
7
(
3
),
031404
(
2020
).
59.
J.
Feldmann
,
N.
Youngblood
,
M.
Karpov
,
H.
Gehring
,
X.
Li
,
M.
Stappers
,
M.
Le Gallo
,
X.
Fu
,
A.
Lukashchuk
,
A. S.
Raja
,
J.
Liu
,
C. D.
Wright
,
A.
Sebastian
,
T. J.
Kippenberg
,
W. H. P.
Pernice
, and
H.
Bhaskaran
, “
Parallel convolutional processing using an integrated photonic tensor core
,”
Nature
589
(
7840
),
52
58
(
2021
).
60.
J.
Cheng
,
H.
Zhou
,
J.
Dong
,
C.
Alberto Alonso-Ramos
, and
L.
Zhou
, “
Photonic matrix computing: From fundamentals to applications
,”
Nanomaterials
11
(
7
),
1683
(
2021
).
61.
X.
Lin
,
Y.
Rivenson
,
N. T.
Yardimci
,
M.
Veli
,
Y.
Luo
,
M.
Jarrahi
, and
A.
Ozcan
, “
All-optical machine learning using diffractive deep neural networks
,”
Science
361
(
6406
),
1004
1008
(
2018
).
62.
A. N.
Tait
,
T. F.
de Lima
,
E.
Zhou
,
A. X.
Wu
,
M. A.
Nahmias
,
B. J.
Shastri
, and
P. R.
Prucnal
, “
Neuromorphic photonic networks using silicon photonic weight banks
,”
Sci. Rep.
7
(
1
),
1
10
(
2017
).
63.
X.
Xu
,
M.
Tan
,
B.
Corcoran
,
J.
Wu
,
T. G.
Nguyen
,
A.
Boes
,
S. T.
Chu
,
B. E.
Little
,
R.
Morandotti
,
A.
Mitchell
,
D. G.
Hicks
, and
D. J.
Moss
, “
Photonic perceptron based on a Kerr microcomb for high-speed, scalable, optical neural networks
,”
Laser Photonics Rev.
14
(
10
),
2000070
(
2020
).
64.
V.
Bangari
,
B. A.
Marquez
,
H.
Miller
,
A. N.
Tait
,
M. A.
Nahmias
,
T. F.
De Lima
,
H.-T.
Peng
,
P. R.
Prucnal
, and
B. J.
Shastri
, “
Digital electronics and analog photonics for convolutional neural networks (DEAP-CNNs)
,”
IEEE J. Sel. Top. Quantum Electron.
26
(
1
),
7701213
(
2020
).
65.
L.
De Marinis
,
M.
Cococcioni
,
P.
Castoldi
, and
N.
Andriolli
, “
Photonic neural networks: A survey
,”
IEEE Access
7
,
175827
175841
(
2019
).
66.
D.
Psaltis
and
N.
Farhat
, “
Optical information processing based on an associative-memory model of neural nets with thresholding and feedback
,”
Opt. Lett.
10
(
2
),
98
100
(
1985
).
67.
X.
Lin
,
Y.
Rivenson
,
N. T.
Yardimci
,
M.
Veli
,
Y.
Luo
,
M.
Jarrahi
, and
A.
Ozcan
, “
All-optical machine learning using diffractive deep neural networks
,”
Science
361
(
6406
),
1004
1008
(
2018
).
68.
D. A. B.
Miller
, “
Setting up meshes of interferometers-reversed local light interference method
,”
Opt. Express
25
(
23
),
29233
29248
(
2017
).
69.
D. A. B.
Miller
, “
Self-configuring universal linear optical component [invited]
,”
Photonics Res.
1
(
1
),
1
15
(
2013
).
70.
Y.
Paquot
,
J.
Dambre
,
B.
Schrauwen
,
M.
Haelterman
, and
S.
Massar
, “
Reservoir computing: A photonic neural network for information processing
,”
Nonlinear Opt. Appl. IV
7728
(
4
),
77280B
(
2010
).
71.
F.
Duport
,
B.
Schneider
,
A.
Smerieri
,
M.
Haelterman
,
S.
Massar
,
K.
Vandoorne
,
W.
Dierckx
,
B.
Schrauwen
,
D.
Verstraeten
,
R.
Baets
,
P.
Bienstman
, and
J.
Van Campenhout
, “
All-optical reservoir computing
,”
Opt. Express
20
(
20
),
22783
22795
(
2012
).
72.
M.
Rafayelyan
,
J.
Dong
,
Y.
Tan
,
F.
Krzakala
, and
S.
Gigan
, “
Large-scale optical reservoir computing for spatiotemporal chaotic systems prediction
,”
Phys. Rev. X
10
(
4
),
041037
(
2020
).
73.
T.
Tsurugaya
,
T.
Hiraki
,
M.
Nakajima
, and
T.
Aihara
, “
Reservoir computing with low-power-consumption all- optical nonlinear activation using membrane SOA on Si
,”
In Conference on Lasers and Electro-Optics (CLEO)
2021
,
1
2
(
2021
).
74.
D.
Brunner
,
M. C.
Soriano
,
C. R.
Mirasso
, and
I.
Fischer
, “
Parallel photonic information processing at gigabyte per second data rates using transient states
,”
Nat. Commun.
4
(
1
),
1
7
(
2013
).
75.
U.
Teğin
,
M.
Yıldırım
,
İ.
Oğuz
,
C.
Moser
, and
D.
Psaltis
, “
Scalable optical learning operator
,”
Nat. Comput. Sci.
1
(
8
),
542
549
(
2021
).
76.
A.
Mehrabian
,
Y.
Al-Kabani
,
V. J.
Sorger
, and
T.
El-Ghazawi
, “
PCNNA: A photonic convolutional neural network accelerator
,” in
2018 31st IEEE International System-on-Chip Conference (SOCC)
(
IEEE
,
2018
), pp.
169
173
.
77.
C.
Wu
,
H.
Yu
,
S.
Lee
,
R.
Peng
,
I.
Takeuchi
, and
M.
Li
, “
Programmable phase-change metasurfaces on waveguides for multimode photonic convolutional neural network
,”
Nat. Commun.
12
(
1
),
1
8
(
2021
).
78.
G.
Tanaka
,
T.
Yamane
,
J. B.
Héroux
,
R.
Nakane
,
N.
Kanazawa
,
S.
Takeda
,
H.
Numata
,
D.
Nakano
, and
A.
Hirose
, “
Recent advances in physical reservoir computing: A review
,”
Neural Networks
115
,
100
123
(
2019
).
79.
D.
Verstraeten
and
B.
Schrauwen
, “
On the quantification of dynamics in reservoir computing
,”
Lect. Notes Comput. Sci.
5768
,
985
994
(
2009
).
80.
B.
Schrauwen
,
J.
Defour
,
D.
Verstraeten
, and
J.
Van Campenhout
, “
An overview of reservoir computing: Theory, applications and implementations
,” in
Proceedings of the 15th European Symposium on Artificial Neural Networks
(
2007
), pp.
471
482
; accessed September 15, 2021; available online: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.155.2814.
81.
F.
Duport
,
A.
Smerieri
,
A.
Akrout
,
M.
Haelterman
, and
S.
Massar
, “
Fully analogue photonic reservoir computer
,”
Sci. Rep.
6
,
1
12
(
2016
).
82.
K.
Vandoorne
,
P.
Mechet
,
T.
Van Vaerenbergh
,
M.
Fiers
,
G.
Morthier
,
D.
Verstraeten
,
B.
Schrauwen
,
J.
Dambre
, and
P.
Bienstman
, “
Experimental demonstration of reservoir computing on a silicon photonics chip
,”
Nat. Commun.
5
(
1
),
3541
(
2014
).
83.
Y.
Chen
, “
Mechanisms of winner-take-all and group selection in neuronal spiking networks
,”
Front. Comput. Neurosci.
11
,
20
(
2017
).
84.
Y.
Zhang
,
S.
Xiang
,
X.
Guo
,
A.
Wen
, and
Y.
Hao
, “
The winner-take-all mechanism for all-optical systems of pattern recognition and max-pooling operation
,”
J. Lightwave Technol.
38
(
18
),
5071
5077
(
2020
).
85.
J.
Nagi
,
F.
Ducatelle
,
G. A.
Di Caro
,
D.
Cireşan
,
U.
Meier
,
A.
Giusti
,
F.
Nagi
,
J.
Schmidhuber
, and
L. M.
Gambardella
, “
Max-pooling convolutional neural networks for vision-based hand gesture recognition
,” in
2011 IEEE International Conference on Signal and Image Processing Applications (ICSIPA 2011)
(
IEEE
,
2011
), pp.
342
347
.
86.
J. J.
Hopfield
, “
Neural networks and physical systems with emergent collective computational abilities
,”
Proc. Natl. Acad. Sci. U. S. A.
79
(
8
),
2554
2558
(
1982
).
87.
B. A.
Marquez
,
Z.
Guo
,
H.
Morison
,
S.
Shekhar
,
L.
Chrostowski
,
P.
Prucnal
, and
B. J.
Shastri
, “
Photonic pattern reconstruction enabled by on-chip online learning and inference
,”
J. Phys.: Photonics
3
(
2
),
024006
(
2021
); accessed September 13, 2021.
88.
X.
Hu
,
J.
Huang
,
W.
Zhang
,
M.
Li
,
C.
Tao
, and
G.
Li
, “
Photonic ionic liquids polymer for naked-eye detection of anions
,”
Adv. Mater.
20
(
21
),
4074
4078
(
2008
).
89.
A.
Ferrari
,
A.
Pikhtin
,
A.
Jascow
,
L.
Schirone
,
M.
Bertolotti
,
N.
Nazorova
, and
V.
Bogdanov
, “
Temperature dependence of the refractive index in semiconductors
,”
J. Opt. Soc. Am. B
7
(
6
),
918
922
(
1990
).
90.
J.
Matsuoka
,
N.
Kitamura
,
S.
Fujinaga
,
T.
Kitaoka
, and
H.
Yamashita
, “
Temperature dependence of refractive index of SiO2 glass
,”
J. Non- Cryst. Solids
135
(
1
),
86
89
(
1991
).
91.
W.
Wang
,
Y.
Yu
,
Y.
Geng
, and
X.
Li
, “
Measurements of thermo-optic coefficient of standard single mode fiber in large temperature range
,”
Proc. SPIE
9620
(
10
),
96200Y
(
2015
).
92.
G.
Ghosh
,
Thermo-optic coefficients, Handbook of Optical Constants of Solids: Handbook of Thermo-Optic Coefficients of Optical Materials with Applications
(
Academic Press
,
1998
), pp.
115
261
.
93.
S.
De
,
R.
Das
,
R. K.
Varshney
, and
T.
Schneider
, “
Design and simulation of thermo-optic phase shifters with low thermal crosstalk for dense photonic integration
,”
IEEE Access
8
,
141632
141640
(
2020
).
94.
K.
Liu
,
C. R.
Ye
,
S.
Khan
, and
V. J.
Sorger
, “
Review and perspective on ultrafast wavelength-size electro-optic modulators
,”
Laser Photonics Rev.
9
(
2
),
172
194
(
2015
).
95.
Y.
Van De Burgt
,
E.
Lubberman
,
E. J.
Fuller
,
S. T.
Keene
,
G. C.
Faria
,
S.
Agarwal
,
M. J.
Marinella
,
A.
Alec Talin
, and
A.
Salleo
, “
A non-volatile organic electrochemical device as a low-voltage artificial synapse for neuromorphic computing
,”
Nat. Mater.
16
,
414
418
(
2017
).
96.
A.
Melianas
,
T. J.
Quill
,
G.
LeCroy
,
Y.
Tuchman
,
H. V.
Loo
,
S. T.
Keene
,
A.
Giovannitti
,
H. R.
Lee
,
I. P.
Maria
,
I.
McCulloch
, and
A.
Salleo
, “
Temperature-resilient solid-state organic artificial synapses for neuromorphic computing
,”
Sci. Adv.
6
(
27
),
eabb2958
(
2020
).
97.
N.
Dhingra
,
J.
Song
,
G. J.
Saxena
,
E. K.
Sharma
, and
B. M. A.
Rahman
, “
Design of a compact low-loss phase shifter based on optical phase change material
,”
IEEE Photonics Technol. Lett.
31
(
21
),
1757
1760
(
2019
).
98.
K.
Okamoto
,
Fundamentals of Optical Waveguides
(
Elsevier
,
2006
).
99.
A. N.
Tait
,
M. A.
Nahmias
,
B. J.
Shastri
, and
P. R.
Prucnal
, “
Broadcast-and-weight interconnects for integrated distributed processing systems
,” in
2014 Optical Interconnects Conference
(
IEEE
,
2014
), pp.
108
109
.
100.
W.
Bogaerts
,
P.
de Heyn
,
T.
van Vaerenbergh
,
K.
de Vos
,
S.
Kumar Selvaraja
,
T.
Claes
,
P.
Dumon
,
P.
Bienstman
,
D.
van Thourhout
, and
R.
Baets
, “
Silicon microring resonators
,”
Laser Photonics Rev.
6
(
1
),
47
73
(
2012
).
101.
B. E.
Little
,
S. T.
Chu
,
H. A.
Haus
,
J.
Foresi
, and
J.-P.
Laine
, “
Microring resonator channel dropping filters
,”
J. Lightwave Technol.
15
(
6
),
998
1005
(
1997
).
102.
B. J.
Metcalf
,
I. A.
Walmsley
,
P. C.
Humphreys
,
W. S.
Kolthammer
, and
W. R.
Clements
, “
Optimal design for universal multiport interferometers
,”
Optica
3
(
12
),
1460
1465
(
2016
).
103.
F.
Shokraneh
,
O.
Liboiron-Ladouceur
, and
S.
Geoffroy-gagnon
, “
The diamond mesh, a phase-error- and loss-tolerant field-programmable MZI-based optical processor for optical neural networks
,”
Opt. Express
28
(
16
),
23495
23508
(
2020
).
104.
J.
Gu
,
Z.
Zhao
,
C.
Feng
,
Z.
Ying
,
M.
Liu
,
R. T.
Chen
, and
D. Z.
Pan
, “
Toward hardware-efficient optical neural networks: Beyond FFT architecture via joint learnability
,”
IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.
40
(
9
),
1796
1809
(
2021
).
105.
A. N.
Tait
,
B. J.
Shastri
,
M. A.
Nahmias
, and
P. R.
Prucnal
, “
Broadcast and weight: An integrated network for scalable photonic spike processing
,”
J. Lightwave Technol.
32
(
21
),
4029
4041
(
2014
); accessed September 08, 2021; available online: https://www.osapublishing.org/abstract.cfm?uri=jlt-32-21-3427.
106.
W.
Gerstner
and
W. M.
Kistler
,
Spiking Neuron Models. Single Neurons, Populations, Plasticity
(
Cambridge University Press
,
2002
).
107.
S.
Ostojic
and
N.
Brunel
, “
From spiking neuron models to linear-nonlinear models
,”
PLoS Comput. Biol.
7
(
1
),
e1001056
(
2011
).
108.
A.
Tefas
,
A.
Tsakyridis
,
G.
Mourgias-Alexandris
,
K.
Vyrsokinos
,
N.
Passalis
, and
N.
Pleros
, “
An all-optical neuron with sigmoid activation function
,”
Opt. Express
27
(
7
),
9620
9630
(
2019
).
109.
D.
Rosenbluth
,
K.
Kravtsov
,
M. P.
Fok
, and
P. R.
Prucnal
, “
Ultrafast all-optical implementation of a leaky integrate-and-fire neuron
,”
Opt. Express
19
(
3
),
2133
2147
(
2011
).
110.
A.
Wen
,
S.
Xiang
,
X.
Guo
,
Y.
Hao
, and
Y.
Zhang
, “
All-optical inhibitory dynamics in photonic neuron based on polarization mode competition in a VCSEL with an embedded saturable absorber
,”
Opt. Lett.
44
(
7
),
1548
1551
(
2019
).
111.
Y.
Zhang
,
S.
Xiang
,
X.
Cao
,
S.
Zhao
,
X.
Guo
,
A.
Wen
, and
Y.
Hao
, “
Experimental demonstration of pyramidal neuron-like dynamics dominated by dendritic action potentials based on a VCSEL for all-optical XOR classification task
,”
Photonics Res.
9
(
6
),
1055
(
2021
).
112.
M. A.
Nahmias
,
B. J.
Shastri
,
A. N.
Tait
, and
P. R.
Prucnal
, “
A leaky integrate-and-fire laser neuron for ultrafast cognitive computing
,”
IEEE J. Sel. Top. Quantum Electron.
19
(
5
),
1800212
(
2013
).
113.
Y.
Shen
,
S.
Skirlo
,
N. C.
Harris
,
D.
Englund
, and
M.
Soljacic
, “
On-chip optical neuromorphic computing
,” in
2016 Conference on Lasers and Electro-Optics
(
Optica Publishing Group
,
2016
), p.
FW5D.3
.
114.
A.
Smerieri
,
A.
Dejonckheere
,
F.
Duport
,
J.-L.
Oudar
,
L.
Fang
,
M.
Haelterman
, and
S.
Massar
, “
All-optical reservoir computer based on saturation of absorption
,”
Opt. Express
22
(
9
),
10868
10881
(
2014
).
115.
A. C.
Selden
, “
Pulse transmission through a saturable absorber
,”
Br. J. Appl. Phys.
18
(
6
),
743
(
1967
).
116.
F.
Selmi
,
R.
Braive
,
G.
Beaudoin
,
I.
Sagnes
,
R.
Kuszelewicz
, and
S.
Barbay
, “
Relative refractory period in an excitable semiconductor laser
,”
Phys. Rev. Lett.
112
(
18
),
183902
(
2014
).
117.
J.
Xiang
,
A.
Torchy
,
X.
Guo
, and
Y.
Su
, “
All-optical spiking neuron based on passive micro-resonator
,”
IEEE J. Lightwave Technol.
38
(
15
),
4019
4029
(
2020
).
118.
W.
Stutius
and
W.
Streifer
, “
Silicon nitride films on silicon for optical waveguides
,”
Appl. Opt.
16
(
12
),
3218
(
1977
).
119.
K.
Nozaki
,
S.
Matsuo
,
T.
Fujii
,
K.
Takeda
,
A.
Shinya
,
E.
Kuramochi
, and
M.
Notomi
, “
Femtofarad optoelectronic integration demonstrating energy-saving signal conversion and nonlinear functions
,”
Nat. Photonics
13
(
7
),
454
459
(
2019
).
120.
Y.-J.
Lee
,
M. B.
On
,
X.
Xiao
, and
S. J.
Ben Yoo
, “
Energy-efficient photonic spiking neural network on a monolithic silicon CMOS photonic platform
,” in
Optical Fiber Communication Conference (OFC) 2021
(
Optical Society of America
,
2021
), p.
Tu5H.5
; available online: http://www.osapublishing.org/abstract.cfm?URI=OFC-2021-Tu5H.5.
121.
E.
Farquhar
and
P.
Hasler
, “
A bio-physically inspired silicon neuron
,”
IEEE Trans. Circuits Syst. I: Regul. Pap.
52
(
3
),
477
488
(
2005
).
122.
M.
Miscuglio
,
G. C.
Adam
,
D.
Kuzum
, and
V. J.
Sorger
, “
Roadmap on material-function mapping for photonic-electronic hybrid neural networks
,”
APL Mater.
7
(
10
),
100903
(
2019
).
123.
A.
Shrestha
and
A.
Mahmood
, “
Review of deep learning algorithms and architectures
,”
IEEE Access
7
,
53040
53065
(
2019
).
124.
S.
Pai
,
I. A. D.
Williamson
,
T. W.
Hughes
,
M.
Minkov
,
O.
Solgaard
,
S.
Fan
, and
D. A. B.
Miller
, “
Parallel programming of an arbitrary feedforward photonic network
,”
IEEE J. Sel. Top. Quantum Electron.
26
(
5
),
6100813
(
2020
).
125.
G.-q.
Bi
and
M.-m.
Poo
, “
Synaptic modification by correlated activity: Hebb’s postulate revisited
,”
Annu. Rev. Neurosci.
24
,
139
166
(
2001
).
126.
W.
Gerstner
and
W. M.
Kistler
, “
Mathematical formulations of Hebbian learning
,”
Biol. Cybern.
87
(
5–6
),
404
415
(
2002
).
127.
W.
Gerstner
,
M.
Lehmann
,
V.
Liakoni
,
D.
Corneil
, and
J.
Brea
, “
Eligibility traces and plasticity on behavioral time scales: Experimental support of NeoHebbian three-factor learning rules
,”
Front. Neural Circuits
12
,
53
(
2018
).
128.
R.
Toole
,
A. N.
Tait
,
T. F.
de Lima
,
M. A.
Nahmias
,
B. J.
Shastri
,
P. R.
Prucnal
, and
M. P.
Fok
, “
Photonic implementation of spike-timing-dependent plasticity and learning algorithms of biological neural systems
,”
J. Lightwave Technol.
34
(
2
),
470
476
(
2016
).
129.
Q.
Li
,
Z.
Wang
,
Y.
Le
,
C.
Sun
,
X.
Song
, and
C.
Wu
, “
Optical implementation of neural learning algorithms based on cross-gain modulation in a semiconductor optical amplifier
,”
Proc. SPIE
10019
,
100190E
(
2016
).
130.
Z.
Cheng
,
C.
Ríos
,
W. H. P.
Pernice
,
C. D.
Wright
, and
H.
Bhaskaran
, “
On-chip photonic synapse
,”
Sci. Adv.
3
(
9
),
e1700160
(
2017
).
131.
B.
Gholipour
,
P.
Bastock
,
C.
Craig
,
K.
Khan
,
D.
Hewak
, and
C.
Soci
, “
Amorphous metal-sulphide microfibers enable photonic synapses for brain-like computing
,”
Adv. Opt. Mater.
3
(
5
),
635
641
(
2015
).
132.
S.
Xiang
,
Z.
Ren
,
Y.
Zhang
,
Z.
Song
,
X.
Guo
,
G.
Han
, and
Y.
Hao
, “
Training a multi-layer photonic spiking neural network with modified supervised learning algorithm based on photonic STDP
,”
IEEE J. Sel. Top. Quantum Electron.
27
(
2
),
7500109
(
2021
).
133.
S.
Xiang
,
Y.
Zhang
,
J.
Gong
,
X.
Guo
,
L.
Lin
, and
Y.
Hao
, “
STDP-based unsupervised spike pattern learning in a photonic spiking neural network with VCSELs and VCSOAs
,”
IEEE J. Sel. Top. Quantum Electron.
25
(
6
),
1
9
(
2019
).
134.
S.
Xiang
,
J.
Gong
,
Y.
Zhang
,
X.
Guo
,
Y.
Han
,
A.
Wen
, and
Y.
Hao
, “
Numerical implementation of wavelength-dependent photonic spike timing dependent plasticity based on VCSOA
,”
IEEE J. Quantum Electron.
54
(
6
),
8100107
(
2018
).
135.
G.
Amato
,
F.
Carrara
,
F.
Falchi
,
C.
Gennaro
, and
G.
Lagani
, “
Hebbian learning meets deep convolutional neural networks
,”
Lect. Notes Comput. Sci.
11751
,
324
334
(
2019
).
136.
J.
Wang
,
A.
Belatreche
,
L.
Maguire
, and
M.
McGinnity
, “
Online versus offline learning for spiking neural networks: A review and new strategies
,” in
Proceedings of the 9th IEEE International Conference on Cybernetic Intelligent Systems (CIS 2010)
(
IEEE
,
2010
).
137.
I. V.
Oseledets
, “
Tensor-train decomposition
,”
SIAM J. Sci. Comput.
33
(
5
),
2295
2317
(
2011
).
138.
Y.
Zhang
,
Y.-C.
Ling
,
Y.
Zhang
,
K.
Shang
, and
S. J. B.
Yoo
, “
High-density wafer-scale 3-D silicon-photonic integrated circuits
,”
IEEE J. Sel. Top. Quantum Electron.
24
(
6
),
1
10
(
2018
).
139.
Y.
Zhang
,
A.
Samanta
,
K.
Shang
, and
S. J. B.
Yoo
, “
Scalable 3D silicon photonic electronic integrated circuits and their applications
,”
IEEE J. Sel. Top. Quantum Electron.
26
(
2
),
8201510
(
2020
).
140.
L. J.
Ba
and
R.
Caruana
, “
Do deep nets really need to be deep?
,”
Adv. Neural Inf. Process. Syst.
3
,
2654
2662
(
2013
); accessed September 05, 2021; available online: https://arxiv.org/abs/1312.6184v7.
141.
X.
Xiao
and
S. J.
Ben Yoo
, “
Scalable and compact 3D tensorized photonic neural networks
,” in
2021 Optical Fiber Communications Conference and Exhibition, OFC 2021
(
IEEE
,
2021
), p.
3
, available online: https://ieeexplore.ieee.org/document/9489610.
142.
M. B.
On
,
Y.-J.
Lee
,
X.
Xiao
,
R.
Proietti
, and
S. J. B.
Yoo
, “
Analysis of the hardware imprecisions for scalable and compact photonic tensorized neural networks
,” in
2021 European Conference on Optical Communication (ECOC)
(
IEEE
,
2021
), pp.
1
4
.
143.
X.
Xiao
,
M. B.
On
,
T.
Van Vaerenbergh
,
D.
Liang
,
R. G.
Beausoleil
, and
S. J. B.
Yoo
, “
Large-scale and energy-efficient tensorized optical neural networks on III–V-on-silicon MOSCAP platform
,”
APL Photonics
6
(
12
),
126107
(
2021
).
144.
Y.
Zhang
,
C.
Qin
,
K.
Shang
,
G.
Liu
,
G.
Liu
, and
S. J. B.
Yoo
, “
Sub-wavelength spacing optical phase array nanoantenna emitter with vertical silicon photonic vias
,” in
Optical Fiber Communication Conference and Exhibition (OFC)
(
IEEE
,
2018
), Vol. Part F84-OFC 2018, p.
1
3
.
145.
K.
Shang
,
S.
Pathak
,
G.
Liu
,
S.
Feng
,
S.
Li
,
W.
Lai
, and
S. J. B.
Yoo
, “
Silicon nitride tri-layer vertical Y-junction and 3D couplers with arbitrary splitting ratio for photonic integrated circuits
,”
Opt. Express
25
(
9
),
10474
(
2017
).
146.
K.
Shang
et al, “
Low-loss compact multilayer silicon nitride platform for 3D photonic integrated circuits
,”
Opt. Express
23
(
16
),
21334
21342
(
2015
).
147.
J. L.
Henning
, “
SPEC CPU2000: Measuring CPU performance in the new millennium
,”
Computer
33
(
7
),
28
35
(
2000
).
148.
P.
Mattson
,
H.
Tang
,
G. Y.
Wei
,
C. J.
Wu
,
V. J.
Reddi
,
C.
Cheng
,
C.
Coleman
,
G.
Diamos
,
D.
Kanter
,
P.
Micikevicius
,
D.
Patterson
, and
G.
Schmuelling
, “
MLPerf: An industry standard benchmark suite for machine learning performance
,”
IEEE Micro
40
(
2
),
8
16
(
2020
).
149.
A. R.
Totovic
,
G.
Dabos
,
N.
Passalis
,
A.
Tefas
, and
N.
Pleros
, “
Femtojoule per MAC neuromorphic photonics: An energy and technology roadmap
,”
IEEE J. Sel. Top. Quantum Electron.
26
(
5
),
1
15
(
2020
).
150.
C.
Cole
, “
Optical and electrical programmable computing energy use comparison
,”
Opt. Express
29
(
9
),
13153
13170
(
2021
).
151.
C.
Eliasmith
,
How to Build a Brain
(
Oxford University Press
,
2014
).
152.
T.
Stewart
,
F.-X.
Choo
, and
C.
Eliasmith
, “
Spaun: A perception-cognition-action model using spiking neurons
,”
Proc. Annu. Meet. Cognit. Sci. Soc.
34
(
34
),
34
(
2012
).
153.
M.
Davies
,
A.
Wild
,
G.
Orchard
,
Y.
Sandamirskaya
,
G. A. F.
Guerra
,
P.
Joshi
,
P.
Plank
, and
S. R.
Risbud
, “
Advancing neuromorphic computing with Loihi: A survey of results and outlook
,”
Proc. IEEE
109
(
5
),
911
934
(
2021
).
154.
M. A.
Nahmias
,
A. N.
Tait
,
B. J.
Shastri
, and
P. R.
Prucnal
, “
An evanescent hybrid silicon laser neuron
,” in
2013 IEEE Photonics Conference
(
IEEE
,
2013
), pp.
93
94
.
155.
G.
Mourgias-Alexandris
,
G.
Dabos
,
N.
Passalis
,
A.
Totovic
,
A.
Tefas
, and
N.
Pleros
, “
All-optical WDM recurrent neural networks with gating
,”
IEEE J. Sel. Top. Quantum Electron.
26
(
5
),
6100907
(
2020
).
156.
G.
Mourgias-Alexandris
,
A.
Tsakyridis
,
T.
Alexoudi
,
K.
Vyrsokinos
, and
N.
Pleros
, “
Optical thresholding device with a sigmoidal transfer function
,” in
Proceedings of the 2018 Photonics in Switching and Computing, PSC 2018
,
September 2018
.
157.
T.
Alexoudi
,
D.
Fitsios
,
A.
Bazin
,
P.
Monnier
,
R.
Raj
,
A.
Miliou
,
G. T.
Kanellos
,
N.
Pleros
, and
F.
Raineri
, “
III-V-on-Si photonic crystal nanocavity laser technology for optical static random access memories
,”
IEEE J. Sel. Top. Quantum Electron.
22
(
6
),
295
304
(
2016
).
158.
I. A. D.
Williamson
,
T. W.
Hughes
,
M.
Minkov
,
B.
Bartlett
,
S.
Pai
, and
S.
Fan
, “
Reprogrammable electro-optic nonlinear activation functions for optical neural networks
,”
IEEE J. Sel. Top. Quantum Electron.
26
(
1
),
1
12
(
2020
).
159.
T. W.
Hughes
,
M.
Minkov
,
Y.
Shi
,
S.
Fan
,
T. W.
Hughes
,
Y.
Shi
,
M.
Minkov
,
Y.
Shi
,
S.
Fan
,
T. W.
Hughes
,
Y.
Shi
,
T. Y. W. H.
Ughes
,
M. O. M.
Inkov
, and
Y. U. S.
Hi
, “
Training of photonic neural networks through in situ backpropagation and gradient measurement
,”
Optica
5
(
7
),
864
871
(
2018
).