The computing industry is rapidly moving from a programming to a learning area, with the reign of the von Neumann architecture starting to fade, after many years of dominance. The new computing paradigms of non-von Neumann architectures have started leading to the development of emerging artificial neural network (ANN)-based analog electronic artificial intelligence (AI) chipsets with remarkable energy efficiency. However, the size and energy advantages of electronic processing elements are naturally counteracted by the speed and power limits of the electronic interconnects inside the circuits due to resistor-capacitor (RC) parasitic effects. Neuromorphic photonics has come forward as a new research field, which aims to transfer the well-known high-bandwidth and low-energy interconnect credentials of photonic circuitry in the area of neuromorphic platforms. The high potential of neuromorphic photonics and their well-established promise for fJ/Multiply-ACcumulate energy efficiencies at orders of magnitudes higher neuron densities require a number of breakthroughs along the entire technology stack, being confronted with a major advancement in the selection of the best-in-class photonic material platforms for weighting and activation functions and their transformation into co-integrated photonic computational engines. With this paper, we analyze the current status in neuromorphic computing and in available photonic integrated technologies and propose a novel three-dimensional computational unit which, with its compactness, ultrahigh efficiency, and lossless interconnectivity, is foreseen to allow scalable computation AI chipsets that outperform electronics in computational speed and energy efficiency to shape the future of neuromorphic computing.

We are creating every day nearly half of the information of the human genome on this planet, i.e., about 2.5 Exabytes of data, and every year we keep generating more than what we have created in the past. The total annual global IP traffic is estimated to reach 4.8 Zettabyte per year by 2022.1 This exponential increase of data generation in the world is leading to new paradigms in data processing, analytics, exploration, and utilization, and is heavily now supported by artificial intelligence. While conventional computing technologies and architectures have been successful for decades, they are now limiting advances and sustainability in a computer-hungry world.

Device integration in microprocessor chips has been making a steady progress at the pace of Moore's law: the number of transistors per chip has been doubling every two years on average during the last four decades. But Moore's law has been coming to an end, with this meaning that the clock frequency of processors has leveled off after 2006, when Dennard's scaling law appeared to have broken down: in fact, while Dennard anticipated that both voltage and current would have scaled proportionally with feature size, or similarly that the power would have scaled proportionally with area leading to a doubling of performance per watt about every 2 years, actually at smaller sizes, current leakage causes the chip to heat up, preventing further increasing of the clock frequencies. Multiple processors have then helped sustain a steady growth of performance gains by parallel computation, but at the expense of energy consumption. Very soon, however, this new technological trend has faced another issue: the throughput limitation governed by Amdahl's law. In fact, in a multi-core setting, only fractions of the integrated circuit can actually be active at any given point in time without violating power constraints, which means that no effective major gains in performance have been reached since then (see Fig. 1), creating the so-called “computational gap” between the amount of data generated and the actual available resources to compute them.2 This urges us to reinvent our core technologies from the fundamental level through to the architectural level to finally deliver an abundance of computational power.

FIG. 1.

Highlight of the increased demand for compute power, which does not follow the data generated trend since 2006. This “computing gap” is not being solved even if Moore's law (number of transistors per area) follows its trend, which calls for disruptive new technologies. Solid circles are data, while the dashed lines are fits (on a logarithmic scale). Reproduced with permission from Kendall and Kumar, Appl. Phys. Rev. 7, 011305 (2020). Copyright 2020 AIP Publishing LLC.

FIG. 1.

Highlight of the increased demand for compute power, which does not follow the data generated trend since 2006. This “computing gap” is not being solved even if Moore's law (number of transistors per area) follows its trend, which calls for disruptive new technologies. Solid circles are data, while the dashed lines are fits (on a logarithmic scale). Reproduced with permission from Kendall and Kumar, Appl. Phys. Rev. 7, 011305 (2020). Copyright 2020 AIP Publishing LLC.

Close modal

There is a strong motivation in the microelectronics community behind exploring novel technology frontiers in the “post-Moore era.”3 The International Technology Roadmap for Semiconductors (ITRS) has already emphasized that the trend for increased performance via “miniaturization” will continue anyhow (More Moore approach), eventually trading performance against power, but sustained by the incorporation of new materials and new transistor concepts. At the same time, a functional diversification of semiconductor-based devices will be carried out, where non-digital functionalities will be added (More than Moore approach) to facilitate the interaction with the external world and to power up the overall system, to complement the digital processing and the storage functionalities. This approach will contribute further to the miniaturization of electronic systems, even not at the same rate as before. The co-integration of different technologies to achieve the desired performance gain sees, among other technologies, integrated photonics playing a major role within this roadmap.4 Alternative solutions are also sought to extend the CMOS roadmap, where nanotechnology is exploited to replace conventional planar MOSFETs5 and supplement 3D stacked nanophotonic deployments.6 In this case, we talk about the Beyond Moore strategy, which also includes purpose-built processors that accelerate application-specific tasks (such as the Tensor Processing Unit from Google, Graphics Processing Units, FPGAs, ASICs, etc.).7 These accelerators rely on a very large number of smaller processors where workloads can be broken down and parallelized to run computational demanding applications. However, in microelectronics, the achievable data throughput is eventually limited by the same electrical interconnections, as skin effect, dielectric loss, and wiring density exacerbate power dissipation.8 This, in turn, forces most of the hardware to stay in idle mode, waiting for the data to be fetched by the memory, to minimize the memory–processor communication energy costs. There is no doubt that any revolutionary chip technology that supplants conventional silicon electronics will certainly have to be interconnect- and memory-centric.9 In this paper, we will first review electronic computing hardware and its state-of-the-art, highlighting big achievements and issues at the same time (Sec. II). We motivate the need to move to neuromorphic photonics structure (Sec. III) according to the different approaches exploited. In Sec. IV, we propose a novel 3D basic element, realized by stacking three different functional layers (synaptic, routing, and nonlinear functions) and presenting how these 3D elements are predicted to generate a powerful computational unit when interconnected according to the most recent mesh-based architectures (Sec. V). In this section, and based on computational metric maps, we identify where the optimal performance lie and suggest the best-in-class technology, which can suit the concept of 3D-based neuromorphic photonics.

With the continuous advances in microelectronics, supercomputers—the core of the information processing—are now able to execute around hundreds of petaFLOPS/s (floating point operations/s),10 but at the impractical cost of about tens of millions of watts. Our brain, in comparison, is able to perform about the same order of operations but consuming only 20 W.11 Designing future hardware circuitry by getting inspiration from brain connectivity promises to offer a real opportunity to overcome the limitations of conventional electronics. New computing paradigms of non-von Neumann architectures have, therefore, begun to unfold, attracting renewed interests during the last decade and leading to the development of a plethora of machines based on novel computing architectures,2 such as neuromorphic or other biologically inspired architectures. Figure 2 shows the different architectural approaches to highlight also the major differences.

FIG. 2.

Computing architectures and paradigms. (a) Schematic of the von Neuman architecture, where the bottleneck between memory (gray areas) and processing unit (yellow area) is highlighted. (b) The in-memory computing architecture paradigm, where electronic synapses allow connections between multiple processing areas with localized memories for the realization of sparse multiple neurons: here the electronic interconnections represent the bottleneck. (c) Biological neural networks, with a multitude of interconnected neuron, where both memory and processing are present (gray/yellow spheres): after integrating the spiking signals coming from other neurons to which it is connected, the neuron reaches a certain threshold and “fires.”

FIG. 2.

Computing architectures and paradigms. (a) Schematic of the von Neuman architecture, where the bottleneck between memory (gray areas) and processing unit (yellow area) is highlighted. (b) The in-memory computing architecture paradigm, where electronic synapses allow connections between multiple processing areas with localized memories for the realization of sparse multiple neurons: here the electronic interconnections represent the bottleneck. (c) Biological neural networks, with a multitude of interconnected neuron, where both memory and processing are present (gray/yellow spheres): after integrating the spiking signals coming from other neurons to which it is connected, the neuron reaches a certain threshold and “fires.”

Close modal

With the diminishing returns on Moore's law and the rise of deep learning (DL) to prominence in 2012, the computing industry has started to rapidly move from a programming- to a learning-era.12 Spurred by the digital energy efficiency wall and following the neuroscience-focused approach, which aims to replicate fundamentals of biological neural circuits in dedicated hardware, important advances has been made in neuromorphic and DL accelerator ASICs. Some notable examples are IBM's TrueNorth,13 SpiNNaker,14 Intel's Loihi,15 Neurogrid,16 and HiCANN.17 These have been built bringing to important reported advancements in power efficiency down to few pJ (picoJoule) per MAC (Multiply-ACcumulate) operation. These large neuromorphic machines are based on the spiking architectural model: they typically perform quite simple tasks and use asynchronous communication to route spikes between neurons and synapses, whose model can though hardly follow the staggering advances released by powerful deep learning models.18 

Deep-learning-focused approaches, on the other hand, begin with an end-application in mind and aim to construct hardware that efficiently realizes a solution, while eliminating as much of the complexity of biological neural networks as possible. Deep learning, in particular, has attracted much attention because it is particularly good at learning from unstructured data by using artificial layered neural networks, which makes it potentially very useful for real-world applications. Computational architectures based on the interconnectivity of multiple neurons are called artificial neural networks. Deep neural networks are the quintessential deep learning models. These are feedforward models, in the sense that information flows through layers of artificial neurons in one unique direction, from the input toward the output, where the outputs of one layer are fed to the next layer as inputs, and are able to run inference, prior to being trained. In Fig. 3, a comparison between a biological neuron and an artificial neuron is shown. The same neuron is then embedded with many others for the construction of a feedforward artificial neural network.

FIG. 3.

General scheme of a deep neural network made of layers of artificial neurons. The basic model of an artificial neuron (yellow inset) is made of two distinct parts: a linear part (Σ) and a nonlinear part (Φ). The black blocks are components for broadcast functionality (e.g., optical splitters). The inset at the bottom right shows in detail the fundamental operations in an artificial neuron: the weighted addition (or synaptic operation) and the threshold function. At the top right, the parallel between a real neuron and artificial neuron is shown.

FIG. 3.

General scheme of a deep neural network made of layers of artificial neurons. The basic model of an artificial neuron (yellow inset) is made of two distinct parts: a linear part (Σ) and a nonlinear part (Φ). The black blocks are components for broadcast functionality (e.g., optical splitters). The inset at the bottom right shows in detail the fundamental operations in an artificial neuron: the weighted addition (or synaptic operation) and the threshold function. At the top right, the parallel between a real neuron and artificial neuron is shown.

Close modal

An artificial neuron represents the basic operation unit in a neural network. Referring to the basic McCulloch–Pitts neuron model,19 the operation executed by a neuron is modeled so that its output can be written as y = φ ( W i x i + b ), where φ is an activation (nonlinear) function, xi is the ith element of the input vector x, wi is the weight factor for the input value xi, and b is a bias [see Fig. 2(b)]. In particular, we call the linear term W i x i weighted addition. For a layer of M interconnected neurons, the output of these neurons can be expressed in vector form as y = φ ( W x + b ), where x is an input vector with N elements, W is the N × M weight matrix, b is a bias vector with M elements, and y is a vector made of M outputs.

Among the most powerful DL hardware, we name the GPU-based DL accelerators hardware favored due to their high compute and memory density as well as an established hardware and software ecosystems,20,21 allowing at the same time for footprint-energy efficiency improvements by ∼7 orders of magnitude on average.22,23 In parallel to the digital approach, there is a constantly growing number of emerging artificial neural network (ANN)-based analog electronic artificial intelligence (AI) chipsets, which, following the technology trends in neuromorphic computing, tend to collocate processing and memory to minimize the memory–processor communication energy costs. The most exciting emerging AI hardware architectures, born to avoid digital bottlenecks, are the analog crossbar approaches,24–26 since they achieve parallelism, in-memory computing, and analog computing at the same time: Mythic's architecture,27 for example, can yield high accuracy inference applications within a remarkable energy efficiency of just 0.5 pJ/MAC. These advances follow all plethora of extensively investigated memristive devices based on a variety of physical processes that go from phase changing to spin torque transfer, which have yielded a number of interesting approaches for high-density storage and computing.28–30 

Even if the implementation of neuromorphic approach is visibly bringing to quite some outstanding record energy efficiencies and computation speeds, we see that neuromorphic electronics is already struggling offering the desired data throughput at the neuron level: artificial neural networks rely on dense interconnectivity between neurons, but closely spaced wires experience bandwidth-distance trade-offs due to resistor-capacitor (RC) parasitic effects, with current machines hardly exceeding GHz clock frequencies, so that further scaling in computational speed and throughput will result in side effects as a huge energy consumption.31,32 Neuromorphic processing for high-bandwidth applications requires about GHz operation per neuron, which calls for a fundamentally different technology approach.33 A future-proof solution that could dominate this landscape for many years should obviously rely on the best-performing and top-efficient technology and architectural mixture that can support the complete DL learning model portfolio.

Companies are now racing to penetrate a fast rising market, with recent roadmaps,30,34 strongly suggesting to migrate to photonic transport solutions for their natural bandwidth.35 Following the successful example of replacing electronic with optical interconnect technologies in today's telecom and datacom industries, neuromorphic photonics can take advantage of the inherent analog nature of light toward introducing neuron connectivity at several tens of GHz clock rates! In the following, we introduce neuromorphic integrated photonics and its state-of-the-art, highlighting strong advantages but also point for improvements, suggesting a new paradigm for ultra-compact, efficient, and fast optical engines, where the key factor is the combination of best-in-class technologies in a 3D planar integration technology.

Neuromorphic photonics rises as a new research field,36 which aims to transfer the well-known high-bandwidth and low-energy interconnect credentials of photonic circuitry to the area of neuromorphic platforms. In contrast to electronics, there is negligible energy overhead for moving light encoded information around, which enables unprecedented circuit interconnectivity and speed. Moreover, photonic engines are bit-rate agnostic, offering the right credentials toward bit-rate-transparent operation that can relax the delicate trade-off between speed and power consumption. On top of that, photonic integration technology has reached now a maturity level where high-performance sophisticated integrated circuits are made available.37,38 Only the combination of the complementary advantages of photonics and electronics and their synergic co-design will enable processing systems with high efficiency, high interconnectivity, and extremely high bandwidth for the postulation of the new field of neuromorphic photonics at the nexus between photonics and neural network processing models.

Breakthrough proof-of-concept experimental photonic AI platforms39–41 have recently started appearing, initially exploiting the maturity of the CMOS silicon photonics industry and shaping a potential path toward integrating multiple photonic neurons on the same silicon chip. Building on these principles, neuromorphic photonics has come to the fore stark making headlines through a few new-born start-up companies42–45 and raising expectations for orders of magnitude higher energy and size efficiencies compared to state-of-the-art electronic AI platforms.27,46–48

It is of paramount importance to review and categorize the most notable examples of neuromorphic photonic demonstrations to be able to understand platform and architecture limitations and foresee long-term developments. While also the neuroscience-focused approach has been applied to photonics,36,49–52 in this paper, we will concentrate on deep learning photonic integrated approaches and will emphasize at the main strategies and architectural approaches for linear neurons, as the synaptic operation is the most computational expensive. Two main architectural approaches are used to realize linear neurons and connectivity between those when moving from a layer to the next one: the coherent and non-coherent approaches, shown in Figs. 4(a)4(f). Coherent layouts rely on the use of interferometric arrangements that require just a single wavelength for their optical input signal. These layouts can provide addition or subtraction while the weighted signals are still in the optical domain via constructive or destructive interference of the optical beams, respectively, allowing in this way the representation of signed values via encoding at the optical carrier signal phase. On the other hand, non-coherent configurations utilize multiple wavelength channels as their input signals relying on wavelength-selective filtering elements for the weighting function and on optical power addition at the photodetector. This obviously requires sign information to be also encoded in the wavelength domain and the final summation to be carried out at the output of a balanced photodetection scheme, where the optical powers of wavelengths carrying the positive and the negative values, respectively, are converted into electrical currents and subsequently subtracted. These architectural approaches are categorized in the next subparagraphs looking at the integrated photonics technology of choice (CMOS compatible and III/V based) and the memory-processing co-location.

FIG. 4.

(Top) Linear photonic neuron implementations based on WDM optical power addition: (a) optical neuron for a cross-connect neural layer, (b) a bank of micro-ring resonators for the broadcast and select (B&S) layer scheme, and (c) the crossbar scheme for the parallel computing using PCM loaded micro-bends. (Bottom) Linear photonic neuron implementations based on coherent electric field summation: (d) the optical interference unit with single stage based on two phase shifters MZI, (e) the IQ modulator scheme on InP, and (f) the summation using the concept of the quantum photoelectric multiplier.

FIG. 4.

(Top) Linear photonic neuron implementations based on WDM optical power addition: (a) optical neuron for a cross-connect neural layer, (b) a bank of micro-ring resonators for the broadcast and select (B&S) layer scheme, and (c) the crossbar scheme for the parallel computing using PCM loaded micro-bends. (Bottom) Linear photonic neuron implementations based on coherent electric field summation: (d) the optical interference unit with single stage based on two phase shifters MZI, (e) the IQ modulator scheme on InP, and (f) the summation using the concept of the quantum photoelectric multiplier.

Close modal

Shen et al. have recently proposed a coherent approach [Fig. 4(d)] using a Mach-Zehnder Interferometer (MZI)-based optical interference unit (OIU) for matrix multiplication combined with software-implemented saturable absorbers to form two-layer feedforward neural networks on silicon on insulator (SOI).40 Though this is a promising approach and also extendable to quantum application, the presence of multiple MZI stages for implementing a single weight increases phase noise accumulation, reducing extinction ratio and preventing scalability. Also, the weight is set via a thermo-optical mechanism, which increases power consumption. An optical neural network accelerator, based on time-multiplexing and coherent (homodyne) detection, has been proposed,53 which promises to be scalable to large networks without any error propagation issue [Fig. 4(f)]. Many-mode ONN operation has been demonstrated in a free space system using spatial light modulators,54 but a faster operation is anticipated by employing silicon photonic integrated chips. With the addition of the wavelength domain, the micro-ring-resonator (MRR)-based weighting bank facilitates scalability with easy implementation of neurons and interconnections,55 based on wavelength division multiplexing (WDM) optical power addition [Fig. 4(b)], showing a computational speed of subTera MACs/s, a computing density of few TMACs/s/mm2, and an energy efficiency of already 0.52 pJ/MAC, similarly to the top electronic AI chip.26 While this weighting scheme has been demonstrated to achieve up to 6 bit precision,55 this is obtained at the cost of complicated calibration schemes. In all these cases, while the routing happens via the SOI low-loss waveguides, the nonlinear functions are realized via software, off-chip, or via the use of a photodiode (PD) balanced scheme. However, an all-optical (AO) implementation of the neural network calls for complete removal of E/O conversions.

With respect to the previous implementations on silicon platforms, other two relevant examples emerge, where the indium phosphide (InP) material platform is considered an alternative, as this allows the co-integration of active and passive components without a loss in performance and opens to scalability. Based on the in-phase and quadrature (IQ) modulator scheme, a coherent optical linear unit is demonstrated [Fig. 4(e)] for a computational speed of 0.32 Tera MACs/s and an energy efficiency of 1.5 pJ/MAC.56 This architecture is exploited for MNIST digit recognition obtaining an average accuracy as high as 97.24%, when combined with the empirical transfer function of the MZI-semiconductor optical amplifier (SOA) optical nonlinear function scheme.57 In addition, an InP integrated optical cross-connect is exploited for demonstrating an all-optical neuron and two-layer neural network by using SOAs as single stage weight elements as well as SOA-based wavelength converters58–60 as shown in Fig. 4(a), while trading dynamic range (up to 9 bit resolution) and scalability for energy efficiency (∼tens of pJ/MAC).

We recognize that, similarly to the approach used in analog electronic AI processors, the in-memory computing approach may be the key approach for breakthrough achievements in energy efficiencies via the use of memory elements based on phase change materials (PCMs). Photonic non-volatile PCMs have lately emerged as promising materials with zero-power reconfigurability,61 synaptic programmability,62 and self-learning capabilities.52 However, these examples mostly exploit GST-based compounds that tune the transmitted light amplitude only via absorption,61 limiting their use for large scale circuitry. In fact, the optical refractive index here is tuned, which affects the optical absorption levels and therefore the transmitted light: This implies that the amount of operations we can perform in-line is eventually intrinsically limited by the available power budget. The network in Ref. 62 has been simulated using PCMs like GSST (Ge–Sb–Se–Te) on SOI for MNIST handwritten digit classification, resulting in a very high (92.3%) accuracy and ultra-low power consumption at the same time. However, the tuning mechanism (heating, in this case) needs quite some extra space, which reduces computing density (∼TMACs/s/mm2) and also increases insertion loss per memory element. Therefore, while the approach of using PCM on integrated photonics appears to be the most promising in terms of computing density and computational energy efficiency, it is still to understand how to do that in an ultra-compact and scalable way. A photonic tensor core has been recently demonstrated by using phase change memory crossbar arrays [Fig. 4(c)] and photonic chip-based optical frequency combs, where the computation is based on measuring the optical transmission of reconfigurable and non-resonant passive components.63 These results provide an impressively credible path toward full CMOS wafer-scale integration of photonic tensor cores operating at TMAC/s speeds, which does not include though the nonlinearities.

All these recent planar integrated optical engine developments suggest that there is still not a technology able to outperform the others in all the aspects for the implementation of photonic neural networks (PNNs): the silicon based platform, for example, misses the possibility to co-integrate the passive synaptic circuitry with active elements, while the InP generic technology opens to scalability but is not compact and energy efficient when using the generic technology; finally, the synaptic reconfigurability via photonic memristor, while it is very energy efficient, is not ready to scale yet. A photonic integration technology, which is at the same time (a) scalable, i.e., allow data routing and synaptic operation with the lowest optical losses possible, (b) includes both active and passive elements for a full all-optical neural network implementation, and (c) processes information with the lowest energy consumption, is highly desirable. Technology and energy requirements for neuromorphic photonic solutions let us foresee that the synergic combination of non-volatile photonic memories with ultra-low-loss passive and ultra-compact active nanophotonics will most likely be able to unlock huge energy and area savings. In the following, we analyze the diverse functionalities needed to allow breakthrough advances in AI platforms, compared to today's performance, and identify the key elements for the most efficient neural implementation, which may allow a disruptive AI photonic revolution.

We believe that key approaches for next generation optical engines will include the mixing of the best-in-class material and device platforms for both passive and active components, together with the promotion of in-memory processing where collocation of photonic memories with high-speed photonic MAC operations is pursued. We argue that the right combination of the best-in-class technologies will have to be conceived via a 3D stack integration approach (see Fig. 5), which sees the co-integration of zero-power synaptic operation that can seamlessly interact with the overlying electronics (top layer), of the lowest-loss routing functionality (middle layer), and of the most compact nonlinear functions for light generation, activation, and detection (bottom layer) for enabling neuromorphic photonic processor, which are scalable, ultra-compact, and energy efficient. Figure 5 shows a schematic of the envisioned 3D ultra-compact neuron, where the vertical stacking of the three functional layers is underlined. The reason for suggesting a planar 3D integration scheme is to avoid any complex and unreliable hybrid integration scheme and offer instead flexibility in terms of material of choice, as well as ease of integration. In the following sections, we analyze the downside of current technologies and identify the desired performance for each layer to propose the most suitable technologies, whose combination promises disruptive deployment of optical computing on chip.

FIG. 5.

Schematic of the envisioned photonic neuron in a 3D fashion, showing the different layers with different functionalities: the top layer (yellow) is dedicated to the synaptic operation in combination with the low-loss routing layer (red layer), and the nonlinear function layer (blue layer) is at the bottom. Altogether, the three layers form a complete most compact, energy efficient neuron.

FIG. 5.

Schematic of the envisioned photonic neuron in a 3D fashion, showing the different layers with different functionalities: the top layer (yellow) is dedicated to the synaptic operation in combination with the low-loss routing layer (red layer), and the nonlinear function layer (blue layer) is at the bottom. Altogether, the three layers form a complete most compact, energy efficient neuron.

Close modal

In every synaptic connection, input data get weighted prior to reaching the nonlinear activation unit, meaning that an optical carrier or a modulated signal needs to be attenuated or delayed by a certain amount dictated by a weight value. Photonic interferometers and resonating structures are mostly employed to realize synaptic connections, leveraging various technologies based on thermo-optic, electro-optic, non-volatile and plasmonic enhanced modulation of amplitude or phase attributes (see Fig. 6). When mapping the dynamics of the employed technology, integrated photonic weighting elements can be classified in fast and slow elements. Fast weighting solutions become essential in training routines where weights are updated on-line, while slow variants are required by layouts for inference tasks with weight assignment being carried out off-line.

FIG. 6.

Perspective performance for the identified technology for the 3D co-integrated linear neuron (pink ball) based on Sb-loaded SiN waveguide and comparison with the state-of-the-art.

FIG. 6.

Perspective performance for the identified technology for the 3D co-integrated linear neuron (pink ball) based on Sb-loaded SiN waveguide and comparison with the state-of-the-art.

Close modal

Thermo-optic (TO) phase shifters loaded Mach–Zehnder interferometers or micro-ring resonators (MRRs) constitute the easiest way to realize on-chip weighting elements by changing the index of propagating modes through resistive metal wires lying on top of waveguides. In most cases, thermo-optic phase shifters are power inefficient and occupy large footprint area.64 However, advanced fabrication techniques can lead to TO phase shifters with tens of microwatts per π phase shift efficiencies,65 while a collaborative research effort pursued within the European H2020 project PlasmoniAC attempts to deliver sub-mW and ultra-compact thermo-optic synaptic elements via the deployment of plasmonics-on-SiN structures.66 Alternatives to TO solutions refer to electro-refractive or even electro-absorptive modulators, where physical mechanisms such as the Pockels effect, Kerr effect, quantum confined Stark effect (QCSE), and free carrier modulation67–71 are considered to realize fast and energy efficient weighting devices; however, a straightforward comparison between modulation principles in any layout cannot be pursued without application requirement relevance. Ultrahigh-speed modulation, for instance, can be achieved relying on mm-long LiNbO3 based MZI modulators; however, their very large footprint raises concerns about scalability and yield.72 At the other extreme, ultra-compact plasmonic modulators based on nonlinear polymers showcased modulation rates higher than >100 Gbaud, but still taking up the gauntlet of reducing excessive optical losses.73 In fact, research groups are striving to engineer hybrid modulators employing graphene-plasmonics,74 InP membranes,75 and other emerging 2D materials76 in an attempt to improve performance metrics in all aspects for both electro-optic and electro-absorptive modulators. Recently, indium-tin-oxide (ITO) appeared compelling exhibiting superior tunable absorption and unity refractive index change properties demonstrating low-loss operation and multi-GHz potential.77 

In the end, we may argue that the aforementioned modulation technologies are well suited for volatile based memory weighting functions, but severe challenges associated with thermal drifts and sub-optimal longevity of weight values during inference need to be overcome. Therefore, endeavors to conceive and develop photonic memristor devices gathered enormous attention, with the prospect to realize all-optical fast and zero-power nonlinear responses with long-term information retention capabilities. In particular, chalcogenide phase change materials (PCMs) exhibit strong modulation in a static, self-holding fashion, leading to ultra-compact and highly energy efficient operations. State-of-the-art demonstrations have revealed memristive behavior for chalcogenide PCMs in spiking neural networks78 with self-holding properties and ultra-compact footprint. GST (Ge2Sb2Te5) islands deposited on top of Si3N4/SiO2 waveguides formed a 15 μm long synapses element, with 3-bit precision and without any static power consumption, but controlled via the repetition rate of an optical pulse. Recently, electrical switching of GST-based optical attenuators with external heaters79,80 and on-chip integrated PIN heaters81 have shown promising results, however, incurring in large insertion loss due to the use of ITO heaters or uniformly doped silicon heaters and in a number of switching cycles limited to ∼5–50. Ultra-compact hybrid Ge–Sb–Se–Te (GSST)–silicon Mach–Zehnder modulators employing an optimized electro-thermal switching mechanism62 have also been reported.

A zero-power memristive weighting structure must be developed to eliminate the energy cost of photonic linear neuron operations. However, the transmitted light amplitude should not be tuned via absorption but through phase change for removing insertion losses. Hereby, we propose some very promising novel antimony (Sb)-based compounds, which allows tuning the optical refractive index without affecting optical absorption levels. Recent fabrication–ellipsometry results82 on the non-volatile refractive index change in Sb2S3 (antimony trisulfide) and Sb2Se3 (antimony triselenide) compounds show already distinctive large index change of Δn = 0.77 without notable increase of the absorption either in the crystalline or amorphous state across a large optical spectrum of >800 nm and a switching extinction ratio with up to 5-bit resolution.

The right choice for a routing layer (middle layer in Fig. 5) is strictly connected to the architectural choice. Multi synaptic connection including the fan-in, i.e., all the inputs/outputs (I/Os) of the optical engines, construct a linear neuron. Linear neuron architectural implementations fall into two main categories: coherent and incoherent. Coherent based linear neurons, so far, relied mostly on interferometers or any type of photonic device that resembles a beam splitter arranged in mesh topologies to perform MAC operation using opto-electronics (OEs). For this case, single laser sources have been used, and fruitful interferometric layouts, supporting different cell symmetries, have been in the spotlight37 resulting in multipath interference patterns aiming at unity fidelity features by minimizing the number of active components and mitigating phase errors. Although remarkable results have been reported so far, coherent approaches still struggle to adopt WDM functionalities and hence not fully taking advantage of the benefits of the inherently parallelized photons, which improves scalability, allowing for an all-to-all interconnectivity, and form factor. On the other hand, the first approaches for realizing neuromorphic photonic layouts relied on incoherent deployments that inherently employ WDM-enabled MAC operations, with the most recent on-chip implementations promoting the use of multiwavelength laser sources or multiple laser sources on chip and creating expectations for skyrocketing computational speed for next-gen photonic processors.

In this context, various incoherent architectures have been demonstrated spanning from crossbar arrays63 to specialized broadcast and weight layouts on SOI83 as well as adopting cross-connect schemes on InP58 using WDM sources. In practice though, both coherent and incoherent based layouts face multitude performance trade-offs associated with thermal stability, increased channel crosstalk, and excessive insertion losses.

Putting facts in perspective, those trade-offs originate mostly from the waveguide platform of choice and their constituent materials. Current implementations employ heavily SOI wafers with crystalline Si waveguides to implement weighting functions and hybrid assemble fan-in structures realized exploiting multi-project wafer services or in-house fabrication facilities. High-index contrast of silicon waveguides allow indeed for compact layouts due to a high refractive index contrast at the expense of increased optical scattering and amenability to phase errors along the direction of propagation. Propagation losses in active–passive InP wafers, on the other side, are mainly due to the p-doped cladding layer: They would necessitate an ad hoc process development to guarantee competitive waveguide losses.84 In stark contrast, moderate index contrast platforms such as those based on silicon nitride propelled the deployment of photonic devices with higher immunity to the temperature drifts, lower optical losses, improved crosstalk values, and wider wavelength transparency.85 In addition, silicon-rich nitride platforms emerged as the means to tailor the index contrast of photonic devices complying with applications requirements where increased index contrast is imperatively needed, such as low-loss (<1 dB) fiber to chip coupling using grating couplers.86 Multi-layer silicon nitride cross-connects standing out as 10 × 100 any-to-any as well as feedforward networks have been also demonstrated pointing out the increased degree of freedom in designing scalable linear neurons pursuing low-loss SiN platforms.87 Nevertheless, hybrid integration of active devices on SiN platforms is still in its infancy impeding their wide adoption in practical applications as opposed to the more mature approaches of co-integrated CMOS electronics on SOI. For these reasons, while we foresee that the SiN platform will play a key role as routing layer in neuromorphic photonics, we also suggest that only a 3D integration scheme merging SiN with best-in-class active technologies bear promises for a revolutionary neuromorphic photonic platform.

After identifying the synaptic weight and linear neuron (top and middle layer), we reach the bottom layer of the envisioned 3D integrated approach in Fig. 5, where all the active functions (of generation, detection, and activation) must be placed and interfaced to the linear neuron to realize the complete neural units.

Nonlinear activation function (NLAF) elements can be classified in optoelectronic (OE) or all-optical (AO) ones. Regarding OE NLAFs, the nonlinear optical transfer function is mostly mediated by an electrical signal used to convert optical input signals at the output;41,88–90 however, the employed O/E conversion can increase inference latency91 or require efficient and low power optoelectronic devices leveraging advanced fabrication methods.92 Demarcating from O/E based NLAFs, all-optical variants are highly anticipated to revolutionize neuromorphic photonic circuits by providing time-of-flight latencies, exploiting fully the available optical bandwidth and consuming low power. All-optical SOA-MZI wavelength converters57 have been shown to exhibit a sigmoid-like transfer function with an extinction ratio of 11 dB at 10 GHz, yet in a bulk and power hungry deployment. On the other hand, power efficient Fano-based MRR-MZI structures52,93–96 can operate as optical thresholders achieving energy efficiencies down to ∼13.3 pJ/Op; however such schemes face hurdlers to facilitate WDM inputs and increase aggregated bandwidth. For example, an optical nonlinear function has been reported in Ref. 52 with PCM GST being loaded on a MRR, using a probe laser coupled into the MRR to assist generating the output pulses; however, the pulse width limits the maximum operation speed. Plasmon-assisted localization in combination with highly nonlinear material such as CdSe quantum dots93 allow for transfer function resembling a reversed sigmoid function, with a stunning compute efficiency of 1 Tera Op/s, pitfalls though regarding high loss and the low contrast aspects. Without moving toward more exotic implementations, a more robust and suitable technology for the active layer may rely on the resonantly enhanced nonlinear response of passive semiconductors, which can be configured to generate a tailored nonlinear function, as for instance in Ref. 94 where photonic crystal Fano structures have been used determining the optimal ratio between energy per bit and speed. Superior performance can be achieved by using photonic crystals nonlinear resonators made of III-V materials. Parameters can be optimized to favor speed and/or energy efficiency97 for the integration on a silicon photonic circuit.98 

So far mainly monolithic integration, where light is moving on a single guiding plane (2D), or butt coupling of diverse monolithic integrated chips have been exploited to realize combinations of diverse functionalities (e.g., linear and nonlinear functions). While the butt coupling of diverse photonic material platforms is exploited to overcome the lack of on-chip gain in fully passive platform,37 but at the expense of a complex and unreliable coupling scheme, a monolithic integration approach offers process robustness, yet preventing further scalability because of the high passive excess losses in active–passive platforms.38 

In this paper, we suggest that a more interesting approach is represented by the 3D hybrid planar integration, made of multiple guiding planes stacked in a 3D fashion, each with a different functionality, where a bottom guiding layer is patterned and planarized, to deposit or bond a next layer on top, which is also patterned and planarized, and so on and so forth. In this paper, we envision that such an integration scheme may provide the most compact optical neurons. Specifically, InP over SOI hybrid technology has been previously demonstrated as an extremely promising solution for future photonic circuits as it combines CMOS compatibility with the optoelectronic properties of III-V materials.99 The two-layer structures are separated by a low-refractive index bonding layer constituted of benzocyclobutene (BCB) and SiO2. A sub-100 nm precise alignment of the III-V nanocavities on top of SOI waveguides below is possible via multi-layer overlay and high precision reference markers fabricated at the silicon waveguide level, which guarantees a high optical coupling between the different layers in the vertical direction. Moreover, the vertical evanescent coupling between the underlying Si waveguides and nanocavities on top is a very compelling method, whose strength can be tuned at will by controlling the transverse overlap of the electro-magnetic fields distributed in each level by changing, for example, the SOI waveguide width and/or the SiO2 intermediate spacer layer.99 

Eventually, electrically powered InP photonic crystal (PhC) nanostructures on Silicon on Insulator (SOI) waveguide platforms have confirmed their superior compactness and energy benefits over the entire spectrum of optical active elements in a rich functional suite, including as nanolasers,99 nanomodulators,100 nanophotodetectors,101 and bistable nanomemories.102 The manufacturing of these nanocavities on top of SOI passives is achieved through a top-down approach that bypasses sophisticated III-V regrowth steps and annealing at temperatures beyond which the CMOS back-end of line processing cannot be envisaged.

As a result, this work paves the way to a 3D hybrid technology that opens to the convergence of microelectronics and photonics for a new generation of complex optoelectronic circuits, which can serve the novel dogma of neuromorphic photonics. Transferring the InP PhC technology onto the SiN waveguide platform and engineering their rich nonlinear characteristics into a complete set of ultra-low power and ultra-small footprint computational elements can bring to verifying the potential of photonic crystal technology in the context of neuromorphic computing, once interfaced to memory-loaded linear neurons.

The integration of possibly electrically controlled (via localized thermal heaters or PIN electrical junctions) non-volatile Sb-based layers onto low-loss SiN waveguides is expected to produce record-low-loss <0.15 dB non-volatile photonic structures, allowing full π-phase shifts via a very short waveguide section (<15 μm). Sb materials are expected to yield identical weight resolution performance metrics when operated in standalone waveguide configurations, but to allow up to >8-bit weight resolution levels when bringing them into an interferometric engineered layout for the highest linear neuron accuracy. On the other side, InP nanophotonic crystals (PhCs) are envisioned to be used for ultra-low-energy ultra-fast (>40 GHz) photonic fan-in, gating, and activation function units. This InP PhC technology may be transferred under (or onto) the SiN Sb-loaded waveguide platform for the most compact and complete functional set required for neuromorphic computations that, interconnected to form programmable meshes,37 are foreseen to release sub-fJ/MAC energy efficiencies.

The adoption of this technology will release a neuromorphic photonic technology that can be benchmarked along all relevant metrics with respect to the electronic AI chip state-of-the-art (see Fig. 7, and corresponding tabulated metrics on Table I). Assuming a basic 16 × 16 neuron architecture for this 3D photonic neural network (3D PNN), we highlight how this can scale to multi-neuron photonic processors enforcing breakthrough performance at the following metrics. The operating frequency of the 3D PNN can reach 50 GHz, being about 50 times higher than current digital GPU, TPU, and spiking engines, leveraging WDM techniques and wavelength parallelization as an extra acceleration factor. The computational power in MAC/s (or number of MACs times frequency), the case of a typical NxN crossbar architecture, with up to 16 different wavelength channels will yield tens of TMAC/s, with the total area of four-channel NxN configuration calculated to have a slightly higher footprint of ∼1 mm2. The full-scale 3D PNN processor could encompass ∼400 cores within a standard silicon die area, yielding a total computational power of the order of tens of PMAC/s, i.e., >4× higher over the rack-scale HICANN and ∼140× more over Google's Cloud TPU v3 boards. Considering the power consumption of N InP PhC PDs, N InP lasers, and N InP modulators including the driver and a total crossbar insertion loss, the total power required for single channel operating at 50 GHz is calculated to be tens of mW. This suggests energy efficiency that scales linearly with N, leading to a value as low as ∼63 fJ/bit, i.e., improved by an order of magnitude over all available digital and analog neuromorphic engines and a total footprint efficiency that surpasses by 2 orders of magnitude all available digital, analog, and spiking engines.66 Latency will obviously benefit from the use of photons as a data carrying medium toward time-of-flight latency-values less than a few hundreds of ps even for the longest route within a 20 × 20 mm2 silicon die. Figure 7 depicts the 3D PNN's breakthroughs, simultaneously achieving top-notch compute power and footprint efficiency compared to state-of-the-art machines, highlighting its long-term vision and future-proof technology.

FIG. 7.

Projected performance for our 3D photonic neural network (PNN) and comparison with state-of-the-art machines in terms of footprint efficiency and compute power vs energy per MAC performance.

FIG. 7.

Projected performance for our 3D photonic neural network (PNN) and comparison with state-of-the-art machines in terms of footprint efficiency and compute power vs energy per MAC performance.

Close modal
TABLE I.

State-of-the-art machines with performance expressed in terms of energy, footprint efficiency, and compute power with projected performance of our 3D PNN.

TypologyEnergy per MAC (pJ)Footprint efficiency (MMAC/s/mm2)Compute power (TMAC/s)
TrueNorth Spiking 0.27 2.0 0.017 
Loihi Spiking 23.6 0.81 0.38 × 10−6 
HiCANN Spiking 198.0 130.0 5.0 × 10−3 
Neurogrid Spiking 119.0 1.40 0.23 × 10−6 
Cambricon Spiking 0.625 2 × 106 128.0 
Google TPU GPU-based 0.8 2.78 × 105 1.4 × 10−6 
Nvidia GPU-based 1.2 3.07 × 105 1.56 
Graphcore GPU-based 0.6 3.1 × 105 0.206 
Groq GPU-based 0.36 1.13 × 106 820.0 
SiPho Si photonics 1.45 4.72 × 105 50.0 
Mythic Analog 0.5 2.0 × 106 100.0 
3D PNN 3D photonics 0.0625 3.5 × 107 20.5 × 103 
TypologyEnergy per MAC (pJ)Footprint efficiency (MMAC/s/mm2)Compute power (TMAC/s)
TrueNorth Spiking 0.27 2.0 0.017 
Loihi Spiking 23.6 0.81 0.38 × 10−6 
HiCANN Spiking 198.0 130.0 5.0 × 10−3 
Neurogrid Spiking 119.0 1.40 0.23 × 10−6 
Cambricon Spiking 0.625 2 × 106 128.0 
Google TPU GPU-based 0.8 2.78 × 105 1.4 × 10−6 
Nvidia GPU-based 1.2 3.07 × 105 1.56 
Graphcore GPU-based 0.6 3.1 × 105 0.206 
Groq GPU-based 0.36 1.13 × 106 820.0 
SiPho Si photonics 1.45 4.72 × 105 50.0 
Mythic Analog 0.5 2.0 × 106 100.0 
3D PNN 3D photonics 0.0625 3.5 × 107 20.5 × 103 

With this perspective paper, we identify and argue that a new topology of photonic neural network can achieve top-notch performance in characteristic computational metrics of computational speed, energy efficiency, and footprint. The investment in novel PCM photonic memristors and 3D co-integration with ultra-compact and energy efficient InP nanophotonics, working together for the promotion of in-memory processing, are key approaches for next generation optical engines. The right combination of the best-in-class technologies conceived in a 3D stack approach will succeed in enabling fully programmable neuromorphic photonic processors with zero-power synaptic operation, ultra-low optical loss routing functionality, and a complete suite for ultra-compact nonlinear function circuitries for light generation, activation, and detection, enabling disruptive deployment of optical computing. Eventually, the development of these technologies for the release of robust technologies that allow for ultra-compact and efficient computing hardware must be pursued together with the development of the mathematical framework for neurophotonic deep learning architectures and training models, in order to actually demonstrate clear advantages in replacing or complementing state-of-the-art conventional computational approaches.

The work of B. Shi, Dr. N. Calabretta, and Dr. Stabile was financially supported by the Netherlands Organization of Scientific Research (NWO) under the Zwaartekracht programma “Research Centre for Integrated Nanophotonics.” The work of Dr. C. Vagionas, Dr. G. Dabos, and Dr. N. Pleros was supported by the Hellenic Foundation for Research and Innovation (H.F.R.I.) under the “First Call for H.F.R.I. Research Projects to support Faculty members and Researchers and the procurement of high-cost research equipment grant” (DeepLight, Project Number: 4233) and by the EC through H2020 Project ICT-PLASMONIAC (Contract 871391).

The data that support the findings of this study are available from the corresponding author upon reasonable request.

2.
J. D.
Kendall
and
S.
Kumar
, “
The building blocks of a brain-inspired computer
,”
Appl. Phys. Rev.
7
,
011305
(
2020
).
3.
A. O.
Riordan
,
G.
Fagas
,
B.
O’Flynn
,
J.
Rohan
,
P.
Galvin
, and
C. Ó.
Mathúna
, More Than Moore, International Roadmap for Devices and Systems (IRDS) white paper.
4.
K.
Kitayama
,
M.
Notomi
,
M.
Naruse
,
K.
Inoue
,
S.
Kawakami
, and
A.
Uchida
, “
Novel frontier of photonics for data processing—Photonic accelerator
,”
APL Photonics
4
,
090901
(
2019
).
5.
J.
Wu
,
Y.-L.
Shen
,
K.
Reinhardt
,
H.
Szu
, and
B.
Dong
, “
A nanotechnology enhancement to Moore’s law
,”
Appl. Comput. Intell. Soft Comput.
2013
,
426962
(
2013
).
6.
S. J.
Ben Yoo
and
D. A. B.
Miller
, “
Nanophotonic computing: Scalable and energy-efficient computing with attojoule nanophotonics
,” in
2017 IEEE Photonics Society Summer Topical Meeting Series (SUM), San Juan
(
IEEE
,
2017
), pp.
1
2
.
7.
B.
Marr
,
B.
Degnan
,
P.
Hasler
, and
D.
Anderson
, “
Scaling energy per operation via an asynchronous pipeline
,”
IEEE Trans. Very Large Scale Integr. Syst.
21
(
1
),
147
151
(
2013
).
8.
Y.
Shen
et al, “
Silicon photonics for extreme scale systems
,”
J. Lightwave Technol.
37
(
2
),
245
259
(
2019
).
9.
J. D.
Meindl
, “
Beyond Moore’s law: The interconnect era
,”
IEEE Comput. Sci. Eng.
5
(
1
),
20
24
(
2003
).
10.
“Top 500 List - June 2020,” TOP500, June 2020, available at https://www.top500.org/lists/top500/list/2020/06/.
11.
J.
Hasler
and
B.
Marr
,
Front. Neurosci.
7
,
118
(
2013
).
12.
A.
Szalay
and
J.
Gray
, “
Science in an exponential world
,”
Nature
440
(
7083
),
413
414
(
2006
).
13.
F.
Akopyan
et al, “
TrueNorth: Design and tool flow of a 65 mW 1 million neuron programmable neurosynaptic chip
,”
IEEE Trans. Comput. Aided Design Integr. Circuits Syst.
34
(
10
),
1537
(
2015
).
16.
B. V.
Benjamin
et al, “
Neurogrid: A mixed-analog-digital multichip system for large-scale neural simulations
,”
Proc. IEEE
102
(
5
),
699
716
(
2014
).
18.
A.
Tavanaei
et al, “
Deep learning in spiking neural networks
,”
Neural Networks
111
,
47
63
(
2019
).
19.
W. S.
McCulloch
and
W.
Pitts
, “
A logical calculus of the ideas immanent in nervous activity
,”
Bull. Math. Biophys.
5
(
4
),
115
133
(
1943
).
20.
E.
Nurvitadhi
,
D.
Sheffield
,
J.
Sim
,
A.
Mishra
,
G.
Venkatesh
, and
D.
Marr
, “Accelerating binarized neural networks: Comparison of FPGA, CPU, GPU, and ASIC,” in 2016 International Conference on Field-Programmable Technology (FPT) (IEEE, 2016), pp. 77–84.
21.
V.
Gupta
,
A.
Gavrilovska
,
K.
Schwan
,
H.
Kharche
,
N.
Tolia
,
V.
Talwar
, and
P.
Ranganathan
, “GViM: GPU-accelerated virtual machines,” in Proceedings of the 3rd ACM Workshop on System-Level Virtualization for High Performance Computing (Association for Computing Machinery, New York, NY, 2009), pp. 17–24.
23.
P.
Kennedy
, See https://www.servethehome.com/hands-on-with-a-graphcorec2-ipu-pcie-card-at-dell-tech-world/ for “Graphcore” (last accessed October 15, 2019).
24.
See http://www.tinymlsummit.org/syntiant_7-25_meetup.pdf for “Syntiant” (last accessed October 15, 2019).
25.
26.
See https://www.mythic-ai.com/technology/ for “Mythic” (last accessed October 15, 2019).
28.
A.
Makarov
,
V.
Sverdlov
, and
S.
Selberherr
, “
Emerging memory technologies: Trends, challenges, and modeling methods
,”
Microelectron. Reliab.
52
,
628
634
(
2012
).
29.
A.
Chen
, “
A review of emerging non-volatile memory (NVM) technologies and applications
,”
Solid-State Electron.
125
,
25
38
(
2016
).
30.
C.
Sung
,
H.
Hwang
, and
I. K.
Yoo
, “
Perspective: A review on memristive hardware for neuromorphic computation
,”
J. Appl. Phys.
124
(
15
),
151903
(
2018
).
31.
E.
Kadric
,
D.
Lakata
, and
A.
Dehon
, “
Impact of parallelism and memory architecture on FPGA communication energy
,”
ACM Trans. Reconfigurable Technol. Syst.
9
(
4
),
30
(
2016
).
32.
M.
Horowitz
, “1.1 Computing’s energy problem (and what we can do about it),” in 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC) (IEEE, 2014), pp. 10–14.
33.
M. A.
Nahmias
,
T.
Ferreira de Lima
,
A. N.
Tait
,
H.-T.
Peng
,
B. J.
Shastri
, and
P. R.
Prucnal
, “
Photonic multiply-accumulate operations for neural networks
,”
IEEE J. Sel. Top. Quantum Electron.
26
(
1
),
7701518
(
2019
).
34.
G.
Tanaka
et al, “Recent advances in physical reservoir computing: A review,” arXiv:1808.04962 (2018).
35.
Y.
Shen
et al., “Silicon photonics for extreme scale systems,”
J. Lightw. Technol.
37
(
2
),
245
259
(
2019
).
36.
Neuromorphic Photonics
, edited by
P. R.
Prucnal
and
B. J.
Shastri
(
CRC Press
,
2017
).
37.
W.
Bogaerts
,
D.
Pérez
,
J.
Capmany
,
D. A. B.
Miller
,
J.
Poon
,
D.
Englund
,
F.
Morichetti
, and
A.
Melloni
, “
Programmable photonic circuits
,”
Nature
586
,
207
216
(
2020
).
38.
M.
Smit
et al, “
An introduction to InP-based generic integrated technology
,”
IOP Semicond. Sci. Technol.
29
(
8
),
083001
(
2014
).
39.
C.
Huang
et al,
Demonstration of Photonic Neural Network for Fiber Nonlinearity Compensation in Long-Haul Transmission Systems
(
OFC
,
San Diego
,
2020
), Postdeadline Th4C.6.
40.
Y.
Shen
et al, “
Deep learning with coherent nanophotonic circuits
,”
Nat. Photonics
11
,
441
446
(
2017
).
41.
A. N.
Tait
et al, “
Silicon photonic modulator neuron
,”
Phys. Rev. Appl.
11
(
6
),
064043
(
2019
).
42.
See https://lightmatter.co/ for Lightmatter.
43.
See https://www.lightelligence.ai/ for Lightelligence.
44.
See https://luminous.co/ for Luminous.
45.
See https://www.lumiphase.com/ for Lumiphase.
46.
See http://www.Groq.com for Groq.
49.
G.
Sarantoglou
,
M.
Skontranis
, and
C.
Mesaritakis
, “
All optical integrate and fire neuromorphic node based on single section, quantum dot laser
,”
IEEE J. Sel. Top. Quantum Electron.
26
,
1900310
(
2019
).
50.
J.
Robertson
,
E.
Wade
,
Y.
Kopp
,
J.
Bueno
, and
A.
Hurtado
, “
Towards neuromorphic photonic networks of ultrafast spiking laser neurons
,”
IEEE J. Sel. Top. Quantum Electron.
26
,
7700715
(
2019
).
51.
S.
Xiang
,
Z.
Ren
,
Z.
Song
,
Y.
Zhang
,
X.
Guo
,
G.
Han
, and
Y.
Hao
, “
Computing primitive of fully VCSEL-based All-optical spiking neural network for supervised learning and pattern classification
,”
IEEE Trans. Neural Netw. Learn. Syst.
1
12
(
2020
).
52.
J.
Feldmann
,
N.
Youngblood
,
C. D.
Wright
,
H.
Bhaskaran
, and
W. H. P.
Pernice
, “
All-optical spiking neurosynaptic networks with self-learning capabilities
,”
Nature
569
(
7755
),
208
214
(
2019
).
53.
R.
Hamerly
,
L.
Bernstein
,
A.
Sludds
,
M.
Soljačić
, and
D.
Englund
, “
Large-scale optical neural networks based on photoelectric multiplication
,”
Phys. Rev. X
9
(
2
),
021032
(
2019
).
54.
L.
Bernstein
,
A.
Sludds
,
R.
Hamerly
,
V.
Sze
,
J.
Emer
, and
D.
Englund
, “Freely scalable and reconfigurable optical hardware for deep learning,” arXiv:2006.13926.
55.
H.
Chaoran
,
S.
Bilodeau
,
T.
Ferreira de Lima
,
A. N.
Tait
,
P. Y.
Ma
,
E. C.
Blow
,
A.
Jha
,
H.-T.
Peng
,
B. J.
Shastri
, and
P. R.
Prucnal
, “
Demonstration of scalable microring weight bank control for large-scale photonic integrated circuits
,”
APL Photonics
5
(
4
),
040803
(
2020
).
56.
G.
Mourgias-Alexandris
et al, “
Neuromorphic photonics with coherent linear neurons using dual-IQ modulation cells
,”
J. Lightwave Technol.
38
(
4
),
811
819
(
2020
).
57.
G.
Mourgias-Alexandris
,
A.
Tsakyridis
,
N.
Passalis
,
A.
Tefas
,
K.
Vyrsokinos
, and
N.
Pleros
, “
An all-optical neuron with sigmoid activation function
,”
Opt. Express
27
,
9620
9630
(
2019
).
58.
B.
Shi
,
N.
Calabretta
, and
R.
Stabile
, “
Deep neural network through an InP SOA-based photonic integrated cross-connect
,”
IEEE J. Sel. Top. Quantum Electron.
26
(
1
),
7701111
(
2020
).
59.
B.
Shi
,
K.
Prifti
,
E.
Magalhães
,
N.
Calabretta
, and
R.
Stabile
, “Lossless monolithically integrated photonic InP neuron for all-optical computation,” in Optical Fiber Communication Conference (Optical Society of America, 2020), p. W2A-12.
60.
B.
Shi
,
N.
Calabretta
, and
R.
Stabile
, “First demonstration of a two-layer all-optical neural network by using photonic integrated chips and SOAs,” in 45th European Conference on Optical Communication (ECOC 2019) (IEEE, 2019), pp. 1–4.
61.
A.
Manolis
et al, “Non-volatile integrated photonic memory using GST phase change material on a fully etched Si3N4/SiO2 waveguide,” in CLEO: Science and Innovations, San Jose, CA (Optical Society of America, 2020).
62.
M.
Miscuglio
,
J.
Meng
,
O.
Yesiliurt
,
Y.
Zhang
,
L. J.
Prokopeva
,
A.
Mehrabian
,
J.
Hu
,
A. V.
Kildishev
, and
V. J.
Sorger
, “Artificial synapse with mnemonic functionality using GSST-based photonic integrated memory,” arXiv:1912.02221 (2019).
63.
J.
Feldmann
,
N.
Youngblood
,
M.
Karpov
,
H.
Gehring
,
X.
Li
,
M.
Stappers
,
M.
Le Gallo
,
X.
Fu
,
A.
Lukashchuk
,
A.
Raja
,
J.
Liu
,
D.
Wright
,
A.
Sebastian
,
T.
Kippenberg
,
W.
Pernice
, and
H.
Bhaskaran
, “Parallel convolution processing using an integrated photonic tensor core,” arXiv:2002.00281 (2020).
64.
N. C.
Harris
et al, “
Efficient, compact and low loss thermo-optic phase shifter in silicon
,”
Opt. Express
22
,
10487
10493
(
2014
).
65.
Z.
Lu
,
K.
Murray
,
H.
Jayatilleka
, and
L.
Chrostowski
, “
Michelson interferometer thermo-optic switch on SOI with a 50-μW power consumption
,” in
2016 IEEE Photonics Conference (IPC), Waikoloa, HI
(
IEEE
,
2016
), pp.
107
110
.
66.
A.
Totovic
et al, “
Femtojoule per MAC neuromorphic photonics: An energy and technology roadmap
,”
IEEE J. Sel. Top. Quantum Electron.
26
(
5
),
8800115
(
2020
).
67.
C.
Wang
,
M.
Zhang
,
M.
Yu
,
R.
Zhu
,
H.
Hu
, and
M.
Loncar
,
Nat. Commun.
10
,
978
(
2019
).
68.
I.
Bar-Joseph
,
C.
Klingshirn
,
D. A. B.
Miller
,
D. S.
Chemla
,
U.
Koren
, and
B. I.
Miller
,
Appl. Phys. Lett.
50
,
1010
(
1987
).
69.
Y.
Kuo
,
Y. K.
Lee
,
Y.
Ge
,
S.
Ren
,
J. E.
Roth
,
T. I.
Kamins
,
D. A. B.
Miller
, and
J. S.
Harris
,
IEEE J. Sel. Top. Quantum Electron.
12
,
1503
(
2006
).
70.
M. R.
Billah
,
M.
Blaicher
,
T.
Hoose
,
P.-I.
Dietrich
,
P.
Marin-Palomo
,
N.
Lindenmann
,
A.
Nesic
,
A.
Hofmann
,
U.
Troppenz
,
M.
Moehrle
,
S.
Randel
,
W.
Freude
, and
C.
Koos
,
Optica
5
,
876
(
2018
).
71.
R.
Amin
,
R.
Maiti
,
C.
Carfano
,
Z.
Ma
,
M. H.
Tahersima
,
Y.
Lilach
,
D.
Ratnayake
,
H.
Dalir
, and
V. J.
Sorger
,
APL Photonics
3
,
126104
(
2018
).
72.
C.
Wang
,
M.
Zhang
,
X.
Chen
,
M.
Bertrand
,
A.
Shams-Ansari
,
S.
Chandrasekhar
,
P.
Winzer
, and
M.
Lončar
,
Nature
562
,
101
(
2018
).
73.
C.
Haffner
,
D.
Chelladurai
,
Y.
Fedoryshyn
,
A.
Josten
,
B.
Baeuerle
,
W.
Heni
,
T.
Watanabe
,
T.
Cui
,
B.
Cheng
,
S.
Saha
,
D. L.
Elder
,
L. R.
Dalton
,
A.
Boltasseva
,
V. M.
Shalaev
,
N.
Kinsey
, and
J.
Leuthold
,
Nature
556
,
483
(
2018
).
74.
A.
Grigorenko
,
M.
Polini
, and
K.
Novoselov
, “
Graphene plasmonics
,”
Nat. Photonics
6
,
749
758
(
2012
).
75.
S.
Ohno
,
Q.
Li
,
N.
Sekine
,
J.
Fujikata
,
M.
Noguchi
,
S.
Takahashi
,
K.
Toprasertpong
,
S.
Takagi
, and
M.
Takenaka
, “Taper-less III-V/Si hybrid MOS optical phase shifter using ultrathin InP membrane,” in Optical Fiber Communication Conference (OFC) 2020, OSA Technical Digest (Optical Society of America, 2020), p. M2B.6.
76.
N.
Youngblood
and
M.
Li
, “
Integration of 2D materials on a silicon photonics platform for optoelectronics applications
,”
Nanophotonics
6
(
6
),
1205
1218
(
2016
).
77.
R.
Amin
,
R.
Maiti
,
Y.
Gui
,
C.
Suer
,
M.
Miscuglio
,
E.
Heidari
,
R. T.
Chen
,
H.
Dalir
, and
V. J.
Sorger
, “
Sub-wavelength GHz-fast broadband ITO Mach–Zehnder modulator on silicon photonics
,”
Optica
7
,
333
335
(
2020
).
78.
K. J.
Laboy-Juárez
,
S.
Ahn
, and
D. E.
Feldman
, “
A normalized template matching method for improving spike detection in extracellular voltage recordings
,”
Sci. Rep.
9
,
12087
(
2019
).
79.
K.
Kato
,
M.
Kuwahara
,
H.
Kawashima
,
T.
Tsuruoka
, and
H.
Tsuda
, “
Current-driven phase-change optical gate switch using indium-tin-oxide heater
,”
Appl. Phys. Express
10
(
7
),
072201
(
2017
).
80.
H.
Zhang
et al, “
Miniature multilevel optical memristive switch using phase change material
,”
ACS Photonics
6
(
9
),
2205
2212
(
2019
).
81.
J.
Zheng
et al, “
Nonvolatile electrically reconfigurable integrated photonic switch enabled by a silicon PIN diode heater
,”
Adv. Mater.
32
,
2001218
(
2020
).
82.
M.
Delaney
,
I.
Zeimpekis
,
D.
Lawson
,
D. W.
Hewak
, and
O. L.
Muskens
, “
A new family of ultralow loss reversible phase-change materials for photonic integrated circuits: Sb2S3 and Sb2Se3
,”
Adv. Funct. Mater.
30
,
2002447
(
2020
).
83.
A. N.
Tait
et al., “Multi-channel microring weight bank control for reconfigurable analog photonic networks,” in 2016 IEEE Optical Interconnects Conference (OI) (IEEE, 2016), pp. 104–105.
84.
D.
Dagostino
,
G.
Carnicella
,
C.
Ciminelli
et al., “Low loss waveguides for standardized InP integration processes,” in Proceedings of the 18th Annual Symposium of the IEEE Photonics Society Benelux Chapter (Technische Universiteit Eindhoven, 2013).
85.
T. D.
Bucio
et al, “
Silicon nitride photonics for the near-infrared
,”
IEEE J. Sel. Top. Quantum Electron.
26
(
2
),
8200613
(
2019
).
86.
T. D.
Bucio
,
A. Z.
Khokhar
,
G. Z.
Mashanovich
, and
F. Y.
Gardes
, “
N-rich silicon nitride angled MMI for coarse wavelength division (de)multiplexing in the O-band
,”
Opt. Lett.
43
(
6
),
1251
(
2018
).
87.
J.
Chiles
,
S. M.
Buckley
,
S. W.
Nam
,
R. P.
Mirin
, and
J. M.
Shainline
, “
Design, fabrication, and metrology of 10 × 100 multi-planar integrated photonic routing manifolds for neural networks
,”
APL Photonics
3
,
106101
(
2018
).
88.
T. F.
de Lima
,
A. N.
Tait
,
H.
Saeidi
,
M. A.
Nahmias
,
H.
Peng
,
S.
Abbaslou
,
B. J.
Shastri
, and
P. R.
Prucnal
, “
Noise analysis of photonic modulator neurons
,”
IEEE J. Sel. Top. Quantum Electron.
26
(
1
),
7600109
(
2020
).
89.
R.
Amin
et al, “
ITO-based electro-absorption modulator for photonic neural activation function
,”
APL Mater.
7
(
8
),
081112
(
2019
).
90.
M. M. P.
Fard
,
I. A. D.
Williamson
,
M.
Edwards
,
K.
Liu
,
S.
Pai
,
B.
Bartlett
,
M.
Minkov
,
T. W.
Hughes
,
S.
Fan
, and
T.-A.
Nguyen
, “
Experimental realization of arbitrary activation functions for optical neural networks
,”
Opt. Express
28
,
12138
12148
(
2020
).
91.
M.
Miscuglio
,
A.
Mehrabian
,
Z.
Hu
,
S. I.
Azzam
,
J.
George
,
A. V.
Kildishev
,
M.
Pelton
, and
V. J.
Sorger
, “
All-optical nonlinear activation function for photonic neural networks [invited]
,”
Opt. Mater. Express
8
,
3851
3863
(
2018
).
92.
K.
Nozaki
et al., “Ultracompact O-E-O converter based on fF-capacitance nanophotonic integration,” in 2018 Conference on Lasers and Electro-Optics (CLEO), San Jose, CA (IEEE, 2018), pp. 1–2.
93.
M.
Miscuglio
et al, “Artificial synapse with mnemonic functionality using GSST-based photonic integrated memory,” arXiv:1912.02221 (
2019
).
94.
D. A.
Bekele
et al, “
Signal reshaping and noise suppression using photonic crystal Fano structures
,”
Opt. Express
26
,
19596
(
2018
).
95.
A.
Jha
,
C.
Huang
, and
P. R.
Prucnal
, “
Reconfigurable all-optical nonlinear activation functions for neuromorphic photonics
,”
Opt. Lett.
45
(
17
),
4819
4822
(
2020
).
96.
A.
Jha
,
C.
Huang
,
T.
Ferreira de Lima
, and
P. R.
Prucnal
, “
High-speed all-optical thresholding via carrier lifetime tunability
,”
Opt. Lett.
45
(
8
),
2287
2290
(
2020
).
97.
M.
Ono
,
M.
Hata
,
M.
Tsunekawa
et al, “
Ultrafast and energy-efficient all-optical switching with graphene-loaded deep-subwavelength plasmonic waveguides
,”
Nat. Photonics
14
,
37
43
(
2020
).
98.
A.
Bazin
,
P.
Monnier
,
X.
Lafosse
,
G.
Beaudoin
,
R.
Braive
,
I.
Sagnes
,
R.
Raj
, and
F.
Raineri
, “
Thermal management in hybrid InP/silicon photonic crystal nanobeam laser
,”
Opt. Express
22
,
10570
10578
(
2014
).
99.
G.
Crosnier
,
D.
Sanchez
,
S.
Bouchoule
,
P.
Monnier
,
G.
Beaudoin
,
I.
Sagnes
,
R.
Raj
, and
F.
Raineri
, “
Hybrid indium phosphide-on-silicon nanolaser diode
,”
Nat. Photonics
11
(
5
),
297
(
2017
).
100.
K.
Lengle
et al, “
Modulation contrast optimization for wavelength conversion of a 20 Gbit/s data signal in hybrid InP/SOI photonic crystal nanocavity
,”
Opt. Lett.
39
(
8
),
2298
(
2014
).
101.
K.
Nozaki
et al, “
Photonic-crystal nano-photodetector with ultrasmall capacitance for on-chip light-to-voltage conversion without an amplifier
,”
Optica
3
,
483
492
(
2016
).
102.
T.
Alexoudi
et al, “
III–V-on-Si photonic crystal nanocavity laser technology for optical static random access memories
,”
IEEE J. Sel. Top. Quantum Electron.
22
(
6
),
295
304
(
2016
).