Recently, integrated optics has become a functional platform for implementing machine learning algorithms and, in particular, neural networks. Photonic integrated circuits can straightforwardly perform vector-matrix multiplications with high efficiency and low power consumption by using weighting mechanism through linear optics. However, this cannot be said for the activation function, i.e., “threshold,” which requires either nonlinear optics or an electro-optic module with an appropriate dynamic range. Even though all-optical nonlinear optics is potentially faster, its current integration is challenging and is rather inefficient. Here, we demonstrate an electroabsorption modulator based on an indium tin oxide layer monolithically integrated into silicon photonic waveguides, whose dynamic range is used as a nonlinear activation function of a photonic neuron. The thresholding mechanism is based on a photodiode, which integrates the weighed products, and whose photovoltage drives the electroabsorption modulator. The synapse and neuron circuit is then constructed to execute a 200-node MNIST classification neural network used for benchmarking the nonlinear activation function and compared with an equivalent electronic module.
I. INTRODUCTION
With the ongoing advancement of neural network systems, there is a pressing demand for new technological paradigms that can perform advanced artificial intelligence tasks without trading off throughput (operations/s) and power dissipation. Photonics based artificial neurons can potentially provide the archetypal solution for this specific challenge.1–3 The main advantage of photonics over current digital electronics implementations is that distinct signals due to their wave-nature can be straightforwardly and efficiently combined exploiting attojoule efficient electro-optic (EO) modulators,4–7 phase shifters, and combiners, simplifying essential operations such as weighted sum or addition, vector matrix multiplications, or convolutions.8 Moreover, photonics enables high parallelism,9 hence a higher baud-rate, since multiple wavelengths can travel in the same physical channel by exploiting wavelength-division multiplexing (WDM).8,10–13 Additionally, photonics is marked by a further degree of freedom in modulating the information carried by the optical waves since a signal can be modulated by altering its phase, amplitude, or polarization.
In a photonic neural network, a photonic neuron performs the following operations: receive multiple input signals (fan-in), weight each by the coefficient and sum them (weighted sum or addition), utilize a nonlinear process (nonlinear activation function), and send its output to other neurons or nodes (fan-out). The Non-Linear Activation Function (NL AF) has a key role in the network pivotal to the convergence into final states by discriminating data and suppressing noise. Different NL AFs with significantly different ranges and trends14 have been proposed and extensively investigated, providing suitable advantages according to the different applications. More in detail, interesting experimental and numerical all-optical nonlinear modules based on saturable and reverse absorption,15,16 graphene excitable lasers,17,18 two-section distributed-feedback (DFB) lasers,19 quantum dots,20 disks lasers,21,22 induced transparency in quantum assembly,15 etc., have been recently reported and showed promising results in terms of efficiency and throughput for different kinds of neural network and applications, ranging from convolutional neural network, spiking neural network, and reservoir computing. However, a more straightforward implementation is currently attained by exploiting electro-optic tuned nonlinear materials23,24 or absorptive modulator directly connected to a photodiode, as shown in Refs. 14 and 25–27. In this case, the photogenerated current, proportional to the detected optical power at the weighted addition, alters the voltage drop on the active material, thus changing its carrier concentration and consequently the effective modal index of the propagating waveguide mode. This approach is affected by RC-latency and by the electro-optic conversion; therefore, it trades off baud-rate with energy efficiency.28 Nonetheless, it still provides higher controllability, reconfigurability, reliability, and easy-integration with respect to more exotic design choices. In this case, the adoption of a specific modulator type will directly impact the shape of the nonlinear activation function and, subsequently, the inference mechanism and ultimately the performance of the neural network. Indeed, the performance of a modulator is directly related to the underlying physics of the entirety of the optoelectronic devices, including active material and device configuration. Therefore, for achieving efficient modulation and, consequently, promising performance for the neural network, one should concurrently engineer both. On the device engineering side, maximizing the fraction of the optical field in the mode overlapping with the active material, namely, intensifying the modal confinement factor, is a suitable and accessible strategy for enhancing the modulation efficiency and the dynamic range of an electro-optic modulator embedded in a photonic integrated circuit. For achieving that, current schemes consist of using subdiffraction limited plasmonic structures29–36 or photonic cavities35 aiming to maximize the light matter interaction in order to achieve a rather high modulation performance and low energy-per-compute surpassing electronic efficiency, while compensating in terms of insertion losses (optical mode hybridization) due to the plasmonic nature of the mode. Another characteristic to consider when engineering an electroabsorption (EA) modulator is the modulation bandwidth, which is the result of material choices and device configuration and, if well engineered, can enable high-throughput communication links. Though necessary, a thorough device configuration might not be enough for achieving an efficient complex index modulation due to inefficient active material choices. The main aspect to consider when designing and engineering an effective electroabsorption modulator (EAM) is, in fact, the variation of the complex refractive index due to applied bias (i.e., carrier tunability), which is inherent to the selected active material.26 Silicon (Si) is the conventional material choice usually as fabrication facilities can benefit tremendously from the mature Si process, but the inherent low tunability of Si under electrical bias forces inadequate performances at higher scaling as increased modulator lengths need to be employed to achieve the desired dynamic range for the NL AF.
Moreover, in order to respond to the demand of densely integrated neural networks, the material of choice for the EA modulator needs to be easily and consistently integrated in a photonic platform, and preferably, the selected active material should be CMOS compatible, thus allowing integration with on-chip memory and EO converter such as the digital-to-analog converter (DAC) and analog-to-digital converter (ADC) enabling large-scale functional photonic network-on-chip. Undoubtedly, one class of materials which adapts to these requirements is transparent conductive-oxides (TCOs). Indium Tin Oxide (ITO) belongs to this class of material. The advantages of using ITO as an active material in EA modulators are manifolds; ITO films are able to deliver unity-strong index modulation when placed in a plasmonic cavity configuration30,37–39 and also show strong modulation in epsilon-near-zero (ENZ) behavior40 in the telecommunication frequency band,41 supporting both strong index modulation and slow light effects.42 In recent years, the massive demand of ITO for industrial purposes43 and research-level advancements favored the rapid development of controllable and tunable processes for ITO deposition, by Radio Frequency (RF),6,44–47 DC sputtering,48 and evaporation,49 which enabled a very high yield and reliable process and monolithic hybrid integration.50
In this paper, we start our study by analyzing the free carrier absorption dynamics in ITO (Sec. II A), only to then build a model of an amplifier coupled electro-optic neuron for the electroabsorption modulator (Sec. II B). Subsequently, we derive its cascaded signal-to-noise ratio (SNR) as a function of its length and laser power. We compare the model obtained with a silicon-based modulator, being still largely employed with successful results33,51–55 (Sec. III A). As a subcase of our study, we also experimentally demonstrate an ITO-based absorption modulator which, due to intrinsic nonideality such as initial doping and asymmetric behavior, outperforms the theoretical model initially described. Ultimately, we evaluate and benchmark the performance of the experimentally fabricated ITO modulator as a NL AF on toy networks such as MNIST feed-forward neural network, designed with two layers of 100 nodes each, using optimized parameters from the SNR analysis. The value of accuracy for the trained network, including noise, reached a remarkable 98%, exploiting also nonideal and asymmetric behavior in the transfer function, discussed in Sec. III B. In conclusion, our results show that ITO-based electroabsorption modulators offer a significantly higher modulation dynamic range and efficiency with respect to similar silicon-based devices and hence are suitable for implementing the nonlinear activation functions in electro-optic neural networks as they require only a small fraction of the power and footprint. We believe that this work could constitute a viable approach not only toward the implementation of a photonic perceptron mechanism based on ITO modulators, both for weighting the inputs and implementing nonlinear activation functions, but also could represent a comprehensive guide for other material integration to avail future neural network trends.
II. METHODS
Initially, we study the electro-optical absorption dynamics in ITO dictated by the free carrier mechanism, which is essential for determining the model for the absorption as a function of the injected carriers and bias voltage. We compare these results with silicon carrier dynamics as the underlying modulation mechanism for both Si and ITO are the same—free carrier absorption arising from accumulation/depletion of the carriers.
A. Free carrier absorption dynamics
Let us consider a free electron gas in the conduction band of a semiconductor such as in Si, ITO, or any other similar material with the carrier density of Nc(x, y). The schematics of a generic waveguide modulator case is shown in Fig. 1(a) employing such active materials. Here, we derive the expression for the absorption cross section of free electrons in these akin materials. The relative permittivity of aforementioned free electron plasma in the Drude approximation can be written as
where ε∞ is the dielectric constant of the undoped semiconductor also referred to as the high-frequency or “background” permittivity, q is the electronic charge, meff is the conduction effective mass, ω is the angular frequency, and γ is the collision frequency of the free carriers quantified by the detuning from any interband resonances thereof. The rate of absorption of electromagnetic energy is proportional to the imaginary part of the dielectric constant as
where the effective electric field, Eeff, is the same as the total field, E. Now, let us consider an electromagnetic wave in a waveguide with width W, propagating along the z-direction [Fig. 1(a)], with the propagation constant β expressed as E = E(x, y)ei(βz−ωt). We can evaluate the Poynting vector as Sz(x) = EyHx. According to Maxwell equations, . Thus, the time-averaged Poynting vector magnitude is , where η0 is the impedance of free space (≃377 Ω). The effective index has been introduced as neff = βc/ω. Integrating over the waveguide cross section [Fig. 1(a)], we obtain
The total power flow then becomes
Here, N2D is defined as the 2-dimensional planar carrier concentration and the effective thickness is
For a very narrow active layer, this approximation can be inferred—otherwise, it is slightly smaller. This definition of the effective thickness may differ from many other in the literature, but the difference is not significant. Usually, it is just in the arrangement of all kinds of indices. Note, this definition includes the fact that active layer is not necessary at the center of the waveguide, i.e., Ea0 need not be the peak electric field. The effective thickness can also be thought of as a relative measure of the confinement of the optical mode to the active layer as confinement factor, Γ ≃ da/teff. It is worthwhile to mention that the only waveguide geometrical parameter distinguishing the modulator’s performance characteristics under various optical waveguide modes is the effective thickness, teff; i.e., different modes can have different teff values, but their corresponding performance metrics also follow this important parameter. As such, we carry out our entire performance analysis in units of teff to ascertain only the material contributions aside from any modal geometric engineering advances thereof. Now, we can obtain from (3) as
The absorption coefficient can be found as α(ω) = σabs(ω)N2D/teff, where the absorption cross section is expressed as
Here, α0 is the fine structure constant, and L(ω) = γ2/(ω2 + γ2) represents a Lorentzian line shape that has maximum value at resonance equal to unity. The effective detuning then becomes γeff = γ + ω2/γ. Note that the maximum absorption is achieved at γ = ω and when γeff = 2ω and
Evidently, this expression is similar to the absorption cross section of semiconductor quantum dots (QDs) with one major difference—due to nonresonant character of the free carrier absorption, the cross section for the free carriers is orders of magnitude lower than that for the resonant QDs.28 As one can see, the absorption cross section does not depend on too many things—at the resonance, it depends only on the resonance width, or, better Q = ω/γ and the optical dipole transition rate, r12. We can invoke the oscillator sum rule (assuming all the dipoles along the axis parallel to the electric field are lined up) which states . Therefore, we can introduce the oscillator strength as and consequentially get
If we now assume that the material used is a semiconductor with a bandgap, Eg, and the momentum matrix element quantified by Pcv, we can actually relate the effective mass of electron to the wavelength
where , for a wide range of semiconductors. Therefore, we obtain (assuming that the bandgap is close to the photon energy)
Thus, while modulators based on two-level systems do inherently benefit from the small γ, the condition for the free carrier absorption based modulators is quite opposite as they benefit from for large γ. Also, we see that for the wavelength in the telecom range the cross section actually scales with broadening caused by scattering γ. Therefore, ITO leads intrinsically to more modulation per unit charge than high-quality silicon. We found in our previous work that for all materials operating with Pauli blocking (saturable absorption schemes) show comparable values of absorption cross section (σ ∼ 10−14 cm2 at room temperature),28 while the free carriers offer a worse performance due to nonresonant character of absorption since according to (8) for free carriers γeff = γ + ω2/γ > 2ω. We further analyze our perturbative results by calculating the dependence of the absorption coefficient on the injected (induced by the gate) carriers, N2D, as the elemental free carriers responsible for absorption tuning in the electrostatic modulators as
Here, γ = 1/τ is the carrier scattering rate, i.e., collision frequency with τ being the mean scattering time corresponding to the mean free path between collisions between the free electrons in the plasma. The electron mobility μ and τ are related by μ = |q|τ/meff. We compare two free carrier absorption based materials for our absorption modulators to employ in the following neuromorphic study, namely, silicon and ITO. For Si, the conductivity effective mass, meff, is taken as 0.26m0, where m0 is the free electron rest mass.56,μ is taken as 1100 cm2 V−1 s−1 at 1016 cm−3 carrier concentration level (i.e., electrons for silicon).57 We used the dielectric constant of nondoped silicon ε∞ = 11.66.58 Unlike the doped silicon, the chemical composition of ITO is usually given as In2O3: SnO2 and can be considered an alloy as concentration of Sn relative to In can be as high as 10%. Several previous studies have calculated the permittivity of ITO using the experimentally measured reflectance and transmittance, and we chose a fitting result of Michelotti et al., whereas γ depends on the deposition conditions, defect states, and film thicknesses, etc.; in our analysis, we have taken γ = 1.8 × 1014 rads−1, background permittivity, ε∞ = 3.9, and meff = 0.35m0.38,59–61 In the near-infrared, ITO is quasimetallic since its free electrons dictate its optical response. In fact, ITO and related transparent conducting oxides have recently been explored as plasmonic materials in the optical frequency range.60,62
Free carrier based absorption modulators in silicon (Si) and indium tin oxide (ITO). (a) Schematic of a generic waveguide modulator showcasing the cross-sectional structure (width, W, and active layer thickness, da) and modal effective thickness, teff. The mode inside the structure is propagating in the z-direction with propagation constant, βeff, L is the length of the modulator, Vd and Id are the drive voltage and drive current, respectively, responsible for the injected charge accumulation/depletion (±Qinj) in the active layer/gate interface. (b) Absorption, α, in units of 1/teff (mm−1) of our cross-sectional geometry in (a) vs 2-dimensional planar carrier concentration, N2D (cm−2). (c) Absorption, α, (dB) vs charge injected per unit cross-sectional area, Qinj (pCμm−2). A 10 dB ceiling for the attainable extinction ratio (ER) was chosen arbitrarily. (d) Absorption, α, (dB) vs drive voltage, Vd, (Volts) corresponding to electrostatic gating in the presence of a gate dielectric with εeff = 10 and thickness, dspacer = 5 nm.
Free carrier based absorption modulators in silicon (Si) and indium tin oxide (ITO). (a) Schematic of a generic waveguide modulator showcasing the cross-sectional structure (width, W, and active layer thickness, da) and modal effective thickness, teff. The mode inside the structure is propagating in the z-direction with propagation constant, βeff, L is the length of the modulator, Vd and Id are the drive voltage and drive current, respectively, responsible for the injected charge accumulation/depletion (±Qinj) in the active layer/gate interface. (b) Absorption, α, in units of 1/teff (mm−1) of our cross-sectional geometry in (a) vs 2-dimensional planar carrier concentration, N2D (cm−2). (c) Absorption, α, (dB) vs charge injected per unit cross-sectional area, Qinj (pCμm−2). A 10 dB ceiling for the attainable extinction ratio (ER) was chosen arbitrarily. (d) Absorption, α, (dB) vs drive voltage, Vd, (Volts) corresponding to electrostatic gating in the presence of a gate dielectric with εeff = 10 and thickness, dspacer = 5 nm.
The inherent low broadening parameter γ of Si limits the attainable absorption with a carrier density level variation even at higher concentrations. More importantly, the slope of absorption vs carrier density characteristics is responsible for the modulator performance [i.e., slope steepness ∼ Extinction ratio (ER)/Vbias], and as such ITO exhibits prominent features as the ITO curve is several orders per decade higher in steepness than the Si one [Fig. 1(b)]. Additionally, we evaluate the total absorption as a function of the injected charge per unit waveguide cross-sectional area setting a 10 dB ER ceiling, Qinj = qN2DL/teff [Fig. 1(c)]. In accordance to our perturbative estimation, the injected charge follows, as expected, Qinj ∼ 2.2q/σ.28 Furthermore, the drive voltage can be obtained via Vd = Qinj/Cg, where Cg is the gate capacitance [Fig. 1(d)]. A dielectric spacer such as an oxide layer for gating facilitation is considered in these analytical formalism. A gate oxide with a relative dielectric constant εeff of 10 and gate spacer thickness dspacer of 5 nm was chosen for acquiring these results. As the drive voltage is increased, free carriers leading to dispersive effects are induced and a corresponding net increase occurs in the carrier concentration. Our analytic results reveal device scaling for the aforementioned modulation dynamic range of 10 dB ceiling as 33 905 for Si, and 1768 for ITO in units of teff.
While both ITO and Si are capable of tuning the index via the free carrier dependent Drude model, nevertheless, ITO exhibits disparate significant advantages over Si with respect to the permittivity variation; first, the carrier concentration in ITO can exceed that of Si by at least a couple of orders of magnitude; the concentration of indium atoms in ITO reaches a few percent, which is significantly beyond the attainable donor concentration in doped Si.63 Second, the effect of inducing the carrier concentration change in ITO on its refractive index is more dramatic than in Si. This can be attributed to the higher bandgap and consequently lower refractive index of ITO compared to that of Si. The higher bandgap allows the density of states (DoS) near band edge to be strongly confined leading to a higher free carrier concentration necessary to change the index. For a discrete spectrum, the DoS consists of a number of delta peaks at the energy levels of the system with weighted degeneracy of that level. If the change of the carrier concentration ∂N2D (e.g., due to an applied bias) causes a change in the relative permittivity (dielectric constant) ∂ε, the corresponding change in the refractive index can be written as ∂n = ∂ε1/2 ∼ ∂ε/2ε1/2, and hence, the refractive index change is greatly enhanced when the permittivity ε is small.49,62 Third, the presence of an epsilon-near-zero (ENZ) region in the tolerable carrier concentration range within electrostatic gating constraints makes ITO a promising substitute to conventional Si. Operation in the vicinity of the ε ∼ 0 (ENZ) condition can understandably result in stronger modulation effects as the optical mode experiences gradually intensifying slow-light effects closer to the ENZ region as a result of increasing light matter interaction thereof. The inherently low background permittivity of ITO allows the ENZ carrier density condition at the telecommunication relevant wavelength of 1550 nm resulting in experimentally obtainable effects (≃6 − 7 × 1020 cm−3).49,62,63 Interestingly, the Si ENZ condition can be approached at a carrier concentration level of about 3.38 × 1019 cm−3, operating at a wavelength of 10 μm in the IR region away from the region of our interest.
B. Neural network
The model of the ITO absorption mechanism is then incorporated in the NL AF of a neural network. A standard broadcast-and-weight feed-forward network architecture in which connections are configured by microring weight banks,19 characterized by two layers and 100 nodes per layer, is considered in Fig. 2. The broadcast-and-weight network shown in Fig. 2(b) is used to implement a fully connected neural network whose topology is shown in Fig. 2(a). The network and its functioning have been previously discussed in Refs. 13 and 64 and present similarities with the fiber networking broadcasting techniques. It consists of a group of nodes (N = 100) sharing a common medium, namely, the broadcast loop (BL). The node is the fundamental unit cell of the network which performs both physical and logical functionalities; it is required for both broadcast-and-weight networking and neuromorphic processing, respectively. Each node comprises a tunable spectral filter bank,65 which can be implemented by a series of microring resonators. The resonance of the filters can be thermally or electronically controlled, allowing for a continuous drop of portion of its corresponding wavelength channel, enabling neural weighting functionalities. The broadcast loop, which could be implemented by a circular fully multiplexed waveguide, implements an all-to-all interconnection, supporting (not necessarily) all N2 potential connections between participating units. The network can be constructed [Fig. 2(b)] using interfacial photonic neurons that connect between broadcast loops (waveguides).The network architecture footprint can be computed based on metrics of the underlying building blocks, fabricated by accredited foundries and recent literature. On a first approximation, the overall footprint of a neuron bank is estimated to be 1600 μm2 (), N being the number of neurons. Assuming that every connection has a dedicated tunable microring resonator (MRR) filter, all critically coupled to the bus waveguide, and that they are single-pole, the N-to-N network footprint is 100 × 1600 μm2 = 0.16 mm2, which is still on a chip-scale. In our adaptive architecture, which considers the reuse of the bandwidth, the broadcast loop can be of at least 4 cm (4 μm × 1002). On the other hand, the power budget and noise analysis of the network cannot be mapped trivially (cross talk if thermal tuning is used to control the weights) and the restrictions might be application-dependent and generically loose due to the statistical and intrinsically noisy nature of neuromorphic algorithms. The network is trained to classify 10 individual digits in a set of images of handwritten digits in a grayscale 28 × 28 pixel (MNIST dataset66) as a toy example. It is worth noticing that here, each pixel is encoded in a unique wavelength carrier in a wavelength division multiplexed (WDM) and broadcast scheme. Subsequently, the network inference performance are benchmarked. The simulation environment was developed in Python using Keras67 and TensorFlow.68 The activation function was replaced with a custom designed nonlinear transfer function resulting from our ab initio ITO based electroabsorption modulator model. The weighting mechanism relies on microring resonator banks as proposed by Tait et al.13,25 which can potentially support more than 100 channels.13 In this case, the weights were bound between minus one and one to simulate input optical weighting by rings in a push-pull configuration.
(a) A topological representation of a fully connected neural network and (b) a photonic feedforward neural network implementation using the broadcast-and-weight protocol. Here, feedforward networks can be constructed using interfacial photonic neurons that connect between broadcast loop (BL) (waveguides).19 (c) Photonic neuron and perceptron mechanism. WDM Inputs are weighed through tunable microring resonators (MRRs). The optical power is accumulated and detected by a balanced photodiode. The photovoltage drives the electroabsorption modulator, which nonlinearly modulates the laser power mimicking an activation function.
(a) A topological representation of a fully connected neural network and (b) a photonic feedforward neural network implementation using the broadcast-and-weight protocol. Here, feedforward networks can be constructed using interfacial photonic neurons that connect between broadcast loop (BL) (waveguides).19 (c) Photonic neuron and perceptron mechanism. WDM Inputs are weighed through tunable microring resonators (MRRs). The optical power is accumulated and detected by a balanced photodiode. The photovoltage drives the electroabsorption modulator, which nonlinearly modulates the laser power mimicking an activation function.
Noise is modeled in Keras as additive Gaussian noise with the standard deviation proportional to the root-mean square (rms) noise power. Shot noise at the photodiodes is modeled as , where Is is the signal current and Id is the dark current taken as 0.05 nA. In addition, thermal noise is modeled as , where T is taken as room temperature of 300 K and Req is the equivalent resistance of 500 Ω.
1. Training
The neural network models are trained in Keras67 with the Adagrad69 method using a categorical cross-entropy loss function, a learning rate of 0.005, zero decay, and with 500 training cycles (epochs) of a 1024 batch size. The amplitude of the noise power model is reduced during training to 10% of the final inference noise power, to allow for training convergence while keeping the model from overfitting to a noiseless activation function.
2. Parameter optimization
The photonic neural network differs from the digital neural network in three ways: First, the noise of the photonic neural network is analog, with sources from shot and thermal noise, and unlike the digital neural network has no quantization noise. Second, the photonic neural network, as an analog system, cascades noise. However, unlike an analog repeater, the analog neural network acts as a signal regenerator when the activation function is pushed into saturation, making the analog neural network a quasidigital system in certain configurations. Finally, the photonic neural network without parameter optimization will shift the bias and dynamic range moving with depth into the network.
To counteract the shifting bias and dynamic range, an optimization procedure was designed to locate the center and range of the activation function at each layer of the network as follows: the activation function is swept over 200 input voltage points with no noise, and the point with the greatest slope is found, a range is built by adding the points to the left and right of the starting point until the slope has dropped below 0.05. After identifying the center of the dynamic range, a DC bias is applied to the modulator to move the minimum input to the center of the identified range.
III. RESULTS AND DISCUSSION
A. ITO electroabsorption activation function: Inference results
An activation function derived from Eq. (12) for both ITO and Si based absorption modulators coupled to a photodiode [Fig. 2(c)] through an ideal transimpedance amplifier was trained in Keras67 at a simulated bandwidth of 1 GHz. The length for both modulators was taken to be the 10 dB dynamic range point for ITO at 1768 in units of the effective thickness teff to demonstrate reduced footprint operation in both materials. We keep the scaling independent of the modal structure choice (only material dependent, i.e., Si/ITO) by using the effective thickness as the underlying unit for the device length as teff can be different for different modes but the physical length required for modulation relates back to teff based on the modal choice as L/teff depends only on material constraints (i.e., maximum attainable absorption).28 The simulation resulted in accuracy greater than 90% for laser power greater than 5 mW (Fig. 3) for the ITO modulator, while the Si modulator never reached accuracy greater than 30% even at the final sweep power of 15 mW. It is worth noticing that here, we compare Si and ITO modulator with the same physical footprint exemplified by the modulator length in effective thickness (∼L/teff) to achieve 10 dB modulation for the ITO device. Moreover, as aforementioned in Sec. II A, this effect is due to the lower modulation range (ER/Vbias) in Si compared to ITO ascribable to the inherent low effective broadening parameter, γeff of Si which drastically limits the attainable absorption even at higher carrier density levels. As such, a similar accuracy level in executing neural network tasks can be obtained for Si based modulators with substantially larger linear footprint compared to ITO-based electroabsorption modulators, when used as nonlinear activation functions. Even with the enhanced footprint, Si based devices could be susceptible to accuracy challenges as opposed to their highly scalable ITO counterparts because of the steeper dynamic response of the latter [Fig. 1(d)].
Simulated MNIST accuracy results for ITO and Si modulator activation functions swept over laser power for a fully connected neural network of two layers, each with 100 nodes.
Simulated MNIST accuracy results for ITO and Si modulator activation functions swept over laser power for a fully connected neural network of two layers, each with 100 nodes.
B. Experimental characterization of an ITO based electroabsorption modulator device
Using the same approach presented in Ref. 30, we fabricate two different EA modulators with different lengths (5 and 20 μm). The device configuration consists of a Si waveguide (800 nm × 340 nm) and stack, placed on top, which comprises 10 nm ITO layer, gate oxide (SiO2, tox = 20 nm), and a metallic gold pad (Fig. 4). The stack was fabricated using electron-beam lithography for defining the pattern, electron-beam evaporation, and lift off processing. In this capacitor configuration, according to the potential applied, we are able to modulate the carrier density of the ITO film, thus tuning the portion of the electric field absorbed by the thin layer. A 1550 nm TM mode traveling in the waveguide is subjected to a shift in the mode profile due to the presence of the plasmonic stack, and accordingly, enhances the mode overlap with the active ITO layer. Increasing the carrier concentration level with active electrostatic gating enhances the modulation effect due to the increased free carrier absorption of the optical mode dynamics as previously discussed. The performances of the characterized electroabsorption modulators display extinction ratios of 5 dB and more than 20 dB for corresponding device lengths of 5 and 20 μm, respectively; when a voltage bias between the ITO and the metal contact is applied. The insertion losses are relatively low and for a device of 5 μm are approximately 1 dB. Going beyond the comparison between the proposed device and free carrier absorption in Si modulators, in Table I, we relate our ITO based EAM to the state-of-the-art silicon and LiNbO3 based traveling-wave modulators, currently employed in integrated photonic circuits. The demonstrated EAM modulator exhibits a competitive figure of merit (FOM); it reaches a modulation range in a rather compact (linear) footprint (∼1 dB/μm) and considerably low insertion losses (1 dB). Therefore, this ITO based EAM represents a practical option to other, less performing, Si and LiNbO3 state-of-the-art EO modulators, without necessitating of cavity feedback (e.g., microring resonators) and is, therefore, characterized by a broadband response. In our view, considering the advantageous ER/IL ratio ( = 5) and compact footprint, this EAM is indeed particularly suitable for implementing optical module of a neuron activation function especially for those whose weighting scheme relies on WDM because these EAMs are spectrally broadband (no resonance used).
[(a) and (b)] Optical image of the electroabsorption device and schematic of the cross section (c). A 10 nm ITO film is deposited on a silicon waveguide (800 nm × 340 nm) covered by a thin SiO2 spacer and a 40 nm Au metal contact. The scale bar is 50 μm. (d) Numerical simulation (FEM) of the normalized electric field distribution for the OFF and ON state, considering a variation of refractive index of the ITO active layer from 1.96 + 0.002i to 1.04 + 0.28i, respectively. (e) Experimentally measured transmission as function of bias voltage (−4 V to 4 V) of a 5 (blue, rhombus) and 20 (yellow, circles) μm ITO based EA modulator. The inset was taken from Ref. 30 and shows the broadband response of similar EAM for different bias voltages between −2 and 2 V.
[(a) and (b)] Optical image of the electroabsorption device and schematic of the cross section (c). A 10 nm ITO film is deposited on a silicon waveguide (800 nm × 340 nm) covered by a thin SiO2 spacer and a 40 nm Au metal contact. The scale bar is 50 μm. (d) Numerical simulation (FEM) of the normalized electric field distribution for the OFF and ON state, considering a variation of refractive index of the ITO active layer from 1.96 + 0.002i to 1.04 + 0.28i, respectively. (e) Experimentally measured transmission as function of bias voltage (−4 V to 4 V) of a 5 (blue, rhombus) and 20 (yellow, circles) μm ITO based EA modulator. The inset was taken from Ref. 30 and shows the broadband response of similar EAM for different bias voltages between −2 and 2 V.
Comparison to state-of-the-art silicon and LiNbO3 based traveling-wave modulators, currently employed. The devices are compared in terms of the main figure of merit: Insertion Losses (IL) (dB), linear footprint (μm), Speed (Gbit/s), energy consumption (fJ/bit), Extinction Ratio (ER) (dB). Considering the capacitive effect of our device we found around 300 GHz for a resistance of R = 500 Ω and a 5 μm long device.
Material . | Device type . | IL (dB) . | Linear footprint (μm) . | Speed (Gbit/s) . | Energy (fJ/bit) . | (ER) (dB) . |
---|---|---|---|---|---|---|
LiNbO370 | MZM | 2 | 7000 | 60 | 1700 | 8 |
Si71 | MZM | 10 | 10 500 | 4 | … | … |
Si72 | MZM | 2 | 3000 | 50 | 450 | 3 |
Si73 | MRR | 3 | 10 | 40 | 80 | 7 |
Si74 | MRR | 3 | 10 | 56 | 45 | 4 |
Si75 | MRR | 1 | 5 | 40 | 4 | 8 |
This work | EAM | 1 | 5 | Lit. 2576 | 60 | 5 |
Proceeding further with our analysis, we introduce the experimental EA modulator performance as component of the NL activation function in the MNIST classifier of handwritten digits as previously done for our theoretical study. We fit a 10° polynomial with saturation to the experimental data (Fig. 5). In this case with the greater performance of the plasmonic mode, we modeled an activation function with the modulators coupled to the photodiodes through TIAs.14 The results show greater performance of the 5 μm ITO modulator over the 20 μm ITO modulator for laser power less than 5 mW, while both modulators asymptotically converge to an accuracy greater of 97% as laser power is increased to 15 mW.
Simulated MNIST accuracy for activation functions fit to experimental voltage absorption data from two ITO modulators, 5 μm and 20 μm.30 A 10° clipped polynomial was used to fit the transmittance vs voltage data from each modulator and used as nonlinear activation function in the simulation environment built in tensor flow. Laser power was swept from 1 mW to 15 mW over the fully connected neural network of two layers each with 100 nodes with both modulator types achieving greater than 90% accuracy for laser power ≥5 mW.
Simulated MNIST accuracy for activation functions fit to experimental voltage absorption data from two ITO modulators, 5 μm and 20 μm.30 A 10° clipped polynomial was used to fit the transmittance vs voltage data from each modulator and used as nonlinear activation function in the simulation environment built in tensor flow. Laser power was swept from 1 mW to 15 mW over the fully connected neural network of two layers each with 100 nodes with both modulator types achieving greater than 90% accuracy for laser power ≥5 mW.
IV. CONCLUSIONS
In conclusion, in this work, we described the free carrier absorption dynamics in ITO. The model of the ITO absorption mechanism is then incorporated as a nonlinear activation function of a neural network which comprises 2 layer with 100 nodes each. After an accurate multiparameter optimization, the proposed noise-material capacitor-circuit model shows that the ITO based modulator used as a component of the NL activation mechanism can provide high accuracy (up to 97%) in the inference results in neural networks, which implements handwritten classification prediction tasks characterized by low latency and a power budget in the order of less than 10 W. For the optical platform alone, in a trained network, considering a capacitance of just 7 fF for a 5 μm long device () and applied bias change of 4 V, the consumption of one electroabsorption module during the inference phase is assumed approximately to be 60 fJ/bit (). At the system level, neglecting I/O and laser power, this translates into 12 pJ/bit (200 NL AF × 60 fJ/bit) consumption for the NL activation functions. Moreover, we incorporate, in our photonic model of tensor-flow, experimental data of a fabricated ITO based EA modulator showing even higher performance than the ones predicted by the model, which does not account for device asymmetries and material nonidealities. In our vision, these results make ITO a very competitive material for electro-optic neural network, particularly for implementing nonlinear activation function for low latency and power demanding applications such as communications and adaptive control of multiantenna systems in LiDAR.
SUPPLEMENTARY MATERIAL
See supplementary material for details on the optical properties of ITO and figures of merit of the neural network.
ACKNOWLEDGMENTS
V.S., T.E., and P.P. are supported via the E2CDA program: V.S., T.E., and P.P. under NSF Grant No. 1740262, V.S. and T.E. under SRC nCORE Grant No. 1740235, and P.P. under SRC nCORE Grant No. 2018-NC-2763-A.
NOMENCLATURE
- ADC
analog to digital converter
- AF
activation function
- BL
broadcast loop
- DAC
digital to analog converter
- DFB
distributed feedback laser
- EA
electroabsorption
- EAM
electroabsorption modulator
- ENZ
epsilon-near-zero
- EO
electro-optic
- EOM
electro-optic modulator
- ER
extinction ratio
- DoS
density of states
- FEM
finite element modeling
- FOM
figure of merit
- I/O
input/output
- IL
insertion losses
- ITO
indium tin oxide
- LiDAR
light detection and ranging
- MRR
microring resonator
- MZM
Mach-Zehnder modulator
- NL
nonlinear
- QD
quantum dot
- RF
Radio frequency
- rms
root mean square
- SNR
signal to noise ratio
- TCO
transparent conductive-oxide
- WDM
wavelength division multiplexing