Photonic reservoir computing (PRC) is a special hardware recurrent neural network, which is featured with fast training speed and low training cost. This work shows a wavelength-multiplexing PRC architecture, taking advantage of the numerous longitudinal modes in a Fabry–Perot (FP) semiconductor laser. These modes construct connected physical neurons in parallel, while an optical feedback loop provides interactive virtual neurons in series. We experimentally demonstrate a four-channel wavelength-multiplexing PRC architecture with a total of 80 neurons. The clock rate of the multiplexing PRC reaches as high as 1.0 GHz, which is four times higher than that of the single-channel case. In addition, it is proved that the multiplexing PRC exhibits a superior performance on the task of signal equalization in an optical fiber communication link. This improved performance is owing to the rich neuron interconnections both in parallel and in series. In particular, this scheme is highly scalable owing to the rich mode resources in FP lasers.
I. INTRODUCTION
The rapid development of artificial intelligence requires an enormous amount of computational power, which is very challenging for traditional computers based on the von Neumann architecture. Photonic computing is a promising approach to significantly raise the computational power, owing to the fast speed, low latency, and high energy efficiency of light.1–3 Photonic reservoir computing (PRC) is a special recurrent neural network, where the neurons are connected with multiple feedback loops.4,5 In contrast to common recurrent neural networks, weights in the input layer and in the hidden reservoir layers of PRCs are fixed, while only weights in the readout layer require training. Therefore, the training speed of PRCs is high and the training cost is low. One implementation approach of PRCs is connecting the physical neurons with optical waveguides on a single chip.6 The operation speed of this type of PRC is high, and the clock rate reaches more than 10 GHz.
However, the integration of nonlinear neurons is technically challenging7 and the scale is limited due to the transmission loss of light in optical waveguides.8 Another approach is employing a time-delay loop together with one physical neuron to produce a large number of virtual neurons.9 The time-delay PRC architecture is usually implemented by using a semiconductor laser with an optical feedback loop10 or by using an optical modulator with an optoelectronic feedback loop.11,12 This time-delay scheme significantly eases the requirement of massive hardware. Nevertheless, the clock rate of the system is inversely proportional to the number of virtual neurons. Consequently, the clock rate of time-delay PRCs is usually limited to tens of MHz.13,14 In addition to the above two approaches, there are various other implementation schemes,15 such as using coupled VCSEL arrays16 or using a spatial light modulator.17
In contrast to electronics, photonics have multiple multiplexing dimensions, including wavelength, polarization, space, and orbital angular momentum.18 In the framework of PRCs, Vatin et al. demonstrated a polarization-multiplexing PRC based on the dual-polarization dynamics of a VCSEL, which could process two tasks in parallel.19 Sunada and Uchida presented a space-multiplexing PRC based on the complex speckle field in a multimode waveguide.20 Butschek et al. reported a frequency-multiplexing PRC, in which 25 comb lines produced by the phase modulation were used as neurons.21 In addition, Nguimdo et al. numerically proposed a parallel PRC based on the two directional modes in a ring laser.22 However, the wavelength division multiplexing (WDM) is the most attractive dimension, thanks to the broad optical bandwidth of optoelectronic devices. Indeed, the ITU-standard dense WDM grid with 50 GHz spacing includes as many as 80 channels. The WDM dimension has been employed in various photonic computing networks, such as the spiking neural network,23 the convolutional neural network,24 and the multilayer perceptron.25 Surprisingly, the deployment of WDM in PRCs has only been discussed in simulations,26–29 to the best of our knowledge.
This work experimentally presents a wavelength-multiplexing PRC, by using the multiple longitudinal modes in a Fabry–Perot (FP) semiconductor laser. All the modes act as physical neurons, which are connected in parallel through the common gain medium. Meanwhile, an optical delay loop is used to produce virtual neurons, which are connected in series. We demonstrate that the four-channel PRC runs four times faster than the single-channel case. In addition, the parallel PRC exhibits a better performance on the task of signal equalization in an optical fiber communication link, thanks to the rich neuron connections both in parallel and in series.
II. WAVELENGTH-MULTIPLEXING SCHEME AND EXPERIMENTAL SETUP
Figure 1 shows the experimental setup for the wavelength-multiplexing PRC architecture. A FP laser with tens of longitudinal modes is used as a slave laser. The modes interact with each other through the common gain medium. All the laser modes are subjected to an optical feedback loop consisting of an optical circulator and two couplers (with ratios of 80:20 and 90:10, respectively). The feedback light is coupled with the laser emission, which can significantly alter the laser dynamics. The laser dynamics is mainly determined by the feedback ratio (the ratio of the feedback light power to the laser emission power) and the feedback delay time. The time-delay loop provides a large number of virtual neurons and constructs the hidden reservoir layer of the PRC.9,10 The optical feedback strength of the feedback loop is tuned by an optical attenuator. Four single-mode external-cavity lasers are used as the master lasers, and the wavelength (λ1–4) of each laser is finely tuned to align with one longitudinal mode of the slave laser, respectively. All the single-mode master lasers are unidirectionally injected into the slave laser through the optical circulator. Similar to optical feedback, the optical injection can substantially change the laser dynamics as well. The laser dynamics is primarily governed by the injection ratio (the power ratio of the master laser to the slave laser) and the detuning frequency (the lasing frequency difference between the master laser and the slave laser). The optical injection is operated in the stable locking regime, which is bounded by the Hopf bifurcation and the saddle-node bifurcation.30,31 In this regime, the lasing frequency of the slave laser is locked to be the same as the master laser, while the phase difference between both lasers is locked as well. After the power amplification with an erbium-doped fiber amplifier, the polarization of each master laser is aligned with that of the Mach–Zehnder intensity modulator (EOSPACE, 40 GHz bandwidth) by using a polarization controller. Then, the polarization of the modulated light is re-adjusted to align with that of the slave laser. Every symbol of the input signal under test is first multiplied with a mask, which consists of a random binary sequence of {0, 1}.32 The mask plays a crucial role in maintaining the transient state of the nonlinear laser system, which is the fundamental requirement of time-delay PRCs. In addition, the duration between each bit of the mask determines the temporal interval of virtual neurons. The preprocessed signal is produced from the arbitrary waveform generator (AWG, Keysight, 25 GHz bandwidth). This radio-frequency signal is amplified before driving the intensity modulator. In this way, the signal at the input layer of the PRC is injected into the slave laser at the hidden reservoir layer for nonlinear processing. The optical spectrum of the output signal is measured by an optical spectrum analyzer (OSA, Yokogawa, 0.02 nm resolution bandwidth). At the output layer, the light is split into two branches by using a 50:50 splitter. Each branch analyzes one longitudinal mode using a bandpass filter (0.95 nm bandwidth). The two optical signals are detected in parallel by high-speed photodiodes (PDs, 25 GHz bandwidth and 50 GHz bandwidth, respectively). After power amplification, the temporal waveforms of both channels are recorded on the digital oscilloscope (OSC, Keysight, 59 GHz bandwidth) simultaneously. The two modes with wavelengths of λ1,2 are recorded first, while another two modes with wavelengths of λ3,4 are tracked in the second-round measurement. It is worthwhile to point out that the four modes can be tracked simultaneously in case a proper wavelength demultiplexer is employed.
Experimental setup for wavelength-multiplexing PRC. AWG: arbitrary waveform generator; OSA: optical spectrum analyzer; OSC: oscilloscope; and PD: photodiode.
Experimental setup for wavelength-multiplexing PRC. AWG: arbitrary waveform generator; OSA: optical spectrum analyzer; OSC: oscilloscope; and PD: photodiode.
In the experiment, the delay time of the optical feedback loop is measured to be about τ = 65.3 ns. The time interval of the virtual neurons in the reservoir is θ = 0.05 ns, which is governed by the modulation rate of the modulator at 20 Gbps. The number of virtual neurons is set as N = 80 throughout the experiment. The weights of the output layer in the PRC are trained with the algorithm of ridge regression.32 The sampling rate of the AWG is set at 60 GSa/s, and the rate of the oscilloscope is set at 80 GSa/s. For the single-channel PRC with only one master laser, the clock cycle of the system is Tc = 4.0 ns, which is determined by the formula Tc = θ × N. When the WDM scheme with multiple master lasers is employed, the neuron number of each channel is inversely proportional to the channel number m as N/m. Consequently, the clock cycle of the system scales down with the channel number as Tc = θ × N/m. For the four-channel PRC in Fig. 1, the clock cycle reduces down to Tc = 1.0 ns, which is four times faster than the one-channel case. It is stressed that the clock cycle Tc of the PRC in Fig. 1 is significantly shorter than the delay time τ, which is different to the common synchronous time-delay PRCs. Our recent work has proved that this asynchronous architecture is beneficial to improving the performance of PRCs.33 This is because the detrimental resonance effect in the synchronous architecture reduces the neuron interconnections and thereby the memory capacity of PRCs.34
III. EXPERIMENTAL RESULTS
In the experiment, the slave FP laser exhibits a lasing threshold of Ith = 8.0 mA at the operation temperature of 20 °C. The laser is biased at 3.5 × Ith with an output power of 3.2 mW, unless stated otherwise. The resonance frequency of the laser is 4.7 GHz, resulting in a characteristic time of 0.21 ns. The peak of the optical spectrum in Fig. 2(a) locates at around 1545 nm, with a free spectral range of about 1.23 nm. When the slave laser is subject to the optical injection from one master laser in Fig. 2(b), only the injected mode keeps lasing, while the other longitudinal modes are suppressed due to the gain reduction in the laser medium.31 When two master lasers, respectively, lock two modes in Fig. 2(c), only the two injected modes remain lasing. In the same way, Fig. 2(d) shows that four modes keep lasing when the four master lasers inject into the slave laser simultaneously. It is noted that all the modes in the FP laser interact with each other due to the cross-gain coupling effect, rather than emitting independently.29 Therefore, the four modes in Fig. 2(d) act as connected physical neurons in parallel. In addition, every mode subject to the feedback loop in Fig. 1 produces a reservoir of virtual neurons. As a result, the four modes generate four connected reservoirs, and each reservoir consists of a large number of virtual neurons. All the virtual neurons are connected not only in series but also in parallel, which substantially enriches the synapse interconnections of the PRC. The mode coupling strength can be quantitatively described by the cross-gain saturation coefficient.29 A coefficient of 0 suggests that the modes run independently, while a value of 1 suggests that the cross-gain saturation effect is as strong as the self-gain saturation effect. Our previous work theoretically proved that the strong coupling strength between the modes is favorable to improve the PRC performance on the benchmark task of nonlinear channel equalization.29
Optical spectra of the slave FP laser under the operation of (a) free running, (b) one-channel injection, (c) two-channel injection, and (d) four-channel injection. The label of the channels is marked in (d).
Optical spectra of the slave FP laser under the operation of (a) free running, (b) one-channel injection, (c) two-channel injection, and (d) four-channel injection. The label of the channels is marked in (d).
We first investigate the performance of the single-channel PRC. Figure 3(a) shows that the BER declines nonlinearly with the injection ratio, owing to the increased signal-to-noise ratio of the PRC system. In addition, increasing the detuning frequency in Fig. 3(b) from the side of the saddle-node bifurcation to the side of the Hopf bifurcation reduces the BER. That is, the optimal PRC performance is achieved in the vicinity of the Hopf bifurcation. This is because the positive frequency detuning of optical injection reduces the damping factor but enhances the resonance frequency of the slave laser, leading to richer dynamics of the virtual neurons.39,40 Therefore, we believe that this positive detuning condition is helpful to improve the PRC performance on other tasks, such as the Mackey–Glass chaos prediction as well.41, Figure 3(c) shows that the BER of the signal is insensitive to the optical feedback ratio. It is remarked that the PRC is always operated in the stable regime of optical feedback. The upper limit of the stable regime is bounded by the critical feedback level, beyond which the slave laser becomes unstable.30,42 The critical feedback level of the slave laser without the optical injection is measured to be about −19.3 dB. However, our recent work has found that the optical injection significantly raised the critical feedback level of the slave laser.33, Figure 3(d) shows that the BER first decreases with increasing pump current and then saturates at around 0.026 when the pump current is larger than 2.0 × Ith. Interestingly, the impacts of the above four operation parameters on the signal equalization task are similar to those on the prediction task of Santa Fe chaos.33
Performance of the single-channel PRC. Effects of (a) the injection ratio Rinj, (b) the detuning frequency Δfinj, (c) the feedback ratio Rext, and (d) the normalized pump current I/Ith. The default operation conditions are Rinj = 4.0; Δfinj: near Hopf bifurcation; Rext = −30.3 dB; and I/Ith = 3.5. The error bar stands for the standard deviation of the four-round measurements.
Performance of the single-channel PRC. Effects of (a) the injection ratio Rinj, (b) the detuning frequency Δfinj, (c) the feedback ratio Rext, and (d) the normalized pump current I/Ith. The default operation conditions are Rinj = 4.0; Δfinj: near Hopf bifurcation; Rext = −30.3 dB; and I/Ith = 3.5. The error bar stands for the standard deviation of the four-round measurements.
Figure 4 shows the performance comparison between PRCs with different numbers of channels. It is shown that all the three PRCs are insensitive to the launch power of the transmitted signal as long as the power is less than 5.0 mW. This is because the small launch power does not stimulate strong Kerr nonlinearity, and hence, the chromatic dispersion dominates the signal distortion. The average BER of the one-channel PRC (squares) is 0.028. In comparison, the average BER of both the two-channel PRC (triangles) and the four-channel PRC (dots) is 0.024, which is 14% smaller than that of the one-channel case. That is, the WDM scheme improves the PRC performance on the signal equalization task. This is because the laser mode interaction provides parallel connections of the virtual neurons, in addition to the series connections arising from the optical feedback loop. In addition, we remind that the clock rate (1.0 GHz) of the four-channel PRC is four times faster than that of the one-channel case (0.25 GHz), taking advantage of the WDM architecture. As shown in Fig. 3(a), the performance of the four-channel PRC can be further improved by raising the injection ratio to 4.0 (similar to the one- and two-channel cases) instead of only 1.0.
Performance of the wavelength-multiplexing PRCs vs the launch power of the transmitted signal. The feedback ratio is Rext = −30.3 dB, and the pump current is I/Ith = 3.5. The optical injection conditions are Rinj = 4.0, Δfinj = −20.1 GHz for the one-channel case (squares); = 4.0, = −34.8 GHz, = −32.3 GHz for the two-channel case (triangles); and = 1.0, = −31.1 GHz, = −23.6 GHz, = −36.0 GHz, = −77.0 GHz for the four-channel case (dots). The error bar stands for the standard deviation of the four-round measurements.
Performance of the wavelength-multiplexing PRCs vs the launch power of the transmitted signal. The feedback ratio is Rext = −30.3 dB, and the pump current is I/Ith = 3.5. The optical injection conditions are Rinj = 4.0, Δfinj = −20.1 GHz for the one-channel case (squares); = 4.0, = −34.8 GHz, = −32.3 GHz for the two-channel case (triangles); and = 1.0, = −31.1 GHz, = −23.6 GHz, = −36.0 GHz, = −77.0 GHz for the four-channel case (dots). The error bar stands for the standard deviation of the four-round measurements.
In order to investigate the ability of the PRC for the compensation of fiber nonlinearity, we manually raise the launch power of the transmitted signal up to 50 mW as shown in Fig. 5. It is shown that the BER of the four-channel PRC increases nonlinearly with the launch power from 0.024 at 1.0 mW to 0.064 at 50 mW. In addition, the BER of every channel (open symbols) with 20 neurons rises nonlinearly as well. Obviously, the four-channel PRC with a total of 80 neurons performs better than each channel, because more neuron dynamics are involved. The performance of the PRC is compared with the feedforward equalizer, which is a transversal filter that linearly combines the received symbol and its neighbors.35 That is, the feedforward equalizer only compensates the chromatic dispersion effect. Figure 5 shows that the BER of the feedforward equalizer (tap number is 5) increases almost linearly with the launch power from 0.042 at 1.0 mW to 0.085 at 50 mW. In comparison, the PRC exhibits a better performance at both low and high launch powers. On the one hand, this is because the PRC has a fading memory effect owing to the nature of recurrent neural networks. Therefore, the PRC can better compensate the distortion of chromatic dispersion. On the other hand, the PRC is a typical nonlinear system and can thereby compensate the distortion of Kerr nonlinearity as well. This comparison result is in agreement with those observed in the literature.43–45 Indeed, Argyris et al. achieved a BER of 10−3 for the equalization of a 25 Gbps NRZ signal at 51 km transmission.43 Vatin et al. reported a BER of 3 × 10−2 for the equalization of a 25 Gbps NRZ signal at 50 km transmission.19 Sackesyn et al. reported an integrated PRC, which compensated a 32 Gbps NRZ signal at 25 km transmission with a BER below 2 × 10−4.46 Ranzini et al. used an optoelectronic RC to compensate the distortion of a 32 Gbps NRZ signal at 80 km transmission, and the BER reached down to 2.2 × 10−4.44 Estebanez et al. recovered the data of a 56 Gbaud PAM-4 100 km transmission link with a BER below 3.8 × 10−3.45 In addition, the signal equalization using PRCs has been widely discussed in simulations as well, where the noise effect was usually neglected.47,48 However, both the noise level and the signal-to-noise ratio of the system substantially affect the PRC performance in practice.29
Performance comparison between the four-channel PRC (dots) and the feedforward equalizer (squares) for a broad range of launch power. The open symbols represent the BERs of each channel.
Performance comparison between the four-channel PRC (dots) and the feedforward equalizer (squares) for a broad range of launch power. The open symbols represent the BERs of each channel.
IV. DISCUSSION
In the above experiment, the delay time of the feedback loop is fixed at τ = 65.3 ns without any optimization. Although the optimization of the PRC performance is beyond the scope of this work, this section discusses the effect of delay time in simulations so as to provide some insights into future experiment designs. The slave laser is described by the rate equation approach, which takes into account the dynamics of the carriers, the photon, and the phase of the electric field.29 The optical feedback effect and the optical injection effect are described by the classical Lang–Kobayashi model.49,50 In the simulation, the neuron number of the single-channel PRC is set as 80. The neuron interval is set at 0.01 ns, and hence, the clock cycle is Tc = 0.8 ns. The simulated BER in Fig. 6 first goes down with the normalized delay time starting from τ/Tc = 0.1. The optimal PRC performance is achieved within the normalized time range of 0.5 to 2.0, and the best BER is around 0.010. In comparison, the optimal time range for the prediction task of Santa Fe chaos is from 2.0 to 4.0.33 Interestingly, the BER jumps up to 0.011 at τ/Tc = 1.0, where the delay time is synchronous with the clock cycle. The performance degradation is attributed to the detrimental resonance effect, which reduces the local memory capacity of the PRC system.33,34 Generally, the memory capacity first increases to a maximum value and then decreases with increasing delay time.34 For τ/Tc > 2.0, the memory capacity deviates from the optimal value, and hence, the BER almost rises linearly with increasing delay time. However, Fig. 6 does not show the obvious performance deterioration at high-order resonances, where τ/Tc = m with m being an integer and m ≥ 2. This is because the local memory capacity only slightly reduces at high-order resonances. In the experimental setup shown in Fig. 1, nevertheless, the feedback delay time of the four-channel PRC is more than 65 times longer than the clock cycle, which is far away from the optimal value. Consequently, future experiments will optimize the delay time to achieve the best signal equalization performance.
Simulated effect of the normalized delay time τ/Tc on the PRC performance.
It has been well established that the spacing of the virtual neurons substantially affects the PRC performance, which must be smaller enough than the characteristic time of the system.32 The neuron spacing in Fig. 1 is fixed at 0.05 ns, which is about a quarter of the characteristic time (0.21 ns). Our recent work demonstrated that increasing this neuron spacing deteriorated the PRC performance due to the reduced transient dynamics.33 Finally, we remark that the parallel PRC scheme not only is able to solve the signal equalization task discussed in this work but also can be generalized to other tasks. The supplementary material discusses its applications in the benchmark task of nonlinear channel equalization and in the benchmark task of Santa Fe chaos prediction, of which both are commonly used in the PRC community.51 The merits of the parallel scheme, including the clock rate acceleration and the performance improvement, remain unaffected, since both the advantages are task independent.
V. CONCLUSION
In summary, we have experimentally demonstrated a wavelength-multiplexing PRC architecture based on the numerous longitudinal modes in a FP laser. The modes play the role of connected physical neurons in parallel. Meanwhile, an optical feedback loop produces virtual neurons, which are connected in series through the time-multiplexing effect. It is shown that the four-channel PRC runs four times faster than the single-channel case and the clock rate reaches up to 1.0 GHz. It is found that the four-channel PRC exhibits a superior performance on the signal equalization task, owing to the interaction of neurons both in parallel and in series. The proposed WDM scheme is highly scalable owing to the rich mode resources in FP lasers. Future work will scale up the number of WDM channels and further raise the clock rate of PRC. It is remarked that the four-wave mixing effect may limit the maximum achievable channels. In order to avoid its impact on the PRC performance, only two adjacent channels will be used to inject data out of every three ITU-standard dense WDM channels. In this way, it is possible to reach a maximum number of 26 channels in the WDM PRC system.
SUPPLEMENTARY MATERIAL
See the supplementary material for the parallel PRC performance on the benchmark task of nonlinear channel equalization and on the benchmark task of Santa Fe chaos prediction.
ACKNOWLEDGMENTS
This work was funded by the Shanghai Natural Science Foundation (Grant No. 20ZR1436500).
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Rui-Qian Li and Yi-Wei Shen contributed equally to this work.
Rui-Qian Li: Data curation (equal); Investigation (equal); Validation (equal); Visualization (equal); Writing – original draft (equal). Yi-Wei Shen: Data curation (equal); Investigation (equal); Validation (equal); Visualization (equal); Writing – original draft (equal). Bao-De Lin: Data curation (equal); Investigation (equal); Validation (equal); Visualization (equal); Writing – original draft (equal). Jingyi Yu: Methodology (equal); Project administration (equal); Supervision (equal); Writing – review & editing (equal). Xuming He: Methodology (equal); Project administration (equal); Supervision (equal); Writing – review & editing (equal). Cheng Wang: Conceptualization (lead); Funding acquisition (lead); Methodology (lead); Project administration (lead); Supervision (lead); Writing – review & editing (lead).
DATA AVAILABILITY
The data that support the findings of this study are openly available in https://zenodo.org/record/7961785#.ZGx_FXZByHu.52