Neural systems are well known for their ability to learn and store information as memories. Even more impressive is their ability to abstract these memories to create complex internal representations, enabling advanced functions such as the spatial manipulation of mental representations. While recurrent neural networks (RNNs) are capable of representing complex information, the exact mechanisms of how dynamical neural systems perform abstraction are still not well-understood, thereby hindering the development of more advanced functions. Here, we train a 1000-neuron RNN—a reservoir computer (RC)—to abstract a continuous dynamical attractor memory from isolated examples of dynamical attractor memories. Furthermore, we explain the abstraction mechanism with a new theory. By training the RC on isolated and shifted examples of either stable limit cycles or chaotic Lorenz attractors, the RC learns a continuum of attractors as quantified by an extra Lyapunov exponent equal to zero. We propose a theoretical mechanism of this abstraction by combining ideas from differentiable generalized synchronization and feedback dynamics. Our results quantify abstraction in simple neural systems, enabling us to design artificial RNNs for abstraction and leading us toward a neural basis of abstraction.
Neural systems learn and store information as memories and can even create abstract representations from these memories, such as how the human brain can change the pitch of a song or predict different trajectories of a moving object. Because we do not know how neurons work together to generate abstractions, we are unable to optimize artificial neural networks for abstraction or directly measure abstraction in biological neural networks. Our goal is to provide a theory for how a simple neural network learns to abstract information from its inputs. We demonstrate that abstraction is possible using a simple neural network and that abstraction can be quantified and measured using existing tools. Furthermore, we provide a new mathematical mechanism for abstraction in artificial neural networks, allowing for future research applications in neuroscience and machine learning.
I. INTRODUCTION
Biological and artificial neural networks have the ability to make generalizations from only a few examples.1–7 For instance, both types of networks demonstrate object invariance: the ability to recognize an object even after it has undergone translation or transformation.8,9 What is surprising about this invariance is not that neural systems can map a set of inputs to the same output. Rather, what is surprising is that they can first sustain internal representations of objects and then abstract these representations to include translations and transformations. Hence, beyond simply memorizing static, discrete examples,10 neural systems have the ability to abstract their memories along a continuum of information by observing isolated examples.11 However, the precise mechanisms of such abstraction remain unknown, limiting the principled design and training of neural systems.
To make matters worse, much of the information represented by neural networks is not static but dynamic. As a biological example, a songbird’s representation of song is inherently time-varying and can be continuously sped up and slowed down through external perturbations.12 In artificial networks, recurrent neural networks (RNNs) can store a history of temporal information such as language,13 dynamical trajectories,14,15 and climate16 to more accurately classify and predict future events. To harness the power of RNNs for processing temporal information, efforts have focused on developing powerful training algorithms, such as backpropagation through time (BPTT)17 and neural architectures such as long short-term memory (LSTM) networks,18 alongside physical realizations in neuromorphic computing chips.19 Unfortunately, the dramatic increase in computational capability is accompanied by a similarly dramatic increase in the difficulty of understanding such systems, severely limiting their designability and generalizability beyond specific datasets.
To better understand the mechanisms behind neural representations of temporal information, the field has turned to dynamical systems. Starting with theories of synchronization between coupled dynamical systems,20,21 theories of generalized synchronization22 and invertible generalized synchronization23 provide intuition and conditions for when a neural network uniquely represents the temporal trajectory of its inputs, and when this representation can recover the original inputs to recurrently store them as memories.24 These theories hinge on important ideas and tools such as delay embedding,25 Lyapunov exponents (LEs),26 and dimensionality,27,28 which quantify crucial properties of time-varying representations. However, it is not yet known precisely how neural systems abstract such time-varying representations. Accordingly, the field is limited in its understanding of abstraction and meta-learning in existing neural systems29–32 and restricted in its ability to design neural systems for abstraction.
Here, we address this knowledge gap by providing a mechanism for the abstraction of time-varying attractor memories in a reservoir computer (RC), which is a type of RNN.33 In this work, we define abstraction as the process of the RC using its internal nonlinear dynamics to map several low-dimensional inputs to a single higher dimensional output. First, we demonstrate that a neural network can observe low-dimensional inputs and create higher-dimensional abstractions, thereby learning a continuum of representations from a few examples. Then, we develop a new theory to explain the mechanism of this abstraction by extending prior work:34 we explicitly write the differential response of the RC to a differential change in the input, thereby giving a quantitative form to ideas of differentiable generalized synchronization.35 We quantify this abstraction by demonstrating that successful abstraction is driven by the acquisition of an additional 0 Lyapunov exponent in the RC’s dynamics and study the role of the RC’s spectral radius and time constant on its ability to abstract dynamics. These results enable the development of more interpretable and designable methods in machine learning and provide a quantitative hypothesis and measure of abstraction from neural dynamics.
II. MATHEMATICAL FRAMEWORK
To study the ability of neural networks to process and represent time-varying information as memories, we use a simple nonlinear dynamical system from reservoir computing
Here, is a vector that represents the state of the reservoir neurons, is the vector of inputs into the reservoir, is a constant vector of bias terms, is the matrix of connections between neurons, is the matrix of weights mapping inputs to neurons, is a sigmoidal function that we take to be the hyperbolic tangent , and is a time constant.
Throughout the results, we use an -neuron network such that . We set to be dense, where each non-zero entry of is a random number from 1 to 1, and then scale such that the absolute value of the largest eigenvalue is , the spectral radius of the network. The spectral radius controls the excitability of the reservoir, as the magnitude of the largest magnitude eigenvalue of is set to be using the MATLAB command . In general, each entry of was drawn randomly from 1 to 1 and multiplied by a scalar coefficient set to 0.1; one analysis that stands as an exception is the parameter sweep in Subsection. IV A, where the scalar coefficient was varied systematically. Each entry of the bias term was drawn randomly from 1 to 1 and multiplied by a bias amplification constant, which was set to 10 in all cases except in the parameter sweep in Subsection IV A.
To study the ability of the reservoir to form representations and abstractions of temporal data, we must define the data to be learned. Following prior work in teaching reservoirs to represent temporal information,14,15,24,34 we will use dynamical attractors as the memories. The first memory that we use is a stable limit cycle that evolves according to
To test the reservoir’s ability to learn and abstract more complex memories, the second memory that we use is the chaotic Lorenz attractor36 that evolves according to
By driving the reservoir in Eq. (1) with the time series generated from either the stable limit cycle in Eq. (2) or the chaotic Lorenz system in Eq. (3), the response of the reservoir neurons is given by . In our experiments, we drive the reservoir and evolve the input memory for 50 time units to create a transient phase which we discard, allowing the RC and the input memory to evolve far enough away from the randomly chosen initial conditions. Then, we drive the reservoir and the input memory together for 100 time units to create a learning phase. Because we use a time step of , this process creates a learning phase time series of 1 00 000 points.
To store the attractor time series as memories, prior work in reservoir computing has demonstrated that it is sufficient to first train an output matrix that maps reservoir states to copy the input according to the least squares norm minimization
and then perform feedback by replacing the inputs with the output of the reservoir, . This feedback generates a new system that evolves autonomously according to
We evolve this new system for 500 time units to create a prediction phase.
As a demonstration of this process, we show a schematic of the reservoir being driven by the stable limit cycle input [Fig. 1(a), blue], thereby generating the reservoir time series [Fig. 1(a), gold], which is subsequently used to train a matrix such that copies the input [Fig. 1(a), red]. The training input, [Fig. 1(b), blue], and the training output, [Fig. 1(b), red], are plotted together and are indistinguishable. After the training, we perform feedback by replacing the reservoir inputs, , with the outputs, [Fig. 1(c)], and observe that the output of the autonomous reservoir remains as a limit cycle [Fig. 1(d)]. Can this simple process be used not only to store memories but also to abstract memories? If so, by what mechanism?
In what follows, we answer these questions by extending the framework to multiple isolated inputs. Specifically, rather than using only one attractor time series , we will use a finite number of translated attractor time series
for , where is a constant vector. Further discussion about the vector can be found in the Appendix. We will use these time series to drive the reservoir to generate a finite number of neural responses . By concatenating all of the inputs and reservoir states along the time dimension into a single time series, and , respectively, we train an output matrix according to Eq. (4) that maps all of the reservoir states to all of the translated inputs. Finally, using , we perform feedback according to Eq. (5).
III. DIFFERENTIAL LEARNING DRIVES ABSTRACTION
To teach the reservoir to generate higher-dimensional representations of isolated inputs, we train it to copy translations of an attractor memory. First, we consider the time series of a stable limit cycle generated by Eq. (2), , and we create shifted time series, , for according to Eq. (6) [Fig. 2(a)]. We then use these time series to drive the reservoir according to Eq. (1) to generate the reservoir time series for , concatenate the time series into and , respectively, and train the output matrix according to Eq. (4) to generate the autonomous feedback reservoir that evolves according to Eq. (5).
To test whether the reservoir has learned a higher-dimensional continuum of limit cycles vs the five isolated examples, we evolve the autonomous reservoir at intermediate values of the translation variable . Specifically, we first prepare the reservoir state by driving the non-autonomous reservoir in Eq. (1) with limit cycles at intermediate translations [i.e., for ] for 50 time units until any transient dynamics from the initial reservoir state have decayed, thereby generating a set of final reservoir states . We then use these final reservoir states as the initial state for the autonomous feedback reservoir in Eq. (5). Finally, we evolve the autonomous reservoir and plot the outputs in green in Fig. 2(b). As can be seen, the autonomous reservoir whose initial state has been prepared at intermediary translations in position continues to evolve about a stable limit cycle at that shift.
A. Differential mechanism of learning
Now that we have numerically demonstrated the higher-dimensional abstraction of lower-dimensional attractors, we will uncover the underlying theoretical mechanism first by studying the response of the reservoir to different inputs and then by studying the consequence of the training process.
First, we compute perturbations of the reservoir state, , in response to perturbations of the input, , by linearizing the dynamics about the trajectories and to yield
where is the derivative of evaluated at , and is the element-wise product of the th element of and the th row of either matrix or . We are guaranteed by differentiable generalized synchronization35 that if is infinitesimal and constant, then is also infinitesimal and evolves according to Eq. (7). Fortuitously, the differential change is precisely infinitesimal and constant and is given by the derivative of Eq. (6) to yield . We substitute this derivative into Eq. (7) to yield
Crucially, this system is linear such that if a shift of yields a perturbed reservoir trajectory of , then a shift of yields a perturbed reservoir trajectory of . Hence, we can already begin to see the mechanism of abstraction: any scalar multiple of the differential input, , yields a scalar multiple of the trajectory as a valid perturbed trajectory.
To complete the abstraction mechanism, we note that the trained output matrix precisely learns the inverse map. If Eq. (8) maps scalar multiples of to scalar multiples of , then the trained output matrix maps scalar multiples of back to . To learn this inverse map, notice that our five training examples are spaced closely together [Fig. 2(a)], which allows the trained output matrix to map differential changes in to differential changes in . Hence, not only does but also learns
The consequence of this differential learning is seen in the evolution of the perturbation of the autonomous feedback reservoir by substituting Eq. (9) into Eq. (8) to obtain
If the training examples are close enough to learn the differential relation in Eq. (9), then any perturbed trajectory, , generated by Eq. (8) is a valid trajectory in the feedback system to linear order. Further, any scalar multiple of is also a valid perturbed trajectory in the feedback system.
Hence, by training the output matrix to copy nearby examples—thereby learning the differential relation between and —we encode scalar multiples of as a linear subspace of valid perturbation trajectories. It is precisely this encoded subspace of valid perturbation trajectories that we call the higher-dimensional abstraction of the lower-dimensional input; in addition to the two-dimensional limit cycle input, the reservoir encodes the subspace comprising scalar multiples of the perturbation trajectory as a third dimension. To visually represent this third dimension, we take the average of the perturbation vector across time as
and project all of the autonomous reservoir trajectories along this vector. We then remove the projection of from the autonomous reservoir time series, compute the first two principle components of this modified trajectory, and plot the projection against these two principle components, shown in Fig. 2(c). As can be seen, the shift in the limit cycle is encoded along the direction. Graphically and numerically, we have confirmed our theoretical mechanism of abstraction using a continuous limit cycle memory.
IV. ABSTRACTION AS THE ACQUISITION OF A LYAPUNOV EXPONENT EQUAL TO ZERO
Now that we have a mechanism of abstraction, we seek a simple method to quantify this abstraction in higher-dimensional systems that do not permit an intuitive graphical representation [Fig. 2(c)]. In the chaotic Lorenz system with a fractal orbit [Fig. 3(a)], it can be difficult to visually determine whether the prediction output is part of a given input example or in between two input examples. Hence, we would like some measure of the presence of perturbations along a trajectory, , that neither grow nor shrink along the direction of linearly scaled perturbations. If these perturbations neither grow nor shrink, they represent a stable trajectory that does not collapse into another trajectory or devolves into chaos.
To measure this abstraction, we compute the Lyapunov spectrum of the RC. Conceptually, the Lyapunov spectrum measures the stability of different trajectories along an attractor. It is computed by first generating an orbit along the attractor of a -dimensional dynamical system, , then by evaluating the Jacobian at every point along the orbit, , and finally, by evolving orbits of infinitesimal perturbations, , along the time-varying Jacobian as
Along these orbits, the direction and magnitude of will change based on the linearly stable and unstable directions of the Jacobian . To capture these changes along orthogonal directions, after each time step of evolution along Eq. (12), we order the perturbation vectors into a matrix , and perform a Gram–Schmidt orthonormalization to obtain an orthonormal basis of perturbation vectors . In this way, eventually points along the least stable direction, along the second least stable direction, and along the most stable direction. The evolution of the three perturbation vectors of the Lorenz system is shown in Fig. 3(a).
To calculate the Lyapunov spectrum, we compute the projection of each normalized perturbation vector along the Jacobian as the Lyapunov exponent (LE) over time,37
and the final LE is given by the time average [Fig. 3(b)]. Every continuous-time dynamical system with bounded, non-fixed-point dynamics has at least one zero Lyapunov exponent corresponding to a perturbation that neither grows nor shrinks on average [Fig. 3(b), red]. In a chaotic system like the Lorenz, there is also a positive LE corresponding to an orthogonal perturbation that grows on average [Fig. 3(b), blue]. Finally, a negative LE corresponds to an orthogonal perturbation that decays on average [Fig. 3(b), yellow]. As can be seen in the plot of trajectories, the orbit of the negative LE is directed transverse to the plane that roughly defines the “wings” of the attractor such that any deviation from the plane of the wings quickly collapses back onto the wings [Fig. 3(a)].
Using the Lyapunov spectrum, we hypothesize that the reservoir’s abstraction of an attractor memory will appear as an additional LE equal to zero. This is because through the training of nearby examples, the reservoir acquires the perturbation direction that neither grows nor decays on average, as all scalar multiples of are valid perturbation trajectories to linear order according to Eq. (10). Hence, the acquisition of such a perturbation direction that neither grows nor decays should present itself as an additional LE equal to zero.
A. Abstraction depends on spectral radius and time constant
With this mechanism of abstraction in RCs, we provide a concrete implementation of our theory and study its limits. Our RCs in Eq. (1) and in Eq. (5) depend on several parameter regimes; the spectral radius (given by ), the time constant , the bias term , the weighting of the input matrix , and the number and spacing of the training examples all impact whether abstraction can successfully occur. We quantify the effect of varying these parameters on the RC’s ability to abstract different inputs via the Lyapunov spectrum analysis.
We focus on the parameters that determine the internal dynamics of the RC: and . The RC is a carefully balanced system whose internal speed is set by . If is too small, the system is too slow to react to the inputs. Conversely, if is too large, the system responds too quickly to retain a history of the input. Thus, we hypothesize that an intermediate will yield optimal abstraction and that the optimal range of will vary depending upon the time scale of the input. Similar to the time constant, the spectral radius is known to impact the success of the learning as it controls the excitability of the RC.15 For abstraction to succeed, the RC needs an intermediate and to learn the input signals with an excitability and reaction speed suited for the input attractor memory.
To find the ideal parameter regime for abstracting a limit cycle attractor memory, we performed a parameter sweep on from 2.5 to 25.0 in increments of 2.5 and on from 0.2 to 2.0 in increments of 0.2. All other parameters in the closed and open loop reservoir equations were held constant. To measure the success of the abstraction, we calculated the first four LEs of the RC, looking for values of and equal to 0, and values of and that are negative. For this continuous limit cycle memory, we found that the best parameter regime was and (Fig. 4). Then, we tested the weighting of the input matrix, , while holding , , and all other parameters constant. We found that an optimal scaling of is between 0.001 and 0.1. We performed a similar test for the bias term, , resulting in an optimal scaling between 1.0 and 20.0. These parameter ranges demonstrate that a careful balance of and , along with and , is necessary to successfully achieve abstraction. More generally, our approach to defining these parameter ranges provides a principled method for future RCs to learn different attractor memories.
B. Abstraction of chaotic memories
While limit cycles provide an intuitive conceptual demonstration of abstraction, real neural networks such as the human brain learn more complex memories that involve a larger number of parameters, including natural and chaotic attractors such as weather phenomena36 or diffusion.38 Chaotic attractors pose a more complex memory for the reservoir to learn, so it is nontrivial to show that the reservoir is able to abstract from several chaotic attractors to learn one single continuous chaotic attractor. By again analyzing the Lyapunov spectrum, we can quantify successful abstraction. As seen in Fig. 3, a chaotic dynamical system is characterized by positive Lyapunov exponents. In the case of the Lorenz attractor, the first Lyapunov exponent is positive (), the second is equal to zero, and the third is negative (). Hence, when the RC learns a single Lorenz attractor, the first LE is positive, the second LE is zero, and the rest of the spectrum is increasingly negative. In the case of the successful and continuous abstraction of the Lorenz attractor, we expect to see that the first LE is positive, followed by not one, but two LEs equal to zero, followed by increasingly negative LEs.
To test this acquisition of an LE equal to zero, we trained the RC to learn multiple chaotic attractor memories, focusing on the Lorenz attractor. To find the ideal parameter regime for learning a continuous Lorenz attractor memory from many discrete examples, we again performed a parameter sweep on from 2.5 to 25.0 in increments of 2.5 and on from 0.2 to 2.0 in increments of 0.2. All other parameters in the closed and open loop reservoir equations were held constant. To measure the success of abstraction, we calculated the first five LEs of the RC, looking for the values of to be positive, values of and to equal zero, and the values of and to be increasingly negative. For this continuous Lorenz attractor memory, we found that the best parameter regime was using 0.6–1.2 for and a of 25, as seen in Fig. 5. Hence, we demonstrate that in addition to simple limit cycle attractors, RCs can successfully abstract much more complex and unstable chaotic attractor memories, demonstrating the generalizability of our theory.
V. DISCUSSION
Reservoir computing has been gaining substantial traction, and significant advances have been made in many domains of application. Among them include numerical advances in adaptive rules for training reservoirs using evolutionary algorithms39,40 and neurobiologically inspired spike-time-dependent-plasticity.41 In tandem, physical implementations of reservoir computing in photonic,42–44 memristive,45 and neuromorphic46 systems provide low-power alternatives to traditional computing hardware. Each application is accompanied by its own unique set of theoretical considerations and limitations,47 thereby emphasizing the need for the underlying analytical mechanisms to make meaningful generalizations across such a wide range of systems.
In this work, we provide such a mechanism for the abstraction of a continuum of attractor memories from discrete examples and put forth the acquisition of an additional zero Lyapunov exponent as a quantitative measure of success. Compared to prior work,34 we remove the external control parameter, thereby moving beyond the translation of a -dimensional attractor to generating a higher, -dimensional representation of the attractor from -dimensional inputs. Moreover, the method can be applied to any learning of chaotic attractor memories due to the generality of the differential mechanism of learning we uncover. While our investigation simplifies the complexity of the network used and the memories learned, we show that the underlying mechanism of abstraction remains the same as we increase the complexity of the memory learned (e.g., discrete to continuous and non-chaotic to chaotic).
Our work motivates several new avenues of inquiry. First, it would be of interest to examine the theoretical and numerical mechanism for abstracting more complex transformations. Second, it would be of interest to embark on a systematic study of the spacing between the discrete examples that is necessary to learn a differential attractor vs discrete attractors and the phase transition of abstraction. Third and finally, ongoing and future efforts could seek to determine the role of noise in both the RC and input dynamics for abstracting high-dimensional continuous attractors from scattered low-dimensional and discrete attractors. Because different RNNs are better suited to learn and abstract different inputs, we expect that this work will shed light on studies that reveal how one can design specialized RNNs for better abstraction on particular dynamical attractors.
VI. CONCLUSION
Here, we show that an RC can successfully learn time-varying attractor memories. We demonstrate this process with both limit cycle and Lorenz attractor inputs. We then show the RC several discrete examples of these attractors, translated from each other by a small distance. We find that the neural network is able to abstract to a higher dimension and learn a continuous attractor memory that connects all of the discrete examples together. This process of abstraction can be quantified by the acquisition of an additional exponent equal to zero in the Lyapunov spectrum of the RC’s dynamics. Our discovery has important implications for future improvements in the algorithms and methods used in machine learning, due specifically to the understanding gained from using this simpler model. More broadly, our findings provide new hypotheses regarding how humans construct abstractions from real-world inputs to their neural networks.
ACKNOWLEDGMENTS
L.M.S. acknowledges support from the University Scholars Program at the University of Pennsylvania. J.Z.K. acknowledges support from the NIH (No. T32-EB020087), PD: Felix W. Wehrli, and the National Science Foundation Graduate Research Fellowship (No. DGE-1321851). D.S.B. acknowledges support from the NSF through the University of Pennsylvania Materials Research Science and Engineering Center (MRSEC) (No. DMR-1720530), as well as the Paul G. Allen Family Foundation, and a grant from the Army Research Office (No. W911NF-16-1-0474). The content is solely the responsibility of the authors and does not necessarily represent the official views of any of the funding agencies.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts of interest to disclose.
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.
CITATION DIVERSITY STATEMENT
We would like to include a citation diversity statement following a recent proposal.48 Recent work in several fields of science has identified a bias in citation practices such that papers from women and other minority scholars are under-cited relative to the number of such papers in the field.49–57 Here, we sought to proactively consider choosing references that reflect the diversity of the field in thought, form of contribution, gender, race, ethnicity, and other factors. First, we obtained the predicted gender of the first and last author of each reference by using databases that store the probability of a first name being carried by a woman.53,58 By this measure (and excluding self-citations to the first and last authors of our current paper), our references contain 6.67% woman(first)/woman(last), 16.4% man/woman, 13.33% woman/man, and 63.6% man/man. This method is limited in that (a) names, pronouns, and social media profiles used to construct the databases may not, in every case, be indicative of gender identity and (b) it cannot account for intersex, non-binary, or transgender people. Second, we obtained predicted racial/ethnic category of the first and last author of each reference by databases that store the probability of a first and last name being carried by an author of color.59,60 By this measure (and excluding self-citations), our references contain 15.94% author of color (first)/author of color (last), 20.06% white author/author of color, 20.17% author of color/white author, and 43.83% white author/white author. This method is limited in that (a) names and Florida Voter Data to make the predictions may not be indicative of racial/ethnic identity, and (b) it cannot account for Indigenous and mixed-race authors, or those who may face differential biases due to the ambiguous racialization or ethnicization of their names. We look forward to future work that could help us to better understand how to support equitable practices in science.
APPENDIX: SHIFT MAGNITUDE
In this work, we used a shift vector in Eq. (6) to control where the translated examples that were shown to the RC were located. We used a magnitude of 0.001 for the vector in order to produce both continuous LC and Lorenz attractors for Figs. 4 and 5. For Fig. 2, we used a magnitude of 0.01 for the continuous LC attractor. For the continuous LC, the direction of was along the = line, so each example shown to the reservoir was translated by the shift magnitude in both and directions. For the continuous Lorenz attractor, a similar procedure was used but extended to the additional third dimension. So, the direction of was along the line such that each example shown to the reservoir was translated by the shift magnitude in the , , and directions.
Figure 6 shows the relationship between the magnitude of and the second LE, which for successful abstraction should be as close to 0 as possible. This figure was created from training 21 LC examples each shifted by the shift magnitude indicated with and , and then the LE spectrum was calculated after the prediction phase. We also note that due to the reservoir’s need to learn the differential relationship between multiple training examples, as outlined in the section entitled “Differential mechanism of learning,” the magnitude of must be small for best results.
Here, in Fig. 7, we show a comparison of different values of the spectral radius at the network sparsity used in this work (i.e., 2% dense) and its effect on the Lyapunov exponents of the reservoir. The time constant was set to , and the reservoir was simply evolved for 50 time units for a transient period, ensuring that the reservoir evolved far enough away for it to ignore transient orbits caused by the random initial condition. Then, the reservoir was time-evolved for 100 time units, and the Lyapunov exponents were calculated for the reservoir.