We articulate the design imperatives for machine learning based digital twins for nonlinear dynamical systems, which can be used to monitor the “health” of the system and anticipate future collapse. The fundamental requirement for digital twins of nonlinear dynamical systems is dynamical evolution: the digital twin must be able to evolve its dynamical state at the present time to the next time step without further state input—a requirement that reservoir computing naturally meets. We conduct extensive tests using prototypical systems from optics, ecology, and climate, where the respective specific examples are a chaotic laser system, a model of phytoplankton subject to seasonality, and the Lorenz-96 climate network. We demonstrate that, with a single or parallel reservoir computer, the digital twins are capable of a variety of challenging forecasting and monitoring tasks. Our digital twin has the following capabilities: (1) extrapolating the dynamics of the target system to predict how it may respond to a changing dynamical environment, e.g., a driving signal that it has never experienced before, (2) making continual forecasting and monitoring with sparse real-time updates under non-stationary external driving, (3) inferring hidden variables in the target system and accurately reproducing/predicting their dynamical evolution, (4) adapting to external driving of different waveform, and (5) extrapolating the global bifurcation behaviors to network systems of different sizes. These features make our digital twins appealing in applications, such as monitoring the health of critical systems and forecasting their potential collapse induced by environmental changes or perturbations. Such systems can be an infrastructure, an ecosystem, or a regional climate system.
Digital twins have attracted much attention recently in many fields. This article develops machine learning based digital twins for nonlinear dynamical systems subject to external forcing. There are two assumptions underlying the situation of interest. First, the governing equations are not known, and only measured time series from the system are available. Second, none of the currently available sparse optimization methods for discovering the system equations from data is applicable. The goal is to generate a replica or twin of the system with reservoir computing machines, a class of recurrent neural networks. Considering that, in nonlinear dynamical systems, various bifurcations leading to chaos and system collapse can take place, a basic requirement for the digital twin is its ability to generate statistically correct evolution of the system and accurate prediction of the bifurcations, especially the critical bifurcations that are associated with catastrophic behaviors. A digital twin so designed can have significant applications, such as monitoring the “health” of the target system in real time and providing early warnings of critical bifurcations or events. In terms of predictive problem solving, successful forecasting of a system collapse in the future by the digital twin makes it possible to devise an optimal control strategy as an early intervention to prevent the collapse. What machine-learning scheme can be exploited for constructing digital twins for nonlinear dynamical systems? Recall the basic property of any dynamical system: its state naturally evolves forward in time according to a set of rules which, mathematically, can be described by a set of differential equations or discrete-time maps. The basic requirement for a digital twin is then that it must be able to evolve forward in time without any state input. Reservoir computing, because of its ability to execute closed-loop, dynamical self-evolution with memory, stands out as the suitable choice for meeting this requirement. This article demonstrates that, with a single or parallel reservoir computer, the digital twin is capable of challenging forecasting and monitoring tasks for prototypical systems from climate, optics, and ecology. It is also shown that the digital twin is capable of the following tasks: extrapolating the dynamics of the target system to external driving signals that it has never experienced before, making continual forecasting/monitoring with sparse real-time updates under nonstationary external driving, inferring hidden variables and accurately predicting their dynamical evolution, adapting to different forms of external driving, and extrapolating the global bifurcations to systems of different sizes. These features make the reservoir computing based digital twins appealing in significant applications, such as monitoring the health of critical systems and forecasting their potential collapse induced by environmental changes.
I. INTRODUCTION
The concept of digital twins originated from aerospace engineering for aircraft structural life prediction.1 In general, a digital twin can be used for predicting and monitoring dynamical systems and generating solutions of emergent behaviors that can potentially be catastrophic.2 Digital twins have attracted much attention from a wide range of fields,3 including medicine and health care.4,5 For example, the idea of developing medical digital twins in viral infection through a combination of mechanistic knowledge, observational data, medical histories, and artificial intelligence has been proposed recently,6 which can potentially lead to a powerful addition to the existing tools to combat future pandemics. In a more dramatic development, the European Union plans to fund the development of digital twins of Earth for its green transition.7,8
The physical world is nonlinear. Many engineering systems, such as complex infrastructural systems, are governed by nonlinear dynamical rules too. In nonlinear dynamics, various bifurcations leading to chaos and system collapse can take place.9 For example, in ecology, environmental deterioration caused by global warming can lead to slow parameter drift toward chaos and species extinction.10,11 In an electrical power system, voltage collapse can occur after a parameter shift that lands the system in transient chaos.12 Various climate systems in different geographic regions of the world are also nonlinear and the emergent catastrophic behaviors as a result of increasing human activities are of grave concern. In all these cases, it is of interest to develop a digital twin of the system of interest to monitor its “health” in real time as well as for predictive problem solving in the sense that, if the digital twin indicates a possible system collapse in the future, proper control strategies should and can be devised and executed in time to prevent the collapse.
What does it take to create a digital twin for a nonlinear dynamical system? For natural and engineering systems, there are two general approaches: one is based on mechanistic knowledge and another is based on observational data. In principle, if the detailed physics of the system is well understood, it should be possible to construct a digital twin through mathematical modeling. However, there are two difficulties associated with this modeling approach. First, a real-world system can be high-dimensional and complex, preventing the rules governing its dynamical evolution from being known at a sufficiently detailed level. Second, the hallmark of chaos is sensitive dependence on initial conditions. Because no mathematical model of the underlying physical system can be perfect, the small deviations and high dimensionality of the system coupled with environmental disturbances can cause the model predictions of the future state of the system to be inaccurate and completely irrelevant.13,14 These difficulties motivate the proposition that the data-based approach can have advantages in many realistic scenarios and a viable method to develop a digital twin is through data. While in certain cases, approximate system equations can be found from data through sparse optimization,15–17 the same difficulties with the modeling approach arise. These considerations have led us to exploit machine learning to create digital twins for nonlinear dynamical systems.
What machine-learning scheme is suitable for digital twins of nonlinear dynamical systems? The very basic characteristic of any dynamical system is its ability to evolve over time. That is, from an initial state, the system evolves in time by generating the state at the next time step according to a set of dynamical rules. Mathematically, these rules can be described by a set of differential equations in continuous time or maps in discrete time. Given a nonlinear dynamical system, the basic requirement is then that its digital twin must also be a dynamical system capable of self-evolution with memory. Reservoir computing,18–20 a kind of recurrent neural network, naturally stands out as a suitable candidate, which in recent years has been extensively studied for predicting chaotic systems.21–42 More specifically, with observational data as input, a reservoir computer can be trained. The training phase corresponds to open-loop operation because of the external input. After training, the output of the reservoir computer is connected to its input, creating a closed loop that enables the neural network to update or evolve its state in time, without input data. A properly trained reservoir computer can then follow the evolution of the target dynamical system, acting as its digital twin. Another advantage of reservoir computing is that no backpropagation is needed for optimizing the parameters—only a linear regression is required in the training, so it is computationally efficient. A common situation is that the target system is subject to external driving, such as a driven laser, a regional climate system, or an ecosystem under external environmental disturbances. Accordingly, the digital twin must accommodate a mechanism to control or steer the dynamics of the neural network to account for external driving. Introducing a control mechanism distinguishes our work from existing ones in the literature of reservoir computing as applied to nonlinear dynamical systems. Of particular interest is whether the collapse of the target chaotic system can be anticipated from the digital twin. The purpose of this paper is to demonstrate that the digital twin so created can accurately produce the bifurcation diagram of the target system and faithfully mimic its dynamical evolution from a statistical point of view. The digital twin can then be used to monitor the present and future “health” of the system. More importantly, with proper training from observational data, the twin can reliably anticipate system collapses, providing early warnings of potentially catastrophic failures of the system.
More specifically, using three prototypical systems from optics, ecology, and climate, respectively, we demonstrate that the reservoir computing based digital twins developed in this paper solve the following challenging problems: (1) extrapolation of the dynamical evolution of the target system into certain “uncharted territories” of the environmental condition with driving signals that it has never experienced before, (2) long-term continual forecasting of nonlinear dynamical systems subject to non-stationary external driving with sparse state updates, (3) inference of hidden variables in the system and accurate prediction of their dynamical evolution into the future, (4) adaptation to external driving of different waveform, and (5) extrapolation of the global bifurcation behaviors of network systems to some different sizes. These features make our digital twins appealing in applications.
II. PRINCIPLE OF RESERVOIR COMPUTING BASED DIGITAL TWIN
The basic construction of the digital twin of a nonlinear dynamical system44 is illustrated in Fig. 1. It is essentially a recurrent reservoir computing neural network with a control mechanism, which requires two types of input signals: the observational time series for training and the control signal that remains in both the training and self-evolving phase. The hidden layer hosts a random or complex network of artificial neurons. During the training, the hidden recurrent layer is driven by both the input signal and the control signal . The neurons in the hidden layer generate a high-dimensional nonlinear response signal. Linearly combining all the responses of these hidden neurons with a set of trainable and optimizable parameters yields the output signal. Specifically, the digital twin consists of four components: (i) an input subsystem that maps the low-dimensional () input signal into a (high) -dimensional signal through the weighted matrix , (ii) a reservoir network of neurons characterized by , a weighted network matrix of dimension , where , (iii) a readout subsystem that converts the -dimensional signal from the reservoir network into a -dimensional signal through the output weighted matrix , and (iv) a controller with the matrix . The matrix defines the structure of the reservoir neural network in the hidden layer, where the dynamics of each node are described by an internal state and a nonlinear hyperbolic tangent activation function.
The matrices , , and are generated randomly prior to training, whereas all elements of are to be determined through training. Specifically, the state updating equations for the training and self-evolving phases are, respectively,
where is the hidden state, is the vector of input training data, is the time step, the vector is defined to be for a vector , and is the leakage factor. During the training, several trials of data are typically used under different driving signals so that the digital twin can “sense, learn, and mingle” the responses of the target system to gain the ability to extrapolate a response to a new driving signal that has never been encountered before. We input these trials of training data, i.e., a few pairs of and the associated , through the matrices and sequentially. Then, we record the state vector of the neural network during the entire training phase as a matrix . We also record all the desired output, which is the one-step prediction result , as matrix . To make the readout nonlinear and to avoid unnecessary symmetries in the system,24,45 we change matrix into by squaring the entries of even dimensions in the states of the hidden layer. [The vector ( in Eq. (2) is defined in a similar way.] We carry out a linear regression between and , with a -2 regularization coefficient , to determine the readout matrix,
To achieve acceptable learning performance, optimization of hyperparameters is necessary. The four widely used global optimization methods are genetic algorithm,46–48 particle swarm optimization,49,50 Bayesian optimization,51,52 and surrogate optimization.53–55 We use the surrogate optimization (the algorithm surrogateopt in Matlab). The hyperparameters that are optimized include —the average degree of the recurrent network in the hidden layer, —the spectral radius of the recurrent network, —the scaling factor of , —the scaling of , —the leakage factor, and —the -2 regularization coefficient. The neural network is validated using the same driving as in the training phase, but driving signals with different amplitudes and frequencies are used in the testing phase. Prior to making predictions, the neural network is initialized using random short segments of the training data, so no data from the target system under the testing driving signals are required. To produce the bifurcation diagram, sufficiently long transients in the dynamical evolution of the neural network are disregarded.
III. RESULTS
For clarity, we present results on the digital twin for a prototypical nonlinear dynamical systems with adjustable phase-space dimension: the Lorenz-96 climate network model.56 In Appendixes, we present two additional examples: a chaotic laser ( Appendix A) and a driven ecological system ( Appendix B), together with a number of pertinent issues.
A. Low-dimensional Lorenz-96 climate network and its digital twin
The Lorenz-96 system56 is an idealized atmospheric climate model. Mathematically, the toy climate system is described by coupled first-order nonlinear differential equations subject to external periodic driving ,
where is the spatial index. Under the periodic boundary condition, the nodes constitute a ring network, where each node is coupled to three neighboring nodes. To be concrete, we set (more complex high-dimensional cases are treated below). The driving force is sinusoidal with a bias : . We fix and , and use the forcing amplitude as the bifurcation parameter. For relatively large values of , the system exhibits chaotic behaviors, as exemplified in Fig. 2(a1) for . Quasi-periodic dynamics arise for smaller values of , as exemplified in Fig. 2(a2). As decreases from a large value, a critical transition from chaos to quasi-periodicity occurs at . We train the digital twin with time series from four values of , all in the chaotic regime: and . The size of the random reservoir network is . For each value of in the training set, the training and validation lengths are and , respectively, where the latter corresponds to approximately five Lyapunov times. The warming-up length is and the time step of the reservoir dynamical evolution is . The hyperparameter values (please refer to Sec. II for their meanings) are optimized to be , , , , , and . Our computations reveal that, for the deterministic version of the Lorenz-96 model, it is difficult to reduce the validation error below a small threshold. However, adding an appropriate amount of noise into the training time series18 can lead to smaller validation errors. We add an additive Gaussian noise with standard deviation to each input data channel to the reservoir network [including the driving channel ]. The noise amplitude is treated as an additional hyperparameter to be optimized. For the toy climate system, we test several noise levels and find the optimal noise level giving the best validating performance: .
Figures 2(b1) and 2(b2) show the dynamical behaviors generated by the digital twin for the same values of as in Figs. 2(a1) and 2(a2), respectively. It can be seen that not only does the digital twin produce the correct dynamical behavior in the same chaotic regime where the training is carried out, it can also extrapolate beyond the training parameter regime to correctly predict the unseen system dynamics there (quasiperiodicity in this case). To provide support in a broader parameter range, we calculate a true bifurcation diagram, as shown in Fig. 2(c), where the four vertical dashed lines indicate the four values of the training parameter. The bifurcation diagram generated by the digital twin is shown in Fig. 2(d), which agrees reasonably well with the true diagram. Note that the digital twin fails to predict the periodic window about , due to its high period (period-21—see Appendix H for a discussion). To quantify the prediction performance, we examine the smallest simple connected region that encloses the entire attractor—the spanned region, and calculate the overlapping ratio of the true to the predicted spanned regions. Figure 2(e) shows the relative error of the spanned regions (RESR) vs , where the spanned regions are calculated from a two-dimensional projection of the attractor. Except for the locations of two periodic windows, RESR is within 4%. When the testing values of are further away from the training values, RESR tends to increase.
Previously, it was suggested that reservoir computing can have a certain degree of extrapolability.34–39 Figure 2 represents an example where the target system’s response is extrapolated to external sinusoidal driving with unseen amplitudes. In general, extrapolation is a difficult problem. Some limitations of the extrapolability with respect to the external driving signal are discussed in Appendix A, where the digital twin can predict the crisis point but cannot extrapolate the asymptotic behavior after the crisis.
In the following, we systematically study the applicability of the digital twin in solving forecasting problems in more complicated situations than the basic settings demonstrated in Fig. 2. The issues to be addressed are high dimensionality, the effect of the waveform of the driving on forecasting, and the generalizability across Lorenz-96 networks of different sizes. Results of continual forecasting and inferring hidden dynamical variables using only rare updates of the observable are presented in Appendixes C and D.
B. Digital twins of parallel reservoir-computing neural networks for high-dimensional Lorenz-96 climate networks
We extend the methodology of digital twin to high-dimensional Lorenz-96 climate networks, e.g., . To deal with such a high-dimensional target system, if a single reservoir system is used, the required size of the neural network in the hidden layer will be too large to be computationally efficient. We, thus, turn to the parallel configuration25 that consists of many small-size neural networks, each “responsible” for a small part of the target system. For the Lorenz-96 network with coupled nodes, our digital twin consists of ten parallel neural networks, each monitoring and forecasting the dynamical evolution of two nodes (). Because each node in the Lorenz-96 network is coupled to three nearby nodes, we set to ensure that sufficient information is supplied to each neural network.
The specific parameters of the digital twin are as follows. The size of the recurrent layer is . For each training value of the forcing amplitude , the training and validation lengths are and , respectively. The “warming-up” length is and the time step of the dynamical evolution of the digital twin is . The optimized hyperparameter values are , , , , , , and .
The periodic signal used to drive the Lorenz-96 climate network of 20 nodes is with , and . The structure of the digital twin consists of 20 small neural networks as illustrated in Fig. 3(a). Figures 3(b1) and 3(b2) show a chaotic and a periodic attractor for and , respectively, in the plane. Training of the digital twin is conducted by using four time series from four different values of , all in the chaotic regime. The attractors generated by the digital twin for and are shown in Figs. 3(c1) and 3(c2), respectively, which agree well with the ground truth. Figure 3(d) shows the bifurcation diagram of the target system (the ground truth), where the four values of : , 2.2, 2.6, and 3.0, from which the training chaotic time series are obtained, are indicated by the four respective vertical dashed lines. The bifurcation diagram generated by the digital twin is shown in Fig. 3(e), which agrees well with the ground truth in Fig. 3(d). Figure 3(f) shows the relative error RESR vs , where a peak occurs at due to the mismatched ending point of the large periodic window.
C. Digital twins under external driving with varied waveform
The external driving signal is an essential ingredient in our articulation of the digital twin, which is particularly relevant to critical systems of interest such as the climate systems. In applications, the mathematical form of the driving signal may change with time. Can a digital twin produce the correct system behavior under a driving signal that is different than the one it has “seen” during the training phase? Note that, in the examples treated so far, it has been demonstrated that our digital twin can extrapolate the dynamical behavior of a target system under a driving signal of the same mathematical form but with a different amplitude. Here, the task is more challenging as the form of the driving signal has changed.
As a concrete example, we consider the Lorenz-96 climate network of nodes, where a digital twin is trained with a purely sinusoidal signal , as illustrated in the left column of Fig. 4(a). During the testing phase, the driving signal has the form of the sum of two sinusoidal signals with different frequencies: , as illustrated in the right panel of Fig. 4(a). We set , , , , , and use as the bifurcation parameter. The parameter setting for reservoir computing is the same as that in Fig. 2. The training and validating lengths for each driving amplitude value are and , respectively. We find that this setting prevents the digital twin from generating an accurate bifurcation diagram, but a small amount of dynamical noise to the target system can improve the performance of the digital twin. To demonstrate this, we apply an additive noise term to the driving signal in the training phase: , where is a Gaussian white noise of zero mean and unit variance, and is the noise amplitude (e.g., ). We use the second-order Heun method57 to solve the stochastic differential equations describing the target Lorenz-96 system. Intuitively, the noise serves to excite different modes of the target system to instill richer information into the training time series, making the process of learning the target dynamics more effective. Figures 4(b) and 4(c) show the actual and digital twin generated bifurcation diagrams. Although the digital twin encountered driving signals in a completely “uncharted territory,” it is still able to generate the bifurcation diagram with reasonable accuracy. The added dynamical noise is creating small fluctuations in the driving signal . This may yield richer excited dynamical features of the target system in the training data set, which can be learned by the neural network. This should be beneficial for the neural network to adapt to different waveforms in the testing. Additional results with varied testing waves are presented in Appendix E.
D. Extrapolability of digital twin with respect to system size
In the examples studied so far, it has been demonstrated that our reservoir computing based digital twin has a strong extrapolability in certain dimensions of the parameter space. Specifically, the digital twin trained with time series data from one parameter region can follow the dynamical evolution of the target system in a different parameter regime. One question is whether the digital twin possesses certain extrapolability in the system size. For example, consider the Lorenz-96 climate network of size . In Fig. 3, we use an array of parallel neural networks to construct a digital twin for the climate network of a fixed size , where the number of parallel reservoir computers is (assuming that is even), and training and testing/monitoring are carried out for the same system size. We ask, if a digital twin is trained for climate networks of certain sizes, will it have the ability to generate the correct dynamical behaviors for climate networks of different sizes? If yes, we say that the digital twin has extrapolability with respect to system size.
As an example, we create a digital twin with time series data from Lorenz-96 climate networks of sizes and , as shown in Fig. 5(a). For each system size, four values of the forcing amplitude are used to generate the training time series: 1.5, 2.0, 2.5, and 3.0, as marked by the vertical orange dashed lines in Figs. 5(a) and 5(b). As in Fig. 3, the digital twin consists of parallel neural networks, each of size . The optimized hyperparameter values are determined to be , , , , , , and . Then, we consider climate networks of two different sizes: and , and test if the trained digital twin can be adapted to the new systems. For the network of size , we keep only two parallel neural networks for the digital twin. For , we add one additional neural network to the trained digital twin for , so the new twin consists of six parallel neural networks with identical matrices. The true bifurcation diagrams for the climate system of sizes and are shown in Fig. 5(b) (the left and right panels, respectively). The corresponding bifurcation diagrams generated by the adapted digital twins are shown in Fig. 5(c), which agree with the ground truth reasonably well, demonstrating that our reservoir computing based digital twin possesses certain extrapolability in system size.
IV. DISCUSSION
We have articulated the principle of creating digital twins for nonlinear dynamical systems based on reservoir computers that are recurrent neural networks. The basic consideration leading us to choose reservoir computing as the suitable machine-learning scheme is its ability to execute dynamical self-evolution in closed-loop operation. In general, reservoir computing is a powerful neural network framework that does not require backpropagation during training but only a linear regression is needed. This feature makes the development of digital twins based on reservoir computing computationally efficient. We have demonstrated that a well-trained reservoir computer is able to serve as a digital twin for systems subject to external, time-varying driving. The twin can be used to anticipate possible critical transitions or regime shifts in the target system as the driving force changes, thereby providing early warnings for potential catastrophic collapse of the system. We have used a variety of examples from different fields to demonstrate the workings and the anticipating power of the digital twin, which include the Lorenz-96 climate network of different sizes (in the main text), a driven chaotic laser system ( Appendix A), and an ecological system ( Appendix B). For low-dimensional nonlinear dynamical systems, a single neural network is sufficient for the digital twin. For high-dimensional systems such as the climate network of a relatively large size, parallel neural networks can be integrated to construct the digital twin. At the level of the detailed state evolution, our recurrent neural network based digital twin is essentially a dynamical twin system that evolves in parallel to the real system, and the evolution of the digital twin can be corrected from time to time using sparse feedback of data from the target system ( Appendix C). In certain circumstances, continual forecasting in the presence of hidden dynamical variables is possible ( Appendix D), and digital twins under external driving with varied waveforms can be constructed (Sec. III C and Appendix E). In applications where direct measurements of the target system are not feasible or are too costly, the digital twin provides a way to assess the dynamical evolution of the target system. Qualitatively, the digital twin can faithfully reproduce the attractors of the target system, e.g., chaotic, periodic, or quasiperiodic, without the need for state updating. In addition, we show that the digital twin is able to accurately predict a critical bifurcation point and the average lifetime of transient chaos that occurs after the bifurcation, even under a driving signal that is different from that during the training ( Appendix F). The issues of robustness against dynamical and observational noises in the training data ( Appendix G) and the effect of long transients in periodic windows of a high period ( Appendix H) have also been treated.
To summarize, our reservoir computing based digital twins are capable of performing the following tasks: (1) extrapolating certain dynamical evolution of the target system under external driving conditions unseen during training, (2) making long-term continual forecasting of nonlinear dynamical systems under nonstationary external driving with sparse state updates, (3) inferring the existence of hidden variables in the system and reproducing/predicting their dynamical evolution, (4) adapting to external driving of different waveform, and (5) extrapolating the global bifurcation behaviors to systems of different sizes.
Our design of the digital twins for nonlinear dynamical systems can be extended in a number of ways.
1. Online learning
Online or continual learning is a recent trend in machine-learning research. Unlike the approach of batch learning, where one gathers all the training data in one place and does the training on the entire data set (the way by which training is conducted for our work), in an online learning environment, one evolves the machine-learning model incrementally with the flow of data. For each training step, only the newest inputted training data are used to update the machine-learning model. When a new data set is available, it is not necessary to train the model over again on the entire data set accumulated so far, but only on the new set. This can result in a significant reduction in computational complexity. Previously, an online learning approach to reservoir computing known as the FORCE learning was developed.58 An attempt to deal with the key problem of online learning termed “catastrophic forgetting” was made in the context of reservoir computing.59 Further investigation is required to determine if these methods can be exploited for creating digital twins through online learning.
2. Beyond reservoir computing
Second, the potential power of recurrent neural network based digital twin may be further enhanced by using more sophisticated recurrent neural network models depending on the target problem. We use reservoir computing because it is relatively simple yet powerful enough for both low- and high-dimensional dynamical systems. Schemes, such as knowledge-based hybrid reservoir computing60 or ODE-nets,61 are worth investigating.
3. Reinforcement learning
Is it possible to use digital twins to make reinforcement learning feasible in situations where the target system cannot be “disturbed”? Particularly, reinforcement learning requires constant interaction with the target system during training so that the machine can learn from its mistakes and successes. However, for a real-world system, these interactions may be harmful, uncontrollable, and irreversible. As a result, reinforcement learning algorithms are rarely applied to safety-critical systems.62 In this case, digital twins can be beneficial. By building a digital twin, the reinforcement learning model does not need to interact with the real system, but with its simulated replica for efficient training. This area of research is called model-based reinforcement learning.63
4. Potential benefits of noise
A phenomenon uncovered in our study is the beneficial role of dynamical noise in the target system. As briefly discussed in Fig. 4, adding dynamic noise in the training dataset enhances the digital twin’s ability to extrapolate the dynamics of the target system with different waveforms of driving. Intuitively, noise can facilitate the exploration of the phase space of the target nonlinear system. A systematic study of the interplay between dynamical noise and the performance of the digital twin is worthy.
5. Extrapolability
The demonstrated extrapolability of our digital twin, albeit limited, may open the door to forecasting the behavior of large systems using twins trained on small systems. Much research is needed to address this issue.
6. Spatiotemporal dynamical systems with multistability
We have considered digital twins for a class of coupled dynamical systems: the Lorenz-96 climate model. When developing digital twins for spatiotemporal dynamical systems, two issues can arise. One is the computational complexity associated with such high-dimensional systems. We have demonstrated that parallel reservoir computing provides a viable solution. Another issue is multistability. Spatiotemporal dynamical systems, in general, exhibit extremely rich dynamical behaviors such as chimera states.64–72 Developing digital twins of spatiotemporal dynamical systems with multiple coexisting states requires that the underlying recurrent neural networks possess certain memory capabilities. Developing methods to incorporate memories into digital twins is a problem of current interest.
ACKNOWLEDGMENTS
We thank Z.-M. Zhai and A. Flynn for discussions. This work was supported by the Army Research Office through Grant No. W911NF-21-2-0055 and by the U.S.-Israel Energy Center managed by the Israel-U.S. Binational Industrial Research and Development (BIRD) Foundation.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Ling-Wei Kong: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Investigation (equal); Validation (equal); Visualization (equal); Writing – original draft (equal). Yang Weng: Conceptualization (supporting); Funding acquisition (supporting); Writing – review & editing (supporting). Bryan Glaz: Conceptualization (supporting); Writing – review & editing (supporting). Mulugeta Haile: Conceptualization (supporting); Writing – review & editing (supporting). Ying-Cheng Lai: Conceptualization (equal); Funding acquisition (equal); Investigation (equal); Methodology (equal); Project administration (equal); Supervision (equal); Writing – original draft (equal); Writing – review & editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.
APPENDIX A: A DRIVEN CHAOTIC LASER SYSTEM
We consider the single-mode, class B, driven chaotic laser system73–76 described by
where the dynamical variables and are proportional to the normalized intensity and the population inversion, is the external sinusoidal driving signal of amplitude and frequency , and and are two parameters. Chaos is common in this laser system.73,74,76 For example, for , , and , there is a chaotic attractor for , as shown by a sustained chaotic time series in Fig. 6(a1). The chaotic attractor is destroyed by a boundary crisis77 at . For , there is transient chaos, after which the system settles into periodic oscillations, as exemplified in Fig. 6(a2). Suppose chaotic motion is desired. The crisis bifurcation at can then be regarded as a kind of system collapse.
To build a digital twin for the chaotic laser system, we use the external driving signal as the natural control signal for the reservoir-computing neural network. Different from the examples in the main text, here the driving frequency , instead of the driving amplitude , serves as the bifurcation parameter. Assuming observational data in the form of time series are available for several values of in the regime of a chaotic attractor, we train the neural network using chaotic time series collected from four values of : , 0.84, 0.87, and 0.90. The training parameter setting is as follows. For each value in the training set, the training and validation lengths are and , respectively, where the latter corresponds to approximately five Lyapunov times. The “warming-up” length is . The time step of the reservoir system is . The size of the random neural network is . The optimal hyperparameter values are determined to be , , , , , and .
Figures 6(a1) and 6(a2) show two representative time series from the laser model (the ground truth) for and , respectively. The one in panel (a1) is associated with sustained chaos (pre-critical) and the other in panel (a2) is characteristic of transient chaos with a final periodic attractor (post-critical). The corresponding time series generated by the digital twin are shown in Figs. 6(b1) and 6(b2), respectively. It can be seen that the training aided by the control signal enables the digital twin to correctly capture the dynamical climate of the target system, e.g., sustained or transient chaos. The true return maps in the pre-critical and post-critical regimes are shown in Figs. 6(c1) and 6(c2), respectively, and the corresponding maps generated by the digital twin are shown in Figs. 6(d1) and 6(d2). In the pre-critical regime, an invariant region (the green dashed square) exists on the return map in which the trajectories are confined, leading to sustained chaotic motion, as shown in Figs. 6(c1) and 6(d1). Within the invariant region in which the chaotic attractor lives, the digital twin captures the essential dynamical features of the attractor. Because the training data are from the chaotic attractor of the target system, the digital twin fails to generate the portion of the real return map that lies outside the invariant region, which is expected because the digital twin has never been exposed to the dynamical behaviors that are not on the chaotic attractor. In the post-critical regime, a “leaky” region emerges, as indicated by the red arrows in Figs. 6(c2) and 6(d2), which destroys the invariant region and leads to transient chaos. The remarkable feature is that the digital twin correctly assesses the existence of the leaky region, even when no such information is fed into the twin during training. From the point of view of predicting system collapse, the digital twin is able to anticipate the occurrence of the crisis and transient chaos. A quantitative result of these predictions is demonstrated in Appendix F.
As indicated by the predicted return maps in Figs. 6(d1) and 6(d2), the digital twin is unable to give the final state after the transient, because such a state must necessarily lie outside the invariant region from which the training data are originated. In particular, the digital twin is trained with time series data from the chaotic attractors prior to the crisis. With respect to Figs. 6(d1) and 6(d2), the digital twin can learn the dynamics within the dashed green box in the plotted return maps but is unable to predict the dynamics outside the box, as it has never been exposed to these dynamics.
A comparison of the real and predicted bifurcation diagram is demonstrated in Fig. 7. The strong resemblance between them indicates the power of the digital twin in extrapolating the correct global behavior of the target system. Moreover, this demonstrates that not only can this approach extrapolate with various driving amplitudes (as demonstrated in the main text) but the approach can also work with varying the driving frequency .
APPENDIX B: A DRIVEN CHAOTIC ECOLOGICAL SYSTEM
We study a chaotic driven ecological system that models the annual blooms of phytoplankton under seasonal driving.78 Seasonality plays a crucial role in ecological systems and epidemic spreading of infectious diseases,79 which is usually modeled as a simple periodic driving force on the system. The dynamical equations of this model in the dimensionless form are78
where represents the level of the nutrients, is the biomass of the phytoplankton, the Lotka–Volterra term models the phytoplankton uptake of the nutrients, represents a small and constant nutrient flow from external sources, is the sinking rate of the nutrients to the lower level of the water unavailable to the phytoplankton, and is the seasonality term: . The parameter values are:78 , , and .
Climate change can dramatically alter the dynamics of this ecosystem.80 We consider the task of forecasting how the system behaves if the climate change causes the seasonal fluctuation to be more extreme. In particular, suppose the training data are measured from the system when it behaves normally under a driving signal of relatively small amplitude, and we wish to predict the dynamical behaviors of the system in the future when the amplitude of the driving signal becomes larger (due to climate change). The training parameter setting is as follows. The size of the neural network is with . The time step of the evolution of the network dynamics is . The training and validation lengths for each value of the driving amplitude in the training are and , respectively. The optimized hyperparameters of the reservoir computer are , , , , , .
Figure 8 shows the results of our digital twin approach to this ecological model to learn from the dynamics under a few different values of the driving amplitude to generate the correct response of the system to a driving signal of a larger amplitude. In particular, the training data are collected with the driving amplitude and , all in the chaotic regions. Figures 8(a1) and 8(a2) show the true attractors of the system for and , respectively, where the attractor is chaotic in the former case (within the training parameter regime) and periodic in the latter (outside the training regime). The corresponding attractors generated by the digital twin are shown in Figs. 8(b1) and 8(b2). The digital twin can not only replicate the chaotic behavior in the training data [Fig. 8(b1)] but also predict the transition to a periodic attractor under a driving signal with larger amplitudes (more extreme seasonality), as shown in Fig. 8(b2). In fact, the digital twin can faithfully produce the global dynamical behavior of the system, both inside and outside the training regime, as can be seen from the nice agreement between the ground-truth bifurcation diagram in Fig. 8(c) and the diagram generated by the digital twin in Fig. 8(d).
APPENDIX C: CONTINUAL FORECASTING UNDER NON-STATIONARY EXTERNAL DRIVING WITH SPARSE REAL-TIME DATA
The three examples (Lorenz-96 climate network in the main text, the driven laser, and the ecological system) have demonstrated that our reservoir computing based digital twin is capable of extrapolating and generating the correct statistical features of the dynamical trajectories of the target system such as the attractor and bifurcation diagram. That is, the digital twin can be regarded as a “twin” of the target system only in a statistical sense. In particular, from random initial conditions, the digital twin can generate an ensemble of trajectories, and the statistics calculated from the ensemble agree with those of the original system. At the level of individual trajectories, if a target system and its digital twin start from the same initial condition, the trajectory generated by the twin can stay close to the true trajectory only for a short period of time (due to chaos). However, with infrequent state updates, the trajectory generated by the twin can shadow the true trajectory (in principle) for an arbitrarily long period of time,32 realizing continual forecasting of the state evolution of the target system.
In data assimilation for numerical weather forecasting, the state of the model system needs to be updated from time to time.81–83 This idea has recently been exploited to realize long-term prediction of the state evolution of chaotic systems using reservoir computing.32 Here, we demonstrate that, even when the driving signal is non-stationary, the digital twin can still generate the correct state evolution of the target system. As a specific example, we use the chaotic ecosystem in Eqs. (B1) and (B2) with the same neural network trained in Appendix B. Figure 9(a) shows the non-stationary external driving whose amplitude increases linearly from to in the time interval . Figure 9(b) shows the true (blue) and digital twin generated (red) time evolution of the nutrient abundance. Due to chaos, without state updates, the two trajectories diverge from each other after a few cycles of oscillation. However, even with rare state updates, the two trajectories can stay close to each other for any arbitrarily long time, as shown in Fig. 9(c). In particular, there are 800 time steps involved in the time interval and the state of the digital twin is updated 20 times, i.e., of the available time series data. We will discuss the results further in Appendix D.
APPENDIX D: CONTINUAL FORECASTING WITH HIDDEN DYNAMICAL VARIABLES
In real-world scenarios, usually not all the dynamical variables of a target system are accessible. It is often the case that only a subset of the dynamical variables can be measured and the remaining variables are inaccessible or hidden from the outside world. Can a digital twin still make continual forecasting in the presence of hidden variables based on the time series data from the accessible variables? Also, can the digital twin do this without knowing that there exist some hidden variables before training? In general, when there are hidden variables, the reservoir network needs to sense their existence, encode them in the hidden state of the recurrent layer, and constantly update them. As such, the recurrent structure of reservoir computing is necessary, because there must be a place for the machine to store and restore the implicit information that it has learned from the data. Compared with the cases where complete information about the dynamical evolution of all the observable is available, when there are hidden variables, it is significantly more challenging to predict the evolution of a target system driven by a non-stationary external signal using sparse observations of the accessible variables.
As an illustrative example, we again consider the ecosystem described by Eqs. (B1) and (B2). We assume that the dynamical variable (the abundance of the nutrients) is hidden and , the biomass of the phytoplankton, is externally accessible. Despite the accessibility to , we assume that it can be measured only occasionally. That is, only sparsely updated data of the variable are available. It is necessary that the digital twin is able to learn some equivalent of as the time evolution of also depends on the value and to encode the equivalent in the reservoir network. In an actual application, when the digital twin is deployed, knowledge about the existence of such a hidden variable is not required.
Figure 10 presents a representative resulting trial, where Fig. 10(a) shows the non-stationary external driving signal [the same as the one in Fig. 9(a)]. Figure 10(b) shows, when the observable is not updated with the real data, the predicted time series (red) diverges from the true time series (blue) after about a dozen oscillations. However, if is updated to the digital twin with the true values at the times indicated by the purple vertical lines in Fig. 10(c), the predicted time series matches the ground truth for a much longer time. The results suggest that the existence of the hidden variable does not significantly impede the performance of continual forecasting.
The results in Fig. 10 motivate the following questions. First, has the reservoir network encoded information about the hidden variable? Second, suppose it is known that there is a hidden variable and the training dataset contains this variable, can its evolution be inferred with only rare updates of the observable during continual forecasting? Previous results24,28,84 suggested that reservoir computing can be used to infer the hidden variables in a nonlinear dynamical system. Here, we show that, with a segment of the time series of used only for training an additional readout layer, our digital twin can forecast with only occasional inputs of the observable time series . In particular, the additional readout layer for is used only for extracting information about from the reservoir network and its output is never injected back into the reservoir. Consequently, whether this additional task of inferring is included or not, the trained output layer for and the forecasting results of are not altered.
Figure 10(d) shows that, when the observable is not updated with the real data, the digital twin can infer the hidden variable for several oscillations. If is updated with the true value at the times indicated by the purple vertical lines in Fig. 10(c), the dynamical evolution of the hidden variable can also be accurately predicted for a much longer period of time, as shown in Fig. 10(e). It is worth emphasizing that during the whole process of forecasting and monitoring, no information about the hidden variable is required—only sparse data points of the observable are used.
The training and testing settings of the digital twin for the task involving a hidden variable are as follows. The input dimension of the reservoir is because there is a single observable . The output dimension is with one dimension of the observable in addition to one dimension of the hidden variable . Because of the higher memory requirement in dealing with a hidden variable, a somewhat larger reservoir network is needed, so we use . The time step of the dynamical evolution of the neural network is . The training and validating lengths for each value of the driving amplitude in the training are and , respectively. Other optimized hyperparameters of the reservoir are , , , , , , and .
It is also worth noting that Figs. 9 and 10 have demonstrated the ability of the digital twin to extrapolate beyond the parameter regime of the target system from which the training data are obtained. In particular, the digital twin was trained only with time series under stationary external driving of the amplitude and . During the testing phase associated with both Figs. 9 and 10, the external driving is non-stationary with its amplitude linearly increasing from to . The second half of the time series and in Figs. 9 and 10 are, thus, beyond the training parameter regime.
The results in Figs. 9 and 10 help legitimize the terminology “digital twin,” as the reservoir computers subject to the external driving are dynamical twin systems that evolve “in parallel” to the corresponding real systems. Even when the target system is only partially observable, the digital twin contains both the observable and hidden variables whose dynamical evolution is encoded in the recurrent neural network in the hidden layer. The dynamical evolution of the output is constantly (albeit infrequently) corrected by sparse feedback from the real system, so the output trajectory of the digital twin shadows the true trajectory of the target system. Suppose one wishes to monitor a variable in the target system, it is only necessary to read it from the digital twin instead of making more (possibly costly) measurements on the real system.
APPENDIX E: DIGITAL TWINS UNDER EXTERNAL DRIVING WITH VARIED WAVEFORM
In the main text, it is demonstrated that dynamical noise added to the driving signal during the training can be beneficial. Figure 11 presents a comparison between the noiseless training and the training with dynamical noise of a strength . The ground-truth bifurcation diagram is shown in Fig. 11(a) and three examples with different reservoir neural networks for the noiseless (b1)–(b3) and noisy (c1)–(c3) training schemes are shown. All the settings other than the noise level are the same as that in Fig. 4. Though there is still a fluctuation in predicted results, adding dynamical noise into the training data can produce bifurcation diagrams that are, in general, closer to the ground truth than without noise.
To further demonstrate the beneficial role of noise, we test the additive training noise scheme using the ecological system. The training process and hyperparameter values of the digital twin are identical to those in Appendix B. A dynamical noise of amplitude is added to the driving signal during training in the same way as in Fig. 4. During testing, the driving signals are altered to
where . Two sets of testing signals are used, with and 0.4, respectively. Figure 12 shows the true and predicted bifurcation diagrams of vs for (left column) and (right column). It can be seen that the bifurcation diagrams generated by the digital twin with the aid of training noise are remarkably accurate. We also find that, for this ecological system, the amplitude of the dynamical noise during training does not have a significant effect on the predicted bifurcation diagram. A plausible reason is that the driving signal is a multiplicative term in the system equations. Further investigation is required to determine the role of dynamical noise in these tasks with varied driving waveforms.
APPENDIX F: QUANTITATIVE CHARACTERIZATION OF THE PERFORMANCE OF DIGITAL TWINS
In the main text, a quantitative measure of the overlapping rate between the target and predicted spanning regions is introduced to measure the performance of the digital twins, where a spanning region is the smallest simply connected region that encloses the entire attractor. In a two-dimensional projection, we divide a large reference plane into pixels of size . All the pixels through which the system trajectory crosses and those surrounded by the trajectory belong to the spanning region, and the regions covering the true attractor of the target system and predicted attractor can be compared. In particular, all the pixels that belong to one spanned region but not to the other are counted and the number is divided by the total number of pixels in the spanned region of the true attractor. This gives RESR, the relative error of the spanned regions, as described in the main text. While this measure is effective in most cases, near a bifurcation (e.g., near the boundary of a periodic window), large errors can arise because a small parameter mismatch can lead to a characteristically different attractor. To reduce the error, we test the attractors at three nearby parameter values, e.g., and with and choose the smallest RESR values among the three.
A direct comparison between the predicted bifurcation diagram with the ground truth is difficult given the rich information a bifurcation diagram can provide. To better quantify, here we provide another measure to quantify the performance of digital twins, and we employ another measure (besides RESR). In particular, for a bifurcation diagram, the parameter values at which the various bifurcations occur are of great interest, as they define the critical points at which characteristic changes in the system can occur. To be concrete, we focus on the crisis point at which sustained chaotic motion on an attractor is destroyed and replaced by transient chaos. To characterize the performance of the digital twins in extrapolating the dynamics of the target system, we examine the errors in the predicted critical bifurcation point and in the average lifetime of the chaotic transient after the bifurcation.
As an illustrative example, we take the driven chaotic laser system in Appendix A, where a crisis bifurcation occurs at the critical driving frequency at which the chaotic attractor of the system is destroyed and replaced by a non-attracting chaotic invariant set leading to transient chaos. We test to determine if the digital twin can faithfully predict the crisis point based only on training data from the parameter regime of a chaotic attractor. Let be the digital twin predicted critical point. Figure 13(a) shows the distribution of obtained from 200 random realizations of the reservoir neural network. Despite the fluctuations in the predicted , their average value is , which is close to the true value . A relative error of can be defined as
where denotes the minimal distance from to the set of training parameter points , i.e., the difference between and the closest training point. For the driven laser system, we have .
The second quantity is the lifetime of transient chaos after the crisis bifurcation,35,39 as shown in Fig. 13(b). The average transient lifetime is the inverse of the slope of the linear regression of predicted data points in Fig. 13(b), which is . Compared with the true value , we see that the digital twin is able to predict the average chaotic transient lifetime to within the same order of magnitude. Considering that key to the transient dynamics is the small escaping region in Fig. 6(d2), which is sensitive to the inevitable training errors, the performance can be deemed as satisfactory.
APPENDIX G: ROBUSTNESS OF DIGITAL TWIN AGAINST COMBINED DYNAMICAL/OBSERVATIONAL NOISES
Can our reservoir computing based digital twins withstand the influences of different types of noises? To address this question, we introduce dynamical and observational noises in the training data, which are modeled as additive Gaussian noises. Take the six-dimensional Lorenz-96 system as an example. Figure 14(a) shows the true bifurcation diagram under different amplitudes of external driving, where the vertical dashed lines specify the training points. Figures 14(b1) and 14(b2) show two realizations of the bifurcation diagram generated by the digital twin under both dynamical and observational noises of amplitudes and . Two bifurcation diagrams for noise amplitudes of an order of magnitude larger: and are shown in Figs. 14(c1) and 14(c2). It can be seen that the additional noises have little effect on the performance of the digital twin in generating the bifurcation diagram.
APPENDIX H: PERIODIC WINDOWS OF A HIGH PERIOD: EFFECT OF LONG TRANSIENTS
Figure 2 demonstrates that the digital twin is able to predict many details of a bifurcation diagram but it fails to generate a relatively large periodic window of about . A closer examination of the dynamics of the target Lorenz-96 system reveals that the periodic attractor in the window has period 21 with a rather complicated structure, as shown in Fig. 15(a) in a two-dimensional projection. The digital twin predicts a chaotic attractor, as shown in Fig. 15(b). The reason that the digital twin fails to predict the periodic attractor lies in the long transient of the trajectory before it reaches the final attractor, as shown in Fig. 15(c). A comparison between Figs. 15(b) and 15(c) indicates that what the digital twin has predicted is, in fact, the transient behavior in the periodic window. The implication is that the digital twin has, in fact, faithfully captured the dynamical climate of the target system.