Big data have become a critically enabling component of emerging mathematical methods aimed at the automated discovery of dynamical systems, where first principles modeling may be intractable. However, in many engineering systems, abrupt changes must be rapidly characterized based on limited, incomplete, and noisy data. Many leading automated learning techniques rely on unrealistically large data sets, and it is unclear how to leverage prior knowledge effectively to re-identify a model after an abrupt change. In this work, we propose a conceptual framework to recover parsimonious models of a system in response to abrupt changes in the low-data limit. First, the abrupt change is detected by comparing the estimated Lyapunov time of the data with the model prediction. Next, we apply the sparse identification of nonlinear dynamics (SINDy) regression to update a previously identified model with the fewest changes, either by addition, deletion, or modification of existing model terms. We demonstrate this sparse model recovery on several examples for abrupt system change detection in periodic and chaotic dynamical systems. Our examples show that sparse updates to a previously identified model perform better with less data, have lower runtime complexity, and are less sensitive to noise than identifying an entirely new model. The proposed abrupt-SINDy architecture provides a new paradigm for the rapid and efficient recovery of a system model after abrupt changes.

Dynamical systems modeling is a cornerstone of modern mathematical physics and engineering. The dynamics of many complex systems (e.g., neuroscience, climate, and epidemiology) may not have first-principles derivations, and researchers are increasingly using data-driven methods for system identification and the discovery of dynamics. Related to the discovery of dynamical systems models from data is the *recovery* of these models following abrupt changes to the system dynamics. In many domains, such as aviation, model recovery is mission critical, and must be achieved rapidly and with limited noisy data. This paper leverages recent advances in sparse optimization to identify the fewest terms required to recover a model, introducing the concept of *parsimony of change*. In other words, many abrupt system changes, even catastrophic bifurcations, may be characterized with relatively few changes to the terms in the underlying model. In this work, we show that sparse optimization enables rapid model recovery that is faster, requires less data, is more accurate, and has higher noise robustness than the alternative approach of re-characterizing a model from scratch.

## I. INTRODUCTION

The data-driven discovery of physical laws and dynamical systems is poised to revolutionize how we model, predict, and control physical systems. Advances are driven by the confluence of big data, machine learning, and modern perspectives on dynamics and control. However, many modern techniques in machine learning (e.g., neural networks) often rely on access to massive data sets, have limited ability to generalize beyond the attractor where data are collected, and do not readily incorporate the known physical constraints. These various limitations are framing many state-of-the-art research efforts around learning algorithms,^{1} especially as it pertains to generalizability, limited data, and *one-shot learning.*^{2–4} Such limitations also frame the primary challenges and limitations associated with the data-driven discovery for real-time control of strongly nonlinear, high-dimensional, multi-scale systems with abrupt changes in the dynamics. Whereas traditional methods often require unrealistic amounts of training data to produce a viable model, this work focuses on methods that take advantage of prior experience and knowledge of physics to dramatically reduce the data and time required to characterize dynamics. Our methodology is similar in philosophy to the machine learning technique of *transfer learning*,^{5} which allows networks trained on one task to be efficiently adapted to another task. Our architecture is designed around the goal of rapidly extracting parsimonious, nonlinear dynamical models that identify only the fewest important interaction terms so as to avoid overfitting.

There are many important open challenges associated with the data-driven discovery of dynamical systems for real-time tracking and control. When abrupt changes occur in the system dynamics, an effective controller must rapidly characterize and compensate for the new dynamics, leaving little time for recovery based on limited data.^{6} The primary challenge in real-time model discovery is the reliance on large quantities of training data. A secondary challenge is the ability of models to generalize beyond the training data, which is related to the ability to incorporate new information and quickly modify the model. Machine learning algorithms often suffer from overfitting and a lack of interpretability, although the application of these algorithms to physical systems offers a unique opportunity to enforce known symmetries and physical constraints (e.g., conservation of mass). Inspired by biological systems, which are capable of extremely fast adaptation and learning based on very few trials of new information,^{7–9} we propose model discovery techniques that leverage an *experiential framework*, where known physics, symmetries, and conservation laws are used to rapidly infer model changes with limited data.

### A. Previous work in system identification

There are a wealth of regression techniques for the characterization of system dynamics from data, with varying degrees of generality, accuracy, data requirements, and computational complexity. Classical linear model identification algorithms include Kalman filters,^{10–12} the eigensystem realization algorithm (ERA),^{13} dynamic mode decomposition (DMD),^{14–17} and autoregressive moving average (ARMA) models,^{18,19} to name only a few. The resulting linear models are ideal for control design, but are unable to capture the underlying nonlinear dynamics or structural changes. Increasingly, machine learning is being used for nonlinear model discovery. Neural networks have been used for decades to identify nonlinear systems,^{20} and are experiencing renewed interest because of the ability to train deeper networks with more data^{1,21–25} and the promise of transformations that linearize dynamics via the Koopman operator.^{26,27} Neural networks show good capacity to recover the dynamics in a so-called “model-free” way.^{28,29} These methods are also known as “reservoir computers,” “liquid state machines,” or “echo state networks,” depending on the context. However, a real-time application is unrealistic, and the output is generally not analytically interpretable. In another significant vein of research, genetic programming^{30,31} is a powerful bio-inspired method that has successfully been applied to system identification,^{32–35} time-series prediction^{36,37} and control.^{38,39} However, evolutionary methods in their pure form, including genetic programming, are computationally complex and thus are not suitable for real-time tracking.

Recently, *interpretability* and *parsimony* have become important themes in nonlinear system identification.^{32,33} A common goal now is to identify the fewest terms required to represent the nonlinear structure of a dynamical system model while avoiding overfitting.^{40} Symbolic regression methods^{33,40–42} are generally appealing for system identification of structural changes, although they may need to be adapted to the low-data limit and for faster processing time. Nonparametric additive regression models^{43–45} require a backfitting loop which allows general transformations, but may be prohibitively slow for real-time applications. Generalized linear regression methods are slightly less general but can be brought to a fast evaluation and sparse representation.^{40,42} These leading approaches to identify dynamical equations from data usually rely on past data and aim at a reliable reproduction of a stationary system, i.e., when the underlying equations do not change in the course of time.^{33,40,45}

### B. Contributions of this work

In this work, we develop an adaptive modification of the sparse identification of nonlinear dynamics (SINDy) algorithm^{40} for real-time recovery of a model following abrupt changes to the system dynamics. We refer to this modeling framework as *abrupt-SINDy*. Although this is not the only approach for real-time change detection and recovery, parsimony and sparsity are natural concepts to track abrupt changes, focusing on the fewest modifications to an existing model. SINDy already requires relatively small amounts of data,^{46} is based on fast regression techniques, and has been extended to identify partial differential equations (PDEs),^{47,48} to include known constraints and symmetries,^{49} to work with limited measurements^{50} and highly corrupted and noisy data,^{51,52} to include control inputs,^{46,53} and to incorporate information criteria to assess the model quality,^{54} which will be useful in abrupt model recovery.

Here, we demonstrate that the abrupt-SINDy architecture is capable of rapidly identifying sparse changes to an existing model to recover the new dynamics following an abrupt change to the system. The first step in the adaptive identification process is to detect a system change using divergence of the prediction from measurements. Next, an existing model is updated with sparse corrections, including parameter variations, deletions, and additions of terms. We show that identifying sparse model changes from an existing model requires less data, less computation, and is more robust to noise than identifying a new model from scratch. Further, we attempt to maintain a critical attitude and caveat limitations of the proposed approach, highlighting when it can break down and suggesting further investigation. The overarching framework is illustrated in Fig. 1.

## II. STATE OF THE ART

Recently, sparse regression in a library of candidate nonlinear functions has been used for sparse identification of nonlinear dynamics (SINDy) to efficiently identify a sparse model structure from data.^{40} The SINDy architecture bypasses an intractable brute-force search through all possible models, leveraging the fact that many dynamical systems of the form

have dynamics $f$ that are sparse in the state variable $x\u2208\mathbb{R}n$. Such models may be identified using a sparsity-promoting regression^{55–57} that penalizes the number of nonzero terms *ξ _{ij}* in a generalized linear model

where $\theta j(x)$ form a set of nonlinear candidate functions. The candidate functions may be chosen to be polynomials, trigonometric functions, or a more general set of functions.^{40,42} With poor choice of the candidate functions *θ _{j}*, i.e., if library functions are non-orthogonal and/or overdetermined, the SINDy approach may fail to identify the correct model.

Sparse models may be identified from time-series data, which are collected and formed into the data matrix

We estimate the time derivatives using a simple forward Euler finite-difference scheme, i.e., the difference of two consecutive data, divided by the time difference

This estimation procedure is numerically ill-conditioned if data are noisy, although there are many methods to handle noise which work very well if used correctly.^{58,59} Noise-robust derivatives were investigated in the original SINDy algorithm.^{40} Next, we consider a library of candidate nonlinear functions $\Theta (X)$ of the form

Here, the matrix $Xd$ denotes a matrix with column vectors given by all possible time-series of d-th degree polynomials in the state $x$. The terms in $\Theta $ can be functional forms motivated by knowledge of the physics. Within the proposed work, they may parameterize a piecewise-affine dynamical model. Following best practices of statistical learning,^{56} to preprocess, we mean-subtract and normalize each column of $\Theta $ to have unit variance. The dynamical system can now be represented in terms of the data matrices as

The coefficients in the column $\Xi k$ of $\Xi $ determine the active terms in the k-th row of Eq. (2). A parsimonious model has the fewest terms in $\Xi $ required to explain the data. One option to obtain a sparse model is via convex $\u21131$-regularized regression

The hyper parameter γ balances complexity and sparsity of the solution. Sparse regression, such as least absolute shrinkage and selection operator (LASSO)^{55} and sequential thresholded least-squares,^{40} improves the robustness of identification for noisy overdetermined data, in contrast to earlier methods^{60} using compressed sensing.^{61,62} Other regularization schemes may be used to improve performance, such as the elastic net regression.^{63}

In this paper we use the sequentially thresholded ridge regression,^{47} which iteratively solves the ridge regression

and then thresholds any coefficient that is smaller than *γ*. The procedure is repeated on the non-zero entries of $\Xi $ until the model converges. The convergence of the SINDy architecture has been discussed in Ref. 64. After a sparse model structure has been identified in normalized coordinates, it is necessary to regress onto this sparse structure in the original unnormalized coordinates. Otherwise, non-physical constant terms appear when transforming back from normalized coordinates due to the mean-subtraction.

In La Cava *et al.*^{35} the authors pursue a complementary although more computationally intensive idea of adaptive modeling in the context of generalized linear models. Starting from an initial guess for the model, a brute force search is conducted to scan a larger set of candidate functions $\theta \u2192\theta \theta \u2032\gamma $, where $\theta \u2032$ are multiplicative extensions to the initial set of candidate functions and *γ* are real valued exponents. The intended use of this method is the refinement of first-principle based models by the discovery of coupling terms. It is possible to combine this refinement with our proposed scheme for dealing with abrupt changes. In addition, sparse sensors^{65} and randomized algorithms^{66} may improve speed.

## III. METHODS

The viewpoint of sparsity extends beyond model discovery, and we propose to extend SINDy to identify systems undergoing abrupt changes. It may be the case that abrupt model changes will only involve the addition, deletion, or modification of a few terms in the model. This is a statement of the *parsimony of change*, and indicates that we can use sparse regression to efficiently identify the new or missing terms with considerably less data than required to identify a new model from scratch. In general, each additional term that must be identified requires additional training data to distinguish between joint effects. Thus, having only a few changes reduces the amount of data required, making the model recovery more rapid. This section will describe a procedure that extends SINDy to handle three basic types of model changes:

**Variation of a term.**If the structure of the model is unchanged and only the parameters vary, we will perform least-squares regression on the known structure to identify the new parameters. This is computationally fast, and it is easy to check if the model explains the new dynamics, or if it is necessary to explore possible additions or deletions of terms.**Deletion of a term.**If the model changes by the removal of a few terms, then SINDy regression can be applied on the sparse coefficients in order to identify which terms have dropped out.**Addition of a term.**If a term is added, then SINDy regression will find the sparsest combination of inactive terms that explain the model error. Since least squares regression scales asymptotically $O(p3)$, with*p*as the number of columns in the library, this is computationally less expensive than regression in the entire library.

Combinations of these changes, such as a simultaneous addition and deletion, are more challenging and will also be explored. This approach is known as *abrupt-SINDy*, and it is depicted schematically in Fig. 2.

### A. Baseline model

First, we must identify a baseline SINDy model, and we use a gridsearch to determine the optimal hyper parameter selection. In gridsearch, all combinations of hyper parameters are tested and the best performing set is selected. This search is only performed once, locking in hyper parameters for future updates. The baseline model is characterized by the sparse coefficients in $\Xi 0$.

### B. Detecting model divergence

It is essential to rapidly detect any change in the model, and we employ a classical predictor-corrector scheme.^{12} The predictor step is performed over a time $\tau pred$ in the interval $t,t+\tau pred$ using the model valid at time *t*. The divergence of the predicted and measured state is computed at $t+\tau $ as $\Vert \Delta x\Vert =\Vert x\u0302(t+\tau )\u2212x(t+\tau )\Vert $, where $x\u0302$ is the prediction and $x$ is the measurement. The idea is to identify when the model and the measurement diverge faster than predicted by the dynamics of the system. For a chaotic system, the divergence of a trajectory is measured by the largest Lyapunov exponent of the system,^{67} although a wealth of similar measures has been suggested.^{68} The Lyapunov exponent is defined as

and its inverse sets the fastest timescale. Here, the analogy of ensemble and time average is used, more precisely the local, finite-time equivalent.^{69,70} An improvement can be achieved by exploiting an ensemble, e.g., by adding noise to the state $x$ that corresponds to the given measurement accuracy. Since we know the dynamical system for the prediction step, the Lyapunov exponent is determined by evolving the tangent space with the system.^{71,72}

In our detection algorithm, we fix a fluctuation tolerance $\Delta x$ and measure if the divergence time we find deviates from the expectation. If data are noisy, this tolerance must be significantly larger than the typical fluctuation scale of the noise. Formally, the model and measurements diverge if the timescale given by the local Lyapunov exponent and prediction horizon disagree. The local Lyapunov exponent is computed directly from the eigenvalues of the dynamical system.^{71,72} The prediction horizon *T*(*t*) is the first passage time where prediction $x\u0302(t+\Delta t)$ with initial condition $x(t)$ and measurement $x(t+\Delta t)$ differ by more than $\Delta x$

Analogous to the local Lyapunov exponent, we compute the ratio $log\u2009\Vert \Delta x(t0+\tau )/\Delta x(t0)\Vert $ as a measure for the divergence based on the measurement. For the model, we compute the local Lyapunov exponent as the average maximum eigenvalue $\lambda \xaf(t)=\u27e8\lambda (t\u2032)\u27e9t\u2032\u2208[t,t+T]$ with $\lambda (t)=max(\lambda i(t))$ and $\lambda ivi(t)=\u2202fj/\u2202xk|x(t)vi(t)$. Thus, we compare the expected and observed trajectory divergence. Model and measurement have diverged at time *t* if the model timescale and the measured one differ:

If the model is not chaotic, but the measurement is chaotic, one must invert the inequality, as in Fig. 3. The empirical factor α accounts for finite-time statistics.

This method depends heavily on the particular system under investigation, including the dynamics, timescales, and sampling rate. In a practical implementation, these considerations must be handled carefully and automatically. It is important to note that we are able to formulate the divergence in terms of dynamical systems theory, because our model *is* a dynamical system, in other cases, such as artificial neural networks, this is not possible due to the limited mathematical framework.

### C. Adaptive model fitting

After a change is detected, the following procedure is implemented to rapidly recover the model:

First, the new data are regressed onto the existing sparse structure $\Xi 0$ to identify varying parameters.

Next, we identify deletions of terms by performing the sparse regression on the sparse columns of $\Theta $ that correspond to nonzero rows in $\Xi 0$. This is more efficient than identifying a new model, as we only seek to delete existing terms from the model.

Finally, if there is still a residual error, then a sparse model is fit for this error in the inactive columns of $\Theta $ that correspond to zero rows in $\Xi 0$. In this way, new terms may be added to the model.

If the residual is sufficiently small after any step, the procedure ends. Alternatively, the procedure may be iterated until convergence. We are solving smaller regression problems by restricting our attention to subsets of the columns of $\Theta $. These smaller regressions require less data and are less computationally expensive,^{63} compared to fitting a new model. The deletion-addition procedure is performed after a model divergence is detected, using new transient data collected in an interval of size $tupdate$.

## IV. RESULTS

In this section, we describe the results of the abrupt-SINDy framework on dynamical systems with abrupt changes, including parameter variation, deletion of terms, and addition of terms. The proposed algorithm is compared against the original SINDy algorithm, which is used to identify a new model from scratch, in terms of data required, computational time, and model accuracy.

In each case, we begin by running a gridsearch algorithm^{73,74} to identify the main parameters: *α*, the ridge regression regularization parameter; *γ*, the thresholding parameter; $ndegree$, the maximum degree of the polynomial feature transformation; and $nfold$, the number of cross-validation runs. For scoring we use the explained variance score and conduct a five-fold cross validation for each point in the $(\alpha ,\gamma ,ndegree)$ parameter grid. Parameters are listed in Table I.

### A. Lorenz system

The Lorenz system is a well-studied, and highly simplified, conceptual model for atmospheric circulation^{75}

where the parameter *ρ* represents the heating of the atmosphere, corresponding to the Rayleigh number, *σ* corresponds to Prandtl number, and *β* to the aspect ratio.^{76} The parameters are set to *ρ* = 28, $\beta =8/3$, *σ* = 10 (Table I).

In the following, we integrate the system numerically to produce a reference data set. We deliberately change the parameter *ρ* at *t* = 40 to *ρ* = 15 and at *t* = 80 back to *ρ* = 28, as shown in Figs. 4 and 5. These parametric changes lead to a bifurcation in the dynamics, and they are detected quickly. The subsequent adapted parameters are accurately detected up to two digits, as shown in Table II. Because we are identifying the sparse model structure on a normalized library $\Theta $, with zero mean and unit variance, we must de-bias the parameter estimates by computing a least-squares regression onto the sparse model structure in the original unnormalized variables. Otherwise, computing the least-squares regression in the normalized library, as is typically recommended in machine learning, would result in non-physical constant terms in the original unnormalized coordinates.

$tdetected$ . | $tupdate$ . | Equations . |
---|---|---|

0.00 | 10.0 | $x\u0307=\u221210.0x+10.0yy\u0307=27.96x\u22120.99y\u22121.0xzz\u0307=\u22122.67z+1.0xy$ |

40.01 | 41.0 | $x\u0307=\u221210.0x+10.0yy\u0307=15.0x\u22121.0y\u22121.0xzz\u0307=\u22122.67z+1.0xy$ |

80.02 | 81.0 | $x\u0307=\u221210.0x+10.0yy\u0307=27.98x\u22121.0y\u22121.0xzz\u0307=\u22122.67z+1.0xy$ |

$tdetected$ . | $tupdate$ . | Equations . |
---|---|---|

0.00 | 10.0 | $x\u0307=\u221210.0x+10.0yy\u0307=27.96x\u22120.99y\u22121.0xzz\u0307=\u22122.67z+1.0xy$ |

40.01 | 41.0 | $x\u0307=\u221210.0x+10.0yy\u0307=15.0x\u22121.0y\u22121.0xzz\u0307=\u22122.67z+1.0xy$ |

80.02 | 81.0 | $x\u0307=\u221210.0x+10.0yy\u0307=27.98x\u22121.0y\u22121.0xzz\u0307=\u22122.67z+1.0xy$ |

Abrupt changes to the system parameters are detected using the prediction horizon from Eq. (10). When the system changes, the prediction horizon of the system should decrease, with smaller horizon corresponding to a more serious change. Conversely, the inverse time, corresponding to the Lyapunov exponent, should diverge. Figure 6 exhibits this expected behavior. After a change is detected the model is rapidly recovered as shown in Table II. It is important to confirm that the updated model accurately represents the structure of the true dynamics. Figure 6 shows the norm of the model coefficients, $\Vert \xi \u2212\xi \u0302\Vert $, which is a measure of the distance between the estimated and true systems. Except for a short time ($tupdate=1$) after the abrupt change, the identified model closely agrees with the true model.

#### 1. Effects of noise and data volume

An important set of practical considerations includes how noise and the amount of data influence the speed of change detection and the accuracy of subsequent model recovery. Both the noise robustness and the amount of data required will change for a new problem, and here we report trends for this specific case. In addition, the amount of data required is also related to the sampling rate, which is the subject of ongoing investigation; in some cases, higher sampling time may even degrade model performance due to numerical effects.^{58}

Figure 7 shows the model fit following the abrupt change, comparing both the abrupt-SINDy method, which uses information about the existing model structure, and the standard SINDy method, which re-identifies the model from scratch following a detected change. In this figure, the model quality is shown as a function of the amount of data collected after the change.

The abrupt-SINDy model is able to identify more accurate models in a very short amount of time, given by $tupdate\u22480.1$. At this point, the standard SINDy method shows comparable error, however for even smaller times, the data are no longer sufficient for the conventional method. Since the adaptive method starts near the optimal solution, larger data sets do not degrade the model, which was an unexpected additional advantage.

Figure 8 explores the effect of additive noise on the derivative on the abrupt-SINDy and standard SINDy algorithms. Note that in practice noise will typically be added to the measurement of $x$, as in the original SINDy algorithm,^{40} requiring a denoising derivative;^{58,59} however, simple additive noise on the derivative is useful to investigate the robustness of the regression procedure. Abrupt-SINDy has considerably higher noise tolerance than the standard algorithm, as it must identify fewer unknown coefficients. In fact, it is able to handle approximately an order of magnitude more noise before failing to identify a model. Generally, increasing the volume of data collection improves the model. The critical point in the abrupt-SINDy curves corresponds to when small but dynamically important terms are mis-identified as a result of insufficient signal-to-noise. Although the noise and chaotic signal cannot be easily distinguished for small signal-to-noise, it may be possible to distinguish between them using a spectral analysis, since chaos yields red noise in contrast to the white additive noise.

### B. Van der Pol

As a second example, we consider the famous nonlinear Van der Pol oscillator.^{77} We include additional quadratic nonlinearities $\alpha x2$ and $\alpha y2$ to study the ability of our method to capture structural changes when these terms are added and removed abruptly. This example focuses on the important class of periodic phenomena, in contrast to the chaotic Lorenz dynamics. The modified Van der Pol oscillator is described by the following equations:

where $\mu >0$ is a parameter controlling the nonlinear damping, and *α* parameterizes the additional quadratic nonlinearity. The reference data set is shown in Fig. 9, with $\mu =7.5$ and *α* = 0 for $t\u2208[0,100]$, which results in a canonical periodic orbit. At *t* = 100, we introduce a structural change, switching on the quadratic nonlinearity ($\alpha =\u22120.25$), and driving the system to a stable fixed point. We also modify the parameter *μ*, setting it to $\mu =6.0$. Finally, at *t* = 200, we switch off the additional nonlinearity (*α* = 0) and keep *μ* = 6.

Table III shows the corresponding models recovered using the abrupt-SINDy method. The change is detected using the Lyapunov time defined in Eq. (11), as shown in Fig. 10. Again, the estimated Lyapunov time [Fig. 10(b)] captures the changes in the model, which correspond to peaks in structural model error [Fig. 10(a)]. While the first and third stages are indeed identified correctly, the term $\u22121.25x$ is preferred over $\u2212x\u22120.25x2$ in the sparse estimate for $y\u0307$ in the orange trajectory. However, since both terms look similar near the fixed point at $x\u223c1$, this describes the dynamics well. This type of mis-identification often occurs in data mining when features are highly correlated^{63} and is more related to sparse regression in general than the proposed abrupt-SINDy. For dynamic system identification, the correct nonlinearity could be resolved by obtaining more transient data, i.e., by perturbing the system through actuation. However, this model may be sufficient for control while a more accurate model is identified.

$tdetected$ . | $tupdate$ . | Equations . |
---|---|---|

0.00 | 20.01 | $x\u0307=1.0yy\u0307=\u22121.0x+4.99y\u22124.99x2y$ |

106.39 | 116.00 | $x\u0307=0.99y+0.25y2y\u0307=\u22121.26x+7.46y\u22127.46x2y$ |

200.12 | 210.00 | $x\u0307=1.0yy\u0307=\u22121.0x+5.98y\u22125.98x2y$ |

$tdetected$ . | $tupdate$ . | Equations . |
---|---|---|

0.00 | 20.01 | $x\u0307=1.0yy\u0307=\u22121.0x+4.99y\u22124.99x2y$ |

106.39 | 116.00 | $x\u0307=0.99y+0.25y2y\u0307=\u22121.26x+7.46y\u22127.46x2y$ |

200.12 | 210.00 | $x\u0307=1.0yy\u0307=\u22121.0x+5.98y\u22125.98x2y$ |

## V. CONCLUSIONS

In this work, we develop an adaptive nonlinear system identification strategy designed to rapidly recover nonlinear models from limited data following an abrupt change to the system dynamics. The sparse identification of nonlinear dynamics (SINDy) framework is ideal for change detection and model recovery, as it relies on parsimony to select the fewest active terms required to model the dynamics. In our adaptive abrupt-SINDy method, we rely on previously identified models to identify the fewest *changes* required to recover the model. This modified algorithm is shown to be highly effective at model recovery following an abrupt change, requiring less data, less computation time, and having improved noise robustness over identifying a new model from scratch. The abrupt-SINDy method is demonstrated on several numerical examples exhibiting chaotic dynamics and periodic dynamics, as well as parametric and structural changes, enabling real-time model recovery.

There are limitations of the method which can be addressed by several promising directions that may be pursued to improve the abrupt-SINDy method

**Fallback models:**In the current implementation, after a change has been detected, the old model will be used until enough data are collected to identify a new model. The dynamic mode decomposition^{17}provides an alternative fallback model that may be identified rapidly with even less data. Additionally, instead of relying on a sparse update to the current model, it is sensible to also maintain a library of past models for rapid characterization.^{78}**Hyperparameterization:**In the initial prototype, the hyper-parameters Δ_{x}and $tupdate$ are fixed. Over time, an improved algorithm may learn and adapt optimal hyper-parameters.**Comprehensive Lyapunov time estimation:**According to Eq. (10), the Lyapunov time $T(t|\Delta x)$ is estimated for a fixed $\Delta x$. Estimating the time for a range of values, i.e., $\Delta x\u2208(0,\Delta xmax]$, will be more robust and may provide a richer analysis without requiring additional data. Further investigation must be made in the case of chaotic systems, where the numerical calculation of the Lyapunov exponent may fail to reveal divergence due to the fact of simple averaging over time. Because of the importance of the detection of model divergence, this is a particularly important area of future research.**Advanced optimization and objectives:**Looking forward, advanced optimization techniques may be used to further improve the adaptation to system changes. Depending on the system, other objectives may be optimized, either by including regularization or in a multi-objective optimization.

The proposed abrupt-SINDy framework is promising for the real-time recovery of nonlinear models following abrupt changes. It will be interesting to compare with other recent algorithms that learn local dynamics for control in response to abrupt changes.^{79} Future work will be required to demonstrate this method on more sophisticated engineering problems and to incorporate it in controllers. To understand the limitations for practical use, many further studies are needed, and it will be particularly useful to test this method on a real experiment. The abrupt-SINDy modeling framework may also help inform current rapid learning strategies in neural network architectures,^{2–4} potentially allowing dynamical systems methods to inform rapid training paradigms in deep learning.

## ACKNOWLEDGMENTS

M.Q. was supported by a fellowship within the FITweltweit program of the German Academic Exchange Service (DAAD). M.Q. and M.A. acknowledge support by the European Erasmus SME/HPC project (588372-EPP-1-2017-1-IE-EPPKA2-KA). S.L.B. acknowledges support from the ARO and AFOSR Young Investigator Programs (ARO W911NF-17-1-0422 and AFOSR FA9550-18-1-0200). S.L.B. and J.N.K. acknowledge support from DARPA (HR0011-16-C-0016). We would like to thank the anonymous reviewers for their comments which helped to improve this manuscript. We also acknowledge valuable discussions related to abrupt model recovery and programming wisdom with Dennis Bernstein, Karthik Duraisamy, Thomas Isele, Eurika Kaiser, and Hod Lipson.

All code for this work is publicly available on Github: https://github.com/Ohjeah/sparsereg.