A general, variational approach to derive low-order reduced models from possibly non-autonomous systems is presented. The approach is based on the concept of optimal parameterizing manifold (OPM) that substitutes more classical notions of invariant or slow manifolds when the breakdown of “slaving” occurs, i.e., when the unresolved variables cannot be expressed as an exact functional of the resolved ones anymore. The OPM provides, within a given class of parameterizations of the unresolved variables, the manifold that averages out optimally these variables as conditioned on the resolved ones. The class of parameterizations retained here is that of continuous deformations of parameterizations rigorously valid near the onset of instability. These deformations are produced through the integration of auxiliary backward–forward systems built from the model’s equations and lead to analytic formulas for parameterizations. In this *modus operandi*, the backward integration time is the key parameter to select per scale/variable to parameterize in order to derive the relevant parameterizations which are doomed to be no longer exact away from instability onset due to the breakdown of slaving typically encountered, e.g., for chaotic regimes. The selection criterion is then made through data-informed minimization of a least-square parameterization defect. It is thus shown through optimization of the backward integration time per scale/variable to parameterize, that skilled OPM reduced systems can be derived for predicting with accuracy higher-order critical transitions or catastrophic tipping phenomena, while training our parameterization formulas for regimes prior to these transitions takes place.

We introduce a framework for model reduction to produce analytic formulas to parameterize the neglected variables. These parameterizations are built from the model’s equations in which only a scalar is left to calibrate per scale/variable to parameterize. This calibration is accomplished through a data-informed minimization of a least-square parameterization defect. By their analytic fabric, the resulting parameterizations benefit physical interpretability. Furthermore, our hybrid framework—analytic and data-informed—enables us to bypass the so-called extrapolation problem, known to be an important issue for purely data-driven machine-learned parameterizations. Here, by training our parameterizations prior to transitions, we are able to perform accurate predictions of these transitions via the corresponding reduced systems.

## I. INTRODUCTION

^{1}which includes dissipative effects, and $G$ denotes a nonlinear operator, which saturates the possible unstable growth due to unstable modes and may account for a loss of regularity (such as for nonlinear advection) in the infinite-dimensional setting. Such systems arise in a broad range of applications; see, e.g., Refs. 2–11.

The framework adopted is that of Ref. 12, which allows for deriving analytic parameterizations of the neglected scales that represent efficiently the nonlinear interactions with the retained, resolved variables. The originality of the approach of Ref. 12 is that the parameterizations are hybrid in their nature in the sense that they are both model-adaptive, based on the dynamical equations, and data-informed by high-resolution simulations.

For a given system, the optimization of these parameterizations benefits, thus, from their analytical origin resulting in only a few parameters to learn over short-in-time training intervals, mainly one scalar parameter per scale to parameterize. Their analytic fabric contributes to their physical interpretability compared to, e.g., parameterizations that would be machine learned. The approach is applicable to deterministic systems, finite- or infinite-dimensional, and is based on the concept of optimal parameterizing manifold (OPM) that substitutes the more classical notion of slow or invariant manifolds when there takes place a breakdown of “slaving” relationships between the resolved and unresolved variables,^{12} i.e., when the latter cannot be expressed as an exact functional of the formers anymore.

By construction, the OPM takes its origin in a variational approach. The OPM is the manifold that averages out optimally the neglected variables as conditioned on the current state of the resolved ones, Refs. 12, Theorem 4. The OPM allows for computing approximations of the conditional expectation term arising in the Mori–Zwanzig approach to stochastic modeling of neglected variables;^{13–17} see also Theorem 5 in Ref. 12 as well as Refs. 18 and 19.

The approach introduced in Ref. 12 to determine OPMs, in practice, consists of first deriving analytic parametric formulas that match rigorous leading approximations of unstable/center manifolds or slow manifolds near, e.g., the onset of instability (Ref. 12, Sec. 2) and then to perform a data-driven optimization of the manifold formulas’ parameters to handle regimes further away from that instability onset (Ref. 12, Sec. 4). In other words, efficient parameterizations away from the onset are obtained as continuous deformations of parameterizations near the onset; deformations that are optimized by minimizing cost functionals tailored to the dynamics and measuring the defect of parameterization.

There, the optimization stage allows for alleviating the small denominator problems rooted in small spectral gaps and for improving the parameterization of small-energy but dynamically important variables. Thereby, relevant parameterizations are derived in regimes where constraining spectral gap or timescale separation conditions are responsible for the well-known failure of standard invariant/inertial or slow manifolds.^{12,20–25} As a result, the OPM approach provides (i) a natural remedy to the excessive backscatter transfer of energy to the low modes classically encountered in turbulence (Ref. 12, Sec. 6) and (ii) allows for deriving optimal models of the slow motion for fast-slow systems not necessarily in the presence of timescale separation.^{18,19} Due to their optimal nature, OPMs allow also for providing accurate parameterizations of dynamically important small-energy variables; a well-known issue encountered in the closure of chaotic dynamics and related to (i).

This work examines the ability of the theory-guided and data-informed parameterization approach of Ref. 12 in deriving reduced systems able to predict higher-order transitions or catastrophic tipping phenomena, when the original, full system is possibly subject to time-dependent perturbations. From a data-driven perspective, this problem is tied to the so-called extrapolation problem, known, for instance, to be an important issue in machine learning, requiring more advanced methods such as, e.g., transfer learning.^{26} While the past few decades have witnessed a surge and advances of many data-driven reduced-order modeling methodologies,^{27,28} the prediction of non-trivial dynamics for parameter regimes not included in the training dataset remains a challenging task. Here, the OPM approach by its hybrid framework—analytic and data-informed—allows us to bypass this extrapolation problem at a minimal cost in terms of learning efforts as illustrated in Secs. IV and V. As shown below, the training of OPMs at parameter values prior the transitions take place is sufficient to perform accurate predictions of these transitions via OPM reduced systems.

The remainder of this paper is organized as follows. We first survey in Sec. II the OPM approach and provide the general variational parameterization formulas for model reduction in the presence of a time-dependent forcing. We then expand in Sec. III on the backward–forward (BF) systems method^{12,29} to derive these formulas, clarifying fundamental relationships with homological equations arising in normal forms and invariant manifold theories.^{12,30–32} Section III C completes this analysis by analytic versions of these formulas in view of applications. The ability of predicting noise-induced transitions and catastrophic tipping phenomena^{33,34} through OPM reduced systems is illustrated in Sec. IV for a system arising in the study of thermohaline circulation. Successes in predicting higher-order transitions such as period-doubling and chaos are reported in Sec. V for a Rayleigh–Bénard problem, and contrasted by comparison with the difficulties encountered by standard manifold parameterization approaches in Appendix D. We then summarize the findings of this article with some concluding remarks in Sec. VI.

## II. VARIATIONAL PARAMETERIZATIONS FOR MODEL REDUCTION

We summarize in this section the notion of variational parameterizations introduced in Ref. 12. The state space is decomposed as the sum of the subspace, $ H c$, of resolved variables (coarse-scale), and the subspace, $ H s$, of unresolved variables (small-scale). In practice, $ H c$ is spanned by the first few eigenmodes of $A$ with dominant real parts (e.g., unstable) and $ H s$ by the rest of the modes, typically stable.

*modus operandi*to provide such continuous deformations is detailed in Sec. III, based on the method of backward–forward systems introduced in Ref. 29, Chap. 4, in a stochastic context. In the case where $G(y)= G k(y)+h.o.t$ with $ G k$ a homogeneous polynomial of degree $k$, this approach gives analytical formulas of parameterizations given by (see Sec. III A)

*parameterization defect*:

Thus, minimizing each $ Q n( \tau n,T)$ (in the $ \tau n$-variable) is a natural idea to enforce closeness of $y(t)$ in a least-squares sense to the manifold $ M \tau (t)$. Panel (a) in Fig. 1 illustrates (5) for the $ y n$-component: The optimal parameterization, $ \Phi n ( \tau n \u2217 , X , t )$, minimizing (4) is shown for a case where the dynamics is transverse to it (e.g., in the absence of slaving), while $ \Phi n ( \tau n \u2217 , X , t )$ provides the best parameterization in a least-squares sense.

In practice, the *normalized parameterization defect*, $ Q n$, is often used to compare different parameterizations. It is defined as $ Q n(\tau ,T)= Q n(\tau ,T)/ | y n | 2 \xaf$. For instance, the flat manifold corresponding to no parameterization ( $ \tau n=0$) of the neglected variables (Galerkin approximations) comes with $ Q n=1$ for all $n$, while a manifold corresponding to a perfect slaving relationship between $ y c(t)$ and $ y n(t)$’s, comes with $ Q n=0$. When $0< Q n<1$ for all $n$, the underlying manifold $ M \tau $ will be referred to as a *parameterizing manifold (PM)*. Once the parameters $ \tau n$ are optimized by minimization of (4), the resulting manifold will be referred to as the optimal parameterizing manifold (OPM).^{35}

We conclude this section by a few words of practical considerations. As documented in Ref. 12, Secs. 5 and 6, the amount of training data required in order to reach a robust estimation of the optimal backward integration time $ \tau n \u2217$, is often comparable to the dynamics’ time horizon that is necessary to resolve the decay of correlations for the high-mode amplitude $ y n(t)$. For multiscale systems, one thus often needs to dispose of a training dataset sufficiently large to resolve the slowest decay of temporal correlations of the scales to parameterize. On the other hand, by benefiting from their dynamical origin, i.e., through the model’s equations, the parameterizations formulas employed in this study (see Sec. III) allow often for reaching out, in practice, satisfactory OPM approximations when optimized over training intervals shorter than these characteristic timescales.

When these conditions are met, the minimization of the parameterization defect (4) is performed by a simple gradient-descent algorithm (Ref. 12, Appendix). There, the first local minimum that one reaches corresponds often to the OPM; see Secs. IV and V and Ref. 12, Secs. 5 and 6. In the rare occasions where the parameterization defect exhibits more than one local minimum and the corresponding local minima are close to each others, criteria involving colinearity between the features to parameterize and the parameterization itself can be designed to further assist the OPM selection. Section V D illustrates this point with the notion of parameterization correlation.

## III. VARIATIONAL PARAMETERIZATIONS: EXPLICIT FORMULAS

### A. Homological equation and backward–forward systems: Time-dependent forcing

^{7}It provides the leading-order approximation of the underlying invariant manifold function if a suitable spectral gap condition is satisfied and solves the homological equation

Another key element was pointed out in Ref. 29, Chap. 4, in the quest of getting new insights about invariant manifolds in general and more specifically concerned with the approximation of stochastic invariant manifolds of stochastic PDEs,^{36} along with the rigorous low-dimensional approximation of their dynamics.^{37} It consists of the method of backward–forward (BF) systems, thereafter revealed in Ref. 12 as a key tool to produce parameterizations (based on model’s equations) that are relevant beyond the domain of applicability of invariant manifold theory, i.e., away the onset of instability.

We want to gain into interpretability of such parameterizations in the context of forced-dissipative systems such as Eq. (1). For this purpose, we clarify the fundamental equation satisfied by our parameterizations; the goal being here to characterize the analog of (7) in this non-autonomous context. To simplify, we restrict to the case $ \Pi cF=0$. The next lemma provides the sought equation.

The proof of this lemma is found in Appendix A.

As it will become apparent below, the $\tau $-dependence of the terms in (I) is meant to control the small denominators that arise in the presence of small spectral gap between the spectrum of $ A c$ and $ A s$ that leads typically to over-parameterization when standard invariant/inertial manifold theory is applied in such situations; see Remark III.1 and Ref. 12, Sec. 6.

The terms in (II) are responsible for the presence of exogenous memory terms in the solution to the homological equation Eq. (16), i.e., in the parameterization $\Phi (X,t)$; see Eq. (31).

^{38}and $\u211c\sigma ( A s)<0$, Eq. (16) reduces, in the (formal) limit $\tau \u2192\u221e$, to

^{31,39}see also Ref. 40. In the case where $ \Pi sF$ is $T$-periodic, one can seek an approximate solution in a Fourier expansion form (Ref. 41, Sec. 5.2), leading to useful insights. For instance, solutions to (18) when $G$ is quadratic, exist if the following non-resonance condition is satisfied

Thus, in view of Lemma III.1 and what precedes, the small-scale parameterizations (15) obtained by solving the BF system (11) over finite time intervals can be conceptualized as perturbed solutions to the homological equation Eq. (18) arising in the computation of invariant and normal forms of non-autonomous dynamical systems Ref. 31, Sec. 2.2. The perturbative terms brought by the $\tau $-dependence play an essential role to cover situations beyond the domain of application of normal form and invariant manifold theories. As explained in Sec. III C and illustrated in Secs. IV and V, these terms can be optimized to ensure skillful parameterizations for predicting, by reduced systems, higher-order transitions escaping the domain of validity of these theories.

### B. Homological equation and backward–forward systems: Constant forcing

Assume that $F$ is constant in time and that $A$ is diagonal under its eigenbasis $ { e j \u2208 C N | j = 1 , \u2026 N}$. Assume furthermore that $\u211c( \lambda m c + 1)<\u211c( \lambda m c)$ and $\u211c\sigma ( A s)<0$, i.e., $ H s$ contains only stable eigenmodes.

Conditions similar to (24) arise in the smooth linearization of dynamical systems near an equilibrium.^{42} Here, condition (24) implies that the eigenvalues of the stable part satisfy a Sternberg condition of order $k$^{42} with respect to the eigenvalues associated with the modes spanning the reduced state space $ H c$.

This theorem is essentially a consequence of Ref. 12, Theorems 1 and 2, in which condition (24) is a stronger version of that used for Ref. 12, Theorem 2; see also Ref. 12, Remark 1 (iv). This condition is necessary and sufficient here for $ \u222b \u2212 \u221e 0 e \u2212 s A s \Pi s G k(p(s)) ds$ to be well defined. The convergence of $ \u222b \u2212 \u221e 0 e \u2212 s A s \Pi sF ds$ is straightforward since $\u211c\sigma ( A s)<0$ by assumption and $F$ is constant. The derivation of (25) follows the same lines as the derivation for Ref. 12, Eq. (4.6).

One can also generalize Theorem III.1 to the case when $F$ is time-dependent provided that $F$ satisfies suitable conditions to ensure $ J F$ given by (22) to be well defined and that the non-resonance condition (24) is suitably augmented. We leave the precise statement of such a generalization to a future work. For the applications considered in later sections, the forcing term $F$ is either a constant or with a sublinear growth. For such cases, $ J F$ is always well defined under the assumptions of Theorem III.1. We turn now to present the explicit formulas of the parameterizations based on the BF system (11).

### C. Explicit formulas for variational parameterizations

We provide in this section, closed form formulas of parameterizations for forced dissipative such as Eq. (1). We consider first the case where $F$ is constant in time and then deal with time-dependent forcing case.

#### 1. The constant forcing case

In the case the nonlinearity $G(y)$ is quadratic, denoted by $B(y,y)$, such formulas are easily accessible as (27) is integrable. It is actually integrable in the presence of higher-order terms, but we provide the details here for the quadratic case, leaving the details to the interested reader for extension.

##### (OPM balances small spectral gaps)

**(OPM balances small spectral gaps)**

Theorem III.1 teaches us that when $ \delta i j n>0$ not only $ D i j n(\tau , \lambda )$ defined in (29) converges toward a well-defined quantity as $ \tau n\u2192\u221e$ but also the coefficients involved in $ R n(F, \lambda ,\tau ,X)$ [see (B1)–(B3)], in the case of existence of invariant manifold. For parameter regimes where the latter fails to exist, some of the $ \delta i j n$ or the $ \lambda j\u2212 \lambda n$ involved in (B2) and (B3) can become small, leading to the well-known small spectral gap issue^{25} manifested typically by over-parameterization of the $ e n$’s mode amplitude when the Lyapunov–Perron parameterization (22) is employed; see Ref. 12, Sec. 6. The presence of the $ \tau n$ through the exponential terms in (29) and (B2) and (B3) allows for balancing these small spectral gaps after optimization and improve notably the parameterization and closure skills; see Sec. V and Appendix D.

^{44}in the context of non-normal modes initialization

^{45–47}are also tied to leading-order approximations of invariant manifolds since $ J(X)$ given by (6) satisfies

^{48,49}but in the presence of small spectral gaps, such an operation may become also less successful. When such formulas arising in AIM theory are inserted within the proper data-informed variational approach such as done in Ref. 12, Sec. 4.4, their optimized version can also handle regimes in which spectral gap becomes small as demonstrated in Ref. 12, Sec. 6.

#### 2. The time-dependent forcing case

This formula of $ \Phi n$ gives the solution to the homological equation Eq. (16) of Lemma III.1 in which $ \Pi s$ therein is replaced here by $ \Pi n$, the projector onto the mode $ e n$ whose amplitude is parameterized by $ \Phi n$. Clearly, the terms in (I) are produced by the time-dependent forcing. They are functionals of the past $ f n$ and convey thus *exogenous* memory effects.

This latter computational aspect is important not only for simulation purposes when the corresponding OPM reduced system is ran online but also for training purposes, in the search of the optimal $\tau $ during the offline minimization stage of the parameterization defect.

If time-dependent forcing terms are present in the reduced state space $ H c$, then the BF system (27) can still be solved analytically albeit leading to more involved integral terms than in (I) in the corresponding parameterization. This aspect will be detailed elsewhere.

#### 3. OPM reduced systems

We emphasize finally that from a data-driven perspective, the OPM reduced system benefits from its dynamical origin. By construction, only a scalar parameter $\tau $ is indeed optimized per mode to parameterize. This parameter benefits furthermore from a dynamical interpretation since it plays a key role in balancing the small spectral gaps known as to be the main issue in applications of invariant or inertial manifold theory in practice;^{25} see Remark III.1.

## IV. PREDICTING TIPPING POINTS

### A. The Stommel–Cessi model and tipping points

A simple model for oceanic circulation showing bistability is Stommel’s box model,^{50} where the ocean is represented by two boxes, a low-latitude box with temperature $ T 1$ and salinity $ S 1$, and a high-latitude box with temperature $ T 2$ and salinity $ T 2$; see Ref. 51, Fig. 1. The Stommel model can be viewed as the simplest “thermodynamic” version of the Atlantic Meridional Overturning Circulation (AMOC) (Ref. 52, Chap. 6), a major ocean current system transporting warm surface waters toward the northern Atlantic that constitutes an important tipping point element of the climate system; see Refs. 53 and 54.

Cessi in Ref. 51 proposed a variation of this model, based on the box model of Ref. 55, consisting of an ODE system describing the evolution of the differences $\Delta T= T 1\u2212 T 2$ and $\Delta S= S 1\u2212 S 2$; see Ref. 51, Eq. (2.3). The Cessi model trades the absolute functions involved in the original Stommel model by polynomial relations more prone to analysis. The dynamics of $\Delta T$ and $\Delta S$ are coupled via the density difference $\Delta \rho $, approximated by the relation $\Delta \rho = \alpha S\Delta S\u2212 \alpha T\Delta T$ which induces an exchange $Q$ of mass between the boxes to be given as $Q=1/ \tau d+(q/V)\Delta \rho 2$ according to Cessi’s formulation, where $q$ denotes the Poiseuille transport coefficient, $V$ is the volume of a box, and $ \tau d$ is the diffusion timescale. The coefficient $ \alpha S$ is a coefficient inversely proportional to the practical salinity unit, i.e., unit based on the properties of sea water conductivity while $ \alpha T$ is in $ \xb0 C \u2212 1$; see Ref. 51. In this simple model, $\Delta T$ relaxes at a rate $ \tau r$ to a reference temperature $\theta $ (with $ T 1=\theta /2$ and $ T 2=\u2212\theta /2$) in the absence of coupling between the boxes.

^{56},

The nonlinear exchange of mass between the boxes is reflected by the nonlinear coupling terms in Eq. (36). These nonlinear terms lead to multiple equilibria in certain parts of the parameter space, in particular, when $F$ is varied over a certain range $[ F c 1, F c 2]$, while $\mu $ and $\u03f5$ are fixed. One can even prove that Eq. (36) experiences two saddle-node bifurcations,^{57} leading to a typical S-shaped bifurcation diagram; see Fig. 2(a).

S-shaped bifurcation diagrams occur in oceanic models that go well beyond Eq. (36) such as based on the hydrostatic primitive equations or Boussinesq equations; see, e.g., Refs. 58–61. More generally, S-shaped bifurcation diagrams and more complex multiplicity diagrams are known to occur in a broad class of nonlinear problems^{62–64} that include energy balance climate models,^{65–68} population dynamics models,^{69,70} vegetation pattern models,^{71,72} combustion models,^{73–76} and many other fields.^{77}

The very presence of such S-shaped bifurcation diagrams provides the skeleton for tipping point phenomena to take place when such models are subject to the appropriate stochastic disturbances and parameter drift, causing the system to “tip” or move away from one branch of attractors to another; see Refs. 33 and 34. From an observational viewpoint, the study of tipping points has gained a considerable attention due to their role in climate change as a few components of the climate system (e.g., Amazon forest and the AMOC) have been identified as candidates for experiencing such critical transitions if forced beyond the point of no return.^{53,54,78}

Whatever the context, tipping phenomena are due to a subtle interplay between nonlinearity, slow parameter drift, and fast disturbances. To better understand how these interactions cooperate to produce a tipping phenomenon could help improve the design of early warning signals. However, we will not focus on this latter point *per se* in this study, we show below, on the Cessi model that the OPM framework provides useful insights in that perspective, by demonstrating its ability of deriving reduced models to predict accurately the crossing of a tipping point; see Sec. IV C.

### B. OPM results for a fixed *F* value: Noise-induced transitions

We report in this section on the OPM framework skills to derive accurate reduced models to reproduce noise-induced transitions experienced by the Cessi model (36), when subject to fast disturbances for a fixed value of $F$. The training of the OPM operated here serves as a basis for the harder tipping point prediction problem dealt with in Sec. IV C, when $F$ is allowed to drift slowly through the critical value $ F c 2$ at which the lower branch of steady states experiences a saddle-node bifurcation manifested by a turning point; see Fig. 2(a) again. Recall that $ F c 1$ denotes here the $F$-value corresponding to the turning point experienced by the upper branch.

**Reformulation of the Cessi model (36).** The 1D OPM reduced equation is obtained as follows. First, we fix an arbitrary value of $F$ in $[ F c 1, F c 2]$ that is denoted by $ F ref$ and marked by the green vertical dashed line in Fig. 2(a). The system (36) is then rewritten for the fluctuation variables $ y \u2032(t)=y(t)\u2212 y \xaf$ and $ z \u2032(t)=z(t)\u2212 z \xaf$, where $ X \xaf=( y \xaf, z \xaf)$ denotes the steady state of Eq. (36) in the lower branch when $F= F ref$.

The eigenvalues of $A$ have negative real parts since $ X \xaf$ is locally stable. We assume that the matrix $A$ has two distinct real eigenvalues, which turns out to be the case for a broad range of parameter regimes. As in Sec. III, the spectral elements of the matrix $A$ (respectively, $ A \u2217$) are denoted by $ ( \lambda j , e j )$ $ [ respectively , ( \lambda j \u2217 , e j \u2217 ) ]$, for $j=1,2$. These eigenmodes are normalized to satisfy $ \u27e8 e j , e j \u2217 \u27e9=1$ and are bi-orthogonal otherwise.

**Derivation of the OPM reduced equation.** We can now use the formulas of Sec. III C to obtain an explicit variational parameterization of the most stable direction carrying the variable $ \delta 2 \u2032$ here, in terms of the least stable one, carrying $ \delta 1 \u2032$.

For Eq. (41), both of the forcing terms appearing in the $p$- and $q$-equations of the corresponding BF system (27) are stochastic (and thus time-dependent). To simplify, we omit the stochastic forcing in Eq. (27a) and work with the OPM formula given by (31) which, as shown below, is sufficient for deriving an efficient reduced system.

**Numerical results.** The OPM reduced equation Eq. (45) is able to reproduce the dynamics of $ \delta 1 \u2032$ for a wide range of parameter regimes. We show in Fig. 2 a typical example of skills for parameter values of $\mu $, $\u03f5$, $\sigma $, and $F= F ref$ as listed in Table I. Since $ F ref$ lies in $ [ F c 1 , F c 2 ]$ (see Table I), the Cessi model (36) admits three steady states among which two are locally stable (lower and upper branches) and the other one is unstable (middle branch); see Fig. 2(a) again. For this choice of $ F ref$, the steady state corresponding to the lower branch is $ X \xaf=( y \xaf, z \xaf)$ with $ y \xaf=0.4130$, $ z \xaf=0.8285$, and the eigenvalues of $A$ are $ \lambda 1=\u22120.5168$ and $ \lambda 2=\u221215.7650$.

$\u03f5$ . | σ
. | μ
. | $ F c 1$ . | $ F c 2$ . | F_{ref}
. |
---|---|---|---|---|---|

0.1 | $ \u03f5$ | 6.2 | 0.8513 | 0.8821 | 0.855 |

$\u03f5$ . | σ
. | μ
. | $ F c 1$ . | $ F c 2$ . | F_{ref}
. |
---|---|---|---|---|---|

0.1 | $ \u03f5$ | 6.2 | 0.8513 | 0.8821 | 0.855 |

The offline trajectory $ \delta \u2032(t)$ used as input for training $ \Phi 2$ to find the optimal $\tau $ is taken as driven by an arbitrary Brownian path from Eq. (41) for $t$ in the time interval $[20,80]$. The resulting offline skills of the OPM, $ \Phi 2( \tau \u2217, \delta 1 \u2032(t),t)$, are shown as the blue curve in Fig. 2(b), while the original targeted time series $ \delta 2 \u2032(t)$ to parameterize is shown in black. The optimal $\tau $ that minimizes the normalized parameterization defect $Q$ turns out to be $\u221e$ for the considered regime, as shown in Fig. 2(c). One observes that the OPM captures, in average, the fluctuations of $ \delta 2 \u2032(t)$; compare blue curve with black curve.

The skills of the corresponding OPM reduced Eq. (45) are shown in Fig. 2(d), after converting back to the original $(y,z)$-variable using (46). The results are shown out of sample, i.e., for another noise path and over a time interval different from the one used for training. The 1D reduced OPM reduced Eq. (45) is able to capture the multiple noise-induced transitions occurring across the two equilibria (marked by the cyan lines), after transforming back to the original $(y,z)$-variable; compare red with black curves in Fig. 2(d). Both the Cessi model (40) and the reduced OPM equation are numerically integrated using the Euler–Maruyama scheme with time step $\delta t= 10 \u2212 3$.

Note that we chose here the numerical value of $ F ref$ to be closer to $ F c 1$ than to $ F c 2$ for making more challenging the tipping point prediction experiment conducted in Sec. IV C. There, we indeed train the OPM for $F= F ref$ while aiming at predicting the tipping phenomenon as $F$ drifts slowly through $ F c 2$ (located thus further away) as time evolves.

### C. Predicting the crossing of tipping points

**OPM reduced Eq. (45) in the original coordinates.**For better interpretability, we first rewrite the OPM reduced Eq. (45) under the original coordinates in which the Cessi model (36) is formulated. For that purpose, we exploit the components of the eigenvectors of $A$ given by (38) (for $F= F ref$) that we write as $ e 1= ( e 11 , e 12 ) T$ and $ e 2= ( e 21 , e 22 ) T$. Then, from (46), the expression of $ y app$ rewrites as

**Prediction results.**Thus, we aim at predicting by our OPM reduced model, the tipping phenomenon experienced by the full model when $F$ is subject to a slow drift in time via

We set now $ F 0=0.85$, $\kappa =2\xd7 10 \u2212 4$, and $\sigma = \u03f5/50$, while $\mu =6.2$ and $\u03f5=0.1$ are kept as in Table I. Note that compared to the results shown in Fig. 2, the noise level is here substantially reduced to focus on the tipping phenomenon to occur in the vicinity of the turning point $F= F c 2$. The goal is thus to compare the behavior of $y(t)$ solving the forced Cessi model (58) with that of the solution $Y(t)$ of the (forced) OPM reduced Eq. (59) as $F(t)$ crosses the critical value $ F c 2$.

^{56,79}good reduction skills from the “slow” reduced equation

The striking prediction result of Fig. 3 are shown for one noise realization. We explore now the accuracy in predicting such a tipping phenomenon by the OPM reduced Eq. (59) when the noise realization is varied.

To do so, we estimate the statistical distribution of the $F$-value at which tipping takes place, denoted by $ F transition$, for both the Cessi model (58) and Eq. (59). These distributions are estimated as follows. We denote by $ y \xaf c$ the $y$-component of the steady state at which the saddle-node bifurcation occurs in the lower-branch for $F= F c 2$. As noise is turned on and $F(t)$ evolves slowly through $ F c 2$, the solution path $y(t)$ to Eq. (58) increases in average while fluctuating around the $ y \xaf c$-value (due to small noise) before shooting off to the upper branch as one nears $ F c 2$. During this process, there is a time instant, denoted by $ t transition$, such that for all $t> t transition$, $y(t)$ stay above $ y \xaf c$. We denote the $F$-value corresponding to $ t transition$ as $ F transition$ according to (57). Whatever the noise realization, the solution to the OPM reduced Eq. (59) experiences the same phenomenon leading thus to its own $ F transition$ for a particular noise realization. As shown by the histograms in Fig. 4, the distribution of $ F transition$ predicted by the OPM reduced Eq. (59) (blue curve) follows closely that computed from the full system (58) (orange bars). Thus, not only the OPM reduced Eq. (59) is qualitatively able to reproduce the passage through a tipping point (as shown in Fig. 3) but is also able to accurately predicting the critical $F$-value (or time-instant) at which the tipping phenomenon takes place with an overall success rate over 99%.

## V. PREDICTING HIGHER-ORDER CRITICAL TRANSITIONS

### A. Problem formulation

^{80}The Prandtl number is chosen to be $\sigma =0.5$ in the experiments performed below, which is the same as used in Ref. 80. The reduced Rayleigh number $r$ is varied according to these experiments; see Table II.

. | m_{c}
. | r_{D}
. | r*
. | r_{P}
. |
---|---|---|---|---|

Experiment I (period-doubling) | 3 | 13.91 | 13.99 | 14 |

Experiment II (chaos) | 5 | 14.10 | 14.17 | 14.22 |

. | m_{c}
. | r_{D}
. | r*
. | r_{P}
. |
---|---|---|---|---|

Experiment I (period-doubling) | 3 | 13.91 | 13.99 | 14 |

Experiment II (chaos) | 5 | 14.10 | 14.17 | 14.22 |

Our goal is to assess the ability of our variational parameterization framework for predicting higher-order transitions arising in (61) by training the OPM only with data prior to the transition we aim at predicting. The Rayleigh number $r$ is our control parameter. As it increases, Eq. (67) undergoes several critical transitions/bifurcations, leading to chaos via a period-doubling cascade.^{80} We focus on the prediction of two transition scenarios beyond Hopf bifurcation: (I) the period-doubling bifurcation and (II) the transition from a period-doubling regime to chaos. Noteworthy are the failures that standard invariant manifold theory or AIM encounter in the prediction of these transitions, here; see Appendix D.

### B. Predicting higher-order transitions via OPM: Method

Thus, we aim at predicting for Eq. (67) two types of transition: (I) the period-doubling bifurcation and (II) the transition from period-doubling to chaos, referred to as Experiments I and II, respectively. For each of these experiments, we are given a reduced state space $ H c= span( e 1,\u2026, e m c)$ with $ m c$ as indicated in Table II, depending on the parameter regime. The eignemodes are here ranked by the real part of the corresponding eigenvalues, $ e 1$ corresponding to the eigenvalue with the largest real part. The goal is to derive an OPM reduced system (34) able to predict such transitions. The challenge lies in the optimization stage of the OPM due to the prediction constraint, one is prevented to use data from the full model at the parameter $r= r P$ which one desires to predict the dynamics. Only data prior to the critical value $ r \u2217$ at which the concerned transition, either period-doubling or chaos takes place are here allowed.

Due to the dependence on $ C \xaf r$ of $\lambda $, as well as of the interaction coefficients $ B k \u2113 j$ and forcing terms $ F j$, a particular attention to this dependence has to be paid. Indeed, recall that the parameterizations $ \Phi n$ given by the explicit formula (28) depend heavily here on the spectral elements of linearized operator $A$ at $ C \xaf r$ and thus does its optimization. Since the goal is to eventually dispose of an OPM reduced system able to predict the dynamical behavior at $r= r P> r \u2217$, one cannot rely on data from the full model at $r= r P$ and thus one cannot exploit, in particular, the knowledge of the mean state $ C \xaf r$ for $r= r P$. We are thus forced to estimate the latter for our purpose.

To do so, we estimate the dependence on $r$ of $ C \xaf r$ on an interval $ I r=[ r 0, r D]$ such that $ r 0< r D< r \u2217$ (see Table II), and use this estimation to extrapolate the value of $ C \xaf r$ at $r= r P$, which we denote by $ C \xaf r P ext$. For both experiments, it turned out that a linear extrapolation is sufficient. This extrapolated value $ C \xaf r P ext$ allows us to compute the spectral elements of the operator $A$ given by (65) in which $ C \xaf r P ext$ replaces the genuine mean state $ C \xaf r$. Obviously, the better is the approximation of $ C \xaf r$ by $ C \xaf r P ext$, the better the approximation of $\lambda ( C \xaf r P )$ by $\lambda ( C \xaf r P ext )$ along with the corresponding eigenspace.

We can then summarize our approach for predicting higher-order transitions via OPM reduced systems as follows:

Extrapolation $ C \xaf r P ext$ of $ C \xaf r$ at $r= r P$ (the parameter at which one desires to predict the dynamics).

Computation of the spectral elements of the linearized operator $A$ at $r= r P$ by replacing $ C \xaf r$ by $ C \xaf r P ext$ in (65).

Training of the OPM using the spectral elements of Step 2 and data of the full model for $r= r D< r P$.

Run the OPM reduced system (70) to predict the dynamics at $r= r P$.

We mention that the minimization of certain parameterization defects may require a special care such as for $ J 6$ in Experiment II. Due to the presence of nearby local minima [see red curve in Fig. 5(e)], the analysis of the optimal value of $\tau $ to select for calibrating an optimal parameterization of $ y 6(t)$ is more subtle and exploits actually a complementary metric known as the parameterization correlation; see Sec. V D.

Obviously, the accuracy in approximating the genuine mean state $ C \xaf r P$ by $ C ^ r P$ is a determining factor in the transition prediction procedure described in steps 1–4 above. Here, the relative error in approximating $\Vert C \xaf r P i\Vert $ ( $i=1,2$) is $0.03%$ for Experiment I and $0.27%$ for Experiment II. For the latter case, although the parameter dependence is rough beyond $ r 2 \u2217$ (see Panel D), there is no brutal local variations of relatively large magnitude as identified for other nonlinear systems.^{18,81} Systems for which a linear response to parameter variation is a valid assumption,^{82,83} thus constitute seemingly a favorable ground to apply the proposed transition prediction procedure.

### C. Prediction of higher-order transitions by OPM reduced systems

As summarized in Figs. 5(a) and 5(b), for each prediction experiment of Table II, the OPM reduced system (70) not only successfully predicts the occurrence of the targeted transition at either $r= r P 1$ and $r= r P 2$ [ $ P 1$ and $ P 2$ in Fig. 5(d)] but also approximates with good accuracy the embedded (global) attractor as well as key statistics of the dynamics such as the power spectral density (PSD). Note that for Experiment I (period-doubling), we choose the reduced dimension to be $ m c=3$ and to be $ m c=5$ for Experiment II (transition to chaos). For Experiment I, the global minimizers of $ J n(\tau )$ given by (69) are retained to build up the corresponding OPM. In Experiment II, all the global minimizers are also retained except for $ J 6(\tau )$ from which the second minimizer is used for the construction of the OPM.

As the baseline, the skills of the OPM reduced system (70) are compared to those from reduced systems when parameterizations from invariant manifold theory such as (6) or from inertial manifold theory such as (30), are employed; see Theorem 2 in Ref. 12 and Remark III.2. The details and results are discussed in Appendix D. The main message is that compared to the OPM reduced system (70), the reduced systems based on these traditional inertial/inertial manifold theories fail in predicting the correct dynamics. A main player in this failure lies in the inability of these standard parameterizations to accurately approximate small-energy variables that are dynamically important; see Appendix D. The OPM parameterization by its variational nature enables to fix this over-parameterization issue here.

### D. Distinguishing between close local minima: Parameterization correlation analysis

Since $ \Phi n$’s coefficients depend nonlinearly on $\tau $ [see Eqs. (28) and (29) and (B1)], the parameterization defects, $ J n(\tau )$, defined in (69) are also highly nonlinear and may not be convex in the $\tau $-variable as shown for certain variables in panels (c) and (e) of Fig. 5. A minimization algorithm to reach most often its global minimum is nevertheless detailed in Ref. 12, Appendix A and is not limited to the RB system analyzed here. In certain rare occasions, a local minimum may be an acceptable halting point with an online performance slightly improved compared to that of the global minimum. In such a case, one discriminates between a local minimum and the global one by typically inspecting another metric offline: the correlation angle that measures essentially the collinearity between the actual high-mode dynamics and its parameterization. Here, such a situation occurs for Experiment II; see $ J 6$ in Fig. 5(e).

Following Ref. 12, Sec. 3.1.2, we recall thus a simple criterion to assist the selection of an OPM when there are more than one local minimum displayed by the parameterization defect and the corresponding local minimal values are close to each other.

*parameterization correlation*as

We illustrate this point on $ J 6(\tau )$ shown in Fig. 5(e). The defect $ J 6$ exhibits two close minima corresponding to $ J 6\u22480.1535$ and $ J 6\u22480.1720$, occurring respectively at $\tau \u22480.65$ and $\tau \u22481.80$. Thus, the parameterization defect alone does not help provide a satisfactory discriminatory diagnosis between these two minimizers. To the contrary, the parameterization correlation shown in Fig. 6 allows for diagnosing more clearly that the OPM associated with the local minimizer has a neat advantage compared to that associated with the global minimizer.

As explained in Ref. 12, Sec. 3.1.2, this parameterization correlation criterion is also useful to assist the selection of the dimension of the reduced state space. Indeed, once an OPM has been determined, the dimension $ m c$ of the reduced system should be chosen so that the parameterization correlation is sufficiently close to unity as measured, for instance, in the $ L 2$-norm over the training time window. The basic idea is that one should not only parameterize properly the statistical effects of the neglected variables but also avoid to lose their phase relationships with the unresolved variables.^{12,84} For instance, for predicting the transition to chaos, we observe that an OPM comes with a parameterization correlation much closer to unity in the case $ m c=5$ than $ m c=3$ (not shown).

## VI. CONCLUDING REMARKS

In this article, we have described a general framework, based on the OPM approach introduced in Ref. 12 for autonomous systems, to derive effective (OPM) reduced systems that are able to predict either higher-order transitions caused by parameter regime shift or tipping phenomena caused by model’s stochastic disturbances and slow parameter drift. In each case, the OPMs are sought as continuous deformations of classical invariant manifolds to handle parameter regimes away from bifurcation onsets while keeping the reduced state space relatively low-dimensional. The underlying OPM parameterizations are derived analytically from model’s equations and constructed as explicit solutions to auxiliary BF systems such as Eq. (11), whose backward integration time is optimized per unresolved mode. This optimization involves the minimization of natural cost functionals tailored upon the full dynamics at parameter values prior to the transitions taking place; see (4). In each case—either for prediction of higher-order transitions or tipping points—we presented compelling evidence of the success of the OPM approach to address such extrapolation problems typically difficult to handle for data-driven reduction methods.

As reviewed in Sec. III, the BF systems such as Eq. (11) allow for drawing insightful relationships with the approximation theory of classical invariant/inertial manifolds (see Refs. 12, Sec. 2 and 31,40,85) when the backward integration time $\tau $ in Eq. (11) is sent to $\u221e$; see Theorem III.1. Once the invariant/inertial manifolds fail to provide legitimate parameterizations, the optimization of the backward integration time may lead to local minima at finite $\tau $ of the parameterization defect, which if sufficiently small, gives access to skillful parameterizations to predict, e.g., higher-order transitions. This way, as illustrated in Sec. V (Experiment II), the OPM approach allows us to bypass the well-known stumbling blocks tied to the presence of small spectral gaps such as those encountered in inertial manifold theory^{25} and tied to the accurate parameterization of dynamically important small-energy variables; see also Remark III.1, Appendix D, and Ref. 12, Sec. 6.

To understand the dynamical mechanisms at play behind a tipping phenomenon and to predict its occurrence are of uttermost importance, but this task is hindered by the often high dimensionality nature of the underlying physical system. Devising accurate and analytically reduced models to be able to predict tipping phenomenon from complex system is thus of prime importance to serve understanding. The OPM results of Sec. IV indicate great promises for the OPM approach to tackle this important task for more complex systems. Other reduction methods to handle the prediction of tipping point dynamics have been proposed recently in the literature but mainly for mutualistic networks.^{86–88} The OPM formulas presented here are not limited to this kind of networks and can be directly applied to a broad class of spatially extended systems governed by (stochastic) PDEs or to nonlinear time-delay systems by adopting the framework of Ref. 89 for the latter to devise center manifold parameterizations and their OPM generalization; see Refs. 90 and 91.

The OPM approach could be also informative to design for such systems nearing a tipping point, early warning signal (EWS) indicators from multivariate time series. Extension of EWS techniques to multivariate data is an active field of research with methods ranging from empirical orthogonal functions reduction^{92} to methods exploiting the system’s Jacobian matrix and relationships with the cross-correlation matrix^{93} or exploiting the detection of spatial location of “hotspots” of stability loss.^{94,95} By its nonlinear *modus operandi* for reduction, the OPM parameterizations identify a subgroup of essential variables to characterize a tipping phenomenon which, in turn, could be very useful to construct the relevant observables of the system for the design of useful EWS indicators.

Thus, since the examples of Secs. IV and V are representative of more general problems of prime importance, the successes demonstrated by the OPM approach on these examples invites for further investigations for more complex systems.

Similarly, we would like to point out another important aspect in that perspective. For spatially extended systems, the modes involve typically wavenumbers that can help interpret certain primary physical patterns. Formulas such as (28) and (31) arising in OPM parameterizations involve a rebalancing of the interactions $ B i j n$ among such modes by means of the coefficients $ D i j n(\tau ,\lambda )$; see Remark III.1. Thus, an OPM parameterization when skillful to derive efficient systems may provide useful insights into explaining emergent patterns due to certain nonlinear interactions of wavenumbers for regimes beyond the instability onset. For more complex systems that are dealt with here, it is known already that near the instability onset of primary bifurcations, center manifold theory provides such physical insights; see, e.g., Refs. 96–102.

By the approach proposed here, relying on OPMs obtained as continuous deformations of invariant/center manifolds, one can thus track the interactions that survive or emerge between certain wavenumbers as one progresses through higher-order transitions. Such insights bring new elements to potentially explain the low-frequency variability of recurrent large-scale patterns typically observed, e.g., in oceanic models,^{103,104} offering at least new lines of thoughts to the dynamical system approach proposed in the previous studies.^{104–107} In that perspective, the analytic expression (B1) of terms such as $ R n(F,\lambda ,\tau ,X)$ in (28) bring also new elements to break down the nonlinear interactions between the forcing of certain modes compared to others. Formulas such as (B1) extend to the case of time-dependent or stochastic forcing of the coarse-scale modes whose exact expression will be communicated elsewhere. As for the case of noise-induced transitions reported in Fig. 2(d), these generalized formulas are expected to provide, in particular, new insights into regime shifts not involving crossing a bifurcation point but tied to other mechanisms such as slow–fast cyclic transitions or stochastic resonances.^{108}

Yet, another important practical aspect that deserves in-depth investigation is related to the robustness of the OPM approach subject to model errors. Such errors can arise from multiple sources, including, e.g., imperfection of the originally utilized high-fidelity model in capturing the true physics and noise contamination of the solution data used to train the OPM. For the examples of Secs. IV and V, since we train the OPM in a parameter regime prior to the occurrence of the concerned transitions, the utilized training data contain, *de facto*, model errors. The reported results in these sections show that the OPM is robust in providing effectively reduced models subject to such model errors. Nevertheless, it would provide useful insights if one can systematically quantify uncertainties of the OPM reduced models subject to various types of model errors. In that respect, several approaches could be beneficial for stochastic systems such as those based on the information-theoretic framework of Ref. 109 or the perturbation theory of ergodic Markov chains and linear response theory,^{110} as well as methods based on data-assimilation techniques.^{111}

## ACKNOWLEDGMENTS

We are grateful to the reviewers for their constructive comments, which helped enrich the discussion and presentation. This work has been partially supported by the Office of Naval Research (ONR) Multidisciplinary University Research Initiative (MURI) Grant (No. N00014-20-1-2023), the National Science Foundation Grant (No. DMS-2108856), and the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (Grant Agreement No. 810370). We also acknowledge the computational resources provided by Advanced Research Computing at Virginia Tech for the results shown in Fig. 4.

## AUTHOR DECLARATIONS

### Conflict of Interest

The authors have no conflicts to disclose.

### Author Contributions

**Mickaël D. Chekroun:** Conceptualization (equal); Formal analysis (equal); Funding acquisition (equal); Investigation (equal); Methodology (equal); Writing – original draft (equal). **Honghu Liu:** Conceptualization (equal); Formal analysis (equal); Funding acquisition (equal); Investigation (equal); Methodology (equal); Writing – original draft (equal). **James C. McWilliams:** Conceptualization (equal); Formal analysis (equal); Funding acquisition (equal); Investigation (supporting); Methodology (supporting); Writing – original draft (supporting).

## DATA AVAILABILITY

The data that support the findings of this study are available from the corresponding author upon reasonable request.

### APPENDIX A: PROOF OF LEMMA III.1

### APPENDIX B: THE LOW-ORDER TERM IN THE OPM PARAMETERIZATION

For the application to the Rayleigh—Bénard system considered in Sec. V, the forcing term $F$ is produced after rewriting the original unforced system in terms of the fluctuation variable with respect to the mean state. For this problem, the eigenvalues $\lambda $ and the interaction coefficients $ B i j n$ both depend on the mean state.

### APPENDIX C: NUMERICAL ASPECTS

In the numerical experiments of Sec. V, the full RB system (61) as well as the OPM reduced system (70) are numerically integrated using a standard fourth-order Runge–Kutta (RK4) method with a time-step size taken to be $\delta t=5\xd7 10 \u2212 3$. Note that since the eigenvalues of $A$ are typically complex-valued, some care is needed when integrating (70). Indeed, since complex eigenmodes of $A$ must appear in complex conjugate pairs, the corresponding components of $x$ in (70) also form complex conjugate pairs. To prevent round-off errors that may disrupt the underlying complex conjugacy, after each RK4 time step, we enforce complex conjugacy as follows. Assuming that $ x j$ and $ x j + 1$ form a conjugate pairs and that after an RK4 step, $ x j= a 1+i b 1$ and $ x j + 1= a 2\u2212i b 2$ $ ( i 2 = \u2212 1 )$ where $ a 1, a 2, b 1$, and $ b 2$ are real-valued with $ a 1\u2248 a 2$ and $ b 1\u2248 a 2$, we redefine $ x j$ and $ x j + 1$ to be, respectively, given by $ x j=( a 1+ a 2)/2+i( b 1+ b 2)/2$ and $ x j + 1=( a 1+ a 2)/2\u2212i( b 1+ b 2)/2$. For each component corresponding to a real eigenmode, after each RK4 time step, we simply redefine it to be its real part and ignore the small imaginary part that may also arise from round-off errors. The same numerical procedure is adopted to integrate the reduced systems obtained from invariant manifold reduction as well as the FMT parameterization used in Appendix D.

### APPENDIX D: FAILURE FROM STANDARD MANIFOLD PARAMETERIZATIONS

In this section, we briefly report on the reduction skills for the chaotic case of Sec. V C as achieved through application of the invariant manifold theory or standard formulas used in the inertial manifold theory. The invariant manifold theory is usually applied near a steady state. To work within a favorable ground for these theories, we first chose the steady state $ Y \xaf r$ that is closest to the mean state $ C \xaf r$ at the parameter value $r$, and we want to approximate the chaotic dynamics via a reduced system. This way, the chaotic dynamics is located near this steady state.

^{49}

The predicted orbits obtained from reduced systems built either on the IM parameterization (D3) or the FMT one (D5) are shown in the top row of Fig. 7 as blue and magenta curves, respectively. Both reduced systems lead to periodic dynamics and thus fail dramatically in predicting the chaotic dynamics. The small spectral gap issue mentioned in Remark III.1 plays a role in explaining this failure but it is not the only culprit. Another fundamental issue for the closure of chaotic dynamics lies in the difficulty to provide accurate parameterization of small-energy but dynamically important variables; see, e.g., Ref. 12, Sec. 6. This issue is encountered here, as some of variables to parameterize for Experiment II contain only from $0.23%$ to $2.1%$ of the total energy.

Replacing $ Y \xaf r$ by the genuine mean state $ C \xaf r$ does not fix this issue as some of variables to parameterize contain still a small fraction of the total energy, from $0.36%$ to $1.5%$. The IM parameterization (D3) and the FMT one (D5) whose coefficients are now determined from the spectral elements of $ A ~$ in which $ C \xaf r$ replaces $ Y \xaf r$ in (D1), still fail in parameterizing accurately such small-energy variables; see bottom row of Fig. 7.^{112} By comparison, the OPM succeeds in parameterizing accurately these small-energy variables. For Experiment I, inaccurate predictions are also observed from the reduced systems built either on the IM parameterization (D3) or the FMT one (D5) (not shown).

## REFERENCES

*Lecture Notes in Mathematics*(Springer-Verlag, Berlin, 1981), Vol. 840.

*Topics in Geophysical Fluid Dynamics: Atmospheric Dynamics, Dynamo Theory and Climate Dynamics*

*Mathematical Surveys and Monographs*(American Mathematical Society, Providence, RI, 1988), Vol. 25, pp. x+198.

*Applied Mathematical Sciences*(Springer-Verlag, New York, 2002), Vol. 143, pp. xiv+670.

*World Scientific Series on Nonlinear Science. Series A: Monographs and Treatises*(World Scientific Publishing, Hackensack, NJ, 2005), Vol. 53.

*Mathematical Surveys and Monographs*(American Mathematical Society, Providence, RI, 2011), Vol. 176.

*Infinite-Dimensional Dynamical Systems in Mechanics and Physics*

*Stochastic Climate Theory*

*Integral Manifolds and Inertial Manifolds for Dissipative Partial Differential Equations*

*Stochastic Parameterizing Manifolds and Non-Markovian Reduced Equations: Stochastic Manifolds for Nonlinear SPDEs II*

*The Parameterization Method for Invariant Manifolds:From Rigorous Results to Effective Computations*

*Approximation of Stochastic Invariant Manifolds: Stochastic Manifolds for Nonlinear SPDEs I*

*Geometrical Methods in the Theory of Ordinary Differential Equations*

*Bifurcation and Chaos in Complex Systems*

*Patterns and Dynamics in Reactive Media*, edited by R. Aris, D. G. Aronson, and H. L. Swinney (Springer, 1991), pp. 11–31.

*Nonlinear Climate Dynamics*

*Natural Climate Variability on Decade-to-Century Time Scales*(National Academy Press, 1995), pp. 355–364.

*Applied Mathematical Sciences*(Springer-Verlag, New York, 2004), Vol. 112, pp. xxii+631.

*Bifurcation Theory: An Introduction with Applications to Partial Differential Equations*

*Climatic Variations and Variability: Facts and Theories*(Springer, 1981), pp. 461–480.

*The Mathematics of models for Climatology and Environment*(Springer, 1997), pp. 253–287.

*et al.*,

*Mathematical Problems from Combustion Theory*

*Diffusion and Heat Exchange in Chemical Kinetics*