Statistical response theory provides an effective tool for the analysis and statistical prediction of high-dimensional complex turbulent systems involving a large number of unresolved unstable modes, for example, in climate change science. Recently, the linear and nonlinear response theories have shown promising developments in overcoming the curse-of-dimensionality in uncertain quantification and statistical control of turbulent systems by identifying the most sensitive response directions. We offer an extensive illustration of using the statistical response theory for a wide variety of challenging problems under a hierarchy of prototype models ranging from simple solvable equations to anisotropic geophysical turbulence. Directly applying the linear response operator for statistical responses is shown to only have limited skill for small perturbation ranges. For stronger nonlinearity and perturbations, a nonlinear reduced-order statistical model reduction strategy guaranteeing model fidelity and sensitivity provides a systematic framework to recover the multiscale variability in leading order statistics. The linear response operator is applied in the training phase for the optimal nonlinear model responses requiring only the unperturbed equilibrium statistics. The statistical response theory is further applied to the statistical control of inherently high-dimensional systems. The statistical response in the mean offers an efficient way to recover the control forcing from the statistical energy equation without the need to run the expensive model. Among all the testing examples, the statistical response strategy displays uniform robust skill in various dynamical regimes with distinct statistical features. Further applications of the statistical response theory include the prediction of extreme events and intermittency in turbulent passive transport and a rigorous saturation bound governing the total statistical growth from initial and external uncertainties.

We discuss several useful perspectives of the linear and nonlinear response theories and their applications in various problems and challenges in high-dimensional turbulent dynamical systems, including statistical model reduction, uncertainty quantification (UQ), and control of complex systems. Especially, the representative properties of the strategies are illustrated by a hierarchy of simplified prototype models ranging from exactly solvable equations to geophysical turbulence, where the dominant statistical features in the models are separated and analyzed extensively. The statistical response theory displays a promising role in overcoming the curse-of-dimensionality by identifying the most sensitive response directions. Uniform robust skill for the statistical response strategies is displayed in the testing examples in various dynamical regimes and showing potential for further development in more realistic turbulent dynamical systems.

## I. INTRODUCTION

Complex turbulent dynamical systems appear in a wide variety of areas in science and technology. Typical examples can be found in interacting physical, biological, and social processes across a huge range of time and spatial scales of overwhelming complexity.^{1–5} Such turbulent systems are usually characterized by an irreducible high-dimensional phase space and a large number of instabilities that move energy across the scales. Instabilities due to the positive Lyapunov exponents as well as the external forcing excitations lead to rapid growth of small perturbations in the high dimensional space.^{1,6,7} A statistical description with model reduction strategies for the state variables becomes necessary. One example of practical significance^{8–11} is the prediction of climate change from external forcing perturbations such as the impact from increased carbon dioxide. Mitigation of such effects from the climate change also requires effective control strategies^{12,13} which form another set of challenging problems.

The linear response theory together with the fluctuation dissipation theorem (FDT) offers a promising tool for the statistical quantification and prediction of turbulent systems.^{14,15} It predicts statistical averages of the functionals of interest (the “response”) requiring only the unperturbed equilibrium statistical information (the “climate”). The potential validity of the linear response theory has been widely investigated in both theories^{16–19} and numerical algorithms^{20–23} through a variety of test models and techniques. In the study of coarse-grained large-scale variability in climate change, the FDT using linear response operators provides a direct leading-order approximation of the statistical responses in a unified fashion valid for multiple climate change scenarios with different forcing, dissipation, and other parameters without the need to repeatedly run the complex climate model with various perturbations.^{15,24}

Despite many success in the linear response theory, this method is significantly hampered by the fundamental constraint within small perturbation amplitudes. It is shown that the linear statistical predictions are usually accurate for the mean responses with near-Gaussian statistics, while the performance for predicting the variance responses can quickly deteriorate in some systems as the forcing perturbation as well as the non-Gaussianity grows large.^{19,25,26} To overcome the limited range of skill in the linear response prediction, new statistical nonlinear response theories and methods have been proposed and become one main theme in the contemporary research.^{1,21,27–29} A systematic methodology has been proposed^{7,26} for capturing the principal statistical nonlinear responses to changes in forcing using statistical reduced-order models. A precise strategy to calibrate the imperfect model error and improve the prediction skill in response to anisotropic perturbations (such as the climate change) is developed relying on the information-theoretic framework^{14,30,31} so that the overwhelming computational demands in Monte-Carlo simulations in a high-dimensional space are avoided.

In this paper, we review the linear and nonlinear statistical response theories with a focus on the variety of important and effective applications in uncertainty quantification (UQ) and control of turbulent complex systems. A hierarchy of dynamical models with increasing complexity (see in Sec. II B) is proposed to illustrate the representative features of the statistical response theory and the related information-theoretic framework. The key problems to be discussed include the following interrelated topics which have crucial applications to many realistic systems from nature.

### A. Approximating linear response operators with Gaussian and non-Gaussian equilibrium statistics

The practical computation of the linear response operator for FDT requires proper characterization for the usually inaccessible equilibrium statistical distribution of the unperturbed system. A first approximation is through the quasi-Gaussian closure (qG-FDT) proposed first by Leith^{15} and has been generalized and improved for both mean and variance responses with many successful applications.^{24,30,32} To compute the linear response operators from a strongly non-Gaussian equilibrium state, a kicked response of the unperturbed system is proposed,^{14,26} where the higher moment feedbacks are better incorporated in an efficient way. The extent of skills in the linear response theory is illustrated under the explicit solvable model^{24} and the two-layer Lorenz ’96 system^{33} (in Sec. III B).

### B. Improving the prediction skill of nonlinear responses through the information-theoretic framework

The nonlinear response theory aims to obtain the statistical estimates of changes in mean and variance for key quantities in the nonlinear responses to external forcing or uncertain initial state. Reduced-order statistical models are constructed for the most sensitive subspace. The linear response operator offers a precise systematic framework to improve imperfect model sensitivity through measuring the information error in the training phase using only unperturbed statistics. Empirical information theory^{14,31} offers an unbiased invariant metric for calibrating the model error in the response operators and systematically improves model sensitivity in a unified way with model fidelity.^{34,35} Besides, a statistical energy principle^{36} is adopted to improve the imperfect model prediction skill by offering the total energy profile without increasing the computational cost too much.

The basic ideas in the construction of the reduced-order statistical model are first illustrated using the triad system with quadratic energy transfer (in Sec. III C). The improved skill in the nonlinear prediction is discussed in detail using the prototype topographic barotropic model describing the interactive motion of large-scale mean flow and small-scale vortical modes^{37} (in Sec. IV). The model displays a rapid change of directions in the large scale jets sensitive to external perturbations. It is shown that the principal statistics in first and second moments including the shifting jet directions can be captured with the desirable accuracy and efficiency using a reduced-order model with only three resolved modes. Besides, a rigorous saturation bound^{38} is derived for quantifying the total changes in the mean fluctuation and variance due to initial and external perturbations.

### C. Controlling inhomogeneous responses using statistical functional responses in high dimension

The control of turbulent systems concerns the optimal course of actions to drive the states of interest to a desired final condition with a minimum cost. Solving the control problem in highly turbulent systems directly is always challenging due to the large instabilities and strong nonlinear exchange of energy among different scales.^{13,39} Statistical response theory provides an alternative approach for controlling the statistical scalar energy,^{40,41} where the instability due to the symmetric nonlinear interactions can be circumvented. Simple autocorrelation functions can be approximated in an explicit form to represent the linear mean response so that the specific control forcing is determined offline without the need to run the complex high-dimensional system to get the responses. The topographic barotropic flow is used again to demonstrate the control skill in geophysical turbulence with non-Gaussian statistics and instabilities (in Sec. V).

### D. Predicting extreme events and tracer intermittency in passive scalar turbulence

Extreme events represented by non-Gaussian fat-tails in passive tracer distributions are a universal feature in observation, experiments, and numerical simulations.^{42–44} As another effective application of the response theory, the capability of using stochastic reduced-order models to capture crucial tracer statistical features is achieved using more tractable linear Gaussian stochastic models with optimization for its linear response functionals.^{45,46} The same systematic model reduction procedure for the statistical response is adopted. The passive scalar algorithm is tested using a two-layer baroclinic turbulent flow, which can generate various representative atmosphere and ocean regimes with jets and vortices (in Sec. VI).

In the structure of this paper, a general overview of the linear and nonlinear statistical theories is given in Sec. II in company with the general model formulation. Using simple prototype models, the accuracy of various statistical response functionals under the several proposed approximation methods is compared in Sec. III. The linear and nonlinear response theories are applied to a more practical geophysical turbulence model in Sec. IV to show their practical skills. Further important applications of the statistical response theory for effective statistical control of turbulent system and prediction of extreme events in passive tracer turbulence are discussed in Secs. V and VI separately. A summary is given in Sec. VII discussing the accuracy and extent of applications of the linear and nonlinear response theories.

## II. STATISTICAL RESPONSE THEORY AND ALGORITHMS FOR COMPUTING THE LINEAR RESPONSE OPERATORS

In this section, we review the statistical formulation for the general complex turbulent systems, and display the main conclusions from the linear response theory. Efficient algorithms to compute the response operators are also described.

### A. A review of the general complex turbulent systems

One representative feature in complex turbulent dynamical systems from nature is the nonlinear energy conserving interaction that transports energy from the unstable subspace to stable one where the energy is strongly dissipated, resulting in a final statistical steady state in equilibrium. The general abstract formulation of the turbulent dynamical systems is proposed^{1,7} for the state variable of interest $u\u2208RN$ in a high-dimensional phase space,

On the right hand side of Eq. (1), the first two components, $(L+D)u$, represent linear dispersion and dissipation effects, where $L\u2217=\u2212L$ is an energy-conserving skew-symmetric operator and $D<0$ is a negative definite operator. The nonlinear effect in the dynamical system is introduced through a quadratic form, $B(u,u)$, that satisfies the conservation law, so that

Besides, the system is subject to external forcing effects that are decomposed into a deterministic component, $F(t)$, and a stochastic component represented by a Gaussian random process, $\sigma (t)W\u02d9(t;\omega )$. It needs to be emphasized that, in many situations, $F$ might be inhomogeneous and introduce anisotropic structures into the system, and $\sigma (t)W\u02d9(t;\omega )$ might further alter the energy structure in the fluctuation modes.

The internal instability and the stochastic random forcing in the turbulent system (1) require a statistical description for the state variable $u$. Assuming that the stochastic white noise is written in the Ito sense, the probability density function $p(u,t)$ can be found directly from the solution of the associated Fokker-Planck equation (FPE),

The unperturbed climate is associated with the equilibrium density function $peq(u)$ that satisfies the invariant equation $LFPp=0$. Usually, we are interested in the statistics in some functional $A(u)$ defined by the expectation according to the density function,

where $\u27e8A\u27e9p$ is the general expectation, while $\u27e8A\u27e9eq$ gives the unperturbed equilibrium statistics. It is generally a challenging task for directly solving the FPE system (3), especially when the dimensionality of the system becomes large. However, there are important turbulent dynamical systems where the FPE can be solved even with large dimension and highly non-Gaussian statistics.^{47,48}

#### 1. A statistical formulation of general turbulent systems

Due to the turbulent features of the general systems (1) and the high computational cost to solve the Fokker-Planck equation (3), it is often useful to derive a dynamical formulation for the evolution of key statistical moments. To achieve this, we rewrite the stochastic field $u$ in a finite-dimensional representation consisting of a fixed-in-time, $N$-dimensional, orthonormal basis ${ei}i=1N$

where $u\xaf(t)=\u27e8u(t)\u27e9$ represents the statistical expectation of the model state mean response (we use the angled bracket $\u27e8\u22c5\u27e9$ to represent the ensemble average), i.e., the mean field; and ${Zi(t;\omega )}$ are stochastic coefficients (with $\omega $ to denote a random process) measuring the fluctuation processes along each model direction $ei$. In most of the situations, we are interested in the time evolution of the mean state $u\xaf$ and the covariance $R$ characterizing the variability among different modes.

By taking the statistical (ensemble) average over the original Eq. (1) and using the mean-fluctuation decomposition (4), the evolution equations of the mean state $u\xaf$ and the covariance matrix $R$ are given by the following dynamical equations:

with $R=\u27e8ZZ\u2217\u27e9$ being the second-order covariance matrix of the stochastic coefficients $Z=[Z1,\u2026,ZN]T$. The detailed derivation of the above equations follows that standard procedure and can be found in Refs. 7 and 27. The explicit forms of the operators in the covariance equation are summarized as follows organized from different order statistics:

- the quasilinear operator $Lu(u\xaf)$ expresses the linear coupling of non-normal dynamics (effect due to $L$) and energy dissipation (effect due to $D$), as well as the energy transfers between the mean field and the stochastic modes (effect due to $B$)(6a)${Lu}ij=[(L+D)ej+B(u\xaf,ej)+B(ej,u\xaf)]\u22c5ei,$
- the positive definite operator $Q\sigma $ expresses energy injection from the external stochastic forcing(6b)${Q\sigma}ij=\u2211k(ei\u22c5\sigma k)(\sigma k\u22c5ej),$
- the nonlinear energy flux $QF$ expresses nonlinear energy exchanges between different fluctuation modes due to non-Gaussian statistics (or nonlinear terms) modeled through third-order moments(6c)${QF}ij=\u2211m,n\u27e8ZmZnZj\u27e9B(em,en)\u22c5ei+\u27e8ZmZnZi\u27e9B(em,en)\u22c5ej.$

We stress that the energy conservation property of the quadratic operator $B$ from (2) is inherited in the statistical equations by the matrix $QF$ since the energy conservation property for the fluctuation state $u\u2032$ yields

The above exact statistical equations for the state of the mean (5a) and covariance matrix (5b) will be the starting point for the developments in the reduced-order models on UQ methods.

Note that the above procedure by adding higher and higher order moment equations can never construct a closed system. The statistical dynamics for the mean (5a) and covariance (5b) are not closed due to the inclusion of third-order moments through the nonlinear interactions in $QF$ in (6c). The basic idea in developing reduced-order schemes concerns proper approximation of this energy flux term $QF$ in a simple and efficient manner so that the energy mechanism can be properly modeled for both the transient state and the final equilibrium.^{1,7}

#### 2. A statistical energy identity for turbulent systems

Despite the fact that the exact equations for the statistical mean (5a) and the covariance fluctuations (5b) are not a closed system, there exists a suitable statistical symmetry so that the total energy in the mean plus the total variance satisfies an energy conservation principle even with the general deterministic and random forcing. Here, we briefly describe the result in the statistical theory developed in Refs. 1 and 36 about a statistical energy functional for the abstract system (1). As we will show in the later sections, this total statistical energy offers a general description about the total responses in the perturbed system and will play a crucial role in the construction of various nonlinear response and control theories.

For characterizing the total statistical variability in the turbulent system, we consider the statistical mean energy, $E\xaf=12|u\xaf|2=12u\xaf\u22c5u\xaf$, for the total energy in the mean; and the statistical fluctuation energy, $E\u2032=12\u27e8u\u2032\u22c5u\u2032\u27e9=12trR$, for the total variability in the system. Furthermore, the following symmetries are assumed^{36} involving the nonlinear interaction operator $B$ under the orthonormal basis ${ei}$:

- The self interactions vanish in the quadratic interaction,(7a)$B(ei,ei)\u22610,1\u2264i\u2264N,$
- The dyad interaction coefficients vanish through the symmetry,(7b)$ei\u22c5[B(ej,ei)+B(ei,ej)]=0foranyi,j.$

The above detailed triad symmetry guarantees that the nonlinear interaction $B(u,u)$ will not alter the total statistical energy structure in the system [though the state of the mean and covariance may both changed due to the nonlinear terms in Eqs. (5a) and (5b)]. It can be found satisfied among many systems including the examples shown next. Therefore, we have the following theorem for general turbulent systems:^{1,36}

*Statistical Energy Conservation Principle*$)$ Under the structural assumptions (7a)$,$ (7b) on the basis $ei,$ for any turbulent dynamical systems (1)$,$ the total statistical energy, $\u27e8E\u27e9=E\xaf+E\u2032=12u\xaf\u22c5u\xaf+12trR,$ satisfies the scalar dynamical equation,

Especially under homogeneous structures, $D=\u2212dI$ and $F=fI$, the right hand side of the statistical energy equation (8) becomes a linear damping on the total energy, $\u2212d\u27e8E\u27e9$, plus the deterministic forcing applying on the homogeneous mean state, $u\xaff$, and the stochastic forcing contribution. This implies that the total energy structure (and thus the second-order variance in all the modes) can be determined purely from the first-order mean state by solving a simple scalar equation.

### B. Prototype models for complex turbulent dynamical systems

Many complex turbulent dynamical systems can be categorized into the previous abstract mathematical structure (1) satisfying the structural property (2). A series of typical prototype models can be found for the investigation of complex turbulent dynamical systems. These qualitative and quantitative models with increasing complexity form a desirable hierarchical set of testing models for various applications in prediction with model error, uncertainty quantification, and state estimation.^{1,7,49}

*The triad system with explicit solution*. The triad model^{1,50}offers an elementary building block of complex turbulent systems with exact solvable solutions. It is a 3-dimensional ODE system with inhomogeneous damping and both deterministic and stochastic forcing terms,The strategies for constructing reduced-order models are built on this triad model without scale separation $\u03f5=1$ in Ref. 14. Another interesting set of problems is described in Ref. 1 for testing the multiscale dynamics with $\u03f5\u226a1$ and no nonlinearity $B1=B2=B3=0$. The triad nonlinear stochastic model also offers an unambiguous testbed for the statistical responses in a time-periodic setting.(9a)$du1dt=\u2212d1u1\u2212L3u2+L2u3+B1u2u3+F1+\sigma 1W\u02d91,du2dt=L3u1\u2212d2u2\u2212L1\u03f5u3+B2u3u1+F2+\sigma 2W\u02d92,du3dt=\u2212L2u1+L1\u03f5u2\u2212d3u3+B3u1u2+F3+\sigma 3W3\u02d9.$^{32}*The one and two-layer Lorenz ’96 model*. The original Lorenz ’96 (L-96) model is a 40-dimensional ODE system introduced first by Lorenz,^{51}^{,}It is defined with periodic boundary conditions mimicking geophysical waves in the midlatitude atmosphere. Various representative statistical features can be generated by changing the external forcing values in $F$.(9b)$dujdt=(uj+1\u2212uj\u22122)uj\u22121\u2212duj+Fj,j=1,\u2026,J=40.$^{1,14,24}In addition, a two-layer generalization^{24,33}is introduced by coupling the original L-96 model (9b) with a second level dynamics,where $\u03f5>0$ is the time scale separation parameter and $\lambda >0$ is the nonlinear coupling parameter. The subindexes $j=1,\u2026,J$ and $k=1,\u2026,K$ are for the first and second level slow and fast processes. The two-layer L-96 model is used as a simple model for testing multiscale flow turbulence simulating the typical features of geophysical flows.(9c)$dujdt=(uj+1\u2212uj\u22122)uj\u22121\u2212duj+Fj\u2212\lambda \u2211kYj,k,dvj,kdt=1\u03f5vj,k+1(vj,k\u22121\u2212vj,k+2)\u2212dj,kvj,k+\lambda uj,$^{33,50}*The one-layer barotropic model with topography*. The one-layer barotropic system^{1,10,37}is a basic geophysical model for the atmosphere or ocean with the essential geophysical effects of rotation, topography, and both deterministic and random forcing. It is defined based on the potential vorticity $q=\Delta \psi +h+\beta y$ with $\omega =\Delta \psi $ being the relative vorticity, $h$ being the bottom topography, and $\beta $ being the rotational effect,This is one of the simplest models for geophysical turbulent, while still maintaining the skill to generate many interesting realistic features, for example, the change of strength and direction in the large-scale zonal jets.(9d)$\u2202q\u2202t+\u2207\u22a5\psi \u22c5\u2207q+U\u2202q\u2202x=\u2212D(\Delta )\psi +F(x,t)+\Sigma (x)W\u02d9(t),dUdt+\u222b\u2202h\u2202x\psi (t)=\u2212D0U+F0(t)+\Sigma 0W\u02d90(t).$^{37,38}*The two-layer quasigeostrophic model with baroclinic instability*. The two-layer quasigeostrophic model with baroclinic instability^{3,8,52}is one fully nonlinear fluid model defined in a two-dimensional periodic domainIt is capable in capturing the essential physics of the relevant internal variability including internal baroclinic instability. This model is useful for the investigation of climate change problems, such as the change of jet direction, the responses in the zonal mean jet strength, and the poleward heat transport in the atmosphere and ocean regimes.(9e)$\u2202q\psi \u2202t+J(\psi ,q\psi )+J(\tau ,q\tau )+\beta \u2202\psi \u2202x+U\u2202\u2202x\Delta \tau =\u2212\kappa 2\Delta (\psi \u2212\tau )+F\psi (x,t),\u2202q\tau \u2202t+J(\psi ,q\tau )+J(\tau ,q\psi )+\beta \u2202\tau \u2202x+U\u2202\u2202x(\Delta \psi +kd2\psi )=\kappa 2\Delta (\psi \u2212\tau )+F\tau (x,t).$^{52}

We will use the above models next to illustrate the various features of linear and nonlinear statistical response theory and its applications to various problems.

^{48,50}provides a rich group of models in understanding and predicting complex multiscale turbulent dynamics. The conditional Gaussian models take the general form of

^{48}Modified versions of the previous examples can be embedded into this conditional Gaussian structure (10) with suitably added noises.

### C. Statistical linear response theory for turbulent systems

One important problem in the complex turbulence system (1) concerns about the accurate calibration of the model sensitivity to perturbations. Detecting the optimal response in the high-dimensional phase space is crucial for the climate change science.^{9,21,22} Direct solving the statistical equations (5) according to various climate change perturbations becomes expensive and mostly impractical. Statistical linear response theory^{7,14,24,53} provides an attractive alternative by expressing the statistical quantities subjected to small perturbations in terms of statistics of an unperturbed state. Here, we give a brief description for the theories and strategies in predicting model statistical responses.

#### 1. Linear response theory for general perturbations

The statistical response theory and fluctuation-dissipation theorem (FDT) offer a convenient way to compute leading-order statistical approximation about model responses to various external perturbations.^{14,23,35,54} The general unperturbed system (1), $\delta F=0$, gives an invariant equilibrium measure $peq(u)$. The external forcing perturbation is written in separation with temporal and spatial variables,

where $w(u)$ defines the spatial dependence and $f(t)$ defines the temporal variability of the forcing perturbations. The resulting perturbed probability density function $p\delta $ due to the general forcing perturbation $\delta F$ then can be asymptotically expanded according to the equilibrium and the correction to perturbation as

The statistical expectation of any functional $A(u)$ to the perturbation is formulated accordingly under the equilibrium measure and leading-order correction, $\u27e8A(u)\u27e9=\u27e8A\u27e9eq+\delta \u27e8A\u27e9(t)+O(\delta 2)$, using the measure asymptotic decomposition (11) with $\u27e8A\u27e9eq=\u222bA(u)peq(u)$ being the expectation of $A$ according to equilibrium distribution $peq$, and $\delta \u27e8A\u27e9=\u222bA(u)\delta p\u2032(u)$ according to the perturbation correction $\delta p\u2032$. FDT gives the leading order prediction for the statistical response functional computed from the convolution with the linear response operator,

The derivation of the above formula is concluded from separating the leading order dynamics of the Fokker-Planck equation for the perturbed probability function.^{14} The pointed-bracket above denotes the statistical average under the solution $p$ from the Fokker-Planck equation. $RA(t)$ is called the linear response operator corresponding to the functional $A$, which is calculated through correlation functions in the unperturbed statistical equilibrium only,

The noise term $\sigma W\u02d9$ is used to generate smooth equilibrium measure $peq$ for computing the derivatives in the linear response operator $RA$. The FDT formula for statistical responses (12) is shown to have high skill for the mean response and some skill for the variance response for a wide variety of turbulent dynamical systems even with nonlinearity.^{20,22,24,30,55,56} In addition, rigorous theories^{16,17} have been proposed for the validity of the linear response theory.

Still, the linear response operator $RA$ is difficult to calculate by directly using (13) for general systems considering the complicated and unaccessible equilibrium distribution $peq$. A variety of Gaussian approximations for $peq$ and improved algorithms have been developed for computing the linear response operators.^{14,15,17,30} In many situations, it is found that a Gaussian PDF from the Gibbs invariant measure can offer a quite accurate characterization of the unperturbed distribution of the system (see explicit examples from Secs. III and IV). In this way, the ‘g‘quasi-Gaussian” (qG) closure, $peq\u223cpeqG$, provides a desirable approximation of the equilibrium measure. Then, the linear response operator (13) can be computed directly from the autocorrection functions. The qG FDT has shown effective skill in predicting the statistical responses especially in the mean state but less feasible for the higher-order moment predictions (see examples in Refs. 24 and 30 and in Sec. III).

#### 2. Computing response operators with non-Gaussian statistics through kicked initial states

The difficulty in calculating the linear response operator using (13) is the expensive computational cost for general systems with non-Gaussian features in the equilibrium distribution $peq$ in a high-dimensional phase space. One strategy with higher accuracy (than using the quasi-Gaussian closure) to approximate the linear response operator avoiding direct evaluation of $peq$ but still include important non-Gaussian statistics is through the kicked response by converting an unperturbed system to a perturbation $\delta u$ of the initial state from the equilibrium measure. The idea is to set the initial distribution with the same equilibrium variance but a kick in the mean state so that

One important advantage of adopting this kicked response strategy is that higher-order statistics due to nonlinear dynamics will not be ignored (compared with the other linearized strategy using only Gaussian statistics^{24}). The validity of the kicked response strategy is guaranteed by the following proposition^{14,57} for calculating the linear response operator:

From the formula (15), the response operator for the mean and variance can be achieved from the perturbation part of the probability density $\delta p\u2032$. The proof of the above proposition is a direct application of Duhamel’s principle to the corresponding Fokker-Planck equation with the forcing perturbation.^{14} The variability in the external forcing is transferred to the perturbation in initial values. More importantly, the kicked response formulation (15) with initial mean state perturbation (14) is independent of the specific perturbation forms. Thus, the operator $RA$ describes the inherent dynamical mechanisms of the system.

##### 1. Practical numerical strategies to compute the kicked response operators

The statistical kicked response strategy offers a practical method to compute the linear response operators especially for the unaccessible non-Gaussian equilibrium distributions. From the formula in (15), the response operators for the mean and variance can be achieved from the perturbation part of the probability density $\delta p\u2032$. In practice, $\delta p\u2032$ can be either estimated from a direct Monte-Carlo simulation or from a statistical model solution. Below, we describe the numerical procedure to get this distribution function $\delta p\u2032$ for general turbulent systems. See Refs. 7 and 14 for a similar version of this algorithm.

*Kicked response from direct model simulations*: A direct way to get the kicked PDF response $\delta p\u2032$ is through a Monte-Carlo simulation of (1) using a large enough ensemble size to capture the leading statistical responses. The initial ensemble for the model simulation is sampled from the equilibrium distribution. For the kicked response to the mean, a constant small perturbation of the equilibrium mean state (for example, $\delta u=0.1u\xafeq$) is added to each initial ensemble members, while the initial variance of the ensemble is kept unchanged. The response distribution $\delta p\u2032$ then is achieved by monitoring the decay of the ensemble mean back to equilibrium under unperturbed dynamics and uniformly perturbed initial value.*Kicked response from statistical model*s: Another way to get the kicked response statistics is through the statistical model equations (5) to get the responses for the mean and variance. The initial mean state is taken from the equilibrium distribution with a perturbation (for example, $\delta u=0.1u\xafeq$). The initial value for the variance is taken the same as the equilibrium state value and kept unperturbed. Then, starting from this initial mean and variance, the statistical model offers the kicked response solution to monitor the decay of the mean and variance back to equilibrium.

The first strategy above for computing the kicked response requires a one-time direct Monte-Carlo simulation. Still, this is only required for one time computation in advance in the training phase, and the optimized model can be efficiently applied to various perturbations in general. In contrast, the qG closure using the Gaussian approximation of the equilibrium measure could be more efficient in computational cost. However, the kicked response gives a better quantification of the higher order statistics of the system since the nonlinear dynamics is better characterized. Furthermore, with the information metric, the errors in the response operator are calibrated in a precise way for long time prediction rather than the usual $L1$ error distance only valid for short time responses. Therefore, the kicked response has more feasible prediction for systems with stronger nonlinearity and generates better results for higher order moments.

## III. STATISTICALLY ACCURATE AND INACCURATE RESPONSE FUNCTIONALS AND REDUCED-ORDER MODELS FOR NONLINEAR RESPONSES

In this section, we display the skills in the statistical linear and nonlinear response theories under the framework of prototype models. With these simplified models, the basic properties of the statistical response functionals with accurate and inaccurate prediction skills can be characterized in a precise way. More realistic applications for the two-dimensional barotropic turbulence are then carried out next in Sec. IV.

Low-frequency stochastic models offer an attractive first testing framework for understanding the basic properties in the statistical response theory. A one-dimensional exactly solvable model with additive and multiplicative noises

has been first used to show the typical features in FDT.^{24,58} This model provides the simplest tool in the investigation of low-frequency variability in the long time atmospheric or climate sensitivity studies. Especially, explicit unperturbed distribution solution can be found for the model (16) by directly solving the associated Fokker-Planck equation. The explicit formulation of the stochastic model enables us to calculate the response operators based on different response functionals $A(u)$ using Gaussian (and conditional Gaussian) statistics. Another desirable property of the stochastic low-frequency model (16) is that it generates a wide range of different statistical regimes from uni-modal near-Gaussian PDFs to skewed non-Gaussian distributions by changing the parameter values. This is also similar to the features in the general atmospheric circulation models.

Next, we compare the accuracy in the Gaussian and quasi-Gaussian response theory with kicked initial state using nonlinear prototype models without explicit solvable expressions.

### A. Empirical information theory for measuring imperfect model errors

In most practical cases, the exact equilibrium distribution $peq$ is unavailable due to the complex nonlinear coupling between scales. Therefore, imperfect approximation and calibration of the model invariant distributions with error is the first problem to consider. The information-theoretic framework^{14,59} provides an unbiased metric to quantify the imperfect model sensitivity by building the least biased probability measure consistent with the true system. Information theory is often used in statistical science for imperfect model selection.^{60} A natural way to measure the lack of information in one probability density with approximation, $pM$, compared with the true probability density, $p$, is through the relative entropy (or information distance),^{14,31} given by

The relative entropy guarantees the following two crucial features to serve as a metric: (i) $P(p,pM)\u22650$, and the equality holds if and only if $p=pM$; (ii) it is invariant under any invertible change of variables. Despite the lack of symmetry in its arguments (that is, $P(p1,p2)\u2260P(p2,p1)$ in general), the relative entropy, $P(p,pM)$ provides an attractive framework for assessing model error and sensitivity like a probabilistic metric.

Usually, the first entry $p$ is used to denote the probability distribution of the perfect model, which is unknown. Nevertheless, we can construct an estimate distribution $pL$ of the perfect model using first $L$ measurements of the true system. Consider the imperfect model prediction with its associated probability density $pLM$, the definition of relative entropy (17) facilitates the practical calculation,^{1,35,57,61,62}

The entropy difference $P(p,pL)$ precisely measures an intrinsic error from $L$ measurements of the perfect system, and this offers a simple example of an information barrier for any imperfect model based on $L$ measurements for calibration. The most practical setup for utilizing the framework of empirical information theory arises when only the Gaussian statistics (that is, the first two moments) of the distributions are computed. With the measurements $L=2$ representing the first two moments, a Gaussian approximation can be used to estimate the information error $P(pL,pLM)$ considering only the first $L=2$ statistical measurements. By only comparing the first two moments of the density functions, we get the following fact:^{14}

The first term on the right hand side of (18) is called the “signal,” reflecting the model error in the mean and weighted by the inverse of the model variance $RM$, whereas the second term is the “dispersion,” including only the model error covariance ratio $RRM\u22121$, measuring the differences in the covariance matrices.

For measuring the model sensitivity in the linear response statistics (11), we have $p\delta =peq+\delta p$ for the true model and $p\delta M=peqM+\delta pM$ for the imperfect model approximation. In the case with equilibrium consistency, that is, the imperfect model gives the exact unperturbed invariant measure $P(peq,peqM)=0$, the relative entropy in (18) between perturbed model density $p\delta M$ and the true perturbed density $p\delta $ with small perturbation $\delta $ can be expanded in the leading order terms as

Above, the first part $P(pL,\delta ,p\delta )$ is the intrinsic error from only first $L$ measurements of the system. The second quadratic term is called the Fisher information metric.^{17,50,63} Under Gaussian statistical closure in the two distributions $p=pG$ and $pM=pGM$, the Fisher information can be expanded componentwise in the perturbation of the mean and variance as

Here, $rk$ is the equilibrium variance in the $k$th component, and $\delta u\xafk$ and $\delta rk$ are the linear response operators for the mean and variance in the $k$th component. The information theory plays an important role in quantifying model error, model sensitivity, and information barrier. Next, we illustrate the calibration of model error in the response theories using explicit examples with multiscale structures.

### B. Linear response functionals using the two-layer Lorenz ’96 model

We first use the inviscid two-layer L-96 model introduced in (9c) to display the validity of linear response theory for systems with multiscale structures,

The two-layer L-96 model is characterized by two sets of slow-fast processes $(uj,vj,k)$. The variable $uj$ with $1\u2264j\u2264J$ represents the slow varying climate, while $vj,k$ with $1\u2264k\u2264K$ according to each slow process $uj$ gives the set of fast varying weather with the total size $JK$. The parameter $\lambda $ defines the coupling coefficient between the two scales $uj$ and $vj,k$, $\u03f5>0$ defines the time scale separation between the two processes. In this test case, we assume zero external forcing $Fj\u22610$ in the unperturbed equilibrium. More discussion about detailed properties and physical interpretations of the two-layer L96 can be found in Refs. 33 and. 48. See Ref. 24 for other interesting phenomena in FDT with this model.

#### 1. Explicit linear response operators from the Gibbs invariant measure

The total energy $E(u,v)$ combining both slow and fast variables of the two-layer L-96 model is conserved for the inviscid flow without any damping and forcing effects. This implies the classical Gibbs invariant measure with zero mean and equipartition of energy at each grid defined as the climate,

where $\sigma eq2$ gives the homogeneous variance at each grid.^{10,14} Therefore, the linear response operator $RA$ in (13) can be computed directly using the explicit formulation of $peq$ from the Gibbs invariant measure. To test the accuracy in the linear responses with perturbations in large and small scale variables, we consider homogeneous perturbations added to the large-scale state $Fj=\delta F$. This forcing perturbation can be viewed at the climate change effect introduced to the slow-varying climate. Still both large and small scale variables will be affected by this large scale climate changing forcing. As a direct conclusion using the statistical energy principle in (8), the total statistical energy in the system $\u27e8E\u27e9=\u222bE(u,v)p(u,v)dudv$ about the perturbed probability distribution $p$ can be found only dependent on the forced response mean state

This implies that the total statistical energy of the system is determined only by the large-scale homogeneous mean state $u\xafj\u2261u\xaf.$ Note that the statistical energy $\u27e8uj2\u27e9,\u27e8vj,k2\u27e9$ contain the variabilities in both energy in the mean, $u\xaf2$, and the variance, $u\u20322\xaf$. The total variance in the inviscid system will keep increasing due to the constant forcing effect $\delta F\u22c5u\xaf$ with a final saturated mean state $u\xaf$.

Next, we compute the detailed statistical responses for the mean and variance according to the homogeneous large scale perturbation $Fj=\delta F$. By directly applying the formula (13), the corresponding operators $Gu$ using the quasi-Gaussian (qG) Gibbs measure (21) can be computed explicitly as

Again, due to the homogeneous statistics in the unperturbed equilibrium measure of the model, the cross-covariances between different grids vanish in the final state, $\u27e8uis(t)uj(0)\u27e9eq=0$ for $i\u2260j$. The explicit formulas for the qG response operators for the mean and variance of the large and small scale homogeneous variables can be computed as

Here, $Ru$ is the response operator for the homogeneous large-scale mean $\u27e8uj\u27e9\u2261u\xaf$, and $Ru2$ is the response operator for the homogeneous large-scale variance $\u27e8(uj\u2212u\xaf)2\u27e9\u2261r$. A similar case applies for the small scale variables $vj,k$. The linear response operators are computed under the statistical average about the equilibrium measure. From (23) using the Gaussian equilibrium distribution (21), the response function from the large scale perturbation is the autocorrelation functions, while the response function for the small scales comes from the lagged cross-correlation between the small-large scale states at some grid, $uj,vj,k$.

Finally, using the response operators (23), the statistical responses in mean and variance subject to the constant in time large scale perturbation $\delta F$ can be computed directly from the convolution formula (12) as an integration of the linear response operators,

Here, $Ru\xaf(t)$ and $Rr(t)$ are the response operators corresponding to the forcing perturbation. This gives the FDT prediction of the mean and variance responses in both large and small scales. Due to the homogeneous statistics in the equilibrium measure as well as the forcing perturbation forms, the statistical response stays uniform at each grid point. Finally, the accuracy of the statistical predictions can be measured under the information metric constraint on the first two moments using the formula (20).

#### 2. Numerical tests for the accurate and inaccurate linear response predictions

In the numerical tests, we take the model dimensions $J=8$ and $K=8$ (that is, with 8 variables in the slow state $u$ and in total 64 variable in the fast state $v$). The value of the coupling parameter is taken as $\lambda =1$ and the time scale separation is set as $\u03f5=0.1$. Zero external forcing $F\u22610$ is assumed for the unperturbed equilibrium state of the system. Similar reference parameters are adopted in Ref. 24. Figure 1 offers the direct simulation result for the unperturbed two-layer L-96 model PDFs. Both the slow and fast variables $(u,v)$ agree with the Gaussian distribution predicted in the Gibbs invariant measure with unit equilibrium variance $\sigma eq=1$. The time series show a more rapid time scale in the fast variable $v$ than the slow variable $u$.

Then, we compare the accuracy using the quasi-Gaussian (qG) FDT and kicked response FDT in this two-layer L-96 system, and the errors are measured in the information metrics. Particularly, we test the skills in the prediction of large and small scale responses in their mean and variance with only homogeneous large scale perturbation $Fj=\delta F$. First, Fig. 2 displays the linear response operators for the mean and variance responses in large scale $u$ and small scale $v$. The results from the two strategies using the kicked response and the qG closure are compared. For the response in the mean state, the two methods offers similar approximation for the linear response operator. This implies comparable skill in predicting mean responses using both approximations. For the response in the variance, however, the kicked response generates non-zero response in the long time limit, while the qG closure just captures the transient changes in variance and decays to zero at the long time limit from the autocorrelation functions. This difference shows the less accuracy in the qG approximation for predicting second order moments. Clearly from the total statistical energy equation (22), the variance should keep increasing in the inviscid model with a non-zero mean generated.

At last, we show the FDT prediction skill using the linear response operators for the statistical response in mean and variance. Figure 3 plots the statistical response predictions in mean and variance using both kicked FDT and qG FDT approximations. Two different homogeneous constant forcing perturbation amplitudes $\delta F=1$ and $\delta F=0.5$ are also compared for testing the accuracy with different nonlinearities. The accuracy of the predictions is calibrated using the relative entropy metric (20) as a balanced combination of the error in the mean (signal error) and the error in the variance (dispersion error). First in the prediction for the mean responses, both kicked response and qG closure do a good job especially in capturing the starting transient state responses. The prediction for the large scale $u$ is also easier with higher accuracy than that of the small scale $v$. With a large perturbation amplitude $\delta F=1$, stronger nonlinear effect gradually takes over. The statistical response theory fails to capture the long time drop in the mean, especially for the small scale variables $v$. In signal, error shows the high skills in both methods.

Second in the variance response, the inviscid model with constant forcing leads to a persistent growth in the variance. The qG FDT fails in this case to predict the long time variance response and cannot capture the persistent growth. The kicked FDT offers a better prediction for the increasing trend in the variance in both large and small scales. Still, the response theory lacks the skill to get the exact rate of increasing with accuracy in the second moments. The dispersion error offers a more precise calibration of the errors in the two methods. It can be seen that the qG closure gives a better estimation in the short time transient state for the variance response in both small and large scales. At the long time limit, the kicked response becomes more skilled at getting the growing trend in the variance, while the qG closure simply goes to zero leading to larger errors. In general, the large scale variability is easier to capture than the rapidly changing small scale responses.

### C. Strategies in predicting nonlinear responses using reduced-order statistical models

The previous example with the multiscale L-96 system displays explicitly the extent of skill of the linear response theory to predict the statistical responses. Using FDT with the linear response operators, the short-time transient changes with small amplitude perturbations can be captured with the desirable accuracy in the first-order moment. On the other hand, when it comes to regimes with strong nonlinearity and higher order moments (including the variance), the linear response theory considering only the leading order expansion becomes less sufficient in obtaining the statistical variability with accuracy.

Therefore, more detailed calibration for the turbulent system becomes necessary when we want to capture the nonlinear responses in a wider range of perturbations. Here, we present the basic ideas in the statistically accurate reduced-order models as a more effective strategy in predicting the nonlinear responses.^{1,7} Before preceding to the applications in a more realistic high-dimensional turbulent system in Sec. IV, the basic ideas in developing the reduced-order statistical model framework are offered using the triad model (9a) as a special case of the general formulation (1) by letting

#### 1. Ideas in calibrating and predicting the nonlinear statistical responses

In general, the reduced-order statistical modeling strategy is carried out in three stages: (i) imperfect model selection according to the complexity of the problem; (ii) model calibration in a training phase using equilibrium data; and (iii) model prediction with the optimized model parameters for various responses to external perturbations. Compared with the linear response theory, the nonlinear dynamics are modeled explicitly in the reduced-order models. Specifically in the reduced-order models to be developed here, we will consider additional damping and random noise corrections for the unresolved higher-order statistics. The equilibrium invariant measure and ergodic theory^{27,64} can help determine the existence of a smooth invariant measure for the reduced-order model. The linear response operators from the unperturbed equilibrium measure in turn help improve the nonlinear response prediction through the model calibration.

Generally, the statistical model for the leading two moments can be derived from the full statistical model (5) with a large scale projection,

where $u\xafM=PMu\xaf$ is the model approximated mean and $RM$ is the reduced order covariance matrix about the fluctuation state variable $u\u2032$. The most expensive but crucial part comes from the nonlinear flux term $QF$ where important third-order moments are included representing the nonlinear interactions between different modes. A hierarchical statistical model reductions are constructed with a judicious estimation about this nonlinear interaction term,

through a series of constructions with increasing complexity.^{7,26,37,52} The negative-definite $QF\u2212=\u2212DM,k(R)RM\u2212RMDM,k\u2217(R)$ represents the additional damping effect to stabilize the unstable modes with positive Lyapunov coefficients, while $QF+=\Sigma M,k(R)$ is the positive-definite additional noise to compensate for the overdamped modes. In the model parameters, $(DM,\Sigma M)$ replaces the original nonlinear unstable and stable effects from the original dynamics. The approximation error in the imperfect model is calibrated through the linear response theory. Two crucial components in the construction of (25) are: first, equilibrium statistical fidelity using the unperturbed equilibrium statistics $(QF,eq+,QF,eq\u2212)$, which can be computed easily from the steady state equations (5) using lower order statistics; and second, the improvement of model sensitivity using the total statistical energy $E$ by solving the additional scalar equation (8). The detailed discussion for the step-by-step construction of the reduced model parameters can be found in Refs. 1 and 7.

##### 1. Reduced model calibration using statistical response theory

It is important to realize that equilibrium fidelity is a necessary but not sufficient condition for model predictions.^{1,7,35} Accurate forecast of the nonlinear forced responses requires the correct representation of model’s “memory” to its previous history. In the linear response theory (12), the linear response operator $RA$ provides a unified way to quantify the model sensitivity in the imperfect models. The imperfect models can maintain high performance for various kinds of external perturbations if they are able to recover the response operator with the desirable accuracy. Thus, the optimal model parameter can be selected through minimizing the information distance in the linear response operators in (20) between the imperfect closure models and the truth.

The model calibration procedure is usually carried out in a training phase before the prediction, so that the optimal imperfect model parameters can be achieved through a careful calibration about the true higher-order statistics. The ideal way is to find a unified systematic strategy where various external perturbations can be predicted from the same set of optimal parameters through this training phase. To achieve this, various statistical theories and numerical strategies need to be blended together in a judicious fashion. Most importantly, we need to consider the linear statistical response theory to calibrate the model responses in mean and variances;^{14,16,17} and use empirical information theory^{10,35,57} to get a balanced measure for the error in the leading order moments. In the final model prediction stage, the optimal imperfect model parameters are applied for the forecast of various model responses to perturbations.

Next, we illustrate the skills of the reduced statistical model in capturing nonlinear statistical sensitivity using the simple triad model.

#### 2. Predicting statistical responses in the triad system

The triad system (9a) where three modes interact through the energy-conserving coupling term form the building block for more complex turbulent flows. It provides a simple first test case to show the model reduction strategy in the first stage. This model mimics the nonlinear interaction of two Rossby waves forced by baroclinic processes with a zonal jet forced by a polar temperature gradient. By changing the model parameters, a wide range of statistics are generated from near-Gaussian to highly non-Gaussian PDFs. The typical performance using the statistical response operators for effective model reduction is discussed in depth in Refs. 1 and 7.

##### 1. Model calibration in the training stage

In the tests using the triad model, we pick the dual energy cascade regime where $d1=1,d2=2,d3=2$, $\sigma 12=10,\sigma 22=0.01,\sigma 32=0.01$, and the energy-conserving nonlinear coupling coefficients $B1=2,B2=B3=\u22121$ and the skew-symmetric interactions as $L1=0.09,L2=0.06,L3=\u22120.03$. In this parameter regime, energy cascades in both directions between the “large scale” $u1$ and “small scales” $u2$ and $u3$, leading to a highly non-Gaussian equilibrium distribution (see Ref. 7 for the PDFs). The linear response operators for the mean and variance are used to tune the parameters in the nonlinear statistical model. Through the discussions about linear response theory in Sec. II C, the true response operator is approximated from the kicked response strategy. The optimal parameter values in the reduced statistical model (24) are achieved through minimizing the distance in the model linear response operators under the information metric. Figure 4 shows the linear response operators in mean and variance due to deterministic and stochastic perturbations. Using the optimal parameters, the nonlinear reduced model gives good recovery of the true linear response effect in both mean and variance. Similar with the previous tests, the qG closure lacks the skill in capturing the sensitivity in the variance response when the non-Gaussian statistics of the system become important.

##### 2. Model prediction to perturbations

In showing the reduced model skill in capturing the nonlinear responses to forcing perturbations, periodic perturbation is added to both the deterministic and random components of the external forcing in the triad system. The numerical results are summarized in Fig. 5. The nonlinear reduced order model prediction is compared with the linear response theory results using the qG closure. Consistent with the previous discussions in Sec. III B, the qG FDT maintains the skill in capturing the responses in the mean with agreeable accuracy. From the signal errors for the mean, the qG FDT is just a little worse than the reduced model results with agreeable accuracy. On the other hand, the linear response theory fails to capture the responses in the higher order variance accurately especially for the two fast processes. In contrast, the reduced statistical model gives accurate predictions in the statistical responses in both mean and variance in slow and fast variables. The improvement is obvious from the comparison of the information errors.

## IV. LINEAR AND NONLINEAR STATISTICAL RESPONSES FOR BAROTROPIC FLOWS WITH TOPOGRAPHY

To extend the validity of the statistical response theory to problems in a more realistic geophysical setting, the topographic barotropic system^{10,37} serves as a stringent model with higher complexity. This model offers a detailed interacting mechanism between small-scale eddies and a dynamically evolving large-scale flow. Here, we show that a reduced-order model in only 3-dimensional resolved subspace is capable of capturing the principal statistics in first and second moments with the desirable accuracy and efficiency. Despite the unavoidable model errors, the low-order approximation model displays uniform ability in predicting strong nonlinear responses to various external perturbations through the statistical response framework. In addition, a rigorous statistical saturation bound can be found for constraining the statistical range of the model responses.

### A. Topographic barotropic system with interacting zonal mean flows

A basic group of simple models that incorporate the essential structures in geostrophic turbulence is the ideal barotropic quasigeostrophic equations with a large-scale zonal mean flow and topographic effect.^{2,10,65} The governing equations for the topographic barotropic turbulence are formulated on a rotational $\beta $-plane with doubly periodic boundary condition on $[\u2212\pi ,\pi ]\xd7[\u2212\pi ,\pi ]$ as

where $q=\omega +h$, $\omega =\Delta \psi $ are the potential vorticity and relative vorticity, respectively; and $\psi $ is the stream function, $h$ is the topographic structure, while $q$ is advected by the velocity field, $v=\u2207\u22a5\psi \u2261(\u2212\u2202y\psi ,\u2202x\psi )$. The interacting topographic Rossby waves from small-scale vortices $\omega $ are coupled with a large scale zonal jet $U$ through the topography $h$. General dissipation and deterministic and random forcing are added on both small and large scale states. The system (26) creates anisotropic flows with a strong meandering zonal jets qualitative resembling to those observed in midlatitude ocean/atmosphere and laboratory experiments on topographic blocking.^{66,67}

The dynamical equations without forcing and dissipation conserve both the “kinetic energy” $E$ and the “large-scale enstrophy” $W$^{10} defined as

It shows that these quadratic invariants have a crucial role in the prediction of nonlinear responses and the analysis of nonlinear stability theory.^{1,65} In the application to the statistical response theory, next we introduce the statistical characterization of the system in unperturbed equilibrium and perturbed states.

#### 1. Equilibrium invariant measure for the unperturbed barotropic flow

In geophysical turbulence, it is reasonable to assume that the unperturbed long-time averaged state (that is, the climate) is known with accuracy.^{10,65} The central quantity of interest becomes the fluctuations about the climate state. A special form of the exact steady state solution with linear dependence between the stream function and potential vorticity is often assumed^{10} as

The steady state is defined by the parameter $\mu $. The linearly dependent solution is also predicted as the most probable state from the maximum entropy principle where the entropy is maximized with constant energy with $\mu $ as the Lagrangian multiplier.^{10,65} We focus on the responses in statistics in the fluctuation components $\psi ~\mu =\psi \u2212\Psi eq(\mu )$ and $U~\mu =U\u2212Ueq(\mu )$ about the climate state. For simplification in representations, the “tildes” for the fluctuations are neglected in the rest part of this section.

The general statistical theories (8) can be applied to the fluctuation states of the barotropic flow. The system is projected on a finite dimensional subspace through a Galerkin truncation of high wavenumber $N$ under standard Fourier basis $e{k}=exp\u2061(ik\u22c5x)$

where $\psi N$ is the truncated stream function, $\omega N=\u22072\psi N$ is the truncated small-scale relative vorticity, and $k=|k|$ is the wavenumber amplitude. A Gibbs invariant measure thus can be constructed^{37} as a Gaussian distribution based on the conserved quantities (27)

Note that above $(U,\omega )$ are the fluctuation components centered at the climate state $(U\xafeq,\omega \xafeq)$. This Gibbs measure is given by the conserved energy and enstrophy constraints.^{10} One obvious drawback in the above invariant measure (29) is that it becomes unrealizable when $\mu <0$. We will address this issue later in Sec. IV D.

In fact, it can be confirmed that $peq(U,\omega )$ with the realizable parameter $\mu >0$ is indeed the invariant measure in equilibrium through direct substitution into the Fokker-Planck equation.^{1} Proper damping and forcing operators are required on the right hand side of (26). The admissible damping and forcing consistent with the Gaussian invariant measure (29) can be constructed^{37} as

with $k=0$ for the large scale forcing for $U$ and $k\u22651$ being the vortical modes. Specific amount of noise $\sigma 0=\sigma |\mu |\u22121/2,\sigma k=\sigma |1+\mu k\u22122|\u22121/2$ is required with the parameter $\sigma $; $\lambda k$ is an additional scaling parameter. For simplicity with $\lambda k\u22611$, the dissipation effect acts as the linear Ekman damping, $\u2212d\omega $. Together, the two model parameters $(d,\sigma )$ define the equilibrium variance $\sigma eq2$. The deterministic forcing is required to be zero, $Fk\u22610$ to reach the invariant measure.

#### 2. Statistical energy principle for the barotropic flow system

With non-zero external forcing perturbation $\delta F\u22600$, the perturbed fluctuations $(U,\omega )$ begin to deviate from the Gaussian invariant measure $peq$. Still, the statistical energy principle (8) can be first used to describe the dynamics of total statistical energy through solving a simple scalar equation. To achieve this, we further introduce the statistical ensemble mean and disturbance about the mean state of the fluctuation variables

The statistical energy considers the combined energy in the mean and the variance in the disturbance state. Note that the above ensemble mean states $(U\xaf,\omega \xafk)$ characterize the statistical responses in the mean due to the deterministic forcing different from the unperturbed time-averaged climate in equilibrium. The required triad symmetry (7) is guaranteed since the barotropic triads will only include components, $\psi k,\psi m,\psi n$, with the selection rule, $k+m+n=0$.^{3} Thus, the nonlinear terms make no contributions in the changes of total statistical energy.

Unfortunately, neither the energy $E~$ nor the enstrophy $W~$ in fluctuation $(U~,\omega ~)$ stays conserved due to the exchange of energy between the steady mean $(Ueq,Qeq)$. Still one important conserved quantity can be constructed as the “pseudoenergy” $E\mu $ through a linear combination of the fluctuation energy and enstrophy $E\mu \u2261W~+\mu E~$. The pseudoenergy $E\mu $ is often more useful since it combines the two important conserved quantities (see Ref. 10). It is useful to investigate the ensemble statistics in the first two moments rather than a single trajectory realization since they not only characterize the deviations from the steady state mean but also illustrate the evolution of uncertainty (variance) for this mean estimation. Finally we can define the “total statistical energy in fluctuations” through the original pseudoenergy as a combination of mean and variance,

with $\u27e8EU\u27e9=U\xaf2+U\u20322\xaf$ being the large-scale statistical energy and $\u27e8Ek\u27e9\u2261|\omega \xafk|2+|\omega k\u2032|2\xaf$ being the energy in fluctuation modes.

Using the conservation law for the pseudoenergy $E\mu $, we are able to derive the statistical energy dynamics for the total energy $\u27e8E\mu \u27e9$. Accordingly, the statistical energy equation with the prescribed Ekman damping and stochastic forcing in (30) and the non-zero deterministic forcing perturbation $\delta F$ becomes

In the above scalar equation, the inner product is defined through the metric in the pseudoenergy (31) according to each perturbed mean, and the entire contribution from the stochastic white noises forcing is applied,

As a find comment, only the statistical mean is included in the contribution to the total statistical energy change due to the exerted external forcing. Thus, the dynamics of the total statistical energy combining mean and variance in (32) is determined through only the change in first order mean state together with the specific forcing and dissipation effects. One advantage of the statistical energy equations is that we can predict the total responses of second-order moments from $\u27e8E\mu \u27e9$ requiring only knowledge of first-order moments.

#### 3. Numerical illustration of the flow structures

We first provide an illustration about the typical barotropic flow structures with the topographic stress. In the direct numerical simulations, the wavenumber truncation is kept relatively small as $N=12$ so that the major large-scale variability is maintained. This also enables us to run direct ensemble simulations for comparison with the true statistics. The topography is taken as a zonal structure on the largest scale mode with perturbations added in smaller scales such that

In the simulations, we take the topographic strength $H=32/4$ and uniform phase shift $\theta k=\pi 4$. Besides, the beta-effect is fixed as $\beta =1$ and the linear damping uses $d=0.1$. In the unperturbed equation, no deterministic forcing is used for all the scales $\delta F0=\delta Fk=0$. A pseudospectral method is applied and the 4th order Runge-Kutta scheme is used for the time integration. The random noise part is added through a standard Euler-Maruyama scheme.

Figure 6 plots the typical flow field snapshots with different forcing perturbations $\delta F=0,0.5,1$. A dominant eastward zonal flow with $U>0$ is generated in the first case. As the forcing perturbation $\delta F$ changes on the largest scale, the eastward flow gets blocked and then the jet direction gets switched to the west (left) direction with $U<0$. The abrupt change of direction in the zonal mean jet is an important observation related with more realistic geophysical flows. More discussions about the stability analysis due to topographic stress can be found in Refs. 10 and 38 and next in Sec. IV D.

### B. Linear response operators according to Gaussian invariant measure

The question in the statistical response theory asks how the distribution function $p\delta $ differs from the unperturbed invariant density $peq$ when inhomogeneous perturbations in different scales, $\delta F0,\delta Fk$, are exerted. Large non-Gaussian higher order statistics could be induced even with a small perturbation forcing due to the nonlinear effects. Next, we test the extent of skills in the linear and nonlinear response predictions using information only in the unperturbed climate $peq$.

The linear response theory in (12) offers a convenient leading order approximation of the responses with the explicit formulation of the invariant measure (29). We focus on the first two moments of the state variables in calculating the linear response operator $RA$. Following the standard steps described in Sec. II C using the linear response formula (13) and the Gaussian invariant measure (29) as the unperturbed equilibrium, the linear response operators for the mean state can be computed from the autocorrelation functions,

where $RA$ is the mean response for the large-scale state with $A=U$ and for the small-scale vortical mode with $A=\omega k$, subject to the general deterministic perturbation $(\delta F0,\delta Fk)$. The total response includes contributions from all cross-correlations between different scales. In practice, we only need to focus on the perturbed directions that dominate the statistical response functionals. The linear response operators for the covariances of $A=U,\omega k$ can be discovered in a similar way as the lagged-in-time third order correlations between large and small scales,

Still, here for capturing the higher order responses in this more complex system, it is necessary to include the feedbacks from cross-correlations between different scales. Even though the cross-correlations between $U$ and $\omega $ get decoupled in the statistical steady state, the non-zero transient response cannot be ignored due to the nonlinear dynamics.

Given hyper-ellipticity, the above system with stochastic forcing is provably geometric ergodic.^{64} The statistical average under equilibrium measure can be computed through temporal average along a single trajectory, that is,

Exploiting this ergodicity is an effective way to reduced the computational cost. With the linear response operator $RA$ available, the leading order statistical response prediction can be computed using the FDT formulation (12). For simplicity, we focus on the responses to constant perturbation, $\delta F=a\delta f$ with the scalar $a$ defining the amplitude of the perturbation. We only need to calculate the time-integrated value $R$ of the linear response operator to get the linear prediction,

Using the above formula (35) for constant forcing perturbation, the leading order responses can be calculated easily and the predictions can be exact if the system is linear. On the other hand, many important nonlinear information in the integrated form of the linear response operator $RA$ is neglected. This will become a serious problem as the nonlinear interactions between modes grow large.

As a first test for the prediction skill of the linear response theory, deterministic forcing perturbations are added on the first two largest scale modes $\delta FU$ and $\delta F1$. The perturbation is tested on a wide range with $a\u2208[\u22121.5,1.5]$. The results for the responses in the mean and variance are shown in Fig. 7. First for the true statistical responses, nonlinearity becomes dominant in both mean and variance even with small perturbation amplitudes. The linear response prediction to the constant perturbation gives a tangent line at the zero perturbation point $\delta F=0$. It offers the infinitesimal rate of change near the origin but fails to capture the nonlinear responses as the perturbation grows large.

### C. Nonlinear response functionals from reduced-order statistical methods

In the prediction of nonlinear responses, reduced-order statistical models are required in correctly representing the nonlinear higher order energy transfer mechanism. Here for the topographic barotropic flow, we focus on the responses in the large scale flow $U$ and the largest scale vortical mode $\omega 1$ due to perturbations. Thus, only the first three modes are resolved compared with the original truncated system. Following the general modeling steps in Sec. III C, the optimal model parameters are first calibrated in the training phase, then the nonlinear responses are predicted in the forecast stage.

#### 1. Reduced-order model for the barotropic system with topography

The same model reduction strategy in (24) is applied here on this barotropic system. In the reduced-order statistical dynamics, we focus on the prediction in the mean $(U\xaf,\omega \xaf)$, variance $(rU,r\omega )$, and the cross-covariance $c$. Explicitly, the dynamical equations for the statistical mean and variance in the leading modes can be derived for the imperfect reduced-order model:

*Mean equations*:(36a)$d\omega \xaf1dt+Lm=\u2212d\omega \xaf1+\delta F1\u2212(dM,1+i\omega M,1)\omega \xaf1+GM,dU\xafdt\u2212ih^1\u2217\psi \xaf1=\u2212dU\xaf+\delta F0\u2212dM,0U\xaf+HM.$*Covariance equations*:(36b)$dr\omega ,1dt+2Lr=\u22122dr\omega ,1+k\u22122\sigma 12\u22122dM,1r\omega ,1+\sigma M,12,drUdt+2h^1\u2217c1=\u22122drU+\mu \u22121\sigma 2\u22122dM,0rU+\sigma M,02,dc1dt+Lc=\u22122dc1\u2212(dM,0+dM,1+i\omega M,1)c1.$

In the mean equations, $Lm$ represents the quasilinear part as well as the large-small scale interaction, $GM$ and $HM$ add the correction from the unresolved subspace using equilibrium statistics. In the variance equations, $Lr$ and $Lc$ represent the quasilinear effects between the small and large scale variables. The last terms on the right hand sides of the equations offer the imperfect model estimation of the unresolved higher order feedbacks. Detailed derivation and expressions of the above statistical equations can be found in Ref. 37.

The same model reduction strategy in (24) is carried out for this barotropic system. Feedbacks from higher-order statistics through nonlinear triad interactions are replaced by the corrections as in (25). Imperfect model parameters $(dM,\omega M,\sigma M)$ are introduced for each scale to control the model performance. The information-theoretic framework (20) is applied in the training stage for the optimal model parameters. In addition, the model sensitivity is improved by scaling the parameters with the total statistical energy $\u27e8E\u27e9$ by solving the statistical energy equation (32) requiring only the statistical mean state with perturbation. The total energy scaling factor has been shown crucial for the accurate prediction of model responses.

#### 2. Reduced model performance in predicting nonlinear responses

In the training phase, the imperfect model parameters are determined based on first equilibrium consistency to recover the exact equilibrium statistics in the unperturbed climate; and then model sensitivity in reproducing the linear response operators using the reduced statistical model. First, the equilibrium consistency is guaranteed using the steady state mean and variance to compute the equilibrium higher order moment feedbacks. This offers the first constraint on the model parameters. Next, the model sensitivity is improved by tuning the rest model parameters according to the linear response operators $R$ for the mean and variance in (34a) and (34b). The relative entropy distance offers an unbiased metric to calibrate the model error combining the mean and variance.

The optimal tuning results for the linear response operators $RA(t)$ for the mean and variance in the largest scale states are compared in the first part of Fig. 8. The optimized reduced-order model shows high skill in recovering the linear response operators for both the mean and variance. Next in the second part of Fig. 8, the predictions for the nontrivial nonlinear responses in mean and variance are shown. The same constant forcing perturbations with different amplitudes as in Fig. 7 are tested again for the efficient reduced order model. In this case, the strong nonlinear responses appearing in both mean and variance are captured with the desirable accuracy in the nonlinear statistical model. This example together with the previous linear response prediction with much larger errors emphasizes the crucial role in modeling the important higher order moment feedbacks in achieving accurate prediction for the nonlinear statistical responses.

### D. Rigorous statistical bounds for the total statistical responses

Using the statistical reduced-order models, the nonlinear responses along the most important model directions can be predicted with accuracy. From another perspective, it is also useful to investigate the bounds in the total statistical variability of the turbulent system. This is related with the statistical saturation bounds concerning the nonlinear stability. In this last section, we display the major conclusions developed in Ref. 38 for the rigorous statistical bounds in the barotropic model with topography.

In the previous discussions for the linear response theory, the fluctuation components about the steady state solution $(Ueq[\mu ),\Psi eq(\mu )]$ are characterized by the Gaussian invariant measure (29). This invariant distribution becomes unrealizable when the linear dependence parameter $\mu $ changes to negative. In this scenario, one natural question is how the system responds to such uncertainties in the unstable regimes $\mu <0$. Instead of the classical deterministic stability analysis which seeks the maximum amplitude of the growing disturbance in one single perturbed flow trajectory near the stationary steady state, the statistical saturation bounds investigate the growth in ensemble predictions during the evolution of the probability distribution of the states.

#### 1. Statistical energy saturation bounds in response to forcing and dissipation

The statistical saturation bounds in general are subject to two classes of flow perturbations: the perturbations in the initial distribution configuration; and the external forcing and dissipation effect. The two sources of uncertainties are unified under the same analytical framework using the conservation of statistical pseudoenergy (31) in Ref. 38. Here, we stick to the previous model setup with the forcing and damping forms (30) and provide the saturation bounds in fluctuation responses subject to external perturbations.

The total statistical energy $\u27e8E\mu \u27e9$ in fluctuation due to the Ekman damping and forcing effects follows the dynamical equation (32),

with the deterministic part applying on the statistical mean state and the stochastic part offering the combined contribution through $Q\sigma ,\mu $. To find the proper bound of the total statistical energy due to the external forcing, we separate the interaction terms with the statistical mean by applying Cauchy’s inequality

with parameter $\lambda >0$. Using “Grönwall’s inequality” to the above relations, the saturation bound for the total statistical energy $\u27e8E\mu \u27e9$ due to the effect of damping and external forcing can be found as

with the statistical bound $C\mu =\mu 2({(d\u22121F0)}2+d\u22121\sigma 02)+12\u2211(1+\mu k\u22122)(|d\u22121Fk|2+d\u22121\sigma k2)$. The result in (37) offers a desirable bound for the stable regime $\mu >0$. However with $\mu <0$, one obvious difficulty is that the components in the statistical energy $\u27e8E\mu \u27e9$ are not all positive. Thus, (37) is no longer a saturation bound for the total energy, but a balance between the positive and negative subspaces that separate the total $\u27e8E\mu \u27e9$.

The trick in Ref. 38 is to propose “reference states” with parameter $\alpha >0$ but with the real forcing in the form with parameter $\mu <0$. The previous bound (37) is still valid according to the reference state energy $\u27e8E\alpha \u27e9$, which remains positive definite in all coefficients of $\u27e8U2\u27e9,\u27e8|\omega k|2\u27e9$. Additional “reference forcing” effects need to be separated for the dynamical equation of $E\alpha $ based on the reference state with parameter $\alpha >0$ in the mean flow and vortical modes. Then, the final saturation bound for the unstable regime $\mu <0$ is achieved by minimizing among all the bounds in (37) with the reference states $\alpha >0$. We state the final conclusion for the saturation bounds with forcing and dissipation effects in the following theorem:

*Saturation bound for statistical mean and variance with damping and random forcing*$)$ With the linear damping and forcing as in (30)$,$ the combined statistical mean fluctuation and variance$,$ $\theta Em+Ev,$ with the ratio parameter $0\u2264\theta <1$ can be controlled as in

#### 2. Numerical verification of the saturation bounds

The saturation bound in (38) can be verified through simple numerical simulations. The basic model setup is kept the same as before. The forcing perturbation is added according to the reference steady state $(Ueq,\Psi eq)$ defined by $\mu $. The maximum statistical growth in mean perturbation and variance is tested in the statistically unstable range $\mu \u2208(\u22122,\u22121)$. Figure 9 compares the total statistical energy in the mean and variance fluctuation through direct numerical simulations and the theoretical saturation bound (38) in the total variability of the statistics. The saturation bound offers a tight upper limit for the maximum growth in the total statistics in both the mean fluctuation and the variance. Especially, for the combined mean fluctuation and variance bounds with two ratio values $\theta $, the upper bound estimation is extremely close to the numerical results, offering an accurate prediction for the maximum increase in the total statistical energy. A more complete discussion with many different forcing scenarios for the statistical saturation bounds according to both initial uncertainty and external forcing excitations can be found in Ref. 38.

## V. BEATING THE CURSE OF DIMENSIONALITY USING STATISTICAL CONTROL THEORY FOR COMPLEX SYSTEMS

The statistical control of inhomogeneous turbulent systems offers another set of problems where the statistical response theory can play an important role. The efficient control of complex systems remains a challenging problem due to the inherent nonlinearity, high-dimensionality, and strong exchange of energy between scales.^{4,9,12,13} The major goal of the control theory can be summarized as to construct an optimal course of action (the “control”) to drive the perturbed statistical solution of a complex turbulent system back to the small neighborhood of the original unperturbed target statistical equilibrium state.

One successful approach for controlling flows in the transition to turbulence is dynamical programing by solving the Hamilton-Jacobi-Bellman (HJB) equation.^{68,69} However, the curse of dimensionality appears during directly solving this HJB PDE when the system becomes genuinely high dimensional and fully turbulent.^{1} We have shown previously in Sec. IV that the statistical response theory offers an efficient way to characterize the model sensitivity to perturbations with robust performance so that it enables us to construct an alternative approach for the specific control without the need to run the detailed complex system. In this section, we illustrate the ideas presented in Refs. 40 and 41 for the effective control combining the statistical energy principle and statistical response theory. The topographic barotropic model (26) is used as a typical example to demonstrate the novel control strategy and its skill.

### A. Efficient control of total statistical energy with mean response

To avoid directly controlling the high dimensional system (1), we first seek to control the scalar statistical energy (8) through manipulating the mean responses $\delta u\xaf=u\xaf\u2212u\xafeq$ subject to various perturbation forcing $\delta F$,

In the control problem, the focus is on the statistical energy fluctuation $E\u2032=E\u2212Eeq$ removing the unperturbed equilibrium $Eeq$ with $\delta F\u22610$. Instability will not appear in this statistical energy equation due to the symmetry in the nonlinear interactions.^{1,36} Thus, the major difficulty in stabilizing many directions of instability in controlling the original equation is circumvented.

Next, the (deterministic) control forcing and the equilibrium statistics are projected to the subspace for control spanned by a truncated orthonormal basis ${ek}k=1N$,

where the control forcing $\kappa \u2192(t)$ applies on the selected $M<N$ leading modes. Especially in typical engineering applications, the control is typically restricted in a specific subspace and the basis can be selected accordingly inside the restricted subspace. The statistical energy fluctuation equation with leading order responses in the mean can be written with respect to the forcing perturbation along the orthonormal modes ${ek}k=1M$

Above and later, the primes for the fluctuation components are dropped in the statistical energy fluctuation $E\u2032$, and $E0$ is the initial total statistical energy perturbation to be controlled. Only leading order terms $O(\delta )$ remains in the dynamics by assuming that the perturbation is kept relatively small.

At last, we link the mean responses $\delta u\xaf$ with the control forcing perturbation $\kappa $ using the linear response theory. Through the linear response formula in (12), the response in the leading order mean state, $\delta u\xaf$, subject to the external forcing $\kappa \u2192(t)$ can be estimated using the mean response operator $Ru$,

Above, we assume that the responses from diagonal components for the variance along each direction $ek$ is dominant, and cross-covariances between different mode can be neglected. The diagonal model with autocorrelation functions as the linear response operator for the mean $Ru$ can be shown as an attractive option from judicious simple linear regression approximations.^{30,45,46}

Using the linearization in both the statistical energy equation and the leading order mean response, the first control problem for the total statistical energy fluctuation can be formulated using the mean response operator $Ru(t)$. The “statistical control equation in leading order responses” can be rewritten from the original statistical dynamics as

The control system (40) is linear but involves nonlinear statistical functionals as coefficient from the target unperturbed equilibrium measure. “Nonlocal non-Markovian forcing” is introduced through the linear response operator $Rk$.

### B. Effective statistical control strategy in two steps

The task now is to find proper formula for the ideal “optimal control forcing” $\kappa \u2192(t)$ on the mean state to drive the perturbed energy fluctuation $E$ in (40) back to zero (or close to zero) efficiently with minimum cost. A “local Markovian statistical control” $C(t)$ is then introduced as a nonlocal functional of the scalar control forcing $\kappa \u2192(t)$ containing all the perturbed feedbacks in the statistical energy fluctuation. We first construct the linear statistical control problem by proposing a proper cost function to optimize. With the statistical control $C(t)=\u2211kCk$ defined with a combination of the contributions from each mode $ek$, the control system (40) is converted to the linear scalar dynamical equation between time interval $[t,T]$,

regardless of the specific structure in the functional $C(t)$. We attempt to minimize the “cost function” as a quadratic combination of the energy fluctuation $E$ and control functional in each mode $Ck$,

The weighting parameter $\alpha k>0$ is introduced to add a balancing factor between the control efficiency (due to $E$) and the control expense (due to $Ck$). In this way, we need to just focus on this statistical control operator $C(t)$ for total statistical energy $E(t)$ in the first step regardless of the explicit forms of the mean state and specific forcing perturbations or exact equilibrium statistics. Then in the second step, the task is to invert the (nonlocal) functional of $C(t)$ to get the explicit forcing control strategy for $\kappa (t)$ using the leading order mean responses.

#### 1. Optimal control on statistical energy identity

In the first step of constructing statistical control $C(t)$ in (41), this linear statistical control problem can be directly solved by the HJB equation through “dynamic programing.”^{68} It solves the backward equation in time about the optimal value function $v(x,t)=minCF\alpha [C(\u22c5)]$. The HJB equation becomes the Riccati equation^{40,70} that is easy to solve for the explicit optimal solution. The optimal feedback control $Ck\u2217(t)$ together with the optimal control statistical equation for $E\u2217$ can be found by the following system:

with the final time value $K(T)=kT$ and initial fluctuation energy $E\u2217(0)=E0$.

#### 2. Attribution of the optimal statistical control

The second step of the problem is to find the optimal control forcing $\kappa k(t)$ from the optimal statistical control $C\u2217(t)$ achieved. Through the explicit formula (40), the attribution of the optimal statistical control forcing $\kappa $ becomes the nonlinear inversion using the linear response factor $Ru$,

Combining the above equality (44) with the previous explicitly optimal statistical control (43), it requires an explicit expression for the mean response operator $Ru$ so that the nonlocal operator can be inverted. The mean response function can be simply approximated by the autocorrelation using a linear regression model,

The imaginary parameter $\omega M$ is used to approximate the oscillatory structure that is common in autocorrelations. The information-theoretic framework^{45,46} again offers an efficient way to reach the optimal model parameters with high accuracy. More details about the strategy in finding the optimal model parameters in (45) in Sec. VI A.

With the approximation in the mean response operator (45), the explicit control forcing in (44) can be recovered by solving the following equation

By solving the above equation, the proposed control $\kappa (t)$ is constructed in explicit form to force the perturbed system back to the neighborhood of initial unperturbed climate. The detailed derivation of the above steps can be found in Ref. 41.

In summary, the two-step strategy of the statistical control has the advantage of only requiring equilibrium information *a priori* without running the detailed high-dimensional complex turbulent system with perturbations. The linear response operator $Ru$ for the mean can be approximated using the autocorrelation function in the most sensitive direction. Therefore the entire control strategy can be carried out easily despite the highly complicated original turbulent dynamical system in a large phase space with instability.

### C. Statistical control of the topographic barotropic flow

The statistical control strategy is displayed under the topographic barotropic turbulent model (26) with the same model setup. In climate systems, the mitigation of climate change effects requires effective control strategies to guide the perturbed state efficiently back to the equilibrium climate. The barotropic flow serves as a simple system to check the control skill to climate change forcing with representative reference to the real atmosphere and ocean. As shown in the various tests in Sec. IV, the flows show distinct inhomogeneous divergence from the equilibrium state with changing directions as forcing perturbation is exerted. In controlling the barotropic flow, we assume that the equilibrium state $(Ueq,\omega eq)$ (i.e., the climate) is known with reasonable accuracy, and the goal is to find the optimal control (as one external forcing) that can drive the perturbed states (say, due to unknown climate change perturbation $\delta F$) back to the unperturbed equilibrium at a fast rate.

According to the general framework, we begin with the scalar statistical dynamical equation to derive a optimal statistical control $C(t)$. Then, the optimal control forcing $\kappa $ is found by solving a scalar Riccati equation regardless of the complicated nonlinear effects. The same statistical energy equation (32) is adopted for the construction of statistical control $C(t)$ as in (41). As in the previous simulations, the perturbation forcing is only exerted on the large scale flow $U$ and the largest topographical modes $\omega 1$. These are the most sensitive directions to perturbations according to the linear response analysis in Sec. IV B. Therefore, we just need to control on the total statistical energy fluctuation using control $C0$ on the mean flow $U$ and $C1$ on the ground vortical mode $\omega 1$,

The optimal energy control can be found using the solution in (43),

The values of $\alpha 0,\alpha 1$ can be determined by the energy in mean flow $U$ and mode $\omega 1$.

#### 1. Linear approximation for the statistical responses in perturbed modes

The next task is to recover the explicit control forcing $\kappa (t)$ with the inversion relation through the statistical control solutions $C0,C1$. The first-order response is approximated by the autocorrelation functions so that

with $u=U$ controlling the large-scale flow and $u=\omega $ controlling the small-scale vortical modes. As a further approximation for practical computation, a linear regression model is used to fit the autocorrelations, that is, $RUM(t)=exp\u2061(\u2212\gamma 0Mt)$ and $R\omega M(t)=exp\u2061(\u2212(\gamma 1M\u2212i\omega 1M)t)$. The first row of Fig. 10 gives the linear model fitting for the autocorrelation functions. Good agreement is reached using the linear explicit fitting functions. This guarantees the feasibility to recover the control forcing form by solving the inversion problem (44). More applications of this linear fitting of the response operator will be shown next in Sec. VI for the prediction of intermittency.

#### 2. Verification of the statistical control in the perturbed barotropic flow

In the numerical test, the same setup as in Sec. IV A is adopted again for the control tests. The unperturbed equilibrium is reached with no additional forcing $\delta F0=\delta F1=0$, while perturbations are added on the largest scales, $\delta F0=dF,\delta F1=dF$. Still, all the rest uncontrolled modes in smaller scales count for large percentage of the total energy, thus are crucial in the final flow structure.

In the verification step, the skill of the optimal control forcing $(\kappa 0,\kappa 1)$ is applied to the perturbed flow at time $t=5$. The lower panel of Fig. 10 gives the control results compared with uncontrolled case where the forcing perturbation is simply removed. The mean and variance responses in the large scale $U$ and first mode $\omega 1$ both converge to the unperturbed state. Compared with the uncontrolled case, the control forcing $\kappa $ effectively speeds up the convergence process to a much faster rate. This illustrates the effectiveness of the efficient statistical control strategy to propose the explicit control forcing valid for high-dimensional turbulent systems with many degrees of internal instability and uncertainty.

The success in the control of the stringent barotropic model presented here suggests further research to understand these model errors in controlling other more complex turbulent dynamical systems. While the L-96 model with $F=6$ already in Ref. 41 has significant non-Gaussian statistics, more investigation is needed to determine the skill and improve algorithms for other important non-Gaussian systems and engineering control. In particular, the role of multiplicative noise is another significant source of non-Gaussianity.

## VI. PREDICTING EXTREME EVENTS AND TRACER STATISTICS IN THE PASSIVE SCALAR TURBULENCE

Turbulent diffusion and intermittency form another rich group of highly nontrivial multiscale problems requiring strategies using the statistical response theory.^{32,46,57,71} Intermittent phenomena in passive scalar turbulence refer to the bursts of extreme values followed by low-amplitude phases that are representative in the observational data.^{2,6,43} Even with no positive Lyapunov exponents, exactly solvable models have shown to exhibit intermittent behaviors from rigorous analysis.^{72}

Linear and nonlinear response theories provide effective calibration strategies for the statistically accurate prediction of these extreme events. Here, we illustrate the application of statistical response theory developed in Refs. 72 and 73 for the prediction of intermittency under the framework of the two-layer baroclinic model (9e). The more complicated two-layer model is picked here due to its wider range of representative features for typical intermittency in ocean and atmosphere regimes.^{3,74,75}

### A. Stochastic formulations for the passive scalar turbulence

The scalar field $T(x,t)$ is characterized by the concentration of the passive tracers which immerse in the turbulent fluid with velocity $v$ but do not inversely influence the dynamics of the fluid. The evolution of the scalar field is described through the joint effect of turbulent advection and diffusion,^{3,44,76}

The dissipation is set as a linear damping and diffusion, where $dT$ has a stronger effect on the large-scale modes and $\kappa T$ applies to the small-scale modes for the diffusion. The statistical description of tracer transport in turbulent flows is a prevalent concern in atmospheric and oceanic fluid dynamics when the advection flow $v$ becomes turbulent.^{52,76} There exists well developed statistical-dynamical understanding for the above problem but with much simpler master equations for the advection flow.^{72,77} The dynamics of the passive tracers can be also investigated via the Lagrangian approach tracking the tracer trajectories.^{44,78}

#### 1. Equations of the passive tracer turbulence with mean gradient

The existence of a background mean gradient for the tracer field has been shown responsible for many important phenomena.^{79,80} A background mean gradient varying along the $y$ direction is then introduced in the tracer field,

The fluctuation part of the tracer $T\u2032$ satisfies the following dynamics by substituting the decomposition (48) into the original tracer equation (47),

The turbulence flow $v=(u(x,t),v(x,t))$ is the solution of the two-layer equations (9e).

Equation (49) offers a detailed illustration about the energy exchange mechanism between the tracer and underlying flow field. Explicit solutions for the mean, variance, and cross-covariance between tracer and flow modes can be derived explicitly and intermittency can be predicted through the random resonance between modes of the turbulent velocity and the passive tracer.^{72} The system (49) advected by the two-layer flow contains more realistic features with inherent internal instability and strong non-Gaussian statistics. The reduced-order model is then constructed with a careful calibration according to the true model nonlinear responses.

#### 2. Reduced-order stochastic model for the passive tracer turbulence

Usually high computational cost is required to resolve the tracer equation (49) as well as the complex turbulent advection flow (9e) through direct methods. As a continuation of the idea for model reduction in Sec. III C and Ref. 7, we adopt the model reduction procedure with linear corrections to approximate the advection flow field in the leading $M$ most energetic modes so that the principal variability can be captured.

First, the flow potential vorticity $qj$ and passive tracer disturbance $Tj\u2032$ can be written in the expansion under modes as

It has been shown that important non-Gaussian and intermittent structures in the tracer field can be generated from a simplified linear Gaussian advection flow. Therefore, the reduced-order advection flow equations can be proposed using the imperfect model reduction idea as in Secs. III C and IV to replace the complex quadratic interaction with equivalent linear damping and noise,

The turbulent velocity field can be achieved from the streamfunction $vM=\u2207\u22a5\psi M$. Only the first $M\u226aN$ large-scale modes, $1\u2264|k|\u2264M$, are resolved in the reduced-order formulation in (50). $(\gamma q,k,\omega q,k)$ are the linear dissipation and dispersion operators. Additional damping and noise $(DqM,\Sigma qM)$ are introduced to correct model errors due to the neglected nonlinear interactions in the flow equations so that consistent equilibrium statistics in first two moments and optimal linear response functional can be recovered.

Next, the tracer dynamics is modeled directly without additional model calibrations about the tracer field statistics in the case of over fitting about data,

Consistent with the advection flow model (50), only the first leading modes $|k|\u2264M\u226aN$ are resolved in the tracer approximation model. The advection flow solution $vM=\u2207\u22a5\psi M$ is solved from the Gaussian linear model solution in (50), The damping and dispersion operators are $\gamma T,k=dT+\kappa T|k|2$, and $\omega T,k=kxU$ due to the dissipation and advection in large-scale zonal mean flow.

In this modeling process, we only consider the calibration about the advection flow field, and the flow solution is applied to the tracer equations for predicting tracer statistics directly. One crude approximation in flow Eq. 50) is to replace the nonlinear advection with linear damping and noise $(DqM,\Sigma qM)$ as discussed in Sec. IV and previous works.^{45,72} Using the statistical formulation (24), the additional damping and noise corrections for the reduced-order flow vorticity model (50) is proposed in the following form

Above $QF,eqq$ is the nonlinear flux (25) from the equilibrium statistics, and the parameters $(Dq,kM,\Sigma q,kM)$ are additional corrections. Comparing with the exact true system, the reduced-order approximation is equivalent to replacing the nonlinear interaction terms with the judiciously calibrated damping and noise (52) in consideration with both the equilibrium energy transfer mechanism and further sensitivity correction. Still, this approximation may introduce large model errors due to neglecting the feedbacks from the unresolved smaller scale fluctuations. Next, the imperfect model error is calibrated and minimized using the information-theoretical framework by calibrating the linear response functionals.

### B. Predicting extreme events using the statistical response theory

#### 1. Training phase for optimal linear response with information theory

Now, we describe the model calibrations for consistency in equilibrium moments and model optimization using the linear response theory under a systematic information-theoretic framework. This information-theoretic framework has been utilized to systematically improve model fidelity and sensitivity and to make an empirical link between model fidelity and forecasting skill. Using the statistical response theory, proper imperfect model parameters can be proposed according to the detailed model calibration using equilibrium statistics.

Especially in the case for capturing model intermittency with fat-tailed distributions in the leading modes, accurate characterization about the transient state with instability becomes crucial. It is realized that with only the corrections from the equilibrium nonlinear interactions, large errors may still exist in the reduced-order model predictions since the autocorrelation functions are not well characterized). As illustrated in the linear response theory (12), the mean sensitivity in the stochastic process for $q(t)$ can be approximated by its autocorrelation functions,

Next, we follow the general strategy introduced in Refs. 45 and 46 for measuring autocorrelation functions with the help of information theory.

The problem here is about to find a proper metric to measure the errors in the imperfect model autocorrelation functions in the training phase in a balanced way. A preferable measure to offer an unbiased metric for the imperfect model probability distribution $pM$ from the truth $p$ is to use the relative entropy (20) concentrating on the leading order moments. However, one additional difficulty in tuning the autocorrelation functions under the information metric is that the functions given in (53) are not actually a probability distribution function to measure in the relative entropy. The problem can be solved by instead considering the spectral representation of the random process $q(t)$, from the “theory of spectral representation” of stationary random fields.^{81} It is proved by Khinchin’s formula^{81} that if the autocorrelation function $R(t)$ makes smooth and rapid-decay, a positive definite matrix $E(\lambda )>0$ can be constructed so that the “spectral representations” of the autocorrelation $R(t)$ and stationary process $q$ become

where the spectral random measure $Z^(d\lambda )$ has independent increments and energy spectrum measured by $dF(\lambda )=E(\lambda )d\lambda =E|Z^(d\lambda )Z^\u2217(d\lambda )|$. Applying the theory for spectral representation of stationary processes, we find the one-to-one correspondence between the autocorrelation function $R(t)$ and positive-definite energy spectra $E(\lambda )$, together with the spectral representation $Z^(d\lambda )$ of the stochastic process $q(t)$. We seek a spectral representation about the autocorrelation function like

where $D(x)\u2261\u2212log\u2061detx+trx\u22122$ is the Gaussian relative entropy with a zero mean state. Also, since here we only concentrate on the leading order statistics (that is, mean and variance), thus this representation is enough. We can find the optimal model parameter $\theta \u2217=(dM,\omega M)$ by minimizing the information metric defined in (55),

Note that, in the reduced-order model in (50), the dynamics in each mode is set to be simple and linear, thus the model is efficient to run and even affordable for the ensemble method to run a group of particles to capture the statistics in important spectral modes. The detailed description of the above strategy can be found in Refs. 45 and 46.

#### 2. Prediction of extreme events in tracers in the baroclinic turbulence

We test the reduced-order models described in the framework of the two-layer baroclinic model (9e). In numerical simulations, the true statistics is calculated by a pseudospectra code by resolving the two-layer equations with 64 spectral modes zonally and meridionally, corresponding to $128\xd7128\xd72$ grid points in total. The time integration is through the standard 4th-order Runge-Kutta methods. The tracer field is simulated directly using the two-layer flow solution. In the reduced-order models, we only compute the large truncation $|k|\u2264M=10$ in largest scale modes, compared with the true system resolution $|k|\u2264N=64$.

As a typical example, Fig. 11 displays the representative flow and tracer structures from direction numerical simulations. The time series of the heat flux $Hf=\u222bv\tau =\u222b\psi x\tau $ illustrate the heat transfer mechanism. In the unblocked regime with strong zonal flow, the meridional heat flux is suppressed, while in the blocked regime, strong meridional heat flux can be observed showing a strong meridional transport of the heat. The source of intermittency in passive scalar tracer is described qualitatively in Ref. 80 under an elementary model. Strong tracer intermittent bursts are always corresponding to the strong meridional heat transport in blocked regime leading to high mixing rate in tracer modes; the quite regime with little meridional heat flux in the unblocked regime also shows little intermittency of the tracer field in small amplitude.

The above on and off intermittent mechanism detailed in Refs. 44 and 80 explains the turbulent mixing and intermittency via streamlines blocking and opening through simple elementary model with no instability and turbulence, while similar phenomena are also observed in the numerical simulations through much more complex two-layer baroclinic model in turbulence here. Even though simple and qualitative, the intuitive reasoning hints that the first two most energetic leading modes (0,1) and (1,0) which represent the competitive unblocked and blocked regime might be crucial for modeling the intermittent structures in the tracer field.

Still, the two-layer turbulent model includes much more complicated non-Gaussian nonlinear features. So next, we test the performance with combination of the reduced-order models (50) and (51), and check how the tuning procedure in training phase can improve the imperfect model prediction skill. Figure 12 plots the time series and PDFs of the leading tracer modes with strong intermittent structures. First qualitatively, the reduced model captures the occurrence of extreme events with much less computation cost. Next quantitatively in the PDFs, the fat tails representing the more frequent occurrence of extreme values are also captured from the reduced order model. Note here in the reduced-order approximation, only the largest scale modes $|k|\u226410$ are resolved compared with the full model with a resolution of 128 grid points along each horizontal direction. The model efficiency can be effectively improved by adopting the low-order model. A complete comparison of the skills of the reduced-order stochastic model for the recovery of tracer energy spectra, eddy diffusivity, and various fat-tailed distributions in different ocean/atmosphere regimes is shown in Ref. 46.

## VII. SUMMARIZING DISCUSSION

Linear and nonlinear statistical response theory provides an effective tool to analyze the various problems and challenges in high-dimensional turbulent dynamical systems.^{1,7,53,56} In this paper, we summarize several useful perspectives of the linear and nonlinear response theory with applications to sensitivity analysis, uncertainty quantification, and control of complex turbulent systems.^{1,7,37,40} The effectiveness of the various strategies described in the main text is displayed under a hierarchy of simplified prototype models ranging from exactly solvable equations to geophysical turbulence with anisotropic dynamics. These successful test cases imply the potential of the strategies with statistical response theory to be further generalized to many more realistic high-dimensional systems with similar nonlinear structures. Immediate examples to be considered next include the Boussinesq models with stratification^{3,8} in geophysical flows and a set of new models^{82,83} for characterizing plasma edge turbulence with collision effects.

We first provide a brief introduction to the general statistical response theory in Sec. II. The practical skill in predicting multiscale variability in mean and variance using the linear response operators with various approximation strategies are discussed in Sec. III using the two-layer L-96 model showing nontrivial nonlinear responses. The accuracy with the different equilibrium statistical approximations is calibrated under the unbiased information metric.^{14} Already under this simple framework, it shows the importance of considering the higher order non-Gaussian statistics even for the accurate prediction in the second order variance responses. The extent of the prediction skill in the linear response theory is limited in the small perturbation regime. This shortcoming becomes more obvious in the anisotropic barotropic turbulence in Sec. IV, where the nonlinear response to the external forcing perturbations is particularly strong.

A systematic information-theoretic framework^{1,7,57} is shown effective to improve the prediction skill and capture the nonlinear response trend. Without dependence on the specific perturbation form, a statistical model reduction strategy in general can be applied to any turbulent dynamic systems with model errors for predicting any perturbed responses. It is required for the models to recover the crucial features of the natural system in statistical equilibrium for climate fidelity, and improve the model sensitivity in response to various external perturbations like climate change and mitigation scenarios. Linear response theory, in turn, provides important guideline for the tuning of optimal imperfect model parameters in the training phase in a systematic fashion. The validity of the reduced-order statistical models is displayed first in the simplest three-dimensional triad model for characterizing the typical features in the strategy, and then to the more complex topographic barotropic flows including multiscale flow interactions with detailed investigation in Sec. IV. The model error is effective reduced by combining the linear response theory with the efficient information theory calibration. Besides the turbulent flow field, the dynamics of passive tracers advected by the turbulent flow also contain many distinctive features, such as, the occurrence of extreme events characterized by fat-tails in the tracer density distributions. The same model reduction idea is applied to the construction of stochastic models^{45,46} aiming to capture such extreme events. In Sec. VI, the reduced-order stochastic model displays uniform high skill in capturing fat-tails in the leading resolved modes using the complex two-layer baroclinic model.

The statistical response theory also displays a promising role in the development of effective control of high-dimensional dynamical systems.^{40,41} Instead of a direct control on the state variables in a high-dimensional phase space, the simple scalar dynamical equation makes it possible for the control of total statistical energy in an efficient way. Using the total statistical energy equation^{40} avoids treating the complicated higher order nonlinear interactions with instabilities. The statistical responses in the mean subject to external forcing is approximated using linear response theory without the need to run the explicit model. The control performance is illustrated using the same topographic barotropic model in Sec. V. Despite the model errors introduced through the leading order expansion in both the statistical equation and linear prediction in mean response, the statistical control methods have uniform robust skill for controlling the homogeneously and inhomogeneously perturbed system in the same fashion. Furthermore, this statistical control strategy shows potential for extremely high-dimensional problems in combination with turbulent flows and passive turbulence.

## ACKNOWLEDGMENTS

The research of A.J.M. is partially supported by the Office of Naval Research (ONR) (No. N00014-19-S-B001). D.Q. is supported as a postdoctoral fellow on the grant.

## REFERENCES

*Proceedings Seminar on Predictability, September 1996*(Cambridge University Press, 2006), Vol. 1, No. 1.

*An Introduction to Mathematical Optimal Control Theory*, Lecture Notes (University of California, Department of Mathematics, Berkeley, 2005).