Harmonic decompositions of multivariate time series are considered for which we adopt an integral operator approach with periodic semigroup kernels. Spectral decomposition theorems are derived that cover the important cases of twotime statistics drawn from a mixing invariant measure. The corresponding eigenvalues can be grouped per Fourier frequency and are actually given, at each frequency, as the singular values of a crossspectral matrix depending on the data. These eigenvalues obey, furthermore, a variational principle that allows us to define naturally a multidimensional power spectrum. The eigenmodes, as far as they are concerned, exhibit a dataadaptive character manifested in their phase which allows us in turn to define a multidimensional phase spectrum. The resulting dataadaptive harmonic (DAH) modes allow for reducing the datadriven modeling effort to elemental models stacked per frequency, only coupled at different frequencies by the same noise realization. In particular, the DAH decomposition extracts timedependent coefficients stacked by Fourier frequency which can be efficiently modeled—provided the decay of temporal correlations is sufficiently wellresolved—within a class of multilayer stochastic models (MSMs) tailored here on stochastic StuartLandau oscillators. Applications to the Lorenz 96 model and to a stochastic heat equation driven by a spacetime white noise are considered. In both cases, the DAH decomposition allows for an extraction of spatiotemporal modes revealing key features of the dynamics in the embedded phase space. The multilayer StuartLandau models (MSLMs) are shown to successfully model the typical patterns of the corresponding timeevolving fields, as well as their statistics of occurrence.
This article introduces a new approach for the stochastic inverse modeling of multivariate time series issued from a mixing dynamical system. For such time series, a new framework, relying on semigroup and integral operator theory, is first presented, and dataadaptive spectral theorems are derived. This framework allows us in turn to extract—for a broad class of timeevolving datasets—reduction coordinates that can be efficiently modeled within a family of stochastic differential equations (SDEs) built from a fixed set of predictor functions. The datadriven modeling effort is thus reduced to elemental stochastic models identified as multilayer StuartLandau (SL) oscillators.
I. INTRODUCTION
The success of machine learning algorithms generally depends on data representation, and it is often hypothesized that this is because different representations can entangle and hide more or less the different explanatory factors of variations behind the data.^{1} To be reasonably addressed, the problem of finding relevant dataadaptive basis functions for a successful inverse modeling must be considered within a specific domain of applications.^{2} This article is concerned with addressing this problem for timeevolving datasets produced by a dynamical system, either deterministic or stochastic.
Among the numerous techniques useful for the data representation or decomposition of timeevolving datasets, we may distinguish several seminal techniques like those based on variance's decomposition such as principal component analysis (PCA)^{3} and its probabilistic formulation^{4} or nonlinear extensions;^{5–7} techniques using eigenfunctions of Markov matrices reflecting the (local) geometry and density of the data;^{8–11} or other recent approaches exploiting the Koopman operator theory.^{12–15}
These methods often tied to dimensionality reduction, extract dataadaptive modes that come with reduction coordinates, which for datasets evolving in time, correspond to time series known as principal components (PCs) and the like. The effective derivation of reduced evolution equations to characterize the longterm dynamics of the underlying dynamical system^{16} based on these coordinates is then a central issue.
For some systems, the form of the master equations, prior experience, or some underlying physical intuition may help to determine good reduced systems. In some special cases, their exact form can even be rigorously found by adopting, for instance, the MoriZwanzig (MZ) projection approach^{17–22} or techniques rooted in the approximation theory of (stochastic) local invariant manifolds; see Refs. 23, 24, and references therein. In more complex cases where a rigorous derivation of the reduced dynamics is mathematically intractable, or when the master equations are even not known, a datadriven inverse modeling approach is often adopted.
Many methods have been proposed in the literature to address this task, such as those using nonlinear autoregression moving average with exogenous inputs (NARMAX), artificial neural networks (ANNs), stochastic linear inverse models (LIMs), empirical model reduction (EMR), and many others; see Refs. 25–33 and references therein. In a datadriven context, the MZ formalism also provides a theoretical guidance for the determination of nonlinear ingredients and other dependences on the past; see Ref. 34. Nevertheless, whether the approach retained uses an a priori set of predictor functions or allow for dictionaries of such functions,^{35} the modeler is often left, either with an explanatory deficit or with a selection problem of the appropriate (class of) predictors for a reliable emulation of the dynamics. In the latter case, a certain level of familiarity with the dataset is typically required to envision the right form of the predictors.
This article proposes to overcome such hurdles for inverse modeling, by introducing a dataadaptive spectral representation of the original data, aimed at providing—for a broad class of timeevolving datasets—reduction coordinates that can be efficiently modeled within a family of stochastic differential equations (SDEs) constituted from a fixed set of predictor functions; see Secs. VIII and IX. Roughly speaking, the proposed modeling approach operates well for finite multivariate time series for which decays of correlations are sufficiently wellresolved for the amount of available data. It thus covers a broad class of datasets issued from dynamical systems that are mixing.
The approach adopted in this article for the data representation of timeevolving datasets relies on the spectral analysis of integral operators $L$ for which the kernels are built from periodic semigroups applied to the data's correlations. The resulting class of operators constitutes one of the two main contributions of this article, and a substantial portion of the latter is devoted to the analysis of their spectral properties; see Secs. II–V.
The other main contribution is concerned with the aforementioned class of SDEs, aimed at the modeling of the reduction coordinates obtained as projection of the original dataset onto the eigenfunctions of $L$. These SDEs fall into the class of networks of linearly coupled StuartLandau oscillators,^{36} that may be embedded within the class of multilayer stochastic models (MSMs),^{34} allowing thus for the inclusion of memory effects in their formulation; see Sec. VIII.
Note that readers who might be more motivated by the applications may skip Secs. II–V, at a first reading, and focus on Secs. VI–IX; the essential material from the theoretical results of Secs. II–V being recalled as needed in Secs. VI–IX.
Another originality of the framework lies in its harmonic flavor. The framework allows indeed for extracting modes naturally ranked per Fourier frequency, with reduction coordinates that are, for mixing dynamical systems, typically narrowband and modulated in amplitude; see Sec. VIII A. The corresponding reduced SDEs are also organized per Fourier frequency and the modeling of a specific frequency band is thus naturally made available and facilitated by the resulting framework.
The article is organized as follows. In Sec. II, we first recall basic elements of the spectral theory of periodic semigroups on a Banach space. In Sec. III, this spectral theory is applied to analyze the spectral properties of a class of integral operators $ L \varphi $ with periodic kernels built from periodic semigroups acting on a given (onedimensional) datafunction ϕ; see Theorem III.1. The important case for applications of kernels built from left circular translation groups is analyzed in Sec. IV. Here, one recovers the Fourier spectrum (power and phase) from the spectral elements of $ L \varphi $ for which the eigenfunctions are phaseshifted sinusoids and their phase relates to the phase spectrum of ϕ; see Theorem IV.1. This representation of the Fourier spectrum via the operator $ L \varphi $ allows us to propose in turn an innovative generalization for multidimensional signals in Sec. V. There we show, for multidimensional data, that the eigenvalues of an operator matrix generalization, $L$, can be grouped per Fourier frequency, and are actually given, at each frequency, as the singular values of a crossspectral matrix depending on the data; see Theorem V.1. These eigenvalues obey furthermore a variational principle that allows us to define naturally a multidimensional power spectrum; see Remark V.1–(ii). The eigenmodes, as far as they are concerned, exhibit a dataadaptive character manifested in their phase which allows us in turn to define a multidimensional phase spectrum; see (80).
In Sec. VI, we then consider the case for which ϕ is—or a collection of such ϕ's are—obtained from lagged correlations estimated from time series issued from a mixing dynamical system. After discussing a natural periodization to apply the theory (see Sec. VI B), we introduce in Secs. VI C and VI D, the Hankel matrices and blockHankel matrices resulting, respectively, from the discretization of $ L \varphi $ in the onedimensional case, and of $L$ in the multidimensional case. First applications of the resulting dataadaptive harmonic (DAH) methodology are then discussed for analytical examples of modulated and traveling waves; see Sec. VII.
In Sec. VIII, we introduce the class of MSMs tailored on stochastic StuartLandau oscillators, aimed at the modeling of the aforementioned reduction coordinates corresponding here to the projection of the dataset onto the DAH modes (DAHM); see Secs. VIII A and VIII B. These elemental models—the multilayer StuartLandau models (MSLMs)—are stacked per frequency and are only coupled at different frequencies by the same noise realization; see Sec. VIII C. In Sec. IX, we finally show the flexibility of the DAHMSLM modeling approach, by its ability to provide skilled stochastic inverse models for data issued either from the nonlinear chaotic Lorenz 96 model or a stochastic heat equation driven by a spacetime white noise. Section X discusses the directions for future research.
II. SPECTRAL THEORY OF PERIODIC SEMIGROUPS
Let us first recall the definition of a strongly continuous semigroup also known as C_{0}semigroup; see Ref. 37 (p. 36). For an introduction to semigroup theory, we refer to Refs. 38 and 39. More advanced treatments and applications of semigroup theory can be found in, e.g., Refs. 37 and 40–42.
Definition II.1. A family $ ( T ( t ) ) t \u2265 0 $ of bounded linear operators on a Banach space $W$ is called a strongly continuous (oneparameter) semigroup if
and for every $\phi $in $W$, the orbit
is continuous.
The generator A of a strongly continuous semigroup $ ( T ( t ) ) t \u2265 0 $ is then defined as the operator
such that
for every $ \phi $ in the domain
Our approach relies on the following important characterization of spectral properties of periodic semigroups, e.g., Ref. 37 (Chap. IV, Sec. 2.25). Recall that a strongly continuous semigroup $ ( T ( t ) ) t \u2265 0 $ is periodic if there exists $ t 0 > 0 $ such that $ T ( t 0 ) = Id $. The period τ of the semigroup is then
Periodic semigroups are always groups with inverses $ T ( t ) \u2212 1 = T ( n \tau \u2212 t ) $, for $ 0 \u2264 t \u2264 n \tau , $ and the spectrum of their generators lies always on the imaginary axis, more precisely we have
Theorem II.1. Ref. 37 (Chap. IV, Lemma 2.25) Let T(t) be a strongly continuous τperiodic semigroup on a Banach space $W$, and A its generator, then
and the resolvent R(μ,A) ( $ \mu \u2208 \u2102 $) of A is given by
The above representation of the resolvent
shows that the resolvent of the generator of a τperiodic semigroup is a meromorphic function having only poles of maximal order, with poles μ_{n} given by
The residue of R(μ,A) at the pole μ_{n} is given by the bounded linear operator of $W$ [Ref. 37, (p. 267)]
Hereafter, we denote by $ L ( W ) $ the space of bounded linear operators of the Banach space $W$.
From Ref. 37 (Chap. IV, Sec. 1.17), it follows furthermore that P_{n} coincides with Π_{n}, the Riesz projector associated with the pole μ_{n}, namely, with
where Γ_{c} denotes a rectifiable closed curve in the complex plane that separates μ_{n} from the rest of the poles. Furthermore, the rank of Π_{n}, $ rg \u2009 \Pi n $, satisfies
i.e., the algebraic multiplicity is equal to the geometric multiplicity. In particular, this implies that the spectrum $ \sigma ( A ) $ of A consists of eigenvalues only (i.e., no continuous spectrum) and that
Recall that for any strongly continuous T(t) with generator A, for every λ in $\u2102$, t > 0, and $\phi $ in D(A), the following identity holds [Ref. 37 (Chap. II, Lem. 1.9)]:
which here due to (8) leads to
Moreover,
With these properties at hand, it is then not difficult to infer the following useful decomposition result of any strongly continuous τperiodic semigroup and of its generator; see Ref. 37 (Chap. IV, Thm. 2.27).
Theorem II.2. Let T(t) be a strongly continuous τperiodic semigroup on a Banach space $W$, and A its generator, then
and
III. SPECTRAL THEORY OF INTEGRAL OPERATORS WITH PERIODIC SEMIGROUP KERNELS
Let us introduce the onedimensional torus $ T : = \mathbb{R} / ( \tau \mathbb{Z} ) $ which we define by identifying points in $\mathbb{R}$ that differ by $ n \tau $ for each n in $\mathbb{Z}$. We denote by $ L 2 ( T ) $ the space of complexvalued measurable functions on $\mathbb{R}$ that are τperiodic, and squareintegrable with respect to an arbitrary reference interval of length τ, $ I = [ \alpha , \alpha + \tau ] $ for some α in $\mathbb{R}$. We endow $ L 2 ( T ) $ with the natural inner product
Here, the integral over $T$ is the integral with respect to r taken over any interval of length τ. In what follows, we indeed often identify $T$ with its interval representation I, making precise the distinction when necessary.
Given a τperiodic C_{0}semigroup T(t) acting on a function space $W$ intersecting the set of τperiodic functions, and a τperiodic function ϕ in $W$, our goal is to analyze the spectral property of the following operator:
In practice, the space $W$ may differ from $ L 2 ( T ) $ (by requiring, for instance, other regularities on ϕ), but to simplify the presentation, we restrict ourselves to the case
The following lemma results directly from the theory of integral operators:
Lemma III.1. Assume ϕ lies in $ L 2 ( T ) $ and let T(t) be a τperiodic C_{0}semigroup on $ L 2 ( T ) $. If

(A1) $ ( r , s ) \u21a6 [ T ( r ) \varphi ] ( s ) $ belongs to $ L 2 ( T \xd7 T ) $,
then the operator $ L \varphi $ defined by (15), maps $ L 2 ( T ) $ into $ L 2 ( T ) $ and is compact. If furthermore

(A2) $ [ T ( r ) \varphi ] ( s ) \xaf = [ T ( s ) \varphi ] ( r ) $,
then the operator $ L \varphi $ is selfadjoint.
Proof. The proof is standard and consists of noting that under the integrability condition (A1), the operator $ L \varphi $ is a HilbertSchmidt operator on $ L 2 ( T ) $ and thus compact; see, e.g., Ref. 43 (Thm. 6.12) or Ref. 44 (Chap. 16, Theorem 3). Finally, condition (A2) allows us to write
which shows that $ \u3008 L \varphi [ \Psi ] , \Psi \u3009 $ lies in $\mathbb{R}$ for any $\Psi $ in $ L 2 ( T ) $, and thus $ L \varphi $ is selfadjoint. ◼
As a consequence of the spectral theorem for selfadjoint compact operators, we have the following theorem, in which either $ J = { 1 , \u2026 , N} $ with $ N = dim ( rg ( L \varphi ) ) $ if $ L \varphi $ has finite rank, or $ J = \mathbb{N} $ otherwise.
Theorem III.1. Let $ L \varphi $ be defined in (15), with T(t) and ϕ satisfying the assumptions of Lemma III.1 including (A1) and (A2). Then, 0 is an eigenvalue of $ L \varphi $ and there exist:

countably many nonzero real numbers $ { \lambda n } n \u2208 J $ either finitely many, or such that $ \lambda n \u2192 0 $ if infinitely many, and

an orthonormal basis $ { E n } n \u2208 J $ of $ cl ( rg ( L \varphi ) ) $, the topological closure of the range of $ L \varphi $ in $ L 2 ( T ) $,
for which
Each λ_{n} is an eigenvalue and each E_{n} is an eigenfunction.
Due to the convergence to zero, we usually order the eigenvalues λ_{n} in decreasing order according to the absolute value
The multiplicity of a given eigenvalue λ is the number of times it is repeated in the list of eigenvalues given above, or, equivalently, it is the dimension of the λeigenspace $ ker ( L \varphi \u2212 \lambda Id ) $. The multiplicity of any given eigenvalue is finite.
So far, we have applied classical results from the theory of integral operators to deal with the spectral properties of $ L \varphi $. We can derive furthermore a key identity satisfied by the eigenfunctions of $ L \varphi $ [see (21)], by exploiting the spectral theory of periodic semigroups recalled in Sec. II and requiring more regularity on ϕ.
For that purpose, we focus here on the spectral properties of $ L \varphi $ in $ L \mathbb{R} 2 ( T ) $, i.e., the subspace of $ L 2 ( T ) $ constituted by realvalued functions. We have then the following result concerning the spectral elements of $ L \varphi $ in $ L \mathbb{R} 2 ( T ) $, for which we make use of the following version of the Fourier transform $ \Psi \u0302 $ of a function $\Psi $ in $ L 2 ( T ) $:
Theorem III.2. Assume ϕ lies in D(A), where A denotes the generator of a τperiodic C_{0}semigroup T(t) on $ L 2 ( T ) $. Assume that (A1) and (A2) hold. Assume
 (A3) there exists an orthogonal basis $ { e n } $ of $ L 2 ( T ) $ for which P_{n} defined in (6) is rankone, i.e.,$ P n f = \u3008 f , e n \u3009 e n . $
Let us denote by E_{n} an eigenfunction of $ L \varphi $ in $ L \mathbb{R} 2 ( T ) $ associated with an eigenvalue λ_{n}, then for any frequency $ f k = k / \tau $, we have
Proof. Let us consider an eigenfunction E_{n} associated with an eigenvalue λ_{n} of the operator $ L \varphi $ given in (15). Then, the identity (12) of Theorem II.2 gives
which due to (5) leads, by integration in the rvariable, to
The latter identity gives (21) by applying (19) with $ f = k / \tau $, combined with Assumption (A3) and the definition of the inner product $ \u3008 \xb7 , \xb7 \u3009 $ in (14), recalling that E_{n} is realvalued. ◼
Remark III.1. Under the conditions of Theorem III.1, one has furthermore
with $ W = L 2 ( T ) $, e.g., Ref. 43 (Prop. 6.9).
We actually have a variational characterization of any eigenvalue of the selfadjoint compact operator $ L \varphi $ by relying on the CourantFischer minmax principle; see Ref. 43 (problem 37) and Ref. 45. However, since the operator $ L \varphi $ is not positive, the CourantFischer minmax principle needs to be amended. The positive eigenvalues of $ L \varphi $ listed in decreasing order and denoted by $ 0 \u2264 \cdots \u2264 \lambda 2 + \u2264 \lambda 1 + $ satisfy thus the relationships
That is, the max is taken over a linear subspace $ V \u2282 W $of codimension k, and the min is taken over all such subspaces. Moreover, the minimum is attained on the subspace
where $ E j + $denotes the eigenfunction associated with a positive eigenvalue $ \lambda j + $.
Similarly, for $ 0 \u2265 \cdots \u2265 \lambda 2 \u2212 \u2265 \lambda 1 \u2212 $, we have
with the maximum attained for
where $ E j \u2212 $denotes the eigenfunction associated with a negative eigenvalue $ \lambda j \u2212 $. Such minmax characterizations of eigenvalues of $ L \varphi $like operators are made more explicit in Sec. Vfor the multidimensional case; see Remark V.1–(ii).
IV. THE CASE OF KERNELS FROM LEFT CIRCULAR TRANSLATION GROUPS
Let g be a τperiodic continuous differentiable function such that $ g \u2032 / g \u2208 L \u221e $, and ϕ in $ L 2 ( T ) $. Let us consider
whose domain is given by
where $ AC ( T ) $ denotes the collection of absolutely continuous functions on $T$.
Then, an adaptation of the arguments of Ref. 46 (Lem. 3.5) shows that A generates a weighted left translation semigroup S(t) on $ L 2 ( T ) $. The corresponding operator $ L \varphi $ is then given by
By assumptions on g and ϕ, assumption (A1) is satisfied and $ L \varphi $ is compact; see Lemma III.1. If g and ϕ are furthermore realvalued, then assumption (A2) is satisfied and thus $ L \varphi $ is selfadjoint as well; see again Lemma III.1.
An analysis of the resolvent of A reveals that the corresponding operators P_{n} defined in (6) are each rankone, and that the E_{n}'s are real. Theorem III.2 can be thus applied here when ϕ lies in D(A). Instead of providing a proof of this statement in this rather general setting, we focus on an important special case for applications, namely, the case in which $ g \u2261 1 $, i.e., the case of the left circular translation group T(t) defined by
We have in this case
and the corresponding Fredholm integral equation of the second kind^{47} possesses explicit solutions such as those described by the following theorem. These solutions actually provide a spectral representation of the Fourier transform in terms of both its power and phase spectra. This is the object of the next theorem.
Theorem IV.1. Let A be the generator of the left circular group $ ( T ( t ) ) t \u2208 \mathbb{R} $ on $ L 2 ( T ) $ defined by (31). Assume ϕ to be a realvalued function that lies in D(A) given by (30) with $ g \u2261 1 $. Then, the eigenvalues of the operator $ L \varphi $ are realvalued, form a discrete set $ { \lambda k } $, where k in $\mathbb{Z}$ characterizes the Fourier frequency $ 2 i \pi k / \tau $. More precisely, to each frequency $ f k = k \tau $ corresponds an eigenvalue λ_{k} such that
Furthermore, the corresponding eigenfunctions in $ L \mathbb{R} 2 ( T ) $are given by
with
where $ arg ( z ) $denotes the principal value^{48} of the argument of the complex number z.
Remark IV.1. The eigenfunctions are thus phaseshifted sinusoids, for which the phase θ_{k} relates to the phase spectrum of the data ϕ. The eigenvalues provide the power spectrum. So far, for a onedimensional signal ϕ, one can recover the Fourier spectrum (power and phase) from the spectral elements of $ L \varphi $. The operator representation of the Fourier spectrum via the operator $ L \varphi $ allows us to propose an innovative generalization for multidimensional signals (see Sec. V) which turns out to be particularly useful for the inverse stochasticdynamic modeling of spatiotemporal scalar fields; see Secs. VIII and IX.
Proof. As explained earlier, assumptions (A1) and (A2) are satisfied. Let us now check assumption (A3) of Theorem III.2. For that let us determine the resolvent of the generator A of the left circular group $ ( T ( t ) ) t \u2208 \mathbb{R} $ defined by (31). Note that the operator A is the differentiation operator defined in (29) with $ g \u2261 1 $.
The resolvent $ R ( \lambda , A ) $ of A exists for all $ \lambda \u2209 { 0 , \xb1 i \omega 1 , \xb1 i \omega 2 , \u2026} $, with $ \omega k = 2 \pi k / \tau $, and is obtained by solving the differential equation
The variationofconstant formula gives
Taking $ s = t + \tau $, and requiring $ u ( t + \tau ) = u ( t ) $, we get
The factor on the left hand side is invertible if and only if λ does not coincide with $ 2 \pi i n / \tau $. In such a case
The righthand side of this formula maps $ L 2 ( T ) $ into the Sobolev space $ W 1 , 2 ( T ) $.
Now let p be arbitrary in $\mathbb{Z}$ and let us take $ \psi ( r ) : = e i \omega p r $. Then
since $ \omega p \tau \u2208 2 \pi \mathbb{Z} $, by definition. Thus, by expanding now any arbitrary $\Psi $ in $ L 2 ( T ) $ into its Fourier expansion, we get
with $ f p = p / \tau $.
Therefore, the residue P_{k} at the pole $ \mu k = i \omega k $ [see (6)] is given, for any k in $\mathbb{Z}$, by
and the assumption (A3) of Theorem III.2 is satisfied with $ e k ( t ) = e i \omega k t . $
The identity (21) becomes in this case
Taking the modulus on both sides of (43) gives
Let us seek solutions to (43) under the form
Now by using the Fourier timeshift property, i.e., that the Fourier transform of $ x ( t \u2212 \theta k ) $ is given by $ x \u0302 ( f ) e 2 \pi i f \theta $, we obtain from the Fourier transform of $ cos \u2009 ( 2 \pi f k t ) $ that
and
With this expression substituted into (43), one finds
and since λ_{k} is real, necessarily
Finally, the factor $ 2 $ comes as a normalization constant of E_{k}, for the norm subordinated to the inner product (14). ◼
Remark IV.2.
 The negative part of the spectrum of $ L \varphi $ comes with eigenmodes of sinetype, namely, if $ \lambda k < 0 $ then$ E k ( t ) = cos \u2009 ( \omega k t \u2212 1 2 arg \varphi \u0302 ( f k ) + \pi 2 ) = sin \u2009 ( \omega k t \u2212 1 2 arg \varphi \u0302 ( f k ) ) . $
 Sometimes instead of (35), we will make use of the following equivalent expression of the phase spectrum in terms of the eigenmodes of $ L \varphi $, namely,$ arg \varphi \u0302 ( f k ) = arg ( \lambda k E \u0302 k ( f k ) ) \u2212 arg ( E \u0302 k ( f k ) \xaf ) , \u2003 mod \u2009 ( 2 \pi ) . $
The latter identity is a direct consequence of Eq. (43).
V. MULTIDIMENSIONAL GENERALIZATION
In this section, we consider a block operator matrix generalization of what precedes. We assume that we are given a collection $ { \varphi i , j ( t ) , \u2009 1 \u2264 i , j \u2264 d} $ of τperiodic functions in $ L 2 ( T ) $. Examples of such functions for the spectral analysis of, e.g., spatiotemporal fields, are discussed in Sec. VI D.
For the moment, let us consider the following operator matrix $L$ defined for each u in $ ( L 2 ( T ) ) d $ by:
with
where each $ L \varphi i , j $ is given by (32). We have then the following Lemma.
Lemma V.1. The operator $L$ given by (52) is selfadjoint and compact on $ X : = ( L 2 ( T ) ) d $ endowed with the inner product given for any $ u = ( \Psi 1 , \u2026 , \Psi d ) $ and $ v = ( v 1 , \u2026 , v d ) $ in $X$ by
Proof. Since for each p, q in $ { 1 , \u2026 , d} $, the function $ \varphi p , q $ lies in $ L 2 ( T ) $ by assumption, each operator $ L \varphi p , q $ given by (32) is compact (see Sec. IV), and thus each operator $ M p $ is a (realvalued) compact operator on $ L 2 ( T ) $, as the finite sum of compact operators. The operator $L$ is then compact on $X$ as the finite cartesian product of compact operators, for $X$ endowed with the product topology induced by the inner product (54).
The selfadjoint property of $L$ results from its definition. Indeed,
which, after similar manipulations as in (16), leads to
◼
We are now in position to formulate the main result that extends Theorem IV.1 to the multidimensional case. For that purpose, we introduce for each frequency $ f k = k / \tau $, the d × d symmetrized crossspectral matrix $ S ( f k ) $ whose entries are given by
The next theorem shows that the spectrum of the operator $L$ relates naturally to the crossspectral matrix $ S ( f ) $. At every frequency line, the singular values of $ S ( f ) $ gives actually d pairs of eigenvalues of $L$, and as f is varied one recovers the full spectrum of $L$.^{49}
Theorem V.1. Let A be the generator of the left circular group $ ( T ( t ) ) t \u2208 \mathbb{R} $ on $ L 2 ( T ) $ defined by (31). Assume that each $ \varphi p , q $ is a realvalued function lying in D(A) given by (30) with $ g \u2261 1 $. Then, the eigenvalues of the operator $L$ defined in (52) are realvalued, form a discrete set $ { \lambda n } $ that possesses the following characterization.
For each frequency $ f k \u2260 0 $, one can extract 2d eigenvalues (counting multiplicities) from the set $ { \lambda n } $, d positive and d negative ones that satisfy the following property.
 (S) For each pair of negativepositive eigenvalues denoted by $ ( \lambda \u2212 p ( f k ) , \lambda + p ( f k ) ) $, there exists a singular value $ \sigma p ( f k ) $ of $ S ( f k ) $ such that$ \lambda + p ( f k ) = \u2212 \lambda \u2212 p ( f k ) = \sigma p ( f k ) , \u2009 \u2009 1 \u2264 p \u2264 d . $
Furthermore, the eigenfunctions W_{k} of $L$ in $ ( L \mathbb{R} 2 ( T ) ) d $ possess the following representation:
with
Finally, introducing the notations
the amplitudes $ B p k $and angles $ \theta p k $satisfy for each frequency f_{k} and each p in $ { 1 , \u2026 , d} $, the relation
Proof. Step 1. We first start with the derivation of an analogue of Eq. (43) in the multidimensional case considered here. The resulting extension of (43) is key to the derivation of the main properties of the spectral elements of the block operator matrix $L$ defined by (52) and (53).
For that purpose, let us consider W_{k} in $ ( L \mathbb{R} 2 ( T ) ) d $ under the form given in (59) with the $ E p k $ to be determined. Due to the operator matrix form of $L$ given by (52) and (53), the eigenvalue problem $ L W n = \lambda W n $ can be equivalently rewritten as
Let p be fixed in $ { 1 , \u2026 , d} $ and k be in $\mathbb{Z}$. By reproducing the arguments of the proof of Theorem III.2, we have
Here again, P_{k} denotes the residue of the resolvent $ R ( \xi , A ) $ of A at the pole $ \mu k = 2 \pi i k / \tau $ and is thus given by
Now let us recall the expression of the resolvent of $ R ( \lambda , A ) $ derived in the proof of Theorem IV.1. Namely, $ R ( \xi , A ) $ exists for all $ \xi \u2209 { 0 , \xb1 i \omega 1 , \xb1 i \omega 2 , \u2026} $, with $ \omega k = 2 \pi k / \tau $, and is given for ψ in $ L 2 ( T ) $ by
Therefore, P_{k} is given by
and thus from (64) we obtain, since $ E q n ( t ) $ is a realvalued function [since W_{n} lies in $ ( L \mathbb{R} 2 ( T ) ) d $],
i.e., the aforementioned extension of (43) to the multidimensional case.
[see also (46)], we conclude that
We consider thus k = n, hereafter. Replacing $ \lambda + p $ by λ in Eq. (62) for a moment, it is easy to see that Eq. (62) is a direct consequence of Eq. (67). The former is indeed obtained by taking the real and imaginary parts of Eq. (67) [in which $ E p k $ is given by (60)] and forming the complex exponentials by summation. To complete the proof, we are thus left with the proof of property (S) which we deal with next.
Step 2. Note first that the symmetrized crossspectral matrix $ S ( f k ) $ whose entries are defined in (57) is complex and symmetric. The AutonneTakagi factorization [see Ref. 50 (Chap. VIII, Théorème 92) and Ref. 51] applied to $ S ( f k ) $ ensures thus the existence of a unitary matrix U_{k} such that
where $ \Sigma k = diag { \sigma 1 ( f k ) , \u2026 , \sigma d ( f k )} $ is a diagonal matrix with real and nonnegative entries. These entries are the singular values of $ S ( f ) $ at the frequency f = f_{k}.
In particular, denoting by $ U \u2217 $ the conjugate transpose of U, we have
and therefore the matrix $ S ( f k ) S ( f k ) \xaf $ is positive semidefinite, and its eigenvalues are given by the $ \sigma j ( f k ) 2 $, for $ 1 \u2264 j \u2264 d $.
which leads to
and thus $ \lambda 2 $ is an eigenvalue of $ S ( f k ) S ( f k ) \xaf $.
To summarize, λ is (up to a sign) the square root of an eigenvalue of $ S ( f k ) S ( f k ) \xaf $ and is thus also a singular value $ \sigma j ( f k ) $ of $ S ( f k ) $, up to a sign. Now since k = n in Eq. (67) (see step 1), there is for each frequency f_{k}, d relations (67), one for each p. Considering the positive and negative eigenvalues that coexist at each frequency $ f k \u2260 0 $, we infer thus (58). ◼
Remark V.1.

It should be noted that an a singular value decomposition (SVD) of $ S ( f k ) $ is not automatically its AutonneTakagi factorization. The proof of the AutonneTakagi factorization provides the details of how an SVD of a complex symmetric matrix can be turned into a symmetric SVD; see the proof of Ref. 52 (Theorem 3.1).
 It is also worth mentioning that due to a general result dealing with the characterization of singular values of complex symmetric matrices,^{53} the singular values of $ S ( f k ) $ possess a variational characterization in terms of the quadratic form$ Q ( x ) : = Re ( x T S ( f k ) x ) , \u2009 \u2009 x \u2208 \u2102 d . $In particular, by application of Ref. 53 (Corollary 5) and Theorem V.1, we obtain for $ 0 \u2264 p \u2264 d \u2212 1 $ the following variational characterization by maximizing over real subspaces, namely:$ max V dim \mathbb{R} ( V ) = p + 1 \u2009 \u2009 min x \u2208 V \Vert x \Vert = 1 \u2009 Re \u2009 ( x T S ( f k ) x ) = \lambda + 1 + p ( f k ) , $and$ max V codim \mathbb{R} ( V ) = p \u2009 \u2009 min x \u2208 V \Vert x \Vert = 1 Re \u2009 ( x T S ( f k ) x ) = \lambda \u2212 1 + p ( f k ) . $
Theorem V.1 and Remark V.1–(ii) ensure that the eigenvalues of the operator $L$ (at the frequency f = f_{k}) relate naturally to an energy via the quadratic form $Q$ given in (74). Contrarily to the onedimensional case [see (35)] and as (62) shows, the $ \theta p k $'s do not relate trivially to the multidimensional phase information contained in symmetrized crossspectral matrix $ S ( f k ) $.
Nevertheless, the relation (67) allows us to extract a useful phase information from $ S ( f k ) $ by noting that (provided that $ B p k \u2260 0 $)
with
and $ \gamma p q k : = E q k \u0302 ( f k ) ( E p k \u0302 ( f k ) ) \u2212 1 \xaf $. Obviously, a similar relation holds for $ \lambda \u2212 p ( f k ) $ (involving the appropriate $ E p k $).
By taking the argument on both sides of (77), we obtain (up to a multiple of $ 2 \pi $)
We denote by $ \eta + p ( f k ) $ the LHS of (79).
On the other hand, (68) gives $ E p k \u0302 ( k ) = exp \u2009 ( \u2212 i \theta p k ) / 2 $, which leads to
We understand thus that $ \eta + p ( f k ) $ gives (twice) the phase shift, $ \theta p k $, contained in the eigenfunctions [see (60)], and also provides a measure about the perturbation brought by the “crossfunctions” $ \varphi p , q $ ( $ p \u2260 q $)—as encapsulated into $ R p ( f k ) $—to the phase spectrum associated with $ \varphi p , p $ when considered as isolated [i.e., $ arg ( \varphi p , p \u0302 ( f k ) ) $].
This perturbation is dependent on the eigenfunctions of the operator $L$ as the expression of $ R p ( f k ) $ given in $ ( 78 ) $ shows. When $ R p ( f k ) = 0 $, on recovers the phase spectrum of $ \varphi p , p $ as in the unidimensional case; see Remark IV.2–(iii). In what follows, we will look at the quantities, $ \lambda + p ( f k ) $ and $ \eta + p ( f k ) $ (resp. $ \eta + p ( f k ) $) as f_{k} is varied; the former providing, in the multidimensional case, a notion of power spectrum (as associated with the form $Q$) and the latter a notion of (perturbed) phase spectrum.
Section VI below focuses on the important case for applications in which the functions $ \varphi p , q $ are built from correlation functions. The concepts of DAH power and DAH phase spectra are further detailed in Sec. VI, addressing their numerical computation, while the determination and visualization of the underlying modes are illustrated by analytical examples in Sec. VII.
VI. APPLICATION TO TIME SERIES: DATAADAPTIVE HARMONIC (DAH) SPECTRA
To simplify the presentation, we consider throughout this section a finitedimensional Euclidean space $H$.
A. Correlation functions
In the case of a deterministic flow $ ( S t ) t \u2265 0 $ acting on $H$ and having an ergodic invariant measure μ supported by an attractor $A$,^{54,55} let us recall that the correlation function $ \rho f , g ( t ) $ associated with two (sufficiently smoothed) observables $ f , g : H \u2192 \mathbb{R} $ is given by
In the case of an ergodic stochastic system, the correlation function is given by
where $ ( P t ) t \u2265 0 $ denotes the Markov semigroup possessing μ as (unique) ergodic invariant measure.^{56} In each case, the function $ \rho f , g ( t ) $ is called a correlation function subordinated to the observables f and g. Depending on these observables, various higherorder moment statistics can be considered.
In the deterministic setting, if μ is a SinaïRuelleBowen measure,^{54,55,57} $ \rho f , g ( t ) $ corresponds to the more familiar crosscorrelation coefficient at lag t (for the observables f and g) which takes then the equivalent form^{58}
for almost every x in the basin of attraction of $A$.^{55}
For an ergodic stochastic system, it takes the form
$P$almost surely and for every x in $H$, when, e.g., the Markov semigroup P_{t} associated with the Markov process $ X t x $ is strong Feller and irreducible; see Ref. 56.
B. Periodized correlation functions and associated semigroup kernels
We place ourselves in one of the working framework described in Sec. VI A above—i.e., either stochastic or deterministic—and apply the theory of Sec. IV to $ \varphi ( t ) = \rho f , g ( t ) $ for a choice of two observables, f and g. Note that if one chooses f = g, one recovers the familiar notion of lagged autocorrelation of, e.g., the univariate time series $ t \u2009 \u21a6 \u2009 f ( S t x ) $. For this reason, we will sometimes, when $ f \u2260 g $, name $ \rho f , g ( t ) $ as a crosscorrelation function, to emphasize the difference with the notion of autocorrelation.
Obviously, the main assumption on which the theory in Secs. III and IV relies on is the periodicity of $ \varphi ( t ) $, which is not guaranteed, in practice, except for periodic trajectories of course. However, motivated by physical applications, we are interested with mixing systems, i.e., systems for which a decay of correlations takes place. For stochastic (resp. deterministic) mixing systems, see, e.g., Ref. 56 (resp. Refs. 55 and 58) and references therein.
For such systems, one should, in practice, choose τ large enough so that the decay of correlations has been sufficiently wellresolved. To apply the theory, one then needs to periodize $ \varphi ( t ) $, and the resulting periodization $\phi $ plays a role in the definition of the operator $ L \phi $. To understand it, let us consider an interval $ I = [ \alpha , \alpha + \tau ] $ with α in $\mathbb{R}$. We define
The left circular shift T(t) defined in (31) applied to $\phi $ thus gives the following expression in terms of ϕ, for r lying in the interval I:
In other words, the kernel of the integral operator $ L \phi $ is modified compared with its expression given in (32).
As a choice of the base interval I representing $T$, a reasonable one consists of taking $ \alpha = \u2212 \tau / 2 $, i.e.,
This choice as a representation of $T$ relies in part on the symmetry of $ \varphi ( t ) = \rho f , g ( t ) $, i.e., $ \rho f , g ( t ) = \rho f , g ( \u2212 t ) $. Thus, $ \varphi ( \u2212 \tau / 2 ) = \varphi ( \tau / 2 ) $. Other choices, like α = 0, results—due to the assumed decay of correlations—with “jumps” that occur at the junction points in the periodization of ϕ given in (85).
These jumps do not constitute, however, an obstruction to apply the theory. Theorem IV.1 allows indeed for a discontinuity of this type for $\phi $ used in the definition of $ L \phi $. The reason is tied to the characterization of D(A) given in (30) (with $ g \u2261 1 $) that shows that a function $\phi $ with a discontinuity is allowed, as long as $\phi $ defined in (85) is absolutely continuous and $ \phi \u2032 $ lies in $ L 2 ( T ) $.
In Sec. VI D, we consider the choice of the base interval I given by (87) in the case of a multidimensional DAH spectrum obtained from a collection of correlation functions. We turn next to the case $ I = [ 0 , \tau ] $, i.e., α = 0, for which the framework adopted in this article allows us to recover instructive results from the literature on Hankel matrices. In that respect, we address first the discretization of the operator $ L \phi $.
C. DAH spectrum from a single correlation function: Discretization
Given a uniform partition $ t 0 = 0 < \cdots < t N \u2212 1 < \tau = t N $, we consider the family of functions
We simply denote by ρ, the correlation function $ \rho f , g $. By the definition (86) of the left circular shift acting on the periodization $\phi $ in (85) (with α = 0), we have from (32)
Now let us approximate ρ by a piecewise constant function f, as follows:
Take $ t j = j \delta \tau $ with $ \delta \tau = \tau / N $, then
Denoting $ \rho ( l \delta ) $ by ρ_{l}, the operator $ L \phi $ can be thus approximated by the following Hankel N × N matrix:
Without any surprise, H is a left circulant matrix, which we denote sometimes as
or in other words, the rows of H are obtained by successive shifts to the left of one position starting from the row $ r = ( \rho 0 , \rho 1 , \u2026 , \rho N \u2212 1 ) $ as a first row.
Let T be the right circulant matrix which has the same first row than the left circulant matrix H. Then, T is Toeplitz and $ H = P T $ for the permutation matrix $ P = 1 \u2295 J n \u2212 1 $, where $ J N \u2212 1 $ is the reversal $ ( N \u2212 1 ) \xd7 ( N \u2212 1 ) $ matrix obtained by flipping the $ ( N \u2212 1 ) \xd7 ( N \u2212 1 ) $ identity matrix $ I N \u2212 1 $ from left to right. Hence, H and T have identical singular values. However, circulant matrices and real anticirculant matrices (which are also real symmetric) are normal matrices, so they can be unitarily diagonalized and their singular values are the moduli of their eigenvalues. Therefore, the eigenvalues H and T have identical moduli. Finally, since real anticirculant matrices are real symmetric, H has real eigenvalues.
The spectral theory of circulant matrices is well established although somehow scattered in the literature; see, e.g., Refs. 59–61. The inherent periodicity of circulant matrices relates them to Fourier analysis and group theory. In contrast to the aforementioned standard circulant Toeplitz matrix T arising in Fourier analysis (see, e.g., Ref. 62 (Def. 5.12), the leftcirculant matrix H is not diagonalized by the unitary Fourier matrix Ref. 61 (Theorem 3.5) and, as Proposition VI.1 summarizes, its eigenvectors actually depend on the data and more exactly relate to the phase spectrum as a counterpart of Theorem IV.1 and a consequence of formula (43).
Let us introduce the Fourier frequency $ \omega k = 2 \pi k / N $, and the periodogram of $ { \rho j } $, namely,
with $ i 2 = \u2212 1 $.
We have then the following:
Proposition VI.1. The eigenvalues $ { \lambda k } k \u2208 { 0 , \u2026 , N \u2212 1} $ of the Hankel matrix H given by (92) can be arranged in terms of Fourier frequencies so that
where $ [ x ] $denotes the largest integer less than or equal to x.
Furthermore, for each pair $ ( \lambda k , \lambda N \u2212 k ) $ corresponding thus to a Fourier frequency $ \omega k = 2 \pi k / N $, the corresponding pair of eigenvectors $ ( v k , v N \u2212 k ) $ satisfies
and where $ u \u0302 ( \omega ) $denotes the discrete Fourier transform—assessed at the frequency ω—of a vector u.
Proof. The identity (96) results from (33) of Theorem IV.1. For a direct proof of (95) and (96), we refer to Ref. 60 (Lemma 1). The identity (97) is a consequence of (51) in the discrete setting. ◼
Remark VI.1. Note that to the best of the authors' knowledge, the identity (97) has not been previously derived in the literature concerned with Hankel matrices.
In what follows, the eigenvalues of the matrix H when ranked from the lowest to the highest resolved frequency form what we call the DAH power spectrum. The DAH phase spectrum is formed by the phase of the corresponding eigenvectors, also ranked per frequency. Due to Proposition VI.1, these concepts coincide for onedimensional signals with the standard notions of power and phase spectra [see also Remark IV.2–(ii)], whereas their multidimensional generalization following Sec. V and application to a collection of crosscorrelation functions lead to more subtle but still useful interpretations. This generalization is discussed in further details in Sec. VI D.
Remark VI.2. Let us introduce the standard cyclic permutation matrix
Then,His naturally associated with the Krylov space $ K ( K , \rho ) $generated by the matrixKgiven by (98) and the vector $ \rho = ( \rho 0 , \u2026 , \rho N \u2212 1 ) tr $, namely,
Indeed, the jth column of H is exactly given by $ K j \u2212 1 \rho $.
Remark VI.3.

Proposition VI.1 remains valid when other base intervals I than $ [ 0 , \tau ] $ are used. The modification consists essentially of replacing accordingly the summation bounds in the definition of the periodogram I_{N} given by (94).

Let $ t \u21a6 X t x $ denote a stochastic process emanating from x in $ H : = \mathbb{R} q \u2009 ( q > 1 ) $ and generated by a system of SDEs. Given two observables $ f , g : H \u2192 \mathbb{R} $, and recalling that ρ denotes the correlation function $ \rho f , g $ given by (82), we infer from the formula (96) that the λ_{k}'s provide the (onesided) crosspower spectrum, $ \Gamma f , g $, of the realvalued stochastic processes $ t \u21a6 f ( X t x ) $ and $ t \u21a6 g ( X t x ) $.
The DAH eigenvectors provide furthermore the (onesided) cross phase spectrum via the formula (97), in the sense that$ \Gamma f , g ( \omega k ) N  \lambda j  = exp \u2009 ( arg ( \lambda j v j \u0302 ( \omega k ) ) \u2212 arg ( v j \u0302 ( \omega k ) \xaf ) ) , with \u2009 j \u2208 { k , N \u2212 k} , \u2009 1 \u2264 k \u2264 [ N \u2212 1 2 ] . $
D. Multidimensional DAH spectra from a collection of crosscorrelation functions
Given a base interval I to be $ [ \alpha , \alpha + \tau ] $, one considers the periodization $ \phi f , g = \phi $ given in (85) for $ \varphi = \rho f , g $, with $ \rho f , g $ denoting a correlation function given either by (83) or by (84), depending on the context.
The operator $ L \varphi $ defined in (32) then becomes
where $\Psi $ is any arbitrary function of $ L 2 ( T ) $.
By choosing $ I = [ \u2212 \tau / 2 , \tau / 2 ] $ (i.e., $ \alpha = \u2212 \tau / 2 $), a uniform partition
and elementary functions such as in (88), each operator $ L \phi i , j $ is approximated by the following Hankel $ ( 2 M \u2212 1 ) \xd7 ( 2 M \u2212 1 ) $ matrix:
The operator $L$ defined in (52)–(53)—in which each operator $ L \varphi p , q $ therein is replaced by $ L \phi p , q $ given in (101) above—is then approximated by the following blockHankel matrix $C$ formed by d^{2} blocks, $ C ( i , j ) $, each of size $ ( 2 M \u2212 1 ) \xd7 ( 2 M \u2212 1 ) $ and given as follows:
for (i, j) in $ { 1 , \u2026 , d} 2 $.
Hereafter, we use $ M \u2032 = 2 M \u2212 1 $ for concision, reindexing possibly from 1 to $ M \u2032 $ the string $ { \u2212 M + 1 , \u2026 , M \u2212 1} $. By construction and in agreement with Lemma V.1, the matrix $C$ is real symmetric, not circulant with each block that is Hankel.
Since the matrix $C$ is the discretization of the operator $L$ [defined in (52) and (53) and in which each operator $ L \varphi p , q $ therein is replaced by $ L \phi p , q $ given in (101)], Theorem V.1 is enlightening concerning the spectrum of $C$. In particular, the eigenvalues of $C$ relate according to (58) to the singular values of the symmetrized crossspectral matrix $ S \delta ( f k ) $ whose entries are still given by (57) except that $ \varphi p , q \u0302 ( f k ) $ is obtained here as the discrete Fourier transform (evaluated at the frequency f_{k}) of the correlation function $ \rho p , q ( t ) $, after its periodisation following Sec. VI B.
As noted earlier for the operator $L$ and applied here to the matrix $C$, Theorem V.1 and Remark V.1–(ii) ensure that the eigenvalues of $C$ relate naturally to the quadratic form $ Q \delta $ given by
via the variational principle given in (75)–(76), in which $ S \delta ( f k ) $ replaces $ S ( f k ) $. The eigenvalues of $C$ give thus account on an energy distribution per frequency, as measured through the quadratic form $ Q \delta $.
Grouping per frequency the absolute value of the eigenvalues of the matrix $C$, and displaying those eigenvalues from the lowest to the highest resolved frequency, we obtain what we call the multidimensional DAH power spectrum, omitting the adjective “multidimensional” when this is clear from the context.
Due to Theorem V.1, the eigenvalues of the grand matrix $C$ come in pairs organized per (nonzero) frequencies according to (58). However, the formula (58) is not used in practice and instead eigenvalues of the $ d M \u2032 \xd7 d M \u2032 $ matrix $C$ are computed directly, and listed as the set of eigenvalues $ ( \lambda j ) 1 \u2264 j \u2264 d M \u2032 $. To group these eigenvalues per Fourier frequency and to determine the DAH power spectrum, we then proceed as follows:

Compute the eigenvalues of $C$ listed as $ ( \lambda j ) 1 \u2264 j \u2264 d M \u2032 $, and the associated eigenvectors W_{j}.
 Theorem V.1 ensures that for each j in $ { 1 , \u2026 , d M \u2032} $, the eigenvector W_{j} possesses the representation (59) where each $ E p j $ is a $ M \u2032 $dimensional row vector that is explicitly associated with a Fourier frequency f according to (60) which translates in its discrete form as$ E p j ( s ) = B p j \u2009 cos \u2009 ( f s + \theta p j ) , \u2009 \u2009 1 \u2264 s \u2264 M \u2032 , $
and where p is in $ { 1 , \u2026 , d} . $
Note that f in the above formula takes value in the following set of Fourier frequency:$ f \u2113 = 2 \pi ( \u2113 \u2212 1 ) M \u2032 \u2212 1 , \u2009 \u2003 \u2113 = 1 , \u2026 , M \u2032 + 1 2 . $ 
Compute the discrete Fourier transform $ E \u0302 p j $ of $ E p j $ and group the j's for which the average power spectral density, $ \Gamma \xaf j : = d \u2212 1 \u2211 p = 1 d  E \u0302 p j  2 $, exhibits a dominant peak at a frequency $ f \u2113 $.
 From step 3 above, we have thus formed for each Fourier frequency $ f \u2113 $ the following subset of indices in $ { 1 , \u2026 , d M \u2032} $:$ J ( f \u2113 ) : = { j \u2009 : \u2009 s . t \u2009 \Gamma \xaf j \u2009 is \u2009 peaked \u2009 at \u2009 f = f \u2113 } . $
 Form for each $\u2113$ ranging from 1 to $ ( M \u2032 + 1 ) / 2 $, the discrete set$ P \u2113 : = {  \lambda j  , \u2009 : \u2009 j \u2208 J ( f \u2113 )} . $
The collection of the $ P \u2113 $ for $\u2113$ ranging from 1 to $ ( M \u2032 + 1 ) / 2 $ denotes the DAH power spectrum.
Note that due to (58), $ J ( f \u2113 ) $ is composed of 2d indices when $ \u2113 \u2260 1 $ and of d indices if $ \u2113 = 1 $ such that the $ J ( f \u2113 ) $'s form a partition of the total set of indices, $ { 1 , \u2026 , d M \u2032} . $
In applications, the eigenvectors of $C$ are called the DAH modes (DAHMs), and the $ E p j $ are called DAHM snippets.
The multidimensional DAH phase spectrum is obtained by grouping per frequency the $ \eta + p ( f k ) $ [resp. $ \eta \u2212 p ( f k ) $] as discussed at the end of Sec. V. Note that, in practice, the $ \eta + p ( f k ) $ [resp. $ \eta \u2212 p ( f k ) $] are calculated by evaluating the LHS of (79).
More precisely, we form
where $ E \u0302 k j $ and $ E \u0302 k j \xaf $ denote the discrete Fourier transform of the DAHM snippet, $ E k j $, and its conjugate, respectively. The DAH phase spectrum is then obtained as the collection of the following discrete set:
as $\u2113$ varies in from 1 to $ ( M \u2032 + 1 ) / 2 $.
Finally, given a ddimensional time series $ X ( t ) $ from which the matrix $C$ is formed, we describe below another natural byproduct that can be computed from the spectral elements of $C$, namely, the DAH coefficients (DAHCs) obtained by projection of $ X ( t ) $ onto the orthogonal set formed by the DAHMs.
More precisely, a DAHC is a time series, $ \xi j ( t ) $, obtained according to the formula
As shown in Theorem V.1, for each frequency $ f \u2260 0 $ corresponds d pairs of DAH eigenvalues; see (58). For each such pair, one can associate a pair of DAHCs that is, as explained in Sec. VIII A for the continuous time setting, constituted of time series that are narrowband about the frequency f. As we will see in Sec. VIII, this property shared by the DAHCs plays a key role in the inverse modeling approach proposed in this article. The fact that the DAHCs constitute a key ingredient for modeling relies also on their usage in natural reconstruction formulas allowing for a reconstruction of the full dataset or a part of it, possibly over a targeted frequency band; see Appendix.
The modeling aspects exploiting DAHCs will be discussed later. For the moment, we turn next to basic examples aimed at providing first intuitions about DAH power and phase spectra, as well as about the related DAHMs and DAHCs.
VII. FIRST APPLICATIONS: DAH SPECTRA, DAHMs, AND DAHCs OF TRAVELING WAVES
We consider the following onedimensional timeperiodic scalar field, given for x in $ [ 0 , L ] $ and t > 0 by
with a wave length k(x) depending on the spatial location. Given a uniform discretization $ [ 0 , L ] $ by N_{x} points for a mesh size δx, we choose hereafter k(x) to be given according to
Here, p denotes the integer for which the quotient $ x / ( \alpha \delta x ) $ is within the roundoff error of that integer. The parameter ω_{c} is the Fourier frequency given by
with $ K \u2032 = 2 K \u2212 1 $ and $ 1 \u2264 l \u2264 M \u2032 $. For the experiments reported below, we set $ L = 5 \pi $, N_{x} = 42, K = 15, l = 3, and $ m = 19. $ For simplicity, a timestepping, $ \delta t = 1 $, is chosen to discretize the timevariable t in (113). The resulting patterns are shown in Fig. 1(a). As a benchmark, a traveling wave with k(x) replaced by its mean value $ k \xaf $ is shown in Fig. 1(b). The effect of k(x) is thus to distort a spatial cosine wave, $ cos \u2009 ( k \xaf x \u2212 \omega c t ) $, by introducing some local oscillations occurring at scales smaller than $ 1 / k \xaf $.
Compared to the case of a pure traveling wave shown in Fig. 1(b), some channels in Fig. 1(a) still exhibit collectively some traveling wave patterns (between, e.g., $ x \u2248 14 $ and $ x = 5 \pi $), while others exhibit now standinglike waves patterns as it can be observed between $ x \u2248 4 $ and $ x \u2248 7 $.
The corresponding DAH power spectrum and DAH phase spectrum are shown in Figs. 2(a) and 2(b), respectively. Their computations have been obtained by diagonalizing the matrix $C$ given in (104) and following the procedure described in Sec. VI D, in which the cross correlations are estimated from $ q ( x j , t n ) $ obtained from (113), with $ x j = ( j \u2212 1 ) L / N x , 1 \u2264 j \u2264 N x $, and t_{n} = n, for $ 0 \u2264 n \u2264 3 \xd7 10 3 $. The corresponding DAHMs and DAHCs are shown in Fig. 4.
A. DAH power spectrum, DAHMs, and DAHCs
We first comment on the DAH power spectrum and DAHMs. The main noticeable feature of the DAH power spectrum shown in Fig. 2(a) [resp. Fig. 3(a)] is the presence of a peak located at the (temporal) frequency close to $ f c = \omega c / ( 2 \pi ) $ (due to our discretization), the characteristic frequency of the spatiotemporal field q(x, t) given by (113) [resp. (113) with k(x) replaced by $ k \xaf $]. Recalling the variational characterization of the eigenvalues of $C$, one has thus here an illustration of the relevance of the energy $ Q \delta $ given by (105) for indicating the presence of a dominant oscillation; the latter being expressed by the presence of a dominant pair of eigenvalues [i.e., with the highest maxmin for $ Q \delta $; see Remark V.1–(ii)] at that frequency.
This dominant pair of eigenvalues come with an energetic oscillatory pair of DAHCs as shown in the panel located at the intersection of the bottom row and third column of Fig. 4 (resp. Fig. 5) for $ f = 0.071 \u2248 \omega c / ( 2 \pi ) $; compare with the DAHCs of the third column for other frequencies.
Visualization of DAHMs: The eigenvectors of the grand matrix $C$ (i.e., the DAHMs) are visualized by exploiting the representation (59). More precisely, given a frequency f of interest and a DAHM, W_{j}, associated with this frequency, we extract d DAHM snippets, $ E p j $, each of length $ M \u2032 $, and we form the array in which each row for $ 1 \u2264 p \u2264 d $ is given by $ E p j ( s ) $ when s varies from 1 to $ M \u2032 $; the index p referring to channels and s to (embedding) time. The resulting “spatiotemporal” array with d rows and $ M \u2032 $ columns gives thus a natural way to visualize a $ d M \u2032 $dimensional DAHM. Such a visualization of DAHMs is adopted for the case of modulated and pure traveling waves considered here, as shown in the 1st and 2nd columns of Figs. 4 and 5. Such a representation of DAHMs is not limited to this example and is adopted throughout this article.
By adopting this representation, the resulting DAHM pattern provides information on, e.g., any outofphase or strong collinearity that would exist between different channels composing the original datasets, at a given frequency. Note that due to the explicit form of the DAHM snippet given in (106), this information is encapsulated into the $ \theta p j $'s and each DAHM visualized this way thus provides quantitative information about the phase shifts between channels as associated with a DAH eigenvalue.
For the modulatedwave case, the mixture of traveling/standinglike wave patterns as well as the local smallscale oscillations are outstandingly well captured, at each frequency $ f \u2260 0 $, by the pair of DAHMs associated with the pair of dominant eigenvalues; see first and second columns of Fig. 4. We emphasize indeed that this mixture of patterns present in the field q(x, t) shown in Fig. 1(a) is not only found for the (few) mode(s) associated with the DAHCs of largest variance(s), but rather is distributed across a broad frequency band of Fourier frequencies f.
Noticeably, these patterns scale with the frequency: at each frequency f, the pair of dominant DAHMs exhibits the same patterns that repeat with a $ 1 / f $temporal periodicity; see 1st and 2nd columns of Fig. 4. The same elemental spatial pattern is thus repeated for a pair of (dominant) DAHMs across the frequencies.
The same feature is observed in the case of a pure traveling wave shown in Fig. 1(b). Here again, the (dominant) DAHMs capture the dominant pattern displayed by the field q(x, t) across a broad range of frequencies; see 1st and 2nd columns of Fig. 5.
Such a scaling property (in time)—of the principal pattern contained in the original dataset—as revealed by the dominant pairs of DAHM across the frequencies is in sharp contrast with other modes that would be obtained by techniques relying on a decomposition of the variance.
Such methods applied to q(x, t) would obviously identify another type of leading modes that would capture most of the variance, without however displaying a scaledversion of the dominant pattern across the frequencies, in particular, for the modes that capture a small fraction of the variance. The third column of Figs. 4 and 5 shows to the contrary that even for pairs of DAHMs associated with DAHCs of small amplitude, these modes exhibit the dominant pattern, repeated at the associated frequency f; see the cases f = 0.024 and f = 0.048 in the 1st and 2nd columns of Figs. 4 or 5.
B. DAH phase spectrum
We comment now the DAH phase spectra shown in Figs. 2(b) and 3(b). In both cases, the most striking feature about these phase spectra is the distinction appearing in their shape by comparing the frequency band $ 0 < f < f c $ with $ f > f c $. For $ f > f c $, one can indeed distinguish that all the lines converge towards a single point as one approaches the Nyquist frequency f_{N}.
By relying on the discussion conducted at the end of Sec. V, this convergence corresponds to a convergence of the $ \eta + p ( f ) $ towards a same value, expressing thus a form of phase synchronization between the channels of a DAHM as $ f \u2192 f N $, by recalling (80) and (60).
One could think that such a phase synchronization revealed by the DAH phase spectrum is limited to the case of pure or modulated traveling waves, but actually, several conducted experiments on other spatiotemporal fields issued from various dynamical systems (not shown) seem to indicate that this feature occurs for a broad set of deterministic dynamics, including chaotic ones; see also Sec. IX A 2. Roughly speaking, the poor resolution that comes with the Nyquist frequency explains that differences between the phases across the channels—subtle by nature—must vanish as $ f \u2192 f N $. However, here lies maybe the most useful aspect of the DAH phase spectrum. As we will see in Sec. IX B 2, dataset issued from a stochastic system does not share this convergence property of the phases across the channels of a DAHM, as $ f \u2192 f N $. The DAH phase spectrum appears thus to be potentially a useful diagnostic tool to distinguish between spatiotemporal deterministic or stochastic dynamics.
VIII. INVERSE STOCHASTIC MULTILAYER STUARTLANDAU MODELS
A. Preliminaries
Let $ X ( t ) = ( X 1 ( t ) , \u2026 , X d ( t ) ) $ be a ddimensional time series from which the operator $L$ defined in (52)–(53) is built from the crosscorrelation functions $ \rho i , j ( t ) $, after (possible) periodization such as described in Sec. VI B. We adopt the notations of Theorem V.1 and form the following timecontinuous version of the DAHC associated with an eigenfunction W_{k} which—in the continuous time setting adopted here—gives
i.e., the continuous counterpart of (112).
If X_{p} lies in D(A) given in (30) (with g = 1), then
see e.g., Ref. 39 (Chap. II, Lemma 1.3).
In particular,
As for (49), one can prove due to (58) that to a pair of DAH eigenvalues corresponds a pair of DAHMs, for which their corresponding phases differ by $ \pi / 2 $, i.e., in each DAHM pair, the modes are shifted by one fourth of the period.
As a consequence, the DAHC, $ \xi \u0303 k $, associated with the mode in phase quadrature with W_{k} is given by
Forming the complex time series $ z k ( t ) = \xi k ( t ) + i \xi \u0303 k ( t ) $ where $ i 2 = \u2212 1 $, we obtain
An integration by parts gives
that is
supplemented by the initial condition
If X_{p} is not in the domain D(A) (such as for $ X ( t ) $ issued from a mixing dynamical system) and the base interval I for $T$ is taken to be $ [ \u2212 \tau / 2 , \tau / 2 ] $, then the boundary terms appear and we are left (formally) with
An integration of (123) gives
and thus
The boundary terms introduce exogenous forcing from the time series components, themselves. Since the convolution term in the RHS of (125) involves a cosine function oscillating at the frequency ω_{k}, the frequencies $ g \u2260 \omega k $ contained in the time series $ X p ( t + \tau / 2 ) \u2212 X p ( t \u2212 \tau / 2 ) $ are filtered out and the resulting DAHC, $ \xi k ( t ) $, is narrowband about the frequency ω_{k}. This latter property is shared by all the DAHCs associated with the frequency ω_{k}. The resulting DAHC pair $ ( \xi k ( t ) , \xi \u0303 k ( t ) ) $ exhibits thus narrowband oscillations (about the frequency ω_{k}) that are superimposed to the pure oscillation $ Re ( z k ( 0 ) e \u2212 i \omega k t ) $.
Although the RHS of (125) provides an exact representation (in the case of an infinite time series) of a DAHC, its usage in practice is limited since it requires the knowledge of the original dataset for future time instants, which is incompatible with, e.g., a prediction purpose. Furthermore, we desire a model of the $ \xi k ( t ) $'s which avoids an explicit dependence on the data, i.e., the X_{p}'s appearing in (125).
Since the oscillatory and modulated behavior of time series that are nearly in phase quadrature and carried out about one dominant frequency may be reasonably emulated by StuartLandau (SL) oscillators,^{63} we propose closures of Eq. (123) built from SL oscillators. By closure, we mean that we want to mimic the effects of the term
in Eq. (123), by means of equations involving only the $ z k ( t ) $variables plus stochastic ingredients to be determined. In Sec. VIII B below we propose a solution to this problem exploiting both the properties of DAH spectra (such as summarized by Theorem V.1) and ideas borrowed from the theory of multilayer stochastic models (MSMs) used for inverse modeling of timeevolving datasets.^{34}
B. Multilayer StuartLandau models
We turn now to the use of DAHMs and DAHCs for deriving inverse stochasticdynamic models. The purpose is to show that these objects allows for a reduction of the datadriven modeling effort to the learning of elemental MSMs^{34} stacked per frequency. These elemental models fall into the class of networks of linearly coupled StuartLandau oscillators^{36} that may include memory terms as described below; see also Ref. 64. In a certain sense, the goal is to show that, given a sequence (of possibly partial) observations issued from a dynamical model, the DAHCs allow for recasting these observations so that they can be modeled within a universal parametric family of simple stochastic models, provided that, as mentioned earlier, the decay of correlations are sufficiently well resolved.^{65} We refer to Sec. IX for examples supporting this statement, and describe hereafter the aforementioned elemental models.
Let (x_{j}, y_{j}) be a pair DAHCs such as shown, for instance, in the third column of Figs. 9 or 12. Except at the zerofrequency and as motivated in Sec. VIII A, DAHCs are typically oscillatory time series that are narrowband about a dominant frequency f and whose oscillations are modulated in time. Such a pair of time series can be reasonably expected to be modeled via nonlinear oscillating systems near a Hopf bifurcation to capture the frequency f, and subject to noise, to capture the amplitude modulations; see Ref. 63. We assume that we have a collection of d such pairs, each associated with the same frequency f and exhibiting modulations occurring in different time locations.
If the signal is made of a concatenation of d different time series such as for the L96 flow, such a pair has a simple interpretation: it provides the manifestation, in the time domain, of the frequency f contained in the system's flow as captured by the corresponding pair of DAHMs.
For a narrowband pair of DAHCs, $ ( x j ( t ) , y j ( t ) ) $, given at a frequency $ f \u2260 0 $, we envision thus its modeling by some stochastic perturbation of a StuartLandau model, namely, by
where $ z ( t ) = x j ( t ) + i y j ( t ) $, and $ \mu , \gamma $, and β are real parameters to be estimated. The system is here driven by some “reddish noise,” $ \u03f5 t = ( \u03f5 j x , \u03f5 j y ) $, to be estimated, also, from the time history of z(t). The issue is that the pairs, when modeled this way, would fail in reproducing the phase coherences across the channels, that would be contained in the signal such as in the case of a wave propagation, for instance. The estimation and simulation of ϵ_{t} requires thus a special care due to the presence of typical nonzero cross correlations among the channels.
The MSM framework^{34} allows us to address this issue, while keeping the advantages of the modeling framework provided by Eq. (126). It allows, as we will see in applications (see Sec. IX), for providing stochasticdynamic inverse models able to respect the phase coherence of the collective behavior of the DAHC pairs $ ( x j ( t ) , y j ( t ) ) $ at a given frequency $ f \u2260 0 $. In that respect, one proposes to model each such a DAHC pair by the following type of MSMs, which in the case of two layers used to model the main noise residual, ϵ_{t}, can be written as the following system of SDEs:
Such a model is named multilayer stochastic StuartLandau models (MSLM). In (MSLM), the set $ J d ( f ) $ denotes the subset of indices i in $ J ( f ) $ [see (108)] that correspond to d distinct pairs of DAH eigenvalues associated with the frequency $ f \u2260 0 $. Recall that the existence of such pairs when $ f \u2260 0 $ is ensured by Theorem V.1 and, in particular, (58).
The $ B k i $'s with k in {1, 2} and i in $ J d ( f ) $ form 2d independent Brownian motions. At a given frequency f, the d pairs are linearly coupled as indicated by the terms in the summation terms appearing in the x_{j} and y_{j}equations. As depicted in the schematic of Fig. 6, the main and last layers (here the x_{j}, y_{j}, and $ \zeta j x / y $equations) are the only layers in which these pairs are explicitly coupled. Without these coupling terms, we are left with a collection of d uncoupled normal forms of a Hopf bifurcation perturbed by a “reddish” noise, which are the analogue in real domain of the aforementioned StuartLandau model (126). The linear coupling at the main level is here aimed to enhance some coherence between the stochastic nonlinear oscillations.
The noise residual $ ( \u03f5 j x , \u03f5 j y ) $ in the main layer of (MSLM) is modeled in the intermediate layer by means of linear dependences involving only $ ( \u03f5 j x , \u03f5 j y ) $ on one hand, and the jth pair (x_{j}, y_{j}) on the other. The noise residual $ ( \zeta j x , \zeta j y ) $ of that intermediate layer is then modeled still via linear dependences that now include this noise itself, the noise residual of the previous layer, and the DAHC pair (x_{j}, y_{j}). The terms in the summations appearing in the $ \zeta j x $ and $ \zeta j y $equations of the last layer involve the other pairs as mentioned earlier. The presence of such terms is aimed to take into account—at the noise level—of crosscorrelations between the d DAHC pairs corresponding to a same frequency. In practice, the corresponding coefficient $ Q l k j $ ( $ l , k \u2208 { 1 , 2} , \u2009 j \u2208 { 1 , \u2026 , d} $) are estimated from the vector of size 2d formed after the concatenation of the last level residuals for each pair.
Note that for the frequency f = 0, and as mentioned in Sec. VI D, there are exactly d DAHMs that are not paired, and they are modeled by a linear MSM, following Ref. 34; see also Ref. 66 [Eq. (9)].
An MSLM is not limited to the twolayer case shown earlier for the modeling of the main noise residual $ ( \u03f5 j x , \u03f5 j y ) $. Following Ref. 34, extra layers may be added, with intermediate layers that only involve a linear dependence on the jth DAHC pair (x_{j}, y_{j}) and on the corresponding noise residuals from the previous layers. In other words, the extra layers must respect the overall structure of (MSLM), with coupling between the other pairs (associated with the same frequency f) only present in the main and last layers. The addition of an extra layer would thus correspond to the addition of another intermediate plane in the schematic of Fig. 6, with only vertical connections to the rest of the structure.
The procedure of adding layers is controlled by a stopping criterion as described in Ref. 34 (Appendix A) which allows thus, in principle, for a different number of layers per frequency. In general, the layers with their surrogate variables, here $ ( \u03f5 j x , \u03f5 j y ) $ and $ ( \zeta j x , \zeta j y ) $, allow for a learning of the temporal correlations contained in the main noise residual ϵ_{t}, as well as of its dependences on the history of $ ( x j ( t ) , y j ( t ) ) $ often required for an appropriate modeling of ϵ_{t}; see Ref. 34 for a general discussion on this topic. Typically, the more layers, the more complex are these correlations and memory dependences; see Ref. 34 (Proposition 3.3), Ref. 34 (Sec. 7), and Ref. 66 for examples. Sections IX A 3 and IX B 3 also illustrate this point.
C. Coupling across the frequencies and practical considerations
As emphasized in (MSLM), the coefficients to be learned depend on the frequency f. An MSLM seeks for the modeling of the DAHC pairs $ ( x j , y j ) j \u2208 J ( f ) $ associated with the frequency f, such as extracted from the original (or preprocessed^{67}) dataset by application of the DAH decomposition. Once this extraction is computed, an MSLM can be learned in parallel for each frequency of interest, by application of, e.g., simple successive regressions. In practice, linear constraints must be incorporated in these regressions such as $ \sigma j ( f ) \u2264 0 $ to ensure stability, as well as antisymmetry, to estimate the coefficients $ \beta j ( f ) $ and $ \alpha j ( f ) $.
Once all the resulting (few) MSLM coefficients have been estimated, the parallelization advantages of the MSLMs across the frequencies rely also on its simulation: no need of extra coupling across the frequencies is indeed needed other than running the MSLMs by the same noise realization, for each frequency. The simulation procedure can be thus also parallelized at the convenience of the modeler to obtain the collective behavior of the simulated DAHCs. These simulated DAHCs can be then multiplied with the corresponding DAHMs such as described in the Appendix, to obtain various possible frequencyband modeling of the original dataset.
Of course, one condition to the success of the resulting modeling approach relies on the ability of the DAH decomposition in extracting modulated DAHC pairs that come in nearly phase quadrature, and that are narrowband in the frequency domain, an MSLM being naturally dedicated to the modeling of such time series. The tradeoff between a good resolution of the decay of temporal correlations and the temporal length of the available dataset, seem to play a key role in that respect. Various conducted experiments (not shown) have indeed corroborated the idea that the DAHC pairs are expected to be narrowband and to come in nearly phase quadrature when the decay of temporal correlations has been sufficiently resolved.
In what follows, the combination of (i) the extraction of DAHCs by the DAH method of Sec. VI D and (ii) their modeling according the the MSLM approach described earlier is named a DAHMSLM modeling. We present Sec. IX below applications to examples for which this criterion is met and a successful DAHMSLM modeling is obtained in each case.
IX. APPLICATIONS TO INVERSE STOCHASTICDYNAMIC MODELING
We apply hereafter the DAHMSLM inverse modeling approach to two complex flows, obtained, respectively, from a chaotic ODE system and an stochastic partial differential equation (SPDE), with a linear drift.
A. Applications to the Lorenz 96 model
1. The dynamical model
In this section, we apply the DAH approach to the socalled Lorenz 96 (L96) model.^{68} The evolution equations of the L96 model, in its simplified version,^{69} is written as
supplemented with the boundary conditions
In the absence of forcing and dissipation, the sum of the squares of the variables (the energy of the system) is conserved. This model has become popular as a toy model of the atmosphere to test various methods in weather and climate sciences as displaying a large set of chaotic behaviors with various mixing properties, as K and F_{1} are varied.
2. DAH spectra, DAHMs, and DAHCs
The model is set to a standard chaotic regime corresponding to K = 40 and $ F 1 = 6 $, and is integrated by using a 4thorder RungeKutta scheme with a time step $ \delta t = 5 \xd7 10 \u2212 3 $. Excluding transient behavior and subsampling every 10th time step, we store N = 11 900 time instances of the Kdimensional vector $ X ( t ) = ( X 1 ( t ) , \u2026 , X K ( t ) ) $.
As shown in Fig. 7(a), the L96 flow is dominated by irregular spatiotemporal traveling wavelike patterns. We estimated the crosscorrelations $ \rho i , j ( t ) $ between the variables $ X i ( t ) $ and $ X j ( t ) $ for i, j in $ { 1 , \u2026 , 40} $, and we formed the $ ( 2 M \u2212 1 ) \xd7 ( 2 M \u2212 1 ) $matrix $C$ given in (104), with M = 160. The corresponding absolute values of its eigenvalues (ranked per frequency)—forming thus the corresponding DAH power spectrum—are shown in Fig. 8(a). The corresponding DAH phase spectrum is shown in Fig. 8(b).
Interestingly, these spectra as well the underlying DAHMs share some common features with those displayed in the case of a pure traveling wave shown in Secs. VII A and VII B. First, the DAHMs shown in the 1st and 2nd columns of Fig. 9 share clearly common patterns with those shown in the respective columns of Fig. 5.
Second, a phase synchronization between the channels of a DAHM as f approaches the Nyquist frequency is also observed in Fig. 8(b) as in Fig. 3(b).
Third, this synchronization is in contrast with the disordered distribution of phases shown in a frequency band $ 0 < f < f c $ with $ f c \u2248 0.1 $; the frequency band for which the most energetic pair of DAH eigenvalues (for the quadratic form $ Q \delta $) are located; see Fig. 8(a).
Nevertheless, the DAH power spectrum exhibits much broader peaks in the case of the L96 flow than in the case of a pure traveling wave; compare Fig. 8(a) with Fig. 3(a).
Finally, the most noticeable difference is shown by the DAHCs. Whereas the latter were all periodic in the case of traveling wave as expected [see (121)], they are—as predicted by (125)^{70}—clearly modulated in amplitude and narrowband in frequency in the case of the L96 flow; compare the 3rd column of Fig. 5 with that of Fig. 9. Furthermore, the pairs of DAHCs are typically constituted by time series (for $ f \u2260 0 $) that are nearly in phase quadrature; see the panels corresponding to f = 0.016, f = 0.031, and f = 0.047 in the 3rd column of Fig. 12. We are thus in a situation described in Sec. VIII B. Section IX A 3 below addresses the inversemodeling problem of the L96 flow, following the approach described in Sec. VIII B.
3. DAHMSLM modeling of the Lorenz 96 model
In this section, we report on the inverse modeling skills obtained by application of the DAHMSLM approach described in Sec. VIII. For that purpose, the original field obtained by integrating Eq. (127), as mentioned earlier, has been sampled every 10 time steps in time. The frequency band that is modeled is from f = 0 up to $ f = 6.29 \xd7 10 \u2212 2 $, which represents a fraction of the variance of $ 86 % $, and 20 Fourier frequencies according to the resolution set by $ M \u2032 = 2 \xd7 160 \u2212 1 = 319 $.
The corresponding DAHCs are modeled for each of these frequencies according to an MSLM of the form described in Sec. VIII B, in which the number of layers per frequency is 11, including the main one. After multiplication by the corresponding DAHMs, the simulated DAHCs allow us, as described in the Appendix, to form a simulated DAHMSLM field. The resulting field is shown in Fig. 10 and exhibits striking common features with the original field, shown in the same figure for a better comparison.
Interested by the ability of the simulated DAHMSLM field to reproduce the timevariability of the L96 flow, we computed for each of these fields the autocorrelation function (ACF) (in time) of each channel, followed by an averaging of the resulting individual ACFs over the channels. The result is shown in Fig. 15(a) and as one can see the ability of the simulated DAHMSLM field in reproducing the timevariability of the L96 flow is very satisfactory, according to this metric.
It should be emphasized that beyond the embedding window size, here $ M \u2032 = 319 $, these reproduction skills of the (mean) ACF, deteriorate due in particular, to the periodization involved in the determination of the matrix $C$, as in any other method exploiting timeembedding techniques.
So far we have been dealing in applications with deterministic systems. The Sec. IX B illustrates that striking inverse modeling skills can also be achieved by applying the DAHMSLM approach for the inverse modeling of stochastic systems, still with the same elemental MSLMs as described in Sec. VIII B.
B. Applications to a stochastic heat equation
1. The dynamical model
The dataset analyzed hereafter is produced from a simple stochastic partial differential equation (SPDE), namely, the following stochastic heat equation posed on the interval $ [ 0 , L ] $ (L > 0) with periodic boundary conditions:
Here, λ and D are positive parameters, and $ \u2009 d W $ denotes a “spacetime” white noise process on the circle $ T L $, roughly speaking a Gaussian random field with spatiotemporal correlations for $ t , s \u2265 0 $, given by
There are (at least) two approaches to frame mathematically the notion of spacetime white noise and related concepts of solutions; see, e.g., Ref. 71 for a comparison. Both, the randomfield approach based on Green functions^{72} and the functional analysis approach,^{73,74} give the same notion of weak/mild solutions, and stochastic heat equations such as Eq. (130) serve as a fundamental object to study various properties or formulas about such solutions; see e.g., Refs. 75–78.
The functional analysis approach teaches us that the stochastic process $ u ( t , \xb7 ) $ solving Eq. (129) is actually an infinitedimensional OrnsteinUhlenbeck process (OU) in the Hilbert space $ H = L 2 ( T L ) $; see Ref. 75. Our goal is to analyze the DAH power and phase spectra of the corresponding (random) spatiotemporal field, u(t, x), and see whether these spectra provide an intuitive characterization of such an OU process.
For that purpose, we first simulate Eq. (129) as follows. Over a grid of mesh size $ \delta x = L / N x $, the discrete approximation $ u j n $ of $ u ( n \delta t , j \delta x ) $ is obtained by finite differences for which the noise term is approximated at each time step $ n \delta t $ (via an explicit EulerMaruyama scheme) by $ \xi n \delta t $, where $ ( \xi j n ) j = 1 N x $ denotes an N_{x}dimensional vector of random variables each drawn independently with respect to $ 1 \u2264 j \u2264 N x $ and n (except $ \xi 1 n = \xi N x n $) from the standard normal distribution $ N ( 0 , 1 ) $.
We chose $ N x = 2 7 , \u2009 L = 2 \pi , \u2009 D = 2 \xd7 10 \u2212 1 $, and $ \delta t = 10 \u2212 3 $ for our simulations. The initial data are taken to be
2. DAH spectra, DAHMs, and DAHCs
Excluding transient behavior and subsampling every $ 8 \delta t $, we store N = 37 500 time instances of the ddimensional vector $ X ( t ) = ( X 1 ( t ) , \u2026 , X d ( t ) ) $, in which the X_{p}'s are also obtained by subsampling every $ 4 \delta x $ the original field $ ( u j n ) $, i.e., here d = 32. The resulting coarsegrained SPDE field is shown in the top panel of Fig. 13. We built up the matrix $C$ (104) whose Hankel blocks are given in (103) with M = 80, and $ 1 \u2264 i , j \u2264 d $. Diagonalizing $C$ gives the multidimensional DAH spectra as explained in Sec. VI D. The corresponding DAH power (resp. DAH phase) spectrum for the coarsegrained SPDE is shown in Fig. 11(a) [resp. Fig. 11(b)].
As it can be observed by comparison with Fig. 8(a), the DAH power spectrum shown in Fig. 11(a) does not exhibit any broadband peak, and is actually composed by curves of DAH eigenvalues that each exhibits a decay from low to high frequencies, reminiscent with that of a scalar OU process. What is remarkable is that the DAH power spectrum provides here a multidimensional picture of what one should expect for an OU process obtained from a standard 1D Langevin equation, showing thus here again the relevance of the quadratic form $ Q \delta $, as an energy characterizing the distribution of eigenvalues of $C$ at a given frequency; see (105) and Remark V.1–(ii).
As another noticeable feature, the phase synchronization such as that observed on the DAH phase spectra of the L96 flow and the traveling wave examples is no longer present for this stochastic example, and is instead replaced in Fig. 11(b) by the presence of some (vaguely visible) bands within a diffuse background. Going back to the interpretation of the phase spectrum discussed at the end of Sec. V in relation with (80), the presence of such a diffuse background, even for f approaching the Nyquist frequency f_{N}, can be attributed to the presence of nonvanishing crossspectral terms $ \rho p , q \u0302 ( f ) $ leading to a nonnegligible $ R p ( f ) $ given in (78). This feature is consistent with the intuitive idea that the noiseterm in Eq. (130) is white and thus forces evenly the resolved frequency band. It should be emphasized that such a feature has been also observed in other conducted experiments on datasets issued from various stochastic systems driven by a white and additive noise.
Finally, each pair of DAHCs associated with a frequency f (with $ f \u2260 0 $) is—as for the nonlinear example of Sec. IX A—constituted by time series modulated in amplitude, narrowband about the frequency f, that are nearly in phase quadrature; see the panels corresponding to f = 0.013, f = 0.025, and f = 0.038 in the 3rd column of Fig. 12. We are thus here again in a favorable situation to apply to these DAHCs the MSLM modeling framework of Sec. VIII B. Section IX B 3 below examines examine the corresponding modeling skills.
3. Coarsegrained DAHMSLM modeling
Recall that the original SPDE field has been sampled every 8th time step, and every 4th spatial increment δx in space; the resulting coarsegrained field is shown on the top panel of Fig. 13. The frequency band that is modeled is from f = 0 up to $ f = 6.33 \xd7 10 \u2212 2 $, which represents a fraction of the variance of $ 86 % $, and corresponds to 10 first Fourier frequencies according to the resolution set by $ M \u2032 = 2 \xd7 80 \u2212 1 = 159 $.
The corresponding DAHCs are here again modeled for each of these frequencies according to an MSLM of the form described in Sec. VIII B, in which the number of layers per frequency is 4, including the main one. The resulting simulated DAHMSLM field is shown in Fig. 13 and exhibits striking common features with the coarsegrained SPDE field shown in the same figure.
At a statistical level, the simulated DAHMSLM field (red curve in Fig. 14) captures remarkably well the distribution of energy across the spatial scales when compared with the coarsegrained version of the SPDE field (blue curve in Fig. 14) from which the parameters of the DAHMSLM have been estimated. Note that the blue curve in Fig. 14 shows this energy distribution for the coarsegrained SPDE field as projected onto the first pairs of DAHMs corresponding to the first 10 Fourier frequencies. The black curve shows in the same figure, the energy distribution for the full coarsegrained SPDE field.
Finally, as in the case of the L96 flow, in order to assess the ability of the simulated DAHMSLM field to reproduce the timevariability contained in the coarsegrained SPDE field, we computed for each of these fields the ACFs of each channel, followed by an averaging of the resulting individual ACFs over the channels. The result is shown in Fig. 15(b) and as one can see the ability of the simulated DAHMSLM field in reproducing the (average) decay of correlations is very satisfactory. Noteworthy is the nearly exponential decay of the later, as intuitively expected from an infinitedimensional OU process.
To summarize, the DAHMSLM inverse modeling framework provides here again efficient emulators of the observed dynamics for various metrics. The most striking feature of the approach is that either for the linear stochastic example examined here or the nonlinear chaotic one of Sec. IX A, the predictor functions stay the same, namely, the monomials used for the class of MSLMs described in Sec. VIII B. The dataadaptive character of the DAHMs manifested by the capture of the phase information contained^{79} in the multivariate signal to be modeled constitutes a key element for a successful modeling. It allows indeed for breaking the signal into elementary bricks, here the DAHCs, that fall naturally within the MSLM class of models. In other words, the DAHMs provide a change of basis which transforms systematically the original dataset into time series generated by a same class of SDEs, provided the corresponding DAHCs are modulated in amplitude and narrowband (for $ f \u2260 0 $). Of course many questions arise, as it did for many novel mathematical concepts and tools, when first introduced. Section X formulates some of these.
X. CONCLUSION
Thus, by means of rigorous harmonic analysis tools, this article provides a natural framework for the representation of complex timeevolving fields by a combination of narrowband time series, the DAHCs, suitable for their modeling within a universal class of models, namely, the MSLMs. The framework opens up several possible directions for future research. We outline some of these directions below and comment on recent achievements obtained on nonsynthetic datasets.

The modeling approach described in this article may benefit from several natural developments. First, terms of the form $ \gamma j ( f ) y j ( x j 2 + y j 2 ) $ and $ \gamma j ( f ) x j ( x j 2 + y j 2 ) $ in the x_{j} and y_{j}equations, respectively, could be incorporated within the formulation of an MSLM. The introduction of such nonlinear “twist” terms is in part motivated by the analysis conducted in Ref. 63. There, it has been indeed shown that such terms are responsible of certain interactions between the noise and the nonlinear terms (associated with hypoellipticity) that may be important to resolve. The inclusion of such twist terms may be thus relevant for the proper modeling of a DAHC pair, for certain applications.
Also, the driving noise of an MSLM may not be required to belong to the class of correlated white noise processes^{80} such as in Eq. (MSLM). For instance, an MSLM driven by Lévy αstable processes could be relevant for certain applications such as the modeling of precipitations in climate^{81,82} and other climate fields.^{83}

The use of MSLMs may not be limited to the modeling of pairs of DAHCs as extracted by DAH. For instance, other multivariate time series decomposition techniques such as MSSA^{84} extract analogue time series known as STPCs, which when coming in pairs that are nearly in phase quadrature, could also be efficiently modeled by an MSLM, provided the corresponding pairs are modulated in amplitude while narrowband. We mention that a similar approach was adopted in Ref. 66 for the stochastic modeling of decadal variability in ocean gyres. There, however, the nonlinear coupling within a frequency bin was absent while linear couplings across the frequencies (falling within a targeted frequency band) were used to model the STPCs from the dataset considered in Ref. 66.

Similarly, coupling across frequencies that go beyond the driving by the same noise realization may also be useful to build up MSLMs, in certain cases. The inclusion into an MSLM of the contribution of another frequency g than a frequency f may be indeed easily achieved by passing the corresponding terms associated with g into (MSLM). Of course, as far as the pairs of DAHCs are sufficiently narrowband, such a coupling may be superfluous, and in such a case, a “same noise realization coupling”—cheaper in terms of coefficients to be estimated^{85}—may be amply sufficient to reach satisfactory modeling skills as shown for the timeevolving fields of Sec. IX.

Unlike the synthetic datasets of Sec. IX, the DAHMSLM approach has been applied to various geophysical multivariate datasets exhibiting complex patterns, from “realworld applications.” In that respect, the DAHMSLM modeling of Arctic sea ice concentration has been successfully addressed in Ref. 86, providing, in particular, stable inverse models for much longer time intervals than their training period (1979–2014), beyond the end of the 21st century. Interestingly, these models exhibit an interdecadal variability consistent with proxy historical records, while reproducing accurately the (nonlinear) trend contained in the observations; see Ref. 86 (Fig. 13).
Prediction experiments using the DAHMSLM framework have also been recently conducted in Ref. 87 for the Arctic sea ice extent (SIE). Here, the forecasts, as it would be the case for any other dataset, are made for the DAHCs using the best MSLMs learned during the training period. As a result, retrospective forecasts show that the resulting DAHMSLM modeling is quite skillful in predicting the September SIE; i.e., for the most challenging SIE value to forecast. Noteworthy are the realtime forecasts conducted in 2016 which have shown very competitive skills as documented here: https://www.arcus.org/sipn/seaiceoutlook/2016/postseason; see Ref. 87 for more details.
Finally, we mention that the DAHMSLM framework has been shown to be useful for the datadriven study of the coupling between the solar wind and the magnetosphere as discussed in Ref. 88. This example shows, in particular, the relevance of the DAHMSLM framework to provide inverse models of externally forced systems, another important theme for geophysical applications.^{89,90}
ACKNOWLEDGMENTS
The authors would like to thank the anonymous reviewers for their useful comments that helped improve the presentation of the manuscript.
The authors also thank Georg Gottwald for the inspiring discussions at the start of this project. The authors are grateful to Henk Dijstra, David Neelin, and Alexis Tantet for fruitful and inspiring collaborations about the stochastic Hopf bifurcation. Constructive comments by Pavel Berloff, Sergey Kravtsov, Honghu Liu, and Eric Simonnet in early stages of this project are also most appreciated. The authors are also grateful to James McWilliams for several inspiring discussions and constructive comments about this work. Finally, Michael Ghil has the full gratitude of the authors for the numerous stimulating discussions over the years as well as for his unflinching support.
This research was supported by Grant No. N000141612073 from the Multidisciplinary University Research Initiative (MURI) of the Office of Naval Research, and by the National Science Foundation Grant Nos. OCE1243175, OCE1658357, and DMS1616981.
APPENDIX: FREQUENCY BAND RECONSTRUCTION
Let us consider a ddimensional time series $ X ( t ) = ( X 1 ( t ) , \u2026 , X d ( t ) ) $ from which the crosscorrelation coefficients $ \rho i , j $ between the ith and the jth channels have been estimated. Once the DAHMs have been determined as eigenvectors of the matrix $C$ defined in (104) (see Sec. VI D) and the DAHCs according to (112), one can use them to extract from $ X ( t ) $ the signal that corresponds to a certain frequency band $ [ f s , f e ] $. We describe hereafter how to proceed to do so.
First, given a $ M \u2032 $dimensional DAHM snippet, $ E p j ( s ) $ ( $ 1 \u2264 s \u2264 M \u2032 $), and its corresponding DAHC one, $ t \u21a6 \xi j ( t ) $, form the reconstructed component (RC), $ R p j ( t ) $, given at time t by
The values of the normalization factor M_{t}, as well as of the lower and upper bound of summation L_{t} and U_{t}, differ between the central part of the time series and its end points, and are given here as for MSSA; see Ref. 84 [Eq. (12)]. Forming the corresponding reconstructed multivariate time series $ R j ( t ) = ( R 1 j ( t ) , \u2026 , R d j ( t ) ) $, and summing over all of the (resolved) j's allows us to recover the original multivariate time series $ X ( t ) $. A summation over a subset of $ { 1 , \u2026 , d M \u2032} $ gives a partial reconstruction of $ X ( t ) $.
Unlike for MSSA^{84} or PCA, such a partial reconstruction is not necessarily guided by a certain targeted percentage of the variance of $ X ( t ) $ to be captured, but can also be performed over a targeted frequency band. Due to its harmonic flavor, the DAH approach allows us indeed to consider a notion of harmonic reconstructed component (HRC) which consists of—for a given channel p—summing the RCs corresponding to a given frequency $ f \u2113 $, namely,
where $ J ( f \u2113 ) $ corresponds to the set of indices given by (108).
In applications, the DAH approach thus offers a way to determine the contribution within a component $ X p ( t ) $ [of the multivariate signal $ X ( t ) $] of a particular frequency band $ I f : = [ f s , f e ] $ by simply adding up the HRCs over the frequencies $ f \u2113 $ lying within I_{f}.