We first review the way in which Hasselmann’s paradigm, introduced in 1976 and recently honored with the Nobel Prize, can, like many key innovations in complexity science, be understood on several different levels. It can be seen as a way to add variability into the pioneering energy balance models (EBMs) of Budyko and Sellers. On a more abstract level, however, it used the original stochastic mathematical model of Brownian motion to provide a conceptual superstructure to link slow climate variability to fast weather fluctuations, in a context broader than EBMs, and led Hasselmann to posit a need for negative feedback in climate modeling. Hasselmann’s paradigm has still much to offer us, but naturally, since the 1970s, a number of newer developments have built on his pioneering ideas. One important one has been the development of a rigorous mathematical hierarchy that embeds Hasselmann-type models in the more comprehensive Mori–Zwanzig generalized Langevin equation (GLE) framework. Another has been the interest in stochastic EBMs with a memory that has slower decay and, thus, longer range than the exponential form seen in his EBMs. In this paper, we argue that the Mori–Kubo overdamped GLE, as widely used in statistical mechanics, suggests the form of a relatively simple stochastic EBM with memory for the global temperature anomaly. We also explore how this EBM relates to Lovejoy *et al.*’s fractional energy balance equation.

Hasselmann’s stochastic energy balance model embodies a long-standing paradigm in climate science. Recent evidence that climate may show long-range memory has, thus, motivated newer stochastic models such as Lovejoy *et al.*’s fractional energy balance equation (FEBE). Physical arguments for extending the Hasselmann formalism are more general than this, however, and are motivated by many other well-known features of the system, including periodicity. We, thus, need a model that allows a more general range of dependency structures than either just the shortest possible or the longest possible ranged behaviors while allowing both as limiting cases. In this article, we propose the use of such a formalism to extend Hasselman’s EBM, the stochastic Mori–Kubo generalized Langevin equation (GLE).

## I. INTRODUCTION

The value of climate science would be substantial even without anthropogenic interference in the climate system. The presence of such an interference, however, raises a number of categorically new challenges. The context of human-induced climate change has made it important to quantify the observed changes, understand their origin, and, as far as possible,^{1,2} predict their future scope, in the presence of the nontrivial intrinsic variability that is already present (see, e.g., Refs. 3 and 4). The development of this need has coincided with the rise of the science of complexity, which has itself both pioneered and promoted new approaches to similar problems across a very wide range of topical application areas.^{5}

Several key achievements in the study of fluctuations, at the interface^{6–8} of climate modeling and the fast-developing sciences of complexity, have recently been recognized by the award of the 2021 Nobel Prize for Physics to Hasselmann, Manabe, and Parisi. The Nobel Committee have given a very useful and accessible short introduction^{9} to these branches of science, but two more recent articles have demonstrated between them an emerging dichotomy in responses to the Prize. Our paper seeks to reconcile them.

In Ref. 10, Franzke *et al.* have shown how “the Hasselmann program,” first enunciated in Ref. 11, of explaining how slow climate arises from fast weather using stochastic EBMs, has given rise to decades of productive and insightful research. The original stochastic EBMs were Markovian, depending only on the current values of their (stochastic) variables, and by construction assumed strongly separated fast and slow time scales. A reader might easily infer from this excellent review, however, that the broader “Hasselmann program” built around such stochastic EBMs, their developments, and his subsequent spatiotemporal fingerprinting methods has fully solved the problems of realistic climate modeling. This impression is reinforced by the fact that the paper does not delve into more recent, and maximally non-Markovian, EBMs that exhibit “long-range memory” (LRM; though see a complementary review of this by Franzke *et al.*,^{12} and recent work by him with his colleagues^{13}).

In contrast, Lovejoy^{14} has stressed the differences between Hasselmann’s Markovian paradigm and more recent developments in complexity science, especially those related to LRM and fractional responses. He has articulated^{15–17} the view that only models incorporating LRM can fully describe the already known properties of climate data. Arguments for the relevance of LRM have at times been received with skepticism by some climate scientists,^{18} while others (e.g., Ref. 19), who do acknowledge its importance, have nevertheless emphasized the deterministic behavior present in the climate system because of the predictability that it affords. Recently, however, the Nobel Committee has remarked^{9} on an increasing awareness that *“on decadal timescales of relevance to humans, evidence is consistent with the memory becoming effectively infinite $\u2026$.”*

An AGU editor highlight (Ana Baros, “Eminently Complex – Climate Science and the 2021 Nobel Prize,” January 1, 2023) described Ref. 14 as making *“the case that for the last 100 years there has been little communication between atmospheric sciences and nonlinear geophysics, and this must change to improve useful predictability (i.e., lead time) of climate models.”* In this paper, we address the need for communication between these two fields in a number of ways.

(i) First, by reviewing the topic, we will show that the situation is in some ways already better than it may appear, because, in fact, there has been work, notably that of Ref. 20, reviewed in Ref. 8, which, while proceeding directly from the Hasselman paradigm, has allowed the construction of models with a non-Markovian structure, via the use of memory kernels. It seems not to have always been very accessible to a wider audience, perhaps because it exploits the Mori–Zwanzig formalism,^{21} well known in applied mathematics, data assimilation, and statistical mechanics, to decompose the effect of fast degrees of freedom into a memory term and a noise-like term. The fundamental (and nonlinear) equation that expresses the Mori–Zwanzig decomposition is an identity, and deterministic, but confusingly, to those already familiar with the linear stochastic equation first studied by Langevin in 1908; it is sometimes^{8,10,20–23} referred to as the generalized Langevin equation (GLE). This Mori–Zwanzig GLE is intractable without further approximations^{21} or other modifications.^{8}

(ii) In view of this intractability, our paper takes an alternative, more direct stochastic approach to an EBM with memory, based on an already existing extension of Langevin’s original equation^{24} of 1908. We advocate for the use of Mori’s and Kubo’s stochastic GLE, widely known and applied since the 1960s in condensed matter theory, quantum optics, and physical chemistry.^{25–29} We show how the form of the Mori–Kubo GLE, occasionally known as the non-Markovian Langevin equation,^{30} suggests an analogous generalized Hasselmann stochastic EBM for the global temperature anomaly. Our paper expands on and revises our contribution to the 2020 International Conference on Complex Systems,^{31} which was itself expanded as Ref. 32.

(iii) We go on to show that Hasselmann’s model is one limiting case of our proposed EBM, that of short-ranged memory, while Lovejoy’s fractional model, FEBE,^{16,33} is closely related to the opposite special case, long-ranged memory. FEBE differs only by using a Riemann–Liouville fractional derivative, instead of a Caputo. We, thus, bridge these two seemingly opposed perspectives. We further note that an EBM based on the Mori–Kubo GLE can handle not just the two extremes of delta-correlated (shortest ranged) or power law decaying (longest ranged) memory kernels but also those with intermediate-ranged memory such as that which arises from a few widely separated timescales. In addition, modifications of such an EBM could, in principle, accommodate many types of periodic (or quasiperiodic) determinism that are present, such as ENSO. As early as 1976, Mitchell^{34} asserted that *“there are no known deterministic mechanisms of climatic variability that are internal to the atmosphere $\u2026$ or any other part of the climatic system.”* From this viewpoint, the emergence of any deterministic mode(s), even if nonlinear and only quasiperiodic, is, thus, crucial to predictability^{35} and its inclusion essential in any compact description.

(iv) We conclude by noting that an EBM based on the Mori–Kubo GLE would have the option of embodying a fluctuation–dissipation relation (FDR) *by construction*, because this was the main original purpose of their stochastic GLE.

We hope our contribution will open a new channel of communication between two disparate worlds and will facilitate the construction of a new class of EBMs, which will, in turn, improve the study of climate predictability.

In Sec. II, we will first briefly recap the basic notions of EBMs and then discuss some ways in which nonlinearity and stochasticity have been incorporated into them.

## II. ENERGY BALANCE MODELS

### A. Nonlinearity in EBMs

^{36–38}of Budyko and Sellers were deterministic. As the name implies, they captured how the earth’s surface temperature $T$ will, after a perturbation, evolve in time $t$ to restore the balance between the incoming $R\u2193$ and outgoing $R\u2191$ heat fluxes at the earth’s surface, expressed by the following equation:

^{36}$\epsilon $ is the earth’s emissivity, and $ \sigma S B$ is the Stefan–Boltzmann constant.

Since the 1960s, EBMs have been explored and extended in many directions. In particular, much work has been done on nonlinear and stochastic climate dynamics in this framework, one topic of the recently edited book by Franzke and O’Kane.^{6}

^{39}

^{36}that gives the radiation incident at the earth’s surface, and the quarter results in an average over the planet’s surface.

### B. Stochasticity in EBMs

Stochasticity can be introduced into nonlinear EBMs in a number of ways, such as white noise $\xi (t)$ with a controllable variance $ \sigma 2$ added as an extra term to Eq. (3) by Sutera.^{40} This changes it from an Ordinary Differential Equation (ODE) to a stochastic differential equation (SDE), an important subtlety.

The Fraedrich–Sutera model is Markovian because its deterministic part depends only on the current value of its single variable, the temperature $T(t)$, while its stochastic part, Gaussian white noise, is delta-correlated and, thus, also has no memory of its previous state.

The marginal distributions of such a Markovian climate model need not be Gaussian, however.^{41} Reference 42 is a comprehensive study of $\alpha $-stable Lévy noise of the more recent Ghil–Sellers extension to the original EBMs. The PDF of the noise here has an infinite variance and is heavy-tailed.

### C. Memory in EBMs

The Markovian restriction on the earliest EBMs has since been relaxed in several different ways. A pioneering approach was that of Bhattacharya *et al.*,^{43} who introduced memory into the deterministic part via a modification of the albedo term. The resulting work has been based on delay differential equations.^{44}

Multi-box (and, thus, multivariate) stochastic models have incorporated memory effects through the presence of a range of feedback time scales in their different “boxes.”^{45–47}

Most of the remainder of this paper concerns memory in stochastic EBMs, but in Sec. III, we first examine the simplest linear Markovian stochastic models in more detail. In particular, we consider the univariate autonomous and non-autonomous versions in which they have been used to model the earth’s global temperature anomaly.

## III. THE HASSELMANN PROGRAM I: LINEARIZED MARKOVIAN STOCHASTIC EBMs

Linear Markovian stochastic EBMs were explored in detail in the second of Hasselmann’s classic mid-1970s papers,^{48}, complementing the treatment in the first paper,^{11} which was largely in PDE (Fokker–Planck) rather than SDE (Langevin) terms. They can be motivated in several ways.

The first is an ansatz and was essentially the approach followed by Mitchell in the 1960s (Sec. III). It recognizes that the discrete time autoregressive first-order Markov process $AR(1)$, and the continuous time Langevin Eq. (9) whose solution is called the Ornstein–Uhlenbeck (O–U) process, are the simplest possible linear stochastic models of the response to perturbation.^{49}

A more physically intuitive argument, summarized in Sec. III B, and given more fully by Lemke in Ref. 50, comes from linearizing deterministic EBMs of the Budyko–Sellers type and adding white noise. At this level of abstraction, it is already apparent that the result is what we will refer to as Hasselmann’s autonomous SDE. However, Hasselmann’s SDE is isomorphic to not one but two such equations arising in Brownian motion (BM); so Sec. III C describes the dilemma posed by the choice of possible mappings.

Hasselmann’s mathematical analogy with Brownian motion was not limited simply to mapping an EBM to a Langevin equation. The most sophisticated and abstract level of his program, proposed first in Ref. 11, replaced Mitchell’s ansatz, which has sometimes been seen as being purely statistical, with a postulate, inspired by statistical physics: that of the separation of fast and slow timescales, and the identification of the former with weather and the latter with climate.

It is very important to realize that Hasselmann’s postulate is potentially quite generally applicable to climate variables and not just to the global mean temperature (GMT) anomaly—it was not simply a way to make a stochastic EBM, but to kick off a research program.

### A. Mitchell’s stochastic models of 1966

Mitchell’s pioneering chapter,^{51} in a RAND Corporation conference proceedings volume from December 1966, proposed two stochastic climate models.

*“primarily on the origins of relatively rapid fluctuations, as, for example, those reflected in the interannual variability of temperature observed at a wide variety of locations during the past century.”*He postulated that the air temperature anomaly in a marine environment was composed of two additive parts, one proportional to the sea surface temperature (SST) anomaly and one independent of it. He then gave arguments suggesting that each component’s data, comprising monthly values, would follow first-order Markov or “AR(1),” stochastic processes. An AR(1) process is defined at discrete times; in this case, the set $t,t+dt,t+2dt,\u2026$, and takes the following form:

Mitchell derived the standard results for the autocorrelation function and power spectrum of the resulting compound first-order linear Markov process and pointed out that the measurements of SST implied that its $\varphi $ would be in the range $0.5$– $0.9$, giving a red noise behavior. He also fitted the compound model to the famous 269-year (1689–1957) Manley series of monthly temperatures for central England, illustrating how the two component model’s Autocorrelation Function (ACF) gave a good description of the observed one. Hasselmann was made aware of Mitchell’s work by a referee and took care to credit it, both in Ref. 11 and subsequently, e.g., the oral history of Ref. 52.

Mitchell’s second model operated on longer time scales than his first and is less directly relevant to Hasselmann’s work. He viewed it as bearing *“contrastingly on the origin of much longer fluctuations, and”* thought that it might *“have some unexpected relevance to the origins of the glacial-interglacial succession of the Pleistocene ice age.”* In a sense, it anticipated the previous section’s Fraedrich–Sutera model and is further evidence of how far Mitchell was ahead of his time.

### B. Adding noise to a linearized Budyko–Sellars model

This SDE is Markovian in the sense that it depends only on the current value of its variable, which is now the global mean temperature (GMT) anomaly $\Delta T$. This SDE is a Langevin equation governing the OU process $\Delta T(t)$. It expresses the competition between the Gaussian fluctuations and the mean-reverting $\u2212\Lambda \Delta T$, which produces a dynamic steady state.

Equations (6) and (7) are formal and, like all SDEs, contain objects whose meanings must be made explicit and rigorous. For white noise-driven SDEs, this process has taken decades of mathematical work in developing the now mature field of stochastic calculus.^{21,53} In that Markovian case, there is a widely used convention to write SDEs as differentials rather than derivatives. We do not use this notation here because many of the equations we study have colored or even fractional noise driving terms for which there is less of a notational consensus.

### C. Relation of EBM to the equations of Brownian motion

Hasselmann was directly inspired by a conceptual and mathematical analogy with Brownian motion. As he observed in an AIP oral history interview,^{52} “my stochastic model is an application of the concept of Brownian motion $\u2026$ One of the simplest stochastic processes. The idea that one could explain long-term climate variability very simply by the short term fluctuations of the atmosphere in analogy with Brownian motion, is really rather obvious and I thought I would write it up somewhere in a little note.”

His paper explored an important distinction between the mathematical and physical usages of the term Brownian motion (BM). For mathematicians, the Wiener process defines BM, while for physicists, the underdamped LE defines BM. Hasselmann showed that the former was already insightful in addition to the then-current paradigms for fluctuations in climate but that the latter was necessary to create a power spectrum that more closely resembled those already being observed.

Interestingly, however, there are not one but two possible SDEs in physical BM to choose from, the underdamped and overdamped Langevin equations. If Hasselmann had proceeded purely by mapping one of these onto an EBM, he could ostensibly have used either, as we now show. While this ambiguity was relatively unimportant to his work, we will then argue that it becomes significant when thinking about extensions to the Hasselmann EBM.

#### 1. Underdamped Langevin equation

It describes the acceleration $dv/dt$, where velocity $v=dy/dt$, of a particle of mass $M$, which experiences a deterministic force due to a potential $V(y)$, a stochastic force $\sigma \xi (t)$, and a linear damping force $\u2212\gamma v$.

If we (i) map velocity into GMT anomaly, and mass into heat capacity $ C P$, (ii) ignore the question of mapping and meaning of the position variable $y$, and (iii) replace the derivative of a potential by deterministic forcing we obtain the non-autonomous version (7) of the Hasselmann EBM.

#### 2. Overdamped Langevin equation

Although mathematically identical to the unforced version of (7), rather than the mass $M$ being mapped to the heat capacity $ C P$, it is now the friction constant $\gamma $, which is so mapped. This is why although $\gamma $ is usually absorbed into $ k \xaf$ and $ \sigma \xaf$, did not do so in Eq. (9). Another conceptual difference is that the linear term in $\Delta T$ of Eq. (10) arises from mapping a linearized potential rather than a friction coefficient.

The overdamped Langevin equation can be modified to study problems with additional deterministic forcing of the velocity, such as the ion trap modeled by Ref. 55, and so its analogous Hasselmann EBM could also have an additional deterministic forcing term.

#### 3. Which of the underdamped or overdamped Langevin equations is closer to Hasselmann’s EBM ?

The conceptual differences between the underdamped and overdamped Langevin equations start to become important when we try to derive EBMs from first principles. This is even more true if we want to extend them to include effects such as long-range memory and fluctuation–dissipation relations. We illustrate this in Sec. IV by considering Hasselman’s own motivation^{11} of his stochastic model embodied in Eq. (10), and the outline derivation of it offered by the Nobel Committee.^{9} We will find that the latter, in particular, has more in common with the overdamped than the underdamped Langevin equation.

## IV. THE HASSELMANN PROGRAM II: FLUCTUATIONS, FAST AND SLOW

### A. Assumptions of the Hasselmann program

#### 1. Fast and slow variables

^{11}

#### 2. Conditional average over fast dynamics

^{9}writes the dynamics of the fast variables in terms of the slow ones and is simplified without the loss of generality by considering scalar $x$ and $y$; hence,

### B. The Nobel Committee’s derivation

#### 1. Drift force and multiplicative noise

At this stage, the Nobel Committee^{9} gives a simpler but more explicit argument for a Langevin-based EBM than Hasselmann’s own. His original paper, in contrast, dealt with both Langevin and Fokker–Planck approaches.

#### 2. Linearizing the drift force and assuming additive noise

The result can be compared to the autonomous Hasselmann model (10), which is in $\Delta T$, not $T$.

## V. STOCHASTIC EBMs AND IAMs

Economic assessments of climate change typically rely on a suite of integrated assessment models (IAMs) descended from the work of another Nobel laureate, Nordhaus.^{56,57} These models have now been refined and extended in many directions—from increasing their spatial and sectoral resolutions^{58–60} to explicitly modeling technological choices^{61} to incorporating agent-based and non-equilibrium models of the economy.^{62–64} Yet, although Hasselmann himself contributed to the early development of IAMs,^{65,66} the climate modules of IAMs are generally based on deterministic EBMs.^{67}

Climate stochasticity, when it is acknowledged at all, most commonly enters in the form of uncertain temperature thresholds (effectively “tipping points”) at which the model’s parameter values suddenly change.^{68}

One strand of economic research has added noise to the temperature dynamics directly.^{69–71} The main objective has been to study how fast we can learn the model’s parameter values—the equilibrium climate sensitivity, in particular—when we observe a noisy temperature time series. Anticipated learning tends to reduce the effect of parameter uncertainty on the optimal mitigation policy, but there is no agreement on whether this anticipated learning-channel is quantitatively meaningful.^{72}

One possible reason that the research on IAMs has paid so little attention to temperature stochasticity is that this source of uncertainty does not affect the *marginal* warming resulting from an additional tonne of carbon emissions. As a consequence, temperature stochasticity itself has very little effect on the optimal mitigation policy.^{73,74} However, a recent paper by Calel *et al.,*^{73} which perhaps offers the first example of a Hasselman EBM coupled to an economic damage function, points out that temperature variability could still greatly affect optimal investment in adaptation. They conclude that a forward-looking social planner would be willing to make substantial investments to avoid or reduce the effects of temperature fluctuations. Evidently, Hasselman’s “obvious” idea still has much to teach us.

## VI. BUT SHOULD STOCHASTIC CLIMATE MODELS REALLY BE MARKOVIAN?

### A. Leith’s “infrared climate problem”

In addition to Mitchell and Hasselmann, a third great climate pioneer, Leith, argued for the use of Langevin models in the mid-1970s. In his case, this was explicitly because of their connection with fluctuation–dissipation relations. As he put it at an NCAR conference on statistics and climate in 1994,^{75} *“a stochastic climate model that incorporates the fluctuation–dissipation relation can be devised and would provide a crude estimate of climate sensitivity. This model would be of the Langevin type.”*

However, by then, he had identified a problem of enduring relevance that *“there is evidence that this kind of model, would not on its own be satisfactory to capture some of the low frequency phenomena observed in the atmosphere.”* He referred to this, in a phrase that has sadly not gained currency, as the *“infrared climate problem.”* This was not a literal reference to thermal radiation but rather to an “infrared catastrophe” in fluctuations in temperature, i.e., the *“piling up of extra variance at low frequencies.”*

^{9}illustrates and acknowledges the continuing relevance of Leith’s worry. It points out that the Markovian nature of Hasselmann-type models results in an exponential decay of their autocorrelation functions,

### B. Phenomenological evidence for LRM in climate

The simplest stochastic process with memory, AR(1) Eq. (4), is Markovian, being formulated only in instantaneous variables. It can additionally be given random initial conditions, although this is not always the relevant case for climate, where, for example, the initial temperature anomaly may be set to zero.

The emergent physics of the earth system on larger scales need not be similarly memoryless, though, and in fact, the Nobel Committee’s summary states that in contrast to the atmosphere *“the ocean has a very long memory of events, at least in the hundreds $\u2026$ 1000 year range.”* As we noted in our introduction, the committee then remarks on the increasing awareness of effectively infinite memory and that *“this suggests a potential self-similarity or fractal character,”* citing Moon *et al.*’s study^{76} in support of their statement. Moon *et al.* in their turn refer to the by now quite extensive literature on long-range dependence in climate, from several research teams. We note, in particular, the groups at Tromso (Refs. 47, 77, and 78) and Montreal,^{16,33,79} and the recent PAGES CVAS group.^{12}

In Ref. 14, Lovejoy addressed the issue of LRM directly. He compared Frankignoul and Hasselmann’s original Markovian power spectral fit^{48} to 16 years of sea surface temperature data from the Ocean Weather Ship India, with the data now available from 452 stations over about 100 years. He argued that a compound power law fit $S(f)\u223c f \u2212 0.6$at low frequencies and $\u223c f \u2212 1.8$ at higher frequencies is much more satisfactory.

### C. Theoretical approaches to LRM in climate

In other papers,^{16,33} Lovejoy and colleagues have argued for a fractional energy balance equation (FEBE) as a method for incorporating long-range memory into EBMs.

In Sec. VII, we will emphasize that there was already a well-studied and standard way, the Mori–Kubo GLE,^{80–83} to incorporate memory in a Langevin equation. The Mori–Kubo GLE is suitable for use as a univariate stochastic equation, and thus, to add memory to EBMs of the Hasselman type. We show that FEBE, but with a Caputo fractional derivative rather than the Riemann–Liouville type used by Lovejoy *et al.*, corresponds to one particular limit of this equation.

## VII. WHAT IS A GENERALIZED LANGEVIN EQUATION?

As frequently happens when mathematics and physics intersect, the terminology used in this subject varies, which has sometimes caused confusion.

An example is the otherwise excellent Nobel Committee article,^{9} which describes the SDE corresponding to the O–U process as a “generalized Langevin equation.” It is literally so, but only in the relatively limited sense that both the underdamped and overdamped Langevin equations have the same mathematical form, and that any other linearly damped model with additive white noise will also obey an SDE of this same form, whether physical in origin or not.

Two other equations which much more substantially merit the description “generalized” are

the equation which has been widely studied in applied mathematics as a concise statement of the Mori–Zwanzig decomposition, and which is occasionally referred to as the GLE

^{8,22}andthe stochastic Mori–Kubo GLE which has found widespread application in statistical physics.

The Mori–Zwanzig decomposition has played an important conceptual role in formalizing an approach to modeling the dynamics of a few preferred degrees of freedom in the presence of many other unobserved ones in many areas of science, notably quantum mechanics. The resulting equations are sometimes directly derived from Hamiltonian equations of motion for the whole system, for example, in a reservoir plus subsystem model.^{22} The technique is not restricted to such cases, though, and it has also become arguably the most influential paradigm for thinking about how to make the Hasselmann program mathematically rigorous and has been employed in climate science to order empirical modes^{20} (see also Ref. 8).

^{25–29}is to refer to the linear integrodifferential Langevin equation of Mori and Kubo with a memory kernel on its velocity term,

## VIII. THE MORI–KUBO GENERALIZED LANGEVIN EQUATION: A NEW APPROACH TO THE INFRARED CLIMATE PROBLEM?

### A. The overdamped Mori–Kubo GLE

We note that the memory kernel is on the left hand side of this equation, on the term in $dy/dt$. We also remark that the same FDR as that for the underdamped Mori–Kubo GLE also applies here. The power spectrum of the noise $\nu (t)$ is matched to that of the kernel $\gamma (t\u2212 t \u2032)$, and so as above, $\nu (t)$ is no longer white noise.

It is important to realize that while mathematically this is adding memory to the Langevin model, physically it is simply making explicit something that was already known, the presence of finite collision times in physical Brownian motion.

### B. An EBM with memory

We remark that an integral over a time dependent heat capacity $ C P(t\u2212 t \u2032)$ has replaced the previous constant $C$. We do not prescribe the form of the noise, so the notation $\nu (t)$ does not presume that a fluctuation–dissipation relation exists.

### C. A Mori–Kubo GLE with power law kernel: The fractional Langevin equation

^{84–87}following the pioneering investigation by Mainardi and Pironi

^{88}of the $\alpha =3/2$ case.

The notation $ \nu \alpha $ here acknowledges the presence of a relationship, the FDR,^{89,90} between the noise and the damping kernel.

### D. Fractional derivatives

The integrodifferential form of the FLE is sometimes simplified by the use of a fractional derivative. Fractional calculus forms a large and growing field of research in itself (see, e.g., Refs. 91–93), so to aid the reader, we have included the Appendix to summarize those aspects we have used here. We again emphasize that the FLE is only one special case of the more general Mori-GLE, and that we remain “agnostic” about its relevance to the climate system, preferring to suggest a model with a more flexible memory kernel.

^{94,95}for $0<\alpha <1$ as

As well as its order $\alpha $, the Caputo derivative has a second parameter, the lower limit $a$ of the integral in its definition. The Caputo fractional derivative has physically more intuitive initial conditions than the most widely used alternative to it, the Riemann–Liouville derivative. This is because they are specified as integers rather fractional powers of physical quantities. However, as noted by Wei *et al*.,^{95} the problem can be circumvented by the Schneider–Weiss formulation of the Riemann–Liouville derivative.

^{96–100}typically under the name “Caputo fractional SDE,” though typically specialized to the case of white noise driving.

### E. The LRM EBM special case and its relation to FEBE

We note that although the integral in the Caputo derivative is frequently started at $ t m=0$ (including, for example, in its current implementation in Mathematica), its definition allows it to start anywhere on the real line.^{101} It, thus, naturally encompasses the $ t m=\u2212\u221e$, “Weyl” case to which Lovejoy^{33} has drawn attention in the climate context, without additional effort.

## IX. SOLVING THE MORI–KUBO GLE AND THE FLE ANALYTICALLY

In this section and the following one, we have used the words “analytic,” “numerical,” and “solution” as is commonly done in the physics literature. The mathematics literature has different conventions and the reader is advised to beware.

### A. General solution of the overdamped Mori–Kubo GLE

There are by now numerous sources in the literature for analytic solutions of the Mori–Kubo GLE, starting from near-contemporary work such as that of Fox.^{83} They typically use Laplace transform methods, e.g., those of Ref. 86.

The cases treated in Ref. 55 were relatively tractable, so they should, in principle, also allow some special cases of our proposed stochastic EBM to be solved analytically-a problem which we expect to return to in future work.

### B. Solving the FLE by Green’s functions

#### 1. The overdamped FLE

For now, though, we will specialize to the fractional special case of the overdamped Mori–Kubo GLE, i.e., the overdamped FLE. We recall that this will specialize further for $\alpha =1$ to the familiar exponential solutions of the Hasselmann model.

The overdamped FLE, both with and without deterministic forcing, has been studied in the mathematics and statistical physics literature, e.g., in 2017 by Li *et al.* in Ref. 102. With a Riemann–Liouville derivative rather than the Caputo type, it has been studied by Lovejoy and co-workers as a stochastic climate model since 2019, and dubbed “the fractional energy balance equation.”

#### 2. The delta function-forced Caputo FDE and Green’s function of the FLE

With an appropriate initial condition, this gives the impulse response of the fractional operator, which can then be used to construct Green’s function to solve either the deterministic or stochastic versions of the inhomogenous equation.

Figure 1 plots $x(t)$ for values of $\alpha $ from $1$ to $1/2$, and $k=1$. On the semilog axes chosen, we can see that the $\alpha =1$ case is exponential while smaller values of $\alpha $ correspond to more slowly decaying tails in the Mittag–Leffler function $ E \alpha , \alpha (t)$ and to an increasingly singular “spike” at zero from the power law prefactor. It may be compared with Figs. 2 and 3 of Ref. 16, which used a Riemann–Liouville derivative.

#### 3. Step function forcing of the Caputo FDE and FLE

This function is illustrated in Fig. 2, which plots $x(t)$, again for values of $\alpha $ from 1 to 0.5, but now for $k=0.1$. The topmost, blue, line is the $\alpha =1$, exponential, case, and $ x 0=0$.

The effect of memory for the cases where $\alpha <1$ is clear in lengthening the time it takes for the solution to take its asymptotic value. In a simplest forcing experiment, such as doubling $C O 2$, for a stochastic EBM, this would correspond to the approach to a new equilibrium value of the GMT anomaly.

#### 4. The overdamped FLE with arbitrary deterministic and stochastic forcing

We finally come to the full overdamped forced FLE (31), again with $ t m=0$, when the stochastic term has also been included. This is too general to have a useful analytic solution, but we can make some comments to help elucidate its meaning.

The above stochastic integral is best understood in the $H=1/2$ case of white noise when it can be described by Itô calculus, although there is by now also a substantial mathematical literature concerning integrals with respect to fractional Brownian motion^{94,104} (fBm).

## X. NUMERICALLY EVALUATING TRAJECTORIES OF THE MORI–KUBO GLE AND FLE

As is so often the case, analytical intractability of the most physically interesting and relevant SDEs means that we must evaluate them numerically. We discuss this below, first for the Mori–Kubo GLE and then for its fractional special case.

### A. Numerics for the Mori–Kubo GLE

There are by now several modern sources of numerical algorithm for the stochastic Mori–Kubo GLE, e.g., Refs. 28, 29, 105, and 106.

The algorithm of Lim^{28} requires that the system of SDEs to be solved must be “Bohl,” i.e., the matrix elements of the memory kernel $K(t)$ are finite linear combinations of the functions of the form $ t k e \beta tcos\u2061(\omega t)$ and $ t k e \beta tsin\u2061(\omega t)$, where $k$ is an integer and $\beta $ and $\omega $ are real numbers. As Lim^{28} notes, when these functions decay as a power law, for example, the resulting fractional Mori–Kubo GLE cannot be studied as a finite-dimensional SDE system. Although one can work formally in an infinite-dimensional setting, numerics will typically resort (e.g., Ref. 107) to approximating the power law, frequently by using a weighted sum of exponentials.

### B. Numerics for the overdamped FLE

The overdamped FLE can be solved in a number of ways. Lovejoy *et al.*’s work^{15,16} on FEBE has so far used Fourier methods similar to those widely used to simulate fBm and its increment, fGn, and has focused on an initial condition specified at $ t m=\u2212\u221e$. In our paper, we will instead consider finite $ t m$, which allows the infinitely distant case as a limit, and wish to use a non-Fourier based algorithm for its flexibility in potential future developments.

Our approach is illustrative, rather than rigorous. We use the explicit solution of the overdamped FLE and evaluate it at each required value of $t$. We do this by discretizing $t$ using the definitions of the stochastic and deterministic integrals as the limits of sums. However, we must stress that even in the O–U case, it is known (e.g., Ref. 24) that this algorithm will **not** ensure that a solution will have the required autocorrelation properties. This is why our results must be considered illustrative rather than definitive for now.

In our future work, in order to achieve the required ACF, a time evolution equation rather than an integral will be needed. A prototype of such algorithms was derived by Gillespie^{24,108} for the O–U case. Candidates include the spectral method of Ref. 98, and the recently developed fast Euler–Maruyama algorithm given in Refs. 109 and 110.

Here, we also restrict ourselves to the white noise-forced case, but with the additional deterministic forcing term, and intend to explore colored noise forcing in future work.

In Fig. 3, we illustrate the behavior of the FLE by plotting 100 sample paths obtained this way for forcing by a Heaviside step function $\Theta (t)$ and a stochastic term of amplitude $ \sigma \xaf=0.16$. We used MATLAB’s function boxplot where for each box, the central mark indicates the median, and the bottom and top edges of the box indicate the 25th and 75th percentiles, respectively. The whiskers extend to the most extreme data points not considered outliers, and the outliers are plotted individually using the “+” marker symbol. Parameters were $ t m=0,x( t m)=0$, $k=0.1$, and $\alpha =0.7$. The mean of the paths is overplotted and is seen to compare well with its known analytic result.

### C. Numerics for our proposed Mori–Kubo GLE-type EBM driven by RCP 8.5

^{111}, a (worst case) model of anthropogenic forcing. We first write the fractional EBM, with its parameters and scalings made more explicit, as

^{,}33), $ \tau H= C ( \alpha )/\lambda $ is the Hasselmann time, and the noise term has variance $ \sigma \xaf 2$. We used MATLAB's quadrature function trapz to speed up the deterministic part of the integration, a step size of 0.1 years, and, as in section X.B used Podlubny and Kacenak's function mlf from the MATLAB contributed toolbox to evaluate the Mittag-Leffler functions. Figure 4 is a box plot of 64 paths for $\Delta T$, again with the, at present arbitrary, choice $\alpha =0.7$ , and with representative values $\lambda =1.23$, $ \tau H=25.78$ years and $ \sigma \xaf=0.1$. The appropriate choice of parameter values for such models remains an area of active research

^{16}. Future more detailed investigations will, rather than quadrature, use a more accurate SDE solver suitable for all GLE-derived EBMs, not just fractional ones, as discussed above.

## XI. ARE THERE ALTERNATIVE STOCHASTIC GLE-BASED EBMs?

Although this paper is focused on a stochastic EBM inspired by a mapping to the overdamped Mori–Kubo Langevin equation, other options exist—some of which have been explored so far in the literature.

### A. Mori–Kubo GLE-based EBMs and other intrinsically stochastic approaches

In addition to Lovejoy *et al.*’s FEBE discussed in some detail above, and the overdamped GLE-based EBM we have proposed here, there have been several other proposals that extend the Hasselmann formalism to include multiscale memory. We enumerate some below. As noted in the introduction, Ghil *et al.*^{19} have argued that a key requirement of any such model is the ability to add quasiperiodic determinism as well as stochasticity.

#### 1. Modified fractional Brownian motion

#### 2. Direct mapping to underdamped generalized Langevin equation rather than overdamped GLE

In our earlier conference proceedings chapter^{31} and subsequent preprint,^{32} we considered a different GLE-based EBM. Here, the mapping was made from the underdamped Langevin equation rather than the overdamped Langevin equation.

This results in a mapping from velocity onto GMT anomaly, rather than from position. We erroneously thought that in its fractional special case, our suggested stochastic EBM was isomorphic to FEBE after suitable conversions between their parameters, but in fact, the comparisons made in Sec. VIII of the present paper show that FEBE is much closer to the overdamped Langevin equation and, thus, to our newer proposal.

#### 3. Two box models

The two box EBM is an interesting alternative which may be useful in some cases. It replaces the single decay scale of Hasselmann’s model with two. This allows, for example, the atmosphere and ocean timescales to be explicitly represented.

As shown in Ref. 45, the two box model results in a second order SDE with constant coefficients, isomorphic to the underdamped Langevin equation for position.

#### 4. Multibox models

The above methods can be extended to the case of many relaxation time scales, an approach studied extensively for SDEs in the applied mathematics literature and also in climate science.^{22,46} Eigenvalue analysis can be used to predict some of the spectral properties of such multibox models, as done, for example, by Fredriksen and Rypdal.^{47}

#### 5. The fractional Ornstein–Uhlenbeck equation

The Langevin equation can be modified to be driven by fractional Gaussian noise rather than white noise while still keeping the friction term constant. This has been studied in the mathematical finance literature extensively^{94} and has been suggested in the climate context.^{113} These authors also develop a time series model^{113,114} based on the discrete time autoregressive fractionally integrated moving average ARFIMA process, which generalizes AR(1).

#### 6. Yuan *et al.*’s generalized stochastic time series model

A fractional time series model, not simply isomorphic to ARFIMA or other existing models, has been proposed^{13} to systematically separate the “forcing-induced direct” and the “memory-induced indirect” trends in data.

### B. Other stochastic Mori–Zwanzig GLE-based EBMs

As noted in our introduction, the Mori–Kubo GLE is in principle a subset of the more complete Mori–Zwanzig formalism, which has at its heart a deterministic identity expressing the division of a system into observed and unobserved degrees of freedom.^{27} Space does not permit a full discussion of this here, but we refer the reader to a short introduction,^{23} and reviews of the method^{21} and its applications to climate.^{8}

## XII. FLUCTUATION–DISSIPATION RELATIONS

### A. Emergent parameters in the OU process

The meaning of the phrase “fluctuation–dissipation relation” can sometimes be as simple as its usage by the Nobel Committee,^{9} who noted that the variance $\u27e8 y 2\u27e9$ of the OU process, which solves the overdamped Langevin equation (9) is proportional to the ratio of the noise amplitude $ \sigma \xaf 2$ to the mean reversion term $ \gamma \xaf$. The Nobel document appears to us to be stressing that **any** system to which the O–U process has been applied must in consequence obey this limited kind of FDR. This is because there will then always be a ratio of $ \sigma 2/2 \gamma \xaf$ and it will empirically determine the value of $< y 2>$. This is not a trivial observation in the climate context because either the value of $\u27e8 y 2\u27e9$ or the noise amplitude $ \sigma \xaf$ are in this sense “emergent” parameters whose values are unknown *a priori*, cf. the FDR-like relation found in Ref. 54.

### B. The FDR in the Brownian motion

In the physical context (the motion of a tracer particle in a fluid) from which the underdamped and overdamped Langevin equations originated,^{4,24} however, this ratio was not a free or empirically determined parameter. It was instead constrained to be proportional to the temperature $T$ of the surrounding heat bath. A proportionality to $T$ is seen both in the Brownian motion special case of Kubo’s “FDR2,”^{89,90,116–118} i.e., the ratio of the noise amplitude to the mean reversion (friction) term in the underdamped Langevin equation, and in a corresponding expression arising from the overdamped Langevin equation. Importantly, these fluctuations are entirely “internal,” and the fast and slow terms in the Langevin equation arose from the same physics. This is not straightforwardly the case when an EBM is written down or derived for the climate problem, and extensive debates have occurred about the applicability or otherwise of the FDR in a climate context.^{3,4}

### C. The FDR as seen in the Mori–Kubo GLE

If we naively take this functional form of the GLE over to the climate context, however, the integral operator is doing a physically different job. It is now an integral over a heat capacity $ C P(t\u2212 t \u2032)$, and so, an “FDR” of the classic GLE type would be relating the covariance structure of heat capacity to that of the noise $\nu (t)$.

This would still be doing essentially the same thing in the sense of balancing the noise and the kernel, scale by scale, in order to allow the system to relax to a steady state, but further investigation would be necessary to establish whether it made physical sense.

## XIII. CONCLUSIONS

Hasselmann’s realization that the Brownian motion could be used as a mathematical superstructure to organize fast weather and slow climate fluctuations was a very powerful one, and his recent Nobel prize provided a timely context for our review. As we have shown, recent progress in going beyond his paradigm has generated, what at first sight seem to be divergent positions, exemplified by the recent reviews of Franzke *et al.* in Ref. 10 and Lovejoy in Ref. 14.

In this paper, we have sought to facilitate an improved dialogue between these views, and have found them to have more points of contact than was initially apparent. We first drew attention to the existing climate work using the Mori–Zwanzig approach and proposed using the stochastic GLE of Mori and Kubo as a framework to study memory in one-dimensional stochastic EBMs. We then related the stochastic GLE-based EBM to Lovejoy’s fractional energy balance equation and concluded with some speculations on fluctuation–dissipation relations.

## ACKNOWLEDGMENTS

It is a pleasure to dedicate this paper to Professor Juergen Kurths on the occasion of his 70th birthday and to thank him for his interest in contribution to this topic. We are also particularly grateful to Ian Ford for many valuable discussions and to Nick Moloney for a thorough reading of the manuscript. We acknowledge valuable interactions with Lenin del Rio Amador, Peter Cox, Holger Kantz, Paul Kushner, Valerio Lembo, Shaun Lovejoy, Valerio Lucarini, Lauren Padilla, Cecile Penland, Sid Redner, Kira Rehfeld, Kristoffer Rypdal, Martin Rypdal, Jan Sieber, Geoff Vallis, and Dmitry Vyushin. N.W.W. and S.C.C. acknowledge AFOSR (Grant No. FA9550-17-1-0054) and STFC (Grant No. ST/T000252/1) and, with D.A.S. and R.C., the Warwick International Partnership Fund. N.W.W. and S.C.C. acknowledge the hospitality of the Max Planck Institute for the Physics of Complex Systems, the University of Potsdam, the Potsdam Institute (PIK), the Santa Fe Institute, the International Space Science Institute, and the CliMA project at Caltech, during the development of this work. N.W.W. also acknowledges past support from the London Mathematical Laboratory, KLIMAFORSK (Project No. 229754) and ONR (Grant No. NICOP-N6209-15-1-N143). S.C.C. also acknowledges an ISSI Johannes Geiss Fellowship. A.V.C. acknowledges support of the Polish National Agency for Academic Exchange (NAWA). D.A.S. acknowledges support from the Grantham Research Institute on Climate Change and the Environment at the London School of Economics, the ESRC Centre for Climate Change Economics and Policy (CCCEP; ref. ES/R009708/1), and the Natural Environment Research Council through Optimising the Design of Ensembles to Support Science and Society (ODESSS; ref NE/V011790/1). N.W.W. and R.K. thank the Isaac Newton Institute Cambridge for participation at the program on Mathematics for the Fluid Earth in 2013.

## AUTHOR DECLARATIONS

### Conflict of Interest

The authors have no conflicts to disclose.

### Author Contributions

**Nicholas W. Watkins:** Conceptualization (equal); Formal analysis (equal); Methodology (equal); Writing – original draft (equal); Writing – review & editing (equal). **Raphael Calel:** Conceptualization (equal); Formal analysis (equal); Methodology (equal); Writing – original draft (equal); Writing – review & editing (equal). **Sandra C. Chapman:** Conceptualization (equal); Formal analysis (equal); Methodology (equal); Writing – original draft (equal); Writing – review & editing (equal). **Aleksei Chechkin:** Conceptualization (equal); Formal analysis (equal); Methodology (equal); Writing – original draft (equal); Writing – review & editing (equal). **Rainer Klages:** Conceptualization (equal); Formal analysis (equal); Methodology (equal); Writing – original draft (equal); Writing – review & editing (equal). **David A. Stainforth:** Conceptualization (equal); Formal analysis (equal); Methodology (equal); Writing – original draft (equal); Writing – review & editing (equal).

## DATA AVAILABILITY

Data sharing is not applicable to this article as no new data were created or analyzed in this study

### APPENDIX: SUMMARY OF FRACTIONAL CALCULUS

Because of their relationship to long-range dependence in time series and heavy-tailed amplitude distributions, fractional differential equations (FDEs) and fractional stochastic differential equations (FSDEs), such as those in the present paper, have also been extensively studied in complexity science. In order to make the paper more accessible to readers who have had little need to deal with fractional integration and differentiation, we give here a very brief summary of the results which we use. Fractional derivatives and the associated FDEs are now implemented directly in some numerical packages, like Mathematica whose demonstrations we encourage the reader to explore.

#### 1. Fractional integral

The form of the right hand side of this expression shows why we can then allow $n$ to be replaced by a non-integer value $\alpha $, thus defining an integral of fractional order. In defining a fractional integral, we use the relationship between the factorial and the gamma function, $\Gamma (n)=(n\u22121)!$ the discovery of which was a consequence of the earliest work on fractional calculus.^{123}

Note that this definition requires that $\alpha >0$. Typically, $\alpha <1$ is also assumed, but this is not obligatory.^{101} Other notations exist for this integral, e.g., $( I a + \alpha f)(x)$ used widely in the mathematics literature.

^{91}

#### 2. Riemann–Liouville fractional derivative

We can now define fractional differentiation as the inverse of fractional integration. In practice, this is done by taking a fractional integral of order 1 higher than the desired order of the derivative, and then, using the usual integer derivative to lower its order by one. Because fractional integration and differentiation do not commute, two possible fractional derivatives exist, depending on whether the fractional integral of $f(t)$ is done first or the integer derivative. These are the Riemann–Liouville and Caputo cases, respectively.

The lower limit of this integral is usually taken to be $0$, though this is not, in fact, essential.^{101} As in the case of the fractional integral, we can define the integral in the derivative over the whole real line, so $a$ can be $\u2212\u221e$. This case is sometimes referred to as the Weyl or Riemann–Weyl derivative.

#### 3. Caputo fractional derivative

## REFERENCES

*The Climate Demon: Past, Present, and Future of Climate Prediction*

*Predicting Our Climate Future: What We Know, What We Don’t Know, and What We Can’t Know*

*Nonlinear and Stochastic Climate Dynamics*, edited by C. Franzke and T. O’Kane (Cambridge University Press, Cambridge, 2017).

*Stochastic Physics and Climate Modelling*, edited by T. Palmer and P. Williams (Cambridge University Press, Cambridge, 2010).

*Weather, Macroweather, and the Climate: Our Random Yet Predictable Atmosphere*

*Stochastic Tools in Mathematical Science*

*Stochastic Processes and Applications: Diffusion Processes, The Fokker-Planck and Langevin Equations*

*An Introduction to Stochastic Processes in Physics*

*Nonequilibrium Statistical Mechanics*

*The Langevin and Generalised Langevin Approach to the Dynamics of Atomic, Polymeric and Colloidal Systems*

*Unifying Themes in Complex Systems X*, Springer Proceedings in Complexity, edited by D. Braha, M. A. M. de Aguiar, C. Gershenson, A. J. Morales, L. Kaufman, E. N. Naumova, A. A. Minai, and Y. Bar-Yam (Springer, 2021).

*The Climate Modelling Primer*

*Energy Balance Climate Models*

*Handbook of Stochastic Methods: For Physics, Chemistry and the Natural Sciences*

*Introduction to Stochastic Calculus with Applications*

*Handbook of Climate Change Mitigation and Adaptation*, 3rd ed., edited by M. Lackner, B. Sajjadi, and W. Y. Chen (Springer Nature, 2022), pp. vii–ix.

*Applications of Statistics to Modeling the Earth’s Climate System*, edited by R. A. Madden, R. W. Katz (NCAR, Colorado, 1994), Technical Note NCAR/TN-409+PROC.

*Nonequilibrium Statistical Physics*

*The Langevin Equation: With Applications to Stochastic Problems in Physics, Chemistry and Electrical Engineering*

*The Fractional Calculus: Theory and Applications of Differentiation and Integration to Arbitrary Order*

*Fractional Differential Equations: An Introduction to Fractional Derivatives, Fractional Differential Equations, to Methods of Their Solution and Some of Their Applications*, Mathematics in Science and Engineering No. 198 (Academic Press, 1999).

*Stochastic Models for Fractional Calculus*

*Stochastic Calculus for Fractional Brownian Motion and Related Processes*

*Markov Processes for Stochastic Modelling*

*Stochastic Calculus for Fractional Brownian Motion and Applications*

*Nonequilibrium Statistical Physics of Small Systems*, edited by R. Klages, W. Just, and C. Jarzynski (Wiley-VCH, Weinheim, 2013), pp. 259–282.

*First Steps in Random Walks: From Tools to Applications*

*Fractional Integrals and Derivatives—Theory and Applications*

*Fractals and Fractional Calculus in Continuum Mechanics*, edited by A. Carpentieri and F. Mainardi (Springer-Verlag Wien GmbH, 1997), pp. 223–276.

*Anomalous Transport: Foundations and Applications*, edited by R. Klages, G. Radons, and I. Sokolov (Wiley-VCH, Weinheim, 2008), pp. 17–XX.