We extend the scope of the dynamical theory of extreme values to include phenomena that do not happen instantaneously but evolve over a finite, albeit unknown at the onset, time interval. We consider complex dynamical systems composed of many individual subsystems linked by a network of interactions. As a specific example of the general theory, a model of a neural network, previously introduced by other authors to describe the electrical activity of the cerebral cortex, is analyzed in detail. On the basis of this analysis, we propose a novel definition of a neuronal cascade, a physiological phenomenon of primary importance. We derive extreme value laws for the statistics of these cascades, both from the point of view of exceedances (that satisfy critical scaling theory in a certain regime) and of block maxima.

Whenever a time-record of the intensity of a physical phenomenon exists—be it, for example, rainfall, temperature or solar activity, one can single out particularly important *extreme events* and, by accumulating data over long time periods, describe them statistically. These statistics have been formalized in a *probabilistic extreme value theory*.^{1} Much more recently, a *dynamical* approach has been adopted:^{2} here, the intensity of a phenomenon is a function defined on the phase-space of a dynamical system, where motions are generated by *deterministic* evolution so that the aforementioned record can be constructed by following this evolution as it unfolds over time. It is the purpose of this paper to extend this theory in two respects. First, we consider many-dimensional, complex systems; second, the functions that gauge the intensity of a phenomenon depend on the behavior of the system in a sequence of times and not only at a particular instant. A dynamical model of the electrical activity of the cerebral cortex is studied thoroughly as an example of this theory, focusing on the phenomenon of neuronal cascades. These are collective excitations of many neurons that are neither synchronous nor uncorrelated. We propose a dynamical definition of these cascades that avoids some problems of conventional usage and can be applied to similar phenomena. Using theoretical analysis and numerical simulations, we obtain scaling relations for the statistics of these cascades that agree with the probabilistic formalism and might be tested in real laboratory experiments.

## I. INTRODUCTION

The dynamical theory of extreme values, well described in a series of works (see, e.g., the review book^{2} and references therein), has so far mainly considered low dimensional systems in a very simple setting that we will describe momentarily. To the contrary, its probabilistic ancestor, the so-called extreme value theory (EVT),^{1} has been fruitfully applied to phenomena such as earthquakes, floods, and epidemics,^{3,4} which all involve complex, many dimensional interactions that operate for a certain period of time and are otherwise silent. It is the purpose of this paper to develop a theoretical framework to extend the dynamical treatment of extreme events to these kinds of systems. In this endeavor, we will not derive new laws, in addition to those of the probabilistic theory, but we shall describe how they arise from deterministic dynamical interactions. We shall do this by focusing on a specific example, a model of a neural network that has been introduced to describe various phenomena occurring in the electrical activity of the cerebral cortex.^{5–8}

This model draws its origin, object, and motivation from a vast body of existing research involving mathematics, physics, and physiology. Nonetheless, the goal of this work is not to reveal new phenomena in the last field—although we will propose a new conceptual approach to a much studied problem: the statistical description of neuronal avalanches.^{9–13} Rather, the model is particularly suited to introduce a formal advancement in the dynamical theory of extreme values, which is relevant to the neuronal case, among many others. In fact, at difference with standard theory, we will consider extreme events that occur in complex, many-dimensional systems and that are not restricted to a specific instant in time but which evolve during a finite time interval.

In the remainder of this introduction, we first outline the basic tenets of dynamical extreme value theory and the generalization that we aim to develop; next, we introduce the phenomenon of neuronal avalanches and our definition of a firing cascade based on a dynamically motivated causality relation. Finally, we summarize the results of this paper, and we sketch a map of what is to be found in different sections.

### A. Dynamical extreme value theory

The dynamical theory of extreme values considers a phase space $Z$, in which motions are induced by the repeated action of a deterministic map $\phi :Z\u2192Z$. An *intensity function* $I:Z\u2192R$ defined on the phase space gauges the *value*, i.e., the intensity $I(z)$ of a phenomenon, an *event*, which occurs at the precise instant of time when the motion visits the point $z\u2208Z$. Extreme value theory is the study of the statistics of particularly intense events, i.e., large values of $I$, accumulated in time along trajectories of the motion. Without loss of generality, we shall assume that time is discrete and denote it by the letter $n$. Trajectories of the system will be written as $zn=\phi n(z0)$, and $I(n,z0)$ will denote the intensity of the event occurring at time $n$ after the motion started at the point $z0$: $I(n,z0)=I(\phi n(z0))$.

Originally,^{2} very simple intensity functions $I$ have been investigated, which depend only on the Euclidean distance of $z$ from a single point $z\u2217$ and increase as this distance diminishes. In this simplified setting, extreme events are close approaches of the motion to such a location. This situation was widely generalized by one of the present authors in Ref. 14, by introducing intensity functions that depend on the distance from an uncountable set of points populating a fractal set $K\u2282Z$. This analysis has recently been confirmed in full mathematical rigor.^{15}

In addition, recall that the phase space $Z$ and map $\phi $ are complemented by an invariant measure $\mu $ to fully define a dynamical system. The measure $\mu $ determines the frequency by which the dynamics visit different regions of the phase space. The mutual action of this measure and of the geometry of $K$ in determining extreme value statistics was also investigated in Ref. 14, revealing the rôle of fractal quantities such as Minkowski dimension and Minkowski content, which were extended to the case of general (i.e., not necessarily Lebesgue and possibly singular) invariant measures. This is the extent of existing theory.

Suppose now that the intensity of a phenomenon can*not* be determined *instantly* at time $n$, when the dynamics visits the phase-space point $zn$, because this phenomenon develops over a certain time span so that its true intensity can only be assessed when it terminates. This is the case of neuronal avalanches, when the electrical activity of the brain is sustained for a few milliseconds. However, it is perhaps easier, to focus ideas, to consider a geophysical analogy,^{16} to which the theory exposed in this paper can be equally applied.

An earthquake can be triggered by a single seismic event, a rupture, which propagates and multiplies in space and time. The total energy released by the quake is the intensity of the event, which can be assessed only after the shock terminates. Notice at this point that even if this event takes place continuously over a finite time span, it is usually ascribed in geophysical catalogs to the moment of the first, triggering rupture. To put this observation into the theoretical framework of dynamical extreme values, the intensity function must take the form $I:Zm\u2192R$, in the sense that the event at time $n$ has an intensity $I(n,zn)$ of the form $I(n,zn)=I(zn,zn+1,\u2026,zn+m\u22121)$, where $m$ is the duration of the phenomenon. This generalization of the conventional theory is studied in this paper.

An important observation must be made at this point. Since we are in the presence of a deterministic dynamics, $zn+j=\phi j(zn)$, the intensity function takes on the special form

so that it can be thought of as depending only on the initial point $zn$ via a global function $H:Z\u2192R$. This might lead one to think that the proposed approach is void of any generalization. As a matter of fact, because of the dependence of $I$ on future points of the evolution, the function $H$ is presumably very complex and its *phase-space portrait*, i.e., its level sets, might feature a fractal, hierarchical organization, thereby providing a concrete dynamical example of the theory of the work^{14} mentioned before. Moreover, when the dynamical system is chaotic, the function $H$ in formula (1) cannot be given a more efficient computational scheme other than what is expressed by the central expression in Eq. (1): the best one can do is to follow the dynamics as it evolves and record its data. Think again of the earthquake example: while it develops, it is *practically impossible* to forecast its intensity. On a more theoretical vein, recall the aphorism by Joe Ford: *a chaotic system is its own fastest computer and most concise description*.^{17}

There is a further consideration that renders the generalization (1) far from trivial: the time span of the phenomenon, $m$, is neither known *a priori* nor it is the same for different events: the integer $m$ in Eq. (1) *depends* on $zn$. Because of the reasons just outlined, there is no significantly better way to compute $m$ than to follow the dynamics until $m$ reveals itself: that is, the phenomenon under investigation—such as, for instance, the quake—terminates.

### B. Neuronal avalanches: Trees of causal excitations

It is now possible to return from the geophysical analogy to the paradigmatic example described in this paper: *neuronal avalanches*. It is observed that electrical stimuli originate (in the presence of or without a sensorial stimulus) and propagate in the cerebral cortex to involve the *firing* of thousands of neurons: in fact, a neuron may either be silent or *fire*, which means go through the depolarization–polarization process described in Sec. II and send an electric stimulus via synaptic connections to other neurons. There is evidence that this activity composes a fundamental part of the computation performed by the brain.^{18–20} It can be recorded by suitable devices,^{11} and it appears as bursts of firings, at times well separated between each other, but more often adjacent. In certain instances, this activity may be due to synchronized oscillations of a large number of neurons,^{21} but different patterns are frequently observed, in a way intermediate between the former and unstructured and uncorrelated firings of individual neurons. A substantial amount of the literature (see, e.g., Refs. 8–13) has shown that this regime can be described as *critical*, characterized by power laws and scaling relations. In this paper, we will focus our attention on both the critical and the subcritical regime, but we will introduce a seemingly new operational definition of avalanche.

This definition is akin to what Rangan and Young^{22,23} have termed *multiple firing events*, as *the maximal sequence of firings that are caused by a single neuron*. By simulating each neuron in a network (typically, several thousands of them are considered) as characterized by microscopic variables (voltages and conductances) evolving according to coupled ODEs, and by the crucial technical expedient of setting certain synaptic time constants to zero, Rangan and Young have been capable of determining exactly which neuron causes the firing (also called the spike) of a post-synaptic neuron.

The situation in the model that we will analyze is different and more general, in that the combined electrical activity of many different pre-synaptic neurons can cause the spike of a neuron. Nonetheless, by studying the detailed dynamics of a single, unperturbed neuron—in particular, the *basin of attraction* of the spiking event—we will be able to introduce a *rule* according to which the firing of a particular neuron can be identified as the *precipitating cause* of the spike of another. We shall then define a *firing cascade* as the *tree that mathematically encodes such chain of excitations*.

We believe that this approach is general enough to be applicable to a wide set of complex systems, composed of many interacting components. At the time of the final revision of this paper, it is hard not to think of the analogy of firing cascades to the *contagion trees* spreading viruses throughout the world. Although difficult to detect (and unfortunately, to stop), it is plausible that also, these trees feature the statistical distributions studied in this paper.

Typical characteristics of firing trees/cascades, such as their intensity (number of nodes/number of firings), duration, and degree (number of levels in the tree), provide intensity functions $H$ to which the extreme value theory can be applied, both from the point of view of exceedances and of block maxima (concepts to be made precise below), thereby providing a concrete example of the abstract theory described in Sec. I A.

This analysis will produce scaling relations that also permit to compute the probability that all cascades in a certain time interval have intensities smaller than a given threshold. This kind of extreme value laws for the size of neuronal cascades has not been investigated before, to the best of our knowledge—see, however,^{24} for a discussion of the relevance of an extreme value theory to sensorial perception.

### C. Contents of the paper

The plan for this paper is the following. We first describe in Sec. II the dynamical model of a neural network that we adopt, borrowed from Ref. 8. It consists of a collection of two-dimensional maps, each representing an individual neuron, initially proposed by Rulkov.^{5,6} Readers not interested in the dynamical details of this neural network can skip Sec. II and proceed directly to Sec. III. In Secs. II A and II C, the theory follows closely, with minor modifications, Refs. 5–8. This part of the paper serves to set notations and to investigate a few details of a dynamical import that were not considered in the original references: in particular, the *basin of attraction of the spiking event*, defined in Sec. II B, is instrumental to define causality interactions. Neurons do not work in isolation but are linked in a sparse network, where different neurons interact only when the spiking state is reached. This is described in Sec. II C, following Refs. 7 and 8.

In Sec. III, we study the collective dynamics that takes place on the network. We start in Sec. III A from the problem of the existence of an invariant measure that we heuristically replace by Birkhoff time averages, having nonetheless in mind that this procedure might be dubious due to possible long transients in the dynamics.^{25} Section III B contains the fundamental definition of *firing cascades as trees of causal excitations*, as well as its practical implementation, which is based on the concept of the basin of attraction of the spiking event. This permits to define and measure important quantities such as the intensity $S$ of a cascade, its generation level $G$, and its time span $T$.

The statistics of these quantities is analyzed in Sec. IV, showing a transition to critical behavior as the intensity of coupling among different neurons is increased; the exponents of the power laws describing the above quantities obey a critical scaling theory^{12,26} and agree with previous experimental findings^{9,13} and theoretical research.^{27,28} From the point of view of an extreme value theory, this analysis can be classified as the study of exceedances.^{2} We have not investigated a region of a much larger coupling intensity where synchronization phenomena and different statistical behaviors take place.^{8,29}

Finally, in Sec. V, we further widen the scope of our investigation, and we go beyond the results of Refs. 7 and 8 and of typical results in this kind of neural networks: we show the emergence of extreme value laws fully analogous to those of the probabilistic theory.^{1} That is, we derive scaling relations for the statistical distribution of the value of the *largest* event in a certain time interval, of variable duration, both in the critical and in the sub-critical regime. Numerical experiments confirm the results of the formal theory.

The conclusions briefly discuss the relevance of this work to a broad class of physical systems, including the neuronal case taken as an example herein. The Appendix describes in an elementary way the stability regions, in the parameter space, of the fixed point of Rulkov maps.

## II. THE DYNAMICAL MODEL: A SPARSE NETWORK OF RULKOV MAPS

A variety of models have been proposed to study neuron dynamics. Many of these are random. Purely dynamical models, on the other hand, try to reproduce more closely physiological phenomena. They are often continuous time models, involving systems of differential equations.^{30} These are able to describe a large variety of observed biological patterns^{22,23,31} but are expensive from a computational point of view. On the contrary, Rulkov 2D-map^{5,6} balances capacity of well reproducing real biological behaviors—such as irregular spiking and bursts of spikes^{6,7}—and fast computations, two features that are crucial for analysis. In this section, we briefly describe the dynamics of a single neuron, which is effected by the Rulkov map. We follow the implementation presented in Refs. 7 and 8 that also contain illustrative pictures. We need only to mention that, at difference with these works, we do *not* allow a random external input: our dynamical system is fully deterministic.

The Rulkov map describes the evolution of a fast variable $xn$ ($n$ is the integer time) that models the membrane potential of a neuron, and a slow control variable $yn$, although void of physiological meaning, is required to pilot the evolution of the former, so as to reproduce some patterns observed in real experiments. This pair of variables describes the state of every neuron in the system. We label neurons by the variable $i=1,\u2026,N$ so that the phase space of the system is $Z={(xi,yi,Ii),i=1,\u2026,N}$. Recall that $n$ labels time; Rulkov map is then defined as follows:

Notice that a memory effect is present in the above definition, via the value of the variable $x$ at time $n\u22121$. $Ii$, the synaptic input on neuron $i$, defined in Eq. (6), also has a dynamical evolution. We will nonetheless use the sole $(xi,yi)$ coordinates as observables in our system.

The positive constants $\mu $ and $\sigma $ and the function $F$ will be defined in Subsection II A. Let us remark here that different neurons are coupled only through the variable $Ii$, the synaptic input on neuron $i$ from the network. $Ii$ is physiologically a *current* and it is defined below in Eq. (6). It is convenient to first discuss the dynamics when this external input is set to a constant (that may be taken to be zero, for the sake of definiteness), and each neuron evolves independently of the others.

### A. Unperturbed dynamics of the single neuron

In this subsection, accounting for the fact that neurons do not interact, we drop the neuron index $i$ for clarity. The function $F$ in Eq. (2) is defined as follows:

The parameters $\alpha ,\beta ,\mu ,\sigma $ appearing in Eqs. (2) and (3) are all positive. Let us describe them. The parameter $\beta $ modulates the relative influence of the synaptic input $In$ with respect to $yn$ in Eq. (3). We choose $\beta =0.133$. The parameter $\mu $ is taken to be small ($\mu =10\u22123$) so that the variable $y$ evolves slowly with respect to time, following the second equation in (2) and its values cover a limited range. The combination of $\mu $ with $\alpha $ and $\sigma $ determines the structure of motions, as discussed in detail in Refs. 5 and 6. In the Appendix, we sketch a simple analysis of the stability of the fixed point located at $(\sigma \u22121,\sigma \u22121+\alpha \sigma \u22122)$. Kanders *et al.*^{8} choose $\alpha =3.6$ so that only $\sigma $ is left free to vary. As shown in the Appendix, under these conditions, the map admits a bifurcation at $\sigma =\sigma cr=2\u2212\alpha 1\u2212\mu $. The parameter space is accordingly divided into four regions, I–IV, plotted in Fig. 13 of the Appendix. Choosing $\sigma =0.09$ puts the parameter-space point in region II, implying an attracting fixed point, while $\sigma =0.103$ leads to region III, where the fixed point is unstable.

Stability analysis, therefore, permits to define two kinds of neurons: Intrinsically Non-Spiking Neurons (INSNs) (parameters in region II, motion depicted in the left frame of Fig. 1) and Intrinsically Spiking Neurons (ISNs) (region III, motion in the right frame of Fig. 1). The latter are the initiators of the dynamics of the network. The different behaviors are apparent in Fig. 1 where we plot motions in the single-neuron phase space (as in Fig. 1 in Ref. 8, the $y$ axis is an abscissa and the $x$ axis is an ordinate).

A neuron is said to *fire* when its membrane potential $xn$ reaches a local positive maximum (depolarization) and then suddenly drops to the hyperpolarized state $x=\u22121$: in physiology, one talks of increasing polarization when the potential becomes more negative. At this moment, a signal is transmitted to the other neurons, as described in Secs. II B and II C. It is instructive to understand how this cycle develops in the dynamics of Eqs. (2) and (3). To do this, follow the motion in the right panel of Fig. 1. Three *cycles* are plotted (two are almost coincident and require attention to be distinguished), each of which contains a spike. Start from the rightmost part of a cycle and follow points rotating counterclockwise so that $yn$ initially decreases and $xn$ increases. This happens because of the first alternative in Eq. (3). The map (2) is defined in such a way that when $x$ becomes positive at time $n$, $xn\u22121$ being negative, either $xn$ is larger than $\alpha +yn+\beta In$ so that at the next step, the membrane potential drops to the negative value $xn+1=\u22121$ (third alternative) or else it further increases to the value $xn+1=\alpha +yn+\beta In$ (second alternative), before achieving the value $xn+2=\u22121$ at the following iteration, again in force of the third alternative. This motivates the introduction of the spike variable $\xi n$ that is always null, except at those times when the neuron potential suddenly drops to the hyperpolarized state $xn=\u22121$, in which case it takes the value one. It can be formally defined as follows—compare with Eq. (3):

The spike variable $\xi n$ will be used in Sec. II C to define the dynamical action of synaptic coupling.

### B. Perturbed dynamics of the single neuron: Basin of attraction of the firing event

The synaptic input $In$ received by a neuron can either be stochastic or deterministic, as it is in our case. How this latter is generated in our model is inessential for the present discussion and will be explained in Subsection II C; suffices here to observe that $In$ enters Eq. (3) linearly. In particular, a positive current $In$ increases the potential $xn$ (hence, it depolarizes the neuron) and can, therefore, be associated with an *excitatory* action, while the reverse is true for a negative current $In$. Notice, however, that this rule of thumb has non-trivial exceptions that we will describe momentarily.

Figure 2 shows how the dynamics of ISN and INSN observed in Fig. 1 is perturbed by $In$. Typical trajectories are shown, in which sharp angles appear when $In$ is non-null. Notice, however, that time being discrete, lines in the figure join distinct points, albeit typically very close, so as to simulate a continuous trajectory that does not exist. In the left frame, one can observe how INSN, silent when isolated, can be triggered to spike by the input $In$. The dynamics of ISN (right panel) can also be perturbed, altering their regular spiking regime. In fact, consider Fig. 3, left panel, where the potential $xn$ of an ISN (red curve) is plotted vs time $n$, together with the synaptic input $In$ it receives (green curve). The synaptic current $In$ is positive, stemming from an excitatory action. Three excitations are visible in the range displayed. Prior to these, the ISN spikes regularly with a period of about 242 iterations. The first excitation has the effect of anticipating the successive spike by 35 iterations. In the right panel, which shows the same motion in the single-neuron phase space, regular oscillations appear as a limit cycle. The first spike is evident in the thread, which emerges from the cycle at the right of the picture and initially moves increasing both $xn$ and $yn$, according to the fact that the excitation $In$ is positive: see Eq. (3). Returning to the left panel, we notice two more excitatory inputs to the neuron that have the paradoxical effect of suppressing further spikes! This can be explained by detecting, in the right panel, two more sharp angles in the trajectory, the first starting from the bottom left of the picture. In both cases, $xn$ and $yn$ increase, but this has the effect of pushing the trajectory toward the fixed point of the map, which is unstable but from whose neighborhood the motion requires a long time to escape.

We have so discovered that in this dynamical model, excitatory actions can sometimes produce an inhibition of spikes, due to the particular nature of the phase space of the system. For a similar reason, inhibitory actions can sometimes induce spikes. This paradox leads to a deeper understanding of the dynamics of this system.

In fact, consider an INSN, like those pictured in the left panels of Figs. 1 and 2. Let $(x,y)$ be the initial condition of the motion and let $In=0$ (isolated neuron). Even if finally attracted by the fixed point, some trajectories can spike before that happens. In Fig. 4, we color in red such initial conditions and in green those that head directly to the fixed point, whose position is marked by a blue asterisk (again, the variable $y$ is plotted on the horizontal axis in these figures). A limit cycle separates the two regions. In Fig. 5, the phase-space portrait with trajectories near the stable fixed point is displayed. It is evident from these graphs that a system orbiting on one of the stable curves, when subject to an excitatory action (increasing both $xn$ and $yn$ in a single step), can make a transition from the green to the red region, i.e., from the basin of the fixed point to that of the spike. It is also possible that the opposite transition happens but only in very particular circumstances.

Moreover, as can be seen in the figures, when the potential $x$ of a INSN is larger than a certain threshold $xth$, such a paradoxical outcome is not possible. In fact, all initial conditions with $xn$ larger than $xth$ lie in the red region and lead to firing, even in the absence of further excitations: the effect of a current $In$ that puts a INSN in the red region can, therefore, only be reversed by an inhibitory input. This fact will be of paramount importance in the following. From the analysis of Fig. 4, we deduce that, in the range of parameters of our experiments (see the discussion of Fig. 1 in Sec. II A), we can safely take $xth=\u22120.7$.

### C. Network topology and synaptic coupling

According to the way in which they act upon others, neurons can either be classified as excitatory or *inhibitory*. Following Ref. 8, we consider a network of size $N$ with a ratio of inhibitory/excitatory neurons $r$ so that the first $nex=\u230aNr\u230b$ neurons are excitatory and the remainder $N\u2212nex=nin$ are inhibitory. Furthermore, as described in Sec. II B, neurons can either be intrinsically spiking or not so that each group (excitatory and inhibitory) can also be parted in two: out of the $nex$ ($nin$) excitatory (inhibitory) neurons, $mex$ ($min$) are chosen to be intrinsically spiking.

Neurons are connected so that each of them is acted upon by $pex$ excitatory neurons and $pin$ inhibitory neurons, chosen at random in the network—with no regard whether they are ISN or NISN, with the only constraint that no neuron can act upon itself. The six integer numbers $nex$, $mex$, $nin$, $min$, $pex$, $pin$, plus the connection diagram fully specify the network. Analytically, this is formalized by the adjacency matrix $\omega ij$, defined by

The factor $\psi >0$ serves to differentiate the intensity of inhibition with respect to excitatory action. We choose $\psi =3$.

Coupling in the network is driven by action potentials generated by firing neurons. Following Ref. 8 again, the spike of a presynaptic neuron $j$ determines the input $Ini$ received by the neuron $i$ according to the linear difference equation,

Let us now define and discuss all terms in this equation that also take on a physiological meaning. $W$ is an intensity parameter, varying which we can increase the effect of synaptic excitations: recall that $Ini$ enters the dynamical equation of evolution of neuron $i$ [Eqs. (2) and (3)]. The homogeneous equation (with null r.h.s.) implies an exponential decay: $In\u223c\eta n$ so that $\eta $ measures the persistence of an excitation: we choose $\eta =0.75$. The spike variables $\xi nj$ at r.h.s. are the source of the currents $Ini$ through the connection matrix $\omega ij$ defined before. Finally, notice that when the synapse reversal potential $\chi ij$ is larger than the potential $xni$, its activation (that happens when $\xi nj=1$) increases the synaptic input $Ini$ and also, because of Eq. (3), the membrane potential $xn+1i$. This is equivalent to a depolarization of the neuron, and, therefore, the synapse acts as excitatory. In the opposite case, the activation polarizes the neuron and the action is inhibitory. The numerical values of the above parameters are chosen as in Refs. 7 and 8: the reversal potential is null for excitatory connections (i.e., $\chi ij=0$ when neuron $j$ is excitatory) and equal to $\chi ij=\u22121.1$ when neuron $j$ is inhibitory, for all $i$. This ends the description of the *laws of motion* of the network.

## III. NETWORK DYNAMICS

We can now study the dynamical behavior of the network defined above. It will be evident that the discussion is quite general and it applies to a large family of similar networks so that it can be followed without a detailed understanding of Sec. II. We start by discussing a basic problem: the existence of an invariant measure. Then, we turn to the crucial definition of firing cascades.

### A. Invariant measure

Equations (2)–(6) define a multi-dimensional dynamical system in the phase space $Z=R3N$ with variables $(xi,yi,Ii),i=1,\u2026,N$. Let $z=(x,y,I)$ be such a $3N$-component phase-space vector and let $\phi $ be the dynamical evolution operator. A fundamental ingredient of a dynamical theory is the so-called invariant measures, that is, positive Borel measures that are preserved by the dynamical evolution. Lacking a theoretical control of these measures in many-dimensional systems such as the one under consideration, we substitute phase-space averages by Birkhoff sums over trajectories. In measure-preserving systems, the two coincide almost everywhere (with respect to the initial point of a trajectory, under the invariant measure).

More precisely, having chosen an intensity function $H(z)$, we construct its distribution function $FH(h)$ as follows. We start from a random point $z\u2217\u2208Z$. We iterate the dynamics $\phi $ for a number of iterations $n0$ sufficiently large to let the system reach a sort of dynamical equilibrium. At that moment, we let $z0=\phi n0(z\u2217)$ be the initial point of a trajectory, and we compute the sample values ${H(\phi n(z0)),n=0,\u2026,P\u22121}$, by which the distribution function $FH$ is

Clearly, in numerical experiments, the limit is replaced by the value obtained for a large but finite sample length $P$. We shall also use the complementary distribution function $EH(h)=1\u2212FH(h)$: this is the measure of the tail of the distribution of values of $H$,

The study of coupled map lattices has revealed puzzling phenomena such as stable motions with exponentially long chaotic transients^{25,32} so that the numerical procedure just explained might be theoretically flawed. Nonetheless, the results of Ref. 8 on small networks ($N=128$), which reveal a transition to positive Lyapunov exponents when the coupling parameter $W$ increases, lead us to believe that convergence in Eq. (7) can be numerically achieved.

In Sec. V, we also consider the function $HL(z)$, defined in Eq. (15). We still use the Birkhoff technique, adapted to the so-called box maxima approach, in which sample points on a trajectory are spaced by $L$ instants of time,

### B. Firing cascades as trees of causal excitations

A crucial difference between conventional coupled map lattices and the network under study is that, in our case, neurons interact only at specific instants of time, while for most of the time, individual systems evolve independently of each other. When it comes to physiological networks, the first and foremost consequence of this fact is that information can be stored, elaborated, and transmitted.^{20} In a dynamical perspective, this calls attention to collective behaviors such as synchronization.^{21,33} Here, we focus on a more elusive set of phenomena: neuronal avalanches, which we prefer to call firing cascades.

Most of the studies on the collective behavior of neural networks define avalanches as the uninterrupted activity of the network during a certain time interval.^{9,13} Statistical properties of these collective events have been extensively studied in a large series of works, both experimental, see, e.g., Refs. 9–13 and theoretical, with the aid of model systems,^{18,19,22,23,28,31,34–37} but statistical analysis alone does not clarify the dynamical detail of the activity network and, in particular, of the impulse propagation. Moreover, it suffers from a degree of arbitrariness in the definition of avalanche. In fact, coarse-graining of time is usually employed, for spiking events appear at particular instants of time and must be grouped to define quiescence periods separating different avalanches. Figure 6 shows a conventional *raster plot* of spikes, depicted as red points at coordinates $(n,i)$, where $n$ is the time at which neuron $i$ spikes. One observes regions with different densities of spikes. The usual procedure^{9} is to partition the time axis into different bins so that a sequence of adjacent, non-empty bins constitutes a cascade. The arbitrariness in the choice of the bin length is apparent, and its influence on the resulting statistical properties has been well described.^{9} Moreover, different regions of the network even if uncorrelated but spiking at close instants may be classified in the same cascade. In short, the true dynamics of the network is overlooked in this approach. This is justified in the analysis of experimental data, when no access to the network of connections is available.

We pursue here a different approach, based upon the study of the dynamics presented in Sec. II. We have remarked that different types of neurons are present in the network: in particular, excitatory ISNs are the origin of propagating excitations. When one of these neurons spikes, say neuron $j$, it acts upon its post-synaptic neurons via Eq. (6). As observed in Sec. II, if such postsynaptic neuron is also an ISN, this action can alter its dynamics, anticipating or delaying its next spike. Otherwise, if the post-synaptic neuron $i$ is of the kind NISN, this excitation might eventually drive it to spike, this occurrence following from the dynamics generated by Eq. (2). The induced spiking, when it takes place, is not instantaneous: according to the coordinates $(xni,yni)$ at the time of an excitation, the motion requires a certain amount of time to exit the region close to the fixed point and spike. It is nonetheless possible to predict if this event will take place, using the notion of the basin of attraction of the spiking event discussed in Sec. II. Recall that this is defined as the set of initial conditions of the *isolated non-spiking* neuron, which leads to a spike, before returning to approach the fixed point. When $xn$ is pushed by synaptic excitations into this basin (red in Fig. 4), the NISN will fire, regardless of further excitatory actions: such spike can only be avoided by inhibitory actions.

This reasoning permits to draw a *causal* relationship between the spiking of a NISN and *a single originating excitation* from another neuron. Even if it is clear that in a complex network many pre-synaptic neurons can contribute to set the dynamics of $i$ on course to firing, we stipulate to ascribe the cause of the spike of the non-intrinsically spiking neuron $i$ to a single excitatory neuron, $i\u2032$, the one which is the last to act on $i$, at time $n\u2032$, before it enters the basin of attraction of the spiking event, to eventually fire at time $n$.

Excitations following this time are not considered in this count because they are not necessary to induce a spike, which would happen nonetheless. Equally not considered are impulses preceding such precipitating events—the straw that broke the camel’s back—because if taken alone, they might not suffice to cause firing. As a simpler rule, this can also be applied in experiments, and we approximate the basin of attraction by the fact that $xi$ exceeds the threshold value $xth$. This procedure can then be formulated as follows: we say that *the spike of the excitatory neuron $i\u2032$ at time $n\u2032$ causes the spike of the INSN $i$ at a later time $n$, and we write $(n\u2032,i\u2032)\u2192(n,i)$ if*

The event that two neurons spike simultaneously and excite the *same* postsynaptic neuron is statistically negligible and can be disposed by an *ad hoc* prescription. Following rule (10), we can, therefore, create trees of excitations, which start at the firing of a single excitatory ISN and propagate to INSN’s through the network. Any such tree is *defined to be a firing cascade*. Return for illustration of this construction to Fig. 6. Green lines $(n\u2032,i\u2032)\u2192(n,i)$ join causally related spikes (blue symbols), on the background of red dots that mark all spiking events. The picture shows how the initial firing of an excitatory, intrinsically spiking neuron ($i\u2032=7$ at $n\u2032=811$) causes non-intrinsically spiking neurons to fire ($i=772$ at $n=820$, $i=235$ and $i=869$ at $n=823$, and $i=2032$ at $n=827$) that, on their turn, stimulate additional NISN (for a total of 31 spikes in the figure). Clearly, not all *physical* connections are *dynamically* activated following a spike, for otherwise, the cascade will have no end. In the example, neuron $i\u2032=7$ is also presynaptic to $i=908$, $i=1115$, and $i=2677$ that remain silent.

It is apparent that this definition of cascade, although presented in detail for the neuron dynamics, can be extended to any directed network composed of individual systems with an excited state. Armed with this tool, we can now move to statistical analysis, following the theoretical scheme of Sec. I A.

## IV. STATISTICAL ANALYSIS OF FIRING CASCADES

The dynamical approach described in Sec. III permits to define precisely and measure accurately various quantities usually defined via time-binning. These are as follows:

$S$, the size of a cascade, that is, the total number of spikes in a cascade—equally, the number of nodes in a causal tree;

$T$, the time span of a cascade: the length of the time interval between the first spike and the last; and

$G$, the generation number, which is the number of levels of the tree—also, the length of the longest chain of successive excitations.

We are so equipped with the tools to analyze the statistics of cascades in the neuronal network: the function $H$ considered in Sec. I A will be any one of $S,T,G$. As mentioned in Sec. III A, we replace phase-space averages by Birkhoff sums. We consider time-segments of length $P=4\xd7106$ iterations, following an initial transient (discarded) consisting of 6000 iterations. Occasionally, we raise this number to account for the fact that a long transient can be observed in extended systems.^{25} Using a parallel architecture, we accumulate the results of $32$ realizations of the initial conditions. We performed an extensive investigation, of which we report here a typical instance.

$N=3000$ neurons, with $nex=2400$, $mex=2$, $nin=600$, $min=1$, $pex=4$, $pin=2$. Physical parameters are $\beta =0.133$, $\alpha =3.6$, $\sigma NINS=0.09$, $\sigma ISN=0.103$, $\mu =10\u22123$, $\psi =3$, $\chi exc=0$, $\chi inh=\u22121.1$.

Since this model neither includes pruning nor adaptation, to achieve criticality, one needs to tune the parameters of the network. Following Ref. 8, we vary the coupling intensity $W$ in Eq. (6). We first compute the *average number of spikes per unit time* $\tau \u22121$, observing a rather sharp transition as we increase $W$: when $W=0.084$, one has $\tau \u22121=3.6\xd710\u22122$, which increases to $\tau \u22121=5.68\xd710\u22121$ for $W=0.087$ and to $\tau \u22121=5.65$ for $W=0.09$. At the largest coupling, about five neurons (out of the total 3000) spike on average at each single (discrete) time. In analogy with the theory of crackling noise,^{12,26} we expect that this transition be reflected in the statistics of cascades, leading to power law distributions and to the absence of a typical scale.

This is confirmed by numerical experiments in which we find two different behaviors: in Fig. 7, we plot the fraction $\rho S(S)$ of firing cascades with $S$ spikes and its (complementary) distribution function $ES(S)$ defined in Eq. (8). Two values of the coupling constant $W$ yield the data in Fig. 7: $W=0.087$ and $W=0.09$. The former is visibly an under-threshold case, and the latter displays a full-fledged power-law distribution that extends over orders of magnitude. In the first case, the distribution $ES(S)$ is well fitted by a stretched exponential,

with $AS\u22430.1694$ and $\alpha S\u22430.497$ (red curve). In the second, the power law

is well satisfied, with the exponent $\beta S=1.5$ ($BS=0.75$, green line). Also plotted in the figure are the densities $\rho S(S)$ in the two cases, the second following the behavior $\rho S(S)\u2243BS\u2032S\u2212\beta S$.

We study the densities $\rho G(G)$ of generation levels $G$ and $\rho T(T)$ of the time span $T$ in an analogous way. They are plotted in Fig. 8 for the case of large coupling, $W=0.09$. Both of these quantities follow a power-law decay with exponents $\beta G$ and $\beta T$ close to two. In the same figure, we also plot the inverse, for graphical convenience, of the quantity $S\xaf(G)$ and the average number of spikes in a cascade with $G$ levels, which approximately follows a power law with the same exponent: $S\xaf(G)\u223cG\gamma $, $\gamma \u22432$. We shall use this result momentarily.

The observed exponents $\beta S=3/2$ and $\beta T=2$ are to a good approximation equal to those appearing in physiological experiments^{9,13} and in various models of neuronal networks.^{27,28} To the contrary, these values differ from those in Ref. 8 for the system that we are investigating but are obtained using the conventional definition of a cascade. It is to be remarked, though, that the general picture presented in Ref. 8 is confirmed in our analysis. It is also to be quoted the measurement of these exponents in high resolution experimental data,^{12} following the time-binning approach.

The critical scaling theory predicts^{12,26} that $\beta T\u22121=(\beta S\u22121)\gamma \u2032$, where $\gamma \u2032$ is defined in terms of the scaling of the average of $S$ at a fixed time length $T$, vs $T$: $S\xaf(T)\u223cT\gamma \u2032$. The exponents derived in Ref. 8 verify this scaling relation. In our model, the time span of a cascade $T$ and its generation level $G$ can be assumed to be proportional because the dynamics described in Sec. II B reveals that a typical time is required to perform an excitation cycle. As a consequence, we can identify the exponents $\gamma $ and $\gamma \u2032$. This leads to the relation

## V. STATISTICS OF EXTREME EVENTS IN A FINITE TIME INTERVAL

The excitation of an intrinsically spiking neuron propagates sometimes to a large number of neurons and generates a cascade. In Sec. IV, we have studied the relative frequency of these cascades vs their intensity, in a theoretically infinite time interval. It is now interesting to consider the probability that none of these cascades, within a certain *finite* observational interval, exceeds a certain intensity threshold—equivalently, the probability that the *largest* cascade in a time interval is smaller than such a threshold. This investigation belongs to the realm of an extreme value theory and goes under the name of statistics of *block maxima*.^{2} In Sec. I A, we have pretended that dynamical EVT can be extended to these phenomena. We, therefore, expect that scaling relations can emerge in these statistics.

Let again $n$ be the time at which an excitatory ISN fires and generates a cascade (possibly, and frequently, limited to such a single spike). To this value, we associate an intensity, $I(n)$, which can be chosen to be the size of the cascade $S$, its level $G$, or its time span $T$. If more than one cascade is originated at time $n$, $I(n)$ is the maximum of the values corresponding to such cascades. To the contrary, if no cascade starts at time $n$ (i.e., no excitatory ISN spikes at time $n$), $I(n)$ is null. Since the system is deterministic, $I(n)$ is actually also a function of the phase-space point $z$ at which the trajectory is located when one starts recording these intensities, as described in Sec. I A,

Following standard usage, we then define the block maxima function $HL(z)$, for various values of $L$, as the largest intensity of a cascade that originates in the time window $n=0,\u2026,L\u22121$,

The cascade achieving maximum intensity is, therefore, appropriately called the *extreme event* in the time window under consideration; the wider the time window, the larger presumably its intensity. An *Extreme Value Law* for $HL(z)$ is an asymptotic form, for large $L$, of the statistical distribution of this variable. Consider, therefore, an invariant measure $\mu $ and the function

As discussed in Sec. III A, Eq. (9), we assume that it can be obtained numerically by Birkhoff time averages over trajectories of the system.

Suppose that different cascades arise independently of each other, that is, correlation of the values of $H(\phi j(z))$ for different $j$ decays quickly when their time difference increases. Under this assumption, the probability that $HL(z)$ is less than a certain value $h$, which is clearly equal to the probability that the intensity $H(\phi n(z))$ of *all* cascades in the time interval $n\u2208[0,L\u22121]$ is less than $h$, becomes a *product* of the probabilities of these individual events. If we let $\lambda $ be the density of cascades per unit time (not to be confused with the density of spikes per unit time), we expect a number $\lambda L$ of cascades to originate in a time window of length $L$ so that

Next, taking logarithms and letting $h$ grow so that $FH(h)$ approaches one, we obtain

Clearly, this is a heuristic estimate, but it yields the same results predicted by a rigorous theory, which requires appropriate conditions^{2} to dispose of the correlations of the motion that might render the independence assumption incorrect. We can verify *a posteriori* the fact that these correlations fade in the limit of large $L$ by a direct evaluation of Eq. (18).

To do this, let us consider again the case of Example 1 when $W=0.09$. Dividing the total number of cascades by the total time span of the computation (averaged over $32$ different runs) yields the value $\lambda =8\xd710\u22123$, which is much smaller than $\tau \u22121=5.65$ and reveals that spikes are organized in cascades of average size $(\lambda \tau )\u22121$, that is, roughly $22$ spikes per cascade, in this case. In Fig. 9, we plot the quantity $\u2212log\u2061(FHL(h))/L$ vs $h$, for various values of $L$, for the three functions $H=G,S,T$. In all three cases, data points collapse on a curve, coinciding with $\lambda EH(h)$, as predicted by Eq. (18). Observe that the value of $\lambda $ employed in the figure does not come from a fit of the data, but it is the one just obtained by counting the number of cascades. The extreme value law given by Eq. (18) appears, therefore, to apply.

We can perform the same analysis in the subcritical case, Example 1 and $W=0.087$. Here, we are outside the domain of a critical scaling theory, but we nonetheless expect a dynamical extreme value theory to apply, with different asymptotic laws that can be deduced from the previous analysis and the formulae of Sec. IV. This is confirmed numerically: in Fig. 10, which is the analog of Fig. 9 for this case, we plot again $\u2212log\u2061(FHL(h))/L$ and $\lambda EH(h)$, for $H=G,S,T$. Collapse of the data points is again observed but for larger values of $h$ than in the previous case.

It is to be noted that in certain cases, an additional positive factor $\theta $, smaller than, or equal to one, appears in front of $\lambda EH(h)$. This factor is called the extremal index: it implies a slower decay of $FHL(h)$, originating from the tendency of extreme events to appear in clusters. A rigorous theory has studied to a large detail this phenomenon in simple systems,^{38} which has also been detected experimentally in many-dimensional systems.^{39} The data of Figs. 9 and 10 indicate a unit value of this parameter.

Combining Eq. (18) with Eqs. (11) and (12), we can, therefore, deduce the asymptotic law for the distribution of the variable $HL(z)$, for values of $W$ smaller and larger than, or equal to, a critical threshold $Wcr$,

where the constants $AH$, $BH$ and the exponents $\alpha H$, $\beta H$ characterize the distribution $EH(h)$ as described in Sec. IV, Eqs. (11) and (12), now for the generic function $H=S,G,T$.

The experimental function $FHL(h)$ for $H=S$ is plotted vs $L$ and $h$ in the left frame of Fig. 11 in the case of Example 1 and $W=0.087$. The variables $L$ and $h$ range over orders of magnitude. Difference with the theoretical expression (19) is plotted in the right frame. As expected from Fig. 10, convergence improves as $L$ and $h$ increase. Figure 12 displays the analog data for the case $W=0.09$. In this case, difference between the numerical data and the analytical relation (19) is remarkably small in all the ranges considered.

We can now start drawing conclusions. Within the framework of extreme value theory, the two cases just discussed, critical and subcritical, differ because they imply different *normalizing* sequences $hL(t)$, which we now define, while providing the same *universal* limiting distribution $F(t)=e\u2212t$. In fact, start from $FHL(h)\u223cexp\u2061{\u2212\lambda LEH(h)}.$ Let $t$ be a positive, real variable. The sequence of levels $hL(t)$, $L=1,2,\u2026$, parameterized by $t$, is defined so as to satisfy the limit behavior,

As a consequence of the above,

The two previous equations are the core results of EVT, when added to the fact that *three* different limiting distributions $F(t)$ can be achieved after further rescaling, as we will show momentarily. They imply that $hL(t)$ is a quantile of the asymptotic distribution of the statistical variable $HL(z)$, whose analytical form involves the ratio $t/L$ and the inverse of the complementary distribution of the function $H$,

In the subcritical case, these quantiles grow as the power of a logarithm of the ratio $L/t$,

At criticality, a power-law behavior is present,

Let us finally show how the previous relations can be put in one of the three standard forms.^{1} In fact, algebraic manipulations permit to rewrite

as $L\u2192\u221e$ for $W<Wcr$, which is the expression of a Gumbel distribution and

as $L\u2192\u221e$ for $W\u2273Wcr$, which corresponds to a Fréchet distribution.

## VI. CONCLUSIONS

We have described in this paper an extension of the dynamical theory of extreme events to time-evolving phenomena in complex, multi-dimensional systems. We have applied it with success to a model of a neuronal network that was studied in full detail. This analysis attains wide generality because this example is characterized by features that are frequently encountered: it is composed of many almost identical subsystems, linked in a sparse, directed network; each individual system integrates the input from the network and in turn returns a signal to it; signal propagation can develop a full chain of excitations. This has led us to introduce a novel approach to define neuronal avalanches, whose statistical characteristics were studied within the framework of an extreme value theory. The definition of avalanche and a similar analysis can be obviously applied to other systems in the class just described.

As discussed before, the presented results are in line with the previous theoretical models and experimental findings on neuronal networks. Nonetheless, since the focus of this work is on the dynamical theory, we did not investigate various phenomena of physiological relevance, such as the balance between excitatory and inhibitory neurons (that are nonetheless present in our model) and the role of pruning and adaptation to sustain criticality.^{37} Also, we did not perform rescaling of the shape of cascades at criticality, to reveal a single curve, as in crackling noise, although it is highly likely that this will happen, as shown in Ref. 8. A further topic that could be addressed is the computation of the branching ratio in the chain of excitations, which is clearly possible with the aid of our definition of cascades. Finally, we did not compare among them different network topologies.

While the setup that we propose to extend the dynamical extreme value theory to complex phenomena, evolving in time, is fully rigorous, our results are formal, even if confirmed by numerical experiments. This poses a new challenge for a rigorous mathematical analysis, while at the same time, it offers a paradigm for experimental studies in various physical systems.

## ACKNOWLEDGMENTS

This material is partly based upon work supported by the National Science Foundation (NSF) under Grant No. DMS-1439786 while the author was in residence at the Institute for Computational and Experimental Research in Mathematics in Providence, RI, during the “Physics and Computer Science Point Configurations in Geometry, Physics and Computer Science” program. Final redaction of the paper was performed under Grant No. PRIN 2017S35EHN$_$007, “Regular and stochastic behavior in dynamical systems.” Numerical computations were performed on the Oscar cluster at Brown University, Providence and the Zefiro cluster at the INFN computer center in Pisa. Enrico Mazzoni is warmly thanked for assistance. We finally thank the reviewers for their careful reading of the full manuscript, a most welcome rarity.

### APPENDIX: STABILITY OF THE FIXED POINT OF RULKOV MAPS

It is easy to derive the fixed point of Rulkov maps and its stability. Consider again an isolated neuron and assume that the first alternative of Eq. (3) holds, which yields the map

From this, the unique fixed point $(x\xaf,y\xaf)$ easily follows:

This result is consistent with Eq. (3) provided that $x\xaf<0$, that is, $\sigma <1$.

The differential of the transformation, i.e., the linear map from the tangent space at the fixed point to itself is

Putting for simplicity $A(\alpha ,\sigma )=\alpha (2\u2212\sigma )2$, the linear map $J$ has real eigenvalues provided that $A(1\u2212(\alpha ,\sigma ))2\u22654\mu $. In the opposite case, complex conjugate pairs appear. In this case, the square modulus of the eigenvalues equals the determinant of $J$ and is

This situation is pictured in Fig. 13: $A(\alpha ,\sigma )$ and $\mu $ are always positive and are taken as coordinates in the $(A,\mu )$ plane. When the corresponding point lies below the parabola $(1\u2212A(\alpha ,\sigma ))2=4\mu $, $J$ has two real eigenvalues. If moreover $A<1$ (region I), their absolute value is less than one and the (hyperbolic) fixed point is attractive, while it is repulsive when $A>1$ (region IV). In the region above the parabola and below the line $A(\alpha ,\sigma )+\mu =1$ (region II), $J$ has a pair of complex conjugate eigenvalues of modulus less than one so that the fixed point is still attractive and trajectories spiral down to it. When finally $A(\alpha ,\sigma )$ is above both the parabola and the straight line (region III), two eigenvalues of modulus larger than one appear: the fixed point is unstable and trajectories spiral out of it.

## REFERENCES

*Extremes and Recurrence in Dynamical Systems*, Pure and Applied Mathematics: A Wiley Series of Texts, Monographs, and Tracts (Wiley, 2016).

*et al.*, “

*Emergent Complexity from Nonlinearity, in Physics, Engineering and the Life Sciences, Proceedings in Physics*, edited by G. Mantica, R. Stoop, and S. Stramaglia (Springer, 2017), Vol. 191.

*Directions in Chaos*, edited by H. Bai-lin (World Scientific, Singapore, 1987).