There are indications that for optimizing neural computation, neural networks may operate at criticality. Previous approaches have used distinct fingerprints of criticality, leaving open the question whether the different notions would necessarily reflect different aspects of one and the same instance of criticality, or whether they could potentially refer to distinct instances of criticality. In this work, we choose avalanche criticality and edge-of-chaos criticality and demonstrate for a recurrent spiking neural network that avalanche criticality does not necessarily entrain dynamical edge-of-chaos criticality. This suggests that the different fingerprints may pertain to distinct phenomena.
In biological neural networks, scale-free avalanches of neuronal firing events have suggested that such networks might preferably operate at criticality, in particular, since theoretical studies of artificial neural networks and of cellular automata have highlighted some potential computational benefits of such a state. In these studies, notions of either edge-of-chaos criticality or avalanche criticality were adhered to. Here, using a recurrent neural network of more realistic neurons compared with what has been considered previously, we scrutinize whether these two manifestations of criticality are necessarily co-occurring. Based on a realistic paradigm of neural networks, we show that a positive largest Lyapunov exponent—indicating chaotic dynamics of the network—is conserved as we tune the network from subcritical to critical and to supercritical avalanche behavior. This demonstrates that avalanche criticality does not necessarily co-occur with edge-of-chaos criticality.
In the endeavor of understanding the functioning of the brain, the hypothesis has emerged that biological neural networks might operate at criticality.1–3 The promise of this hypothesis is that at the critical point, the particular details of the system's individual elements, and their interaction laws cease to have importance.4 In this case, the phase transition itself dominates the behavior of the system and therefore many astounding anatomical and biophysical details of neural circuits would surrender to some very generic network properties, which would permit to describe the fundamentals of the ongoing information processing and computation—at least for this case—in a simple way. Several computational advantages of criticality that render such a state particularly attractive have also been exhibited, such as optimized information transmission and capacity, increased flexibility of responses,5,6 and more.
A fingerprint of criticality is power law distributions of the properties exhibited by local descriptors when evaluated across the ensemble. Such distributions were found in the statistics of spontaneous avalanches in cortical tissue recorded with multi-electrode arrays1,7,8 and, more recently, in the auditory system.9 In addition to this evidence, distinct potential mechanisms for the emergence of power-law like distributions have been suggested as well,10,11 and the emergence of power-law avalanche distributions has even been questioned in some experiments.12 As a result, the avalanche criticality hypothesis1,13 has remained controversial.
In recurrent neural networks, also edge-of-chaos14 criticality has been studied, mostly in the context of reservoir computing.15,16 For best task performance (in the edge-of-chaos sense of “computation”), a network requires properties somewhat analogous to the ones ascribed to avalanche criticality: Flexibility to represent spatiotemporally diverse inputs, while essentially preserving distance relationships (i.e., similar inputs should trigger similar responses). This is indeed the case at the system's transition from stable to chaotic dynamics, which is characterized by a vanishing largest Lyapunov exponent. Unfortunately, in the sense of computation as a reduction of complexity of prediction,17 such a “reservoir” is not actually computing; it rather serves as a high-dimensional representation space of spatiotemporal input patterns, from which the readout neurons can sample and perform the computation. This sort of “computation” as “the ability to transmit, store and modify information”14 (which then has been claimed to be optimal at edge-of-chaos) has a different meaning from a computation seen as the step of simplifying, i.e., destroying, information.17
Despite the links that have been drawn between edge-of-chaos and avalanche criticality18,19 previously, the precise relationship between avalanche and edge-of-chaos criticality is still not settled. While we are not aware of contradicting evidence, a few studies have exhibited simultaneous occurrence of both phase transitions,6,20 but these studies were based on rather simple network models, with nodes having no intrinsic dynamics. In our contribution, we examine whether in a recurrent spiking neural network model with realistic, non-trivial node dynamics, avalanche criticality in fact needs to co-occur with edge-of-chaos criticality. We will show that this is not the case.
II. NEURAL NETWORK MODEL
Our recurrent neural network model is based on “Rulkov neurons”21 with dynamical synapses as the nodes on a directed graph (see Fig. 1). Its architecture reflects the general features ascribed to cortical networks: it consists of both excitatory (80%) and inhibitory (20%) neurons, its connectivity is sparse (connection probability ), and inhibitory synapses are three times stronger than the excitatory ones. The number of neurons in the network is set to N = 128, with excitatory neurons and inhibitory neurons (rounding to the nearest integer). This network size is chosen to trade off between obtaining enough statistics for the avalanche size distributions and a reasonable expense for calculating network Lyapunov exponents. To reflect the origin and target of the chemical signaling between neurons, the edges between our network nodes are directed. Each neuron i in the network has a number of “presynaptic” neurons j that impinge on it, and each neuron is “postsynaptic” from the perspective of the “presynaptic” neuron; the full relationship is defined by the network's weight matrix , where , see Fig. 1(b). Specifically, the number of presynaptic excitatory and inhibitory nodes chosen in this work is set to and (rounded to the nearest integer). For every neuron in the network, from the pool of excitatory and inhibitory neurons, respectively, and nodes are selected at random as presynaptic neighbors. By setting the diagonal elements of w to zero, self-connections are eliminated. By construction, network nodes have an in-degree kin of 5 or 4 (the latter case due to the elimination of self-connections). Out-degrees vary more substantially, owing to the described selection process. In this simple and controllable way, heterogeneity is introduced in the network, where neurons with higher out-degree dominate the network activity. The obtained topology could, in a sense, be seen as a simple controllable approximation to in vitro dissociated neural cultures22 that so far have provided the strongest evidence for critical avalanches of spike events.
To model the dynamics exhibited by a neuron of the network labeled by index , we use Rulkov's two-dimensional iterative map, where we denote the iteration step by index n21,23
where . models the neuron's membrane voltage, whereas describes a regulatory subsystem able to turn the firing on and off (Fig. 1(c)). contains the postsynaptic input to neuron i (see below). Parameter σ controls the state of the map (Fig. 1(d)), where at the bifurcation point. The parameter values for excitatory and inhibitory neurons are identical: . For these values, at , Rulkov's map undergoes a subcritical Neimark–Sacker bifurcation from silent to spiking behavior, corresponding to a Class II neuron behavior.24 Rulkov neurons can reproduce essentially all experimentally observed spike patterns,23 and even finer neurobiological details, such as phase response curves of biological neurons.25 By this feature, our network model distinguishes substantially from the previous efforts of linking avalanche and edge-of-chaos criticality based on probabilistic binary units6 or analog rate neurons.20
Rulkov neurons interact by means of synapses that are attached at the message-receiving side of the neuron. Occasionally, synapses receive input from other neurons in the form of spikes. A spike variable carries the value 1 if at iteration n, neuron i has generated a spike, and a value 0 otherwise:
i.e., neuron i is firing at iteration n, if attains the maximum value (red horizontal line in Fig. 1(c)). Synapses have their own dynamics that are modeled by an exponential decay and step-like increase upon presynaptic spike events as
Here, η controls the decay rate of the synaptic current and and are the reversal potentials for excitatory and inhibitory synapses, respectively. When there is no connection between the neurons i and j ( , where i is the index of the postsynaptic neuron and j is the index of the presynaptic neuron, see Fig. 1(b)), or if there has not been a presynaptic spike event ( = 0), the corresponding entry in the sum vanishes. The decay parameter is chosen as , the reversal potentials as and , and the external input weight as wext = 0.6. Internal excitatory connections have a weight of wij = 0.6 as well, whereas inhibitory connections have a tripled weight of wij = 1.8. By joining the role of σ, can push intrinsically silent Rulkov neurons to emit spikes (Figs. 1(d) and 1(e)). W is a connectivity-scaling factor; increasing it increases the coupling among the neurons without changing architecture otherwise. This is in contrast to processes like Hebbian learning or mechanisms of synaptic plasticity used in other approaches. By changing W, diverse activity states can be accessed.
Being composed, so far, of intrinsically non-spiking neurons, to become excited our network requires internal or external excitatory sources. In our approach, we implemented both. The internal one is a localized source of activity representing a so-called “nucleation site”26 or a “leader neuron.”27 This is implemented by putting one of the network's excitatory neurons above firing threshold (from to ), which causes this neuron to fire in a subtly chaotic manner. This firing behavior is then additionally modified by means of recurrent connections from the network (cf. Figs. 1(a), 1(d), and 1(f)). The external source of activity captures the influence of noise on neuronal activity (such as by spontaneous neurotransmitter vesicle release or quickly changing external stimulation). This aspect is modeled by an individual excitatory Poisson spike train input to each neuron, represented by an external spike variable having value 1 when the i-th neuron receives an external spike and 0 otherwise
where p is a random number drawn at each time step from a uniform distribution in the open interval (0,1). Choosing renders the external input temporally sparse.
A single simulation covered time steps, of which the first 5000 steps were discarded. For each value of W, we picked 50 simulations that exhibited a requested level of activity (at the critical point we occasionally increased this number). The precise requirement was that the average inter-event interval (i.e., the average time between two subsequent spikes in the network) of a simulation run should fall into the interval , where and are the mean and standard deviation of the distribution across all network simulations at a particular W. Such sampling ensures that results from typical network realizations are looked at, and it permits the pooling of the results from different simulations.
IV. RESULTS: AVALANCHE CRITICALITY
Following the approach taken in experimental investigations,1 we chose a binning of time of width and defined neuronal avalanches as the maximal extensions of nonempty adjacent bins. Experimental investigations originally focused on avalanches of spikes in population activity inferred from local field potential recordings,1,5 but later also individual neuron action potentials were considered.7,8,28–30 Avalanche size S was measured as the total number of spikes within the avalanche; avalanche lifetime T was measured by the number of time bins an avalanche spans. Our exponents characterizing the avalanche size and lifetime distributions are maximum likelihood estimates, and the goodness of fit was evaluated following Ref. 31 (see the Appendix).
A weight scaling exhibited subcritical, and supercritical avalanche behavior, respectively (Fig. 2). The increase of W led to an overall increase in network activity (Fig. 2(a)), resulting in -values of 110, 48, and 8 Rulkov time steps (rounded), for the subcritical, critical, and supercritical networks, respectively. The avalanche size distribution of the critical network follows a power law, , with exponent (Fig. 2(b)). Because of the network's finite size of N = 128 elements, a noisy cut-off after must be expected.
In the subcritical case, the avalanches are generally small and their size distribution decays exponentially, while in the supercritical case, the increased number of large avalanches results in a characteristic hump toward the end of the distribution.32 A similar metamorphosis of the distribution shape is observed for avalanche lifetimes. At critical avalanche behavior, the lifetime distribution also follows a power law (exponent , Fig. 2(c)). The fit is, however, somewhat less convincing than the one obtained for the size distribution, which is a commonly observed phenomenon in electrophysiological experiments.1,28,29
Power law distributions can, in principle, be caused by several mechanisms. To confirm that the network is at (approximate) criticality, we performed the following tests. For genuine scale-free behavior, the choice of the temporal bin size, , should not affect the avalanche size distribution (cf. Ref. 33). Fig. 3(a) exhibits that in the critical case, choosing different binnings only mildly affects the distribution, whereas the effects are markedly stronger in the subcritical and the supercritical cases.
We moreover checked whether all avalanches of the critical state would collapse to one characteristic shape, upon a corresponding rescaling of time.30,34 To this end, the shape V(T, t) of an avalanche of lifetime T is calculated. V(T, t) expresses the temporal evolution (time variable t) of its “shape” measured by the number of spikes emitted in the temporal bin around time t. For each T, the average avalanche shape is calculated. From the scaling ansatz between the mean size of avalanches and their lifetime T, , the critical exponent γ is obtained. From this, the universal scaling function representing the characteristic shape of all avalanches, emerges as . For our “critical” network, indeed follows a power law, with exponent (Fig. 3(b)). In the case of our supercritical network, a smaller range of the function also follows a power law, which permits the comparison between the self-similarities of the avalanche shapes of the critical and the supercritical state. For the critical network, we observe a noisy collapse of the avalanche shapes of duration (Fig. 3(c)), which is not the case for the supercritical network (Fig. 3(d)). Because of the strong influence of the intrinsically spiking neuron (for further evidence, see the Appendix), avalanche shapes of short lifetimes (T < 25) fail to exhibit a nice collapse. Generally, universal scaling at shortest length scales should not be expected, as individual system part behavior can be stronger than the collective behavior at these scales.34 As a final test we examined whether the crackling noise relationship between critical exponents, , would hold,30,34 for our critical network. The critical exponents of avalanche lifetime distribution ( ), avalanche size distribution ( ), and the function of the mean avalanche size depending on the lifetime ( ) fulfill the required relation remarkably well. Taken together, power law distributions, self-similarity of avalanche shapes, and an excellent fulfillment of the fundamental relation between critical exponents strongly suggest that our “critical” network is indeed from the close vicinity of a critical network state.
V. RESULTS: LYAPUNOV SPECTRA
To determine whether avalanche criticality is confined to the edge-of-chaos, we calculated the Lyapunov spectrum for the subcritical, the critical, and the supercritical cases, by using the Jacobian matrix evaluated at points along the trajectory of the network's state vector35 (see the Appendix for further details). For the two notions of criticality to co-occur, we would expect the largest Lyapunov exponent λ1 to be negative, vanishing, and positive, for the three cases, respectively. λ1, however, is positive in all of the three cases (Fig. 4). Across the full parameter neighborhood of avalanche criticality considered ( ), its numerical value is essentially unchanged ( for the subcritical and critical network and for the supercritical network). If one Rulkov iteration is identified with a duration of , which is sometimes done to facilitate biological interpretation,23 values around would be obtained.
Lyapunov spectra can, moreover, provide a deeper insight into the observed phenomenon. Upon increasing the synaptic strength, the total number of positive Lyapunov exponents increases as well. Every positive Lyapunov exponent amplifies perturbations of the microstate to an observable macrostate change. The sum of all positive λd gives the total average rate of this amplification, . This sum is also known as the upper bound of the Kolmogorov–Sinai entropy,36 and can be interpreted as an entropy production rate. H increases with stronger synaptic coupling, from 0.014 ± 0.003 (mean ± standard deviation) for the subcritical case, via 0.023 ± 0.006 for the critical case, to 0.044 ± 0.027 for the supercritical case. Therefore, although the supercritical case has a slightly smaller largest Lyapunov exponent, it loses information about a past state at a faster rate.
The majority of the studies on dynamical stability in neural networks have used the perturbation method; i.e., every simulation run of the network activity was repeated after adding a random perturbation δ0 to state vector, so that the largest Lyapunov exponent could be assessed from the evolution of the distance between the network's unperturbed and perturbed trajectories.6,15,16,37 This approach yields only an estimate of λ1, and does not provide information about the rest of the Lyapunov exponents. Moreover, such perturbations may be far away from the perturbation limit inherent to the definition of Lyapunov exponents, so that it cannot be excluded that the largest Lyapunov exponent obtained following such approaches, depends on the size of the perturbation. Then, instead of a positive value for the first Lyapunov exponent, a negative value could emerge.38 The method employed here does not suffer from such potential shortcomings.
Chaotic dynamics could be a collective effect of the network interactions, or arise simply because nodes themselves have chaotic dynamics. To scrutinize this, we measured the largest Lyapunov exponent of the intrinsically spiking neuron in the absence of network input and found it to be . In the presence of external input, the neuron is occasionally silenced (Fig. 1(f)). This behavior is well-known from Class II neurons24 in the vicinity of a Andronov–Hopf bifurcation.39 Because the intrinsically spiking Rulkov neuron used in our simulations is close to the Neimark–Sacker bifurcation, some perturbations are able to push the neuron's state variable close to the unstable fixed point. During the time it takes to escape from the fixed point, the neuron does not fire. As a result of this occasional silencing, the neuron's largest Lyapunov exponent's long-time average drops, embedded in the network, to , which is in close agreement with the value of λ1 found in our network. This suggests that the largest Lyapunov exponent of the network essentially captures the dynamics of the intrinsically spiking neuron. In the subcritical and critical cases we find generally between 3 and 4 more positive Lyapunov exponents, which is close to the number of neurons that receive direct inputs from the intrinsically spiking neuron. Therefore, the source of chaos in our networks may originate from this single neuron's dynamics; an increase of coupling transmits its behavior more efficiently into the network.
In the analysis of local field potential avalanches in Ref. 1 and, more recently, also for cochlear activation networks,9 exponents were measured. Mostly, experimental settings have yielded avalanche size distribution exponents .7,8 On this background, the value of exhibited by our biologically plausible network may seem high; but similar values ( ) were observed in simulations of bursting recurrent networks (for “background” avalanches26 that have been linked to critical percolation on a Cayley tree-like network yielding the same value40). Even though strong qualitative changes on the topological network state were enforced, our networks exhibited chaotic dynamics. This demonstrates that avalanche criticality does not necessarily co-occur with edge-of-chaos criticality. Rather, this suggests that in neural networks with non-trivial node dynamics, two separate phase transitions may occur. The high variability of exponent α may express the different network and dynamical conditions under which avalanche criticality is possible, and may point to a link between avalanche and edge-of-chaos criticality, albeit of a much weaker form than is usually assumed (generally, we expect that higher values of this exponent will be related to stronger computational performance).
Our findings suggest that for a full analysis of artificial and simulated neuronal networks, in addition to the one regarding avalanche behavior, an analysis of the dynamical state should be provided as well: Results regarding avalanche criticality obtained for a non-chaotic network might not be relevant for a chaotic network with unpredictable patterns of activity. In addition, our study highlights the presence of a “paradox” that may be of importance for understanding biological network behavior: Upon an increase of the synaptic coupling, chaos may intensify in the sense of a larger entropy production rate, while losing coherence, indicated by a decrease of the largest Lyapunov exponent. As the overall coupling value W is increased beyond avalanche criticality in our network, a distinguished substantial maximum of the entropy production emerges, at close, but clearly distinguishable distance from the latter. For what class of networks such an observation holds, more generally, will be an investigation of interest of its own.
This work was supported by the Swiss National Science Foundation Grant (No. 200021 153542/1), an internal grant of ETHZ (ETH-37 152), and a Swiss-Korea collaboration grant (IZKS2_162190).
APPENDIX: METHOD DETAILS
1. Distribution parameter estimation
The theoretical fits to the avalanche size and lifetime distributions were found following the guidelines in Ref. 31. We assume that the observations o (avalanche size or lifetime) were sampled independently from a distribution parametrized by α. The likelihood of the parameter α is given by the probability of the observations o, given α
where M is the number of samples. In practice, we used the logarithm of the likelihood, , which allows to replace the product with a sum. Log-likelihood has a maximum at the same α as the likelihood, due to monotonic nature of the logarithm function. Thus the maximum likelihood estimator of the parameter α is
In the case of a discrete, truncated power law distribution of o with scaling exponent α within the bounds and , the probability of om is
which gives the log-likelihood that needs to be maximized as
Similarly, we can derive the log-likelihood of the exponential decay constant, ε, of a discrete, truncated exponential distribution of o as
To determine the range of our fits, we fitted each distribution over a range of , where b was a conveniently chosen cutoff, and followed the procedure outlined in Ref. 31. The maximum likelihood fit was used to generate 1000 surrogate datasets. The surrogates were then fit, in turn, using maximum likelihood, and for each fit, the Kolmogorov–Smirnov distance was calculated. As a measure of plausibility of a fit for a given distribution type, p-values were calculated as the fraction of surrogate data sets with a higher Kolmogorov–Smirnov distance (a worse fit) than the corresponding experimental data fit. The value of a was chosen as the lowest value for which the p-value exceeded 0.05.
2. Avalanche shapes
The obtained individual avalanche shapes V(T, t) are highly variable. For the mean avalanche shape calculation, to improve the statistics, also avalanches of lifetimes were included (Fig. 5). The avalanche statistics for each lifetime T thus covered samples, the smaller sample numbers accounting for larger avalanches.
In the case of the critical network, the periodicity of the intrinsically spiking neuron is roughly 240 Rulkov time steps (jittering around 235-247 time steps with the mode of the distribution at 237), which is equivalent to about 5 temporal bins. The peaks in the average avalanche shapes are at and thus are each separated by one mean interspike interval of the intrinsically firing neuron. For larger t, the peaks are less prominent and their spacing varies more and this relationship is no longer that apparent.
3. Calculation of Lyapunov spectra
The largest Lyapunov exponent λ1, describing the time-averaged rate of the strongest exponential separation of system trajectories in the tangent bundle, is used to determine whether a dynamical system is stable or chaotic. For , nearby trajectories converge, while implies divergence of nearby trajectories and hallmarks chaos. At the critical point, ; in its neighborhood, the system experiences a critical slowing down of the dynamics, where small perturbations can have long-lasting effects. We numerically determined λ1 using the local linearization along the system's trajectory, i.e., the Jacobian matrix of the neural network.35,36 This powerful method not only yields λ1, but provides the whole Lyapunov spectrum, i.e., all Lyapunov exponents of the system.
Every neuron lives in a three-dimensional state space: the two state variables and , and the synaptic input variable , into which also the temporally sparse external inputs are incorporated. The Jacobian matrix for a single neuron has the form
where . The extension of the single neuron Jacobian matrix to the full network is straight-forward: the state variables of a neuron do not directly depend on the state variables of other neurons because the interaction is only through spike events and we can write the Jacobian of the full network, , as a block diagonal matrix with the Jacobians of the individual neurons on the diagonal and all other elements being equal to 0.
Lyapunov exponents are obtained by following how a unit sphere On is transformed by the Rulkov Jacobian, into an ellipsoid . A one-step growth rate of a unit length base vector is thus given by the length of the mapped vector . By applying a Gram–Schmidt orthonormalization procedure, a new unit sphere is reconstructed (with generally rotated base vectors) and the growth rates into their directions are determined anew. Maintaining the initial indexing of the unit base vectors and repeating this procedure, after n iterations the separation of trajectories into the direction described by index d is . Owing to the Gram–Schmidt procedure, for n large, index d = 1 describes the largest and index the smallest, separation in the tangent bundle. If we are interested in exponential separation , the long-time behavior of the system is described by the Lyapunov exponents
where the sign of the first exponent provides the information whether the system is “chaotic” or not . The orthonormalization procedure is computationally expensive, which makes the calculation of Lyapunov exponents for large networks slow. We calculated the Lyapunov exponents of the subcritical, critical, and supercritical networks for 10 random configurations out of the 50 that we used to obtain the avalanche size and lifetime distributions. The simulation length for the calculations was kept at time steps. The Lyapunov exponents converged well, but, to take care of potential fluctuations, the final value of λd was obtained by averaging over the last 5000 steps.