We review an empirically grounded approach to studying the emergence of collective properties from individual interactions in social dynamics. When individual decision-making rules, *strategies*, can be extracted from the time-series data, these can be used to construct *adaptive social circuits*. Social circuits provide a compact description of collective effects by mapping rules at the individual level to statistical properties of aggregates. This defines a simple form of social computation. We consider the properties that complexity measures would need to have to best capture regularities at different level of analysis, from individual rules to circuits to population statistics. One obvious benefit of using the properties and structure of biological and social systems to guide the development of complexity measures is that it is more likely to produce measures that can be applied to data. Principled but pragmatic measures would allow for a rigorous investigation of the relationship between adaptive features at the micro, meso, and macro scales, a long standing goal of evolutionary theory. A second benefit is that empirically grounded complexity measures would facilitate quantitative comparisons of strategies, circuits, and aggregate properties across social systems.

The origins of social complexity have long fascinated anthropologists, sociologists, and biologists. Attempts to classify social structures by their complexity and to study how different social structures arise remain largely qualitative. There has been, however, enormous progress in physics and computer science in the development of rigorous methods for measuring complexity and structure more generally. There has also been substantial progress within biology accounting for the origins and the maintenance of collective properties from component interactions in adaptive systems. Here, we use a case study from an animal society to illustrate how the emergence of collective properties from component interactions can be quantitatively studied in adaptive social systems as a computational process. Our results and those of others suggest three natural levels of analysis: the rules individuals use to make strategic decisions, the social circuits that describe how the rules are collectively implemented, and the aggregate features—social structure—that result from this collective implementation of the rules. A central challenge is establishing the relationship between complexity at these different scales. We discuss candidates for complexity measures at each level, and challenges for implementation given finite population size, collective behavior, stochasticity, and other features common to biological and social systems.

## I. INTRODUCTION

In the last twenty or so years, there has been significant quantitative progress made in our understanding of how systems with functional aggregate properties arise from interactions among more basic components (e.g., Refs. 43, 47, 58, 11, 29, 32, 17, 42, and 33). In the biological domain, examples of functional aggregates include phenotypic features, like spines on the surface of a sea urchin, numbers of cells in the endomesoderm, signal transduction pathways within a cell, and the ability to metabolize essential substrates. These features arise out of the coordinated interactions among a multiplicity of lower-level elements. These elements include genes, proteins, and tissues interacting according to a set of regulatory rules characterized in terms of causal networks or regulatory circuits.^{42}

In social systems, analogs to phenotypic features include properties that arise out of the behavior of individuals interacting in space^{49,50,13,12,48,51} or in social networks.^{25,27–29} Social aggregates range from spatial trajectories and changing volumes of schools of fish,^{13,1} the distribution of fight sizes in a social group,^{22} the distribution of power in animal societies^{25,8} to the degree of assortativity or reciprocity characterizing social interactions. Recent results suggest that the dynamics generating these properties can in some cases be characterized in terms of causal, regulatory networks^{22} much the way they are represented in studies of the phenotype.

In this paper, we pose the question, what kind of complexity measures will be required to describe collective behavior in social systems? Attempts to classify social structures by their complexity remain largely qualitative (e.g., Refs. 10, 20, 31, 9, and 35). There has been, however, enormous progress in physics and computer science in the development of rigorous methods for measuring complexity and structure more generally (see papers in this special issue). Our aim is to suggest some properties that formal complexity measures will need to have useful measures of social complexity given in the following ubiquitous characteristics of social systems (see Crutchfield and Machta, Introduction to Focus Issue^{16}):

Social systems are typically hierarchical and require descriptions at multiple spatial scales.

Social systems often have functional or adaptive consequences for the individuals comprising them, and successive organizational levels manifest distinct functional capabilities.

Social systems are typically the aggregates of components in direct conflict or imperfectly coordinated.

Components of social systems are cognitive entities with significant constraints and biases in their behavior.

The components of social systems turn over far more quickly than the whole, and multiple time scales of activity play a crucial role in adaptive behavior.

Social systems undergo fusion and fission, meaning that social systems are not perfectly individuated collectives.

Decision-making and strategy use are often probabilistic and there can be finite size effects.

To ground our discussion, we use a case study from an animal society. We begin by describing the character of the raw data and the functionally relevant aggregate properties encoded in these data. In the context of this case study, we briefly review the methods for extracting decision-making rules, or strategies, from time-series data. We then characterize the causal networks that arise from the collective implementation of rules. We discuss the value of these causal social networks for mapping the decision-making rules to aggregate statistical properties. We interpret these mappings as social computations in order to emphasize that the aggregate consequences of strategy use can be as important to individuals as the direct consequences of their decisions, and hence, as with strategies at the individual level, we should expect (and find) that the aggregate consequences are tuned through collective behavior.

The case study is of a carefully investigated system (e.g., Refs. 22, 21, 27, 28, and 26), with data collected at multiple spatial and temporal scales, and so makes the challenges for complexity theory explicit. We detail the properties complexity measures should have given the character of the organizational structure present at the micro, meso, and macro scales and the quality of data present at each scale. We are keen to avoid generic measures of complexity, which fail to address biologically interesting properties of societies (given in the list in the Introduction) and that add little to our understanding other than a few numerical outputs of questionable value.

## II. COLLECTIVE SOCIAL COMPUTATION

In this section, we consider what constitutes an input to a social computation, what statistical features of social data constitute the output of a computation, and the validity of a termination criterion in the context of social computation. We leave the mechanisms in support of the algorithm mapping inputs to outputs to the end, as this requires the most involved description. Our objective is to demonstrate that each stage of the computation corresponds to an increasingly coarse-grained level of description and that these coarse-grainings are useful to the system components. We further suggest that each level lends itself to description with different complexity formalism.

### A. Conflict data

The raw data describes a conflict time-series in a primate social group. The time-series has the following structure: a fight *f*_{i} occurs at time *t*_{i,s} and terminates at time *t*_{i,t}. Fights are separated from subsequent fights by a peaceful period of duration, *p*_{i}. The number, identity, and actions of the individuals in each fight are represented in an annotated vector, *X*_{i}. Hence, we have a discrete series of measurements. The basic structure of the time-series is shown in Figure 1. To ground the discussion, we begin with a brief description of the study system.

The data were collected from a large group of captive pigtailed macaques (*Macaca nemestrina*) socially housed at the Yerkes National Primate Center in Lawrenceville, Georgia. Data on social dynamics and conflict were collected from this group over a stable four month period.

The study group contained 48 socially mature individuals and 84 individuals in total. Conflicts are typically expressed as fights in this system. Fights involve two or more individuals and are separated by peaceful periods—defined as the absence of fights among any of the group members. Briefly, a “fight” here includes any interaction in which one individual threatens or aggresses a second individual. A conflict was considered terminated if no aggression or withdrawal responses (fleeing, crouching, screaming, running away, submission signals) were exhibited by any of the conflict participants for 2 min from the last such event. Fights can involve multiple individuals, with the observed size in the present data set ranging from 2–28 individuals. Fights can be conceived of as small networks that grow and shrink as pair-wise and triadic interactions (which occur when fight participants redirect aggression to third parties, or third parties intervene into an ongoing fight) become active or terminate, until there are no more individuals fighting under the two-minute-resolution criterion.

### B. The input to computation

The input states in a social system can include the number of individuals, demographic factors such as the age-sex composition of a group and social and neuro-physiological states. Examples of social states include relative power,^{25} influence, and “likability.” Neuro-physiological states can include levels of neurotransmitters (e.g., serotonin) and steroids (e.g., cortisol) or other physiological variables. The extent to which the input states are chosen to be coarse demographic factors, fine-grained behavioral or social states, or very fine-grained physiological data—will depend on the computation being performed—in particular, its spatial scale (is the statistic computed over the micro-dynamics of subgroups of individuals, the whole group, etc.), and also the timescale on which the input data change.

In the present case study, only a relatively simple type of input state data is considered: the identities of the conflict participants, which gives a vector with 48 elements. Figure 2 shows the character of the time-series data under this input constraint.

### C. The termination of computation

The termination of computation idea derives from computer science, where loosely an algorithm is defined as a procedure or set of steps for performing a mapping from a set of inputs to a functional set of outputs in finite time.^{19} In order for an algorithm to halt—to produce some output—it must be able to evaluate whether it has reached some desired state. This problem, known as the halting problem,^{55} is distinct from whether the algorithm can generate a “correct” answer. In biological systems, the concept of a termination is perhaps less useful, as it is only necessary for an output state to persist for long enough for benefits to be accrued. This is potentially achieved through a separation of timescales between the aggregate level and the faster microscopic dynamics. A slow aggregate property can provide a sufficiently stable background for individuals to tune their decision-making adaptively. Of course individuals adapt and shape collective features through a process of social niche construction,^{38,27,8,23} and so there is no fixed termination point only a quasi-stationary equilibrium.

Although a separation of timescales could provide the basis for a quasi-termination mechanism, the problem nonetheless has a dynamic character. In computer science, formal approaches to termination require presentation of the entire input and, hence, apply best to static systems. Not all computer science problems have this character; however, databases, for example, are typically updated continuously. This recognition has prompted efforts to develop computational complexity measures that can be applied when input is continuous. The goal of these measures (e.g., Refs. 39 and 57) is to quantify how much computational work is required to keep the computation current.

In addition to permitting the study of computation in a more diverse set of systems, this development is important because of its implications for how much computational work the system has to perform. It has been shown that the quantity of computational work required to make incremental updates is less than what would be required if the system had to restart from scratch each time it received a new bit of information (personal communication, J. Machta, 2011). Hence, dynamic assessments can make for more efficient computations.

### D. The output of computation

We define an aggregate statistical property as the output of a collective social computation if it can take on values that are of adaptive value at the group or individual level, is the result of a distributed and coordinated sequence of individual interactions under the operation of a strategy set, and is a stable output of input values that converge (terminates) in biologically relevant time. This is fairly consistent with the concept of a computation, or effective calculation, in computer science albeit in a finite system.^{19,18} Hence, for an aggregate property of social dynamics to be considered a computation, it must minimally possess functional consequences either at the aggregate level—influencing how the target group interacts with other groups—or it must have consequences for the system components—directly influencing their behavioral patterns—and it must be tunable.

Statistical properties of the time-series at an aggregate-level include the average size of cascades (sequences of fights spanning several bouts), cascade propensity, the cumulative distribution of fight sizes, and the distribution of fight durations. For the sake of simplicity, we focus on the distribution of fight sizes. In previous work,^{22} we referred to this distribution as the *long fraction*. The long fraction is a distribution of fight sizes: the number of fights of size *b*, divided by the total number of fights larger than two,

where *N*(*b*) is the number of fights of size *b*, the maximum fight size of 48 comes from the total number of socially mature individuals in the group, and *c* is the total number of fights larger than size two. The long-fraction is a simple aggregate or population property from which we can extract functional statistics.

The data indicate that as the number of large fights increase, the cost (measured as aggression received per individual) individuals pay when they or others in the group engage in conflict increases. This is because individuals are more likely to redirect aggression and use severe aggression during large fights. The long-fraction feeds down to influence individual behavior by influencing the cost individuals pay for conflict. Individuals can change the long fraction by changing their decision-making rules (see Sec. ???). We have studied elsewhere (e.g., Refs. 23 and 25–27) the causes of more complicated aggregate statistical properties and their feedback consequences to individuals.

### E. Algorithm for computation

In previous work,^{22} we developed a technique, called *Inductive Game Theory* (IGT) for extracting strategies from time-series data. We refer the reader to that paper for details of these methods. We provide an abbreviated description of the IGT methods, explained in the context of extracting conflict decision-making strategies from the conflict time-series described above.

First, we determine the space of strategies yielding the aggregate output. In the case of the long fraction, the strategies concern how individuals decide to join fights, as this decision should account for fight growth and hence variability in fight size. The space of strategies is potentially large and can include factors such as who is involved, relative power of individuals, and number of allies. However, because we have limited our input state data to individual identity data (Sec. II B), the space of strategies we consider is substantially circumscribed. It only includes decision-making rules defined with respect to the identity of conflict participants.

Two parameters are important: the identity of fight participants in bout, *f*_{i}, and the identity of fight participants in bout, *f*_{i}_{+1}. Individuals can, in principle, base their decision to fight on the appearance of single individuals in a previous fight, the appearance of two individuals, or three, and so forth. We can make this general by using the parameter *n* to represent individuals appearing at *f*_{i} and *m*, to represent individuals appearing at *f*_{i}_{+1}. The rule specifies the cause, *c*, of *m* in *f*_{i}_{+1} is the appearance of *n* in *f*_{i}. The space of strategies can be written generically as *c*(*n, m*), where *n* ≥ 0 and *m* ≥ 1. Colloquially, if Mary joins a fight because Jack appeared in the previous fight, this would be an instantiation of *c*(1,1), whereas if Mary joins a fight because Jack and Joe appeared in the previous fight, this would be an instantiation of *c*(2,1).

This formalism captures fights separated by only one peaceful period and corresponds to a first order Markov process (except in the case of *n* = 0, which is a 0th order Markov process). We capture longer strings of fights, or additional types of social interactions and higher order Markov processes, by extending *c*(*n, m*) to *c*(*n, m, p*). When *n* and *m* = 1, the strategy can be said to be of a pair-wise nature. When *n* and *m* are >1, the strategy is higher-order.

A principled way to constrain the magnitude of *n* and *m* and to limit the number of steps in *c*(*n, m*,…) is to consider the cognitive capacity (e.g., constraints on memory for number of participants, identity, etc.) of the individuals making the decisions.^{22,30} In this case study, only first-order Markov rules were considered, and *n* and *m* were restricted to values ≤2. Results from our own work as well as results reported in the literature indicate that monkeys commonly form triadic relationships and engage in triadic interactions,^{2,20} suggesting that for the study system these cut-offs are sensible.

We present in Fig. 3, a computational resource lattice for Markovian agents modified from Ref. 22. This describes the full space of join-fight strategies for our system restricted to those strategies that are cognitively plausible. With the space of strategies defined, we are in a position to determine whether any of the posited strategies is causal; in other words, whether any of the alternative *c*(*n, m*) strategies is a generative model that, when implemented collectively, reproduces statistical features observed in the data.

We reduce the number of generative models (strategies) by asking whether there is correlational evidence for any of the strategies in the strategy space, keeping only those strategies with significant (above a null model) associations. To do so, we rewrite the causal models as correlations, such that *c*(1,1) corresponds to a simple pair-wise correlation between the appearance of *A* in *f*_{i} and the appearance of *B* in fight *f*_{i}_{+1}, which can be written as *P*(*A* → *B*) and so forth. In the case study, significant correlations were found for *P*(*A* → *B*), *P*(*AB* → *C*), and *P*(*A* → *BC*).

The significant *P*(*A* → *B*) correlations can be used to construct a *P*(*A* → *B*) correlation network, in which nodes correspond to individuals and the presence of an edge indicates a significant above null model correlation. Bipartite correlation networks can be constructed for triadic associations. These correlation networks, with their respective topologies, are a first step towards a description of how the strategies, when collectively implemented generate an aggregate statistical property.

#### 1. Constructing causal networks from strategies

We can intervene into the system,^{41} using *clamping* or *knockout techniques* to manipulate the probabilities associated with specific pairs and triplets to determine the consequences for the long fraction. This will tell us when a correlation is causal. This kind of experiment is used to study the gene regulatory networks that govern gene and protein interactions in the production of phenotypic features like the skin.^{42} Intervention is, however, often impractical for social systems.

A second approach is to build a simulation parameterized by the correlations extracted from the time-series.^{22} In our case study, a simulation can be built for each of the *c*(*n, m*) strategies. In each simulation, only individuals with edges (above null correlations) in the correlation network are allowed to play the strategy. Consider strategies of the form, *c*(1,1). If an above null correlation was found for individuals *A* and *B*, of the form, *P*(*A* → *B*), we accord *B* the possibility of fighting in the simulation at *f*_{i}_{+1} if *A* was fighting in the simulation at *f*_{i}.

This is a social group, so in principle many individuals can be playing a given strategy. Hence, it is not a single strategy that matters but their *collective implementation*. If the collective implementation of a strategy in simulation can recover the target statistical property and alternative strategies cannot, we infer that the successful strategy is the dominant strategy played (with respect to the decision to join fights).

The set of pair-wise (or triadic) above null correlations given by our correlation network is not enough to build the causal network associated with a given strategy. The causal network must further specify how an individual integrates over strategies to make a decision, as fights in social systems can involve multiple players interacting simultaneously. Consider *c*(1,1). Individual *B* can have many above null correlations with other individuals and pairs, some of which are excitatory and some of which are inhibitory. Assume that individuals *D*, *E*, and *F* fighting at *f*_{i} each independently can trigger the appearance of *B* at *f*_{i}_{+1}, but *G* inhibits *B* at *f*_{i}_{+1}. If all of these individuals appear together in a fight, how is *B* to respond? Three individuals recommend “fight” and one recommends “do not fight.” *B* must combine these “recommendations,” each of which corresponds to one pair-wise *c*(1,1) strategy in order to make a decision. The causal network must include an operator (called a “combinator” in Ref. 22) that specifies how multiple inputs, some which might be conflicting, to a node are combined to make a decision.

In the case study, this was accomplished by introducing an *AND*/*OR* Boolean gate as a third parameter, giving generative models of the form, *c*(*n, m*) + *AND*/*OR*. In the case of *c*(*n, m*) + *AND*, *j* fights if all of individuals fighting in the previous time-step, recommend “fight.” In the case of *c*(*n, m*) + *OR*, only one individual (or pair) in the previous fight who can trigger it, has to recommend “fight,” for it to fight in the next time-step. The combination of the AND/OR gate with the three *c*(*n, m*) strategies gives six alternative causal networks, each of which serves as a potential generative model for the long fraction. As illustrated in Fig. 4, strategies and operators reside at the microscopic scale, causal networks at the mesoscopic scale, and aggregate statistical properties, like the long fraction, at the macroscopic scale. Of course, many other types of rules are possible, including rules that are modulated by a knob (e.g., additive) instead of a binary operator. These issues and the social scenarios warranting more complicated rules are developed in Ref. 24.

The edges and operators in the causal networks determine changes to the states of the nodes. The edges in causal networks are probabilistic. Only the major causes, with respect to the strategy space explored and the target aggregate property, are specified. We refer to this kind of causal network as a *compressed causal network* or a *social circuit*.

This kind of probabilistic causal network is consistent with existing theories of causality for directed acyclic networks, as developed by Pearl^{40,41} and extended by Ay^{4,5,3,6} to recurrent graphs. One potentially important difference is that although node state changes are induced probabilistically, the probabilities are filtered through a Boolean gate. However, our strategies and operators together could be said to comprise the “mechanisms” that, in the formulations of Pearl and Ay, specify how the nodes in the causal network function.

We know from simulation results that the six alternative causal networks generate very different distributions of fight sizes. Strategies *c*(1,1) + *AND* and *c*(1,1) + *OR* both generate only relatively small fights, whereas *c*(2,1) + *OR* and *c*(1,2) + *AND*/*OR* generate many large and many small fights. Strategies *c*(2,1) + *AND*, on the other hand, generates many small fights and some large fights, as is illustrated in Fig. 4. Both the strategy and the operator are important, although the relative contributions of each to the long fraction remain unquantified. Also unknown is how the topology of the causal networks influences the long fraction. This includes factors like the average degree of a node, the average local clustering coefficient, and the average reach in *n* steps. Fig. 5 gives an intuition for how conflict propagates under the different strategies, assuming the AND operator. As illustrated in Fig. 5, some strategies, such as *c*(2,1), require either large fights, additional rules, spontaneous involvement, or highly connected causal networks to continue to generate fights. This suggests that topological network features such connectivity, degree, and reach, could play a role in collective computation of aggregate properties.

Another way to reduce the size and complexity of a causal network is to ask whether the input data (Sec. II B) are at an appropriate resolution. How finely or coarsely resolved is the information used by individuals to make decisions? In the case study, we explored whether the individuals in our study system are using all identity information, or coarse-grain individuals into demographic and social classes, such as “male,” “female,” “powerful,” “not powerful,” and so forth.^{22} This is both a model selection problem and simultaneously a cognitive question concerning how much information components in the system are using to make decisions.

Finally, it is worth noting that strategies, and hence the social circuits and population statistics that arise from their collective implementation, can change as individuals learn or as the distribution of opportunities to use the strategies changes. In the above discussion, we ignored this complexity as we were dealing with aggregated time-series data. However, one can subdivide the time-series and look at strategy dynamics. It is important to keep in mind that strategy fluctuation does not imply learning or a non stationary distribution of opportunities to use a strategy. There are several explanations for fluctuations that first need to be ruled out. Among these, two of the most practically important include (a) a weak signal in the data due to noise and (b) an inappropriate choice of a measure of association used to extract strategies.

## III. LEVELS OF ANALYSIS AND COMPLEXITY MEASURES

A central challenge is establishing the relationship between complexity at the micro, meso, and macro scales. Can we say something general about the way individual rules produce regulatory circuits, and whether circuits with a particular topology (or complexity, see Sec. III B) will tend to favor one distribution over another? Do complex rules produce complex circuits, and how should we measure complexity at each level? We briefly discuss candidates for complexity measures and some challenges for implementation. Perhaps it is not surprising that at the lowest levels, measures of structure, and computation are more appealing. Whereas at the aggregate levels, statistical measures are likely to prove of greater utility.

### A. The space of strategies

We seek to represent pigtailed-macaque-conflict-specific rules in a generic form. A useful starting point might be to conceive of strategies like *c*(*n, m*) as behavioral production rules. These would be the analogs of linguistic production rules that produce strings within a formal language. Here, strategies and operators, when implemented collectively, produce strings of fights.

It is well understood that sequences of words in sentences that exhibit constituent structure can be generated by simple rules. In 1956, Chomsky introduced a classification of production rules for generating an infinite set of finite length sequences of symbols. Production rules specify simple read-write operations that input a starting symbol and, through repeated application, generate sentences that belong to a given grammar. Hence, we might have a rule *A* → *AB*, which starting from *A* would generate iteratively the strings, *AB*, *ABB*, *ABBB*, etc. A grammar (*G*) is defined as a function of four arguments: an alphabet of symbols (e.g *A*, *B*), a set of non-terminal symbols in the alphabet, a finite set of production rules, and a start symbol. Terminal symbols are elements of the alphabet that cannot be taken as inputs to production rules and, hence, terminate string generation. Chomsky defined four classes, or types, of production rule, each of which generates sentences with increasingly embedded structure and requires increasing computational resources. These classes are the regular languages (type 3), the context free language (type 2), the context-sensitive languages (type 1), and the unrestricted languages (type 0). These can be summarized in terms of read-write operations on a string. If we use the convention that non-terminal symbols are upper case, terminal symbols lower case, and strings Greek letters:

Type 3:

*A*→*a*or*A*→*aB*Type 2:

*A*→ αType 1: α

*A*δ → αβδType 0: α →β

The type-3 rules, or regular grammars, replace one symbol with another and are capable of simple concatenations of symbols. Type-2 can replace symbols with strings and generate embedded or recursive sequences. These are described as context-free as the adjacent symbols do not influence the replacement. Type-1 rules replace symbols with strings (which can of course be symbols) as a function of context. Type 0 rules have no restrictions and can generate any string based on arbitrary input strings. If we assume a potentially infinite number of applications of the rules, each rule types requires more memory and more computational power to be executed. Type 3 can be generated with a simple Markov process, whereas type-0 requires an infinite memory Turing machine.

When there are only a finite number of instantiations, then all these mappings can be implemented with a simple regular grammar of arbitrary order. Not all will be as easily learnt. Hence, even if all rules can in principle be implemented with a first order Markov process, or using a look-up table, this does not imply that the individuals are using this approach. It is important to keep in mind that it is not known whether large look-up tables with their large memory costs are more or less cognitively challenging than rules, which require capacities of compression and generalization. In other words, we do not know if the strategic brain is good at model selection.

Below we enumerate some features of our strategies in relation to production rules for formal languages,

Superficially

*c*(1,1) is of type 0,*c*(1,2) of type 1, and*c*(2,1) of type 2. These rules define mappings between successive sets not linear strings of symbols.Context-sensitivity in

*c*(2,1) and, more generally,*c*(*n, m*), when*n*> 1, derives from the dependence of behavior at*t*_{n}_{+1}on coordinated behavior at*t*_{n}. The context-sensitivity for*c*(2,1) is relatively trivial, whereas strategies defined over multiple bouts*c*(*n, m, p*,…) are not.In the social context, individual identity and state are important variables. Without some additional coarse-graining of identify data into demographic classes, individual identities—the symbols in our rewrite system—cannot appear more than once on the left or right hand side of the production rule. This is not a problem if the input data are actions rather than individuals.

The behavioral production rules we have discussed are probabilistic. It is very likely that probabilistic rules are a generic feature of all social systems and many biological systems. In addition to population size effects, the extent to which a rule is probabilistic will depend in part on the role of context-dependence in system dynamics. Stochastic transitions provide a natural termination mechanism as any sequence of mappings will eventually come to an end.

In social and biological systems, the output—social structure described in terms of some aggregate statistical property—results from multiple individuals or components implementing production rules asynchronously. Hence, the problem of computation in social dynamics fundamentally has a collective character. This is not explicitly the case for production rules for formal languages.

As a result of the probabilistic nature of the rules, constraints on memory, and most importantly the fact that the population sizes are constant (finite number of symbols), temporally varying social structures will have finite depth. Technically this implies that finite automata could be employed to generate a finite number of instances from a context sensitive grammar. This finite automata representation would likely require greater computational resources principally in the form of memory.

For individual (and subgroup) decisions, production rules seem to be a natural fit and have the additional value of relating sequences of behavior to sequences of words generated during communication. This opens up the possibility of a grammar of behavior and conflict, and comparing behavioral competence to linguistic competence.

### B. The space of networks

We would also like to be able to compare the complexity of causal networks that map decision-making strategies (production rules) used by individuals onto aggregate features and ask how complexity at the network level is correlated with complexity at the rule level. This is an open challenge, and there is as yet nothing like a Chomsky hierarchy for functional or computational networks (see for a review of these issues: Ref. 15 and also Crutchfield, this issue). Measures of network complexity can be roughly classified as informational, functional, and topological.

Informational approaches to complexity include many measures developed in the context of time-series data (e.g., mutual information, statistical complexity,^{14,15} and predictive information^{7}) and are often presented in terms of networks. One measure of great interest is a network complexity measure developed by Tononi, Sporns, and Edelman^{53,52,54,6} to quantify neural complexity, but which can be applied to any physical interaction network or causal network. This is an extensive measure based on multi-information in which the mutual information of all bipartitions of a network is computed. The distribution of multi-information scores across different scales (bipartitions) of the system provides a measure of complexity. An indicator of low complexity is a linear distribution that arises when components are tightly coupled and the dynamics across scales are homogenous. Deviations from linearity, and hence increasing complexity, are characteristic of heterogeneity, modularity, and hierarchy.

Functional or computational approaches to network complexity are based on the theory of computational complexity as it has developed in theoretical computer science. Here, the emphasis is on Boolean functions and measuring the size and depth of a the circuits required to compute some function.^{56} The depth is the length of the longest path through the circuit from an input gate to an output gate. Size is the number of non-input gates in the circuit. Most of these measures are well-understood for directed, acyclic graphs with binary gates. However, networks in social and some biological systems, as discussed in Sec. ???, are often recurrent and, in many cases, connections are probabilistic. As discussed in Sec. II B, if individual identity is the input to a computation, rather than actions, circuit size will be finite. These two features of social problems suggest that complexity measures for stochastic circuits will be required. It is beyond our ability to suggest principled decomposition algorithms to stochastic Boolean circuits, but see Dedeo (this issue).

Topological approaches quantify network structure. How topological measures —mean degree, local clustering, assortativity, reach, network centrality, etc (see for review^{37})—relate to complexity is not yet known. However, our results and those of many other studies (e.g., Refs. 45, 46, 44, and 36) suggest that topological features of circuits, beyond size and depth, can be important factors influencing the output of computation. Comparing topological features of networks that vary in size and connectivity and which do not have well-defined wiring rules (e.g., exponential, scale-free, small-world, etc.) remains difficult.

### C. The space of aggregate properties or distributions

It would be desirable to be able to specify the algorithmic or computational complexity^{34} required to take us from production rules to statistical distributions or some observable group-level property. For our purposes, what might be needed is a procedure for quantifying the number of steps required to converge on an aggregate feature of adaptive value. In the case of the long fraction, this could be something like the number of steps in a system of a given size to construct a causal network generating a distribution of fight sizes that maximize individual benefits and minimize the costs individuals pay from large fights. This is a very hard problem and perhaps an unsolvable one. We nonetheless mention it as a desirable goal. If there could be an algorithmic complexity measure for behavior, it would bring us closer to a general theory of social dynamics.

## IV. BROADER EVOLUTIONARY ISSUES

The causes of an aggregate property, e.g., the distribution of fight sizes, cannot be reduced to a simple, single factor such as the distribution of resources over time. To explain a higher-level feature, we need to analyze lower level interactions. This approach is somewhat neglected in many evolutionary treatments of behavior, which emphasize functional explanations for decision-making patterns. If we aim to be predictive, we need to consider the strategies individuals employ, the rules for decision-making given conflicting strategic recommendations, and the topological features of the corresponding causal networks that arise from the collective implementation of rules. This shift in emphasis from the functional “why” question to a more deeply mechanistic “how” question touches on many important issues.^{33,23} How important is each level? How much stochasticity at any one level can be tolerated before we observe changes to higher level properties? Or how degenerate is an aggregate property given the underlying generative rules, or the underlying causal networks and space of strategies? Armed with a better understanding of the mechanics through which aggregate properties arise from microscopic processes should allow us to address the “why” questions with greater power. For example, can selection shape topological features of the networks, as well as the strategies used by components? This kind of question is the focus of studies of the evolution of development, which aims to explain the genotype-phenotype map. To understand the origins of social complexity, and structure more generally, we need to ask similar questions of social systems.

## V. CONCLUSION

Deriving measures of complexity for social dynamics could provide a natural basis for comparison across groups and species. An open challenge is to ground measures of complexity in mechanisms of behavior while preserving their generality. We have presented data and analysis of a social system that performs a simple form of social computation. This computation can be decomposed into inputs, algorithms for strategic decision making, causal circuits, and gross statistical outputs. We suggest that each of these stages of computation, or levels of behavioral observation, makes fairly unique demands on measures of complexity, and it is unlikely that we shall be able to profitably compare the complexity of social systems without attending to variation over these scales. We have shown that for some levels we have precedent measures, such as the use of formal languages and automata, whereas other levels remain wide open. There has been great progress in the mathematical and computational analysis of measures of complexity. There has been more limited progress in applying these measures to data in a way that convinces empiricist of their value. We suggest that some of this resistance could be overcome if the development of complexity measures was more informed by the canonical properties of biological and social systems.

## ACKNOWLEDGMENTS

We acknowledge NSF Grant No. BCS-0904863 for the support during this project. We thank Simon DeDeo, Bryan Daniels, Eric Smith, and Mark Johnson for helpful discussion over the course of this project and Jon Machta for comments on an early draft of this manuscript.

## REFERENCES

*Cooperation and Complexity*, edited by