Detecting causality between variables in a time series is a challenge, particularly when the relationship is nonlinear and the dataset is noisy. Here, we present a novel tool for detecting causality that leverages the properties of symmetry transformations. The aim is to develop an algorithm with the potential to detect both unidirectional and bidirectional coupling for nonlinear systems in the presence of significant sampling noise. Most of the existing tools for detecting causality can make determinations of directionality, but those determinations are relatively fragile in the presence of noise. The novel algorithm developed in the present study is robust and very conservative in that it reliably detects causal structure with a very low rate of error even in the presence of high sampling noise. We demonstrate the performance of our algorithm and compare it with two popular model-free methods, namely transfer entropy and convergent cross map. This first implementation of the method of symmetry transformations is limited in that it applies only to first-order autonomous systems.
Real world systems are composed of a number of interacting units. Identifying causality refers to detecting how these units are coupled and quantifying the strength of that coupling from time series data. Detecting causality is particularly important in understanding real-world systems, and hence has attracted researchers from different disciplines to develop a number of causality detecting tools. Here, we present a novel causality detecting algorithm that utilizes the property of dynamical symmetry. The application of the present version of the algorithm is limited to first-order autonomous systems. The performance of the algorithm is demonstrated with toy models from ecology and a real-world electronic system and is found to be very reliable in correctly identifying the coupling between variables. In other words, it successfully discovers the presence and direction of causal edges. We further discuss the scope of improvement of the present version of the algorithm that may find application in more general scenarios.
I. INTRODUCTION
Detecting causality between observed time series refers to the identification of dominant coupling direction and the quantification of coupling strengths between the interacting time series variables. Identifying such causal relationships from observational or experimental data is also often referred to as “causal discovery.”1 Identifying causality is of great importance as it has applications in unraveling various real-world phenomena, ranging from climate networks2 to brain networks.3 The challenge of learning causal relations for such a diversity of systems has attracted researchers from various disciplines, and a variety of methods have been proposed. One early method is Granger causality,4 which relies on the idea of separability. This refers to the removal of information of the causative factors from the effects.5,6 Originally, the application of Granger causality was limited because it assumed linear dependence,3 whereas most real-world systems are non-linear. But recently, Granger causality has been extended to nonlinear systems.7 Another approach for detecting causality in linear systems is by computing cross-correlation8,9 between time series for a range of time lags, and identifying the maximum in correlation value at some non-zero lag. The leading time series variable is then inferred to be the cause.
To infer causality in non-linear systems, model-free methods (that do not assume a background model), such as transfer entropy,10 convergent cross map,5 recurrence plots,3,11 causation entropy,12 and phase dynamics13,14 are becoming increasingly popular. A tool to detect the direction of coupling in interacting oscillators is developed in Refs. 13 and 14 and works by observing an asymmetric failure of invariance in the time evolution of oscillator phases. Roughly, if oscillator 1 drives oscillator 2, then the change in phase of oscillator 2 over a fixed interval of time will depend upon the phase of oscillator 1 at the outset, but not vice versa. On the other hand, transfer entropy (TE) is based on an information-theoretic concept and employs the idea that causes predict their effects. In particular, it quantifies the information flow between time series variables and identifies the direction of causation via the dominant direction of information flow. Transfer entropy has already found applications in neuroscience,15 finance,16 social media,17 animal behavior,8,18,19 and public policy.20 The main limitation of TE is that the method requires long time series which may not be available in practical situations.
A relatively recent but popular method is the convergent cross map (CCM),5 which is based on a state space reconstruction technique. CCM introduces a new criterion for detecting causality. In this case, the effects cross-estimate the causes contrary to the Granger and TE sense.21 In Ref. 5, CCM is identified as a superior method to both TE and Granger causality in correctly detecting causality.5 CCM has already found application in diverse settings.5,22–25 The major limitation of CCM is that it can only detect low to moderate coupling and fails in the presence of strong coupling and for noisy data.26
Despite their broad applicability, TE and CCM continue to struggle with noisy data or small datasets. We present a first version of an algorithm that, though of limited range of application, is robust for small, noisy datasets. This method detects the direction(s) of causal influence by utilizing the concept of dynamical symmetries.27
The paper is organized as follows. In Sec. II, we describe dynamical symmetries, and how properties of these symmetries can be exploited in detecting causality. In Sec. III, we present the algorithm. In Sec. IV, we explain TE and CCM. In Sec. V, we introduce Lotka-Volterra models as toy target systems and demonstrate the performance of our algorithm in detecting unidirectional and bidirectional causality. We further compare the performance of our algorithm with that of TE and CCM on real world systems. Finally, we present the limitations of our algorithm and the scope of improvement in Sec. VI.
II. Method
The concept of dynamical symmetry was first introduced in Ref. 28 and has proved to be a powerful tool for classifying dynamical systems according to higher-order properties of their causal structure.27 For purposes of time series analysis, we use the following definition of the concept:
Dynamical symmetry: Let be the set of states of a system of variables, and let be the state evolution operator that takes the system from some state at time to the corresponding evolved state at . The transformation is a dynamical symmetry with respect to time if and only if it has the following property: . In other words, the set of dynamical symmetries is the set of transformations of the system that commute with its time evolution.
For deterministic dynamics described by a system of differential equations, dynamical symmetries correspond to automorphisms on the space of solutions, which in the case of dynamical systems are trajectories. Consider two arbitrary trajectories such that at , the system is in state on trajectory 1 and state on trajectory 2. Then the map that pairs states with for any is equivalent to a dynamical symmetry. To see this, consider any paired states and . If one evolves the state for time and then applies the map, one obtains , exactly as if one had applied the map at and then evolved through an interval of . The map in question is also a function (generally a vector function) since, under the assumption that the system is deterministic, no trajectory can repeat a state (i.e., no trajectories intersect with themselves or others).
The set of dynamical symmetries of a system is determined by its causal structure. Furthermore, dynamical symmetries can be directly estimated from data without first learning a dynamical model.27 For these reasons, dynamical symmetry can be used to probe causal structure.
In the present study, we exploit the nature of dynamical symmetries to identify the causal influences in a system composed of two time-dependent variables that may be coupled in either or both directions. To illustrate our procedure, consider a unidirectionally coupled, autonomous, first-order system of the form:
A common representational tool in causal discovery is the structural equation model (SEM)29 (Chaps. 1 and 5). If a system such as Eq. 1 has solutions for the given and , then we can equivalently represent it with a structural equation model.27 This is because one can express a flow of the dynamical system as a function of time and the present and past values of the dynamical variables. In this case, we have
Our method depends upon examining the dynamical symmetries of a system. More specifically, we look at differences in their one-variable projections. Given any two trajectories of a deterministic system of two dynamical variables ( and ), there is (at least one) dynamical symmetry that maps one into the other. In general, these symmetries can be represented by two-component vector functions of system states. Each component of such a vector function can be viewed as a surface whose distance above or below a point in the phase plane is the value to which that component of the dynamical symmetry maps the point. Sequences of points in the phase plane are thus mapped by a dynamical symmetry to curves on this surface. Our method relies in effect on detecting differences in the shape of these surfaces along different basis directions. We do so by looking at the projection of curves on these surfaces onto the plane corresponding to one or the other dynamical variable. For example, the -component of the vector function representing a symmetry may be written . Consider a curve in the phase plane given by . This is mapped by a dynamical symmetry to , a curve on the surface dictated by . If we project this curve onto the plane containing the -axis, we obtain a new curve, denoted . In this case, when parameterized by time, this curve is given by . In some cases, the projected curve is in fact a proper function of and thus represents a dynamical symmetry of the system that involves intervention on only the variable . However, it is often the case that is not a function. For this reason, we call these projections pseudo-symmetries.
To derive implicit equations for the pseudo-symmetries involving only [denoted ] and those involving only [denoted ] given the above assumptions, consider the series of operations summarized in Tables I and II, respectively. To find we start from then apply the transformation on followed by time evolution. Each row of the center column of the table indicates the state of the system after each of these operations as dictated by Eq. (2). Next, we start from , but change the order of time evolution and the transformation on . The resulting system states are shown in the right-hand column. This procedure yields a condition that must be satisfied for any transformation, , to qualify as a pseudo-symmetry. From the final row of Table I, we can see that any transformation of that is a pseudo-symmetry must satisfy
Importantly, this condition entails that any symmetry is dependent upon both initial conditions, and . That is, the set of pseudo-symmetries of are determined in part by the initial value of .
We perform a similar exercise to obtain the condition that must satisfy. From Table II, it is clear that the pseudo-symmetries involving alone must satisfy
which is independent of the initial condition for . The reason is independent of is because the evolution of as given in Eq. (2) is not influenced by . A similar argument to the above shows that if the causal influence is bidirectional—if influences and vice versa—then would depend upon the initial value of and would depend upon the initial value of .
The central idea of the present study is to exploit this pattern of dependence on initial conditions to infer the direction(s) of causal influence. By estimating for two different initial conditions of , one can ascertain whether depends upon . If it does, then is a cause of . Conversely, if is found to depend upon , then is a cause of , as in the example above. And if both one-variable pseudo-symmetries depend upon the initial values of the other variable, then the causal influence is mutual (i.e., and influence each other). Note, however, that this decision procedure does not indicate the degree to which one influences the other.
This approach depends upon comparing pairs of pseudo-symmetries, and , for the dynamical system of interest. In practice, the information to do so can be gleaned from a single time series of sufficient length, but we assume here that multiple replicates of a dynamical system are used, as depicted in Fig. 1. In the first (System 1), we start by choosing two initial conditions for ( and ) while keeping the initial condition for fixed at . System 2 is set up similarly, except that is fixed at . The pseudo-symmetries of interest are the functions and that, respectively, map values of in the time series to the corresponding (same time) value in the time series vector, (the subscripts and refer to the initial value of for each system). These two functions are denoted and in Fig. 1(a). Similarly, we set up two additional systems, System 3 and System 4, to produce two time series for the variable , each starting from a different initial condition, while the initial value of is set to either (for System 3) or (for System 4). The relevant pseudo-symmetries of in this case are the functions denoted and in Fig. 1(b). The subscripts and refer to initial value of in Systems 3 and 4, respectively. Table III presents a complete notation for the respective trajectories.
The schematic shows how dynamical symmetries can be used to identify causal direction. In this example, the variable is driving . Each pane depicts a pair of trajectories for four different replicates of a dynamical system. In Systems 1 and 2, initial conditions for are fixed, and functions and are the functions that map the time series vector to and the time series vector to , respectively. Similarly, for Systems 3 and 4, and are the functions connecting trajectories of for which the initial values of are set to and respectively. The functions and depend on the initial values of , and hence However, since the variable is independent of
The schematic shows how dynamical symmetries can be used to identify causal direction. In this example, the variable is driving . Each pane depicts a pair of trajectories for four different replicates of a dynamical system. In Systems 1 and 2, initial conditions for are fixed, and functions and are the functions that map the time series vector to and the time series vector to , respectively. Similarly, for Systems 3 and 4, and are the functions connecting trajectories of for which the initial values of are set to and respectively. The functions and depend on the initial values of , and hence However, since the variable is independent of
III. The algorithm
The algorithm is based on a technique that was first introduced in Ref. 27, where the dynamical symmetries of two systems are compared to determine whether the systems share a similar dynamics. In case they do not share the same symmetry transformations, they are confidently determined to belong to “different kinds”; otherwise, they are provisionally taken to belong to the “same kind.” In the present study, a similar procedure is adopted to compare pairs of pseudo-symmetries and use the pattern of results to detect causal influences. To detect causality in this way, we proceed in four steps: (i) sampling, (ii) data transformation, (iii) comparison (of with , and with ), and (iv) decision.
In step (i), we sample from four systems, where the governing equations and the parameters remain the same, but the initial conditions vary as depicted in Fig. 1. For System 1, two time series are measured using the same initial condition for the variable (), different initial conditions for namely and Similarly, for System 2, we set and choose two initial conditions for , and . The trajectories are obtained by measuring the values of the variables at times for each of the systems. Similarly, time series are acquired from System 3 and System 4. In this case, the initial value of is held constant, and measurements are made on each system for two different initial values of . In all cases, measurements are made at the same set of times, . We choose segments such that and vary monotonically. This guarantees that the pseudo-symmetries are functions over at least this interval, which allows us to use the comparative tools described below.
In step (ii), we combine pairs of time series acquired in step (i) to generate a sample of each of the functions . Specifically, for System 1 we pair the vector of -values, for the time series beginning with with the vector of -values from the time series beginning with . The result is a two-column matrix (represented in Algorithm 1 as input ) for which the -th row contains elements from each of the two time series at the time . The function mapping the values in the first column to those of the second [represented by ] is the pseudo-symmetry . We similarly pair the time series vectors of or variables for Systems 2–4.
Comparison of with and comparison of with
Input: —for System 1, an matrix where each row contains one element of and the corresponding element of to which maps the system state. |
—for System 3, an matrix where each row contains one element of and the corresponding element of to which maps the system state. |
—for System 2, an matrix where each row contains one element of and the corresponding element of to which maps the system state. |
—for System 4, an matrix where each row contains one element of and the corresponding element of to which maps the system state. |
Output: —a boolean value indicating whether and are identical. |
—a boolean value indicating whether and are identical. |
1: randomize the rows of |
2: divide the rows of into 10 segments |
3: divide the rows of into 10 segments |
4: divide the rows of into 10 segments |
5: divide the rows of into 10 segments |
6: for to 10 do |
7: for to 4 do |
8: |
9: end for |
10: for to 2 do |
11: |
12: |
13: |
14: concatenate(, squared errors of with respect to ) |
15: concatenate(, squared errors of with respect to ) |
16: concatenate(, squared errors of with respect to ) |
17: end for |
18: end for |
19: |
20: |
21: |
22: |
23: if then |
24: return and |
25: else |
26: return and |
27: end if |
28: if then |
29: return and |
30: else |
31: return and |
end if |
Input: —for System 1, an matrix where each row contains one element of and the corresponding element of to which maps the system state. |
—for System 3, an matrix where each row contains one element of and the corresponding element of to which maps the system state. |
—for System 2, an matrix where each row contains one element of and the corresponding element of to which maps the system state. |
—for System 4, an matrix where each row contains one element of and the corresponding element of to which maps the system state. |
Output: —a boolean value indicating whether and are identical. |
—a boolean value indicating whether and are identical. |
1: randomize the rows of |
2: divide the rows of into 10 segments |
3: divide the rows of into 10 segments |
4: divide the rows of into 10 segments |
5: divide the rows of into 10 segments |
6: for to 10 do |
7: for to 4 do |
8: |
9: end for |
10: for to 2 do |
11: |
12: |
13: |
14: concatenate(, squared errors of with respect to ) |
15: concatenate(, squared errors of with respect to ) |
16: concatenate(, squared errors of with respect to ) |
17: end for |
18: end for |
19: |
20: |
21: |
22: |
23: if then |
24: return and |
25: else |
26: return and |
27: end if |
28: if then |
29: return and |
30: else |
31: return and |
end if |
In step (iii), we determine whether and whether . To detect whether the pseudo-symmetries differ between Systems 1 and 2, we estimate the error of two competing models, sep and joint, using 10-fold cross-validation. In the sep model, data from System 1 and System 2 are treated separately, whereas for the joint model, the data from System 1 and System 2 are combined and treated as a single sample from the same function. In either case, we fitted polynomial functions to estimate the pseudo-symmetries. For a given dataset, the order of the polynomial functions to be fit is determined by an additional 10-fold cross-validation (represented by FitPolynomial in Algorithm 1), as was done in Ref. 27. This function starts by considering a polynomial of order 1 and computes the mean-squared error of fitting a first-order polynomial over 10 folds of the data. The function continues fitting the data with higher order polynomials and compares the mean-squared error with that of the preceding order. If the error increases, the function terminates and returns the order that minimized the mean square error. The polynomial coefficients are then evaluated by fitting the entire data set with the order as returned by FitPolynomial in Algorithm 1.
It is important to note that mean squared errors computed for the sep and joint models cannot be directly compared, because the error depends on how the data are partitioned and on the amount of observation noise present in the system. Following the procedure in Ref. 27, we quantify this background contribution to the mean squared errors of sep and joint models by dividing each of the systems into two subsystems, and using the same 10-fold cross-validation routine on each pair to identify the difference of the mean squared errors between sep and joint models that one should expect purely by chance. For example, System 1 is divided into two subsystems and mean squared errors are computed for joint and separate models fitted to these subsystems; the absolute difference in these errors is represented by . Similarly, refers to the absolute difference in mean squared errors between sep and joint models built from System 2. In Algorithm 1, The hypothesis that and have the same functional form is rejected if the condition holds, and hence we infer The same calculations are done with the data from Systems 3 and 4 to decide whether .
In the final step (iv), we reach a decision regarding causal influence. Specifically, if we detect and , we identify as the cause of . Conversely, if we detect and , we identify as the cause of If both pairs are unequal, this suggests mutual causal influence, and if both pairs are identical, then does not appear to influence or vice versa.
Note that the algorithm relies on randomizing the rows of and hence the decision may vary on different runs using the same data. Thus, for a given time series, we need to run Algorithms 1 and 2 multiple times, and select the majority decision.
Decision
Input: and from Algorithm 1. |
Output: Value if value if , value if and value indicating no causal relationship detected. |
1: if and then |
2: return |
3: else if and then |
4: return |
5: else if and then |
6: return |
7: else |
8: return |
9: end if |
Input: and from Algorithm 1. |
Output: Value if value if , value if and value indicating no causal relationship detected. |
1: if and then |
2: return |
3: else if and then |
4: return |
5: else if and then |
6: return |
7: else |
8: return |
9: end if |
IV. Other causality detecting tools: Transfer Entropy (TE) and Convergent Cross Map (CCM)
In our present study, we choose two model-free methods, namely TE and CCM, to provide a benchmark for the performance of our algorithm in detecting causality. In drawing comparisons, it is important to note that the method introduced here requires four replicate time series data each starting from four different pairs of initial conditions, whereas both TE and CCM require a single time series data, without any prerequisite on the number of initial conditions needed.
A. Transfer entropy
The idea of TE was first formalized in Ref. 10 to measure information flow between two time series variables. TE uses the concept of Shannon entropy. Shannon entropy quantifies the expected value of the information associated with the occurrence of an event and is defined as follows:
where is the probability mass function for a time series variable taking the value and refers to the set containing all possible realizations of . Now, given two time series variables and the TE from to (), quantifies information transfer from to by measuring the reduction in entropy of when conditioned on . Hence, transfer entropy from to is defined as
where and denote the probability of conditioned on alone, and on both and , respectively; denotes joint probability. If does not provide any additional information then , and hence equals zero. In general, transfer entropy is an asymmetric quantity, and comparing with detects the dominant direction of information flow, thus identifying the direction of causation.
We implement TE using the open-source JIDT (Java Information Dynamics Toolkit),30 which implements various information-theoretic measures. We used the Kraskov, Stogbauer, and Grassberger (KSG) method to calculate the probability mass functions (PMF) used in the above definitions because it uses dynamically altered kernel width in terms of nearest-neighbors () and it has been shown to decrease errors in PMF estimation.31
B. Convergent cross map
CCM, first introduced in Ref. 5, is a technique for determining causality between sets of time series using a neighborhood method. It relies on the idea that if time series causally influences , then signatures (or imprints) of inherently exist in and hence historical records of can be used to estimate the current state of The estimated state of is denoted by where refers to a shadow version of the original manifold reconstructed from the projection of times series . Finally, the accuracy of that estimation is evaluated using the correlation coefficient, calculated between the original time series and their cross-map estimates. If the correlation coefficient between the true values () and the estimated values () is high, then it is concluded that there is “enough” information stored in that came from to indicate that is among the causes of . However, if does not influence then estimate of using denoted by will be inferior. A more detailed description of the algorithm can be found in Ref. 32, and in the supplementary materials of Ref. 5.
The CCM estimates are unitless and the values range between and . The cross map estimation is evaluated at different library sizes, which refers to the length of the time series used. The cross map estimation converges to a constant value with increasing library size, and the comparison of relative skill of cross-map estimations indicates the direction of causation. As the real-world datasets possess observational and process noise and may lack sufficient data, the asymptotic value of cross map estimation may not converge to a value of one.5 The parameters associated with CCM are lag, which is used for the construction of the shadow manifold, and the optimal embedding dimension. In the present study, we implement CCM using the rEDM package33 in R, where the lag was set to by default, and the optimal embedding dimension was evaluated following the Simplex Projection method.34
V. Results
A. Detecting unidirectional coupling in competitive Lotka-Volterra model
To evaluate the performance of our algorithm, we chose to implement a two-species competitive Lotka-Volterra model. Nonlinear models of this type are used to describe the population dynamics of two species competing for resources. The governing equations are given as follows:
where and denote the population sizes, and the growth rates, and and the carrying capacities of species 1 and species 2, respectively. The parameter represents the influence of species 2 on species 1, and similarly, represents the influence of species 1 on species 2.
For the first set of tests, we set choose and vary from to By setting we ensure that has no influence on , while increasing values of correspond to an increasing strength of influence from to Next, we implement our algorithm on the time series data generated by solving Eq. (6) using ode45 in Matlab. To implement our algorithm, we need four pairs of initial conditions for and which we choose as follows: and We generate time series data of length We further add Gaussian observation noise with mean and varying standard deviation from to
We choose different coupling strengths and standard deviations of observation noise to verify the robustness of our algorithm in correctly detecting as the cause of . Figure 2(a) demonstrates the results, where we vary the coupling strength () from to and for each choice of , we choose six different amounts of observation noise ranging from standard deviation to Since our algorithm relies on shuffling the data while evaluating , the results obtained in the decision phase are not deterministic. Hence, we run the algorithm ten times on the same data and count the number of times is incorrectly detected as the cause of , the number of times is correctly detected as the cause of , and the number of times the algorithm indicates bidirectional causality. We plot the result as percentage in Fig. 2(a). Though some runs on a given data set incorrectly indicate bidirectional causation, we note that the majority decision is accurate for every combination of parameter values tested. Furthermore, in no case does it miss the influence of on .
(a) Results demonstrating performance of our algorithm in detecting unidirectional causality. We set and varied the coupling from to , by varying from to For each value of the six bars correspond to increasing Gaussian measurement noise with mean and standard deviation ranging from to in increments of . For each choice of , and each choice of standard deviation we run our algorithm ten times on single sample and count the number of times is correctly detected as the sole cause (colored in yellow), the number of times is falsely detected as the sole cause (colored in green), the number of times algorithm does not detect a causal relation (colored in light blue), and the number of times the algorithm detects bidirectional causation (colored in deep blue). The results are plotted as percentage. Interestingly, the algorithm never misses the influence of on , but at times additionally detects influence of on inferring bidirectional coupling. (b) Results demonstrating the performance of TE in detecting as the cause of . (c) Results demonstrating the performance of CCM in detecting as the cause of . Results show that at times both TE and CCM fail to identify as the sole cause.
(a) Results demonstrating performance of our algorithm in detecting unidirectional causality. We set and varied the coupling from to , by varying from to For each value of the six bars correspond to increasing Gaussian measurement noise with mean and standard deviation ranging from to in increments of . For each choice of , and each choice of standard deviation we run our algorithm ten times on single sample and count the number of times is correctly detected as the sole cause (colored in yellow), the number of times is falsely detected as the sole cause (colored in green), the number of times algorithm does not detect a causal relation (colored in light blue), and the number of times the algorithm detects bidirectional causation (colored in deep blue). The results are plotted as percentage. Interestingly, the algorithm never misses the influence of on , but at times additionally detects influence of on inferring bidirectional coupling. (b) Results demonstrating the performance of TE in detecting as the cause of . (c) Results demonstrating the performance of CCM in detecting as the cause of . Results show that at times both TE and CCM fail to identify as the sole cause.
Next, to provide a benchmark for this performance we implement TE and CCM. Since TE and CCM do not require four different choices of initial conditions, we reuse the data set generated above with initial conditions We evaluate TE for values ranging from to , where is the parameter used in estimating the PMF. However, the resolution width used in estimating TE should not be too large nor too small,31 so we fix to We find in Fig. 2(b) that many times TE falsely identifies as the sole cause, particularly as the coupling strength and measured noise increase. Next, we implement CCM which is presented in Fig. 2(c). CCM relies on the asymptotic value of convergence at maximum library size, and Fig. 2(c) presents the causal direction as identified by CCM. The results demonstrate that CCM also performs poorly, as it too falsely identifies as a cause of for a number of values of and noise.
1. Bidirectional causality with equal coupling strength
So far, we only considered the presence of unidirectional coupling. Next, we investigate bidirectional coupling, beginning with the special case where the coupling strength for either direction is the same. Again we modeled a competitive Lotka-Volterra system and kept all parameter values the same, but set We vary the coupling strength from values of to and for each choice of coupling strength we vary the standard deviation for the Gaussian noise from to Next, we implement our algorithm on the time series data generated by solving Eq. (6) using ode45 in Matlab. We keep the choice of initial conditions fixed as in the earlier example. Further, we compare the results of our algorithm with that of TE and CCM.
The performance of all three algorithms is presented in Fig. 3(a), where we observe that our algorithm can reliably detect the bidirectional coupling for all choices of coupling strength and increased measured noise. In this case, the output of all 10 runs was uniform for every combination of parameter values. The performance of TE and CCM in detecting the bidirectional coupling is presented in Figs. 3(b) and 3(c), respectively. This case is clearly more challenging for TE and CCM. At least in a minority of cases, TE correctly identifies bidirectional causation, but CCM never does. Note, however, that these methods output continuous measures—of information exchange in the case of TE and degree of correlation for CCM—for which there is no universal decision threshold. It is likely that adjustment of these thresholds would improve performance, but any such threshold would have to be justifiable in the absence of information about the system or the nature of the sampling noise. It is an advantage of the method presented here that a natural threshold is built into the binary decision process that requires no foreknowledge of the system to determine.
(a) Results demonstrating the performance of our algorithm in detecting bidirectional causality. We varied from to and set For each value of the six bars correspond to increasing Gaussian measurement noise with mean and standard deviation ranging from to in increments of . For each choice of coupling strength, and each choice of standard deviation, we run our algorithm ten times on the same sample and count the number of times is detected as the sole cause (colored in yellow), the number of times is detected as the sole cause (colored in green), the number of times algorithm does not detect a causal relation (colored in light blue), and the number of times bidirectional coupling is detected (colored in deep blue). The results are plotted as percentage. The algorithm uniformly detects bidirectional causality for the entire range of coupling strength and noise. (b) Results demonstrating the performance of TE in detecting bidirectional causality. (c) Results demonstrating the performance of CCM in detecting bidirectional causality. Results show that TE misses one or the other direction of causal influence, while CCM never detects bidirectional causality.
(a) Results demonstrating the performance of our algorithm in detecting bidirectional causality. We varied from to and set For each value of the six bars correspond to increasing Gaussian measurement noise with mean and standard deviation ranging from to in increments of . For each choice of coupling strength, and each choice of standard deviation, we run our algorithm ten times on the same sample and count the number of times is detected as the sole cause (colored in yellow), the number of times is detected as the sole cause (colored in green), the number of times algorithm does not detect a causal relation (colored in light blue), and the number of times bidirectional coupling is detected (colored in deep blue). The results are plotted as percentage. The algorithm uniformly detects bidirectional causality for the entire range of coupling strength and noise. (b) Results demonstrating the performance of TE in detecting bidirectional causality. (c) Results demonstrating the performance of CCM in detecting bidirectional causality. Results show that TE misses one or the other direction of causal influence, while CCM never detects bidirectional causality.
For reference, we have provided sample trajectories for both unidirectional and bidirectional case in Fig. 4.
Sample trajectories of the Lotka-Volterra model used in the analysis. (a) Unidirectional case, with and the standard deviation of the Gaussian noise is (b) Bidirectional case, with and the standard deviation of the Gaussian noise is
Sample trajectories of the Lotka-Volterra model used in the analysis. (a) Unidirectional case, with and the standard deviation of the Gaussian noise is (b) Bidirectional case, with and the standard deviation of the Gaussian noise is
2. Bidirectional causality with unequal coupling strength
Finally, we investigate the general case of bidirectional causality, when both the variables influence each other but do not necessarily do so with equal coupling strength. We again choose the competitive Lotka-Volterra equations and fix the coupling strength from to by setting and the standard deviation for the Gaussian noise to Next, we vary the coupling strength from to from values to where small values of correspond to weak coupling. To measure the performance, we run our algorithm 10 times for each choice of (again, on the identical data set) and count the number of times and are identified as a causal driver, independently. Bidirectional coupling will be inferred if the algorithm detects both and as causal drivers. We present the result as percentage in Table IV. We notice that the algorithm detects the presence of bidirectional coupling for all choices of except when In other words, when the value of is relatively small in comparison to the algorithm correctly infers as a cause of , but misses the relatively weak influence of on .
B. Accuracy
So far, we have found that the proposed algorithm performs better for a single noisy dataset in comparison to both TE and CCM. As noted above, the algorithm is indeterministic for a given dataset; the output may depend upon the shuffling that takes place in cross-validation. As suggested above, this can be mitigated when using the algorithm to decide causal structure by running multiple times on a given dataset and then taking a consensus of the results. For characterizing the accuracy of such a procedure, we accept the majority output as definitive and test against a sample of datasets drawn from the same Lotka-Volterra system. We repeat these tests for a range of systems and sampling noises. Specifically, we vary the standard deviation of the Gaussian sampling noise from to . For a fixed value of noise, we vary the coupling strength from to . We set to zero for the unidirectional case and vary both and for the bidirectional case, where is set equal to For each set of system and noise parameters, we conduct ten trials, generating a distinct time series for each. Within each trial, we run our algorithm ten times and identify the causal direction by majority vote. We define the accuracy as the ratio of the number of times the algorithm correctly identifies all and only the correct causal edges to the total number of trials, which is equal to ten. We present the accuracy results in Fig. 5. We find that the accuracy decays with an increase in the noise, however, the effect of noise on performance reduces when the coupling strength is also increased. In other words, for relatively weak coupling the noise dominates the performance, but the performance gets better if coupling strength is increased proportionally. This pattern is prominent in the bidirectional case [Fig. 5(b)] and weakly present in the unidirectional case Fig. 5(a).
The performance of the algorithm is measured in terms of accuracy (with a range of to as indicated by color) for both (a) unidirectional and (b) bidirectional case.
The performance of the algorithm is measured in terms of accuracy (with a range of to as indicated by color) for both (a) unidirectional and (b) bidirectional case.
C. Detecting bidirectional coupling in a Dixon system
In this section, we evaluate the performance of our algorithm by implementing it on a system that demonstrates chaos-like behavior. According to the Poincaré-Bendixson theorem, the attractor for any smooth two-dimensional bounded continuous-time autonomous system is either a stable equilibrium or a limit cycle.36 Since our algorithm is limited to first-order two-dimensional continuous-time autonomous systems, we cannot apply it to most of the well-known chaotic systems. However, there exists a system derived by Dixon et al. which does not satisfy the smoothness condition since it has a singularity at the origin. The governing equations are given as follows:
We choose the parameters and for which the system displays “chaotic” behavior, which is more precisely referred to as “S-chaos” (singularity-chaos).36,37 Next, we implement our algorithm on the time series data generated by solving Eq. (7) using ode45 in Matlab. We choose four pairs of initial conditions for and as follows: and We further add Gaussian observation noise with mean and standard deviation. Next, we implemented our algorithm, and ran it ten times and noted the dominant causal direction that it detected. Interestingly, the algorithm detected the correct causal direction in each run, indicating bidirectional coupling.
D. Real-world system
To evaluate the performance of our algorithm with respect to a real-world system, we built a nonlinear circuit in which an LC oscillator is used to drive an inverting amplifier. The detailed circuit diagram is provided in Fig. 6(a). We treated the output of the LC oscillator and the output of the inverting amplifier as our two dynamical variables. The signals recorded from these two channels are referred to as and in Fig. 6(a). The input voltage VCC was used to intervene in the system in order to attain the desired initial conditions. We chose four pairs of initial conditions for and as follows: and . We recorded a sample of size at an interval s. In this experimental setup, drives The signals are plotted in Fig. 6(b) for reference. Next, we implemented our algorithm and ran it ten times and noted the dominant causal direction that it identified. Interestingly, the algorithm detected the correct causal direction in each run, identifying as the cause. We further implemented TE and CCM, to compare the performance with respect to our tool. TE correctly detected the coupling direction from , whereas CCM incorrectly inferred bidirectional coupling.
(a) Circuit diagram used for generating the real-world data. LC op amp oscillator is used to control the inverting amplifier. (b) Two signals, namely and , that are recorded simultaneously at four different initial conditions.
(a) Circuit diagram used for generating the real-world data. LC op amp oscillator is used to control the inverting amplifier. (b) Two signals, namely and , that are recorded simultaneously at four different initial conditions.
VI. Discussion and future scope
In the present work, we use the concept of pseudo-symmetries to detect unidirectional and bidirectional causality. We compare the performance of our algorithm to that of TE and CCM when applied to noisily sampled two-species competitive Lotka-Volterra systems. In this context, both TE and CCM can in principle detect the causal influences. However, tests demonstrate that both these tools fail often in the presence of significant noise, particularly with bidirectional coupling. On the other hand, the proposed algorithm based on comparisons of pseudo-symmetries is accurate across all of the conditions examined. To demonstrate the application of the present approach to real-world systems, we constructed a circuit in which one voltage drives another. The algorithm accurately identified the correct causal direction. The symmetry approach is therefore a viable alternative to existing tools when dealing with noisy real-world data that may exhibit any combination of causal influences between dynamical variables.
The present approach relies upon observing behavior for four pairs of distinct initial conditions. The simplest way to achieve this is by intervention on the system. But, given the possibility of intervention, an alternate approach to identifying causal direction suggests itself, namely a direct comparison of trajectories. For example, comparing with and with as in Figs. 1(a) and 1(b) would enable one to detect that drives . The method we use to compare portions of pseudo-symmetries (that can be represented by polynomials) could be used to directly compare trajectories. Differences in, e.g., for changes in the initial value of would then be indicative of causal direction (in this case, from to ). However, that method fails when we do not have exact control over initial conditions. To test this assertion, we compared the present method with that of such a trajectory comparison method by adding Gaussian noise to the initial conditions, and then letting a Lotka-Volterra system evolve in time. Under these conditions, the symmetry method performs better in detecting the causal direction than a direct comparison of trajectories. Further, we consider a chaotic Dixon system, to verify the performance with respect to noisy initial conditions. The symmetry method detected the correct causal direction. Although it is simpler to intervene, it is often not necessary. For a sufficiently long time series, a purely observational dataset can often be segmented in such a way that, when each segment is treated as a separate time series, the requisite set of initial conditions are at least approximately obtained. This is sufficient for applying the algorithm, albeit at some cost to accuracy.
This first version of the algorithm is, however, limited to first-order autonomous systems. It is also the case that the present version of the algorithm is applicable only to systems having two time-dependent variables. The most straightforward approach to extending this method (or the direct comparison of trajectories mentioned above) to systems with more variables would require a geometric increase in the number of initial conditions as a function of the number of variables. We see two approaches to generalizing our method that avoids this problem. On the one hand, it is possible to assess directional gradients of the surfaces that correspond to components of a dynamical symmetry in ways that do not involve comparing pseudo-symmetries. If this can be done with accuracy from typical time series data, then in principle we would not be limited by the large number of initial conditions demanded by a straightforward generalization of the current approach. On the other hand, it is possible to preserve causal information while reducing many variable systems to approximate two-variable systems.38 Both approaches are left for future work.
Finally, our algorithm, though it reliably detects bidirectional influence, yields only a binary decision; it does not indicate the relative degree of causal influence in either direction. But such information can be gleaned after a reliable detection is made by applying one or another continuous measure of the degree of causal influence.
Despite its limits, the algorithm presented is a demonstration of the potential of symmetry methods for the reliable detection of causality when data is noisy, causation may be bidirectional, and coupling strengths vary. This work only begins to open a wide scope for improvement, which is left for future studies.
Acknowledgments
This work was supported by the National Science Foundation under Grant No. 1454190. We would like to thank Naveed Jantzen for his assistance in collecting data while conducting the experiment.