We propose a mutual information statistic to quantify the information encoded by a partition of the state space of a dynamical system. We measure the mutual information between each point’s symbolic trajectory history under a coarse partition (one with few unique symbols) and its partition assignment under a fine partition (one with many unique symbols). When applied to a set of test cases, this statistic demonstrates predictable and consistent behavior. Empirical results and the statistic’s formulation suggest that partitions based on trajectory history, such as the ordinal partition, perform best. As an application, we introduce the weighted ordinal partition, an extension of the popular ordinal partition with parameters that can be optimized using the mutual information statistic, and demonstrate improvements over the ordinal partition in time series analysis. We also demonstrate the weighted ordinal partition’s applicability to real experimental datasets.
Symbolic dynamics is a powerful tool in nonlinear time series analysis. Given a multidimensional time series, we partition its state space into disjoint sets, each represented by a unique symbol. Analyzing the symbolic representations of trajectories can offer useful insights. How we choose to partition the state space is critical; the aim is to select partitions that produce symbolic representations reflecting the true dynamics of a system. We introduce a mutual information statistic as a novel method for assessing how well a partition achieves this aim. The statistic measures the mutual information between a point’s symbolic trajectory history and its current location. We demonstrate greater suitability on test cases when compared with existing statistics and use the statistic to propose an improvement on the popular ordinal partition.
I. BACKGROUND AND MOTIVATION
Symbolic dynamics is the modeling of a dynamical system by discretizing its state space into a set of partitions, with applications in nonlinear time series analysis.1 Symbolic dynamics analysis requires the application of a partitioning scheme, which is the process of assigning a symbol to each point in the data based on which partition it falls into. A partitioning scheme can be considered a map , where represents the state space of a dynamical system and represents the set of symbols assigned uniquely to each partition. Good partitions are those that produce a symbol sequence retaining a large amount of information of the original system, termed “high-information;” finding high-information partitions is, therefore, a problem of interest.
A generating partition is a particular high-information partition where there is a one-to-one correspondence between each trajectory and its infinite symbolic sequence.2 Finding generating partitions is difficult, and a general method does not exist. There is an existing body of work, which attempts to estimate generating partitions. While these methods are theoretically applicable to higher-dimensional continuous systems, implementation has proven challenging, hence applications have focused on two-dimensional systems such as the Hénon and Ikeda maps.2–6
Some works that estimate generating partitions directly from time series have introduced statistics to assess how close to generating a partition is and attempt to find partitions optimizing their values. These statistics follow the principle that under a generating partition, observations that are neighbors in symbol space should also be neighbors in state space; see, for example, Kennel and Buhl’s symbolic false nearest neighbors3 or symbolic shadowing by Hirata et al..2,5 These statistics encourage partitions that are contiguous or otherwise localized in the state space. However, we suggest that partitions that are considered close to generating in this manner are not necessarily well suited for time series analysis tasks. The ordinal partition, introduced by Bandt and Pompe,7 does not necessarily generate contiguous partitions but has been widely used with success in time series analysis tasks.
The ordinal partition is particularly popular for its computational ease.1,8 However, superior partitioning methods allowing for greater information retention may exist. Furthermore, in addition to these generating partition statistics, there is currently no method for assessing and comparing the performance of different partitioning methods. Thus, in this paper, we have two aims: first, to develop a statistic that can be used to quantifiably assess the performance of a partition on a particular system and second, to use this statistic to identify superior alternatives to ordinal partitioning. The statistic we propose allows for the quantitative comparison of partitioning schemes on various systems, and we demonstrate its potential to identify better partitions for time-series analysis.
In Sec. II, we introduce a mutual information statistic as a method for assessing partitions. In Sec. III, the statistic is applied to a set of test partitions and systems to highlight differences with generating partition statistics and to ensure that it correctly ranks the partitions’ quality in accordance with their performance in time series analysis. In Sec. IV, we introduce the weighted ordinal partition as an application of the statistics and demonstrate improvements over the ordinal partition in time series analysis tasks. In Sec. V, we apply the weighted ordinal partition to two real experimental datasets.
II. MUTUAL INFORMATION STATISTIC
Suppose that we take a single low-precision observation of an orbit on a hyperbolic set. There are likely several trajectories passing through this point, and we cannot be sure which our observation comes from. Now, let us take a sequence in time of similarly low-precision observations from the same orbit. The shadowing lemma guarantees that there will be at least one trajectory that passes through these points.9 Knowledge that the trajectory we have observed passes through not only our initial observation but also all of these observations provides us with more information about the trajectory and allows us to identify it with greater precision. In effect, multiple low-precision observations have provided us with the same information as a single high-precision observation would have; this principle is illustrated in Fig. 1. This demonstrates a general property of chaotic dynamical systems; a single high-precision observation provides the same information as a sequence of low-precision ones.
In (a), we take a single low-precision observation of a chaotic dynamical system and observe three trajectories passing through it; we cannot be sure which we have observed. In (b), we take a sequence in time of low-precision observations from our system, allowing us to identify the red trajectory as the one we have observed. In (c), we identify this trajectory by taking a single high-precision observation. This simple example illustrates a general property of dynamical systems, and consequence of the Shadowing Lemma: a single high-precision observation provides the same information as a sequence of low-precision ones. Mutual information measures how well symbol sequences resulting from a partition preserve this property.
In (a), we take a single low-precision observation of a chaotic dynamical system and observe three trajectories passing through it; we cannot be sure which we have observed. In (b), we take a sequence in time of low-precision observations from our system, allowing us to identify the red trajectory as the one we have observed. In (c), we identify this trajectory by taking a single high-precision observation. This simple example illustrates a general property of dynamical systems, and consequence of the Shadowing Lemma: a single high-precision observation provides the same information as a sequence of low-precision ones. Mutual information measures how well symbol sequences resulting from a partition preserve this property.
Formally, suppose is a partitioning scheme that can be applied to produce a variable number of partitions. Let be the partition resulting from applying to produce partitions. The process for assessing on a trajectory is as follows:
Algorithm 1
Apply coarse partition to .
Assign to each point the coarse history symbol sequence , where is the symbol for under , and and are parameters called history delay and history length, respectively.
For some , apply fine partition to and assign to each point its corresponding fine partition symbol.
Produce the joint probability distribution , where is the random variable representing fine partition symbols and is the random variable representing coarse symbol history sequences.
Calculate as given in Eq. (1).
Repeat steps 3 to 5 for various values of .
We expect good partitioning schemes to exhibit high values of mutual information and increasing mutual information with , and bad partitioning schemes to exhibit low mutual information and no strong increase in mutual information with . We thus expect that vs graphs should be able to differentiate between good and bad partitioning schemes; in Sec. III, we verify that this is indeed the case.
III. RESULTS ON TEST CASES
In this section, the mutual information statistic is used to assess four partitioning schemes on three test systems (Duffing oscillator, Lorenz system, and i.i.d. noise). Two good partitioning methods (ordinal partition and -means clustering) as well as two poor methods (slice and random partitions) are selected to test that the statistic is able to differentiate between them. The ordinal partition is well established in the literature; see Bandt and Pompe for details.7 In this paper, and refer to the embedding dimension and time-delay parameters respectively for the ordinal partition. The time delay is chosen to be one quarter of the period of the orbit,10,11 while is varied to produce different ; note that is the number of unique ordinal sequences of length present in the time series and cannot be controlled explicitly. Clustering is implemented using -means clustering, an unsupervised machine learning method that separates points into clusters and aims to minimize the sum of the squared distances between all points and their cluster means.8 Parameter defines the number of clusters; we set . The slice partition is constructed by selecting one dimension of a trajectory to produce a scalar time-series , then defining evenly sized bins between and . The random partition is constructed by randomly assigning each point in a trajectory one of symbols from the discrete uniform distribution . If the mutual information statistic is able to produce assessments of each partition that accurately reflect their performance in time series analysis, this will suggest that it is a suitable statistic for assessing the quality of a partitioning scheme.
A. Duffing oscillator
The statistic correctly identifies the random and slice partitions as performing worse than the -means and ordinal partitions, as shown in Fig. 2. The statistic detects that increasing the noise parameter results in decreasing the partition quality, also aligning with expectations. The exception is the random partition, where adding noise to the system does not affect results as partitions are assigned randomly anyway. Results also suggest that the ordinal partition is more resilient to noise than the other partitioning methods; this observation agrees with previous results.7
Mutual information vs on the Duffing oscillator for each partition with various noise parameters . As percentages of the standard deviation of the oscillator, correspond to , and , respectively. The (a) random and (b) slice partitions perform worse than the (c) -means and (d) ordinal ( ) partitions, correctly reflecting their respective performances in time series analysis. Additionally, the statistic is able to detect that increasing noise reduces partition quality, and that the ordinal partition is more resilient to noise than the other partitions. In all cases, the statistic displays the predicted behavior. The random and -means partitions are non-deterministic; central lines are a mean of 100 trials, with standard deviation shown above and below in the shaded regions. Note that for the ordinal partition, we use for , and for ; the different ordinal patterns present in each case result in different for each , accounting for the different domain in Fig. 2(d).
Mutual information vs on the Duffing oscillator for each partition with various noise parameters . As percentages of the standard deviation of the oscillator, correspond to , and , respectively. The (a) random and (b) slice partitions perform worse than the (c) -means and (d) ordinal ( ) partitions, correctly reflecting their respective performances in time series analysis. Additionally, the statistic is able to detect that increasing noise reduces partition quality, and that the ordinal partition is more resilient to noise than the other partitions. In all cases, the statistic displays the predicted behavior. The random and -means partitions are non-deterministic; central lines are a mean of 100 trials, with standard deviation shown above and below in the shaded regions. Note that for the ordinal partition, we use for , and for ; the different ordinal patterns present in each case result in different for each , accounting for the different domain in Fig. 2(d).
B. Lorenz system
We choose parameter values . The statistic correctly identifies the random and slice partitions as performing worse than the -means and ordinal partitions, as shown in Fig. 3. The -slice and -embedding result in significantly higher mutual information than other dimensions. This is because the Lorenz system is symmetric under inversion through the -axis, so the dimension does not offer full observability of the system. The -embedded orbit is, therefore, less complex than the original orbit, with the attractor only exhibiting a single lobe, resulting in greater predictability and, therefore, higher mutual information.
Mutual information vs for the Lorenz system. The (a) random partition performs the worst, followed by the (b) slice partition, then (c) -means then the (d) ordinal partition ( ), correctly reflecting their respective performances in time series analysis. The symmetry of the Lorenz system under -inversion means that the -embedded orbit is simpler, generally resulting in higher mutual information. In all cases, the statistic displays the predicted behavior. The random and -means partitions are non-deterministic; central lines are a mean of 100 trials, with standard deviation shown above and below in the shaded regions. Note that for the ordinal partition, we use for the and embeddings and for the embedding; the different ordinal patterns present in each case result in different for each , accounting for the different domain in Fig. 3(d).
Mutual information vs for the Lorenz system. The (a) random partition performs the worst, followed by the (b) slice partition, then (c) -means then the (d) ordinal partition ( ), correctly reflecting their respective performances in time series analysis. The symmetry of the Lorenz system under -inversion means that the -embedded orbit is simpler, generally resulting in higher mutual information. In all cases, the statistic displays the predicted behavior. The random and -means partitions are non-deterministic; central lines are a mean of 100 trials, with standard deviation shown above and below in the shaded regions. Note that for the ordinal partition, we use for the and embeddings and for the embedding; the different ordinal patterns present in each case result in different for each , accounting for the different domain in Fig. 3(d).
The exception to this is the ordinal partition, where the -embedding only slightly outperforms the other dimensions. This is because the partitions in the coarse -embedded ordinal partition are not contiguous, as shown in Fig. 4. This means that transitions between symbols are more difficult to associate with a specific location on the trajectory, resulting in lower mutual information between fine partitions and coarse partition history.
Coarse ordinal partitions ( ) on the Lorenz system. Partitions are contiguous on the (a) -embedded and (b) -embedded partitions but not the (c) -embedded partition. This means transitions between symbols for the -embedded ordinal partition are more difficult to attribute to a specific location on the attractor, resulting in higher mutual information for a given partition.
Coarse ordinal partitions ( ) on the Lorenz system. Partitions are contiguous on the (a) -embedded and (b) -embedded partitions but not the (c) -embedded partition. This means transitions between symbols for the -embedded ordinal partition are more difficult to attribute to a specific location on the attractor, resulting in higher mutual information for a given partition.
For the Lorenz system, the statistic produces reliable results reflecting both partition quality and the way the partitions interact with the geometry of the system. These examples demonstrate that besides the quality of the partition itself, choices around its implementation, such as embedding dimension, are also crucial for obtaining a good partition.
C. Independent and identically distributed noise
We generate i.i.d. noise from time series of the Lorenz system by randomly selecting points with replacement, preserving the distribution of the data but removing temporal correlation between points. We anticipate that partitions should perform poorly. The -means partition in the state space, shown in Fig. 5(a), does indeed perform poorly. However, the -means partition on embedded orbits and the ordinal partition in Figs. 5(b) and 5(c), respectively, show significant mutual information between fine partitions and coarse partition history. This is because in both cases, a time-delay embedding step is carried out. The time-delay embedding process introduces correlation between successive points in the embedded orbit, so each point contains information about trajectory history, resulting in a higher mutual information statistic value being measured. The ordinal partition is explicitly dependent upon the relationship between successive points, while the -means partition only geometrically partitions the embedded orbit; this accounts for the ordinal partition’s better performance, with higher mutual information and a consistent increase with . Note that unlike on the original Lorenz system, partitions on the -embedding of the i.i.d. noise do not result in higher mutual information. This reflects the fact that here, mutual information results only from the time-delay embedding, not from system dynamics. These results suggest that in general, partitioning methods that utilize trajectory history, including the ordinal partition, will perform better under the mutual information statistic.
Mutual information vs for i.i.d. noise generated from the Lorenz system. (b) -means on the embedded system demonstrates meaningful mutual information; (a) the same partition in state space does not; (c) the ordinal partition ( ) demonstrates even higher mutual information as well as an increase in mutual information with increasing . Even though the noise retains no temporal correlation between observations, the process of time-delay embedding introduces correlation between successive observations. These results suggest that, in general, partitioning methods that utilize trajectory history will perform better under the statistic. The -means partitions are non-deterministic; central lines are a mean of 100 trials, with standard deviation shown above and below in the shaded regions. Note that for the ordinal partition, the different ordinal patterns present in each embedding dimension result in different for each , accounting for the different domain in Fig. 5(c).
Mutual information vs for i.i.d. noise generated from the Lorenz system. (b) -means on the embedded system demonstrates meaningful mutual information; (a) the same partition in state space does not; (c) the ordinal partition ( ) demonstrates even higher mutual information as well as an increase in mutual information with increasing . Even though the noise retains no temporal correlation between observations, the process of time-delay embedding introduces correlation between successive observations. These results suggest that, in general, partitioning methods that utilize trajectory history will perform better under the statistic. The -means partitions are non-deterministic; central lines are a mean of 100 trials, with standard deviation shown above and below in the shaded regions. Note that for the ordinal partition, the different ordinal patterns present in each embedding dimension result in different for each , accounting for the different domain in Fig. 5(c).
D. Comparison of mutual information and generating partition statistics
In this section, we highlight differences between our proposed mutual information statistic and two existing statistics used to assess candidate generating partitions: symbolic false nearest neighbors and symbolic shadowing.
Kennel and Buhl introduce symbolic false nearest neighbors3 to measure how close a partition is to generating. The statistic is built upon the principle that neighbors in symbol space should also be neighbors in the state space. Their algorithm is outlined below for a time series with symbol sequence :
Algorithm 2
- Embed in the unit squarewhere is the number of unique symbols in the symbol sequence.
For each , find its nearest Euclidean neighbor in the unit square and call the nearest neighbor index .
Define .
Define as the percentile rank of among all pairs of points in .
Calculate the proportion of values below .
We compare how our proposed mutual information statistic, the symbolic false nearest neighbors statistic and the symbolic shadowing statistic (with ) assess the same system and partitions. Using the Lorenz system with partitions, we apply the random partition, slice partition ( slice), -means partition, and ordinal partition ( , -embedded). Results are shown in Table I.
Comparing assessments of four k = 27 partitions on the Lorenz system offered by the mutual information statistic (MI), symbolic false nearest neighbors (SFNN), and symbolic shadowing (SS). The mutual information statistic offers an assessment of each partition that accurately reflects their performance in time series analysis. Note that for SS (n = 5), under the random partition, every observation has a unique surrounding symbol sequence, so mean distances are 0.
. | Random . | Slice . | K-means . | Ordinal . |
---|---|---|---|---|
MI | 0.05 | 1.08 | 1.26 | 2.52 |
SFNN | 0.009 | 0.988 | 0.934 | 0.526 |
SS (n = 1) | 50.7 | 21.2 | 7.2 | 118.4 |
SS (n = 5) | 0.0 | 2.7 | 2.4 | 60.0 |
. | Random . | Slice . | K-means . | Ordinal . |
---|---|---|---|---|
MI | 0.05 | 1.08 | 1.26 | 2.52 |
SFNN | 0.009 | 0.988 | 0.934 | 0.526 |
SS (n = 1) | 50.7 | 21.2 | 7.2 | 118.4 |
SS (n = 5) | 0.0 | 2.7 | 2.4 | 60.0 |
The mutual information statistic offers an assessment of each partition that accurately reflects their performance in time series analysis. It scores the ordinal partition higher than the -means partition, which itself scores higher than the slice partition and then the random partition. Symbol sequences from the slice and -means partitions offer strong state space localization and are rated well using symbolic false nearest neighbors and symbolic shadowing. However, disagreement between the three statistics demonstrates that state space localization does not necessarily result in high-information partitions as defined by the mutual information statistic. Given these results, we suggest that the mutual information statistic’s definition of “high-information” partitions is more appropriate for determining good partitions for time series analysis; we demonstrate such an application in Sec. IV. We also note that computation times for the proposed mutual information statistic are significantly lower than those for the existing generating partition statistics.
IV. WEIGHTED ORDINAL PARTITION
Empirical results and the mutual information statistic’s formulation suggest that the ordinal partition is a good partition as it utilizes trajectory history. Attempting to improve upon the ordinal partition, we introduce the weighted ordinal partition as an application of the mutual information statistic.
Let be a scalar time series from a one-dimensional observation of a system. Consider a point with -dimensional embedding . In an ordinal partition, we assign symbols based on the rank order of the amplitudes of components of (in the literature, time ordering is sometimes used rather than rank ordering; both generate equivalent partitions). In a weighted ordinal partition, we first apply an element-wise weighting, , by calculating , where denotes the element-wise product. We then assign a symbol based on the rank order of the amplitudes of components of . The conventional/unweighted ordinal partition is the case . Because the ordinal partition is scale invariant, we can fix .
Each selection of will result in a different partition with a different value under the mutual information statistic. We propose that a partition can be optimized by selecting to maximize the value of the mutual information statistic, and that the resulting optimized weighted ordinal partition should outperform the conventional ordinal partition (also referred to as just the “ordinal partition”). An algorithm for selecting the optimal for scalar time series is as follows:
Algorithm 3
Select and to produce the embedded orbit . Note that by selecting we also select the number of symbols, .
Select a range of values for to be tested, contained in the set . In this paper, each component of (besides ) is varied in the range with step size .
For each , generate the -dimensional fine symbol sequence by finding the ordinal sequence of for all .
For each , generate the coarse symbol sequence in a similar manner, using truncated weighting and embedding .
For each , follow steps 4 and 5 of Algorithm 1 to calculate the mutual information statistic between the coarse and fine partitions.
Select the weighting that results in the partition with the highest mutual information value.
A. Application to the Lorenz system
We apply a weighted ordinal partition with parameters to an -observation of the Lorenz system as defined in Eq. (2). We apply Algorithm 3 with two modifications: in step 2, is varied in the range and in the range as these are the values of producing interesting behavior; and we remove step 6, as in this section, we are interested in observing behavior for a variety of values of rather than selecting the optimal one. Results are shown in Fig. 6. A region of high mutual information is seen along the parabola . Partitions along the parabola appear to meet at the center of each lobe of the attractor, while partitions adjacent to the parabola meet outside the center of each lobe. Partitions meeting at the center of a lobe form “wedges,” resulting in regular and predictable symbol sequences as a trajectory traverses each lobe and thus higher mutual information.
A range of values of are tested for the weighted ordinal partition ( ) on an -observation of the Lorenz system. (a) Mutual information values for each and . Points on the parabola have high mutual information due to partitions that meet at the center of each lobe, forming “wedges” that the symbol sequence predictably cycles through; compare (b) on the parabola to (c) adjacent to the parabola. Two regions in (a) show higher mutual information values than the conventional ordinal partition: (d) , which separates the lobes along the plane, and (e) , which more evenly distributes points between symbols. Contiguous partitions in (d) and (e) may also contribute to higher mutual information.
A range of values of are tested for the weighted ordinal partition ( ) on an -observation of the Lorenz system. (a) Mutual information values for each and . Points on the parabola have high mutual information due to partitions that meet at the center of each lobe, forming “wedges” that the symbol sequence predictably cycles through; compare (b) on the parabola to (c) adjacent to the parabola. Two regions in (a) show higher mutual information values than the conventional ordinal partition: (d) , which separates the lobes along the plane, and (e) , which more evenly distributes points between symbols. Contiguous partitions in (d) and (e) may also contribute to higher mutual information.
The standard ordinal partition with has the highest mutual information in the parabolic region with . There are two regions where higher mutual information is observed: the line , with giving , and a region below the parabola, with giving .
Along the line , ordinal sequences are generated from the points . This means that ordinal sequences and only occur when the Lorenz system crosses the plane . This plane represents an important switching between the two lobes of the system; applying this weighting, therefore, captures an important system behavior and resulting in a good symbolic representation of system dynamics with high mutual information.
When compared to the standard , the weighting parameters result in a more even distribution of points among the partitions, as Fig. 7 illustrates. Although many factors determine partition quality, given similar partitioning schemes, evenly distributing points between partitions should provide more information per symbol by maximizing the entropy of the symbol distribution; this can account for the higher mutual information observed for the weighting . It is also noted that the parameter choices and both result in contiguous partitions; this may also increase mutual information.
Weighted ordinal partitions ( ) on the Lorenz system for weightings (a) and (b) . According to the mutual information statistic, offers an improvement over the conventional ordinal partition. This is due to a more even distribution of points between symbols, resulting in higher entropy and therefore mutual information; see histograms for symbol distributions for (c) and (d) .
Weighted ordinal partitions ( ) on the Lorenz system for weightings (a) and (b) . According to the mutual information statistic, offers an improvement over the conventional ordinal partition. This is due to a more even distribution of points between symbols, resulting in higher entropy and therefore mutual information; see histograms for symbol distributions for (c) and (d) .
These results show that when optimized using the mutual information statistic, the weighted ordinal partition can offer improvements over the ordinal partition by identifying partitions that more effectively exploit system dynamics.
B. Weighted ordinal vs conventional ordinal partition
Under a good partitioning scheme, changes in the regime of a system’s behavior should be tracked by symbol sequence complexity measures. We measure the largest Lyapunov exponent (LLE)12 of each system as a proxy for chaos to characterize system behavior, and permutation entropy (PE)7 and Lempel–Ziv complexity (LZC)13 as measures of symbol sequence complexity. On both systems, an ordinal partition and a weighted ordinal partition are each used to generate symbol sequences from which we calculate PE and LZC. To assess partition quality, we compare how well PE and LZC from both partitions track LLE as bifurcation parameters are varied.
1. Logistic map
Both the weighted ordinal and ordinal partitions use embedding parameters , . Points of interest for the logistic map are the onset of chaos at , and periodic windows at . As Fig. 8 shows LZC from the weighted ordinal partition is able to detect the onset of chaos slightly more accurately than LZC from the ordinal partition, while PE from both partitions is inaccurate, with the weighted ordinal partition increasing prematurely at and the ordinal partition delayed at . Additionally, while both LZC statistics detect all three periodic windows, LZC for the weighted ordinal partition is higher in the chaotic regime, meaning the periodic windows are more distinct from the surrounding chaos. Similarly, PE from the weighted ordinal partition detects all three periodic windows, while the ordinal partition demonstrates smaller decreases in PE and fails to detect entirely.
LLE tracking results for the logistic map ( ). Both (a) LZC and (b) PE for the weighted ordinal partition outperform their ordinal partition counterparts in their ability to track LLE. Note, in particular: (1) the larger decreases in LZC and PE for the weighted ordinal partition at periodic windows ; (2) the failure of PE from the ordinal partition to detect the periodic window; and (3) the accurate detection of the onset of chaos at in LZC from the weighted ordinal partition.
LLE tracking results for the logistic map ( ). Both (a) LZC and (b) PE for the weighted ordinal partition outperform their ordinal partition counterparts in their ability to track LLE. Note, in particular: (1) the larger decreases in LZC and PE for the weighted ordinal partition at periodic windows ; (2) the failure of PE from the ordinal partition to detect the periodic window; and (3) the accurate detection of the onset of chaos at in LZC from the weighted ordinal partition.
Table II contains short sections of the symbol sequences generated by both partitions at different values of and offers some insight into the reason for the weighted ordinal partition better distinguishing chaos from periodicity. During chaos at , the ordinal partition’s symbol sequence can be split into recurring “motifs” of 2403 and 13. This is because at this parameter value, despite being chaotic, the logistic map produces short sequences of points with recurring ordinal patterns; see Fig. 9(a). Although the order of the ordinal partition’s motifs is chaotic, their presence means that the symbol sequence can effectively be reduced to two symbols, lowering sequence complexity measures. On the other hand, no such motifs are identifiable in the weighted ordinal partition’s sequence. At , the optimal weighting is . Figure 9(b) shows length segments at various points along the time series after is applied. Segments that had the same ordinal sequence under the ordinal partition can be distinguished under the weighted ordinal partition. This eliminates the presence of recurring motifs in the symbol sequence, making periodic windows more distinguishable from chaos than in the conventional ordinal partition. At the periodic window, both the ordinal and weighted ordinal sequences fall into a six-periodic cycle, before returning to their respective chaotic behaviors at .
(a) Segment of the logistic map at . Points alternate above and below the horizontal line at . The vertical lines separate segments with recurring ordinal patterns and symbol assignments, under an ordinal partition. Segments of length 4 (grey background) correspond to the symbol sequence motif 2403, and segments of length 6 correspond to the motif 240 313. The order these motifs appear in is chaotic, but their presence reduces complexity measures. (b) Length segments of the time series at after applying optimal weighting . These segments have identical ordinal patterns (3241) under the conventional ordinal partition but can be distinguished after the weighting is applied. This leads to higher complexity measures for the weighted ordinal partition, making chaos more distinct from periodicity.
(a) Segment of the logistic map at . Points alternate above and below the horizontal line at . The vertical lines separate segments with recurring ordinal patterns and symbol assignments, under an ordinal partition. Segments of length 4 (grey background) correspond to the symbol sequence motif 2403, and segments of length 6 correspond to the motif 240 313. The order these motifs appear in is chaotic, but their presence reduces complexity measures. (b) Length segments of the time series at after applying optimal weighting . These segments have identical ordinal patterns (3241) under the conventional ordinal partition but can be distinguished after the weighting is applied. This leads to higher complexity measures for the weighted ordinal partition, making chaos more distinct from periodicity.
Symbol sequences for weighted ordinal and ordinal partitions (m = 4) on the logistic map in the chaotic regime (r = 3.62, 3.64) and six-periodic window (r = 3.63). Even in the chaotic regime, the ordinal partition’s symbol sequence can be split into recurring motifs, resulting in periodic windows being less distinguishable in complexity measures. Where present, motifs and recurring symbol segments are highlighted by alternating between bold and regular text.
r . | Ordinal symbol sequence . | Weighted ordinal symbol sequence . |
---|---|---|
3.62 | 1,3,2,4,0,3,2,4,0,3,2,4,0,3,2,4,0,3,2,4,0,3,2,4,0,3,1,3 | 10,9,6,2,4,7,10,9,6,2,5,7,3,4,0,2,10,9,6,2,5,7,3,4,0,2 |
3.63 | 2,4,0,3,1,3,2,4,0,3,1,3,2,4,0,3,1,3,2,4,0,3,1,3 | 5,4,2,3,0,1,5,4,2,3,0,1,5,4,2,3,0,1,5,4,2,3,0,1 |
3.64 | 2,4,0,3,2,4,0,3,1,3,2,4,0,3,2,4,0,3,2,4,0,3,2,4,0,3,1,3 | 0,6,5,7,0,6,5,7,0,6,5,7,0,1,3,6,5,7,0,6,5,10,0,8,5,7 |
r . | Ordinal symbol sequence . | Weighted ordinal symbol sequence . |
---|---|---|
3.62 | 1,3,2,4,0,3,2,4,0,3,2,4,0,3,2,4,0,3,2,4,0,3,2,4,0,3,1,3 | 10,9,6,2,4,7,10,9,6,2,5,7,3,4,0,2,10,9,6,2,5,7,3,4,0,2 |
3.63 | 2,4,0,3,1,3,2,4,0,3,1,3,2,4,0,3,1,3,2,4,0,3,1,3 | 5,4,2,3,0,1,5,4,2,3,0,1,5,4,2,3,0,1,5,4,2,3,0,1 |
3.64 | 2,4,0,3,2,4,0,3,1,3,2,4,0,3,2,4,0,3,2,4,0,3,2,4,0,3,1,3 | 0,6,5,7,0,6,5,7,0,6,5,7,0,1,3,6,5,7,0,6,5,10,0,8,5,7 |
2. Lorenz system
Results also demonstrate a slight advantage for the weighted ordinal partition for the Lorenz system with , shown in Fig. 10. PE for the weighted ordinal partition accurately detects the onset of chaos at , whereas PE for the ordinal partition decreases, while in both PE and LZC, periodic windows at are more distinct for the weighted ordinal partition. The exception to this is PE for both partitions failing to detect the periodic windows at . The advantage that the weighted ordinal partition holds largely disappears when the embedding dimension is increased to , shown in Fig. 11; at a high enough dimension, both symbol sequences contain enough information to track LLE very accurately. This result suggests that using the mutual information statistic to optimize partition assignment for a lower number of symbols has a similar effect to increasing the number of symbols; that is, increasing the information conveyed per symbol.
LLE tracking results for the Lorenz system with embedding parameters . Both (a) LZC and (b) PE for the weighted ordinal partition slightly outperform their ordinal partition counterparts in their ability to track LLE. Note in particular: (1) the larger decreases in LZC and PE for the weighted ordinal partition at periodic windows , and (2) the accurate detection of the onset of chaos at in PE from the weighted ordinal partition.
LLE tracking results for the Lorenz system with embedding parameters . Both (a) LZC and (b) PE for the weighted ordinal partition slightly outperform their ordinal partition counterparts in their ability to track LLE. Note in particular: (1) the larger decreases in LZC and PE for the weighted ordinal partition at periodic windows , and (2) the accurate detection of the onset of chaos at in PE from the weighted ordinal partition.
weighted ordinal and conventional ordinal partitions demonstrate a similar ability to track LLE on the Lorenz system for both (a) LZC and (b) PE. At this sufficiently high dimension, both symbol sequences contain enough information to track LLE very accurately, so the advantage of the weighted ordinal partition is no longer present.
weighted ordinal and conventional ordinal partitions demonstrate a similar ability to track LLE on the Lorenz system for both (a) LZC and (b) PE. At this sufficiently high dimension, both symbol sequences contain enough information to track LLE very accurately, so the advantage of the weighted ordinal partition is no longer present.
These results demonstrate that the weighted ordinal partition can offer improvements over the ordinal partition. The ability for complexity measures from both partitions to track LLE is largely similar as they are conceptually identical. However, the mutual information statistic is successful in maximizing the amount of information per symbol given a partitioning scheme and limited number of symbols; this is the cause of the greater visibility of periodic windows and accuracy of detection of the initial transitions from periodicity to chaos.
V. APPLICATIONS TO EXPERIMENTAL DATA
In this section, we apply the mutual information statistic and the weighted ordinal partition to two sets of experimental data: a laser time series from the Santa Fe time series competition,14 and a set of data derived from ECG measurements known as the Fantasia dataset, originally recorded and studied by Iyengar et al.15 and made publicly available on PhysioBank.16
The laser time series consists of 9093 observations representing the intensity of a far-infrared-laser in a chaotic state. In windows of length 100 observations, overlapping and separated by step size 10, we apply Algorithm 3 to find an optimal weighted partition, measure LZC from the partition’s symbol sequence, and track how it changes over time. Results are shown on a segment of the time series in Fig. 12. The time series exhibits oscillations, which increase in magnitude until they “collapse,” returning to small magnitude oscillations, which steadily increase in magnitude again. LZC values appear to increase as the magnitude of the oscillations increase, then suddenly spike downward as the collapse occurs, suggesting the system exhibits more chaotic behavior immediately prior to these transitions. LZC from the weighted ordinal partition captures these patterns better than LZC from the ordinal partition; note in the former the correct detection of the upward spike at index 6400 and a downward spike at index 6850. In LZC from the ordinal partition, the small upward spike at index 6400 has the same magnitude as another spike about 100 observations earlier, while the trough at index 6850 extends past index 7000.
LZC from weighted ordinal and ordinal partitions on a segment of the laser dataset, with a window size of 100 observations. LZC values increase prior to collapses from large to small magnitude oscillations, suggesting the system exhibits more chaotic behavior immediately prior to these transitions. LZC from the weighted ordinal partition outperforms LZC from the ordinal partition; note in the former the detection of the upward spike at index 6400 and a downward spike at index 6850.
LZC from weighted ordinal and ordinal partitions on a segment of the laser dataset, with a window size of 100 observations. LZC values increase prior to collapses from large to small magnitude oscillations, suggesting the system exhibits more chaotic behavior immediately prior to these transitions. LZC from the weighted ordinal partition outperforms LZC from the ordinal partition; note in the former the detection of the upward spike at index 6400 and a downward spike at index 6850.
Additionally, we analyze a subset of the Fantasia dataset. The data consist of ten time series containing between 4936 and 8708 observations, representing interbeat time intervals from ten subjects recorded over two hours. Five of the time series are from elderly subjects aged 68–85, and five are from younger subjects aged 21–43. The authors of the original paper15 attempted to isolate age as the only experimental variable, with the aim being to discriminate between the two age groups based on the time series alone. In previous work, other authors have shown that measurements of permutation entropy are unable to do so.15,17 Applying Algorithm 3 to the time series, we generate the optimal weighted ordinal partition with the aim of identifying improvements over the standard ordinal partition in the ability to discriminate based on permutation entropy. However, we find that for embedding parameters and observations, the optimal weighting is always . This means that the standard ordinal partition is already optimal, so the weighted ordinal partition is unable to offer any improvements in this case.
VI. CONCLUSIONS
We have introduced a mutual information statistic to assess partitions of the state space of chaotic dynamical systems. Compared with existing generating partition statistics, this mutual information statistic produces assessments of partitions that better reflect their performance in time series analysis. We, therefore, suggest that the statistic’s mechanism, measuring mutual information between the trajectory history and current location, defines a more accurate notion of “high-information” partitions in the context of time series analysis.
The statistic’s formulation, as well as empirical results from its application, indicate that partitions utilizing trajectory history, such as the ordinal partition, perform well. We can, therefore, offer an account for the already popular ordinal partition’s usefulness and provide evidence supporting its continued use.
As an extension to the ordinal partition, we introduce the weighted ordinal partition. Optimizing the weighted ordinal partition’s parameters using the mutual information statistic produces partitions that exploit system dynamics more effectively than the conventional ordinal partition. The weighted ordinal partition demonstrates improvements over the ordinal partition in time series analysis, particularly, in distinguishing chaos from periodicity when using a small number of unique symbols. The weighted ordinal partition can also be applied to real-world datasets, demonstrated on two experimental time series.
ACKNOWLEDGMENTS
J.L. was supported by the Australian Mathematical Sciences Institute through a 2023–2024 Vacation Research Scholarship.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Jason Lu: Conceptualization (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Software (equal); Visualization (equal); Writing – original draft (equal). Michael Small: Conceptualization (equal); Methodology (equal); Supervision (equal); Writing – review & editing (equal).
DATA AVAILABILITY
Example code implementing an algorithm to calculate the mutual information statistic is openly available in the mutual-information-statistic repository at https://github.com/jason-luuuuu/mutual-information-statistic, Ref. 18.