Networks and graphs have emerged as powerful tools to model and analyze nonlinear dynamical systems. By constructing an adjacency matrix from recurrence networks, it is possible to capture critical structural and geometric information about the underlying dynamics of a time series. However, randomization of data often raises concerns about the potential loss of deterministic relationships. Here, in using the spring-electrical-force model, we demonstrate that by optimizing the distances between randomized points through minimizing an entropy-related energy measure, the deterministic structure of the original system is not destroyed. This process allows us to approximate the time series shape and correct the phase, effectively reconstructing the initial invariant set and attracting dynamics of the system. Our approach highlights the importance of adjacency matrices derived from recurrence plots, which preserve crucial information about the nonlinear dynamics. By using recurrence plots and the entropy of diagonal line lengths and leveraging the Kullback–Leibler divergence as a relative entropic measure, we fine-tune the parameters and initial conditions for recurrence plots, ensuring an optimal representation of the system’s dynamics. Through the integration of network geometry and energy minimization, we show that data-driven graphs can self-organize to retain and regenerate the fundamental features of the time series, including its phase space structures. This study underscores the robustness of recurrence networks as a tool for analyzing nonlinear systems and demonstrates that randomization, when guided by informed optimization, does not erase deterministic relationships, opening new avenues for reconstructing dynamical systems from observational data.
The analysis of networks and graphs has seen substantial advancements in recent years.1–3 Various methods exist for constructing and visualizing networks and graphs, such as ordinal networks,4,5 visibility graphs,6,7 and -nearest neighbor networks.8 A recurrence network can be established from a recurrence matrix, which also serves as an adjacency matrix, enabling the extraction of quantification measures related to the underlying dynamical system.9,10 Recurrence plots and quantification measures make use of recurrence as a key property of dynamical systems,11 with growing interest in recent years in analyzing nonlinear systems.9,12
I. INTRODUCTION
Eckmann et al.13 introduced recurrence plots (RPs) as a method for visualizing recurrence matrices, which encapsulate the time-invariant properties of a dynamical system derived from observed time series data.14 RPs are generated through phase-space reconstruction, commonly applied to time-varying systems.15–21 Creating a recurrence network involves deriving an adjacency matrix which may be generated from a distance matrix or from the RP’s weighted similarity matrix, which then captures the nodal connections within the network.22 Networks have proven useful for estimating dynamic transitions, quantification measures,9,23 and invariants, as highlighted by recent studies such as those by Haehne et al.24 and Börner et al.25 As opposed to recurrence plots, which require embedded systems, networks belong to the methods which are non-embedded methods.26 Notably, using an adjacency matrix in conjunction with the spring-electrical-force (SEF) model has shown promise in qualitatively reproducing attractors akin to the original phase-space representation.10 This has potentially the advantage that invariant sets may be reconstructed from data alone if an optimal process of reconstruction using the SEF model can be found. To enable the reconstruction of an attractor, the time series need to be steady-state (after transients have dissipated) and stationary (joint probability distribution is invariant under time shifts).27 Also, noise plays an important role and the extent to which an attractor reconstruction depends on the degree of external perturbation remains unanswered. Some early research on image reconstruction deals with perturbations and reprojection errors using spring-embedder algorithms28,29 and later the force-directed algorithm.30,31
A physically meaningful recurrence matrix depends on the proper embedding of the system’s dynamics, requiring estimates for parameters such as embedding dimension,32,33 time delay,14,34 and recurrence threshold criteria.19,35–48 Standard methods include estimating embedding dimension using global false nearest neighbors19,32 and time delay using average auto-mutual information.18,34 Sophisticated techniques for optimizing recurrence thresholds have been explored,22,37,38,40,49 but simpler approaches such as scaling phase-space diameter or examining RP diagonal structures are often used.11,15,35,36,50
Accurate embedding ensures that the RP retains the dynamical system’s essential information, including static and dynamic invariants.14,19,27 Entropy-based measures like Kullback–Leibler divergence51 or generalized forms (e.g., Tsallis or Rényi divergences52) are suitable for optimization, as they provide insights into hidden states.53–55
Yang and Liu10 used the spring-electrical force (SFE) model10,56 to reconstruct phase space from network data in a self-organized manner. This model, which uses springs to simulate attraction and electrical forces for repulsion, allows intrinsic geometry to emerge organically from random initial points.57 The approach effectively reveals the hidden dynamics of the system.58 Self-organized systems are theoretically inherently self-optimized and do not require further improvement. However, especially in numerical implementations and artificial, self-organized models and depending on the constraints and the cost function, optimization might be required and can be applied to achieve further improvements.59,60
Applying the force-directed algorithm (spring-electrical force model) and just choosing any set of random initial conditions as described by Yang and Liu10 may not lead to an acceptable reconstructed attractor. To overcome this, we have added an optimization process based on an optimized threshold search on initial conditions to their method by moving iteratively backward using random initial conditions and minimizing the Kullback–Leibler divergence51 as a geometric measure of relative entropy. Further, we have projected the time series twice, thereby deteriorating the signal similar to an observable, to approximate the effects of noise and to simulate the data-driven reconstruction of unknown systems. The procedure developed in this paper further aims to optimize the self-organized process of the spring-electrical force model, which is inherently optimized but usually employed in different applications and known to suffer from some deficiencies.
II. METHODS
Figure 1 schematizes the different steps of how to re-generate an attractor from an observable in phase space,14 visualization through a complex network, e.g., as horizontal visibility graph26 and the application of the spring-electrical force model and the optimization procedure of obtaining the ideal set of initial conditions. Table I lists the main variables used in this paper.
Conceptual diagram of how to recover the attractor from a complex graph. Algorithms developed and provided in this paper are highlighted with Algorithm 1 (main paper) and Algorithm 1 and Algorithm 2 (supplementary material II and III). After (a) phase space reconstruction (using an optimized threshold) and (b) extracting the RP and its adjacency matrix, (c) a complex network as undirected Graph (with , being vertices and edges) is formed and visualized here using,26 which is iteratively updated using optimized initial conditions by minimizing the Kullback–Leibler divergence. Then, (d) the time series and the attractor based on an observable can be regenerated.
Conceptual diagram of how to recover the attractor from a complex graph. Algorithms developed and provided in this paper are highlighted with Algorithm 1 (main paper) and Algorithm 1 and Algorithm 2 (supplementary material II and III). After (a) phase space reconstruction (using an optimized threshold) and (b) extracting the RP and its adjacency matrix, (c) a complex network as undirected Graph (with , being vertices and edges) is formed and visualized here using,26 which is iteratively updated using optimized initial conditions by minimizing the Kullback–Leibler divergence. Then, (d) the time series and the attractor based on an observable can be regenerated.
Summary of Notations.
Notation . | Definition . |
---|---|
dm;τ | embedding dimension; time delay |
distENTRm | mth order ENTR deviation distance |
y | initial coordinate set |
A | adjacency matrix of G |
C, p | SEF model parameters |
CRP | Cross-Recurrence Plot |
DKL | the Kullback–Leibler divergence |
ENTR | Shannon entropy of the frequency distribution of the diagonal line lengths |
G | an unweighted and undirected graph |
K | natural spring length |
P | end displacement of force-directed algorithm |
the recurrence matrix of X | |
RTE | recurrence time entropy |
RP | Recurrence Plot |
S | the embedded time series of X |
X | a time series (abbrv. TS) |
δ | position updating magnitude |
η | learning rate |
the recurrence threshold | |
recurrence threshold set | |
λ | the KL divergence threshold |
θ | force-directed algorithm stopping threshold |
Notation . | Definition . |
---|---|
dm;τ | embedding dimension; time delay |
distENTRm | mth order ENTR deviation distance |
y | initial coordinate set |
A | adjacency matrix of G |
C, p | SEF model parameters |
CRP | Cross-Recurrence Plot |
DKL | the Kullback–Leibler divergence |
ENTR | Shannon entropy of the frequency distribution of the diagonal line lengths |
G | an unweighted and undirected graph |
K | natural spring length |
P | end displacement of force-directed algorithm |
the recurrence matrix of X | |
RTE | recurrence time entropy |
RP | Recurrence Plot |
S | the embedded time series of X |
X | a time series (abbrv. TS) |
δ | position updating magnitude |
η | learning rate |
the recurrence threshold | |
recurrence threshold set | |
λ | the KL divergence threshold |
θ | force-directed algorithm stopping threshold |
A. Basic notations
Let be a time series with points, where and . Using fixed values of embedding dimension and time delay , we can reconstruct the time series to a set with vector points, denoted as . Each is represented by , where . The embedding dimension and time delay are estimated with the global false nearest neighbor algorithm (FNN)19,32 and average auto-mutual information (MI),34,50 respectively. Specifically, the FNN is applied by using the criterion of below falseness of nearest neighbors and the MI is applied by choosing the first minimum of the values.
Based on the definition in Ref. 17, the recurrence plot of can be expressed as (or ), where is the recurrence threshold, i.e., the cut-off distance.15
The embedding dimension and time delay are determined using: (a) The average auto-mutual information (MI) method,34,50 selecting as the first minimum of the MI values. (b) The global false nearest neighbor (FNN) algorithm,19,32 with a falseness threshold of below for nearest neighbors.
By selecting an appropriate embedding, as prescribed by Takens’ embedding theorem,18,27 the time series is reconstructed into phase space,14 enabling the analysis of the observed dynamics. This results in the construction of an binary recurrence matrix , where each element reflects the recurrence status of states at times and . Specifically, indicates that the distance between states at times and is no greater than , while indicates that the distance exceeds this threshold.
By selecting a proper embedding (in this case following Takens embedding theorem18,27), all our analyses are based on observed dynamics regenerated in phase space.14 In contrast to an RP, a cross-recurrence plot (CRP) compares the states of two different systems and is defined as , where and are states from two different systems in their respective reconstructed phase spaces. The key difference is that RP focuses on self-similarity within a single system, while CRP examines shared dynamics or interactions between two systems. We will use the CRP to compare the reconstructed attractors in phase space comparing the method of Yang and Liu10 with our proposed procedure, where we aim to optimize the self-organized process of the spring-electrical force mechanism.
By Eq. (2), an binary recurrence matrix is constructed, where (respectively, ) indicates states at times and are recurrent (respectively not recurrent), i.e., the distance between states and states is no larger (respectively larger) than .
Donner et al.9 and Marwan et al.23 constructed undirected, unweighted networks 62 from the binary recurrence matrices by representing the embedded points ( ) as nodes ( ) and the recurrence relationships as edges ( ). We use (resp. ) to denote the node set (respectively, edge/link set). According to Lacasa et al.,26 graphs can be represented differently, for illustration see Fig. 26 for the Rössler system.
B. Optimal recurrence threshold selection
We present the method employed to optimize the recurrence threshold for the RP. As seen in Eq. (2), if the recurrence threshold is set too high, most points in the dataset will appear recurrent. Conversely, if is chosen to be too small, only a minimal number of points will be classified as recurrent.
To determine the optimal threshold for the time series with fixed embedding dimension and time delay , we defined an interval of recurrence threshold values denoted by with step size , where and . The total number of thresholds used is given by .
In this context, represents the difference between the values (i.e., recurrence thresholds ) of successive recurrence plots. Consequently, and can be considered similar when their deviation is small. Therefore, a recurrence threshold corresponding to the minimum value of can be selected as the optimized recurrence threshold. A brief numerical example illustrating this process is provided in supplementary material I. It should be noted here that this choice of threshold is different from the optimal threshold as used for the ideal reconstruction of the dynamics. This was necessary to examine the maximum possible change and information content required for the SEF model to reconstruct the attractor with acceptable quality.
The optimal threshold is determined using the algorithm outlined in SI 1, which identifies the minimum value within the set .
C. Optimal initial displacement selection
Here, we introduce the methods we used to optimize the initial displacements for the self-organizing process.10
1. Force-directed recurrence networks
For a given time series , Hui et al.10 proposed a self-organizing process to derive its geometric structure from the corresponding network. This approach utilizes the force-directed algorithm, also known as the spring-electrical-force algorithm.56
Force-directed algorithm. For a graph , whose adjacency matrix is , we use to denote the coordinates (displacement or layout64) of the nodes in , i.e., for a node , its coordinate is .
For a graph with adjacency matrix , we use to represent the coordinates (or displacement/layout65) of the nodes in . Specifically, for a node , its coordinate is denoted as .
To visualize the graph layout concerning , the analogy of the spring-electrical model is employed, where the edges are treated as springs and the nodes are modeled as electrically charged particles.10,56 This principle is implemented in the force-directed algorithm. The goal of this algorithm is to minimize the energy in the network, mimicking the behavior of a physical system in equilibrium. As a result, the algorithm iteratively updates the nodes’ coordinates, with the direction and magnitude of each node’s movement determined by the net force acting on it.
2. Updated stop criteria
In Algorithm 1, we use to represent the adjacency matrix generated from a known time series . The variable denotes the node’s updating magnitude, also referred to as the cooling factor,56 which is updated according to different scenarios10 (lines 27–39). The stopping threshold is represented by , where the procedure terminates when the condition is met (line 22). Additionally, based on Ref. 56, we update the stopping criteria (line 22) to the minimum difference between two subsequent layouts, specifically when . The procedure concludes when both of the following conditions are satisfied: . Here, represents the initial coordinate set of each node in , where (line 1). For the force model parameters, we set , , and (lines 3 and 4).10
The system’s energy, denoted by and represented by Eq. (7), is initially set to (line 6). In lines 7 and 21, we introduce and then update the variable , which tracks the direction of change in the system’s energy. This value serves as an input parameter for controlling the node’s updating magnitude (lines 27–39). In line 8, the variable is introduced to manage the recursive procedure (lines 8–26). It is initialized as (line 5).
In each iteration (lines 9–25), the system’s energy and the node coordinates are updated based on the combined force [Eq. (6)] acting on each node. At the start of each iteration, the system energy from the previous iteration is stored as (line 9), which is used to check the stop criterion (line 22) and adjust the updating magnitude (line 21). The system’s energy for the current iteration is initialized to (line 9). Before starting the force computation, the coordinate set is saved as (line 9), and the initial combined force for each node is set to (line 10). In lines 12 to 17, for each node in , the forces are computed using Eq. (6), the coordinates are updated (line 18), and the system’s energy is recalculated using Eq. (7) (Line 19). Based on certain conditions (lines 27–39), the magnitude is then updated (line 21).
Current problems of the self-organizing process. We identified problems when extracting the steady geometric structure from a network using the self-organizing process of Ref. 10, and these are addressed as follows:
Optimal coordinates evaluation: In line 40 of Algorithm 1, the updated coordinates are returned to satisfy the energy change condition 22, but there is no evaluation of the optimal displacement . However, the ideal displacement should be identical to the embedded time series of 66 (i.e., , Sec. II A), meaning . To address this, we apply the Kullback–Leibler Divergence (KL divergence)67 to measure the divergence between the displacement returned by the algorithm, , and the ideal displacement values, . A small KL divergence value indicates closeness between the two.
Initial coordinates generation: The initial coordinate set of Algorithm 1 is randomly distributed in a high-dimensional space (embedding dimension , Sec. II A).10 The KL divergence value (i.e., the divergence between the returned and ) can vary significantly with different sets of initial coordinates. To find the closest distance to , the initial coordinates need to be optimized, as described in Sec. II D.
D. Iterative learning optimization process
We propose an iterative learning optimization process for generating the initial coordinates. Given an embedded time series , our goal is to derive a displacement that is closest to . This consists of the smallest KL divergence value between and , leading to the optimal initial coordinate set . Note that, , where the input adjacency matrix is generated by , and represents the corresponding initial coordinates.
(i) If the KL divergence value remains nearly constant over 50 iterations (indicating that the optimization procedure has reached a critical point), the learning rate is decreased to . A smaller will refine the optimization process.
(ii) If the KL divergence value increases in one iteration, the learning rate is increased to . This adjustment will accelerate the updating of the coordinates toward their optimal position.
(iii) In any other case, the current learning rate is maintained.
III. EXPERIMENTAL EVALUATION
We evaluate the proposed algorithms on simple sinusoidal motion and the chaotic motion of the Rössler system.
Algorithms. We implement and evaluate the two optimal algorithms for recurrence threshold (Algorithm SI 1) and initial coordinate [Algorithm (SI 2)], respectively.
Settings. All recurrence plot-related computations are constructed with the “Commandline Recurrence Plots.”69 All parameters are using the default values, except the embedding dimension, time delay, and neighborhood criterion as crucial parameters for embedding,18,70 with calculations conducted using the package Time Series Analysis (TI.SE.AN).33 We select for the th order deviation distance [Eq. (5)].
For the forces (repulsive, attractive) computation [Eq. (6)], throughout this paper, we fix , , , , , where is the average distance of an edge in the force physical model, as explained in.56
For the initial coordinates optimization procedure, unless otherwise specified, we set , and .
All programs are implemented in Python and are compiled in Linux. All experiments are performed on a machine with AMD EPYC 2.4 GHz CPU and Redhat Linux System.
A. Application to sine function
For the recurrence threshold optimization, we select with steps of . Since is meaningless, we choose the first value of as , i.e., , .
1. Evaluation of selecting optimal recurrence threshold
In constructing the recurrence plots, we use the time delay and embedding dimension .
Evaluation of .
Figures 2(a)–2(d) show the recurrence plots for different thresholds ( ). With the increase of the threshold, more neighbors (black dots) become visible [Eq. (2)], i.e., the number of recurrences increases (Fig. 2) and an increase in the change of information can be measured, which is why the entropy grows [Eq. (4) and Fig. 3].
The recurrence plots of the sine function associated with different thresholds. (a) , (b) , (c) , (d) .
The recurrence plots of the sine function associated with different thresholds. (a) , (b) , (c) , (d) .
For large enough , the number of recurrences increases slowly, which affects the trend of on different . increases substantially at the beginning, then changes steadily. Figure 3 shows that can be used to detect the difference of the recurrence plots using various recurrence thresholds.
Evaluation of th Order Deviation Distance. Figure 4 depicts the selection of the optimal recurrence threshold from the for a sine function by Algorithm (SI 1). It uses the following thresholds , , where the th order deviation distances are calculated for , respectively.
th Order deviation distance ( ) against the threshold on Sine Time Series.
The plots of exhibit similar behavior. At lower threshold values, the sensitivity is higher: a small increase in the threshold can result in capturing many more recurrences, leading to a larger variance (indicated by stronger oscillations) for all values calculated.
As the threshold increases, the changes in become relatively minimal (Fig. 3). We select the threshold that minimizes ( ) as the optimal choice. In this case, the optimal threshold is . This threshold is then used for subsequent computations, including the optimization of the initial coordinates (Sec. III A 2).
2. Evaluation of initial coordinates optimization
We demonstrate the validity of the displacement evaluation measure (i.e., KL divergence) and the optimization procedure for the initial coordinates (abbrv. ICs).
Evaluation of End Displacement with KL Divergence. Figure 5 illustrates the KL divergence distribution for all dimensions, based on independent randomized initial coordinate sets. Each KL divergence value is computed from the corresponding end displacement of an initial coordinate set using Algorithm 1 and Eq. (8).
KL Divergence Distribution over Independent Randomized Initial Coordinate Sets. IC 30 has absolutely the smallest divergence, while 61 shows the largest. Examples of their performance are shown in the supplementary material.
KL Divergence Distribution over Independent Randomized Initial Coordinate Sets. IC 30 has absolutely the smallest divergence, while 61 shows the largest. Examples of their performance are shown in the supplementary material.
For clarity, we also report the labels of the initial coordinate sets with the minimum and maximum KL divergence values, respectively. For all dimensions (over-embedding), the KL divergence values are generally grouped into two clusters, referred to as Group-A and Group-B. In Group-A, all KL divergence values exceed . We demonstrate in supplementary material IV to VI that not every initial coordinate set is capable of generating a displacement close to the time trace of the embedded time series.
Figures 5 and S1 (in supplementary material IV) illustrate that displacements with smaller KL divergence values are closer to their optimal displacements. This demonstrates that KL divergence can effectively be used to assess the proximity of the end displacement to the ideal displacement .
Evaluation of Initial Coordinates Optimization. We present the results of the initial coordinate (abbrv. IC) optimization [Algorithm (SI 2)] for Dimension 1 (or dimension ) of No. 30’s initial coordinate set generated in the previous section.
Figure 6 illustrates the iterative optimization process for dimension of No. 30’s initial coordinate set on the sine function. Starting from a random layout of 500 nodes in the phase space, shown in Fig. 6(a), it takes approximately 1700 iterations to achieve the stable structure, at which point remains steady. Figures 6(a) and 6(b) compare the end displacement (black line) with the embedded time series (black line) at various stages of iteration. As reduces to around 0.06, the final displacement closely approximates , achieving a small . When the two distributions ( and ) are sufficiently close, the values converge, indicating that the final displacement is nearly identical to the optimal displacement for the current IC.
Different Iterations of the End Displacement of No. vs the Embedded Time Series on Sine Wave; after about 1757 iterations the solution is converged. (a) iteration, (b) iterations.
Different Iterations of the End Displacement of No. vs the Embedded Time Series on Sine Wave; after about 1757 iterations the solution is converged. (a) iteration, (b) iterations.
B. Application to Rössler model
We observed the -coordinate at every unit of time, generating a time series consisting of 2000 sample points.
Figure 7 displays the time series of the coordinates along with the corresponding attractor. For the recurrence threshold optimization, we choose with a step size of , i.e., , .
1. Evaluation of selecting optimal recurrence threshold
In constructing the recurrence plots, we use a time delay of and an embedding dimension of for the -coordinate of the above sample.72
Evaluation of . Figure 8 illustrates the effect of the recurrence threshold on [Eq. (4)] for the Rössler time series, with ranging from to . As increases, more recurrences are detected [see also Figs. 9(a)–9(d)]. When becomes sufficiently large, the relative number of recurrences increases only slightly. Together, these two factors influence the behavior of across different values of . That is, rises sharply at first and then levels off, changing more gradually.
Evaluation of th order deviation distance. Figure 10 shows the selection of the optimal recurrence threshold from the on the Rössler time series using Algorithm (SI 1) with , , and . The th order deviation distances are calculated for , respectively. The Rössler model exhibits similar behavior to the sinusoidal dynamics: (i) when is small, the values oscillate; (ii) when is large enough, the values stabilize and change steadily. The optimal threshold is selected as the one with the minimum value of ( ), which corresponds to . This setting is used for the subsequent computations, i.e., the optimization of initial coordinates (Sec. III B 2).
The recurrence plots of Rössler model associated with different thresholds. (a) , (b) , (c) , (d) .
The recurrence plots of Rössler model associated with different thresholds. (a) , (b) , (c) , (d) .
th order deviation distance ( ) against the threshold on Rössler.
2. Evaluation of initial coordinates optimization
In this section, we demonstrate the correctness of the displacement evaluation measure (i.e., KL divergence) and the optimization procedure for the initial coordinates (abbreviated as ICs). The optimal threshold identified in Sec. III B 1 (i.e., ) is used for matrix generation (Algorithms 1 and 2 in supplementary material III).
Evaluation of End Displacement with KL Divergence. Figure 11 displays the KL divergence distribution for all dimensions, computed over independent randomized initial coordinate sets. Each KL divergence value is derived from the corresponding end displacement of an initial coordinate set using Algorithm 1 and Eq. (8). For convenience, we provide the labels of the initial coordinate sets with the minimum and maximum KL divergence values, respectively.
KL divergence distribution over independent randomized initial coordinate sets; using third dimension of the Rössler system. IC 72 shows the largest density, while the results for IC 85 are not that obvious. A comparison of the performance of those ICs is presented in the SI.
KL divergence distribution over independent randomized initial coordinate sets; using third dimension of the Rössler system. IC 72 shows the largest density, while the results for IC 85 are not that obvious. A comparison of the performance of those ICs is presented in the SI.
Figure S2 in the supplementary material compares the end displacement and the embedded time series for each dimension, highlighting that not all initial coordinates generate a displacement that provides a good candidate for the time series.
Both Figs. 11 and S2 (in supplementary material V) demonstrate that displacements with smaller KL divergence values are closer to the optimal displacement; however, the result for No. 72 does not perform as well.
Evaluation of Initial Coordinates Optimization. We report on the results of optimizing the initial coordinates (abbrv. IC) using Algorithm (SI 2) for No. ’s initial coordinate set generated previously, exemplified for dimension of the Rössler system, see Fig. 12. Starting from a random initial layout of nodes it takes about iterations to reach the stable structure, i.e., keeps stable. Figures 12(a) and 12(b) show the structure of the final displacement (blue line) compared to the reference time series (black line) for different iterations. Similar to the sine function results ( decreased to ), the final displacement approximates .
Evaluation of the overall performance using cross recurrence quantification measures. In the following, we use joint recurrence plots and joint recurrence quantification measures on the Rössler attractor to compare the quality of the time series regenerated through the method suggested by Yang and Liu10 and using the optimization procedure outlined in this paper (Fig. 1). A joint recurrence plot compares two separate phase spaces. As embedding parameters the average cross-mutual information between pairwise time series is generated which produces the original time series against the time series generated with10, and using our method . In both cases, a dimension of is estimated using the global false nearest neighbor algorithm.19, Figure 13 shows a comparison of the original Rössler system ( -coordinate), and the reconstructed system as time series (a , b ) and as attractors (a , b ) using delay embedding following (a) the procedure of Ref. 10, and (b) following the optimization of initial conditions of the current study. Using a cross-recurrence quantification analysis (CRQA) Fig. 13(c) shows the recurrence time entropy and (d) the clustering coefficient of the trajectories shown in phase space (a ) and (b ) against each other. While the time series in both cases look acceptable (see a and b ), high-frequency components (noise) have been introduced. The attractors can be reconstructed in their basic shape (cf. a and b ), yet the parts that relate to the diverging trajectory cannot be fully recovered. Using the procedure of the current study the phase and the size (the phase space diameter) of the attractor as shown in Fig. 13 b 2 could be corrected compared to that shown in Fig. 13 a 2. However, due to the rescaling procedure, a shift has been introduced which becomes apparent by comparing the ordinate value at the beginning of the time series for Figs. 13 a 2 and 13 b 2. Recurrence time entropy (RTE) in CRQA measures the variability in the timing of interactions between two time series. A higher RTE indicates greater diversity and unpredictability in the recurrence times, reflecting more complex interacting dynamics; conversely, a lower RTE suggests more regularity and predictability in their interactions.17 As seen in Fig. 13(c), the RTE between the original Rössler system and the method of Yang and Liu10 is higher than that of the original Rössler system and our modified method here. In CRQA, the clustering coefficient [depicted in Fig. 13(d)] measures the degree of local interconnectedness of recurrence points, expressing the coherence, stability, and structured nature of interactions between time series. A high clustering coefficient indicates strong, localized synchronization and deterministic dynamics and better correlation between two time series, while a low value reflects weaker or more random coupling, which is the case for the original system compared to Ref. 10. Overall, the results in Fig. 13 suggest the optimization approach introduced here constitutes an improvement relative to a mere self-organization process.
Different Iterations of the End Displacement (blue line) of No. vs the Embedded Time Series (black line) on Rössler. (a) iteration, (b) iterations.
Different Iterations of the End Displacement (blue line) of No. vs the Embedded Time Series (black line) on Rössler. (a) iteration, (b) iterations.
Comparison of the chaotic Rössler attractor, (a) original system vs Yang and Liu,10 (b) original system vs reconstructed attractors of the current study; (a ), (b ) time series, (a ), (b ) reconstructed phase space ( , , FAN threshold criterion, , window size 150, and window shift 5), and cross-recurrence quantification analysis (c) recurrence time entropy and (d) clustering coefficient.
Comparison of the chaotic Rössler attractor, (a) original system vs Yang and Liu,10 (b) original system vs reconstructed attractors of the current study; (a ), (b ) time series, (a ), (b ) reconstructed phase space ( , , FAN threshold criterion, , window size 150, and window shift 5), and cross-recurrence quantification analysis (c) recurrence time entropy and (d) clustering coefficient.
IV. DISCUSSIONS
Research in networks and graphs has received significant attention in recent years; many quantifiers and metrics exist, especially for complex social networks.2,9 Incorporating these measures into physically explainable nonlinear time series analysis provides increased potential for data interpretation in complex dynamics.22,42
We found that the force-directed model10,56 alone does not guarantee an optimal reconstruction of the attractor and that its performance strongly depends on how the attractor is visualized in phase space. We, therefore, reshaped the network iteratively, by minimizing Kullback–Leibler divergence, as a relative entropic measure (i.e., a measure dependent on relative information generation), to optimize the initial conditions. The adjustment of iterations is achieved by using a variable learning rate to control the updating magnitude which shows good agreement. The embedding is based on an optimized threshold and relies on the calculation of the th order of entropy deviation.
The application of a cross-recurrence quantification analysis to compare the original Rössler attractor with the method proposed by Yang and Liu10 and our new procedure shows that the new method produces a more similar attractor with less uncertainty, based on a network being clustered more closely relative to the original network of the Rössler system. The results here suggest that the optimization approach to initial conditions significantly improves both the robustness and quality of the attractor reconstruction method proposed by Yang and Liu.10
While results indicate that there exists theoretically an equivalence relation between the attractor in phase space, generated from a time series and its graph, it is yet unclear how this bijective mapping can be explicitly and accurately described, so that the invariant set and its invariant measures remain preserved. For the results of Ref. 10 but also for our method presented here, this is not the case yet. Using twice a component of an embedded time series is like forming twice a derivative (aka projection), like introducing noise (losing an order of magnitude accuracy).73 Noise, especially high-frequency noise, as commonly found in channel noise of real-life measurements, increases the phase space dimension of a low dimensional dynamical system.19,27 While not tested here specifically on the two benchmark systems, similar to attractor reconstruction in the phase space, the self-organization effect of the spring-electrical force (SEF) model is affected by noise and low-dimensional warping effects.28,29
While the attractor is not fully recovered, the SEF method still seems to work with some degradation but requires a different threshold value as needed for conventional time-delay embedding, indicating a need for more information to reconstruct the attractor fulfilling the KLD optimization criterion. Our proposed method corrects the phase and slightly reduces the noise, especially in the divergent part of the dynamics (where the positive Lyapunov exponents become dominant). However, the warping at the edges accounts for non-uniform vertex density which leads to warping effects (similar to signal distortion but originating from a deterministic transformation process of attraction and repulsive forces) when applying the SEF model; one way to overcome this could be to combine the spring-electrical force model with the Kamada–Kawai stress model.74 The threshold value detected to be ideal is different from the threshold value chosen for normal delay embedding. The reason for this may lie in the edge effects and the information required by the SEF model to reconstruct the attractor and to reach equilibrium in the presence of these perturbations. More research is required to establish whether the threshold value changes if the Kamada–Kawai stress model is employed. While the repeated application of the embedding theorems (trying to recover the attractor from one component of a time-delayed time series) somewhat mimics the features of an observable time series (i.e., introduces noise), the exact trajectory can only be approximated. A similar problem exists when embedding a time series75 and it can be assumed that the extraction of the attractor likely follows similar approximations.
We investigated the impact of the recurrence threshold by varying it within a practical range and observing its effect on the recurrence matrix connectivity and the ENTR deviation distance. Our findings indicate that extremely small or large values of can lead to suboptimal representations: a small tends to under-connect nodes and lose global dynamic patterns, while a large over-connects nodes and introduces noise. However, the results remain robust within a moderate range of , where the reconstructed attractor reliably preserves the system’s topology. These observations underline the importance of selecting an appropriate tailored to the system under study.
For the initial conditions in the spring-electrical force (SEF) model, we conducted multiple optimization runs with randomized ICs to assess their impact. While ICs can cause slight variations in individual node positions, the overall attractor topology and network metrics (e.g., ENTR deviation distance) remain consistent when averaged over multiple runs. This consistency demonstrates that the method is robust to IC variability, ensuring reliable results even when ICs are not precisely optimized.
Regarding the delay time ( ), we determined its value using the (average) auto-mutual information method. This approach ensures that reconstructed dimensions are minimally dependent on one another while retaining meaningful temporal correlations, which is essential for accurately capturing the system’s dynamics.
It remains to be shown in the future how accurately reconstructed invariant sets allow estimating invariant measures, which should then be compared with static and dynamic invariants, such as correlation dimension estimates or estimates of Rényi entropies for further verification.
While the th order entropy is used here as the optimal threshold criterion, it should be tested whether the results of attractor reconstruction can be further improved by measuring statistical complexity and determining the degree of irrelevance43 (similar to anti-optimization), including border effect corrections to obtain noise-robust diagonal line entropy values,47 or by employing Monte Carlo decision tree search for optimal embedding.48 Future research directions should also include the study of real-life systems and the systematic exploration of how features are preserved in the presence of noise, using networked rather than just individual systems. By doing so, network clusters/aggregation, coherence, or chimera states in complex systems can be trialed to be reconstructed using the self-organization property of the force-directed algorithm.
The optimization procedure, including the search for optimal initial conditions, requires substantial computational resources even for small systems. The computational complexity of the spring-electrical-force model ranges from for sparse matrix solutions to for dense matrix solutions. A major computational challenge lies in the calculation of pairwise distances during phase-space reconstruction, a critical step in generating the recurrence matrix. This process has a theoretical time complexity of , where is the number of data points. For large datasets, this quadratic scaling can quickly become a bottleneck in terms of both time and memory usage.
To mitigate this issue, we implemented several optimization techniques, such as reducing redundant calculations, leveraging vectorized operations, and introducing parallel processing (refer to lines 5–8 of Algorithm 1). Although the theoretical complexity remains , these practical optimizations significantly reduced the actual computational time and resource consumption, particularly for large datasets. By combining these techniques, we maintained high reconstruction accuracy while improving the feasibility of handling larger datasets.
In addition, the iterative optimization steps in the spring-electrical-force (SEF) model, which are used to refine node positions by minimizing system energy, also contribute to the computational burden with its computational complexity ranging from for sparse matrix solutions to for dense matrix solutions. Each iteration involves repeated updates to node positions based on their connections, with the computational cost scaling with the number of nodes and edges in the network. To balance displacement optimization and computational efficiency, especially for large datasets, we introduced stopping criteria based on both iteration count and runtime. Specifically, the optimization stops and returns the best node positions obtained so far if either the number of iterations exceeds a predefined threshold, or the runtime surpasses a set time constraint. By carefully selecting these thresholds, we achieved an effective trade-off between computational efficiency and the accuracy of the reconstructed network.
However, further research is needed to explore networked systems and real-world measurements. For practical applications, it will be essential to develop a more efficient computational framework that leverages software engineering (e.g., using heuristics for efficient neighborhood search) diverse hardware architectures and distributed systems (e.g., parallelization through field programmable gate arrays).
SUPPLEMENTARY MATERIAL
See the supplementary material (I–VII) for separate document additional algorithms and examples to illustrate the optimization and self-organization process for the sine and the Rössler system.
ACKNOWLEDGMENTS
This research was supported under Australian Research Councils (ARC) Discovery Projects funding scheme Project Nos. DP200100358 and DP240101536.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Conggai Li: Data curation (lead); Formal analysis (lead); Investigation (lead); Methodology (equal); Software (lead); Validation (equal); Visualization (equal); Writing – original draft (lead). Joseph C. S. Lai: Conceptualization (equal); Funding acquisition (equal); Investigation (equal); Methodology (equal); Supervision (equal); Validation (equal); Writing – review & editing (equal). Sebastian Oberst: Conceptualization (lead); Data curation (supporting); Formal analysis (supporting); Funding acquisition (lead); Investigation (equal); Methodology (equal); Project administration (lead); Resources (lead); Supervision (equal); Validation (equal); Visualization (supporting); Writing – original draft (supporting); Writing – review & editing (equal).
DATA AVAILABILITY
Pseudo-code and all model details are contained within this paper. Additional information can be requested from the corresponding author upon request.