The recurrence plot and the recurrence quantification analysis (RQA) are well-established methods for the analysis of data from complex systems. They provide important insights into the nature of the dynamics, periodicity, regime changes, and many more. These methods are used in different fields of research, such as finance, engineering, life, and earth science. To use them, the data have usually to be uniformly sampled, posing difficulties in investigations that provide non-uniformly sampled data, as typical in medical data (e.g., heart-beat based measurements), paleoclimate archives (such as sediment cores or stalagmites), or astrophysics (supernova or pulsar observations). One frequently used solution is interpolation to generate uniform time series. However, this preprocessing step can introduce bias to the RQA measures, particularly those that rely on the diagonal or vertical line structure in the recurrence plot. Using prototypical model systems, we systematically analyze differences in the RQA measure average diagonal line length for data with different sampling and interpolation. For real data, we show that the course of this measure strongly depends on the choice of the sampling rate for interpolation. Furthermore, we suggest a correction scheme, which is capable of correcting the bias introduced by the prepossessing step if the interpolation ratio is an integer.

Almost all natural systems are non-linear systems, often with multiple dimensions. The analysis of the possibly rich dynamics of such systems requires advanced methods. One of these is the recurrence plot and the associated recurrence quantification analysis, which are, among other things, used to investigate the nature of the dynamics, periodicity, or to detect regime transitions. As with most other methods, this method was developed for uniformly sampled data. This, however, restricts the use of the method. Some data from, for example, astrophysics or paleoclimate cannot be analyzed straightforwardly. To circumvent this restriction, one used approach is to interpolate the data to a constant sampling rate. We show that the differences in the sampling rate together with the subsequent interpolation can lead to strong deviations, and it is not recommendable to take this approach without further consideration. We, thus, propose a correction scheme that can, with some limitations, correct these deviations.

The analysis of data from complex real-world systems creates the basis for many different fields in science, such as finance, engineering, life, and earth science. For many kinds of data, standard measures, such as mean, standard deviation, or higher moments, are not sufficient to capture the details of their dynamics. For this purpose, methods from complex system science are more appropriate, such as Lyapunov exponents,1 complex networks,2 symbolic dynamics,3 or recurrence analysis.4 Recurrence analysis provides a set of tools, e.g., for studying synchronization, classifying different types of dynamics, or detecting regime transitions.5 The increasing popularity of this framework is reflected by the lively methodical development and the growing number of applications in many scientific disciplines.6 

In several fields, data are only available on a non-equidistant time axis; for instance, measurements based on heartbeats are timed by the rhythm of the heart,7,8 which gives a natural varying timescale, and this leads to the necessity of interpolation when compared with other variables that form the cardiorespiratory system, such as the blood pressure.9 Time series of pulsars and rotating stars are obtained through non-equidistant observations,10,11 and paleoclimate data are hampered by the non-constant sedimentation rate, resulting in non-equidistant sampling.12 Such unevenly sampled data are a challenge for most time series analysis techniques, which usually require equidistant sampling points. One solution is to transform the time series to an equidistantly sampled one using interpolation or other techniques (e.g., transformation cost13). Other approaches try to modify the time series analysis tools to be directly applicable to uneven time series, such as Lomb–Scargle periodogram,14 kernel-based correlation,15,16 or edit distance-based recurrence analysis.17 However, interpolation is still a widely used technique, although it can cause serious bias in the results (overemphasizing the lower frequencies). Although this interpolation effect is known for several time series analysis techniques,18 it is not yet systematically considered for recurrence analysis and, thus, can lead to wrong interpretations and conclusions.

Our focus in this study is the effect of non-uniform sampling in paleoclimate time series on the results of recurrence analysis, as it has recently gained attention for its potential to address various research questions in geosciences.19–24 A prominent example is the investigation of transitions in the paleoclimate.25–29 Changes in the recurrence properties, mainly based on changes in the distribution of the diagonal line structures in recurrence plots (related to determinism or predictability of the underlying dynamics), can be used to identify regime changes. Paleoclimate data are usually retrieved from specific geological archives, e.g., marine and lake sediments, tree rings, ice cores, or stalagmites. The climate information is stored during the growth (deposition) of these archives. Since the growth or sedimentation rate can differ over time, but the sampling procedure of such archives usually uses an equidistant sampling scheme, and the final time series are usually unevenly sampled in time.12 Using interpolation without due consideration before conducting recurrence analysis can lead to bias in the distribution of diagonal line structures in the recurrence plots, finally resulting in erroneous conclusions.

In this work, we show the effects of different and changing sampling times and subsequent interpolation on the recurrence quantification analysis using different types of model systems as well as a real paleoclimate example.

We further suggest an approach to estimate quantitatively these effects and provide a correction scheme. Although we focus here on applications in paleoclimate research, the correction scheme can be analogously applied on other research questions, e.g., in astrophysics or physiology.

This work is structured as follows: In Sec. II, the recurrence plot and recurrence plot measures are introduced. In Sec. III, the used models and methods are given. Then, we present the measured effects for the simulated systems. Afterward, in Sec. IV, we present the derivation as well as the evaluation of our correction scheme. In Sec. V, a real-world example is considered. We conclude our findings in Sec. VI.

The recurrence plot (RP) is based on the fundamental principle that a dynamical system will always return to a state arbitrarily close to its initial or any other state.4 This fact is used to simplify the generally multidimensional phase space trajectory to a matrix containing only zeros and ones. This method can be applied to any series of states in a given phase space { x i | i = 1 , , N }, where x i is the phase space vector at the time step i (corresponding to time t = i d t and d t the sampling time) and N the number of considered states (or the length of the time series). If we do not have access to the full phase space vector, it can be reconstructed using time delay embedding.30,31 To create a matrix representation of the recurrences of the data, the phase space distance from all pairs of data points is calculated using a suitable norm, represented by the distance matrix
D i , j = x i x j , i , j = 1 , , N .
(1)
If not stated otherwise, we use the Euler norm. A recurrence is finally defined as having the pairwise distance D i , j between the states smaller than a specified recurrence threshold ε; i.e., the recurrence matrix is derived from the distance matrix D by applying the threshold ε, leading to the binary recurrence matrix R with its elements
R i , j ( ε ) = Θ ( ε D i , j ) ,
(2)
with Θ being the Heaviside function and ε being the recurrence threshold.4 

The recurrence matrix R can be displayed as a plot, where all the ones ( R i , j = 1) are marked by points, and such an entry is, therefore, called a “recurrence point” or a “1-point.” In contrast, the points R i , j = 0 are called a “non-recurrence point” or a “0-point.” This plot is then called a recurrence plot (RP) and can be visually interpreted because it expresses rich patterns characteristic of specific dynamics.4,32

Although the first visual impression gives some important hints, further (quantitative) analysis is often useful. For this purpose, the recurrence quantification analysis (RQA) was introduced.7,33,34 The recurrence rate R R, the fraction of recurrence points in the RP, gives a quantification of how often the system returns to the same region in the phase space. Most of the other measures in RQA rely mainly on the length distribution P l ( l ) of either diagonal or vertical lines (formed by the recurrence points) visible in the RP. Such a diagonal line is denoted here as a “recurrence line” or a “1-line.” The idea behind the study of diagonal recurrence lines is that their length corresponds to the time the system evolves similarly compared to the other times the system visited the same region in the phase space. Therefore, the fraction of 1-points that form diagonal 1-lines in the RP is a heuristic measure of how deterministic the system is,
D E T = l = l min N l P l ( l ) l = 1 N l P l ( l ) ,
(3)
with P l ( l ) being the probability to find a 1-line with exact length l, l min a chosen minimal line length, and N the maximal line length (equal to data length). This measure is called determinism and ranges between 0 and 1. It becomes 1 if all recurrence points belong to diagonal lines equal or longer than l min and 0 if no recurrence point does. This measure is widely used to identify regime transitions, e.g., in paleoclimate studies.27–29,35,36
The calculation of D E T is based on the distribution of diagonal line lengths. For the sake of simplicity, here, we focus on the average diagonal length,
L = l = 1 N l P l ( l ) .
(4)
In contrast to the standard definition of this measure,4 we consider here all line lengths, including such of length 1. This simplification allows us to construct correction schema for possible sampling and interpolation effects. This measure is related to the prediction time. However, it depends on the temporal resolution of the system. For Gaussian white noise, L is 1 / ( 1 R R ) and, therefore, for small R R close to one, whereas for perfectly periodic systems, it should theoretically be infinite but is limited by the RP size and boundary.37  L is very sensitive to noise because noise causes random interruptions in the diagonal lines. In paleoclimate studies, measures quantifying the diagonal lines are often interpreted in terms of the predictability of the climate.29,35,36

In many cases, the temporal variation of the RQA measures is of interest as they can reveal changes or transitions in the system’s dynamics, such as when the system is approaching a tipping point. To determine such changes, the sliding window approach is used. The time series is partitioned into smaller segments (windows) of a predetermined length, which may overlap with one another. Then, the RP and the RQA measures are calculated for every window separately. The distance between the start points of two consecutive windows is called the window step size, and the size of a window is the window size. It is important to determine which time point is assigned to each window. In this study, we just take the starting point of the window so that all points used to calculate the measure are in the interval after this time point.

To demonstrate the interpolation effect, we use two model systems to generate prototypical data. We use an autoregressive model, where the course of the data is stochastically driven, and a Roessler38 system, which is a typical non-linear system, fully described by three non-linear ordinary differential equations.

To quantify the deviation in the L measure due to the difference in the sampling rate and subsequent interpolation, we compare L ref calculated from a reference series without interpolation with L int calculated from the interpolated series with the same temporal resolution as the reference series by taking the ratio L int / L ref. Ratios greater than 1 indicate an increase of the measure due to the interpolation and ratios smaller than 1 a decrease. This could then either lead to an over- or underestimation of an L ref value when analyzing the interpolated series.

The effect depends on the considered system, the interpolation ratio, and the offset. The interpolation ratio r is the ratio of the sampling time of the time series before and after interpolation. The offset is the time difference between the first value in the time series before and after interpolation (Fig. 1).

FIG. 1.

Schema illustrating the different series as well as the offset and interpolation.

FIG. 1.

Schema illustrating the different series as well as the offset and interpolation.

Close modal

An interpolation ratio greater than 1 increases the total number of points. For integer interpolation ratios, every rth point in the interpolated series lines up with the underlying series only if the offset is zero.

We use the two different model systems for our study and generate the data using the following equations. To remove transient behavior from the initial conditions to some kind of stable dynamics, the first part of every time series is discarded.

1. Autoregressive model

The time series is generated iteratively according to the rule
x i + 1 = a x i + σ ξ i ,
(5)
where ξ is Gaussian white noise (mean zero and standard deviation one) and a and σ are free parameters, defining the auto-correlation (memory) and the impact of the noise. We used a parameter setting of a = 0.99 and σ = 1. Three series are generated for the analysis with each 2501 data points. The initial conditions are drawn from a normal distribution, and the first 100 points are discarded. The noise and the initial conditions are generated with a random number generator with three different seeds to make the results reproducible.

2. Roessler model

The Roessler system is fully described by the differential equations38 
x ˙ = y z , y ˙ = x + a y , z ˙ = b + z ( x c ) .
(6)
We use the standard parameters a = 0.15, b = 0.2, c = 10 and a temporal resolution of d t = 0.01. To approximately solve these equations, we used the Euler method and the initial condition ( x = 15 , y = 0 , z = 0 ). The obtained series has 12 505 data points, and the first 500 points are discarded. Afterward, only every fifth data point is used so that the series of both systems have 2401 points.
To mimic the effect of different sampling times and subsequent interpolation, we consider high-resolution series, representing the “true” dynamics. To obtain series with different temporal resolution, multiple downsampled versions are constructed. One downsampled time series is chosen as reference series and analyzed without interpolation so that the results are unchanged and describe the dynamics of the system. All other downsampled series are called the test series (see Fig. 2). In the analysis, the offset for the reference series is chosen to be 0 because only the difference in the offset between the reference and the test series is important. This gives the following series:
High-resolution series: { x i | i = 0 , 1 , 2 , , N hr 1 } , Reference series: { x i | i = 0 , 1 d ref , 2 d ref , , ( N ref 1 ) d ref } , Test series: { x i | i = k test , n , 1 d test , n + k test , n , 2 d test , n + k test , n , , ( N test , n 1 ) d test , n + k test , n } ,
where N hr, N ref, and N test , n are the lengths of the different series, d ref is the downsampling factor of the reference series, and d test , n and k test , n are the downsampling factors and offsets of the different test series, where n numbers the different test series. The corresponding times are given by t i = i d t. We generate interpolation functions for every test series using linear, quadratic spline, cubic spline, and pchip interpolation. To obtain the interpolated series, we evaluated these interpolation functions at the same time points as the reference series (see Fig. 1).
FIG. 2.

Scheme illustrating the method used to investigate the change in the RQA measures due to the difference in the sampling time, offset, and interpolation.

FIG. 2.

Scheme illustrating the method used to investigate the change in the RQA measures due to the difference in the sampling time, offset, and interpolation.

Close modal

Using the reference instead of the high-resolution time series as a comparison, it is possible to study non-integer interpolation ratios, offsets, and also interpolation ratios smaller than one.

The interpolation ratios are given as
r n = d t test , n d t int = d t test , n d t ref = d test , n d t hr d ref d t hr = d test , n d ref ,
(7)
where d t hr, d t ref, and d t test , n are the different sampling times from the series defined above and d t int is the sampling time of the interpolated series.

The RP and the RQA measure L are calculated for the reference and for every interpolated series using the same recurrence threshold ε. L ref is the measure obtained for the reference series and L int for the interpolated series. The differences in the RQA measure are due to the different sampling times and the interpolation. We can compare the results from all interpolated series to the reference series by computing all ratios L int / L ref.

Using this procedure, we mimic the typical sampling bias, e.g., in paleoclimate studies, where sliding windows with different sampling are all interpolated to the same reference sampling and the hidden “true” dynamics is the continuous nature and is not accessible.

To illustrate the deviations in the RP between an interpolated series and a reference series, we consider a short time series of an arbitrary autoregressive process and downsample it by a factor of four before interpolating it linearly back to the original size. This procedure leads to an interpolation ratio of r = 4 and no offset. Here, the high-resolution time series corresponds to the reference series. Comparing the RP of the interpolated time series with the RP of the reference time series, we find some differences (Fig. 3). Points that are 1-points in both RPs are black, points that are only 1-points in the reference RP are red, and points that are only 1-points in the interpolated RP blue. The coarse structure is preserved under the downsampling and interpolation; however, on a smaller scale, many short 1-lines are missing in the interpolated RP, while other 1-lines are merged into greater recurrence structures. Both effects lead to an increased average 1-line length.

FIG. 3.

Difference of interpolated and reference recurrence plots of an arbitrary AR process. Black pixel mark 1-points in both RPs; red pixel mark 1-points in reference RP, which 0-points in interpolated RP; and blue pixel mark 0-points in reference RP, which are 1-points in interpolated RP. The interpolation ratio is r = 4 and offset 0.

FIG. 3.

Difference of interpolated and reference recurrence plots of an arbitrary AR process. Black pixel mark 1-points in both RPs; red pixel mark 1-points in reference RP, which 0-points in interpolated RP; and blue pixel mark 0-points in reference RP, which are 1-points in interpolated RP. The interpolation ratio is r = 4 and offset 0.

Close modal

To quantitatively study the effects on the average 1-line length, we use the method described in Sec. III B. For every model system, the high-resolution series is created according to Sec. III A. From this one, reference and multiple test series are created by downsampling with different offsets and downsampling factors d test , n. We use downsampling factors between 1 and 50 and for every downsampling factor five different offsets, if possible. For series with a downsampling factor smaller than five, the number of different offsets is limited by the downsampling factor.

The reference series are obtained with a downsampling factor d ref = 5 and no offset. The high-resolution series have a length of 2401. The reference series have, therefore, 481 data points. The recurrence thresholds for the systems are chosen so that the RP of the reference series has a recurrence rate of 10%. The recurrence thresholds as well as the L ref measure are given for the Roessler and the three autoregressive systems in Table I. The number in the name of the autoregressive systems (AR-42, AR-43, and AR-44) states the used seed for the random module of the Python library NumPy. With the three different autoregressive systems, we can check whether the results are changed for other realizations of the noise. For every system, the ratio L int / L ref is calculated for every interpolation ratio, interpolation method, and offset separately. To simplify the data, the mean and the standard deviation are calculated over the different offsets, and for the autoregressive systems, also over the three different realizations. The result is, therefore, only dependent on the kind of the system, the interpolation ratio, and the interpolation method.

TABLE I.

Properties of the four different reference series.

System name Recurrence threshold ε Reference Lref
Roessler  5.06  24.6 
AR-42  1.13  1.44 
AR-43  1.05  1.36 
AR-44  1.26  1.46 
System name Recurrence threshold ε Reference Lref
Roessler  5.06  24.6 
AR-42  1.13  1.44 
AR-43  1.05  1.36 
AR-44  1.26  1.46 

1. Roessler system

For the Roessler system, L ref and L int values are almost equal for interpolation ratios smaller than 1.5 (Fig. 4 left). For a larger interpolation ratio, the L int values for the different interpolated series are smaller than L ref and decrease with an increasing interpolation ratio. The quadratic and the cubic spline interpolation lead to very similar results, whereas the linear and the pchip interpolation create the strongest deviation. The differences caused by the offset values are negligible for the linear interpolation, except at the integer ratios 2, 3, and 4. An explanation for this behavior is given in Sec. IV. Additionally, the same analysis was done with the maximum norm instead of the Euler norm. The results are qualitatively similar (see  Appendix B).

FIG. 4.

Relative deviation of L between interpolated and reference series for the Roessler system (left) and three autoregressive systems (AR) (right). The data points represent the mean over the different systems (only for AR) and offsets, and the error bars give the standard deviation.

FIG. 4.

Relative deviation of L between interpolated and reference series for the Roessler system (left) and three autoregressive systems (AR) (right). The data points represent the mean over the different systems (only for AR) and offsets, and the error bars give the standard deviation.

Close modal

2. Autoregressive process

For the interpolated series from the autoregressive systems, L int increases with growing interpolation ratios for all methods (Fig. 4, right). L int calculated from the linearly interpolated series shows the strongest deviation from the reference series. The effect of different realizations and offsets is similar for all considered interpolation methods.

To investigate the deviation caused by different offsets, L int / L ref is calculated without offset and compared with the mean over all offsets. We find that the deviation is only different for integer interpolation ratios (Fig. 5). This is as expected because only for integer interpolation ratios, an offset determines whether all time points of the test series are aligned (without offset) or misaligned (with offset) to the interpolation time points (see Fig. 1). The L int / L ref value is smaller without an offset, which means that the deviation of L int from L ref is smaller.

FIG. 5.

Relative deviation of L between interpolated and reference series for three autoregressive systems (same as Fig. 4, right). The data points represent the mean over the different systems with (blue) and without (red) different offsets, and the error bars give the standard deviation.

FIG. 5.

Relative deviation of L between interpolated and reference series for three autoregressive systems (same as Fig. 4, right). The data points represent the mean over the different systems with (blue) and without (red) different offsets, and the error bars give the standard deviation.

Close modal

This result shows that interpolation can have a remarkable impact even if the interpolation ratio is 1 if there is some time offset between the points of the interpolated and test series. This situation commonly arises when the sampling time varies, causing misalignment between the interpolated series and the data. Even in portions of the data where the sampling time in the interpolated series is equal to the data’s sampling time, the exact timings of the different series can be misaligned.

Knowing the specific effect of interpolation on RQA measure L, it becomes apparent that a correction scheme aimed at mitigating this interpolation effect would be both desirable and feasible.

To construct a correction scheme, which models the deviation in the L measure between an interpolated series to the reference series, it is necessary to account for the influences of interpolation and the differences in the sampling rate.

L can be calculated using the total number of diagonal 1-lines (i.e., lines consisting of values 1),
L = N r N l = N 2 R R N l ,
(8)
where N r is the total number of 1-points in the RP, N l is the number of 1-lines (including the single points, i.e., lines with l = 1), N is the length of the series, and, therefore, N 2 is the total number of points in the RP, and R R is the recurrence rate.

The total number of points N 2 in the RP generated from the reference series ( R ref ) and the interpolated one ( R int ) is equal. In all investigated examples, the recurrence rate R R is also very similar; therefore, the deviation in L can be derived from the difference in the number of 1-lines N l. As we will show, the number can differ, because

  1. separated 1-lines in R ref can be connected in R int,

  2. 1-lines in R ref can be missing in R int, and

  3. parts of 0-lines in R ref can form 1-lines in R int.

(1) and (2) lead to fewer 1-lines in R int and a greater L compared to R ref. (3) leads to more 1-lines and a smaller L.

To tackle the problem, we restrict ourselves to the following case:

  1. There is an integer interpolation ratio between the test series and the interpolation series and no offset.

This means that the test series consists of every rth data point from the reference series and is, therefore, a downsampled version of it. The downsampling results in a RP ( R test ) consisting of every rth point in every rth row of the reference RP ( R ref ) (with r being the interpolation ratio). The same is true for the diagonals: The ith diagonal in R test consists of every rth point from the ( r i )th diagonal of R ref. We call these points anchor points and these diagonals anchor diagonals (Fig. 6). To calculate the relative change in the number of 1-lines, we restrict ourselves to these diagonals.

FIG. 6.

Recurrence plots of some arbitrary reference signal (left), the test signal (middle), and the interpolated signal (right). The red boxes indicate the anchor points, which are identical in all plots. Blue boxes indicate all points belonging to the anchor diagonals. The interpolation ratio r is 3.

FIG. 6.

Recurrence plots of some arbitrary reference signal (left), the test signal (middle), and the interpolated signal (right). The red boxes indicate the anchor points, which are identical in all plots. Blue boxes indicate all points belonging to the anchor diagonals. The interpolation ratio r is 3.

Close modal
At first, we calculate the difference in the number of 1-lines, due to the different sampling, by comparing the structures in R ref and R test. There are more 1-lines in R ref if there are more 1-lines starting between two anchor points than depicted by them. We refer to the sequences of 0- and 1-points between two anchor points as intervals. Only if the first anchor point is a 0-point and the second one is a 1-point, there is a starting point in R test; all further starting points in R ref are additional. To quantify this effect, we can calculate the mean number of 1-line starting points. This is calculated separately for intervals following a 1- or a 0-anchor point,
Δ N l 1 = Δ N l = 0 r Δ N l P Δ N l , 1 ( Δ N l ) ,
(9)
Δ N l 0 = Δ N l = 0 r Δ N l P Δ N l , 0 ( Δ N l ) ,
(10)
where Δ N l 1 is the mean number of additional 1-line starting points per interval, for intervals following 1-points, and Δ N l 0 for intervals following 0-points. P Δ N l , 1 ( Δ N l ) and P Δ N l , 0 ( Δ N l ) are the probabilities to find intervals with Δ N l 1-lines starting points, starting on either 1-points or 0-points.
To find these probabilities, we calculate the length distribution for sequences of alternating 0- and 1-lines, which have Δ N l 1-lines starting points and calculate the probability that they fit between two anchor points. Such a sequence between two 1-points has a length between
l Δ N l , 1 , 1 = l s + i = 1 Δ N d i + i = 1 Δ N 1 l i and
(11)
l Δ N l , 1 , 1 = l s + i = 1 Δ N ( d i + l i ) .
(12)
l s is the start length, which is the number of 1-points after a random point on a random 1-line and d i and l i are the length of random 0-lines and 1-lines. Here, the second equation includes the 1-line containing the second 1-point and the first does not [example for Δ N l = 2 in Fig. 7(a)]. A sequence between a 1-point and a 0-point, which has Δ N l 1-line starting points, has a length between [example for Δ N l = 1 in Fig. 7(b)]
l Δ N l , 1 , 0 = l s + i = 1 Δ N ( d i + l i )
(13)
and
l Δ N l , 1 , 0 = l s + i = 1 Δ N + 1 d i + i = 1 Δ N l i .
(14)
To get the probability that such sequence fits between a 1-anchor point and a random second anchor point, we add up probabilities for a 1- and a 0-point as a second anchor. In both cases, the probability is derived from the probability that the sequence is shorter than the interval length, multiplied by the probability of reaching the next anchor point when considering the next line together. The second factor is always a conditional probability because both lengths are not independent of each other. The distance between two anchor points and, therefore, the interval length is the same as the interpolation ratio r. The underlying probability distributions are given later. In total, we get
P Δ N l , 1 ( Δ N l ) = P ( l Δ N l , 1 , 1 < r ) P ( l Δ N l , 1 , 1 r | l Δ N l , 1 , 1 < r ) + P ( l Δ N l , 1 , 0 < r ) P ( l Δ N l , 1 , 0 r | l Δ N l , 1 , 0 < r ) ,
(15)
where r is the interpolation ratio and P ( X < r ) are cumulative probabilities, which can be calculated from the probability distributions: P ( X < r ) = X = 0 r 1 P X ( X ), where P X ( X ) are the probability distributions of the different sequence lengths.
FIG. 7.

Example intervals (a) between two 1-anchor points, showing one possible configuration with two additional 1-line starting points. (b) Between one 1-anchor points and one 0-anchor point, showing one possible configuration with one additional 1-line starting point.

FIG. 7.

Example intervals (a) between two 1-anchor points, showing one possible configuration with two additional 1-line starting points. (b) Between one 1-anchor points and one 0-anchor point, showing one possible configuration with one additional 1-line starting point.

Close modal

The first two rows give the probability that if the first anchor point is a 1-point that there are Δ N 0-lines and ( Δ N 1 ) 1-lines before the next anchor point, which is also a 1-point. The third and the fourth row calculate the probability that there are Δ N 0-lines and Δ N 1-lines before the next anchor point, which is a 0-point. In both cases, the total interval has Δ N more 1-lines than the interval downsampled to the two anchor points. This is explicitly given for all possible configurations for one example interval in  Appendix C.

The first and third row are calculated from the probability distribution of the sum of the given random variables [Eqs. (11) and (13)]. For our calculation, we assume that the probabilities for the length of the consecutive 1- and 0-lines are independent. This gives
P l Δ N l , 1 , 1 ( l Δ N l , 1 , 1 ) = ( P l s [ P d P l ] x ( Δ N l 1 ) P d ) ( l Δ N l , 1 , 1 ) ,
(16)
P l Δ N l , 1 , 0 ( l Δ N l , 1 , 0 ) = ( P l s [ P d P l ] x ( Δ N l ) ) ( l Δ N l , 1 , 1 ) ,
(17)
where represents a convolution and [ P d P l ] x ( M ) indicates that this part of the equation is repeated M times. P l s ( l s ), P d ( d ), and P l ( l ) are the probability distributions for the start length, the 0-lines, and the 1-lines. P d ( d ) and P l ( l ) have to be known and are only accessible from the 0- and 1-line length histograms of the reference recurrence plot. How to estimate these when the reference series is not known is discussed later. The probability distribution for the start length l s is (see  Appendix A)
P l s ( l s ) = l = l s + 1 P l ( l ) l = 1 l P l ( l ) ;
(18)
for the conditional probabilities in row two, we first need to modify the probability from before so that the condition is fulfilled,
P l Δ N l , 1 , 1 ( l Δ N l , 1 , 1 ) = 1 l Δ N l , 1 , 1 = 0 r 1 P l Δ N l , 1 , 1 ( l Δ N l , 1 , 1 ) × { P l Δ N l , 1 , 1 ( l Δ N l , 1 , 1 ) l Δ N l , 1 , 1 < r 0 else ,
(19)
where the part in front of the curly bracket is the normalization factor. This can now be used to calculate
P ( l Δ N l , 1 , 1 | l Δ N l , 1 , 1 < r ) = ( P l Δ N l , 1 , 1 P l ) ( l Δ N l , 1 , 1 )
(20)
and equivalent for row four. P Δ N l , 0 ( Δ N l ) can be calculated in the same way by interchanging all d i and l i and calculating l s with P d ( d ) instead of P l ( l ).
We now calculate the difference between R int and R test. Afterward, we can add up the changes. When interpolating the test series, the RP ( R int ) regains the size of the reference RP ( R ref ), and every anchor point is equal in both plots (Fig. 6),
R i , j ref = R i / r , j / r test = R i , j int for i , j [ 0 , r , 2 r , ] .
(21)

If we make the assumption that the anchor diagonals have the same statistics for the length of 0- and 1-diagonal lines compared to all diagonals in the interpolated RP ( R int ), we only have to investigate the intervals between anchor points. For linear interpolation, there are four possibilities in R int. For the formal derivation, see  Appendix B.

  1. In intervals lying between two 1-anchor points, all points are 1; therefore, there is no 1-line starting point.

  2. In intervals lying between two 0-anchor points, there is at most one 1-line in between; therefore, there is either one or zero 1-lines starting point.

  3. In intervals lying between one 1- and one 0-anchor point, then there is one 1-0 transition and therefore, no start of a 1-line.

  4. In intervals lying between one 0- and 1-anchor point, then there is one 0-1 transition and, therefore, one start of a 1-line.

Only the second point causes differences in the number of 1-lines between the plots because there is no start of a 1-line in R test, but there might be in R int. If there is a 1-line, we call this case a jump. The total number of jumps is N j and can be directly measured by counting the number of intervals with jumps in R int. From this, we can follow that there is always a bigger or equal amount of 1-lines on anchor diagonals in the interpolated RP ( R int ) compared to the test RP ( R test ).

The estimated total difference Δ N l of 1-lines lying on anchor diagonals between R int and R ref can now be calculated as the sum of, minus the number of intervals following a 1-anchor point times the mean number of 1-line starting-points in such an interval minus the number of intervals following a 0-anchor point times the mean number of 1-line starting points in such an interval plus the number of jumps,
Δ N l = N d r R R Δ N l 1 N d r ( 1 R R ) Δ N l 0 + N j ,
(22)
with N d being the total number of points on the anchor diagonals, r is the interpolation ratio, and R R is the recurrence rate. N d r R R is the number of intervals following a 1-anchor point, and N d r ( 1 R R ) is the number of intervals following an 0-anchor point. The first two summands quantify the difference between R test and R ref, and Δ N l 1 and Δ N l 0 are calculated using Eqs. (9) and (10). The last summand quantifies the difference between R int and R test.
To get the estimated deviation of the average diagonal line length from this consideration, we first use Eq. (8) to write
L int L est = N l int Δ N l N l int = 1 Δ N l N l int ,
(23)
where L int is the average line length measured in the interpolated RP ( R int ) and L est is our estimation of L ref measured in the reference RP ( R ref ). N l int is the number of 1-lines in the interpolated RP ( R int ) on the anchor diagonals, and Δ N l is the estimated difference of the 1-lines between the reference and the interpolated RP on these diagonals.
After inserting Eq. (8) with N r = N d R R, which is the number of 1-points on the anchor diagonals, and Eq. (22), this can be written as
L int L est = 1 + ( Δ N l 1 r + Δ N l 0 r ( 1 R R 1 ) N j N d R R ) L int .
(24)
The total number of jumps N j can also be rewritten as a probability n j that a random interval contains a jump by dividing the number of jumps N j by the number of intervals N d / r,
n j = N j r N d .
(25)
Inserted in Eq. (24), this gives the final result
L int L est = 1 + ( Δ N l 1 r + Δ N l 0 r ( 1 R R 1 ) n j r R R ) L int .
(26)

This equation can be used to estimate the correction for the effects of the different sampling and the interpolation under the conditions that there is an integer interpolation ratio r and no offset between, the reference series and the test series. Furthermore, anchor and not anchor diagonals in the interpolated RP must have similar probability distributions for the 1- and 0-lines ( P l ( l ) and P d ( d )), and the probabilities for consecutive lines have to be independent. In order to calculate our estimate L est from a measured L int from the interpolated RP, it is necessary to know the statistics for the 1- and 0-lines [ P l ( l ) and P d ( d )] from the underlying dynamics. This might for a real-world example be accessible from a period in the data with a high sampling rate. The recurrence rate R R as well as the jumping probability n j can be directly computed from the recurrence plot of the interpolated series. The recurrence rate is just the fraction of 1-points, and n j is calculated by counting the intervals that contain a jump and dividing the results by the number of intervals.

In Sec. IV B, the use of this correction scheme is demonstrated on a simulated data processing example, where everything is known about the true underlying dynamics. Afterward, the scheme is used to correct for sampling and interpolation-induced biases in real-world data.

To evaluate the correction scheme, L int / L ref is calculated for the first autoregressive system and the Roessler system in the same way as before (Sec. III) and compared to the correction L int / L est, which is calculated with the described scheme. r being the interpolation ratio is varied, and while n j and R R are obtained from the interpolated series and the corresponding recurrence plot ( R int ), P l ( l ) and P d ( d ) are obtained from the reference RP ( R ref ).

The estimated L int / L est for the autoregressive process follows the increase in L int / L ref with increasing interpolation ratio in the real data very closely if the interpolation ratio is smaller than 5 [Fig. 8(b)]. For higher values, there are some deviations caused by the small size of the test series. It has less than 100 data points for an interpolation ratio greater than 5. For the Roessler system, the correction only qualitatively captures the decrease, but with an underestimation of the effect [Fig. 8(a)].

FIG. 8.

Estimation and measured relative deviations of the L measure between the interpolated and reference series for the Roessler (top left) and autoregressive system (top right) (the same as in Fig. 4) for integer interpolation ratios. Equation (26) is used to calculate the correction. The bottom row shows the same systems, but the L-measure for the calculation of the change as well as for the calculation of the correction is obtained only from the anchor diagonals (Fig. 6).

FIG. 8.

Estimation and measured relative deviations of the L measure between the interpolated and reference series for the Roessler (top left) and autoregressive system (top right) (the same as in Fig. 4) for integer interpolation ratios. Equation (26) is used to calculate the correction. The bottom row shows the same systems, but the L-measure for the calculation of the change as well as for the calculation of the correction is obtained only from the anchor diagonals (Fig. 6).

Close modal

This deviation is caused by the fact that the following assumptions are not true for the Roessler system. On the one hand, the statistics of the diagonal lines are very different when looking at all diagonals compared to the anchor diagonals. If only these diagonals are considered, the correction for the Roessler system captures the effect also quite well [Fig. 8(c)]. On the other hand, the assumption that the length of consecutive 1- and 0-lines is independent of each other is not true for a deterministic system, such as the Roessler system.

Using consideration from Sec. IV A, it is possible to understand the reasons for the differences between the systems. For the autoregressive process, the dominating effect is that there are fewer short 1- and 0-lines in R int compared to R ref; therefore, the total number of lines is smaller and L int is greater than L ref. The dominating effect for the Roessler system is that there are more short 1-lines between two anchor points in R int than in R ref; therefore, the total number of lines is greater and L int is smaller than L ref.

In this section, we use the insights from the theoretical considerations and simulations above to investigate real-world data. We use the described correction scheme to reduce the influence of the interpolation and the different sampling times.

For this purpose, we use the data from a 290 m long composite core from Chew Bahir in southern Ethiopia. The core was created by combining the two 280 m long parallel lacustrine sediment cores CHB14-2A and 2B, which were drilled within the Chew Bahir Drilling Project (CBDP) in 2014.39 The aim of the CBDP was to establish a high-resolution environmental record spanning an important time interval of human evolution in eastern Africa. The potassium (K) concentration in the sediment is a proxy for the aridity in the region during the time of deposition.40 It was determined by micro x-ray fluorescence ( μXRF) scanning, which was carried out with a spatial resolution of 5 mm. To attribute time points to the measured K concentrations, the age-depth model RRMarch202141 is used, which follows a direct-dating approach and uses multiple chronometers, including AMS 14C dating, optically stimulated luminescence (OSL) dating, argon-argon ( 40Ar/ 39Ar) dating, and tephrochronological data. Details about the age-depth model can be found in Roberts et al.41 In the following, we use this environmental record to show sampling and interpolation effects, which are not dependent on the accuracy of the age-depth model. A detailed RQA and corresponding interpretation has been performed in Trauth et al.42 

The age-depth model reveals a non-constant sedimentation rate; i.e., the data points in this time series are not sampled equally in time. We investigate the temporal change of the mean diagonal line length L using the sliding window approach with a window size of 3079 yr. Therefore, we get a measure of the temporal evolution of the predictability of climate. The mean sampling time changes between these windows. In the oldest part from 616 787 to 434 038 yrBP, the sampling time is around 15 yr, and from 434 038 to 148 149 yrBP, there is a sampling time of around 10 yr. In the newest part of the data from 148,149 yr BP until present, the sampling time changes a lot, from a minimum of around 5.36 to 47.27 yr [Fig. 9(a)]. The strong changes in the sampling time result from the increased number of age measurements available for more recent times. The long periods of uniform sampling times arise due to the limited availability of datable material within the cores during those time intervals.

FIG. 9.

(a) Mean sampling times for every window (red) and constant sampling times with which the interpolation function is evaluated (blue, green). The window step size is 3079 yr, and the window size is 6153 yr. (b) and (c) L for sliding windows, with d t = 47.2 yr (blue) and 5.36 yr (green). With fixed ε = 250. (d) Corrected data with a described scheme.

FIG. 9.

(a) Mean sampling times for every window (red) and constant sampling times with which the interpolation function is evaluated (blue, green). The window step size is 3079 yr, and the window size is 6153 yr. (b) and (c) L for sliding windows, with d t = 47.2 yr (blue) and 5.36 yr (green). With fixed ε = 250. (d) Corrected data with a described scheme.

Close modal

To analyze the data, we use the measured potassium concentration together with the corresponding time points from the age-depth model as our time series. We interpolate this series linearly and evaluate the interpolation function at time points with a fixed sampling time. To see the effect of a different choice of this sampling time, two time series are created, one with a sampling time of d t = 5.36 yr and one with d t = 47.2 yr. These sampling times correspond to the lowest and highest sampling times in the data. To avoid effects from the interpolation over long time periods with missing data (hiatuses), we exclude all points of the interpolated time series, where the temporal distance between the two neighboring measured data points exceeds 50 yr. For windows where more than 40% of the points are excluded, we do not calculate the L measure. This leads to some gaps in the L-series (see Fig. 9).

When comparing the L-curves calculated from the interpolated series with d t = 5.36 y r and d t = 47.2 yr, significant differences emerge. First, the first peak at approximately 10 000 yrBP is considerably stronger in the interpolated series with d t = 5.36 yr. Additionally, the amplitudes of the structures from 616 787 to 434 038 yrBP show an increase compared to the amplitudes of the structures between 434 038 to 148 149 yrBP [Figs. 9(b) and 9(c)]. The sampling time in the time intervals of increased amplitudes is lower than in the other parts, which leads to the hypothesis that the difference in the course of the data is due to the different interpolation. As we can see in Sec. III C, the deviations caused by an interpolation ratio smaller than 1 seem to be small compared to the effect of a larger interpolation ratio. This leads to the conjecture that the results from the interpolated series with d t = 47.2 are closer to the truth and the results of the series with d t = 5.36 yr are altered by the interpolation. To investigate this further, we apply our correction scheme to the L int values. To use this method, the original distributions P l ( l ) and P d ( d ) have to be known. We assume some stationarity; i.e., these distributions should not change too much within the course of the data. We, therefore, use the histograms of 0- and 1-line lengths from the interval where the original d t is 5.36 yr. This is the case between 36 132 and 45 501 yrBP. R R, L int, and n j are calculated for every window. The correction method can only be used for integer ratios; therefore, the interpolation ratio is calculated for every window by dividing the mean sampling time in the data inside the window by the interpolation sampling time of 5.36 yr and rounded to the next integer to calculate the correction.

After applying the correction, the first peak is reduced in its height, and also, the changes of amplitudes in the older part of the data match the results of the interpolated series with d t = 47.2 yr [Fig. 9(d)]. This is an indication that the correction provides a close approximation of the true effect, which leads to the deviations in the L measure and helps to avoid misinterpretation due to the interpolation.

Recurrence plots and recurrence quantification analysis are useful tools for the analysis of data from non-linear systems. They can give valuable insights into the changes in the dynamics and indicate critical regime changes. Applying these methods to paleoclimate data give additional and complementary valuable insights, which are usually not accessible with standard linear methods.

When using RPs for paleoclimate data, we have to take into account that the data from paleoclimate archives are usually not uniformly sampled in time. One commonly used method to tackle this is to generate an interpolation function and evaluate it at equally spaced time points. This, however, can create some bias and lead to wrong conclusions.

In this work, we have demonstrated that, depending on the dynamics, the average diagonal recurrence line length L calculated from the RP of an interpolated signal can be greater or smaller compared to the true L derived from the RP of the raw signal with the same temporal resolution. We have shown this by comparing data with different sampling rates, which are interpolated to the time points of the reference time series.

For a Roessler system, L calculated from the interpolated series is smaller than the one calculated from the reference series for all considered interpolation methods. For autoregressive processes, on the other hand, L is bigger. Furthermore, we have explained that for an integer interpolation ratio, a possible offset can also change the result, and this is even true if the temporal resolution is not changed.

We identified three main reasons for the difference between the series with different sampling and interpolation: (1) distinct recurrence lines in the reference RP are merged in the interpolated recurrence plot, (2) short recurrence lines in the reference RP are missing in the interpolated one, and (3) there are short recurrence lines in the interpolated RP, which are not present in the reference one.

Using these insights, we developed a correction scheme and discussed its capabilities and limitations. For the autoregressive process, it can predict the difference very precisely, as long as the interpolation ratio is not too large and the downsampled data, therefore, not too short. For the Roessler system, on the other hand, it can capture the decrease but underestimates the effect up to a factor of two. Here, the main reason for this deviation seems to be that the correction is only true for the diagonal lines in the interpolated plot, which correspond to a diagonal in the downsampled one.

In the last part of our work, we have investigated paleoclimate data from a lake sediment core. First, we showed that the course of the L measure, when calculated in sliding windows, strongly depends on the choice of sampling time when evaluating the interpolation function. We then applied our proposed correction scheme to produce an approximation of the actual course.

This study shows that there are potentially big biases, when interpolating data, before applying the RP method. It also shows that the effect is strongly system dependent, and a simple correction might not be possible. The proposed correction scheme provides an intuition on which system shows which effects and also provides an approximate correction. The correction scheme is only valid for integer interpolation values without offset, which is rarely the case for real data. Nevertheless, this can give an approximation of the real effect. Further research is required to enhance our understanding of the impact of offset and noninteger interpolation ratios. Another limitation of this study is that the series with different temporal resolutions are obtained by downsampling, which is only similar to the measurement process if the measure yields the value of some quantity at one time point. In reality, measurements often do some kind of averaging over a finite period of time. To understand to what extent this changes the result of this study and how to change the correction scheme, further research is required.

We thank Dr. Tobias Braun and Vanessa Skiba for fruitful discussions.

The authors have no conflicts to disclose.

Nils Antary: Software (lead); Visualization (lead); Writing – original draft (lead); Writing – review & editing (equal). Martin H. Trauth: Writing – review & editing (equal). Norbert Marwan: Supervision (lead); Writing – review & editing (equal).

Code and data used for the research in this study are available in Zenodo at https://doi.org/10.5281/zenodo.8123086.

When starting on a random 1-point in a recurrence plot, l s is the remaining length on the 1-line, where this 1-point is a part of (see Fig. 7). To calculate the probability distribution of this length P l s ( l s ), we have to account for every combination of 1-line length together with the position of the point on this line, which creates a remaining line of the length l s; for example, if we start on 1-line with length 6 on the fourth position, the remaining length l s is 2. The probability to hit a 1-line with a certain length is the probability to find such a line in the RP times the length of the line and this normalized,
P ( l ) = l P l ( l ) l = 1 l P l ( l ) .
(A1)
When hitting a 1-line of a certain length l, the probability that the remaining length is l s is 0 if the length l is not longer than l s. Otherwise, the probability is 1 / l. Summed over all possibilities, which are not zero, this gives in total
P l s ( l s ) = l = l s + 1 P l ( l ) l = 1 l P l ( l ) .
(A2)

Here, we show the mathematical proof for the statements made about the 1-points on anchor diagonals.

  1. In intervals lying between two 1-anchor points, all points are 1.

    1. The distance matrix for an interpolated time series is given by
      D i , j = I ( t i ) I ( t j ) ,
      (B1)
      where I ( t ) is the linear interpolation function. When looking at interpolation with integer r and without offset, we have anchor diagonals. At the anchor diagonals, the following equation holds:
      t i t j = z r d t ref = z d t test z Z ,
      (B2)
      with z being an integer and d t ref and d t test are the sampling times of the reference and test series. The interpolation function is a linear function for t k < t < t k + 1, where t k are the sampling points of the test series. Therefore, I ( t i ) I ( t j ) is the difference between two linear functions for t k i < t i < t k i + 1 and t k j < t j < t k j + 1. The second equation can be rewritten as t k j < t i z d t test < t k j + 1, which is the same as t k j + z d t test < t i < t k j + 1 + z d t test, which is the same as the first conditions. This shows that the difference itself is a linear function between two anchor points.
      For the linear function, which goes through the points A and B, we can show that if A and B are smaller than ε, then all points in between have a norm smaller than ε. It follows that if our interpolation function is a linear function between two anchor points and both anchor points are recurrent and therefore, their norm is smaller than ε, all points on the interpolation function in between have a norm smaller than ε and are, therefore, also recurrent.
      A < ε and B < ε , C = A + z ( B A ) 0 z 1 , C = A + z ( B A ) = ( 1 z ) A + z B ( 1 z ) A + z B max ( A , B ) ε .
      (B3)

  2. In intervals lying between two 0-anchor points, there is at most one 1-line in between.

    1. Two 1-lines would violate the first point because there would be 0-points between 1-points.

  3. In intervals lying between one 1- and one 0-anchor point, then there is one 1-0 transition.

    1. There cannot be an additional 0-line before the last 1-point, as shown in the first point.

  4. In intervals lying between one 0- and 1-anchor point, then there is one 0-1 transition.

    1. There cannot be an additional 0-line after the first 1-point, as shown in the first point.

To illustrate changes to different intervals on an anchor diagonal, we show in Table II the effect of downsampling and interpolation. Furthermore, we give the probability of finding such an interval in the original recurrence plot.

TABLE II.

Example of different intervals on the anchor diagonal of a recurrence plot and how they are changed when the underlying data are downsampled and linearly interpolated with an interpolation ratio of r = 4. The first and the last point are always the anchor points, as described in Sec. IV A.

Original Downsampled Interpolated 1-line-starts Probability
11111  11  11111  P(ls > r
11000  10  11100  P(ls < r) ⋅ P(ls + d1 > r | ls < r
10011  11  11111  P(ls + d1 < r) ⋅ P(ls + d1 + l1 > r | ls + d1 < r
10110  10  11100  P(ls + d1 + l1 < r) ⋅ P(ls + d1 + l1 + d2 > r | ls + d1 + l1 < r
10101  11  11111  P(ls + d1 + l1 + d2 < r) ⋅ P(ls + d1 + l1 + d2 + l2 > r | ls + d1 + l1 + d2 < r
Original Downsampled Interpolated 1-line-starts Probability
11111  11  11111  P(ls > r
11000  10  11100  P(ls < r) ⋅ P(ls + d1 > r | ls < r
10011  11  11111  P(ls + d1 < r) ⋅ P(ls + d1 + l1 > r | ls + d1 < r
10110  10  11100  P(ls + d1 + l1 < r) ⋅ P(ls + d1 + l1 + d2 > r | ls + d1 + l1 < r
10101  11  11111  P(ls + d1 + l1 + d2 < r) ⋅ P(ls + d1 + l1 + d2 + l2 > r | ls + d1 + l1 + d2 < r

To show that the effect seen for the Euler norm is also present when using the maximum norm, we performed the analysis from Fig. 4 (left) again, but using the maximum norm instead. Here, we also see that the L-measure obtained from the interpolated recurrence plot is decreased compared to the reference recurrence plot and the deviation is present for all interpolation methods and increases with the interpolation ratio. The linear and pchip interpolation leads to a similar deviation, which is qualitatively different from the deviation after using one of the spline interpolations (Fig. 10).

FIG. 10.

Relative difference of L int between interpolated and reference series. For a Roessler system, L ref = 8.77, ε = 4.45, and the maximum norm is used.

FIG. 10.

Relative difference of L int between interpolated and reference series. For a Roessler system, L ref = 8.77, ε = 4.45, and the maximum norm is used.

Close modal
1.
A.
Wolf
,
J. B.
Swift
,
H. L.
Swinney
, and
J. A.
Vastano
, “
Determining Lyapunov exponents from a time series
,”
Physica D
16
,
285
317
(
1985
).
2.
X.
Xu
,
J.
Zhang
, and
M.
Small
, “
Superfamily phenomena and motifs of networks induced from time series
,”
Proc. Natl. Acad. Sci. U.S.A.
105
,
19601
19605
(
2008
).
3.
M.
Martini
,
T. A.
Kranz
,
T.
Wagner
, and
K.
Lehnertz
, “
Inferring directional interactions from transient signals with symbolic transfer entropy
,”
Phys. Rev. E
83
,
011919
(
2011
).
4.
N.
Marwan
,
M. C.
Romano
,
M.
Thiel
, and
J.
Kurths
, “
Recurrence plots for the analysis of complex systems
,”
Phys. Rep.
438
,
237
329
(
2007
).
5.
N.
Marwan
, “
A historical review of recurrence plots
,”
Eur. Phys. J. Spec. Top.
164
,
3
12
(
2008
).
6.
N.
Marwan
and
K. H.
Kraemer
, “
Trends in recurrence analysis of dynamical systems
,”
Eur. Phys. J. Spec. Top.
232
,
5
27
(
2023
).
7.
N.
Marwan
,
N.
Wessel
,
U.
Meyerfeldt
,
A.
Schirdewan
, and
J.
Kurths
, “
Recurrence plot based measures of complexity and its application to heart rate variability data
,”
Phys. Rev. E
66
,
026702
(
2002
).
8.
U.
Parlitz
,
S.
Berg
,
S.
Luther
,
A.
Schirdewan
,
J.
Kurths
, and
N.
Wessel
, “
Classifying cardiac biosignals using ordinal pattern statistics and symbolic dynamics
,”
Comput. Biol. Med.
42
,
319
327
(
2012
).
9.
N.
Marwan
,
Y.
Zou
,
N.
Wessel
,
M.
Riedl
, and
J.
Kurths
, “
Estimating coupling directions in the cardio-respiratory system using recurrence properties
,”
Philos. Trans. R. Soc. A
371
,
20110624
(
2013
).
10.
N. R.
Lomb
, “
Least-squares frequency analysis of unequally spaced data
,”
Astrophys. Space Sci.
39
,
447
462
(
1976
).
11.
J.
Ge
,
J.
van Eyken
,
S.
Mahadevan
,
C.
DeWitt
,
S. R.
Kane
,
R.
Cohen
,
A.
Vanden Heuvel
,
S. W.
Fleming
,
P.
Guo
,
G. W.
Henry
,
D. P.
Schneider
,
L. W.
Ramsey
,
R. A.
Wittenmyer
,
M.
Endl
,
W. D.
Cochran
,
E. B.
Ford
,
E. L.
Martin
,
G.
Israelian
,
J.
Valenti
, and
D.
Montes
, “
The first extrasolar planet discovered with a new-generation high-throughput Doppler instrument
,”
Astrophys. J.
648
,
683
695
(
2006
).
12.
S. F. M.
Breitenbach
,
K.
Rehfeld
,
B.
Goswami
,
J. U. L.
Baldini
,
H. E.
Ridley
,
D.
Kennett
,
K.
Prufer
,
V. V.
Aquino
,
Y.
Asmerom
,
V. J.
Polyak
,
H.
Cheng
,
J.
Kurths
, and
N.
Marwan
, “
COnstructing proxy-record age models (COPRA)
,”
Clim. Past
8
,
1765
1779
(
2012
).
13.
I.
Ozken
,
D.
Eroglu
,
T.
Stemler
,
N.
Marwan
,
G. B.
Bagci
, and
J.
Kurths
, “
Transformation-cost time-series method for analyzing irregularly sampled data
,”
Phys. Rev. E
91
,
062911
(
2015
).
14.
J. D.
Scargle
, “
Studies in astronomical time series analysis. II. Statistical aspects of spectral analysis of unevenly spaced data
,”
Astrophys. J.
263
,
835
(
1982
).
15.
P.
Stoica
and
N.
Sandgren
, “
Spectral analysis of irregularly-sampled data: Paralleling the regularly-sampled data approaches
,”
Digit. Signal Process.
16
,
712
734
(
2006
).
16.
K.
Rehfeld
,
N.
Marwan
,
J.
Heitzig
, and
J.
Kurths
, “
Comparison of correlation analysis techniques for irregularly sampled time series
,”
Nonlinear Process. Geophys.
18
,
389
404
(
2011
).
17.
I.
Ozken
,
D.
Eroglu
,
S. F. M.
Breitenbach
,
N.
Marwan
,
L.
Tan
,
U.
Tirnakli
, and
J.
Kurths
, “
Recurrence plot analysis of irregularly sampled data
,”
Phys. Rev. E
98
,
052215
(
2018
).
18.
R.
Hébert
,
K.
Rehfeld
, and
T.
Laepple
, “
Comparing estimation techniques for temporal scaling in palaeoclimate time series
,”
Nonlinear Process. Geophys.
28
,
311
328
(
2021
).
19.
N.
Marwan
,
M.
Thiel
, and
N. R.
Nowaczyk
, “
Cross recurrence plot based synchronization of time series
,”
Nonlinear Process. Geophys.
9
,
325
331
(
2002
).
20.
T.
Chelidze
and
T.
Matcharashvili
, “Dynamical patterns in seismology,” in Recurrence Quantification Analysis—Theory and Best Practices, edited by C. L. Webber, Jr. and N. Marwan (Springer, Cham, 2015), pp. 291–334.
21.
S.
Oberst
,
R. K.
Niven
,
D. R.
Lester
,
A.
Ord
,
B.
Hobbs
, and
N. P.
Hoffmann
, “
Detection of unstable periodic orbits in mineralising geological systems
,”
Chaos
28
,
085711
(
2018
).
22.
A.
Spiridonov
,
L.
Balakauskas
,
R.
Stankevic
,
G.
Kluczynska
,
L.
Gedminiene
, and
M.
Stancikaite
, “
Holocene vegetation patterns in southern Lithuania indicate astronomical forcing on the millennial and centennial time scales
,”
Sci. Rep.
9
,
14711
(
2019
).
23.
A.
Zaitouny
,
M.
Small
,
J.
Hill
,
I.
Emelyanova
, and
M. B.
Clennell
, “
Fast automatic detection of geological boundaries from multivariate log data using recurrence
,”
Comput. Geosci.
135
,
104362
(
2020
).
24.
S.
Radzevicius
,
R.
Stankevic
,
R.
Budginas
,
A.
Cichon-Pupienis
,
A.
Venckute-Aleksiene
,
T.
Meidla
,
L.
Ainsaar
, and
A.
Spiridonov
, “
Integrated stratigraphy of the Ludlow (Silurian) of the Baubliai-2 core (Western Lithuania) and the record of delta O-18 and delta C-13 climatically driven co-variability
,”
Newsl. Stratigr.
56
,
75
88
(
2023
).
25.
J. F.
Donges
,
R. V.
Donner
,
M. H.
Trauth
,
N.
Marwan
,
H. J.
Schellnhuber
, and
J.
Kurths
, “
Nonlinear detection of paleoclimate-variability transitions possibly related to human evolution
,”
Proc. Natl. Acad. Sci. U.S.A.
108
,
20422
20427
(
2011
).
26.
N.
Marwan
and
J.
Kurths
, “
Complex network based techniques to identify extreme events and (sudden) transitions in spatio-temporal systems
,”
Chaos
25
,
097609
(
2015
).
27.
D.
Eroglu
,
F. H.
McRobie
,
I.
Ozken
,
T.
Stemler
,
K.-H.
Wyrwoll
,
S. F. M.
Breitenbach
,
N.
Marwan
, and
J.
Kurths
, “
See-saw relationship of the Holocene East Asian-Australian summer monsoon
,”
Nat. Commun.
7
,
12929
(
2016
).
28.
M. H.
Trauth
,
A.
Asrat
,
W.
Duesing
,
V.
Foerster
,
K. H.
Kraemer
,
N.
Marwan
,
M. A.
Maslin
, and
F.
Schaebitz
, “
Classifying past climate change in the Chew Bahir basin, Southern Ethiopia, using recurrence quantification analysis
,”
Clim. Dyn.
53
,
2557
2572
(
2019
).
29.
T.
Westerhold
,
N.
Marwan
,
A. J.
Drury
,
D.
Liebrand
,
C.
Agnini
,
E.
Anagnostou
,
J. S. K.
Barnet
,
S. M.
Bohaty
,
D.
De Vleeschouwer
,
F.
Florindo
,
T.
Frederichs
,
D. A.
Hodell
,
A. E.
Holbourn
,
D.
Kroon
,
V.
Lauretano
,
K.
Littler
,
L. J.
Lourens
,
M.
Lyle
,
H.
Pälike
,
U.
Röhl
,
J.
Tian
,
R. H.
Wilkens
,
P. A.
Wilson
, and
J. C.
Zachos
, “
An astronomically dated record of Earth’s climate and its predictability over the last 66 million years
,”
Science
369
,
1383
1387
(
2020
).
30.
N. H.
Packard
,
J. P.
Crutchfield
,
J. D.
Farmer
, and
R. S.
Shaw
, “
Geometry from a time series
,”
Phys. Rev. Lett.
45
,
712
716
(
1980
).
31.
K. H.
Kraemer
,
G.
Datseris
,
J.
Kurths
,
I. Z.
Kiss
,
J. L.
Ocampo-Espindola
, and
N.
Marwan
, “
A unified and automated approach to attractor reconstruction
,”
New J. Phys.
23
,
033017
(
2021
).
32.
J.-P.
Eckmann
,
S.
Oliffson Kamphorst
, and
D.
Ruelle
, “
Recurrence plots of dynamical systems
,”
Europhys. Lett.
4
,
973
977
(
1987
).
33.
J. P.
Zbilut
and
C. L.
Webber
, Jr.
, “
Embeddings and delays as derived from quantification of recurrence plots
,”
Phys. Lett. A
171
,
199
203
(
1992
).
34.
C. L.
Webber
, Jr.
and
J. P.
Zbilut
, “
Dynamical assessment of physiological systems and states using recurrence plot strategies
,”
J. Appl. Physiol.
76
,
965
973
(
1994
).
35.
N.
Marwan
,
J. F.
Donges
,
R. V.
Donner
, and
D.
Eroglu
, “
Nonlinear time series analysis of palaeoclimate proxy records
,”
Quat. Sci. Rev.
274
,
107245
(
2021
).
36.
T.
Braun
,
S. F. M.
Breitenbach
,
V.
Skiba
,
F. A.
Lechleitner
,
E. E.
Ray
,
L. M.
Baldini
,
V. J.
Polyak
,
J. U. L.
Baldini
,
D. J.
Kennett
,
K. M.
Prufer
, and
N.
Marwan
, “
Decline in seasonal predictability potentially destabilized Classic Maya societies
,”
Commun. Earth Environ.
4
,
82
(
2023
).
37.
K. H.
Kraemer
and
N.
Marwan
, “
Border effect corrections for diagonal line based recurrence quantification analysis measures
,”
Phys. Lett. A
383
,
125977
(
2019
).
38.
O. E.
Rössler
, “
An equation for continuous chaos
,”
Phys. Lett. A
57
,
397
398
(
1976
).
39.
V.
Foerster
,
A.
Asrat
,
C. B.
Ramsey
,
E. T.
Brown
,
M. S.
Chapot
,
A.
Deino
,
W.
Duesing
,
M.
Grove
,
A.
Hahn
,
A.
Junginger
,
S.
Kaboth-Bahr
,
C. S.
Lane
,
S.
Opitz
,
A.
Noren
,
H. M.
Roberts
,
M.
Stockhecke
,
R.
Tiedemann
,
C. M.
Vidal
,
R.
Vogelsang
,
A. S.
Cohen
,
H. F.
Lamb
,
F.
Schaebitz
, and
M. H.
Trauth
, “
Pleistocene climate variability in Eastern Africa influenced hominin evolution
,”
Nat. Geosci.
15
,
805
811
(
2022
).
40.
V.
Foerster
,
A.
Junginger
,
O.
Langkamp
,
T.
Gebru
,
A.
Asrat
,
M.
Umer
,
H. F.
Lamb
,
V.
Wennrich
,
J.
Rethemeyer
,
N.
Nowaczyk
,
M. H.
Trauth
, and
F.
Schaebitz
, “
Climatic change recorded in the sediments of the Chew Bahir basin, southern Ethiopia, during the last 45,000 years
,”
Quat. Int.
274
,
25
37
(
2012
).
41.
H. M.
Roberts
,
C. B.
Ramsey
,
M. S.
Chapot
,
A. L.
Deino
,
C. S.
Lane
,
C.
Vidal
,
A.
Asrat
,
A.
Cohen
,
V.
Foerster
,
H. F.
Lamb
,
F.
Schäbitz
,
M. H.
Trauth
, and
F. A.
Viehberg
, “
Using multiple chronometers to establish a long, directly-dated lacustrine record: Constraining > 600 , 000 years of environmental change at Chew Bahir, Ethiopia
,”
Quat. Sci. Rev.
266
,
107025
(
2021
).
42.
M. H.
Trauth
,
A.
Asrat
,
A. S.
Cohen
,
W.
Duesing
,
V.
Foerster
,
S.
Kaboth-Bahr
,
K. H.
Kraemer
,
H. F.
Lamb
,
N.
Marwan
,
M. A.
Maslin
, and
F.
Schäbitz
, “
Recurring types of variability and transitions in the 620 kyr record of climate change from the Chew Bahir basin, Southern Ethiopia
,”
Quat. Sci. Rev.
266
,
106777
(
2021
).
Published open access through an agreement with Potsdam-Institut fur Klimafolgenforschung (PIK) e V