Forecasting complex systems is important for understanding and predicting phenomena. Due to the complexity and error sensitivity inherent in these predictive models, forecasting proves challenging. This paper presents a novel approach to assimilate system observations into predictive models. The approach makes use of a recursive partitioning algorithm to facilitate the computation of local sets of model corrections as well as provide a data structure to traverse the model space. These local sets of corrections act as a sample from a piecewise stochastic process. Appending these corrections to the predictive model incorporates hidden residual dynamics, resulting in improved forecasting performance. Numerical experiments demonstrate that this approach results in improved forecasting for the Lorenz 1963 model. In addition, comparisons are made between two types of corrections: Vector Difference and Gaussian. Vector Difference corrections provide the best computational efficiency and forecasting performance. To further justify the effectiveness of this approach it is successfully applied to more complex systems such as Lorenz for various chaotic parameterizations, coupled Lorenz, and cubic Lorenz.

Predictive models of complex systems play a crucial role in understanding phenomena. These models capture the underlying dynamics of phenomena and facilitate predictions. Despite computational advancements, modeling and forecasting these systems proves challenging. By making use of data assimilation and forecasting techniques, model inadequacies are addressed and effective forecasts are made. In some cases, such techniques are insufficient. Regardless, it is advantageous to have longer forecast horizons with greater confidence. In this paper, a novel approach to forecasting complex systems is introduced through the use of stochastic forcing, where the stochastic forcing is a piecewise stochastic process that approximately maps the model state to its associated model error. Applying the forcing to the state’s evolution equation as an additional term results in improved forecasting performance.

Outside of trivial systems, the perfect model scenario, where model predictions are equivalent to system outcomes, is a fiction. This is due to the inability to perfectly capture the dynamics that contribute to a system’s properties because either the dynamics are imperceptible, introducing them into the model is computationally intractable,1 or a combination of the two. In the Imperfect Model Scenario (IPMS), which is reality, models are unable to account for system dynamics in their entirety. The IPMS poses challenges for forecasting, especially when the system in question displays Sensitive Dependence on Initial Conditions (SDICs), where a small perturbation to the initial condition results in an exponential divergence between the perturbed and unperturbed trajectories. SDIC leads to non-linear scaling of model error, shortening the forecast horizon (the period of time when the model can adequately predict the future state of the system) of the predictive model. Despite the limitations posed by IPMS, predictive models provide useful forecasts, informing both policy makers and scientists, enabling decision making. Given that systems of interest can display SDIC, it is important to develop methodologies that can extend their forecast horizons. Systems such as those found in oceanography and meteorology2–8 are examples of phenomena that display SDIC and are of interest.

Previous research has made use of data assimilation to incorporate observations of the system into the predictive model, thus encoding missing dynamics into the predictive model. Duffin et al. take a Bayesian approach to the problem of determining model parameters given observations of the system.9,10 Through the statFEM method originally introduced by Girolami et al.11 Duffin et al. use Gaussian Processes12 as model parameters, which they then fit to the data for use in the finite element method, allowing for improved forecasts with uncertainty quantification. Another approach is theoretically outlined by Judd and Smith.13,14 Their approach is to take a deterministic model and append a stochastic process to account for residual dynamics. The stochastic process acts as an approximation of the map between the model state and the associated model error. By appending the stochastic process to the predictive model, model error is reduced.

This paper addresses the problem of model error in chaotic systems by appending a piecewise stochastic process to the model in line with the theory outlined by Judd and Smith.14 In the context of chaotic systems, the definition of model error by Judd et al. is utilized. Namely, a combination of projection and direction error,15 where projection error occurs when the model’s attractor is in the wrong location and direction error is when the trajectories of the model move in the wrong direction compared to the projection of reality into model space. To develop the methodology the Lorenz ’63 system16 serves as a toy model, where the standard parameter Lorenz system will serve as the predictive model and a misspecified Lorenz system will serve as the system to forecast. The methodology is then tested against varying chaotic parameterizations of Lorenz, coupled Lorenz, and cubic Lorenz systems.

Through the application and development of a recursive partitioning algorithm, the model space is partitioned and through these partitions a piecewise stochastic process is generated. The contributions provided by this paper are as follows: a recursive data structure to facilitate data assimilation, a piecewise stochastic process utilizing the recursive data structure, a comparison between the performance of two approximations of model errors, and an assessment of the robustness of this approach to varying forecasting scenarios.

The paper is organized as follows: Sec. II introduces the theory and background. Section III introduces the Lorenz system16 and the process of assembling the piecewise stochastic process using recursive partitioning. Section IV introduces both Vector Difference corrections and Gaussian corrections. Section V introduces the metric to compare the two types of model corrections, the implementation of the forecasting scheme, and an analysis of the robustness of this approach to varying forecasting scenarios. Finally, Sec. VI summarizes the findings and suggests directions for future research.

This paper shows Vector Difference corrections outperform Gaussian corrections in terms of effective forecast duration and computational efficiency, as well as demonstrating the effectiveness of appending a piecewise stochastic process using both correction types when compared to the original model. Through simulations of the model using Vector Difference corrections, it is found that increasing the resolution of partitions leads to the greatest increase in the forecast horizon, as opposed to the quantity of corrections associated with any given partition, such that additional contributions to the correction samples quickly become negligible. Finally, this method can perform effectively on more complex examples using Vector Difference corrections.

This outline of the theory summarizes the complete outline given by Judd and Smith.14 Let y ˙ = G ( y ) be a system with y R n , n > 0 and G ( ) be a continuous vector field G : R n R n. To model this system, the predictive model x ˙ = F ( x ) with x R n and F : R n R n is used. From Judd and Smith, it is known that by appending a stochastic process to account for residual dynamics to the predictive model F ( ) that model error can be reduced.14 This paper uses Judd and Smith’s findings to construct samples of a piecewise stochastic process to act as corrections for the predictive model, thus improving forecasting skills. Specifically,
(1)
(2)
(3)
where τ = t 0 + ( m 1 ) Δ t, t 0 is the initialization time, the LHS of Eq. (3) is the forecast at time T = m Δ t + t 0 with corrections applied every Δ t time steps, and ϕ Δ t M , ϕ Δ t S are evolution operators of the model and system, respectively.17 Equations (1) and (2) provide time realizations of the model and system, respectively. Γ ( ) is a function that outputs the model error for any given model state, where Γ : W V, W , V R n, and W and V are the model space and correction space, respectively. In practice, the future state is unknown. Therefore, Γ ( ) from Eq. (3) is approximated with ξ ( ). To apply the theory outlined above, it is necessary to construct ξ ( ). ξ ( ) is constructed as a piecewise stochastic process, where ξ ( ) ρ ( ξ | x ± Δ x ). The piecewise stochastic process is generated using a recursive partitioning algorithm inspired by Carroll and Byers.18 Allowing for the storage and computation of local sets of corrections. Specifically,
(4)
where X i is a random variable corresponding with the ith partition P i of N p partitions covering the model space,
(5)
where C i is the set of corrections associated with partition P i and N E is the number of samples taken from set C i. This is a form of ensemble forecasting. Given that the model will not perfectly match any state of the system, it is impossible to apply the perfect correction. Ensemble forecasting allows information from nearby trajectories to inform the model correction,19–22 thus improving forecasting.

Figure 1 shows the process of how corrections improve the predictive model as well as the variability in model error over model space. It is because of this variability in model error that ξ ( ) is piecewise, allowing for local errors to inform local trajectories, ensuring that the corrections applied are relevant to the model state.

FIG. 1.

Shows the process of applying corrections from Γ ( ) to the predictive model trajectory every Δ t time steps, resulting in the model shadowing the system.

FIG. 1.

Shows the process of applying corrections from Γ ( ) to the predictive model trajectory every Δ t time steps, resulting in the model shadowing the system.

Close modal
The proposed approach is tested using the Lorenz 63’ system.16 The Lorenz system is defined as
(6)
The predictive model x ˙ = F ( x ) is Lorenz with standard parameterization and the system y ˙ = F ρ = 42 ( y ) is Lorenz with standard parameterization excepting ρ = 42. Before partitioning the model space, the system is projected from the system space to the model space. By taking N samples of the model X : R 3 × R N and the system Y ^ : R 3 × R N and scaling as follows:
(7)
By multiplying Y ^ by the quotient of the sample model standard deviation and sample system standard deviation, the sample variance of system observations equals the model observations. In cases where the data are misaligned with the predictive model, a projection can be made to the data to reduce the misalignment.

Figure 2(a) shows the difference in scale between the two realizations. By scaling Y ^ to Y, the error resulting from the difference in scale is mitigated. Figure 2(b) shows the result of scaling Y ^ to Y resulting in both predictive model and scaled system having the same sample variance. With Y ^ projected into the model space, the next step is to partition the model space. Carroll and Byers utilized a grid-based partitioning algorithm for the purpose of characterizing invariant properties of chaotic attractors.18 In this paper, grid-based partitioning is used to assemble a piecewise stochastic process in line with Eq. (4).

FIG. 2.

Model and system realizations before and after scaling. Figure 2(a) shows the system realization prior to scaling. Figure 2(b) shows the result of the transformation of Y ^ to Y, where Y = Y ^ SD ( X ) SD ( Y ).

FIG. 2.

Model and system realizations before and after scaling. Figure 2(a) shows the system realization prior to scaling. Figure 2(b) shows the result of the transformation of Y ^ to Y, where Y = Y ^ SD ( X ) SD ( Y ).

Close modal

It begins with a single cube that covers Y in the model space. This cube is then split into 2 D equally sized cubes, with D being the dimension. Cubes continue to split recursively and empty cubes are discarded. The stopping criterion for the partitioning algorithm is met when a cube contains a certain number of points from Y. The stopping criterion is set to 5. Once a cube has 5 or less points, it ceases to split. The selection of stopping criterion was determined through trial and error. Since 5 provided effective results and high resolution partitions to address SDIC, it was selected. The question of optimal parameters for partitioning and how to obtain them is outside the scope of this paper. There are some indicators to determine if the stopping criterion is too high. For example, if the corrections stored by a partition are highly contradictory, specifically they differ greatly in direction and magnitude, then it would indicate that the partition is too big and thus too general, in which case a lower stopping criterion should be set. Although the stopping criterion can be set, the partitioning algorithm cannot be finely tuned due to computational costs and general unpredictability inherent in its operations.

Figures 3(a) and 3(b) demonstrate the algorithm for 1 level of recursion. Figure 3(c) shows the end of the algorithm for | Y | = 10 5, where | | denotes the cardinality. Partitioning until a partition contains a fixed number of points allows for the resolution of the partitioning algorithm to adjust to the density of points. Figure 3(c) demonstrates this through high resolution partitions occurring around regions of high density such as the separatrix. The number of partitions ( N p) resulting from this algorithm is dependent on the stopping criteria, | Y |, and the properties of the time series. As a result, N p is only known after partitioning.

FIG. 3.

Recursive partitioning algorithm for 1 level of recursion and the final result of the algorithm. Figure 3(a) shows Y contained by a box with side lengths equal to 4 SD ( Y ). Figure 3(b) then shows the box in Fig. 3(a) being split into eight equally sized cubes with side lengths 2 SD ( Y ). This process of splitting cubes into eight equally sized cubes continues. When a cube is split, sub-cubes that do not contain points are removed. This process repeats until each cube contains at least one point and at most five points, at which point they are referred to as partitions. Figure 3(c) shows the final result of the partitioning algorithm.

FIG. 3.

Recursive partitioning algorithm for 1 level of recursion and the final result of the algorithm. Figure 3(a) shows Y contained by a box with side lengths equal to 4 SD ( Y ). Figure 3(b) then shows the box in Fig. 3(a) being split into eight equally sized cubes with side lengths 2 SD ( Y ). This process of splitting cubes into eight equally sized cubes continues. When a cube is split, sub-cubes that do not contain points are removed. This process repeats until each cube contains at least one point and at most five points, at which point they are referred to as partitions. Figure 3(c) shows the final result of the partitioning algorithm.

Close modal

With the model space partitioned, the partitions are populated with additional scaled observations Y L. By sorting additional points into the partitions, the sets of corrections become denser and thus contain more information about the dynamics. The sorting of these additional points Y L is facilitated through the existing recursive structure. Y L Y = and Y L is scaled by the same factor in Eq. (7).

Figure 4 demonstrates how points in Y L are sorted into partitions. This approach operates on the fact that subsequent elements of Y Y L are nearby. Initially, the partition that contains the starting point y Y Y L is unknown. As a result, the algorithm starts at the top of the recursive structure and moves down through the partitions that contain y until it either finds a partition at the bottom or the closest candidate at the bottom. Once a partition is found, it will serve as the starting point for finding the subsequent point in Y Y L. In this case, the algorithm moves up the recursive structure until it reaches a partition that contains the point, then like before it moves down the recursive structure until it either finds the partition at the bottom that contains the point or it finds the closest candidate.

FIG. 4.

y 1 , y 2 Y L, where y 1 comes before y 2 in Y L. To find the partition containing y 2 given that P contains y 1 first find the smallest partition P containing both y 1 and y 2. Find the partition that splits from P . If P is at the bottom of the recursive structure, then P is the partition that contains y 2, otherwise continue to the sub-partitions of P .

FIG. 4.

y 1 , y 2 Y L, where y 1 comes before y 2 in Y L. To find the partition containing y 2 given that P contains y 1 first find the smallest partition P containing both y 1 and y 2. Find the partition that splits from P . If P is at the bottom of the recursive structure, then P is the partition that contains y 2, otherwise continue to the sub-partitions of P .

Close modal
With the partitions constructed, corrections can be generated for each of the partitions, beginning with corrections derived from vector differences between the model prediction and system outcome. Specifically, for each partition P i
(8)
where V i Y Y L is the set of points covered by P i.

Figure 5 demonstrates the application of Vector Difference corrections when forecasting. With the process repeating every Δ t time steps. Since the true correction for z ( τ ) is unknown, it is approximated by adding E ( C i ) to ϕ Δ t M ( z ( τ ) ), allowing for nearby information to inform the correction; however, if data are sparse there is more uncertainty on the corrections. To account for the additional uncertainty, Gaussian corrections are introduced, which make use multivariate normals to produce model corrections.

FIG. 5.

z ( τ ) is the current state of the stochastically forced predictive model and is contained by partition P i. Evolving z ( τ ) to ϕ Δ t M ( z ( τ ) ) then applying the correction from ξ ( z ( τ ) ) results in z ( τ + Δ t ). The corrections are sampled uniformly from C i in line with Eq. (5). Corrections c C i are generated using Eq. (8).

FIG. 5.

z ( τ ) is the current state of the stochastically forced predictive model and is contained by partition P i. Evolving z ( τ ) to ϕ Δ t M ( z ( τ ) ) then applying the correction from ξ ( z ( τ ) ) results in z ( τ + Δ t ). The corrections are sampled uniformly from C i in line with Eq. (5). Corrections c C i are generated using Eq. (8).

Close modal

The implementation of Gaussian corrections is inspired by Particle Filters;23–25 as a result, the implementation of Gaussian corrections borrows from Particle Filter implementations. There are two implementations of Gaussian corrections: one with locally defined parameters and the other with globally defined parameters.

Inspired by Judd and Stemler’s implementation of the Particle Filter,24  V Y Y L is defined as the points contained by partition P. Then, the time evolved points in V are assigned a uniform weighting,
(9)
where | V | denotes the cardinality of V. Samples are taken from multivariate normals centered at observations from F. Like Particle Filters, the samples will be accepted or rejected by some criteria. The parameters that will determine the multivariate normal and the acceptance criteria are taken from Schenk, Potthast, and Rojahn’s implementation of a Particle Filter for Lorenz.26 Specifically, the parameters are
c is sampled from N ( y ( t + Δ t ) , Σ ) with y ( t + Δ t ) sampled uniformly from F. ρ ( | | s c | | 2 ) > r is the acceptance criteria for c, where
(10)
and | | | | 2 is the L 2 norm. If sample c is accepted, then it is in C ^ with weighting w ρ ( | | s c | | 2 ), where w was the original uniform weighting assigned in Eq. (9), resulting in
(11)
After 200 samples are drawn, N E samples are taken from the ensemble F ^ and a weighted average is computed. This weighted average becomes the new state. In the case of locally defined parameters, local properties are used to inform the parameters. When the model is in partition P i, it is parameterized using V i Y Y L, the set points covered by P i. Specifically,
This parameterization includes information within partitions, allowing for parameters to vary depending on the model state. Model state dependent parameters allow the forecasting scheme to better account for the variance in model error throughout the model space.

Figure 6 shows the process of applying Gaussian corrections when forecasting. This process repeats every Δ t time steps. With the conclusion of the discussion on the types of corrections and their implementations, they are compared to assess their overall effectiveness at mitigating model error.

FIG. 6.

Given z ( τ ) is in partition P i proceed to evolve z ( τ ) to s, where s (10) is the prediction the model gives for the next state in the forecast. c 1 and c 2 are samples taken from distributions N ( ϕ Δ t D ( y 1 ) , Σ ) and N ( ϕ Δ t D ( y 1 ) , Σ ), respectively. c i is accepted if ρ ( | | s c i | | 2 ) > r, where ρ N ( φ , λ 2 ) and r U ( 0 , 1 ). The weighted sample mean of a random subset of the accepted samples c is computed and becomes the point z ( τ + Δ t ).

FIG. 6.

Given z ( τ ) is in partition P i proceed to evolve z ( τ ) to s, where s (10) is the prediction the model gives for the next state in the forecast. c 1 and c 2 are samples taken from distributions N ( ϕ Δ t D ( y 1 ) , Σ ) and N ( ϕ Δ t D ( y 1 ) , Σ ), respectively. c i is accepted if ρ ( | | s c i | | 2 ) > r, where ρ N ( φ , λ 2 ) and r U ( 0 , 1 ). The weighted sample mean of a random subset of the accepted samples c is computed and becomes the point z ( τ + Δ t ).

Close modal
To assess the performance of Vector Difference corrections and Gaussian corrections, a metric from Stemler and Judd27 called Separation Time is used. Separation Time is the maximum amount of time where the Euclidean distance between the true state and the model state is less than some threshold. It is defined as
(12)
(13)
(14)

In the case of Gaussian corrections, ξ ( y n ) = z ( τ + Δ t ) s, where s is defined in Eq. (10). In line with Stemler and Judd, γ = 2.5 and z ( t i ) correspond to the ith initial condition for a forecast, with both system and model initialized the same.27  Figure 7 shows the Mean Separation Times ( T ¯ m ) as defined by Eqs. (12)–(14). For each forecasting scheme with varying N E, each point in Fig. 7 corresponds to T ¯ m for 10 000 simulations with 10 000 different initial conditions u i W Y Y L occurring at times t i using a partition structure generated with | Y | = 10 5 and | Y L | = 7.5 × 10 6. It is evident from Fig. 7 that N E has a limited effect on all forecasting schemes, excepting the Global parameter Gaussian corrections.

Figure 8 shows that sorting additional points into partitions has a negligible effect on performance. Instead, it is better to increase the resolution of the partitions. By using more data to construct partitions, the stopping criterion for the algorithm is met for smaller partitions, resulting in higher resolution partitioning. For high resolution partitions, the variance of corrections is small. As a result, including additional observations in a given partition does not meaningfully improve the effectiveness of corrections. Despite high resolution providing the most significant performance gains, the improvements afforded by increasing the partition resolution eventually plateau around | Y | = 200 000, where T ¯ m approaches 4.4 s.

FIG. 7.

For larger Ensemble sizes, performance improved marginally for the Local parameter Gaussian corrections and Vector Difference corrections, excepting the Local Gaussian corrections that improved significantly. The Vector Difference corrections outperformed both parameterizations of Gaussian corrections, more than doubling each of their Separation Times. Due to a covariance matrix being computed for Local parameter Gaussian corrections, all partitions that contain only one point are removed.

FIG. 7.

For larger Ensemble sizes, performance improved marginally for the Local parameter Gaussian corrections and Vector Difference corrections, excepting the Local Gaussian corrections that improved significantly. The Vector Difference corrections outperformed both parameterizations of Gaussian corrections, more than doubling each of their Separation Times. Due to a covariance matrix being computed for Local parameter Gaussian corrections, all partitions that contain only one point are removed.

Close modal
FIG. 8.

The Mean Separation Time for 156 Forecasts using Vector Difference corrections, an Ensemble Size of 50, and varying allocations of data. The heatmap results from linear splines.

FIG. 8.

The Mean Separation Time for 156 Forecasts using Vector Difference corrections, an Ensemble Size of 50, and varying allocations of data. The heatmap results from linear splines.

Close modal

To further justify its effectiveness, the approach is applied to Lorenz for varying chaotic parameterizations and structural changes. A set of parameterizations are selected by visually inspecting the resulting attractors. It is important to ensure the parameterizations are chaotic since periodic parameterizations result in seemingly indefinite separation times and collapsed attractors lead to the partitioning algorithm failing due to reaching the max recursion. Visually inspecting the qualitative properties of each Lorenz system to determine whether it is chaotic is used over numerical methods like Lyapunov exponents due to the absence of computation time and the small set of parameterizations to inspect. Assessing whether a parameterization of Lorenz is chaotic can only be determined after generating realizations.28 As a result, samples of parameterizations are tested and discarded if they do not result in chaotic realizations. It is important to ensure that parameterizations are chaotic so as to avoid biasing results from the inclusion of seemingly indefinite forecasts of Lorenz systems with periodic parameterizations. The structural changes to the Lorenz system are described in the following systems of ODEs:

Cubic Lorenz:
(15)
Coupled Lorenz:
(16)
(17)
(18)

Equation (15) defines a cubic Lorenz system and Eqs. (16)–(18) define two coupled Lorenz systems introduced by Just et al.29 

Figure 9 illustrates the Lorenz system perturbed by structural error or misparameterization. Such parameterizations and structural changes have a notable effect on the dynamics of each realization. Mean Separation Time is then computed for a set of parameterizations of each system, beginning with misparameterizations of the standard Lorenz system.

Figure 10 shows the Mean Separation Times for 10 000 simulations over 67 parameterizations. The more complex and contrary the dynamics are with respect to the predictive model the shorter the Separation Time. The variability in Separation Times results precisely from the variance in dynamics between parameterizations with similar parameterizations, in some cases, having dissimilar Separation Times. To further generalize the performance of this approach, Separation Times are generated for the cubic Lorenz System [Eq. (15)] and the coupled Lorenz System [Eqs. (16) and (17)].

Figure 11 shows the Mean Separation Times for cubic Lorenz systems and coupled Lorenz systems. Figure 11(a) shows the Mean Separation Time in response to an increase in the contribution of the cubic component of Eq. (15). Using the Euler method of integration, chaotic attractors are observed for 0 ϵ < 3.07 × 10 4. Similarly, in Fig. 11(b), chaotic attractors are observed for a subset of parameterizations of ζ and K. Outside of these parameterizations, the realizations collapse to a fixed point or the numerical integration is unstable. As the contribution of the cubic component of Eq. (15) increases through ϵ, the model misspecification increases along with its complexity. Similarly, increasing ζ in Eq. (17) and increasing the coupling through K decreases the Separation Time.

FIG. 9.

Shows to scale realizations of misparameterized Lorenz [Eq. (6)], cubic Lorenz [Eq. (15)], coupled Lorenz [Eqs. (16)–(18)], and standard Lorenz [Eq. (6)].

FIG. 9.

Shows to scale realizations of misparameterized Lorenz [Eq. (6)], cubic Lorenz [Eq. (15)], coupled Lorenz [Eqs. (16)–(18)], and standard Lorenz [Eq. (6)].

Close modal
FIG. 10.

Mean Separation Times for misparameterized Lorenz systems over 10 000 simulations. Figure 10(a) are Separation Times for σ = 10, Fig. 10(b) are Separation Times for ρ = 28, and Fig. 10(c) are Separation Times for β = 8 3.

FIG. 10.

Mean Separation Times for misparameterized Lorenz systems over 10 000 simulations. Figure 10(a) are Separation Times for σ = 10, Fig. 10(b) are Separation Times for ρ = 28, and Fig. 10(c) are Separation Times for β = 8 3.

Close modal
FIG. 11.

Mean Separation Times for cubic Lorenz systems and coupled Lorenz systems over 10 000 simulations. Figure 11(a) are Separation Times for cubic Lorenz and Fig. 11(b) are Separation Times for coupled Lorenz.

FIG. 11.

Mean Separation Times for cubic Lorenz systems and coupled Lorenz systems over 10 000 simulations. Figure 11(a) are Separation Times for cubic Lorenz and Fig. 11(b) are Separation Times for coupled Lorenz.

Close modal

The approach put forth by this paper makes use of a piecewise stochastic process to account for model error in the predictive model. It is shown that effective forecast horizons are achievable using this approach and that Vector Difference corrections are superior to Gaussian corrections in both efficacy and computational efficiency. The effectiveness of recursive partitioning in generating the piecewise stochastic process is also demonstrated. Not only did it account for varying data densities, but it also did so computationally efficiently, operating with O ( N log N ) for assembling the structure and O ( H log N ) for sorting additional points, where N is the number observations of the system used to build the partition structure and H is the number of additional system observations sorted into the partitions. Considering Fig. 8, sorting additional points into the partitions is unnecessary; therefore, the methodology can be streamlined by its omission. In addition, the design of the data structure facilitates efficient serialization,30,31 allowing for predictive models to be efficiently stored and accessed, therefore aiding in the practicality of the method. This paper makes use of noiseless data in IPMS. In reality, data are rarely, if ever, noiseless. As a result, there are research opportunities in applying this method to chaotic systems using noisy data. In addition to testing the method using noisy data, there are opportunities to investigate different approximations of Γ from Eq. (3). Currently, the piecewise structure of ξ ( ) is inflexible due to being unable to precisely tune the recursive partitioning algorithm. The partitioning algorithm determines the partitioning structure solely on the number of observations and the stopping criterion. These parameters do not provide fine control over the partitioning structure; instead, they determine general properties. As a result, there are research opportunities available in replacing it, possibly with a Gaussian Process, which has historically performed well as a function approximator.12 Overall, this work offers a novel approach to the problem of model error in forecasting and opens new avenues for research. To recreate results, the code is available on Github at https://github.com/marcus-dyson/RPS-Forecasting.

All authors are supported by the ARC ITRH for Transforming Energy Infrastructure through Digital Engineering (TIDE), Grant No. IH200100009. This research is also supported by an Australian Government Research Training Program (RTP) Scholarship.

The authors have no conflicts to disclose.

M. Dyson: Conceptualization (equal); Methodology (lead); Software (lead); Visualization (lead); Writing – original draft (lead); Writing – review & editing (lead). T. Stemler: Conceptualization (equal); Methodology (supporting); Project administration (lead); Resources (lead); Supervision (lead); Writing – review & editing (supporting).

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

1.
J. O.
Berger
and
L. A.
Smith
, “
On the statistical formalism of uncertainty quantification
,”
Annu. Rev. Stat. Appl.
6
,
433
460
(
2019
).
2.
M.-R.
Alam
, “
Predictability horizon of oceanic rogue waves
,”
Geophys. Res. Lett.
41
,
8477
8485
, https://doi.org/10.1002/2014GL061214 (
2014
).
3.
M.
Verlaan
and
A. W.
Heemink
, “
Tidal flow forecasting using reduced rank square root filters
,”
Stoch. Hydrol. Hydraul.
11
,
349
368
(
1997
).
4.
K.
Judd
, “
Forecasting with imperfect models, dynamically constrained inverse problems, and gradient descent algorithms
,”
Physica D
237
,
216
232
(
2008
).
5.
A.
Robock
, “
Volcanic eruptions and climate
,”
Rev. Geophys.
38
,
191
219
, https://doi.org/10.1029/1998RG000054 (
2000
).
6.
L.
Astfalck
,
E.
Cripps
,
J.
Gosling
, and
I.
Milne
, “
Emulation of vessel motion simulators for computationally efficient uncertainty quantification
,”
Ocean Eng.
172
,
726
736
(
2019
).
7.
L.
Astfalck
,
M.
Bertolacci
, and
E.
Cripps
, “
Evaluating probabilistic forecasts for maritime engineering operations
,”
Data-Centric Eng.
4
,
e15
(
2023
).
8.
L.
Astfalck
,
E.
Cripps
,
J.
Gosling
,
M.
Hodkiewicz
, and
I.
Milne
, “
Expert elicitation of directional metocean parameters
,”
Ocean Eng.
161
,
268
276
(
2018
).
9.
C.
Duffin
,
E.
Cripps
,
T.
Stemler
, and
M.
Girolami
, “
Low-rank statistical finite elements for scalable model-data synthesis
,”
J. Comput. Phys.
463
,
111261
(
2022
).
10.
C.
Duffin
,
E.
Cripps
,
T.
Stemler
, and
M.
Girolami
, “
Statistical finite elements for misspecified models
,”
Proc. Natl. Acad. Sci.
118
,
e2015006118
(
2021
).
11.
M.
Girolami
,
E.
Febrianto
,
G.
Yin
, and
F.
Cirak
, “
The statistical finite element method (statFEM) for coherent synthesis of observation data and model predictions
,”
Comput. Methods Appl. Mech. Eng.
375
,
113533
(
2021
).
12.
C. K.
Williams
and
C. E.
Rasmussen
,
Gaussian Processes for Machine Learning
(
MIT Press Cambridge
,
MA
,
2006
), Vol. 2.
13.
K.
Judd
and
L.
Smith
, “
Indistinguishable states: I. Perfect model scenario
,”
Physica D
151
,
125
141
(
2001
).
14.
K.
Judd
and
L.
Smith
, “
Indistinguishable states II: The imperfect model scenario
,”
Physica D
196
,
224
242
(
2004
).
15.
K.
Judd
,
C. A.
Reynolds
,
T. E.
Rosmond
, and
L. A.
Smith
, “
The geometry of model error
,”
J. Atmos. Sci.
65
,
1749
1772
(
2008
).
16.
E. N.
Lorenz
, “
Deterministic nonperiodic flow
,”
J. Atmos. Sci.
20
,
130
141
(
1963
).
17.
J.
Guckenheimer
and
P.
Holmes
, “Nonlinear oscillations, dynamical systems, and bifurcations of vector fields,” in Applied Mathematical Sciences (Springer, New York, 2013).
18.
T. L.
Carroll
and
J. M.
Byers
, “
Grid-based partitioning for comparing attractors
,”
Phys. Rev. E
93
,
042206
(
2016
).
19.
M.
Leutbecher
and
T.
Palmer
, “
Ensemble forecasting
,”
J. Comput. Phys.
227
,
3515
3539
(
2008
).
20.
Z.
Toth
and
E.
Kalnay
, “
Ensemble forecasting at NMC: The generation of perturbations
,”
Bull. Am. Meteorol. Soc.
74
,
2317
2330
(
1993
).
21.
J. M.
Lewis
, “
Roots of ensemble forecasting
,”
Mon. Weather Rev.
133
,
1865
1885
(
2005
).
22.
S.
Hallerberg
,
D.
Pazó
,
J. M.
López
, and
M. A.
Rodríguez
, “
Logarithmic bred vectors in spatiotemporal chaos: Structure and growth
,”
Phys. Rev. E
81
,
066204
(
2010
).
23.
A.
Doucet
,
S.
Godsill
, and
C.
Andrieu
, “On sequential Monte Carlo sampling methods for Bayesian filtering,” in Readings in Unobserved Components Models, edited by A. C. Harvey and T. Proietti (Oxford University Press, Oxford, 2005), pp. 418–441.
24.
K.
Judd
and
T.
Stemler
, “
Failures of sequential Bayesian filters and the successes of shadowing filters in tracking of nonlinear deterministic and stochastic systems
,”
Phys. Rev. E
79
,
066206
(
2009
).
25.
V.
Elvira
,
J.
Miguez
, and
P. M.
Djurie
, “
Adapting the number of particles in sequential Monte Carlo methods through an online scheme for convergence assessment
,”
IEEE Trans. Signal Process.
65
,
1781
1794
(
2017
).
26.
N.
Schenk
,
R.
Potthast
, and
A.
Rojahn
, “
On two localized particle filter methods for Lorenz 1963 and 1996 models
,”
Front. Appl. Math. Stat.
8
,
920186
(
2022
).
27.
T.
Stemler
and
K.
Judd
, “
A guide to using shadowing filters for forecasting and state estimation
,”
Physica D
238
,
1260
1273
(
2009
).
28.
C.
Sparrow
, “6. The Lorenz equations,” in Chaos, edited by A. V. Holden (Princeton University Press, Princeton, 1986), pp. 111–134.
29.
W.
Just
,
H.
Kantz
,
C.
Rödenbeck
, and
M.
Helm
, “Stochastic modelling: Replacing fast degrees of freedom by noise,”
J. Phys. A Math. Gen.
34
, 3199 (
2001
).
30.
M.
McKerns
and
M. A. G.
Aivazis
, “Pathos: A framework for heterogeneous computing,” http://uqfoundation.github.io/project/pathos (2010).
31.
M. M.
McKerns
,
L.
Strand
,
T.
Sullivan
,
A.
Fang
, and
M. A. G.
Aivazis
, s“Building a framework for predictive science” arXiv:1202.1056 [cs] (2012).