A semiparametric methodology for reconstructing Markovian and non-Markovian Langevin equations from time series data using unscented Kalman filtering is introduced and explored. The drift function and the logarithm of the diffusion function are expanded into sets of polynomial basis functions. In contrast to the more common state augmentation approach, the Kalman filter is here used only for state estimation and propagation; the model parameters are determined by maximum likelihood based on the predictive distribution generated by the Kalman filter. Model selection regarding the number of included drift and diffusion basis functions is performed with the Bayesian information criterion. The method is successfully validated on various simulated datasets with known system dynamics; it achieves accurate identification of drift and diffusion functions, also outside the prescribed model class, from datasets of moderate length with medium computational cost.
The oldest and most established scientific modeling strategy is the physics-based or forward or direct modeling approach. Dynamical models are built from the known governing equations of motion of the system derived from first principles. A second and complementary approach is offered by data-driven or inverse modeling; here, dynamical models are extracted from datasets of the underlying system using statistical techniques. Data-driven modeling is useful if the equations of motion of the system are not readily known and, in particular, if some sort of coarse-graining is involved, that is, if the goal is to derive reduced or effective equations modeling particular macroscopic variables or modes of the system. Such equations will then naturally have a stochastic nature and incorporate memory effects.
I. INTRODUCTION
The reconstruction of the drift and diffusion functions in Eq. (1) from data is a non-trivial estimation problem. Possible approaches include the direct estimation method,4,5 maximum likelihood estimation,6,7 Kalman filtering8,9 and Markov chain Monte Carlo methods10 in a Bayesian framework. All these techniques have advantages and disadvantages regarding issues such as finite sampling interval, partial observation of the system, observational noise, and computation time.
Estimating the drift and diffusion functions in a non-Markovian Langevin equation from time series data of only is an even more challenging task than reconstructing a Markovian Langevin equation. An extension of the direct estimation method was proposed;13 it is based on a perturbation approach, admits a non-zero correlation time of the driving noise, but still needs it to be much smaller than the correlation time of . A direct estimation method without this restriction was introduced recently.14 However, the proposed optimization procedure does not appear to be straightforward; a complicated regularization guided by trial and error for each example system is necessary. Moreover, very large datasets are required for the method to be applicable.
This study introduces and explores a methodology for reconstructing drift and diffusion functions in Markovian and non-Markovian SDEs based on the unscented Kalman filter. The estimation is semiparametric as the drift and diffusion functions are expanded into nested sets of basis functions of increasing complexity. Unlike previous work using the state augmentation technique for parameter estimation,8,9,15,16 the Kalman filter is here only used for state estimation and propagation of the probability density of the state. The model parameters are determined by maximizing the likelihood of the observations based on the predictive probability density generated by the Kalman filter.
II. RECONSTRUCTION METHODOLOGY
A. Semiparametric approach
Depending on the domain of the system, the method could be extended by using also other types of basis functions such as spline, trigonometric, logarithmic, or rational functions which would then also be included in the orthonormalization, but this will not be discussed here. A similar strategy of combining different types of basis functions was recently applied in the context of maximum likelihood probability density estimation.17
In the non-Markovian case, the correlation time parameter is also to be estimated. It is represented as to guarantee positivity. The total parameter vector of the Langevin model is in the white noise case and in the correlated noise case. There are no constraints on any of the parameters. When quoting values of expansion coefficients, we always refer to the parameters and of the monomial expansions in Eqs. (11) and (12), which are linked by a linear transformation to the parameters and .
B. The unscented Kalman filter
The unscented Kalman filter18,19 (UKF) is used for state estimation and propagation of state probability densities in the Langevin models. The UKF allows for recursive estimation of unobserved states in stochastic nonlinear systems from incomplete observations. It keeps the full nonlinear system dynamics rather than linearizing it but truncates the filter probability density to a Gaussian in each step by only propagating means and covariances.
C. The likelihood function and model selection
D. Parameter uncertainty and ill-conditioning
An ensemble of size of parameter vectors consistent with the parameter uncertainty is generated as , where are column vectors of length of independent standard normal random variables and is obtained from the Cholesky decomposition . This ensemble converts to an ensemble of drift and diffusion functions from which confidence intervals can be constructed. In this study, ensembles of size are used to calculate 90% confidence intervals, delimited by the 5% and 95% quantiles.
A possible ill-conditioning or ill-posedness of the parameter estimation problem can be monitored with the (spectral) condition number of given as , where and are the largest and smallest eigenvalues of the symmetric and positive definite matrix .
III. EXAMPLE SYSTEMS
The estimation methodology is explored and validated on a couple of simulated datasets from prototype example systems with known dynamics of increasing complexity. Similar systems were discussed in Ref. 14.
For each system, first the model selection is performed by increasing the number of drift and diffusion basis functions in steps of one and finding the model with the smallest BIC. Then, the reconstructed drift and diffusion functions for the chosen model are examined. Additionally, we look at some important summary statistics in long-term integrations of the true and the reconstructed model: the (stationary) probability density function of the process , the probability density function of the increment process , and the autocorrelation function of . These summary statistics are calculated from very long time series of data points at a sampling interval of 0.025. The probability densities are estimated using a recently proposed semiparametric maximum likelihood estimator17 unless they are available analytically. The correlation times of the systems are estimated by evaluating the integral of Eq. (4) from the sample autocorrelation function using the (composite) trapezoidal quadrature rule. In practice, a finite upper limit of the integral has to be adopted. As the sample autocorrelation function is rather noisy at large time lags, even if calculated from a very long time series, we monitor the correlation time as a function of the upper limit and look for a plateau.
A. Linear model with multiplicative noise
We first consider a linear model with strongly multiplicative noise.
1. White noise
The parameter values , , , , and are chosen here. The system exhibits a markedly positively skewed probability density; the mode of the distribution is below the mean value, and there is a long right tail in the region of high noise level. The correlation time of the system is . The reconstruction is based on a time series of length with sampling interval .
2. Correlated noise
B. Bistable model
Next, we examine a nonlinear bistable system with parabolic diffusion function. The drift function has odd and the diffusion function has even symmetry.
1. White noise
The parameter setting here is , , , and . The dynamics are bistable with stable fixed points at ; the probability density is bimodal with the modes at . The correlation time is . A time series of length at a sampling interval of is used for the reconstruction.
2. Correlated noise
C. Rayleigh process
Finally, we consider the Rayleigh process, here extended to a setting with multiplicative and correlated noise. This system has a non-polynomial drift function.
1. White noise
The parameter setting , , and is adopted here. The system has a stable fixed point at ; the mode of the probability density is at . The correlation time of the system is . A dataset of length with sampling interval is used for the estimation.
2. Correlated noise
IV. RESULTS AND DISCUSSION
A. Model selection
The model selection is illustrated in Tables I–VIII. The log-likelihood function at the maximum likelihood solution and the BIC are listed when successively increasing the number of drift and diffusion basis functions. The values are rounded to integers as more precision is not needed here. For all example systems, the sampling interval is divided into subintervals for propagating the probability densities of the states with the UKF (see Sec. II B).
For all example systems with a polynomial drift function (A.1, A.2, B.1, and B.2) the correct drift model is clearly chosen. The rational drift function of the Rayleigh process is approximated with a third-order polynomial in the white noise case and a fourth-order polynomial in the correlated noise case.
For all example systems, the true diffusion function lies outside the model classes covered by semiparametric modeling. For systems A.1, A.2, B.1, and B.2, the logarithm of the diffusion function can be well approximated on the data range by a quadratic polynomial; for system C.1 a fourth-order, and for system C.2, a fifth-order polynomial is chosen.
In the examples B.1 and B.2, the pattern of symmetry in both drift and diffusion is clearly visible (Tables III and V). A significant model improvement is only achieved when adding a basis function with the correct symmetry. This suggests rerunning the model selection with only odd drift and even diffusion basis functions, which leads to more parsimonious models (Tables IV and VI).
Model selection for example system A.1: Log-likelihood function and Bayesian information criterion (BIC) for various numbers of drift and diffusion basis functions. The model with minimum BIC is indicated in bold.
. | . | ||||
---|---|---|---|---|---|
. | {0} . | {0, 1} . | {0, 1, 2} . | {0, 1, 2, 3} . | {0, 1, 2, 3, 4} . |
{0} | 818/−1619 | 1124/−2222 | 1143/−2252 | 1145/−2247 | 1145/−2239 |
{0, 1} | 942/−1858 | 1221/−2409 | 1259/−2476 | 1260/−2470 | 1260/−2461 |
{0, 1, 2} | 982/−1931 | 1221/−2400 | 1259/−2468 | 1260/−2461 | 1260/−2453 |
{0, 1, 2, 3} | 1003/−1963 | 1234/−2416 | 1260/−2460 | 1260/−2453 | 1261/−2444 |
. | . | ||||
---|---|---|---|---|---|
. | {0} . | {0, 1} . | {0, 1, 2} . | {0, 1, 2, 3} . | {0, 1, 2, 3, 4} . |
{0} | 818/−1619 | 1124/−2222 | 1143/−2252 | 1145/−2247 | 1145/−2239 |
{0, 1} | 942/−1858 | 1221/−2409 | 1259/−2476 | 1260/−2470 | 1260/−2461 |
{0, 1, 2} | 982/−1931 | 1221/−2400 | 1259/−2468 | 1260/−2461 | 1260/−2453 |
{0, 1, 2, 3} | 1003/−1963 | 1234/−2416 | 1260/−2460 | 1260/−2453 | 1261/−2444 |
As Table I but for example system A.2.
. | . | ||||
---|---|---|---|---|---|
. | {0} . | {0, 1} . | {0, 1, 2} . | {0, 1, 2, 3} . | {0, 1, 2, 3, 4} . |
{0} | 11 187/−22 349 | 11 456/−22 877 | 11 474/−22 906 | 11 485/−22 918 | 11 486/−22 912 |
{0, 1} | 11 243/−22 453 | 11 503/−22 964 | 11 539/−23 026 | 11 539/−23 019 | 11 539/−23 011 |
{0, 1, 2} | 11 333/−22 623 | 11 504/−22 956 | 11 539/−23 018 | 11 539/−23 011 | 11 539/−23 002 |
{0, 1, 2, 3} | 11 340/−22 628 | 11 527/−22 995 | 11 539/−23 010 | 11 540/−23 003 | 11 540/−22 994 |
. | . | ||||
---|---|---|---|---|---|
. | {0} . | {0, 1} . | {0, 1, 2} . | {0, 1, 2, 3} . | {0, 1, 2, 3, 4} . |
{0} | 11 187/−22 349 | 11 456/−22 877 | 11 474/−22 906 | 11 485/−22 918 | 11 486/−22 912 |
{0, 1} | 11 243/−22 453 | 11 503/−22 964 | 11 539/−23 026 | 11 539/−23 019 | 11 539/−23 011 |
{0, 1, 2} | 11 333/−22 623 | 11 504/−22 956 | 11 539/−23 018 | 11 539/−23 011 | 11 539/−23 002 |
{0, 1, 2, 3} | 11 340/−22 628 | 11 527/−22 995 | 11 539/−23 010 | 11 540/−23 003 | 11 540/−22 994 |
As Table I but for example system B.1.
. | . | ||||
---|---|---|---|---|---|
. | {0} . | {0, 1} . | {0, 1, 2} . | {0, 1, 2, 3} . | {0, 1, 2, 3, 4} . |
{0} | −5365/10 749 | −5364/10 756 | −5341/10 718 | −5340/10 726 | −5308/10 672 |
{0, 1} | −5100/10 228 | −5099/10 235 | −5091/10 229 | −5090/10 236 | −5060/10 184 |
{0, 1, 2} | −5098/10 233 | −5097/10 241 | −5089/10 234 | −5089/10 243 | −5057/10 188 |
{0, 1, 2, 3} | −4769/9584 | −4769/9593 | −4744/9552 | −4744/9561 | −4742/9568 |
{0, 1, 2, 3, 4} | −4768/9592 | −4768/9601 | −4743/9560 | −4743/9569 | −4742/9576 |
. | . | ||||
---|---|---|---|---|---|
. | {0} . | {0, 1} . | {0, 1, 2} . | {0, 1, 2, 3} . | {0, 1, 2, 3, 4} . |
{0} | −5365/10 749 | −5364/10 756 | −5341/10 718 | −5340/10 726 | −5308/10 672 |
{0, 1} | −5100/10 228 | −5099/10 235 | −5091/10 229 | −5090/10 236 | −5060/10 184 |
{0, 1, 2} | −5098/10 233 | −5097/10 241 | −5089/10 234 | −5089/10 243 | −5057/10 188 |
{0, 1, 2, 3} | −4769/9584 | −4769/9593 | −4744/9552 | −4744/9561 | −4742/9568 |
{0, 1, 2, 3, 4} | −4768/9592 | −4768/9601 | −4743/9560 | −4743/9569 | −4742/9576 |
As Table III but with only odd drift basis functions and only even diffusion basis functions.
. | . | ||
---|---|---|---|
. | {0} . | {0, 2} . | {0, 2, 4} . |
{1} | −5102/10 222 | −5094/10 215 | −5063/10 164 |
{1, 3} | −4771/9569 | −4745/9527 | −4744/9534 |
{1, 3, 5} | −4770/9577 | −4744/9534 | −4744/9543 |
. | . | ||
---|---|---|---|
. | {0} . | {0, 2} . | {0, 2, 4} . |
{1} | −5102/10 222 | −5094/10 215 | −5063/10 164 |
{1, 3} | −4771/9569 | −4745/9527 | −4744/9534 |
{1, 3, 5} | −4770/9577 | −4744/9534 | −4744/9543 |
As Table I but for example system B.2.
. | . | ||||
---|---|---|---|---|---|
. | {0} . | {0, 1} . | {0, 1, 2} . | {0, 1, 2, 3} . | {0, 1, 2, 3, 4} . |
{0} | 10 635/−21 243 | 10 636/−21 235 | 10 662/−21 278 | 10 662/−21 269 | 10 665/−21 267 |
{0, 1} | 10 746/−21 455 | 10 747/−21 448 | 10 906/−21 757 | 10 907/−21 750 | 10 940/−21 806 |
{0, 1, 2} | 10 751/−21 455 | 10 751/−21 447 | 10 908/−21 752 | 10 908/−21 743 | 10 942/−21 801 |
{0, 1, 2, 3} | 11 476/−22 897 | 11 476/−22 888 | 11 520/−22 966 | 11 520/−22 957 | 11 520/−22 948 |
{0, 1, 2, 3, 4} | 11 476/−22 888 | 11 476/−22 878 | 11 520/−22 957 | 11 520/−22 948 | 11 520/−22 939 |
. | . | ||||
---|---|---|---|---|---|
. | {0} . | {0, 1} . | {0, 1, 2} . | {0, 1, 2, 3} . | {0, 1, 2, 3, 4} . |
{0} | 10 635/−21 243 | 10 636/−21 235 | 10 662/−21 278 | 10 662/−21 269 | 10 665/−21 267 |
{0, 1} | 10 746/−21 455 | 10 747/−21 448 | 10 906/−21 757 | 10 907/−21 750 | 10 940/−21 806 |
{0, 1, 2} | 10 751/−21 455 | 10 751/−21 447 | 10 908/−21 752 | 10 908/−21 743 | 10 942/−21 801 |
{0, 1, 2, 3} | 11 476/−22 897 | 11 476/−22 888 | 11 520/−22 966 | 11 520/−22 957 | 11 520/−22 948 |
{0, 1, 2, 3, 4} | 11 476/−22 888 | 11 476/−22 878 | 11 520/−22 957 | 11 520/−22 948 | 11 520/−22 939 |
As Table V but with only odd drift basis functions and only even diffusion basis functions.
. | . | ||
---|---|---|---|
. | {0} . | {0, 2} . | {0, 2, 4} . |
{1} | 10 745/−21 462 | 10 906/−21 774 | 10 940/−21 833 |
{1, 3} | 11 475/−22 913 | 11 518/−22 990 | 11 518/−22 982 |
{1, 3, 5} | 11 481/−22 916 | 11 519/−22 983 | 11 519/−22 974 |
. | . | ||
---|---|---|---|
. | {0} . | {0, 2} . | {0, 2, 4} . |
{1} | 10 745/−21 462 | 10 906/−21 774 | 10 940/−21 833 |
{1, 3} | 11 475/−22 913 | 11 518/−22 990 | 11 518/−22 982 |
{1, 3, 5} | 11 481/−22 916 | 11 519/−22 983 | 11 519/−22 974 |
As Table I but for example system C.1.
. | . | |||||
---|---|---|---|---|---|---|
. | {0} . | {0, 1} . | {0, 1, 2} . | {0, 1, 2, 3} . | {0, 1, 2, 3, 4} . | {0, 1, 2, 3, 4, 5} . |
{0} | 394/−769 | 959/−1891 | 984/−1930 | 985/−1923 | 985/−1914 | 988/−1910 |
{0, 1} | 605/−1183 | 1162/−2288 | 1205/−2365 | 1210/−2365 | 1215/−2365 | 1215/−2357 |
{0, 1, 2} | 710/−1384 | 1170/−2294 | 1215/−2375 | 1223/−2381 | 1227/−2380 | 1227/−2371 |
{0, 1, 2, 3} | 770/−1495 | 1193/−2332 | 1221/−2377 | 1228/−2382 | 1234/−2386 | 1234/−2377 |
{0, 1, 2, 3, 4} | 786/−1516 | 1197/−2329 | 1222/−2371 | 1228/−2373 | 1234/−2376 | 1234/−2368 |
. | . | |||||
---|---|---|---|---|---|---|
. | {0} . | {0, 1} . | {0, 1, 2} . | {0, 1, 2, 3} . | {0, 1, 2, 3, 4} . | {0, 1, 2, 3, 4, 5} . |
{0} | 394/−769 | 959/−1891 | 984/−1930 | 985/−1923 | 985/−1914 | 988/−1910 |
{0, 1} | 605/−1183 | 1162/−2288 | 1205/−2365 | 1210/−2365 | 1215/−2365 | 1215/−2357 |
{0, 1, 2} | 710/−1384 | 1170/−2294 | 1215/−2375 | 1223/−2381 | 1227/−2380 | 1227/−2371 |
{0, 1, 2, 3} | 770/−1495 | 1193/−2332 | 1221/−2377 | 1228/−2382 | 1234/−2386 | 1234/−2377 |
{0, 1, 2, 3, 4} | 786/−1516 | 1197/−2329 | 1222/−2371 | 1228/−2373 | 1234/−2376 | 1234/−2368 |
As Table I but for example system C.2.
. | . | ||||||
---|---|---|---|---|---|---|---|
. | {0} . | {0, 1} . | {0, 1, 2} . | {0, 1, 2, 3} . | {0, 1, 2, 3, 4} . | {0, 1, 2, 3, 4, 5} . | {0, 1, 2, 3, 4, 5, 6} . |
{0} | 12 807/−25 586 | 13 720/−27 403 | 13 821/−27 595 | 13 848/−27 641 | 13 865/−27 665 | 13 870/−27 667 | 13 871/−27 659 |
{0, 1} | 13 003/−25 969 | 13 986/−27 926 | 14 089/−28 123 | 14 121/−28 177 | 14 142/−28 211 | 14 165/−28 248 | 14 171/−28 249 |
{0, 1, 2} | 13 326/−26 605 | 14 018/−27 982 | 14 133/−28 202 | 14 159/−28 244 | 14 178/−28 274 | 14 196/−28 299 | 14 198/−28 296 |
{0, 1, 2, 3} | 13 561/−27 066 | 14 114/−28 163 | 14 142/−28 211 | 14 184/−28 284 | 14 200/−28 307 | 14 213/−28 324 | 14 214/−28 318 |
{0, 1, 2, 3, 4} | 13 695/−27 326 | 14 130/−28 186 | 14 172/−28 262 | 14 186/−28 279 | 14 208/−28 314 | 14 219/−28 327 | 14 219/−28 319 |
{0, 1, 2, 3, 4, 5} | 13 730/−27 386 | 14 156/−28 230 | 14 186/−28 279 | 14 201/−28 302 | 14 208/−28 305 | 14 221/−28 321 | 14 221/−28 313 |
. | . | ||||||
---|---|---|---|---|---|---|---|
. | {0} . | {0, 1} . | {0, 1, 2} . | {0, 1, 2, 3} . | {0, 1, 2, 3, 4} . | {0, 1, 2, 3, 4, 5} . | {0, 1, 2, 3, 4, 5, 6} . |
{0} | 12 807/−25 586 | 13 720/−27 403 | 13 821/−27 595 | 13 848/−27 641 | 13 865/−27 665 | 13 870/−27 667 | 13 871/−27 659 |
{0, 1} | 13 003/−25 969 | 13 986/−27 926 | 14 089/−28 123 | 14 121/−28 177 | 14 142/−28 211 | 14 165/−28 248 | 14 171/−28 249 |
{0, 1, 2} | 13 326/−26 605 | 14 018/−27 982 | 14 133/−28 202 | 14 159/−28 244 | 14 178/−28 274 | 14 196/−28 299 | 14 198/−28 296 |
{0, 1, 2, 3} | 13 561/−27 066 | 14 114/−28 163 | 14 142/−28 211 | 14 184/−28 284 | 14 200/−28 307 | 14 213/−28 324 | 14 214/−28 318 |
{0, 1, 2, 3, 4} | 13 695/−27 326 | 14 130/−28 186 | 14 172/−28 262 | 14 186/−28 279 | 14 208/−28 314 | 14 219/−28 327 | 14 219/−28 319 |
{0, 1, 2, 3, 4, 5} | 13 730/−27 386 | 14 156/−28 230 | 14 186/−28 279 | 14 201/−28 302 | 14 208/−28 305 | 14 221/−28 321 | 14 221/−28 313 |
A comprehensive investigation of the model selection would involve running ensembles of realizations of the example systems and monitoring how often which model is chosen in the reconstruction as a function of the length of the data set and the sampling interval . We leave this to future work.
B. Reconstruction of drift and diffusion functions
The reconstructed drift and diffusion functions of the various example systems are displayed in Figs. 1 and 2 together with pointwise 90% confidence intervals. A canonical choice for the block length in the uncertainty estimates (see Sec. II D) is , the largest integer smaller or equal to , that is, the number of data points per correlation time. Outside the range of the learning dataset, a linear extrapolation is applied to the drift functions and a constant extrapolation to the diffusion functions. For the systems driven by correlated noise (A.2, B.2 and C.2), the estimates of the correlation time parameter with their uncertainties are listed in Table IX.
Estimation of the correlation time parameter γ for the systems driven by correlated noise (A.2, B.2, and C.2): True and reconstructed values with 90% confidence intervals.
System . | True . | Reconstructed . |
---|---|---|
A.2 | 0.200 | 0.189 ([0.153, 0.234]) |
B.2 | 1.000 | 0.950 ([0.869, 1.042]) |
C.2 | 0.500 | 0.559 ([0.491, 0.635]) |
System . | True . | Reconstructed . |
---|---|---|
A.2 | 0.200 | 0.189 ([0.153, 0.234]) |
B.2 | 1.000 | 0.950 ([0.869, 1.042]) |
C.2 | 0.500 | 0.559 ([0.491, 0.635]) |
Within the range of the learning data sets both the drift and diffusion functions are very accurately identified. In the long tails of the linear models and the Rayleigh processes the errors and the uncertainties are somewhat larger. The correlation time parameter is also reasonably well estimated in all cases. The uncertainties of the diffusion function estimates are clearly smaller for the white noise models than for the correlated noise models, whereas the drift functions are slightly more accurately estimated in the correlated noise models. The reason for this is that in the white noise case process increments are in lowest order determined by the noise contribution and the drift enters only in the next order, whereas in the correlated noise case, drift and diffusion contribute in the same order.
Systems driven by white noise: True and reconstructed drift function (top) and diffusion function (bottom) with pointwise 90% confidence intervals. From left to right: linear model (A.1), bistable model (B.1), and Rayleigh process (C.1). Dotted vertical lines indicate the range of the learning dataset.
Systems driven by white noise: True and reconstructed drift function (top) and diffusion function (bottom) with pointwise 90% confidence intervals. From left to right: linear model (A.1), bistable model (B.1), and Rayleigh process (C.1). Dotted vertical lines indicate the range of the learning dataset.
As Fig. 1 but for systems driven by correlated noise (A.2, B.2, and C.2).
Bimodality of the likelihood function for system A.2: True and estimated system parameters as well as log-likelihood function and Bayesian information criterion for the two maxima.
. | α0 . | α1 . | β0 . | β1 . | β2 . | γ . | /BIC . |
---|---|---|---|---|---|---|---|
True | 2.000 | −2.000 | 0.200 | ||||
Global maximum | 1.997 | −1.958 | −1.568 | 1.524 | −0.377 | 0.189 | 11 539/−23 026 |
Secondary local maximum | 0.274 | −0.265 | −2.487 | 1.161 | −0.228 | 1.751 | 11 481/−22 911 |
. | α0 . | α1 . | β0 . | β1 . | β2 . | γ . | /BIC . |
---|---|---|---|---|---|---|---|
True | 2.000 | −2.000 | 0.200 | ||||
Global maximum | 1.997 | −1.958 | −1.568 | 1.524 | −0.377 | 0.189 | 11 539/−23 026 |
Secondary local maximum | 0.274 | −0.265 | −2.487 | 1.161 | −0.228 | 1.751 | 11 481/−22 911 |
C. Summary statistics
Due to the symmetries of the drift and diffusion functions, the probability density in the systems B.1 and B.2 is symmetric about zero. The same holds for the reconstructed system preserving the symmetries of the drift and diffusion functions. The symmetry of the probability density is prescribed in the semiparametric estimator17 as there may be an imbalance between the two states due to their metastability even in very long time series.
For example systems C.1 and C.2, we apply a reflecting boundary condition at zero when integrating the reconstructed models. Otherwise, there would be a finite probability of reaching negative values of as the singularity of the drift function at zero cannot be captured in the reconstruction. This probability is very small, though, and the boundary at zero is actually never hit in our integrations. We here also apply a reflecting boundary condition at the largest data point in the learning dataset. The estimate of the drift function in the right tail of the data set is very uncertain due to the high noise level there and the extrapolation may lead to a positive drift for very large and, thus, the escape of trajectories to infinity.
The summary statistics as estimated from a long-term integration of the true and reconstructed models are displayed in Figs. 3 and 4. We observe excellent agreement in all these statistics between the true and the reconstructed systems with only very small errors. Even details of heavy-tailed increment probability densities are very well captured.
Summary statistics for true and reconstructed systems driven by white noise: probability density (top), increment probability density (middle), and autocorrelation function (bottom). From left to right: linear model (A.1), bistable model (B.1), and Rayleigh process (C.1).
Summary statistics for true and reconstructed systems driven by white noise: probability density (top), increment probability density (middle), and autocorrelation function (bottom). From left to right: linear model (A.1), bistable model (B.1), and Rayleigh process (C.1).
As Fig. 3 but for systems driven by correlated noise (A.2, B.2, and C.2).
D. Data requirements
The time series here extend over between 185 and 870 times the correlation time of and between 10 and 27 samples per correlation time are used. The method is fairly robust with respect to these choices. It still provides virtually unbiased estimates of the drift and diffusion functions from shorter time series; obviously, the uncertainty increases with decreasing time series length . Practically useful estimates can be obtained down to a length of about 50 times the correlation time. If the sampling interval becomes too large, the model parameter estimates are biased as the Gaussian approximation of the UKF is no longer sufficiently valid. When and how strongly this effect shows up depends on the system under consideration, in particular, on the strength of the nonlinearity and the state-dependence of the driving noise.
V. CONCLUSIONS AND OUTLOOK
A methodology for reconstructing Markovian and non-Markovian SDEs from time series data is introduced and discussed. It makes use of the unscented Kalman filter to estimate states and propagate their probability densities and then employs the filter predictions to estimate model parameters, leading to a principled maximum likelihood technique that naturally comes with uncertainty estimates. The drift and diffusion functions are represented in a semiparametric manner.
The technique is validated on a couple of synthetic time series datasets with known underlying dynamics ranging from linear systems, to nonlinear bistable systems, to intermittent systems. The maximization of the likelihood function is robust and fast. Drift and diffusion functions are accurately reconstructed, even if they lie outside of the prescribed polynomial model class.
The computational effort of the present method is considerably higher than that of direct estimation methods,13,14 but it is still not too onerous and much lower than the computational cost of Markov chain Monte Carlo techniques.10 The direct estimation method of Ref. 13 has the restriction that the correlation time of the driving noise is required to be much shorter than that of the variable . The system reconstruction with the present method is very accurate for datasets of moderate length, whereas direct estimation methods13,14 require extremely long datasets to produce any reasonable results. Furthermore, the present method provides a straightforward optimization procedure without the need to tune hyperparameters by hand for each example system and it comes naturally with uncertainty estimates for the drift and diffusion functions, both of which is not the case with direct estimation methods.13,14 In summary, the Kalman filter maximum likelihood approach to identification of nonlinear stochastic systems appears to offer a very good compromise in terms of computation time, accuracy, and data requirements.
A detailed study of the model selection and reconstruction accuracy depending on the length of the time series and the sampling interval using multiple realizations of example systems is left for future work.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
F. Kwasniok: Conceptualization (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Software (equal); Writing – original draft (equal); Writing – review and editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.