This study presents a general framework, namely, Sparse Spatiotemporal System Discovery ( ), for discovering dynamical models given by Partial Differential Equations (PDEs) from spatiotemporal data. is built on the recent development of sparse Bayesian learning, which enforces sparsity in the estimated PDEs. This approach enables a balance between model complexity and fitting error with theoretical guarantees. The proposed framework integrates Bayesian inference and a sparse priori distribution with the sparse regression method. It also introduces a principled iterative re-weighted algorithm to select dominant features in PDEs and solve for the sparse coefficients. We have demonstrated the discovery of the complex Ginzburg–Landau equation from a traveling-wave convection experiment, as well as several other PDEs, including the important cases of Navier–Stokes and sine-Gordon equations, from simulated data.
Data-driven methods provide a viable alternative to more traditional approaches, such as first-principles derivation, for accelerating the discovery of physical laws. This work aims to develop a sparse Bayesian learning-based framework for modeling spatiotemporal dynamical systems described by Partial Differential Equations (PDEs). The proposed framework, based on sparse Bayesian learning (SBL), can reconstruct a variety of prototypical PDEs, including the Navier–Stokes equation, the FitzHugh–Nagumo equation for nerve conduction, and the Schr dinger equation in quantum mechanics, using a small number of samples compared to state-of-the-art (SOTA) methods. In particular, we demonstrate that the data-driven discovery method can automatically extract the first-principles-based coupled complex Ginzburg–Landau equations from data collected in a convection experiment.
I. INTRODUCTION
Recently, there has been a surge of interest in the data-driven discovery of physical laws, which applies computational methods to discover representations from data. There have been reports of many successes using data-driven methods, such as learning a nonlinear mapping from time series,1 predicting catastrophes in nonlinear systems,2 creating autonomous smart systems capable of self-adaption and self-regulation for short- and long-term production management,3 equation-free modeling multi-scale/complex systems,4 inferring models of nonlinear coupled dynamical systems,5 reconstructing networks with stochastic dynamical processes,6 and extracting Koopman modes from data.7–10
In the system identification community, a standard way is to transform the model discovery problem into a regression problem and design a good optimization algorithm to solve the problem.11,12 There are several numerical algorithms to find solutions to problems. The first kind of such a method builds on the ordinary least-squares regression. Examples include orthogonal reduction,13,14 the backward elimination scheme,15,16 least-squares estimator,17 and the alternating conditional expectation algorithm.18–20 These approaches do have a good fitting accuracy as described by Ref. 21 which, however, requires a relatively large number of data to train. The second type of method is the sparse regression approach, where a sparse prior assumption of underlying governing equations is incorporated to tackle the outstanding issue, overfitting. By converting the discovery problem of differential equations to a sparse regression problem, Ref. 22, and later, Brunton et al.23 developed a sparse identification of nonlinear dynamics framework to extract ODEs from noisy measurement data. Rudy et al.24 and Schaeffer25 consequently extended the method in Ref. 23 to recover PDEs from spatiotemporal data.
On the development of sparse optimization problem, sparse Bayesian learning (SBL) method, originally proposed in Ref. 26 and further developed in Refs. 27–29, adopts a Bayesian probabilistic framework to find sparse solutions. By treating the parameter as a random variable with sparse-inducing prior distributions determined by a set of hyperparameters and then calculating the posterior distribution of the to-be-estimated parameter given the data, it has been shown to be more general and powerful in finding a maximally sparse solution. For instance, the maximum a posteriori (MAP) estimation with a fixed weight prior is its special case;28 the Lasso problem can be interpreted as the explicit MAP with Laplace prior.30,31 Specifically, Wipf et al.32 established the equivalence of the SBL cost function and the objectives in canonical MAP-based sparse methods (e.g., the Lasso) via a dual-space analysis.
In this work, we propose a SBL framework for data-driven PDE discovery problems. This framework integrates Bayesian inference and sparse priori distribution with the sparse regression method. In our sparse Bayesian method derivation, an iterative re-weighted -minimization algorithm is derived to approximate the original -minimization. Compared with PDE-FIND with the Ridge Regression, which uses a thresholding -minimization algorithm, -minimization algorithm is known to pick the sparser solution and, therefore, can better approximate the solution to the original problem. Additionally, we also analyze spatiotemporal data observed from an experiment on convection in an ethanol water mixture in a long, narrow, annular container which is heated from below and demonstrate that the proposed data-driven discovery method is able to automatically extract the governing PDEs from spatiotemporal binary-fluid convection data. This is distinct from the previous works,24,33 in which the data used for discovering the PDEs are generated from the numerical solution to the known PDEs. Our key empirical findings are the following:
We demonstrate that an interpretable first-principle model can be discovered from inherently noisy experimental data. The proposed SBL method discovers a coupled complex Ginzburg–Landau equation (CGLE) using spatiotemporal binary-fluid convection data. Furthermore, it has been proven that the discovered values of various coefficients in the CGLE, such as the group velocity, the dispersion coefficients, the saturation parameter and the coupling coefficient, can be accurately determined in terms of the theoretical values and experimental ones provided in Refs. 19,34, and 35. Moreover, the numerical solution to the discovered CGLE matches the observed experimental phenomenon.
We demonstrate that the proposed SBL framework is capable of reconstructing a variety of prototypical PDEs, including the Navier–Stokes equation in fluid mechanics,36 the FitzHugh–Nagumo for nerve conduction,37 the Schr dinger equation in quantum mechanics,38 the sine-Gordon equation, and the Korteweg–de Vries equation. By comparing with the state-of-the-art PDE-FIND method in Ref. 24 and the Douglas-Rachford method in Ref. 25, the proposed algorithm is robust to measurement noise and is able to discover the underlying PDEs using a much smaller number of samples.
The rest of this paper is organized as follows: In Sec. II, a Bayesian framework for discovering PDEs from spatiotemporal data is presented. In Secs. III and IV, the proposed framework has been applied to the discovery of PDE models using both experimental and synthetic data. Conclusions and perspectives are drawn in Sec. V.
Notations: Throughout this paper, we denote vectors by lowercase letters, e.g., , and matrices by uppercase letters, e.g., . For a vector and a matrix , we denote its transpose by and , respectively. Furthermore, refers to its th element of . and denote its Euclidean norm and norm, respectively. For a differentiable function , we call and (abbreviated as ) the derivative of and the th partial derivative of with respect to , respectively. For a convex function , we say vector is a sub-gradient of at if for all , . The set of all subgradients at is called the subdifferential at and is denoted .
II. S3d: A BAYESIAN FRAMEWORK TO DISCOVER PDES FROM SPATIOTEMPORAL DATA
Schematics of the method used to infer the CGLE from data. (a) Data collection. Data are collected from experiments of a real system (or from a numerical solution to a PDE). The collected data are formed into a matrix. (b) Model specification. A large library of candidate terms including the potential features for the to-be-identified PDE is constructed. The candidate terms include the state , its spatial derivatives up to two orders, and nonlinear coupling terms between them. The time derivative is regarded as the model output. The collected data are used to approximate the time derivative of the state and all the candidate terms. A set of linear equations (i.e., ) is then defined to represent the underlying PDE. (c) Sparse Bayesian learning. Sparse regression problem is solved using an iterative reweighted -minimization algorithm.
Schematics of the method used to infer the CGLE from data. (a) Data collection. Data are collected from experiments of a real system (or from a numerical solution to a PDE). The collected data are formed into a matrix. (b) Model specification. A large library of candidate terms including the potential features for the to-be-identified PDE is constructed. The candidate terms include the state , its spatial derivatives up to two orders, and nonlinear coupling terms between them. The time derivative is regarded as the model output. The collected data are used to approximate the time derivative of the state and all the candidate terms. A set of linear equations (i.e., ) is then defined to represent the underlying PDE. (c) Sparse Bayesian learning. Sparse regression problem is solved using an iterative reweighted -minimization algorithm.
A. Model specification
B. Sparse Bayesian learning for S3d
C. Convergence guarantee of the sparse Bayesian learning algorithm
SBL algorithm based on CCP.
Input: : design matrix; : observation vector; : |
tolerance; |
Output: |
1: initial , ; |
2: repeat |
3: ; |
4: ; |
5: , ; |
6: |
7: until |
Input: : design matrix; : observation vector; : |
tolerance; |
Output: |
1: initial , ; |
2: repeat |
3: ; |
4: ; |
5: , ; |
6: |
7: until |
III. DISCOVERY OF COMPLEX GINZBURG–LANDAU EQUATIONS FROM BINARY-FLUID CONVECTION EXPERIMENT
In this section, we apply the proposed method to discover the complex Ginzburg–Landau equation (CGLE) from a binary-fluid convection experiment. Traveling-wave (TW) convection in binary fluids is a known, mainstream ansatz technique for studying the physical mechanism of non-equilibrium pattern-forming systems.44 Substantial experiments on TW convection unfold many different dynamical states of TW. A major challenge in the scientific study of TW is quantitatively understanding and predicting the underlying dynamics. First-principle models based on the CGLE and its variants45,46 have long been used to quantitatively explain experimental observations.
A. Experimental data description
The experimental data are collected from an experiment on convection in an ethanol/water mixture in a long, narrow, annular container which is heated from below.19,34 The objective of the experiment is to probe into the counterpropagating wave packet formation regime. As reported in Refs. 34,47, and 48, a regular, small-amplitude traveling-wave (TW) state is observed in this experiment, which consists of pairs of quasilinear wave packets which propagate around an annular cell in opposite directions, referred to as “left” and “right.” In Fig. 2, we show the dynamics of the TW state for the bifurcation parameter (scaled by the characteristic time defined below): , where the left-TW rapidly moves to the left with small amplitudes [denoted by ] and collides periodically with the right-TW [denoted by ].
The dynamics of the observed TW state for . The top line shows the physical change of the amplitudes of the left-going (blue full line and dashed line) and right-going (red full curve and dashed line) wave components at a particular time and at the next time’s data. (a) The behavior of left-going amplitudes; (b) the behavior of right-going amplitudes; (c) the behavior of the sum of left- and right-going amplitudes.
The dynamics of the observed TW state for . The top line shows the physical change of the amplitudes of the left-going (blue full line and dashed line) and right-going (red full curve and dashed line) wave components at a particular time and at the next time’s data. (a) The behavior of left-going amplitudes; (b) the behavior of right-going amplitudes; (c) the behavior of the sum of left- and right-going amplitudes.
Overview of the TW data in the designed experiment. The special letter “cal” and “car,” respectively, represent the left-going waves and right-going ones. The files record the experimental data. Seven bifurcation parameters, , scaled by the characteristic time, τ0 (i.e., ), represent the same experiments but conducted with different bifurcation parameters. All samples used for analysis are truncated from the 21st sampling point to the 780th sampling point19 such that a total of 180 × 760 data points is used for the analysis. Then, we form the snapshot matrices, .
. | Bifurcation . | . | Sampling . | Space . | Number of points . |
---|---|---|---|---|---|
Experiment label . | parameter . | Sample number . | interval . | interval . | after truncation . |
cal06172/car06172 | γ = 9.32 × 10−3 | 988 × 180 | 1.5625 | 0.458 167 | 228 |
cal06182/car06182 | γ = 4.22 × 10−3 | 800 × 180 | 1.5625 | 0.458 167 | 40 |
cal06192/car06192 | γ = 1.77 × 10−3 | 1000 × 180 | 1.5625 | 0.458 167 | 240 |
cal06212/car06212 | γ = 6.38 × 10−3 | 1000 × 180 | 1.5625 | 0.458 167 | 240 |
cal06222/car06222 | γ = 12.07 × 10−3 | 1031 × 180 | 1.5625 | 0.458 167 | 271 |
cal06242/car06242 | γ = 14.03 × 10−3 | 1000 × 180 | 1.5625 | 0.458 167 | 240 |
cal06252/car06252 | γ = 16.28 × 10−3 | 1000 × 180 | 1.5625 | 0.458 167 | 240 |
. | Bifurcation . | . | Sampling . | Space . | Number of points . |
---|---|---|---|---|---|
Experiment label . | parameter . | Sample number . | interval . | interval . | after truncation . |
cal06172/car06172 | γ = 9.32 × 10−3 | 988 × 180 | 1.5625 | 0.458 167 | 228 |
cal06182/car06182 | γ = 4.22 × 10−3 | 800 × 180 | 1.5625 | 0.458 167 | 40 |
cal06192/car06192 | γ = 1.77 × 10−3 | 1000 × 180 | 1.5625 | 0.458 167 | 240 |
cal06212/car06212 | γ = 6.38 × 10−3 | 1000 × 180 | 1.5625 | 0.458 167 | 240 |
cal06222/car06222 | γ = 12.07 × 10−3 | 1031 × 180 | 1.5625 | 0.458 167 | 271 |
cal06242/car06242 | γ = 14.03 × 10−3 | 1000 × 180 | 1.5625 | 0.458 167 | 240 |
cal06252/car06252 | γ = 16.28 × 10−3 | 1000 × 180 | 1.5625 | 0.458 167 | 240 |
B. Discovery of CGLE
We proceed the discovery process by two steps: first, the proposed method intensively determines the dominant physical terms from a large library of candidate terms. We construct the function dictionary using brief domain knowledge in fluid dynamics theory. Possible terms may include the terms appearing in the Navier–Stokes equations (such as ) or the ones appearing in the first principles-based derived complex Ginzburg–Landau equations (such as the higher-order nonlinear terms ). In general, the amplitude itself and its derivative up to some custom order should be added in order to properly model the physical system. Furthermore, to correctly model states of counterpropagating TW, the coupling terms like and must be added to account for the interactions between the left- and right-going TW. In this example, we take the third-order Volterra expansions of and and use the fourth-order compact Pade scheme50 for the derivative estimation. It is shown that the Pade scheme is much more robust than the explicit fourth-order difference scheme for the noise. We investigate the performance of the method on the last three sets of data with a larger signal-to-noise ratio in Table I. The second step is to identify coefficients for each identified dominant term, where all the data displayed in Table I are used for the coefficient estimations. The used candidate terms include , , , , , , , , , , , , , , , and for the right-going TW.
The accelerations of fitting error due to a change in the number of feature terms. We empirically adjust the regulation parameter, , while observing the change in the fitting error. Three sets of data were chosen to test : car/cal06242, car/cal06222, and car/cal06252 from top to bottom. For each dataset, the fitting error slowly increased and tended to a relatively stable value [five features in Eq. (3.1) emerge], then that error began to rapidly grow.
The accelerations of fitting error due to a change in the number of feature terms. We empirically adjust the regulation parameter, , while observing the change in the fitting error. Three sets of data were chosen to test : car/cal06242, car/cal06222, and car/cal06252 from top to bottom. For each dataset, the fitting error slowly increased and tended to a relatively stable value [five features in Eq. (3.1) emerge], then that error began to rapidly grow.
All the used candidate terms for the experimental data labeled car/cal06242, the proposed method identifies the feature terms with different .
All the used candidate terms for the experimental data labeled car/cal06242, the proposed method identifies the feature terms with different .
Based on the analysis above, we further apply the proposed method to fine-tune the coefficients of CGLE. For TW convection, most parameters in CGLE (the coupling coefficient is an exception) have been measured in Ref. 35. Here, to show our identified results in an accessible form, we introduce the concept of the leave-one-out in machine learning to discuss the best method to choose the regularization parameters and obtain the best-fit coefficients. We accept the parameters that lead to the smallest fitting error on the test sets: . In Table II, we summarized the identified coefficients, from which it can be seen that the estimated coefficients mostly resemble the theoretical values and experimental ones (Refs. 19,34, and 35 and reference therein), except for some obscure parameters in theory and experiment. Particularly, we observe that for smaller bifurcation parameters, the estimated coupling coefficients possess the same characteristic, that is, for the right-going TW component is smaller than ones for the , and the value of approximate . But for the three datasets with the highest signal-to-noise ratio, we clearly find that the estimated coefficients are unstable, which may arise from the bad estimates of the derivatives. Additionally, it is interesting to note that all the seven analyzed datasets have one thing in common, that is, the negativity of the coupling coefficients, which indicates a stabilizing nonlinear competition between the two TW and a slowing down of the TW group velocity, which is also similar to Ref. 19 and agrees with the experimental observation.34
Summary of the identified parameters of seven sets of real data: The data are labeled according to “right”(car) and “left”(cal). Each row represents the identified coefficients by S3d.
Coefficients in the discovered complex Ginzburg–Landau equation . | |||||||||
---|---|---|---|---|---|---|---|---|---|
Experiment label . | s . | . | . | . | . | . | . | . | . |
cal06172cb | 0.5895 | 0.0065 | 0.3392 | −17.1619 | −6.4595 | 0.0443 | −0.2185 | −89.0851 | −139.7500 |
car06172cb | 0.5982 | 0.0079 | 0.9210 | 17.7800 | −25.2965 | 0.0640 | −0.1243 | −206.3442 | −150.6041 |
cal06182cb | 0.6268 | 0.0057 | 0.0000 | −78.0836 | −0.8076 | −0.0094 | −0.3774 | 0.0000 | −154.4844 |
car06182cb | 0.6308 | 0.0081 | 0.0935 | −90.2362 | −35.0632 | −0.0059 | −0.4397 | −55.8157 | −157.3323 |
cal06192cb | 0.5930 | 0.0054 | 0.0148 | −139.4494 | −2.5230 | −0.0066 | −0.6457 | −37.6346 | −152.7833 |
car06192cb | 0.5916 | 0.0087 | 0.0158 | −198.6799 | −34.9115 | −0.0066 | −0.9189 | −68.1182 | −141.3105 |
cal06212cb | 0.6063 | 0.0057 | 0.0639 | −45.0095 | −7.9602 | 0.0033 | −0.2732 | −35.3580 | −154.5495 |
car06212cb | 0.6131 | 0.0042 | 0.0202 | −26.1553 | −33.0555 | 0.0019 | −0.1489 | −17.7879 | −168.7832 |
cal06222cb | 0.5741 | 0.0173 | 1.7492 | 40.1657 | −15.1882 | 0.0281 | −0.0424 | −379.5853 | −136.5903 |
car06222cb | 0.5771 | 0.0127 | 1.3000 | 29.9928 | −16.6945 | 0.0114 | −0.0894 | −272.3181 | −161.5083 |
cal06242cb | 0.5509 | 0.0138 | 1.0701 | 16.2302 | −17.3846 | 0.0555 | −0.1505 | −234.8376 | −144.0718 |
car06242cb | 0.5597 | 0.0150 | 1.1368 | 13.1359 | −4.5687 | 0.0567 | −0.2108 | −237.9784 | −149.1559 |
cal06252cb | 0.5634 | 0.0199 | 1.1690 | 2.1061 | −5.2084 | 0.0867 | −0.2690 | −267.3739 | −169.0035 |
car06252cb | 0.5481 | 0.0131 | 1.1075 | 22.3429 | −3.9619 | 0.0794 | −0.1245 | −234.6182 | −179.6777 |
Coefficients in the discovered complex Ginzburg–Landau equation . | |||||||||
---|---|---|---|---|---|---|---|---|---|
Experiment label . | s . | . | . | . | . | . | . | . | . |
cal06172cb | 0.5895 | 0.0065 | 0.3392 | −17.1619 | −6.4595 | 0.0443 | −0.2185 | −89.0851 | −139.7500 |
car06172cb | 0.5982 | 0.0079 | 0.9210 | 17.7800 | −25.2965 | 0.0640 | −0.1243 | −206.3442 | −150.6041 |
cal06182cb | 0.6268 | 0.0057 | 0.0000 | −78.0836 | −0.8076 | −0.0094 | −0.3774 | 0.0000 | −154.4844 |
car06182cb | 0.6308 | 0.0081 | 0.0935 | −90.2362 | −35.0632 | −0.0059 | −0.4397 | −55.8157 | −157.3323 |
cal06192cb | 0.5930 | 0.0054 | 0.0148 | −139.4494 | −2.5230 | −0.0066 | −0.6457 | −37.6346 | −152.7833 |
car06192cb | 0.5916 | 0.0087 | 0.0158 | −198.6799 | −34.9115 | −0.0066 | −0.9189 | −68.1182 | −141.3105 |
cal06212cb | 0.6063 | 0.0057 | 0.0639 | −45.0095 | −7.9602 | 0.0033 | −0.2732 | −35.3580 | −154.5495 |
car06212cb | 0.6131 | 0.0042 | 0.0202 | −26.1553 | −33.0555 | 0.0019 | −0.1489 | −17.7879 | −168.7832 |
cal06222cb | 0.5741 | 0.0173 | 1.7492 | 40.1657 | −15.1882 | 0.0281 | −0.0424 | −379.5853 | −136.5903 |
car06222cb | 0.5771 | 0.0127 | 1.3000 | 29.9928 | −16.6945 | 0.0114 | −0.0894 | −272.3181 | −161.5083 |
cal06242cb | 0.5509 | 0.0138 | 1.0701 | 16.2302 | −17.3846 | 0.0555 | −0.1505 | −234.8376 | −144.0718 |
car06242cb | 0.5597 | 0.0150 | 1.1368 | 13.1359 | −4.5687 | 0.0567 | −0.2108 | −237.9784 | −149.1559 |
cal06252cb | 0.5634 | 0.0199 | 1.1690 | 2.1061 | −5.2084 | 0.0867 | −0.2690 | −267.3739 | −169.0035 |
car06252cb | 0.5481 | 0.0131 | 1.1075 | 22.3429 | −3.9619 | 0.0794 | −0.1245 | −234.6182 | −179.6777 |
C. Model validation
Comparison of spatiotemporal evolution of the superposition of the left- and right-TW amplitude between the simulation and the experiment. The different initial values, the th column of the snapshot matrix for data labeled “cal/car06172,” the th column for data labeled “cal/car06192,” the th column for data labeled “cal/car06212,” and the st column for data labeled “cal/car06182,” are selected for simulation (the first column). The first 400 (or 200) snapshots of the real data and the simulation data are shown with a resolution of (column 2 and column 3).
Comparison of spatiotemporal evolution of the superposition of the left- and right-TW amplitude between the simulation and the experiment. The different initial values, the th column of the snapshot matrix for data labeled “cal/car06172,” the th column for data labeled “cal/car06192,” the th column for data labeled “cal/car06212,” and the st column for data labeled “cal/car06182,” are selected for simulation (the first column). The first 400 (or 200) snapshots of the real data and the simulation data are shown with a resolution of (column 2 and column 3).
Comparison of spatiotemporal evolution of the superposition of the left- and right- TW amplitude between the simulation and the experiment. Using the extended time and initial value (i.e., th column of snapshot matrix), a regular bust of TW is seen .
Comparison of spatiotemporal evolution of the superposition of the left- and right- TW amplitude between the simulation and the experiment. Using the extended time and initial value (i.e., th column of snapshot matrix), a regular bust of TW is seen .
IV. DISCOVERY OF PROTOTYPICAL PDES
Table III demonstrates that method has successfully discovered the sine-Gordon equation, the Korteweg–de Vries equation, the FitzHugh–Nagumo equation, the Schrödinger’s equation, and the Navier–Stokes equation. Due to page limit, we only demonstrate the discovery results for the Navier–Stokes equation.
Summary of the discovery results for the sine-Gordon equation, the Korteweg–de Vries equation, the FitzHugh–Nagumo equation, the Schrödinger’s equation, and the NS equation.
PDE . | Noise case . | Points . | Identified PDE . | Mean(err)±STD(err) . |
---|---|---|---|---|
Sine-Gordon equation | No noise | 10 | utt = 0.9999uxx − 0.9986sin (u) | |
With noise | 50 | utt = 0.9934uxx − 0.9986sin (u) | ||
Korteweg–de Vries equation | No noise | 10 000 | ut = −0.000 484uxxx − 0.999 246uux | |
With noise | 18 750 | ut = −0.000 474uxxx − 0.974 795uux | ||
FitzHugh–Nagumo equation | No noise | |||
With noise | ||||
Schrödinger’s equation | No noise | 10 000 | ||
With noise | 10 000 | |||
Navier–Stokes equation | No noise | 10 000 | ||
With noise | 20 000 |
PDE . | Noise case . | Points . | Identified PDE . | Mean(err)±STD(err) . |
---|---|---|---|---|
Sine-Gordon equation | No noise | 10 | utt = 0.9999uxx − 0.9986sin (u) | |
With noise | 50 | utt = 0.9934uxx − 0.9986sin (u) | ||
Korteweg–de Vries equation | No noise | 10 000 | ut = −0.000 484uxxx − 0.999 246uux | |
With noise | 18 750 | ut = −0.000 474uxxx − 0.974 795uux | ||
FitzHugh–Nagumo equation | No noise | |||
With noise | ||||
Schrödinger’s equation | No noise | 10 000 | ||
With noise | 10 000 | |||
Navier–Stokes equation | No noise | 10 000 | ||
With noise | 20 000 |
A. Navier–Stokes equation
Flow field for 2D NS equation (4.1) on the torus depicted in time , and .
The dictionary matrix, , consists of basis functions. The underlying dynamics is described with the three state variables, . Thus, our dictionary of basis functions include the linear combinations of the variables, . With snapshots, , we use the explicit difference scheme to approximate the first- or second-order derivatives in the noise-free case. We see from the discovery results that our method has strong performance with the noise-free dataset, a small MSE, and a small STD. We enumerate the re-sampling region in Table III.
In the noisy case, we use the fourth-order compact Pade scheme with the method. We add Gaussian noise to the original snapshot and the resulting RMSE is for , for , and for . We apply the POD method used in Refs. 57 and 58 to de-noise our data. We reshape the snapshot into a matrix, ; we do the same shaping for and . By projecting the data onto 33 POD modes obtained using a threshold of 99.99 fixed, we generate a new snapshot, , from the matrix, : with consisting of 33 POD modes and the corresponding POD coefficients, . Then, we reverse the process such that snapshots, . The computed RMSE is for , for , and for . We pay special attention to the Pade scheme. With the de-noised dataset, we approximate the first-order derivatives in each grid point and then we compute the second-order derivatives using the Pade scheme. After we compute the dictionary matrix, we correctly discover the NS equation as shown in Table III.
B. Comparison with SOTA methods on the prototypical examples from PDE-FIND24
Comparison of identified results using PDE-FIND method and S3d method with the original dataset in Ref. 24.
Example . | S3d method . | PDE-FIND method . | ||
---|---|---|---|---|
Identified results . | No. of samples . | MSE(err)-STD(err) . | No. of samples . | MSE(err)-STD(err) . |
Korteweg–de Vries (single-soliton) | ||||
750 | 12 800 | |||
ut = −6.0011uux − 1.0005uxxx | 1500 | 25 600 | ||
Korteweg–de Vries (double-soliton) | ||||
ut = −5.9909uux − 0.9995uxxx | 9000 | 102 912 | ||
ut = −5.9250uux − 0.9867uxxx | 16 600 | 85 432 | ||
Burgers’ equation | ||||
ut = −0.9999uux + 0.1000uxx | 2000 | 25 856 | ||
ut = −1.0011uux + 0.1001uxx | 5000 | 19 116 | ||
Quantum harmonic oscillator | ||||
2000 | 205 312 | |||
2000 | 187 452 | |||
Nonlinear Schrödinger equation | ||||
ut = 0.4998iuxx + 1.0000i|u|2u | 4500 | 256 512 | ||
ut = 0.5034iuxx + 0.9987i|u|2u | 7200 | 236 652 |
Example . | S3d method . | PDE-FIND method . | ||
---|---|---|---|---|
Identified results . | No. of samples . | MSE(err)-STD(err) . | No. of samples . | MSE(err)-STD(err) . |
Korteweg–de Vries (single-soliton) | ||||
750 | 12 800 | |||
ut = −6.0011uux − 1.0005uxxx | 1500 | 25 600 | ||
Korteweg–de Vries (double-soliton) | ||||
ut = −5.9909uux − 0.9995uxxx | 9000 | 102 912 | ||
ut = −5.9250uux − 0.9867uxxx | 16 600 | 85 432 | ||
Burgers’ equation | ||||
ut = −0.9999uux + 0.1000uxx | 2000 | 25 856 | ||
ut = −1.0011uux + 0.1001uxx | 5000 | 19 116 | ||
Quantum harmonic oscillator | ||||
2000 | 205 312 | |||
2000 | 187 452 | |||
Nonlinear Schrödinger equation | ||||
ut = 0.4998iuxx + 1.0000i|u|2u | 4500 | 256 512 | ||
ut = 0.5034iuxx + 0.9987i|u|2u | 7200 | 236 652 |
V. CONCLUSION AND DISCUSSION
This work proposes a framework to discover PDEs from experimental observations and synthetic spatiotemporal data. The proposed method discovers the dynamics underlying a state of traveling-wave convection from experimental data, i.e., the simplest complex Ginzburg–Landau equations. Additionally, many canonical PDEs are also re-discovered from synthetic spatiotemporal data. The merit of the proposed method is its ability to freely construct a model class with candidate functions and to automatically select the key ones that reproduce the observed spatiotemporal patterns. Benefiting from sparsity, the inferred PDEs are parsimonious and accurate, enabling interpretability. In the various examples, we observe that we are additionally able to robustly handle different types of noise and the measurement data with small sample size. We expect the method to be useful for the modeling of spatiotemporal dynamics from experimental data. This framework, as demonstrated through numerous examples, could potentially accelerate the discovery of new laws and stimulate physical explanations for the discovered equations, which lead to the discovery of the underlying mechanisms.
Although the dynamics we refer to in this work are Eulerian dynamics described by PDEs, , as a general method, is also able to infer ODEs and static functional relations using datasets in Refs. 24 and 59. unifies results for the discovery of natural laws.5,59 Future work will focus on three important aspects of extension of our method for a wider range of practical applications. First, the currently proposed method relies on estimating temporal and spatial derivatives of the measured state variables from data, while there are cases that the system variables are not necessarily observable. Second, the current method can reconstruct equations that are linear with respect to parameters. We will extend this proposed method to the cases that are nonlinear in parameters. Finally, we assume that the constructed library is sufficiently large to have a sparse representation of the dynamics underlying the dataset. However, when there is no any prior knowledge on the to-be-identified governing equation, it is possible that the constructed library is insufficient. In these cases, genetic symbolic regression works much better. We may integrate genetic symbolic regression and the sparse regression into a unified framework, which could overcome this limitation.
ACKNOWLEDGMENTS
The authors acknowledge financial support from the National Natural Science Foundation of China under Grant Nos. 92167201 and 61903148.
We wish to thank Dr. Paul Kolodner for useful discussion and allowing us to use the experimental data.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Ye Yuan: Conceptualization (lead); Data curation (equal); Formal analysis (lead); Funding acquisition (lead); Investigation (lead); Methodology (lead); Project administration (lead); Resources (equal); Software (equal); Supervision (lead); Validation (equal); Visualization (equal); Writing – original draft (lead); Writing – review & editing (lead). Xiuting Li: Data curation (equal); Formal analysis (lead); Investigation (equal); Methodology (equal); Software (equal); Validation (equal); Writing – review & editing (equal). Liang Li: Formal analysis (equal); Investigation (equal); Software (equal); Validation (equal). Frank Jiang: Formal analysis (equal); Investigation (equal); Software (equal); Validation (equal). Xiuchuan Tang: Formal analysis (equal); Investigation (equal); Validation (equal). Fumin Zhang: Formal analysis (equal); Project administration (equal); Supervision (equal). Jorge Goncalves: Formal analysis (equal); Project administration (equal); Writing – review & editing (equal). Henning U. Voss: Data curation (equal); Investigation (equal); Supervision (equal). Han Ding: Funding acquisition (equal); Project administration (equal); Supervision (equal). Jürgen Kurths: Data curation (equal); Methodology (equal); Project administration (equal); Resources (equal); Writing – review & editing (equal).
DATA AVAILABILITY
All synthetic data and codes used in this paper are publicly available in GitHub at https://github.com/HAIRLAB/S3d, Ref. 60.