Parametric Learning of Time-Advancement Operators for Unstable Flame Evolution

This study investigates the application of machine learning, specifically Fourier Neural Operator (FNO) and Convolutional Neural Network (CNN), to learn time-advancement operators for parametric partial differential equations (PDEs). Our focus is on extending existing operator learning methods to handle additional inputs representing PDE parameters. The goal is to create a unified learning approach that accurately predicts short-term solutions and provides robust long-term statistics under diverse parameter conditions, facilitating computational cost savings and accelerating development in engineering simulations. We develop and compare parametric learning methods based on FNO and CNN, evaluating their effectiveness in learning parametric-dependent solution time-advancement operators for one-dimensional PDEs and realistic flame front evolution data obtained from direct numerical simulations of the Navier-Stokes equations.


I. INTRODUCTION
Resolving complex science and engineering problems often entails tackling nonlinear partial differential equations (PDEs), historically managed through numerical methods such as finite difference (FD), finite element (FE) or finite volume (FV) approaches.However, the computational demands of these methods have spurred the exploration of more efficient alternatives.
Recent advancements in machine learning and deep neural network methods, especially those adept at learning PDE solution operators, present promising solutions to these computational challenges.In the context of PDE solutions, operators play a crucial role, mapping one function to another.The focus of operator learning lies in methods capable of robustly learning PDE operators, providing excellent generalization performance.
Various operator learning methods have emerged in recent literature.An early notable method involves using deep convolutional neural networks (CNNs) [1][2][3][4][5][6][7] , inspired by techniques from computer vision.These CNNs employ a finite-dimensional parameterization of the PDE operator, effectively mapping discretely represented functions between images.More recent advancements include a class of neural operator methods 8,9 capable of learning infinitely dimensional operators, exemplified by models like DeepOnet 10 and Fourier Neuron Operator (FNO) 11 .The theoretical underpinnings of these approaches have garnered support [12][13][14] , and both have demonstrated proficiency across a wide array of benchmark problems 15,16 .Recent advancements have seen the further extension of neural operators, drawing inspiration from wavelet methods 17,18 , and adapting methods for complex domains 19 .
Of particular interest in the landscape of time-dependent PDE learning is the solution time-advancement operator.A proficiently learned operator in this regard can predict diverse instances of PDE solutions' evolution over extended durations, finding applications in tasks demanding detailed descriptions of turbulent flow dynamics.Examples range from weather forecasting 20 to designing devices involving complex reacting flows.
While classical computational methods for turbulent problems are computationally intensive, the introduction of an operator learning method trained on the solution timeadvancement operator holds the promise of enabling rapid predictions.This learned operator is expected not only to make accurate short-term predictions from varying initial conditions but also to project solutions over extended periods.However, the challenge of accurate long-term predictions in chaotic systems with nonlinear dynamics, sensitive to initial conditions, prompts an exploration of feasible alternatives.This often involves imposing weaker constraints, aiming for learned models that reproduce long-term statistics akin to ground truths.
In our recent study 21 , three methods -CNN, FNO, and DeepOnet -were applied to learn the underlying operator governing the nonlinear development of unstable flame fronts in channels.Through recurrent training optimizing models for multiple consecutive predictions from a single input solution, it was found that FNO and CNN could effectively learn the front evolution.Specifically, FNO exhibited superior performance in capturing the intricate flame evolution in a wide channel, where the front evolves into cellular, fractal structures.On the other hand, CNN demonstrated better performance in predicting simpler flame evolution in a narrow channel, where the front evolves into a steady cusp shape.The channel-dependent front behavior is governed by PDEs, with the channel width as a parameter.
In the current work, the emphasis is on extending methods for learning time-advancement operators to include additional inputs of PDE parameters, capturing channel-dependent flame front evolution using a single neural network.The goal is to create a unified learning method capable of covering a spectrum of parameter values, providing accurate short-term solutions, and offering robust long-term statistics under varied parameter conditions.Given that engineering tasks often involve computational simulations over diverse parameter conditions, the development of a parametric operator learning method facilitates computational cost savings and accelerates development.This work will concentrate on developing parametric learning methods based on two approaches -FNO and CNN.Integrating additional inputs into existing operator learning methods while retaining key mathematical structures presents a non-trivial challenge.
Notably, the impact of added parameters on both short and long-term solutions must be incorporated to ensure better generalization of operator learning performance.It is worth noting that the influence of parameters on nonlinear systems may extend to bifurcation phenomena.For a detailed exploration of bifurcation behavior in machine learning studies, readers are referred to a paper 22 where a neural network archetype is constructed based on theories of the center manifold and normal forms.
The paper follows a structured organization: we commence with a description of the problem setup, followed by the presentation of parametric learning methods based on FNO and CNN.These methods will be compared in the context of learning parametric-dependent solution time-advance operators for two one-dimensional PDEs that model unstable front evolution due to distinct mechanisms of flame instability.Additionally, the methods will be showcased in their capacity to learn from realistic flame data obtained through direct numerical simulations of the Navier-Stokes equations.A summary and conclusion will be provided at the end.

II. LEARNING PROBLEM SETUP
In this section, we describe the problem setup for learning a PDE operator, followed by the recurrent training methods.
Consider a system described by PDEs, which is often represented by multiple functions and maps between these functions.Here we consider a parametric operator mapping of which maps an input tuple of a function v(x) and parameters γ ∈ R dγ into another function where v ∈ V, and V = V(D; R dv ) is a functional space with domain D ⊂ R d and codomain where v ′ ∈ V ′ , and In this study, our main interest is in the solution time advancement operator with parametric dependence, i.e., Ĝ : (ϕ(x, t), γ) → ϕ(x; t + 1) where ϕ(x; t) is the solution to a PDE under certain parameters γ.Here t = t/∆ t ∈ R is time normalized using a positive time increment ∆ t (assumed to be small).For simplicity, we consider autonomous problems and let v ′ and v share the same domain and codomain, i.e., Furthermore, we assume simple (periodic) boundary conditions on D that do not vary with time.
Consider the approximation of the mapping Ĝ using neural network methods.Let Θ be the space of all trainable parameters in the neural network.A neural network can be defined as a map Training neural network corresponds to finding a suitable choice of θ * ∈ Θ such that G θ * approximates Ĝ.
Starting from an initial solution function ϕ(x; t 0 ) and under fixed parameters values γ, the operator G θ,γ := G θ (•, γ) can be recurrently applied by letting the input function being its output from previous prediction, this can roll out predicted solutions of arbitrary length.To learn a PDE solution advancement operator Ĝγ := Ĝ(•, γ), a well-trained neural network is expected to make accurate short-term predictions; while the long-term predicted solutions may not be precise when the PDE admits chaotic solutions, it is still preferable for the predicted solutions to share similar long-term statistics as ones in the PDE solutions.
Following our previous study .Let all observation data be arranged in 1-to-n pairs as v j , ( Ĝ1 where v j ∼ χ and γ i ∼ χ ′ are sequences drawn from two independent probability measures of χ and χ ′ respectively, the total number of input/output training pairs is Z = Z i ×Z j .Then, training the network G θ to approximate Ĝ amounts to minimize where C : V n × V n → R is a cost function defined as relative mean square error(MSE), V n denotes the Cartesian product of n copies of V.
It is noteworthy that the aforementioned one-to-many training setup is crucial for ensuring numerical stability in the learned solution advancement operator.As demonstrated in paper 21 , deploying the same operator network but trained in a one-to-one setup frequently leads to divergent predictions upon repeated applications, attributed to unbounded error growth.

III. PARAMETRIC OPERATOR LEARNING METHODS
In this section, we propose two models for learning the parametric operator of Ĝ.They are developed from two baseline methods, namely Convolutional Neural Network (CNN) and Fourier Neural Operator (FNO), designed to learn the non-parametric operator Ĝγ .

A. Parametric convolutional neural network (pCNN)
The solution advancement operator Ĝγ , when discretized on an equispaced mesh, assumes the form of an image-to-image map.Image learning, akin to a computer vision task, is often accomplished through the application of deep CNN [1][2][3][4][5] .The baseline architecture is a convolutional auto-encoder with skip connections, reminiscent of those employed in ConvPDE-UQ 7 and U-Net 6 .This auto-encoder comprises an encoder block and a decoder block, with the input data undergoing successive transformations through a series of simple convolutional layers.
In this study, we propose a parametric CNN that expands the encoder block to account for parameter influence, as illustrated in Fig. 1 where the red lines highlight the new extensions.Let e + 0 denotes the input function v(x j ) represented at a x-mesh, the encoder block is an iterative update procedure of (e + 0 , γ) → e + 1 , (e where N l,(k) is pixel number in k-th dimension.With an increased encoder level, images are recommended to increase in channel number (i.e.c l ≤ c l+1 , at least for small l ≤ 3 ) but shrink in size (N l,(k) = 2N l+1,(k) , except at first level where size do not change,i.e.N 0,(k) = N 1,(k) ).The update e + l → (e l+1 , e * l+1 ) is decomposed into two sub-maps e + l → e l+1 and e + l → e * l+1 , each of these two sub-maps is implemented by a size-2 max-pooling layer (to half the image size, but not needed at the first level l = 0 ) followed by a few standard convolution layers (or replaced by Inception layer 23 for improved performance).Each of the above layers uses a filter size 3, periodic padding, and stride 1 followed by a ReLU activation.

Let e ′
L be the last encoded image e + L , the decoder block is a reversed update procedure of ] is a sequence of decoded images.Each decoder update (e ′ l+1 , e + l ) → e ′ l is implemented by passing e ′ l+1 through an up-sampling layer to double its image size then concatenating with e + l along the channel dimension (i.e. the 'skip' connections shown by the blue lines in fig. 1 ), followed by a few convolution layers.Nonlinear RELU activation is applied at each l except at the last level l = 1 giving the final output v ′ (x j ) = e ′ 1 .

B. Parametric Fourier Neural Operator (pFNO)
In this section we describe the method for parametric extension of Fourier Neural Operator, starting from the baseline method.
1. Baseline FNO FNO 11 is developed based on an early paper on neural operator 8 in which an infinitedimensional operator is approximated by composing nonlinear activation and a class of integral kernel operators.In FNO the integral kernel is parameterized in Fourier Space.
Similar to the pseudo-spectral method for solving nonlinear PDE, FNO involves intermediate data transformation alternatively switched in between Fourier space and physical space, as illustrated in fig. 2.
The main architecture of FNO is an iterative update procedure of where ε l : R d → R dε ; x → ε l (x) for l = 1, ..., L is a sequence of functions.Here ε l (x) may be viewed as an 'image' having d ε (≥ 1) number of 'color' channels if the x domain is discretized on an equispaced mesh.Each update ε l → ε l+1 can be achieved by a Fourier Layer of where σ is a component-wise nonlinear activation function(e.g.ReLU) and the function ) performs a channel-wise linear transformation which can be parameterized by a single convolutional neural network layer with kernel size 1.
The remained operation in Eq. ( 8) starts with Fast Fourier Transform(FFT) F and ends with its inverse F −1 .First, on the complex-valued Fourier modes F{ε l } we truncate higher frequency modes than denotes the number of modes kept in the i-th dimension.Then, these truncated modes are linearly transformed by a function R l : where R l ∈ C κ max ×dε×dε is a trainable weight tensor.Lastly, the inverse Fourier Transform F −1 brings the modified Fourier modes back to physical space.
For the input function v(x) ∈ R dv to be fed into above Fourier layers, a local transformation P : R dv → R dε is required to lift the input to a higher dimension (i.e.ε 0 (x) = P (v(x)) is another local transformation.Both P and Q can be parameterized using simple networks such as multiple layer perceptions(MLP).
FNO has several pleasant features: (i) FNO is mesh-invariant and can be trained at low resolution and then evaluated at high resolutions.(i) FNO can handle different boundary conditions through W l , while particularly with periodic conditions FNO is invariant to a translation in the input function, i.e. i.e., G γ;θ (v(x + a))(x) = G γ;θ (v(x))(x + a) for any a ∈ D. (iii) FNO can also be promoted to learn the higher dimensional map Ĝ * where both input and output functions are extended to have a time dimension.One restriction of FNO attributes to the usage of FFT, therefore both the input and output functions v and v ′ should be discretized on an equispaced mesh.Theoretical analysis on error bounds of FNO method is in paper 12 .

Parametric extension
To incorporate parametric influence into FNO, we enhance the baseline Fourier layer by introducing additional learnable complex weights that encompass parametric influences across different ranges of wave numbers.The primary modificationa are illustrated by the components connected by red solid lines in Fig. 2.
, the Fourier layer is modified as a parametric-dependent update procedure of: be achieved by Eq. ( 8) through replacing R l : with where R * l ∈ C κ max ×dε×dε is an additional trainable complex weight tensor and D * l : R dγ → R κ max is a function which maps the parameters γ to κ max -number of positive ratios (for rescaling R * l ).D * l can be realized by composing two functions: a first function • N γ number of values and it is parameterized by a shallow MLP; the second function performs the mapping of R N D → R κ max .Following a similar approach to constructing pCNN, this second mapping function is implemented hierarchically, enabling the distribution of parameter influence across different ranges of wave numbers.

Variants (pFNO*)
An alternative approach to incorporate parameter influence into the baseline FNO method involves appending each parameter value γ ∈ R dγ to the codomain of the input function v(x).
This modification results in a different input function v * : R d → R dv+dγ (in contrast to letting v * (x) = v(x) in the previous section), represented by the red dashed line in Figure 2.This modification is a simple extension to the baseline FNO and will be referred to as pFNO* for future reference.

IV. RESULTS AND DISCUSSIONS
The two models, namely pFNO and pCNN, as detailed in Section III, are deployed to learn the parametric-dependent operator map governing the evolution of unstable flame fronts.Two distinct datasets are utilized to train these models.The first dataset is derived from 1D solutions of two modeling equations: the 1D Michelson-Sivashinsky (MS) equation 24,25 and the 1D Kuramoto-Sivashinsky (KS) equation 25,26 .These equations capture intrinsic flame instability arising from Darrieus-Landau 27,28 and diffusive-thermal 29,30 mechanisms, respectively.Although the front solutions from these equations are confined to 1D functions (dependent along the channel wall-normal direction), a substantial number of solution sequences can be generated, varying initial conditions and parametric values.
The second dataset encompasses 2D front solutions 31 obtained from direct numerical simulations (DNS) of reacting Navier-Stokes equations.The DNS provides more realistic and intricate front solutions that cannot be represented by 1D functions.However, due to computational demands, the 2D dataset is limited in the number of solutions available for training.
In the subsequent sections, we evaluate the performance of our proposed models for parametric operator learning.Initially, we compare their effectiveness using the more extensive 1D dataset.Subsequently, we showcase the utilization of the 2D-version models for training on the DNS dataset.

A. 1D governing equations and training dataset
Consider modeling unstable flame front development in a periodic channel.Let x represent a normalized spatial coordinate along the channel's wall-normal direction, i.e., x ∈ D = [−π, π), and t denotes time.Introduce a displacement function ϕ(x, t) : R × R → R describing the stream-wise coordinate of a zero-thickness flame front undergoing Darrieus-Landau instability.This evolution can be captured by the Michelson-Sivashinsky (MS) equation 24,25 : Similarly, the flame front evolution due to diffusive-thermal(DT) instability is modeled by the Kuramoto-Sivashinsky (KS) equation 25,26 : Both equations have periodic boundary conditions and initial conditions of ϕ(x, 0) = ϕ 0 ∈ L 2 per ([−π, π)).In Eq. ( 12), Γ : ϕ(x) → −H(∂ x ϕ) is a linear singular non-local operator defined using the Hilbert transform H, or in terms of spatial Fourier transform, denoted by F k (ϕ(x)) and its inverse F −1 : Equations ( 12) and ( 13) include positive parameters ν and β, where ν depends on channel width and gas thermal expansion, while β depends on the disparity between reactant mass and heat diffusion.The KS equation is known to exhibit chaotic solutions at large β and is often used as a benchmark case for PDE learning studies.The MS equation, less known outside the flame instability community, can be exactly solved using a pole-decomposition technique 32 , transforming it into a set of ODEs with finite freedoms.Additionally, at large ν, the MS equation admits a steady solution in the form of a giant cusp front.However, at smaller ν, the equation becomes sensitive to noise, resulting in unrest solutions with everlasting small wrinkles atop a giant cusp.Further details about known theory can be found in 21,[33][34][35][36][37][38][39][40] .
The 1D equations ( 12) and ( 13) are solved using a pseudo-spectral approach along with a Runge-Kutta (4,5) time integration method.All numerical solutions are obtained on a

B. Learning 1D flame evolution
The training datasets described in the previous section are employed to learn two parametric solution advancement operators, denoted as Ĝ : (ϕ(x; t), γ) → ϕ(x; t + ∆ t ).These operators correspond to the MS equation ( 12) with γ = ν and the KS equation ( 13) with γ = β, representing unstable front evolution due to the DL and DT instability mechanisms, respectively.Subsequently, these operators are referred to as ĜDL ν and ĜDT β .In the current study, two parametric learning models, pFNO and pCNN, introduced in section II are applied to learn ĜDL ν and ĜDT β .For comparison, we also include a third model pFNO* (a simple variant of FNO through appending the parameter γ to the codomain of input function, described in section III B 3 ).Given a fixed parameter value, the learned operator is expected to make recurrent predictions of solutions over an extended period.
The training for such operators aims not only for accurate short-term predictions but also for robust predictions of long-term solutions with statistics similar to the ground truth.
As demonstrated in a previous study 21 , achieving this involves organizing the training data in a 1-to-n pair (n=20 for 1D dataset), as expressed in Eq. ( 7), optimized for accurately predicting 20 successive steps of outputs from a single input over a range of parameter values.x +1) 1/2 dx/ D dx, respectively, while Fig. 6 portrays accumulated model errors through recurrent prediction steps.To quantify the statistics of long-term predicted solutions, an auto-correlation function is introduced:

Table I presents
where ϕ * (x) denotes a predicted solution obtained after a sufficiently long time.Fig. 7 compares the auto-correlation function obtained by all models with the reference one.For learning the other operator ĜDT β , similar plots are presented for front displacement (Fig. 8), front slope (Fig. 9), total flame front length (Fig. 10), accumulated error (Fig. 11), and auto-correlation (Fig. 12).
The findings can be summarized as follows: 1.All three models-pFNO, pFNO*, and pCNN-perform generally well in learning two parametric front evolution operators of ĜDL ν and ĜDT β .
When learning the DL-destabilized front evolution at varying parameter values ν, all models demonstrate proficiency in making reasonably accurate short-term predictions (t/∆ t ≤ 50).This is evident from the small relative errors (less than 0.02 as shown in Table I and Fig. 6), as well as from the predicted front displacements, front slope, and total front length illustrated in Figs. 3, 4, and 5, respectively.The long-term predictions (t/∆ t ≥ 500) by all models align with the characteristic reference solution pattern, specifically the persistent noise-affected single-cusp fronts depicted in Figs. 3   and 4, and approximate the auto-correlation functions shown in Fig. 7.
Similarly, on learning the evolution of DT-fronts at different parameter values β, all three models excel in predicting short-term solutions (indicated by the errors shown in However, when faced with the challenge of learning the long-term evolution of DLfronts, whose characteristics depend on the parameter value ν and are prone to noise, pFNO is outperformed by pFNO* and pCNN.This is evident in the comparison of auto-correlation functions shown in Fig. 7.In particular, all three models tend to overestimate the impact of noise-induced front wrinkles at larger values of ν ≥ 0.07, as observed in Figs. 3 and 4. Here, the predicted front displacement and slope remain frequently disturbed, while the reference solutions evolve to become steady at ν = 0.15.
Consequently, all models also overpredict the total front length at large ν, as shown in Fig. 5.
Moreover, Fig. 5 reveals that pFNO* and pFNO tend to predict a higher total front length than pCNN, especially at small parameter values of ν ≤ 0.035.This behavior may be attributed to the Fourier operator-based method's tendency to fit noise signals through certain high-frequency representations.
It is noteworthy that in the previous study 21 , while the (non-parametric) baseline CNN method has demonstrated the ability to learn and reproduce the reference steady DLfront solution at large ν, when trained separately to learn the solutions at small ν, it tends to predict significant artifacts.Interestingly, our proposed parametric-CNN model, represented by a single network, now yields good predictions for the solution at small ν, albeit with less accuracy in predicting the steady solution at large ν.

C. 2d training dataset
The 2D training dataset utilized in this study is derived from DNS presented in a previous work 31 .These simulations focus on the nonlinear evolution of an unstable premixed flame due to the DL stability.The governing system comprises reactive Navier-Stokes equations with a low Mach number approximation, assuming a one-step Arrhenius reaction and a unity Lewis number 31 .The DL stability is maintained by sustaining an above-unity density ratio (Ξ) between the cold fresh gas and hot burned gas on either side of the flame front.Highorder numerical methods [41][42][43] are employed for computational solutions.The study explores the free propagation of an initially planar flame into a quiescent fresh reactant within a 2D periodic channel of width Λ.The computational domain is a rectangular window region [0, Λ] × [0, 3Λ], moving with the propagating flame 31,[44][45][46][47] .
In the aforementioned DNS study 31 , comprehensive simulations were conducted covering relevant parameter ranges for different channel widths (Λ) and various flames characterized by the density ratio (Ξ).Each DNS is a single-instance run commencing from an initially perturbed planar flame, and the simulation extends to a sufficiently long time to allow for the nonlinear development of flame instability.Each DNS run outputs a sequence of 2D flame fronts.
Here, a '2D flame front' of ∂Ω refers to the boundary of the 'burned' domain Ω(t) = {x|Y (x; t) > 0.6}, with Y is the fuel mass fraction being 0 in the fresh reactant and 1 in the product.In other words, 2D flame fronts are a collection of iso-scalar lines of Y (x, t) = 0.6 numerically extracted from a x j -mesh representation of Y (x j , t) using the marching square method.
For the operator learning methods to describe the 2D flame evolution, the iso-lines repre- sentation of zero-thickness flame fronts is extended to a functional representation of Φ(x, t): Here, tanh represents the hyperbolic tangent, ∆ * is a factor allowing the front to have a Long-term solutions of the 1D KS equation ( 13) at four different parameters β ∈ [24, 18, 9, 6] (from top to bottom row), obtained by high-order numerical methods as a reference, are compared against predictions by pFNO and pCNN.Other details remain consistent with those in Figure 3.
elements obtained from these DNS runs.The 2D operator-learning networks take inputs as a discrete representation of the function Φ(x j , t), evaluated on a uniform mesh x j discretizing D with a mesh size of 256 × 256.All cases adopt the same thickness factor ∆ * = Λ/128.
Illustrations of the functional representation of Φ(x) can be seen in the 2D-pCNN depicted in Figure 1, where its input and output functions v(x) and v ′ (x) are shown.Here, Φ(x) takes values of either -1 (yellow color) or +1 (blue color) due to the tanh function in Eq. ( 16), and can also take intermediate values near the region of zero iso-lines of Φ(x, t) = 0.  smaller channel with ν = 0.033, the models predict a considerably smoother front evolution, albeit with a slight overestimation of noisy wrinkling-a phenomenon observed similarly when learning 1D DL-fronts.
Given that the models have learned the underlying flame front evolution operators, they demonstrate the ability to predict new instance solutions starting from random initial conditions, as depicted in Figure 13.

V. SUMMARY AND CONCLUSION
In this study, we delve into the application of parametric learning methods, specifically Fourier Neural Operator (FNO) and Convolutional Neural Network (CNN), to forecast the evolution of unstable flame fronts in both one and two dimensions.These methods are extended to assimilate additional input parameters, enriching our ability to comprehend the nuanced and varied conditions governing flame evolution.
The proposed methods, namely pCNN and pFNO, are applied to model the flame front evolution described by two 1D parametric partial differential equations: the Michael-Sivashinsky (MS) and Kuramoto-Sivashinsky (KS) equations, corresponding to the Darrieus-Landau (DL) and diffusive-Thermal (DT) instabilities, respectively.Across the relevant parameter range, both models exhibit proficiency in accurate short-term predictions for both PDEs.When learning solutions of the KS equation, both models adeptly capture the statistical characteristics of long-term chaotic solutions, faithfully reproducing auto-correlation functions comparable to reference solutions.
In the context of long-term solutions to the MS equations, both models capture the peculiar phenomenon of parameter-dependent increasing noise disrupting a stationary solution.
However, they tend to overestimate noise-induced unresting front wrinkles when the reference solution maintains a stable state against noise disturbance.Notably, when the reference long-term solutions exhibit highly disturbed noisy patterns, pCNN outperforms pFNO in predicting front length accurately.
Expanding our study to 2D flame evolution, we leverage direct numerical simulations (DNS) for training data.Our proposed parametric learning models, 2D-pFNO and 2D-pCNN, adeptly capture the channel width-dependent front evolution.In larger channels, both models reproduce intricate long-term front structures, while in smaller channels, they

FIG. 1 .
FIG. 1.The parametric CNN is derived from the convolutional auto-encoder archetype by extending its encoder block, with the added components highlighted through linking with red lines.The sketch is demonstrated for a 2d input discretized as an image (v(x j ) ∈ R 1×256×256 ) with L=6 levels of encoding.The channel number c l is shown on top in a bracket and the image size N l is on the bottom.Dashed lines refer to skip connection.Max-Pooling and upsampling are used to shrink and increase image size, respectively.All gray blocks are implemented using a standard convolution layer (of filter size 3, stride 1, and periodic padding which enforces periodic boundary condition), and the magenta block is implemented by an Inception layer 23 .

FIG. 2 .
FIG. 2. Illustration of the parametric extension of the Fourier Neural Operator (pFNO), highlighting the extended components with red solid lines connecting to the original FNO parts.The top-right inset provides a zoomed view of the second map inside the function D * .
the training/validation (relative L 2 ) errors for the obtained models on the two equations.Additional details on training and model hyper-parameters are provided in Appendix A. For learning ĜDL ν , Fig. 3 compares two extended sequences of front displacements predicted by all models against the reference ones.Figs. 4 and 5 illustrate analogous comparisons for the front slope of ϕ x and normalized total front length of D (ϕ 2

FIG. 3 .
FIG.3.Long-term solutions of the 1D MS equation(12) for front displacement ϕ(x, t) at four different parameters ν ∈ [0.025, 0.035, 0.07, 0.15] (from top to bottom row).Reference solutions obtained using high-order numerical methods are represented by black dashed lines, while predictions from parametric operator learning methods pFNO* and pCNN are denoted by red solid and cyan long-dash lines, respectively.Each pair of solution sequences, displayed in the left and right columns, starts from differently randomized initial fronts.The sequences include eleven snapshots of ϕ(x, t j ) with t j = j∆ t at j ∈ [0, 50, 125, 250, 500, 750, 1000, 1250, 1500, 1750, 2000] and a fixed time interval ∆ t =0.015.A time shift (t/100) is applied to displayed fronts to reduce overlap.

FIG. 9 .FIG. 10 .
FIG. 9. Comparison of front slopes between reference solutions to the 1D KS equation at four parameters β = 24,18,12,6 and predictions by pFNO*, pFNO, and pCNN.The format and details remain consistent with those presented in Figure 4.

Figure 13 illustratesFIG. 12 .
Figure 13 illustrates the flame fronts extracted from two reference DNS cases with ν values of 0.033 and 0.011, comparing them to the predictions generated by two parametric operator learning models, 2D-pFNO and 2D-pCNN.Both models successfully capture the general trend of ν-dependent front evolution.In the case of flame development in a larger channel (characterized by µ = 0.011), both models replicate long-term intricate front structures, featuring frequent noisy wrinkles atop cellular shapes of varying sizes.Conversely, for the
predict a smoother evolution with a slight overestimation of noisy wrinkling.Both 2D models successfully learn the underlying flame front evolution operators, enabling predictions of new flame solutions from random initial conditions.This work highlights the effectiveness of parametric learning methods in predicting the evolution of unstable flame fronts.The models excel in capturing short-term dynamics and replicating long-term statistical characteristics.However, challenges emerge in learning the long-term evolution of DL-fronts, particularly in the presence of noise.The study offers valuable insights into the capabilities and limitations of parametric learning methods in the realm of flame dynamics, paving the way for further advancements in predicting complex physical phenomena.ACKNOWLEDGMENTSThe author gratefully acknowledges the financial support by the Swedish Research Council (VR-2019-05648).The simulations were performed using the computer facilities provided by the Swedish National Infrastructure for Computing (SNIC) at PDC , HPC2N and ALVIS.This work benefited from the preliminary study conducted during Ludvig Nobel's master's thesis at Lund University in 2022.a batch size of 800 is utilized.During model training, the maximum norm of gradients is clipped above 50 to stabilize the training process.The 2D-pFNO networks are trained in a 1-to-10 manner with a batch size of 40, while the 2D-pCNN networks are trained similarly in a 1-to-10 manner with a batch size of 28.These batch sizes are determined by the available GPU memory (NVIDIA Tesla A40).All trainings are performed using a single GPU, training a 1D-pFNO model takes around 20 hours, while training a 1D-pCNN takes 40 hours.Training the 2D-pFNO and 2D-pCNN models takes around 52 hours and 48 hours respectively.All 1D-pFNO networks utilize L = 4 levels of parametric Fourier layers and d ε = 30 channels.For learning the MS equation, two hyperparameters, κ max = 64 and N γ = 6, are employed, while for the KS equation, these are slightly adjusted to κ max = 128 and N γ = 5.The 2D-pFNO network adopts hyperparameters with L = 4, d ε = 20, κ (1),max = κ (2),max = 64, and N γ = 6.All pFNO methods are implemented by sharing most of the trainable parameters within a single Fourier layer (Eq.(10)) across all layers l = 0, ..., L − 1, except for those used to parameterize the function D l (γ).This adjustment significantly reduces the model size with minimal impact on performance, making it a standard practice across all pFNO models.Additionally, a skip connection may be added to the Fourier layer asε l+1 = ε l + σ F −1 {R * l (F{ε l }, γ)} + W l (ε l ) .(A1)Adding such a skip connection has been found to facilitate model training for 2D-pFNO.The 1D-pCNN shares a similar architecture with the 2D-pCNN illustrated in Figure1except a few difference: (i) All images (e, e * , e + , and e ′ ) become 1D, necessitating the adaptation of convolution layers, up-sample layers, and max-pooling layers to their corresponding 1D versions.(ii) The parameter influence is activated only for the first 4 encoder levels, effectively setting the function D l = 0 for l ≥ 4. (iii) The channel number after the first encoder layer (c 1 ) changes to 16.
D l (γ), where the last function D l : R dγ → R converts the PDE parameters γ into a scaling ratio.Here e l , e * l and e + l ∈ R c l ×N l are three sequences of encoded 'images', those images have c l channels and a total pixel number

TABLE I .
Relative L 2 train/validation errors for all 1D operator networks Model Governing Eq.Parameter range Train L 2 Valid.L 2 uniformly spaced 1D mesh of 256 points.For the MS Equation (12), training solutions are generated at six parameter values of ν ∈ [0.025, 0.035, 0.05, 0.07, 0.1, 0.15].Given that, at large values of ν, the long-term MS solution tends to evolve into a nearly stationary single cusp front, 250 sequences of consecutive solutions are generated for each of the three large ν ∈ [0.07, 0.1, 0.15].Each sequence contains 1000 consecutive solutions separated by a time interval of ∆ t = 0.015 and starts from random initial conditions ϕ 0 (x) sampled from a uniform distribution over [0, 0.03].For smaller ν ∈ [0.025, 0.035, 0.05], the training data is similarly created, but adjustments are made to better represent the noise-affected unrest solutions.Specifically, 250 random sequences of solutions are generated over a shorter duration 0 < t < 7.5, each containing 500 consecutive solutions at intervals of ∆ t .The long-time solution behavior is represented by one extra-long sequence, including 125,000 consecutive solutions throughout 0 < t < 1875.For the KS Equation (13), the training dataset comprises 250 randomly initialized solution sequences over 0 < t < 7.5 for each of five parameter values of β ∈[6, 9, 12, 18, 24].The KS solutions at all these β values remain chaotic.For both MS and KS equations, the validation dataset is similarly created but contains only 10 percent of the data in the training dataset.

Table I and
Fig.11, and by the results over t/∆ t ≤ 50 inFigs 8, 9, 10).Furthermore, they successfully capture the statistical characteristics of long-term chaotic solutions, as evidenced by the auto-correlations shown in Fig.7.
2. In term of deciphering the prolonged evolution of chaotic DT-fronts, pFNO outperforms the other two models, pFNO* and pCNN, as demonstrated by the superior auto-correlations at five distinct β values illustrated in Fig.12.Moreover, pFNO provides the most accurate short-term predictions for both DT-fronts and DL-fronts, supported by the smallest errors highlighted in TableI, and depicted in Figs.6 and   11.