Non-intrusive, transferable model for coupled turbulent channel-porous media flow based upon neural networks

Turbulent flow over permeable interface is omnipresent featuring complex flow topology. In this work, a data driven, end to end machine learning model has been developed to model the turbulent flow in porous media. For the same, we have derived a non linear reduced order model with a deep convolution autoencoder network. This model can reduce highly resolved spatial dimensions, which is a prerequisite for direct numerical simulation. A downstream recurrent neural network has been trained to capture the temporal trend of reduced modes, thus it is able to provide future evolution of modes. We further evaluate the trained model s capability on a newer dataset with a different porosity. In such cases, fine tuning could reduce the efforts (up to two order of magnitude) to train a model with limited dataset and knowledge and still show a good agreement on the mean velocity profile. Leveraging the current model, we find that even quick fine tuning achieving an impressive order of magnitude reduction in training time by approximately still results in effective flow predictions. This promising discovery encourages the fast development of a substantial amount of data-driven models tailored for various types of porous media. The diminished training time substantially lowers the computational cost when dealing with changing porous topologies, making it feasible to systematically explore interface engineering with different types of porous media. Overall, the data driven model shows a good agreement, especially for the porous media which can aid the DNS and reduce the burden to resolve this complex domain during the simulations. The fine tuning is able to reduce the training cost significantly and maintain an acceptable accuracy when a new flow condition comes into play.


I. INTRODUCTION
Turbulent flow across permeable interfaces is a phenomenon that is pervasive in both natural environments and engineering contexts, manifesting in diverse applications ranging from sediment transport to transpiration cooling in gas turbines.Porous media are often characterized by a complex spatial topology [1][2][3], and numerous properties have been recognized to exert a substantial influence on both mass and momentum transport at the interface between porous and free-flowing media.Despite this understanding, the task of comprehending and optimizing the macroscopic and microscopic characteristics of porous media continues to present a challenge.The underlying reasons for this complexity include the inherent difficulties associated with conducting experimental measurements at the pore scale [4] and the substantial computational resources required for numerical simulations that adequately resolve the entire range of relevant scales.Historical examinations of turbulent flow over permeable substrates have investigated the impact of variable permeability on surface flows.In scenarios where interfacial permeability is low, the turbulent surface flow exhibits characteristics akin to a canonical boundary layer, owing to the limited disruption of near-wall structures.Conversely, when permeability is augmented, large-scale vortical structures begin to manifest within the surface flow.This phenomenon has been ascribed to Kelvin-Helmholtz (KH) type instabilities, originating from the inflection points present in the mean velocity profile.In more recent times, the exploration of porous media as a means of achieving drag reduction has catalyzed a series of comprehensive studies focused on the subject of anisotropic permeability.
Direct Numerical Simulation (DNS) provides a distinct advantage for observing and analyzing the physics of turbulence within a constrained small spatial domain, extending beyond traditional applications such as channel [5] and pipe flows [6,7].Suga et al. [8] conducted an investigation into the influence of the anisotropic permeability tensor within porous media at a regime characterized by higher permeability.Their findings revealed that both streamwise and spanwise permeabilities contribute to turbulence enhancement, while vertical permeability alone does not exert a similar influence.The effect on turbulence is particularly pronounced in the presence of porous walls exhibiting streamwise permeability, as this configuration promotes the development of large-scale streamwise perturbations induced by Kelvin-Helmholtz instability.However, despite the insights offered by DNS, its utilization for extensive parametric studies, often required in industrial applications, remains a costly solution.In such contexts, it is still customary to employ either analytical models or advanced turbulence models, which tend to provide satisfactory accuracy within established applications.Nevertheless, these models are not without their challenges, as they can exhibit divergent behavior, and the calibration process can prove to be both intricate and time-consuming.
Machine learning (ML) is on surge where it has successfully tackled many complex tasks ranging from self-driving cars [9], drug discovery [10], weather prediction [11], and more recently state-of-the-art large language model such as GPT for natural language processing [12].Machine learning has been democratized heavily in the last decade and some of the reasons include the vast availability of open-source libraries, an active community, a huge volume of incoming data from physical as well as numerical experiments, and an ease of access to high-performance computing resources via cloud.In recent years, ML has shown potential in the modeling of various fluid flow use-cases [13][14][15][16][17][18][19][20][21][22][23].Typically, big data ML problems rely upon some kind of feature extraction process e.g.conventional feature selection process, long-standing proper orthogonal decomposition (POD) [24][25][26], relatively new dynamic mode decomposition (DMD) [27] or ML-based methods [28].Feature extractors transform the input raw data into information-preserving feature vectors with the primary goal of reducing the dimensionality of the data.Several attempts have been made in the past few years to combine POD or DMD with machine learning [22,[29][30][31].However, these methods have certain limitations especially when the flow advances in the turbulent flow regimes.Therefore, an end-to-end machine learning system is being suggested where POD and DMD are replaced by a non-linear DNN [15,29,32].A common architecture is a deep convolution autoencoder (DAE) to extract the features for further processing and decision making [33,34].Turbulent flow problem often involves the tempo-spatial data, therefore, extracted features could be used to train a downstream network to build a data-driven dynamic model.A common choice is recurrent neural network (RNN) which preserves the sequential relationship in a time-series [35].A combination of such approaches has shown exemplary results in terms of data compression and modeling [15].
In this work, we make an attempt to model complex flow behaviors in a porous medium with the help of a datadriven approach.The flow is in a turbulent regime and due to the presence of porosity, geometry is resolved to a finer scale.This setup makes the end-to-end ML task challenging and equally interesting.The goal is to combine the deep learning-based feature extraction step followed by a recurrent neural network to model temporal evolution.Echo state network was chosen as the RNN due to its proven efficacy in modeling multi-dimensional tempo-spatial data.The trained models are restricted to the domain of training and work poorly beyond the trained data domain which could be a specific parameter such as the Reynolds number in fluid flow problems.A commonly used strategy called transfer learning is typically used, where the objective is to transfer the gained knowledge from the training of source task to related target task with a limited dataset [36].Fine-tuning, which can be considered a part of transfer learning, is one methodology where an already trained and optimized model is tuned with another similar set of data for a similar task.This could decrease the training time significantly in addition to neural architecture search and training data requirements [37].Fine-tuning can be performed on the entire network level where we are allowed to retrain all the layers while initializing the parameters with the pre-trained model.Alternatively, several layers can be frozen and retraining is allowed on only a few layers.The former method is suitable when the network size is very large and data is limited.While the first approach could lead to overfitting [38].Therefore, we also present the result from fine-tuning where a trained model from a given data domain was transferred to a different data domain.
The remainder of the paper is divided into 4 sections.Section II describes the fluid flow problem in porous media along with the underlying governing equations and numerical method.Section III explains in detail regarding the hierarchical data-driven model.Section IV describes the results from training and fine-tuning.With section V, we conclude the work and give a brief outlook.

II. SYNTHETIC DATA GENERATION
High-fidelity simulations such as Direct Numerical Simulation (DNS) [5,[39][40][41] can serve reliable spatial and temporal resolved data for the modeling.In our DNS, the three-dimensional incompressible Navier−Stokes equations are solved in a non-dimensional form as follows: Here, Π represents a constant pressure gradient in the mean-flow direction.The governing equations are normalized by employing the half-width of the entire simulation domain, denoted as H (refer to figure 1a), and the mean bulk velocity U b within the channel region specified by y/H = [0, 1].The velocity components in the streamwise (x), wall-normal (y), and spanwise (z) directions are henceforth expressed as u, v, and w, respectively.The domain dimensions (L x /H × L y /H × L z /H) are fixed at 10 × 2 × 0.8π for all scenarios considered.The lower half of the domain (y/H = [−1, 0]) encompasses the porous media, while the upper half (y/H = [0, 1]) constitutes the free channel flow.The porous layer is constructed with 50 cylindrical elements aligned in the streamwise direction and 5 rows positioned in the wall-normal direction, as delineated in figure 1.The distance D between two adjacent cylinders is consistently set at D/H = 0.2.No-slip boundary conditions are enforced on the cylinders, the upper wall, and the lower wall, whereas periodic boundary conditions are implemented in both the streamwise and spanwise directions.
The spectral/hp element solver Nektar++ is utilized to solve the equations as referred to in Eqn.1-2 [3,[42][43][44].The geometric configuration in the x-y plane is discretized utilizing quadrilateral elements, with localized refinement in the vicinity of the cylinders (refer to figure 1(a)).Localized element expansions are implemented based on the modified Legendre basis [45].Flexibility in polynomial orders is employed across the wall-normal range through continuous Galerkin projection.Specifically, the polynomial order within the free-flow region, defined as y/H = [0.2,1], is set at P = 6-7.The proximal wall region and the upper two strata of cylinders, where y/H = [−0.4,0.2], are augmented with a superior order of P = 8-9.Conversely, an inferior order of P = 5 is designated within the more profound locations of the cylinder array (y/H = [−1, −0.4]).The spanwise orientation is broadened utilizing a Fourier spectral FIG. 1.A x − y cutting plane of the computational domain with intastanteous streamwise velocity is depicted in color.√ Kαα p+ and C p+ αα are the diagonal components of the permeability tensor and Forchheimer coefficient, respectively, in the direction of α (α ∈ {x, y, z}), which are normalized by wall units.r p+ c is the radius of the cylinders.method, and the 2/3 rule is implemented to circumvent aliasing errors.The temporal progression is executed with a second-order mixed implicit-explicit (IMEX) scheme, with the time step fixed at ∆T /(H/U b ) = 5 × 10 −4 .
Two DNS cases are performed with varying porosity φ = 0.5, 0.6, 0.7, and 0.8, which is defined as the ratio of the void volume to the total volume of the porous structure.The parameters of the simulated cases are listed in TABLE I, where the cases are named after their respective porosity.The superscripts (•) p and (•) s represent permeable wall and smooth wall side variables, respectively.Variables with superscript + are scaled by friction velocities u τ of their respective side and viscosity ν.
It should be noted that the distance between the cylinders remains constant, while the porosity undergoes modification through variation in the cylinder radii.The normalized cylinder radius is found to be within the range r p+ c = r c u p τ /ν = 42-52 for all examined cases (refer to TABLE I), thereby implying that the effects of surface roughness are presumed to be consistent across different scenarios.For all instances, the Reynolds number associated with the top wall boundary layer is configured at Re s τ = δ s u s τ /ν ≈ 180 (δ represents the distance between the position of maximal streamwise velocity and the wall).This configuration aids in minimizing fluctuations in the top wall boundary layer.In the region of the upper smooth wall, the streamwise cell size spans from 4.1 ≤ ∆x s+ ≤ 6.3, and the spanwise cell size is confined to below ∆z s+ = 5.4.On the side corresponding to the porous media, the value of ∆z p+ is capped at 8.4, whereas ∆x p+ and ∆y p+ are augmented by polynomial refinement of the local mesh [42].The cumulative number of grid points fluctuates between 88×10 6 (C05) and 110×10 6 (C06), with each cylinder within the porous domain resolved using 80 to 120 grids along its perimeter.The spatial resolution of the current work aligns closely with that of preceding DNS studies [46,47].Additionally, the utilization of a high-order scheme in conjunction with body-fitted mesh confers a notable advantage in resolving fine scales and necessitates fewer grids compared to both finite volume and immersed boundary methods to attain equivalent accuracy [48].Furthermore, a comparative analysis of our grid resolutions with the Kolmogorov length scale, denoted as η = (ν/ϵ) 1/4 , at the interface has been conducted.For all cases, the findings show that (∆x/η)y = 0 ⩽ 2, (∆y/η)y = 0 ⩽ 1, (∆z/η) y=0 ⩽ 4.5, thereby validating that the current resolution is indeed sufficient III.DATA-DRIVEN MODELLING Data driven modelling (DDM) is an approach to harness the available data about a system and establish a connection between the system state variables (input, internal and output variables) without explicit knowledge of the physical behaviour [49].In this work, we have the synthetic data derived from the DNS experiments while the end model should be based upon this data and should able to provide a model to predict the temporal evolution of a state variable.Figure 2 depicts our end-to-end 2-step ML framework.To have an extendable framework, we chose to first extract the features with the help of a deep encoder and feed the extracted features to a second-level recurrent neural network (RNN).This architecture is inspired from our earlier work [13].

A. Feature extraction with autoencoder
The deep convolution autoencoder (DAE) operates within the framework of unsupervised machine learning, meaning that the model's training does not depend on labeled data.The learning objective is to distill efficient representations or features within a low-dimensional hyper-plane, with the aim of reproducing the original data while minimizing reconstruction error.In more formal terms, an autoencoder encodes input data I into a compact form c = encode(I), where c ∈ R l .It then decodes this compact form back into a reconstructed representation Î = decode(c).The difference between the reconstructed and original data, given by Î − I, constitutes the reconstruction error.
In the specific implementation discussed, 2-dimensional data were sourced from DNS, as delineated in II.The architecture of the autoencoder is deep, comprising several convolution layers.Each layer functions as a local feature extractor and shares its weights across the network.Following each convolution layer is a pooling layer-specifically, FIG. 2. Two-step machine learning architecture for end-to-end ROM model for porous media.In the first step, DAE is trained as a feature extractor cum decoder while the second step trains the extracted feature in an autoregressive manner.
a max pooling layer in this instance-which diminishes the dimensionality of the data and emphasizes non-repetitive key features.This contributes to the network's unique translation invariance property.The encoder segment of the network concludes with a bottleneck layer, which harbors the latent space or extracted features.These features are subsequently channeled into a decoder section, which attempts to reconstruct the original DNS data.The training process initiates with random weight initialization across the network and iteratively refines these weights according to the loss function, employing backpropagation for optimization.
Consider three-dimensional input data I ∈ R Nx×Ny×Cin with C in because the number of input channels.Here, C in = 1 as we load the streamwise velocity field u x only into the network and thus I ∈ R Nx×Ny .This input is convoluted with a kernel k m as shown in Eqn. 3.
In the encoder, multiple convolutional layers are typically followed by pooling layers.A window of configurable size and sliding is used with a configurable step across the input.This process combines the input within each window to produce a single output element using a chosen aggregation function.The common max-pooling layer's aggregation function retains only the maximum element within each window.The pooling layer's step size is usually chosen to reduce the input's dimensionality, denoted as I.
Within the encoder segment of a DAE, multiple convolutional layers are commonly succeeded by pooling layers.These layers operate by deploying a configurable sliding window across the input, aggregating the information within each window into a single output element via a specified aggregation function.A frequent choice for this function is the retention of the maximum element within each window, as implemented in a max-pooling layer.This approach serves to diminish the dimensionality of the input, represented by I, typically through the strategic selection of the pooling layer's step size.
The cumulative effect of the encoding sequence is to map the input data into a latent space R l , where l ≤ N x × N y , and N x and N y denote the dimensions of the initial input.Following the encoder, the decoder of the DAE primarily mirrors the encoder's design, with a critical distinction: it substitutes max-pooling layers with upsampling layers to expand the data back to its original domain size.This expansion process, known as unpooling, essentially doubles the data's dimensions.In the context of the work discussed, nearest-neighbor linear interpolation is employed to facilitate the upsampling.The processing of intricate turbulence data necessitates a sophisticated, deep architecture composed of multiple convolutional layers, augmented by corresponding upsampling layers.Such an arrangement ensures the attainment of a faithful representation, minimizing information loss.In the comprehensive system, subsequent to training all relevant networks, the DAE's decoder is applied to extract a compressed form, denoted by ĉt , from a specific turbulence snapshot I t .This condensed snapshot is then input into an RNN, which predicts the subsequent reduced representation ĉt+1 .The DAE's decoder translates ĉt+1 back into the input domain, recreating the anticipated turbulence snapshot.Through the integrated use of deep convolutional autoencoders and recurrent neural networks, the system attains efficient and insightful compression of complex turbulence data.This approach enhances the capacity for predictive modeling and nuanced analysis within the realm of fluid mechanics.

B. Echo State Network (ESN)
Within the framework of the present study, ESN serves as a downstream recurrent neural network, functioning to predict the temporal evolution of the features extracted by the encoder segment of a DAE in an autoregressive fashion.The forecasted values are reintroduced to the decoder segment of the DAE, facilitating the recreation of the target feature.During the operational phase, the focus is exclusively on the trained ESN and the decoder, while the encoder is disregarded.
The structural construct of the reservoir is represented as an N × N adjacency matrix, denoted by W (r) η .The initialization of this reservoir matrix is predicated on a specific vector of hyperparameters, symbolized by η, and is applied to translate the ESN's input into a concealed representation known as the reservoir state, amassing information from previous inputs.The reservoir dimension, referred to as N , is usually selected to substantially exceed the latent space's dimension, following the criterion N ≥ l.Additionally, this reservoir is dynamically updated with every time increment, receiving new input.Specifically, at each discrete time t, the input vector c t modulates the computation of an updated reservoir state r t ∈ R N .
The evolution of the reservoir state is defined by: where α denotes a leakage rate, dictating the integration of previous states and current inputs.The random matrices W (in) and W (r) η are initialized at the commencement of training and are retained as constants.The ESN's output at the time step t is deduced by: Contrasting with convolutional networks, in the ESN case, only the final output layer requires training, obviating the need for backpropagation across several epochs.This implies that the elements of the output matrix W (out) must be meticulously optimized to minimize a cost function, denoted by C, which quantifies the discrepancy between the training data and the ESN's output.
The cost function and its minimization are formally represented as: Here, W (out) i represents the i-th row of W (out) , and || 2 2 signifies the L 2 norm.The regression problem is resolved by: The ESN's hyperparameters include the reservoir size N , the node density D, the spectral radius of the reservoir ρ(W (r) η), the leakage rate α, and the Tikhonov regularization parameter β in the cost function C. Collectively, these are captured in the vector η = (N, D, ρ, α, β).This training methodology is inherently expeditious relative to other forms of RNNs.Following successful training and optimization of hyperparameters, the ESN operates as an autonomous dynamical system.In this mode, a forecasted compressed snapshot c t+1 can be iteratively used as the succeeding ESN input to predict c t+2 and continue thereafter.This mode of operation is recognized as the closed-loop scenario.

A. Data driver flow model
As mentioned earlier, we first trained a DAE network to extract the feature in lower dimensions.For the same, input and output layers have the same dataset i.e. the snapshots from the DNS consist of u-velocity component with a dimension of 1501 × 109 × 1.The neural network has multiple convolution layers combined with a max pooling layer while the decoder section has a convolution layer and up-sampling layers, giving a converging-diverging shape.The features or latent modes are obtained at the end of the encoder section and it has a shape of 24 × 2 × 34, thereby giving a dimensionality reduction of 99%.This could increase further on the expense of reconstruction loss.TABLE II illustrates the various layers and their shape.After fixing the DAE architecture, various hyperparameters were obtained by using Bayesian optimization.This particular approach is part of AutoML paradigm [50], and it assists in achieving an optimized set of parameters in a relatively smaller number of iterations when compared to the grid or random search [51].It is worth mentioning that data was scaled between 0-1 before the training.For the entire process, we used 2000 snapshots, 60% are used for training, 20% for validation while remaining 20% for blind testing.The model has around 1.9 Million trainable parameters and the model reaches a loss of O(10 Once DAE is trained, we extract the features for the entire dataset using its encoder section.These encoded data which have 99% reduced dimension were further used to train the second level of RNN i.e.ESN.Unlike DAE, training an ESN is relatively easy and computationally cheaper because of straightforward design choices.The optimized network of ESN has 3726 reservoirs, 0.93 as spectral radius, 0.94 as leaking rate and 0.16 as reservoir density.For the best results, the look back history period is 4 timesteps i.e., the model requires 4 timesteps data as an initialization then the system can predict auto-regressively.As a first step of validation, we visualize the low-dimensional features from DAE along with the generated data from ESN on the blind validation set.We call the method generative because the model works auto-repressively once we provide the initial data.Figure 3 shows such a time series for 2 of the modes out of 1632.It is impressive that the model maintains a value of 0 without any minor fluctuations for ϕ 1 while the series fluctuates with the right phase and amplitude for ϕ 10 .Figure 4 further illustrates the probability density function (PDF) of these 2 modes for the DAE and ESN.It reaffirms the generative capability of ESN where it is able to match the overall distribution with GT (i.e., DAE for ESN).
As a next step of validation, a comparison in high-dimensional space where flow physics needs to be captured is performed.Figure 5 displays contour plots of the original DNS data in panels (a) and (d), reconstructed data from the features space of DAE in panels (b) and (e), and the decoded value from the prediction of ESN is shown in panels (c) and (f).One can observe a good qualitative agreement in terms of capturing the flow fields.Both regions i.e. main channel flow and porous flow can be predicted.Both DAE and ESN were trained from the DNS, and are not able to fully capture the strong turbulent field in the channel.While the flow in the porous media section is much better because it falls within the laminar region mostly.The obvious reason is the limited capability of DAE to capture the entire flow field because it discards 99% of input data and uses only 1% of the features, similar to conventional reduced-order modeling approaches such as POD.The derived model with ESN shows an excellent result while matching the field compared to its GT i.e.DAE.It suggests that the entire model can further be improved if a state-of-the-art feature extractor such as a transformer model can be used instead of DAE.It is worth mentioning that DAE surpasses conventional as well as some advanced feature extractors e.g., variational autoencoder for this use-case, which was observed in our experimentation.Often, averages and fluctuations are critical values for the analysis in industrial applications.Therefore, we further validate the results with mean and fluctuation profiles over the cross-section of the channel.Figure 6 shows these profiles and the mean velocity profile overlaps with the DNS for both DAE and ESN.The fluctuation of velocity is slightly deviated in the main channel while they remain zero in the porous section due to dominating laminar flow.The distinction made by the model is also great where it can predict the interface where flow is transiting from laminar to turbulent.For a comprehensive analysis, we include findings from Fourier spectra in our comparison.Figure 7 contrasts the streamwise spectra E uu near both the smooth wall at y = 19.5 and the permeable wall at y = 10.2.Both the DAE and ESN models perform well in capturing the energy at larger scales (k x < 1).However, neither DAE nor ESN successfully models the abundant turbulent kinetic energy found in the inertial subrange, particularly at y = 19.5.This limitation is anticipated, given that the DAE model achieves a 99% dimensionality reduction.

B. Fine tuning to a various porosity
The trained model has shown a good agreement in terms of quantitative as well as qualitative characteristics of the flow.However, the trained model is limited to a given data domain that was used for the training which means that the model will not predict accurate results if we use different parametric conditions.This is one of the biggest downsides of such a data-driven model.Considering the current setup, different porous media with various porosities [24,44] and topologies [43] are often compared together, whereas the channel remains unchanged.Variations in porous topology, porosity, and permeability can lead to distinct influences on turbulence modulation, which in turn could impact functionalities like drag reduction, noise absorption, and heat transfer enhancement.We evaluate the use of fine-tuning as a method to extend the versatility of these models to broader data sets.Fine-tuning involves adapting a pre-trained model to new, limited data.Given that the DAE is computationally intensive within this workflow, we are seeking a substantial reduction of the training time by using fine-tuning.The DAE model was tested using a distinct data set for this assessment.Here, we used case C06 which has higher level of porosity φ = 0.6.The other parameters such as turbulent Reynolds number on both wall sides Re p τ , Re TABLE III shows the cases used for the fine-tuning study.It compares the training time and the mean square error of these cases.The case numbering is based on an increasing order of the training time.Case A uses the trained model with C05 without any change.Therefore, its training time is 0 while the error is also the highest.Case E is trained from scratch, which is the same as the procedure in the last section.It is featured with the longest training time and the lowest mean square error showing the high accuracy and high training cost.Cases A and E represent two extreme conditions, whereas cases B, C and D fall between the two extremes.Case D is based on the same model as the original model from C05, however, it is retrained with all data points for case C06.Its training cost O(10 2 ) is one order of magnitude lower than case A whereas the mean square error is only slightly higher which indicates a worthy trade-off.Case C is fundamentally close to case D however only retrained with 10% data for case C06.Hence, its training time O(10 1 ) is again one order of magnitude lower than case D but with a significant increase of the mean square error.Case B is inherited with the model from case C05 with only 1 trainable layer for case C06.The training time is approximately half of that of case C and shows a significant increase in the error.In general, both case C and case D are valuable approaches with a reasonable trade-off of accuracy and training cost.Case D offers a high accuracy and still a significant reduction of training cost, whereas case C can be valuable when the training resource is extremely limited.
Figure 8 compares the instantaneous velocity and time-averaged velocity obtained from the ground truth (DNS) with all the fine-tuning cases from TABLE III.In the DNS result (figure 8(a)), turbulent structures on multi-scales can be resolved by fine resolutions.On the other hand, low-rank modeled data by DAE can capture the dominant structures such as the blow and suction events, for instance, the low-velocity region at the interface (x ≈ 20, y ≈ 10).Moreover, figure 9 shows the quantification of the time-averaged streamwise velocity profile and its fluctuation.In the channel region, all the models are able to reproduce the DNS results even in case A without any training on the DNS data.It is because the channel domain is topologically the same even though the porosity in the connected domain is different.A clear difference is seen in the porous domain as well as in the vicinity of the interface.Case A in red color is not able to follow the DNS data of the mean velocity as well as the velocity fluctuation.On the other hand, cases B, C and case D, empowered by the fine-tuning method, show an excellent modeling of the flow inside the porous domain regarding the mean velocity profile.This is quite impressive considering case B has only one trainable layer and with more than two orders of magnitude lower training cost.This suggests that a small amount of data is able to fine-tune the model to model and predict the mean profile in the entire domain.The prediction of velocity fluctuation in figure 9(b) seems much more challenging.Case B shows only a slight improvement but is still close to that of case A. Even though, case B can model the mean velocity profile with no clear discrepancy.Case C (10% new training data and O(10 2 ) reduction of the training time) shows a significant improvement in both the channel and porous region.Its performance is quite close to the case D and E with much higher training cost.Therefore, case C is considered as a balanced case with efficiency and accuracy.On the other hand, a further cost reduction of from case C to case B is not quite significant where the accuracy loss is noticeable.For a more in-depth examination of the fine-tuning results in turbulent channel flows, Figure 10 shows the streamwise spectra E uu for all fine-tuned cases against DNS results at y = 19.5 (close to the top wall) and y = 10.2 (close to the porous wall).Across all fine-tuned models, the predictions for large-scale motion at k x < 1 are highly accurate.However, deviations become apparent at higher streamwise wavenumbers.Although differences in time-averaged velocity fluctuations are noticeable (as seen in figure 9), they are less discernible in frequency space when plotted on a log-log scale.Among the cases, cases D and E stand out for their superior performance in the inertial subrange at both positions.
To further investigate the performance of fine-tuned models, figure 11 displays the instantaneous velocity field u x to the porous media domain.Case A fails to accurately capture the flow pattern within the porous domain due to the missing input of the new training data.Starting from case B, fine-tuning is able to significantly increase the modeling of the flow inside porous media even though the training data is quite limited.Increasing the training effort from case B to case E can progressively enhance the fidelity.Even though the flow in porous media is significantly influenced by dispersion caused by the porous structure, there are still instances of non-uniform velocity around y ≈ 8.This is indicative of a transitional regime between laminar and turbulent flow, due to its proximity to the interface.This phenomenon is better captured by Cases D and E.
The interfacial behavior is of great interest since it enables bi-directional interactions between channel flow and porous media [24,44].On the other hand, it is featured with high complexity involving two different types of geometries as well as a mixing state of turbulent, transition and laminar flows.Figure 12 depicts the instantaneous velocity fluctuation at the interface from DNS and various reduced-order models.In the boundary layer region (x > 10), DNS shows a rich scale from the large-scale structures to the fine motions, whereas ROM can only reproduce the bigger ones.This is reasonable considering that ROM has to ignore these details to deliver a low-rank model.Among them, case A shows a filtered modeling of the velocity fluctuation field, whereas more details are retained by other cases.Below the interface, the strength of fluctuation is much weaker.But case C, D and E can still deliver the model in a certain way.Case A has received no training data from this DNS, therefore, the porous topology is unclear and the prediction is far from satisfactory.Addressing the challenges posed by the complex geometry and mixed state of laminar, transitional and turbulent regimes, our study introduces a data-driven, end-to-end machine learning framework specifically designed to model turbulent flows within a channel flow coupled with porous media.To achieve this, we've established a non-linear reduced-order model powered by a deep convolution autoencoder network.Impressively, this model can reduce the spatial dimensions required for highly resolved simulations-integral for direct numerical simulation-by 99%.To capture the time-varying characteristics of the reduced modes, we employed a downstream recurrent neural network.This strategic integration allows the model to accurately predict the future trajectories of these modes.Furthermore, to test the adaptability and robustness of our model, we evaluated its performance on a dataset that presents a different porosity from the original training set.Through fine-tuning, our system has demonstrated the potential to significantly cut down on computational resources, reducing training efforts by up to two orders of magnitude when working with a limited dataset (constituting merely 10% of the full training data).Even with this reduced data input, the results achieved were commendable, particularly with respect to the mean velocity profile.The efficient FIG.12.A detailed look of instantaneous velocity fluctuation u ′ x to the channel flow-porous media interface fine-tuning of our current model significantly reduces computational time, making it much easier to study the effect of different types of porous media for drag reduction, enhanced heat transfer and noise absorption.This efficiency is particularly advantageous for systematic investigations into interface engineering, allowing for more focused and quicker research cycles.Of particular note is the observation that the fine-tuned model excels within the porous domain in comparison to the channel and interface regions.This suggests that the inherent topological features of the porous media are more amenable to training compared to the challenging multi-scale attributes typical of turbulent flows.

FIG. 3 .FIG. 4 .
FIG. 3. A comparison of extracted features from DAE which acts as ground truth for the training, and prediction from ESN for 2 of the modes.

FIG. 7 .
FIG. 7. Streamwise spectra of velocity fluctuation at different locations, (a) close to the smooth wall, (b) close to the permeable wall

TABLE I .
Simulation parameters.The porosity of the porous medium region is φ.The friction Reynolds numbers are Re p τ and Re s τ for the porous and impermeable top walls respectively.

TABLE II .
−4) for both training and validation set in 50 epochs.DAE architecture used at first level for feature extraction.The table summarizes the encoder and decoder architectures (Conv = convolution, Max Pool = max pooling, Upsamp = upsampling).Symbol E2 denotes for example encoder hidden layer No. 2.