This paper introduces a novel neural network—a flow completion network (FCN)—to infer the fluid dynamics, including the flow field and the force acting on the body, from the incomplete data based on a graph convolution attention network. The FCN is composed of several graph convolution layers and spatial attention layers. It is designed to infer the velocity field and the vortex force contribution of the flow field when combined with the vortex force map method. Compared with other neural networks adopted in fluid dynamics, the FCN is capable of dealing with both structured data and unstructured data. The performance of the proposed FCN is assessed by the computational fluid dynamics (CFD) data on the flow field around a circular cylinder. The force coefficients predicted by our model are validated against those obtained directly from CFD. Moreover, it is shown that our model effectively utilizes the existing flow field information and the gradient information simultaneously, giving better performance than the traditional convolution neural network (CNN)-based and deep neural network (DNN)-based models. Specifically, among all the cases of different Reynolds numbers and different proportions of the training dataset, the results show that the proposed FCN achieves a maximum norm mean square error of 5.86% in the test dataset, which is much lower than those of the traditional CNN-based and DNN-based models (42.32% and 15.63%, respectively).
I. INTRODUCTION
Flow field completion and body force extraction from incomplete flow field information are important in a range of applications. An example is in one of the key approaches in experimental fluid dynamics—particle image velocimetry (PIV),1,2 where noninvasive force measurements have long been a challenging task. Another application is the load prediction and controls in aeroelastic problems such as the wind farm flow prediction and control from LIDAR measurements.3 The flow field reconstruction from sparse sensors4 also involves the flow completion techniques. Solving the Navier–Stokes equations in computational fluid dynamics (CFD) could provide the detailed flow field and the pressure distribution and skin friction on the body surface, thus the unsteady force acting on the body. Direct load measurements in PIV can be significantly contaminated by resonance effect,5,6 and it can be advantageous to obtain force information instead by computing them from the measured flow field. Resolving these forces directly from surface pressures and skin friction has been challenging since resolving the entire boundary layer to an adequate resolution near the solid surface is not realistic in most experimental measurements.5 Instead, volumetric pressure-free methods7–11 achieve success in taking advantage of accurate experimental measurements of flow fields such as PIV and extracting the force on the body in a non-intrusive way. Li and Wu12 proposed a vortex force map (VFM) method by further exploring Howe's force formula10 for the derivation of the body force from the vorticity field. The VFM method has been well extended to a finite and limited chosen region enclosing the body13 and to three-dimensional flows.14 However, these methods still require detailed flow field information at least in a specific domain enclosing the body. Therefore, in this work, we will explore the flow field reconstruction from very limited incomplete measurements and further predict the body force combined with the previously proposed VFM method.
The recent boom of data-driven approaches and the proliferation of high-quality experimental or CFD flow data have attracted great attention in data-driven inference to simulate, reconstruct, or predict the fluid dynamics properties. Currently, high-fidelity CFD simulation is still resource-intensive and limits their use in industrial applications requiring quick turnarounds. Low-order theoretical descriptions of flow features have seen some success with analytical-numerical coupling methods6,15 but are very limited in their scope of application. The situation is now changing with the introduction of machine learning, which has been widely used in reconstruct or the surrogate modeling of the flow fields according to the information collected from either experiments or numerical simulations.16,17 Fukami et al.18 adopted a standard convolution neural network (CNN) and developed an improved hybrid downsampled skip-connection multi-scale (DSC/MS) model to reconstruct the high-resolution flow field from grossly under-resolved flow field data. They showed a remarkable ability to reconstruct laminar and turbulent flow fields from low-resolution data. Morimoto et al.19 developed a CNN-based method to estimate the velocity field through imperfect experimental (PIV) measurements of snapshots with missing data. Kochkov et al.20 used an end-to-end CNN based model to improve approximations inside computational fluid dynamics for modeling two-dimensional turbulent flows. Their research exemplifies how scientific computing can leverage machine learning and hardware accelerators to improve simulations without sacrificing accuracy or generalization. Raissi et al.21,22 developed a classical physical-informed deep neural network by including the N–S equation in the loss functions to infer the velocity, pressure, and hence, the lift and drag from limited and scattered time-space data of the velocity field. This method predicted precisely the flow information within the range of the training set. Miyanawala and Jaiman23 proposed an efficient model reduction technique based on the CNN and the stochastic gradient descent method to predict the unsteady fluid dynamic forces for different geometries at low Reynolds numbers. Bhatnagar et al.24 built a surrogate model for flow field prediction based on CNNs, which was shown to predict the velocity and pressure field orders of magnitude faster than the RANS solver. Specific convolution operations, parameter sharing, and gradient sharpening are used to improve the capability of the CNN.
Most of the traditional CNN-based methods are inherently limited to utilizing structured data since these methods need a generation of a feature matrix that could not apply to unstructured data. Flow measurements are, however, highly unstructured or even scatter distributed. Moreover, the standard CNN is translation invariant and sensitive to the scale of the data. The resolution of the output data depends on the scale of the training data and the resolution of the input data. In addition, it is difficult to directly use the N–S equation as a loss function in a standard CNN model since the CNN-based model structure could not automatically calculate the partial derivative of coordinates through existing deep learning frameworks (such as tensorflow and pytorch). Other works25,26 on deep learning of CFD on irregular geometries and unstructured grids have overcome the limitations of CNNs for complex geometries for steady flow problems.
To perform inference on unstructured or mesh-free data, Trask et al.27 introduced GMLS-Nets, which parameterize the generalized moving least squares functional regression method. The GMLS-Nets demonstrated successful prediction of body forces on a cylinder dataset based on unstructured point cloud fluid data. Ogoke et al.28 proposed a data-driven graph neural network (GNN) framework, extended from GraphSAGE,29 for the drag force prediction of flow field from irregular and unstructured data. In Ref.,29 the Top-K pooling step is used to replace the feature aggregation, whereas, none of these existing GNN-based models apply the laws of physics (NS equations in the fluid dynamics) to the flow field prediction, which is proved to be vital to the accuracy, efficiency, and generalization capability of the model.30 Thus, the physical-informed GNN applied to unstructured data needs further exploration.
In this work, a novel deep learning model—a flow completion network (FCN)—updated from the GraphSAGE,29 is designed to accurately predict the velocity field from an incomplete knowledge of the existing flow field data. Combined with the VFM method, the predicted velocity field is directly used to infer the force contribution of the vortex flow field, avoiding the utilization of an intermediate variable—the pressure field. It is well understood that the over-smoothing31,32 and the lack of gradient information are detrimental to the convergence rate and the accuracy of the GNN models. Thus, in our FCN, five neural network layers are introduced instead of using a deeper GNN in order to supress the over-smoothing phenomenon.33–35 The five neural network layers consist of three graph convolution (GC) layers and two spatial gradient attention (SGA) layers, where the SGA layer could also utilize the gradient information between the reference nodes to perform an accurate information transmission and flow field prediction. Unlike the traditional CNN model, which has limited application in structured data, this model is free from the constraints of the data structure and could deal with both structured and unstructured data.
To effectively use the gradient information between nodes, gradient attention layers are carefully designed to facilitate the transmission of gradient information between nodes. This procedure greatly simplifies the structure of the model and increases its performance. Moreover, the gradient information in N–S equation is integrated into the GNN model training as a loss function to make sure the obtained model conforms to the physical laws.
The experimental results show that the proposed FCN model could accurately predict the flow features such as the velocity in an efficient manner. It could also predict the body force when combined with the VFM method. It works well on limited or even missing regions of the training data presented on unstructured meshes or scattered points.
In Sec. II, the problem setup and methodology are introduced. The principle and structure of the proposed FCN and its sub-modules are described in detail in this section. A brief introduction to other networks is given as well for comparison.
II. PROBLEM SETUP AND METHODOLOGY
We start with a classical flow problem around a circular cylinder. The unsteady fluid motion is governed by the incompressible Navier–Stokes equations, where the density of the fluid is constant ρ and the viscosity of the fluid is constant μ. The solid body is denoted by ΩB bounded by a closed surface SB. Given scattered measurements of the snapshot data of the flow field, this work aims to infer the fluid dynamics features such as the velocity field and the body force. Specifically, this work is devoted to accurately predicting the flow field velocity on arbitrary nodes from the observable data on a finite number of reference nodes with coordinates . Moreover, with the vortex force map (VFM), we can calculate the body forces (lift and drag) from the inferred velocity fields.
To solve the aforementioned problem, first, we use to represent the model. The model output
is a function of the model inputs: the observed features , the coordinates of the reference nodes, and the coordinates of the prediction nodes. The outputs of the model are the predicted features on the target nodes.
Here, the subscript r represents the reference nodes while the subscript p represents the prediction nodes. The symbol with a hat, e.g., , represents the predicted features, while the symbol without a hat () represents the ground truth features. Here, in this paper, we presume the ground truth features are the data computed from CFD. In this paper, we deal with two-dimensional (2D) flow field completion cases, where the model is defined as . This 2D model could easily be extended to a three-dimensional (3D) model by extending the 2D N–S gradient loss function Eq. (11) to 3D and changing the coordinates of relevant nodes [ in Fig. 1(d)] to 3D coordinates. Part of the data are used to train the model, and the rest are used to test and evaluate the model. The details of sampling data set are described in Sec. II C.
The structure of the FCN. (a) The framework of the FCN, which consists of an input and output layers, three graph convolution layers (GC layers I, II, and III), and two spatial gradient attention layers (SGA layers I and II). (b) Visual illustration of SGA layer I. The reference nodes ( shown as solid orange dots) are used to refer the intermediate nodes (Nrj shown as solid blue dots). (c) Visual illustration of SGA layer II in (a). Six intermediate nodes () are used to refer the prediction node (Np shown as solid green dot). Note that, for illustration purpose only, the structure of the GraphSAGE depicted in this figure comprises three reference points in each layer. (d) The main framework of the SGA layers shows how to refer the features on the prediction node Np from its six neighborhood reference nodes (). Here, Xp and are the coordinates and the features of node p, respectively. AF is the features attention, while AC is the coordinates attention. “ResBlock” represents the residual learning block, and “ResSE” represents the ResBlock with squeeze and excitation module. (e) The details of the ResBlock and ResSE. “Conv” denotes the one dimensional convolution layer, “Pooling” denotes the global average pooling layer, and “FC” denotes the full connected layer.
The structure of the FCN. (a) The framework of the FCN, which consists of an input and output layers, three graph convolution layers (GC layers I, II, and III), and two spatial gradient attention layers (SGA layers I and II). (b) Visual illustration of SGA layer I. The reference nodes ( shown as solid orange dots) are used to refer the intermediate nodes (Nrj shown as solid blue dots). (c) Visual illustration of SGA layer II in (a). Six intermediate nodes () are used to refer the prediction node (Np shown as solid green dot). Note that, for illustration purpose only, the structure of the GraphSAGE depicted in this figure comprises three reference points in each layer. (d) The main framework of the SGA layers shows how to refer the features on the prediction node Np from its six neighborhood reference nodes (). Here, Xp and are the coordinates and the features of node p, respectively. AF is the features attention, while AC is the coordinates attention. “ResBlock” represents the residual learning block, and “ResSE” represents the ResBlock with squeeze and excitation module. (e) The details of the ResBlock and ResSE. “Conv” denotes the one dimensional convolution layer, “Pooling” denotes the global average pooling layer, and “FC” denotes the full connected layer.
After obtaining the velocity fields from the incomplete measurements through the aforementioned model , we recall the VFM method14 to extract the lift and drag coefficients on the circular cylinder
where is the free stream velocity, d is the diameter of the circular cylinder, and is the vorticity. and are the unit vectors in the lift and drag directions, respectively. is the Reynolds number. The vortex force vectors are defined as
and the hypothetical potential is defined as
A. The FCN
The main framework of the proposed FCN is shown in Fig. 1. The FCN consists of three graph convolution (GC) modules (GC layers I, II, and III) and two spatial gradient attention (SGA) modules (SGA layers I, II), as shown in Fig. 1(a). Each GC layer contains one simple neuron layer, and each SGA layer is a multi-layer perception (MLP) containing six simple neuron layers. Thus, the total number of hidden layers is 15, each of them containing 64 neurons as a general treatment to meet the requirements of the model performance.21 The activation functions and the output functions are collectively referred to as the transfer functions. The activation functions between different hidden layers are ReLU Activation36 (Torch.nn.Relu in the Pytorch deep learning framework). There are no output functions in our model. The GC module is mainly used to learn the node features. The details of the structure of each GC module will be introduced in Sec. II A 1. For a more accurate aggregation process, in other words, learning the flow features on the targeting nodes from the neighbor nodes more accurately, the SGA module is extended from the aggregation module in GraphSAGE.29 More details could be found in Sec. II A 2. The N–S gradient loss function in Eq. (11) enables the SGA modules to learn the gradient characteristics in line with the physical laws described by the N–S equations (10). The first and second SGA layers are designed to learn the first- and second-order partial derivatives in line with the N–S equations, respectively. Three GC layers are then designed accordingly before and after the SGA layers to fuze the flow field feature and the learned spatial gradient information. We choose three GC layers, rather than more GC layers, to suppress the over-smoothing phenomenon33–35 for a more accurate model.
1. The graph convolution (GC) module
The GC module is introduced from GraphSAGE.29 The role of the GC module is to learn the topological structure and the vertex features, in other words, the embedding representation of vertices. Hamilton et al.29 proposed a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embedding for the previously unseen data—GraphSAGE. The three GC modules we used are shown in Fig. 1(a).
2. The spatial gradient attention module
In the SGA layers, the spatial-based graph convolution network is introduced, where the GraphSAGE (Graph SAmple and aggreGatE) proposed by Hamilton et al.29 is adapted here to deal with a spatial-based graph. We have made two main aspects of renovations on the original GraphSAGE. The first one is to replace the feature distance layer with the gradient feature attention layer [AF in Fig. 1(d)] in the updated GraphSAGE model. The second one is to introduce the gradient coordinate attention layer [AC in Fig. 1(d)], which is based on a spatial coordinate. The two SGA modules (SGA I and SAG II) are shown in Figs. 1(b) and 1(c), respectively. Figure 1(b) shows how to refer the data on the target node Nri from its six neighbor nodes . For demonstration purposes, only three neighbor nodes are depicted in the figure. Similarly, Fig. 1(c) shows how to refer the data on the target node Np from its six neighbor nodes , and only three neighbor nodes are depicted.
The SGA module is proposed here to calculate the spatial gradient and aggregate it into the node features. As the NS equations in Eq. (10) contain the second-order partial derivatives, two SGA layers are introduced in our model to learn to compute the first and second partial derivatives. In order to aggregate node features from multi local neighborhood nodes, we modify the input channel of the attention layers (AF) from original GraphSAGE frameworks. As shown in Figs. 1(d) and 1(e), the gradient attention layer contains a ResBlock module and a ResSE module. The ResSE is the SE-ResNet module proposed by Hu et al.37 and He et al.,38 and the ResBlock is the Residual module introduced by He et al.38
B. Other models
As for comparisons to our FCN, the DNN-based model and the CNN-based model, commonly used in fluid dynamics, are also introduced here.
1. The DNN-based model
relates the input data (time t and the coordinates x) to the predicted features on the target nodes. The main framework of is shown in Fig. 2(a). The DNN-based model contains ten hidden layers, and each hidden layer consists of 64 weight neurons and one bias neuron. The normal square error loss Eq. (12) and the Navier–Stokes loss functions are used in the training procedure for our DNN-based model, following the work by Raissi et al.21
The main framework of the other models. (a) The DNN based model DNN. The DNN consists of ten hidden layers and 64 weight neurons and one bias neuron per hidden layer. The DNN takes the input variables t, x, and y and outputs u and v. (b) The CNN based model CNN.
The main framework of the other models. (a) The DNN based model DNN. The DNN consists of ten hidden layers and 64 weight neurons and one bias neuron per hidden layer. The DNN takes the input variables t, x, and y and outputs u and v. (b) The CNN based model CNN.
2. The CNN-based model
shown in Fig. 2(b) is similar to the deep learning model proposed by Gao and Grauman.39 The input data of the model are the given features of the given nodes and the mask m, and the output data are the features on the targeting nodes. Here, the mask m is used to distinguish the reference nodes from the target nodes, and m = 0 refers to the reference node, while m = 1 refers to the target node. This model uses a SegNet-like framework,40 which consists of encoding layers and decoding layers. Each of the encoding layers contains four downsample blocks Down1, Down2, Down3, and Down4 as shown in Fig. 2, and each of the decoding layers also contains four upsample blocks (Up1, Up2, Up3, and Up4) as shown in Fig. 2. A ResBlock38 is used to connect the encoder and the decoder in the CNN-based model. The mean square error loss, same as the loss functions in SegNet,40 is used here for training of the CNN-based model.
C. Dataset, loss functions, and metrics
The flow field data around a circular cylinder obtained by CFD are used as the dataset in this work. For CFD simulations, the Navier–Stokes equations in unsteady laminar flow are solved numerically using the same method as used by Li et al.13 It consists of using the commercial code Fluent with the options of a second-order upwind SIMPLE (semi-implicit method for pressure-linked equations) pressure–velocity coupling method. The computational domain is 36 diameters in the inflow direction and 21 diameters in the direction perpendicular to the inflow. For the cases of Re = 100, Re = 500, and Re = 1000, three meshes with 41 573, 64 190, and 80 197 grid points are used, respectively. For all the cases, 20 mesh layers inside the laminar boundary layer are guaranteed. The flow is impulsively started from an initially uniform flow. The non-dimensional time is defined as the number of chord length the uniform flow travels. For the cases of Re = 100, Re = 500, and Re = 1000, a total time of are simulated. For every , the flow field data, including the coordinates, the velocity, and the vorticity, are saved to form the dataset. Thus, the dataset contains a total points of for the case of Re = 100; it consists of for the case of Re = 500, and it includes for the case of Re = 1000.
The whole dataset is divided into three parts: the training set for the training process of our models, the validation set for finding the proper hyper-parameters, and the test set for assessing the performance of the model. To ensure the reliability of the model in the general cases, there is no overlap between the training and test datasets. The selection of various datasets is done based on a random process,41 i.e., a 10-fold cross validation strategy here. This random process may lead to spatial imbalance in the dataset. In order to avoid the data imbalance, several data augment strategies are introduced into the model training process, which will be detailed in Sec. III A 1. Note that the CFD method utilized in this work is purely for providing data and validation purposes. It is the intent for the future work that we implement the proposed FCN model for prediction of an experimental field dataset. We should mention that the entire methodology of this work does not involve a pressure field, making its application to experimental data, such as PIV, more instinctive. The numerical method used in this work has been well validated in the previous work by Li et al.;12,13 thus, the details of the validation are not given here.
We carried out three sets of experiments with different sizes of the training dataset: 10%, 30%, and 50%, respectively. The corresponding validation dataset and testing dataset are shown in Table I. As in the literature, some existing models21,25 use > 80% of data for training, and the rest are split equally as the validation dataset and the testing dataset. However, the large proportion of the training dataset sets an inevitable limitation for the generalization ability of existing models. Thus, in this work, we tried to train our model with a smaller proportion (10%, 30%, and 50%) of training datasets. Moreover, a larger proportion of the testing set than the validation set is chosen to show good performance of our proposed model. The focus of this paper is to design and validate this flow completion network. To show the reliability of the model, three different experiments has been carried out to train three models for three different Reynolds numbers (Re = 100, Re = 500, and Re = 1000), separately. An updated model capable of dealing with multi-Reynolds numbers with rigorous uncertainty and reliability assessment is worth to be explored in the future work. To evaluate the performance of all the models, all the metrics for the test dataset are computed to assess the capability of the model trained by different training datasets. The u and v obtained by CFD are used as labels in our dataset.
The details of the dataset. As comparison, we have done the same experiments on different training dataset: 10%, 30%, and 50%. Here, for the case of Re = 100, for the case of Re = 500, and for the case of Re = 1000.
Train dataset: | 10%NT | 30%NT | 50%NT |
Validate dataset: | 30%NT | 30%NT | 20%NT |
Test dataset: | 60%NT | 40%NT | 30%NT |
Train dataset: | 10%NT | 30%NT | 50%NT |
Validate dataset: | 30%NT | 30%NT | 20%NT |
Test dataset: | 60%NT | 40%NT | 30%NT |
We use the normal mean square error and the absolute error as the metrics to evaluate the performance of the model. The proportion of the training set is also an important index to measure the generalization ability of the model. If the results of the model are the same, the fewer data used in the model training process, the stronger the generalization ability of the model. is the u and v of the CFD result, and is predicted u and v for the target nodes by the model M. The equations of the normal mean square error Enmse for both u and v are as follows:
To assess the generality of the proposed model, in addition to the NMSE of the flow velocity, the relative error of the force coefficients extracted from the predicted u and v and the coefficients computed by CFD
are also utilized in evaluating the performance of our model.
Other statistical error parameters are also defined, including the correlation coefficient (CC),
where the subscript p indicates the nodes of prediction (or the target nodes), the variables without hat (e.g., and ) denote the ground truth, and the variables with a hat (e.g., and ) denote the predicted value. The averaging of the variables is indicated by an overline (e.g., and ).
The Bias is defined as
The motion of fluids is expressed by conservation laws for mass, momentum, and energy. The equation for mass is known as the continuity equation while the equation for momentum is called equation of motion that is an expression of Newton's law. If the viscous fluid and inviscid fluid are considered in these equations, they are known as the Navier–Stockes and Euler equations, respectively.
Two loss functions are introduced to the training procedure for our model. One is the loss function specialized here to include the partial derivatives in the N–S equations
and another one is the conventional normal mean square error loss function LossNMSE. The is designed to guide the model to learn the gradient information in the flow field and is defined as
The LossNMSE is defined as
Here, is added to its denominator to avoid numerical errors. To train the model , we aim to minimize the difference between the predicted and the from the CFD result by introducing the loss functions and .
III. EXPERIMENTS AND RESULTS
A. Training
The training dataset mentioned in Sec. II C is used to train different models in our experiments. In order to obtain widely applicable models, dimensionless parameters (such as , and ), instead of dimensional parameters (such as , and t) are used in this work. The flow completion network converges gradually after about 40 epochs of training. The CNN-based model and the DNN-based model, used as comparisons here, converge after about 61 and 67 epochs of training, respectively.
1. The data Augmentation strategies
In order to improve the generalization of the model, several data augmentation strategies are used during the training procedures. The balance-weight sampling method and the Gaussian noise method are introduced into our training procedures.
The balanced weight sampling method
To guide the model to better learn the flow features close to the wall area and predict the flow more accurately, we add the sampling weight to the nodes close to the wall. The nodes closer to the wall have a larger sampling weight, while the nodes farther away from the wall have a smaller sampling weight.
The Gaussian nose method
To enhance the fitness of our models, the Gaussian noise is added to the original features. The Gaussian noise with zero mean value and 1.0 standard deviation is added to the original features during the training procedure.
2. The hyper-parameters in the training
The hyper-parameters for the training procedure of our model are shown in the following list:
Optimizer (training algorithm): stochastic gradient descent (SGD) algorithm,
Momentum: 0.9,
Learning rate: 10 × 10−3,
Batch size: 32,
Training epochs: 100,
Dropout rate: 0.5.
The best learning rate and batch size are determined by the grid search strategy. The schematics of the unstructured mesh grid used in the CFD simulation and the training dataset are shown in Fig. 3. As stated in Sec. II C, 10%, 30%, and 50% uniform randomly distributed scatters are subtracted from the total dataset as training datasets. In Fig. 3, we only show the 10% and 30% training dataset for simplicity. Meanwhile, the training dataset with and without unobserved regions are also tested.
For the case of Re = 1000: (a) the schematic of the unstructured mesh grid used in the CFD. (b) The 10% randomly distributed training dataset nodes in the flow field. (c) The 30% randomly distributed training dataset nodes in the flow field without unobserved regions. (d) The 30% randomly distributed training dataset nodes in the flow field with an unobserved pentagon region in the wake area, which is shown in green.
For the case of Re = 1000: (a) the schematic of the unstructured mesh grid used in the CFD. (b) The 10% randomly distributed training dataset nodes in the flow field. (c) The 30% randomly distributed training dataset nodes in the flow field without unobserved regions. (d) The 30% randomly distributed training dataset nodes in the flow field with an unobserved pentagon region in the wake area, which is shown in green.
B. Results
The experimental results on the dataset described in Sec. II C are presented in this section. The proposed FCN model is compared with the traditional CNN and DNN-based models in the same dataset. We also test the generalization ability of all the models on the dataset.
The experiment results of the lift and drag coefficients, against non-dimensional time , predicted by the , and models for different Reynolds numbers are shown in Figs. 4(a) and 4(b). Figure 4(c) shows the relative error of the lift coefficients obtained from different models (CNN, DNN, and FCN) for the three different Reynolds number cases. The above figures show that our proposed FCN has better performance in predicting the lift and drag coefficients than the traditional CNN and DNN-based models. From Fig. 4(d), we can see that the normalized mean squared error of our proposed model is lower than those of and . Moreover, the performance of the and varies with non-dimensional time τ while the proposed has stable performance over all time range. One possible explanation is that our proposed could learn physics from the and utilizes the information on the neighbor nodes, which lead to a better prediction.
(a) The comparison of the lift coefficient curve predicted by , and computed from the CFD data at three different Reynold numbers: Re = 100, Re = 500, and Re = 1000. (b) The comparison of the drag coefficient curve predicted by , and computed from the CFD data at three different Reynold numbers: Re = 100, Re = 500, and Re = 1000. (c) The relative error of the lift coefficients obtained from different models (, and ) for three different Reynold numbers cases: Re = 100, Re = 500, and Re = 1000. (d) The normal mean square error (NMSE) of the flow velocity NMSE(u, v) for three different models (, and ).
(a) The comparison of the lift coefficient curve predicted by , and computed from the CFD data at three different Reynold numbers: Re = 100, Re = 500, and Re = 1000. (b) The comparison of the drag coefficient curve predicted by , and computed from the CFD data at three different Reynold numbers: Re = 100, Re = 500, and Re = 1000. (c) The relative error of the lift coefficients obtained from different models (, and ) for three different Reynold numbers cases: Re = 100, Re = 500, and Re = 1000. (d) The normal mean square error (NMSE) of the flow velocity NMSE(u, v) for three different models (, and ).
The predicted velocity, vorticity, thus, lift and drag distribution for the cylinder dataset (Re = 1000) obtained from and from CFD at a typical instant are shown in Fig. 5. The first two lines of Fig. 5 show the predicted, CFD, and there differences (Learned—CFD) on the velocity field []. The third line of Fig. 5 shows the predicted, CFD, and there differences (Learned—CFD) on the vorticity field, while the fourth and fifth lines of Fig. 5 show the predicted, CFD, and there differences (Learned—CFD) on the lift and drag distributions, respectively. A good comparison has been found between our proposed FCN model and the CFD results.
The comparison of the predicted, CFD, and there differences (Learned—CFD) on the velocity field [], the vorticity field, and the lift and drag distributions for the case of Re = 1000.
The comparison of the predicted, CFD, and there differences (Learned—CFD) on the velocity field [], the vorticity field, and the lift and drag distributions for the case of Re = 1000.
In order to demonstrate the effectiveness of our model, the flow fields completed by our model, the model, and the model are compared with CFD results at different scales, shown in Fig. 6. The vorticity ω is demonstrated here. From Fig. 6, we can see that the CNN and DNN-based model learned vorticity field is noisy due to a lack of the second-order gradient information during the training procedure. Our proposed FCN defeats the traditional CNN and DNN-based models regarding the flow field prediction from unstructured data.
(a) The CFD result of the non-dimensional vorticity field at a typical instant for the cylinder flow for Re = 1000. (b) The super-resolution vorticity field comparison of , CFD, and in an area amplified from (a). (c) The super-resolution vorticity field comparison of , CFD, and in an area amplified from (b).
(a) The CFD result of the non-dimensional vorticity field at a typical instant for the cylinder flow for Re = 1000. (b) The super-resolution vorticity field comparison of , CFD, and in an area amplified from (a). (c) The super-resolution vorticity field comparison of , CFD, and in an area amplified from (b).
It is very common to have shadows or unobserved data in the experimental measurements or point-cloud represented data. The unobserved data may lie in any arbitrary training domains, which could contain important information. Here, in this work, we use a pentagonal region in the wake of the cylinder as an example. The performance of our proposed model compared with , and model with this missing pentagonal region is also tested. The results are shown in Fig. 7, where we could see that the completion compares well with the CFD results, while the other two traditional Figure 8 shows the training loss and testing loss during the training procedure for the , the , and the , respectively. Figure 8(a) shows that the FCN model converges after about 40 epochs of training. The average training loss converges to 0.0806, and the average testing loss converges to 0.1092, which is very close to the average training loss. The gap between the average testing loss and the average training loss indicates that there is almost no overfitting during the training procedure of the proposed FCN model. Similarly, in Figs. 8(b) and 8(c), the training loss and testing loss are shown to converge after about 61 and 67 epochs of training, respectively. The average training loss and testing loss for the CNN-based model converge to 0.1042 and 0.2324, respectively. The average training loss and testing loss for the DNN-based model converge to 0.0639 and 0.1031, respectively. The convergence rate for the FCN model is the highest among all the models tested here.
(a) The CFD result of the non-dimensional vorticity field at a typical instant for the Re = 1000 cylinder flow with an unobserved pentagonal region in the wake. The black pentagonal area is the flow field region that needs to be completed. (b) The comparison of the completion vorticity field for the above-mentioned unobserved pentagonal region obtained from , CFD, and .
(a) The CFD result of the non-dimensional vorticity field at a typical instant for the Re = 1000 cylinder flow with an unobserved pentagonal region in the wake. The black pentagonal area is the flow field region that needs to be completed. (b) The comparison of the completion vorticity field for the above-mentioned unobserved pentagonal region obtained from , CFD, and .
The training loss and testing loss during the training procedure for (a) the proposed FCN model, (b) the CNN-based model, and (c) the DNN-based model.
The training loss and testing loss during the training procedure for (a) the proposed FCN model, (b) the CNN-based model, and (c) the DNN-based model.
Tables II–IV list the metrics of the training dataset and testing dataset for , and trained by 10%, 30%, and 50% of the total dataset at three different Reynolds numbers Re = 1000, Re = 500, and Re = 100, respectively.
The metrics (NMSE(u, v)) of the training dataset and the testing dataset for , and trained by 10%, 30%, and 50% cylinder datasets for the case of Re = 1000.
Partition ratio . | (test/train) . | (test/train) . | (test/train) . |
---|---|---|---|
10% | 37.23%/19.15% | 15.63%/8.98% | 5.86%/9.79% |
30% | 33.09%/20.85% | 10.60%/12.18% | 4.72%/6.37% |
50% | 30.05%/20.21% | 15.22%/8.46% | 3.21%/6.86% |
Partition ratio . | (test/train) . | (test/train) . | (test/train) . |
---|---|---|---|
10% | 37.23%/19.15% | 15.63%/8.98% | 5.86%/9.79% |
30% | 33.09%/20.85% | 10.60%/12.18% | 4.72%/6.37% |
50% | 30.05%/20.21% | 15.22%/8.46% | 3.21%/6.86% |
The metrics [NMSE(u, v)] of the training dataset and the testing dataset for , and trained by 10%, 30%, and 50% cylinder datasets for the case of Re = 500.
Partition ratio . | (test/train) . | (test/train) . | (test/train) . |
---|---|---|---|
10% | 40.31%/20.14% | 11.37%/10.28% | 3.84%/7.18% |
30% | 31.05%/21.27% | 12.36%/10.71% | 4.15%/6.54% |
50% | 26.88%/18.87% | 11.98%/7.78% | 4.39%/4.68% |
Partition ratio . | (test/train) . | (test/train) . | (test/train) . |
---|---|---|---|
10% | 40.31%/20.14% | 11.37%/10.28% | 3.84%/7.18% |
30% | 31.05%/21.27% | 12.36%/10.71% | 4.15%/6.54% |
50% | 26.88%/18.87% | 11.98%/7.78% | 4.39%/4.68% |
The metrics [NMSE(u, v)] of the training dataset and the testing dataset for , and trained by 10%, 30%, and 50% cylinder datasets for the case of Re = 100.
Partition ratio . | (test/train) . | (test/train) . | (test/train) . |
---|---|---|---|
10% | 42.32%/22.07% | 13.51%/10.62% | 4.64%/7.31% |
30% | 29.20%/20.21% | 13.12%/9.75% | 3.59%/7.32% |
50% | 25.45%/19.96% | 11.92%/6.85% | 3.82%/6.38% |
Partition ratio . | (test/train) . | (test/train) . | (test/train) . |
---|---|---|---|
10% | 42.32%/22.07% | 13.51%/10.62% | 4.64%/7.31% |
30% | 29.20%/20.21% | 13.12%/9.75% | 3.59%/7.32% |
50% | 25.45%/19.96% | 11.92%/6.85% | 3.82%/6.38% |
In Table II, we can see that for the case of partition ratio equals 10% and Re = 1000, the training dataset metric NMSE(u, v) for is 37.23%, higher than its testing dataset metric 19.15%. Similarly, the testing dataset metric for is slightly higher than the training dataset metric . On the contrary, for our model, the testing dataset metric is lower than its training dataset metric . Similar phenomenon could be found in other cases for other partition ratios and Reynolds numbers, which indicate that the overfitting of the proposed model is much milder than that of the model. Moreover, from Tables II–IV, it is obvious that the metrics for our proposed model are much lower than those for the and model, which indicate that our proposed model has better performance than the other two classical models with a relative small training dataset. This is because, the proposed model could learn the features and the gradient information from neighboring nodes.
By comparing the metrics of , and trained by 10%, 30%, and 50% training dataset, we find that with the reduction of training datasets, the difference between and training dataset Enmse and testing dataset Enmse is becoming larger, which means that the performances of the and are related to the size of training datasets, while the performance of our proposed FCN model is less affected by the size of the training dataset.
Table V shows the uncertainty assessment for the proposed models trained by different datasets. The correlation coefficients, defined in Eq. (8), in Table V for different cases are all close to 1, which indicates that the predicted value has a strong correlation to the ground truth (the CFD results here). The NMSE for all the cases are smaller than 5.86% and the bias for and are within ±0.0338, which again shows the high accuracy our proposed FCN model. Figure 9 shows our proposed model is fair in predicting the flow field for different Reynolds number cases.
The uncertainty assessment includes the correlation coefficient (CC), the normalized mean square error (NMSE), and the bias, for the proposed models trained by different datasets. Here, the partition ratio (PR) means the proportion of the training dataset.
Model (Re) . | PR . | CC . | . | . | . |
---|---|---|---|---|---|
10% | 0.9678 | 4.64% | +0.0020 | −0.0018 | |
30% | 0.9867 | 3.59% | −0.0012 | +0.0002 | |
50% | 0.9693 | 3.82% | −0.0034 | +0.0067 | |
10% | 0.9576 | 3.84% | −0.0268 | −0.0422 | |
30% | 0.9718 | 4.15% | −0.0021 | −0.0104 | |
50% | 0.9641 | 4.39% | −0.0059 | +0.0014 | |
10% | 0.9478 | 5.86% | −0.0338 | +0.0038 | |
30% | 0.9597 | 4.72% | +0.0361 | −0.0229 | |
50% | 0.9510 | 3.21% | +0.0272 | −0.0146 |
Model (Re) . | PR . | CC . | . | . | . |
---|---|---|---|---|---|
10% | 0.9678 | 4.64% | +0.0020 | −0.0018 | |
30% | 0.9867 | 3.59% | −0.0012 | +0.0002 | |
50% | 0.9693 | 3.82% | −0.0034 | +0.0067 | |
10% | 0.9576 | 3.84% | −0.0268 | −0.0422 | |
30% | 0.9718 | 4.15% | −0.0021 | −0.0104 | |
50% | 0.9641 | 4.39% | −0.0059 | +0.0014 | |
10% | 0.9478 | 5.86% | −0.0338 | +0.0038 | |
30% | 0.9597 | 4.72% | +0.0361 | −0.0229 | |
50% | 0.9510 | 3.21% | +0.0272 | −0.0146 |
The comparison between the predicted by the proposed models and those from the CFD results. The first column is for Re = 100, the second column is for Re = 500, and the third column is for Re = 1000.
The comparison between the predicted by the proposed models and those from the CFD results. The first column is for Re = 100, the second column is for Re = 500, and the third column is for Re = 1000.
In order to demonstrate the fitness/generality of the proposed FCN model, three observation points (A, B, and C) are selected from the cylinder flow field at Re = 1000 [see Fig. 10(a)] to check the variation of predicted flow features (velocity) against time. The comparison of the velocity variation against time obtained from the FCN model and extracted directly from the CFD is shown in Fig. 10(b). It is shown that the velocity variation predicted by the proposed FCN model compares well with the CFD data, and it has strong generalization ability. Moreover, the error of the does not vary with both the location of the observation point and the time.
The flow field completion results for (a) the velocity contours () at a typical instant and (b) the time variation of velocity () at specific points obtained by the FCN model.
The flow field completion results for (a) the velocity contours () at a typical instant and (b) the time variation of velocity () at specific points obtained by the FCN model.
IV. CONCLUSION
In this work, we introduced a novel model based on GraphSAGE for the flow field completion through using unstructured scattered data. The was well designed to contain two GC layers and three SGA layers. The GC layers were introduced to take advantage of the properties of graph convolution neural networks such as the internal physical law of the flow field (N–S equations). The SGA layers were introduced to include the spatial gradient information while dealing with unstructured data. As we know, the experimental measurements of the flow field properties are usually conducted on sparsely scattered points, leading to unstructured data that are difficult to process with traditional machine learning algorithms (e.g., CNN-based models).
To test the proposed FCN model, CFD simulation of a two-dimensional circular cylinder flow at different Reynolds numbers (Re = 100, Re = 500, and Re = 1000) on the unstructured mesh was conducted here to provide the training dataset. The CFD results also served as the “ground truth.” The relative error of the lift and drag coefficients, the NMSE of the two velocity components, as well as the CC and Bias of the velocity components were introduced to evaluate the performance of the proposed FCN model. 10%, 30%, and 50% uniform randomly distributed scatter subtracted from the total dataset with and without unobserved regions have been used as training datasets. The comparison of experimental results from our proposed model together with two other traditional CNN and DNN-based models with CFD groud truth showed the superiority of our FCN model in predicting the flow field feature and body force from incomplete flow measurements on unstructured mesh or scattered points. The efficiency and accuracy of the proposed FCN model were less affected by decreasing the training dataset, and even 10% of the whole dataset gave a reasonable prediction with a 5.86% NMSE in the testing dataset for the case of Re = 1000. The NMSE for our proposed FCN model is much lower than those for the traditional CNN and DNN-based models. The output and input parameters of the FCN model show strong correlations, and the biases for the predicted flow velocity are minor. In a nutshell, this well-designed network and variable loss functions made the model being trained quickly and robustly.
In summary, a novel neural network FCN has been proposed in this work to infer the fluid dynamics, including the flow field and the force acting on the body, from the incomplete data based on the graph convolution attention network. The FCN was designed to be capable of dealing with both structured data and unstructured data. The experimental results showed that our FCN model effectively utilizes the existing flow field information and the gradient information simultaneously, giving a better prediction of the flow field and body force than the traditional CNN-based and DNN-based models.
ACKNOWLEDGMENTS
This work has received funding from the European Union's Horizon 2020 Research and Innovation Programme under the Marie Sklodowska-Curie Grant via Agreement No. 765579. This work was funded by the Leverhulme Trust, Grant Ref. No. ECF-2018-727. Their support is gratefully acknowledged.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Xiaodong He: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Resources (equal); Software (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review and editing (equal). Yinan Wang: Funding acquisition (equal); Investigation (equal); Writing – review and editing (equal). Juan LI: Conceptualization (equal); Formal analysis (equal); Funding acquisition (equal); Investigation (equal); Project administration (equal); Resources (equal); Software (equal); Supervision (equal); Visualization (equal); Writing – original draft (equal); Writing – review and editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.