Nanofluids have been applied in various fields, such as solar collectors, petroleum engineering, and chemical engineering, due to their superior properties compared to traditional fluids. Among the various thermophysical properties of nanofluids, viscosity plays a critical role in thermal applications involving heat transfer and fluid flow. While several conventional machine learning (ML) techniques have been proposed to predict viscosity, these conventional models require many experimental measurements to be optimized and make accurate predictions. This study reports a novel ML method using a multi-fidelity neural network (MFNN) to accurately predict the viscosity of nanofluids by incorporating the physical laws into the model. The MFNN correlates a low-fidelity dataset derived from the prediction of the theoretical model with a high-fidelity dataset, which consists of experimental measurements. It is shown that the MFNN can recover the rheology of nanofluids and outperforms the conventional artificial neural network due to incorporating the underlying physics of nanofluids into a model.
I. INTRODUCTION
In recent years, nanofluids have become increasingly popular in various fields, such as petroleum engineering, chemical engineering, and electrical engineering, due to their enhanced thermal properties, such as thermal conductivity.1–3 Tong et al. discovered that nanofluids can improve the thermal performance of solar collectors by 21%.4 In petroleum engineering, using CuO nanoparticles in engine oil resulted in a 0.76% system efficiency improvement.5 Daneshpour and Mehrpooya numerically investigated the use of nanofluids as the circuit flow of a geothermal borehole heat exchanger.6 It was concluded that using nanofluids increases heat extraction compared to conventional fluids. Nanofluids have been widely used in cooling systems to improve performance and manage heat generation compared to traditional fluids.7 Nanofluids are typically prepared by suspending one or more solid particles in the base fluid, resulting in a variation in the thermophysical properties of the fluid. Solid particles typically include Al2O3, SiO2, and CuO mixed with base fluids, such as water, deionized water, and ethylene glycol.8
Viscosity is defined as the resistance of the flow to the applied force.9 Fluid viscosity affects all fluid-dependent applications, including mixing, piping systems, fluid injection, and transport.10 Among the various thermophysical properties of nanofluids, viscosity plays a critical role in applications involving heat transfer and fluid flow due to its impact on the required pumping power, heat-transfer coefficient, and pressure loss.11,12 Thus, accurate viscosity prediction is essential despite the complexity of hydrodynamic and particle–particle interactions in nanofluids.13 The viscosity of nanofluids is influenced by several parameters, including nanoparticle properties such as the volume fraction. Nguyen et al. revealed that the viscosity of CuO-water nanofluids decreases as the temperature rises, and the effect of particle size becomes more pronounced at high particle volume concentrations.14 Kwek et al. found that the relative viscosity of Al2O3-water nanofluid decreased with increasing temperature.15 Although it is possible to accurately measure the nanofluid viscosity with a viscometer, the process is time-consuming, costly, and laborious, particularly in engineering applications where an immediate viscosity value is required.16 Therefore, theoretical models have been developed to estimate the viscosity of nanofluids.
Einstein developed the first model for predicting the viscosity of nanofluids, which can estimate the viscosity with low volume concentrations of particles (<2%), whereas the power law method shows promising results for higher volume concentrations.17 Although several predictive models, including but not limited to those by Batchelor, Brinkman, Thomas, and Nguyen, have been developed to capture the rheological behavior of nanofluid under different conditions, they suffer from limitations, such as oversimplified assumptions that result in a significant deviation from experimental measurements.17–22 The majority of these models predict viscosity using only the particle volume fraction, whereas experimental measurements indicate that the temperature and particle size can significantly affect viscosity.
Recently, a novel model was proposed by Masoumi et al. where the viscosity of nanofluids was evaluated by adding static viscosity and effective viscosity due to the Brownian motion of particles.23 Udawattha et al. extended the Masoumi model by including particle volume fraction in the static portion of nanofluid viscosity.24 These models surpass the simple theoretical model due to incorporating critical parameters such as system temperature, nanoparticle density, size, and volume fraction. However, theoretical models based on Brownian motion use a number of fitting parameters that are determined by the best fit of the model to experimental data. Hence, they cannot be employed to recover the viscosity of a new experimental measurement (extrapolation). In addition, these complex models exhibit deviations from experimental observations due to simplifying assumptions. Thus, extensive research is conducted to develop a new technique capable of accurately capturing the rheological behavior of nanofluids for new experimental measurements without any simplifying assumptions (extrapolation).25
In the past decade, computer-aided models such as neural networks (NNs) have been increasingly applied in various fields such as biology, medical research, geology, economics, and engineering.26,27 For indication, Mokhtari et al. developed a multi-layer perceptron neural network for evaluating the performance of integrated wastewater treatment systems.28 These methods are powerful for determining the relationship between nonlinear and complex variables for predicting and controlling unknown systems with high precision.29 Recently, machine learning (ML) techniques such as artificial neural networks (ANNs) and support vector regression (SVR) have been applied to overcome the limitations of theoretical models.11,30 These techniques can recover the rheological behavior of nanofluids with greater precision than conventional models, as they incorporate more critical parameters and exclude theoretical simplifications. Ramezanizadeh et al. provide a comprehensive overview of the application of ML techniques in capturing the viscosity of nanofluids.31
Although several ML techniques have been widely used to predict the nanofluid viscosity, they require many experimental measurements to be optimized and make accurate predictions. Another disadvantage of these conventional techniques is that they are generally useful for a range of training data, implying that their predictions are only accurate for data that falls within the range of the training dataset but not beyond it (extrapolation). In addition, conventional ML techniques rely solely on data correlations and statistics for prediction and ignore the underlying physics of the problem.32 As only limited experimental measurements with a limited range are available for the viscosity of nanofluids, there is a great need to develop a new technique that can reduce the need for large datasets and incorporate underlying physics to improve the extrapolation capability of the model.
To overcome the limitations of traditional ML techniques, a physics-informed neural network (PINN) is introduced.33 PINN methods can surpass the traditional ML limitations by incorporating physical governing laws of problems in the NN method, resulting in a lower number of experimental observations required for reliable predictions. Thus, numerous PINN techniques have been proposed, including, but not limited to, DeepXDE,34 fractional PINN,35 and nonlocal PINN.36 The physical governing law of a problem can be implicitly or explicitly included in an NN. When the model is in the form of differential equations, the explicit inclusion of physics is preferred. However, implicit inclusion can be more beneficial when the models provide inaccurate predictions. For using PINN in nanofluids, the implicit inclusion of theoretical models is preferred, as they show deviations from experimental observations.
This study focused on a meaningful metamodel to capture the rheological behavior of nanofluids by implicitly incorporating theoretical models into the NN, known as a “multi-fidelity neural network (MFNN).” The main concept behind MFNN is to incorporate the prediction of theoretical models into a neural network (NN) model to achieve a meaningful metamodel. The effect of various theoretical models on the prediction of the MFNN was investigated, and the best-suited model for maximizing the performance of the MFNN was elected. Finally, a comparison between the extrapolation capability of the MFNN and conventional ANN was presented to demonstrate the superiority of the MFNN.
II. PROBLEM SETUP
A total of 1425 experimental observations were gathered based on 19 classes of nanofluids for the high-fidelity dataset. This dataset was initially prepared by Heydari et al. and used to train a conventional ANN and SVR.11 A summary of the gathered dataset is presented in the supplementary material. The input parameters of the MFNN include the diameter (d), density (ρ), and volume fraction (ϕ) of the nanoparticles, the temperature (T) of the system, and the output of the model is the relative viscosity (μr) of the nanofluids.
As previously noted, although various simple theoretical models have been proposed to capture the viscosity of nanofluids, none of them provide a reliable prediction of experimental measurements. This study investigates the effect of five well-known theoretical models, namely, Einstein,17 Batchelor,18 Brinkman,19 Masoumi,23 and Udawattha,24 on the performance of the MFNN. Table I summarizes the five models with respect to their critical parameters. The first three models (Einstein, Brinkman, and Batchelor) account for only the volume fraction to predict the viscosity, while the last two models (Masoumi and Udawattha) consider all the critical parameters.
III. METHODOLOGY
The ANN configuration is based on a biological neural pattern and is constructed using several neurons.37 The conventional ANN model, which is typically known as a multilayer perceptron (MLP), consists of three fully interconnected layers: input, hidden, and output layers (Fig. 1). The input layer is the first layer of the ANN, where the system parameter information is gathered and transferred to the following layers. The output layer is formed based on the number of desired system outputs, and a minimum of one hidden layer is located between the input and output layers. Each layer in the ANN is constructed using several neurons. Each neuron takes the output of all the neurons in the previous layer through weighted connections. The activation function is applied to map the summation of the neuron’s input to an output that is transferred to the subsequent neurons. The mathematical relationship for each node and activation function can be found in Ref. 38.
Figure 2 depicts a schematic of the MFNN that consists of two main NN: the first is a low-fidelity neural network, NNL, which handles low-fidelity data generated from theoretical models; the second is a high-fidelity neural network, NNH, which handles high-fidelity data acquired from experimental measurements. A high-fidelity neural network (NNH) consists of linear and nonlinear NN. The output of the NNL acts as an additional feature for the input of the NNH. The main idea behind incorporating low-fidelity data into the MFNN framework is to provide a general trend to the NNH, as only a limited number of high-fidelity data are available. This is essential for applying ML techniques in physical applications, as the understanding of conventional NNs such as MLP can deviate from the ground truth of a physical problem when only limited experimental observations are available. The NNH provides the most comprehensive understanding of the MFNN derived from the physical problem, whereas the NNL prevents the NNH from deviating from the solution.
The hyperparameters of MFNN are optimized by minimizing the following function:
where MSEL and MSEH are the mean squared errors of the low- and high-fidelity datasets, λ is the L2 regularization rate, and ω is the interconnected weight of the MFNN. The last term in Equation 6 is used to prevent the network from overfitting.39
The performance of the model was evaluated using statistical error analysis on the dataset obtained by MFNN prediction and actual experimental observations. The mean square error (MSE), mean absolute percentage error (MAPE), and R-squared (R2) were employed in the current study and are expressed as
where is the average value for the experimental dataset, y is the experimental viscosity of the nanofluids, and is the prediction of the MFNN.
IV. RESULTS AND DISCUSSION
The primary objective of this research is to develop an efficient ML algorithm as an alternative platform to theoretical models for accurately capturing the rheological behavior of nanofluids. The gathered high-fidelity data (experimental observations) were randomly divided into the training (70%) and testing datasets (30%). The weights and biases of MFNN were adjusted using the training dataset, and the performance of the model was evaluated using the testing dataset. In the case of MFNN, it is critical to provide a logical number of low-fidelity data points to ensure that the model is independent of the amount of low-fidelity data. It was found that when the number of low-fidelity data exceeds six times the number of high-fidelity data, the MFNN model indicates data independence. Thus, the number of low-fidelity data was six times that of the high-fidelity data for each of the following results presented. In order to optimize the performance of the MFNN, the effects of various hyperparameters, such as theoretical models, activation functions, the number of hidden layers and neurons, and training algorithms, were studied. Figure 3 summarizes the essential steps required to develop an optimal MFNN.
The input of NNH consisted of five parameters: the size, density, and volume fraction of nanoparticles, the temperature of the system, and the NNL output. The input of NNL was determined by the theoretical model of choice. The weights and biases of MFNN were assigned randomly using the Xavier initializer.40 The learning rate of MFNN was chosen as 0.001, and the tangent hyperbolic (tanh) activation function was applied for hidden layers of NNL and NNH. Other linear activation functions, such as ReLU, and nonlinear activation functions, such as ELU and sigmoid, were also studied; however, the tanh activation function demonstrated the best performance in both NNL and NNH. There were no hidden layers in the linear part of NNH, as a linear correlation is assumed between inputs and outputs. The NNL consisted of two hidden layers with 20 neurons in each. Although the effect of the architecture of NNL on the performance of MFNN was examined, no significant impact was observed, as the overall trend of MFNN is set by NNL, and the actual prediction of MFNN is substantially influenced by NNH. Therefore, it is essential to investigate the effect of the architecture of the nonlinear part of the NNH on the performance of MFNN. In this study, the depth of the nonlinear part of NNH with a range of 1–3 and the width with a range of 5–20 were investigated to prevent overfitting. The MFNN with two hidden layers (depth) and 13 and 18 neurons in each layer (width) performed best in predicting the viscosity of the nanofluids.
As previously noted, various theoretical models can represent the underlying physical laws of nanofluids. Clearly, these models cannot accurately predict the viscosity of nanofluids due to using a limited number of parameters that affect the viscosity and their simplifying assumptions. Given that the performance of an MFNN depends on the theoretical model of choice, it is essential to investigate the effect of each theoretical model on the prediction of the MFNN. Figure 4 presents the effect of five theoretical models, including Einstein, Brinkman, Batchelor, Masoumi, and Udawattha, on the prediction of the MFNN on the test dataset.
A simple theoretical model that only predicts the viscosity based on the volume fraction of nanoparticles is incapable of recovering the viscosity changes. In the MFNN framework, the general trend and shape of the fitting are determined by the low-fidelity data, whereas the ranges and predictions are recertified using high-fidelity data. This was confirmed by using several theoretical models in the MFNN framework with only one input (i.e., volume fraction), yielding comparable MSE errors (>0.01). It was concluded that using a simple theoretical model could not provide the flexibility required to capture the complexities of viscosity behavior. By incorporating more critical parameters such as temperature, particle size, and particle density into the theoretical model of choice (i.e., Masoumi and Udawattha), the MFNN can capture the underlying behavior of the viscosity of the nanofluids accurately. The NNL with low-fidelity data generated by these models can provide more information regarding the trend and flexibility of various parameters in the NNH, resulting in the development of a more accurate MFNN. We also noticed that the MFNN with the Udawattha theoretical model reaches a lower error value than the Masoumi model. It is believed that this is due to the accuracy of the theoretical model of choice, as the output of NNL is used as one of the input features for NNH.
In order to achieve a robust MFNN for predicting the viscosity of nanofluids, several training algorithms for adjusting the weights and biases of the MFNN were investigated. The optimizer name and error value are listed in Table II. Other training methods were also reviewed, but they were omitted due to resulting in a high error value. As can be seen, the Adam41 training algorithm exhibited the lowest error value in the test dataset compared to the other algorithms. Hence, it was used for regression and performance analyses. We also noticed that only the Adam and RMSProp42 methods can adjust the hyperparameters of both NNL and NNH, whereas the Adadelta43 and Adagrad44 methods only optimize the NNH.
Training method . | MSE error value . |
---|---|
Adam | 0.0051 |
Adadelta | 0.0138 |
Adagrad | 0.0112 |
RMSProp | 0.0082 |
Training method . | MSE error value . |
---|---|
Adam | 0.0051 |
Adadelta | 0.0138 |
Adagrad | 0.0112 |
RMSProp | 0.0082 |
The optimal MFNN configuration, as well as the input and output parameters, are summarized in Table III. It should be noted that the optimal MFNN was retrained using a ten-fold cross-validation technique to prevent sampling biases and overfitting of the developed MFNN. The MFNN with the best performance was elected for the rest of the results. It should be noted that the training was carried out on a personal computer with no special requirements. The average runtime for training MFNN was less than 45 min.
Structure . | MFNN . |
---|---|
Tool | Python—TensorFlow |
Input | ρ, d, φ, T |
Output | μr |
Initializer | Xavier initializer |
Learning rate | 0.001 |
Activation function | Tangent hyperbolic (tanh) |
Training algorithm | Adam optimizer |
Layers | 2 |
Neurons | 8/13 |
Structure . | MFNN . |
---|---|
Tool | Python—TensorFlow |
Input | ρ, d, φ, T |
Output | μr |
Initializer | Xavier initializer |
Learning rate | 0.001 |
Activation function | Tangent hyperbolic (tanh) |
Training algorithm | Adam optimizer |
Layers | 2 |
Neurons | 8/13 |
The training state of the MFNN can be observed via the performance graph, which depicts the convergence and residual losses of both the NNL and NNH. To achieve maximum performance, it is necessary to minimize the residual losses of both the low- and high-fidelity components of the MFNN. Figure 5 shows the MSE error values for the low-fidelity dataset (LF-train), the training dataset from high-fidelity (HF-train), and the test dataset from high-fidelity (HF-test). During the first training epoch, random weights and biases were assigned to neurons, resulting in a high error value. After a few training epochs, the error values decreased considerably, indicating that the training method (“Adam”) can successfully minimize the residual loss of both the NNL and NNH. It should be noted that the residual loss of the training datasets is less than that of the test dataset, implying that the L2 regularization rate used in the loss function prevents overfitting in the developed algorithm [Eq. (1)].
Figure 6 depicts the regression diagram for the training dataset and the test dataset from high-fidelity data to evaluate the performance of the MFNN. This diagram shows the relationship between the MFNN prediction and the desired viscosity value. The prediction of the ideal model must be similar to the experimental observations, resulting in an R2 = 1. As can be seen, the MFNN can accurately track the viscosity value with minimal deviation from the experimental observations (R2 > 0.997). This demonstrates that the MFNN framework can accurately predict the viscosity behavior of nanofluids by incorporating low-fidelity data derived from the theoretical model of choice (i.e., Udawattha).
A crucial index for analyzing the performance of ML models is an error histogram plot that indicates the frequency of errors at different error margins (Fig. 7). It is seen that the errors are clustered around 0, confirming that the MFNN is well-trained and has an accurate prediction of viscosity.
A comparison between the MAPE error values of the MFNN model and pure theoretical models (see Table I) is shown in Fig. 8. Evidently, the MFNN model outperforms the pure theoretical models. This observation is significant as it suggests that the MFNN can effectively comprehend viscosity behavior, whereas theoretical models fail to provide accurate predictions due to their simplifying assumptions and parameters. In fact, there is no perfect theoretical model for capturing the viscosity of nanofluids. In this situation, the MFNN model provides a significant jump in understanding the rheological behavior of nanofluids by incorporating these theoretical models into the ML technique.
As previously noted, conventional ML techniques, such as ANN and SVR, commonly lack predictive capabilities for new experimental measurements because they are purely data-driven and their predictions are primarily based on data correlations and statistics. Thus, to rigorously evaluate the performance of the MFNN and ensure that the developed model is independent of the experimental observations provided in the training dataset, it is necessary to test the extrapolation capability of the MFNN model. To the best of our knowledge, only Heydari et al. used the extrapolation technique to test their ANN for predicting the viscosity of nanofluids.11 They collected an unseen dataset, referred to as the extra-validation dataset, which consisted of 65 data points from six classes of nanofluids. A summary and statistical description of the extra-validation dataset are presented in the supplementary material.
To ensure that the MFNN captures the ground truth behavior of viscosity, the performance of the MFNN on the same dataset was evaluated and compared with Heydari ANN (Fig. 9). The results demonstrated that the MFNN model has higher performance compared to the Heydari ANN. The MFNN model has an R2 of 0.991, while that of the Heydari ANN model is only 0.977. This observation confirms that the MFNN surpasses the traditional data-driven ML technique (ANN) by incorporating underlying physics into NN.
We have shown that MFNN possesses significant advantages compared with other methods to predict the viscosity of nanofluids. The number of input features is an essential characteristic of the MFNN model. The conventional ANN models typically use base fluid viscosity as an input feature, which is excluded in the MFNN model. The value of base fluid viscosity can limit the application of the model, as it strongly correlates with temperature and is available within a limited temperature range. Thus, the base fluid viscosity can be considered an expensive input. The MFNN model can accurately predict the viscosity without using base fluid viscosity, which expands the applicability of the model. In addition, the number of experimental data points is another major factor in training neural network models to achieve reliable predictions. The conventional neural network models, such as multi-layer perceptron, are purely data-driven, and their predictions are primarily based on data correlations and statistics. However, the MFNN model can achieve a meaningful prediction by implicitly incorporating theoretical models into the training stage of the neural network. The MFNN method provides a possibility for employing neural networks in fields where limited data is available. More importantly, it is shown that the MFNN can provide an accurate prediction of viscosity for entirely new nanofluids, whereas the conventional neural network fails to reflect the real behavior of solutions (see Fig. 9). The developed MFNN enables accurate prediction of viscosity directly from nanoparticle features, which has a significant impact on real-world applications of nanofluids.
V. CONCLUSION
In this research, we developed a novel physics-informed neural network based on a multi-fidelity neural network (MFNN) to recover the rheological behavior of nanofluids and predict the viscosity value. The proposed MFNN used limited experimental measurements as high-fidelity data and an abundance of data generated from a theoretical model of choice as low-fidelity data. The developed MFNN can overcome the drawback of traditional machine learning (ML) models, where only limited experimental measurements are available, and produce a meaningful metamodel that accurately reflects the rheological behavior of nanofluids by incorporating the physical law of the problem into the model. The low-fidelity data are applied to determine the overall trend and shape of the fitting, while the high-fidelity data are used to revalidate the ranges and predictions. Four critical parameters, including system temperature, nanoparticle size, nanoparticle density, and nanoparticle volume fraction, were chosen as the input of MFNN, and relative viscosity as the output. The effect of theoretical models on the accuracy of MFNN was investigated by generating different low-fidelity datasets from various theoretical models, including Einstein, Brinkman, Batchelor, Masoumi, and Udawattha. The results demonstrated that the MFNN with a simple theoretical model, such as Einstein, Brinkman, and Batchelor, fails to recover the rheological behavior of nanofluids. However, the MFNN with a more complicated model, such as Masoumi and Udawattha, can successfully capture the complexities of viscosity behavior. More importantly, the superiority of the MFNN model was demonstrated by comparing it to a traditional artificial neural network (ANN) over entirely new experimental measurements. It was found that the MFNN model can provide an accurate prediction of the viscosity and outperforms the conventional ANN model due to incorporating the physical laws into the structure.
SUPPLEMENTARY MATERIAL
See the supplementary material for the gathered dataset for training, testing, and extra-validating the MFNN network.
ACKNOWLEDGMENTS
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Ilia Chiniforooshan Esfahani: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Software (equal).
DATA AVAILABILITY
The data that support the findings of this study are available within the article and its supplementary material.