Removing the performance bottleneck of pressure – temperature flash calculations during both the online and offline stages by using physics-informed neural networks

Pressure – temperature (PT) flash calculations are a performance bottleneck of compositional-flow simulations. With the sparse grid surrogate, the computing burden of PT flash calculations is shifted from the online stage to the offline stage of the compositional-flow simulations, and a great acceleration is achieved. It is known that the data-driven neural network can also be a surrogate of PT flash calculations. However, flash calculations are carried out in the training stage, i


I. INTRODUCTION
Traditionally, flash calculations are important routines in reservoir simulation (Wu, 2015;Wu and Sun, 2016), by which the phase condition of the flow and the compositions of the hydrocarbons in each phase can be learnt.Nowadays, the application of flash calculations is expanded to more fields such as the shale oil and gas exploitation (Liu and Zhang, 2019;Wang et al., 2018;Xue et al., 2023) and carbon sequestration (Li et al., 2020;Sandoval et al., 2019).Different from the conventional reservoir, the porosity and the pore diameter of the unconventional reservoir such as the shale are much smaller.As a result of that, flash calculations in the shale are affected by the capillary pressure greatly, which means that the flash calculation model used in the conventional reservoir cannot be applied in the unconventional reservoir (Zuo et al., 2018).To address this, another flash calculation model named volume-temperature (VT) flash has been applied for the unconventional reservoir (Jindrov a and Mikyška, 2013;Mikyška and Firoozabadi, 2011;Smejkal and Mikyška, 2018, 2021a, 2021b;Smejkal et al., 2021Smejkal et al., , 2022)).Corresponding to it, the flash calculation model for the conventional reservoir is called pressure-temperature (PT) flash calculations (Michelsen, 1999;Michelsen and Mollerup, 2004;Wu et al., 2015;Wu and Chen, 2018;Wu and Ye, 2019).More details of the two flash calculation models can be found in the references.Meanwhile, in the carbon sequestration field, flash calculations also play a significant role.Carbon sequestration is an effective means to achieve the goal of carbon peak and carbon neutrality.Carbon dioxide emitted by industries can be captured and stored in a lot of geologic formations such as basalt, deep saline formations, un-mineable coal seams, oil and gas reservoirs, etc.When carbon dioxide is injected into the oil and gas reservoirs, the phase condition in the reservoir is changed, which influences the storage effect of carbon dioxide.
However, flash calculations are performance bottlenecks for the applications.For example, the time to do PT flash calculations may take up more than 70% of the total time of the compositional-flow simulation (Wu et al., 2015).Thus, how to accelerate flash calculations is on the go nowadays.In true reservoirs, the number of components in flash calculations is large, and the computing cost increases with the increase of the number of components.As a result of that, the issue of curse of dimension appears, if the composition of every component is deemed as a dimension.Considering that the technology of sparse grids is an effective tool to conquer the curse of dimension, Wu et al. (2015) have tried to use sparse grids to solve the problem.They generated a sparse grid surrogate and then used it to approximate the PT flash calculation results.In their numerical experiments, the speed of flash calculations is increased at least 2000 times, while the accuracy is deteriorated only a little.In addition to that, it is known that the deep neural network can also be deemed as a surrogate to conquer the issue of curse of dimension.Thus, Li et al. (2019) and Zhang et al. (2020) developed deep neural networks to approximate PT flash calculations and VT flash calculations, respectively, and their approximation results approach the true values well for most of the experiments.However, the datadriven neural network (DDNN) is applied in their work, which is not a good choice for flash calculation approximation, and further improvement can be achieved.
The DDNN is a well-investigated network, which is developed to process the data-rich issue at the beginning.For example, the picture classification problem is a data-rich issue, where a huge amount of picture samples and their corresponding class tags are used to train the network (Liu et al., 2016(Liu et al., , 2018)).Then, the DDNN is introduced to the field of numerical simulation where the network is used as a surrogate to approximate the numerical results (Wu and Xiu, 2020;Zheng et al., 2020).In this field, the whole numerical model can be deemed as a function, and the input and output of the function are the input and output of the network, respectively.The numerical model is run to generate a large quantity of pairs of the inputs and outputs, i.e., the data samples, which are then used to train the network.This stage is called the offline stage.After training, a surrogate network is prepared, which replaces the numerical model to output the approximation values when the numerical experiment is carried out.This stage is called the online stage.However, the running of the numerical model is required in the offline stage, and the computing burden is just transferred from the online stage to the offline stage.Thus, the performance bottleneck of the numerical model is not fully removed by the application of neural network.The disadvantage stems from that the DDNN is applied to the issue of numerical simulation directly, and the characteristics of the issue different from the data-rich issue are not considered.In fact, the numerical model is defined by a lot of equations, but the data-rich issue lacks of such equations to reveal its inner patterns.Thus, a great quantity of sample data have to be retrieved in the datarich issue to play the same role as the equations in the numerical model.From the point of view, the equations are more direct and more accurate than the sample data to express the physical laws of the investigated issue.Thus, applying the DDNN to numerical simulation may not be the optimal solution.Instead, a new kind of neural network that integrates the equations should be considered.Since the equations are the representation of the physical laws of the numerical simulation, the new neural network is called the physics-informed neural network, or PINN for short.The PINN has already been applied in the numerical simulation of many fields, such as fluid dynamics (Cai et al., 2022), the heat transfer problems (Cai et al., 2021), the weather and climate modeling (Kashinath et al., 2021), etc.
Since the equations give us a chance to see the details of the numerical model, the computing burdens or the performance bottlenecks can be found.In general, solving some of the equations or equation systems may become the performance bottlenecks.In that condition, the neural network can substitute these parts as a surrogate, and the corresponding performance bottlenecks can be removed.The kind of strategy has the following advantages.First, the performance bottlenecks are directly removed from the computing framework of the numerical model in the offline stage.However, the application of the DDNN only "transfers" the bottlenecks from the online stage to the offline stage.Second, since the sample data are generated by the equations, errors may occur from that.Thus, the direct application of the equations is more accurate than the application of the sample data.Third, no sample-data storage is needed in the PINN, while a huge amount of sample data has to be stored in the memory or the disk in the DDNN.Every sample datum is in reality a pair of input and correct output.The input is fed to the DDNN to retrieve an approximation that is compared with the correct output to derive the loss function, based on which the weights and biases of the network are updated.However, in the PINN, the approximation can be integrated into the loss function directly, and not necessary to be compared with the correct output.Therefore, it is no need to store the correct output.Moreover, every input can be generated in real time from the input space when training the network.Here, the input space is the value range of the input, and a series of input values can be gotten by sampling the input space.Thus, neither the correct output nor the input needs to be stored in the PINN.
In this work, the PINN is applied to approximate PT flash calculations, and its application to VT flash calculations is left as future work.It is emphasized that Zhang and Sun (2021) have developed a thermodynamics-informed neural network to approximate VT flash calculations.However, it cannot be considered as a fully physicsinformed neural network, since the physical information other than the thermodynamics should be represented by the data.By investigating the model of PT flash calculations, it is found that the stability analysis and the successive substitution technique (Zhao et al., 2020) are two performance bottlenecks of PT flash calculations, and therefore, the PINN may substitute them to remove the bottlenecks.This work is organized as following.The literature reviews, goals, and strengths of this work have been discussed in Sec.I.In Sec.II, the twophase PT flash calculation is introduced, followed by Sec.III that gives more details of the PINN for PT flash calculations.A series of numerical experiments are carried out in Sec.IV, and then Sec.V concludes this work and summaries the main contributions of this work.

II. PT FLASH CALCULATIONS
PT flash calculations can be described as below.Given a mixture of m components, the composition of component i is x i ; i ¼ 1; …; m.The total mole amount of the m components is F. When pressure p and temperature T are set as constant, the mixture can be divided into different phases.Stability analysis (Wu, 2015) is done first to decide how many phases there are.For example, the two phases of the oil phase and the gas phase are common to be seen in the reservoir condition.In that condition, when equilibrium is achieved, the mole amounts of the oil phase L and the gas phase V can be known.
In addition to that, the compositions of the components in the oil phase x oi and in the gas phase x gi ; i ¼ 1; …; m can also be calculated.
x oi and x gi are called the mole fraction of the ith component.A general description of the two-phase PT flash calculation is shown in Fig. 1.
In the two-phase PT flash calculation, equilibrium hints a state in which the compositions of the components in the two phases and the mole amounts of the two phases will not change any more.In that condition, mass conservation law can be expressed as Moreover, the sum of the compositions in each phase should be one, which is expressed as (2) In addition to that, for component i, its fugacities in the oil and gas phases f oi and f gi should be equal with each other, which is expressed as It is easy to see that there are 2m þ 2 unknowns in Eqs. ( 1)-( 4), which are L, V, x oi , and x gi , respectively.Meanwhile, there are also 2m þ 2 equations.Thus, the problem is well-posed.If there is a single-oil phase, the following equations can be derived directly: (5) Similarly, if there is a single-gas phase, there are equations as follows: To solve for all the unknowns in the two-phase PT flash calculation, the expression of the fugacity should be given.In fact, the fugacities of the oil phase and the gas phase can be expressed as follow: (13) Here, u oi and u gi represent the fugacity coefficients of the oil phase and the gas phase, respectively.If we use the Peng-Robinson equation of state (PR EOS), the fugacity coefficient can be expressed as Since for the oil phase and the gas phase, the expressions of the fugacity coefficient have no difference, the subscripts "o" and "g" for the variables u and x in Eq. ( 15) are ignored.The values of the other variables in Eq. ( 15) can be gotten from the PR EOS, which is derived as below.Suppose x i is the acentric factor of component i, and its range is between 0 and 0.5.A parameter k i that is related to the vapor pressure of hydrocarbons can be calculated from x i , With k i , another parameter a i can be derived as follows: where T ci is the critical temperature of component i.Then, we can calculate the factors a i and b i as where R is the universal gas constant and p ci is the critical pressure of component i.All the aforementioned parameters describe the attribute of every component itself.After mixing all the components, the attribute of the mixture can be described by the following parameters: where K ij is the binary interaction parameter between components i and j.Furthermore, we have another two parameters as It is emphasized that the parameters a, b, A, and B describe the attribute of the mixture in either phase, and therefore, the four parameters have to be calculated for either phase.Now, the only leftover variable not discussed in Eq. ( 15) is Z, which is the compressibility factor.To calculate it, the PR EOS is introduced as follows: where v is the mole volume of the flow.It is part of the expression of the compressibility factor as follows: From Eqs. ( 22) to (25), it is derived Peng-Robinson's cubic equation in Z as follows: By solving it, the variable Z can be calculated.It is noted that the smallest positive root is selected for the oil phase, and the largest positive root is selected for the gas phase.
The derivation of the fugacity assumes that x oi and x gi have been known.Thus, there comes to see how to derive x oi and x gi .Traditionally, a loop is introduced to do that: the successive substitution technique (Wu, 2015) is used to derive x oi , x gi , L, and V.Then, x oi and x gi are sent to Eq. ( 4) to evaluate the difference between f oi and f gi .If the difference is small enough, convergence is achieved, and the loop is finished; otherwise, the difference is used by the successive substitution technique to update x oi and x gi , and the loop is continuous.However, the successive substitution technique includes a timewasting iteration procedure, such as Newton's method and the bisection method, which becomes a performance bottleneck of PT flash calculations.Not only that, the other performance bottleneck is the stability analysis.Since a great quantity of PDEs have to be solved in the stability analysis, its computing time is much larger than that of the successive substitution technique.
Since the successive substitution technique and the stability analysis are two performance bottlenecks of PT flash calculations, removing them and then increasing the speed of PT flash calculations is meaningful in the fields related to PT flash calculations, such as the compositional-flow simulation.It is observed that if x oi , x gi , L, and V are deemed as functions of x i , F, p, and T, these functions are continuous, except the phase-transition zones.In other words, they are Lebesgue integrable functions.In addition to that, according to the universal approximation theorem of artificial neural networks, it is learnt that deep neural networks can approximate Lebesgue integrable functions well.Thus, deep neural networks can be used as surrogates in PT flash calculations in the following two ways: first, since x oi and x gi can be output directly by the deep neural network, the network is a surrogate of the successive substitution technique.Second, after getting x oi and x gi , the number of phases can be decided by simple calculations, which will be stated in Sec.III.In other words, the network is also a surrogate of the stability analysis.Not only that, considering the approximation cost of deep neural networks is much less than the cost of PT flash calculations, the performance bottlenecks can be removed.

III. PHYSICS-INFORMED NEURAL NETWORKS A. To decide the compositions
Mathematically, the deep neural network surrogate for flash calculations is a function as follows: To further simplify the network, F is supposed to be one.In that case, from Eqs. ( 1) to (3), there is Thus, F and V can be removed from the input and output, respectively, and Eq. ( 27) is reduced as follows: The simplest deep neural network is the feedforward neural network.
If it is used to approximate flash calculations, its sketch can be shown in Fig. 2. In the figure, it is seen that there is one input layer, one output layer, and two hidden layers, although more hidden layers can be added into the network.Since it is a feedforward neural network, data can only go from the input layer to the output layer unidirectionally.The input layer represents the input of f , and therefore, there are m þ2 neurons, which represent p, T, and x i ; i ¼ 1; …; m.In general, the number of neurons in the hidden layers should be larger than that in the input layers for better approximation.The output layer stands for the output of f , and it contains 2m þ 1 neurons, which represent x oi and x gi ; i ¼ 1; …; m, and L. After training the network, a weight vector w and a bias vector b can be gotten.As a result of that, f can be written as For better training effects, it is necessary to nondimensionalize the input entries of f .For the pressure p, it is supposed that p falls into the range of ½p min ; p max , and then the nondimensionalized p, which is p, is expressed as follows: In the same way, it is supposed that T is in the range of ½T min ; T max .After nondimensionalizing it, there is By nondimensionalization, all the input entries of f fall in the range of ½0; 1.In the output layer, all the compositions must satisfy Eqs. ( 2) and ( 3), and L must fall into the range of ½0; 1 due to Eq. ( 28).To achieve that, a sigmoid layer and a softmax layer should be added before the output layer.Moreover, it is noted that if x i is zero, x oi and x gi should be zero.Thus, a mask layer has to be applied before the softmax layer.The mask layer guarantees that only the non-zero x i is the input of the softmax function.After such processing, the network in Fig. 2 is upgraded as shown in Fig. 3.It is noted that all the input and output entries fall into the range of ½0; 1 after upgrading.Since the physics-informed neural network is applied in this work, the physical information in the two-phase condition, i.e., Eqs. ( 1)-( 4), should be embedded in the network in Fig. 3.There are many ways to do that, and one of the popular ways is to add the residuals of the equations to the loss function.Fortunately, Eqs. ( 2) and (3) naturally hold by the application of the two softmax functions in Fig. 3. Thus, only the residuals of Eqs. ( 1) and ( 4) should be added to the loss function.To derive the loss function, first, the residuals of the equations are put into an auxiliary vector called the residual vector one by one.If x i > 0, a residual from Eq. ( 1) is defined as However, if x i ¼ 0, there are two residuals from Eq. ( 1), which are defined as follows: ffiffiffiffiffi ffi It is easy to see that when Eq. ( 1) holds, the aforementioned residuals are zero.However, the residual of Eq. ( 4) is a little complicated.If x i ¼ 0, there is no residual from Eq. ( 4).Thus, only the condition of x i > 0 is discussed in the following.If there are two phases, the residual is defined as If there is only a single phase, the residual is defined as ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi for the single-oil phase and the single-gas phase, respectively.Unfortunately, the phase condition cannot be known at this moment, which impedes the selection of the residual.It is observed that L may be a good indicator to tell the phase condition.When L ¼ 0, there is a single-gas phase; when 0 < L < 1, there are two phases; when L ¼ 1, there is a single-oil phase.However, the cases of L ¼ 0 and L ¼ 1 can be output by the neural network hardly, which means L will always indicate the two-phase condition.As a result of that, a relaxation factor e ð0 e 0:2Þ is introduced.By this way, the phase condition is supposed as follows: Phase condition is single À oil phase if L > 1 À e; single À gas phase if L < e; two phases; otherwise: It is noted that the above-mentioned phase condition is supposed, and not necessary to stand for the true phase condition.With the factor e, the two-phase weight w can be defined as follows: When L > 1 À e, it is supposed that a single-oil-phase condition appears.If x gi > 0, the residual is defined as If x gi 0, the residual is defined as In such condition, the larger the output L, the more likely the singleoil phase appears.Thus, the two-phase weight w is smaller, and the single-phase weight 1 À w is larger.When L < e, it is supposed that a single-gas-phase condition appears.If x gi > 0, the residual is defined as If x gi 0, the residual is defined as If L indicates the two-phase condition, the residual is directly defined as ( 36) and (37).
It is observed that when x i > 0, there is one residual from Eq. ( 1) and one residual from Eq. ( 4); when x i ¼ 0, there are two residuals from Eq. ( 1).Thus, for a batch including s inputs, there are totally 2m Ã s residuals in the residual vector, and therefore, the loss function can be defined as the mean squared error between the residual vector and the zero vector.When the PINN is trained to achieve convergence, the compositions can be gotten from the outputs of the PINN.

B. To decide the phase condition
It is emphasized that in the training stage as above, the phase condition is supposed.However, in the real applications such as the compositional-flow simulations (Wu, 2015), the true phase condition should be known.Unfortunately, the true phase condition cannot be told by the trained PINN directly.Meanwhile, in the following experiments, it will be learnt that the accuracies of the compositions x o1 , x o2 , x g1 , and x g2 are much higher than that of L. Thus, a reliable method to tell the true phase condition is derived from x o1 , x o2 , x g1 , and x g2 .Since the method is derived from the outputs of the PINN, it is called the PINN-induced Phase Decision Method (PPDM), which is given as follows.First, with x o1 , x o2 , x g1 , and x g2 and the PR EOS, the mass density of the oil phase q o and the mass density of the gas phase q g are calculated.
The details of such calculations are referenced to Wu (2015).In general, the mass density of the oil phase is much larger than that of the gas phase.Thus, if q o =q g < d q , it is deemed as not reasonable, and the single-phase condition is decided.Here, d q is a threshold.For the methane-propane experiments in Sec.IV, d q can be set as 20.It is noted that how to choose d q depends on the experimental condition.Otherwise, if q o =q g !d q , it does not mean two phases definitely, and a further check has to be done.The outputs x o1 , x o2 , x g1 , and x g2 are used to calculate the saturation S, which is referenced to Wu (2015).If 0 < S < 1, a two-phase condition is decided; otherwise, a single-phase condition is decided.
If a single phase is decided from the above, whether it is a singleoil phase or a single-gas phase that needs further check.It is recalled that in the single-oil phase, Eqs. ( 5)-( 8) hold; in the single-gas phase, Eqs. ( 9)-( 12) hold.Suppose it is a single-oil phase, x o1 and x o2 are updated with Eq. ( 5) and then used to calculate q o .If q o > d o , a single-oil phase is decided; otherwise, a single-gas phase is decided.Here, d o is a threshold.For example, d o can be set as 100 kg/m 3 in the methane-propane experiments in Sec.IV.The flow chart of the PPDM can be seen in Fig. 4. By this way, the true phase condition can be decided by x o1 , x o2 , x g1 , and x g2 .It is noted that L is used to tell the supposed phase condition in the training stage, and not as the true phase indicator in the real applications.In addition to that, the other interesting thing is that in true PT flash calculations, the stability analysis is done to decide the phase condition at first, followed by the successive substitution technique to calculate x o1 , x o2 , x g1 , and x g2 .However, in this work, the order is reverse.The compositions x o1 , x o2 , x g1 , and x g2 are output by the trained PINN, followed by the PPDM to tell the phase condition.
In brief, the two performance bottlenecks of PT flash calculations, i.e., the stability analysis and the successive substitution technique, can be replaced by the neural network surrogate, while the other routines of PT flash calculations such as the fugacity calculation are still needed in the neural network approximation.By this way, it is anticipated that the speed of PT flash calculations can be increased greatly both in online and offline stages.III.In this table, the number of training samples is a vector, and the entries of the vector stand for the uniform sampling number of the pressure, the temperature, and the first-component composition.Thus, the total number of training samples per experiment is the multiplication of all the entries of the vector.The activation function of the hidden layers is ReLU.The samples are shuffled before every training.The initializer "HeNormal" (Datta, 2020) is used to initialize the network parameters.The L2 regularization penalty and the Adam optimizer are applied in the network.
In the test stage, the output of the trained network is compared with the output of flash calculations to calculate the error that is defined as the absolute difference between the true value and the approximated value.The code of flash calculations is written in FORTRAN 90.To show the results easily, the test temperature is fixed at 160, 260, and 480 K for experiment 2-160, 2-260, and 2-480, respectively.In that condition, 100 points are uniformly sampled for the pressure and the first-component composition,, and therefore, there are ten-thousand test sample points for every experiment.If the true phase is the single-oil phase, only the errors of the oil-phase compositions are calculated in this condition, since the gas-phase compositions are not cared.Meanwhile, the error of L is calculated as L À 1 j j.If the true phase is the single-gas phase, only the errors of the gas-phase compositions are calculated in this condition.Meanwhile, the error of L is calculated as L j j.If the true phase is the oil-gas phase, both the errors of the oil-phase and gas-phase compositions are calculated.However, the value of L is not important in this condition, and therefore, its error is not calculated.Table IV summarizes the error calculation for different phase conditions.The experimental results are shown as follows.
From Figs. 5(a), 6(a), and 7(a), it can be seen that all the trainings achieve convergence.Experiment 2-160 gives a single-oil phase, which can be seen in Fig. 5(b).This figure is generated from the true flash calculation.The errors of L are shown in Fig. 5(c).It is learnt that the errors of L have a gradual decreasing trend from the left boundary to the right-up corner.Since it is a single-oil phase, only errors of x o1 and x o2 are generated, which are shown in Figs. 5(d) and 5(e), respectively.In the whole sense, the errors are very small.Interesting, much smaller errors such as below 0.005 can be seen in some strip zones.It is emphasized that the two figures are actually the same, which is a demonstration of the correctness of the code.
In experiment 2-260, there are three phases in Fig. 6(b): the single-oil phase, the single-gas phase, and the oil-gas phase.Moreover, it can be learnt that the phase condition depends more on the firstcomponent composition than the pressure.In other words, when the first-component composition is fixed, the change of pressure has little effect on the phase change.This is the reason that more sample points are applied in the first-component composition than in the pressure.In Fig. 6(c), all the errors of L are below 0.03, which is small.The errors of x o1 and x o2 are shown in Figs. 6(d) and 6(e), respectively.It is observed that most of the errors of x o1 and x o2 are below 0.02.The errors of x g1 and x g2 are shown in Figs.6(f) and 6(g), respectively.Most of the errors in Figs.6(f) and 6(g) are below 0.045.Compared with the errors of x o1 and x o2 , the errors of x g1 and x g2 are a little larger.
In experiment 2-480, a single-gas phase can be seen in Fig. 7(b), since the temperature is as high as 480 K.In such condition, the L errors are generated in the whole sample space, which is shown in Fig. 7(c).From the figure, it can be seen that there is a zone in the center of the figure, where the errors are relatively larger than those of the other zones.The errors decrease gradually from the central zone, and much smaller errors such as below 0.02 can be seen on the left and right boundaries of the figure.Since it is a single-gas phase, there are only errors of x g1 and x g2 , which is shown in Figs. 7(d) and 7(e), respectively.Moreover, it is learnt that except some small zones, the errors in

Error
Single-oil phase

Single-gas phase
Oil-gas phase the other zones are below 0.02.Especially, in some strip zones, the errors below 0.005 can be seen.In summary, by setting the temperature as 160, 260, and 480 K, the single-oil phase, the oil-gas phase, and the single-gas phase can be generated, respectively.For all the experiments, the errors of x o1 , x o2 , x g1 , and x g2 are small, which demonstrates the accuracy and effectiveness of the PINN.However, the error of L is relatively larger, which hints that L is not a reliable factor to foresee the phase condition.

B. Three-component experiments
To demonstrate the generality of the PINN, three-component experiments are carried out in this section.In addition to methane and propane, carbon dioxide is added into the mixture.Carbon dioxide, methane, and propane is called the first, the second, and the third component, respectively.Their physical and chemical properties and the BIPs can be found in Tables I and II, respectively.The pressure range is set from 580 to 620 bar, and the temperature ranges are from 200 to 240 K and from 460 to 500 K.The corresponding experiments are called experiments 3-220 and 3-480, respectively.The network and training parameters of the three experiments can be found in Table V.The vector in Table V has the similar meaning as in Table III, with the last entry representing the second-component composition.However, the total number of samples per experiment is not the multiplication of all the entries of the vector, since the sum of all the compositions should be one.In the test stage, 30 points are uniformly sampled for the pressure, the first-component composition, and the secondcomponent composition.With the limit that the sum of all the compositions should be one, there are 13 771 test sample points totally.
The test temperature is fixed at 220 and 480 K for experiments 3-220 and 3-480, respectively, and therefore, it is not necessary to sample the temperature.The trainings of both experiments achieve convergence, which can be learnt in Fig. 8.The average errors of both experiments are shown in Table VI.
From Table VI, it can be learnt that all the average errors of the compositions are below 0.035, which are very small.However, the errors of L are one-order larger.Meanwhile, compared with the twocomponent experiments, the compositional errors of the threecomponent experiments are in the same order, which demonstrates the good scalability of the PINN.Moreover, it is observed that experiment 3-220 outputs the single-oil phase, the single-gas phase, and the oil-gas phase, which is similar to experiment 2-220.Experiment 3-480 outputs only the single-gas phase, which is also similar to experiment 2-480.In one word, both the twocomponent and the three-component experiments demonstrate the accuracy of the compositional outputs of the PINN, which further demonstrates the reliability of the PPDM.The accuracy of the PPDM can be seen in the compositional-flow experiments in the following.

C. Two-phase compositional-flow experiments
PT flash calculations are an important routine in compositionalflow simulations.If the finite difference method (Wang and Sun, 2016) and the IMplicit Pressure and Explicit Concentration (IMPEC) scheme are applied, the compositional-flow simulation can be deemed as a time-iteration procedure (Wu, 2015).At each time step, PT flash calculations are leveraged to compute the compositions of each grid cell.If the grid size is huge, most of the computing time is spent on PT flash calculations, and therefore, PT flash calculations become a performance bottleneck of compositional-flow simulations.More details of the two-phase compositional model can be seen in Appendix.In this section, there is a compositional-flow experiment as follows, and the code of the compositional-flow simulations is written in FORTRAN 90.A mixture of methane (the first component) and propane (the second component) is full of a reservoir of 4 Â 4 m 2 , and a grid of 40 Â 40 is imposed on the reservoir domain.The reservoir temperature is fixed at 260 K.At the beginning, a constant pressure of 20 bar is imposed in the reservoir, and no gravity effect is considered.The initial average porosity is 0.2, and the initial permeability is 9:87 Â 10 À10 m 2 .Both the initial compositions of methane and propane are 0.5.Then, methane is injected into the reservoir from the left-bottom cell at the speed of 0.02 mol/(m 2 s), and the mixture in the reservoir is pushed out of the right-bottom cell.A constant pressure of 20 bar is imposed on the exit corner.The simulation runs for 8760 time-iterations, and each time step is 360 s.
As stated in Sec.I, the sparse grid technique is used to accelerate PT flash calculations and has achieved excellent results.In this section, a sparse grid surrogate is generated in the offline stage and then used in the online stage of the compositional-flow simulation above.Since it is a two-component flash calculation, the first dimension of the sparse grid surrogate is the pressure, which is from 19 to 21 bar, and the second dimension is the composition of the first component (methane).After nondimensionalizing the first dimension, the sparse grid surrogate is shown in Fig. 9.It is noted that it is a united sparse grid surrogate, which means the approximation of all the compositions is given by the same sparse grid surrogate.In the works (Wu and Chen, 2018;Wu and Ye, 2019), every variable has its own sparse grid surrogate to do approximation.For example, there is a sparse grid surrogate to approximate x o1 , and there is another sparse grid surrogate to approximate x o2 .That way is suitable to be applied in the cases where a larger number of variables need approximating.However, in this work, only the compositions need approximating, and a united sparse grid surrogate is applied.After getting the compositions, the other physical and chemical variables can be derived from the compositions, following the same routines as the true flash calculations.Moreover, it is observed in Fig. 9 that there are more interpolation points in the x 1 direction than in the p direction, which coincides with the conclusion from experiment 2-260: the phase condition depends more on the first-component composition than the pressure.
After getting the compositions from the united sparse grid surrogate, the phase condition can be directly decided by the compositions as following.If the sum of the oil-phase compositions is smaller than a threshold, for example, 10 À7 , it is deemed that the oil phase does not exist; otherwise, there exists the oil phase.The method is also applied in determining the gas phase.You are reminded of that the PPDM holds the similar philosophy to decide the phase condition, but it is much more complex than the method here.This is due to the much higher accuracy of the sparse grid surrogate than the PINN surrogate.For example, the sparse grid surrogate used in this experiment can make sure the absolute difference of the true composition, and the estimated composition is below 10 À5 .As a result of that, if the sum of the oil-phase compositions approaches zero, there is a high probability to tell the oil phase does not exist, which is similar for the gas-phase condition.However, from the former experiments, it is learnt that the PINN surrogate cannot achieve such a high accuracy as the sparse grid surrogate, and therefore, more complex methods such as the PPDM should be used by the PINN surrogate to decide the phase condition.
The PINN trained in experiment 2-260 is also used as a surrogate for the PT flash calculations in this compositional-flow simulation.Thus, the compositional-flow simulation runs for three times: one with the true PT flash calculations, one with the sparse grid surrogate, and the other one with the PINN surrogate.The corresponding simulation results are shown in Figs.10-12, respectively.The Darcy velocity describes the condition of the compositional flows, and the compositions and the phase condition are two kinds of concerns of PT flash calculations.In addition to that, in real petroleum industries, propane has much more economical value than the methane, and therefore, the recovery rate of propane is also a key point of the experiment.All the concerns can be found in the three figures.
From the figures, it is learnt that the streamlines go from the left-down corner to the right-down corner.Furthermore, much higher velocities are seen near the two corners.These findings make scientific and common sense, which is a demonstration of the correctness of the codes.In addition to that, it is observed the results with the PT flash calculations and with the sparse grid surrogate match each other perfectly, which owes to the high accuracy of the sparse grid surrogate.However, the simulation with the PINN surrogate shows the recovery of propane advances faster than the other two simulations.For example, the   0.010 0.018 former gives the recovery rate of 7%, while the latter two give the recovery rate of a little more than 5%.The phenomenon can also be seen in the other figures.For example, in the figures of the mole fraction of propane, the area with lower fraction (the blue area) is larger in Fig. 12(c) than in Fig. 10(c) and Fig. 11(c).The similar phenomenon can also be observed in the saturation figures.The differences show the approximation errors of the PINN surrogate.

D. Single-phase compositional-flow experiments
In this section, carbon dioxide is added to the mixture, and the three-component compositional-flow experiments are carried out.In these experiments, there is a reservoir of 8 Â 4 m 2 , and a grid of 80 Â 40 cells is imposed on the reservoir domain.The reservoir temperature is fixed at 220 K, and the gravity effect in the ydirection is considered.The initial pressure in the reservoir is 60 bar, the initial permeability is 9:87 Â 10 À15 m 2 , and the initial average porosity is 0.2.The initial compositions in the reservoir are 0.1 for carbon dioxide (the first component), 0.2 for methane (the second component), and 0.7 for propane (the third component).Carbon dioxide is injected from the left-down cell at the speed of 2 mol/(m 2 s), and then the mixture is pushed out of the right-up corner.A constant pressure of 60 bar is imposed on the exit corner.The simulation runs for 8760 time-iterations, and each time step is 360 s.The simulation results from the PT flash calculations, the sparse grid surrogate, and the PINN surrogate are shown in Figs.13-15, respectively.The pressure range of the sparse grid surrogate is from 58 to 62 bar.The first, the second, and the third dimension of the sparse grid surrogate are the pressure, the composition of carbon dioxide, and the composition of methane, respectively.The PINN trained in experiment 3-220 is used as the surrogate in these experiments.
From the figures, it is known that all the three simulations output almost the same results, which means the PINN surrogate has nearly the same approximation accuracy as the sparse grid surrogate in the single-phase experiments.It is recalled that the latter has higher accuracy than the former in the two-phase experiments.The streamlines go from the left-down corner to the right-up corner, and larger velocities are seen near the two corners.The gravity effect is observed in the figure of mole fraction of propane: more propane is pushed out in the x-direction than in the y-direction.It is recalled that in the two-component compositional-flow experiments, where there is no gravity effect, propane is pushed out with the similar amounts in the x-direction and the y-direction.These observations verify the correctness of the codes.Furthermore, at the end of the simulations, a single-oil phase appears in the reservoir, which is the reason that the PINN surrogate achieves almost the same accuracy as the sparse grid surrogate in the experiments.By comparing the compositional accuracies between experiment 2-260 and experiment 3-220, it is found that the accuracies are in the same order of 10 À2 .However, in the single-phase cases, the compositions are used by the PPDM to decide the phase condition only, and the compositional accuracy is good enough to foresee the phase condition correctly.While in the two-phase cases, the compositions are also used to compute the other physical and chemical parameters, which brings about larger errors.Thus, the PINN surrogate can be used as the phase-condition judger at least, if it is not good enough to foresee the compositions accurately.

V. CONCLUSIONS
PT flash calculations are a performance bottleneck in many applications such as reservoir simulations, and therefore, the endeavors to speed up PT flash calculations are always on the way.It is known that the sparse grid technique has been successfully applied to fix the issue.In that solution, a great acceleration is achieved, while the accuracy is deteriorated only a little.However, in the offline stage, the PT flash calculations are carried out for many times to generate the sparse grid surrogate, which means the computing burden is shifted from the online stage to the offline stage.Thus, more endeavors have to be done to remove the computing burden in the offline stage, although the burden is alleviated much by that kind of shifting.It is known that the PINN can also be a surrogate of PT flash calculations, as long as the loss functions are designed properly.In that condition, the network is trained in the offline stage to generate the PINN surrogate, without the help of the successive substitution technique and the stability analysis.It is noted that the two routines take up most of the computing burden of PT flash calculations.Compared with that, the burden to train the PINN is much smaller, and therefore, the computing burden in the offline stage is removed.However, by the compositional-flow experiments, it is found that the compositional accuracy of the PINN surrogate is not so good as the sparse grid surrogate.For example, the errors of the former are on the order of 10 À2 , while the latter can keep the order as low as 10 À7 .Although the lower accuracy is not necessary to bring about a wrong phase condition, it will cause larger errors in the computations based on the compositions.For example, in the compositional-flow experiments, the two-phase simulation results deviate from the true results a little.Thus, how to increase the accuracy of the PINN surrogate is the next work.
In this work, the feedforward neural network is applied.The network parameters have been tried with different values, such as the number of layers, the number of neurons of each layer, and the batch size.The best accuracies are on the order of 10 À2 in the experiments.To get better accuracy, the other kinds of networks with different structures should be applied and studied.Moreover, better accuracy can be derived from the proper arrangement of the training samples.In this work, the training samples are retrieved uniformly from the sample space, which does not describe the sample space well.In the phase-transition zones, more sample points are needed to guarantee the accuracy, which has the similar philosophy as the sparse grid technique.Thus, integrating sparse grids into the training stage of the PINN is also an interesting work in the future.Finally, but not least, the PINN should be also applied in the VT flash calculations in the coming days.

FIG. 1 .
FIG. 1.A general description of the two-phase PT flash calculation.
FIG. 2.A sketch of the deep feedforward neural network for flash calculation approximation.
FIG. 3.A sketch of the upgraded deep feedforward neural network for flash calculation approximation.

FIG. 4 .
FIG. 4. The flow chart of the PINN-induced phase decision method.

FIG. 8 .
FIG. 8.The training losses of the three-component experiments.(a) The training loss of experiment 3-220 and (b) the training loss of experiment 3-480.

FIG. 10 .
FIG. 10.The compositional-flow simulation results at the end of the simulation, with PT flash calculations.(a) The modulus of Darcy velocity, (b) the streamlines, (c) the mole fraction of the second component, (d) the saturation, and (e) the history of the issued moles of the second component.

FIG. 11 .
FIG. 11.The compositional-flow simulation results at the end of the simulation, with the sparse grid surrogate.(a) The modulus of Darcy velocity, (b) the streamlines, (c) the mole fraction of the second component, (d) the saturation, and (e) the history of the issued moles of the second component.

FIG. 12 .
FIG. 12.The compositional-flow simulation results at the end of the simulation, with the PINN surrogate.(a) The modulus of Darcy velocity, (b) the streamlines, (c) the mole fraction of the second component, (d) the saturation, and (e) the history of the issued moles of the second component.

FIG. 13 .
FIG. 13.The compositional-flow simulation results at the end of the simulation, with PT flash calculations.(a) The modulus of Darcy velocity, (b) the streamlines, (c) the mole fraction of the third component, (d) the saturation, and (e) the history of the issued moles of the third component.

FIG. 14 .
FIG. 14.The compositional-flow simulation results at the end of the simulation, with the sparse grid surrogate.(a) The modulus of Darcy velocity, (b) the streamlines, (c) the mole fraction of the third component, (d) the saturation, and (e) the history of the issued moles of the third component.

TABLE I .
The physical and chemical properties of the components.In this part, a mixture of methane and propane is considered, and it is stipulated that methane is the first component, and propane is the second component.Their physical and chemical properties are listed in TableI, and the binary interaction parameters (BIPs) are listed in TableII.The pressure range is set from 19 to 21 bar, which is often seen in the shallow reservoirs.The temperature range has three different kinds: from 140 to 180, 259 to 261, and 460 to 500 K. Corresponding to it, three experiments are carried out, and they are called experiment 2-160, 2-260, and 2-480.The neural networks are realized in Keras and TensorFlow 2.8, and the code runs on Mac.The network and training parameters of the three experiments are listed in Table

TABLE II .
The binary interaction parameters (BIPs) of the components.

TABLE III .
The network parameters of the two-component experiments.

TABLE IV .
The error calculation for different phase conditions.

TABLE V .
The network parameters of the three-component experiments.

TABLE VI .
The average errors of the three-component experiments.