Spatial prediction of the turbulent unsteady von Kármán vortex street using echo state networks 

The spatial prediction of the turbulent flow of the unsteady von K (cid:1) arm (cid:1) an vortex street behind a cylinder at Re ¼ 1000 is studied. For this, an echo state network (ESN) with 6000 neurons was trained on the raw, low-spatial resolution data from particle image velocimetry. During prediction, the ESN is provided one half of the spatial domain of the fluid flow. The task is to infer the missing other half. Four different decompositions termed forward, backward, forward – backward, and vertical were examined to show whether there exists a favorable region of the flow for which the ESN performs best. Also, it was checked whether the flow direction has an influence on the network ’ s performance. In order to measure the quality of the predictions, we choose the vertical velocity prediction of direction (VVPD). Furthermore, the ESN ’ s two main hyperparameters, leaking rate (LR) and spectral radius (SR), were optimized according to the VVPD values of the corresponding network output. Moreover, each hyperparameter combination was run for 24 random reservoir realizations. Our results show that VVPD values are highest for LR (cid:2) 0.6, and quite independent of SR values for all four prediction approaches. Furthermore, maximum VVPD values of (cid:2) 0 : 83 were achieved for backward, forward – backward, and vertical predictions while for the forward case VVPD max ¼ 0 : 74 was achieved. We found that the predicted vertical velocity fields predominantly align with their respective ground truth. The best overall accordance was found for backward and forward – backward scenarios. In summary, we conclude that the stable quality of the reconstructed fields over a long period of time, along with the simplicity of the machine learning algorithm (ESN), which relied on coarse experimental data only, demonstrates the viability of spatial prediction as a suitable method for machine learning application in turbulence.


I. INTRODUCTION
Machine learning (ML) application to big data is on the frontiers of many disciplines in science and engineering.The goal is to extract the hidden order beneath chaotic dynamical systems without solving the complex and often unknown underlying equations.2][3][4][5][6][7][8] Therefore, any breakthrough in ML application in turbulence will also have immediate advantages for the other disciplines.In other words, not only might turbulence research benefit from ML, but there is also room for vice versa.
In turbulence the irregularities and chaos are present both in space and time; thus, spatial prediction or modeling is as desirable as its temporal counterpart.Thus, the current study focuses on predicting the flow not in time, but in space.Spatial prediction of a turbulent flow has long been pursued in the form of super-resolution in the community.In this way, the limits of numerical and experimental methods in providing fully resolved flow data are expected to be overcome via ML application in order to fully resolve the small-scale features of the flow.
While image super-resolution was an established method in other disciplines, [9][10][11][12][13] one of the pioneering studies on super-resolution in fluid dynamics was conducted by Fukami et al., 14 where they applied two ML algorithms, namely, a convolutional neural network (CNN) and a downsampled skip-connection/multi-scale (DSC/MS) model, to the two-dimensional (2D) wake flow behind a cylinder and successfully reconstructed the flow from low-resolution data.Meanwhile, Deng et al. 15 successfully increased the resolution of the flow behind an isolated cylinder and two side-by-side cylinders via a generative adversarial network (GAN).5][36][37][38][39] The most recent effort in this direction was conducted by Luo et al., 40,41 where they applied autoencoders to reconstruct the flow from sparse measurements by sensors and also fill the missing flow field in turbulent flows.
However, ML application in turbulence through the space domain was not limited to super-resolution.3][44][45] Li et al. 46 and Otmani et al. 47 applied ML to detect the turbulent regions within the wake of a cylinder, while Colvert et al. 48classified the flow from its local vorticity values.Yousif et al. 49 reconstructed the three-dimensional (3D) turbulent flow behind a cylinder from the two-dimensional input using a GAN.Determination of the secondary variables of the flow from some initially available input variables is the other approach in this regard. 50aissi et al. 51 conducted one of the pioneering investigations by reconstructing the pressure and velocity fields behind a bluff body from the available flow visualization data.
In conclusion, while various approaches have tackled the reconstruction of turbulent flows through space mainly by super-resolution to some extent, the direct prediction of a large section of the turbulent flow has not been addressed yet.Therefore, the aim of the current study is to predict the flow field from limited available data of a section outside.It is, thus, expected that the ML algorithm will be able to recognize the relationship between flow structures in different parts of the spatial domain.An echo state network (ESN) was chosen as the ML algorithm for the current study.This is different from most previous investigations where different forms of deep learning were applied.The reason for this choice is the simplicity and speed of the ESN compared to the complex structure of deep learning algorithms.Therefore, in the case of success, it will be a much stronger proof of concept for spatial prediction, with room to expand into more complex algorithms to improve it in the future.
An ESN is a type of recurrent neural network (RNN) 52,53 that is composed of a reservoir with sparsely connected neurons, where only the output layer is trained. 54,55Therefore, the internal memory of the past inputs makes it a suitable algorithm for learning from periodic turbulent flows.In recent years, studies have attempted to implement ESNs in order to predict the out-of-plane vortices in the 3D Rayleigh B enard convection (RBC), 56 the velocity field in 2D RBC, [57][58][59][60] large spatiotemporally chaotic systems, 61 sea surface temperature, 62 shallow water dynamics, 63 transient turbulent trajectories, 64 and the wake flow in von K arm an vortex street (KVS). 65n most of the studies, when ESNs are used for prediction, the flow data are usually reduced by a data reduction method, such as proper orthogonal decomposition (POD) 58,65 or autoencoder, 59 to provide the input data in a suitable form for the ESN.However, one can argue that these prior data reduction methods are some sort of additional layers in the algorithm, violating the initial claim of the algorithm's simplicity.Therefore, this study uses coarse, low-resolution experimental data directly to demonstrate the effectiveness of ESN in extracting hidden information from such data without using any data reduction methods.
This study aims for spatial prediction of the unsteady flow of KVS at Re ¼ 1000.Four different approaches are presented in which the input and output sections of the flow are varied (see Fig. 5).In forward prediction, the upstream flow is provided as output while the downstream is desired to be reconstructed by the ESN.The backward prediction is the opposite of the previous case, where the upstream section of the flow is predicted.Forward-backward prediction is the case where the flow in the middle of the wake flow is provided, and a prediction of both upstream and downstream sections is desired.Finally, in the vertical prediction, the lower vertical half section of the flow is provided to predict its respective upper part.This study is a proof of concept for the applicability of spatial prediction of the turbulent flow (of unsteady KVS via ESN) and a representation of the possibility of using coarse experimental velocity fields without prior data reduction for flow prediction.Furthermore, the four different approaches of prediction (forward, backward, forward-backward, and vertical) provide the necessary comparison to show whether certain sections contain more useful information of the turbulent flow, thus, if these data are given, the rest of the flow can be predicted much more accurately.

II. METHODS A. Experimental data
Figure 1 shows a schematic sketch of the experimental setup for the particle image velocimetry (PIV) measurements.A cylinder with a diameter of D ¼ 8 mm was placed in a water channel with 50 Â 50 mm 2 cross section.Then, the flow was seeded with Polyamide particles of 20 lm diameter.The vertical mid-plane of the channel was illuminated with a continuous wave laser (Laserworld Green-200 532).The optical elements formed a light sheet with a thickness of 1 mm.Consequently, the images of the illuminated particles were taken using a high speed camera (HS 4M by LaVision GmbH) perpendicular to the laser sheet outside the channel.The calibration was performed with respect to the channel walls.The Reynolds number based on the cylinder diameter D was calculated to be Re ¼ V 1 D= $ 1000.Here, V 1 stands for free stream velocity (133 mm/s) and is the kinematic viscosity.For this Reynolds number, the vortex shedding is unsteady.The Strouhal number was St ¼ f v D=V 1 ¼ 0:22 with the vortex shedding frequency f v .The data were recorded for 100 s with a temporal camera resolution of f ¼ 50 Hz frequency.This corresponds to a temporal resolution of 15 time steps (TS) per vortex shedding event.However, these numbers are for the KVS on average; otherwise, due to the instability of the vortices, their strength and shedding period vary over time.
The PIV analysis was conducted using Lavision GmbH's DaVis software.The data for the current study has been made intentionally coarse in order to show the robustness of the spatial prediction approach in dealing with coarse experimental data and also to keep the number of input and output variables in a manageable range for the current study.However, in order to represent the quality of the measurements, Fig. 2 shows the results of measurements in full resolution 0:35 Â 0:35 mm 2 (for further detailed information, see Sharifi Ghazijahani et al. 65 ).At the left, the V y field in full resolution shows the complexity of the turbulent flow behind the cylinder.In the middle, the average V y field shows that when the flow is divided from the horizontal centerline (which is the case for the Vertical prediction approach), one will end up with positive values at the top and negative at the bottom.Finally, the root mean square (rms) values of the field are shown on the right.It is evident that fluctuations are expanding vertically in downstream, and the highest fluctuations occur near the horizontal centerline.For the coarse data, which have been used in the current study, an advanced cross correlation evaluation was applied for PIV processing for a rectangular interrogation window size of 64 Â 64 pixels.This yields a field with 24 Â 16 grid points with a spatial resolution of 2:8 Â 2:8 mm 2 .For further detailed information, please see K€ ahler et al. 66 Figure 3 shows the sample V y field of the coarse data used in the current study.The arrows represent the direction of the displacements.

B. Echo state network
This study employs an ESN to predict the velocity field for an unsteady von K arm an vortex street (KVS) behind a cylinder at Re ¼ 1000.An echo state network is an implementation of a recurrent neural network (RNN) where only the output weights are trained.Since Jaeger 54 first proposed ESNs as an alternative to gradient descent training for RNNs, they have been widespread due to their simplicity and strength in dealing with time series. 55Figure 4 shows a schematic sketch of the ESN.The ESNs consist of a reservoir of N neurons (6000 for the current study) with an internal reservoir state (s), which receive N in input signals (u) and produce N out output signals (q).First, input signals are connected to different reservoir neurons with a randomly generated weight matrix of W in 2 R NÂNin .These random connections are represented with blue arrows in Fig. 4. In addition, the reservoir neurons are connected with each other by a random weight matrix W as indicated by green arrows in Fig. 4. The random realization of W in and W can be fixed by assigning a random seed (RS) to the underlying random number generation process.One of the most important control parameters also referred to as hyperparameters of an ESN is the maximum eigenvalue of W, i.e., its spectral radius (SR).Through this parameter, the internal reservoir interactions contribute to the nonlinear reservoir dynamics.Thus, high values of SR mean a more chaotic interaction of the neurons with each other.An important characteristic of the ESN is the so-called echo state property (ESP).The ESP reflects the reservoir's fading memory property.Specifically, it ensures that a reservoir becomes independent of its past states and is, therefore, uniquely defined by the last inputs. 54,67In his seminal paper, Jaeger proposed that keeping SR below unity would ensure the ESP.However, it has been shown that this condition is neither necessary nor sufficient for fulfilling the ESP. 68,69A further hyperparameter is the fraction of neural connections inside the reservoir, called the reservoir density (RD).Here, we fix it to a value of 0.2, as the reservoir's performance shows only a weak dependence on RD 67  sðnÞ ¼ ð1 À LRÞsðn À 1Þ þ LRsðnÞ; (2) where Following this, the current state of the neurons sðnÞ is calculated by combining the perfect linear memory of the past iteration with a nonlinear memory term s, as shown in Eq. ( 2).The leaking rate (LR) parameter blends both contributions and can be interpreted as the update speed of the reservoir state.Thus, the optimal LR is always dependent on the system's dynamics.Finally, the weights in W out relate the reservoir state to its output signals.See Eq. (3).In a training phase, the output weights are computed by minimizing the penalized mean square error loss in (4).This well-known optimization problem results in a simple regression task to find the components of W out .Here, b > 0 stands for the ridge regression parameter responsible for preventing the amplification of small differences in-state dimensions by large rows of W out .Additionally, it prevents overfitting, in which the algorithm learns the training data by heart and performs poorly on unseen data.The ESN model for the current study is written in Python using the library easyesn. 70The red arrows in Fig. 4 show the connection between the neurons and output signals (W out ) after the training phase in the reservoir.
Echo state networks (ESNs) are well known in the scientific community for their ability to deal with time series.There is a sequential nature to the current turbulent flow, as vortices are created upstream and are advected with the mean velocity.As a result, ESNs are believed to be suitable for the task at hand in the present study.Furthermore, ESNs are reservoirs of neurons with random connections without the necessity of backpropagation as in other deep learning algorithms (U-Net, CNN).Thus, they adhere to the general claim of machine learning applications, which is that information can be learned from complex dynamic systems with random connections among neurons.In this sense, their success is a strong endorsement of ML's potential use in fluid dynamics and other similar fields in the future.As a final point, their less complex structure translates into much faster operation, which rationally favors their application in comparison to more complex networks.
For the current study, a reservoir with N ¼ 6000 neurons with a reservoir density of D ¼ 0.2 was employed.The reservoir was trained for T ¼ 700 time steps and subsequently tested for 700 additional time instances [for more information about the choice of training length (TL), refer to the Appendix].It should be noted that the data are timeresolved with approximately 15 time steps per vortex shedding event, as already mentioned above.Moreover, the ESN is used for the spatial prediction of the KVS in an open-loop scenario, also referred to as teacher mode. 71This means that the reservoir is continuously provided with measurement data of one half of the V y field and is tasked with reconstructing the missing other half.To this end, the available teacher input half has been chosen in four different ways to create several different approaches for the current study, namely, forward, backward, forward-backward, and vertical.Figure 5 shows a schematic sketch of the ESN, the four approaches, and how the flow is divided in each one of them.The ESN's two main hyperparameters LR and SR are optimized with respect to its prediction performance in all four scenarios.Details will be provided in Sec.III.For each hyperparameter set, the reservoir ran for 24 different random seeds to make a conclusive interpretation of the reservoir performance for that hyperparameter set.As previously shown, once the reservoir has been run for 24 different random seeds, its average prediction quality usually becomes independent of its random initialization. 65

III. RESULTS A. Network optimization
For the current ESN, the grid search is performed for 13 Â 13 different values of leaking rate (LR) and spectral radius (SR) in the range of [0.01,0.99].For each hyperparameter set, the ESN ran for 24 different random seeds (RS).As already discussed in Sharifi Ghazijahani et al., 65 optimizing the ESN, i.e., choosing hyperparameters for optimum prediction, is quite a challenging process as regular measures like mean square error always favor predictions that are close to average over those that actually mirror flow dynamics.For the von K arm an vortex street (KVS), one can assume that the dynamics of the flow are mainly represented in the fluctuations of the vertical velocity component V y , rather than the horizontal V x or absolute velocity values jVj.Then the aim is to mainly predict the direction of the upward and downward flows in the wake of the cylinder, without much insistence on the identical prediction of the magnitudes.Thus, the vertical velocity's prediction of direction (VVPD) is used as the optimization parameter for this study, as was before in Sharifi Ghazijahani et al. 65 as well.As shown in Fig. 6, a threshold of V y ¼ 17:6 mm/s ¼ 0:13V 1 (equal to one pixel displacement for the particle images) is defined in order to divide the flow into three sections of V y < À0:13V 1 in blue, À0:13V 1 < V y < 0:13V 1 in white, and 0:13V 1 < V y in red.Then, VVPD is the percentage of the grid points that are predicted successfully in terms of being part of the upward (red) or downward (blue) wake flow or the free stream (white).

B. Forward prediction
Figure 7 shows the result for the forward prediction approach.As already explained in Sec.II A, in this approach, the upstream half of the flow field is given to the network, and the rest of the flow is predicted.Since the structures in the downstream were once in the upstream in the past, one can say, that here the aim is to reconstruct the past (in the downstream) from the available data in its relative future (in the upstream).In Fig. 7(a) left, the average VVPD values for each hyperparameter set (LR and SR) are shown.It is evident that the VVPD is only sensitive to the LR, and SR variation does not affect it much.However, for large LR values, VVPD also changes slightly with respect to SR.This is in line with the fact that for smaller LR values, the contribution of the neurons in the reservoir to the status of each neuron becomes infinitesimal, and the neuron's status is largely determined by its state in the previous time step.Figure 7(a) (middle) shows the relative standard deviation (STD) of the VVPD values with respect to the 24 random seeds to show the degree of dependence of the predictions in the random connections in the reservoir.Again, the lower values in the smaller LRs are due to the smaller influence of reservoir neurons on each individual's state.Finally, in the right panel of Fig. 7(a), the temporal average of VVPD values of each grid point for the best ESN set with VVPD ¼ 0.74 for RS ¼ 15, LR ¼ 0.6, and SR ¼ 0.99 is shown.While the free stream on the edges is obviously the most predictable, the VVPDs are decreasing in the center, where the unsteady wake flow is present the most.However, the interesting point is that, unlike the vertical location of the grid points, the horizontal location has no considerable effect on the VVPD values.This shows that once there exists a coherent wake flow in the upstream, the ESN will predict the downstream irrespective of the horizontal position of the grid point.A closer look at the ground truth data reveals that while for TS ¼ 1 and 700, there is a very regular coherent succession of up and downward flows in the field, for TS ¼ 350, the flow is mixed and no such clear structure is present.This is due to the unsteadiness of the KVS for the current Reynolds number of Re ¼ 1000.Hence, it is intuitively possible to suggest that the quality of the predictions should be influenced not only by the coherence of the wake flow in each time step but also by the number of time steps passed during the prediction phase since prediction errors can accumulate over time.However, this is only a preliminary suggestion, and our analysis will show that this is not the case and that the prediction error is relatively stable over time.This is due to the fact that although the other half of the flow data are continuously provided as teacher signal if LR ¼ x, then x percent of the neuron depends on the reservoir and, therefore, on its memory.For TS ¼ 1, the predicted field is quite close to the ground truth in terms of both the direction of V y and also its magnitudes.For TS ¼ 700, a similar conclusion with a slightly bigger deviation can be made.However, for TS ¼ 350, the prediction is not satisfactory due to the non-coherent structure of the wake flow for this particular TS.One might nonetheless suggest that a deviating prediction for an irregular ground truth in this TS, might be more favorable than a wrong prediction of large V y magnitudes instead.

C. Backward prediction
The results of backward predictions are shown in Fig. 8.In this case, the approach is the opposite to forward predictions, and while the wake flow in the downstream is available to the ESN, the upstream has to be predicted.Additionally, due to the cone-shaped wake with its vertical expansion downstream, it can be argued that the information, i.e., the sets of positive and negative V y values, are more compressed and entangled in the upstream region in comparison.On the other hand, a greater percentage of the prediction field now belongs to the free stream as well.Therefore, there must be some notable difference between the forward and backward predictions in general.
The average VVPD values in Fig. 8(a) left show the same independence of the ESN performance from SR values.Again, the moderate LR values show higher VVPD in comparison with the lower or higher LRs.Moreover, the VVPD values are in general higher compared to forward predictions.Similar to the forward approach, see middle panel of Fig. 8(a), the higher LR values have a higher relative standard deviation of VVPD, indicating a higher dependence on their random seed.The best prediction for this approach belongs to the ESN with RS ¼ 4, LR ¼ 0.5, and SR ¼ 0.9 with VVPD ¼ 0.83. Figure 8(a) (right) shows the temporal average VVPD values of this prediction.
Clearly, the most challenging grid points to predict are those near the center, where vortices emerge and begin their downstream journey.Finally, Fig. 8(b) shows the ground truth and predicted V y fields at TS ¼ 1, 350, and 700.For TS ¼ 1, the prediction is well-matched with the ground truth in terms of the directions and magnitudes.For TS ¼ 350 with the so-called irregular flow field as discussed previously, the predictions are quite accurate in terms of their directions.However, the magnitudes are not well-matched.The same is true for the final time step of the prediction at TS ¼ 700.In general, it seems that when the vortex shedding strength (the vorticity of vortices) and the corresponding generated V y magnitudes are swiftly changing between consecutive vortices, the network is less capable of accurately predicting the flow fields because the vortices are considerably different at the up-and downstream.However, considering the VVPD values and reconstructed velocity fields, one can argue that the Backward predictions are far more successful than Forward cases.

D. Forward-backward prediction
Figure 9 shows the results of forward-backward prediction.In this approach, the middle half of the V y field is available as input for the ESN and the quarter upstream and quarter downstream of the field are inferred by the ESN.Hence, it is more like an available wake flow to predict what has happened in its past in the downstream and its future in the upstream.Therefore, it can be considered as a mix of forward and backward predictions.For this approach, the average VVPD values are quite comparable with the results from the backward prediction and even with slightly lower values.The same SR independence is also seen here, and again higher LR values have more dependence on the random weight initialization, as can be seen in their relative standard deviations.The best prediction belongs to RS ¼ 18, LR ¼ 0.5, and SR ¼ 0.7, with VVPD ¼ 0.82.This is very close to the backward (VVPD ¼ 0.83) rather than forward (VVPD ¼ 0.74) prediction case.This might be a bit counterintuitive, as one might expect the best prediction to be a blend between backward and forward prediction.This might be due to the fact that also the prediction region is still half of the flow field.However, the most distant grid points are closer to the available input mid-half of the field, and therefore, they are easier to predict.This can be revealed with a very careful comparison of the temporally averaged VVPD field of the best predictions, where every grid point has higher values compared to its values in the backward or forward prediction.Finally, the predicted fields in Fig. 9(b) show that in TS ¼ 1 the reconstruction aligns well with the ground truth in terms of both directions and magnitudes.However, for TS ¼ 350 and 700, only the directions are predicted to a reasonable extent.The only exception to this is the downstream part of the prediction in TS ¼ 350, where even the direction is not retained.

E. Vertical prediction
The final approach for the spatial predictions in this study is the vertical prediction, as shown in Fig. 10.Here, the significant difference with the aforementioned approaches is that the flow is divided vertically, and thus, the prediction is relatively independent of time.This is due to the fact that its counterpart on the top is always available as input for the ESN.Therefore, it is obvious that this approach generally results in slightly better predictions.One might suggest that due to the symmetry in the flow in this approach, actually some sort of physicsinformed modeling is done. 72,73Although the argument is true to some extent, it should be noted that in this flow vortices are of two types.Those with positive V y move to the top, and those with negative V y are ending up in the lower part of the field.That is why the average V y field in Fig. 2 middle has non-zero values, and as will be discussed later, the probability density function (PDF) estimates of the ground truth in Fig. 12 for vertical prediction are asymmetric.
In general, for this approach, similar trends of independence from SR and more dependence on the random seed in large LR values are also visible here.The best prediction found is for the ESN with RS ¼ 4, LR ¼ 0.6, and SR ¼ 0.3 with VVPD ¼ 0.84, which is the highest VVPD in all predictions of the study.The reconstructed V y fields in Fig. 9(b) have acceptable quality in TS ¼ 1 and 700.However, for TS ¼ 350, with its non-coherent flow field, prediction is more challenging.Overall, although VVPD values are slightly higher, no significant improvement is seen in the vertical approach compared to the others.

F. Comparison of the four approaches
The temporal variation of the VVPD values is crucial in determining the reliability of the presented spatial prediction method.Thus, stable prediction quality can be later improved with fewer challenges compared to a case where VVPD values fluctuate strongly over time.Figure 11 shows the VVPD values of the best predictions of all four approaches over time.The largest range of fluctuations was observed for the forward prediction, where the values oscillate around an average VVPD of 0.74 with a standard deviation (STD) of 0.046.However, fluctuations appear to be random.In contrast, the domains of fluctuations in the other approaches are much smaller, so the prediction quality is more reliable.For them, the average VVPD values are 0.83, 0.82, and 0.84, with a standard deviation of 0.038, 0.036, and 0.036 for backward, forward-backward, and vertical approaches, respectively.One can argue that the forward approach predicts the future from unknown starting point, whereas the other approaches have information from the present and predict what happened in the past.Thus, the predictions of the latter are more reliable with less variation in prediction quality.Moreover, there is no decay of VVPD values in time and no accumulation of prediction errors.Therefore, even when the prediction is not accurate at a certain time step, the network can correct itself and approach the ground truth back again in the next time steps.This indicates that the main advantage of the current method is its reliability in reconstructing the flow for infinitely long time spans.In the appendix, it is shown that the ESN is capable of predicting individual snapshots of the flow.Therefore, one can completely shuffle the data and feed them into the network and still get comparable predictions.This is the main reason behind the robustness of spatial prediction against the accumulation of errors in long-term predictions.
Finally, a statistical comparison between the ground truth and predictions can provide complementary insight into the quality of the predictions.Figure 12 shows the probability density function (PDF) estimates of the best predictions of each approach with their respective ground truth.The forward prediction is well aligned with the ground truth for À0:1 < V y =V 1 < 0:1.However, for larger magnitudes, it assumes lower PDF values.The backward prediction scenario is much more in line with its ground truth, even though the ESN assumes slightly higher PDF values.The best match belongs to the forwardbackward prediction, where almost the same probabilities are achieved in the entire range.In the case of the vertical prediction, the ground truth PDF is not matched perfectly by the inferred distributions.Specifically, a mismatch between the PDFs for positive V y values can be observed.The ground truth exhibits a reduced probability for these values.This is of course due to the fact that a positive velocity V y is associated with fluid leaving the lower domain and, hence, a reduced value in the PDF.This has been well shown in Fig. 2 middle for the average V y field with positive values at the top and negatives at the bottom.In other words, unlike physics-informed machine learning, 72,73 there is no statistical symmetry between the available teacher data and the region, which has to be predicted.Thus, the ESN is unable to learn this feature and produces an almost symmetric PDF.
Last but not least, one might suggest that the hyperparameters depend on the specific data.However, since the characteristics of the flow field do not change in the data over time and approximately 48 vortex-shedding periods were taken into account, we believe that the data are a valid representation of the flow, and the actual data set is sufficiently large.We also do not see any sensitive specific dependence of the hyperparameters.Instead, it was shown that the prediction quality is unaffected for a large range of the two hyperparameters (leaking rate and spectral radius).This demonstrates that good predictions do not specifically depend on the data, and the ESN predicts the flow for a wide range of hyperparameters.Furthermore, as shown in Fig. 6, the prediction quality is also very stable over time.Thus, if optimization would have been performed for 400 time steps and the other 300 time steps in the prediction are used for validation, still the conclusions would be valid, and the predictions would be of good quality.

IV. CONCLUSIONS
This study introduced a machine learning model for the spatial prediction of the velocity field of a turbulent flow.In this approach, an echo state network (ESN) was trained to predict one half the domain of the vertical velocity V y field based on the knowledge of the other half.For this, an unsteady von K arm an vortex street (KVS) for Re ¼ 1000 was used as the training data.The data were collected experimentally using particle image velocimetry (PIV).Moreover, an ESN of 6000 neurons was optimized with respect to its two main hyperparameters, leaking rate (LR) and spectral radius (SR), for 24 random reservoir initializations to generate the best prediction.The vertical velocity's prediction of direction (VVPD) was chosen as the optimization parameter.The VVPD is the percentage of ground truth V y that is correctly predicted in terms of its direction.Four different approaches in terms of the target region for prediction are defined.In forward prediction, the downstream half of the field is inferred, backward prediction targets the upstream half, forward-backward prediction targets the upstream quarter and downstream quarter, and finally, in vertical prediction, the lower half of the flow is predicted.The KVS is a sequence of vortices that are shed in the upstream, i.e., the past, and follow the flow to the downstream, i.e., the future.Therefore, each of the four approaches of the current study provided a different part of the flow from a temporal point of view.Thus, it was possible to determine how the quality of the predictions and the optimized hyperparameters changed with respect to the propagation direction of the information, i.e., in the downstream direction.
The average VVPD values were insensitive to SR for all four approaches.Yet, their sensitivity increased for LR > 0.8.The best LR values were around 0.5 and 0.6; however, for 0.2 < LR < 0.8, the VVPD values were similar.The random initialization of the reservoir weights played a more important role for higher LR values.It can be concluded that the highest VVPD values are for LR % 0.6 and SR % 0.9 in spatial prediction for all four approaches in general.The predicted V y fields are predominantly well aligned with the reference data in terms of both direction and magnitude.The maximum VVPD was around 0.83 for backward, forward-backward, and vertical approaches; however, for forward prediction, the value was only 0.74.This might suggest that the prediction of the flow from its unknown past in the forward approach is more challenging.Moreover, both forward and vertical prediction cases showed less statistical accordance with the ground truth, while matching statistics were achieved for the backward and forward-backward approaches.We conclude that our method provides reliable and stable predictions for long periods.This is contrary to the autoregressive temporal prediction task, where prediction will accumulate over time.
The spatial prediction proved a reliable approach for machine learning applications on turbulence with many possible outcomes.It was also proven that this method performs well using a simple ESN and coarse experimental data.In this way, no data reduction method or complex hierarchy of deep neural layers were required.Furthermore, it was shown that the method is capable of predicting the randomly arranged time steps, and it is quite robust in terms of predictions with less training length.In the future, the spatial prediction method can be integrated with temporal prediction or superresolution to provide more promising outcomes.In this way, each method can benefit from the strength of its counterparts to compensate for its weakness.Furthermore, it is desirable to further extend the application of spatial prediction to more complex turbulent flows, such as flow behind arrays of multiple cylinders 74 or even turbulent Rayleigh B enard convection. 75Another possible field of application is to complement the information in the case of a low number of measurement positions, for example, in weather or ocean flow data.
Although, ground truth data are necessary to train the ESN networks, it might be applicable to extrapolate into a larger spatial domain.Thus, in a real experiment, much less data would have to be recorded.Furthermore, spatial prediction can also provide insights into more fundamental questions.In the flow field, for instance, whether there is a particular region that has a higher significance for the respective turbulent flows (and, therefore, can be used to predict the rest of the domain better) or if there is an order beneath the foreseeable chaos that relates to turbulence between different regions of the spatial domain.This might be useful if one has limited experimental data for weather or ocean circulation measurements or needs to reduce the data stream of the experiments.Since the model is used to learn where the important features of a flow are, it can also be used to optimize experiments in the sense that high-resolution measurements are done in a specific region, and the rest is predicted or reconstructed.The same holds for numerical simulation, where the effort can be minimized.Last, but not least, it is also possible to use these models for subgrid modeling in large-scale simulations such as in climate modeling.
In the end, it should be mentioned that machine learning application in turbulence is still a very open field, with many problems to tackle, each having promising outcomes if succeeded.In the present study, we show a clear difference between spatial prediction and superresolution in space.(Super-resolution in the time domain is completely different in this respect.)One of the main advantages is the simplicity of the implemented network (ESN) working with complex data (coarse, low-resolution experimental data).However, super-resolution and physics-informed ESNs are indeed one of the reasonable directions that one can proceed after this study. 33

APPENDIX: SUPPLEMENTARY INFORMATION 1. Random time steps
A critical question in the current study is whether the algorithm is learning the time series or whether it has succeeded in learning spatial relations between the different parts of the spatial domain irrespective of their temporal succession and can predict snapshots if they are randomly arranged.Here, the entire 1400 time steps that are used for training and prediction are shuffled and rearranged, and the prediction quality is examined for the best prediction case of the forward approach where RS ¼ 15, LR ¼ 0.60, and SR ¼ 0.99.The VVPD value for the shuffled forward approach stands for VVPD ¼ 0.74, which is equal to its forward time-ordered counterpart.Moreover, as shown in Fig. 13, the predicted vertical velocity fields V y are similar in terms of quality to the nonrandom prediction.Therefore, it can be concluded that the ESN is able to perform spatial prediction of snapshots, even if the data are not time-resolved or randomly rearranged.

Training length
Figure 14 shows the effect of training length (TL) on the VVPD values for the forward prediction approach where RS ¼ 15, LR ¼ 0.60, and SR ¼ 0.99 which is the hyperparameter set of best prediction in this approach.The red dots represent the case of prediction of seen data in training, whereas the blue dots show the prediction of unseen data.It should be noted that for the seen data prediction, the VVPD values are calculated for the TL ¼ training length.Moreover, the first 50 time steps are excluded as the transition time of the network is equal to 50.Whereas for the prediction of unseen data, the VVPD values are calculated for the entire 700 time steps of prediction.Clearly, the VVPD values are at peak at TL ¼ 700.However, the predictions are still very good, even for much shorter TLs.The VVPD of the predictions of the seen data in training decreases constantly as the TL increases.This is expected as the flow is highly turbulent, and snapshots do not repeat themselves.Thus, a larger training data sample results in a task with higher complexity but a statistically better representations of the flow.However, since measurements are always prone to random noise, the more samples added, the harder a deterministic prediction gets.In general, conceivably, the prediction of seen data is less challenging for the network; however, the unseen data are also predicted in comparable quality.A training length between 500 and 800 snapshots (which corresponds to 35 to 55 vortex shedding events using the average Strouhal number) is a good trade-off between a representative but not too complex system.Therefore, it can be concluded that overfitting is not happening in the current prediction.Because in overfitting, there should be a very good prediction of seen data and low quality prediction of unseen data, which is not the case here.
FIG. 13.The vertical velocity V y fields of shuffled Forward prediction for time step (TS) ¼ 0 (left), 349 (middle), and 700 (right).The region of interest was the same as in Fig. 3.For better readability, we left the axis labels blank.H. Gao, L. Sun, and J.-X.Wang, "Super-resolution and denoising of fluid flow using physics-informed convolutional neural networks without high-resolution labels," Phys.Fluids 33, 073603 (2021).

1 )FIG. 2 .FIG. 3 .
FIG.2.Instantaneous vertical velocity V y field behind the cylinder for Re ¼ 1000 in full resolution (left), and its respective average field (middle), and root mean square (rms) for the entire 1400 time steps that are used in this study (right).

FIG. 4 .
FIG. 4.A more detailed schematic sketch of an echo state network (ESN) with the conceptual depiction of its connections to the input and output signals.

FIG. 5 .
FIG. 5. A schematic sketch of an echo state network (ESN) along with the four different approaches of spatial prediction.

FIG. 6 .
FIG.6.Vertical velocity V y field behind the cylinder for Re ¼ 1000.Up-and downward-directed flows are shown in red and blue, respectively, and white regions are for jVyj=V 1 < 0:13 mm/s in the free-stream, corresponding to pixel displacement of less than one in the PIV images.

Figure 7 (
Figure 7(b) illustrates what exactly is meant by coherent wake flow.Here, the ground truth (top) and the respective predicted velocity field (bottom) are shown for time step (TS) ¼ 1, 350, and 700.A closer look at the ground truth data reveals that while for TS ¼ 1 and 700, there is a very regular coherent succession of up and downward flows in the field, for TS ¼ 350, the flow is mixed and no such clear structure is present.This is due to the unsteadiness of the KVS for the current Reynolds number of Re ¼ 1000.Hence, it is intuitively possible to suggest that the quality of the predictions should be influenced not only by the coherence of the wake flow in each time step but also by the number

FIG. 12 .
FIG.12.Probability density function of the best prediction vs the respective ground truth for all four approaches.