Inspired by biological swimming and flying with distributed sensing, we propose a data-driven approach for load estimation that relies on complex networks. We exploit sparse, real-time pressure inputs, combined with pre-trained transition networks, to estimate aerodynamic loads in unsteady and highly separated flows. The transition networks contain the aerodynamic states of the system as nodes along with the underlying dynamics as links. A weighted average-based (WAB) strategy is proposed and tested on realistic experimental data on the flow around an accelerating elliptical plate at various angles of attack. Aerodynamic loads are then estimated for angles-of-attack cases not included in the training dataset so as to simulate the estimation process. An optimization process is also included to account for the system's temporal dynamics. Performance and limitations of the WAB approach are discussed, showing that transition networks can represent a versatile and effective data-driven tool for real-time signal estimation using sparse and noisy signals (such as surface pressure) in realistic flows.

## I. INTRODUCTION

The instantaneous loads in biological swimming and flying are highly sensitive to environmental perturbations, such as the wakes of other animals, or gusts in the atmosphere, respectively. Despite challenging boundary conditions, animals control the flow over their propulsors (i.e., flippers or wings) with ease and even utilize unsteady flows to their advantage.^{1,2} Biological sensory systems monitor the flow in real-time by gathering feedback at multiple locations on the propulsors. By combining the sensor input with their experience, animals instantaneously estimate and control their present aerodynamic state (e.g., the aerodynamic loads).^{3,4} These insights have inspired a series of studies adapting the multi-sensor principle for the control of autonomous swimming and flying vehicles.^{5–7} In the absence of simple aerodynamic models for three-dimensional (3D) and highly separated flows, data-driven methods have been proposed that utilize sparse pressure data to characterize the instantaneous aerodynamic state on an arbitrary body (e.g., a wing) under a variety of conditions [Figs. 1(a) and 1(b)]. Examples include attached flow, separated two-dimensional (2D) flows, weakly separated, and highly separated three-dimensional flows.^{6–9}

Data-driven approaches overcome the need to define explicit aerodynamic models, which are usually problem dependent—hence, less generalizable—and heavily rely on the knowledge of some system's parameters (e.g., the instantaneous angle of attack).^{10–12} In fact, to accurately perform real-time signal reconstruction and control during unsteady conditions, these parameters are not typically available. Nevertheless, data-driven methods and explicit (mathematical) models could be combined to advance the methodological capabilities through an integrated framework.^{13}

With the aim to perform (aerodynamic) load estimation, data-driven methods have been shown to be a valid option, although they usually tend to perform well only within a limited range of unsteady boundary conditions.^{6–9} The need for large training datasets, as well as the availability of very sparse pressure distributions, represent a challenge for data-driven methods attempting to characterize and predict aerodynamic loads.^{7} Moreover, strong nonlinear effects emerge under realistic conditions (i.e., for highly separated, unsteady flows at high Reynolds numbers), contributing to the challenge in the load-estimation process.

In order to fully account for the effects deriving from such realistic conditions, and to exploit only a sparse set of sensors—hence, without the knowledge of additional parameters such as the instantaneous angle of attack—existing data-driven methods are continuously improved, as well as novel approaches are proposed. Among other techniques, complex networks represent a powerful and versatile tool for time-series analysis^{14} that have been recently employed to study fluid flows,^{15} including vortical flows,^{16,17} turbulent-combustor dynamics,^{18–20} as well as mixing in wall-bounded turbulence.^{21,22} In this context, transition networks—thanks to their connection with Markov models^{14}—have been successfully employed for time series reconstruction,^{23–26} as well as for reduced-order modeling^{27–29} and control.^{30} Specifically, Fernex *et al.*^{26} have recently shown that cluster-based transition networks can be used as an effective data-driven tool to model complex nonlinear dynamical systems (including turbulence) without any prior knowledge.

Despite the recent progress, cluster-based transition networks are still predominantly used to *reconstruct* data resulting from accurate, numerically obtained data. In the reconstruction process, the newly generated time series are only expected to be globally similar (i.e., sharing similar statistical features) to the reference time series, which is explicitly included in the training dataset.^{26,28} In the present study, instead, we apply transition networks to perform signal *estimation* in real-time and with experimentally obtained data. Here, we generate new signals that, based on sparse (sensor) input, estimate the instantaneous aerodynamic state of an aerodynamic body with good local accuracy.

Hereby, three main challenges arise from real-world (experimental) implementations: (i) limited amount of training data (e.g., range of boundary conditions); (ii) sparse data (limited amount of sensors); and (iii) realistic (noisy) data. To tackle these issues, we present an algorithm that—relying on cluster-based transition networks—is able to perform signal estimation in highly separated experimental flows where sparse sensors are available (Fig. 1). In particular, the network-based strategy can exploit sparse datasets as well as the system's dynamics in the recent past for signal estimation, thus mitigating the need to collect large datasets typical of data-driven approaches.

The methodological steps required to build transition networks are reported in Sec. II, where the new network-based strategy allowing for signal estimation is described (Sec. II C). Our strategy is tested on a simple yet challenging, experimental test case of an accelerated elliptical plate (Sec. III), captured by only two differential pressure sensors. The results of the load estimates are presented and discussed in Sec. IV. Conclusions and future outlook are eventually drawn in Sec. V.

## II. TRANSITION NETWORKS WITH REAL-TIME INPUT

This section describes the signal estimation process, which exploits the features of a transition network built on an experimental training dataset, and a testing dataset of (input) sparse pressure measurements enforcing a constraint to the estimation process of unknown load signals. As shown in Fig. 1, the overall method is characterized by four main steps: (i) the collection of a training dataset [Fig. 1(b)]; (ii) the definition of a phase space from training data and the construction of transition networks [Fig. 1(c)]; (iii) the measurement of (real-time) input data [Fig. 1(d)]; and (iv) the estimation of the load signal [Fig. 1(e)]. We note that, while the procedural steps leading to the construction of transition networks (see Secs. II A and II B) are mainly based on standard practices in the literature,^{26–28} the novel strategy adopted here to estimate the load signal is provided in Sec. II C.

### A. Training dataset and phase-space clustering

A training dataset consists of *N* synchronized time series from sparse sensors (here pressure probes) and *M* signals corresponding to the variables that have to be estimated (here aerodynamic load time series). In general, the training dataset can comprise multiple collections of synchronized time series [see Fig. 1(b)], where each collection belongs to a different configuration parameter value, *α _{i}*. The configuration parameter can be, e.g., the Reynolds number, a boundary condition, or a geometrical configuration. In this work,

*α*represents different angles of attack (see Sec. III).

In this study, we consider *N *=* *2 pressure signals and *M *=* *1 loads (as described in Sec. III), so that a 3D phase space can be obtained. In a phase space, each variable of the training dataset corresponds to a direction, *η _{n}* or

*ξ*with $n=1,\u2026,N$ and $m=1,\u2026,M$. Figure 2(a) shows a 3D phase space with directions

_{m}*η*

_{1},

*η*

_{2}, and

*ξ*, and an exemplifying trajectory depicted as a blue dotted arrow. The rationale behind the phase-space construction is to provide a geometrical representation of a multivariate time series, where each set of values ${\eta 1(t),\eta 2(t),\xi (t)}$ at a given time,

*t*, indicates a unique dynamical state of the (flow) system through a unique point in the phase space. By mapping signal data at different times into the phase space, an oriented trajectory can then be formed whose direction is in increasing time.

Trajectory points in the phase space are grouped by means of the *k*-means algorithm.^{26,28,31,32} Phase-space clustering is usually performed to gain a simplification of the trajectories in the phase space, thus providing a reduced-order representation of the system.^{26–28} Specifically, the *k*-means algorithm partitions the phase space into Voronoi cells represented by their cell centroids, $S\alpha $, whose entries are the centroid's coordinates in the phase space. The superscript $\u2022\alpha $ here indicates that the clustering is applied to the trajectory corresponding to the configuration *α* of the training dataset. As a result, a cluster-based representation of the training dataset is obtained, where each centroid in the phase space represents a coarse-grained aerodynamic state. Following this idea, in this work each centroid represents a specific set of pressure and load values resulting from a specific (unsteady) condition, which in turn is dependent on the angle of attack and its time history).

In the example of Fig. 2(a), cluster centroids are depicted as red dots, capturing the essential features of the blue-dotted trajectory. The inset in Fig. 2(b) presents three Voronoi cells associated with three centroids, where red straight lines indicate the cell edges. Note that, although the *k*-means performs an unsupervised clustering, it requires an *a priori* definition of the number of clusters (i.e., centroids), *N _{cl}*, which should be large enough to capture the essential geometrical features of the phase-space trajectories. After

*N*is fixed, the

_{cl}*k*-means algorithm is applied to each trajectory corresponding to each configuration

*α*.

### B. Building transition networks

Transition networks are constructed from clustered trajectories, where cluster centroids are assigned to network nodes. Accordingly, a univocal correspondence exists between Voronoi cells, their centroids, and network nodes, all indicated through the symbol $S\alpha $. Network links are weighed on the probability of (temporal) transition between two nodes, thus capturing the temporal dynamics of the complex system. In particular, the probability of transition from node $Si\alpha $ to node $Sj\alpha $ is given by

where $N(Si\alpha ,Sj\alpha )$ is the number of times that trajectories directly transit from node $Si\alpha $ to node $Sj\alpha $, while $Nall(Si\alpha )$ is the total number of times that trajectories exit from node $Si\alpha $. In general, $Pi,j$ is not symmetric, i.e., $Pi,j\u2260Pj,i\u2009\u2009\u2200i\u2260j$, so that network links can be illustrated by means of arrows indicating the direction of transition.^{33} For example, the blue trajectory in the inset of Fig. 2(b) uniquely transits from *S*_{1} to *S*_{3}, so that $P1,3=1$, as reported in the transition network diagram where links are depicted as green arrows. Moreover, $\u2211j=1NclPi,j=1$ (by definition of probability), and $Pi,i=0$ (by definition of direct transition between nodes^{28}) for any $i=1,\u2026,Ncl$.

To fully characterize the transition properties of the network, transition times, $Ti,j\alpha $, are also defined as the average amount of time needed for the transition from a node $Si\alpha $ to a node $Sj\alpha $.^{28} Figure 2(c) shows a 3D sketch to illustrate the computation of transition times for a given reference cell identified by node *S*_{2}, where three trajectories (or three intervals of the same trajectory) are illustrated as colored dotted arrows. For node *S*_{2}, the transition times are $T2,3=1.5$ and $T2,1=3$ since the trajectories take, on average, 1.5 and 3 time steps to transit from node *S*_{2} to *S*_{3} and *S*_{1}, respectively. As per matrix $P$, the transition times matrix, $T$, is also generally asymmetric ($Ti,j\alpha \u2260Tj,i\alpha $).

### C. Weighted-average-based (WAB) transition networks

The features of the transition-probability matrix, $P$, and the transition-time matrix, $T$, can then be used to generate a new set of signals. Owing to the versatility in the transition network construction, in this work we present a strategy to perform signal *estimation* referred to as the weighted-average-based (WAB) transition network. The methodology proposed here performs a weighted average among different states in the phase space to create a trajectory of newly generated nodes, as well as an optimization procedure that minimizes the difference between the estimated and input (measured) pressure. In particular, here we assume that input values from *N* time series are known during the signal generation. For example, input values can originate from a set of *N* sparse pressure sensors collecting data in real-time, thereby supporting the time-series estimation [Fig. 1(d)].

The time series from the testing dataset are hereafter indicated via $\u2022\u0303$ notation (i.e., ${\eta 1\u0303,\eta 2\u0303,\xi \u0303}$), while the newly generated signals are hereafter indicated via $\u2022\u0302$ notation (i.e., ${\eta \u03021,\eta \u03022,\xi \u0302}$). We also note that an estimated time vector, $t\u0302$, can also be defined since, in general, $T$ entries do not exactly correspond to the time step $\Delta t$ (a consequence of the clustering operation).

The WAB methodology is described here by highlighting its key features and then sketched in Fig. 3, while procedural details (due to their elaborated nature) are extensively reported in Appendix A:

The first step of WAB is its initialization [Fig. 3(a)]. A load estimation is obtained at the first time,

*t*_{0}, by employing a nearest-neighbor approach, because a transition approach requires at least two times.*N*nearest nodes [see cyan filled circles in Fig. 3(a)] are identified for each trajectory (corresponding to each_{nn}*α*) with respect to the measured input pressure at*t*_{0}. Hence, the load value at*t*_{0}is evaluated as the distance-based weighted average of the load values coming from each of the*N*nodes in each trajectory;_{nn}The transition probabilities of the networks are then used to continue estimating the load signal at a generic $th>t0$ [Fig. 3(b)]. In particular,

*N*nearest nodes, with respect to the pressure input and the previously estimated load, ${\eta \u03031,\eta \u03032,\xi \u0302}(th\u22121)$, are first identified for each trajectory. By so doing, a set of weights, $wi,dist\alpha $, can be defined that are inversely proportional to the distance between the_{nn}*N*closest nodes and ${\eta \u03031,\eta \u03032,\xi \u0302}(th\u22121)$ [see cyan dashed lines in Fig. 3(b)]. The transition matrix is then exploited to identify the transition target nodes of each of the_{nn}*N*(source) nodes, following the criterion of maximum transition probability [see magenta circles and arrows in Fig. 3(b)];_{nn}To not only account for the present system state, but also for the recent past of the pressure input, we define a second set of

*α*-specific weights, $wopt\alpha $ [see Fig. 3(c)]. The weights, $wopt\alpha $, are generated through an optimization process that minimizes the error between the estimated pressure $\eta \u0302$ and the input (measured) pressure $\eta \u0303$ over the recent time period $\Delta t$. As such, the configurations,*α*, with a similar pressure history as the input data, will have a higher impact on the load estimate; and- The load $\xi \u0302(th)$ is eventually estimated as a weighted average of the loads from the identified target nodes [see red-filled dot in Fig. 3(d)],(2)$\xi \u0302(th)=\u2211\alpha wopt\alpha \u2211iwi,dist\alpha \xi (Si\alpha )\u2211\alpha wopt\alpha \u2211iwi,dist\alpha .$

As such, the weighted average accounts for both the phase-space distances (via $wi,dist\alpha $) and the temporal dynamics of each trajectory (via $wopt\alpha $).

To conclude, an estimated time, $t\u0302h$, is also computed by using transition times $T$ instead of load values in the weighted average.

## III. EXPERIMENTAL TEST CASE

As a test case for the transition-network frameworks presented in Sec. II, realistic experimental flow data were captured in a highly separated and unsteady flow at a high Reynolds number. In particular, the flow around an accelerating elliptical plate was characterized via pressure and load measurements, and the same experimental setup was used to obtain both training and testing datasets, as described in Sec. II.

### A. Test facility and experimental setup

The experiments were performed in a fully enclosed, water-filled (viscosity *ν*), 15-m-long towing-tank facility with $1\u2009m\xd71\u2009m$ cross section [Fig. 4(a)]. The model consisted of an elliptical plate [Fig. 4(a)], with principal axes $b=0.3$ m and $w=0.15$ m and a cross-sectional area $A=\pi bw$. The model was connected to the traverse above the towing tank via a horizontal sting with diameter $0.08b$ and length 2*b*, and a vertical symmetric profile of thickness $0.08b$. The plate was towed from rest with the plate velocity *U _{p}* accelerating at a rate of 0.4 $m/s2$ until hitting its final velocity $U\u221e=1$ m/s, resulting in a terminal Reynolds number of $Re=U\u221eb/\nu =194\u2009000$. The plate velocity ($Up=U\u221e$) was then kept constant over a distance of $\u223c40Dh$ before it was decelerated to rest, where

*D*is the hydraulic diameter. The same kinematics were tested for the plate mounted at various angles of attack

_{h}*α*[as defined in Fig. 4(b)], in the range of $45\xb0\u2264\alpha \u2264130\xb0$.

Two Omega differential pressure transducers captured the instantaneous differential pressure $\Delta p$ between the two sides of the plate. Figure 4(b) shows the positions of the two pressure transducers at $y/b=0.5$ ($\Delta p1$) and $y/b=0.75$ ($\Delta p2$), respectively. The pressure sensors measure a range of $\xb16895$ Pa and have a response time of $10\u22123$ s, and an accuracy of $\xb10.25%$ the full-scale best fit straight line (FS BFSL) with hysteresis and repeatability of 0.2% FS. In order to measure forces and moments on the plate, an ATI Nano 25 six-axis force-torque sensor was mounted between the plate and the horizontal sting. The transducer has a resolution of 0.125 N. Pressures and forces were recorded at a sampling frequency of 1000 Hz.

Figures 4(c) and 4(d) present the temporal evolution of the normalized pressures $Cpi=2\Delta pi/(\rho U\u221e2)$, and the plate-normal load $CN=2FN/(\rho AU\u221e2)$, for the six angles of attack $\alpha ={45\xb0,60\xb0,75\xb0,90\xb0,110\xb0,130\xb0}$. Here, *i *=* *1, 2 refers to sensor position at $y/b=0.5$ and $y/b=0.75$, respectively. Inertial effects due to the plate's mass are within 1% of the average loads and, therefore, they are not accounted for. The pressures and loads were phase-averaged over 10 runs and temporally filtered with a Savitzky–Golay filter.^{34} Figure 4(e) visualizes the same data of Figs. 4(c) and 4(d) in 3D phase space, whose directions are $\eta 1=Cp1,\u2009\eta 2=Cp2$, and $\xi =CN$. Single-run data are also illustrated in Fig. 4(e) as gray trajectories, in addition to the phase-averaged data (colored).

For smaller *α* values, the spacing between different trajectories is notably visible [e.g., the red and yellow trajectories in Fig. 4(e)], while similar pressures are observed for $\alpha >75\xb0$, making it difficult in this phase space to distinguish between the trajectories of different *α*. As such, the present dataset is particularly challenging with regard to accurate load estimates using transition networks. Specifically, ambiguous states are likely to appear, namely, points in the phase space with similar pressure values but different loads.

### B. Transition-network construction and estimation parameters

The transition networks were built using a training dataset made up of the pressure data ($\eta 1=Cp1,\eta 2=Cp2$) and the plate-normal load ($\xi =CN$), as well as following the description provided in Sec. II. The order of the experimental data was then reduced by clustering the phase-space trajectories for each *α* into *N _{cl}* = 300 centroids, $S\alpha $. As a representative example, centroids are shown in Fig. 4(e) for the (phase-averaged) trajectory corresponding to the configuration $\alpha =45\xb0$. In general, small values of

*N*serve to reduce the computational effort of the method. However, if

_{cl}*N*is too small, the dynamics of a trajectory in the phase space cannot be properly resolved, thus leading to higher estimation errors. In the present study,

_{cl}*N*= 300 (with 12 300 time-series instants, i.e., trajectory points) provides a good balance between estimation accuracy of the load $C\u0302N$ and computational effort. A parametric analysis on the effects of

_{cl}*N*on the results is provided in Appendix B.

_{cl}Once the transition networks are established, a real-time estimate of the plate-normal load $\xi \u0302=C\u0302N$ can be obtained by utilizing the pressure sensors' (real-time) input ($\eta \u03031=C\u0303p1$ and $\eta \u03032=C\u0303p2$) in combination with the WAB procedure (Sec. II C). Specifically, the number of nearest-neighbors *N _{nn}* [cyan nodes in Figs. 3(a) and 3(b)] was set equal to 10. Although

*N*is usually set to 3 or 4,

_{nn}^{26}

*N*= 10 leads to smoother load estimates without substantial changes in the overall results.

_{nn}Furthermore, pressure-error minimization was performed over a temporal window $\Delta t*\u22646$ (i.e., a traveled distance at most equal to six times the plate hydraulic diameter). The value of $\Delta t*$ corresponds to half of the acceleration time $t*\u224812$ [see Fig. 4(d)] and allows us to sufficiently capture the temporal features of the acceleration and deceleration phases. Larger $\Delta t*$ values do not lead to substantial changes in the results.

## IV. RESULTS AND DISCUSSION

In this section, we present and discuss the results stemming from the application of the WAB approach when the transition networks are used to estimate loads for omitted flow configurations, i.e., time series corresponding to *α* values that were not available in the training data. To mimic realistic estimation conditions, a randomly selected single run [gray lines in Fig. 4(e)] is used as a testing time series, rather than the phase-averaged signals [colored trajectories in Fig. 4(e)]. In this way, we account for single-run noise, as phase-averaged signals are less noisy.

Figure 5 shows the resulting load estimates $C\u0302N$ compared to the measured loads $C\u0303N$, for different omitted *α* values (reported in each panel's title). To highlight the impact of including the system dynamics in the estimation process, we compare the results with and without utilizing the *α*-specific weights $wopt\alpha $ obtained via the pressure optimization strategy presented in Fig. 3(c) (dark blue and black lines, respectively). In general, the WAB approach is able to reproduce well the shape of the measured loads $C\u0303N$.

To quantify the estimation performance in more detail, Fig. 6 presents the normalized error, $E=Eabs/\u27e8C\u0303N\u27e9$, of the estimates $C\u0302N$ with respect to the test data $C\u0303N$, where $Eabs=|C\u0302N\u2212C\u0303N|$ is the absolute error while $\u27e8\u2022\u27e9$ indicates the time average. The accuracy of the estimate varies depending on the flow stage [i.e., acceleration, steady-state, deceleration; see Figs. 4(c) and 4(d)], the estimation strategy (with or without pressure optimization), as well as on the omitted *α*. The magnitude of *E* remains relatively small (less than 20%) throughout the whole estimation process, even if optimization is not performed (see also Fig. 7 for a comprehensive assessment).

Note that, in contrast to previous studies on signal reconstruction,^{26,28} we use the (normalized) absolute error *E* instead of statistical quantities, such as autocorrelation to assess the estimation quality. A discussion on the relevance of statistical similarity in the context of signal estimation is provided in Appendix C.

Estimating the aerodynamic state in real-world applications, such as our accelerated flat plate, imposes several challenges as outlined in Sec. I: (i) a limited amount of training data; (ii) a limited amount of sensors; and (iii) realistic (noisy) data. In the following, we use the present results to discuss how the network-based estimation algorithm presented in Sec. II C tackles those challenges.

### A. Limited amount of training data

Training a data-driven algorithm in an experimental setting comes with significant effort. In fact, experimental campaigns are often time intensive and involve costly facilities. To potentially reduce the required amount of training data needed for accurate load estimates, the WAB approach estimates unknown dynamics by combining information from different configurations, *α*, via a weighted average. A similar approach was proposed by Fernex *et al.,*^{26} who evaluated the weighted average between two configurations that were identified manually and *a priori*. In contrast, the WAB approach used here takes all available configurations, *α*, of the training dataset into account, since one does not know *a priori* which *α* value(s) are suitable for the present estimation.

By omitting a configuration *α* from the training data, and then estimating the aerodynamic loads of the omitted *α*, we can assess the capabilities of the WAB approach to estimate an unknown signal. As shown in Fig. 5, the WAB approach successfully estimates loads from *α* that were omitted in the training data. The estimates are particularly accurate for $\alpha =90\xb0$ and $\alpha =110\xb0$ [see Figs. 5(b) and 5(c) and Figs. 6(b) and 6(c)]. However, the performance of the WAB approach deteriorates if $\alpha =130\xb0$ or $\alpha =45\xb0$ are omitted in the training dataset and then estimated [see panels (a) and (d), respectively, in Figs. 5 and 6].

Taking $\alpha =130\xb0$ as a representative case, this behavior can be explained by the fact that the trajectory for $\alpha =130\xb0$ is not fully surrounded by other trajectories in the phase space [see Fig. 4(e)], but is only close to the trajectory for $\alpha =110\xb0$ [orange line in Fig. 4(e)]. From the point of view of WAB, the weighted average to estimate the load for $\alpha =130\xb0$ is performed using load data that are always lower than the expected $C\u0303N$, so that the average is unavoidably driven by lower *C _{N}* values, thus leading to higher estimation errors. In contrast, the trajectories of $\alpha =90\xb0$ and $\alpha =110\xb0$ are surrounded by states of various

*α*[as shown in the phase-space representation of Fig. 4(e)], so that the WAB strategy can better interpolate to the estimated force.

This limitation of the WAB approach is a common feature of interpolation-based techniques, and can be overcome by properly collecting training data [Fig. 1(b)] so that all expected peripheral boundary conditions are accounted for. For example, in the present experimental dataset, the training should contain the maximum and minimum *α* that can be experienced by the system.

### B. Limited amount of sensors

Physically implementing a dense network of pressure sensors on an aerodynamic body requires high design and production costs. However, when the amount of sensors is significantly reduced, ambiguous states will likely occur (i.e., same pressure input values, but different loads; see Sec. III). To accurately estimate the aerodynamic loads with sparse data, the present WAB approach concurrently relies on the instantaneous sensor data and the recent history of the system. Namely, the information from the previous state estimate at $th\u22121$ and the instantaneous sensor input are combined to determine the node-specific weights $wi,dist\alpha $ [Fig. 3(b)]. In addition, the recent pressure history is used to find a set of *α*-specific weights $wopt\alpha $ that minimize the error of the pressure estimate within the recent past [window $\Delta t*$, Fig. 3(c)]. As such, both weights, $wi,dist\alpha $ and $wopt\alpha $, contribute to the WAB's performance in systems with sparse sensors.

The positive effects of using $wopt\alpha $, obtained by the pressure-error minimization, is apparent when comparing the estimated loads with and without optimization in Figs. 5 and 6. The estimation performance is consistently better when $wopt\alpha $ is used. This is particularly evident for $\alpha =90\xb0$ and $\alpha =110\xb0$, in which optimized WAB [black lines in Figs. 5(b) and 5(c)] is able to capture the initial load bump occurring at the end of the acceleration phase ($t*\u224812$). This local increase is particularly challenging to be estimated as a result of the strong unsteadiness and nonlinearity in the system, thus highlighting the potential of optimized WAB in performing load estimation effectively.

### C. Realistic data

As shown in Sec. IV B, including the system dynamics in the estimation process can lead to better cluster-based modeling and estimation performance. While our WAB approach relies on a pressure-error optimization, alternative approaches were suggested to account for system dynamics during the state estimation. For instance, Nair *et al.*^{30} accounted for the system dynamics by adding the temporal derivative of their (numerical) input data as an additional axis of the phase space, thus obtaining a better aerodynamic state identification. However, under realistic (experimental) conditions, training and input data display noise levels as a result of several (systematic or randomly appearing) factors affecting the measurements. In particular, the noise level of experimental pressure data obtained in separated, high-Reynolds number flows is typically very high, leading to inaccurate evaluations of temporal gradients. Therefore, in such cases, temporal gradients do not generally represent a suitable choice to be included in the phase space. Accordingly, our WAB approach has been conceived to rely only on absolute values of the sensor input and not on their temporal gradients.

Furthermore, to mitigate the impact of experimental noise on estimation, we used phase-averaged data as a training dataset [see Figs. 4(c)–4(e)]. Nevertheless, the estimation capabilities were tested on (randomly selected) single-run time series, which differ from the phase-averaged signals, and provide additional challenges to the estimation accuracy. In spite of these challenges, the WAB approach still proved to be accurate even in the presence of noisy input data.

## V. CONCLUSIONS AND OUTLOOK

In this study, we extend the application of transition networks from signal reconstruction to load estimation with real-time input. In particular, we generate new signals that were not included in the training dataset, utilizing the input of *N *=* *2 sparse sensors. A weighted average-based (WAB) network strategy is proposed and tested under realistic conditions of unsteady flows. In particular, the network-based approach is tested on an experimental dataset with pressure and load data from an accelerating elliptical plate at various angles of attack. The WAB strategy exploits the features of transition networks (which comprise the definition of a phase space and a clustering algorithm) and a real-time input of sparse pressure signals. Furthermore, an optimization process that minimizes the difference between estimated and measured (input) pressure is implemented.

The potential and limitations of the WAB approach are discussed for estimates corresponding to different (omitted) angles of attack. The results indicate that transition networks can estimate configurations that were unknown during the training stage, with global estimation errors below 20%, and for some cases even below 10%. Moreover, the pressure optimization approach is able to further refine the estimation outcomes by also capturing characteristic local behaviors of the unsteady load signals, e.g., after the acceleration phases. Therefore, our WAB approach proves to be a robust and accurate tool for signal estimation, even in the presence of sparse input data and with limited training data.

While the current approach represents a first effort to employ transition networks to estimate aerodynamic loads in unsteady and high-Reynolds number flows, several methodological advancements can be implemented to potentially enhance the capabilities of the WAB approach. In fact, transition networks do not represent a *black box* tool between input and output variables but provide a versatile framework that can be easily modified to account for advanced information on the flow system.

In this regard, the implementation of the system dynamics through pressure-error minimization represents a paradigmatic example. Additional physical insights can be included in the network model by expanding the size of the phase space, where additional axes could represent other measured variables of the system. In this vein, future efforts aim to incorporate external flow measurements (e.g., velocity or vorticity fields) in the transition network model. On this note, the outcomes from simple models could also be incorporated in the algorithms, either as additional axes of the phase space or as additional input data to constrain the estimation process.

With the aim to account for noise in experimental data, different clustering strategies could also be applied (such as fuzzy algorithms), and the probabilistic nature of transition networks could be further exploited, e.g., implementing Bayesian statistics.^{35} Furthermore, different interpolation schemes can be used, where the weighted average can be performed on a limited set of configurations chosen via an optimization routine.

In conclusion, transition networks show great potential for real-time estimation of unknown variables in fluid dynamics problems, even under challenging flow conditions and sparse training datasets. Therefore, we believe the proposed network-based methodology, owing to its versatility, can be a promising tool for the real-time estimation of realistic flows even when limited by sparse data.

## ACKNOWLEDGMENTS

D.E.R. acknowledges support from the Air Force Office of Scientific Research (AFOSR) under Grant No. FA9550-20-1-0086, monitored by Dr. Gregg Abate.

## AUTHOR DECLARATIONS

### Conflict of Interest

The authors have no conflicts to disclose.

## DATA AVAILABILITY

The data that support the findings of this study are available from the corresponding authors upon reasonable request.

### APPENDIX A: WAB PROCEDURAL DETAILS

The details of the WAB methodology are here reported. We recall that the time series from the testing dataset are indicated via $\u2022\u0303$ notation (i.e., ${\eta 1\u0303,\eta 2\u0303,\xi \u0303}$), while the newly generated signals are hereafter indicated via $\u2022\u0302$ notation (i.e., ${\eta \u03021,\eta \u03022,\xi \u0302}$). The WAB approach comprises the following four main steps (see Fig. 3):

- At the first time,
*t*_{0}, a nearest-neighbor approach is used to estimate $\xi \u0302(t0)$ because a transition approach requires at least two times. First, a set, $Snn\alpha $, of*N*nodes is selected for each configuration_{nn}*α*. Specifically, each set of nodes comprises the closest*N*nodes to the measured pressures of the testing dataset, namely, $\eta \u0303(t0)={\eta \u03031(t0),\eta \u03032(t0)}$. For example, in Fig. 3(a),_{nn}*N*= 2 and the closest nodes to the two configurations_{nn}*α*_{1}and*α*_{2}are identified by dashed cyan lines, which highlight the 2D distances $d\eta =||\eta (Snn\alpha )\u2212\eta \u0303(t0)||2$. The load value, $\xi \u0302(t0)$, is then evaluated as the weighted average of the*ξ*values of each node in $Snn\alpha $, namely,(A1)$\xi \u0302(t0)=\u2211\alpha \u2211iwi\alpha \xi (Si\alpha )\u2211\alpha \u2211iwi\alpha ,$where $Si\alpha \u2208Snn\alpha $, while $wi\alpha =1/d\eta $ is a set of distance-dependent weights. A newly estimated node $S\u0302(t0)={\eta 1\u0303(t0),\eta 2\u0303(t0),\xi \u0302(t0)}$ is then obtained, which is illustrated as a filled red dot in Fig. 3(a);

The transition probabilities of the networks are then used to continue estimating the load signal. At a generic time

*t*, the_{h}*N*closest nodes, $Snn\alpha $, are first identified. In particular, the nodes in $Snn\alpha $ are selected to minimize the Euclidean distance, $d\eta ,\xi $, between the point ${\eta \u03031,\eta \u03032,\xi \u0302}(th\u22121)$ [see filled red circle in Fig. 3(b)] and all the nodes belonging to each trajectory. For any node in $Snn\alpha $, the transition matrix is exploited to identify the transition target nodes following the criterion of maximum transition probability. Target nodes are highlighted by magenta circles in each trajectory of Fig. 3(b), while transitions are shown via magenta arrows._{nn}Similar to the initialization at

*t*_{0}[Fig. 3(a)], a set of weights $wdist,i\alpha =1/d\eta ,\xi $ can be defined, implying that closer $Snn\alpha $ nodes will have a higher impact (i.e., a higher weight) on the estimation of the next state $S\u0302(th)$. We note that during initialization, a 2D distance ($d\eta $) was used. For $th>t0$, a load estimate $\xi \u0302(th)$ exists that can be exploited to calculate a 3D distance ($d\eta ,\xi $), as shown by cyan dashed lines in Fig. 3(b). In contrast to a 2D distance, the 3D distance provides more robustness against ambiguous estimations. These arise when the*η*_{1}and*η*_{2}values of the nodes $Snn\alpha $ are similar, but their*ξ*values are significantly different;- Although the identified nodes $Snn\alpha $ are close to $S\u0302(th\u22121)$ in the phase space, this does not necessarily imply that $Snn\alpha $ nodes belong to a trajectory (i.e., an
*α*case) displaying a similar temporal dynamics as the input data $\eta \u0303$. Therefore, to identify the*α*cases that best capture the dynamics of the input data, we rely on the recent past of our input data $\eta \u0303$. In particular, we define a new set of*α*-specific coefficients, $wopt\alpha $, which are generated through a pressure-error minimization strategy [Fig. 3(c)]. The input data, $\eta \u0303$, in a temporal window $[th\u2212\delta h,th]$ are used as reference values and compared with the estimated pressure values $\eta \u0302$, computed as(A2)$\eta \u0302=\u2211\alpha \u2211iwi\alpha \eta (Si\alpha )\u2211\alpha \u2211iwi\alpha ,$where(A3)$wi\alpha =wi,dist\alpha \xb7wopt\alpha .$A minimization problem is then solved which aims to find the set of weights $wopt\alpha $ that minimizes the maximum absolute difference between input and estimated pressure [dashed and solid red lines in Fig. 3(c)] over the chosen temporal window, $[th\u2212\delta h,th]$, namely,(A4)$argminwopt\alpha [maxt\u2208[th\u2212\delta h,th][|\eta \u03021,2(t;wopt\alpha )\u2212\eta \u03031,2(t)|]].$We note that $wopt\alpha $ only depends on

*α*, thus providing a measure of the reliability of each trajectory (corresponding to configurations*α*) to fulfill the constrain on estimated pressure coming from input (testing) data. For example, the blue trajectory corresponding to*α*_{3}in Fig. 3(c) is much less reliable than the remaining two trajectories (for*α*_{1}and*α*_{2}) in providing a good estimation for pressure, thus leading to $wopt\alpha 3\u226a1$. If optimization is not performed, $wopt\alpha =1$ for any*α*configuration and $wi\alpha \u2261wi,dist\alpha $; and Finally, a weighted average of the

*ξ*values of the target nodes is computed to estimate the load value $\xi \u0302$ at time*t*, thus obtaining the point $S\u0302(th)$ [Fig. 3(d)]. Equation (A1) is still exploited to get $S\u0302(th)$, but the weights $wi\alpha =wi,dist\alpha \xb7wopt\alpha $ [Eq. (A3)] are defined as the product of the distance-related weight $wi,dist\alpha =1/d\eta ,\xi $ [cyan dashed lines in Fig. 3(b)] and the_{h}*α*-specific novel coefficient $wopt\alpha $ [Fig. 3(c)] which accounts for the system dynamics.

To conclude the procedure, the estimated time $t\u0302h$ is also computed as the sum of the previous estimated time, $t\u0302h\u22121$, and a weighted-average transition time from Eq. (A1) where transition times $T$ are used instead of *ξ*.

### APPENDIX B: PARAMETRIC ANALYSIS ON NUMBER OF NETWORK NODES

This Appendix describes the effects the *N _{cl}* parameter on the estimation performances of the WAB transition network strategy. We recall that

*N*indicates the number of nodes in the network, which correspond to the centroids of the Voronoi cells obtained from the

_{cl}*k*-means clustering.

Figure 7 shows the average performance of WAB when *N _{cl}* is varied, either with or without pressure-error minimization. Here, a global error is computed as $\u27e8E\u27e9=\u27e8Eabs\u27e9/\u27e8C\u0303N\u27e9$. In general, $\u27e8E\u27e9$ increases toward small

*N*values because

_{cl}*N*becomes comparable with the number of nodes used to perform the weighted average,

_{cl}*N*= 10. As discussed in Sec. IV, the WAB method performs well when intermediate configurations have to be estimated, which is confirmed in Fig. 7 for $\alpha ={75\xb0,90\xb0,110\xb0}$ (see blue, green, and orange lines, respectively). In particular, it is evident as the pressure-error minimization (filled-dotted lines) always reduces the overall estimation error for any omitted

_{nn}*α*. Finally, we highlight that $\u27e8E\u27e9$ remains quite constant for

*N*> 300, with values below 10% for the intermediate cases and below 20% for external cases (i.e., $\alpha =45\xb0,130\xb0$), thus justifying the choice of

_{cl}*N*= 300 in Sec. IV.

_{cl}### APPENDIX C: RECONSTRUCTION VERSUS ESTIMATION

Data-driven approaches have often been used to *reconstruct* data, so that the newly generated reconstructed time series are expected to share very similar statistical features with respect to the corresponding signal included in the training dataset.^{26,28} In contrast, in the present study, we aimed to *estimate* unknown signals, i.e., not included in the training dataset. In general, the newly estimated signals could display very similar statistical features with respect to the expected time-series, while still containing considerable local errors. In other words, although an estimation process could produce a globally (statistically) similar time-series with respect to the expected signal, local errors can be non-negligible.

In our work, this could be a consequence of the fact that each single run will always differ locally from other runs or from phase-average signals that are included in the training dataset. A representative example is provided in Fig. 8: we show as a black line the autocorrelogram of the estimated load $C\u0302N$ (without performing optimization) for the omitted case $\alpha =130\xb0$, as a function of the temporal lag $\Delta \tau *$. For comparison, the autocorrelogram of the measured (reference) load $C\u0303N$ is also reported as a cyan line. Autocorrelation is chosen here in analogy with previous studies assessing the capabilities of transition networks.^{26,28} While estimation errors can be locally non-negligible [as illustrated in panel (d) of Figs. 5 and 6], the difference between the autocorrelogram for the estimated and measured loads is instead very small.

Therefore, while statistical tools like the autocorrelation could be effectively used to assess *reconstruction* performances, they might not always provide a reliable measure of the *estimation* performances, especially in the context of unsteady load estimation.