Metasurfaces designed with deep learning approaches have emerged as efficient tools for manipulating electromagnetic waves to achieve beam steering and power allocation objectives. However, the effects of complex environmental factors like obstacle blocking and other unavoidable scattering need to be sufficiently considered for practical applications. In this work, we employ an experiment-based deep learning approach for programmable metasurface design to control powers delivered to specific locations generally with obstacle blocking. Without prior physical knowledge of the complex system, large sets of experimental data can be efficiently collected with a programmable metasurface to train a deep neural network (DNN). The experimental data can inherently incorporate complex factors that are difficult to include if only simulation data are used for training. Moreover, the DNN can be updated by collecting new experimental data on-site to adapt to changes in the environment. Our proposed experiment-based DNN demonstrates significant potential for intelligent wireless communication, imaging, sensing, and quiet-zone control for practical applications.
I. INTRODUCTION
Metasurfaces, consisting of subwavelength artificial array structures on an ultrathin surface, possess a remarkable ability to fully control the properties of electromagnetic (EM) waves including their amplitude, phase, polarization, and wavefront structure. This level of control gives rise to exotic EM phenomena, such as a negative refractive index,1,2 perfect absorption,3 superlensing,4 and invisibility cloaking.5,6 However, these metasurfaces are usually inherently limited to specific functions once their fabrication is finalized and thus cannot meet the requirement of dynamic control of EM waves. Recently, significant efforts have been devoted to developing active or reconfigurable metasurfaces,7–17 whose responses can be varied through external tuning. Specifically, programmable metasurfaces18 as a cost-effective implementation of reconfigurable intelligent surfaces (RISs)19–21 can be digitally controlled using a field-programmable gate array (FPGA). Programmable metasurfaces have demonstrated considerable potential for various applications, including beam scanning,22–24 spatial frequency multiplexing,25–27 nonreciprocal reflection,28 holographic imaging,29,30 and orbital angular momentum generation.31,32
Deep learning has become an efficient tool for metasurface inverse designs33–36 together with different techniques, such as generative models,37–40 tandem networks,41–43 transfer learning,44,45 reinforcement learning,46,47 and hybrid models.48–50 Deep neural networks (DNNs) trained with a large dataset of examples can be employed to generate optimized metasurface designs to specific desired functionalities. Particularly, the DNN-assisted metasurface inverse design has been applied in power allocation to distribute the transmitted power to spatially separated users,51–55 which plays a key role in modern wireless communication to reduce interference and enhance spectral efficiency. However, the impact of complex environmental factors, such as the blocking effect of obstacles, needs to be sufficiently considered for practical applications of power allocation. Current DNN approaches for such situations56–58 often require a large amount of training data from time-consuming simulations and face difficulties in simulating ambiguous or unknown factors (e.g., geometric or material parameters) of the complex obstacles. The mismatches between the simulation model and the actual system may lead to inaccurate predictions of the DNN.59,60 Recently, enabled by the fast tunability of programmable metasurfaces, large amounts of experimental data, which inherently incorporate complex environmental factors, can be efficiently collected for DNN training to overcome modeling challenges and improve prediction accuracy in the application of cloaking, imaging, and direction of arrival estimation.61–63 The combination of a deep learning approach and programmable metasurfaces may provide a more efficient way to extend power allocation applications to realistic complex environments.
In this work, we employ an experiment-based deep learning approach with a programmable metasurface to address the power allocation challenges of real-world complex environments with obstacle blocks. Without prior physical knowledge of these complex systems, we train the DNN directly with experimental data measured across various configurations of the programmable metasurface. Moreover, we update the DNN to adapt to changes in the environment with newly collected experimental data. Both scenarios with and without obstacle blocks are investigated, and the results demonstrate the effectiveness of the experiment-based DNN in controlling the transmitted power toward multiple receivers while showcasing robust adaptability to changes in the environment. The proposed experiment-based deep learning scheme offers a promising avenue for leveraging real-world data to achieve accurate and efficient programmable metasurface designs, holding potential for intelligent wireless communication of Wi-Fi and 5G signals in complex indoor environments.
II. EXPERIMENT-BASED DEEP LEARNING APPROACH
We aim to control the power transmitted to specific receivers in a complex environment, generally with an obstacle, using a programmable metasurface together with an experiment-based deep learning approach as shown in Fig. 1. A reflective programmable metasurface featuring tunable reflection phase profiles in the microwave regime is illuminated with a monochromatic excitation signal at 11 GHz from a feed horn. The metasurface comprises 20 columns of unit cells, and the reflection phase {φi} (i = 1, 2, …, 20) for each column can be independently controlled. After the reflected wave is scattered by an obstacle (a metal frame in this case), the scattered field intensities (Im1, Im2, and Im3) are measured by three open-end waveguide probes in specific locations. Our DNN consists of a forward scattering engine (FSE) and an inverse-design engine (IDE), as shown in Fig. 1. We first train the FSE in turning a set of reflection phases {φi} into the predicted scattered fields (j = 1, 2, 3). During the training, a large number of randomly generated configurations of {φi} and the corresponding intensities of the experimentally measured scattered field {Imj} by the three probes are used as training data. The mean squared error (MSE) between and {Imj} is used as the loss function to optimize the FSE during the training process. Specifically, the FSE is a supervised network with 40-100-100-3 fully connected layers. We opted to split the cyclic reflection phase {φi} into {cos φi, sin φi}, resulting in 40 input variables for the 20 columns of reflection phases to improve training performance (more details can be found in Sec. III of the supplementary material). The FSE has two hidden layers, both with 100 neurons, using the exponential linear unit (ELU) activation function, and has three output variables for the predicted intensities . After training, the FSE acts similarly as a surrogate solver in replacing the simulation of complex environments, except now replacing the real physical scattering process.
Schematic of the experiment-based DNN for power allocation with a programmable metasurface. The reflection phase profiles {φi} and corresponding field intensities {Imj} are regarded as experimental training data to train the integrated DNN, which combines the IDE with the pre-trained FSE, to coordinate the metasurface inverse design to manipulate the scattered fields on demand.
Schematic of the experiment-based DNN for power allocation with a programmable metasurface. The reflection phase profiles {φi} and corresponding field intensities {Imj} are regarded as experimental training data to train the integrated DNN, which combines the IDE with the pre-trained FSE, to coordinate the metasurface inverse design to manipulate the scattered fields on demand.
Next, the IDE is constructed with reverse topology of 3-50-50-20 fully connected layers. The input target intensities {Ij} are inversely transformed to the desired reflection phase profile {φi}. During the training the IDE, the MSE between {Ij} and (IDE combined with the pre-trained FSE) is used as the loss function and no experimental data are needed in this stage. Finally, for any target set of {Ij}, the output of the IDE, {φi}, can now be used as input of the real metasurface to test whether the experimentally obtained {Imj} is similar to {Ij}. We note that due to the inverse design nature of the problem, there may be multiple phase profiles {φi} that can achieve the same set of target intensities {Ij}. The integration of the IDE and pre-trained FSE as an integrated DNN (an autoencoder setting for results instead of design parameters) can help mitigate the nonuniqueness issue.64 We also note that there is an additional pre-trained quantization network (see more details in Sec. I of the supplementary material) when the IDE is connected to FSE. The quantization network transfers continuous input values to eight possible discrete values of reflection phases for realistic implementation of the programmable metasurfaces with FPGA.
III. PROGRAMMABLE METASURFACE DESIGN AND DATA COLLECTION
To obtain the experimental training data, we design and fabricate a programmable metasurface consisting of 20 × 20 unit cells operating at 11 GHz, in which the reflection of each column {φi} can be independently controlled as shown in Fig. 2(a). The reflected fields depend on the assigned phase profiles {φi} on the 20 columns of the metasurface. By varying the reflection phase profiles rapidly in time, a large set of experimental data can be collected from the three probes within a short time for the DNN training. In our case, 10 000 sets of randomly selected {φi} are chosen as input to the metasurfaces, and the experimental training data {Imj} can be collected within 10 s. The unit cell structure of the programmable metasurface is shown in Fig. 2(b). The geometrical parameters of the element are designed as h1 = 0.813 mm, h2 = 0.1 mm, s = 0.6 mm, a = 10 mm, b = 5 mm, c = 4 mm, g = 1 mm. Three copper layers are printed on two substrate layers (Rogers 4003C, relative permittivity ɛr = 3.55, loss tangent tan δ = 0.0027) and a bonding layer (Rogers 4450F, ɛr = 3.52, tan δ = 0.004). A varactor diode (MAVR-000120-14110P), as an active component whose capacitance changes with the bias voltage, is embedded between two metallic patches on the top layer. Two metallic vias are used to electrically connect to the negative “−” electrode in the middle layer and the positive “+” electrode in the bottom layer, respectively. By applying different bias voltages to the varactor diode, the dipole resonance of the metasurface can be shifted in the frequency domain, leading to programmable reflection phase responses at a fixed working frequency. The measured reflection phase responses of the metasurface at eight different bias voltages are shown in Fig. 2(c). At the operating frequency of 11 GHz indicated by the vertical orange line in the figure, we use eight discrete phase states with 45-degree gradient covering a 315-degree range (to be set by FPGA). These phase states are assigned to the reflection phases {φi} of the 20 independent columns to create different phase profiles. The reflection amplitudes for these eight states have some variation (within 1.7 dB), but this limitation of the implementation has already been considered in the DNN as the network is trained directly from experimental data.
Metasurface design and experimental training data collection. (a) Schematic of the programmable metasurface design. (b) Unit cell structure with geometric parameters. (c) Measured reflection phase of the metasurface at different bias voltages. The vertical orange line indicates the operating frequency at 11 GHz. (d) Normalized intensities Im1, Im2, and Im3 measured from the three probes for the DNN training process in the scenario without obstacle. The color of points denotes the sum of the intensities from the three probes.
Metasurface design and experimental training data collection. (a) Schematic of the programmable metasurface design. (b) Unit cell structure with geometric parameters. (c) Measured reflection phase of the metasurface at different bias voltages. The vertical orange line indicates the operating frequency at 11 GHz. (d) Normalized intensities Im1, Im2, and Im3 measured from the three probes for the DNN training process in the scenario without obstacle. The color of points denotes the sum of the intensities from the three probes.
There is a need to investigate the possible range of measured intensities {Imj} from the metasurface. In the scenario without the obstacle placed in front of the metasurface, we randomly generate 10 000 sets of phase profiles on the metasurface and use the three fixed probes to experimentally measure the corresponding intensities {Imj}. These 10 000 sets of experimental training data can be collected efficiently in 10 s, while it can take much longer if using simulation methods (see Sec. II in the supplementary material). The intensities {Imj} are plotted as three-dimensional points in Fig. 2(d). We note that the {Imj} plotted in the figure are normalized by Imj/Imax, where Imax denotes the maximum intensity received from the three probes in the given 10 000 sets of measurements. The normalized intensities Im1, Im2, and Im3 less than 0.6 account for 97.9%, 94.1%, and 98.8% of the total data, respectively. In the following, these data are used to train the DNN, and any target normalized intensity values {Ij} are assumed to range from 0 to 0.6.
IV. EXPERIMENTAL RESULTS
A. DNN training and testing without obstacles
The proposed experiment-based deep learning approach for power allocation is first demonstrated in the scenario without any obstacle. The randomly generated phase profiles {φi} and the corresponding measured intensities {Imj} in Fig. 2(d) are used as experimental training data to train the FSE first. After that, we train the integrated DNN comprising the IDE and pre-trained FSE. The details of the DNN training process can be found in Sec. III in the supplementary material. To test the performance of the trained DNN, we first demonstrate three special cases called “001,” “101,” and “000.” The “001” case denotes that the metasurface can manipulate the scattered fields toward one particular probe with a strong signal while the other two probes obtain weak signals. Similarly, the “101” case shows that two probes receive strong signals while the central probe receives a weak signal. The “000” case means minimum or zero target power level for the signals to be received for all the three probes, in creating a “quiet zone” for the three receivers. As shown in the black bars in Figs. 3(a)–3(c), we show the target normalized intensities as {0, 0, 0.55}, {0.55, 0, 0.55}, and {0, 0, 0} to the trained DNN, corresponding to the three special cases. The IDE is then used to output the reflection phases {φi} (after quantization network) for the metasurface to fulfill the demand targets. Then, the reflection phases are regarded as the input of the FSE, generating the predicted intensities (orange bars) that agree well with the target values. To experimentally validate the network predictions, we implement the obtained reflection phases (from the IDE) to the metasurface and experimentally measure the intensities from the three probes. As can be seen in the figure, the experimentally measured results (blue bars) match well with the targets and network predictions, showing our DNN-assisted metasurface can manipulate the scattered fields on demands to realize the three special cases. Particularly, these results enable application of the programmable metasurface to deliver and damp signal received at different locations, pointing to applications for these metasurfaces as RISs, e.g., for a room decorated with such metasurfaces to selectively deliver signals at different locations.20 We note that the experimental conditions remain the same for the whole training and test process.
Performance of DNN-assisted power allocation without obstacle. (a)–(c) Three special cases of “001,” “101,” and “000” for the three probes. The black bars, orange bars, and blue bars represent the target intensities {Ij}, the predicted intensities from the DNN, and the measured intensities {Imj} from the experiment, respectively. (d)–(f) General cases for three probes with 3000 sets of test data. The predicted intensities (orange points) and the measured intensities {Imj} (blue points) are both plotted against the target intensities {Ij}. The black dashed line is plotted for reference.
Performance of DNN-assisted power allocation without obstacle. (a)–(c) Three special cases of “001,” “101,” and “000” for the three probes. The black bars, orange bars, and blue bars represent the target intensities {Ij}, the predicted intensities from the DNN, and the measured intensities {Imj} from the experiment, respectively. (d)–(f) General cases for three probes with 3000 sets of test data. The predicted intensities (orange points) and the measured intensities {Imj} (blue points) are both plotted against the target intensities {Ij}. The black dashed line is plotted for reference.
To evaluate the overall performance, our system can arbitrarily control the allocated power to target values within the reasonable range as shown in Figs. 3(d)–3(f). We input the remaining 3000 testing sets of {I1, I2, I3} as the target intensities to the trained DNN and obtained the 3000 sets of {φi} and predicted intensities . In Figs. 3(d)–3(f), the horizontal axes and right-hand vertical axes denote the target intensities {I1, I2, I3} and predicted intensities , respectively. We observe that 3000 orange data points show a linear distribution around the dashed reference lines , showing that the DNN has been well trained to predict the intensities of the three probes according to the input targets. The mean squared errors (MSEs) between {I1, I2, I3} and are calculated and found to be 0.61 × 10−3, 0.52 × 10−3, 0.59 × 10−3 for the three probes. Next, we evaluate the performance in an actual experimental test. The 3000 sets of {φi} are implemented by the metasurface, and the corresponding measured intensities are plotted against the target intensities. As expected, the measured results denoted by blue points are distributed linearly around the dashed reference lines Imj = Ij, indicating the system can control the allocated power at the three probes to target values. The MSEs for measured results are obtained as 2.4 × 10−3, 2.2 × 10−3, 3.1 × 10−3 for the three probes, respectively. The errors for predicted and measured results may come from limited training samples, phase quantization errors, and noisy data acquisition.
B. DNN training and testing with an obstacle
However, once the ambient conditions change (the emergence of obstacle blocks, for example), a pre-trained DNN for the specific scenario without further update may fail to work. Here, we collect new experimental data on-site and then update the DNN to adapt to the changes in the environment. To demonstrate the adaptivity of this system, a metal frame obstacle is added in between the metasurface and the three probes as shown in Fig. 1. We input the same 3000 testing sets of to the previous DNN (trained without obstacle) and obtain the metasurface phase profile {φi} but now with an obstacle present in the system. By implementing the 3000 sets of {φi} on the metasurface, we measure the corresponding intensities and plot them with the target as shown in Figs. 4(a)–4(c). For Figs. 4(a) and 4(b), the measured data points deviate below the dashed reference lines Imj = Ij, which means the signals transmitted to these two probes are blocked or scattered away by the added obstacle. For Fig. 4(c), the measured results show a poor linear correlation with target values affected by the appearance of the obstacle. The MSEs are 7.5 × 10−3, 22 × 10−3, 4.6 × 10−3 for Figs. 4(a)–4(c), respectively, showing larger errors compared with the case without obstacle in Figs. 3(d)–3(f). Therefore, the original DNN trained without the obstacle performs poorly under the changed ambient conditions.
Performance of DNN-assisted adaptive power allocation with an obstacle. (a)–(c) The measured intensities {Imj} against target intensities for the three probes, using the previous DNN trained without an obstacle. (d)–(f) The measured intensities {Imj} with target intensities using the on-site updated DNN trained with the obstacle.
Performance of DNN-assisted adaptive power allocation with an obstacle. (a)–(c) The measured intensities {Imj} against target intensities for the three probes, using the previous DNN trained without an obstacle. (d)–(f) The measured intensities {Imj} with target intensities using the on-site updated DNN trained with the obstacle.
To adapt to the new ambient condition, we collect new experimental data again (10 s for 10 000 sets) and update the DNN (10.9 min for DNN training from scratch, and it can be reduced to 2.4 min by reusing the weights in the previous training process, see details in Sec. IV of the supplementary material). With the updated DNN, we input the same 3000 sets of testing and measure the intensities . As shown in Figs. 4(d)–4(f), the measured data points restore a linear distribution around the dashed reference lines with lower MSEs of 2.6 × 10−3, 1.4 × 10−3, 3.8 × 10−3. Several results with special target power combinations are also provided in Sec. V of the supplementary material. It is obvious that the updated DNN can overcome the blocking effect of the obstacle and adapt to changes in the environment.
V. DISCUSSION
In this work, we use three probes with specific locations to collect the experimental training data for the DNN construction. Our scheme also allows for the control of scattered fields in other locations by adding more probes, depending on the number of target users. In addition, we discuss the DNN training time scales with more neurons when considering a larger metasurface with more columns N of unit cells with phase profiles {φi} (i = 1, 2, …, N) (see Sec. VI of the supplementary material). Furthermore, the training time of the DNN can be significantly reduced by employing transfer learning (TL),44,45 in which the weights and biases learned from the previous training process are reused and fine-tuned in the subsequent training phase, instead of starting the training process from scratch. We have demonstrated that the DNN training time can be significantly reduced to 2.4 min by using TL in Sec. IV of the supplementary material. Although outside the scope of the current work, we can also further adopt real-time reinforcement learning58 to have an additional agent to learn the optimal policy iteratively while interacting with the dynamic environment.
VI. CONCLUSION
The power allocation problems in complex environments with obstacle blocks are effectively addressed using the experiment-based DNN approach with a programmable metasurface. Without prior physical knowledge of complex systems, we directly train the DNN using experimental data, circumventing the need for complex modeling and time-consuming simulations. Our experimental results have demonstrated that the experiment-based DNN can effectively control power distribution transmitted toward multiple receivers and can be updated through on-site collected data to adapt to changes in the environment. Our work provides the potential to leverage real-world data for more accurate and efficient metasurface designs, achieving intelligent wireless communication such as Wi-Fi and 5 G signals in complex indoor environments and can also be applied to applications such as imaging, sensing, and quiet-zone control. More generally, training DNN with experimental data can also be directly applicable to inverse problems, such as inverse scattering and imaging, which can be sensitive to on-site information of practical environments.
SUPPLEMENTARY MATERIAL
See the supplementary material for details on DNN architecture, data acquisition, DNN training process, training time cost, and alternative AI approaches.
ACKNOWLEDGMENTS
This work was supported by Hong Kong Research Grants Council (Grant Nos. R6015-18 and C6012-20G), Croucher Foundation (Grant No. CF23SC01), and the 30 for 30 Research Initiative Scheme from HKUST. The work is also partly supported by a donation from Chow Sang Sang Jewelry Company Limited under Grant No. 9229058..
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Jingxing Zhang and Jiawei Xi contributed equally to this work.
Jingxin Zhang: Conceptualization (equal); Data curation (equal); Methodology (equal); Validation (equal); Writing – original draft (equal). Jiawei Xi: Data curation (equal); Formal analysis (equal); Methodology (equal); Writing – original draft (equal). Peixing Li: Data curation (equal); Software (equal). Ray C. C. Cheung: Writing – review & editing (equal). Alex M. H. Wong: Funding acquisition (equal); Writing – review & editing (equal). Jensen Li: Conceptualization (equal); Funding acquisition (equal); Project administration (lead); Supervision (lead); Writing – review & editing (equal).
DATA AVAILABILITY
The data and code that support the findings of this study are openly available at https://doi.org/10.5281/zenodo.10239876.65