The 2024 Nobel Prizes in Physics and Chemistry were awarded for foundational discoveries and inventions enabling machine learning through artificial neural networks. Artificial intelligence (AI) and artificial metamaterials are two cutting-edge technologies that have shown significant advancements and applications in various fields. AI, with its roots tracing back to Alan Turing’s seminal work, has undergone remarkable evolution over decades, with key advancements including the Turing Test, expert systems, deep learning, and the emergence of multimodal AI models. Electromagnetic wave control, critical for scientific research and industrial applications, has been significantly broadened by artificial metamaterials. This review explores the synergistic integration of AI and artificial metamaterials, emphasizing how AI accelerates the design and functionality of artificial materials, while novel physical neural networks constructed from artificial metamaterials significantly enhance AI’s computational speed and its ability to solve complex physical problems. This paper provides a detailed discussion of AI-based forward prediction and inverse design principles and applications in metamaterial design. It also examines the potential of big-data-driven AI methods in addressing challenges in metamaterial design. In addition, this review delves into the role of artificial metamaterials in advancing AI, focusing on the progress of electromagnetic physical neural networks in optics, terahertz, and microwaves. Emphasizing the transformative impact of the intersection between AI and artificial metamaterials, this review underscores significant improvements in efficiency, accuracy, and applicability. The collaborative development of AI and artificial metamaterials accelerates the metamaterial design process and opens new possibilities for innovations in photonics, communications, radars, and sensing.
I. INTRODUCTION
Electromagnetic wave modulation is crucial for scientific research and industrial applications, including information processing, telecommunications, military uses, and imaging systems. Traditional devices such as lenses and deflectors control electromagnetic waves but require large sizes and complex shapes due to limited dielectric constant options.1–11 In recent years, artificial electromagnetic metamaterials have addressed the challenging issues faced by traditional devices, which can achieve artificially tunable electromagnetic responses.12–21 The modulation of electromagnetic waves by metamaterials is accomplished by arranging meta-atoms in a predefined order.22–30 This principle relies on utilizing abrupt phase changes of transmitted or reflected waves. Since Francesco Capasso and colleagues introduced the concept of generalized Snell’s law, research on artificial metamaterials and metasurfaces has rapidly expanded, driving the development of a wide range of advanced functional metamaterial devices.31–34
Over the past two decades, researchers have explored increasingly complex structures, integrated unconventional materials, and investigated unique scattering characteristics.35–38 However, the design of these materials presents significant challenges, particularly due to the complex interactions between the structure and electromagnetic properties at subwavelength scales.39 Traditionally, finite element methods (FEMs) or finite-difference time-domain (FDTD) methods are used to predict electromagnetic properties.40 Metamaterial units are simulated under periodic boundary conditions and are then combined to form large-area systems. This process is time-consuming and often fails to achieve ideal electromagnetic responses due to mutual coupling errors between atoms. Advanced design lacks simple functional relationships or qualitative heuristics, making it a major obstacle to further development. Since the 2020s, the number of published papers on AI and artificial metamaterials has surged, reflecting growing research interest driven by technological advances and application demands. As technology continues to develop, the combination of AI and artificial metamaterials is expected to have even broader prospects for development.
Figure 1 shows the integration and application of AI and artificial metamaterials using a central yin-yang pattern and three concentric rings, symbolizing their complementary relationship. The hardware, represented by artificial metamaterials, intersects with the software, represented by AI. The innermost ring represents the theoretical foundation, including the principles of electromagnetism, algorithmic mathematical models, and quantum physical properties of matter, forming the core of the field. The middle ring showcases key technologies from the integration of AI and artificial metamaterials, crucial for translating theory into practice. The outermost ring lists specific application scenarios, highlighting the broad practical applications of these integrated technologies. This structured representation clearly shows how AI and artificial metamaterials interact and enhance each other at theoretical, technological, and application levels. It helps us understand the immense potential of this interdisciplinary field. As technologies develop and mature, the integration of AI and artificial metamaterials is expected to lead to technological innovation, provide effective solutions to global challenges, and drive new peaks in scientific research.
Diagram of the integration of artificial intelligence and artificial metamaterials.
Diagram of the integration of artificial intelligence and artificial metamaterials.
The introduction of AI has significantly advanced the design of artificial electromagnetic metamaterials.41–53 In the early 1990s, researchers used Hopfield neural networks for microwave impedance matching, highlighting the potential of neural networks in solving electromagnetic problems.54 Over time, deep fully connected networks began simulating more complex microwave circuit elements, such as heterojunction bipolar transistor amplifiers,55 coplanar waveguide components,56 and lumped 3D integrated components.57 In the past decade, deep neural networks have been applied to microwave technology, including frequency selective,58 metamaterials,59 and filters.60 These technologies can achieve complex functions and have potential applications in photonic systems.61 Since 2010, deep learning (DL) has been used to predict the bandgap properties of photonic crystals,62 the dispersion characteristics of photonic crystal fibers,63 and the propagation characteristics of plasma transmission lines.64 Recent studies have focused on devices with more geometric degrees of freedom, such as various types of photonic crystal fibers,65 3D photonic crystals,66 photonic crystal cavities,67 plasma waveguide filters,68 in-plane mode couplers and splitters,69,70 Bragg gratings,71 and free-space grating couplers.72 In recent years, AI has been applied to a variety of nanostructures and photonic materials,73 such as chiral nanostructures,74,75 planar scatterers,76 absorbers77 lattices for customized coloring,78 smart windows based on phase-change materials,79 Fano resonance nanoslits arrays,80 dielectric gratings,81 dielectric metasurfaces,82 graphene-based metamaterials,83 and scatterers for color design.84 These models have been widely applied in fields such as color filters85 and topological insulators.86–89
Artificial metamaterials have the potential to significantly enhance AI performance by offering novel approaches to improving neural network functionality and optimizing signal processing in machine learning algorithms.90–95 For example, computationally intensive tasks in neural networks, such as matrix multiplication and convolution operations, can be hardware-accelerated using artificial metamaterials technology. Researchers have developed new types of electromagnetic physical neural networks that leverage diffraction effects by adjusting the amplitude and phase of artificial metamaterials. These innovations enable the manipulation of electromagnetic waves in a way that significantly accelerates computation processes. By tuning the properties of metamaterials, it is possible to achieve parallel computation at the speed of light, providing a substantial improvement over traditional digital computing systems. This approach not only enhances the processing speed but also offers a new paradigm for efficient data handling and computation in neural networks. Furthermore, the use of artificial metamaterials facilitates a high degree of control over electromagnetic interactions, allowing for the optimization of signal transmission and minimizing energy loss. As a result, this development holds the promise of advancing AI applications, particularly in areas such as real-time data processing and large-scale machine learning.
This paper reviews the integration of AI in metamaterial design. Section II introduces the fundamentals of AI, focusing on machine learning (ML) and deep learning techniques. Section III discusses the application of AI in the design of artificial metamaterials, covering both forward and inverse design approaches, as well as the future potential of big data in advancing these designs. Section IV focuses on physical neural networks, where metamaterials are employed to accelerate AI computations, thereby improving both processing speed and security. Finally, Sec. V summarizes the key findings and highlights potential future directions in AI-driven metamaterial design.
II. BASIC THEORY OF AI
Figure 2 shows the hierarchical relationship between artificial intelligence, machine learning, and deep learning. AI represents the broadest category, encompassing all forms of intelligent systems, including those that do not rely on data-driven learning methods. It involves the development of algorithms and systems capable of performing tasks that typically require human intelligence, such as problem-solving, perception, and decision-making. Machine learning, a subset of AI, specifically focuses on the use of algorithms that enable systems to learn from data, thus improving their performance without being explicitly programmed. Deep learning is a more specialized subfield of machine learning that uses neural networks with multiple layers to model complex patterns in large datasets. These networks are capable of automatically extracting features from raw data, thereby enabling more accurate predictions and classifications in applications such as image and speech recognition. As the field progresses, the distinctions between AI, ML, and DL are becoming increasingly blurred, resulting in the development of more sophisticated systems with enhanced capabilities.
Diagram of the relationship between artificial intelligence, machine learning, and deep learning.
Diagram of the relationship between artificial intelligence, machine learning, and deep learning.
The origins of artificial intelligence can be traced back to the seminal work of Alan Turing, a pioneer in computer science and cryptography. In his 1950 paper “Computing Machinery and Intelligence,” Turing proposed the innovative Turing Test. He suggested that if 30% of human interrogators could not distinguish whether they were interacting with a human or a machine through a series of keyboard-mediated questions, the machine could be considered to have human-like intelligence.96 The formal introduction of the term “artificial intelligence” occurred during a two-month conference at Dartmouth College in 1956, organized by John McCarthy and colleagues. This event marked the inception of AI research, which soon saw breakthroughs in machine theorem proving and checkers programs. However, the period from the 1960s to the early 1970s was characterized by excessive expectations and successive failures, leading to stagnation in AI development. The advent of expert systems in the early 1970s to mid-1980s marked a pivotal transition from theory to practice. Planning, search algorithms, and evolutionary algorithms also fall under the domain of artificial intelligence, contributing to its diverse and growing range of applications. In 1974, Paul Werbos from Harvard University introduced the backpropagation algorithm for training neural networks, which initially garnered little attention.97 The 1980s and 1990s witnessed an expansion in the application scope of AI despite the emerging limitations of expert systems.
The twenty-first century has witnessed significant advancements in network technology, which have accelerated innovations in AI. In 2006, Geoffrey Hinton and colleagues introduced deep learning, heralding a new era for AI.98 Technological progress in big data, cloud computing, and advanced graphical processing units facilitated breakthroughs in deep neural networks, impacting fields such as image classification and autonomous driving. In recent years, AI has continued to yield groundbreaking results. In 2014, Ian Goodfellow and Yoshua Bengio proposed generative adversarial networks (GANs).99 In 2016, AlphaGo’s victory over a world champion Go player demonstrated AI’s potential in strategic games.100 The success of AlphaFold 2 in predicting protein structures in 2021 highlighted AI’s capabilities in biological research.101 In 2022, OpenAI released GPT-3.5, an advanced model for generating natural language text, and on December 2023, Google introduced the Gemini model, a multimodal AI capable of recognizing and processing text, images, audio, video, and code.102 These diverse models hold profound significance and promise extensive application prospects.103,104
A. Machine learning
Machine learning, a fundamental subset of artificial intelligence, involves the creation of algorithms that allow systems to analyze data, identify patterns, and improve their performance iteratively. Machine learning allows the system to figure out these instructions by analyzing data and learning patterns, without the need to manually write detailed instructions or rules for the system to follow to perform specific tasks.105–107 Techniques such as linear regression, decision trees, support vector machines (SVM), and K-nearest neighbors (KNN) are key methods within machine learning, each widely used for various predictive and classification tasks. In a way analogous to human decision-making, where experience accumulates to improve judgment, ML systems improve their functionality by leveraging data. A widely accepted definition by Tom M. Mitchell succinctly captures the essence of ML: “a computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks T, as measured by P, improves with experience E.”108 In this context, “experience” refers to the data processed by the system, while “performance” is evaluated based on predefined metrics. This framework emphasizes the iterative nature of learning and its continuous refinement based on data-driven experiences.
ML methods can be broadly categorized into three types: supervised learning, unsupervised learning, and reinforcement learning. Supervised learning involves training models on labeled data, where both the input and the corresponding output are known. This category is further divided into classification (predicting discrete labels, such as identifying images of cats and dogs) and regression (predicting continuous values, such as estimating house prices). Unsupervised learning, by contrast, deals with unlabeled data, aiming to discover hidden patterns or structures within the data, such as clustering similar items or reducing dimensionality. Reinforcement learning focuses on decision-making through trial and error, where the model learns optimal actions by receiving feedback in the form of rewards or penalties, making it especially useful in areas such as robotics and game strategies.109–111
A central concept in ML is optimization, a mathematical process employed to minimize or maximize a function by adjusting the model’s parameters.112–114 This function is commonly referred to as the objective function, cost function, or loss function, depending on the specific context. A common optimization technique is gradient descent, which iteratively updates model parameters by moving in the direction that reduces the error. The ultimate goal is to find the global minimum of the loss function, which corresponds to the optimal model performance. However, given the complexity of real-world problems, models may sometimes find local minima rather than the global minimum, presenting a common challenge in ML.
B. Deep learning
Deep learning is a subset of machine learning that has gained significant attention due to its ability to handle large-scale, high-dimensional data. It is based on artificial neural networks, which are inspired by the structure and function of neurons in the human brain. A neural network is composed of layers of interconnected artificial neurons, which process input data and learn complex patterns through a hierarchical structure. This architecture allows deep learning models to excel in tasks, such as image recognition, speech processing, and natural language understanding, where traditional machine learning models may struggle.
Deep neural networks consist of multiple layers, each containing several neurons. As the data pass through these layers, the network extracts increasingly abstract features, enabling it to perform complex tasks such as classification and regression. Through continuous training and fine-tuning, deep learning models have achieved state-of-the-art performance in fields ranging from computer vision to natural language processing, and they continue to push the boundaries of what AI systems can accomplish.
III. AI-BASED METAMATERIAL DESIGN
A. Artificial intelligence for forward artificial metamaterials Design
Figure 3 shows the four commonly applied AI methods in the design of artificial metasurfaces: multilayer perceptron (MLP), convolutional neural network (CNN), variational autoencoder (VAE), and generative adversarial networks (GAN). MLP and CNN are typically used for forward prediction of artificial metasurface performance, while VAE and GAN can directly predict the structural parameters of metasurfaces from performance data. These AI methods provide powerful tools for efficient bidirectional design transformations, from material property predictions to device functionality design. They have significantly accelerated and optimized the development of metasurfaces, enabling the realization of many complex functional capabilities.117–122 In metamaterial design, neural networks can predict the electromagnetic properties of metamaterials more efficiently than traditional finite element analysis methods. Once an accurate forward prediction network is obtained, the optimization of the metamaterial structure can be achieved through iterative processes, such as global search and heuristic optimization algorithms.
Illustrations of four basic artificial intelligence network architectures: MLP, CNN, GAN, and VAE.
Illustrations of four basic artificial intelligence network architectures: MLP, CNN, GAN, and VAE.
1. Multilayer perceptron
In 2018, Peurifoy et al. proposed using MLPs to predict the light scattering of multilayer nanoparticles, completing complex optical process simulations in an extremely short time [Fig. 4(a)].73 Neural networks can perform calculations several orders of magnitude faster than traditional finite element methods. In addition, this work leveraged the gradients of neural networks, using neural adjoint methods for the rapid inverse design of nanoparticles. In 2021, Ren et al. combined MLPs with genetic algorithms to optimize the geometric structure of photonic devices [Fig. 4(b)].123 This approach exhibited excellent performance in multi-objective and multi-constraint design problems. By integrating genetic algorithms, the required training data for the neural network were reduced, resulting in photonic devices that met manufacturing specifications, had very low insertion losses, and operated over a broad wavelength range. This method significantly improved design speed and flexibility.
Application of multilayer perceptron in the design of artificial metamaterials. (a) Prediction of multilayer nanoparticle light scattering. Reprinted with permission from Peurifoy et al., Sci. Adv. 4, eaar4206 (2018). Copyright 2018 AAAS. (b) Optimizing power dividers and based on multilayer perceptron. Reprinted with permission from Ren et al., Photonics Res. 9, B247 (2021). Copyright 2021 OSA Publishing.
Application of multilayer perceptron in the design of artificial metamaterials. (a) Prediction of multilayer nanoparticle light scattering. Reprinted with permission from Peurifoy et al., Sci. Adv. 4, eaar4206 (2018). Copyright 2018 AAAS. (b) Optimizing power dividers and based on multilayer perceptron. Reprinted with permission from Ren et al., Photonics Res. 9, B247 (2021). Copyright 2021 OSA Publishing.
2. Convolutional neural network
In 2019, Sajedian et al. combined CNNs with recurrent neural networks (RNNs) to directly predict the absorption spectrum from images of plasmonic structures [Fig. 5(a)].77 This approach, which used CNNs to extract spatial features and RNNs to analyze spectral data, outperformed traditional numerical simulation methods in accuracy and speed, significantly enhancing computational efficiency. In 2021, Zhu et al. used a pre-trained GoogLeNet-Inception-V3 network, a CNN model based on large-scale image data, to explore the application in metamaterial design [Fig. 5(b)].134 By transferring models from the image domain to the electromagnetic domain, they achieved rapid design of complex functional metasurfaces. Fine-tuning the network adapted it to new metasurface design tasks, accelerating the design process and validating its accuracy through experiments and simulations.
CNN models for advanced metamaterial design. (a) Structure of the deep-learning model for designing chiral metamaterials using CNNs and RNNs to predict absorption spectra from images. Reprinted with permission from Sajedian et al., Microsyst. Nanoeng. 5, 27 (2019). Copyright 2019 Springer Nature. (b) Framework of the deep-learning model for designing chiral metamaterials using transfer learning to adapt CNN models for rapid and accurate metasurface design. Reprinted with permission from Zhu et al., Nat. Commun. 12, 2974 (2021). Copyright 2021 Springer Nature.
CNN models for advanced metamaterial design. (a) Structure of the deep-learning model for designing chiral metamaterials using CNNs and RNNs to predict absorption spectra from images. Reprinted with permission from Sajedian et al., Microsyst. Nanoeng. 5, 27 (2019). Copyright 2019 Springer Nature. (b) Framework of the deep-learning model for designing chiral metamaterials using transfer learning to adapt CNN models for rapid and accurate metasurface design. Reprinted with permission from Zhu et al., Nat. Commun. 12, 2974 (2021). Copyright 2021 Springer Nature.
3. Forward network combined with an iterative optimization method
Perceptive network structures demonstrate strong performance in forward prediction. However, they face challenges in direct inverse design due to oscillations among multiple potential outputs in ill-posed problems, which are characterized by numerous possible solutions without a unique or stable outcome. This can result in imprecise and unstable results. To address this issue, the integration of perceptive networks with iterative optimization methods has shown effectiveness. In this approach, perceptive networks predict system performance based on design parameters, while iterative optimization algorithms refine these parameters to identify the optimal solution. This combination enhances the accuracy and efficiency of inverse design. To ensure practical feasibility, penalty losses can be introduced during the optimization process, preventing the optimization from ignoring physical constraints and maintaining a balanced, realistic design solution.
As shown in Fig. 6, Wu et al. proposed a method in 2022 that combines MLP with a genetic algorithm (GA) for inverse design, achieving rapid phase-to-mode mapping.135 MLPs classify and predict the initial population, enabling the GA to search the design space more accurately. This approach enhances the overall efficiency and quality of the design process. By accurately controlling phase values during optimization, this method reduces the challenges of extensive parameter spaces and avoids the need for numerical simulations across the entire phase range, significantly improving speed and accuracy. The experimental results align well with simulations, demonstrating the method’s effectiveness and superior performance. The combination of MLPs and GA offers a new pathway for rapid and complex metasurface design, showcasing tremendous potential and practical value. Forward network combined with an iterative optimization method provides a powerful framework for addressing the challenges of inverse design, offering enhanced speed, accuracy, and practical feasibility in complex metasurface design tasks.
Integration of multilayer perceptron and genetic algorithms for inverse design. Reprinted with permission from Wu et al., Opt. Express 30, 45612 (2022). Copyright 2022 OSA Publishing.
Integration of multilayer perceptron and genetic algorithms for inverse design. Reprinted with permission from Wu et al., Opt. Express 30, 45612 (2022). Copyright 2022 OSA Publishing.
B. Artificial intelligence for artificial metamaterials inverse design
Researchers using perceptive networks combined with iterative optimization can achieve optimal performance in the inverse design of metamaterials. However, this method may not be highly efficient when speed and broad design requirements are critical. In such cases, direct inverse design methods using deep learning become crucial. These methods generate solutions directly through a single neural network invocation, greatly accelerating the design process. While generative networks can quickly produce solutions, they may lack the detailed adjustments provided by iterative optimization, potentially trading off some performance. Despite this, generative networks offer significant convenience for rapid design generation or preliminary prototyping. We will explore these generative network inverse design approaches, evaluating their efficiency and accuracy.136–138
1. Tandem neural network
The simplest configuration of generative networks is the tandem network. A tandem network is divided into a decoder and an encoder. In the first step, the decoder is trained, using design parameters as inputs to predict their physical properties. In the second step, the weights of the decoder network are fixed, and the encoder for generating designs is added to the model. The required physical properties (e.g., reflectance spectrum) are input into the encoder, which then predicts the design. The generated design is fed into the trained decoder network from the first step to predict the physical properties. The predicted response is then compared to the input response, and the error between the two is minimized as the training loss. Through the backpropagation algorithm, the network undergoes several gradient updates, and the required structural parameters are obtained.139,140
Figure 7 shows the applications of tandem neural networks in the inverse design of artificial metamaterials. In 2019, An et al. used tandem neural networks to optimize the design of metasurfaces by directly inputting design objectives and predicting the corresponding nanostructure parameters with a decoder. By directly inputting design objectives into the network and using deep learning to predict the corresponding nanostructure parameters as the decoder, the encoder was then constructed to generate the required design [Fig. 7(a)].82 This work improved accuracy by predicting the real and imaginary parts of the response curve, highlighting the potential of tandem neural networks in enhancing design iteration speed and reducing computational complexity, thereby laying the groundwork for subsequent research. In 2023, Wen et al. further advanced the practicality of tandem neural networks in the dynamic reconfigurable surface design at microwave frequencies.141 They used a data-driven learning model to adjust the microwave reflective surface in real time to adapt to different environments and requirements [Fig. 7(b)]. This study not only validated the efficiency of tandem neural networks in practical applications but also demonstrated their capability to handle environmental changes and achieve rapid responses. In 2024, Peng et al. expanded the application of deep learning in metamaterial design to the generation of multifunctional vortex beams.142 This work underscored the efficiency of tandem networks in handling multi-objective and highly customized designs, showcasing their broad applicability in achieving complex optical functionalities [Fig. 7(c)].
Applications of tandem neural networks in inverse design of artificial metamaterials. (a) Metasurface design optimization using tandem neural networks. Reprinted with permission from An et al., ACS Photonics 6, 3196–3207 (2019). Copyright 2019 American Chemical Society. (b) Real-time adaptive surface design at microwave frequencies. Reprinted with permission from Wen et al., Nat. Commun. 14, 7736 (2023). Copyright 2023 Springer Nature. (c) Multifunctional vortex beam generation with tandem neural networks. Reprinted with permission from Peng et al., Adv. Opt. Mater. 12, 2300158 (2024). Copyright 2024 John Wiley and Sons Inc.
Applications of tandem neural networks in inverse design of artificial metamaterials. (a) Metasurface design optimization using tandem neural networks. Reprinted with permission from An et al., ACS Photonics 6, 3196–3207 (2019). Copyright 2019 American Chemical Society. (b) Real-time adaptive surface design at microwave frequencies. Reprinted with permission from Wen et al., Nat. Commun. 14, 7736 (2023). Copyright 2023 Springer Nature. (c) Multifunctional vortex beam generation with tandem neural networks. Reprinted with permission from Peng et al., Adv. Opt. Mater. 12, 2300158 (2024). Copyright 2024 John Wiley and Sons Inc.
The main advantage of the tandem approach is that it simplifies the inverse design problem by breaking it into two steps: learning a forward problem and using that knowledge to train the generator network. This makes the training process easier. However, tandem networks can only find one solution for a design goal, even if multiple solutions exist. For scenarios requiring multiple solutions, VAE and GAN are better. They use latent spaces to make design descriptions compact and continuous, allowing them to learn multiple solutions. VAE and GAN can generate valid designs for every point in their latent space, offering more flexibility and diverse design options.
2. Variational autoencoder
During training, VAEs associate multiple possible solutions with different latent vector values, eliminating training ambiguity. Once the VAE network converges, the decoder can be used for inverse design, generating different designs by altering the latent vectors. The advantage of VAEs lies in their robust training capability, which systematically explores the latent space to identify various potential solutions for artificial material designs.143–146
In 2023, Chen et al. introduced a generation-elimination framework leveraging deep learning to correlate metasurface spectra, addressing the challenge of spectra-to-spectra design. Their approach utilized a VAE to stochastically sample and generate diverse candidates, which were then refined by using an elimination network to identify the optimal output. This method accurately predicted inaccessible spectra without structural information, achieving high accuracy in terahertz metasurface applications and offering a new perspective for deep learning to interpret complex physical processes in metasurface design [Fig. 8(a)].147 In 2022, Yu et al. combined VAE with genetic algorithms to achieve inverse design of unit cells with high degrees of freedom.148 Their method employed a forward perception network to quickly evaluate the fitness values of each offspring in the genetic algorithm [Fig. 8(b)]. The VAE generated models were used to constrain and compress the large design space, helping the GA escape local optima and find the global optimum. This approach enabled the inverse design of different structures with similar spectral characteristics.
Enhancing inverse design of photonic structures using VAE. (a) Spectral correlation in metasurface design with VAE. Reprinted with permission from Chen et al., Nat. Commun. 14, 4872 (2023). Copyright 2023 Springer Nature. (b) Inverse design of meta-atoms with high degrees of freedom using VAE. Reprinted with permission from Yu et al., Opt. Express 30, 35776 (2022). Copyright 2022 OSA Publishing.
Enhancing inverse design of photonic structures using VAE. (a) Spectral correlation in metasurface design with VAE. Reprinted with permission from Chen et al., Nat. Commun. 14, 4872 (2023). Copyright 2023 Springer Nature. (b) Inverse design of meta-atoms with high degrees of freedom using VAE. Reprinted with permission from Yu et al., Opt. Express 30, 35776 (2022). Copyright 2022 OSA Publishing.
Figure 9 shows the application of VAE in the inverse design of metasurface arrays to achieve holography. In 2020, Hossein Eybposh et al. proposed DeepCGH, a non-iterative VAE-based method for generating three dimensional computational holograms using gratings.149 This technique used a VAE composed of convolutional neural networks to map target intensity patterns directly to structural parameters, greatly enhancing computational efficiency and hologram quality [Fig. 9(a)]. In 2024, Xi et al. used VAE to design polarization-multiplexed holograms [Fig. 9(b)]. Their method generated independent holograms from various polarization states, achieving the desired holograms in different co-polarization and cross-polarization channels.150 This demonstrates that VAEs are effective not only in single-unit design tasks but also in meeting diverse and complex design requirements, showcasing their potential for precise and efficient design in highly complex systems.
Applications of VAE in the inverse design of metasurface arrays for holography. (a) Rapid generation of 3D computational holograms using DeepCGH. Reprinted with permission from Hossein Eybposh et al., Opt. Express 28, 26636 (2020). Copyright 2020 OSA Publishing. (b) Design of polarization multiplexed holograms using VAE. Reprinted with permission from Xi et al., Adv. Opt. Mater. 12, 2202663 (2024). Copyright 2024 John Wiley and Sons Inc.
Applications of VAE in the inverse design of metasurface arrays for holography. (a) Rapid generation of 3D computational holograms using DeepCGH. Reprinted with permission from Hossein Eybposh et al., Opt. Express 28, 26636 (2020). Copyright 2020 OSA Publishing. (b) Design of polarization multiplexed holograms using VAE. Reprinted with permission from Xi et al., Adv. Opt. Mater. 12, 2202663 (2024). Copyright 2024 John Wiley and Sons Inc.
3. Generative adversarial networks
After exploring the applications of autoencoders in metamaterial design, we now focus on another powerful generative model: GANs. VAEs utilize an encoder–decoder architecture that emphasizes reconstruction error and latent space regularization, typically resulting in smoother, slightly blurred images. In contrast, GANs employ adversarial training between a generator and a discriminator, enabling the production of high-quality, detail-rich images. Despite their susceptibility to mode collapse, the adversarial training process drives the generator to create more realistic designs. This makes GANs particularly advantageous for metamaterial design, which often requires generating complex, highly nonlinear material properties and intricate outputs.151–158
Figure 10 shows the application of GAN in the inverse design of artificial metamaterials. In 2018, Liu et al. introduced the use of GAN for the inverse design of metamaterials.159 The model can directly generate corresponding metamaterial structures from target spectral characteristics, achieving a direct mapping from functional requirements to physical realization. This approach accelerates the design cycle, making complex design tasks more manageable and efficient [Fig. 10(a)]. In 2019, Jiang et al. further utilized conditional GAN to design freeform metamaterial diffraction gratings, optimizing and designing complex metamaterial gratings [Fig. 10(b)].160 Their method introduced practical parameter inputs for iterative optimization of the generated designs, enhancing device performance and design robustness. In 2021, An et al. demonstrated GANs in the design of multifunctional metamaterials, generating complex nanostructures meeting specific requirements such as spectral responses and polarization control [Fig. 10(c)].161 These studies highlight the rapid development and broad application of GAN technology in metamaterial design, showcasing their ability to handle intricate demands and provide quick, precise design solutions.
Applications of GAN in the inverse design of artificial metamaterials. (a) Direct generation of metamaterial structures from target spectral characteristics using GAN. Reprinted with permission from Liu et al., Nano Lett. 18, 6570–6576 (2018). Copyright 2018 ACS Publications. (b) Design of freeform metamaterial diffraction gratings using conditional GAN. Reprinted with permission from Jiang et al., ACS Nano 13, 8872–8878 (2019). Copyright 2019 American Chemical Society. (c) Generation of multifunctional metasurface using GAN. Reprinted with permission from An et al., Adv. Opt. Mater. 9, 2001433 (2021). Copyright 2021 John Wiley and Sons Inc.
Applications of GAN in the inverse design of artificial metamaterials. (a) Direct generation of metamaterial structures from target spectral characteristics using GAN. Reprinted with permission from Liu et al., Nano Lett. 18, 6570–6576 (2018). Copyright 2018 ACS Publications. (b) Design of freeform metamaterial diffraction gratings using conditional GAN. Reprinted with permission from Jiang et al., ACS Nano 13, 8872–8878 (2019). Copyright 2019 American Chemical Society. (c) Generation of multifunctional metasurface using GAN. Reprinted with permission from An et al., Adv. Opt. Mater. 9, 2001433 (2021). Copyright 2021 John Wiley and Sons Inc.
C. Artificial intelligence metasurface design driven by big data
With advancements in AI and the expansion of metamaterial databases, large-scale generative models are increasingly employed in metamaterial design. These AI approaches, driven by big data, which include transfer learning,162,163 reinforcement learning,164–166 diffusion models,167–169 and transformer models,170,171 leverage extensive datasets and computational power to provide exceptional speed and flexibility in creating tailored metamaterials. Transfer learning enhances large models by enabling knowledge from pre-trained tasks to improve performance on related tasks, thereby reducing training time and data requirements. Reinforcement learning optimizes designs through interactions with environments to maximize rewards, while diffusion models generate high-quality outputs. In addition, transformer models utilize self-attention mechanisms to effectively process large-scale data, contributing to more efficient and intelligent solutions in complex design tasks. By harnessing big data, these models enhance performance and facilitate the discovery of novel and cost-effective metamaterials. The following sections will detail these methods and applications.
1. Transfer learning
In artificial metamaterial design, the parameter space is vast and complex. While deep learning architectures can meet most design requirements, they demand extensive computational resources and time for simulation or testing to obtain data. In addition, new designs or applications often require training models from scratch, which is inefficient and resource-intensive. Transfer learning provides an effective solution by applying knowledge and models learned from one task to another related task. A pre-trained model, such as one used for a specific optical frequency, can be quickly adapted to a different frequency or design task with different physical properties, accelerating the design process.172–175
Figure 11 shows the application of transfer learning in artificial material design. In 2022, Zhang et al. proposed a heterogeneous transfer learning method, utilizing feature enhancement and dimensionality reduction techniques to enable rapid modeling across metasurfaces with different parameterizations, physical sizes, and geometric shapes, achieving accurate perception models with only 10% of the data.176 This study combined forward perception networks and heuristic algorithms, significantly reducing the data collection workload and overcoming limitations of previous studies restricted to fixed physical structures [Fig. 11(a)]. In 2022, Zhao et al. improved the transfer of knowledge across different physical scenarios through transfer learning, enabling the transition from smaller to larger arrays and speeding up the learning process for complex arrays [Fig. 11(a)].177 In 2024, Xu et al. enhanced spectral prediction accuracy using complex perception networks and employed transfer learning to transfer knowledge between infrared and terahertz bands. This method demonstrated its reliability and scalability in designing typical terahertz devices with high efficiency even on small datasets [Fig. 11(a)].178 These studies demonstrate the potential of transfer learning in metamaterial design, enabling rapid and accurate predictions across various scenarios. Transfer learning reduces dependence on large datasets, decreases the time and resources needed for new model training, and increases the flexibility and efficiency of the design process. Transfer learning also facilitates the integration of data and models from different artificial metasurfaces, providing a foundational approach for AI metasurface design driven by big data.
Applications of transfer learning in metamaterial design. (a) Accelerating the design of metasurfaces with different shapes. Reprinted with permission from Zhang et al., Adv. Opt. Mater. 10, 2200748 (2022). Copyright 2022 John Wiley and Sons Inc. (b) Accelerating the transition from small-scale to large-scale metasurface designs. Reprinted with permission from Fan et al., Phys. Rev. Appl. 18, 024022 (2022). Copyright 2022 American Physical Society. (c) Accelerating metasurface designs across different frequency bands. Reprinted with permission from Xu et al., Adv. Photonics Nexus 3, 026002 (2024). Copyright 2024 SPIE.
Applications of transfer learning in metamaterial design. (a) Accelerating the design of metasurfaces with different shapes. Reprinted with permission from Zhang et al., Adv. Opt. Mater. 10, 2200748 (2022). Copyright 2022 John Wiley and Sons Inc. (b) Accelerating the transition from small-scale to large-scale metasurface designs. Reprinted with permission from Fan et al., Phys. Rev. Appl. 18, 024022 (2022). Copyright 2022 American Physical Society. (c) Accelerating metasurface designs across different frequency bands. Reprinted with permission from Xu et al., Adv. Photonics Nexus 3, 026002 (2024). Copyright 2024 SPIE.
2. Reinforcement learning
Reinforcement learning learns optimal strategies through interaction with the environment, without relying on large annotated datasets, making it particularly suitable for design tasks with close model–environment interaction. In metamaterial design, reinforcement learning can optimize design parameters through trial and error to achieve high-efficiency optical performance.179–183 In 2022, Seo et al. focused on using deep Q-networks to optimize one-dimensional freeform silicon-based metasurface beam deflectors, demonstrating superior efficiency at multiple wavelengths and deflection angles compared to existing methods [Fig. 12(a)].184 In 2022, Liao et al. combined the proximal policy optimization algorithm from deep reinforcement learning to design 3D chiral plasmonic metasurfaces, varying all variable parameters simultaneously. This method improved training and design efficiency, achieving optimal circular dichroism values at various target wavelengths [Fig. 12(b)].185 Reinforcement learning accelerates the design process and enhances the functionality of metasurfaces through intelligent methods. It can address the vast design space in freeform optimization, finding optimal structures without prior metamaterial data and enabling adaptive adjustments based on conditions.
Applications of reinforcement learning in the design of artificial metamaterials. (a) Optimization of one-dimensional freeform silicon-based metasurface beam deflectors using deep Q-networks. Reprinted with permission from Seo et al., ACS Photonics 9, 452–458 (2022). Copyright 2022 American Chemical Society. (b) Design of 3D chiral plasmonic metasurfaces using deep reinforcement learning. Reprinted with permission from Liao et al., Opt. Express 30, 39582–39596 (2022). Copyright 2022 OSA Publishing.
Applications of reinforcement learning in the design of artificial metamaterials. (a) Optimization of one-dimensional freeform silicon-based metasurface beam deflectors using deep Q-networks. Reprinted with permission from Seo et al., ACS Photonics 9, 452–458 (2022). Copyright 2022 American Chemical Society. (b) Design of 3D chiral plasmonic metasurfaces using deep reinforcement learning. Reprinted with permission from Liao et al., Opt. Express 30, 39582–39596 (2022). Copyright 2022 OSA Publishing.
3. Diffusion neural network
Recently, stochastic generative models, specifically diffusion models, have been proven effective in addressing inverse problems. Diffusion models view data generation as a process of diffusing from high-dimensional data space to low-dimensional latent space, followed by generating data from low dimensional space through reverse process.189,190 In 2023, Zhang et al. presented MetaDiffusion, a diffusion probabilistic model for accurate and high-degree-of-freedom metasurface inverse design. This method surpasses GANs in stability and accuracy by learning noise diffusion processes to generate meta-atoms conforming to S-parameters, offering a new path in metasurface design [Fig. 13(a)].186 In 2024, Ding et al. demonstrated a novel optical logic operator driven by multifunctional metasurfaces using an all-optical diffusion neural network.187 This network consists of only one hidden layer, physically mapped as a metasurface composed of simple compact unit cells, significantly reducing the system’s volume and computational resources. The designed optical quantum operators achieved up to 96% fidelity across all four quantum logic gates, showing potential for large-scale optical quantum computing systems [Fig. 13(b)]. These studies indicate that the application of diffusion models in metamaterial design not only addresses the “one-to-many” problem inherent in traditional methods but also significantly enhances design speed while maintaining high generation quality. This provides a rapid and efficient new pathway for metamaterial design.
Applications of diffusion models in the inverse design of artificial metamaterials. (a) Metamaterial inverse design using image parameter diffusion models. Reprinted with permission from Zhang et al., Nanophotonics 12, 3871–3881 (2023). Copyright 2023 Walter de Gruyter. (b) Optical logic operator design using all-optical diffusion neural networks. Reprinted with permission from Ding et al., Adv. Mater. 36, 2308993 (2024). Copyright 2024 John Wiley and Sons Inc.
Applications of diffusion models in the inverse design of artificial metamaterials. (a) Metamaterial inverse design using image parameter diffusion models. Reprinted with permission from Zhang et al., Nanophotonics 12, 3871–3881 (2023). Copyright 2023 Walter de Gruyter. (b) Optical logic operator design using all-optical diffusion neural networks. Reprinted with permission from Ding et al., Adv. Mater. 36, 2308993 (2024). Copyright 2024 John Wiley and Sons Inc.
4. Transformer neural network
Transformer models, with their attention-based mechanism, excel in handling sequential data and capturing long-range dependencies, making them suitable for managing complex design parameters and behaviors. The attention mechanism identifies key design parameters, enhancing design accuracy and efficiency. Compared to recurrent neural networks, transformers better manage large-scale and high-dimensional data, optimizing the design of multifunctional metamaterials.191,192 As shown in Fig. 14, a study proposed a transformer-based method for designing an all-dielectric surface-enhanced Raman scattering (SERS) metasurface based on quasi-bound states in the continuum.188 By adjusting the incident angle, this design achieved a strong coupling mechanism, significantly enhancing the SERS signal enhancement factor, providing new guidelines for metal-free all-dielectric SERS sensing technology. However, transformers’ performance advantages may not be as pronounced with smaller datasets or models, posing challenges for laboratory-scale metamaterial research. Combining traditional CNN architectures with transformer models may enhance performance on small datasets, leveraging the strengths of both approaches. This hybrid model method is advantageous for laboratory-scale studies. With increasing data availability and technological advancements, transformer models are expected to play a more significant role in metamaterial design.
Applications of transformer models in the design of multifunctional metamaterials. Reprinted with permission from Chen et al., Adv. Opt. Mater. 12, 2301697 (2024). Copyright 2024 John Wiley and Sons Inc.
Applications of transformer models in the design of multifunctional metamaterials. Reprinted with permission from Chen et al., Adv. Opt. Mater. 12, 2301697 (2024). Copyright 2024 John Wiley and Sons Inc.
Various AI-driven design approaches have been employed to enhance the efficiency of both forward prediction and inverse design processes. Table I presents different AI design methods, their applications, advantages, disadvantages, and relevant references. Traditional models such as MLP and CNN are effective for forward prediction, providing simplicity and spatial feature extraction, respectively. However, they face challenges when dealing with inverse design tasks. Tandem neural network and VAE excel in inverse design by offering inverse designs but they often have limitations in scope or exploration breadth. GAN and diffusion model push the boundaries of innovation, generating highly creative designs, although they come with trade-offs in stability and data demands. Reinforcement learning and transfer learning are notable for their ability to operate with minimal labeled data and leverage pre-existing knowledge, but they require integration with other methods to maximize their potential. Finally, cutting-edge approaches such as transformer models offer exceptional scalability and versatility, bridging the gap between forward and inverse design processes, at the cost of high computational and data requirements. By leveraging these AI techniques, researchers are able to achieve more sophisticated functional implementations in artificial metamaterials, pushing the boundaries of what is possible in this innovative field.
AI-driven design approaches for artificial metamaterials.
Design method . | Application . | Advantages . | Disadvantages . |
---|---|---|---|
Multilayer perceptron73,123 | Forward prediction | Simple structure | Not ideal for complex structures |
Convolutional neural network77,134 | Forward prediction | Extracts spatial features | Limited effectiveness in inverse design |
Perception model with optimization methods135 | Structure optimization | Optimizes structure design by selecting the best | Limited speed and scope |
Tandem neural network82,141,142 | Inverse design | Direct and straightforward inverse design steps | Cannot achieve different structures with the same performance |
Variational autoencoder147–150 | Inverse design | Provides various effective solutions | Limited exploration breadth |
Generative adversarial network159–161 | Inverse design | Generates innovative solutions | Stability issues |
Transfer learning176–178 | Knowledge inheritance | Reduces data requirements significantly | Requires integration with other structures and data |
Reinforcement learning184,185 | Inverse design | Unsupervized, no dependency on labeled data | Efficiency can be unstable |
Diffusion model186,187 | Inverse design | Produces highly innovative solutions | Requires substantial data for training |
Transformer model188 | Bidirectional design | Highly scalable and versatile | Requires extensive data and computational resources |
Design method . | Application . | Advantages . | Disadvantages . |
---|---|---|---|
Multilayer perceptron73,123 | Forward prediction | Simple structure | Not ideal for complex structures |
Convolutional neural network77,134 | Forward prediction | Extracts spatial features | Limited effectiveness in inverse design |
Perception model with optimization methods135 | Structure optimization | Optimizes structure design by selecting the best | Limited speed and scope |
Tandem neural network82,141,142 | Inverse design | Direct and straightforward inverse design steps | Cannot achieve different structures with the same performance |
Variational autoencoder147–150 | Inverse design | Provides various effective solutions | Limited exploration breadth |
Generative adversarial network159–161 | Inverse design | Generates innovative solutions | Stability issues |
Transfer learning176–178 | Knowledge inheritance | Reduces data requirements significantly | Requires integration with other structures and data |
Reinforcement learning184,185 | Inverse design | Unsupervized, no dependency on labeled data | Efficiency can be unstable |
Diffusion model186,187 | Inverse design | Produces highly innovative solutions | Requires substantial data for training |
Transformer model188 | Bidirectional design | Highly scalable and versatile | Requires extensive data and computational resources |
IV. ARTIFICIAL METAMATERIALS EMPOWERED ARTIFICIAL INTELLIGENCE
The rapid development of AI has led to a significant demand for computational power, and traditional electronic computing hardware is facing performance bottlenecks as models grow in complexity and scale. To overcome these limitations, researchers are exploring new computing technologies, such as optical computing and metamaterials, due to their unique advantages. Optical computing and metamaterials leverage the wave properties of light for information processing, offering high speed, low energy consumption, and strong parallel processing capabilities.193–206 Metamaterials can create novel storage and computing units that may surpass existing electronic devices in speed, energy efficiency, and integration density. Applying metamaterial technology to AI hardware development promises to address the current computational power shortfall and potentially lead the next revolution in AI computing. By utilizing light to simulate the forward propagation process in fully connected neural networks, all matrix operations are completed during the diffraction of light. This optical computing architecture provides significantly higher energy efficiency and computational speed compared to traditional hardware, such as graphical processing units (GPUs). In the microwave regime, electromagnetic diffraction neural networks have two distinct advantages: the ability to perform direct computations in the microwave electromagnetic domain and easier programmability. This chapter will introduce optical and microwave metamaterial-based neural networks, along with their chip-scale applications.
A. Optical physical neural networks
In recent years, research has begun to focus on planar photonic circuits to realize the functions of neural networks. In 2017, Shen et al. proposed all-optical neural network architecture based on Mach–Zehnder interferometers and wavelength-division multiplexing.207 This coherent nanophotonic circuit processes optical signals, providing an efficient optical implementation for matrix multiplication and nonlinear activation functions in deep learning tasks [Fig. 15(a)]. In 2021, Liao et al. introduced a novel all-optical computing framework based on optical convolutional neural networks.208 The framework uses cascaded silicon Y-waveguides and side-coupled silicon waveguide segments as weight modulators, achieving complete control over the amplitude and phase of optical signals [Fig. 15(b)]. This provides an effective solution for ultra-high-speed, ultra-low-energy all-optical computing.
Advances in all-optical neural network architectures for photonic computing. (a) All-optical neural network architecture using Mach–Zehnder interferometers and wavelength-division multiplexing. Reprinted with permission from Shen et al., Nat. Photonics 11, 441–446 (2017). Copyright 2017 Springer Nature. (b) All-optical computing framework using optical convolutional neural networks. Reprinted with permission from Liao et al., Opto-Electron. Adv. 4, 200060 (2021). Copyright 2021 OE Journal. (c) Enhanced inference accuracy in diffraction-based optical neural networks using differential detection. Reprinted with permission from Li et al., Adv. Photonics 1, 046001 (2019). Copyright 2020 SPIE. (d) Reconfigurable diffractive processing unit for neuromorphic optoelectronic computing. Reprinted with permission from Zhou et al., Nat. Photonics 15, 367–373 (2021). Copyright 2021 Springer Nature.
Advances in all-optical neural network architectures for photonic computing. (a) All-optical neural network architecture using Mach–Zehnder interferometers and wavelength-division multiplexing. Reprinted with permission from Shen et al., Nat. Photonics 11, 441–446 (2017). Copyright 2017 Springer Nature. (b) All-optical computing framework using optical convolutional neural networks. Reprinted with permission from Liao et al., Opto-Electron. Adv. 4, 200060 (2021). Copyright 2021 OE Journal. (c) Enhanced inference accuracy in diffraction-based optical neural networks using differential detection. Reprinted with permission from Li et al., Adv. Photonics 1, 046001 (2019). Copyright 2020 SPIE. (d) Reconfigurable diffractive processing unit for neuromorphic optoelectronic computing. Reprinted with permission from Zhou et al., Nat. Photonics 15, 367–373 (2021). Copyright 2021 Springer Nature.
To further enhance neural network performance, three-dimensional diffraction neural networks capable of integrating more neurons have been proposed. In 2019, Li et al. significantly improved the inference accuracy of diffraction-based optical neural networks by introducing a differential detection technique [Fig. 15(c)].209 This technique optimizes the non-negative constraint of light intensity by assigning independent detector pairs for each category, achieving high blind test accuracy on standard datasets. Zhou et al. proposed a reconfigurable diffractive processing unit for constructing large-scale neuromorphic optoelectronic computing systems in 2021 [Fig. 15(d)].210 This system efficiently supports various neural network models and achieves model complexity with up to millions of neurons. By dynamically programming optical modulators and photodetectors, it realizes extremely high-speed data modulation and large-scale network parameter updates.
B. Microwave physical neural networks
In recent years, microwave-band diffractive neural networks have attracted significant interest. These networks often achieve network parameter regulation through multilayer metasurfaces. In the microwave band, programmable metasurfaces have demonstrated tremendous potential for implementing diffractive neural networks due to their unique capabilities.211, Figure 16 shows the work on physical neural networks using microwave-band devices. Qian et al. proposed an optical logic operation method based on diffractive neural networks, using spatial encoding and composite Huygens metasurfaces to perform logical operations on plane wave signals without precise control of input light in 2020 [Fig. 16(a)].212 Liu et al. introduced a programmable artificial intelligence structure that establishes hierarchical neural connections through a multilayer digitally encoded metasurface array in 2022 [Fig. 16(b)].213 This structure is capable of performing a variety of deep learning tasks, including image classification, mobile communication encoding and decoding, and real-time multi-beam focusing, thereby illustrating the application potential of all-optical diffractive deep neural networks in wave sensing and communication. Gao et al. proposed a programmable surface neural network based on a surface plasmon polariton platform in 2023 [Fig. 16(c)].214 This network can perform electromagnetic wave sensing regulation at near-light speed. By adjusting the bias voltage on varactor diodes, it achieves programmable weight modulation and incorporates real-time control and feedback mechanisms to design customizable activation functions. Programmable artificial surfaces are poised to hold great promise in the development of fast and reliable physical neural networks.
Advances in microwave-band diffractive neural networks using programmable metasurfaces. (a) Optical logic operations using diffractive neural networks. Reprinted with permission from Qian et al., Light: Sci. Appl. 9, 59 (2020). Copyright 2020 Springer Nature. (b) Programmable artificial intelligence structure with metasurface arrays. Reprinted with permission from Liu et al., Nat. Electron. 5, 113–122 (2022). Copyright 2022 Springer Nature. (c) Programmable surface neural network on a surface plasmon polariton platform. Reprinted with permission from Gao et al., Nat. Electron. 6, 319–328 (2023). Copyright 2023 Springer Nature.
Advances in microwave-band diffractive neural networks using programmable metasurfaces. (a) Optical logic operations using diffractive neural networks. Reprinted with permission from Qian et al., Light: Sci. Appl. 9, 59 (2020). Copyright 2020 Springer Nature. (b) Programmable artificial intelligence structure with metasurface arrays. Reprinted with permission from Liu et al., Nat. Electron. 5, 113–122 (2022). Copyright 2022 Springer Nature. (c) Programmable surface neural network on a surface plasmon polariton platform. Reprinted with permission from Gao et al., Nat. Electron. 6, 319–328 (2023). Copyright 2023 Springer Nature.
Table II categorizes recent advancements into two types of electromagnetic metamaterial physical neural networks. Each category includes a summary of the key features, practical applications, and references, showcasing the unique capabilities and advancements in neural network technology within various electromagnetic spectra. This comprehensive overview underscores the potential of electromagnetic metamaterial physical neural network to meet the growing computational demands of artificial intelligence.
AI-driven design approaches for metamaterials.
Type . | Key features . | Applications . |
---|---|---|
Optical physical neural networks207–210 | High-speed optical signal process, ultra-low-energy, optoelectronic computing | Deep learning tasks, image recognition, real-time data modulation |
Microwave physical neural networks211–213 | Direct calculations in the microwave frequency, easy to implement programmability | Optical logic operations, wave sensing, real-time multi-beam focusing |
Type . | Key features . | Applications . |
---|---|---|
Optical physical neural networks207–210 | High-speed optical signal process, ultra-low-energy, optoelectronic computing | Deep learning tasks, image recognition, real-time data modulation |
Microwave physical neural networks211–213 | Direct calculations in the microwave frequency, easy to implement programmability | Optical logic operations, wave sensing, real-time multi-beam focusing |
C. Integration of electromagnetic physical neural networks
Diffractive neural networks in both the optical and microwave domains have shown considerable promise for deep learning applications but face challenges with compact integration due to larger structural footprints of discrete diffractive neurons. To address these obstacles, researchers are exploring novel approaches for more efficient integration. In 2022, Wang et al. showcased an integrated photonic system that employs subwavelength structures for directional diffraction and dispersion management [Fig. 17(a)].215 This system, operating on a silicon photonics platform within the communication band, performs selective image recognition and achieves high-speed vector-matrix multiplication operations through the use of multilayer metasurfaces. In 2022, Zhu et al. introduced an integrated chip diffractive neural network that leverages ultra-compact diffractive elements to carry out parallel Fourier transform and convolution operations, thereby significantly curtailing hardware space and energy usage.216 This development offers a scalable and energy-efficient solution for optical artificial intelligence applications [Fig. 17(b)]. Moving into 2023, Fu et al. presented an on-chip diffractive optical neural network constructed on a silicon-on-insulator platform [Fig. 17(c)].217 This network is capable of executing machine learning tasks with high levels of integration and minimal power consumption, attaining nearly 90% classification accuracy on the Iris dataset through the physical mapping of optical neural networks within silicon-based structures. The successful implementation of integrated diffractive neural networks is poised to deliver more compact, efficient, and dependable solutions across the fields of sensing and communication, thereby propelling the evolution and application of pertinent technologies.
Integration of electromagnetic physical neural networks. (a) Integrated photonic system for directional diffraction and dispersion engineering. Reprinted with permission from Wang et al., Nat. Commun. 13, 2131 (2022). Copyright 2022 Springer Nature. (b) Integrated chip diffractive neural network with ultra-compact diffractive elements. Reprinted with permission from Fu et al., Nat. Commun. 14, 70 (2023). Copyright 2023 Springer Nature. (c) On-chip diffractive optical neural network on a silicon-on-insulator platform. Reprinted with permission from Zhu et al., Nat. Commun. 13, 1044 (2022). Copyright 2022 Springer Nature.
Integration of electromagnetic physical neural networks. (a) Integrated photonic system for directional diffraction and dispersion engineering. Reprinted with permission from Wang et al., Nat. Commun. 13, 2131 (2022). Copyright 2022 Springer Nature. (b) Integrated chip diffractive neural network with ultra-compact diffractive elements. Reprinted with permission from Fu et al., Nat. Commun. 14, 70 (2023). Copyright 2023 Springer Nature. (c) On-chip diffractive optical neural network on a silicon-on-insulator platform. Reprinted with permission from Zhu et al., Nat. Commun. 13, 1044 (2022). Copyright 2022 Springer Nature.
V. CONCLUSIONS AND PERSPECTIVES
The integration of AI and artificial metamaterials is a rapidly emerging interdisciplinary field that holds the potential to revolutionize numerous industries, from telecommunications to advanced computing. By using AI’s capabilities in optimization and complex problem-solving, the design of metamaterials has been accelerated, enabling new functionalities that were previously unattainable through conventional methods. This review focuses on key AI approaches for metamaterial development, including forward and inverse design methodologies, such as MLP, CNN for forward modeling, and GAN and VAE for inverse design. AI-driven metamaterial design using big data techniques enhances predictive modeling and adaptive functionality. In addition, artificial metamaterials have the potential to significantly enhance AI performance by offering novel approaches to improving neural network functionality and optimizing signal processing in machine learning algorithms. Despite the potential of AI-integrated metamaterials, several significant challenges remain.
First, the computational cost of training sophisticated AI models is high, demanding advanced hardware and considerable energy resources. The collection of high-quality datasets that accurately represent the complexities of electromagnetic properties is resource-intensive, creating obstacles for generalization and reproducibility.
Second, selecting the optimal AI models for artificial metamaterial design is challenging due to imbalanced property distributions and model complexity. Effective use of models such as GAN and VAE requires extensive experimentation and domain-specific expertise. Achieving precision and consistency in performance further complicates the design process, requiring significant human intervention and iterative optimization.
Third, AI models often act as “black boxes” with limited interpretability, which undermines trust in their outputs. Integrating physical knowledge, such as Maxwell’s equations, into AI can ensure that the method is based on physical constraints and discovers feasible and reliable designs. In addition, incorporating physical knowledge improves the interpretability of AI models and enables them to make more efficient use of limited data in designing artificial materials.
Fourth, the use of artificial metamaterials for AI computing faces cost and integration challenges. The fabrication of these materials is expensive due to their complex design requirements, which limits their scalability. Moreover, integrating metamaterials into traditional chips is difficult, as their intricate structures and sensitivities pose challenges to chip miniaturization and practical implementation.
Fifth, effective integration of AI and artificial metamaterial design requires interdisciplinary collaboration across material science, physics, and computer science, yet misaligned terminologies and methodologies demand standardized frameworks to ensure progress. In addition, the scalability of AI-designed metamaterials is hindered by fabrication constraints, such as precision, material availability, and cost, which limit their practicality for industrial adoption.
Looking ahead, the future of AI-integrated electromagnetic metamaterials is promising. Advancements in AI methodologies and fabrication techniques are expected to address these challenges, leading to new breakthroughs. Collaboration among material scientists, AI researchers, and engineers will be crucial for refining models, improving interpretability, and making fabrication processes more accessible. AI-driven metamaterials have the potential to revolutionize fields, such as photonics, electronics, and computational hardware, paving the way for faster, more efficient, and adaptable next-generation devices.
ACKNOWLEDGMENTS
This research was funded by the National Key R&D Program of China (Grant No. 2022YFF0604801), the National Natural Science Foundation of China (Grant Nos. 62271056, 62171186, and 62201037), the Beijing Natural Science Foundation of China-Haidian Original Innovation Joint Fund (Grant No. L222042), and the 111 Project of China (Grant No. B14010).
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Liming Si: Conceptualization (lead); Funding acquisition (lead); Methodology (equal); Supervision (lead); Writing – original draft (lead); Writing – review & editing (equal). Rong Niu: Methodology (equal); Visualization (equal); Writing – original draft (equal). Chenyang Dang: Investigation (equal). Xiue Bao: Investigation (equal). Yaqiang Zhuang: Investigation (supporting). Weiren Zhu: Investigation (equal).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding authors upon reasonable request.