Accurate fabrication of high-aspect ratio (HAR) structures in applications from semiconductor devices to x-ray observatories is essential for their optimal performance because their performance directly depends on their structure. High-efficiency critical-angle transmission (CAT) gratings enable high-resolution x-ray spectroscopy in astrophysics, but their performance is only ideal when certain performance-critical parameters, like the bar tilts introduced during deep reactive-ion etching, are tuned to precise values. Traditional measurement methods like small-angle x-ray scattering (SAXS) are accurate, but limit the development of robust control algorithms to nudge performance-critical parameters toward favorable values because they are slow and often destructive. We present a fast, accurate, nondestructive measurement method using Mueller matrix spectroscopic ellipsometry and machine learning. Given a HAR structure, we train on rigorous coupled-wave analysis simulation data to predict Mueller matrix spectra from input performance-critical parameter values. We then invert this forward problem by freezing our network weights, measuring experimental Mueller matrix spectra, and vanilla gradient descending on performance-critical parameters to values that correspond to the input Mueller matrix spectra. Introducing machine learning to invert the forward problem reduces computation time, and experimental results demonstrate close agreement between our method’s determined tilt and SAXS measurements. Our accurate, fast measurement method paves the way for the development of robust control algorithms that adjust fabrication parameters in response to measurement, ensuring optimal performance in not only CAT gratings but also HAR structures embedded in applications from semiconductor to microelectromechanical systems fabrication.
I. INTRODUCTION
High-aspect ratio (HAR) microelectronic and photonic devices are critical components in advanced technological applications, including microelectromechanical systems (MEMSs), semiconductor devices, solar cells, and x-ray observatories. Accurate fabrication is essential for HAR device performance because optimal performance is by virtue of the precise structure itself. For example, smooth and straight sidewalls in HAR device structures minimize defects and roughness, reducing electrical resistivity and improving performance in MEMS.1
Accurate fabrication is only possible through measurement. Critical structure parameters (like a parameter that formally defines smoothness) that yield optimal device performance must be measured, ideally in real-time. If measurements deviate from desired values, control algorithms can correct fabrication parameters to nudge performance-critical structure parameters to these desired values. For example, in the critical-angle transmission (CAT) diffraction grating bars for x-ray observatories discussed in this paper, near-zero-degree tilt bars are crucial for optimal performance. When fabricating diffraction gratings, accurate measurement of the grating bar tilt can reveal deviations from zero-degree tilt bars. Control algorithms can then perturb etch parameters like substrate bias to ensure that the tilt remains near zero.2
A current limitation to the existence of such robust control algorithms for all HAR applications from MEMS to semiconductors, is the speed of measurement. Traditional techniques like scanning electron microscopy (SEM) and atomic force microscopy (AFM) are accurate but often slow and limited in real-time measurements.3 In this paper, we present a method for fast and accurate measurement of performance-critical HAR structure parameters. We demonstrate the efficacy of our method by using it to measure bar tilts in CAT gratings, but the method can be generalized to measure performance-critical HAR structure parameters in many other micro electro/photonic devices, especially those that are periodic. This paves the way for robust control algorithms that perturb fabrication parameters in real-time, ultimately accelerating optimal performance microelectronic and photonic device development.
CAT gratings, fabricated from silicon-on-insulator (SOI) wafers, are ultra-HAR structures used in high-resolution x-ray spectroscopy.4–7 They are blazed transmission gratings, reflecting x rays off their sidewalls at grazing angles below the critical angle for total external reflection, thus maximizing diffraction efficiency in higher orders and enabling high spectral resolving power (Fig. 1). They combine the high diffraction efficiency and resolving power of blazed reflection gratings with the practical advantages of transmission gratings.
One critical challenge in fabricating CAT gratings is the introduction of undesired grating bar tilts during the deep reactive-ion etching8 (DRIE) step (Fig. 2). These tilts significantly affect the incident x-ray angle, impacting the blazing behavior.9,10 Accurate measurement and characterization of these bar tilts are essential for fine-tuning of the DRIE step and thus optimizing grating performance.2 For CAT gratings and other HAR structures, traditional methods like small-angle x-ray scattering (SAXS) can be accurate but require destructive sample thinning if used early in the fabrication process and are time consuming.9
The basic CAT grating fabrication steps are shown in Fig. 3. Initial steps (1 and part of 2) are performed on specially designed 200 mm-diameter SOI wafers patterned at MIT Lincoln Lab. Subsequent steps are performed on campus labs including MIT.nano. Mueller matrix spectroscopic ellipsometry (MMSE) can be applied immediately after the critical DRIE of the device layer (Step 3), providing prompt feedback without requiring thinning the back side layer as required for SAXS.
We propose using MMSE for the nondestructive characterization of bar tilts in CAT gratings. MMSE measures changes in polarization as light reflects off a sample, providing detailed information about the sample’s optical properties and structure. By capturing experimental MMSE spectra from the sample grating, we can build a model of the grating using a rigorous coupled-wave analysis (RCWA)-based electromagnetic solver.14
The experimental setup captures the MMSE spectra, which are then compared to the modeled spectra. The optimization process involves calculating the gradient of the square deviation between the experimental and modeled spectra with respect to the free parameters. This allows us to iteratively adjust the free parameters until the model spectra converge with the experimental data. Traditional approaches perform gradient calculations through finite differences with RCWA simulations at each step, which is computationally intensive and time-consuming. The novelty in our method is in that we replace RCWA with a neural network, and gradient descend on the input space to find a solution that best approximates experimental spectra. To our knowledge, this technique inspired by generative artificial intelligence has not been applied in HAR metrology before. We do this by first generating training data with RCWA-simulated spectra across free parameters. We train the neural net to solve the forward problem of predicting an MMSE spectra, given a point in the free parameter space. Once trained, the neural network approximates RCWA in an analytical form. The gradient is then calculated analytically through the network with the chain rule by freezing the network weights.
The robustness of our method is evident in its speed, consistency, and accuracy compared to calculating the gradient with finite differences using RCWA. Our neural network-based approach not only accelerates the optimization process but also maintains high accuracy in parameter estimation. Our method provides detailed results of the bar tilt across the entire wafer. By measuring multiple points on the wafer, we map the tilt variations introduced during the DRIE process. The results of the MMSE-measured tilt are validated against SAXS measurements, demonstrating that the MMSE method provides consistent and accurate tilt measurements. The non-destructive and rapid nature of MMSE, combined with the computational efficiency of neural network-based optimization, offers a powerful tool for characterizing and improving the fabrication of CAT gratings.
II. PHYSICAL FOUNDATIONS
A. Scatterometry
Scatterometry is a metrology technique used to determine the structure of a sample by analyzing the spectra of light that interacts with it.
It relies on the idea that the spectral response of light, when it interacts with a periodic structure, is unique to the structure’s geometric and material properties. The scattered light carries information about the sample, which can be decoded to reconstruct the sample’s physical characteristics.
One important aspect of the interaction between light and the sample is depolarization. Depolarization occurs when the polarization state of the incident light goes from fully polarized to partially polarized. This transformation to partial polarization is highly indicative of the sample’s structural properties, particularly in complex, anisotropic structures like HAR microelectronic and photonic devices.
B. Jones vectors and Mueller matrices
While Jones calculus is effective for fully polarized light, it does not capture partially polarized light, which manifests when light interacts with anisotropic structures like HAR microelectronic and photonic devices.
The coordinate system is defined such that and are the orthogonal linear polarization directions with respect to the horizontal and vertical axes of the laboratory frame. is the light’s intensity in a particular polarization direction. and denote the polarization directions at and relative to the horizontal axis, and and represent the right and left circular polarizations, respectively.
To obtain the Mueller matrix , we use linearly independent Stokes vectors as inputs and measure the corresponding output Stokes vectors. (The experimental setup is described in Sec. VI A.) By solving a system of linear equations, we can determine the elements of .
A common approach is to choose the four orthonormal basis states in the standard basis as the four linearly independent Stokes vectors. The output Stokes vector for each input basis state then becomes a column in the Mueller matrix.
Using the Mueller matrix formalism to determine the structure of HAR microelectronic and photonic devices presents several challenges. The relationship between the Mueller matrix elements and the physical parameters of the sample is nonlinear. Furthermore, depolarization effects introduce further complexity, requiring sophisticated algorithms and models to accurately interpret the measured spectra. Despite these challenges, the comprehensive information provided by MMSE makes it a powerful tool for characterizing HAR structures.
III. MATHEMATICAL FORMULATION
The ultimate goal of our method is to map the measured spectra directly to the physical structure of the HAR microelectronic and photonic devices. This involves determining the structural parameters of the sample from the spectral data obtained through MMSE.
Directly mapping the spectra to the structure is challenging because the spectra are influenced by multiple interdependent parameters, each simultaneously contributing to the overall response.
Instead, we solve the forward problem using RCWA simulations and invert it. Our method is essentially a method to invert RCWA simulations that is faster and less computationally intensive than other methods in the literature, while maintaining accuracy. RCWA is a semi-analytical electromagnetic simulation method that computes the diffraction efficiencies of periodic structures by solving Maxwell’s equations. It provides a way to generate theoretical spectra for a given set of structural parameters, which can then be compared with the experimental spectra.
A. RCWA simulation
RCWA discretizes the structure into layers and solves Maxwell’s equations in each layer. The electric and magnetic fields within each layer are expressed as a Fourier series, and the boundary conditions are applied at the interfaces between layers. The resulting system of linear equations is solved to obtain the diffraction efficiencies.
B. Confining physical model space
Typical parameters in this reduced space, shown in Fig. 4, include the thickness and offset of the hard mask layer, the tilt angle of the grating bars, and the coefficients of a Legendre polynomial parameterization of the trench critical dimension (CD).
C. Free parameter selection and minimization of differences between RCWA and experimental spectra
Parameter . | Description . | Range (Min, Max) . | Discretization . |
---|---|---|---|
ht.3 | Height of the grating bar (nm) | (2500.0, 5000.0) | 1.0 |
xtilt.4 | Bar tilt angle of the grating bar (degrees) | (−0.750, 0.750) | 0.01 |
pw0.4 | P0(x): Constant Legendre coefficient of bar width | (50.0, 120.0) | 1.0 |
pw1.4 | P1(x): Linear Legendre coefficient of bar width | (−75.0, 75.0) | 1.0 |
pw2.4 | P2(x): Quadratic Legendre coefficient of bar width | (−100.0, 100.0) | 1.0 |
pw3.4 | P3(x): Cubic Legendre coefficient of bar width | (−100.0, 100.0) | 1.0 |
pw4.4 | P4(x): Quartic Legendre coefficient of bar width | (−100.0, 100.0) | 1.0 |
Parameter . | Description . | Range (Min, Max) . | Discretization . |
---|---|---|---|
ht.3 | Height of the grating bar (nm) | (2500.0, 5000.0) | 1.0 |
xtilt.4 | Bar tilt angle of the grating bar (degrees) | (−0.750, 0.750) | 0.01 |
pw0.4 | P0(x): Constant Legendre coefficient of bar width | (50.0, 120.0) | 1.0 |
pw1.4 | P1(x): Linear Legendre coefficient of bar width | (−75.0, 75.0) | 1.0 |
pw2.4 | P2(x): Quadratic Legendre coefficient of bar width | (−100.0, 100.0) | 1.0 |
pw3.4 | P3(x): Cubic Legendre coefficient of bar width | (−100.0, 100.0) | 1.0 |
pw4.4 | P4(x): Quartic Legendre coefficient of bar width | (−100.0, 100.0) | 1.0 |
To demonstrate that the RCWA-simulated spectra are indeed sensitive to the performance-critical parameters, we have chosen to vary in our optimization process; we vary these parameters and visually inspect the resultant RCWA-simulated spectra. The simulated spectra have 16 different matrix elements as a function of wavelength. All the matrix elements have sensitivity to our performance-critical parameters, so we fit to all matrix elements in our method; however, we note that the off-diagonal elements of the Mueller matrix, especially in the upper-right and lower-left quadrants, are more sensitive to asymmetry, because they capture the cross-polarization effects and interactions between different polarization states. The diagonal elements generally describe the overall intensity and depolarization effects, which are often related to symmetric properties of the sample. The difference between our neural network’s approximation of RCWA (the predicted spectra), and the spectra that we measure is called a “loss” function.
Again, is the total number of elements summed over all wavelengths, stands for measured, and stands for predicted. This loss function that weighs pairs of off-diagonal elements exaggerates the effect of the off-diagonals more than weighing the elements themselves, like in (9), because of the cross terms that appear in the quadratic.
Figure 5 illustrates our sensitivity analysis to tilt, where we vary tilt on the order of a few degrees and plot a linear combination of each of the upper-right and lower-left matrix element sums. The distinct change in the spectra as a function of bar tilt bolsters our hypothesis that we should be able to determine the tilt of a grating bar given experimental spectra by matching it to one simulated by RCWA.
The physical model and the parameter selection help in reducing the complexity of the problem while retaining the essential characteristics of the structure. This confined parameter space allows for more efficient and accurate optimization.
IV. EXISTING PARAMETER DETERMINATION METHODS
Traditional methods for determining the parameters of HAR microelectronic and photonic structures from MMSE spectra often involve exhaustive grid searches. These methods explore the parameter space by computing the simulated spectra for every possible combination of parameters and comparing it to the experimental spectra.
This demonstrates the impracticality of naïve grid search methods because of their computational intensity and time requirements.
The most popular alternative approach is the library method, which involves precomputing a lookup table of spectra for different parameter sets and storing this in a database. When a new experimental spectrum is measured, the closest matching precomputed spectrum is found using k-nearest neighbors (k-NNs) search.
While gradient descent is faster than a full grid search, the finite difference approximation for gradients still requires multiple RCWA simulations per iteration, making it very computationally expensive (Table II).
Require: Initial parameter estimates , learning rate , stopping condition 1: 2: repeat 3: Compute the gradient using (18): 4: Update parameters: 5: 6: until 7: 8: return |
Require: Initial parameter estimates , learning rate , stopping condition 1: 2: repeat 3: Compute the gradient using (18): 4: Update parameters: 5: 6: until 7: 8: return |
V. OUR METHOD
We performed RCWA and machine learning analysis using proprietary software (NanoDiffract, Onto Innovation Inc., Wilmington, MA). While exact algorithm details are confidential, the following is a description of what one could perform to achieve similar results. The key idea is to maintain the notion of gradient descending on input parameters but replace the naïvely slow gradient calculation with one that is faster by displacing RCWA.
At a high level, one could replace the computationally expensive RCWA simulations with an analytical form using trained neural networks. The neural network, once trained, can rapidly compute the Mueller matrix spectra and their gradients, enabling efficient optimization.
The workflow for parameter estimation from experimental spectra using the trained neural network is summarized in Table III.
Require: Physical model of sample, free parameter grid-spacing, initial parameter estimates , learning rate , stopping condition 1: Capture experimental MMSE spectra from the sample 2: Generate RCWA data using physical model and free parameters 3: Train the neural network to approximate RCWA given this data 4: Freeze the weights in the network 5: 6: repeat 7: Compute the predicted spectra using the neural network 8: Compute the gradient: 9: Update parameters: 10: 11: until 12: 13: return |
Require: Physical model of sample, free parameter grid-spacing, initial parameter estimates , learning rate , stopping condition 1: Capture experimental MMSE spectra from the sample 2: Generate RCWA data using physical model and free parameters 3: Train the neural network to approximate RCWA given this data 4: Freeze the weights in the network 5: 6: repeat 7: Compute the predicted spectra using the neural network 8: Compute the gradient: 9: Update parameters: 10: 11: until 12: 13: return |
This method significantly reduces the computation time compared to traditional RCWA-based approaches, almost entirely because we replace the finite-difference gradient calculation with one that is analytical by approximating RCWA with a neural network, making it feasible for real-time parameter estimation and in-line process adjustments.
VI. EXPERIMENTAL SETUP
A. Setup for measuring bar tilt across wafers
To measure the bar tilt across wafers, we employ MMSE. The MMSE setup includes a light source, a polarizer, two dual-rotating compensators (one in each arm), a sample stage, an analyzer, and a detector. The light source generates a beam of known polarization, which passes through the polarizer and compensator before interacting with the sample. The reflected light is then analyzed to determine the changes in its polarization state, providing detailed information about the sample’s optical properties and structure. The setup is a commercial setup, specifically an Atlas V from Onto Innovation with RC2 ellipsometer integrated (from JA Woollam Company).
The MMSE spectra are captured over a range of wavelengths, from 200 to 650 nm, allowing us to construct the Mueller matrix for each measurement point on the wafer. The experimental setup is carefully calibrated to ensure accurate and repeatable measurements. The exact details of calibration are confidential, but we generally follow the methods detailed in Section five of Chen.15
The spot size of the light beam is kept small ( m) to avoid mm-pitch Level 2 support structures [e.g., hexagonal support structures in Fig. 4(b)]. The exact details of how this spot size is achieved are confidential, but we use custom refractive, compound lenses with multiple elements, like shaping components, that keep the spot size circular and uniform over multiple wavelengths. We use a conical geometry, where the incident light beam is parallel to the CAT grating bars with a angle of incidence relative to the surface normal.
B. Data collection methods and validation using small-angle x-ray scattering
To validate the MMSE measurements, we use small-angle x-ray scattering (SAXS), a well-established technique for characterizing nanostructures. SAXS, in principle, can provide high-resolution data on the grating profile, including the bar tilt and periodicity, by analyzing the scattering patterns of x rays as they interact with the sample.
We follow the method described by Song,9 which involves using a collimated x-ray beam directed at the sample. The scattered x rays are detected at small angles relative to the incident beam, and the diffracted orders are analyzed as a function of incidence angle to extract bar tilt. The SAXS data serve as an accurate benchmark for validating the MMSE measurements.
By comparing the tilt angles obtained from MMSE and SAXS, we can assess the consistency and precision of our MMSE-based characterization method. The combination of MMSE and SAXS provides a comprehensive approach for measuring and validating bar tilt across HAR photonic wafers, offering both nondestructive and high-resolution capabilities.
VII. RESULTS AND DISCUSSION
The contour plot in Fig. 6 shows the bar tilt determined by MMSE across different points on the wafer. We show that we are able to rapidly extract tilt measurements from any point on the wafer.
To validate the accuracy of these measurements, we extract measurements along a line on the wafer perpendicular to the grating bars to compare with SAXS data [Fig. 8(a)]. Collecting SAXS data is time consuming; so we do not compare SAXS data to every point on the wafer, but instead, compare data along the entire vertical length of the wafer to determine if we see alignment at both small and large bar tilts. We follow the methods described by Song,9 and plot the bar tilt determined by both MMSE and SAXS [Fig. 8(b)]. The close agreement between the two sets of measurements validates the accuracy of our method.
Figure 8 illustrates where exactly on the full wafer from Fig. 6 the points for SAXS were extracted. This visually shows our machine learning method’s ability to measure grating bar tilt to the accuracy of SAXS, across the length of the wafer.
Our method essentially moves through critical-parameter-space to attempt to fit the curves in Fig. 5 to those in Fig. 7. This fitting is sped up by replacing the brute-force RCWA calculations with a neural network. Because the neural network is an analytical function, the gradient can be calculated rapidly with backpropagation in the critical-parameter-space after the weights are frozen. Figure 5 illustrates RCWA-simulated Meuller matrix spectra for upper-right and lower-left quadrant matrix element pairs. These are those that are most sensitive to asymmetry, and they gave us a first proof-of-concept that bar tilt could be accurately measured across a wafer through its effect on these Meuller matrix spectra. Figure 7 illustrates real, experimental spectra across a wafer. The variation in these spectra across the wafer provided validation that we could perform a fitting procedure that would attempt to match the spectra from Fig. 5 (or, neural network approximations to the spectra from Fig. 5 for increased speed) with the spectra from Fig. 7 to effectively measure bar tilt. Interestingly, the curves in Fig. 5 do not match those in Fig. 7 with extreme accuracy, but our method still works well in recovering performance-critical parameters (Fig. 8).
To gain further insight, we plot the experimental spectra against our model’s spectra after convergence for two points along the wafer in Fig. 9. This elucidates how close our machine learning approximation of RCWA is able to recover the experimental data. Note again that the two pairs of curves are not exactly the same, but the accuracy of our bar tilt measurements is still high. This implies that not all the variation in the experimental spectra is needed to determine the bar tilt. The approximations made by both RCWA and our neural net are enough to capture the bar tilt variance across the wafer. We, therefore, expect that a similar gradient descent algorithm that operates in a basis in which the curves are sparse, like the Fourier or Wavelet basis, would be sufficient.
For future work, we imagine we could Fourier transform the experimental spectra and pass them through a low-pass filter to remove the high frequency components. We can also try passing them iteratively through different band-pass filters to remove certain frequency components. We can then inverse Fourier transform back to the canonical basis, and see if our method is still able to recover the bar tilt with similar accuracy. This may further validate the hypothesis that only certain frequencies or components of information in the experimental spectra are needed to recover certain critical parameters. Determining which parts of the spectra are most important for measuring different structure parameters may help guide measurement techniques or machine learning architectures for rapid measurement.
Our machine learning approach, using a trained neural network to replace RCWA simulations, is not only accurate, but also significantly improves the speed of the gradient computation. Gradient computation is the bottleneck in the gradient descent fitting procedure; so replacing RCWA with an analytical form through a neural network, and still being able to measure critical parameters by freezing network weights, is the key addition of our paper.
As shown in Table IV, which summarizes the approximate precomputation times and query times for all methods including ours, the gradient computation time per step using the neural network is reduced to 0.05 s, making the total computation time for 100 iterations approximately 5 s. This is a substantial improvement over traditional methods, which can take several minutes or up to a day. The accuracy of the machine learning approach is validated by its close agreement with the SAXS-determined tilt. The current approach focuses on a reduced parameter space to ensure tractability. However, our method can be extended to higher-dimensional parameter spaces. Future work can explore including multiple parameters like sidewall roughness and material composition. It could also be used as a sensor in a control algorithm to adjust fabrication parameters in real-time to nudge these performance-critical structure parameters to their optimal values.
Method . | Time scaling equation . | Precomputation time . | Time at query . |
---|---|---|---|
Our method | 30 h | 5 s | |
RCWA with finite difference | 0.1 × d′ | RCWAtime × 100 | 10 s |
Library search | 0.1 × 10d′ | 28 h | 17 min |
Naïve grid search | 0.1 × 10d′ | 0 | 28 h |
Method . | Time scaling equation . | Precomputation time . | Time at query . |
---|---|---|---|
Our method | 30 h | 5 s | |
RCWA with finite difference | 0.1 × d′ | RCWAtime × 100 | 10 s |
Library search | 0.1 × 10d′ | 28 h | 17 min |
Naïve grid search | 0.1 × 10d′ | 0 | 28 h |
VIII. CONCLUSION
Our method is not only accurate in determining bar-tilt, but also fast. By replacing the standard finite-difference gradient calculation with one that is analytical, we are able to speed up the traditional bottleneck in RCWA-based approaches to measurement. The main addition of our paper is freezing the network weights and using gradient descent on the input space after training the network, along with experimental validation. While standard methods using Meuller matrix ellipsometry surpass SAXS in their non-destructiveness and speed, they are still too slow for potential feedback control of fabrication parameters because they rely on brute-force RCWA for their fitting procedures. This paper could open the door to the development of robust control algorithms that adjust fabrication parameters in response to measurement. Control algorithms need fast feedback sensors; otherwise, the latency between measurement and reality is too high for accurate control of critical-structure parameters. We foresee future work using our method to develop such control algorithms to not only monitor but also control fabrication processes in close to real time.
ACKNOWLEDGMENTS
The authors would like to thank Charlie Settens and Jordan Cox for their help troubleshooting SAXS instrument data collection. They also thank Mariel Shapiro for providing the SEM image in Fig. 4(a) from a separate wafer. The authors are grateful to Matthew Heine and the IS&T department at MIT for their help navigating computing resources. Finally, the authors would like to thank Mallory Whalen, Jungki Song, Bethany Levenson, James Jusuf, Tristen Wallace, Paran Culanathan, Varan Culanathan, Spencer Schneider, Richard Bao, Anish Mudide, Emma Batson, Mark Mondol, Anjelica Molnar-Fenton, and C. J. Johnson for their helpful discussions. This work was performed, in part, in the MIT.nano Characterization Facilities and supported by NASA Grant No. 80NSSC22K1904.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Shiva Mudide: Conceptualization (equal); Data curation (equal); Formal analysis (lead); Investigation (lead); Methodology (lead); Software (lead); Supervision (equal); Validation (lead); Visualization (lead); Writing – original draft (lead); Writing – review & editing (equal). Nick Keller: Formal analysis (lead); Investigation (equal); Methodology (equal); Software (equal); Validation (equal); Visualization (lead). G. Andrew Antonelli: Conceptualization (lead); Funding acquisition (lead); Methodology (lead); Project administration (lead); Software (equal); Supervision (equal). Geraldina Cruz: Data curation (equal). Julia Hart: Data curation (equal); Investigation (equal). Alexander R. Bruccoleri: Conceptualization (equal); Data curation (equal); Funding acquisition (equal); Methodology (equal). Ralf K. Heilmann: Conceptualization (lead); Funding acquisition (lead); Methodology (equal); Project administration (lead); Resources (lead); Supervision (lead); Writing – review & editing (lead). Mark L. Schattenburg: Conceptualization (lead); Funding acquisition (lead); Methodology (equal); Project administration (lead); Resources (equal); Supervision (lead); Writing – review & editing (lead).
DATA AVAILABILITY
There are two datasets used in this paper. First is the RCWA simulation data used to train our neural network. The second is the raw experimental Meuller Matrix spectral data that we captured from our wafer. The former is available from Onto Innovation. Restrictions apply to the availability of this dataset, which were used under license for this study. Data are available from the authors upon reasonable request and with the permission of Onto Innovation. The latter dataset is available from the corresponding author upon reasonable request.