Microscopes and various forms of interferometers have been used for decades in optical metrology of objects that are typically larger than the wavelength of light λ. Metrology of sub-wavelength objects, however, was deemed impossible due to the diffraction limit. We report the measurement of the physical size of sub-wavelength objects with deeply sub-wavelength accuracy by analyzing the diffraction pattern of coherent light scattered by the objects with deep learning enabled analysis. With a 633 nm laser, we show that the width of sub-wavelength slits in an opaque screen can be measured with an accuracy of ∼λ/130 for a single-shot measurement or ∼λ/260 (i.e., 2.4 nm) when combining measurements of diffraction patterns at different distances from the object, thus challenging the accuracy of scanning electron microscopy and ion beam lithography. In numerical experiments, we show that the technique could reach an accuracy beyond λ/1000. It is suitable for high-rate non-contact measurements of nanometric sizes of randomly positioned objects in smart manufacturing applications with integrated metrology and processing tools.
Accurate measurement of a sub-wavelength object by imaging it with a magnifying lens (conventional microscope) is impossible because the image blurs. Advanced and complex nonlinear and statistical optical techniques, such as the stimulated emission depletion (STED) and the single-molecule localization method (SMLM), can measure sub-wavelength objects remotely1,2 but are generally unsuitable for non-invasive metrology in nanotechnology as they require intrusive functionalization of samples with luminescent molecules or quantum dots.
Lensless imaging and metrology is also possible by analyzing the diffraction patterns of light scattered by the object (scatterometry). Traditional scatterometry relies on multiple illuminations of the sample and demands post-processing of data to compare with libraries of measurements and simulations3–5 and has been shown to increase the resolution of low numerical aperture imaging systems.6 A range of iterative feedback algorithms has been developed to enable the reconstruction of an image from the intensity of scattering patterns of optical, deep UV, and x-ray radiation with a resolution that is essentially limited by the wavelength of the illuminating light in most cases7,8 and that is around five times higher when compressed sensing techniques for imaging sparse objects are used.9
Artificial intelligence provides a powerful alternative to the iterative feedback algorithms in solving the inverse scattering problem of reconstructing the object from its diffraction patterns by analyzing them with a deep learning artificial neural network trained on similar a priori known objects.10 This way, random dimers of two sub-wavelength slits have been measured in the proof-of-principle experiments with an accuracy of λ/10 using a small training set of ∼102. Recently, it was demonstrated that the resolution of imaging and metrology can be further improved if topologically structured light is used for illumination of the object.11
Here, we report that the accuracy of single-shot measurements of linear dimensions of randomly positioned sub-wavelength objects of ∼λ/130 can be achieved by deep learning analysis of light scattered by the slits with a neural network trained on <103 objects of known dimensions. The ability to measure randomly positioned objects is an important feature of this methodology that makes it radically different and more suitable for applications than other techniques that require scanning (e.g., SEM, SNOM, and STED). We measured slits of random widths that are cut in an opaque screen. Each slit was placed at a random position along the x-direction within a rectangular frame, defined by four alignment marks. Each slit is characterized by its width W and offset O from the center of its rectangular frame (Fig. 1).
The sample with the slits was placed at the imaging plane of the apparatus and illuminated through a low numerical aperture lens (NA = 0.1) with a coherent light source at a wavelength λ = 633 nm. Light diffracted from the sub-wavelength slit was then imaged (mapped) by a high-numerical aperture lens (NA = 0.9) at distances of H = 2 λ, 5 λ, and 10 λ from the screen and at the screen level (H = 0). Slit localization and focusing were done semi-automatically, relying on the repeatability of the XY + Z stages (∼10 nm) and on the regularity of the fabricated array of slits. An imaging system with a 4× magnification changer and a 5.5-megapixel sCMOS camera with 6.5 μm pixel size was used. Since the diffracted field is formed by free-space propagating waves, it can be imaged by a high numerical apaerture lens at any magnification without loss of resolution by adjusting the magnification level necessary to ensure that the detector pixels are smaller than the required resolution. Our imaging system had a magnification of 333×, corresponding to an effective pixel size of 19.5 nm on the reference plane. On the diffraction map, the slits are not resolved. They appear blurred and surrounded by interference fringes.
Single-shot recording of a diffraction map was sufficient to retrieve the width W and offset O of the slit with nanometric accuracy. They are retrieved with an artificial neural network (seven fully connected layers; see the supplementary material for details), previously trained on a set of scattering events from a number of such slits of known widths and positions. Once trained, the system is ready to measure any number of unseen slits.
The deep learning process operates with diffraction patterns created by slits of known widths W and offsets O at different distances H from the sample. Our analysis below will aim to answer the following questions: (a) What is the accuracy of measurements of the slit width W that can be achieved if information on the intensity profile of diffraction pattern is used? (b) Since the power of light passing through the slit increases with the slit width, what is the accuracy of measurements of the slit width W that can be achieved if only the overall power of transmitted light is recorded?
In practical terms, the main challenge in the implementation of deeply sub-wavelength optical metrology is creating a trustworthy training set for deep learning. Such a dataset can be either virtual or physical. The virtual training dataset of objects and their diffraction scattering patterns can be generated by numerical modeling (Maxwell solving). Here, the main challenge is to ensure that the computer model is meticulously congruent with the physical realization of the optical instrument to allow adequate operation, which may be problematic. Alternatively, a physical dataset can be created by fabricating a number of real scattering elements (e.g. slits) followed by recording of their real scattering patterns. Generating a physical set is labor-intensive, but such a set is naturally congruent with the metrology instrument.
We chose a physical dataset for training and validation that has been created by fabricating a number of slits followed by recording their scattering patterns in the optical instrument. We fabricated a set of 840 slits of random widths and offsets by focused ion beam (FIB) milling on a 50 nm thick chromium film deposited on a sapphire substrate. In the set, widths of the slit W were randomly chosen in the interval from 0.079 λ to 0.47 λ (50 nm–300 nm). The slit offset O was randomized in the interval L from −0.79 λ to 0.79 λ (−500 nm to +500 nm). Here, all slit widths are well below λ/2, and hence, their structure would be considered beyond the “diffraction limit” of conventional microscopy. Upon fabrication, the slits were measured by a scanning electron microscope (SEM). For comparison (see below), we also used as ground truth the prescribed width values for fabrication by focused ion beam milling.
Upon completion of the training with 756 slits of a priori known widths and positions, the apparatus was ready for measuring unseen slits. To minimize errors related to the order with which the network was trained, we repeated training 100 times randomizing the training sets. The retrieved parameters of the unseen slits have been averaged over the 100 realizations of the trained networks. The standard errors of the means related to training with randomized sets (∼0.001 λ) were negligible in comparison to other random errors in the measurement compared to the ground truth data.
The results of the validation experiments on 84 randomly selected slits of unknown dimensions are presented in Fig. 2. It shows the optically measured values of the width of the slit against the ground truth values, measured in four independent experiments, at distances H = 0 λ, 2 λ, 5 λ, and 10 λ from the sample. We trained 100 realizations of the network for each H independently. Here, the retrieved values are plotted as a function of the ground truth values as obtained by SEM analysis. The red dashed line represents perfect agreement, while deviation from the line indicates inaccuracies of the measurement. Figure 2(a) shows results for width measurements that use only information on the power of transmitted light (WP), while Figs. 2(b) and 2(c) show the measurement results obtained by the analysis of the diffraction pattern (WD) by neural networks that are trained using known widths from either SEM inspection Fig. 2(b) and from FIB-fabrication prescribed widths, Fig. 2(c). Figure 2(d) compares widths from SEM measurements to widths prescribed for FIB-fabrication.
Analysis of the diffraction patterns offers much better metrology than metrology based on intensity measurements. Indeed, the standard deviation between slit widths measured optically and using the scanning electron microscope when only information on the total power of light transmitted through the slit was used, σP–SEM = 34 nm, is five times higher than the standard deviation between the slit widths measured optically and using the scanning electron microscope when analyzing the diffraction patterns, σD–SEM = 6.7 nm.
To quantify the metrology we calculated the standard deviations (a) between slit widths measured optically and using the scanning electron microscope (σD–SEM = 6.7 nm), (b) between the slit widths measured optically and prescribed for fabrication by focused ion beam milling (σD–FIB = 6.0 nm), and (c) between slit widths measured using the scanning electron microscope and prescribed for fabrication by focused ion beam milling (σSEM-FIB = 5.7 nm).
These standard deviations can be presented as resulting from the accumulated errors of the two-stage processes. The deviation σD–SEM results from the errors of measuring the slit widths optically (by analyzing the diffraction pattern), σD–0, and the errors of measuring the slits width with scanning electron microscope, σSEM–0. The deviation σD–FIB results from the combined errors of measuring optically, σD–0, and from the discrepancy between the actual values of the ground truth and those prescribed for fabrication by FIB, σFIB–0. Therefore, , , and . From here, the standard deviations of the optical and scanning electron microscopy measurements and the deviations between the archived slit widths and those prescribed for fabrication by FIB can be readily evaluated: σD–0 = 4.9 nm, σSEM–0 = 4.6 nm, and σFIB–0 = 3.4 nm.
Now, we shall note that the measurements taken at four different distances from the sample (at H = 0 λ, 2 λ, 5 λ, and 10 λ) are independent. They return similar values for = 5.4 nm, = 6.3 nm, = 5.6 nm, and = 6.4 nm. Correlations between data obtained in the measurements at different distances return a very high average value of correlation coefficients between measurements at different distances, ⟨r⟩ = 0.998 76 (see the supplementary material for details). Therefore, we argue that after performing K = 4 statically independent measurements at different distances, the accuracy of determining the slit width can be evaluated using = (2.5 nm)2, i.e., approximately a factor of 2 better than that for a single-shot measurement at any of the used distances.
Therefore, the standard deviation of optical measurements, which is a measure of the technique’s accuracy, is 4.9 nm or ∼λ/130 (single-shot measurement) and 2.5 nm or ∼λ/260 after measurements at four different distances (λ = 633 nm). This accuracy is comparable with the accuracies of the FIB milling (3.4 nm) and SEM measurements (4.6 nm) that we find. Here, the accuracies for FIB/SEM data suffer from the pixilation effects, finite sizes of the FIB/SEM hotspot, ion/electron beam current instabilities, and charging artifacts.
There is a crucial difference between the optical techniques and the SEM technique. The SEM measurement requires raster scanning the sample followed by image analysis and can only be performed at a very low rate, up to a few frames per seconds at best. The optical measurements are single-shot and can be performed with a high repetition rate. In real-time mode, a kilohertz measurement rate can be archived since it is only limited by the time needed for the neural network to process one diffraction pattern (in our case, it was ∼1 ms). This rate can be much higher in the binning mode, with sequential storage of diffraction patterns in the camera memory. Indeed, ultra-high-speed cameras are currently reaching hundreds of megahertz frame rates.12 The images might be sequentially post-processed by the neural network, achieving high-rate metrology that allows the study of transient dynamic processes in nanostructures.
To back up our claims of extraordinary level of single-shot accuracy in our optical experiments and to help in understanding the sources of errors in it, we conducted a full computer modeling experiment. For this, we take the same number of slits for validation and training as in the experiment (84 and 756, respectively). The experimentally recorded diffraction patterns were substituted with diffraction patterns calculated using vector diffraction theory.13 The modeling experiment returns an accuracy of ∼λ/1000 (0.60 nm in absolute terms) for the slit width W measured at distances H = 2 λ, 5 λ, 10 λ. The accuracy for the modeling experiment is one order of magnitude higher than the accuracy of the physical experiment. This is highly encouraging since it indicates that the main factors limiting the experimental accuracy, such as the mechanical instability of the apparatus, laser instabilities, and to a lesser extent, pixilation of the image sensor, can be improved, and thus, an accuracy surpassing λ/1000 can be achieved.
We note that in our real and numerical experiments, we also measured the a priori unknown random offset of the slits. In a practical application, absolute positioning of the object in the field of view (FOV) with nanometric precision is impossible. Instead, we measure relative offsets from the center of the FOV (markers were used). Slit offset and width measurements can be obtained simultaneously from the same images. We found that the slit offset can also be retrieved with sub-wavelength accuracy (see the supplementary material).
The experimentally observed accuracy of width measurements ∼λ/130 (or ∼λ/260) exceeds, by a factor of 65 (or 130), the λ/2 “diffraction limit” of conventional optical microscopes. This brings artificial intelligence-enabled optical metrology to the nanoscale. This is an accuracy that challenges the resolution of advanced tools such as focusing ion beam milling. We therefore argue that the deep learning process involving a neural network trained on a priori known objects creates a powerful and accurate measurement algorithm. Remarkably, such accuracy is achieved with a small physical dataset comprising of less than a thousand slits of a priori known sizes. Our numerical modeling indicates that single-shot sub-nanometric accuracy better than λ/1000 is possible, thus reaching molecular level dimensions. Moreover, using topologically structured light illumination accuracy will improve even further.11 However, further improvements of accuracy and precision will require larger training sets and considerable improvements in the mechanical stability of the metrology apparatus.
Finally, we highlight the simplicity of the optical instrument needed for the diffraction metrology (in comparison with complex and heavy instruments based on conventional interferometric techniques), the high throughput, and the ease of sample preparation (e.g., in comparison with SEM metrology, no vacuum is needed). The metrology technique is insensitive to where the object is placed in the field of view, and the instrument does not involve moving parts and is therefore suitable for future smart-manufacturing applications with machine tools. Moreover, the demonstrated metrology is a subset of the more general ill-posed inverse scattering problem, which is the retrieval of the image from its diffraction pattern that mathematically can be reduced to the Fredholm integral equation. It has been proven mathematically14 that neural networks are very efficient in solving this sort of problem. This is experimentally confirmed by achieving deeply sub-wavelength accuracy in our experiments reported here. The mathematically proven suitability of the connectionist approach to such tasks gives us confidence that our methodology will work with different types of objects and can be extended to simultaneous measurement of objects’ several dimensions. At the same time, we expect that for higher dimensional tasks, larger training sets and more complex networks will be required to achieve the same level of accuracy.
N.I.Z. conceived the idea of the experiment, wrote the manuscript, and supervised the project. C.R.-B., G.A., and G.Y. fabricated the samples. C.R.-B. and G.Y. constructed the optical apparatus. C.R.-B. took optical measurements and generated the experimental and theoretical datasets. E.A.C. and T.P. developed the artificial neural network. E.A.C. performed simulations and assessment of simulated data. G.Y. developed the diffraction module for simulations. All co-authors conducted the data analysis and edited the manuscript.
See the supplementary material for a description of accuracy position measurements, angular width of the diffraction pattern for slits of different width numerical calculations, details of the artificial neural network and definitions and statistical measures, and experimental data relating to the retrieved slit widths by the artificial neural network.
The authors are grateful to Nikitas Papasimakis, Ilay Kuprov and Bruce Ou for discussions.
The authors acknowledge the Singapore Ministry of Education (Grant No. MOE2016-T3-1-006); the Agency for Science, Technology and Research (A*STAR), Singapore (Grant No. SERC A1685b0005); and the Engineering and Physical Sciences Research Council UK (Grants No. EP/N00762X/1 and No. EP/M0091221), and the European Research Council (Advanced grant FLEET-786851). T.P. acknowledges support from the China Scholarship Council (CSC No. 201804910540).
Authors declare no competing interests.
Experimental data relating to the retrieved slit widths by the artificial neural network and ground truth values are available in the supplementary material. Following a period of embargo, the data from this paper can be obtained from the University of Southampton research repository at https://doi.org/10.5258/SOTON/D1854.