The use of multiple diverse measurements can make lensless phase retrieval more robust. Conventional diversity functions include aperture diversity, wavelength diversity, translational diversity, and defocus diversity. Here we discuss a lensless imaging scheme that employs multiple spherical-wave illuminations from a light-emitting diode array as diversity functions. In this scheme, we place a binary mask between the sample and the detector for imposing support constraints for the phase retrieval process. This support constraint enforces the light field to be zero at certain locations and is similar to the aperture constraint in Fourier ptychographic microscopy. We use a self-calibration algorithm to correct the misalignment of the binary mask. The efficacy of the proposed scheme is first demonstrated by simulations where we evaluate the reconstruction quality using mean square error and structural similarity index. The scheme is then experimentally tested by recovering images of a resolution target and biological samples. The proposed scheme may provide new insights for developing compact and large field-of-view lensless imaging platforms. The use of the binary mask can also be combined with other diversity functions for better constraining the phase retrieval solution space. We provide the open-source implementation code for the broad research community.

## I. INTRODUCTION

Common image recording devices are generally incapable of recording phase information of the incoming light waves. Such a challenge has stimulated the development of phase retrieval techniques using intensity-only measurements. As with many inverse problems, a common formulation of the phase retrieval problem is to seek a solution that is consistent with the intensity measurements of the object. The Gerchberg-Saxton (GS) algorithm,^{1} as well as its related error reduction algorithms,^{2,3} is one of the earliest numerical schemes for this type of problem. It consists of alternating enforcement of the known information (constraints) of the object in the spatial and/or Fourier domains. It has been shown that it is often possible to reconstruct the object from a single intensity pattern with strong object constraints, such as non-negativity constraint or object sparsity constraint.^{4–7} However, if strong object constraints do not exist, the solution may not converge, and the reconstruction quality deteriorates due to the stagnation and ambiguity problems.^{2,8} To overcome these limitations, the phase diversity technique has been proposed and developed.^{9–12} This technique relies on measuring many intensity patterns of the same object, as opposed to a single intensity measurement. A known modification of the optical setup (diversity function) is introduced in between different measurements. Stagnation and ambiguity problems can be overcome by providing a set of measurements that more robustly constraint the solution space. Different types of diversity functions have been successfully demonstrated over the past decades, including nonlinear diversity,^{13} sub-aperture piston diversity,^{14} wavelength diversity,^{15–17} translational diversity in ptychography,^{18–20} defocus diversity,^{21} and angular diversity in intensity-based synthetic aperture imaging.^{22,23}

Inspired by the configuration of Fourier ptychographic microscopy (FPM),^{23} here we discuss a lensless imaging implementation that employs a binary mask as a support constraint and uses multiple spherical wave illuminations as diversity functions. In FPM, the objective lens and the tube lens perform two Fourier transform operations of the object’s light field. In the proposed lensless scheme, we replace these two Fourier transform operations with two free-space propagations. The binary mask in the lensless scheme enforces the light field to be zeros at certain locations in a controllable manner, and it is similar to the Fourier aperture constraint in FPM. In our implementation, we use a self-calibration algorithm to correct the misalignment of the binary mask and such a correction process is similar to the pupil recovery scheme in FPM.^{24} The efficacy of the proposed scheme is first demonstrated by simulations where we evaluate the quality of the reconstruction using mean square error (MSE) and structural similarity index (SSIM). The scheme is then experimentally demonstrated by recovering images of a resolution target and biological samples. We show that the lensless prototype platform is able to image confluent samples over the entire surface of an image sensor. Elimination of costly imaging lenses is especially advantageous for applications in biomedical screening, where the use of cost-effective arrayed tools is preferred to conduct high-throughput analysis. The proposed scheme may provide new insights for developing compact and large field-of-view lensless imaging platforms. The use of the binary mask can be combined with other diversity functions for better constraining the phase retrieval solution space. Finally, we provide the open-source implementation code and Video 1 in the supplementary material.

This paper is structured as follows. In Sec. II, we will introduce the proposed lensless imaging scheme. We will then discuss the forward imaging model and the reconstruction process. In Sec. III, we will discuss the performance using simulations. In particular, we will show that the use of the binary mask is essential for a successful reconstruction. In Sec. IV, we will demonstrate the use of a prototype device to acquire images of a resolution target and biological samples. Finally, we will summarize the results and discuss the future directions, including sub-pixel implementation, multi-state modeling, and the optimal design of the binary mask.

## II. IMAGING MODEL AND RECONSTRUCTION PROCESS

The proposed lensless imaging scheme shares its roots with the FPM approach. In Fig. 1, we show the comparison between a typical FPM setup [Fig. 1(a)] and the proposed lensless imaging setup [Fig. 1(b)]. In the FPM setup, the objective lens and the tube lens perform two Fourier transform operations of the object’s light field. In the proposed lensless scheme, we replace these two Fourier transform operations with two free-space propagations with distance *d*_{1} and *d*_{2}. In the FPM setup, the support constraint is imposed by a circular aperture at the Fourier plane. In the lensless setup, we replace this aperture with a binary mask placed in between the object and the image sensor. In both settings, we use a light-emitting diode (LED) array for sample illumination. By illuminating the sample with different LED elements, we acquire multiple intensity images of the sample and use them to recover the complex sample image in the reconstruction process. Video 1 of the supplementary material visualizes the operation of the proposed lensless scheme.

The differences between the FPM and the proposed lensless imaging scheme are twofold. First, the resolution of FPM is determined by the numerical aperture (NA) of the objective and the largest incident angle. In the lensless imaging scheme, there is no low-pass filtering process imposed by the binary mask. If we do not implement the subpixel sampling scheme,^{23,25} the achievable resolution will be determined by the pixel size of the image sensor. The objective lens used in FPM is typically corrected for chromatic aberrations, and thus, the chromatic dispersion has little effect on the achievable resolution. The lensless imaging scheme, on the other hand, solely relies on free-space propagation. In Sec. IV, we will also show that the resolution of the lensless scheme will be affected by the bandwidth of the illumination spectrum. Second, the reconstruction process of FPM aims to stitch the information in the Fourier plane and recover the complex sample information at the same time. In the proposed lensless scheme, the reconstruction process is to impose the binary-mask constraint to better recover the sample image. To this end, the use of different spherical wave illuminations in the lensless scheme is to better constrain the solution space instead of enlarging the Fourier passband.

### A. Forward imaging model

The forward imaging model of the lensless scheme is shown in Fig. 2. We use this model to establish the relationship between the complex object image *O*(*x*, *y*) and the *i*th intensity measurement *I*_{i}(*x*, *y*). It can be explained in the following 4 steps.

First, the object is illuminated by the *i*th LED element in the array. In our implementation, we model the illumination light field from the *i*th LED element as a spherical wave

where $r=x\u2212xLEDi2+y\u2212yLEDi2+d02,$$xLEDi\u2009yLEDi$ denotes the location of the *i*th LED element, *d*_{0} denotes the distance between the object plane and the LED-array plane, *j* is the imaginary unit, and *λ* is the central wavelength of the illumination. With the LED illumination, the resulting light field below the object plane in Fig. 2 can be written as *U*_{1}(*x*, *y*) = *O*(*x*, *y*)·*P*_{i}(*x*, *y*), where “·” denotes element-wise multiplication.

Second, the light field *U*_{1}(*x*, *y*) propagates from the object plane to the mask plane, and the resulting light field above the mask plane can be written as *U*_{2}(*x*, *y*) = *PSF*_{1} $*$ *U*_{1}(*x*, *y*), where *PSF*_{1} denotes the point spread function (PSF) for free-space propagation over distance *d*_{1} and “$*$” denotes convolution. The convolution process is typically implemented in the Fourier space as follows:^{26,27}

In Eq. (2), (*k*_{x}, *k*_{y}) represent the coordinates in the Fourier space, $U1^kx,ky$ and $U2^kx,ky$ are the Fourier spectrum of *U*_{1}(*x*, *y*) and *U*_{2}(*x*, *y*), respectively, *k*_{0} is the wave number and equals to 2*π*/*λ*, and $ej\u22c5k02\u2212kx2\u2212ky2\u22c5d1$ is the Fourier transform of *PSF*_{1}. Based on Eq. (2), *U*_{2}(*x*, *y*) can be obtained by an inverse Fourier transform of $U2^kx,ky$.

Third, the light field *U*_{2}(*x*, *y*) is modulated by the binary mask *M*(*x*, *y*). This process can be written as *U*_{3}(*x*, *y*) = *U*_{2}(*x*, *y*)·*M*(*x*, *y*).

Fourth, the light field propagates from the mask plane to the detector plane, yielding the light field distribution *U*_{4}(*x*, *y*) = *PSF*_{2} $*$ *U*_{3}(*x*, *y*), where *PSF*_{2} denotes the PSF for free-space propagation over distance *d*_{2}. The distance between the mask and the object is related to the feature size of the mask pattern. With a smaller feature size, the distance can be shorter because different LED elements generate different mask-projected patterns on the object. On the other hand, a smaller feature size on the mask requires a higher precision for mask alignment. The image sensor finally records the intensity variation of the light field *U*_{4}(*x*, *y*), and the captured intensity image *I*_{i}(*x*, *y*) can be written as

### B. Image reconstruction with a known mask

In the forward imaging model, each raw image is obtained under a specific spherical-wave illumination *P*_{i}(*x*, *y*). The goal is to recover the object *O*(*x*, *y*) from many intensity measurements *I*_{i}(*x*, *y*) (i = 1, 2, 3, …). If the mask is known, we can use the alternating projection algorithm (AP)^{3,28} combined with the stochastic gradient descent method to recover the object. The reconstruction process is shown in Fig. 2. The key operations are to update the light field at the detector plane, the mask plane, and the object plane. At the detector plane, the amplitude is updated by the captured image while the phase is kept unchanged, which is a projection operation. At the mask plane, the light field is updated at the empty region of the mask while kept unchanged at the dark region, which is another projection operation. At the object plane, we can use the stochastic gradient descent method to update the object. The detailed reconstruction process is explained in the following.

It starts with an initial guess of the object *O*_{g}(*x*, *y*), where the subscript “g” means “guess.” A constant matrix can be used as the initial guess. Based on the forward imaging model discussed in Sec. II A, we can obtain the light field *U*_{4}(*x*, *y*) at the detector plane. This light field is then updated by replacing its amplitude with the square root of the captured image *I*_{i}(*x*, *y*), while keeping the phase unchanged

where $arg(\u2009)$ represents taking the argument of a complex number (i.e., keep the phase unchanged). The updated *U*_{4}(*x*, *y*) is denoted as $U4\u2032(x,y)$ in Eq. (4). Following the backward flow in Fig. 2, the updated light field $U4\u2032(x,y)$ is then propagated back to the object plane. In this process, *U*_{3}, *U*_{2}, and *U*_{1} are accordingly updated to $U3\u2032$, $U2\u2032$, and $U1\u2032$ as follows:

where $conj(\u2009)$ denotes a complex conjugate. Finally, we can update the object using the stochastic gradient descent method as follows:

It has been shown that the use of stochastic gradient descent often leads to faster convergence of the solution.^{29,30} The step size “$maxx,y{|Pix,y|2}$” in Eq. (8) is inspired by the ptychographic algorithm^{29} and is related to Lipschitz constants.^{31} The above updating process will be repeated for other LED elements to complete one loop. The entire process is then repeated until the solution converges. We typically use 2-20 loops in our implementations. The reconstruction process is summarized in Algorithm 1.

### C. Image reconstruction with mask updating

The modeling of the binary mask is critical in the proposed lensless imaging scheme. A small rotation (<1°) or a positional shift of the mask would lead to a mismatch between the actual mask pattern and the one used in the imaging model. Such a mismatch could cause undesired artifacts of the recovered object and degrades the reconstruction quality. If the exact mask position is unknown, we can employ a self-calibration method to correct the misalignment and improve the robustness of the technique. The self-calibration method works by updating the mask pattern in every iteration using the following stochastic gradient descent equations:

This updating process is similar to the probe update in ptychography^{32} or the pupil update in FPM.^{24} The image recovery process with the mask updating process is summarized as Algorithm 2.

## III. SIMULATIONS

We present a series of simulations to validate the proposed lensless imaging scheme. As shown in Fig. 3, we use the “Lena” image as the intensity of the object and the “baboon” image as the phase of the object. We use a hole-array as the binary mask. Table I lists other parameters used in the simulations. The open-source implementation code is provided in Video 1 of the supplementary material.

Item . | Value . |
---|---|

Size of the simulated object image (pixels) | 700 × 700 |

Pixel size of the camera (μm) | 3.45 |

Number of the utilized LEDs | 9 × 9 |

Central wavelength of the illumination (nm) | 530 |

Distance between adjacent LEDs (mm) | 4 |

Distance between the LED matrix and the object (mm) | 269 |

Distance between the object and the mask (mm) | 1.20 |

Distance between the mask and the detector (mm) | 0.93 |

Item . | Value . |
---|---|

Size of the simulated object image (pixels) | 700 × 700 |

Pixel size of the camera (μm) | 3.45 |

Number of the utilized LEDs | 9 × 9 |

Central wavelength of the illumination (nm) | 530 |

Distance between adjacent LEDs (mm) | 4 |

Distance between the LED matrix and the object (mm) | 269 |

Distance between the object and the mask (mm) | 1.20 |

Distance between the mask and the detector (mm) | 0.93 |

The first simulation is to answer an important question: is the binary mask necessary in the lensless imaging scheme? In this simulation, we use a binary mask with 6 × 6 circular holes (Fig. 3). The light transmission is one in the regions of the circular holes and zero otherwise. The radius of each hole is 100 *μ*m and the distance between two holes is 250 *μ*m. A smaller feature size in the binary mask provides better (tighter) support constraint for the phase retrieval process. However, if the feature size is too small, a small amount of mask misalignment would lead to a complete failure of the reconstruction process. The ideal case is to directly fabricate a mask pattern with fine features on top of the cover glass of the image sensor. The alignment of the mask and its pattern can be precisely controlled in this case. In our implementation, we fabricate the mask on a glass wafer and put it on top of the image sensor. The choice of 100 *μ*m hole size in our mask is a compromise between the effectiveness of the support constraint and the robustness against mask misalignment.

Figure 4 shows the reconstructions without (left) and with (right) the binary masks. For both cases, we illuminate the object with 9 × 9 LED elements and use Algorithm 1 to recover the object images (for the case without the mask, we simply set *M* = 1 in Algorithm 1). The iterative recovery process is illustrated in Video 1 of the supplementary material. We can see that the binary mask is essential for the reconstruction process. The case without the mask fails to converge to the ground truth. This simulation justifies the use of the binary mask in the proposed lensless imaging scheme. The simulations in the following all employ the binary mask shown in Fig. 3.

The second simulation is to evaluate the recovery efficiency of the proposed scheme. Figure 5 shows the reconstruction with different loop numbers. We quantify the reconstruction quality using MSE and SSIM in Fig. 5(b). The results show that the proposed imaging scheme enables rapid convergence of the solution. High-quality recovery can be achieved with 3 loops.

The third simulation is to test the performance against additive noises. In Fig. 6, we add different amounts of Gaussian noise to the captured intensity images. The added noises have a zero mean and with 1%–5% standard deviations. As shown in Fig. 6, the quality of reconstructions gradually degrades when the noise level increases. Thus, the proposed scheme is robust against additives noises.

The last simulation is to test the performance of the self-calibration method. In this simulation, we generate a misaligned mask by first rotating the original mask with 1°. We then shift it with 2 pixels along the x-direction and 3 pixels along the y-direction. In the reconstruction process, we use the original mask for enforcing the support constraints. Figure 7 shows the reconstructions without (left) and with (right) mask updating. We can see that the self-calibration process is able to correct the misalignment of the binary mask. In a practical imaging setup, the self-calibration process can be run at the beginning. The recovered mask pattern can then be used in the subsequent experiments.

## IV. PROOF-OF-CONCEPT EXPERIMENT

We validate the proposed lensless scheme with a proof-of-concept experiment. In this experiment, we use an LED array for sample illumination. The measured bandwidth of the illumination is ∼30 nm. As shown in Fig. 8, the binary mask is fabricated on a glass wafer using laser direct writing procedures, and the pattern is the same as that used in simulations. We use a CMOS image sensor (PointGray CM3-U3-50S5M-CS) at the bottom to record the mask-modulated intensity images. The image sensor has 2448 × 2048 pixels and the pixel size is 3.45 *μ*m. Other parameters are the same as those used in simulations.

We first characterize the achievable resolution using a United States Air Force (USAF) resolution target. Figure 9(a1) shows a captured image without placing the binary mask in between the object and the image sensor. We can clearly see the diffraction effect due to free-space propagation. Figure 9(a2) shows the reconstruction using the same parameters as those in simulations. Figures 9(a3) and 9(a4) show the magnified views of Figs. 9(a1) and 9(a2). From Fig. 9(a4), we can see that the proposed scheme is able to recover the object with high image quality. We can resolve group 6, element 5 of the resolution target, corresponding to a linewidth of 4.9 *μ*m (i.e., 4.9-*μ*m half-pitch resolution). The optimal half-pitch resolution we can achieve should be the same as the pixel size of 3.45 *μ*m. The discrepancy between the achieved resolution and the optimal resolution is due to the 30-nm bandwidth of the illumination source.

To demonstrate this effect, we perform a simulation in Fig. 9(b). Figure 9(b1) shows a ground-truth image of the resolution target (we have scaled the size to match the real dimension). Figure 9(b3) shows the reconstruction under monochromatic illuminations (530 nm wavelength). In this case, group 7, element 2 (3.45-*μ*m linewidth) can be resolved, and it is consistent with the optimal resolution we can achieve using the 3.45-*μ*m pixel size. In Fig. 9(b4), we model the light source using the spectral curve in Fig. 9(b2). In this case, the captured intensity images are modeled as incoherent summations of 15 different wavelengths of the spectral band.^{33–35} We then recover the object assuming 530-nm monochromatic illuminations. As shown in Fig. 9(b4), group 6, element 5 of the resolution target can be resolved in this case, corresponding to a 4.9-*μ*m linewidth. This result is in a good agreement with our experimental data in Fig. 9(a4). In other words, the 30-nm bandwidth of the LED illuminations leads to the discrepancy between the achieved resolution and the optimal resolution.

We also test the prototype platform using a biological sample (liver fluke transversal section) in Fig. 10. The iterative recovery process is illustrated in Video 2 of the supplementary material. Figure 10(a1) shows the captured image without placing the binary mask in between the object and the image sensor. Figures 10(a2)-10(a4) show three magnified views of Fig. 10(a1). From these magnified views, we can clearly see the diffraction effect caused by free-space propagation. Figure 10(b1) shows our reconstruction of the object. Figures 10(b2)-10(b4) show three magnified views of Figs. 10(b1) and 10(b5)–10(b7) show the corresponding phase images. In this experiment, we use 20 × 20 circular holes in the binary mask. Other parameters are the same as those in the previous experiment. The achieved field-of-view is ∼5.52 × 5.52 mm^{2} in this case, limited by the dimensions of the image sensor. We also note that there are some periodic grid artifacts on the recovered images in Figs. 10(b2)-10(b4). It may be better to design a non-periodic mask pattern to reduce such artifacts.

## V. SUMMARY AND DISCUSSION

In summary, we have discussed a lensless imaging scheme that employs multiple spherical-wave illuminations from an LED array as diversity functions. In this scheme, we place a binary mask between the sample and the detector for imposing support constraints for the phase retrieval process. This support constraint enforces the light field to be zero at certain locations in a controllable manner and is similar to the aperture constraint in FPM. We have demonstrated two algorithms for recovering the object from multiple intensity measurements. The first one is for the case with a precisely known mask. The second one is to update the mask pattern in the iterative reconstruction process. The efficacy of the proposed scheme is first demonstrated by simulations where we evaluate the quality of the reconstruction using MSE and SSIM. The scheme is then experimentally demonstrated by recovering images of a resolution target and biological samples. We show that the achievable resolution is limited by both the pixel size of the detector and the spectrum bandwidth of the illuminations. We demonstrate a 4.9-*μ*m half-pitch resolution over a ∼30 mm^{2} field-of-view.

There are a few future directions for the proposed lensless imaging scheme. First, the spatial resolution of the current prototype setup is largely limited by the pixel size of the image sensor. Using a detector with smaller pixel size is a straightforward solution to improve the spatial resolution. A pixel size of 1.4 *μ*m or smaller is now commercially available. Another solution is to model the down-sampling process in the imaging model. In the reconstruction process, we can inverse the down-sampling effect by using a sub-sampled scheme for updating the pixel values. This solution has been successfully demonstrated for solving the pixel aliasing problem in FPM.^{36}

Second, the partial temporal coherence of the illumination source degenerates the spatial resolution in the proposed scheme. A simple solution is to add a narrow-band spectral filter in the light path. Another solution is to model the captured images as incoherent summations over multiple coherent states. The dispersion effect can be inversed in this case.^{33–35} The third solution is to reduce the free-space propagation distances between the object plane, the mask plane, and the detector plane. In particular, the mask pattern can be directly written on top of the cover glass of the image sensor.

Third, the design of the mask pattern is an interesting topic for the future study. A smaller feature size on the mask leads to a better support constraint for the phase retrieval process. The relationship between the mask pattern and the number of illumination angles needed is currently unknown for the proposed system. It has been shown that three random masks are enough to recover the object from Fourier diffraction measurements.^{37} With an optimal mask design, the number of employed LED elements can be minimized.

## SUPPLEMENTARY MATERIAL

See supplementary material for the open-source implementation code (Video 1 of the supplementary material) and it can be downloaded from https://doi.org/10.6084/m9.figshare.6215552.

## ACKNOWLEDGMENTS

This work was in part supported by Nos. NSF 1555986, NSF 1510077, NSF 1700941, NIH R21EB022378, and NIH R03EB022144. Z. Zhang acknowledges the support of the China Scholarship Council. Y. Zhou acknowledges the support of Tsinghua University.