Pinhole defects in atomic layer deposition (ALD) coatings were measured in an area of 30 cm^{2} in an ALD reactor, and these defects were represented by a probabilistic cluster model instead of a single defect density value with number of defects over area. With the probabilistic cluster model, the pinhole defects were simulated over a manufacturing scale surface area of ∼1 m^{2}. Large-area pinhole defect simulations were used to develop an improved and enhanced design method for ALD-based devices. A flexible thermal ground plane (FTGP) device requiring ALD hermetic coatings was used as an example. Using a single defect density value, it was determined that for an application with operation temperatures higher than 60 °C, the FTGP device would not be possible. The new probabilistic cluster model shows that up to 40.3% of the FTGP would be acceptable. With this new approach the manufacturing yield of ALD-enabled or other thin film based devices with different design configurations can be determined. It is important to guide process optimization and control and design for manufacturability.

## I. INTRODUCTION

### A. Defects in ALD films

Atomic layer deposition (ALD) technology is essential for applications requiring thin pinhole-free coatings for hermetic sealing,^{1–6} surface passivation,^{7–12} high-K thin films,^{13,14} and other functionalities.^{15} Other techniques used in industry and research to deposit single and multilayer barrier films include combinations of chemical vapor deposition (CVD), physical vapor deposition (PVD), and polymer-like deposition processes.^{16–19} Conformal thin ALD films have been described as a superior thin film barrier technology.^{6} ALD films are described as pinhole-free due to the self-limiting and surface saturation behavior of the ALD process,^{20} but the nature of the manufacturing processes^{21} creates extrinsic pinhole defects and intrinsic^{22} defects. For thick ALD films >10 nm, extrinsic defects are a consequence of substrate particulate contamination, and for films <10 nm extrinsic defects are due to a less than efficient nucleation process.^{23} Typically, in an ALD process, a random and stationary defect density (i.e., defects/area) is used to characterize an ALD coating.^{24} However, this representation assumes the distribution of extrinsic pinholes is described by complete spatial randomness. With this representation alone, enhanced manufacturability of ALD-based devices cannot be achieved due to no knowledge of the clustering (i.e., tightly spaced) tendency of the pinhole defects. Use of a single density metric is misleading, as shown in Fig. 1, by a large discrepancy between a simulation of random pinhole defects and a highly clustered pinhole defects simulation of the same defect density and different yields. Single pinhole defects and pinhole clusters (i.e., more than one pinhole defects) are represented by open circles and solid circles, respectively. Pinhole defects are clustered together in tight surface area regions and entirely absent in other regions of the thin film, and as a result, the density of the pinhole defects is a function of the surface area size and location. The pinholes in each cluster are random in the cluster area, as shown in Figs. 1(c) and 1(d). A new probabilistic method is needed to characterize defect distributions including clustered pinhole defects.

In this study, the probabilistic process by which extrinsic pinhole defects cluster in ALD films was determined, and simulated over a large area such as 1 m^{2}, which is typical in a manufacturing scale. Simulated devices on the holder, e.g., wafer or glass plate, with patterned ALD coatings, as shown in Fig. 2, could be investigated by overlaying device patterns onto the extrinsic defect simulation maps to be discussed in detail later. Areas of the device that require an ALD hermetic seal and areas that do not are represented in black and white, respectively. This design method can be applied to any type of pattern needing ALD coatings.

The tendency of defects to cluster was tested by fitting the observed frequency of extrinsic pinhole defect counts per quadrat^{25} area mesh size (i.e., unit cell of the grid/mesh used to bin defect frequency counts) to a random and cluster model. Extrinsic pinhole defects in ALD films can deviate from a random process in three ways, as shown in Fig. 3, and was also discussed in the case of particulate clustering and defects in chips on silicon wafers in the early semiconductor industry.^{26–28} Particulate clustering on the coating surface before and during ALD coating is considered to be the driving factor for clustering of extrinsic pinhole defects in ALD films.

### B. Probabilistic random defect model

A random pinhole generation model implies that the mean number of pinhole defects (*λ*_{ij}) within any disjoint subregion R_{i} in S_{j} are entirely independent and realized by one parameter *λ*_{ij} = *λ*. This follows that the mean number of defects in any disjoint subregion R_{i} in S_{j} is the same. This process is called a Poisson process

where *k* is the number of defects and *X* is a random variable.

### C. Probabilistic cluster defect model

A cluster process shown in Figs. 3(b)–3(d) cannot be modeled using a one parameter model. The *λ* parameter is a random variable in a cluster model with a marginal distribution *P*(*λ*)

The marginal distribution describes the probability of observing a mean number of defects in a disjoint subregion R_{i} in S_{j}.

Equation (2) can be solved using a variety of marginal distributions to define the probability of observing a *λ* value.^{26} For example,

can be formulated by compounding Eq. (2) with a gamma distribution as the marginal distribution with parameters *α* and *β*.

The final form of Eq. (3) after integration is the negative binomial model

where *r* and *p* are model parameters. The relationship of *r* and *p* with the gamma distribution parameters *α* and *β* may be written

and

The *r* and *p* parameters are related to the mean and variance of the distribution by the following:

and

Equation (7) can be rearranged to show that the variance can be independently varied by *r*. The degree of pinhole clustering is described by 1/*r*; with maximum clustering as *r* approaches 0. Maximum clustering implies that all pinholes are confined to one infinitely small region. Minimum clustering will occur when r goes to ∞ and the cluster model converges to a random model where the mean and variance are equal.

The statistical method used in this study to test for and simulate clustering of defects was also tested for compatibility using a spatial roll-to-roll ALD system.^{29} In spatial ALD systems, the precursor dosing is separated by space rather than time.^{30} This compatibility suggests that this ALD-based device design method, to be discussed in Sec. III, may be applicable to other gas-phase thin film deposition techniques such as CVD and PVD. The cluster parameter (1/*r*) may be used to identify different pinhole defect sources if the sources produce defects with different degrees of clustering.

## II. EXPERIMENT

### A. Al_{2}O_{3} ALD film deposition setup and conditions

Five samples (1.6 × 1.9 cm) of polyethylene naphthalate (HS-PEN Q51, Dupont-Teijin) were adhered, using double-sided Kapton tape, to a repurposed lithography glass mask to serve as a holder for each of four distinct ALD runs, as shown in Fig. 4. The metal on the repurposed glass holder does not have any purpose in this study. The total surface area of the five samples combined was 60.8 cm^{2} of which 30 cm^{2} was analyzed. The ALD tool used in this study was a Beneq TFS 200 system. The ALD reactor volume and samples were cleaned prior to each deposition in the Colorado Nanofabrication Laboratory cleanroom facility using ethanol (CHROMOSOLV, HPLC grade). The samples were loaded into the reactor volume in the cleanroom and sealed.

Trimethylaluminum (TMA) (Strem, 98%) and water (HPLC grade) was used as the Al_{2}O_{3} precursors. The reactor pressure was typically ∼2 mbar at the deposition temperature of 110 °C. Four-hundred and fifty cycles were deposited with cycle times of 1000 ms N_{2} purge, 250 ms TMA dose, 1000 ms N_{2} purge, and 250 ms water dose.

### B. Defect decoration and enlargement using O_{2} plasma

To visualize defects reported as small as 200 nm,^{31} the extrinsic pinhole defects in the ALD films were enlarged and decorated after Al_{2}O_{3} deposition with oxygen plasma etching of the underlying PEN Q51 substrate, and then the resulting undercuts were visualized with optical microscopy at 100× magnification, as shown in Fig. 5. Image capture resolution used was 1280 × 1024 (i.e., pixel = 0.77 *μ*m^{2}). The oxygen plasma (Power_{RF} = 75–100 W, pressure = 0.6–0.7 Torr, and time = 6 h) was generated using a solid state RF generator (March Plasmod) with a 13.56 MHz signal. Other methods for pinhole defect visualization that could be used are electroplating^{24} and fluorescent tagging.^{31} Electroplating requires an electrically conductive substrate, which is not suitable for decoration of defects in an ALD film deposited on the insulating PEN Q51 polymeric substrate used in this study. Fluorescent tagging can be used for defect decoration on polymeric substrates, and provides information on the actual size of the pinhole. Oxygen plasma etching was chosen as the decoration method in this study as it provided the simplest solution for defect decoration.

### C. Image acquisition method

Twenty PEN Q51 samples with 450 ALD Al_{2}O3 cycles from four distinct runs were examined for extrinsic defects under a grid as shown in Fig. 6 using a quadrat method.^{25} Each of the four grids consisted of 12 × 12 Cr boxes with a quadrat corresponding to a 200× microscope image. Magnification of 100× consists of four sets of 6 × 6 images, with each image containing four of the Cr cells. The images obtained were of lower quality than Fig. 5(c) due to the small air gap between the glass slide with the Cr grid and the sample creating a slight distortion of the images. The image files were stored in a directory file, and analyzed using a matlab gui. The gui functioned by reading each microscope image from file and then prompting the user to manually select the (x, y) pixel location of each pinhole defect. The pixel locations of all of the defects relative to each run were then analyzed for clustering or randomness.

## III. RESULTS AND DISCUSSION

### A. Extrinsic pinhole defects in an Al_{2}O_{3} ALD film

The (x, y) locations of all defects are plotted in Fig. 7. The mean number of defects per run shows run-to-run fabrication variations with densities from 1.62 to 22.92/cm^{2}. The defect data show that in some cases a pinhole free film is obtained, such as sample 2 and 7 of runs 1 and 2, respectively. However, 90% of the samples were not pinhole free and were observed to have at least one defect. A quadrat method^{25} was used to count the number of pinholes in each quadrat (i.e., unit cell of a grid/mesh) and determine the frequency distribution of the number of pinholes per quadrat, and for different quadrat mesh sizes. Pinhole locations per sample were determined from four sets of 6 × 6 arrangement 100× microscope images with quadrat area mesh size of 2.6 × 10^{−2} cm^{2} as shown in Fig. 6. The defects per smaller quadrat were then counted using matlab for quadrat area mesh sizes of 6.4 × 10^{−4} cm^{2} (24 × 24), 1.6 × 10^{−4} cm^{2} (48 × 48), and 4 × 10^{−5} cm^{2} (96 × 96). In the case of the 24 × 24 quadrat mesh, each microscope image was further subdivided into a 4 × 4 mesh or equally a set of 16 smaller quadrats of area size 6.4 × 10^{−4} cm^{2}. As a result, each of the four original 6 × 6 meshes of 100× microscope images per sample were rearranged into four 24 × 24 meshes. The purpose of analyzing the frequency of defect counts with respect to different quadrat area mesh sizes is to determine the extent of clustering of pinholes in ALD films for simulations to be discussed in Sec. III B.

To isolate pinhole clustering effects, the cumulative frequency counts for the defects for each quadrat mesh size per run were separately fit using a maximum likelihood estimation (MLE) method in matlab to the random and cluster model cumulative distribution functions after integrating in matlab Eqs. (1) and (2), respectively.^{32} The residuals (i.e., the difference between the measured and modeled percent cumulative defect counts) of the fits for the random and cluster model are shown in Figs. 8 and 9, respectively. In all cases, but the defect map in Fig. 7(a), the distribution of pinhole defects were best described by a cluster model, with the residuals of the cluster models multiple orders of magnitude less than the random models. The normality of the residuals of both models was calculated with a Kolmogorov–Smirnov test. Seventy-five percentage of the runs were best described using a cluster model, which follows that the deviation from a random model is not singularly due to run-to-run fabrication variation, but also due to pinhole clustering, as shown in Fig. 3(b).

Run-to-run fabrication variation and clustering of defects, as described in Fig. 3(d), were tested by combining runs 1–4 into one dataset of frequency counts for the defects for each quadrat mesh area sizes of 6.4 × 10^{−4}, 1.6 × 10^{−4}, and 4 × 10^{−5} cm^{2}. The residuals of fitting the cumulative pinhole counts to the random and cluster model using the same MLE method are shown in Fig. 10. For example, in the case of counting the frequency defect counts of 1 defect per quadrat and quadrat mesh area size of 6.4 × 10^{−4} cm^{2}, the residuals for the fitting to a cluster model and random model are 3.02 and −37.14, respectively. For this case, the cluster model underpredicts the probability of observing 1 defect or less by 3.02 and the random model overpredicts the probability of observing 1 defect or less by −37.14. A perfect model will have a residual of zero, and a bad model can have a maximum residual magnitude up to ±100. The residual for predicting a defect count or less was determined up to the maximum observed defect count for that quadrat size. For example, in the case of the largest quadrat size of 6.4 × 10^{−4} cm^{2}, the maximum observed number of defects was five. The normality of the residuals for each quadrat area size and defect count describes how well the models describe the pinhole defects. A normal description of the residuals suggests that the error in the model is random and not a systematic error in the model. A larger p-value suggests more evidence that the residuals show normality and that the model is sound. For example, in the case of the quadrat size of 6.4 × 10^{−4} cm^{2}, the p-values for the cluster and random model are 0.71 and 0.06, respectively. The relatively smaller residuals and stronger evidence of residual normality with respect to the random model suggests that the cluster model is better than the random model at describing the distribution of pinhole defects. Using the cluster model parameters *r* and *p*, pinhole defects on centimeter square devices could be simulated over a large area ∼1 m^{2}. The cluster model parameter estimates for the quadrat area mesh sizes were as follows: 6.4 × 10^{−4} cm^{2} (*r* = 0.04 ± 0.005, *p* = 0.861 ± 0.029), 1.6 × 10^{−4} cm^{2} (*r* = 0.016 ± 0.004, *p* = 0.909 ± 0.024), and 4 × 10^{−5} cm^{2} (*r* = 0.012 ± 0.005, *p* = 0.967 ± 0.015).

### B. Pinhole cluster simulations in ALD films

Pinhole simulations were developed in this study as a tool for enhancing device design for nanomanufacturing. Using a single random density of defects to determine the number of defects on large complicated designs such as Fig. 2 is misleading because defects are clustered and not random.

Pinhole clusters were simulated over 100 separate 10 × 10 cm surface maps with a (n × m) grid from quadrat area mesh sizes of 6.4 × 10^{−4} cm^{2} (n × m = 440 × 349 = 153 560 quadrats), 1.6 × 10^{−4} cm^{2} (880 × 698 = 614 240 quadrats), and 4 × 10^{−5} cm^{2} (1760 × 1396 = 2 456 960 quadrats). This simulation represents 100 fabrication runs. A two-step simulation process was developed. This simulation method incorporates both run-to-run fabrication variations and clustering of defects in the quadrats. Run-to-run fabrication variations can be attributed to different initial levels of particulate contamination on the substrate before the ALD coating, reactor particulate levels, substrate handling, and any different levels of particulate generation during the ALD coating.

Step 1 (run-to-run variation): Cluster model parameters (*r*, *p*) are generated for each run from a pseudorandom sampling of a normal distribution description. For example, the normal distributions generated for *r* and *p* in the case of the quadrat area mesh size of 6.4 × 10^{−4} cm^{2} are shown in Fig. 11.

Step 2 (clustering): Pseudorandom pinhole counts were generated in a (n × m) grid of quadrats by sampling from the cluster distribution generated from the *r* and *p* values from step 1.

Figure 12 shows cluster simulations for the quadrat area mesh sizes of 6.4 × 10^{−4}, 1.6 × 10^{−4}, and 4 × 10^{−5} cm^{2}. Figures 12(a) and 12(b) illustrates the run-to-run fabrication variation with the maps with the minimum and maximum number of defects, respectively. Each map is a simulation of a different run. The difference in the degree of clustering and the total number of defects follows that there is a variation in the sample preparation methods and the manufacturing process. A random model does not describe run-to-run variations. Maps shown in Figs. 12(c) and 12(d) are for quadrat mesh sizes of 1.6 × 10^{−4} and 4 × 10^{−5} cm^{2}, respectively.

As shown in Fig. 13, the degree of clustering from simulations is a function of quadrat mesh area size, with the probabilistic cluster model not accurately describing defects below a quadrat mesh area size of 10^{−3} mm^{2}. A coarse quadrat mesh is not desirable because defects locations cannot be resolved within a quadrat area mesh size. Using a fine quadrat area mesh allows analysis of small ALD features with the smallest area size of 58 × 68 *μ*m.

The cumulative percentage of devices with respect to defect density for different device sizes were analyzed by overlaying device configurations on the 100 simulated maps from the quadrat area mesh size of 6.4 × 10^{−4} cm^{2}. A single defect density from the random probabilistic model will be compared to the simulation maps using the probabilistic cluster model in Sec. III C for a device size of 2.28 × 2.35 cm.

The defect maps were analyzed by partitioning the surface area with devices of the size 9.12 × 9.49 cm (1 device per map, 100 total devices), 2.28 × 2.35 cm (16 devices per map, 1600 total devices), and 0.57 × 0.59 cm (256 devices per map, 25 600 total devices). The defect density distribution was determined for each device from the total gross area and from the perimeter (i.e., the outer quadrats = 255 *μ*m), as shown in Figs. 2(a) and 2(c). Defect density is calculated by dividing the number of defects by the size of each different device size. The shape of all the distributions depends on the device size, and shows an increase in the cumulative percentage of devices with a lower defect density with smaller device sizes as shown in Fig. 14. This increase in percentage of devices with a lower defect density follows that as the devices become smaller, the device size becomes comparable to the secondary cluster size and intercluster spacing, as shown in Fig. 12. The effect of pinhole clustering on the yield of defect densities of devices is more evident using a real device example to be discussed in Sec. III C.

### C. Enhanced nanomanufacturing design: An example

To illustrate the effectiveness of this enhanced design method over using a single defect density value from a random model, a flexible thermal ground plane (FTGP)^{33,34} heat management device can be used as a case study. For example, the FTGP can be built in two different configurations requiring ALD hermetic sealing over the entire device or the perimeter polymer seal, such as in Figs. 2(a) and 2(c). The FTGP requires a thin film hermetic seal to prevent fluid loss, prevent gain of atmospheric air, and provide an internal hydrophilic coating. Extrinsic rather than intrinsic defects have been shown to be driving factor dictating the useful life of the coatings for applications in the FTGP.^{34} The ALD films can be deposited using the same Beneq TFS 200 used in this study. The size of the device of interest is fixed at 2.28 × 2.35 cm with a thickness of 255 *μ*m, and total surface area of 10.96 cm^{2}. The devices will be analyzed using 100 simulated ALD maps with the quadrat area mesh size of 6.4 × 10^{−4} cm^{2}. The device is intended to operate at either 60 °C temperatures or high temperatures applications >60 °C, with device failure at 40, and 7 defects/cm^{2}, respectively. The failure criteria and density numbers for the FTGP are courtesy of Dr. Ryan Lewis from Y. C. Lee's group at University of Colorado at Boulder.^{35}

A random defect density of 9.48/cm^{2} can be obtained by measuring all the defects over the 100 simulations and dividing by the total area examined of 8500.73 cm^{2}. This method of measuring all the defects implies that the investigator physically etched and imaged all of the samples, as opposed to observing the clustering behavior of defects over a small area ∼30 cm^{2} and simulating over the 8500.73 cm^{2} area using computer software. Table I summarizes the average number of tolerable defects per device configuration. The number of tolerable defects is determined by multiplying the area needing ALD coating by the tolerable defect density. In the case of the <60 °C applications, using the old single defect density value from a random model, the average number of defects is less than the tolerable amount in all cases and more for the >60 °C case. Using a single defect density, the number of devices that fail or pass (i.e., the device yield) in terms of tolerable average number of defects is not known. The new probabilistic cluster model simulations applied in this study is able to determine how many devices pass or fail for any device configuration.

Application . | (1) Perimeter (0.24 cm^{2})
. | (2) Top (5.36 cm^{2})
. | (1) + (2) (5.6 cm^{2})
. |
---|---|---|---|

<60 °C | 9.6 | 214.4 | 224 |

>60 °C | 1.68 | 37.52 | 39.2 |

Random: 9.48/cm^{2} | 2.28 | 50.81 | 53.09 |

Application . | (1) Perimeter (0.24 cm^{2})
. | (2) Top (5.36 cm^{2})
. | (1) + (2) (5.6 cm^{2})
. |
---|---|---|---|

<60 °C | 9.6 | 214.4 | 224 |

>60 °C | 1.68 | 37.52 | 39.2 |

Random: 9.48/cm^{2} | 2.28 | 50.81 | 53.09 |

Using the enhanced design method developed in this study, the yield of the FTGP device configurations were determined and tabulated in Table II. Using this method, the cumulative percentage of devices with respect to the total number of defects can be applied to devices with ALD coated patterns. In some applications, the ALD coating is only needed on certain areas, such as a perimeter seal. It is known that in the case of the perimeter only seal: 40.3% of the devices will pass with one average defects (tolerable number = 1.68) for the >60 °C case and 100% of the devices will pass with nine defects (tolerable number = 9.6) for the <60 °C application. The high yield of devices with perimeter yield follows that defects are highly clustered, and as the ALD pattern becomes smaller, less defects will be present in those areas, as discussed in the general results of Fig. 11.

Perimeter and top . | Perimeter . |
---|---|

Defect: 0 | 0 |

Yield: (0) | (16.8) |

25 | 1 |

(7.6) | (40.3) |

39 | 2 |

(29.4) | (62.3) |

50 | 3 |

(54.8) | (79.1) |

75 | 4 |

(90.7) | (90.6) |

100 | 5 |

(99.2) | (96.1) |

224 | 9 |

(100) | (100) |

Perimeter and top . | Perimeter . |
---|---|

Defect: 0 | 0 |

Yield: (0) | (16.8) |

25 | 1 |

(7.6) | (40.3) |

39 | 2 |

(29.4) | (62.3) |

50 | 3 |

(54.8) | (79.1) |

75 | 4 |

(90.7) | (90.6) |

100 | 5 |

(99.2) | (96.1) |

224 | 9 |

(100) | (100) |

In the case of the gross area ALD coating: 29.4% of the devices will pass with an average number of defects of 39 for the >60 °C application and 100% will pass 224 defects in the case of the <60 °C application. Figure 15 shows the cumulative percentage curves, which were used to create Table II.

The simulations can be used to design for higher yield by changing device feature sizes that require ALD coatings. For example, if a higher yield was needed in the case of the FTGP, the size of the device may be altered as shown in Fig. 14, as well as the perimeter thickness.

## IV. SUMMARY AND CONCLUSIONS

Al_{2}O_{3} films were grown on PEN Q51 under cleanroom conditions using a Beneq TFS 200 ALD reactor tool. Oxygen plasma etching and optical microscopy were used to enlarge and visualize extrinsic pinhole defects over an area about 30 cm^{2}. The distribution of extrinsic pinhole defects in the ALD films was successfully modeled using a probabilistic cluster model, and then simulated over a large, manufacturable area of ∼1 m^{2}. Simulations using a cluster model were successfully used to design a manufacturable FTGP requiring thin ALD hermetic coatings. Using a single defect density and assuming a random model, it was determined that for high temperature operation applications >60 °C, the FTGP would not be possible. The new method developed in this study shows that up to 40.3% of the FTGP will not fail.

In future studies, a thin film coating tool capable of both molecular layer deposition and ALD will be characterized using the modeling methods developed in this study. Using this MLD and ALD tool polymer substrates will be coated with MLD and ALD films.

The statistical method used in this study to test for clustering of defects may be applicable to other gas-phase thin film deposition techniques such as CVD and PVD.

## ACKNOWLEDGMENTS

This work supported by the NSF through SNM: roll-to-roll atomic/molecular layer deposition Award No. CBET 1246854 awarded to the University of Colorado.