We have developed a simulation tool to model self-limited processes such as atomic layer deposition (ALD) and atomic layer etching inside reactors of arbitrary geometry as well the output of in situ quartz crystal microbalance and mass spectrometry. We have applied this model to two standard types of cross-flow reactors: a cylindrical tube reactor and a model 300 mm wafer reactor, and explored both ideal and nonideal self-limited kinetics. The model results are in agreement with experimental results and analytic expressions obtained using a simple plug-flow model for the cylindrical tube reactor. We also extended the simulations to consider two nonideal self-limited processes: soft-saturating processes characterized by a slow reaction pathway and processes where surface by-products can compete with the precursor for the same pool of adsorption sites. Our results show that it is possible to have a self-limited process with saturated yet inhomogeneous growth profiles due to the competition of reactor by-products. This is in agreement with experimental observations for titanium dioxide ALD from titanium tetraisopropoxide and titanium tetrachloride precursors.
I. INTRODUCTION
Atomic layer deposition (ALD) is a thin film growth technique that has been experiencing a tremendous growth in terms of processes and application domains, ranging from semiconductor manufacturing to photovoltaics and energy storage.1 More recently, atomic layer etching (ALE), ALD’s etching counterpart, has also seen a resurgence due to its impact on advanced lithographic applications in semiconductor processing and the development of thermal ALE processes.2 Both ALD and ALE are enabled by self-limited precursor-surface interactions, which underpin their ability to add/remove material in a controlled way over large substrate areas and inside nanostructured and high aspect ratio features.
ALD and ALE’s self-limited nature is usually conceptualized in terms of a finite number of reactive sites on the surface: once the reactive sites are fully consumed, the surface is no longer reactive toward the precursor.3 However, this picture overlooks some critical aspects of self-limited processes: first, self-limited heterogeneous reactions introduce nonlinear time-dependent kinetics and complex spatiotemporal patterns in the surface reactivity and chemistry during each precursor dose. Consequently, the contribution of the reactant transport of species inside the reactor needs to be considered when extracting information on the surface kinetics from in situ measurements. The transport of species is also key to understanding the scale-up of a process and optimizing reactor geometry.
Second, the self-limited nature of precursor-surface interactions is not sufficient to guarantee uniformity over large areas or inside high aspect ratio features: even without considering the effect of additional nonself-limiting components, the presence of soft-saturating kinetics, additional surface recombination pathways, competitive adsorption with reaction by-products, flux/pressure-dependent surface kinetics, long surface residence times, or merely insufficient purge times, can lead to inhomogeneous processes even when the processes are self-limited. As mentioned in a recent review,4 these processes can also affect the reproducibility of ALD and, by extension, ALE.
In this work, we explore the reactive transport of species under self-limited processes using computational fluid dynamics (CFD) simulations. CFD is a valuable tool to predict the impact surface kinetics and reactor geometry has on metrics such as throughput, coating homogeneity, and precursor utilization. Examples in the literature have explored CFD and multiscale simulations to model and optimize ALD processes.5–10 Here, we use CFD to address the following three questions: (1) how can we use under-saturated doses to extract information on surface kinetics from coverage and thickness profiles? (2) how does reactive transport impact the data obtained using in situ measurement tools such as quartz crystal microbalance (QCM) and quadrupole mass spectrometry (QMS), and (3) what is the impact that nonidealities in the surface kinetics have on the saturation profiles and homogeneity at the reactor scale? Understanding these three aspects is crucial to maximize the information that we can extract from reactor-scale data.
Finally, one of our goals is to make the simulation code available to the research community: the simulation tools and some examples that can be used to reproduce the results presented in this paper have been publicly released as open source in github (https://github.com/aldsim/aldFoam) under a GPLv3 license.
II. MODEL
A. Model equations
Our model solves the time-dependent transport equation of one or more reactive species subject to self-limited heterogeneous processes inside reactors of arbitrary geometry. Our model makes two fundamental assumptions: (1) we assume that precursors and reaction by-products represent a small perturbation with respect to the carrier gas flow and (2) we consider that the total carrier gas flow is minimally altered during doses and purge times. Both approximations are consistent with the way our experimental reactors operate: the carrier gas lines utilized during the precursor dose and purge operations in our reactor have similar conductances, so the effect of bypassing a precursor bubbler during the purge times on the overall flow is minimal.11
These two assumptions allow us to decouple the momentum and energy transport from the precursor mass transport equation so that the reactive transport of both precursors and reactants takes place on a velocity field that is determined by the carrier gas and the overall process conditions. Under these assumptions, we can approximate the momentum transport equation with the incompressible Navier–Stokes equation for the carrier gas,
which under the steady state flow approximation reduces to
Here, is the kinematic viscosity of the carrier gas, is the dynamic viscosity, and is the mass density. Note that this approximation assumes an isothermal reactor since the kinematic viscosity depends on the temperature through both the dynamic viscosity and gas density. This is a restriction that can be easily lifted if the temperature dependence with the surface kinetic parameters is known. A discussion on nonisothermal conditions is given in Sec. II E.
Under steady-state conditions, the solution of the Navier–Stokes equations results in a velocity field that is then used as input for the mass transport equations of each of the chemical vapor species. The mass transport equation for each species can be formulated in terms of its molar concentration as
with
Here, is the mass diffusivity of species in the carrier gas.
The molar concentration can be expressed in terms of the precursor pressure as
where is the gas constant.
Our model neglects gas phase reactions. However, the presence of nonself-limited processes can be directly incorporated through the boundary conditions. They also spontaneously occur whenever the precursor and coreactant are simultaneously present in the gas phase.
B. Boundary conditions
We codify the reactivity of each species in terms of a wall reaction probability . This term incorporates reversible and irreversible interactions as well as any side reaction pathways that do not contribute to either growth or etching such as surface recombination. The second term that we need to consider is desorption from the walls. We codify these processes in terms of a flux so that the mass balance equation at each point of the reactor surface can be expressed as
The exact dependence of and on the surface kinetics will vary depending on the specific heterogeneous processes being considered in the model. These will be described in more detail in Sec. II C. Also note that, with this formalism, the transport equations are the same regardless of the type of process that we are modeling, encompassing both self-limited growth (ALD) and etching (ALE).
Inlet boundary conditions need to incorporate the pulsed nature of ALD and ALE. A key challenge is how to model accurate pulses when the reactor inlet is not considered in the simulation domain. Here, we assume that concentration pulses at the inlet that are characterized by a response time controls the steepness of the pulse: at the beginning of each dose, the precursor concentration increases linearly during a time interval and remains fixed at a preset value during a time equal to . Then, the concentration decreases linearly with time over the same internal . This allows us to control the spread of the pulse at the inlet while keeping the product constant, where refers to the precursor pressure at the inlet. For reaction by-products ( ), we assume that there is no net mass transport at the inlet, by imposing the condition
Zero gradient boundary conditions were used for the outlet.
C. Surface models for self-limited processes
1. Ideal self-limited process
A key assumption of self-limited processes is the presence of a finite number of surface sites. If we define the surface fractional coverage as the fraction of surface sites that have reacted with the precursor, the simplest model is to assume that the reactivity of the precursor is given by12,13
that is, the surface reactivity toward the precursor is proportional to the fraction of available sites. This model corresponds to an irreversible first order Langmuir kinetics, and it is one of the most widely used models for ALD simulations. The evolution with time of the surface coverage will be then given by
When we consider full ALD cycles with both precursor and coreactant doses, the evolution of the fractional coverage simply incorporates the influence of both species,
Here, and are the surface flux of precursor species to the surface, is the average surface area of a reaction site for the precursor, and is the number of coreactant molecules required per precursor molecule required to satisfy the film’s stoichiometry. For instance, within this approximation, for the ALD of Al O from trimethylaluminum (TMA) and water and for the case of TiO ALD from titanium tetraisopropoxide and water.
The wall flux can be expressed as a function of the precursor pressure as
where is the mean thermal velocity of species and is connected with the molar density through Eq. (5).
The value of can be obtained from the mass gain (or loss in the case of etching) per unit surface area per cycle or the growth (etch) per cycle in unit of thickness. For the former, is simply given by
where is the molecular mass of the solid and is the number of precursor molecules per unit formula (i.e., two in the case of trimethylaluminum and Al O ).
2. Soft-saturating processes
A simple generalization to the ideal model is to consider two self-limited reaction pathways with a fast and a slow reacting component.14 This allows us to incorporate soft-saturating processes where saturation is not reached as fast as in the ideal case.
If represents the fraction of sites with the second pathway, the reaction probability will be given by
where and represent the sticking probabilities for each pathway and and their respective fractional coverages so that the total surface coverage will be
and the evolution of and is given by Eq. (9).
3. Site-blocking by precursor by-products
Thus far, we have assumed that heterogeneous processes involve a precursor and a coreactant species. However, reaction by-products and precursor ligands can play an important role, for instance, competing with precursor molecules for available surface sites. In the case of ALD, this can lead to the presence of thickness gradients in the reactor even under saturation with otherwise perfectly self-limited processes.15,16
Here, we implement a simple model that captures the impact of precursor by-products through a simple site-blocking mechanism. Our model considers the surface fractional coverage of both the precursor and precursor by-products, and , respectively. The simplest possible model assumes that upon adsorption, the precursor occupies a single surface site, releasing reaction by-products into the gas phase. These by-products can then adsorb on available sites. The coreactant is then able of fully regenerating the surface. This can be modeled using the following equations:
when modeling just the precursor dose. The flux of by-product molecules coming back to the gas phase is given by
When we model a full cycle, we have to incorporate the displacement of by-products by the coreactant,
Here, we assume a 1:1 displacement ratio between the by-product and the coreactant.
4. Nonself-limited and recombination pathways
Another way of generalizing Eq. (9) is to consider nonself-limited as well as secondary pathways such as surface recombination that do not lead to either growth or etching. We can implement them simply by adding extra components to the reaction probability,
where and correspond to the sticking probability of the nonself-limited and recombination, respectively.
5. Sticky precursors
The final case that we can consider are precursors that have a significant residence time on the surface. These may require larger purge times in order to fully evacuate unreacted precursor molecules from the reactor.
Here, we consider a simple model, where a precursor molecule can undergo monolayer adsorption on reacted sites. Under this assumption, we have available sites, reacted sites that do not have adsorbed precursor molecules, whose fractional coverage is given by , and reacted sites with adsorbed precursor molecules, whose fractional coverage is given by . The evolution of and will be given by
where is the desorption rate and is the sticking probability for reversible adsorption.
A consequence of having this process is that the growth per cycle becomes larger than the nominal saturation value when the coreactant reacts with reversibly adsorbed precursor molecules.
D. Transport coefficients
The models described above depend on the values of the kinematic viscosity , where is the dynamic viscosity and is the mass density of the carrier gas. Pair diffusivities of the different species in the carrier gas are calculated using the Chapman–Engskop approximation,17
where and are coefficients for the pair potential, which is assumed to be spherically symmetric. In the case of a Lennard-Jones model, is a function that can be parametrized as
E. Implementation
We have implemented the models described above using the open source library openfoam.18 openfoam solves partial differential equations using a finite volume method with a colocated grid approach in which all properties are stored at a single point of each control volume (its centroid). Interpolation, discretization, and matrix solution schemes can be selected at runtime. Through openfoam, we can work with arbitrary reactor geometries, including 1D, 2D, 2D axisymmetric, and full 3D simulations.
In order to incorporate self-limited processes, we have created a series of custom solvers for both ideal and nonideal self-limited interactions. Volume fields such as reactant and by-product concentrations are still solved and discretized using openfoam’s built-in capabilities. The time evolution of the different surface coverages is solved using a custom solver. Each of the active boundaries is initiated using kinetic parameters that are specific to each region in our mesh. This allows us to incorporate regions with different reactivity, for instance, to account for changes in temperature or different types of substrates.
All time derivatives are discretized using an implicit Euler method, whereas linear interpolation is used to approximate values at cell faces. The use of implicit methods ensures that fractional surface coverages remain bounded between 0 and 1. The solution of the resulting system of equations is solved using openfoam’s built-in algorithms. In particular, we used OpenFOAM's preconditioner biconjugate gradient (PBiCG) solver, a standard Krylov subspace solver that allows the use of a runtime selectable preconditioner. For this work, we used the simplified diagonal-based incomplete lower-upper preconditioner.
The velocity field used as input for the advection, diffusion, and reaction equation of gas-phase species [Eq. (3)] has been calculated using openfoam’s implementation of the SIMPLE algorithm for the Navier–Stokes equations.19 The velocity fields have been obtained assuming laminar flow conditions, which is a reasonable assumption for the low Reynolds numbers involved in the low pressure reactors considered in this work. We have also used nonslip boundary conditions for the flow velocity, which are consistent with the low Knudsen numbers in our experimental condition ( , assuming a mean free path of 50 m at 1 Torr and characteristic reactor width of the order of 1 cm). openfoam’s current implementation naturally allows us to extend the simulations to nonisothermal conditions. We have used these in the past to model other configurations, such as vertical chemical vapor deposition (CVD) reactors, with strong thermal gradients.20 Likewise, it is possible to generalize the model to consider turbulent flow conditions. Both conditions are outside the scope of the present work. In particular, for turbulent flows, one needs to consider the effect of turbulent transport, which becomes the dominant mechanism of precursor mixing.
The code has been run both in off-the-shelf desktops and laptops and in Blues, one of the clusters at Argonne’s Laboratory Computing Research Center. This allowed us to explore process parameters in a massively parallel fashion.
III. RESULTS
A. Simulation domains and carrier gas flow
In this work, we have considered two main reactor geometries shown in Fig. 1. The first geometry is a 2D model of a 5 cm diameter cross-flow cylindrical tube reactor.11 This model reproduces the geometry of three of the experimental reactors in our laboratory, one of which is shown in Fig. 1(a). We have created an axisymmetric mesh using openfoam’s mesh generation utility blockMesh. The mesh used in this work is composed of 9400 cells, with a spatial resolution in the downstream direction of 1 mm. Meshes with two times higher spatial resolution were also created to confirm that the simulation results are independent of mesh size.
The second geometry corresponds to a large area cross-flow horizontal circular reactor for 300 mm wafers with 30 mm circular inlet and outlet and a total diameter of 50 cm. The reactor is 2 cm tall [Fig. 1(b)]. The mesh has been generated using Gmsh,21 an Open Source mesh generator, and then converted to an openfoam-compatible format using the utility gmshToFOAM. The mesh used in this work is composed of 148 390 cells. Again, meshes with 2 times higher spatial resolution were also created to confirm that the results were independent of the mesh size.
The same flow conditions were used for all simulations in this work. For the tube geometry, we adapted the inlet velocity to ensure that the average flow velocity inside the reactor was 1 m/s, which is in agreement with the values expected in our experimental setup, and considered a pressure of 1 Torr. In Fig. 2(a), we show the steady state magnitude value of the flow velocity as it transitions from the inlet to the wider reactor region. The kinematic viscosity (Table I) is high enough to ensure that viscous terms compensate the inertial term and the flow quickly becomes fully developed, opening up in the reactor area without any wall separation that may lead to recirculation patterns and stagnation points at the inlet. The Reynolds number is for a characteristic length equal to the tube diameter, which is consistent with a laminar flow approximation, and a residence time of 0.5 s.
Parameter . | Symbol . | Value . | Units . |
---|---|---|---|
Precursor diffusivity | D 1 | 0.01 | m2 s−1 |
Kinematic viscosity | ν | 0.05 | m/s |
Process temperature | T | 473 | K |
Precursor molecular mass | M 1 | 150 | amu |
Surface site area | s 0 | 24 | nm2 |
By-product diffusivity | D bp | 0.05 | m2 s−1 |
Parameter . | Symbol . | Value . | Units . |
---|---|---|---|
Precursor diffusivity | D 1 | 0.01 | m2 s−1 |
Kinematic viscosity | ν | 0.05 | m/s |
Process temperature | T | 473 | K |
Precursor molecular mass | M 1 | 150 | amu |
Surface site area | s 0 | 24 | nm2 |
By-product diffusivity | D bp | 0.05 | m2 s−1 |
In the case of the 300 mm wafer reactor, the velocity field was obtained assuming 300 SCCM of total flow at a base pressure of 1 Torr. In Fig. 2(b), we show the magnitude of the flow velocity at a cross section located at midheight of the reactor. Both are consistent with stable flow patterns, with an average velocity of 0.5 m/s over the wafer area. This corresponds to a Reynolds number and a residence time of 1 s.
B. Ideal ALD process
1. Benchmark with the plug-flow model
In a previous work, we considered a plug-flow approximation of a cross-flow reactor and derived analytic solutions for the coverage profiles for the ideal ALD process.12 The plug-flow approximation assumes that precursor transport in the axial direction of a cylindrical reactor is dominated by advection, which corresponds to a case where the Peclet number, defined as the ratio of advective to diffusive transport rates, is greater than one,
where is the flow velocity, is the diffusivity, and is, in this case, the length of the tube. For the value of diffusivity in Table I, 1 m/s axial velocity and a reactor length of 40 cm, this condition is fulfilled, with .
In terms of the precursor pressure , the average flow velocity , dose time , and the same parameters used in Eq. (9), the predicted saturation coverage as a function of the dose time in the plug-flow approximation is given by the following expression:12
where
and
Here, the volume to surface ratio is equal to for a parallel plate reactor and for a tube reactor with being the radius of the cylindrical tube. In that same work, we also showed that Eq. (26) yielded excellent agreement with experimental saturation profiles obtained during the ALD of aluminum oxide from TMA and water.
In Fig. 3, we show a comparison between the simulated growth profiles obtained with our CFD model and those predicted by Eq. (26) for the cylindrical reactor. We present simulated growth profiles along the reactor for three different dose times: 0.05, 0.1, and 0.2 s. In all cases, the average precursor pressure at the inlet is set to 20 mTorr. The results, which have been obtained for a reaction probability , show a good agreement between this work and the analytic expression. A key difference between this work and the plug-flow approximation is that the latter neglects the presence of axial and radial diffusion: the combined effect of these two factors is a softening of the growth profiles with respect to the predictions of the plug-flow model.
Since our simulation domain incorporates part of the reactor inlet, in order to produce Fig. 3 we had to take into account the effect of upstream consumption in Eq. (26). We did so by considering an offset in the axial coordinate given by
where is the length of the inlet, is the radius at that specific location, and is the radius of the reactor tube. This expression is the result of considering the spatial dependence of [Eq. (28)] in the inlet as the radius (and therefore the mean flow velocity) changes with respect to the radius of the reactor tube.
2. Impact of reaction probability
The reaction probability is by far the most influential factor in the ideal ALD model. In Fig. 4, we show the impact of this parameter in the reactor growth profiles: all curves in Fig. 4 have been obtained considering the same dose time (0.2 s) and precursor pressure (20 mTorr) and different values of . As shown in previous works, a decrease in the reaction probability decreases the steepness of unsaturated growth profiles in the reactor. It is important to note, though, that a reduction of 1 order of magnitude in the value of does not necessarily reduce the total mass uptake by an order of magnitude. Instead, if the reaction probability is high enough, a reduction in the reaction probability simply redistributes the way the mass is deposited in the reactor. Only when the reaction probability is low enough, the system transitions from a transport-limited to a reaction-limited regime, and the mass uptake is directly proportional to .
The same trends are reproduced when instead of the tubular reactor we consider the 300 mm wafer reactor. In Fig. 5, we show the evolution of surface coverage during a single half cycle comprising a 0.3 s dose and 1s purge. The simulations assume a precursor pressure of mTorr and . The evolution of the surface coverage follows the steep saturation profile that is expected from high sticking probability processes.
In Fig. 6, we show the coverage profiles on 30 cm wafer substrates for increasing dose times and two values of the reaction probability: and . The results show the smoothening of the profiles and the increase in saturation times as the reaction probability decreases. This sensitivity to the surface reactivity at a wafer scale suggests that through a combination of CFD simulations and experimental thickness measurements on wafers it should be possible to extract sticking probabilities for self-limited processes in this type of reactors, similarly to how it is currently done using tubular reactors and macrotrenches.12,22
3. Modeling in situ QCM and QMS
As shown in Sec. III B 2, undersaturated growth profiles carry out information about the underlying kinetics of self-limited processes. However, the information that they provide is static, representing the cumulative effect of a whole dose. In contrast, in situ techniques such as QCM and QMS have enough temporal resolution to provide mechanistic information during each dose. One challenge of extracting kinetic data from these techniques is that kinetic effects are convoluted with the transport of the precursor inside the reactor. Here, we focus on simulating the output of these two techniques in order to understand how the flow and process conditions affect the measurements.
The transport of the precursor and the reaction by-products is strongly influenced by the reaction probability . In Fig. 7, we show the precursor partial pressure inside our tube reactor model at different snapshots in time taken at 0.1 s intervals during a single saturated dose. In particular, we compare processes with three different values of the reaction probability: , , and . In the high reaction probability case, the precursor is fully consumed before it reaches the end of the reactor. However, as the reaction probability decreases, an increasing fraction of the precursor reaches the outlet. In all cases, we observe a broadening of the pulse shape due to axial diffusion transport. This broadening depends on the precursor transport in the inlet region of the reactor (not shown in Fig. 7), and from an experimental point of view it will be affected by the flow and length of the inlet region of an ALD/thermal ALE reactor.
Precursor transport and surface reactivity impact how the shape of the QCM profiles changes as a function of the sensor position inside the reactor. In Fig. 8, we have considered the evolution of the fractional surface coverage during a single saturating dose at 10 different positions in our tubular reactor, each placed 4 cm apart. Under the ideal ALD conditions considered in this section, there is a one-to-one correspondence between the fractional surface coverage and the mass gain observed by the QCM.
The results in Fig. 8 show that time delays of up to 0.5 s can be expected between the first and the last QCM of the array. Precursor transport is a key factor in this offset, with a small contribution due to upstream precursor consumption. Upstream precursor consumption also leads to a change in slope observed as the QCM moves further from the inlet. This hinders our ability to extract quantitative information from data coming from a single QCM within a single half-cycle of a self-limited process without additional flow information.
We can again compare the results of the full model with the prediction of the simpler 1D plug-flow approximation for the tubular reactor. The plug-flow model predicts a mass gain as a function of time for a QCM located at a position given by
where is a rectifying function that is zero whenever the argument is negative. The value of represents the delay in the arrival of the pulse, which in the plug-flow approximation is simply given by
where is the average flow velocity in the tube. In Fig. 8(c), we show the predicted values of fractional coverage with time under the same conditions of 8(b). While the agreement is not perfect, Eq. (30) does provide a good approximation that could be used to discriminate between ideal and nonideal self-limited processes from QCM data.
The reaction probability strongly impacts the distribution of mass gains that would be observed using an array of sensors: for a high sticking probability process, mass gains sharply transition from fully saturated values in the upstream sensors to zero mass gain in the downstream sensors [Fig. 8(a)], whereas for lower values of the reaction probability [Fig. 8(b)] the difference in mass gain between upstream and downstream sensors is expected to be smaller. The use of multiple sensors can therefore be leveraged to extract quantitative data for a self-limited process if mass transport inside a reactor is well understood.
In order to model QMS input, we have tracked as a function of time the partial pressures of both the precursor and by-product species at a single location at the downstream position of the reactor. This is a common experimental configuration. In Fig. 9, we show the concentration of both species for the same two processes used for the QCM analysis: the reaction probability has a strong impact on the temporal separation of the traces coming from the precursor and the by-product. This is a natural consequence of the results shown in Fig. 7, where for high enough reaction probabilities all the precursor initially reacts inside the reactor. At the same time, reaction by-products are released since the beginning of the dose. This creates an offset between the two contributions. As the reaction probability goes down, this offset is reduced, as clearly seen in Fig. 9(b) for the case of .
This separation becomes more apparent if we break down a single dose into a sequence of microdoses. In Figs. 9(c) and 9(d), we show the precursor and by-product partial pressures during a sequence of microdoses, where each pulse corresponds to a 0.1 s dose. In the high sticking probability case [Fig. 9(c)], the first pulses are dominated by the contribution coming from the reaction by-products, and the transition between pure by-product and pure precursor signals occurs rather abruptly at 2 s. In contrast, a lower sticking probability [Fig. 9(d)] leads to by-product and precursor signals that persist throughout the microdose sequence and a gradual transition between by-product and precursor signals.
C. Nonideal self-limited processes
1. Soft-saturating processes
The first generalization of the ideal self-limited model that we have considered is the presence of more than one reactive pathway on the surface: we have incorporated a second self-limited reaction pathway, comprising a fraction of the surface sites and characterized by a different reaction probability (Sec. II C 2). This allows us to model soft-saturating processes with fast initial rise and slow saturation.
In Fig. 10, we show growth profiles in our tube reactor configuration for a self-limited process with a fast and a slow component for varying fractions of the slow component, . The growth profiles have been obtained using the same dose times (0.2 s) and precursor pressure in the inlet (50 mTorr) for all the cases. In Fig. 10(a), the fast and slow components have reaction probabilities of and , and they show the progressive transition from a more steplike saturation profile at to a more gentle, almost linear profile for .
In Fig. 10(b), the probability of the slow component is 2 orders of magnitude smaller, . The interesting aspect in this case is that for small values of , the region closer to the inlet shows the type of plateau that would be expected from a fully saturated process, except at surface coverages that are well below saturation. In Fig. 10(c), we show growth profiles for increasing dose times for the case of and . After a dose time of 0.5 s, the process has the appearance of saturation but its slow component is still evolving, resulting in a slightly higher coverage at 1 s dose time. The slow self-limited component could be easily mistaken for a nonself-limited component if the timescales for saturation are large enough. If we compare this result with the literature, there are several examples of this behavior in the case of ALD: for instance, it has been shown that extremely large doses of water can lead to slightly larger growth per cycle than the conventional ALD process.23
If we now consider this model on the 300 mm wafer reactor, we observe similar trends: in Fig. 11, we show the surface coverage on 300 mm wafers for increasing values of dose times for the same soft saturating process represented in Fig. 10(c). After 1 s, the whole wafer has almost identical surface coverage, yet it is still 10% below its saturation value.
A second consequence of this soft-saturating behavior is a process that appears to have reached saturation but that still has a measurable variability inside the reactor. Using the standard deviation of surface coverage as a measure of within-wafer variability, in Fig. 12 we compare the saturation curve of the soft-saturating process with that of the ideal process: both curves are very similar, except that the soft-saturating case “saturates” at a lower value of the fractional coverage [Fig. 12(a)]. On the other hand, the variability of the coverage across the wafer is much higher in the soft-saturating case and decreases asymptotically with dose times.
2. Site blocking by ligands and reaction by-products
A second generalization of the ideal self-limited interactions considers the effect of ligands or reaction by-products competing with the precursor for adsorption sites. This effect has been well documented in the ALD literature, leading to self-limited yet inhomogeneous growth profiles, as it is, for instance, the case of the ALD of TiO from titanium tetraisopropoxide and water and titanium tetrachloride and water.15,16 Growth modulation studies in ALD showed that alkoxides, betadiketones, and carboxylic acids are some of the moieties that interfere with the adsorption of ALD precursors.24
As described in Sec. II C 3, we have modeled the presence of competitive adsorption between precursor molecules and reaction by-products by considering that by-products can irreversibly react with surface sites through a first order Langmuir kinetics characterized by a reaction probability . Under this assumption, the surface now is composed of two types of adsorbed species, and we can therefore define a precursor fractional coverage and a by-product fractional coverage .
The sticking probability of both the precursor and reaction by-products is proportional to the fraction of available sites,
Consequently, both species are competing for the same pool of surface sites. The surface becomes unreactive when .
In Fig. 13, we show the impact that competitive adsorption has on the saturation profiles in our tube reactor. We consider that, on average, one ligand is released per precursor molecule. The by-product reactivity leads to the presence of thickness gradients even when the process reaches saturation. These gradients increase with the by-product reaction probability and are mitigated with increasing precursor pressures. The relative diffusivities of the precursor and by-product molecules also play a role, controlling the relative spread of the concentration gradients of both species as they move downstream in the reactor.
The results in Fig. 13 have been obtained assuming a reaction probability for the precursor of . In Fig. 13(a), we have explored the impact of increasing reactivity of the reaction by-product on the surface coverage of precursor molecules. The dose time (0.8 s) and average precursor pressure at the inlet during the dose (50 mTorr) lead to a complete saturation of the surface so that, in the absence of competition, the surface coverage is one everywhere in the reactor ( ). As we increase the reactivity of the surface by-product, we observe increasing gradients along the reactor due to the fact that, in saturation, . In Fig. 13(b), we show the surface coverage of the reaction by-product . The fact that the curves obtained mirror those of the precursor shown in Fig. 13(a) indicate that , as expected from a fully saturated process. In Fig. 13(c), we show the impact of precursor pressure for the case of under the same conditions used for Fig. 13(a): a higher precursor pressure tends to reduce the impact of by-product adsorption, as expected from a competitive adsorption process.
Similar results are obtained on the 300 mm wafer reactor. In Fig. 14, we show the by-product surface coverage at saturation for a precursor reactivity and two different ligand reactivities: and . By-product coverages increase from the upstream (left) to the downstream (right) position in the wafer, reaching maximum values of 5% and 25%, respectively.
In Fig. 15, we show the evolution of precursor coverage and variability for both cases, showing how both processes saturate at a point where is not zero. This provides a key differentiating feature between a soft-saturating process and a process with competitive adsorption by reaction by-products: in the former case, the variability is expected to decrease, albeit slowly, with increasing dose times, whereas the variability in the latter remains constant once saturation has been reached.
IV. DISCUSSION
In this work, we have explored self-limited processes in two different types of reactors: a cross-flow horizontal tube reactor and a model reactor for 300 mm wafers. For the tube reactor case, the results obtained are in good agreement with the prediction of an analytic model that we previously developed under the plug-flow approximation.12 The effect of axial diffusion is to smooth the growth profile for undersaturated doses with respect to the plug-flow approximation. The plug-flow model also shows a reasonable agreement with the expected QCM profiles as long as upstream consumption and propagation delays are properly accounted for. The plug-flow approximation underestimates the initial rise in surface coverage compared to the CFD model considering both axial and radial diffusion, but the overall agreement is good.
The first order irreversible Langmuir kinetics used for the ideal model is the simplest example of surface kinetics exhibiting self-limited behavior. In reality, though, many ALD and ALE processes exhibit more complex behavior. In this work, we considered two types of generalizations: we considered soft-saturating cases, which we modeled taking into account fast and slow self-limited reaction pathways, and the presence of site blocking by reaction by-products. Both cases represent examples of self-limited processes that can lead to reactor inhomogeneities. In the first case, inhomogeneities arise due to the timescale of the slow component requiring unfeasibly large doses to fully saturate the surface. The site-blocking case is intrinsically inhomogeneous even under saturation conditions.
The examples explored in this work are just two of the many sources of nonideal behavior. For instance, it has been postulated that the effective sticking probability of TMA can have a dependence with precursor pressure due to the presence of slow, reversible intermediates on the surface, something that is also well-known from the CVD literature.25 One of the challenges of including increasingly complex processes is the larger number of parameters, most of them unknown, that are introduced in the kinetic model: a self-limited process that is soft-saturating and has a slow reversible intermediate would have at least four independent parameters per precursor. In some cases, the surface mobility of the adsorbed species can greatly impact the effective sticking probability, for instance, increasing the capture zone around islands and steps in otherwise passivated surfaces.26
One of the challenges of modeling the surface kinetics of self-limited processes is the scarcity of kinetic data. This is also an issue for the mass diffusivity of different precursor molecules. Except for a few of the most commonly used precursors and some recent studies,17,27 the diffusivity of the majority of relevant species for ALD and ALE is not known. One way of extracting information on surface kinetics is through the use of saturation profiles. As mentioned above, plug-flow approximations have been used in the past to model and extract kinetic data from saturation profiles at a reactor scale. Likewise, growth profiles inside high aspect ratio features can be used to extract effective sticking probabilities of self-limited processes. Recently, the use of macrotrenches has greatly simplified the characterization of these growth profiles, allowing the use of techniques such as spectroscopic ellipsometry.22 The comparison between the kinetics at a reactor scale and inside macrotrenches can also provide information on the impact of the surface fluxes of different species on the kinetics of self-limited processes.
One example of the use of saturation profiles to extract kinetic data is shown in Fig. 16. In Fig. 16(a), we show an experimental configuration used to explore ALD at rates exceeding 1 cycle per second (1 Hz). Both precursors are brought into the reactor using two long injector lines. Precursor doses are so short that it reaches 100% precursor consumption across a 300 mm wafer. Using kinetic data for trimethylaluminum previously extracted in our tube reactor and fitted to the plug-flow model, we simulated this process using the model described in this work. The results, shown in Fig. 16(b), are obtained after considering three full ALD cycles. Without any fitting parameter, the agreement between the simulation and experiments is remarkably good, even capturing the presence of a nonself-limited CVD component at the center of the oval due to the simultaneous presence of trimethylaluminum and water on the surface. The main discrepancy between the simulations and the experiments is near the injection point: our model considers well-separated pulses, while in reality, the long injection channels cause a spread of the trimethylaluminum and water pulses, leading to a higher growth rate near the injectors due to precursor mixing. Still, Fig. 16 provides a good example of the potential of using reactor growth profiles as a source for kinetic information of self-limited processes.
Also, it is important to emphasize that, while we have focused on the case of ALD, the results presented in this work can be directly applied to atomic layer etching. For instance, in Fig. 8, we show the changes in mass as a function of time and reactor position as measured using in situ quartz crystal microbalance. In the case of thermal ALE, the only difference is that instead of mass gains, the plots in Fig. 8 would represent mass losses. Likewise, the simple 1D models explored in this work would naturally extend to the case of thermal atomic layer etching, providing a simple way of exploring the effective rate coefficients in ALE from etching profiles using subsaturating doses.
Finally, in this work, we have not considered the effect of high surface area materials inside the reactor. This is a technologically relevant case, as one of the key advantages of self-limited processes is its conformality, and yet the scale up of ALD processes to large area substrates can sometimes be far from trivial. In a prior work, we briefly explored the impact of a specific type of high surface area substrate on the reactant transport of an ALD precursor.28 That simple example showed the formation of a large region depleted of precursor near the center of the substrate for cross-flow reactors. However, this is an area that still needs further exploration and will be the focus of a future work. This is also a configuration where the symmetry between ALD and ALE breaks down: as the number of cycles starts to affect the shape and surface area of the nanostructures, we expect to see a divergence in precursor transport at the reactor scale for each case.
V. CONCLUSIONS
We have developed a versatile reactor scale simulation tool capable of modeling self-limited processes such as atomic layer deposition and atomic layer etching. All the process, from mesh generation to the visualization of 3D results, is based on open source tools. In addition to the traditional ideal self-limited model, we have explored soft-saturating processes and the case of competitive adsorption by reaction by-products. In particular, the model is able to capture two experimental observations in nonideal ALD processes: the presence of gradients at the reactor scale even under saturation due to competitive absorption between the precursor and reaction by-products and potential discrepancies in the growth per cycle due to homogeneous yet nonsaturated growth in soft-saturating processes. We have released the simulation code as well under a GPL v3 license and can be found at https://github.com/aldsim/aldFoam.
ACKNOWLEDGMENTS
This research is based upon work supported by Laboratory Directed Research and Development (LDRD) funding from Argonne National Laboratory, provided by the Director, Office of Science, of the U.S. Department of Energy under Contract No. DE-AC02-06CH11357. We gratefully acknowledge the computing resources provided on Blues, a high-performance computing cluster operated by the Laboratory Computing Resource Center at Argonne National Laboratory. We would also like to acknowledge Shashikant Aithal for his support and useful suggestions for testing our simulations in Blues.