Halide perovskite materials have attracted great interest for applications in low-cost, solution-processed solar cells and other optoelectronics applications. The role of moisture in perovskite device degradation and crystal formation processes remains poorly understood. Here, we use a data-driven approach to discover the influence of trace amounts of water on perovskite crystal formation by analyzing a comprehensive dataset of 8470 inverse-temperature crystallization lead iodide perovskite synthesis reactions, performed over 20 months using a robotic system. We identified discrepancies between the empirical crystal formation rates in batches of experiments conducted under different ambient relative humidity conditions for each organoammonium cation. We prioritized these using a statistical model and then used the robotic system to conduct 1296 controlled interventional experiments, in which small amounts of water were deliberately introduced to the reactions. The addition of trace amounts of water promotes crystal formation for 4-methoxyphenylammonium lead iodide and iso-propylammonium lead iodide and inhibits crystal formation for dimethylammonium lead iodide and acetamidinium lead iodide. We also performed thin-film syntheses of these four materials and determined the grain size distributions using scanning electron microscopy. The addition of water results in smaller grain sizes for dimethylammonium and larger grain sizes for iso-propylammonium, consistent with earlier or delayed nucleation, respectively. The agreement between the inverse temperature crystallization and thin film results indicates that this is a feature of the organoammonium-water interaction that persists despite differences in the synthesis method.
Halide perovskites are an emerging class of materials1 of interest for optoelectronics2 and photovoltaics.3 Choosing different A-site cations, such as different organoammonium species, yields a wide range of structural and functional diversities beyond the prototypical “perovskite” structure.4 Halide perovskites are solution processable, which reduces manufacturing costs, and further capital cost reductions could be achieved by reducing the need for environmental controls during device fabrication.5 However, environmental parameters, such as humidity, remain an important and incompletely understood aspect of perovskite device fabrication.6 Water affects crystal growth kinetics, morphology, ion migration, reversibility of phase transitions, and overall stability and is generally considered harmful to device stability and performance, although under certain conditions, small amounts of water result in higher quality films and enhanced photovoltaic performances.5,7–10 This is attributed to water promoting earlier nucleation (which results in low nucleation density and larger grain sizes), whereas anhydrous conditions allow the system to reach a higher degree of supersaturation before nucleation occurs (resulting in increased nucleation site density and smaller grain sizes).11,12 Post-synthesis water treatment can reorient polycrystalline thin films, improving charge-carrier extraction in photovoltaic devices.13 However, research has focused primarily on methylammonium, formamidinium, and cesium lead halides, and so less is known about the two-dimensional (Ruddlesden–Popper and/or Dion–Jacobsen) perovskite materials that have improved long-term photovoltaic stability.14,15
Single crystal studies provide insight into the growth process as well as facilitating structure and property determinations for new materials.16–18 Inverse temperature crystallization (ITC) has many advantages for growing large, high-quality single crystals.19,20 ITC relies on a retrograde solubility effect, in which the product perovskite crystal is less soluble at high temperatures than the precursor species in solution.21 While previously believed to only occur for methylammonium and formamidinium perovskites, our recent work has found that ITC growth conditions can be used to produce 17 other lead iodide perovskites.22 Yet, the >1000 articles citing the seminal 2015 papers by Saidmanov et al.19 and Kadro et al.,20 which introduced ITC growth for halide perovskites, have paid scant attention to the role of water. Saidmanov et al. noted the role of water in perovskite degradation but did not discuss its role in the ITC process.19 Kadro et al. noted that the gamma-butyrolactone (GBL) solvent used for ITC is hygroscopic, and cavity nucleation by trace water boiling out during the ITC process might initiate growth.20 However, using freshly distilled GBL in an air- and water-free glovebox gave the same results, which was used to exclude this as a primary cause; even if water was present, in this theory, it would only promote nucleation. In their mechanistic study of the ITC process, Nayak et al. attributed the role of trace water to the degradation of dimethylformamide (DMF) and GBL solvents into acidic species, which promotes crystal formation by shifting the organoammonium protonation state equilibrium.21 Their key insight was that this could be achieved more reproducibly by deliberately adding formic acid; again, their model implies that water would only promote crystal formation.
Here, we report two counterexamples in which water hinders ITC crystal formation, which were discovered using a statistical analysis of historical data. Seasonal variations in ambient laboratory conditions influence reaction outcomes but are generally not controlled and are often only communicated in intra-lab conversations. These variations provide “limited sloppiness” that enables the occurrence of beneficial serendipity and harmful zemblanity.23 Even without the ability to control all possible reaction parameters during each experiment, one can measure and record these data. Given enough observations, statistical methods can be used to identify unexpected correlations.
The combination of high-throughput experimentation and software for comprehensive data capture enables a strategy of data-driven automated serendipity. We previously described our Robot-Accelerated Perovskite Investigation and Discovery (RAPID) system for the synthesis of lead iodide perovskites by ITC.22 Reactions are performed in batches of 96 experiments, typically sampled randomly over the achievable composition space. For the present analysis, reaction outcomes are reduced to a binary score (does a crystal form or not), independent of size or quality. Figure 1(a) depicts the observed crystallization probability in 94 batches containing 8470 reactions collected over 20 months in a single laboratory. (A full copy of this dataset, “perovskite0057” and the corresponding data analysis code are provided in the supplementary material.) Our Experiment Specification, Capture, And Lab Automation TEchnology (ESCALATE) software captures not only the experiment specification (composition, temperature, and other processing conditions) and reaction outcome but also the metadata associated with the experiment.24 This metadata includes ambient laboratory conditions, including laboratory humidity. As reactions are performed in open atmosphere (not in a glovebox), they experience variations in laboratory humidity. The distribution of observed relative humidity (RH%) is shown in Fig. 1(b).
These uncontrolled fluctuations provide a set of natural experiments to identify cases where humidity affected crystal formation. Figure 1(c) depicts an illustrative example of the relationship between laboratory humidity and ethylammonium lead iodide, (CH3CH2NH3)PbI325 crystal formation. The weak correlation (adjusted R2 = 0.048) could be attributed to water playing a minor role in crystallization for this system or to other variations between the different experiment batches, such as differences in sampling regimes. Although the dataset contains a complete transcript of all experiments (including typically unreported “failed” reactions26) and experiments were predominantly determined by random sampling (to limit anthropogenic selection bias27), different sampling algorithms were used in this initial dataset. An initial algorithm sampled dispense volumes and was subsequently replaced with an algorithm that performed uniform sampling in the achievable concentration space.28 The dataset also contains deliberate reproducibility experiments, experiments designed consciously by humans, and experiments guided by machine learning algorithms operating in both exploration and exploitation modes. These different experiment sampling choices may be correlated with seasonal variations in laboratory humidity and, in turn, hinder automated serendipity.
Using the distribution of observed RH% [Fig. 1(a)], “low” and “high” humidity conditions were defined as the bottom (18.3–44.9% RH) and the top (54.7–63.0% RH) terciles, respectively. As the polymer-capacitive humidity sensors have a ±2.5% precision,29 restricting attention to the top and bottom terciles focuses on the cases where the effect is largest. Five organoammonium iodide salts had batches conducted in both low- and high-humidity conditions; the observed crystallization probability for the low- and high-humidity batch of experiments is depicted on the horizontal and vertical axes of Fig. 2 for each species.
Confirmation experiments were prioritized using an elementary statistical model. Let us assume that each individual crystallization experiment is a Bernoulli trial with crystal formation probability p. Two possible physical interpretations are as follows: (i) p describes the stochastic nature of crystal nucleation in each individual experiment; (ii) p describes the relative volume of successful experiments in the compositional space, i.e., the chance that a randomly selected experiment is successful when one uniformly samples possible compositions. The number of crystals formed in a batch of n experiments is the sum of these Bernoulli trials, i.e., the Binomial distribution, . Suppose one compares two simultaneous batches of n experiments conducted under conditions a and b, with crystal formation probabilities pa and pb, respectively (e.g., low humidity and high humidity). What is the probability of observing the “wrong” outcome (i.e., fewer total successes in group b than in group a, despite pa > pb)? The probability of k successes for group a is the probability mass function of ), denoted as . The probability that as-many-or-fewer-than k successes occur for group b is the cumulative density function of , denoted as . Both f and F have well-known analytical expressions. As experiments in each batch are independent, the probability that both events occur is the product, and the probability of more successes in a than in b is
By similar logic, the probability that more than k successes are observed for group b is
Suppose one is willing to accept a 5% probability of observing the “wrong” outcome (i.e., despite pa < pb, we observe ka > kb), for a given experiment budget of 2n (i.e., n trials assigned to each batch). This demarcation corresponds to setting Eq. (1) or Eq. (2) equal to 0.05 and solving for the contour of pb for each input pa values; contours for n = 24, 48, and 96 are shown in Fig. 2. Points outside a given contour line have a < 5% chance of giving the “wrong” answer for the given number of trials, and thus, the two chemical systems outside the blue n = 24 contour will be “easiest” to verify if the low- and high-humidity groups have been sampled identically for all other variables, such as composition. To reiterate, this condition is not strictly obeyed for this dataset, as the historical low- and high-humidity batches were generally sampled using different methods (vide supra). Even so, this analysis allows us to identify possible serendipitous results and rank the difficulty of testing them in subsequent interventional experiments.
To confirm the predictions described above, we conducted 1296 controlled matched-pair experiments using RAPID22 (see the supplementary material for full experimental details). Each batch of 96 experiments consisted of 48 matched pairs, whose compositions were sampled over the possible organoammonium iodide, lead iodide, and formic acid concentration space.22,28 Sampling strategies included (a) uniform sampling over all achievable compositions, (b) uniform sampling over compositions where the formic acid concentration was between 7 and 10 M, and (c) uniform sampling within the convex hull defined by previously observed crystal formation. For each sampled composition, a pair of experiments—one without water, the other with deliberate addition of 10 μl water—was performed. To eliminate any inadvertent water content, all solvents were dehydrated overnight with 3 Å molecular sieves, which were activated in an oven at 250 °C for 3 h before use. Glassware was dried prior to the experiment by heating in a dry oven at 100 °C for 24 hours. Paired experiments were placed in a mirror symmetric (“butterfly”) pattern on the 96-well microplate, depicted in Figure S2, to eliminate any effects of temperature variations across the plate during heating.30 To define our notation, Npairs is the total number of unique pairs of experiments performed (one each with and without the addition of water), and Nwet+ and Ndry+ are the number of crystals observed for experiments with and without added water, respectively. Concordant outcomes—when both the wet and dry pairs yield a crystal (N++) or not (N−−)—are uninformative. Discordant outcomes—N+- when the wet experiment yields a crystal but the dry experiment does not, and N-+ when the wet experiment does not yield a crystal but the dry experiment does—are analyzed using the exact unconditional two-sided McNemar pair test,31 as each pair differs only because of the intervention, and each outcome is dichotomous. (The commonly-used asymptotic McNemar test may fail for small numbers of examples,32 which is avoided by using the exact unconditional form.33) The relative number of the discordant outcomes reveals the directionality and magnitude of the effect. When N+- > N-+, water promotes crystal formation; when N-+ > N+-, water inhibits crystal formation. Table I contains a summary of the results and statistical analysis; the electronic supplementary material contains the complete machine-readable description and visualization of the experiment outcomes.
Organoammonium . | Npairs . | Ndry+ . | Nwet+ . | N++ . | N+− . | N−+ . | N−− . | McNemar p . |
---|---|---|---|---|---|---|---|---|
Dimethylammonium (all) | 192 | 77 | 81 | 71 | 10 | 26 | 85 | 0.008 0 |
(a) Uniform sampling | 96 | 29 | 29 | 24 | 5 | 5 | 62 | 0.96 |
(b) 7M ≤ [FAH] ≤ 10 M | 48 | 24 | 23 | 19 | 4 | 5 | 20 | 0.80 |
(c) Historical crystallization regions | 48 | 44 | 29 | 28 | 1 | 16 | 3 | 0.000 26 |
4-methoxyphenylammonium (all) | 144 | 2 | 11 | 0 | 11 | 2 | 131 | 0.015 |
(a) Uniform sampling | 48 | 0 | 0 | 0 | 0 | 0 | 48 | … |
(b) 7M ≤ [FAH] ≤10 M | 48 | 1 | 6 | 0 | 6 | 1 | 41 | 0.066 |
(c) Historical crystallization regions | 48 | 1 | 5 | 0 | 5 | 1 | 42 | 0.11 |
Acetamidinium (all) | 144 | 60 | 33 | 38 | 5 | 22 | 79 | 0.001 0 |
(a) Uniform sampling | 48 | 18 | 16 | 16 | 0 | 2 | 30 | 0.19 |
(b) 7M ≤ [FAH] ≤ 10 M | 48 | 22 | 12 | 9 | 3 | 13 | 23 | 0.013 |
(c) Historical crystallization regions | 48 | 20 | 15 | 13 | 2 | 7 | 26 | 0.11 |
iso-propylammonium (all) | 144 | 3 | 10 | 3 | 7 | 0 | 134 | 0.008 3 |
(a) Uniform sampling | 48 | 0 | 1 | 0 | 1 | 0 | 47 | 0.34 |
(b) 7M ≤ [FAH] ≤ 10 M | 48 | 3 | 8 | 3 | 5 | 0 | 40 | 0.029 |
(c) Historical crystallization regions | 48 | 0 | 1 | 0 | 1 | 0 | 47 | 0.34 |
Organoammonium . | Npairs . | Ndry+ . | Nwet+ . | N++ . | N+− . | N−+ . | N−− . | McNemar p . |
---|---|---|---|---|---|---|---|---|
Dimethylammonium (all) | 192 | 77 | 81 | 71 | 10 | 26 | 85 | 0.008 0 |
(a) Uniform sampling | 96 | 29 | 29 | 24 | 5 | 5 | 62 | 0.96 |
(b) 7M ≤ [FAH] ≤ 10 M | 48 | 24 | 23 | 19 | 4 | 5 | 20 | 0.80 |
(c) Historical crystallization regions | 48 | 44 | 29 | 28 | 1 | 16 | 3 | 0.000 26 |
4-methoxyphenylammonium (all) | 144 | 2 | 11 | 0 | 11 | 2 | 131 | 0.015 |
(a) Uniform sampling | 48 | 0 | 0 | 0 | 0 | 0 | 48 | … |
(b) 7M ≤ [FAH] ≤10 M | 48 | 1 | 6 | 0 | 6 | 1 | 41 | 0.066 |
(c) Historical crystallization regions | 48 | 1 | 5 | 0 | 5 | 1 | 42 | 0.11 |
Acetamidinium (all) | 144 | 60 | 33 | 38 | 5 | 22 | 79 | 0.001 0 |
(a) Uniform sampling | 48 | 18 | 16 | 16 | 0 | 2 | 30 | 0.19 |
(b) 7M ≤ [FAH] ≤ 10 M | 48 | 22 | 12 | 9 | 3 | 13 | 23 | 0.013 |
(c) Historical crystallization regions | 48 | 20 | 15 | 13 | 2 | 7 | 26 | 0.11 |
iso-propylammonium (all) | 144 | 3 | 10 | 3 | 7 | 0 | 134 | 0.008 3 |
(a) Uniform sampling | 48 | 0 | 1 | 0 | 1 | 0 | 47 | 0.34 |
(b) 7M ≤ [FAH] ≤ 10 M | 48 | 3 | 8 | 3 | 5 | 0 | 40 | 0.029 |
(c) Historical crystallization regions | 48 | 0 | 1 | 0 | 1 | 0 | 47 | 0.34 |
Matched-pair experiments were initially conducted for the dimethylammonium and 4-methoxyphenylammonium lead iodide systems by sampling the achievable compositional space uniformly, denoted as strategy (a). As indicated in Table I, there was no overall discrepancy between the crystal formations in the presence or absence of water. For dimethylammonium, Nwet+ and Ndry+ were the same, and there was no systematic trend in the discrepant outcomes (N-+ = N+-). For 4-methoxylphenylammonium, no perovskites formed in this set of experiments. However, many of these sampled compositions simply fail to form compounds, which underrepresents regions where water modifies crystal formation. Historical reaction data for 4-methoxyphenylammonium iodide found that formic acid (FAH) concentrations below 7 M and above 10 M almost exclusively yielded non-crystalline powder or clear liquid, respectively, motivating us to constrain sample to within those bounds, denoted as strategy (b). Those experiments resulted in meaningful difference in the discordant outcomes for 4-methoxyphenylammonium (p = 0.066), but not for dimethylammonium (p = 0.80). (The p-value indicates the probability of the observation under the null hypothesis that outcomes with and without water is the same; lower p-values indicate a better rejection of this null hypothesis.) As formic acid-constrained sampling may not include conditions relevant to each cation, strategy (c) used historical data collected for each cation to construct a convex hull in compositional space where crystal formation occurred and then generated new experiments by a grid sampling of 48 compositions within this region (see Fig. S3). Experiments using strategy (c) revealed differences in crystal formation with and without water for these two systems, with p = 0.000 26 and p = 0.11 for dimethylammonium and 4-methoxyammonium, respectively. Satisfied that at least one of these sampling conditions would highlight the effect of water on crystallization, all three strategies were used for the remaining two systems: acetamidinium and iso-propylammonium. In both cases, strategy (b) resulted in the greatest discrepancy between reaction outcomes with added water (p = 0.013 for acetamidinium and p = 0.029 for iso-propylammonium).
Different sampling strategies explore regions in compositional space where it is easier or harder to find discordant examples, but the trends should hold over all conditions when the data are combined. Indeed, statistically significant differences in reaction outcome were observed for all four systems studied. Water promotes crystal formation (N+− > N−+) for 4-methoxyammonium and iso-propylammonium systems, consistent with expectations from prior ITC results. In contrast, water inhibits crystal formation (N−+ > N+−) for dimethylammonium and acetamidinium systems, contrary to previous expectations. The interventional outcomes agree with the qualitative predictions in Fig. 2 in 3 of the 4 tested systems, despite the wildly different sampling strategies employed in the historical dataset. Furthermore, the fewest discrepancies are observed for iso-propylammonium in these interventional experiments, consistent with the difficulty estimate provided by the contour lines in Fig. 2. Discrepant reaction outcomes are scattered throughout compositional space in each system (see Figs. S4–S16), supporting the first physical interpretation of the statistical model.
To explore the relevance for device fabrication, perovskite thin films were fabricated via spin-coating for each of the four cations using the Soltrain system.34 Separate films were prepared in a glovebox using precursor solutions prepared without water, 1% (v/v) water, and 2% (v/v) water. The same solvents and temperature ranges were used as the ITC counterparts. The grain length distribution of the resulting films was characterized by scanning electron microscopy (SEM). X-ray diffraction (XRD) patterns indicate that the single-crystal and thin-film syntheses resulted in the same perovskite phase (see the supporting materials for relevant experimental details and XRD comparisons). The surface roughness and the clustering of grains in perovskite thin-films are highly dependent on the choice of A-site cations, even without the addition of water, consistent with previous observations.35 Water has only a slight effect on the grain lengths of 4-methoxyphenylammonium and acetaminidium (see Fig. 3). In contrast, water increases grain length in iso-propylammonium perovskite thin films and decreases grain length in dimethylammonium films. The latter is unusual, as most prior studies report that water enhances grain growth, but it is consistent with the corresponding ITC results. In spin-coating experiments, larger grains occur when grain growth is faster than nucleation (typically at lower supersaturation of the perovskite precursors) and smaller grains occur when nucleation is faster than grain growth (typically at higher supersaturation).36 Water promotes nucleation of iso-propylammonium lead iodide at a lower supersaturation concentration, resulting in larger grains. In contrast, water inhibits nucleation of dimethylammonium lead iodide until the system reaches higher supersaturation, resulting in decreased grain lengths. Despite the different crystallization mechanisms, information obtained about free-standing single crystal formation by ITC can be used to identify systems where additives modify substrate-based grain growth by spin-coating. Other qualitative variations depend upon both the cation and presence of water, without a direct relationship to the grain length distribution changes. For example, the addition of water increases the clustering of grains and pinholes present in the acetamidinium and isopropylammonium films [Figs. 3(i) and 3(l)]. However, these features depend more upon the cation selected than on the presence or absence of water, as seen by comparing dimethylammonium to isopropylammonium films [Figs. 3(f) and 3(o)]. Further work is needed to determine how thin-film morphology in the presence of water-organoammonium interactions is modified by growth temperatures, solvent types, and process parameters.
In summary, water can promote or inhibit perovskite crystallization, depending on the organoammonium cation species present. The latter observation contrasts with previous ITC studies, which attributed the role of water to creation of nucleation sites (from vaporization) or changes in protonation equilibrium (through formation of acid decomposition products), both of which only promote crystal formation. Qualitatively consistent trends in both ITC and thin-film systems suggest a common underlying water-organoammonium interaction mechanism. A practical implication is that trace amounts of water can provide an additional experimental parameter to produce the compact thin films of desired grain sizes needed for stable and efficient devices.
More broadly, this work demonstrates the value of comprehensive electronic experimental records containing both data and metadata. Such records are a prerequisite for automated serendipity, enabling the statistical identification of anomalies that can be subjected to a more deliberate study. As such, it serves as additional encouragement for the adoption of automated laboratory processes (such as RAPID22 and SolTrain34) and software (such as ESCALATE24) that facilitate this type of data collection and reuse.
See the supplementary material for the following: the description of data files and analysis codes; expanded discussion of materials and methods for the ITC and thin-film syntheses and SEM characterization; expanded discussion of sampling strategies; figures illustrating reaction outcomes for the 1296 ITC experiments as a function of composition; and an expanded version of Fig. 3 showing results for 1% water addition.
AUTHORS' CONTRIBUTIONS
V.G. and J.S. conceived the project and performed the historical data analysis. P.W.N. and Z.L. performed the synthesis, characterization, and data analysis of ITC perovskites. J.T., S.S., and N.T.P.H. performed the synthesis and characterization of thin-film perovskites. P.W.N. performed the data analysis of the thin-film perovskites. J.S., E.M.C., A.J.N., and T.B. supervised the project. All authors contributed to the preparation of the manuscript.
This study was based upon the work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No. HR001118C0036. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of DARPA. Work at the Molecular Foundry was supported by the Office of Science, Office of Basic Energy Sciences, of the U.S. Department of Energy under Contract No. DE-AC02–05CH11231. J.S. acknowledges the Henry Dreyfus Teacher-Scholar Award (No. TH-14–010) and resources of the MERCURY consortium (http://mercuryconsortium.org/) under NSF Grant No. CHE-2018427. S.S. acknowledges partial support from TOTAL S.A. for supporting a research fellowship.
The authors declare general IP in the area of applied machine learning, and some of the authors are associated with startup efforts to accelerate materials development using applied machine learning. These do not relate to this study specifically, as all data and code from this study are open sourced.
DATA AVAILABILITY
Complete data sets and Mathematica 12.1 and Python 3.7 codes used for data analysis and figure generation are openly available in figshare at https://doi.org/10.6084/m9.figshare.14963733, Ref. 37. The data that support the findings of this study are available within the article and its supplementary material.