Heterogeneous nucleation in the random field Ising model

We investigate the nucleation dynamics of the three-dimensional random field Ising model (RFIM) under an external field. We use umbrella sampling to compute the free-energy cost of a critical nucleus, and use forward flux sampling for the direct estimation of nucleation rates. For moderate to strong disorder, our results indicate that the size of the nucleating cluster is not a good reaction coordinate, contrary to the pure Ising model. We rectify this problem by introducing a coordinate that also accounts for the location of the nucleus. Using the free energy barrier to predict the nucleation rate, we find reasonable agreement, although deviations become stronger as disorder increases. We attribute this effect to cluster shape fluctuations. We also discuss finite-size effects on the nucleation rate.


I. INTRODUCTION
Nucleation phenomena control many important physical processes, including vapor condensation 1 , ice crystallization 2 , and many others [3][4][5][6][7][8][9] .Classical theories [10][11][12] describe the nucleation process as the spontaneous formation of a cluster of the stable product phase within a homogeneous metastable phase, across a free energy barrier that is the free energy cost of forming a critical nucleus.In modern formulations, this means that the reaction coordinate is the size of the nucleating cluster 13 .Classical nucleation theory (CNT) estimates this free energy barrier from macroscopic properties of the phases, and provides a good qualitative description of many nucleation processes 3 .In particular, nucleation in colloidal systems has been numerically studied in great detail [13][14][15][16][17] , establishing the modern computational approach to the study of nucleation dynamics.As a simple example of homogeneous nucleation, there is also an extensive body of numerical work on the domain-reversal dynamics of the Ising model [18][19][20][21][22][23] , which is a simple and computationally tractable model system.
However, while these theories of homogeneous nucleation are elegant and consistent with computer simulation results, experimental systems are often affected by heterogeneous nucleation, for example due to random impurities, or surfaces.These are beyond the scope of classical theories.Studies of nucleation in Ising models have been extended to heterogeneous nucleation by manipulating the boundary conditions [24][25][26] or introducing impurities into the system [27][28][29] .In particular, it was found that nucleation in free boundary conditions happens preferentially at the corners of the system 24 , and that the introduction of a single fixed spin can speed up nucleation by four orders of magnitude 27 .It was more recently found that the introduction of randomly placed 0-spins lowers both the free energy barrier and the critical nucleus size 29 .
In this work, we analyze nucleation in the random field Ising model (RFIM), which is a prototypical system for studying effects of disorder on first-order phase transitions.The RFIM provides a schematic description of many physical systems where impurities play an important role, such as diluted antiferromagnets in a homogeneous external field 30 , mixed Jahn-Teller systems 31 , binary liquids in porous media 32 , etc. (a review can be found in Ref. 33).As such, the model provides an interesting setting for effects of disorder on nucleation.Note for example that it interpolates smoothly between homogeneous nucleation when the disorder strength is zero, and heterogeneous nucleation at large disorder.In addition, recent work has connected the RFIM to properties of glassforming liquids [34][35][36][37][38][39][40][41] , which provides a further motivation for studies of finite-temperature dynamics in the RFIM [42][43][44][45][46][47] .
We study the three-dimensional RFIM using computer simulations.An established approach for nucleation 23,29 is to use umbrella sampling 48 to compute a free energy barrier associated with the critical nucleus, which can then be compared with direct estimation of the nucleation rate by forward flux sampling (FFS) [49][50][51] or other rare-event sampling methods 52 .Following the same path for the RFIM, an analysis based on the committor 53 shows that that the size of the nucleating cluster is not a suitable reaction coordinate for nucleation, except when the disorder is very weak.We rectify this problem by introducing a localized reaction coordinate, which measures the barrier for nucleation in a specific part of the system.Using these results to predict nucleation rates, we compare with FFS simulations, finding agreement to within an order of magnitude over a wide range of nucleation rates, although deviations become stronger as disorder increases.We attribute these deviations to shape fluctuations of the nucleating cluster, which can affect the rate 52,54 .We also analyze finite-size effects on the nucleation rate, where rare regions of the system can play an important role.
In the remainder of this paper, Sec.II introduces the RFIM and our theoretical approach.Sec.III presents the results and Sec.IV shows our conclusions.We describe our numerical methods in the Appendices.

A. RFIM
We perform our investigation in three spacial dimensions, which is the lowest dimension at which the RFIM has a ferromagnetic phase [55][56][57] .The three-dimensional RFIM is defined on a cubic lattice of linear size L.Each of the N = L 3 vertices on this lattice contains an Ising spin s i which takes the values ±1, corresponding to the up-spin and down-spin states.A configuration of the system is denoted by s = (s 1 , s 2 , . . ., s N ).
Each spin interacts with its nearest neighbors by an exchange interaction of strength J > 0, and feels a magnetic field of strength H + h i , so the system's energy is where the notation ⟨i j⟩ indicates a sum over pairs of nearest neighbors, while the sum over i run over all spins in the system.The parameter H represents the external magnetic field, and h i is a quenched random field on site i that are independent and identically distributed Gaussian random variables with standard deviation R. Physically, R is the typical magnitude of the random field.We fix the energy scale by setting J = 1, so the dimensionless parameters that appear in the energy are H and R.
We study the domain reversal dynamics of the RFIM in a small positive external field, from the metastable state of bulk down spins to the stable state of bulk up spins.As the total number of up spins is not conserved, it is natural to use Metropolis dynamics 58 .In a single Monte Carlo (MC) move, one picks a random spin s i and proposes to change its value from s i to −s i .This proposed move is accepted with probability min(1, exp (−β ∆E i )) where ∆E i is the change in energy due to the proposed move.A sequence of N such moves is called an MC sweep (MCS), which provides the natural time unit for our system.

B. Becker-Döring theory of nucleation
Theories of nucleation aim to predict the nucleation rate per unit volume, denoted here by I.In our case, this means that a system initialized in a metastable phase undergoes nucleation with probability p nuc (∆t) = IN∆t in a short time ∆t.One typically expects that I is an intensive quantity (independent of system size), which is the case for the pure Ising model.However, situations may be more complicated in systems with disorder 28 .
To estimate I, we start from Becker-Döring theory 11,12 , which is framed in terms of the concentrations of clusters of up spins, and forms the basis of classical nucleation theory (CNT).Write M n (s) for the number of up-spin clusters of size n in configuration s.The free energy of such a cluster is measured relative to that of an individual monomer as We also identify ρ n = ⟨M n ⟩/N as the concentration of such clusters, and in particular ρ 1 is the concentration of isolated up spins.Note that this theory does not distinguish the shapes of the clusters, nor their locations in the system.Becker-Döring theory assumes additionally that clusters grow and shrink by single spin flips, and that this process is Markovian.The rates of growth and shrinkage are related through the detailed balance condition, expressed in terms of the equilibrium concentrations ρ n 14,59 .Finally one assumes that nucleation is a rare event in which case F(n) will have a large barrier at a cluster size n * ≫ 1.Then the size n can be promoted to a continuous coordinate and the nucleation dynamics can be reduced to a onedimensional Brownian motion in a potential F(n).The nucleation rate is controlled by the barrier height as where ∆F = F(n * ) and f * is the rate that the cluster size increases from n * to n * + 1, and is the Zeldovich factor 12 which gives the extent to which the critical nucleus needs to grow before falling into the product basin.
In CNT, the barrier height in (3) is estimated in terms of macroscopic properties of the starting (metastable) phase and the nucleating (stable) one.This is a drastic assumption since practical critical nuclei are unlikely to be macroscopic.Fortunately, accurate microscopic computations of ∆F are possible using computer simulations [13][14][15][16][17]23,29,52,60 .

C. Reaction coordinate and committor
Modern theories for rare transitions between metastable states 54 are framed in terms of a reaction coordinate, and describe the kinetic pathway by which the system transforms.For nucleation, this pathway involves the growth of a cluster of the nucleating phase.However, the most appropriate reaction coordinate for describing this process is a subtle question even for systems without disorder, involving an interplay of the cluster size and shape.For the RFIM, we show below that one must also consider the cluster location.
For consistency of Becker-Döring theory with modern rareevent theories, one should identify the reaction coordinate with the size of the nucleating cluster, which is the largest cluster of the stable phase in the system 13 .In our system, this is the size of the largest connected cluster of up spins, which we denote by λ (s).To test whether λ is a good reaction coordinate, one should consider the committor p B 53 : for any configuration s, this p B (s) is defined as the probability that a trajectory initialized in s reaches the nucleating (stable) phase before it returns to the parent (metastable) one.This probability is estimated numerically by running many such trajectories.
The ensemble of configurations s with p B (s) = 0.5 plays an important role in transition path theory: it is called the transition state ensemble (TSE).If λ is the optimal reaction coordinate, then the TSE can be characterized as the ensemble with λ (s) = n * where n * is the size of the critical nucleus.This allows Becker-Döring theory to be interpreted in terms of this reaction coordinate.For the pure Ising model, this situation holds quite accurately for λ 23 .On the other hand, if a poor reaction coordinate is chosen, whose free energy maximum does not correspond to the TSE, one would expect faulty estimates of the reaction rate 53,61 .
In addition to choosing an appropriate reaction coordinate, theories for homogeneous nucleation require some care because the nucleation rate also depends on the system size.In standard rare event theories the rate is proportional to e −β ∆F and the free energy F can be estimated via a histogram of the reaction coordinate, typically extracted by umbrella sampling.However, it is important in nucleation theories that ∆F is instead computed via (2), see Ref. 14 and also Refs.26 and 62 for a discussion.This free energy can still be computed from umbrella sampling simulations, see Appendix A for details.

III. RESULTS
Analysis of nucleation requires a suitable choice of model parameters.Writing T c (R) for the critical temperature of the RFIM, we must choose the temperature T significantly below T c (R), but still high enough that simulations are tractable, as well as avoiding the roughening transition that occurs at low temperatures in the pure Ising model.We work throughout at T = 2.7 ≈ 0.6T c (0) which is a representative parameter choice within this regime.The external field H must be chosen small enough that nucleation is a rare event, but very small values lead to very large critical nuclei, which are problematic for numerics.The values used in the following respect these constraints.

A. Free energy and committor distribution
For any given disorder realization the free energy F(n) can be measured by umbrella sampling, which we describe in Appendix A. An example is shown in Fig. 1, based on a representative realization with weak disorder R = 0.1.Based on this free-energy profile, we estimate the size of the critical nucleus as n * ≈ 210 and we compute the committor distribution P(p B ) for the ensemble with λ (s) = n * .This is shown in the inset of Fig. 1.The distribution shows a single peak near p B = 0.5, indicating that the cluster size is a suitable reaction coordinate for nucleation, for this (weak) disorder.(That is, configurations with λ = n * do form a good approximation for the TSE.) Comparing with F(n) with results of Ref. 23 for the pure Ising model, we observe that the addition of small disorder leaves the critical nucleus size almost unchanged, but slightly reduces the height of the barrier (the difference is approximately 5k B T ).Reduced barrier heights are generic in the presence of disorder, as observed for example in heterogeneous nucleation 27,29 .
At higher disorder, the effect of the random field becomes much more pronounced.Fig. 2 shows free-energy profiles computed for 8 disorder realizations at R = 1.0.Their shapes vary significantly between disorder realizations, with some even showing nonconvexity, which can be attributed to pinning: a set of sites with highly negative random field values around a growing cluster induces an energetic barrier to the growth of that cluster past these sites, which modifies the shape of the free energy curve.The behavior for intermediate disorder (R = 0.7) is qualitatively similar despite the free energy curves remaining convex, and will be discussed in Sec.III D, below.
Focusing on one of the realizations in Fig. 2, F(n) is replotted as the black curve in Fig. 3(a).The corresponding committor distribution is shown in the inset of that Figure , showing that P(p B ) is no longer sharply peaked, so the cluster size is not a good reaction coordinate, contrary to the assumption of classical theories, and that extra information is required to describe the nucleation mechanism.This is due to FIG. 3. (a) The free energy curves obtained through umbrella sampling for a system with R = 1.0, T = 2.7 and H = 0.25.The black curve is the unconstrained free energy, while the colored curves represent free energies calculated using the constraint scheme introduced in Sec.III B. (b) The average local magnetization m i collected at the peak of the barrier, where darker patches corresponds to higher magnetization.The colors of the circles around the visible clusters match with the colors of their corresponding free energy curves.(c) The heatmap W i of the ensemble of spherical nuclei of size n * , where darker patches corresponds to higher probability.The inset shows the committor distributions measured from configurations taken at the peak of the black free energy curve.
the random field breaking translational invariance in the system: for a given cluster in the RFIM, the probability that it is expected to grow or shrink is not just a function of its size, but also heavily affected by the random field configuration around it.We show in the following that a better reaction coordinate can be obtained by insisting that a cluster grows in a specific location.

B. Localized reaction coordinate
Our method for constructing an improved reaction coordinate is illustrated with the single disorder realization considered in Fig. 3(a).We discuss the general case in at the end of this section.For the the size n * ≈ 340 that maximizes F(n), we extract representative configurations by umbrella sampling.Fig. 3(b) shows their average local magnetization m i = ⟨s i ⟩ n * , as a function of the position i. [Here ⟨•⟩ n * denotes the average over the ensemble with λ (s) = n * .]One clearly sees a few locations where large clusters tend to appear.This stands in stark contrast to the pure Ising model where transla-tion invariance ensures that clusters are equally likely to form at any location, so m i would be independent of i.
For the RFIM we can therefore identify statistically preferred nucleation sites in the system by finding connected clusters of spins for which m i > m cut , with a cutoff m cut = −0.75.Filtering out small clusters that represent background fluctuations, we index the resulting clusters by an index j, and write C j for the set of spins within the jth cluster.In the example of Fig. 3, we can easily identify two such preferred nucleation sites by visual inspection.For completeness, we also consider a third cluster which is not apparent from this visual representation, but does contribute strongly to the subset of configurations with large committor (p B > 0.8).
The resulting picture is that for a large system and a specific realization of the disorder, there are certain privileged locations where nucleation is most likely to occur.This is attributable to local energy differences caused by the disorder.In other words, the transition still occurs by nucleation, but describing the nucleation mechanism requires analysis of the location of the nucleus, as well as its size.This is easily understood when one considers the pinning of cluster growth by the random field: for a cluster of a given size, the random field around its location dictates whether it is energetically favorable for the cluster to grow or shrink.For example, a small cluster at nucleation site A surrounded by a large number of positive random field spins may be more likely to grow and invade the system than a larger cluster at site B surrounded by highly negative random field spins.Therefore the critical nucleus at site A will be smaller.Similarly, positive random field spins at the nucleation site make it energetically favorable for a cluster to form there, thus lowering the free energy barrier to cluster formation at that site, and vice versa.
Using this argument, we can predict the location of preferred nucleation sites from the random field realization alone by sampling the equilibrium ensemble of spherical nuclei of size n * .We approximate a sphere of size n • ≈ 340 = n * centered around a spin s i by the set O i of all spins within a distance r • < 4.3 from s i .In the equilibrium ensemble of up-spin spheres of that size, a sphere centered at s i should appear with probability weight w i = exp(β ∑ j|s j ∈O i h j )/Z • , where the sum runs over all spins within O i , and Z • is a normalization factor such that ∑ N i w i = 1.A complete sampling of this ensemble can thus be performed by dropping one such sphere centered around each spin in the system, and calculating the probability that a spin s i is a member of a spherical nucleus of size n * where r(i, j) is the distance between spins s i and s j , and θ is the Heaviside step function.
We plot the resultant configuration of W i in Fig. 3(c).Despite the crudeness of this procedure, a comparison between Figs. 3(b) and (c) shows that this ensemble of spherical nuclei predicts the actual preferred nucleation sites quite accurately.We however note that in this ensemble, the dark cluster in the bottom right corner of Fig. 3(c) receives a probability weight of around 0.97, while the other cluster at the top left corner only receives a weight of around 0.02, which is different from the weights obtained from the actual sampling of the average local magnetization m i .This is due to the crude assumption that all critical clusters are almost spherical in shape, see also Sec.III C, below.Therefore, this procedure only estimates the locations of the nucleation sites, and is not sufficient to predict the probability that the system actually nucleates there.We also note that the preferred location of a spherical nucleus depends significantly on its size.
To make further progress, we estimate a nucleation rate k j associated with each preferred location j.This requires identification of a suitable reaction coordinate, and computation a suitable free energy barrier.Then the nucleation rate I for the whole system is obtained by summing over the rates for nucleation at each such location and dividing by the volume Our strategy in the following is to estimate individual contributions k j separately, and then to consider the total rate I.
The physical idea is that a suitable coordinate is the size λ ( j) of the largest cluster in the vicinity of reference cluster j.To achieve this, write C * (s) for the set of spins that forms the largest cluster in configuration s and let Q j (s) be the number of spins in C * (s) that overlap with the reference cluster C j , that is Q j (s) = |C * (s) ∩C j |.Then define a localized reaction coordinate λ ( j) (s) = |C * (s)| as the size of the largest cluster in s, subject to the constraint that Q j is larger than a cutoff D, which we choose to be λ ( j) /6.The choice of D does not affect the measured free energy barrier as long as (a) it is smaller than the size of C j , and (b) it ensures that there is significant overlap between C * (s) and C j .Our particular choice is simply a matter of convenience.
We will see that this new reaction coordinate is suitable for identifying transition states and measuring free energy barriers, but we note that it does not make sense for small clusters (for example, no configuration can have λ ( j) (s) < D according to this definition).
We then compute a free energy F j (n) along the reaction coordinate λ ( j) , which is directly comparable with the total free energy F(n).(We emphasize again that we do not measure the free energy through a histogram of λ ( j) , recall Sec.II C.) For that purpose we define Physically, χ j = 0 if the largest cluster in the system is of size n > n cut but not at location j.The choice of n cut does not affect the free energy barrier as long as D follows the criteria mentioned above, and is sufficiently far away from any critical nucleus size.Here we choose n cut = 20.Note that ⟨χ j ⟩ is very close to unity because clusters bigger than n cut are rare.Now define a constrained equilibrium distribution within which large clusters must be at location j: where Z j is a normalization constant.Averages with respect to this distribution are denoted by ⟨•⟩ j .Finally, we define By analogy with (2), this F j is an estimate of the free-energy profile associated with nucleation at location j.As advertised above, these profiles are directly comparable with the unconstrained profiles F(n).
The constraint χ j means that ⟨M n ⟩ j only counts large clusters when they are in location j.For n > n cut one may write where p j (n) is the probability that a cluster of size n occurs at location j.On the other hand, clusters with n < n cut are almost unaffected by the constraint χ j so one has ⟨M n ⟩ j ≈ ⟨M n ⟩ in that case.This leads to a jump in F j at n = n cut of size β ∆F cut j = − log p j (n cut ).The jump is clearly visible in numerical results for β F j , shown as colored lines in Fig. 3(a), where the color of each curve indicates that the curve illustrates the free energy of cluster formation around the preferred nucleation site circled by the same color in Fig. 3(b) (the blue curve corresponds to a site that is too faint to be seen in Fig. 3(b), as discussed in the beginning of this Section).One also expects from ( 9) and ( 10) that F j (n) ≥ F(n): this bound is close to an equality if clusters of size n in the unconstrained system occur predominately at location j.This situation is realized for the red curve in Fig. 3 in the range 200 ≲ n ≲ 400.
To assess the suitability of λ ( j) as a reaction coordinate, we identify the sizes n * j of the critical nuclei at each location.(Note that these sizes vary significantly between locations, from 260 to 370.) Then we extract configurations from the maxima of the three profiles F j from which we compute committor distributions P(p B ).These are shown in Fig. 4. Compared to the distribution in the inset of Fig. 3, they are much more sharply peaked, indicating that they successfully capture three different subsets of the transition state ensemble, associated with nucleation events at the three relevant locations.Due to the flatness of the free energies near their peaks, it is difficult to locate the values of n that yield committor distributions peaked very close to p B = 0.5: the important observation is that the peaks are relatively narrow.
The rates for nucleation at each location can be estimated analogous to (3) as where we used (9) to express the relevant free energy barrier in terms of ⟨M n ⟩ j (this is helpful because it shows that the jump in F j does not affect the rate estimates).See Sec.III C below for further discussion of these rates.
The physical conclusion of this analysis -and specifically of Fig. 4 -is that domain reversal dynamics of the system is still controlled by the nucleation and growth of critical nuclei, which now takes place at preferred locations in the system.Our localized reaction coordinate accounts for this preference, and leads to single-peaked committor distributions.We repeated the above procedure for other disorder realizations and find similar results, though the shape and height of the barrier as well as the number of statistical dominant barriers vary between disorder realizations.The only exceptions to this behavior occur when neighboring target clusters are so close to each other that growth of one cluster occasionally invades the other, causing the committor distribution at the peak of the F j to lose its single-peaked shape.This scenario is rare: we do not discuss it further here, but it should be straightforward to adapt the idea of a local reaction coordinate to this case, if required.
We also find that the free energy barriers extracted from F j are typically very similar to those obtained from F, as found in Fig. 3(a).This may be expected from (10) as long as the number of locations for nucleation is not too large, such that p j = O(1).

C. Nucleation rates and trajectories
In order to test the validity of ( 3) and (11) in the RFIM, we calculate the nucleation rate in our system using forward flux sampling (FFS), which is described in Appendix B. The calculated rates are denoted as I FFS .We consider eight disorder realizations for three different values of R, and plot − log I FFS against the free energy barriers in Fig. 5.As free energy barriers calculated with or without the spacial constraint are shown to be extremely similar, we use the unconstrained scheme to calculate the values of ∆F unless stated otherwise.The results in in Fig. 5 fit well to a straight line of gradient 1.Moreover, the same straight line fit holds across disorder realizations for systems with the same parameters, implying that the kinetic prefactor in (3) (given by the intercept of the straight line) varies much more slowly than the exponential term for the same set of parameters, and can be treated as a constant across disorder realizations.This confirms that variations in nucleation rate is dominated by the variations of the free energy barrier.
We further comment that the fits in Fig. 5 cannot be used to distinguish between the our reaction coordinate and the conventional CNT one, as the barriers calculated with respect to the two coordinates are similar within the range of error acceptable to the fit.This is illustrated in Fig. 5(c) where the data points using barriers calculated using the spatially constrained reaction coordinate are plotted in blue.The data points due to the two reaction coordinates either overlap completely or show only small variances.Instead, the quality of the reaction coordinates must be assessed by a committor analysis, as elaborated in the previous section.
We then compare predictions of (3) to the nucleation rates obtained by FFS.All terms on the right-hand side of (3) can be measured without explicitly measuring the nucleation rate: Γ, ∆F and ρ 1 can be obtained directly from the equilibrium cluster size distribution ⟨M n ⟩ through (4) and ( 2).The parameter f * is computed in terms of the diffusion constant of the reaction coordinate at the top of the barrier.We can extract this diffusion constant by measuring the fluctuations of the reaction coordinate near the top of the barrier 54 through its mean-sqaure displacement where the average is taken over trajectories starting in the transition state ensemble (i.e., starting with λ = n * ).In practice, we collect 500 configurations at the peak of the barrier, and compute ⟨∆λ 2 (t)⟩ n * by averaging over 200 trajectories starting from each configuration.Results are shown in Fig. 5(d) for 8 disorder realizations at R = 1.0, and times t up to 0.06 MCS.These results can be accurately fitted by straight lines, which we use to estimate f * through their gradients.
Combining all these results, we compare the rates predicted by (3) with those measured using FFS, and plot their ratios I FFS /I BD against the measured nucleation rate I FFS in Fig. 6 for a range of disorder strengths, varying the external field to ensure that the nucleation rates for all systems considered are comparable.We observe that while the nucleation rate varies by more than 4 orders of magnitude, the ratio of I BD to I FFS is always of order unity.The theoretical prediction (3) consistently overestimates the rate, by a factor between 2 and 10.
To interpret these results, we first note that result for the pure Ising model at R = 0 and H = 0.55 agrees with that reported by Cai and Ryu 23 , who have produced a similar plot for the pure Ising model over a wide range of parameters.Their results suggest that in three dimensions that (3) systematically overestimates the nucleation rate by up to factor of 2. For nonzero disorder, we make the following observations: (a) the nucleation rates increasingly vary between disorder realizations with increasing disorder strength.(b) For a given set of parameters, the ratio I FFS /I BD fluctuates weakly between disorder realizations, even when the nucleation rates vary by many orders of magnitude.This is particularly clear for R = 1.0.(c) The ratio I FFS /I BD < 1 for all parameter values, and decreases with increasing disorder strength R, showing that (3) becomes less accurate when the disorder is strong.
This last trend can be explained by the increasing importance of shape fluctuations of the nucleating cluster as disorder is increased.The nucleation rate predicted by ( 3) is based on an effective coarse-grained one-dimensional description that integrates over all variables except for the size of the largest cluster.(This includes an integration of cluster shape fluctuations.)Such effective one-dimensional descriptions consistently overestimate the nucleation rate 61 unless the reaction coordinate is chosen to be exactly orthogonal to the surface on which all configurations have committor 0.5.For the pure 3-dimensional Ising model, it was shown 52 that this surface is not orthogonal to the cluster size coordinate in a free energy landscape that is a function of the cluster size and cluster surface area, which implies that using cluster size as the only reaction coordinate will cause an overestimation of the nucle- ation rate.This effect is particularly pronounced when shape fluctuations relax slowly, in comparison to microscopic time scales for addition or removal of single spins from the cluster.As disorder increases, shape fluctuations will become slower and more significant, and the effects of neglecting them become more severe.Snapshots of the nucleating cluster taken from a sample FFS trajectory, as seen in Fig. 7, show significant deviations from a spherical shape, which is linked to large slow fluctuations.We also note that for the pure Ising model it was argued 63 that in the continuum limit shape fluctuations cause nonuniversal corrections to the nucleation rate in three dimensions but not two, which explains the better agreement between theory and experiment for nucleation rates in two dimensions that is reported in the literature 23,29 .

D. System size dependence of the nucleation rate
All numerical results thus far were obtained in systems of size N = 30 3 .For systems without disorder then the probability of observing a (rare) nucleation event within a given (short) time window is an extensive quantity, hence our focus on the nucleation rate per unit volume, I.The prediction (3) is consistent with this observation because the concentrations ρ n are intensive quantities so the free energy barrier ∆F is independent of system size, in systems without disorder.
In systems with disorder -like the RFIM -the situation is more subtle 64 .To illustrate this, Fig. 8 shows the behavior of F(n) as a function of system size, for several representa- tive realizations of the disorder.The sample-to-sample fluctuations are significant, but we observe a clear trend, that the average barrier height ∆F decreases as the system size L increases.The intuition for this result is that larger systems support a broader range of disordered local environments where nucleation can take place, and critical nuclei are biased towards (rare) regions where the random field happens to favor the nucleating phase.This idea can be formalized using extreme value theory 65,66 , which yields results similar to those of Sear 64 , who considered a model of randomly distributed nucleation barriers.
Recalling (5), the main effect of the random field on critical nuclei is to reduce their energy by ε O = ∑ i∈O h i , where O is the set of spins that forms the nucleus.Taking O to be a random cluster of size n, one sees that ε O is Gaussian with mean zero and variance nR 2 .For a given disorder realization, it is useful to identify the location in the system where the nucleation barrier is smallest, since this will typically give the largest k j in (11).This amounts to identifying the critical cluster O with the largest value of ε O .
We denote this largest energy by ε max .Its behavior can be characterized within extreme value theory: suppose that all critical clusters have size n, and that there are a n N possible cluster locations, and that each cluster has an independent value of ε O (this last assumption is discussed in more detail below).Then ε max is the largest of a large number of identically and independently distributed Gaussian random variables, so it has a Gumbel distribution whose most likely (modal) value is where W is the Lambert W function.In fact, the probability density for ε max is The assumption that different clusters have independent values of ε O is an approximation (due to possible overlaps) but for large enough systems one still expects to recover this limiting Gumbel distribution, with a n N playing the role of an effective sample size.
For large systems one uses that W (x) ≈ log(x) at large x to see that the typical value of ε max scales as which increases (weakly) with N.That is, the lowest nucleation barrier in a large system decreases weakly with system size, contrary to the pure Ising model where it remains constant.This is consistent with Fig. 8 In the light of this result, one may imagine two scenarios.Either nucleation is dominated in large systems by the cluster with largest ε O , so that a single term dominates the sum in (11); or, there are many possible sites with similar barriers, which all contribute to the sum.The weak N-dependence of (15) means that the latter situation is realized in practice.To see this, note that doubling the system size increases ε * max by a small contribution of order (log N) −1/2 which carries through to the log-rate for nucleation; however, it also doubles the number of clusters with typical ε O ≈ R √ n, which corresponds to an increase of the log-rate by a finite constant log 2. This latter contribution dominates the small contribution from the change in ε * max , leading to an extensive nucleation rate in large systems, albeit with strong finite-size corrections from clusters with anomalously large ε O .

IV. CONCLUSION
We have investigated the nucleation dynamics of the threedimensional RFIM by a combination of umbrella sampling and FFS.By calculating the distribution of committor probabilities at the peak of the free energy curves, we tested the hypothesis that the size of the nucleating cluster is a suitable reaction coordinate for nucleation.While this hypothesis is valid for weak disorder (R ∼ 0.1), it breaks down at higher disorder (R = 0.7, 1.0), where the location of the nucleating cluster is also needed to fully describe the nucleation dynamics.We describe a method that predicts the locations of the preferred nucleation sites directly from the disorder configuration.Committor analysis confirms that a localized measure of cluster size serves as a good reaction coordinate, even in the presence of strong disorder.
We also find that while nucleation rates measured using FFS and our free energy barriers fit well through an Arrenhius form, Becker-Döring theory increasingly overestimates the nucleation rates with increasing disorder.We attribute this behavior to the importance of shape fluctuations of the nucleating cluster, as such clusters observed in our simulations show highly anisotropic surfaces.Finally, we argue that the system size dependence of the nucleation rate manifests in a downward shift in the expected free energy barrier, proportional to √ log N. Looking forward, a natural further step is to develop a theory that correctly accounts for the effects of shape fluctuations on nucleation rates.Previous attempts 67,68 have been made using Langer's theory of first passage times over a multidimensional landscape 69 for systems without disorder.In addition, by assuming spherical critical nuclei, we have developed a procedure that predicts the location of the preferred nucleation sites directly from the disorder realization, but fails to predict their relative probability weights.A more accurate prediction of nucleation sites from the disorder will likely require machine learning methods.
Finally, we comment that the main results of this paper, namely the statistical preference of nucleation around a small number of locations determined by the disorder, the overestimation of the nucleation rate by CNT due to cluster shape fluctuations, and the nonlinear scaling of the nucleation rate with system size, should be generally applicable to nucleation in disordered environments.

FIG. 1 .
FIG.1.The free energy curve of a system with R = 0.1, T = 2.7 and H = 0.45.The inset shows the committor distribution of configurations taken at the top of the barrier.

FIG. 4 .
FIG. 4. (a) -(c)The committor distributions measured from configurations taken at the peak of the colored free energy curves in Fig.3, with the colors of the histograms matching those of the free energies.

FIG. 5 .
FIG. 5. − log I FFS plotted against the free energy barrier, for 8 disorder realizations at T = 2.7, (a) R = 0.3, H = 0.5, (b) R = 0.7, H = 0.4 and (b) R = 1.0,H = 0.25.The straight lines in both plots have gradient 1.Then black crosses show free energy barriers calculated from unconstrained umbrella sampling runs, while the blue crosses in (c) show the true barriers calculated using the constraint scheme.(d) The ensemble-averaged time series ⟨∆λ 2 ⟩ n * plotted against t at R = 1.0,H = 0.25, with each color corresponding to a different disorder realization.

10 − 14 R 25 FIG. 6 .
FIG.6.The ratio between the nucleation rates obtained from FFS and Becker-Döring theory at T = 2.7 for different values of disorder and external field.

30 FIG. 8 .
FIG.8.Free energy curves taken from systems of various sizes at R = 0.7, T = 2.7 and H = 0.4.The shaded lines represent free energies measured from 10 disorder realizations, while the solid lines are averaged over those realizations.