We outline how principal component analysis can be applied to particle configuration data to detect a variety of phase transitions in off-lattice systems, both in and out of equilibrium. Specifically, we discuss its application to study (1) the nonequilibrium random organization (RandOrg) model that exhibits a phase transition from quiescent to steady-state behavior as a function of density, (2) orientationally and positionally driven equilibrium phase transitions for hard ellipses, and (3) a compositionally driven demixing transition in the non-additive binary Widom-Rowlinson mixture.
I. INTRODUCTION
Principal component analysis (PCA) is a simple and widely used unsupervised machine learning tool for dimensionality reduction.1–3 Perhaps the most common application of PCA is for the lossy compression of images. One popular demonstration is the analysis of facial images, leading to the aptly named “eigenfaces” that capture collective attributes of facial structure.1,2 Only a subset of the eigenfaces—much fewer than the naïve dimensionality of the problem—are required to recover the salient aspects of facial images by simple linear combination. Another routine use is in natural language processing, where PCA is employed to shrink the data dimensionality down from the large number of words appearing in a data set or in a dictionary.1,2 Use of the resultant lower dimensional representation greatly improves the development of predictive models to classify text documents.
The combined power and simplicity of PCA has made it a popular tool in the biological and physical sciences as well. For example, DNA microarray data are routinely treated with PCA to reduce the high dimensionality of the problem in order to identify unique gene expression states across various experimental conditions.2,4 Furthermore, PCA is commonly leveraged to extract dominant collective modes in simulations of proteins, referred to as “Essential Dynamics” in that field.5,6 More recently, various spin models from statistical physics have been investigated via PCA and other machine learning methods.7–14 These studies have demonstrated the ability of machine learning tools to detect and quantify phase transitions by the autonomous construction of an order parameter (OP).
The aforementioned work on phase transitions in spin models served as motivation for this two-part series of papers. In the first manuscript (henceforth referred to as Paper I52), we developed guidelines for the utilization of PCA1–3 to detect phase transitions in off-lattice, particle-based systems. We also demonstrated that PCA can readily identify the freezing transitions in hard disks and hard spheres, as well as liquid-gas phase separation in a binary mixture of patchy particles with complementary attractions. In developing and evaluating this approach, we initially focused on phase transitions that were equilibrium in nature and could be identified on the basis of features reflecting the positional degrees of freedom of the particles.
Here, we seek to generalize the formalism developed in Paper I52 to assess its utility for detecting phase transitions in a broader class of systems. Examples include equilibrium systems with (1) anisotropic particles leading to orientational as well as positional ordering15,16 and (2) compositional degrees of freedom that can induce demixing.17 We also address active or driven matter, which exhibits phase transitions whose detection and characterization cannot generally be facilitated based on arguments from equilibrium statistical mechanics.18–27
We propose several numerical encoding schemes (i.e., feature vector representations) for data describing particle configurations in these systems to detect their phase transitions with PCA. We find that prior knowledge of the phase transition is not required to construct a useful feature vector; consideration of the properties of the model system at hand is sufficient. However, we also show that by performing PCA on several choices for the feature vector, one can gain physical insights into the nature of the phase transition.
The balance of the manuscript is organized as follows. In Sec. II, considerations for constructing features for the detection of phase transitions in off-lattice systems using PCA are presented. The model systems analyzed in this work and the corresponding simulation details for each model are also provided. Section III is divided into three subsections, each dedicated to a different model system. The first, Sec. III A, describes a study of the random organization model, which exhibits a nonequilibrium phase transition between a quiescent state and a dynamically evolving steady state as a function of increasing density.28–31 Section III B addresses the fluid-nematic (orientationally driven) and the nematic-solid (positionally driven) phase transitions that occur upon densification of hard ellipses.32–35 Finally, in Sec. III C, compositional demixing in the Widom-Rowlinson (WR) model—a binary mixture where unlike particles interact via excluded volume effects but like particles are noninteracting36–38—is explored. Concluding remarks are presented in Sec. IV.
II. METHODS
A. Feature construction
Features (fi) are scalar quantities that inform a machine learning algorithm about some aspect of the system being studied.1,2 Here, we denote a general vector of m features as
where T indicates a transpose. Feature vectors provide a numerical encoding for the separate realizations (or measurements) contained in the data set ().
When possible, features should reflect any known constraints; for physics problems, these include invariance to translation and rotation.39–42 Such constraints can be easily encoded via the use of internal coordinates (e.g., interparticle distances or relative angular orientations) as features. Here, we compute pairwise quantities that are in reference to a probe particle (α) and a corresponding particle in its environment (β). A feature vector built from information considering nP probe particles (each with nNN corresponding environmental particles) can be represented as
where the full vector of vectors corresponding to each probe particle α is implicitly flattened to form one contiguous feature vector (block matrix notation).
Within the above mathematical framework, there is no unique choice for the selection of either the probe particles or the neighboring particles that define their environment. Once a collection of probe and corresponding environment particles are chosen, we also must specify how the resultant pairwise quantities () are assigned to the α and β indices in Eq. (2). So we do not have to compute properties with respect to every particle in the simulation box, and we select nP probe particles at random. For the corresponding environmental particles, we use physical intuition as a guide by assuming that the distance between the probe particle and a given environmental particle will influence the manner in which the associated feature reports on a given phase transition. As a result, we use a distance-based criterion to determine which particles comprise the environment for a given probe (e.g., the first twenty nearest neighbors or every tenth nearest neighbor), hence our use of nNN to denote the number of environmental particles. Similarly, we assign the index β on the basis of interparticle distance so that
The assignment of a probe particle to a given α is less intuitive and could be model-dependent; however, random assignment is always a possibility, and, as we discuss below, the results obtained from that initial assignment can in some cases help identify a superior assignment scheme for the probe particles.
In principle, nP could be as large as the number of particles in the simulation, N, and nNN could have a maximum value of N − 1. Since the total feature vector size [in relation to Eq. (1)] is m = nP × nNN, the preceding choices would yield a feature vector of length N(N − 1). For most systems of interest, PCA for feature vectors of this size would be computationally infeasible. Therefore, practical implementation of PCA using particle-based coordinate data requires sensible choices for nP and nNN that we describe in Subsections II B 1–II B 3.
Finally, we refer to features where are physically motivated quantities as “intuited” features (fI). In Paper I,52 we showed that fI do not necessarily approximate white noise in the disordered reference state (here, the ideal gas) limit and therefore may possess correlations that could obscure the detection of a phase transition via PCA. Arriving at corrected features (fC) that are linearly decorrelated when applied to an ideal gas reference data set () is accomplished by deriving a PCA whitening transformation43 (fI → fC) that satisfies , where I is the unit matrix and is an average over the reference data.
B. Models
We provide a brief description of each model examined in this work as well as the relevant phase transition(s) below. We then specify the form of the associated feature vectors within the framework provided by Eq. (2). Finally, we describe the simulation protocols used to generate the configurations on which the PCA is performed. Throughout, N denotes the number of particles in a two-dimensional (2D) periodically replicated simulation cell of area A, ρ = N/A is the number density, and η = ρπσ2/4 is the packing fraction. For the remainder of the paper, we set σ = 1, implicitly rendering all length and volume units dimensionless with respect to σ.
1. Random organization model
In one variant of the Random Organization (RandOrg) model, a circular particle of diameter σ is defined as active if it overlaps with any other particle.30,31 For a given configuration, all active particles are simultaneously given random displacements; all other particle positions are unaltered. Particle positions are initialized at random, from which the above procedure is repeated until either (1) a so-called absorbing state is reached where no particle overlaps are present (lower densities) or (2) a steady-state is reached where the fraction of active particles fluctuates about some non-zero value (higher densities).
Given that the RandOrg model comprises identical, radially symmetric particles, features that explicitly encode positional packing correlations around tagged particles are an obvious first choice to try. Specifically, we utilize mean subtracted interparticle distances as our features,
Furthermore, while the model is technically single-component to the extent that there are no immutable labels associated with the particles, multiple particle types (active and inactive) are created on-the-fly due to the dynamics prescribed by the model. To capture emergent inhomogeneity with respect to the particle environment, it is critical to utilize multiple probe particles as prescribed by Eq. (2). In the present work, we use a fixed feature length of m = nP × nNN = 400 and examine the effects of co-varying nNN and nP.
Within the above approach, there is still the question of how to assign the probe particles to specific values of α in Eq. (2). One valid, though perhaps not particularly informative, choice is to randomly order the probe particles such that α assignment does not encode any information. In Sec. III A, we demonstrate how performing PCA with this choice produces results that suggest a more informative sorting scheme, where probe particles are assigned to the index α on the basis of their first NN distance, i.e., .
We note two additional technical points regarding PCA for the RandOrg model. First, as we increase nP, the magnitude of the OP grows in a nonlinear fashion. We can collapse OPs onto the same scale by dividing by the square root of the explained variance of their dimension, a procedure equivalent to data “whitening” discussed in Paper I52 in the context of correcting the physics-motivated features. We also find that OPs obtained from both dimensional (preserving units of distance) and nondimensionalized features (dividing raw distances by ρ−1/D, where D is the dimension) accurately detect the phase transition of the RandOrg model. However, as we demonstrate, the PCA-derived OP using the former convention shows behavior that is more strikingly reminiscent of the classical OP for this system.
To generate the configuration data required to construct the above features, we performed 2D simulations in a square box with N = 1000 particles and employed a maximum displacement of 0.25σ in both the x and y directions for the active particles. The length of the simulations varied with proximity to the critical point characteristic of the transition between an absorbing and a steady state. At densities below the critical point, an individual simulation ended when the absorbing state is reached; however, critical slowing down impacts the simulation length required to achieve that state.29,30 We used a maximum of 105 simulation steps for densities below the critical point. For higher densities, the fraction of active particles decreased from the initial random state before fluctuating about a steady-state. The number of simulation steps was chosen to be at least twice as long as the initial relaxation time scale, ranging from 103 steps (at the highest densities) to 105 steps (just past the critical point). We performed 103 separate simulations, using only the last frame from the simulation in the PCA. Values for ρ ranging from 0.38 to 0.63 were simulated in increments of 0.005. From the simulation data, we computed 25 feature vectors from each simulation snapshot, where the probe particles were selected at random. Within a single feature vector, probe particles are selected without replacement; however, a particular probe particle can appear in multiple feature vectors.
2. Hard ellipses
Densification of hard ellipses bears similarity to the freezing of hard disks studied in Paper I52 but with added complexity derived from particle-shape anisotropy. In addition to ordering on the center-of-mass positional level, quasi-long ranged orientational ordering is possible, yielding the so-called nematic phase.15,16 Two obvious pairwise properties to compute from the configurational data of hard ellipses are the center-of-mass distances and the relative orientations of the ellipses. With respect to the former, we use the positional features with a single probe particle as employed in Paper I52 for hard disks and spheres. This form is equivalent to the feature vector defined by Eqs. (2)–(4) for the case where nP = 1 and nNN = N − 1, where N is the number of particles. Subsequently, the size of the feature vector is reduced by only including every 10th NN distance after the first feature in the final feature vector. The pairwise distances are normalized with respect to the mean interparticle spacing l ≡ ρ−1/D, where ρ is the number density and D is the spatial dimensionality, to remove trivial density scaling.
For the latter case of relative orientations, we still employ one probe and index its environmental particles on the basis of NN sorting [Eqs. (2) and (3)]; however, we use a measure of relative pair orientations in place of pair distances in the feature vector. Defining as the angular difference between the probe and environmental particles assigned to indices α and β, respectively, we employ features of the form
That is, quantifies the relative orientation of the single probe particle with its closest NN ellipse. From the sorted list defined by the combination of Eqs. (2), (3), and (5), only every 10th NN is included in the feature vector, as was performed for the positional features above.
The feature vectors used as input to PCA were collected from Monte Carlo simulations of hard ellipses carried out at a constant particle number and volume using the HOOMD-blue software package.44–46 The box shape was chosen to approximate a square by an appropriate distribution of triangular lattice cells with an aspect ratio of , where κ = b/a is the ratio of the semi-major (b) and semi-minor (a) elliptical axes, respectively. (Here, we set the lengthscale as 2a = 1.) Specifically, given the number of cells in the y direction, ny, the number of cells in the x direction is chosen as . For ellipses with κ = {3, 4, 6, 9}, we chose ny = {17, 15, 12, 10} which yielded the total number of particles N = {2992, 3120, 3000, 3120}, respectively. For each step, the move type (rotation or translation) was selected at random with equal probability. The maximum degree of translation and rotation per move were independently scaled to yield an ∼25% acceptance rate for efficient phase sampling. Density ranges were chosen to span the isotropic, nematic, and solid phases. For ellipses with κ = {3, 4, 6, 9}, we chose η = {0.6–0.9, 0.55–0.9, 0.4–0.9, 0.3–0.9}, respectively. A typical run proceeded as follows. A system of N hard ellipses was started from an ideal triangular lattice at maximum packing fraction and expanded to a target η value. Next, the range of translational and rotation move sizes was optimized using 50 iterations of 100 steps to achieve the targeted acceptance ratio, where a step is equal to an HOOMD-blue “time step,” or approximately four sweeps over all particles. Then, the system was equilibrated for 6 × 106 steps, and data were collected every 6000 steps from an additional 6 × 106 step production run. From each frame, 30 feature vectors were constructed, where the probe particles were selected without replacement within a given frame.
3. Widom-Rowlinson model
The Widom-Rowlinson (WR) model36 is composed of a binary mixture of A and B particles where like pairs (A-A or B-B) are non-interacting and unlike pairs (A-B) interact isotropically via a hard-core repulsion with diameter σ. Upon densification, the WR model compositionally demixes to form separate A- and B-rich phases. The resulting phase transition can straightforwardly be used to model compositional demixing; however, by integrating out the coordinates of one of the species, a model for liquid-gas coexistence can be obtained. In this work, we study the symmetric WR model where the number of A and B particles is equal.
The full specification of an individual particle in the WR model requires knowledge of both its type (A or B) and its position, yielding two obvious quantities to include in the feature construction. Instead of directly encoding the particle type as a categorical variable, we use particle type information to modify the assignments of the α and β indices. We construct NN positional features as prescribed by [Eqs. (2)–(4)], but we only include distances between pairs of A particles in the feature vector. We use a single probe particle (nP = 1) with nNN = 1200 nearest neighbors for the environmental descriptors. By neglecting one of the WR components, we construct features that explicitly leverage both compositional and positional information. Finally, we normalize the distances in the same fashion as the ellipse positional features described in Sec. II B 2.
For the production of the configuration data required to construct feature vectors for PCA, the HOOMD-Blue hard-particle Monte Carlo integrator44–46 was used to perform the simulations of the WR model in a square box for N = 4096 particles in 2D. Equilibrium samples were generated at number densities spanning both the mixed and ordered phases as follows. After compressing the final configuration from simulation at the previous density, the system was equilibrated for 107 steps, followed by a production run of 107 steps, from which data were collected every 103 steps, for a total of 104 snapshots per density. A step is equivalent to a HOOMD-blue “time step” as defined in Sec. II B 2. Simulations were run from ρ = 0.064 to 3.82 in increments of 0.064. From each frame, a single feature vector was constructed.
III. RESULTS AND DISCUSSION
Prior to examining the PCA results for the above models, we explain the general interpretation of the quantities that result from PCA below. For the features constructed according to the protocols described in Sec. II, PCA discovers a set of orthogonal axes—the principle components (PCs)—that are constructed in succession so as to maximize the data variance projected along each new axis. In this work, we monitor the relative explained variance of the PCs, denoted as λi for the ith PC; by convention, the PCs are sorted so that λi ≥ λi+1. A comparatively large value for λ1 indicates that the information content of the features has been effectively concentrated into a single dimension: the first PC.
Of particular relevance to interpreting the PCA results is the projection of the feature vectors along the PCs: the PC score, denoted pi for the ith PC. Given that the first PC contains the largest explained variance, we evaluate the use of p1 as an OP-like quantity to report on the phase transition of interest.47 This strategy amounts to coalescing as much “information” (i.e., variance) as possible from the high-dimensional feature vector fC into the scalar p1.
Since each p1 is associated with a single feature vector, we define two quantities that are averaged over a given state-point (here, the state points are densities): and the associated standard deviation, . When P1 and σ1 are plotted as a function of density, phase transitions will generally be indicated by a sigmoid in the former and a peak in the latter metric. The maximum of σ1 is useful as an operational definition of the phase transition point when P1 is slowly varying—as can be the case for very continuous phase transitions. This is not to say that P1 is useless in such cases though. In fact, if the dimensionality reduction is successful—as gauged by the presence of dominant principal components—the behavior of P1 is always revealing of the transition. σ1 should be viewed as akin to a susceptibility of the order parameter, which can be useful as a supplemental metric.
Throughout, P1 is compared to an appropriate classical OP for the transition under consideration. The utility of the PCA derived OP is gauged by the quantitative closeness of the estimated transition points and not by the overall similarity between the two OPs. For hard disks, in Paper I,52 as well as for the random organization model in this paper, PCA can yield, at least very closely, the functional form of the classical OP. However, such similarity is not required for the PCA to inform the location of the phase transition. Multiple distinct metrics may be capable of quantifying any given phase transition. Our goal in this work is to explore the capability of PCA to develop a useful order parameter for the transition, not necessarily to replicate a classical analog.
The final relevant quantities from the PCA calculation are the PCs themselves—the weights that relate the features and the PC scores. As described in Sec. II A, the intuited features (fI) are transformed in a corrected representation (fC), the latter of which are input into the PCA calculation. Since the values comprising fI are straightforward to interpret physically, we explore the relationship between the PC scores and fI (instead of fC). As described in Paper I,52 it is possible to write down a linear relationship between the scalar pi and the vector fI via the following dot product:
where qi is the desired vector of weights that map fI to pi. Examination of these weights reveals which physically meaningful quantities are particularly relevant to the phase transition.
In summary, we consider (1) the effectiveness of the dimensionality reduction via λ1, (2) the low-dimensional (OP-like) representation of the data via quantities that depend on p1 (P1 and σ1), and (3) the relative importance of the physical quantities that comprise fI via the weights qi. For convenience, we summarize the above notation in Table I.
Common PCA variable definitions.
λi | Relative (fractional) explained variance captured by the ith PC, ranging between 0 and 1, where larger values are indicative of greater importance. |
pi | The ith PC score. Mathematically, this is the projection of a feature vector along the ith PC. PCs offer a new coordinate system with information concentrated along the earlier (smaller index) PCs. |
Pi | Average of the ith PC score, pi, over data from a state point (). P1 serves as the OP-like quantity to report on phase transitions. |
σi | Standard deviation of the ith PC score, pi, at a state point (). This is used as an effective “susceptibility” to locate the phase transition by identifying the maximum value. |
qi | Vector of weights that quantify the relevance of each feature to the ith PC. |
λi | Relative (fractional) explained variance captured by the ith PC, ranging between 0 and 1, where larger values are indicative of greater importance. |
pi | The ith PC score. Mathematically, this is the projection of a feature vector along the ith PC. PCs offer a new coordinate system with information concentrated along the earlier (smaller index) PCs. |
Pi | Average of the ith PC score, pi, over data from a state point (). P1 serves as the OP-like quantity to report on phase transitions. |
σi | Standard deviation of the ith PC score, pi, at a state point (). This is used as an effective “susceptibility” to locate the phase transition by identifying the maximum value. |
qi | Vector of weights that quantify the relevance of each feature to the ith PC. |
A. Random organization model
The RandOrg model was developed to understand the transition from reversible to irreversible dynamics that occurs upon increasing either the applied periodic shear or the density of a material.29 In the first incarnation of the RandOrg model, an initial configuration was sheared and any particles overlapping with others as a result of deforming the simulation box were defined as active particles. Only the active particles were given a random displacement after which the simulation box was restored to its original geometry. At sufficiently low combinations of density and applied shear, a quiescent “absorbing” state eventually results, where shear does not generate further particle overlaps and there are no longer active particles. However, at greater densities and/or shear rates, shearing the system will always generate some overlaps. The reversible-to-irreversible transition reflects a state where the onset of particle collisions upon shearing prevents the system from returning to its original state when the shear is removed.28,29
A modified version of the RandOrg model, where shear is not included, has also been studied.30,31 Instead, as described in Sec. II B 1, initial particles are placed at random; active particles correspond to overlapping particles. This model possesses the same type of transition from an absorbing state at sufficiently low densities to an evolving steady-state containing a non-zero number of active particles at higher densities, while being technically simpler to implement. Figure 1(a) shows the fraction of active particles fA in this version of the RandOrg model as a function of number density ρ. Two simulation configurations below (ρ = 0.5) and above (ρ = 0.51) the critical point are shown in Figs. 1(b) and 1(c), respectively.
(a) Fraction of active particles, fA, in the RandOrg model as a function of number density, ρ. [(b) and (c)] Simulation snapshots (b) below (ρ = 0.5) and (c) above (ρ = 0.51) the critical density. Active particles are shown in lighter green in panel c.
(a) Fraction of active particles, fA, in the RandOrg model as a function of number density, ρ. [(b) and (c)] Simulation snapshots (b) below (ρ = 0.5) and (c) above (ρ = 0.51) the critical density. Active particles are shown in lighter green in panel c.
Like hard disks, the RandOrg model comprises identical, radially symmetric particles, for which distance-based features are a sensible choice. Therefore, we first employed the feature vector developed in Paper I52 for hard disks—sorted nearest-neighbor (NN) distances associated with a single probe particle [Eqs. (2)–(4) where nP = 1]. However, this construction of the feature vector did not produce a satisfactory OP for either P1 or σ1. Use of a feature vector for which nP = 1 likely fails because, above the critical point, the system always has two effective particle types—active and inactive [see Fig. 1(c)]. Therefore, the environment of a single particle is not an accurate representation of the simulation box as a whole at higher densities.
As described in Sec. II A, distance-based feature vectors can naturally incorporate multiple probe particles (and their corresponding neighbors). As with the NN distances for a given probe particle, one must decide how to order the probe particles inside the feature vector. In the absence of any information about the nature of a given phase transition, a first choice might be to randomly order the probe particles. In Fig. 2, we show results for the first PC using a feature vector constructed from 40 probe particles (nP = 40), each of which is encoded via its first 10 NN distances (nNN = 10). In principle, the weights associated with each probe particle should be identical since there is no physics-based interpretation for their ordering in the feature vector. Indeed in Fig. 2(c), we find a repeating pattern for every 10 weights: the first NN distance () component weight for each probe is large in magnitude. The relative uniformity of the first NN distance weights, compared to the noisiness in the larger NN distances, indicates that the values are informative to the PCA.
(a) The PCA-deduced OP P1 (with probe particles sorted randomly in the feature vector) as a function of number density ρ compared to the conventional OP (the fraction of active particles, fA) for the RandOrg model. (b) Comparison of P1 with probe particles sorted randomly (black) versus according to their first NN distance so that (blue). (c) Component weights, , as a function of feature dimension k for the two PC scores shown in panel (b). (Inset) Standard deviations of fA (σ(fA)) and of p1 or the PC for an individual feature sorted by the first NN distance, (σ1) as a function of ρ.
(a) The PCA-deduced OP P1 (with probe particles sorted randomly in the feature vector) as a function of number density ρ compared to the conventional OP (the fraction of active particles, fA) for the RandOrg model. (b) Comparison of P1 with probe particles sorted randomly (black) versus according to their first NN distance so that (blue). (c) Component weights, , as a function of feature dimension k for the two PC scores shown in panel (b). (Inset) Standard deviations of fA (σ(fA)) and of p1 or the PC for an individual feature sorted by the first NN distance, (σ1) as a function of ρ.
The corresponding OP is shown in Fig. 2(a) and bears striking resemblance to the standard OP shown in Fig. 1(a). Indeed, by arbitrarily shifting and scaling the PC score, we find that fA essentially overlaps with the PCA-deduced OP. It seems that the repeating unit in the component weights is able to distinguish between overlapping and non-overlapping particles and therefore can report on the relative amounts of active and inactive particles at a given value of ρ.
We can use the above component weights to intelligently devise a better sorting scheme for the probe particles. From Fig. 2(c), it is clear that is a highly weighted contribution to the feature vector; therefore, we performed a separate PCA calculation with the same values for nP and nNN while sorting the probe particles so that . Because we have sorted the probe particles on the basis of a physically meaningful descriptor, the symmetry among probe particles is broken and the probe particles with the closest NNs (i.e., those assigned to lower α) are weighted more heavily than other probe particles [Fig. 2(c)]. Moreover, the associated OP is significantly sharper at the phase transition, essentially giving a binary classification into absorbing states and dynamic steady states on the basis of the OP [Fig. 2(b)].
The approximate location of the phase transition can be determined by the lowest density at which fA is non-zero, which occurs at ρ = 0.505 for the simulation protocol described in Sec. II B 1. This value coincides with the sudden increase in P1 from a flat line for the sorted features shown in Fig. 2(b). The standard deviation in both fA () and the p1 associated with the NN sorted features (σ1), shown in the inset of Fig. 2(c), both possess an obvious spike at ρ = 0.505, indicating that the maximum value in σ1 is also an accurate reporter on the location of the phase change.
With the above sorting scheme in hand, we vary both nP and nNN while keeping the length of the feature vector fixed at m = nP × nNN = 400. As nP increases and therefore nNN decreases, the PCA-deduced OP sharpens into a sigmoidal curve that separates quiescent absorbing states from diffusive steady-states; see Fig. 3(a). Conversely when nP = 1 (as was the case for Paper I52), P1 cannot detect the transition. Correspondingly, when features constructed with more probe particles are used, the explained variance associated with the first PC increases dramatically [Fig. 3(b)]. The preceding trend is monotonic—there is no value to include more than the nearest interparticle distance per probe particle at constant m, an indication of the local character of the phase transition in the RandOrg model.
(a) PCA-deduced OP P1 of the RandOrg model as a function of number density ρ for different numbers of probe particles nP and corresponding nearest neighbors nNN, respectively. (b) Corresponding explained variance λi for the first three PCs and (c) the first 80 PC weights .
(a) PCA-deduced OP P1 of the RandOrg model as a function of number density ρ for different numbers of probe particles nP and corresponding nearest neighbors nNN, respectively. (b) Corresponding explained variance λi for the first three PCs and (c) the first 80 PC weights .
For the above series of PCA calculations, the first 80 component weights are plotted in Fig. 3(c). When nP is small, the components appear to be largely random, but as nP is increased, the components develop more structuring. For each probe particle, the component weights associated with have a much larger weight than that of the rest of the features, reinforcing the importance of the first NN distance in the dimensionality reduction. Moreover, the probe particles that have closer first NNs have greater weights; we can interpret the role of using multiple probe particles as capturing an accurate measure of in a statistical sense, i.e., not all probe particles are required, but sampling is needed to make sure that sufficiently representative interparticle separations are included in each feature vector.
While the importance of the first NN distance is intuitive given that the RandOrg phase transition is defined by the presence or absence of particle overlaps, we did not incorporate knowledge of the transition in constructing the features. In other words, our results suggest that modifying the feature vector can be used to infer characteristics of a transition, even if its nature is unknown at the outset. Specifically, for the RandOrg model, the importance of the first NN distance revealed by the PCA implies a transition that is local in character in real-space, and the necessity of multiple probe particles indicates that multiple distinct particle types or environments are an important characteristic.
B. Hard ellipses
The freezing transition for hard ellipses differs from that of hard disks because the former features an intervening nematic phase between the disordered fluid and the positionally ordered solid. The nematic phase manifests when the ellipses display disordered center-of-mass positions but quasi-long range orientational order.32–35 A conventional OP that reports on the fluid-nematic transition, , as well as simulation configurations at densities below and above the phase transition, is shown in Fig. 4 for ellipses with an aspect ratio κ = 4. The continuous, second-order nature of the fluid-nematic transition is apparent from the behavior of , from which the precise density for the underlying phase transition is not readily apparent. Therefore, one typically monitors the long-range power-law decay of a pairwise angular correlation function versus interparticle separation to identify the transition. The nematic phase transition point is identified when the power law decay exponent (ϕ) falls below an approximate value of 1/4.33,34 As shown in the inset of Fig. 4(a), ϕ = 1/4 at η ≈ 0.696, in close agreement with prior identification of the transition.33,34
Density-driven isotropic fluid to nematic phase transition in a system of hard ellipses with aspect ratio κ = 4. (a) The packing fraction η dependence of the conventional order parameter for this transition, , where θi is the angle between the semi-minor axis of the ith ellipse and the x-axis and N is the number of ellipses, as per Ref. 35. (Inset) Power law exponent (ϕ) for the angular correlations as a function of η around the fluid-nematic transition. The horizontal line indicates that the phase transition occurs when ϕ = 1/4. Simulation configurations of (b) the isotropic fluid at η = 0.65 and (c) the nematic phase at η = 0.75. Ellipses are color coded according to θi as defined above with the angular range limited to [−π/2, π/2] due to orientation symmetry of the ellipse.
Density-driven isotropic fluid to nematic phase transition in a system of hard ellipses with aspect ratio κ = 4. (a) The packing fraction η dependence of the conventional order parameter for this transition, , where θi is the angle between the semi-minor axis of the ith ellipse and the x-axis and N is the number of ellipses, as per Ref. 35. (Inset) Power law exponent (ϕ) for the angular correlations as a function of η around the fluid-nematic transition. The horizontal line indicates that the phase transition occurs when ϕ = 1/4. Simulation configurations of (b) the isotropic fluid at η = 0.65 and (c) the nematic phase at η = 0.75. Ellipses are color coded according to θi as defined above with the angular range limited to [−π/2, π/2] due to orientation symmetry of the ellipse.
To detect the fluid-nematic phase transition via PCA, we use a feature vector constructed from the relative orientation of pairs of ellipses that are sorted in ascending order by the distance between the probe particle and its neighbor, as described in Sec. II B 2. In Fig. 5, we present the results of PCA using this orientational feature vector for ellipses with an aspect ratio κ = 4. As seen by the explained variance λi in Fig. 5(b), the first PC captures approximately 40% of the data variance, indicating effective dimensionality reduction. From the component weights in Fig. 5(a), it is clear that long-range orientations (larger values of k) are much more important than the closer neighbor orientations which tend toward zero.
Based on PCA of the 2D system of hard ellipses, we show (a) component weights , (b) the explained variance λi, and (c) the OP (P1) and standard deviation ().
Based on PCA of the 2D system of hard ellipses, we show (a) component weights , (b) the explained variance λi, and (c) the OP (P1) and standard deviation ().
The above weights reflect the underlying structural motifs present in hard ellipses at various values of η. Orientationally aligned clusters of ellipses are present in both fluid and nematic phases [compare, for instance, the snapshots in Figs. 4(b) and 4(c)]. Therefore, orientations between nearby ellipses [smaller values of k in Fig. 5(a)] are not useful indicators of nascent orientational long-range order, and their contributions to the OP are suppressed by the PCA. On the other hand, long distance components are approximately equal-weighed as they correlate proportionally to the presence of an emerging nematic director but average out for random configurations in the fluid state.
Regarding the OP itself, we show P1 and its standard deviation σ1 in Fig. 5(c); the latter quantity can be interpreted as a type of susceptibility of the OP. Note that P1 resembles the traditional OP of Fig. 4 and, while responsive to the nematic phase change, does not provide a unique transition point. As such, we use σ1 to correlate the PCA results to the fluid-nematic phase transition: the maximum of σ1 indicates the region most consistent with large-scale configuration fluctuations near the critical point of a continuous phase transition. The maximum occurs at η ≈ 0.695, in close agreement with η ≈ 0.696 identified by the point at which the power law exponent ϕ decays to 1/4 in Fig. 4(a). Indeed, for all values of κ investigated here, we find that the density η associated with the maximum value of σ1 is in excellent agreement with fitted fluid-nematic boundary reported by Xu et al.34 (see Fig. 6), without requiring a tedious analysis of the long range scaling behavior in the angular correlations employed by the latter study.
Phase boundary of the isotropic fluid to nematic transition for a system of hard ellipses as a function of packing fraction η and aspect ratio κ. Solid black dots indicate the phase transition point identified from the position of maximum susceptibility, max(σ1). The dashed gray line indicates the phase boundary fit reported by Xu et al.34
Phase boundary of the isotropic fluid to nematic transition for a system of hard ellipses as a function of packing fraction η and aspect ratio κ. Solid black dots indicate the phase transition point identified from the position of maximum susceptibility, max(σ1). The dashed gray line indicates the phase boundary fit reported by Xu et al.34
While the use of orientational features as input to PCA provides a means to detect the fluid-nematic phase transition, the relationship between the above PCA results and the nematic-solid transition is less obvious. In Figs. 7(a) and 7(b), we plot a normalized version of P1 and its standard deviation σ1 derived from the orientational features against the known nematic-solid coexistence region (the gray shaded area). There is a weak response to the nematic-solid region as drops abruptly—perhaps an indicator of reduced orientational freedom of the ellipses upon solidification. The weak response is not surprising as angular degrees of freedom are an indirect measure of the nematic-solid transition. Thus, in this case, angular features do not capture the dominant physics governing the freezing transition and will require more suitable features to more clearly delineate the phase transition.
For the first PC of hard ellipses, comparison of the (a) shifted and normalized PC scores and (b) normalized standard deviations , as derived from either orientational or positional feature vectors. The dashed black vertical line indicates fluid-nematic boundary, and the shaded gray region indicates the nematic-solid phase coexistence region reported in Ref. 35.
For the first PC of hard ellipses, comparison of the (a) shifted and normalized PC scores and (b) normalized standard deviations , as derived from either orientational or positional feature vectors. The dashed black vertical line indicates fluid-nematic boundary, and the shaded gray region indicates the nematic-solid phase coexistence region reported in Ref. 35.
In order to detect the center-of-mass level ordering that occurs at the nematic-solid transition, we employ the positional NN features used for hard disks in Paper I.52 That is, instead of including the relative angle between two ellipses in the feature vector, we employ the interparticle distances. In Figs. 7(a) and 7(b), we compare the normalized PC scores and the associated , respectively, when these positional features are used as input to the PCA. The resulting OP is insensitive to the fluid-nematic boundary but grows sharply across the known nematic-solid phase-coexistence region. Similar to the orientational features, the maximum in the position-based susceptibility σ1 in Fig. 7(b) is an appropriate identifier for the underlying phase transition. The form of the OP as a function of η—flat over the fluid phase, rapid growth upon solidification, and more muted growth in the solid phase—is qualitatively similar to the OP reported in Paper I52 for the densification of hard disks. Together, the above results attest to the ability of PCA to provide insights into the character of a given phase transition by varying the form of the feature vector.
C. Widom-Rowlinson model
As mentioned in Sec. II B 3, the WR model contains two particles types—A and B—where like particles are non-interacting and unlike particles interact via a hard-core repulsion of diameter σ. At low densities, the two species are mixed. However, upon densification, a phase transition occurs37,38,48 where the WR mixture phase separates into A-rich and B-rich regions [Figs. 8(a) and 8(b)] as the excluded volume effects experienced by the unlike particles overcome the mixing entropy. The density at which the demixing transition occurs varies with composition; we denote x as the fraction of A particles. In the present work, we study the mixture for which x = 0.5.
For the WR model at x = 0.5, simulated configuration snapshots of (a) mixed WR particles at ρ = 1.25 (below the phase transition) and (b) demixed WR particles at ρ = 2.5 (above the phase transition). (c) Fraction of percolated configurations (fperc) as a function of number density.
For the WR model at x = 0.5, simulated configuration snapshots of (a) mixed WR particles at ρ = 1.25 (below the phase transition) and (b) demixed WR particles at ρ = 2.5 (above the phase transition). (c) Fraction of percolated configurations (fperc) as a function of number density.
When x = 0.5, the density at which clusters of like particles become percolated can be used to determine the demixing transition.38,49,50 For the WR model, a cluster is defined as a group of particles that are all either directly overlapping or connected via a contiguous pathway of overlapping particles when periodic boundary conditions are properly taken into account. For a finite-sized, periodically replicated simulation box, a percolated cluster is one that grows in size upon replication of the simulation cell. Therefore, for each species at x = 0.5, we computed the fraction of configurations possessing at least one percolated cluster of that particle type and averaged the results for the A and B particles to yield fperc. Figure 8(c) shows fperc as a function of density; percolated clusters were identified as described in Ref. 51. One choice for the percolation threshold—the point when at least 50% of the configurations are percolated—yields a de-mixing transition density of ρt = 1.68.
Positional features of the form defined by Eqs. (2)–(4) are unable to detect the above de-mixing transition if compositional degrees of freedom are not taken into account—a consequence of the absence of large scale fluctuations in the packings of the particles (agnostic to particle type) as the phase transition occurs. However, for any multicomponent mixture, features can be constructed using particle type data as well as spatial information. For the WR model, one such strategy is to design a feature vector that only includes interparticle distances if the corresponding pair of particles meets some criterion based on the particle type. One such choice (though others are possible) is to only include distances between two A particles in the feature vector defined by Eqs. (2)–(4)—akin to the liquid-gas formulation of the WR model. The outcome of PCA with the above feature vector is shown in Fig. 9. The component weights in Fig. 9(a) indicate that the long-ranged positional correlations dominate, whereas the smallest interparticle separations with respect to a given probe are essentially meaningless to the PCA. Reminiscent of the fluid-nematic transition seen in ellipses and described in Sec. III B, some local clustering on the basis of the particle type occurs at lower densities than phase separation does; see Fig. 8(a), for example. Therefore, it is the long-range correlations that change sharply as the phase transition occurs. Figure 9(b) depicts the explained variance λi for the first 20 PCs, where the first PC accounts for ∼10% of the data variance—an order of magnitude more than the succeeding components.
For the first PC upon application of PCA to the WR model, (a) its component weights , (b) the explained variance λi for the first 20 components, and (c) the PCA-deduced OP (P1) and percolation-based OP (fperc). (Inset) The standard deviation σ1 of the PCA OP.
For the first PC upon application of PCA to the WR model, (a) its component weights , (b) the explained variance λi for the first 20 components, and (c) the PCA-deduced OP (P1) and percolation-based OP (fperc). (Inset) The standard deviation σ1 of the PCA OP.
The PCA-deduced OP is shown in Fig. 9(c). Relative to fperc, the PCA-based OP varies more slowly and has a wider transition window. To identify the unique transition point, the standard deviation σ1, as outlined in Sec. III B, is computed for every density and shown in the inset of Fig. 9(c). The maximum value of σ1, denoting the region with most variance in the feature vectors, is accepted as being associated with the transition density ρt. For the above one-component mixture, a peak in the standard deviation is observed at ρt = 1.66, which is in excellent agreement with the value obtained through percolation arguments (ρt = 1.68)—indicating successful identification of the de-mixing transition.
IV. CONCLUSIONS
In this article, we extended the PCA framework introduced in Paper I52 to detect phase transitions in three new model systems, each characterized by very different physics. The success of the method in these cases highlights the importance of exploring various feature representations when seeking to detect such phase transitions and understand their underlying physics.
Moving forward, two avenues seem fruitful for developing a routine analysis toolkit: (1) curate a sufficiently diverse library of features, each focused on different physical aspects that may be relevant to various phase transitions of interest, and (2) explore the ability of more sophisticated, nonlinear learners to autonomously extract the physical intuition underlying such transitions on-the-fly.
Finally, we comment on general trends that we observed to indicate that the PCA calculation was usefully reporting on the phase transition of interest. First we note that meaningful dimensionality reduction into the first PC is generally indicated by a large relative explained variance in comparison to the higher order PCs. Furthermore, we found that appropriate choices for the features resulted in an OP with strong convergence properties that required relatively small amounts of data to overcome sampling noise. We expect that these trends are relevant to other machine-learning approaches for the detection of phase transitions as well.
ACKNOWLEDGMENTS
The authors thank Michael P. Howard for valuable discussions and feedback. This research was primarily supported by the National Science Foundation through the Center for Dynamics and Control of Materials: an NSF MRSEC under Cooperative Agreement No. DMR-1720595 as well as the Welch Foundation (F-1696). We acknowledge the Texas Advanced Computing Center (TACC) at the University of Texas at Austin for providing HPC resources.
REFERENCES
We do not exclude the possibility that the other PC scores may be useful or that a better order parameter could involve multiple PC scores.9–14 For simplicity, we focus on p1.