Many molecular systems and physical phenomena are controlled by local fluctuations and microscopic dynamical rearrangements of the constitutive interacting units that are often difficult to detect. This is the case, for example, of phase transitions, phase equilibria, nucleation events, and defect propagation, to mention a few. A detailed comprehension of local atomic environments and of their dynamic rearrangements is essential to understand such phenomena and also to draw structure–property relationships useful to unveil how to control complex molecular systems. Considerable progress in the development of advanced structural descriptors [e.g., Smooth Overlap of Atomic Position (SOAP), etc.] has certainly enhanced the representation of atomic-scale simulations data. However, despite such efforts, local dynamic environment rearrangements still remain difficult to elucidate. Here, exploiting the structurally rich description of atomic environments of SOAP and building on the concept of time-dependent local variations, we developed a SOAP-based descriptor, TimeSOAP (τSOAP), which essentially tracks time variations in local SOAP environments surrounding each molecule (i.e., each SOAP center) along ensemble trajectories. We demonstrate how analysis of the time-series τSOAP data and of their time derivatives allows us to detect dynamic domains and track instantaneous changes of local atomic arrangements (i.e., local fluctuations) in a variety of molecular systems. The approach is simple and general, and we expect that it will help shed light on a variety of complex dynamical phenomena.
I. INTRODUCTION
Structure–property relationships, at the heart of modern materials science, are hard to elucidate in complex molecular systems. Multi-scale and many-body interactions among all the atoms make it challenging, yet inspiring, to reconstruct the macroscopic behavior of such systems from their underlying atomic structure.1,2 Ranging from materials with an intrinsically dynamic character, such as soft supramolecular architectures, to common crystal lattices, a thorough knowledge of atomic arrangements, including their structural and dynamic evolution, is required to unlock tangled material responses and features.3–7 In crystalline solids, for instance, the materials’ plasticity and viscosity and their microstructural evolution are dictated by the energy and kinetics of defects8 or structural imperfections.9,10 Furthermore, atomically disordered domains, such as surfaces, grain boundaries, and heterogeneous interfaces, have been widely recognized to be linked to transport, mechanical, electronic, and optical properties.11–14 Shedding light on the intimate connection between complex atomic arrangements and material dynamic properties would clearly pave the way for novel design rules and optimization of molecular systems for tailored behaviors.15,16 However, in most practical cases, this remains far from being trivial.
In recent years, advances in data availability and computational power have enabled the development of valuable tools for gaining a deeper understanding of the chemical–physical phenomena occurring in materials.17,18 In particular, molecular dynamics (MD) simulations have been playing an increasingly significant role in the exploration of materials, serving as a large source of potential information.19–27 The use of MD simulations to elucidate structure–property relationships substantially embeds a two-step protocol: (i) the translation of MD trajectories into a numerical representation of atomic neighborhood environments, resulting in highly detailed and high-dimensional data, known as fingerprints or descriptors, and (ii) the extrapolation of meaningful information from the large volumes of generated datasets. Regarding the latter step, Machine Learning (ML) algorithms have often exhibited promising advantages in regard to handling large and complex sets of data, thereby receiving increased interest.28–33 However, a low-dimensional representation facilitating the navigation and identification of hidden patterns and features would be desirable.
Within this framework, methods for adequately characterizing complex atomic arrangements from MD simulations have been remarkably expanded. Over the last decades, many descriptors relying on the use of order parameters or mathematical quantities have been proposed.1 Low-dimensional descriptors based on the use of order parameters often allow gaining very accurate information, though being dependent on a priori knowledge about system features. However, methods relying on the use of structural environments (i.e., order parameters), such as coordination analysis, bond order analysis,34 bond angle analysis (BAA),35 common neighbor analysis, (CNA)36 adaptive CNA (A-CNA),37 and Voronoi analysis, generally struggle to identify different local coordination environments when the geometric symmetry is lost or exhibit a short-range nature (e.g., in crystalline systems close to the melting temperature). On the other hand, coupling more mathematically sophisticated descriptors to ML approaches enables effective characterization of systems by exploiting the rich and high-dimensional datasets provided by MD simulations38–40 apart from being less dependent on a priori knowledge. Nonetheless, advanced mathematical descriptors, such as the Behler–Parrinello symmetry functions (BP),41 Chebyshev polynomial representations (CPR),42 adaptive generalizable neighborhood informed (AGNI) features,43,44 smooth overlap of atomic position (SOAP),45 and atomic cluster expansion (ACE),46 generally based on the representation of atomic environments, can efficiently capture local structural properties but are less efficient in providing information on microscopic dynamic events occurring in the studied systems.
Among such abstract mathematical descriptors, SOAP was recently found very efficient for characterizing of a wide range of systems,47–50 including soft, disordered, and complex assemblies.51–54 Despite being strongly connected to the structural features of local environments, the SOAP fingerprint coupled with unsupervised clustering approaches and statistical analyses has been recently used to also reconstruct the dynamics of complex systems, such as metal surfaces,55 metal nanoparticles,56 soft supramolecular polymers,51,57 self-assembled micelles,52 and complex hierarchical superlattices, to cite a few.54,58 Since SOAP descriptors are typically high-dimensional, both linear and nonlinear dimensionality reduction (DR) approaches are often employed for facilitating both analyses and data visualization.59–62 However, DR represents the fundamental roadblock because it inherently leads to a loss of information, resulting in challenging characterization of systems where ordered and disordered domains coexist in dynamic exchange and equilibrium. In addition, beyond a few valuable techniques,63,64 the time evolution of structural changes, including rare fluctuations, still remains weakly explored by simply classifying datasets with unsupervised and sophisticated ML tools.
Time-dependent descriptors offer a different approach. For example, a recently developed descriptor—Local Environments and Neighbors Shuffling (LENS)65—monitors how much the microscopic surrounding of each molecular unit changes over time in terms of neighbor individuals/identities along an MD trajectory. LENS allows the identification of dynamic domains and detection of local fluctuations in a variety of systems tracking events of addition/subtraction of neighbors within a certain cutoff over time. However, LENS does not contain structural information on, e.g., the relative position or arrangements of neighbors inside the cutoff sphere. In this way, it does not capture, e.g., local structural rearrangement, adjustment, or rattling. A time-dependent descriptor capable of retaining rich structural information and of efficiently monitoring structural changes over time would be desirable.
Building on such a concept, here we report a time-dependent descriptor, TimeSOAP (τSOAP), that essentially exploits the structurally rich description of molecular/atomic environments guaranteed by the SOAP vectors and measures to what extent the SOAP power spectra of each unit within a complex molecular system change over time. An ML-based analysis of the time-series τSOAP data allows us to robustly and efficiently detect, e.g., structural transitions, phase transitions, and the coexistence of phases in a variety of systems with rich and diverse intrinsic dynamics. It is worth noting that the time derivative of τSOAP also provides sharp signals identifying local fluctuations, highlighting local and rare events that may be overlooked with other approaches. The paper is organized as follows: In Sec. II (Methods), we present our τSOAP and τSOAP-based descriptors and the coupled ML-based workflow. In Sec. III, we discuss the results obtained by performing our τSOAP analysis on various systems characterized by solid/liquid coexisting phases, solid-like and fluid-like behaviors, respectively. Our tests indicate that τSOAP analyses are flexible and robust and can shed light on complex molecular/atomic systems with nontrivial multilayered dynamics, providing insights that are difficult to attain with other approaches.
II. METHODS
A. SOAP as a descriptor of atomic environments
Recently, data-driven approaches capturing the structural complexity of materials from equilibrium MD trajectories have been proposed. A generic MD trajectory is represented by an ordered list of N atomic coordinates R(t) in the 3D space at each simulation time step, where N is the number of particles in the system. In order to characterize complex atomic arrangements, descriptors of atomic neighborhood environments have been widely employed. By associating a feature vector to each R(t), the descriptors enable passing from the 3D coordinate space to an S-dimensional feature space. Importantly, these representations are required (1) to be permutationally, translationally, and rotationally invariant—in order for physically equivalent configurations to be recognized as such—and (2) to smoothly vary with small changes in atomic positions. Among many developed descriptors, we adopt the Smooth Overlap of Atomic Position (SOAP) to examine our sample of materials ranging from crystalline to soft and liquid states. SOAP is a state-of-art, high-dimensional representation of atomic environments, and it has recently provided valuable insights into both material properties and structural features.49,57,66,67
Importantly, pi and pj describe the local environments related to two different SOAP centers. Besides the SOAP kernel, this distance representation provides a bounded similarity measure between two local environments, indicating how their local densities match in the S-dimensional feature space.
B. Tracking dynamical SOAP variations with TimeSOAP
The output dataset containing the S-dimensional SOAP vectors is typically high-dimensional, and although rich in information on the atomic/molecular arrangements, it requires a crucial preprocessing step to both facilitate the interpretation of the results and effectively identify relevant molecular patterns. For this reason, after estimating pi [Eq. (3)] for the whole set of SOAP centers {i} at each sampled configuration of the MD trajectory, a SOAP-based pattern recognition procedure typically relies on two successive key phases: (1) use of dimensionality reduction (DR) of SOAP spectra by means of, for instance, Principal Component Analysis69,70 (PCA), and (2) employment of unsupervised clustering techniques for the identification of molecular motifs. Despite providing some information on a wide range of molecular systems, this approach presents a few key shortcomings: (i) Since the time information is not emphasized, insights on consequential transition events as well as the temporal persistence of the individual molecular configurations is not retained, thus hindering a detailed comprehension of the rate of change of every individual molecular configuration; (ii) on such low-dimensional SOAP-based dataset, some poorly populated configurations may remain undetected by (e.g., density-based) unsupervised clustering approaches; and (iii) low-dimensional embedding of atom-density representations can fail in faithfully preserving valuable information, such that a high number of principal components would be desirable.71 This makes detecting local fluctuations and rare events typically awkward with such approaches.
In contrast to Eq. (5), here both and describe the local environments related to the same unit (i.e., the i-th SOAP center) but at different simulation times, t and t + Δt, respectively. Thus, measures how similar the i-th SOAP vector calculated at time t is to that calculated at the next sampled time step (t + Δt). We analyze consecutive frames, that is, adjacent points, where Δt represents the MD sampling time step (different for the various systems, see Molecular Dynamics Simulations for more details). As a result, τSOAP evaluates how the i-th local environment changes, in terms of SOAP descriptor, at every consecutive time interval Δt. We thus obtain λi(t), a τSOAP signal over time for each individual in the selected set {i}, thereby allowing to track the evolution of each SOAP constituent unit center along the trajectory.
By computing it along the MD trajectory, we get . What represents is the rate of local environment changes over time for the i-th SOAP center. This allows us to highlight the relevant dynamic phenomena occurring along the trajectory, notably discriminating between local environments characterized by a constant variation and those exhibiting an increasing/decreasing variation.
To increase the signal-to-noise ratio (S/N), both λi(t) and are preprocessed by employing the Savitzky–Golay73 filter from the SciPy74 python package, thus obtaining smoothed signals. A common polynomial order parameter of p = 2 is used for each signal λi(t), while different time windows are chosen depending on the analyzed system, in order to reach a compromise between an acceptable S/N value and a sufficiently smaller window compared to the length of signal (see the supplementary material, Fig. S1, for details). After having chosen the time window for λi(t), to adequately smooth its first derivative , we keep the same time window and use two applications of the filter (following a general rule: for the n-th derivative, use at least n + 1 applications of the filter).
C. Dynamic domains detection
After increasing S/N, an ML-based analysis is performed on λi(t) data to detect relevant dynamics domains in each system. As a clustering method, we opted to use the KMeans algorithm from the Scipy python package,75,76 since it was demonstrated to be robust and capable of providing a good trade-off between clustering quality and computational cost.65 Nonetheless, it is worth noting that the analysis approach is versatile, and other clustering methods could be used. KMeans requires the number K of clusters to be created in the process to be predetermined. Here, with the aim of guaranteeing a wide variety of micro-cluster dynamics regardless of the analyzed system, we start anyway from K = 10 clusters. On the basis of the exchange probability matrix and the dendrogram associated with cluster interconnections, we then hierarchically merged the K clusters a posteriori. The exchange probability matrix contains, indeed, the percentage probability of a unit i belonging to a given cluster to persist in that cluster or to jump to another cluster in the sampling time step Δt; from this, by means of an “average” linkage method, we built the associated dendrogram connecting the dynamic domains that have a high probability of exchanging elements. Ultimately, to establish the cutoff point of the dendrogram, we used the Elbow Curve Method as an indicative guideline for selecting the optimal number of clusters k (see the supplementary material, Fig. S2). However, for completeness, all the steps leading from the starting K = 10 clusters to the final k clusters are reported in the supplementary material, Figs. S3 and S4.
On the other hand, the domain recognition on data has been performed via a different approach. On obtaining the distribution and the associated Kernel Density Estimate (KDE) for each system, we divide the KDE in deciles and consider the first and the tenth deciles to detect units significantly falling far from the mean local environment variation rate. The first decile and the tenth decile capture units with rapidly decreasing and increasing, respectively, local environment changes. This provides a clear distinction between domains moving toward more dynamic and those moving toward less dynamic configurations.
D. Molecular dynamics (MD) simulations
We test our τSOAP analysis on MD trajectories obtained for different systems with nontrivial dynamics: a water–ice interface system in correspondence of the transition temperature, a gold nanoparticle at 200 K, a copper surface at 700 K, and DPPC lipid bilayer where liquid and gel phases coexist at 293 K of temperature.
The atomistic ice/liquid water phase coexistence at the solid/liquid transition temperature is obtained by employing the direct coexistence technique77,78 using the GROMACS software.79 In order to model both the ice and the liquid water phase, the TIP4P/ICE water model80 is used. The direct coexistence technique is based on the idea of placing in contact two or more phases (in this case, the phase of ice Ih and the liquid water phase) in the same simulation box and at constant pressure. Since the energy is constant at T = 268 K, while the system melts at T = 269 K,81 we set the temperature at T = 268 K and keep it constant by means of the v-rescale thermostat with a relaxation time of t = 0.2 ps.
To get the initial configuration of ice Ih, the Genice tool proposed in the work of Matsumoto et al.82 is used, which generates a hydrogen-disordered lattice with zero net polarization satisfying the Bernal–Fowler rules. The solid lattice is equilibrated by performing a 10 ns-long anisotropic NPT simulation at ambient pressure (1 atm). The c-rescale barostat83 is used with a time constant of t = 20 ps and compressibility of 9.1 × 10−6 bar−1. On the other hand, the liquid phase is obtained from the same initial configuration of ice Ih but performing a NVT simulation at T = 400 K in order to quickly melt the ice slab. Then, a 10 ns-long simulation at T = 268 K is performed to equilibrate the liquid phase, using the c-rescale barostat in semi-isotropic conditions and at a compressibility of 4.5 × 10−5 bar−1. Since the initial ice slab is composed of 1024 water molecules, both the solid and liquid phases have the same number of molecules and box dimensions. The two phases are put in contact and, then, the system is equilibrated for t = 10 ns employing the c-rescale pressure coupling at ambient pressure with the water compressibility (4.5 × 10−5 bar−1). The production NPT is carried out in semi-isotropic conditions, applying the pressure only in the direction perpendicular to the ice/water interface, thus reproducing the strictly correct ensemble for the liquid–solid equilibrium simulation by the direct coexistence technique.84 Finally, a 100 ns-long production run is performed, with a sampling time interval of Δt = 0.1 ns.
The second case study analyzed in this work is an icosahedral gold nanoparticle (Au-NP) composed of 309 atoms. The parameterization of the model is performed according to the Gupta potential.85 The Au-NP system is simulated for t = 2 µs at T = 200 K sampling every Δt = 0.1 ns using the LAMMPS software.86 The details are described in Ref. 56.
The third system, the atomistic model of copper FCC surface Cu(210), is composed of 2304 Cu atoms and simulated at T = 700 K. A Neural Network potential built by means of the DeepMD platform87 is employed to perform Deep-potential MD simulations of the Cu(210) surface with the LAMMPS software,86 as reported in Ref. 55. The MD trajectory is conducted for 150 ns using a sampling time interval of Δt = 0.3 ns.
Finally, the last case study is a DPPC lipid bilayer composed of 1152 lipids simulated at T = 293 K. As detailed in Ref. 53, DPPC lipids are simulated and parameterized in explicit water by using the Martini2.288 Coarse-Grained (CG) force field. The CG-MD simulation is performed for t = 1 µs and sampled every 0.1 ns with the GROMACS software.79 However, in our analysis, we use the last 500 ns of MD trajectory.
III. RESULTS AND DISCUSSION
Herein, we use the descriptor τSOAP to elucidate the dynamics of atomic/molecular structural environments, which are often key determinants of global material performances. In order to show the whole picture of dynamic information that can be extracted from τSOAP signals, we analyze MD trajectories of different systems exhibiting various structures and nontrivial behaviors, thus indicating the transferability of such an approach to a wide range of materials. In particular, we first focus on ice/liquid water coexistence at the solid/liquid transition temperature, where structural and dynamic properties continuously alternate from solid-like to liquid-like character.89 We also carry out our analysis on systems revealing solid-like dynamics, such as metal nanoparticles and surfaces well below the melting point. Ultimately, a fluid-like soft system is included by testing our approach on a lipid bilayer below the gel-to-liquid transition temperature.
A. Into the dynamics of ice/liquid water phase coexistence via τSOAP signal
We start testing τSOAP on a system where crystalline ice and liquid water coexist at the solid/liquid transition temperature, while exhibiting a dynamic equilibrium between solid-like and liquid-like regimes. We analyze a simulation box, in periodic boundary conditions, having 1024 hexagonal ice (Ih) molecules in contact with 1024 liquid water molecules [see Fig. 1(a)] at T = 268 K. We consider 1001 consecutive frames sampled every Δt = 0.1 ns along 100 ns of an MD trajectory. As a first step, we compute the SOAP vectors for the oxygen atoms of each water molecule (2048 TIP4P/ICE water molecules) along all frames of the trajectory (see Sec. II for details). Before illustrating the τSOAP analysis, we start by briefly discussing the results obtained via, e.g., a SOAP + PCA pattern recognition procedure—widely used for studying molecular systems—on such ice/liquid water system, here presented as a first case study. Figure 1(b) shows the results of this analysis, which detects, from the SOAP-based dataset, two main clusters, corresponding to the ice and liquid water domains (in green and gray). It is worth noting that DR (via PCA) to a three-dimensional subspace already allows to capture >90% of the cumulative variance of the SOAP-based dataset in this case (see Fig. S5a). A systematic analysis on the effect of increasing the dimensionality provided essentially the same results, demonstrating how two main SOAP domains are detected (ice and liquid water) independently on the number of PCs used and of the identified clusters (see also Fig. S5).
Automatic detection of molecular motifs in ice/liquid water coexistence. (a) MD snapshot of ice/liquid water simulation box made of 2048 TIP4P/ICE water molecules at T = 268 K. Color code: red for oxygen and white for hydrogen atoms. (b) Example of a typical SOAP-based pattern recognition procedure. Left: PCA projection of the SOAP-based dataset estimated from the ice/liquid water system in (a). Right: clustering analysis—on the same dataset—carried out with KMeans. The two main detected clusters, colored in green and gray, are also visualized on the MD water snapshot (taken at 44 ns), showing the ice and water domains in gray and green, respectively. (c) On the left, time-series of τSOAP signals, λi(t), shown for all the oxygen atoms in (a). The colored λi(t) profiles are related to three explicative oxygen atoms, i.e., (i) black, (ii) cyan, and (iii) crimson, displayed on the right with the respective color code. The reported MD snapshots are around t ∼ 60 ns. (d) τ SOAP-based analysis. λi(t) profiles and their KDE are carried out for all the oxygen atoms of all water molecules in the system (a). The final k = 4 detected macro-clusters are shown as colored in gray, crimson, blue, and cyan. (e) Interconnection probability matrix of the final k = 4 identified macro-clusters. (f) MD snapshot (taken at 44 ns) showing the four main clusters identified by τ SOAP-based analysis (same color code of (d)): ice (in gray), solid/liquid interface (in crimson), liquid water (in blue), a distinct domain in the liquid phase (in cyan).
Automatic detection of molecular motifs in ice/liquid water coexistence. (a) MD snapshot of ice/liquid water simulation box made of 2048 TIP4P/ICE water molecules at T = 268 K. Color code: red for oxygen and white for hydrogen atoms. (b) Example of a typical SOAP-based pattern recognition procedure. Left: PCA projection of the SOAP-based dataset estimated from the ice/liquid water system in (a). Right: clustering analysis—on the same dataset—carried out with KMeans. The two main detected clusters, colored in green and gray, are also visualized on the MD water snapshot (taken at 44 ns), showing the ice and water domains in gray and green, respectively. (c) On the left, time-series of τSOAP signals, λi(t), shown for all the oxygen atoms in (a). The colored λi(t) profiles are related to three explicative oxygen atoms, i.e., (i) black, (ii) cyan, and (iii) crimson, displayed on the right with the respective color code. The reported MD snapshots are around t ∼ 60 ns. (d) τ SOAP-based analysis. λi(t) profiles and their KDE are carried out for all the oxygen atoms of all water molecules in the system (a). The final k = 4 detected macro-clusters are shown as colored in gray, crimson, blue, and cyan. (e) Interconnection probability matrix of the final k = 4 identified macro-clusters. (f) MD snapshot (taken at 44 ns) showing the four main clusters identified by τ SOAP-based analysis (same color code of (d)): ice (in gray), solid/liquid interface (in crimson), liquid water (in blue), a distinct domain in the liquid phase (in cyan).
After computing SOAP vectors, τSOAP signals are estimated by capturing the variations of local SOAP environments in Δt = 0.1 ns [see Eq. (6)]. Figure 1(c) reports, on the left, the resulting λi(t) time profiles related to each of the 2048 oxygen atoms, while, on the right, it shows the ice/liquid water MD snapshots at t = 59 ns, t = 63 ns, and t = 65 ns. Notably, three distinct λi(t) profiles are highlighted in Fig. 1(c), left: (i) the black signal oscillating around λi = 0.2; (ii) the cyan signal lying in the highest λi region; and (iii) the crimson signal, which rapidly passes, at t ∼ 60 ns, from low to high λi values. The oxygen atoms related to the latter three λi(t) profiles are instead depicted on the MD snapshots in Fig. 1(c), right, with the respective color code, i.e., black, cyan, and crimson. The visualization of these selected atoms clearly shows that the black and cyan oxygen units belong to the ice and liquid water phase, respectively, regardless of the displayed time steps. On the other hand, the identified crimson oxygen represents an atom involved in the ice/liquid water transition occurring at t ∼ 60 ns. By lightening the behavior of such atoms, we attempt to emphasize the potential meaning provided by τSOAP descriptor on the single unit dynamics: Following the time variation of atomic structural environments, τSOAP allows both to distinguish atoms belonging to different phase states and to capture those undergoing phase transitions.
In order to systematically detect the complete scenario of distinct dynamic behaviors in our water system, an ML-based analysis is carried out on all τSOAP signals. The results of the clustering investigation, performed via the KMeans algorithm, are shown in Fig. 1(d). The final four identified clusters (gray, crimson blue, and cyan) are displayed both on the time series of the λi(t) data [Fig. 1(d): left] and on the λi(t) data distribution reported with the correlated KDE [Fig. 1(d): right]. As already pointed out, the four different dynamic domains identify those water molecules undergoing specific transitions, i.e., instantaneously changing their local structural environments. In particular, the gray domain is dominated by oxygen units that are characterized by low λi values along the complete trajectory, i.e., by a weak variability of their local atomic environments. On the other hand, oxygen atoms showcasing high changes of their structural atomic distributions, and hence high values of λi, belong to the blue cluster or cyan domain. Oxygen units that, instead, tend to reveal medium values of λi—because of their transition from one λi regime to the other one—are classified into a distinct crimson cluster. It is interesting to note how, differently from the SOAP + PCA pattern recognition procedure shown in Fig. 1(b), an analysis of the time-series τSOAP data reveals this third dynamically different environment—i.e., the ice/liquid water interface, which gets lost in SOAP + PCA-based analyses due to its reduced (negligible) statistical weight (see the supplementary material, Fig. S5 for SOAP + PCA-based analyses with increased number of clusters). Ultimately, the cyan domain is detected as a different cluster of units with higher local environmental changes. The graphical representation of such clusters is shown in Fig. 1(f) through an MD snapshot (see the supplementary material, Movie S1). Not surprisingly, the gray cluster, characterized by the lowest τSOAP signal, corresponds to the ice phase; the blue domain, characterized by 0.4 0.54, is mainly correlated with the liquid phase; and the crimson one, including oxygen atoms with 0.2 0.4, is instead located at the solid/liquid interface. Finally, the cyan cluster (λi > 0.54), although sited in the same region of the liquid phase (blue cluster), is identified as presenting a different dynamic behavior. In the considered ice/liquid water system, the exchange probabilities among the final four clusters are displayed in the matrix in Fig. 1(e): Although the oxygen atoms exhibit a probability higher than to persist in a given cluster in the sampling time step Δt (probabilities on the matrix diagonal), no negligible transient events occur between red–blue and cyan–blue clusters, demonstrating that a percentage of oxygen population is involved in instantaneous transitions among dynamic domains (out of diagonal probabilities).
After detecting the main dynamics clusters based on τSOAP signals, λi(t), we carry out a further domain recognition analysis based on , that is, the instantaneous rate of local environment variations λi(t). The key information that can be gathered from the time derivative of λi(t) is pointed out in Fig. 2(a), where an explicative example is reported. Here, both λi(t) and time profiles are associated with the same oxygen atom i: In gray, λi(t) shows the atom undergoing phase transition at t ∼ 60 ns when the time signal significantly and rapidly passes from the low to the high λi value region; in green, the first time derivative of the gray profile exhibits a peak corresponding to phase transition while fluctuating around zero in both the initial and final stages of the trajectory. Clearly, tracks a notable signal leading up to a substantial dynamic change in the system. The first time derivative, indeed, offers a neat discrimination between small oscillations of λi(t)—which are intrinsic to the constituent units, independently from the proper dynamic domain—and large fluctuations driving significant changes in the atomic structure. Notably, also provides a detailed comprehension of the directionality of the local environment variations, i.e., on the evolution of the material structures. While the presence of a peak, i.e., of a large fluctuation in profile, suggests that a relevant event is occurring in that time interval, the sign of such fluctuations points out the evolution of a structural environment: A positive sign indicates that the atom is undergoing a local reconfiguration toward a more dynamic domain; a negative sign means that a local environment reconfiguration toward a more static domain is occurring.
First, the time derivative of τSOAP signal for ice/liquid water system. (a) τSOAP (λ(t)) and its first time derivative profiles associated with the same oxygen atom are shown in gray and green, respectively. (b) signals and their KDE estimated for all the oxygen atoms in Fig. 1(a). Clustering color code: (i) blue for environments corresponding to the first decile; (ii) orange for those corresponding to the tenth decile; (iii) white for values in all the other deciles. (c) MD snapshots displaying blue, orange, and white domains. Left: local detail of the orange cluster evolving toward melting (first and second snapshots). Right: local detail of blue cluster associated with a small disordered region evolving toward freezing (third and fourth snapshots). (d) Orange local environments identify ice molecules moving out of hexagonal ice configurations.
First, the time derivative of τSOAP signal for ice/liquid water system. (a) τSOAP (λ(t)) and its first time derivative profiles associated with the same oxygen atom are shown in gray and green, respectively. (b) signals and their KDE estimated for all the oxygen atoms in Fig. 1(a). Clustering color code: (i) blue for environments corresponding to the first decile; (ii) orange for those corresponding to the tenth decile; (iii) white for values in all the other deciles. (c) MD snapshots displaying blue, orange, and white domains. Left: local detail of the orange cluster evolving toward melting (first and second snapshots). Right: local detail of blue cluster associated with a small disordered region evolving toward freezing (third and fourth snapshots). (d) Orange local environments identify ice molecules moving out of hexagonal ice configurations.
Figure 2(b) shows, on the left, the time profiles of related to each of the 2048 oxygen atoms and, on the right, the KDE of the data distribution. We color in blue and orange the domains corresponding to the first and the tenth decile, respectively, while we merge all the other deciles in a single white cluster. It is worth noting that the KDE distribution has a peak approximately corresponding to , indicating that the local environment variations, λi(t), are, on average, constant. On the other hand, atoms that significantly increase or decrease, frame by frame, their local environmental changes are captured by positive (in the orange region) or negative peaks (in the blue region), respectively. In Fig. 2(c), we visualize these three different domains (blue, orange, and white) on a few snapshots along the MD trajectory, thereby showing that the positive and negative peaks allow characterizing melting and freezing phenomena occurring within small solid-like and liquid-like regions. In the first snapshot of Fig. 2(c), we represent a small portion of oxygen solid-like atoms (in orange) located at the ice/liquid water interface and exhibiting positive τSOAP (λi) variations (i.e., undergoing rearrangements toward more dynamic configurations). Accordingly, in the second snapshot, those rings appear as broken, thus proving a melting-type process. In the third snapshot of Fig. 2(c), we report, instead, an example of oxygen units presenting negative τSOAP (λi) variations (blue cluster), thus evolving toward more static configurations at the solid–liquid interface. Indeed, as shown in the fourth snapshot, an ordered ring structure forms, thus reproducing a typical freezing phenomenon. Ultimately, Fig. 2(d) shows a further detail potentially revealed by our analysis. In particular, water molecules exhibiting a high positive rate of change of their local SOAP environment (high ) turned out to be also associated with ice molecules that, at the interface with liquid water, undergo transitions out of the typical hexagonal packing, i.e., forming interface ice defects [Fig. 2(d)].90 In summary, besides capturing the local atomic rearrangements and characterizing their evolution, seems to be useful for defect detection purposes also.
The previous results suggest how the use of τSOAP descriptor and its first time derivative is a possible strategy to unveil microscopic phenomena occurring at the ice/water interface in dynamic equilibrium. In particular, by reliably detecting local fluctuations along with rearrangements and their evolution, the time variations of structural atomic environments show considerable potential for tracking crystallization or melting processes from MD trajectories.95 In order to outline the main features of τSOAP and the differences with respect to other analysis approaches often used to study the dynamics, we also compared τSOAP with a time-lagged Independent Component Analysis (tICA),91,92 a DR approach used to process high-dimensional input data by retaining valuable temporal information (see the supplementary material, Fig. S6). Concerning the study case of ice–liquid water transition, we projected the high-dimensional SOAP space on its highest-autocorrelation linear tICA subspace. The results in Figure S6 demonstrate how tICA essentially finds two main environments, corresponding to the ice and water domains. However, similar to a classical SOAP + PCA analysis [see Fig. 1(b)], such SOAP + tICA DR approaches do not recognize the ice/water interface as a separate environment, nor does it capture the local individual transitions as done by τSOAP (Figs. 1 and 2). This shows how such standard pattern recognition approaches (e.g., PCA or tICA coupled with clustering analyses) can effectively detect dynamic domains with dominant statistical weight, while sparse and local fluctuations get typically lost due to their negligible statistical occurrence. In this sense, τSOAP has the advantage to preserve any change in local structural environments, from the slowest to the fastest visited along the studied trajectories, thereby avoiding specific screening of structural variations.
B. Application to discrete solid-like dynamics
As completely different test cases, we test our approach on systems revealing solid-like dynamics. We discuss the results of our analysis applied on MD trajectories of (i) a 309-atom icosahedral gold nanoparticle, denoted as Au-NP, at 200 K [Fig. 3(a)], and (ii) a copper Cu(210) FCC surface at 700 K [Fig. 4(a)].
τ SOAP-based analysis on 309-atom icosahedral gold nanoparticle (Au-NP). (a) MD snapshots of ideal Au-NP (top), and equilibrated one at T = 200 K (bottom). (b) λi(t) profiles and the related KDE for the Au atoms in the system (a). The final k = 5 macro-clusters identified by KMeans are shown in gray, crimson, pink, blue, and cyan. (c) Exchange probability matrix of the final k = 5 detected macro-clusters. (d) MD snapshots with the five main clusters identified in (b): inner core in gray, interface region between the inner core and the outermost layer in crimson, more static surface face in pink, more dynamic surface face in blue, atoms undergoing the highest local environmental changes in cyan. (e) Domain detection based on profiles and their KDE: The blue domain is associated with the first decile, the orange domain is linked to the tenth decile, and the white domain includes in all the other deciles. (f) MD snapshots displaying the emergence of blue, orange, and white domains along the MD trajectory. On the left, the predominance of the orange cluster before (first snapshot) and during the symmetry breakdown (second snapshot) is shown. The central snapshot exhibits a prevalence of white domain, together with a balance between orange and blue ones. On the right, a prevalence of the blue domain can be observed (fourth snapshot) before the formation of a more static configuration (white cluster: fifth snapshot). (g) Blue, orange, and white domains associated with the rearrangement, over time, of a local configuration from “vertex” to “rosette.”
τ SOAP-based analysis on 309-atom icosahedral gold nanoparticle (Au-NP). (a) MD snapshots of ideal Au-NP (top), and equilibrated one at T = 200 K (bottom). (b) λi(t) profiles and the related KDE for the Au atoms in the system (a). The final k = 5 macro-clusters identified by KMeans are shown in gray, crimson, pink, blue, and cyan. (c) Exchange probability matrix of the final k = 5 detected macro-clusters. (d) MD snapshots with the five main clusters identified in (b): inner core in gray, interface region between the inner core and the outermost layer in crimson, more static surface face in pink, more dynamic surface face in blue, atoms undergoing the highest local environmental changes in cyan. (e) Domain detection based on profiles and their KDE: The blue domain is associated with the first decile, the orange domain is linked to the tenth decile, and the white domain includes in all the other deciles. (f) MD snapshots displaying the emergence of blue, orange, and white domains along the MD trajectory. On the left, the predominance of the orange cluster before (first snapshot) and during the symmetry breakdown (second snapshot) is shown. The central snapshot exhibits a prevalence of white domain, together with a balance between orange and blue ones. On the right, a prevalence of the blue domain can be observed (fourth snapshot) before the formation of a more static configuration (white cluster: fifth snapshot). (g) Blue, orange, and white domains associated with the rearrangement, over time, of a local configuration from “vertex” to “rosette.”
τSOAP-based analysis on a copper FCC surface, Cu(210), composed of 2304 atoms. (a) MD snapshots of ideal Cu(210) surface (top), and equilibrated one at T = 700 K (bottom). (b) λi(t) signals and the related KDE for Cu atoms of system in (a). The final k = 3 macro-clusters identified by KMeans are shown in gray, crimson, and cyan. (c) Exchange probability matrix of the final k = 3 detected macro-clusters. (d) MD snapshots showing the three main clusters identified in (b): crystalline bulk in gray, subsurface region in crimson, more dynamic surface atoms in cyan. (e) Domain detection based on profiles and the related KDE: The blue domain is associated with the first decile, the orange cluster is linked to the tenth decile, the white domain is related to all the other deciles. (f) MD snapshots of blue, orange, and white domains (in transparency) related to two different frames. Top: At t = 25.8 ns, the circled portion of the surface exhibits a prevalence of blue domain, thus predicting stable reconfigurations in the two successive frames (green atoms). Bottom: At t = 29.1 ns, the same circled portion exhibits a predominance of orange cluster, thus predicting dynamic reconfigurations in the two successive frames (green atoms).
τSOAP-based analysis on a copper FCC surface, Cu(210), composed of 2304 atoms. (a) MD snapshots of ideal Cu(210) surface (top), and equilibrated one at T = 700 K (bottom). (b) λi(t) signals and the related KDE for Cu atoms of system in (a). The final k = 3 macro-clusters identified by KMeans are shown in gray, crimson, and cyan. (c) Exchange probability matrix of the final k = 3 detected macro-clusters. (d) MD snapshots showing the three main clusters identified in (b): crystalline bulk in gray, subsurface region in crimson, more dynamic surface atoms in cyan. (e) Domain detection based on profiles and the related KDE: The blue domain is associated with the first decile, the orange cluster is linked to the tenth decile, the white domain is related to all the other deciles. (f) MD snapshots of blue, orange, and white domains (in transparency) related to two different frames. Top: At t = 25.8 ns, the circled portion of the surface exhibits a prevalence of blue domain, thus predicting stable reconfigurations in the two successive frames (green atoms). Bottom: At t = 29.1 ns, the same circled portion exhibits a predominance of orange cluster, thus predicting dynamic reconfigurations in the two successive frames (green atoms).
Regarding case (i), we analyze 20 000 consecutive frames of a 2-µs long MD trajectory sampled every Δt = 0.1 ns at T = 200 K. It is well known that metal nanoparticles may exhibit nontrivial dynamics at room and at even sufficiently lower temperatures. Despite the reduced atomic motion and, accordingly, the more stabilized ideal icosahedron architecture, some local fluctuations and atomic rearrangements can be observed in the Au-NP even at T = 200 K. τSOAP signals in Fig. 3(b), indeed, present a sudden increase after s, demonstrating that some atoms are experiencing intense instantaneous local environment variations. Our cluster analysis on λi(t) recognizes five main dynamic domains whose transition probabilities are reported in Fig. 3(c). This transition matrix proves a negligible attitude of the gold atoms to transfer to diverse dynamic domains, while they prefer to remain in their own cluster with probabilities higher than 98.4%. The MD snapshots in Fig. 3(d) show that these clusters identify distinct dynamical behaviors and structural domains (see also the supplementary material, Movie S2). First, the cluster analysis is able to accurately distinguish the inner core of the Au-NP (in gray): a more static region characterized—not surprisingly—by low λi(t) values along the whole simulation, from an interface region (in crimson) between the inner core and the outermost layer. Second, such a clustering approach sharply separates the surface of the Au-NP into two coexisting regions (pink and blue) related to different characteristic λi(t). While the pink face turned out to be more static, the blue domain reliably detects the portion of the surface where a fracture formation may occur, breaking down the symmetry (Fig. 3(d), second MD snapshots on the right). Interestingly, τSOAP also identifies some local events such as the formation of concave ”rosettes” (a vertex, having five neighbors in an ideal icosahedron, penetrates inside the NP surface, thus passing to six neighbors). In Fig. 3(d) (third snapshot on the right), two rosettes can be observed as belonging to a more dynamic cluster—significantly varying their local environments—(in blue), while the associated vertices are identified as more stable (in crimson).
Furthermore, the estimation of [Fig. 3(e)] provides interesting details on the dynamic evolution of the system. A rather large percentage of Au atoms is characterized by constant variations of their surrounding environment (λi(t) = const, and ). Rare and sharp fluctuations are anyhow clear in the λi(t) profile. To qualitatively illustrate some of these peaks, five MD snapshots presenting different predominant domains are shown Fig. 3(f). In the first and second snapshots, the domain characterized by positive (in orange) prevails, suggesting that the atoms belonging to that cluster are collectively involved in a significant increase in the instantaneous local environment variations. Indeed, this predicts the symmetry breaking shown in the second snapshot. However, the prevalence of represented by the white domain, along with a balance between positive (orange) and negative (blue) peaks, establishes a dynamic equilibrium leading to no relevant events along several trajectory frames (one example is presented in the third snapshot). In the last two snapshots, instead, a significant collective decrease of the instantaneous local variations (negative ) emerges (prevalence of blue domain in the fourth snapshot), thus predicting the evolution of the associated atoms toward more static environments (shown in the final snapshot). Ultimately, the information on the directionality of local rearrangements is also highlighted in Fig. 3(g): While positive values (in orange) mark a vertex evolving toward a less stable configuration where a missing atom appears, negative (in blue) predict rearrangement of the structure toward a stable rosette-like configuration.
For case (ii), we use 502 consecutive frames of 150 ns-long MD simulation of a Cu(210) surface composed of 2304 Cu atoms [Fig. 4(a)] sampled every Δt = 0.3 ns at T = 700 K. Although metals tend to be traditionally considered as hard matter, it is now well known that their constituent surface atoms may exhibit nontrivial dynamics, undergoing rearrangements well below the melting temperature.55,93 Our clustering procedure applied on τSOAP profiles identifies three main domains related to Cu atoms exhibiting very competing behaviors [Fig. 4(b)]: one dense and more static cluster in gray along with two less populated but more dynamic domains in red and cyan. The exchange probability matrix in Fig. 4(c) points out that the transient events among diverse domains mainly engage Cu atoms belonging to the red and cyan clusters. Figure 4(d) graphically represents the identified clusters at two explicative time steps, t = 85.8 ns and t = 92.7 ns: Not surprisingly, the gray domain corresponds to the crystalline bulk of the Cu(210) surface, reasonably detected by our analysis as the most static with small local environment variations (low λi(t) values); on the other hand, the surface atoms are identified as more dynamic clusters, thereby including all λi(t) > 0.07. However, two subsurface regions are recognized by KMeans: in crimson, a domain characterized by 0.07 < λi(t) ≤ 0.16, and in cyan, a cluster with the highest local environment variations. The two MD snapshots in Fig. 4(d) show significant correspondence between that more static surface region (crimson) and more stable surface atomic arrangements with increased coordination (see also the supplementary material, Movie S3).
The Cu(210) domain characterization based on confirms the effectiveness of this analysis in providing some key information on the time evolution of the material structure. Figure 4(e) highlights, also in this case, that the average rate of the local environment variations is null, i.e., most of the Cu atoms in Cu(210) show a steady-state behavior of λi(t). In addition, the cluster representation in Fig. 4(f) suggests that the domain with essentially corresponds to the ice crystalline bulk (in white). On the other hand, most of the surface atoms are highly dynamic; consequently, a balance between domains with positive (orange) or negative (blue) is established over time. Consistent with the test cases discussed above, this dynamic balance indicates that no substantial reconfiguration toward more stable/dynamic arrangements is occurring. Nevertheless, some interesting cluster details are worth noting in Fig. 4(f): The snapshot on the top, corresponding to t = 25.8 ns, exhibits a portion of the surface with a clear predominance of atoms evolving toward more static configurations (in blue); the zoom onto that portion clarifies its stability in the two successive sampled times (t = 26.1 and t = 26.4). On the contrary, when the same surface region is characterized, after some frames, by a prevalence of positive (in orange in the snapshot on the bottom at t = 29.1 ns), the associated atoms experience an evident rearrangement as highlighted by the green atoms onto the zoom. Again on this system, was revealed to be useful in predicting the evolution toward more static/more dynamic configurations.
C. Phase coexistence in soft dynamical systems
As a final test case, we apply our τSOAP-based dynamic domain recognition on a soft system characterized by a two-phase coexistence: gel and liquid. Specifically, we analyze the last 500 ns of 1 μs-long CG-MD simulation of a DPPC lipid bilayer composed of 1152 self-assembled DPPC lipids [see Fig. 5(a)] at T = 293 K, thus considering the last 5001 consecutive frames (Δt = 0.1 ns). Although the gel-to-liquid transition temperature of a DPPC membrane is at K, here we investigate the dynamics of the lipid bilayer at a slightly lower temperature, thereby avoiding addressing the critical dynamics issues occurring at the transition temperature.
τ SOAP-based analysis on a lipid bilayer composed of 1152 DPPC lipids at T = 293 K. (a) MD snapshots of DPPC lipid bilayer (top and lateral views). (b) λi(t) signals and the related KDE for all the phosphate atoms of all lipid molecules in the DPPC bilayer (a), along the last 500 ns of the MD trajectory. KMeans clustering identifies k = 2 final macro-clusters shown with crimson and cyan. (c) Exchange probability matrix of the final k = 2 macro-clusters. (d) MD snapshot of the two detected domains [same color code as (b)]: gel phase in crimson and liquid phase in cyan. (e) Domain detection based on profiles and the related KDE: The blue domain is associated with the first decile; the orange cluster is linked to the tenth decile; the white domain is related to all the other deciles. (f) MD snapshots of blue, orange, and white domains related to different frames along the trajectory.
τ SOAP-based analysis on a lipid bilayer composed of 1152 DPPC lipids at T = 293 K. (a) MD snapshots of DPPC lipid bilayer (top and lateral views). (b) λi(t) signals and the related KDE for all the phosphate atoms of all lipid molecules in the DPPC bilayer (a), along the last 500 ns of the MD trajectory. KMeans clustering identifies k = 2 final macro-clusters shown with crimson and cyan. (c) Exchange probability matrix of the final k = 2 macro-clusters. (d) MD snapshot of the two detected domains [same color code as (b)]: gel phase in crimson and liquid phase in cyan. (e) Domain detection based on profiles and the related KDE: The blue domain is associated with the first decile; the orange cluster is linked to the tenth decile; the white domain is related to all the other deciles. (f) MD snapshots of blue, orange, and white domains related to different frames along the trajectory.
Our clustering analysis, displayed in Fig. 5(b) on λi(t) profiles and on the related KDE, identifies two main dynamic domains: one colored in crimson including λi < 2.8 and the other one in cyan, containing the highest values of τSOAP fingerprints. Figure 5(c) reports, instead, the interconversion matrix between the two clusters. Beyond a small probability (3.6%) to transient from cyan to red cluster, the lipids manifest relatively high stability to preserve, along the complete trajectory, a specific local environment variation (λi), typical of each individual dynamics cluster. The graphical representation of the lipid bilayer in Fig. 5(d) suggests a close link between the dynamic domains and the phase states: The crimson cluster characterized by small λi is indeed associated with a more static, gel, phase, while the cyan domain, with higher local environment variations, is connected to a more dynamic—liquid—phase. Although the ability of SOAP to distinguish environments characterized by diverse structural features is known,53,57 this case demonstrates how τSOAP is able to clearly detect the nucleation and emergence of distinct dynamic domains in intrinsically disordered systems, in a very agile and efficient way. This offers an additional proof of the versatility and robustness of this descriptor.
Further analysis on , reported in Fig. 5(e), reveals a predominance of (white cluster) along with a balance between positive (orange) and negative (blue) peaks over time. We recall indeed that a null time derivative of λi(t) represents the behavior of those units exhibiting a constant variation of their local environment, with some statistical oscillations classified in the orange and blue domains. Within such a resulting scenario, the proposed analysis predicts gel–liquid phase coexistence in dynamic equilibrium, as shown in the MD snapshots in Fig. 5(f). In other words, the lipids are not evolving toward a more static/dynamic configuration, whereas each remains in its proper dynamic domain.
IV. CONCLUSIONS
Investigating the dynamics of individual units in many atomic/molecular systems is essential to understand the behavior of complex molecular systems, and their physical and chemical properties, collective transitions as well as to design next-generation materials and molecular systems with desirable dynamical behaviors.94 However, because of the complexity of local structural environments along with their dynamics in such systems, a general approach is still lacking. Although faithful representations of atomic neighborhood environments—such as the SOAP descriptor—are available and widely employed, here we want to draw attention to the time evolution of these structures, which is typically overlooked in molecular motif recognition procedures.95
In this work, we propose an alternative perspective allowing us to track the dynamical changes in atomic structural environments of the interacting subunits, thus enhancing the detection of dynamic domains and emerging phenomena. Building upon the SOAP descriptor, we implement τSOAP, a new fingerprint that quantifies the variations of local SOAP environments surrounding each constituent unit along its MD trajectory. τSOAP, indeed, retains the time information from high-dimensional SOAP vectors, thereby aiming at emphasizing the importance of consequential events for reconstructing dynamics and detecting rare fluctuations. Coupled to an ML-based analysis, we demonstrate the potentiality of such an approach to identify domains with different structural and dynamical behaviors. Ranging from an ice/liquid water system where solid-like and fluid-like domains coexist in dynamic equilibrium, to solid-like materials, and soft matter presenting gel and liquid coexisting phases, we prove that our analysis reliably addresses phase transitions, rare dynamic events, and coexisting phases. Moreover, by estimating the first time derivative of τSOAP signal, we gain further information on the direction of the local structural changes. Indeed, besides detecting local rearrangements, the first time derivative of τSOAP enables the characterization of their evolution toward either strongly or weakly dynamic environments. Finally, we can envisage that descriptors like τSOAP, and its first time derivative, may be also useful in enhanced sampling methods, providing degrees of freedom along which structural variations or transitions within the system can be accelerated.
Nonetheless, τSOAP-based investigation presents a few limitations. Although τSOAP signal tracks the evolution of each constituent unit along the whole MD trajectory, thus providing time history data, the coupled ML-based approach relies on the instantaneous values of local environment changes, without performing time-series clustering for identifying dynamic domains. Importantly, time-series clustering and classification based on the frequency/duration of local environment variations could have a striking advantage in regard to discriminating fluctuations leading up to significant structural changes in the system. Notably, by including in our ML-based framework the first time derivative of τSOAP, we start providing further insights on predicting the evolution of local changes and specifically how selected environments reconstruct or evolve in time. In summary, our approach turned out to be robust and versatile to capture fluctuating environments from SOAP spectra in a variety of systems by means of a completely agnostic and data-driven analysis.
SUPPLEMENTARY MATERIAL
The supplementary material contains details about the length of MD simulation trajectories and SOAP vector parameters; the Elbow Curve Method profiles for the identification of the optimal number of final clusters; KMeans clustering analysis on the τSOAP data starting from K = 10 clusters with their relative transition probabilities and the associated dendrograms; SOAP + PCA-based analyses related to TIP4P/ICE ice/liquid water system; SOAP + tICA-based analyses related to TIP4P/ICE ice/liquid water system. This material is provided as a PDF file. Complete details of all molecular models used for the simulations, and of the simulation parameters (e.g., input files), as well as the complete TimeSOAP analysis code, are also available at https://doi.org/10.5281/zenodo.7962820.
ACKNOWLEDGMENTS
G.M.P. acknowledges the support received by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (Grant Agreement No. 818776—DYNAPOL) and by the Swiss National Science Foundation (SNSF Grant No. IZLIZ2_183336).
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
G.M.P. conceived this research and supervised the work. C.C. developed the descriptor and implemented the analysis. C.C., M.C. D.R., and A.C. performed the simulations and the analyses. All authors analyzed and discussed the results. C.C., A.C., and G.M.P. wrote the manuscript.
Cristina Caruso: Conceptualization (equal); Data curation (lead); Formal analysis (equal); Investigation (equal); Methodology (equal); Validation (equal); Visualization (lead); Writing – original draft (lead). Annalisa Cardellini: Conceptualization (equal); Investigation (equal); Methodology (equal); Supervision (equal); Writing – original draft (equal). Martina Crippa: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Visualization (equal); Writing – original draft (supporting). Daniele Rapetti: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (equal). Giovanni M. Pavan: Conceptualization (lead); Funding acquisition (lead); Investigation (equal); Resources (lead); Supervision (lead); Writing – review & editing (equal).
DATA AVAILABILITY
Complete details of all molecular models used for the simulations, and of the simulation parameters (e.g., input files), as well as the complete TimeSOAP analysis code, are available at https://doi.org/10.5281/zenodo.7962820. Further details on the analyses are provided in the supplementary material.