Recent developments in machine learning have enabled accurate predictions of the dynamics of slow structural relaxation in glass-forming systems. However, existing machine learning models for these tasks are mostly designed such that they learn a single dynamic quantity and relate it to the structural features of glassy liquids. In this study, we propose a graph neural network model, “BOnd TArgeting Network,” that learns relative motion between neighboring pairs of particles, in addition to the self-motion of particles. By relating the structural features to these two different dynamical variables, the model autonomously acquires the ability to discern how the self motion of particles undergoing slow relaxation is affected by different dynamical processes, strain fluctuations and particle rearrangements, and thus can predict with high precision how slow structural relaxation develops in space and time.

When a liquid is rapidly cooled, it freezes into a glass, which retains a random amorphous structure. This phenomenon, referred to as the glass transition, is ubiquitous, but its origin still remains to be clarified. The elucidation of this origin has been an important research topic for decades.1,2 Along with the drastic slowdown of atomic motion accompanying the glass transition, localized domains of particles that rearrange more preferentially than others grow up.3–6 Identifying the structural origin of this phenomenon has long been a central problem in the field. Structural analyses of local geometric orders7–9 and of real-space normal modes of vibrations derived from static structures10–14 have accumulated evidence for structure–dynamics correspondence. However, the random, featureless structures of glasses still prevent us from fully understanding the origin and predicting the dynamics based on the static structure.

Recently, machine learning methodologies have enabled the accurate determination of structure–dynamics correspondence in glassy systems.15–23 Among them, graph neural networks (GNNs),24 a type of deep learning algorithm that operates on a graph, have been found to be suitable.20 GNNs had already proven to be useful for making predictions of material properties by directly representing structures of molecules or atomic clusters as nodes and edges of the graph.25–28 Bapst et al.20 applied this ability of GNNs to the task of predicting particle mobility in a glass-forming liquid from its static structure and showed that GNNs outperform other machine learning methodologies. This high accuracy in the predictions of glassy dynamics has enabled the development of inverse modeling as a further application.29 It has also led to the development of a cost-effective linear-regression (LR) model that can overcome the high computational cost of GNNs but still achieves equivalent precision in its prediction.21,22

In the study by Bapst et al.,20 the prediction accuracy of GNNs was found to exceed not only that of other machine learning methods but also that of normal mode analysis at all temperatures accessible by simulations and across all temporal scales up to the structural relaxation time. Normal mode analysis is a method to predict the spatial distribution of short-time vibrations. Because short-time vibrations (soft modes) are intimately linked to the slow α relaxations,9,30–34 normal mode analysis yields a strong baseline of physics-based prediction of glassy dynamics from a static structure.10,35 Nevertheless, the prediction accuracy of GNNs becomes relatively low in the case of fast β dynamics compared to that of slow α relaxation.20,22 We may therefore conjecture that the dynamics is made difficult to predict owing to elastic vibrations, which are erased after coarse-graining the short-time dynamics. This decline in prediction accuracy can be rectified, as shown by a recent study,36 if we encode short-time vibrations obtained from the actual simulations. However, there are still no methodologies to make predictions solely from a static structure by bypassing the challenges caused by such transient fluctuations.

In this article, we propose a GNN model named “BOnd TArgeting Network (BOTAN)” that realizes highly precise predictions of the entire glassy relaxation from a static structure. BOTAN can learn a target characteristic quantity assigned on edges corresponding to pairs of close-by particles, in addition to the previous target quantity assigned on nodes. BOTAN must be positioned as a straightforward but strong extension of a GNN used in the previous study by Bapst et al.,20 which we term as “node-targeting GNN” (NT-GNN). In this study, such a feature of BOTAN is exploited to characterize and learn relative motion: It allows itself to characterize structural rearrangements being unaffected by elastic fluctuations as well as the local degree of local elastic strains. We show that this modeling allows for highly improved predictions of complex dynamics of slow structural relaxations of glassy systems.

In this section, we explain the neural network architecture of BOTAN. Similar to NT-GNN, it is based on the interaction network37,38 consisting of an encoder–process–decoder architecture, where features of nodes and edges in a graph are mutually computed by exchanging messages between a pair of two-layer multilayer perceptrons (MLPs) assigned for nodes and edges. This architecture is suitable for extracting intricate relationships from simulation data computed with high precision. Because the network treats node and edge features equivalently, it is possible to regress the network toward the target quantities on edges as well as nodes39 by decoding edge embeddings using an MLP.

Let G(V, E) be a graph that includes a set of nodes V and edges E. For a pair of nodes v, uV, an edge between v and u is represented as ev,u. The neighboring edges of a vertex vV are represented as N(v) (⊂ E). Given input feature vectors hvin and hein for a node vV and an edge eE, respectively, the goal is to compute the corresponding output features hvout and heout. As shown in a block diagram in Fig. 1, and by denoting an encoder, a multilayer perceptron (MLP), and a decoder as EN(·), MLP(·), and DE(·), respectively, each layer used to compute feature vectors for node v is formulated as follows: First, input feature vectors are encoded by respective encoders,
(1)
(2)
Then, the update of the edge and node features via MLPs is repeated n times,
(3)
(4)
where m stands for the iteration index of n repeat cycles (0 < mn). In each cycle, messages are passed between nodes and edges, wherein encoded feature vectors are concatenated together before every update. For the results that will be presented in this article, it is critical that edges receive messages from both neighbor nodes v, u in each repeated cycle of message passing in Eq. (3), as indicated in the inset of Fig. 1 (“A. Edge update”; an edge is updated bidirectionally, based on the states of nodes on both sides). Furthermore, at each stage of these updates, encoded information, he0 or hν0, is concatenated such that the training becomes stable. Decoders DE(·), placed as the final layers, yield the output feature on each node and each edge,
(5)
(6)
where the edge decoder in Eq. (6) is an additional part of the network that was absent in the NT-GNN. EN(·), MLP(·), and DE(·), respectively, have two hidden layers, where each consists of 64 units using the rectified linear units (ReLU) as the activation function. At every repeat update cycle on edges and nodes in Eqs. (3) and (4), information of the neighbor shell (inside the distance of 2.0σAA) is propagated; therefore, the number of repeat cycles n determines how far the structural information can be incorporated.20,21 The prediction accuracy of the model declines as n decreases; this is shown in  Appendix A. In this study, we set n = 7, as in the previous study by Bapst et al.
FIG. 1.

Block diagram of the GNN model, BOTAN, indicating the steps of computation in an epoch. “EN” and “DE” denote “encoder” and “decoder,” respectively. The GNN model has an encoder–process–decoder structure consisting of connected layers of MLPs with two hidden layers of 64 units with ReLU activation. The decoder on the edges is an added part, which does not exist in the NT-GNN.

FIG. 1.

Block diagram of the GNN model, BOTAN, indicating the steps of computation in an epoch. “EN” and “DE” denote “encoder” and “decoder,” respectively. The GNN model has an encoder–process–decoder structure consisting of connected layers of MLPs with two hidden layers of 64 units with ReLU activation. The decoder on the edges is an added part, which does not exist in the NT-GNN.

Close modal
In this study, we conduct training and prediction test of BOTAN and NT-GNN by using a dataset of molecular dynamics trajectories of a three-dimensional (3D) Kob–Andersen binary Lennard-Jones (LJ) mixture,40 wherein the pairwise force is smoothed at its cutoff.41,42 Interactions between the particles are determined by an interatomic pairwise potential given by
(7)
where μ, ν ∈ {A, B} denotes the particle types, where the number of particles belonging to type A (denoted as NA) is 80% of the total number N = 4096. The interaction energy and particle size are defined as ɛAA = 1.0, ɛAB = 1.5, ɛBB = 0.5, σAA = 1.0, σAB = 0.8, and σBB = 0.88. In this study, therefore, distances, time, and temperature are denoted in units of σAA, τ=mσAA/εAA, and ɛAA/kB, with kB the Boltzmann constant. In the remainder of this article, we use the dimensionless units based on the Lennard-Jones potential and the two species of this mixture are denoted as types A and B. Trajectories for training and prediction tests are generated as so-called the isoconfigurational ensemble,10,31,35,43 wherein 32 simulation runs are conducted with the same particle configuration and at different initial velocities. Among the 500 initial configurations, 400 configurations (12 800 runs) are used for training and the other 100 (3200 runs) are used for testing.
For improving the accuracy of numerical integration over a large number of steps, the pairwise potential is modified in the track of previous studies as41,42
(8)
so that the force and the potential vanish at the cutoff length r = rc. The cutoff lengths are set differently for different combinations of pair species depending on the combination of pair species as rc = 2.5σαβ.
The procedure of dataset generation is similar to that followed in the previous study20, except that the system size is fixed to satisfy N/V = 1.2. The time step of numerical integration is kept at Δt = 10−3 throughout. We set three target temperatures: T = 0.44, 0.50, and 0.64. In Fig. 2, the self-parts of the intermediate scattering functions for particles of type A,
(9)
are shown for these temperatures (and additionally T = 0.47 and 0.56), where the wavenumber k is set at 2π/σAA. The α-relaxation times τα are evaluated as 4120, 217, and 16.5, for T = 0.44, 0.50, and 0.64, respectively, by fitting it to a double stretched exponential function.44,45 In this study, the time points t = 130 (0.03τα) and 4120 (τα) are mainly investigated at the lowest temperature T = 0.44, which is also indicated in Fig. 2 with vertical dotted lines.
FIG. 2.

The intermediate scattering function Fs(k, t) in Eq. (9) is shown for temperatures T = 0.44, 0.47, 0.50, 0.56, and 0.64 as functions of time. Vertical dotted lines indicate time points corresponding to 0.03τα and τα for T = 0.44, which are mainly investigated in this article.

FIG. 2.

The intermediate scattering function Fs(k, t) in Eq. (9) is shown for temperatures T = 0.44, 0.47, 0.50, 0.56, and 0.64 as functions of time. Vertical dotted lines indicate time points corresponding to 0.03τα and τα for T = 0.44, which are mainly investigated in this article.

Close modal

For each of these three target temperatures, 500 independent particle configurations are generated as follows: After keeping the temperature at T = 5.0 for the time lapse of th = 104, the system is cooled rapidly (in tcool = 20) to the target temperature. The temperature is subsequently kept constant by using the Nosé–Hoover thermostat until the system reaches a steady state up to the time scale of 40τα. The final particle configuration thus obtained is used as the initial configuration at t = 0 in the production run for generating the dataset.

From the 500 initial configurations, we introduce the isoconfigurational ensemble10,43 by running 32 separate microcanonical (NVE) simulations with each starting from the same configuration; however, the initial velocities are randomly given from the Maxwell–Boltzmann distribution at the target temperature. These simulations are conducted approximately up to 30τα, where the particle configurations are sampled at logarithmically increasing time points, as summarized in Table I. Among the 500 initial configurations, 400 configurations (12 800 runs) are used for training and the other 100 (3200 runs) are used for testing.

TABLE I.

Time points used for generating the training data (in the nondimensional LJ units). At each temperature, we set nine or ten time points for which isoconfigurational ensemble of 32 trajectories are taken. Boldface denotes time points that are mainly investigated in this study.

Tagging12345678910
T = 0.44 0.13 1.30 13.0 130 412 1300 4120 13 000 41 200 130 000 
0.50 0.108 0.434 2.17 6.86 21.7 68.6 217 686 2 170 6 860 
0.64  0.100 0.165 0.52 1.65 5.2 16.5 52 165 520 
Tagging12345678910
T = 0.44 0.13 1.30 13.0 130 412 1300 4120 13 000 41 200 130 000 
0.50 0.108 0.434 2.17 6.86 21.7 68.6 217 686 2 170 6 860 
0.64  0.100 0.165 0.52 1.65 5.2 16.5 52 165 520 

A graph for training or prediction test is constructed based on the particle configuration wherein the particles are represented by their nodes. Edges are assigned to pairs of particles i and j within a distance threshold value rij(t = 0) ≤ re so that the neighborhood relation can be represented. We choose the threshold length re = 2.0σAA that has the best predictive performance regarding particle propensity.20 The constructed graphs are then fed into the model by encoding particle types (A or B) into the nodes and 3D relative positions between particles (rij = rirj) into the edges. Specifically, for input feature vectors hvin{A,B} and hein (=rij) R, the first layers of the node and edge encoders [Eqs. (1) and (2)] are implemented by linear transformations fn:RRd and fe:R3Rd, respectively, where d is the dimension of the hidden layers (64 in our case). For the node feature, type A and B are converted into real numbers 0 and 1 beforehand, and both the input features on nodes and edges are treated as real number with 32-bit single-floating point precision.

In NT-GNN,20,22 the node output feature hvout was exploited for learning and prediction test, whereas the edge output feature heout is additionally extracted for these tasks in BOTAN. In this study, the target quantities for training and prediction tests are the particle displacement (traveling distance) of each particle i,
(10)
and pair-distance changes characterizing relative motion between neighbor pairs,
(11)
respectively. Both of these target quantities of learning are averaged over isoconfigurational ensembles, for which explicit notations are abbreviated. The GNN models are independently trained and tested at different timepoints.

The learning setups and hyperparameters in the present study are similar to those used by Bapst et al.20 The training dataset is learned to minimize the loss function, wherein the neural network is optimized using the Adam algorithm without norm regularization, with a learning rate of 10−4 and a standard implementation in PyTorch. The data loaded for training are augmented by applying random rotations of the simulation box to the particle positions, as all the target quantities for learning are scalar variables that are rotationally invariant.

The loss functions for learning are defined by
(12)
where LS and LE are standard L2-norm loss functions, respectively, defined for deviations of particle propensity Si(t) and pair-distance changes Eij(t). These may be written as
(13)
(14)
where ⟨·⟩IC represents the average over the isoconfigurational ensemble and Ŝj(t,{rj}) and Êjk(t,{ri}) denote the values of the particle propensity and pair-distance change as the output from the GNN, respectively. The second summation in Eq. (14) is over all neighbor nodes k (within the distance of re) connected with j via edges of the graph. Using loss function LM, the model can be trained via both Sj(t) and Ejk(t) simultaneously; furthermore, these quantities can be inferred simultaneously as well. Parameter p is treated as a hyperparameter (0 ≤ p ≤ 1) as it decides the weight of learning between nodes and edges. Two limiting cases are p = 0, wherein the model only learns pair-distance changes Eij(t), and p = 1, which reduces to NT-GNN.20 For simultaneous learning on nodes and edges (0 < p < 1), we set p = 0.4 in this study. This choice is justified by the ablation experiment described in  Appendix B.

In all these training scenarios, all the particles (all the particle pairs) are learned together, regardless of the particle types. No minibatching of graph inputs are applied in the training, except that the minibatch size of the graph input is fixed at 5 in the training of NT-GNN (p = 1). Although Bapst et al. applied no minibatching in their previous study,20 this difference does not significantly change the results presented in this article. The training with NT-GNN required ∼2 h when using one NVIDIA A100 Tensor Core GPU (40 GB SXM). A similar amount of time is required for training BOTAN, except that there may exist a substantial overhead arising from computation of interparticle distances, for which the actual elapsed time depends on the implementation.

In fact, many training epochs are required if we start the training of BOTAN with random weight parameters. The reason is that, as will be discussed in Sec. V A, there is a simple relation between the initial distance rij(t = 0) and its change after time Eij(t) when the interval time t is short. Such a relation defines a “metabasin” to which many weight parameters are largely optimized. Therefore, in this study, we start all the training of BOTAN (p < 1) with weight parameters that are adjusted by pretraining on edges at a low temperature T = 0.44. This pretraining is performed for 2000 epochs with the pair-distance changes, by keeping p = 0 and using trajectory data at a short time t = 13, which results in a sufficient convergence of the loss functions, as shown in Fig. 3. Such pretraining allows the main part of the training scenario to be efficient enough to finish within 1000 epochs.

FIG. 3.

Training and test losses (left axis) for t = 13 (0.003τα) at T = 0.44 for the pretraining of pair-distance changes Eij(t) as an edge feature, with p = 0.

FIG. 3.

Training and test losses (left axis) for t = 13 (0.003τα) at T = 0.44 for the pretraining of pair-distance changes Eij(t) as an edge feature, with p = 0.

Close modal
First, we investigate the predictive ability of BOTAN when it is trained only with pair-distance change Eij(t): the case of p = 0. In Fig. 4, prediction results of the pair-distance change Eij(t), obtained for t = (a) 130 and (b) 4120 using BOTAN trained at respective time points and at the lowest temperature T = 0.44, are shown for a particle configuration in the test dataset. The data are shown in color maps projected onto segments satisfying rij(t = 0) < 1.35 in a two-dimensional cross section. The “actual” pair-distance changes (the isoconfigurational ensemble average) from the simulation are also shown for each as the ground truth. The time points correspond to 0.03τα and τα, where τα is the α-relaxation time. BOTAN effectively discerns which specific pairs become separated, particularly for the short time in (a). Over a long time, as in (b), the prediction deviates from the ground truth, but it still captures pair-level propensity for distance changes as well as the spatial contrast between mobile and immobile regions. In Fig. 5, the prediction accuracy of BOTAN over Eij(t) is quantified by using the Pearson correlation coefficient,20–22 
(15)
which expresses the proximity between the predicted ({yi}) and ground truth ({xi}) data. Here, x̄ and ȳ denote the average of xi and yi. This coefficient should be quantified carefully to ensure that the comparison is made over equivalent particle pairs. Therefore, among pairs of type A particles (pairs related to type B are excluded), pairs within the first neighboring distance rij(t = 0) < 1.35 are chosen in this comparison, because neighbor pairs in the first and second neighbor shells are expected to exhibit different distance changes on average. The prediction accuracy thus evaluated is high in the short time, and monotonically declines with time.
FIG. 4.

Distribution of the predicted and “actual” (ground truth) pair-distance change Eij(t) at temperature T = 0.44 for t = (a) 130 (0.03 τα) and (b) 4120 (τα), respectively, which are plotted in color maps projected on each segment. Segments represent pairs within a distance of r < 1.35. A specific cross section (11.1 < z < 11.9) is cut out of the 3D box system.

FIG. 4.

Distribution of the predicted and “actual” (ground truth) pair-distance change Eij(t) at temperature T = 0.44 for t = (a) 130 (0.03 τα) and (b) 4120 (τα), respectively, which are plotted in color maps projected on each segment. Segments represent pairs within a distance of r < 1.35. A specific cross section (11.1 < z < 11.9) is cut out of the 3D box system.

Close modal
FIG. 5.

Pearson correlation coefficients between predicted and ground truth values of Eij(t) plotted as a function of time t at T = 0.44, 0.50, and 0.64. Error bars depict the median, best, and worst of five independently trained models. The two dotted lines indicate 0.03τα and τα for T = 0.44, and the arrows show the α-relaxation times for respective temperatures.

FIG. 5.

Pearson correlation coefficients between predicted and ground truth values of Eij(t) plotted as a function of time t at T = 0.44, 0.50, and 0.64. Error bars depict the median, best, and worst of five independently trained models. The two dotted lines indicate 0.03τα and τα for T = 0.44, and the arrows show the α-relaxation times for respective temperatures.

Close modal

The high prediction accuracy in the short-time region originates from a simple relation between the pair-distance change Eij(t) and initial distance rij(t = 0). Figure 6 shows a probability distribution function map as functions of these quantities for the closest neighbor pairs of type A particles in the training dataset at time points ranging from t = 13 to 13 000 at T = 0.44. In the short-time region (t ≲ 0.1τα = 412), a simple anticorrelation exists, wherein the pair-distance change Eij(t) remains one order of magnitude smaller than the particle radii for most of the neighbor pairs. This quantity, therefore, characterizes local strains induced by the elastic vibrations. The anticorrelation gradually disappears but persists up to t = τα as the time increases, making the relation between the two less trivial. The anticorrelation determines the metabasin in the landscape of the loss function in the vast weight-parameter space. The number of training epochs required to optimize BOTAN drastically increases as the anticorrelation gradually disappears for a longer time t, as the loss landscape becomes more complex to make the metabasin less “noticeable.” Therefore, we pretrained BOTAN over the pair-distance changes at T = 0.44 after t = 13, then we used the weight parameters for training with varying time. As the metabasins in the parameter space are likely to be located close to each other between the different time points, the usage of pretrained models would allow us to concentrate on optimizing the weight parameters around the metabasins and thus reduce the number of training epochs required for the main part of training.

FIG. 6.

The probability distribution function showing the relationship between the initial pair distance rij(t = 0) and pair-distance changes Eij(t) in the simulation data used for training at T = 0.44, for time points t = (a) 13.0, (b) 130, (c) 412, (d) 1300, (e) 4120, and (f) 13 000.

FIG. 6.

The probability distribution function showing the relationship between the initial pair distance rij(t = 0) and pair-distance changes Eij(t) in the simulation data used for training at T = 0.44, for time points t = (a) 13.0, (b) 130, (c) 412, (d) 1300, (e) 4120, and (f) 13 000.

Close modal

Because BOTAN has a pair of decoders that respectively computes output features for nodes and edges, it is possible to train BOTAN simultaneously with target quantities for both, particle propensity Si(t) and pair-distance changes Eij(t). This is realized by setting the loss function as a weighted sum of L2-norm losses for Si(t) and Eij(t). The model is then trained with other hyperparameters remaining unchanged.

We compare the prediction ability of BOTAN with that of NT-GNN by Bapst et al. Figure 7 shows the prediction results for (1) t = 130 (0.03τα) and (2) t = 4120 (τα) at T = 0.44 of (a) NT-GNN and (b) BOTAN, in addition to (c) the “actual” propensity directly evaluated from the trajectory data in the test dataset. As all the snapshots are shown in the same cross section as in Fig. 4, the spatial correspondence can be seen between the predictions of Si(t) and Eij(t). BOTAN clearly makes better predictions of spatially heterogeneous patterns of particle propensity than NT-GNN. In (d), 3D vector plots are also provided to show the displacement field locally averaged over nearest neighbors, defined by
(16)
where Θ(x) denotes Heaviside’s step function and rf = 1.35 is the cutoff length of coarse-graining. Each is plotted in a cross section with a thickness of 1.2, where the 3D vectors are enlarged for better visibility. These plots can extract particle motion taking place as a consequence of quasi-localized and phonon vibrations46–48 after removing the effect of interparticle rearrangement. For both t = 130 and 4120, we observe collective motion on the length scale considerably exceeding the particle size, whereas the flows are more aligned in the former case, suggesting the effect of phononic motion. In (e) and (f), error maps of predictions are shown for BOTAN and NT-GNN, respectively, showing the existence of spatial heterogeneity in prediction errors. By comparing the error map of NT-GNN (f) with the locally averaged displacement field in (d), we find that these spatial patterns are corresponding—that is, the locally averaged displacements tend to be parallel with the contour lines of the prediction error map, especially in the region where the displacement vectors are aligned. This clear spatial correlation suggests that the collective fluctuations, which are less intimately related to the local structure around each particle, are a possible reason for the decline in prediction accuracy of NT-GNN.20,22
FIG. 7.

Comparison of predictions and ground truth data for t = (1) 130 (0.03τα) and (2) 4120 (τα), both at the lowest temperature T = 0.44: (a) and (b) Particle propensity Si(t) predicted using (a) BOTAN and (b) NT-GNN, respectively. Color maps are linearly interpolated between particles. (c) Distribution of Si(t) as the ground truth data. (d) Local displacement field averaged over neighbors. (e) and (f) Error map for predictions of (e) BOTAN and (f) NT-GNN, corresponding to (a) and (b), respectively. All data are plotted for the same 3D particle configuration, in the cross section 11.1 < z < 11.9, except (e), in which 10.9 < z < 12.1 is shown.

FIG. 7.

Comparison of predictions and ground truth data for t = (1) 130 (0.03τα) and (2) 4120 (τα), both at the lowest temperature T = 0.44: (a) and (b) Particle propensity Si(t) predicted using (a) BOTAN and (b) NT-GNN, respectively. Color maps are linearly interpolated between particles. (c) Distribution of Si(t) as the ground truth data. (d) Local displacement field averaged over neighbors. (e) and (f) Error map for predictions of (e) BOTAN and (f) NT-GNN, corresponding to (a) and (b), respectively. All data are plotted for the same 3D particle configuration, in the cross section 11.1 < z < 11.9, except (e), in which 10.9 < z < 12.1 is shown.

Close modal

We next assess the prediction accuracy of BOTAN in comparison with NT-GNN. Because particles of types A and B are expected to exhibit different diffusivity over a long period, the prediction accuracy is evaluated in terms of the Pearson correlation coefficient using only the data for type A particles. Figure 8 shows how accurately BOTAN and the NT-GNN predict the particle propensity Si(t) for three different temperatures, T = (a) 0.44, (b) 0.50, and (c) 0.64. The trend of time dependence for NT-GNN is in agreement with the previous results:20 In the shorter time before reaching the plateau region (t < 0.2), the Pearson correlation coefficient assumes high values, then falls below 0.5, and afterward gradually increases to reach its peak at around the α-relaxation time. Conversely, BOTAN outperforms NT-GNN in its predictive accuracy over the entire temporal range and for all temperatures under investigation.

FIG. 8.

Pearson correlation coefficients ρ between predicted and actual particle propensities Si(t), representing the prediction accuracy of BOTAN and NT-GNN, are shown as functions of time t for T = (a) 0.44, (b) 0.50, and (c) 0.64. Additionally, those of Ri(t) and Di(t) obtained by NT-GNN are also shown for comparison. Error bars show the median, best, and worst of five independently trained models.

FIG. 8.

Pearson correlation coefficients ρ between predicted and actual particle propensities Si(t), representing the prediction accuracy of BOTAN and NT-GNN, are shown as functions of time t for T = (a) 0.44, (b) 0.50, and (c) 0.64. Additionally, those of Ri(t) and Di(t) obtained by NT-GNN are also shown for comparison. Error bars show the median, best, and worst of five independently trained models.

Close modal

To this end, we note that the pair-distance change in Eq. (11) has its own significance despite the extreme simplicity of the definition. Because the pair-distance changes characterize relative motion, they can capture structural rearrangements. This feature is distinct from standard quantities defined on the basis of particle displacements that is affected by nonlocal phononic fluctuations and thus enables the model to better distinguish the rearranging hotspot separating nonlocal effects. Similar ideas have also been implemented in the method of counting the replacement of neighboring pairs,10,49,50 also termed “bond breakage.”5,51,52 Furthermore, the pair-distance changes can characterize the extent to which each pair tends to change owing to the local shear deformation around rearranging cores; this aspect of structural changes has long been addressed using the concept of “shear transformation zones.”47,53–55 By learning how strains of non-rearranging regions are distributed and how rearranging hot spots are localized, BOTAN autonomously fixes the errors in the prediction of particle displacements and overall acquires unprecedented predictive ability regarding the glassy dynamics.

Finally, we try to compare the level of prediction accuracy that can be achieved by training NT-GNN by replacing target variables into a quantity in which the effect of collective motion is restricted to a certain extent. For this purpose, we introduce and briefly examine two quantities that characterize particle rearrangements being less affected by such collective motion. One is the relative displacement that they would have under a uniform strain,
wherein the vector and tensor components α, β ∈ {x, y, z} are explicitly denoted for clarity. Here, ɛαβ represents the local strain tensor, which is evaluated from change in the local arrangement of neighboring particles in the distance of 1.6, and summation is again over the nearest neighbors within the same distance with Nn.n. the number of these neighbors. When this local strain reduces to zero (ɛαβ = 0), which is valid in the limit of t → 0, this quantity reduces to a more easily interpretable form Ci(t)=Δri(t)Nn.n.1jn.n.Δrj(t), clearly manifesting itself as the relative displacement with respect to neighbor environments.45,56
The other quantity is a further variant but widely used,14,50,53,54,57 and it is defined as the mean-square of difference between the actual displacement and the uniform strain displacement of the neighbors,
where we employ its square root Di(t) as the target of learning. This quantity is close to Ri(t) in its form; the difference lies in the squared sum that is more distinctly affected by a specific ij pair if a specific ij pair becomes further away than other pairs. Therefore, Di(t) is more susceptible to hot spots where particle rearrangement preferably occurs.

In Fig. 8, Pearson correlation coefficients between predictions of NT-GNN and the actual values of neighbor-relative particle propensities, Ri(t) and Di(t), are shown; furthermore, in (a), this correlation coefficient is also plotted for the prediction made by BOTAN. The correlation coefficient is evaluated only for type A particles, as a proximity between the predicted and ground truth values. The prediction accuracy of NT-GNN improves over these quantities, especially in the time regions shorter than 0.1τα. This clearly implies that spatially extended static fluctuations cause decline in the prediction accuracy of Si(t) by NT-GNN in the short time and that this decline in the prediction accuracy is recovered because, in these quantities, the effect of these spatially extended static fluctuations is removed to a certain extent.

In conclusion, we have introduced a GNN model that realizes faithful predictions of glassy dynamics over the entire temporal range, with accuracy greatly improved from previous models, including the GNN model by Bapst et al.20 Typical deep learning models, including both convolutional neural networks and GNNs, are designed so that they can characterize local features; thus, they are not extremely good at capturing nonlocal characteristics. The key to the present improvement is that the model learns relative motion between neighbor pairs to relate it to two-body structural correlation in addition to the self-motion of the particles. In this way, the GNN model bypasses direct learning of nonlocal motion itself and acquires the ability to autonomously “interpret” how particle motion is affected by different dynamical effects nonlocal strain fluctuations and local particle rearrangements. As a consequence, BOTAN reconciles the differences between fast β and slow α relaxation dynamics and achieves state-of-the-art ability in the task of predicting the glassy dynamics from the static structure.

In the present study, particle propensity for motion defined on self-displacement is chosen as the dynamical quantity of interest, in line with many previous studies on structure–dynamics correspondence in glasses.7,9,10,20,21 It was shown that the GNN model proposed by Bapst et al. (NT-GNN) can be reduced to a linear-regression (LR) model once properly defined local structural parameters are averaged over neighbor shells, wherein predictive abilities are essentially the same between the LR model and NT-GNN.21 Our results have, somewhat contrarily, shown that such local structural parameters are not the only descriptor for self-displacement because they are under the influence of strain fluctuations. Recently, new neural network models that successfully improve the predictive ability on glassy dynamics by introducing different ideas have emerged. For example, a recently proposed machine learning framework, GlassMLP, has raised the predictive ability by introducing the physics-informed descriptors’ input and bottleneck layers;58 a couple of recent studies also seem to have improved the predictive ability of GNNs by introducing rotational-equivariant structural descriptors59 or by additionally introducing self-attention layers into GNNs.60 As the reasons for the improved prediction are likely to be different from each other, it may be possible to develop a more advanced framework by combining these features, which is left as an open task in the future. Given that a much higher prediction ability is realized, such models may also become useful for finding the “reaction coordinates” of glassy dynamics, along which the fluctuations should be enhanced for efficient sampling of molecular trajectories. Extensions to the recently discussed machine learning-aided sampling technique may be interesting.61,62

We thank T. Kawasaki for critical input regarding this research. We also thank Y. Asahi, J. J. Molina, and F. Landes for helpful discussions and comments. H. Shiba and T. Shimokawabe were supported by JSPS KAKENHI Grant No. JP19H05662 and by “Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures” and “High Performance Computing Infrastructure” in Japan (Project No. jh220052). M. Hanai and T. Suzumura were supported by “Advanced Research Infrastructure for Materials and Nanotechnology” from MEXT, Japan. Training and testing are performed using NVIDIA A100 Tensor Core GPUs (40 GB SXM) on Wisteria/BDEC-01 Aquarius subsystem at Information Technology Center, University of Tokyo. For generation of the dataset for training and testing, usage of the following computational resources is acknowledged: Oakforest-PACS at the Joint Center for Advanced High Performance Computing (JCAHPC), Oakbridge-CX at Information Technology Center, University of Tokyo, Supercomputer System B at Institute for Solid State Physics, University of Tokyo, and MASAMUNE-IMR at Institute for Materials Research, Tohoku University.

The authors have no conflicts to disclose.

Hayato Shiba: Conceptualization (lead); Data curation (lead); Formal analysis (lead); Investigation (lead); Methodology (lead); Resources (equal); Software (lead); Validation (lead); Visualization (lead); Writing – original draft (lead); Writing – review & editing (lead). Masatoshi Hanai: Formal analysis (supporting); Methodology (supporting); Software (supporting); Writing – original draft (supporting). Toyotaro Suzumura: Software (supporting); Supervision (supporting); Writing – review & editing (supporting). Takashi Shimokawabe: Conceptualization (supporting); Funding acquisition (lead); Investigation (supporting); Project administration (lead); Resources (equal); Software (supporting); Writing – review & editing (supporting).

The source code used for training and evaluation is openly available in GitHub repository at https://github.com/h3-Open-BDEC/pyg_botan. This code is an implementation of BOTAN using PyTorch Geometric.63 The dataset of simulation trajectories for evaluations is also available via the hyperlink indicated in the same GitHub repository for three years after the publication of this paper.

In the model architecture of BOTAN, a pair of two-layer MLPs, characterizing the features of nodes and edges, are mutually updated. Each edge is updated based on the embedding of the nodes it is connecting, in addition to its own previous embedding. Each node is updated based on the sum of incoming edges, in addition to the embedding of the nodes. This is how message passing between the nodes is realized in the present GNNs. Therefore, the number of repeat cycles determines how far away structural information can be incorporated into a node (or into an edge). In Fig. 9, the dependence of the Pearson correlation coefficient on this iteration number (denoted as n in Sec. II) is shown for t = 130 and 4120 at T = 0.44, where the training is performed over 1000 epochs as well. The predictive ability increases as a function of the number of iterations n, but the increase is much slower when n < 4.

FIG. 9.

Pearson correlation coefficients for the predictions of BOTAN at two time points, t = 130 (0.03τα) and 4120 (τα), at temperature T = 0.44 are plotted as a function of the number of iterations of message passing.

FIG. 9.

Pearson correlation coefficients for the predictions of BOTAN at two time points, t = 130 (0.03τα) and 4120 (τα), at temperature T = 0.44 are plotted as a function of the number of iterations of message passing.

Close modal

In the simultaneous training on nodes and edges [via both Si(t) and Eij(t)], a value of 0.4 is chosen for the hyperparameter p in Eq. (12) that determines the training weight between nodes and edges. This choice of the parameter is justified by an ablation experiment, in which BOTAN is trained for various values of the hyperparameter p without using the pretrained model. In Fig. 10, we show Pearson correlation coefficients ρ computed using predicted values for Si(t) and Eij(t), after being trained for 3000 epochs at t = 130 and T = 0.44. The result suggests that 0.2 ≤ p ≤ 0.6 may be a requirement that yields an optimal condition for the simultaneous training.

FIG. 10.

Dependence of Pearson correlation coefficients for particle propensity Si(t) and pair-distance changes Eij(t) at t = 130 and T = 0.44, plotted as functions of p, a hyperparameter for the simultaneous learning. When 0 < p < 1, these quantities are learned simultaneously.

FIG. 10.

Dependence of Pearson correlation coefficients for particle propensity Si(t) and pair-distance changes Eij(t) at t = 130 and T = 0.44, plotted as functions of p, a hyperparameter for the simultaneous learning. When 0 < p < 1, these quantities are learned simultaneously.

Close modal
1.
C. A.
Angell
, “
Formation of glasses from liquids and biopolymers
,”
Science
267
,
1924
1935
(
1995
).
2.
P. G.
Debenedetti
and
F. H.
Stillinger
, “
Supercooled liquids and the glass transition
,”
Nature
410
,
259
267
(
2001
).
3.
M. D.
Ediger
, “
Spatially heterogeneous dynamics in supercooled liquids
,”
Annu. Rev. Phys. Chem.
51
,
99
128
(
2000
).
4.
W.
Kob
,
C.
Donati
,
S. J.
Plimpton
,
P. H.
Poole
, and
S. C.
Glotzer
, “
Dynamical heterogeneities in a supercooled Lennard-Jones liquid
,”
Phys. Rev. Lett.
79
,
2827
2830
(
1997
).
5.
R.
Yamamoto
and
A.
Onuki
, “
Heterogeneous diffusion in highly supercooled liquids
,”
Phys. Rev. Lett.
81
,
4915
4918
(
1998
).
6.
L.
Berthier
and
G.
Biroli
, “
Theoretical perspective on the glass transition and amorphous materials
,”
Rev. Mod. Phys.
83
,
587
645
(
2011
).
7.
H.
Tanaka
,
H.
Tong
,
R.
Shi
, and
J.
Russo
, “
Revealing key structural features hidden in liquids and glasses
,”
Nat. Rev. Phys.
1
,
333
348
(
2019
).
8.
C. P.
Royall
and
S. R.
Williams
, “
The role of local structure in dynamical arrest
,”
Phys. Rep.
560
,
1
75
(
2015
).
9.
H.
Tong
and
H.
Tanaka
, “
Revealing hidden structural order controlling both fast and slow glassy dynamics in supercooled liquids
,”
Phys. Rev. X
8
,
011041
(
2018
).
10.
A.
Widmer-Cooper
,
H.
Perry
,
P.
Harrowell
, and
D. R.
Reichman
, “
Irreversible reorganization in a supercooled liquid originates from localized soft modes
,”
Nat. Phys.
4
,
711
715
(
2008
).
11.
A.
Tanguy
,
B.
Mantisi
, and
M.
Tsamados
, “
Vibrational modes as a predictor for plasticity in a model glass
,”
Europhys. Lett.
90
,
16004
(
2010
).
12.
K.
Chen
,
M. L.
Manning
,
P. J.
Yunker
,
W. G.
Ellenbroek
,
Z.
Zhang
,
A. J.
Liu
, and
A. G.
Yodh
, “
Measurement of correlations between low-frequency vibrational modes and particle rearrangements in quasi-two-dimensional colloidal glasses
,”
Phys. Rev. Lett.
107
,
108301
(
2011
).
13.
A.
Ghosh
,
V.
Chikkadi
,
P.
Schall
, and
D.
Bonn
, “
Connecting structural relaxation with the low frequency modes in a hard-sphere colloidal glass
,”
Phys. Rev. Lett.
107
,
188303
(
2011
).
14.
S. S.
Schoenholz
,
A. J.
Liu
,
R. A.
Riggleman
, and
J.
Rottler
, “
Understanding plastic deformation in thermal glasses from single-soft-spot dynamics
,”
Phys. Rev. X
4
,
031014
(
2014
).
15.
P.
Ronhovde
,
S.
Chakrabarty
,
D.
Hu
,
M.
Sahu
,
K. K.
Sahu
,
K. F.
Kelton
,
N. A.
Mauro
, and
Z.
Nussinov
, “
Detecting hidden spatial and spatio-temporal structures in glasses and complex physical systems by multiresolution network clustering
,”
Eur. Phys. J. E
34
,
105
(
2011
).
16.
E. D.
Cubuk
,
S. S.
Schoenholz
,
J. M.
Rieser
,
B. D.
Malone
,
J.
Rottler
,
D. J.
Durian
,
E.
Kaxiras
, and
A. J.
Liu
, “
Identifying structural flow defects in disordered solids using machine-learning methods
,”
Phys. Rev. Lett.
114
,
108001
(
2015
).
17.
S. S.
Schoenholz
,
E. D.
Cubuk
,
D. M.
Sussman
,
E.
Kaxiras
, and
A. J.
Liu
, “
A structural approach to relaxation in glassy liquids
,”
Nat. Phys.
12
,
469
471
(
2016
).
18.
S. S.
Schoenholz
,
E. D.
Cubuk
,
E.
Kaxiras
, and
A. J.
Liu
, “
Relationship between local structure and relaxation in out-of-equilibrium glassy systems
,”
Proc. Natl. Acad. Sci. U. S. A.
114
,
263
267
(
2017
).
19.
E.
Boattini
,
S.
Marín-Aguilar
,
S.
Mitra
,
G.
Foffi
,
F.
Smallenburg
, and
L.
Filion
, “
Autonomously revealing hidden local structures in supercooled liquids
,”
Nat. Commun.
11
,
5479
(
2020
).
20.
V.
Bapst
,
T.
Keck
,
A.
Grabska-Barwińska
,
C.
Donner
,
E. D.
Cubuk
,
S. S.
Schoenholz
,
A.
Obika
,
A. W. R.
Nelson
,
T.
Back
,
D.
Hassabis
, and
P.
Kohli
, “
Unveiling the predictive power of static structure in glassy systems
,”
Nat. Phys.
16
,
448
454
(
2020
).
21.
E.
Boattini
,
F.
Smallenburg
, and
L.
Filion
, “
Averaging local structure to predict the dynamic propensity in supercooled liquids
,”
Phys. Rev. Lett.
127
,
088007
(
2021
).
22.
R. M.
Alkemade
,
E.
Boattini
,
L.
Filion
, and
F.
Smallenburg
, “
Comparing machine learning techniques for predicting glassy dynamics
,”
J. Chem. Phys.
156
,
204503
(
2022
).
23.
D.
Coslovich
,
R. L.
Jack
, and
J.
Paret
, “
Dimensionality reduction of local structure in glassy binary mixtures
,”
J. Chem. Phys.
157
,
204503
(
2022
).
24.
F.
Scarselli
,
M.
Gori
,
A. C.
Tsoi
,
M.
Hagenbuchner
, and
G.
Monfardini
, “
The graph neural network model
,”
IEEE Trans. Neural Networks
20
,
61
80
(
2009
).
25.
T.
Xie
and
J. C.
Grossman
, “
Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties
,”
Phys. Rev. Lett.
120
,
145301
(
2018
).
26.
L.
Chanussot
,
A.
Das
,
S.
Goyal
,
T.
Lavril
,
M.
Shuaibi
,
M.
Riviere
,
K.
Tran
,
J.
Heras-Domingo
,
C.
Ho
,
W.
Hu
,
A.
Palizhati
,
A.
Sriram
,
B.
Wood
,
J.
Yoon
,
D.
Parikh
,
C. L.
Zitnick
, and
Z.
Ulissi
, “
Open catalyst 2020 (OC20) dataset and community challenges
,”
ACS Catal.
11
,
6059
6072
(
2021
).
27.
V.
Fung
,
J.
Zhang
,
E.
Juarez
, and
B. G.
Sumpter
, “
Benchmarking graph neural networks for materials chemistry
,”
npj Comput. Mater.
7
,
84
(
2021
).
28.
S.
Takamoto
,
C.
Shinagawa
,
D.
Motoki
,
K.
Nakago
,
W.
Li
,
I.
Kurata
,
T.
Watanabe
,
Y.
Yayama
,
H.
Iriguchi
,
Y.
Asano
,
T.
Onodera
,
T.
Ishii
,
T.
Kudo
,
H.
Ono
,
R.
Sawada
,
R.
Ishitani
,
M.
Ong
,
T.
Yamaguchi
,
T.
Kataoka
,
A.
Hayashi
,
N.
Charoenphakdee
, and
T.
Ibuka
, “
Towards universal neural network potential for material discovery applicable to arbitrary combination of 45 elements
,”
Nat. Commun.
13
,
2991
(
2022
).
29.
Q.
Wang
and
L.
Zhang
, “
Inverse design of glass structure with deep graph neural networks
,”
Nat. Commun.
12
,
5359
(
2021
).
30.
U.
Buchenau
and
R.
Zorn
, “
A relation between fast and slow motions in glassy and liquid selenium
,”
Europhys. Lett.
18
,
523
528
(
1992
).
31.
A.
Widmer-Cooper
and
P.
Harrowell
, “
Predicting the long-time dynamic heterogeneity in a supercooled liquid on the basis of short-time heterogeneities
,”
Phys. Rev. Lett.
96
,
185701
(
2006
).
32.
J. C.
Dyre
, “
Colloquium: The glass transition and elastic models of glass-forming liquids
,”
Rev. Mod. Phys.
78
,
953
972
(
2006
).
33.
L.
Larini
,
A.
Ottochian
,
C.
De Michele
, and
D.
Leporini
, “
Universal scaling between structural relaxation and vibrational dynamics inglass-forming liquids and polymers
,”
Nat. Phys.
4
,
42
45
(
2008
).
34.
H. W.
Hansen
,
B.
Frick
,
T.
Hecksher
,
J. C.
Dyre
, and
K.
Niss
, “
Connection between fragility, mean-squared displacement, and shear modulus in two van der Waals bonded glass-forming liquids
,”
Phys. Rev. B
95
,
104202
(
2017
).
35.
A.
Widmer-Cooper
,
H.
Perry
,
P.
Harrowell
, and
D. R.
Reichman
, “
Localized soft modes and the supercooled liquid’s irreversible passage through its configuration space
,”
J. Chem. Phys.
131
,
194508
(
2009
).
36.
A.
Tripodo
,
G.
Cordella
,
F.
Puosi
,
M.
Malvaldi
, and
D.
Leporini
, “
Neural networks reveal the impact of the vibrational dynamics in the prediction of the long-time mobility of molecular glassformers
,”
Int. J. Mol. Sci.
23
,
9322
(
2022
).
37.
P. W.
Battaglia
,
R.
Pascanu
,
M.
Lai
,
D.
Rezende
, and
K.
Kavukcuoglu
, “
Interaction networks for learning about objects, relations and physics
,”
Adv. Neural Inf. Process. Syst.
29
,
4502
4510
(
2016
).
38.
P. W.
Battaglia
,
J. B.
Hamrick
,
V.
Bapst
,
A.
Sanchez-Gonzalez
,
V.
Zambaldi
,
M.
Malinowski
,
A.
Tacchetti
,
D.
Raposo
,
A.
Santoro
,
R.
Faulkner
,
C.
Gulcehre
,
F.
Song
,
A.
Ballard
,
J.
Gilmer
,
G.
Dahl
,
A.
Vaswani
,
K.
Allen
,
C.
Nash
,
V.
Langston
,
C.
Dyer
,
N.
Heess
,
D.
Wierstra
,
P.
Kohli
,
M.
Botvinick
,
O.
Vinyals
,
Y.
Li
, and
R.
Pascanu
, “
Relational inductive biases, deep learning, and graph networks
,” arXiv:1806.01261 [cs.LG] (
2018
).
39.
G.
DeZoort
,
S.
Thais
,
J.
Duarte
,
V.
Razavimaleki
,
M.
Atkinson
,
I.
Ojalvo
,
M.
Neubauer
, and
P.
Elmer
, “
Charged particle tracking via edge-classifying interaction networks
,”
Comput. Software Big Sci.
5
,
26
(
2021
).
40.
W.
Kob
and
H. C.
Andersen
, “
Testing mode-coupling theory for a supercooled binary Lennard-Jones mixture: The van Hove correlation function
,”
Phys. Rev. E
51
,
4626
4641
(
1995
).
41.
M.
Shimada
,
H.
Mizuno
, and
A.
Ikeda
, “
Anomalous vibrational properties in the continuum limit of glasses
,”
Phys. Rev. E
97
,
022609
(
2018
).
42.
S.
Toxvaerd
and
J. C.
Dyre
, “
Communication: Shifted forces in molecular dynamics
,”
J. Chem. Phys.
134
,
081102
(
2011
).
43.
A.
Widmer-Cooper
,
P.
Harrowell
, and
H.
Fynewever
, “
How reproducible are dynamic heterogeneities in a supercooled liquid?
,”
Phys. Rev. Lett.
93
,
135701
(
2004
).
44.
S.
Sengupta
,
S.
Karmakar
,
C.
Dasgupta
, and
S.
Sastry
, “
Breakdown of the Stokes-Einstein relation in two, three and four dimensions
,”
J. Chem. Phys.
138
,
12A548
(
2013
).
45.
H.
Shiba
,
T.
Kawasaki
, and
K.
Kim
, “
Local density fluctuation governs divergence of viscosity underlying elastic and hydrodynamic anomalies in a 2D glass-forming liquid
,”
Phys. Rev. Lett.
123
,
265501
(
2019
).
46.
H.
Mizuno
,
H.
Shiba
, and
A.
Ikeda
, “
Continuum limit of the vibrational properties of amorphous solids
,”
Proc. Natl. Acad. Sci. U. S. A.
114
,
E9767
E9774
(
2017
).
47.
D.
Richard
,
G.
Kapteijns
,
J. A.
Giannini
,
M. L.
Manning
, and
E.
Lerner
, “
Simple and broadly applicable definition of shear transformation zones
,”
Phys. Rev. Lett.
126
,
015501
(
2021
).
48.
Y.-C.
Hu
and
H.
Tanaka
, “
Origin of the boson peak in amorphous solids
,”
Nat. Phys.
18
,
669
677
(
2022
).
49.
P.
Yunker
,
Z.
Zhang
,
K. B.
Aptowicz
, and
A. G.
Yodh
, “
Irreversible rearrangements, correlated domains, and local structure in aging glasses
,”
Phys. Rev. Lett.
103
,
115701
(
2009
).
50.
V.
Chikkadi
and
P.
Schall
, “
Nonaffine measures of particle displacements in sheared colloidal glasses
,”
Phys. Rev. E
85
,
031402
(
2012
).
51.
H.
Shiba
,
T.
Kawasaki
, and
A.
Onuki
, “
Relationship between bond-breakage correlations and four-point correlations in heterogeneous glassy dynamics: Configuration changes and vibration modes
,”
Phys. Rev. E
86
,
041504
(
2012
).
52.
B.
Guiselin
,
C.
Scalliet
, and
L.
Berthier
, “
Microscopic origin of excess wings in relaxation spectra of supercooled liquids
,”
Nat. Phys.
18
,
468
472
(
2022
).
53.
M. L.
Falk
and
J. S.
Langer
, “
Dynamics of viscoplastic deformation in amorphous solids
,”
Phys. Rev. E
57
,
7192
7205
(
1998
).
54.
F.
Shimizu
,
S.
Ogata
, and
J.
Li
, “
Theory of shear banding in metallic glasses and molecular dynamics calculations
,”
Mater. Trans.
48
,
2923
2927
(
2007
).
55.
M. R.
Hasyim
and
K. K.
Mandadapu
, “
A theory of localized excitations in supercooled liquids
,”
J. Chem. Phys.
155
,
044504
(
2021
).
56.
H.
Shiba
,
Y.
Yamada
,
T.
Kawasaki
, and
K.
Kim
, “
Unveiling dimensionality dependence of glassy dynamics: 2D infinite fluctuation eclipses inherent structural relaxation
,”
Phys. Rev. Lett.
117
,
245701
(
2016
).
57.
H. L.
Peng
,
M. Z.
Li
, and
W. H.
Wang
, “
Structural signature of plastic deformation in metallic glasses
,”
Phys. Rev. Lett.
106
,
135503
(
2011
).
58.
G.
Jung
,
G.
Biroli
, and
L.
Berthier
, “
Predicting dynamic heterogeneity in glass-forming liquids by physics-informed machine learning
,” arXiv:2210.16623 (
2022
).
59.
F. S.
Pezzicoli
,
G.
Charpiat
, and
F. P.
Landes
, “
SE(3)-equivariant graph neural networks for learning glassy liquids representations
,” arXiv:2211.03226 (
2022
).
60.
X.
Jiang
,
Z.
Tian
, and
K.
Li
, “
Geometry-enhanced graph neural network for glassy dynamics prediction
,” arXiv:2211.12832 (
2022
).
61.
F.
Noé
,
S.
Olsson
,
J.
Köhler
, and
H.
Wu
, “
Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning
,”
Science
365
,
eaaw1147
(
2019
).
62.
M.
Gabrié
,
G. M.
Rotskoff
, and
E.
Vanden-Eijnden
, “
Adaptive Monte Carlo augmented with normalizing flows
,”
Proc. Natl. Acad. Sci. U. S. A.
119
,
e2109420119
(
2022
).
63.
M.
Fey
and
J. E.
Lenssen
, “
Fast graph representation learning with PyTorch Geometric
,” in
ICLR Workshop on Representation Learning on Graphs and Manifolds
,
2019
.