Substitutional Alloying Using Crystal Graph Neural Networks

Materials discovery, especially for applications that require extreme operating conditions, requires extensive testing that naturally limits the ability to inquire the wealth of possible compositions. Machine Learning (ML) has nowadays a well established role in facilitating this effort in systematic ways. The increasing amount of available accurate DFT data represents a solid basis upon which new ML models can be trained and tested. While conventional models rely on static descriptors, generally suitable for a limited class of systems, the flexibility of Graph Neural Networks (GNNs) allows for direct learning representations on graphs, such as the ones formed by crystals. We utilize crystal graph neural networks (CGNN) to predict crystal properties with DFT level accuracy, through graphs with encoding of the atomic (node/vertex), bond (edge), and global state attributes. In this work, we aim at testing the ability of the CGNN MegNet framework in predicting a number of properties of systems previously unseen from the model, obtained by adding a substitutional defect in bulk crystals that are included in the training set. We perform DFT validation to assess the accuracy in the prediction of formation energies and structural features (such as elastic moduli). Using CGNNs, one may identify promising paths in alloy discovery.

As in statistical mechanics with the need for identifying appropriate order parameters of novel phases and structures, the key challenge in ML algorithms is to identify effective system descriptors that can function as structure identifiers.A large variety of descriptors have been proposed, including fixed-length feature vectors of material elemental or electronic properties Seko et al (2015); Xue et al (2016); Isayev et al (2017), as well as structural descriptors, based on rotational and traslational invariant transformations of atomic coordinates, like the Coulomb matrix Rupp et al (2012), atom-centered symmetry functions (ACSFs) Behler (2011), social permutation invariant coordintes (SPRINT) Pietrucci and Andreoni (2011), smooth overlap of atomic positions (SOAP) De et al (2016) and global minimum of root mean-square distance Sadeghi et al (2013).However, these solutions are often system-specific, and are not suitable for vast compositional and structural space exploration.
For this reason, a topic of fervent interest in the materials science community is the use of graph neural networks (GNNs) Zhou et al (2020); Wu et al (2021), which allow to learn representations directly and in a flexible way, focused on molecular systems Jorgensen et al (2018); Schütt et al (2017); Duvenaud et al (2015); Wu et al (2018); Kearnes et al (2016); Coley et al (2017), surfaces Back et al (2019); Palizhati et al (2019); Gu et al (2020) and periodic crystalsSchütt et al (2017); Xie and Grossman (2018a,b); Chen et al (2019a); Dunn et al (2020); Louis et al (2020a); Park and Wolverton (2020a); Karamad et al (2020); Chen et al (2019b).GNNs can be regarded as the generalization of convolutional neural networks (CNN) to graph-structured data, from which the internal materials representations can be learned and used for prediction of target properties Reiser et al (2022); even though larger amounts of data is required with respect to conventional ML models, GNNs take advantage of the unambiguous physics-guided real-space local associations between the system's degrees of freedom, hence for any type of atomic crystalline structure Gong and Yan (2021).The common idea of GNN-based models is to represent atoms as nodes (V) and their chemical bonds as edges (E) in a graph G(V,E), which can be fed to a trained neural network to create node-level embeddings (learned representations of each atom in its individual chemical environment) through convolutions with neighbouring nodes and edges Fung et al (2021).Therefore, given a set of learnable weights (W), and (y) a target material property, the GNN model reformulates the prediction task as the mapping f (G : W ) → y.
A direct benefit of the crystal material GNN-converted graph encoding is the naturally derived vector characterization of the atoms and edges Louis To validate the predictions, as described in Sec.(2.2) and Sec.(3.3), we perform Density Functional Theory (DFT) calculations, and we find CGNNs have both a great potential, but also limitations in predicting properties of defected bulk crystals, and promote materials discovery.Here, we limit ourselves to present the main features of the model, but for a more exhaustive explanation we recommend the reader to the original work of Chen et al.Chen et al (2019b)

Machine Learning framework
The update of atomic attributes involves the average over i-th atom connecting bonds ve i = 1 N e N e i k=1 e ′ k r k =i , the i-th atom self-attributes v i and the global state ones u, as in Finally, an information flow from all three attribute groups is involved in the update of the global state attibutes, as in As mentioned before, for our systems of interest, namely periodic crystals, the atomic number is the only node attribute.For bonds, the spatial distance is expanded in a Gaussian basis set, centered at a linearly spaced r 0 locations between r 0 = 0 and r 0 = r cut , and characterized by a given width σ1 .
Finally, the global state is simply a two zeros placeholder for global information exchange.
Fig. 1: Parity plots for the pre-trained model on the MP dataset.The plots involve the predictions on the bulk modulus (K VRH ), the shear modulus (G VRH ) and the formation energy (E form ).

Data collection
We consider crystal structures collected through the Python Materials Genomics interface (pymatgen) Ong et al (2013)

Validation with DFT
We verify the accuracy of the model's predictions for system properties such as bulk modulus K VRH , shear modulus G VRH and formation energy E form , after single-atom substitution is implemented in 2 × 2 × 2 supercells.We perform DFT calculations with Quantum Espresso (QE) Giannozzi et al (2009Giannozzi et al ( , 2017Giannozzi et al ( , 2020) )  Convergence is checked on the number of k-points, plane-wave cutoff energy, and also, the energy smearing spreading (degauss parameter in QE), for each case, after a preliminary variable-cell relaxation of pure crystals, and then, further fixed-cell relaxation with optimal parameters, for final equilibrium bulk structures.The common acceptance threshold in the variation of the total energy upon parameter change is set at 10 −5 Ry.Forces and total energy convergence thresholds for ionic minimization are set to a common value of 10 −5 a.u. and 10 −6 a.u.respectively.After optimization of pure crystals, fixedcell relaxation is performed on supercells with single-atom substitutions, and its structural properties are then extracted through the THERMO PW driver.
The computation of formation energies for validation purposes is performed on the case of single-atom substitutional defects (D) applied on pure bulk crystals (M), as follows, where x is the atomic fraction of substitutional defects, E M 1−x D x is the total energy per atom of the compound, while E M and E D are the total energies per atom of the precursor species that compose it; the latter energies are obtained by relaxing the ground-state lattices of these species using the same aforementioned QE parameters as for the compound, as necessary to ensure computational consistency in the evaluation of accurate values for the defects formation energies.

Results: Property predictions for substitutional alloying with CGNN
There is a large variety of ways to systematically evaluate the effect of substitutional alloying on the properties of crystals Kaxiras (2003).Here, we focus on two key questions: 1. Data Science of Defects: What is the effect of substituting the same defect in a large variety of systems?

An elemental defect seeing a wealth of different crystalline environments
First, we focus on the prediction of system properties, where the systematic substitutional defective process is applied using the same replacement atom, on randomly selected sites of crystals that exist in the the Materials Project crystals dataset Jain et al ( 2013).The aim is to explore how material properties change after single-atom substitution, evaluating deviations from original pure crystalline predictions, and how they depend on the specific substitution.For we consider three elemental cases: Rb, Mn and H, as the key replacement atoms that we will mutually compare.The reason for the choice lies in the drastic elemental differences among their unique characteristics and the assumption Table 3: Differences in the atomic properties of the three elements considered for the single-atom substitutional process.As an example, we report here the atomic weight, radius and electronic configuration.
that, on the base of these, we might be able to gain deeper understanding on how the model behaves in the predictions of previously unseen defectinduced changes in the system properties, basing on its learned notion of local environment.Table (3) reports, as an example, the values of the atomic weight, radius and electronic configuration of each considered replacement atom.
In Fig.
(2), we display the predictions of MEGNet for the bulk moduli of Rb-defected systems with respect to the original, pure ones.As contained in Table( 4), this kind of substitutional defect causes the largest root-meansquared (RMSD) deviations among the three considered atomic species, with a value of approximately 6.2 GPa.In this section, we only report plots referring to the largest RMSD cases, but analogous plots can be found in the Appendix Sec.(A).

A crystalline matrix seeing a wealth of different elemental defects
The case of systematically changing the single-atom substitutional defect species in the same crystalline matrix is complementary to the prior addressed    We show periodic table plots for the properties of interest, K VRH , G VRH and E form , for the host matrix which displays the largest deviations in the properties, this is the case of Mo, but analogous plots for the other matrices are included in the Supplementary Material (SM).With the help of the visualization style, we characterize the predictions of the effect on the chemical distance between the substitutional defects and the host matrix on the properties of interest, and how it correlates with the well known trends along the periodic table.In Fig.( 6), we show the prediction of the host supercell bulk modulus variation when defected with one of the elements from the periodic table.In particular, in the plot we refer to where K

Mo(X)
VRH is the bulk modulus of the Mo matrix defected by atom X, and K Mo VRH its value for pure Mo (labelled with a red flag in the figure).
Even though Sr represents an outstanding outlier in decreasing the bulk modulus of the Mo crystal, at this scale it is still possible to appreciate how the variation happens along the periods: for the 3d, 4d and 5d transition metals from 3rd to 12th group, the defect-induced variations are mainly small, while a tendency to increase is observed, in modulus, in the post-transition metals and, remarkably, in the alkali and alkaline-earth.This behaviour can be interpreted in terms of the well known variations of the bulk modulus along the periodic table of elements, which seem correlated with respect to the defect-induced ones contained in this plot.The effect of substitutional alloying species which, in their bulk and standard temperature and pressure (STP) conditions, show a lower bulk modulus, is, as a tendency, to lower the bulk modulus of their host system.We may perform analogous investigations for the shear modulus of a pure host matrix, like Mo, and how it gets influenced by single-atom substitutional alloying, spanning over almost the entire periodic table, as shown in Fig.( 8).
We follow the same protocol and consider the variation In this case, the colormap reveals Si as an outlier towards hardening of the Mo matrix.This is a feature that cannot be explained by a correlation in defect-and defect-induced properties trends, since the predicted value is larger that the model prediction error on the shear modulus.We believe that the power of this investigation method is in paving the way to an efficient exploration of substitutional alloying, with the twofold possibility of looking for comparable performances (discovery of alternatives) or outstanding ones (discovery of exceptionals).Similarly to the investigation of the bulk modulus, the Alkali and Alkaline-earth metals like K, Rb, Sr and Ba are among the substitutional species providing the largest decreases in the shear modulus.
Overall, the effects and fluctuations caused by the substitutional defects on the defected host can always be highlighted, but it is not the aim of this work to find an exhaustive explanation for the existing predictions: The reasons for such trends may be due to any of the input parameters, such as the   10), it is evident that there is a trend that spans over the periods, and an overall interesting correlation could be found with the known trends for the atomic electronegativity along the periodic table, suggesting that the higher the latter for the substitutional defect in the Mo matrix, the higher is the resulting formation energy.

Property prediction -DFT validation
The validation of artificial intelligence methods predictions is a crucial step in order to quantify their quality.Even though the proposed graph network based method has been already validated for its predictions in bulk crystals, this work aims at testing its capabilities in the presence of single-atom substitutional defects.As explained in Sec.2.2, we compare the model predictions of K VRH , G VRH and E form with the DFT based ones, obtained with the THERMO PW driver of QE.  model with respect to K VRH and G VRH predictions, but large errors when it comes to the formation energies: the first may be regarded as a success, given that the model is predicting properties of a new class of systems, and given the computational limitations; the second one is a negative signature, even though the (i) defect dependent nature of the error order of magnitude opens up a deeper window of investigation on its reasons, and (ii) the validation set for defected systems is extremely small compared to the undefected MP dataset, on which initial MAEs have been evaluated.

Size effects
In substitutional alloying, the defect atomic specie is usually present in a dilute concentration, in the range of 0.1-10%.asymptotic over-dilute level (< 0.1%).Moreover, while the defect formation energy, as expected, shows a common descent to zero-level for all the defect species, the predicted structural properties seem to be sensitive on them, with the Rb-defected Mo crystal conserving a visible difference in the property value even at the limit of 0.1% concentration, both for K VRH and G VRH .Even though it is out of the present work aims to validate the asymptotic-size behaviour of the predictions with accurate but expensive DFT calculations, we believe these results to stand in favour of the positive model understanding of a crystalline defected system environment.Let us continue along the previously paved path of analysis which also looks into the effects of systematically changing the substitutional atomic specie among the large variety contained in the periodic table, as previously done in

Conclusions
In this work, we utilized a convolutional graph-neural network, based on the MEGNet architecture Chen et al (2022), in order to attempt the design of novel alloys.Alloying involves substitutional and interstitial alloying at relatively low concentrations, thus, single-defect properties shall be informative on overall designing capabilities and guidance.For this purpose, we utilized the MP database, and we focused on the prediction of the properties of single-atom substitutionally defected bulk crystals,in the context of both (i) systematic substitution with a specific set of species in a wide variety of crystals from the entire Materials Project dataset and (ii) systematic substitution with a variety of atomic species in a specific set of pure bulk crystals.We also validated some of the results with our own DFT calculations.We believe that such approaches      • the shear modulus variation scale of the Ni matrix upon single-atom substitution is comparable to the one of Mo, and they also share alkaline and alkaline earth metals in the lower bound variations; • similarly to the previous point, Au and Al share a similar variation scale and outlier map for the bulk modulus, and, in particular, for the formation energy  Even though the variations in the interesting properties upon substitutional-defecting of the host matrices are dominant for the Mo matrix, we here want to report, as previously done, all the variations, and draw a brief comparison between them in a combined visualization.While the trends in the bulk modulus variations of Mo and Ni matrices are noticeably different, Al and Au show pretty similar trends and hierarchies on different scales: the central 4d and 5d elements tend to maximize the variations, with a minimization happening at extremas and, for both, for a Mn substitution.In particular, it is interesting to notice that in the Al matrix nearly all the periodic table elements considered lead to a positive bulk modulus variation.A similarity between variations in Mo and Ni holds for the shear modulus variations, which et al (2020b).The work of Xie et al.Xie and Grossman (2018a) represents the pioneering example of a crystal graph convolutional neural network (CGCNN) architecture, which has been later extended in the iCGCNN by Park et al. Park and Wolverton (2020b) to include 3-body correlations on neighbouring atoms, information on the Voronoi tasselated structure and an optimized chemical representation of interatomic bonds in the crystal graphs.For the discovery of new materials, one may take various exploring paths, involving high-throughput computational Curtarolo et al (2013) and experimental Liu et al (2019) methods.However, the combined approach of machine-learning methods and compositional manipulation, has very quickly acquired a well established role in materials science, and it is applied in a wide range of property optimization searches like for zinc blende semiconductors Mannodi-Kanakkithodi et al (2022), perovskites Zhai et al (2022); Balachandran et al (2018); Ye et al (2018); Sharma et al (2020); Klug et al (2017); Sampson et al (2017) and others Guan (2019); Ning et al (2017); Oba et al (2018); Deml et al (2015, 2014); Wan et al (2021); Varley et al (2017); Mannodi-Kanakkithodi et al (2020).In this work, we utilize a particular improvement of the originally proposed Xie and Grossman (2018a) CGCNN model, the MatErials Graph Network (MEGNet) model from Chen, Ye and coworkers Chen et al (2019b), introduced in Sec.(2.1.1),that has the merit of being developed and tested both on molecules and crystals, with the possibility of defining global state attributes including temperature, pressure and entropy.We test the capabilities of graph networks to predict the properties of single-atom substitutionally defected crystals with the MEGNet model.After considering a pre-trained model on the Materials Project (MP) database (Sec.(2.1.3)),we focus on the formation energies, bulk and shear moduli predictions, both comparing the results obtained in datasets of similarly defected structures (Sec.(3.1)) and the effects of almost all the possible single-atom defects in the same matrices (Sec.(3.2)).
description In the present work, we utilize the MEGNet modelChen et al (2019b).The reasons for this choice lie in the structure and performance of the model: 1.It is characterized by a low number of attributes, one for the atom (atomic number) and one for the bond (spatial distance), but MEGNet outperforms previous graph-based modelsChen et al (2019b), as CGCNNXie and Grossman (2018a) and MPNNJorgensen et al (2018), with higher number of attributes, as well as SchNetSchütt et al (2017), with a similar low number.2. the MEGNet framework includes a global state attribute, essential for stateproperty relationship predictions in materials, 3. The graph network construction of MEGNet has been developed and tested for both molecules and crystals and its THERMO PW Dal Corso (2023) driver for the calculation of structural properties.Pseudo-Potentials for all involved atomic species are Ultra-Soft and with Perdew-Burke-Ernzerhof (PBE) Perdew et al (1996) functional.The Methfessel-Paxton smearing Methfessel and Paxton (1989) has been introduced to correctly investigate metallic systems, and the calculations have been set as spin-polarized, for possible non-zero magnetization effects.
2. Is there qualitative and quantitative effects from atom-substituting various elemental defects in the same host crystalline matrix?While it has not yet been possible to perform an exhaustive search of the kind, in this work, for the first part, the basin of host systems is represented by the Materials Project crystals dataset introduced in the previous sections Jain et al (2013), while for the latter part, the host systems are pure metallic bulk crystalline supercells of Al, Ni, Mo and Au.

Fig. 2 :
Fig.2: Predicted bulk moduli in the Rb-defected MP crystals with respect to their prediction in non-defected ones.An example structure from the MP dataset is shown, CaN2, in which one atom of Ca has been replaced with a Rb atom.Here only the case of Rb-defected systems is shown, due to its largest RMSD among the set of considered defects.Similar plots for the Mn-and Hdefected systems are contained in the Supplementary Materials.

Fig. 3 :
Fig. 3: Predicted shear modulus in the Rbdefected MP crystals with respect to the prediction in non-defected ones.Here only the case of Rb-defected systems is shown, due to its largest RMSD among the set of considered defects.Similar plots for the Mn and H-defected systems are contained in the Supplementary Materials.

Fig. 4 :
Fig. 4: Predicted formation energy in the H-defected MP crystals with respect to the prediction in non-defected ones.Here only the case of H-defected systems is shown, due to its largest RMSD among the set of considered defects.Similar plots for the Rb and Mn-defected systems are contained in the Supplementary Materials.

Fig. 5 :
Fig.5: Periodic table plot (b) explaining the process of selecting a set of host matrices (in light blue) and substitutional defects (in green).To a given a selected host matrix, the non-selected ones represent defects too.The host matrices are 3 × 3 × 3 supercells of the highlighted species (a).

Fig. 6 :
Fig. 6: Predicted bulk modulus variation (K' VRH ) for a single-atom substitutionally defected Mo supercell with respect to the undefected one, for each possible defect atomic specie from the provided periodic table.Similar plots for Al, Au and Ni supercells are provided in the Supplementary Materials.The red flag highlights the zero relative difference, meaning the pure Mo matrix selection.

Fig. 7 :
Fig. 7: Predicted bulk modulus variation (K' VRH ) for a single-atom substitutionally defected Mo supercell with respect to the undefected one, along the 3d, 4d and 5d series of the periodic table.The black dashed line highlights the pure Mo case.
atomic number and bond lengths, or an abstract notion of local environment which is good enough to show reasonable correlations with existing alternative descriptors (i.e.atomic properties).The variations along the 3d, 4d and 5d transition metals are, for most of the cases, below the model MAE for G VRH predictions, therefore no meaningful extrapolation is possible, but we report the plot in Fig.(B9) of the Appendix.

Fig. 8 :
Fig. 8: Predicted shear modulus variation (G' VRH ) for a single-atom substitutionally defected Mo supercell with respect to the undefected one, for each possible defect atomic specie from the provided periodic table.Similar plots for Al, Au and Ni supercells are provided in the Supplementary Materials.The red flag highlights the zero relative difference, meaning the pure Mo matrix selection.

Fig. 9 :
Fig. 9: Predicted formation energy variation (E' form ) for a single-atom substitutionally defected Mo supercell with respect to the undefected one, for each possible defect atomic specie from the provided periodic table.Similar plots for Al, Au and Ni supercells are provided in the Supplementary Materials.The red flag highlights the zero relative difference, meaning the pure Mo matrix selection.

Fig. 10 :
Fig. 10: Predicted formation energy variation (E' form ) for a single-atom substitutionally defected Mo supercell with respect to the undefected one, along the 3d, 4d and 5d series of the periodic table.The black dashed line highlights the pure Mo case.
For this reason, it is of interest the study of how the property predictions vary with defect concentration, that we propose in the saturation plots of Fig.(11), Fig.(12) and Fig.(13).Our example system follows the choice of the Mo host matrix, with the H, Mn and Rb substitutional defects we focused on, respectively in the second and first part of the previously reported results.Interstingly, a hierarchy is conserved among the defect-induced variations for different host supercell sizes: the single Rb defect always causes the largest deviations of the studied properties from their

Fig. 11 :
Fig. 11: Saturation plot of a Mo host matrix K VRH when substitutionally defected with H, Mn or Rb, for different supercell sizes.

Fig. 12 :
Fig. 12: Saturation plot of a Mo host matrix G VRH when substitutionally defected with H, Mn or Rb, for different supercell sizes.

Fig. 13 :
Fig. 13: Saturation plot of a Mo host matrix E form when substitutionally defected with H, Mn or Rb, for different supercell sizes.

Fig.( 6
Fig.(6), with the only exception of selecting the smallest and largest supercell sizes from the previous saturation plots of Fig.(11), respectively 2 × 2 × 2 (1% concentration, 16 atoms cell) and 8 × 8 × 8 (10% concentration, 1024 atoms cell).In Fig.(14), focusing on the first row of plots which deal with the K' VRH variation, and comparing with the previously investigated supercell case of size 3×3×3 (2% concentration) of Fig.(6), one can notice that the scale of property variations changes accordingly: smaller defect concentrations lead to smaller defect-induced effects, and viceversa, as expected.In particular, in the largest supercell case the induced variations range is reduced to a 3% of the 3 × 3 × 3 system's one.However, the composition map outliers are left unchanged.The shear modulus variations, in the second row of tables, sees the emergence of new outliers in the chemical neighbourhoods of the previously obtained ones in 3 × 3 × 3 supercells, while the formation energy variations undergo both a change in scale and a complete change in the map outliers.Following the brief assessment of predictions quality performed for each of the interesting properties in the previous section, we expect the formation energy variations to suffer by non-negligible fluctuations, leading to the possible need of further validation.However, the main aim of the present discussion is to underline the power of the overall approach in highlighting the path towards composition search in substitutional alloying, which can effectively drive to specific desired emerging properties.

Fig. 14 :
Fig. 14: Predicted variations K' VRH (first row plots), G' VRH (second row plots), E' form (third row plots) for a single-atom substitutionally defected Mo supercell with respect to the undefected one, for each possible defect atomic specie from the provided periodic table.In plots of column a) a 2 × 2 × 2 supercell is considered, while in column b) a 8 × 8 × 8 supercell.

Fig. A2 :
Fig.A2: Predicted shear modulus in the H-and Mn-defected MP crystals with respect to its prediction in non-defected ones.

Fig. A3 :
Fig. A3: Predicted formation energy in the Mn-and Rb-defected MP crystals with respect to its prediction in non-defected ones.

Fig. A4 :
Fig. A4: Predicted formation energy in the Rb-defected MP crystals with respect to its prediction in non-defected ones.The plot has the aim to display the need of a log-log scale to appreciate the deviations.

Fig. B5 :
Fig. B5: Comparison between the periodic table plots of the predicted bulk modulus, shear modulus and formation energy for a single-atom substitutionally defected Ni (left column) and Mo (right column) supercell with respect to the undefected one, for each possible atomic specie from the provided periodic table.

Fig. B6 :
Fig. B6: Comparison between the periodic table plots of the predicted bulk modulus, shear modulus and formation energy for a single-atom substitutionally defected Au (left column) and Al (right column) supercell with respect to the undefected one, for each possible atomic specie from the provided periodic table.

Fig. B8 :
Fig.B8: Predicted bulk modulus variation (K' VRH ) for a single-atom substitutionally defected Al and Au supercell with respect to the undefected one, along the 3d, 4d and 5d series of the periodic table.The black dashed line highlights the pure host matrix case.

Fig. B9 :
Fig. B9: Predicted shear modulus variation (G' VRH ) for a single-atom substitutionally defected Mo and Ni supercell with respect to the undefected one, along the 3d, 4d and 5d series of the periodic table.The black dashed line highlights the pure host matrix case.

Fig. B10 :
Fig. B10: Predicted shear modulus variation (G' VRH ) for a single-atom substitutionally defected Al and Au supercell with respect to the undefected one, along the 3d, 4d and 5d series of the periodic table.The black dashed line highlights the pure host matrix case.

Fig. B11 :
Fig. B11: Predicted formation energy variation (E' form ) for a single-atom substitutionally defected Mo and Ni supercell with respect to the undefected one, along the 3d, 4d and 5d series of the periodic table.The black dashed line highlights the pure host matrix case.

Fig. B12 :
Fig.B12: Predicted formation energy variation (E' form ) for a single-atom substitutionally defected Al and Au supercell with respect to the undefected one, along the 3d, 4d and 5d series of the periodic table.The black dashed line highlights the pure host matrix case.

Table 1 :
Parameters from the pre-trained MEGNet model.•V is the set of N v atomic attribute vectors v i ; Chen et al (2022)properties.Table(1)shows some of the paramaters of the model, and a more complete list can be found at the default implementation of the classChen et al (2022)

Table 2 :
MAEs of the model for the prediction of the bulk modulus (K VRH ), shear modulus (G VRH ) and formation energy (E form ).
We report parity plots of Fig.(1) for all three properties of interest in this study: bulk modulus (K VRH ), shear modulus (G VRH ) and formation energy (E form ). To evaluate the model accuracy in predicting the properties of interest for the present study, the mean-absolute error (MAE) is used as the evaluation metric.Table(2) presents the MAE values for each predicted property over the dataset, which provides insights into the pre-trained model performance.

Table 4 :
RMSD (GPa)for the prediction of the bulk modulus (K VRH ) in Rb, Mn, and H single-atom substitutionally defected systems with respect to the non-defected ones.Also the shear moduli predictions show the largest RMSD for the case of a Rb-defect, with a value of 0.0628 log(GPa) 2 .The results are shown in Fig.(3) and onTable(5).According to the model, it seems that both K VRH and G VRH upon substitution of a Rb atom, are showing a tendency to decrease their value, respectively, implying an increase in their compressibility and decrease in hardness.It is worth noticing that even though the only physical atomic feature that the model exploits is the atomic number, the prediction of larger changes in the structural properties for involved defects with larger radii can be regarded as a reasonable one.

Table 5 :
RMSD (log(GPa)) for the prediction of the shear modulus (G VRH ) in Rb, Mn, and H single-atom substitutionally defected systems with respect to the non-defected ones.

Table 6 :
RMSD (eV/atom) for the prediction of the formation energy (E form ) in Rb, Mn, and H single-atom substitutionally defected systems with respect to the non-defected ones.
Table(7)shows the results for the validation on the properties in the case of an Al matrix and single-atom substitutional defects, including B, C, I, Comparing the MAE values of our DFT calculations, with the ones of the model for non-defected systems in Table(2), we find good performances of the

Table 7 :
Comparison of the DFT and MEGNet results for the three properties of interest in a small set of samples.Here, Al B means a single B-atom substitution in an Al host 2 × 2 × 2 supercell matrix.
might provide novel insights into alloy design, especially if predictions include extended lattice defects such as dislocations and/or grain boundaries.