In material science domain, the data availability has made it possible to design and test machine learning models not only to strengthen our understanding of various properties of materials but also to give predictive capabilities through finding trends and patterns. Here, we report the insight into magnetization of Iron based compounds using the machine learning model and by doing the model interpretability analysis using SHapley Additive exPlanations. Most of the Iron based compounds are magnetic in nature and are well studied with abundant data available in different repositories. We have used data from Materials Project.
I. INTRODUCTION
Machine learning (ML) has revolutionized numerous sectors in recent years, including image identification, web searches, fraud detection, self-driving, and many more. Following its enormous success in a variety of everyday tasks, machine learning is now being employed in various scientific domains ranging from fundamental physics to materials design. But, in material science, the scarcity of quality data is one of the major challenges. In the previous decades, great evolution in computational science and technologies along with tremendous breakthroughs in simulation approaches have been made to overcome the problem to some extent. Input creation can now be automated, and multiple (even millions) simulations can run in parallel or sequentially to create many datasets known as high-throughput (HT) calculations. There are an increasing number of structure–property databases, such as Materials Project,1 AFLOWLIB consortium,2 NOMAD,3 OQMD,4 and many more that collect vast quantities of DFT computations in a convenient manner. These datasets are now frequently used to build statistical machine learning models that can reproduce structure–property correlations in known crystals and extrapolate unknown chemistries. By building a structure–property link using advanced mathematical techniques on a previously accessible dataset, ML can swiftly predict material properties without having to solve a lot of complex fundamental Schrödinger equations. The best part about machine learning models is that they improve over time as more and more tagged data becomes available.
Iwasaki et al.5 used machine learning and ab initio simulations to identify Fe–Co-based alloys with high magnetization and verified the same with experiments. Thousands of materials were narrowed down to a few using ML screening to further synthesize in the lab. Long et al.6 used a Random Forest classifier algorithm to separate out Ferromagnetic compounds from antiferromagnetic compounds to further predict the Curie temperature of ferromagnetic compounds only, using Random Forest regressor algorithm. The neural network approach was utilized by Yuan et al.7 to improve the nuclear magnetic moment predictions of odd-A nuclei. They also included the nuclear spin and Schmidt magnetic moment in the input layer to improve the accuracy further. To examine the magnetism of uranium-based compounds, Ghosh et al.8 used various regression techniques and discovered that the best ML model was constructed using the Random Forest regressor approach. They have predicted both spin and orbit moment sizes of uranium-based compounds and classified them based on the magnetic ordering. Rhone et al.9 used density functional theory (DFT) calculations and machine learning methods to study the magnetic and thermodynamic properties of two-dimensional materials. Pham et al.10 proposed an orbital-field matrix (OFM) descriptor to represent the material based on the distribution of valence cell electrons. The OFM is used to estimate the local magnetic moments of constituent atoms in bi-metal alloys of lanthanide metal and a transition metal. Summarily, the availability of large datasets and high-throughput calculations along with the powerful ML algorithms are the perfect combination for the second computational revolution in the field of materials science after density functional theory (DFT). ML has huge potential to predict almost any material property provided model is trained with a large amount of data and with high accuracy.
In this work, we have used various machine learning (ML) algorithms to predict the magnetic moment and formation energy per atom of various Fe-based magnetic compounds.
II. DESCRIPTOR AND DATASETS
Data of 11545 Fe-based compounds are collected from the Materials project repository1 using open Materials Application Programming Interface (API)11 and the open-source Python Materials Genomics (pymatgen) materials analysis package.12 Data on materials project consists of band structure, density, Fermi energy, cut-off energy, ground state energy, total magnetization, crystal structure, the total number of atoms in the compound, crystal system, magnetic ordering, formation energy per atom, and many more for all compounds. Out of the above properties, we have focussed only on crystal structure, formation energy and magnetic properties. A schematic diagram to create the datasets is shown in Fig. 1.
The structure of any compound plays an important role in determining its magnetic properties, so we used the structure of the compound for the input layer of our model, and output layers are magnetic moment per atom and formation energy per atom. Orbital field Matrix (OFM) descriptor purposed by Pham et al.10 is used to represent the structure of the compound in the machine-readable form, i.e., a matrix of size 32 × 32. This representation is mainly based on the distribution of the valence shell electrons and coordination number for each atom, along with the weight of the neighboring atoms in the form of solid angles determined by the faces of the Voronoi polyhedra. Python Materials Genomics (pymatgen) package is used for descriptor implementation in the python programming language. The equation used to create the OFM is as follow:
Where n is the total number of atoms in the compound, nn is the total nearest neighbor for each ith atom in the compound, Hjis the hot vector for jth atom, is the transpose of the hot vector of ith atom, wij is the weight of the jth atom with respect to ith atom determined by the face of Voronoi polyhedron, and dij is the distance between the ith and the jth atoms. Data is now having 1024 features capturing the structure of the complete system. To reduce the entropy in the system data is further divided into various datasets based on the magnetic ordering, as shown in Table I. Dataset 1 includes all Fe-based compounds including Ferromagnetic (FM), Ferrimagnetic (FiM), nonmagnetic (NM), antiferromagnetic (AFM) and some unknown configurations. All others are the subsets of dataset 1.
Datasets according to magnetic ordering.
Datasets . | Instances . | Description/Magnetic ordering . |
---|---|---|
Dataset 1 | 11545 | All Fe-based magnetic compounds |
Dataset 2 | 8083 | Fe-based ferromagnetic (FM) compounds |
Dataset 3 | 2224 | Fe-based ferrimagnetic (FiM) compounds |
Dataset 4 | 10307 | Fe-based FM + FiM compounds |
Datasets . | Instances . | Description/Magnetic ordering . |
---|---|---|
Dataset 1 | 11545 | All Fe-based magnetic compounds |
Dataset 2 | 8083 | Fe-based ferromagnetic (FM) compounds |
Dataset 3 | 2224 | Fe-based ferrimagnetic (FiM) compounds |
Dataset 4 | 10307 | Fe-based FM + FiM compounds |
III. RESULTS AND DISCUSSION
In this section we present the various ML algorithms and their performance for the datasets of our interest.
A. Machine learning modeling
In machine learning, while choosing a model for the dataset under consideration, several factors are needed to be considered. For example, it is important to understand that what kind of model we want to evolve as each algorithm has been designed in a way that it serves a specific purpose like classification, regression, etc. Further, as the ML is data-hungry, therefore, an over-constrained model on the insufficient training dataset results in underfitting. On the other hand, an under-constrained model is likely to result in overfitting of the dataset. The size of the training dataset is a factor that plays a major role for us in deciding the algorithm of our choice. For a small training dataset, the low bias/high variance classifiers (such as k-nearest neighbors) are likely to overfit the training dataset, the high bias/low variance classifiers (such as Naive Bayes) are at an advantage over this. Considering the limited dataset size and nature of our dataset, we chose to use the Random Forest algorithm as it handles both classification and regression problems and combines the output of multiple decision trees to reach a single result. Random Forest Algorithm offers ease to handle the dataset containing continuous variables as in the case of regression and categorical variables as in the case of classification. For material datasets, this algorithm has proven to be better performing to find various material properties.6,8,13 However, for the dataset of our interest that is Fe-based materials, we checked various ML algorithms as there are not many efforts to predict magnetic moment using ML in this set of class of materials. We applied Support Vector Regressor (SVR), Random Forest (RF), K Nearest Neighbour (KNN), and Extreme Gradient Boosting (XGB) to dataset 1 which contains the complete data with no classification based on magnetic ordering. We used grid search hyperparameter tuning technique to get the best parameters for all algorithms, and then we applied 5-fold cross-validation using the best parameters obtained. We have optimised gamma parameter for SVR, which defines the influence range of a single training example, low values indicates “far” and high values indicates “close”. Our optimised value for gamma is 0.001. Important parameters for random forest are n_estimaters and max_depth, which represent the number of trees in the forest and maximum depth of the tree. Our optimised values for n_estimaters and max_depth are 50 and 800, respectively. N_neighbors in KNN represents the number of neighbors to be used for the calculation, it’s optimised value is 8 and weights are taken as “distance,” which means the weight points of nearest neighbor will be the inverse of their distance. N_estimators in XGB Regressor represents the number of runs XGB will try to learn the model and its optimised value comes out to be 600 for our model. We have used Scikit-learn 1.1.314 python library for implementing various algorithms and hyperparameter tuning in Sec. III A. However final calculations in Sect. III B and III C for magnetic moment per atom and formation energy per atom using random forest algorithm are done using WEKA 3.8.5,15 open source software developed by the university of Waikato. We have few compounds in our dataset with high magnetic moment per atom commonly known as outliers in ML term, but we did not exclude any outlier because there are materials in real life with high magnetic moment per atom, so this data is important to include in our model. A schematic of the whole process for the required task is shown in Fig. 2. The final merit matrix for all the algorithms is shown in Table II.
Flowchart for the various steps used in ML model to predict the magnetic moment and formation energy per atom of Fe-based compounds.
Flowchart for the various steps used in ML model to predict the magnetic moment and formation energy per atom of Fe-based compounds.
Comparison of various ML algorithms for dataset one to predict the magnetic moment per atom.
. | SVR . | RF . | KNN . | XGB . |
---|---|---|---|---|
Correlation coefficient (CC) | 0.661 | 0.812 | 0.702 | 0.818 |
Mean absolute error (MAE) | 0.208 | 0.135 | 0.175 | 0.147 |
Root mean squared error (RMSE) | 0.325 | 0.232 | 0.294 | 0.241 |
. | SVR . | RF . | KNN . | XGB . |
---|---|---|---|---|
Correlation coefficient (CC) | 0.661 | 0.812 | 0.702 | 0.818 |
Mean absolute error (MAE) | 0.208 | 0.135 | 0.175 | 0.147 |
Root mean squared error (RMSE) | 0.325 | 0.232 | 0.294 | 0.241 |
From Table II, it is obvious that there is close competition between XGB and RF. The correlation coefficient (CC), which is measure of the strength of relationship between features and labels, is slightly better for XGB so it correlates OFM with magnetism slightly better than RF, whereas error in prediction of magnetization is less in RF as compared to XGB. Random Forest seeks to give more preference to hyperparameters in order to optimise the model whereas XGB always prioritises functional space while minimising the cost of a model. Our priority here is to optimise the model for better results. Therefore, we have used the RF algorithm for our further modelling. Similar findings are observed in literature,8 where RF provides the best result in predicting the magnetic properties of Uranium-based compounds.
B. Magnetic moment
The merit matrix to predict the magnetic moment for various datasets is shown in Table III. The values of predicted magnetic moment per atom lies within a range of 0.03 µB per atom to 2.13 µB per atom. However, these results are slightly off than what we have got in Sec. III A. This is due to the fact that here we are using WEKA software. It should be noted that in scikit-learn, one need to fix the random states to make the results reproducible but there is no such requirement in WEKA. Here we did not optimise any parameter and just used default parameters of WEKA. The correlation coefficient (CC) is highest in ferromagnetic compounds. This is understandable as the magnetic moment is uniformly distributed in the same direction in ferromagnetic ordering, which reduces the entropy in the system relatively, and it is easy for a machine to establish a relationship in a low entropic or linear system. CC is lowest in FiM dataset (dataset 3) as magnetic moments are unequal and are oppositely aligned in ferrimagnetic compounds, which makes the system more entropic. So, it is difficult for a machine to establish a correlation in a non-uniformly distributed high entropic system. The mean absolute error (MAE) is in good agreement with the CC, with a minimum error of 0.148 µB per atom in FM dataset (dataset 2) and a maximum error of 0.215 µB per atom in FiM dataset. Interestingly, accuracy for ferromagnetic compounds dataset is the best in all datasets, it is least in Ferrimagnetic compounds dataset, and intermediate in the remaining two datasets where FM and FiM compounds are taken together. This is not surprising in machine learning as entropy of the system plays an important role in determining the model accuracy so here FM dataset with minimum entropy shows the best results. Pham et al.10 have shown that OFM descriptor is suitable to predict only the magnitude of magnetic moment of a particular atom in any material, not direction. However, the total magnetization is not computed as magnetic ordering (FM or any other) is not accounted for. It is reflected in our results as the model trained with the data of ferromagnetic compounds only seems to perform better. Therefore, this suggests that the descriptor needs to be improved, for it to be suitable for the generalized dataset.
Accuracy parameters for various datasets based on magnetic ordering to predict the magnetic moment per atom.
. | All . | FM . | FiM . | FM + FiM . |
---|---|---|---|---|
Correlation coefficient (CC) | 0.826 | 0.850 | 0.836 | 0.844 |
Mean absolute error (MAE) | 0.170 | 0.148 | 0.215 | 0.178 |
Root mean squared error (RMSE) | 0.226 | 0.235 | 0.308 | 0.261 |
No. of compounds | 11 545 | 8083 | 2224 | 10 307 |
. | All . | FM . | FiM . | FM + FiM . |
---|---|---|---|---|
Correlation coefficient (CC) | 0.826 | 0.850 | 0.836 | 0.844 |
Mean absolute error (MAE) | 0.170 | 0.148 | 0.215 | 0.178 |
Root mean squared error (RMSE) | 0.226 | 0.235 | 0.308 | 0.261 |
No. of compounds | 11 545 | 8083 | 2224 | 10 307 |
Further, density plot, which represents deeper insight into the results by comparing each predicted and actual value in visual form is used here. The density plot of the magnetic moments for dataset 2 (only FM compounds) is shown in Fig. 3. High density near the central line indicates the good fitting of the data. For the lower value of the magnetic moment, i.e., less than 0.44 µB per atom, it is slightly overestimated, whereas, for higher values, i.e., more than 0.44 µB per atom, it is underestimated. Figure 4 presents the data distribution of all the datasets as a function of magnetic moment per atom. Each circle represents the normalized density of datapoints within range of ±0.01 µB per atom. Clearly, the data is non-uniformly distributed with a maximum density at 0.44 µB per atom. This biasness of the data affects the prediction pattern as discussed above. In our case, predicted values try to shift towards the denser side, i.e., towards 0.44 µB per atom.
Density plot between actual and predicted values of magnetic moments per atom for dataset 2.
Density plot between actual and predicted values of magnetic moments per atom for dataset 2.
Distribution of magnetic moment per atom of all Fe-based compounds. Each circle represents the normalized density of datapoints within range of ±0.01 µB per atom. A maximum number of compounds are having 0.44 µB per atom magnetic moment.
Distribution of magnetic moment per atom of all Fe-based compounds. Each circle represents the normalized density of datapoints within range of ±0.01 µB per atom. A maximum number of compounds are having 0.44 µB per atom magnetic moment.
C. Formation energy
Formation energy which is measure of stability is computed in most of the work done using ML approach. It has appeared as a robust feature and all the approaches with different descriptors have predicted the formation energy quite accurately. However, the same is not true for other properties, e.g., magnetic moment as explained above. We have trained the datasets with formation energy as the output layer to predict the formation energy. The merit matrix for the formation energy for various datasets is shown in Table IV. The predicted values of formation energy per atom lies within a range of −3.33 eV per atom to 4.00 eV per atom. The correlation coefficient (CC) is directly proportional to the number of compounds in the dataset, which is consistent with the general trend that ML algorithms works better with increase in the input or training datapoints.
Merit matrix for various datasets based on magnetic ordering to predict the formation energy per atom.
. | Dataset 1 . | Dataset 2 . | Dataset 3 . |
---|---|---|---|
Correlation coefficient (CC) | 0.967 | 0.964 | 0.954 |
Mean absolute error (MAE) | 0.140 | 0.132 | 0.227 |
Root mean squared error (RMSE) | 0.261 | 0.248 | 0.392 |
No. of compounds | 11 545 | 8083 | 2224 |
. | Dataset 1 . | Dataset 2 . | Dataset 3 . |
---|---|---|---|
Correlation coefficient (CC) | 0.967 | 0.964 | 0.954 |
Mean absolute error (MAE) | 0.140 | 0.132 | 0.227 |
Root mean squared error (RMSE) | 0.261 | 0.248 | 0.392 |
No. of compounds | 11 545 | 8083 | 2224 |
The density plot between actual and predicted values of the formation energy per atom for dataset one is shown in Fig. 5. The energy of concern for us is near zero, i.e., between −0.5 to 0.5, and the plot in the range is very well symmetric to the central line showing unbiasing of the model and high density near the line indicating the good prediction in the range.
D. Model interpretability
Model interpretability is critical to explain the ML model and the accuracy of the results. It is also needed to understand the model and to improve it. With interpretability analysis, one can explain model prediction by generating feature importance values for the entire model or for individual datapoints. It also provides a visualization platform to discover the patterns in data. There are a few tools like LIME, ELI5, or SHAP value for the model interpretability. Here we have used the tool SHAP (SHapley Additive exPlanations),16 which is a game-theoretic approach to explain the output of any machine learning model. The python PyPI library has SHAP capability based on a high-speed exact algorithm for tree ensemble methods. We have used this implementation to understand the results of our ML model.
Out of 1024 features in the dataset coming from 32 × 32 matrix representing crystal structure of the compounds, SHAP analysis gives a few features or orbital interactions to be dominant for the magnetic moment. Figure 6 shows the sorted average SHAP values for the orbital interactions dominant for the magnetic moment. Orbital interactions that push the Magnetic moment higher are represented in red, while those that push it lower are shown in the blue colour. For example, the blue part of s2-s2 orbital interaction in Fig. 6 indicates the reduction in magnetic moment, whereas the red part of d6-d6 interaction indicates the increment in magnetic moment. Also, the spread in SHAP values indicate the amount of dominance of each feature. The wider the spread the more dominant is a feature.
Shap values for (a) all Iron based compounds, (b) iron based ferromagnetic compounds, and (c) iron based ferrimagnetic compounds.
Shap values for (a) all Iron based compounds, (b) iron based ferromagnetic compounds, and (c) iron based ferrimagnetic compounds.
As shown in Fig. 6, orbital interactions s2-s2, s1-s2, p4-p4, s2-p3, s2-p4, and s2-p2 try to suppress the magnetization of the compound, whereas d6-d6, d6-p4, and p4-d6 improve the magnetization in Fe-based compounds. In Fig. 6(a) d6-d6 interaction is the most dominant which improves the magnetization in iron-based compounds and at the same time it is also responsible for the ferrimagnetic interactions as shown in Fig. 6(c). Similarly, p4-d6, d6-p4 and s2-d6 orbital interactions are responsible for high magnetization in ferromagnetic compounds and rest are responsible for the suppression in the magnetization as shown in Fig. 6(b). Orbital interaction s2-s2 is interesting one having higher values in ferromagnetic and lower values in ferrimagnetic compounds with an inverse relation with magnetization. It indicates that far separated s orbitals makes compound ferrimagnetic and close packed s orbitals are found in ferromagnetic ordering. Orbital interaction d6-d6 represents the interaction of Fe-Fe atoms. Features d6-p4 and p4-d6 indicate the d6 orbital interaction with p4 orbital (group 16 in the Periodic Table). Several Iron oxides are the examples of the same.
IV. CONCLUSION
In conclusion, we have trained the model for Iron-based compounds using various machine learning algorithms. Random forest algorithm provides better results as compared to other algorithms. We can predict the magnetic moment per atom of iron-based ferromagnetic compounds with a mean absolute error of 0.148 µB per atom. Datasets of ferromagnetic compounds are more uniform with respect to magnetization direction as compared to any other magnetic ordering. This is reflected in the accuracy of machine learning algorithms over different magnetically ordered datasets. Orbital field matrix descriptor is best suited for ferromagnetic compounds as compared to other magnetic ordering. However, considering the accuracy in the magnetic moment, a scope to improve the descriptor is evident. Further, as the best suited orbital field matrix descriptor does not capture the direction of the moments, it might be very useful to include that in the descriptor. This suggests more work to be done in refining the descriptor for magnetic systems.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Yogesh Khatri: Conceptualization (lead); Data curation (lead); Formal analysis (lead); Methodology (equal); Resources (lead); Software (supporting); Supervision (lead); Validation (lead); Visualization (lead); Writing – original draft (lead); Writing – review & editing (supporting). Rajesh Sharma: Data curation (supporting); Methodology (equal); Software (lead). Ashutosh Shah: Data curation (supporting). Arti Kashyap: Project administration (lead); Writing – review & editing (lead).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.