This study examines the application of machine learning algorithms, specifically the Random Forest regression model, to optimize the magnetocaloric effect in all- -metal Heusler alloys. The model was trained using descriptors related to the mean properties of individual atoms, the properties of simple compounds in their ground state, and measures of chemical disorder. It demonstrated high accuracy in predicting structural properties, while exhibiting moderate accuracy in predicting magnetic properties. To identify optimal alloy compositions, a genetic algorithm was used to find those with the greatest differences in magnetization during martensitic transitions. Using this combined approach, the Ni–Co–Mn–Ti alloy system was thoroughly explored, resulting in the discovery of an alloy with a maximum magnetization difference. These results are consistent with previous research based on density functional theory and highlight the effectiveness of integrating machine learning with genetic algorithms for the discovery of new materials with outstanding magnetocaloric properties. The study emphasizes the need for further refinement of models capable of accurately predicting complex magnetic interactions, which is essential for fully leveraging the potential of all- -metal Heusler alloys in practical applications.
I. INTRODUCTION
For an extended period, magnetocaloric cooling has been regarded as an environmentally benign and efficacious alternative to conventional compressor-based cooling technologies.1–3 This technology has several advantages, including high energy efficiency, the absence of environmentally harmful refrigerants, and relatively low levels of noise and vibration.4 In recent years, a variety of potential magnetocaloric materials have been identified, including Gd, manganites, Laves phases, Fe–Rh, Mn–As, La–Fe–Si, and Heusler alloys.1,2 Additionally, several prototype devices employing magnetocaloric cooling technology and using Gd as the working material have been demonstrated.4–6 However, for industrial applications of this technology, several challenges must be addressed, such as the cost of the magnetocaloric material and the magnetic system, as well as the frequency of switching between the heat transfer fluid and the working body, as outlined in Ref. 7. Heusler alloys are among the most promising candidates because their Curie temperatures can be easily tuned by varying stoichiometric coefficients,8,9 many consist of relatively inexpensive components, and they exhibit some of the highest magnetocaloric effects observed.8,10
The main issue is thermal hysteresis, which leads to irreversible changes in the magnetocaloric effect during cyclic operation.7 Another challenge, particularly in systems with a giant magnetocaloric effect characterized by a first-order phase transition, is the development of defects under cyclic loading.11 These defects cause “smearing” of the transition, reducing the maximum value of the magnetocaloric effect.12 Therefore, alloys with exceptional mechanical properties are required to solve the problems of rapid heat transfer and resistance to defect formation. However, traditional X -Y -Z type alloys are highly brittle due to covalent hybridization.13 A recently proposed solution to this problem is to replace the Z element with a transition metal, thereby substituting strong covalent hybridization with metallic bonding between elements. This results in enhanced mechanical properties associated with the martensitic transition.14 The ductility of all- -metal Heusler alloys, such as MnTi, significantly surpasses that of classical Mn(Ga, Al, In, Sn) alloys based on Pettifor’s and Pugh’s ratios. The magnetocaloric effect values in these alloys are also high and comparable to those of classical Heusler alloys. For instance, exhibits an isothermal entropy change of 20 and an adiabatic temperature change of 4 K for a magnetic field change of 2 T at room temperature.15
The number of possible compositions of all- -metal Heusler alloys is vast, and considering that exceptional properties are often found in non-stoichiometric regions, the search space becomes infinite. This makes a classical search method based on a full factorial design, even when incorporating certain physical considerations, very expensive. Identifying suitable materials among numerous possible compositions is a labor-intensive and time-consuming process.
In recent years, only a limited number of compositions in this series have been studied, most of which are Ni–Mn–Ti,13,14 with the addition of a fourth element such as Co,14,16 Fe,17 and a fifth element such as Cu18 and B.19 An alternative approach is to use various machine learning algorithms and high-throughput screening. For example, using regression with a Random Forest algorithm, the mechanical properties of all- -metal Heusler alloys, particularly the Pugh's ratio, were optimized,20 and an empirical formula for estimating ductility was proposed. In the study,21 a model was developed to classify cubic and tetragonal phases of the alloy using simple descriptors of -electron occupancy and spin moment with a support vector machine algorithm, achieving a prediction accuracy of 90.4% relative to DFT. In addition to machine learning, high-throughput screening methods are also employed. In the study,22 1881 all- -metal compounds were considered, and by applying constraints on the convex hull energy, magnetic moment magnitude, type of magnetic ordering, and mechanical stability, 11 compounds were identified. The authors of the study22 also provided several empirical rules for determining the lattice type based on the electronegativity of the elements and tuning the behavior of the magnetostructural phase transition.
Despite significant progress in using machine learning methods to optimize the properties of Heusler alloys, existing models face several major challenges. One of the key difficulties is accurately predicting the properties of non-stoichiometric compositions, which is critically important for developing new materials with improved properties. The magnetocaloric effect itself is a complex phenomenon, and there are currently no models that can accurately predict all its aspects. While some models successfully predict elastic properties and chemical stability, the challenge of predicting differences in magnetization between phases, Debye temperatures, and other key parameters remains unresolved.
The aim of this study is to develop models for predicting the volume, tetragonality parameter, and magnetization of austenite and martensite phases, as well as an optimization algorithm for the difference in magnetization for a given atomic composition of all- -metal Heusler alloys.
II. MATERIALS AND METHODS
ID . | Full descriptor name . | Symbol . |
---|---|---|
x1 | Mendeleev number | Z |
x2 | Atomic weight | A |
x3 | Column | Col |
x4 | Row | Row |
x5 | Electronegativity | χ |
x6 | Number of d valence electrons | Nval |
x7 | Number of d unfilled electrons | Nunf |
x8 | Covalent radius | rcov |
x9 | Space group number | SGN |
x10 | Volume simple compounds | Vsc |
x11 | Bandgap | |
x12 | Magnetic moment | μsc |
x13 | Melting temperature | |
x14 | Shannon entropy | S |
x15 | Volume | V |
x16 | Tetragonality ratio | c/a |
ID . | Full descriptor name . | Symbol . |
---|---|---|
x1 | Mendeleev number | Z |
x2 | Atomic weight | A |
x3 | Column | Col |
x4 | Row | Row |
x5 | Electronegativity | χ |
x6 | Number of d valence electrons | Nval |
x7 | Number of d unfilled electrons | Nunf |
x8 | Covalent radius | rcov |
x9 | Space group number | SGN |
x10 | Volume simple compounds | Vsc |
x11 | Bandgap | |
x12 | Magnetic moment | μsc |
x13 | Melting temperature | |
x14 | Shannon entropy | S |
x15 | Volume | V |
x16 | Tetragonality ratio | c/a |
A Random Forest Regression method was selected for the cell volume, tetragonality parameter, and magnetic moment of austenite and martensite. The choice of the Random Forest algorithm was determined by comparing the effectiveness of various regression models for predicting the tetragonality parameter and the magnetic moment of austenite and martensite. As shown in Table II, the random forest model outperformed other regression methods, always achieving the largest coefficient of determination and the smallest (or comparable to the smallest) root mean square error (RMSE). In contrast, the gradient gain model showed slightly lower performance. Other models such as linear regression, support vector regression (SVR), K-neighbors, and decision tree showed significantly higher RMSE values and lower scores. These results highlight the reliability and suitability of the Random Forest algorithm for this particular application. Also the Random Forest algorithm is notable for its straightforward interpretability through feature importance scores. In comparison, while deep learning models are highly effective, they were not employed in this instance due to their limited interpretability, which impedes comprehension of the relationships between features and target variables.
. | c/a . | μaust (μB) . | μmart (μB) . | |||
---|---|---|---|---|---|---|
. | RMSE . | R2 . | RMSE . | R2 . | RMSE . | R2 . |
Random forest | 0.08 | 0.62 | 0.23 | 0.62 | 0.27 | 0.62 |
Gradient boosting | 0.09 | 0.58 | 0.22 | 0.58 | 0.28 | 0.58 |
Linear regression | 0.10 | 0.45 | 0.31 | 0.45 | 0.31 | 0.45 |
SVR | 0.12 | 0.25 | 0.46 | 0.25 | 0.47 | 0.25 |
K-neighbors | 0.12 | 0.26 | 0.42 | 0.26 | 0.41 | 0.26 |
Decision Tree | 0.12 | 0.24 | 0.31 | 0.24 | 0.33 | 0.24 |
. | c/a . | μaust (μB) . | μmart (μB) . | |||
---|---|---|---|---|---|---|
. | RMSE . | R2 . | RMSE . | R2 . | RMSE . | R2 . |
Random forest | 0.08 | 0.62 | 0.23 | 0.62 | 0.27 | 0.62 |
Gradient boosting | 0.09 | 0.58 | 0.22 | 0.58 | 0.28 | 0.58 |
Linear regression | 0.10 | 0.45 | 0.31 | 0.45 | 0.31 | 0.45 |
SVR | 0.12 | 0.25 | 0.46 | 0.25 | 0.47 | 0.25 |
K-neighbors | 0.12 | 0.26 | 0.42 | 0.26 | 0.41 | 0.26 |
Decision Tree | 0.12 | 0.24 | 0.31 | 0.24 | 0.33 | 0.24 |
The hyperparameters of the model have been configured using Grid Search in order to minimize the RMSE (for details, see the supplementary material). A total of 200 trees were selected in order to ensure a robust ensemble and control overfitting. The maximum depth of 40 serves to limit the growth of the tree, thereby balancing the competing demands of complexity and generalization. A minimum samples split of 2 allows the model to learn from smaller data subsets, while a minimum samples leaf of 1 ensures that each leaf node has at least one sample, thereby facilitating precise predictions. Collectively, these settings enhance the model’s ability to capture underlying data patterns while maintaining interpretability and performance.
The training dataset was constructed using an active learning approach, as illustrated in Fig. 1. At this stage, a random atomic composition was generated from four 3 -elements [ ]. From this composition, one stoichiometric alloy and ten non-stoichiometric alloys with random coefficients were created. Additionally, ten compositions were generated with a random concentration of a fourth element at random crystallographic positions: , , and , where , , and . Density Functional Theory (DFT) calculations were performed using the PBE functional,25 implemented in the VASP package,26,27 for cubic and tetragonal Heusler structures with symmetry groups #225 (#139) and #216 (#119), involving complete structural optimization. This process provided benchmark values for volume, tetragonality ratio, and magnetic moments of the austenitic and martensitic phases. The choice of structure type, direct #225 (#139) or inverse #216 (#119), was made according to classical rules for conventional Heusler alloys. It should be noted, however, that this simplification is limited, and for alloys involving all metals, the atom must also be considered (see Ref. 28).
Subsequently, the generated compositions were described, and a regression model was developed using the Scikit-learn module in Python.29 The predictions of the regression model were compared with DFT calculations, and only those compositions for which the deviation between the DFT predictions and the model exceeded 5% were added to the training dataset. This iterative process continued until, over ten iterations, the average deviation was reduced to less than 5%.
Cross-validation was used to assess the stability and generalizability of the model. The test set consisted of 950 samples generated according to the first block “Data preparation” of Fig. 1, which were not used during the training phase. This ensures that the test set provides an objective assessment of the accuracy of the model’s prediction, as these samples were reserved for testing purposes only and do not overlap with the training data.
A multi-objective genetic algorithm30 was employed to optimize the composition of all- -metal Heusler alloys. The process began with the generation of multiple subpopulations, each comprising compositions situated at randomly selected points. The suitability of a composition was determined by the absolute difference in magnetizations predicted by the regression model, with a larger difference indicating greater suitability. Tournament selection with a fixed tournament size was used as the selection operator. To generate new subpopulations from the selected compositions, a crossover operator was employed to compute the arithmetic mean concentration of parental compositions. The mutation operator introduced random perturbations to the alloy compositions, ensuring that the overall composition was preserved, i.e., the sum of the element fractions equaled one. This perturbation involved transferring up to 5% of one element’s concentration to another element’s concentration. The mutation and crossover rates were adjusted based on the diversity of the subpopulations, measured as the average pairwise distance between compositions in barycentric coordinates. At higher diversity, the rates were decreased to promote stability, while at lower diversity, the rates were increased to enhance the search.
To validate the effectiveness of the regression model and the genetic algorithm, DFT was used to ensure that the computed minima accurately represent the physical properties of the system under investigation. This approach was applied specifically to the Ni–Co–Mn–Ti system. Initially, a series of compositions with an increment of 12.5% was generated using 16 atomic supercells. Both austenitic and martensitic phases, including their direct and inverse structures, were constructed. Preliminary lattice parameters, including the initial volume and the ratio, were derived from the regression model and further optimized through DFT calculations.
To account for various magnetic orderings, a selection of both ferromagnetic (FM) and antiferromagnetic (AFM) structures was made. The AFM configurations were generated based on the system’s symmetry using the Enumlib,31 focusing on those with the highest symmetry, as these are often more energetically favorable. The selection process was thus limited to the five most symmetric AFM structures to reduce computational complexity. A comparison of the minima identified through the genetic algorithm and regression model with those obtained from DFT calculations was conducted to evaluate the reliability of the computational methods used in predicting the most promising configurations of the Ni–Co–Mn–Ti system.
III. RESULTS AND DISCUSSION
A. Evaluation of the regression models
Figure 2 illustrates the distribution density of the regression prediction results, alongside the and RMSE metrics for both the training and test sets, focusing on cell volume, tetragonality parameter, and total magnetic moment for the austenitic and martensitic phases. The model fully captures the cell volume, which represents an average value between the volumes of the elementary cells of its constituent elements, aligning well with the results of Ref. 28. The dominant influence on this prediction is the average volume, with unaccounted percentages likely due to structural disorder.
This approach is particularly effective for predicting volume, which mainly depends on the volume of its constituent components and can be approximated linearly. In all- -metal Heusler alloys, the bonds are almost entirely metallic, and their volumes generally follow Vegard’s law. The observed deviations from this law and the scattering of data points can be attributed to contributions from the magnetic subsystem, the presence of slight degree of covalency in the bonds, and other factors. The tetragonality parameter is primarily influenced by the atomic mass of the alloy ( 8%), the magnetic moment of the constituent elements in their ground states ( 5%), and the cell volume ( 4%). The absolute error in predicting the tetragonality parameter is about 0.07, allowing for reliable predictions of the geometry of the martensitic phase in all- -metal Heusler alloys, particularly those with a large ratio.
It should be noted that this approach differs from the behavior observed in classical Heusler alloys, where the number of valence electrons is the primary determinant of tetragonal ratio.32 In the case of all- -metal Heusler alloys, however, this contribution is significantly reduced, accounting for only 2% of the overall effect. For predicting magnetic moments in the case of austenite, the primary influence ( 20%) is the magnetic moment of the constituent elements in their ground state. The contribution from the following fitches is substantially less. The martensitic structure exhibits a similar trend, with the tetragonal ratio also having a significant impact ( 12%). The and RMSE values for magnetic moments show more pronounced differences between the test and training samples, indicating an increased risk of overfitting. Furthermore, the distribution profile becomes asymmetric, with a bias toward higher model estimates relative to DFT. These factors are likely due to the predominant presence of FM phases in the training set, whereas other magnetic states, such as AFM, are known to exist experimentally.33
To gain a more comprehensive understanding, it is also necessary to validate the model against data available in the literature. This approach helps identify and account for potential sources of deviation, thereby improving prediction accuracy. Initially, the prediction results obtained using the regression model were compared with the DFT calculation results from a previous study.22
Figure 3 illustrates the distribution of errors obtained by the proposed regression model with respect to the DFT results published previously in Ref. 22 for Heusler alloys comprising solely 3 elements and including 4 and 5 elements. The standard deviation distribution based on this work approximately corresponds to the distribution shown in Fig. 2. However, the mathematical expectation of the mean of the relative error shifts to the left, showing that the model underestimates the properties relative to the DFT.
The initial source of the underestimation can be attributed to the presence of 4 and 5 transition metals, which possess a larger atomic radius than 3 elements but were absent from the training set. This results in a systematic underestimation of volume, which subsequently reduces the predicted tetragonality and magnetic moments, as volume is a principal descriptor (see Fig. 2 and correlation matrix in the supplementary material). A second contributing factor is that the training dataset is comprised almost entirely of nonstoichiometric compositions with a random distribution of atoms. In contrast, the study22 focuses exclusively on stoichiometric compositions. The nonstoichiometric alloys exhibit greater disorder, which results in increased atomic spacing and, consequently, larger predicted volumes.35 In nonstoichiometric alloys, a reduction in the magnetic moment is also anticipated, given that the substitution of an atom with a non-magnetic one is a more probable occurrence.36 When a model trained on these alloys is applied to stoichiometric compositions, where atoms are more ordered and compact, the model tends to predict lower volumes and magnetic moments compared to actual values. This contributes to the observed prediction bias in the validation set. It should be noted that non-systematic errors related to atomic environment differences also exist, although their influence is difficult to assess. This shift underscores the need to incorporate new data types into the model to enhance prediction accuracy. Nevertheless, the agreement between the model and the DFT results remains within an acceptable range, confirming its overall reliability and applicability, and can thus be used to narrow the search area for subsequent, more accurate DFT modeling.
The magnetic moments of the ground state for several all- -metal Heusler alloys were predicted. Figure 4 displays the predicted magnetic moments in comparison with those calculated in Ref. 34. The values obtained from the regression model align with the trends calculated using DFT. The maximum deviation in magnetic moment is 34%, while the average deviation from the DFT values is 15%, which is consistent with the results shown in Fig. 2. The largest deviation is observed for TiZn. This discrepancy is likely due to the complex magnetic structure of this all- -metal alloy, which includes FM, AFM, and non-magnetic configurations depending on the concentration of the constituent elements. These configurations are not sufficiently represented in the training dataset. The remaining sources of magnetic moment underestimation are analogous to those illustrated in Fig. 3.
An indirect indicator of the magnetocaloric effect magnitude can be the difference in magnetic moments between the austenitic and martensitic phases, denoted as . Figure 5 shows for the Ni–Mn–Ti system with Fe and Co substitutions. According to experimental data,17 no martensitic transition occurs up to a Fe concentration of 15% in . Beyond this concentration, martensitic transformation becomes possible, and the magnetization difference between the austenitic and martensitic phases increases with higher Fe concentrations. The regression model results presented in Fig. 5(a) qualitatively replicate these findings, indicating a pronounced increase in magnetization at a Fe concentration of 15%. However, the model currently cannot predict the complete absence of the martensitic phase at Fe concentrations up to 15%. The results shown in Fig. 5(b) demonstrate similar trends for the and Heusler alloy series. The other authors experimentally and theoretically predicted that the martensitic transformation would cease at both low and high Ti concentrations,16 marked by colored areas at the boundaries of the martensitic transition.
The regression model can quantitatively predict the cessation of the transition at high concentrations, which is indicated by a sharp decline in . At low Ti concentrations, a reduction in the magnetization difference is also observed, although the concentrations deviate from those observed experimentally. The absence of a martensitic transition is predicted for the Ni–Co–Mn system.
B. Screening of the Ni–Co–Mn–Ti alloy family using a genetic algorithm
It is crucial to verify whether the maxima of magnetization differences correspond to those calculated by DFT. Figure 6 shows cross sections of the Ni–Co–Mn–Ti diagram obtained using the regression model (a)–(d) and DFT calculations (e)–(h). In the case of the Ni–Co–Ti system, there is excellent agreement between the model results and the DFT data. For the Ni–Co–Mn and Co–Mn–Ti systems, good agreement is also observed, although the model does not identify all maxima. The most significant discrepancies are evident in the Ni–Mn–Ti system, where the model exhibits substantial deviations and predicts non-existent maxima. These errors are particularly pronounced in regions with high Mn concentrations and stabilization of the AFM state. This suggests that the current model may struggle to accurately predict magneto-structural states at high Mn concentrations and in scenarios dominated by complex magnetic interactions.
Optimization was performed for the Ni–Mn–Co-Ti system using this genetic algorithm with the regression model. The results are shown in Fig. 7. Initially, all compositions were distributed randomly, with each population centered around its focus. At the fifth step, no individuals remained in the zero magnetization region and the algorithm successfully finds local minima. The populations then migrate to the global maximum and successfully locate the region with the largest magnetization difference. Thus, the model is capable of identifying both local and global minima. The magnetization difference at each generation is shown in Fig. 7(d). It can be seen that the curve oscillates with an upward trend. Oscillations correspond to the mutation moments to explore new regions. The best composition found in the Ni–Mn–Co–Ti class, according to the optimization results with the regression model, is with magnetization difference between austenitic and martensitic phases 2.24 /f.u. This result aligns with physical intuition and findings from other studies,18,37 since replacing weakly magnetic Ni and Ti with Co and Mn leads to increased magnetization.
IV. CONCLUSIONS
In this study, a regression model based on the Random Forest algorithm was developed to predict the structural and magnetic properties of all- -metal Heusler alloys. The model exhibits high accuracy in predicting structural characteristics such as cell volume and tetragonality parameter, while demonstrating moderate accuracy in predicting total magnetic moments of austenitic and martensitic phases. The limited accuracy in predicting magnetic moments can be attributed to the inherent difficulty in determining AFM ground states. The model currently struggles to accurately capture the complex magnetic interactions that occur in systems with significant AFM contributions. Therefore, the development of a more effective and computationally affordable algorithm to accurately identify the ground magnetic state remains a critical challenge for improving the predictive capabilities of the model.
Despite these limitations, the model qualitatively predicts the presence or absence of martensitic transitions and the associated differences in magnetization between phases with a reasonable degree of accuracy. This capability is particularly important for applications where the martensitic transition plays a crucial role in determining material properties, such as in magnetocaloric materials.
Furthermore, a genetic optimization algorithm was employed to identify alloy compositions in family Ni–Co–Mn–Ti that exhibit the greatest differences in magnetization during the martensitic transition. These predicted compositions could be highly advantageous for magnetocaloric applications, where large magnetization changes are necessary. The results show a correlation with results from previous studies, underscoring the potential of this combined approach for discovering new materials with enhanced magnetocaloric properties.
SUPPLEMENTARY MATERIAL
See the supplementary material for the detailed information on the process of selecting the descriptors and the training of the random forest regression model. The learning curves, which illustrate the dependence of the MSE on the number of iterations during active learning, are presented. The correlation matrices for the fitches and target properties are provided. The results of hyperparameter optimization for the random forest model using Grid Search are illustrated.
ACKNOWLEDGMENTS
The research was supported by the RSF—Russian Science Foundation Project No. 24-12-20016.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
D. R. Baigutlin: Conceptualization (equal); Methodology (equal); Writing – original draft (equal). V. V. Sokolovskiy: Conceptualization (equal); Methodology (equal); Supervision (equal); Writing – review & editing (equal). V. D. Buchelnikov: Conceptualization (equal); Supervision (equal); Writing – review & editing (equal). S. V. Taskaev: Conceptualization (equal); Funding acquisition (equal); Project administration (equal); Supervision (equal).
DATA AVAILABILITY
The data that support the findings of this study, including a dataset of all- -metal Heusler compounds, trained regression models, code for training and testing these models, and code for the optimization model, are openly available in the GitHub repository at the following link: https://github.com/Danil-phy-cmp-120/all_d_optimization.