A Deep-learning Model for Fast Prediction of Vacancy Formation in Diverse Materials

The presence of point defects such as vacancies plays an important role in material design. Here, we demonstrate that a graph neural network (GNN) model trained only on perfect materials can also be used to predict vacancy formation energies ($E_{vac}$) of defect structures without the need for additional training data. Such GNN-based predictions are considerably faster than density functional theory (DFT) calculations with reasonable accuracy and show the potential that GNNs are able to capture a functional form for energy predictions. To test this strategy, we developed a DFT dataset of 508 $E_{vac}$ consisting of 3D elemental solids, alloys, oxides, nitrides, and 2D monolayer materials. We analyzed and discussed the applicability of such direct and fast predictions. We applied the model to predict 192494 $E_{vac}$ for 55723 materials in the JARVIS-DFT database.

Recently, machine learning techniques have been proposed as a faster method for predicting defect energetics, but so far they still require time-consuming defect data generation for model training and limit the applicability and generalizability of the defect energetic predictions. 18-23 Especially, graph neural network based deep-learning models [24][25][26][27] have become very popular for predicting materials properties and have been used for several bulk property predictions and their applicability needs to be tested for defect property predictions. Two key ingredients needed for accomplishing this task are: 1) a pretrained deep-learning model that can directly predict the total energy of perfect and defect structures, 2) a test DFT dataset of vacancy formation energies on which the DL model could be applied.
In this work, we demonstrate that the atomistic line graph neural network (ALIGNN) 28 based total energy prediction model (trained on the JARVIS-DFT 29 OptB88vdW energy per atom data for perfect bulk materials 29 ) can be directly used to predict vacancy formation energy of an arbitrary material with reasonable accuracy without requiring additional training data. The performance in terms of mean absolute error for the energy per atom model was reported as 0.037 eV in ref. 28 Note that we do not train any machine learning/deep learning model in this work for defects and just used the model parameters for energy prediction that was developed and shared publicly in ref. 28 Developing a vacancy formation energy dataset can be extremely time-consuming and depends on several computational setup parameters such as supercell-size, choice of k-points, considering neutral vs. charged defects and selection of appropriate chemical potentials. For testing the strategy adopted in this work, we generated a DFT dataset of 508 entries with charge neutral defects using a high-throughput approach. The dataset consists of elemental solids, oxide, alloy and 2D materials. In addition to predicting the vacancy formation energies, we analyze the trends, strengths and limitations of such predictions. Lastly, we used this strategy to develop a database of vacancy formation energies for all the materials in the JARVIS-DFT database. The deep-learning model, the DFT dataset and the workflow are made publicly available through the JARVIS (Joint Automated Repository for Various Integrated Simulations) infrastructure. 29 First, we discuss the generation of vacancy formation energy dataset that is used for testing the deep-learning model. We obtained stable elemental solids, binary alloys, oxides and 2D materials from the JARVIS-DFT dataset. We used at least 8Å lattice parameter constraints in x,y,z directions to build the supercell. We removed an atom with a unique Wyckoff position to generate the vacancy structure using the JARVIS-Tools package (https: //github.com/usnistgov/jarvis). The defect structures were then subjected to energy minimization using Optb88vdW functional 30 and projected augmented wave formalism 31 in Vienna Ab initio Simulation Package (VASP) package. 32,33 Please note commercial software is identified to specify procedures. Such identification does not imply recommendation by National Institute of Standards and Technology (NIST). We used the converged k-point and cut-off from the JARVIS-DFT dataset based on total energy convergence. 34 We used an energy convergence of 10 −6 eV for energy convergence during the self-consistent cycle.
Currently, we have 508 entries for the vacancies and the dataset is still growing.
For the deep-learning predictions, we used the recently developed atomistic line graph neural network (ALIGNN), 28 which is publicly available at https://github.com/usnistgov/ alignn. ALIGNN has been used to train fast and accurate models for more than 65 properties of solids and molecules with high accuracy. 25,28,35,36 In ALIGNN, a crystal structure is represented as a graph using atomic elements as nodes and atomic bonds as edges. Each node in the atomistic graph is assigned 9 input node features based on its atomic species: electronegativity, group number, covalent radius, valence electrons, first ionization energy, electron affinity, block and atomic volume. The inter-atomic bond distances are used as edge features with radial basis function up to 8Å cut-off and a 12-nearest-neighbor (N ). This atomistic graph is then used for constructing the corresponding line graph using interatomic bond-distances as nodes and bond-angles as edge features. ALIGNN uses edge-gated graph convolution for updating nodes as well as edge features using a propagation function (f ) for layer (l), atom features (h), and node (i), details of which can be found in Ref. which will be used as an energy predictor for both perfect and defect structures in this work. In Fig. 1 we analyzed the DFT database for vacancy formation energies developed in this work. We used this dataset for testing purposes only. Although there have been several studies in generating vacancy formation energy dataset, a fully atomistic dataset consistent with bulk and vacancy energetics information is not available to our knowledge. Hence, we generated a DFT dataset for vacancies consisting of a wide variety of material classes such as elemental solids, 2D materials, oxides, and metallic alloys. We visualize the defect formation energies of materials in Fig. 1. As mentioned above, we only considered the charge neutral vacancies within a finite 8Å cell size with OptB88vdW functional. The vacancy formation energy was calculated as where, E vacancy is the vacancy formation energy, E def ect is the energy of the defect structure with an atom missing, E perf ect is the energy of the perfect structure, µ is the chemical potential used as energy per atom of the most stable structure of an element. The chemical potentials used in this work are provided in the supplementary information.
Currently, the vacancy formation energy dataset consists of 508 entries. We compare a subset of this dataset with available data from previous experimental and DFT-studies 14, [39][40][41][42][43] in Fig. 1a. We find an excellent agreement between our dataset and that from literature with a mean absolute error (MAE) of 0.3 eV. In Fig. 1b, we show the histogram of all the vacancy formation energy data. We find that most of the vacancy formation energy data lie below 3 eV. Depending on the type of engineering applications, either a high or low E vac could be desirable.
Next, we used ALIGNN based pretrained total energy per atom model trained on the JARVIS-DFT dataset for predicting defect energy and perfect energy required for vacancy formation energy following Eq. 2. This model was trained using bulk energies for 55723 solids. 28 The defect structures were generated by deleting an atom with a unique Wyckoff position without optimizing the atomic positions of other atoms in the defect structures.
We used the same chemical potential for elements from the JARVIS-DFT as given in the supporting information. The comparison of ALIGNN based prediction with respect to DFT  data is shown with blue dots in Fig. 2a. Interestingly, we observed that there is a noticeable correlation between the ALIGNN direct predictions and DFT data with a mean absolute error of 1.51 eV. However, we found that the ALIGNN based predictions were usually underestimated. To circumvent this issue, we applied a scissor shift by adding 1.3 eV to all the ALIGNN based predictions represented by green dots. Using such a shift, we were able to lower the MAE to 1.0 eV i.e., leading to an overall 33.8 % improvement. The value of 1.3 eV was chosen such that overall MAE is minimized. We noted that the previous reports on machine learning for vacancy formation energies resulted in mean absolute error values of To further analyze the predictions for different types of materials, we compared the DFT and ALIGNN predictions for elemental solids, oxides and 2D monolayers in Fig. 2b, Fig. 2c and Fig. 2d respectively. Corresponding original ALIGNN prediction based MAE and that with scissor-shifted values are shown in Table 1. We found that the ALIGNN based models perform well for 2D monolayer and oxide materials compared to elemental solids and alloys.
This behavior can be explained based on the fact that GNN architectures usually perform message passing locally and may work well for insulting materials with fewer bonds rather than for elemental solids and alloy systems which are usually metallic and have delocalized  per atom and high electronic bandgaps. The vacancy formation energy histogram showed a high peak around 2 eV which was similar to that observed in Fig. 1b

Supporting Information Available
Experimental and previous DFT data for benchmarking, and chemical potential of elemental solids are provided in the supporting information.