Quantifying charge-state transition energy levels of impurities in semiconductors is critical to understanding and engineering their optoelectronic properties for applications ranging from solar photovoltaics to infrared lasers. While these transition levels can be measured and calculated accurately, such efforts are time-consuming and more rapid prediction methods would be beneficial. Here, we significantly reduce the time typically required to predict impurity transition levels using multi-fidelity datasets and a machine learning approach employing features based on elemental properties and impurity positions. We use transition levels obtained from low-fidelity (i.e., local-density approximation or generalized gradient approximation) density functional theory (DFT) calculations, corrected using a recently proposed modified band alignment scheme, which well-approximates transition levels from high-fidelity DFT (i.e., hybrid HSE06). The model fit to the large multi-fidelity database shows improved accuracy compared to the models trained on the more limited high-fidelity values. Crucially, in our approach, when using the multi-fidelity data, high-fidelity values are not required for model training, significantly reducing the computational cost required for training the model. Our machine learning model of transition levels has a root mean squared (mean absolute) error of 0.36 (0.27) eV vs high-fidelity hybrid functional values when averaged over 14 semiconductor systems from the II–VI and III–V families. As a guide for use on other systems, we assessed the model on simulated data to show the expected accuracy level as a function of bandgap for new materials of interest. Finally, we use the model to predict a complete space of impurity charge-state transition levels in all zinc blende III–V and II–VI systems.

1.
L.
Dobaczewski
,
A. R.
Peaker
, and
K.
Bonde Nielsen
, “
Laplace-transform deep-level spectroscopy: The technique and its applications to the study of point defects in semiconductors
,”
J. Appl. Phys.
96
,
4689
4728
(
2004
).
2.
H.
Philipp Ebert
, “
Imaging defects and dopants
,”
Mater. Today
6
,
36
43
(
2003
).
3.
J. M.
Johnson
,
S.
Im
,
W.
Windl
, and
J.
Hwang
, “
Three-dimensional imaging of individual point defects using selective detection angles in annular dark field scanning transmission electron microscopy
,”
Ultramicroscopy
172
,
17
29
(
2017
).
4.
C.
Freysoldt
,
B.
Grabowski
,
T.
Hickel
,
J.
Neugebauer
,
G.
Kresse
,
A.
Janotti
, and
C. G.
Van de Walle
, “
First-principles calculations for point defects in solids
,”
Rev. Mod. Phys.
86
,
253
305
(
2014
).
5.
W.
Chen
and
A.
Pasquarello
, “
Accuracy of GW for calculating defect energy levels in solids
,”
Phys. Rev. B
96
,
020101
(
2017
).
6.
A. V.
Krukau
,
O. A.
Vydrov
,
A. F.
Izmaylov
, and
G. E.
Scuseria
, “
Influence of the exchange screening parameter on the performance of screened hybrid functionals
,”
J. Chem. Phys.
125
,
224106
(
2006
).
7.
D.
Morgan
and
R.
Jacobs
, “
Opportunities and challenges for machine learning in materials science
,”
Annu. Rev. Mater. Res.
50
,
71
103
(
2020
).
8.
W.
Ye
,
C.
Chen
,
Z.
Wang
,
I.-H.
Chu
, and
S. P.
Ong
, “
Deep neural networks for accurate predictions of crystal stability
,”
Nat. Commun.
9
,
3800
(
2018
).
9.
B.
Meredig
, “
Five high-impact research areas in machine learning for materials science
,”
Chem. Mater.
31
,
9579
9581
(
2019
).
10.
P. V.
Balachandran
,
A. A.
Emery
,
J. E.
Gubernatis
,
T.
Lookman
,
C.
Wolverton
, and
A.
Zunger
, “
Predictions of new ABO3 perovskite compounds by combining machine learning and density functional theory
,”
Phys. Rev. Mater.
2
,
043802
(
2018
).
11.
B.
Meredig
,
E.
Antono
,
C.
Church
,
M.
Hutchinson
,
J.
Ling
,
S.
Paradiso
,
B.
Blaiszik
,
I.
Foster
,
B.
Gibbons
,
J.
Hattrick-Simpers
,
A.
Mehta
, and
L.
Ward
, “
Can machine learning identify the next high-temperature superconductor? Examining extrapolation performance for materials discovery
,”
Mol. Syst. Des. Eng.
3
,
819
825
(
2018
).
12.
W.
Li
,
R.
Jacobs
, and
D.
Morgan
, “
Predicting the thermodynamic stability of perovskite oxides using machine learning models
,”
Comput. Mater. Sci.
150
,
454
463
(
2018
).
13.
J.
Wei
,
X.
Chu
,
X. Y.
Sun
,
K.
Xu
,
H. X.
Deng
,
J.
Chen
,
Z.
Wei
, and
M.
Lei
, “
Machine learning in materials science
,”
InfoMat
1
,
338
358
(
2019
).
14.
J.
Schmidt
,
M. R. G.
Marques
,
S.
Botti
, and
M. A. L.
Marques
, “
Recent advances and applications of machine learning in solid-state materials science
,”
npj Comput. Mater.
5
,
83
(
2019
).
15.
R.
Ramprasad
,
R.
Batra
,
G.
Pilania
,
A.
Mannodi-Kanakkithodi
, and
C.
Kim
, “
Machine learning in materials informatics: Recent applications and prospects
,”
npj Comput. Mater.
3
,
54
(
2017
).
16.
H.-J.
Lu
,
N.
Zou
,
R.
Jacobs
,
B.
Afflerbach
,
X.-G.
Lu
, and
D.
Morgan
, “
Error assessment and optimal cross-validation approaches in machine learning applied to impurity diffusion
,”
Comput. Mater. Sci.
169
,
109075
(
2019
).
17.
Y.-c.
Liu
,
B.
Afflerbach
,
R.
Jacobs
,
S.-k.
Lin
, and
D.
Morgan
,
MRS Commun.
9
,
567
575
(
2019
).
18.
A.
Mannodi-Kanakkithodi
,
X.
Xiang
,
L.
Jacoby
,
R.
Biegaj
,
S.
Dunham
,
D.
Gamelin
, and
M.
Chan
, “
Universal machine learning framework for defect predictions in zinc blende semiconductors
,”
Patterns
(published online) (
2021
).
19.
G.
Pilania
,
J. E.
Gubernatis
, and
T.
Lookman
, “
Multi-fidelity machine learning models for accurate bandgap predictions of solids
,”
Comput. Mater. Sci.
129
,
156
163
(
2017
).
20.
A.
Patra
,
R.
Batra
,
A.
Chandrasekaran
,
C.
Kim
,
T. D.
Huan
, and
R.
Ramprasad
, “
A multi-fidelity information-fusion approach to machine learn and predict polymer bandgap
,”
Comput. Mater. Sci.
172
,
109286
(
2020
).
21.
R.
Batra
,
G.
Pilania
,
B. P.
Uberuaga
, and
R.
Ramprasad
, “
Multifidelity information fusion with machine learning: A case study of dopant formation energies in hafnia
,”
ACS Appl. Mater. Interfaces
11
,
24906
24918
(
2019
).
22.
O.
Egorova
,
R.
Hafizi
,
D. C.
Woods
, and
G.
Day
, “
Multifidelity statistical machine learning for molecular crystal structure prediction
,”
J. Phys. Chem. A
124
,
8065
(
2020
); chemRxiv:12407831.v1.
23.
A.
Tran
,
J.
Tranchida
,
T.
Wildey
, and
A. P.
Thompson
, “
Multi-fidelity machine-learning with uncertainty quantification and Bayesian optimization for materials design: Application to ternary random alloys
,”
J. Chem. Phys.
153
,
074705
(
2020
); arXiv:2006.00139.
24.
R.
Batra
and
S.
Sankaranarayanan
, “
Machine learning for multi-fidelity scale bridging and dynamical simulations of materials
,”
J. Phys.: Mater.
3
,
031002
(
2020
).
25.
A.
Mannodi-Kanakkithodi
,
M. Y.
Toriyama
,
F. G.
Sen
,
M. J.
Davis
,
R. F.
Klie
, and
M. K. Y.
Chan
, “
Machine-learned impurity level prediction for semiconductors: The example of Cd-based chalcogenides
,”
npj Comput. Mater.
6
,
39
(
2020
).
26.
M. P.
Polak
,
R.
Kudrawiec
,
R.
Jacobs
,
I.
Szlufarska
, and
D.
Morgan
, “
Modified band alignment method to obtain hybrid functional accuracy from standard DFT: Application to defects in highly mismatched III–V:Bi alloys
,”
Phys. Rev. Mater.
5
,
124601
(
2021
).
27.
J. L.
Lyons
and
C. G. V.
de Walle
, “
Computationally predicted energies and properties of defects in GaN
,”
npj Comput. Mater.
3
,
1
10
(
2017
).
28.
A.
Alkauskas
,
P.
Broqvist
, and
A.
Pasquarello
, “
Defect energy levels in density functional calculations: Alignment and band gap problem
,”
Phys. Rev. Lett.
101
,
046405
(
2008
).
29.
G.
Kresse
and
J.
Furthmüller
, “
Efficiency of ab initio total energy calculations for metals and semiconductors using a plane-wave basis set
,”
Comput. Mater. Sci.
6
,
15
50
(
1996
).
30.
G.
Kresse
and
J.
Furthmüller
, “
Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set
,”
Phys. Rev. B
54
,
11169
11186
(
1996
).
31.
C.
Freysoldt
,
J.
Neugebauer
, and
C. G.
Van de Walle
, “
Fully ab initio finite-size corrections for charged-defect supercell calculations
,”
Phys. Rev. Lett.
102
,
016402
(
2009
).
32.
C.
Freysoldt
,
J.
Neugebauer
, and
C. G.
Van de Walle
, “
Electrostatic interactions between charged defects in supercells
,”
Phys. Status Solidi B
248
,
1067
1076
(
2011
).
33.
I.
Mosquera-Lois
and
S. R.
Kavanagh
, “
In search of hidden defects
,”
Matter
4
,
2602
2605
(
2021
).
34.
R.
Jacobs
,
T.
Mayeshiba
,
B.
Afflerbach
,
L.
Miles
,
M.
Williams
,
M.
Turner
,
R.
Finkel
, and
D.
Morgan
, “
The materials simulation toolkit for machine learning (MAST-ML): An automated open source toolkit to accelerate data-driven materials research
,”
Comput. Mater. Sci.
176
,
109544
(
2020
).
35.
F.
Pedregosa
,
G.
Varoquaux
,
A.
Gramfort
,
V.
Michel
,
B.
Thirion
,
O.
Grisel
,
M.
Blondel
,
P.
Prettenhofer
,
R.
Weiss
,
V.
Dubourg
,
J.
Vanderplas
,
A.
Passos
,
D.
Cournapeau
,
M.
Brucher
,
M.
Perrot
, and
E.
Duchesnay
, “
Scikit-learn: Machine learning in Python
,”
J. Mach. Learn. Res.
12
,
2825
2830
(
2011
).
36.
J. H.
Friedman
, “
Greedy function approximation: A gradient boosting machine
,”
Ann. Stat.
29
,
1189
1232
(
2001
).
37.
M. L.
Hutchinson
,
E.
Antono
,
B. M.
Gibbons
,
S.
Paradiso
,
J.
Ling
, and
B.
Meredig
, “
Overcoming data scarcity with transfer learning
,” arXiv:1711.05099 (
2017
).
38.
J. A.
Chan
,
S.
Lany
, and
A.
Zunger
, “
Electronic correlation in anion p orbitals impedes ferromagnetism due to cation vacancies in Zn chalcogenides
,”
Phys. Rev. Lett.
103
,
016404
(
2009
).
39.
D.
Powers
, “
Evaluation: From precision, recall and F-factor to ROC, informedness, markedness and correlation
,”
Mach. Learn. Technol.
2
,
37
63
(
2008
).
40.
M. P.
Polak
,
R.
Jacobs
,
A.
Mannodi-Kanakkithodi
,
M. K. Y.
Chan
, and
D.
Morgan
(
2021
). “
Data and model for ‘Multi-fidelity machine learning for impurity charge-state transition levels in semiconductors from elemental properties
,”
figshare
, Dataset.
41.
A.
Mannodi-Kanakkithodi
,
M. Y.
Toriyama
,
F. G.
Sen
,
M. J.
Davis
,
R. F.
Klie
, and
M. K. Y.
Chan
, “
Machine-learned impurity level prediction for semiconductors: The example of Cd-based chalcogenides
,”
npj Comput. Mater.
6
,
39
(
2020
).

Supplementary Material

You do not currently have access to this content.