A rough interface seems to be one of the possible reasons for low channel mobility (conductivity) in SiC metal-oxide-semiconductor field-effect transistors. To evaluate the mobility by interface roughness, we drew a boundary line between an amorphous insulator and crystalline 4H–SiC in a cross-sectional image obtained by using a transmission electron microscope by using the deep learning approach of a convolutional neural network (CNN). We show that the CNN model recognizes the interface very well, even when the interface is too rough to draw the boundary line manually. The power spectral density of interface roughness was calculated and was comparable with those of Si interfaces, indicating that interface roughness cannot account for the low channel mobility of SiC interfaces.

Metal-oxide-semiconductor field-effect transistors (MOSFETs) based on 4H silicon carbide (4H–SiC) are applied to switching devices for high-power applications.1 However, the low conductivity at the channel interfaces between the amorphous insulator and crystalline 4H–SiC remains a serious problem. The roughness at the interface is a possible cause of the low interface conductivity of SiC MOSFETs.2–5 The surface-roughness-scattering model established for SiO2/Si interfaces6–10 has been partially used to analyze the mobility at SiO2/4H–SiC interfaces.11–14 For SiO2/Si interfaces, the power spectral density of surface roughness was extracted from the boundary line between the two phases in the cross-sectional image obtained by using a transmission electron microscope (TEM),9 and the mobility was calculated from the theoretical formula6–8 and power spectral density. For SiC interfaces, however, power spectral density has not been determined from TEM images and has been treated as fitting parameters to reproduce measured electrical properties or mobilities.11–14 

It is a difficult task to draw the boundary line between the two phases. Goodnick et al.9 defined the interface boundary as “the last discernible lattice fringe corresponding to the periodicity of the Si,” but they admitted that “this procedure is somewhat arbitrary at many points, as an abrupt change from crystalline Si to non-crystalline SiO2 is not always apparent.” They probably drew the boundary manually. Zhao et al.15 took another approach: they defined the boundary based on the distinct darkness difference between the two phases. This approach is good in that the interface is determined uniquely and automatically. This approach, however, cannot be applied to SiC interfaces because the TEM image of crystalline SiC has both bright and dark regions.2–4,16,17 In addition, the existence of an interface transition layer has been ignored in previous reports.

A convolutional neural network (CNN) is a deep learning approach that has achieved great success in image classification.18–20 In this study, we classified each point in a TEM image as an amorphous insulator or crystalline 4H–SiC by using this approach and determined the interface boundary.

A TEM sample was prepared by depositing amorphous aluminum oxide on the 4-degree-tilted (0001) surface of crystalline 4H–SiC.21 We chose an interface that was too rough to draw the boundary line manually. A cross-sectional image was obtained using a TEM (JEM-ARM200F, JEOL) at an acceleration voltage of 200 kV. The sample thickness was estimated to be ∼70 nm. Figure 1 shows the cross-sectional TEM image, where the upper half is amorphous aluminum oxide (A) and lower half is crystalline 4H–SiC (C). Figure 2 shows the crystal structure of 4H–SiC.

The size of one pixel in Fig. 1 is 0.016 × 0.016 nm2 (Δr = 0.016 nm). The training dataset consists of 2000 small images that were randomly extracted from two distinct regions that are ∼0.6 nm–3 nm away from the interface; 1000 images were extracted from the A region in the TEM image (Fig. 1), and 1000 images were extracted from the C region. In the same way, 2000 images for the test dataset were extracted from another two regions that are ∼3 nm–20 nm away from the interface. Every image in the training and test datasets does not include the interface and is assigned the correct label (A or C). The size of the extracted images is 29 × 29 pixels2 (0.46 × 0.46 nm2), and an example is shown in Fig. 3. This size was selected to include several lattice points (Fig. 2) in order that the CNN can recognize the feature of the crystalline 4H–SiC. At every position (pixel) in the interface region, a 29 × 29 pixel2 image centered on that position was extracted, which results in a dataset of 214 326 images (interface dataset). Images in the interface dataset are not assigned a correct label.

The Neural Network Console (NNC) software developed by Sony Network Communications Inc. was used in this study.22 Deep learning models were trained using the training dataset to classify the extracted images as A or C, and the degree of learning was evaluated using the test dataset. Subsequently, every position (pixel) in the interface region was classified using the corresponding image in the interface dataset.

The convolutional neural network (CNN) model used in this study is a modified version of the CNN proposed in Ref. 18 and is shown in Figs. 4(a) and 5(a). The main change is the introduction of batch normalization layers23 after convolution and affine layers. The batch normalization layers prevent the distribution of internal variables from changing significantly and suppress overfitting. Adam24 was used as an optimizer at α = 0.001. Another model that consists of fully connected (FC) affine layers was also evaluated as a reference using the configuration shown in Figs. 4(b) and 5(b). Figure 6 shows the learning progress with a batch size of 30. The errors for both training and test datasets decreased with epochs, except for some surges owing to the variation of each batch. The CNN trained for 43 epochs was used to classify the interface dataset, and the FC model trained for 14 epochs was used. NNC default values were used for other hyperparameters.

Figure 7 shows the images of the interface region extracted from Fig. 1. Figures 7(b) and 7(c) show the classification results by FC and CNN models, respectively, where pixels classified as A (C) are colored red (yellow). The FC model could not classify the interface very well, e.g., it misjudged the lower part of the C region to be the A region. On the other hand, the CNN model, which is effective in image classification,18–20 classified the interface very well. For example, we humans can recognize a depression at the position indicated by the arrow in Fig. 7. The shape of the depression was well formed by the CNN model, which resulted in the boundary line being convex downward. Hereafter, only the CNN model is used for further analysis.

It is possible to insert the third phase of an intermediate transition region by assigning the pixels with intermediate output values (e.g., 0.001 < Y < 0.999) to that region, which is shown as the red region in Fig. 7(d). The average thickness of the transition layer is calculated by dividing the area of the transition layer [red area in Fig. 7(d)] by the length in the horizontal direction of the analyzed rectangular area. The calculated value was 0.29 nm, which is comparable to a Si–C bilayer in the 〈0001〉 direction (0.25 nm). The ability to insert the intermediate transition region is an advantage of the deep learning approach that handwriting does not have.

For the purpose of evaluating the power spectral density of roughness, the boundary lines in Fig. 7(c) were modified to be a single-valued function shown in Fig. 7(e) by

znx=ΔrnzYnx,nz,znx=znx(Anx+B),
(1)

where Y is the output value obtained by the CNN model (0 ≤ Y ≤ 1) and nx (1–2646) and nz (1–81) are the serial numbers of pixels in the horizontal and vertical directions, respectively. The least squares regression line (Anx + B) is subtracted from z′(nx). The calculated root-mean-square (standard deviation) of z(nx) was 0.14 nm.

Figure 8 shows the power spectral density calculated by using discrete Fourier transform as

Snq=ΔrNxnx=1Nxznxexp2πinx1nq1Nx2,x=nx1Δr,q=nq1Δq,Δq=2πΔrNx=0.15nm1,
(2)

where Nx = 2646 is the total pixel number in the horizontal direction. The size of the extracted images used in the classification process was 0.46 nm (q = 14 nm−1); then, structures smaller than this order cannot be detected. Thus, it is reasonable that the power spectral density drastically decreases for q > 14 nm−1, as shown in Fig. 8. We can recognize two peaks at q = 0.6 nm−1 and 11 nm−1. The peak at 0.6 nm−1 corresponds to the periodic step-and-terrace structure on the 4-degree-tilted surface with steps consisting of two Si–C bilayers (q = 0.87 nm−1). In other words, the CNN model revealed step bunching with two bilayers. The peak at 11 nm−1 indicates that there is a fluctuation in the order of several atoms. This fluctuation seems to stem from the intermediate region where the A and C regions overlap vertically on the paper surface. It is difficult for the human eyes to classify these ambiguous regions. Thus, we cannot determine at this stage the validity of the peak at 11 nm−1 obtained by the CNN model.

One-dimensional power spectral densities of SiO2/Si interfaces were calculated by

Sq,Δ,L=2Δ2(L/2)1+q2(L/2)2,
(3)

using the reported root-mean-square (Δ) and correlation length (L) in Ref. 9: S(q, 0.18 nm, 2.2 nm), S(q, 0.20 nm, 1.0 nm), and S(q, 0.20 nm, 1.6 nm). They are plotted in Fig. 8. The power spectral density of the SiC interface was comparable with those of the Si interfaces. Based on the conventional surface roughness scattering model,6–10 such a small density cannot account for the low channel mobility of SiC interfaces, which consists with the results by Noguchi et al.14 On the other hand, the conventional surface roughness scattering model is based on the effective mass approximation and assumes fluctuations on a scale sufficiently larger than the atomic scale. Atomic-scale fluctuations such as those indicated by the peak at 11 nm−1 must be dealt with by more accurate models.

Although classification results depend on the CNN model and its hyperparameters, we expect that they can reveal the morphological differences between different samples by using the same model and hyperparameters. We would like to evaluate interfaces formed under various conditions in our future studies.

We drew a boundary line between two phases by using the convolutional neural network (CNN). We show that the CNN model can recognize the interface accurately, even when the interface is too rough to draw the boundary line manually. The evaluated power spectral density of the SiC interface was comparable with those of the Si interfaces, indicating that interface roughness cannot account for the low channel mobility of SiC interfaces.

See the supplementary material for the original pictures of Figs. 1, 3(a), 3(c), and 7(a)–7(e).

The data that support the findings of this study are available from the corresponding author upon reasonable request.

1.
T.
Kimoto
and
J. A.
Cooper
,
Fundamentals of Silicon Carbide Technology
(
Wiley
,
Singapore
,
2014
).
2.
K.
Fukuda
,
M.
Kato
,
S.
Harada
, and
K.
Kojima
,
Mater. Sci. Forum
527-529
,
1043
(
2006
).
3.
T.
Masuda
,
S.
Harada
,
T.
Tsuno
,
Y.
Namikawa
, and
T.
Kimoto
,
Mater. Sci. Forum
600-603
,
695
(
2009
).
4.
P.
Fiorenza
,
F.
Giannazzo
,
A.
Frazzetto
, and
F.
Roccaforte
,
J. Appl. Phys.
112
,
084501
(
2012
).
5.
H.
Yoshioka
,
AIP Adv.
9
,
075306
(
2019
).
6.
Y. C.
Cheng
and
E. A.
Sullivan
,
Surf. Sci.
34
,
717
(
1973
).
7.
Y.
Matsumoto
and
Y.
Uemura
,
Jpn. J. Appl. Phys.
13
(
Supplement 2-2
),
367
(
1974
).
8.
T.
Ando
,
J. Phys. Soc. Jpn.
43
,
1616
(
1977
).
9.
S. M.
Goodnick
,
D. K.
Ferry
,
C. W.
Wilmsen
,
Z.
Liliental
,
D.
Fathy
, and
O. L.
Krivanek
,
Phys. Rev. B
32
,
8171
(
1985
).
10.
S.
Takagi
,
A.
Toriumi
,
M.
Iwase
, and
H.
Tango
,
IEEE Trans. Electron Devices
41
,
2357
(
1994
).
11.
S.
Potbhare
,
N.
Goldsman
,
G.
Pennington
,
A.
Lelis
, and
J. M.
McGarrity
,
J. Appl. Phys.
100
,
044515
(
2006
).
12.
S.
Dhar
,
S.
Haney
,
L.
Cheng
,
S.-R.
Ryu
,
A. K.
Agarwal
,
L. C.
Yu
, and
K. P.
Cheung
,
J. Appl. Phys.
108
,
054509
(
2010
).
13.
V.
Uhnevionak
,
A.
Burenkov
,
C.
Strenger
,
G.
Ortiz
,
E.
Bedel-Pereira
,
V.
Mortet
,
F.
Cristiano
,
A. J.
Bauer
, and
P.
Pichler
,
IEEE Trans. Electron Devices
62
,
2562
(
2015
).
14.
M.
Noguchi
,
T.
Iwamatsu
,
H.
Amishiro
,
H.
Watanabe
,
N.
Miura
,
K.
Kita
, and
S.
Yamakawa
,
Jpn. J. Appl. Phys.
58
,
031004
(
2019
).
15.
Y.
Zhao
,
H.
Matsumoto
,
T.
Sato
,
S.
Koyama
,
M.
Takenaka
, and
S.
Takagi
,
IEEE Trans. Electron Devices
57
,
2057
(
2010
).
16.
J. H.
Dycus
,
W.
Xu
,
D. J.
Lichtenwalner
,
B.
Hull
,
J. W.
Palmour
, and
J. M.
LeBeau
,
Appl. Phys. Lett.
108
,
201607
(
2016
).
17.
T.
Ono
,
C. J.
Kirkham
,
S.
Saito
, and
Y.
Oshima
,
Phys. Rev. B
96
,
115311
(
2017
).
18.
Y.
LeCun
,
L.
Bottou
,
Y.
Bengio
, and
P.
Haffner
,
Proc. IEEE
86
,
2278
(
1998
).
19.
Y.
LeCun
,
Y.
Bengio
, and
G.
Hinton
,
Nature
521
,
436
(
2015
).
20.
W.
Rawat
and
Z.
Wang
,
Neural Comput.
29
,
2352
(
2017
).
21.
H.
Yoshioka
,
M.
Yamazaki
, and
S.
Harada
,
AIP Adv.
6
,
105206
(
2016
).
22.
See https://dl.sony.com/ for the Neural Network Console (NNC) software.
23.
S.
Ioffe
and
C.
Szegedy
, arXiv:1502.03167 (
2015
).
24.
D. P.
Kingma
and
J.
Ba
, arXiv:1412.6980 (
2014
).

Supplementary Material