This paper compares the Random Forest and AdaBoost classifier with resampling for modeling the imbalanced late payment tuition fee data. We utilize the Random Undersampling (RUS), Random Oversampling (ROS), and Synthetic Minority Oversampling Technique (SMOTE) to have more balanced data. We used late payment tuition fee data of the IPB undergraduate program with regular admission from 2016 to 2018. The results showed that the best Random Forest classifier uses seven explanatory variables and 500 trees with Random Oversampling (ROS) method. The best AdaBoost classifier uses the optimal 80 iterations with Random Undersampling (RUS) method. The Random Forest-ROS and AdaBoost-RUS classifiers have ROC-AUC of 58.70% and 52.90%, respectively, indicating that the Random Forest-ROS classifier has better prediction than AdaBoost-RUS. The important variables for predicting the late payment tuition fee are the household’s electric capacity, the father’s income, and the number of children in the family.

1.
Ministry of Education and Culture
,
Regulation on Single Tuition and Single Tuition Fees at State Universities
(
Jakarta
,
2013
).
2.
Muqorobin
,
Kusrini
and
E. T.
Luthfi
,
Journal of Scientific Sinuses
,
17
,
1693
1173
(
2019
).
3.
G.
James
,
D.
Witten
,
T.
Hastie
, and
R.
Tibshirani
,
An Introduction to Statistical Learning: with Applications in R
(
Springer
,
New York
,
2017
).
4.
Yuelin
 et al.,
Procedia Computer Science
,
174
,
141
149
(
2020
).
5.
T.
Hastie
,
R.
Tibshirani
, and
J.
Friedman
,
The Elements of Statistical Learning: Data Mining, Inference, and Prediction
(
Springer
,
New York
,
2017
).
6.
A. J.
Wyner
,
J.
Bleich
,
M.
Olson
, and
D.
Mease
,
Journal of Machine Learning Research
,
18
,
1
33
(
2017
).
7.
M.
Anis
and
M.
Ali
,
European Scientific Journal
,
13
,
340
353
(
2017
).
8.
K.
Upadhyay
,
P.
Kaur
, and
S. V. A. V.
Prasad
,
GIS Science Journal
,
8
,
875
903
(
2021
).
9.
S.
Susan
and
A.
Kumar
,
Engineering Reports
,
3
,
1
24
(
2021
).
10.
G.
Karatas
,
O.
Demir
and
O. K.
Sahingoz
,
IEEE Access
,
8
,
32150
32162
(
2020
).
11.
E.
Burnaev
,
P.
Erofeev
, and
A.
Papanov
, “
Influence of resampling on accuracy of imbalanced classification
”,
Eighth International Conference on Machine Vision
(
2015
).
12.
S.
Wu
and
H.
Nagahashi
,
Journal of Electrical and Computer Engineering
,
5
,
1
17
(
2015
).
13.
A.
Agresti
,
Categorical Data Analysis
(
John Wiley and Sons
,
New York
,
2002
).
14.
T.
Fawcet
,
Pattern Recognition Letters
,
27
,
861
874
(
2006
).
This content is only available via PDF.
You do not currently have access to this content.