During the Covid-19 pandemic, internet usage in Indonesia increased enormously. It was reported that online duration for adults increased by 52% compared to before the pandemic. The easiest way to get the latest information about the pandemic is to access online news portals. However, people often skim the unique and bombastic headline, then click on it leading to the whole story. This occurrence is called clickbait. Clickbait is a term that refers to website content in the form of news, advertisement, or services intended to gain attention and encourage visitors to click on links to certain web pages. This study aims to automatically classify clickbait news on online portals using a statistic model, random forest, that contains a collection of decision trees. Because of the unbalanced data, the researchers employ SMOTE to balance the amount of data in the minority class by generating synthesis data. The total of the news is 2,028 of its headline and content about covid. Preprocessing data includes case folding, removing number and punctuation, removing stop words, and lemmatization. The transformation of text into numeric representation using TF-IDF was done separately between headline and content. The best weight for the TF-IDF’s headline and content applied before concatenating process is 70%:30%. Based on the analysis results with a ratio of training data and testing data of 80%:20%, the accuracy value is 80%, precision is 99%, and recall is 63%.

1.
Asosiasi Penyelenggara Jasa Internet Indonesia (APJII)
,
Bul. APJII
(
2020
).
2.
J. N.
Pangerapan
,
A.
Boham
, and
J.L.K.
Randang
,
Acta Diurna Komun.
2
,
1
(
2020
).
3.
A.
Syafieq
,
A.H.
Wahid
, and
R.D.
Ayuni
,
Penggunaan Umpan Klik Pada Judul Berita Untuk Menarik Minat Pembaca
(
2019
).
4.
J. Antony
Vijay
,
H. Anwar
Basha
, and
J. Arun
Nehru
,
Adv. Intell. Syst. Comput.
1257
,
331
(
2021
).
5.
H. K.
Sharma
,
K.
Kshitiz
, and
Shailendra
, in
Proc. 2018 Int. Conf. Adv. Comput. Commun. Eng. ICACCE 2018
(
Institute of Electrical and Electronics Engineers Inc
.,
2018
), pp.
265–272
.
6.
A. N.
Kasanah
,
Muladi
and
U.
Pujianto
,
J. RESTI.
3
,
196
201
(
2019
).
7.
A.
Fernández
,
S.
García
,
F.
Herrera
, and
N. V.
Chawla
,
J. Artif. Intell. Res.
61
,
863
(
2018
).
8.
R.A.
Barro
,
I.D.
Sulvianti
, and
F.M.
Afendi
,
Xplore J. Stat.
1
,
9
(
2013
).
9.
L.
Breiman
,
Random Forest
(
Berkeley
,
2001
).
10.
T.
Hastie
,
R.
Tibsshirani
, and
J.
Friedman
,
The Elemnts Os Statistical Learning
(
Springer Science+Business Media
,
New York City, NY, USA
,
2009
).
11.
N.
Dogru
and
A.
Subasi
,
2018 15th Learn. Technol. Conf. L T 2018
40
(
2018
).
12.
P.
Kasih
,
Innov. Res. Informatics
1
,
63
(
2019
).
This content is only available via PDF.
You do not currently have access to this content.