During the Covid-19 pandemic, internet usage in Indonesia increased enormously. It was reported that online duration for adults increased by 52% compared to before the pandemic. The easiest way to get the latest information about the pandemic is to access online news portals. However, people often skim the unique and bombastic headline, then click on it leading to the whole story. This occurrence is called clickbait. Clickbait is a term that refers to website content in the form of news, advertisement, or services intended to gain attention and encourage visitors to click on links to certain web pages. This study aims to automatically classify clickbait news on online portals using a statistic model, random forest, that contains a collection of decision trees. Because of the unbalanced data, the researchers employ SMOTE to balance the amount of data in the minority class by generating synthesis data. The total of the news is 2,028 of its headline and content about covid. Preprocessing data includes case folding, removing number and punctuation, removing stop words, and lemmatization. The transformation of text into numeric representation using TF-IDF was done separately between headline and content. The best weight for the TF-IDF’s headline and content applied before concatenating process is 70%:30%. Based on the analysis results with a ratio of training data and testing data of 80%:20%, the accuracy value is 80%, precision is 99%, and recall is 63%.
Skip Nav Destination
Article navigation
25 May 2023
3rd INTERNATIONAL SEMINAR ON SCIENCE AND TECHNOLOGY (ISSTEC) 2021: Science, Technology and Data Analysis for Sustainable Future
30 November 2021
Yogyakarta, Indonesia
Research Article|
May 25 2023
Classification analysis of clickbait news using random forest and SMOTE methods: Online mass media news about Covid-19
Shaula Andreinna Arthamevia;
Shaula Andreinna Arthamevia
a)
1
Department of Statistics, Universitas Islam Indonesia
, Yogyakarta, Indonesia
Search for other works by this author on:
Arum Handini Primandari
Arum Handini Primandari
b)
1
Department of Statistics, Universitas Islam Indonesia
, Yogyakarta, Indonesia
b)Corresponding author:[email protected]
Search for other works by this author on:
b)Corresponding author:[email protected]
AIP Conf. Proc. 2720, 020001 (2023)
Citation
Shaula Andreinna Arthamevia, Arum Handini Primandari; Classification analysis of clickbait news using random forest and SMOTE methods: Online mass media news about Covid-19. AIP Conf. Proc. 25 May 2023; 2720 (1): 020001. https://doi.org/10.1063/5.0136924
Download citation file:
Pay-Per-View Access
$40.00
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
26
Views
Citing articles via
Inkjet- and flextrail-printing of silicon polymer-based inks for local passivating contacts
Zohreh Kiaee, Andreas Lösel, et al.
Effect of coupling agent type on the self-cleaning and anti-reflective behaviour of advance nanocoating for PV panels application
Taha Tareq Mohammed, Hadia Kadhim Judran, et al.
Students’ mathematical conceptual understanding: What happens to proficient students?
Dian Putri Novita Ningrum, Budi Usodo, et al.
Related Content
Modified self-attentive bi-directional long-short term memory for detecting clickbait in Indonesian news headline
AIP Conf. Proc. (February 2024)
Improving stunting prediction in children: Evaluating ensemble algorithms with SMOTE and feature selection
AIP Conf. Proc. (January 2025)
An empirical studies on online gender-based violence: Classification analysis utilizing XGBOOST
AIP Conf. Proc. (January 2025)
Heart disease prediction system using (SMOTE technique) balanced dataset and decision tree classifier
AIP Conf. Proc. (December 2023)