Data mining is a process of gathering information to find important pattern recognition in the data set in the database so that it becomes knowledge discovery. Classification is a technique of grouping data based on data attachment to sample data. Naïve Bayes is one of the techniques in data mining classification that uses the probability method and is better known as the Naïve Bayes Classifier (NBC). The main characteristic of NBC is that there is a strong assumption of independence from each condition (independent variable). By applying Naive Bayes to a data sometimes causes misclassification if the training data is only a few so that the testing data is not found in the training data and this causes the probability result to be zero and an error in the classification process. To avoid zero probability results that cause errors in the classification process, a refinement method is needed. Laplacian Smoothing is a smoothing technique that can be used in Naïve Bayes classification in an easy way. The concept is to add a small positive value to each of the existing conditional probability values to avoid zero values in the probability model. Laplacian Smoothing on NBC is examined in this article.

1.
D. T.
Larose
,
Data Mining Methods and Models.
2006
.
2.
M.
Sabransyah
,
Y. N.
Nasution
, and
F. D. T.
Amijaya
,
J. EKSPONENSIAL
8
,
111
(
2017
).
3.
D.
Jurafsky
and
J. H.
Martin
,
SPEECH Lang. Process. An Introd. to Nat. Lang. Process. Comput. Linguist. Speech Recognit.
1
(
2021
).
4.
L.
Dhande
and
G.
Patnaik
,
Int. J. Sci. Eng. Technol. Res.
03
,
1110
(
2014
).
5.
W.
Zhang
and
F.
Gao
,
Procedia Eng.
15
,
2160
(
2011
).
6.
A. F.
Watratan
,
A.
Puspita
, and
D.
Moeis
,
J. Appl. Comput. Sci. Technol.
1
,
7
(
2020
).
7.
M.
Anggraeni
,
M.
Syafrullah
, and
H. A.
Damanik
"Literation Hearing Impairment (I-Chat Bot): Natural Language Processing
(
NLP) and Naïve Bayes Method"
in
ICERA 2019
,
IOP Conf. Ser. J. Phys.
1201
, (
IOP Publishing
,
Philadelhia
,
2019
).
8.
Randy
,
Hasniati
, and
I. A.
Musdar
,
Jtriste
5
,
8
(
2018
).
9.
N.
Stylianides
and
E.
Kontou
, (
2020
).
10.
A. P.
Ardhana
,
D. E.
Cahyani
, and
Winarno
, "
Classification of Javanese Language Level on Articles Using Multinomial Naïve Bayes and N-Gram Methods
" in
ICMETA 2018
,
IOP Conf. Ser. J. Phys.
1306
, (
IOP Publishing
,
Philadelphia
,
2019
).
11.
D. R. S.
Saputro
,
P.
Widyaningsih
,
F.
Handayani
, and
N. A.
Kurdhi
,
AIP Conf. Proc.
1827
, (
2017
).
12.
W. M.
Bolstad
,
Am. Stat.
15
,
21
(
1961
).
13.
V.
Narayanan
,
I.
Arora
, and
A.
Bhatia
, "
Fast and accurate sentiment classification using an enchaned Naïve Bayes model
" in
IDEAL 2013
,
Springer Berlin Heidelberg.
8206
, (
Springer Berlin Heidelberg
,
German
,
2013
).
14.
I.
Rish
, (
2014
).
15.
A.
Salim
,
M. R.
Alfian
,
H.
Andriani
, and
N.
Afifah
, "
Optimization of Naïve Bayes uses the genetic algorithm for classification data
" in
ICMSE 2020
,
IOP Conf. Ser. J. Phys.
1918
, (
IOP Publishing
,
Philadelphia
,
2021
).
16.
Yuliana
and
Erlangga
,
10
,
246
(
2017
).
17.
M.
Kikuchi
,
M.
Yoshida
,
M.
Okabe
, and
K.
Umemura
, "
Confidence interval of probability estimator of Laplace smoothing
" in
2015 2nd International Conference on Advanced Informatics: Concepts, Theory and Applications (ICAICTA), IEEE
, (
IEEE
,
Chonburi
,
2015
).
18.
R. A.
Ramadhani
,
F.
Indriani
, and
D. T.
Nugrahadi
,
2016 Int. Conf. Adv. Comput. Sci. Inf. Syst. ICACSIS 2016
287
(
2017
).
This content is only available via PDF.