Proceedings of the Sriwijaya International Conference on Information Technology and Its Applications (SICONIAN 2019)

Handling Numerical Features on Dataset Using Gauss Density Formula and Data Discretization Toward Naïve Bayes Algorithm

Authors
Mochammad YUSA, Ernawati ERNAWATI, Yudi SETIAWAN, Desi ADRESWARI
Corresponding Author
Mochammad YUSA
Available Online 6 May 2020.
DOI
10.2991/aisr.k.200424.072How to use a DOI?
Keywords
Gauss Density, data discretization, Naive Bayes, data mining, classifiers
Abstract

Naïve Bayes is one of best classifiers in data mining. Naïve Bayes Algorithm either is used in some research areas. Besides having good performances, the algorithm can also handle numerical and categorical data values. This paper presents two ways of treating numerical features as a pre-process before implementing Naïve Bayes algorithm in classifying a dataset. First way is by implementing Gauss Density Formula. In second way, we treat the numerical features to be categorized manually by involving the experts. This study start from collecting data which contains numerical attributes in majority. Then dataset will be treated by using first way and second way. We validate the performance of algorithm by using 10-Fold Cross Validation. The considered performances in this research are accuracy, precision, and recall. The result shows that treating numerical features using Gauss Density Techniques outperforms the treatment by discretizing numerical features of nominal values. First way obtains 80% accuracy, 80,61% of precision average, and 80,41% of recall average value while the second way reaches 65% of accuracy, 63,95% of precision average, and 66,43% of recall average.

Copyright
© 2020, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the Sriwijaya International Conference on Information Technology and Its Applications (SICONIAN 2019)
Series
Advances in Intelligent Systems Research
Publication Date
6 May 2020
ISBN
10.2991/aisr.k.200424.072
ISSN
1951-6851
DOI
10.2991/aisr.k.200424.072How to use a DOI?
Copyright
© 2020, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Mochammad YUSA
AU  - Ernawati ERNAWATI
AU  - Yudi SETIAWAN
AU  - Desi ADRESWARI
PY  - 2020
DA  - 2020/05/06
TI  - Handling Numerical Features on Dataset Using Gauss Density Formula and Data Discretization Toward Naïve Bayes Algorithm
BT  - Proceedings of the Sriwijaya International Conference on Information Technology and Its Applications (SICONIAN 2019)
PB  - Atlantis Press
SP  - 467
EP  - 473
SN  - 1951-6851
UR  - https://doi.org/10.2991/aisr.k.200424.072
DO  - 10.2991/aisr.k.200424.072
ID  - YUSA2020
ER  -