Proceedings of the First Mandalika International Multi-Conference on Science and Engineering 2022, MIMSE 2022 (Informatics and Computer Science) (MIMSE-I-C-2022)

Indonesian SMS Spam Detection Using TF-RF Feature Weighting Method and Support Vector Machine Classifier

Authors
Muhammad Syulhan Ghofany1, Ramaditia Dwiyansaputra1, *, Fitri Bimantoro1, Khairunnas2
1Department Informatics Engineering, University of Mataram, Mataram NTB, Indonesia
2Computer Engineering Department, Vistula University, Warzawa, Poland
*Corresponding author. Email: rama@unram.ac.id
Corresponding Author
Ramaditia Dwiyansaputra
Available Online 26 December 2022.
DOI
10.2991/978-94-6463-084-8_12How to use a DOI?
Keywords
SMS Spam; Text Classification; Supervised Term Weighting; TF-RF; Support Vector Machine
Abstract

SMS Spam is an unsolicited or unwanted text message by a user that is sent to a mobile device. At this time, increasingly criminal acts can annoy recipients by spreading unsolicited or unwanted spam SMS, including promotions, fraud, pornographic messages, and others. Therefore, the classification of SMS needs to be developed to assist in categorizing SMS. In existing research, to try to overcome these problems, the term frequency-inverse document frequency (TF-IDF) feature is applied. However, this method has a disadvantage, namely eliminating category information on each document, so in this study, a comparison will be made with the Supervised Term Weighting feature method, which is one of the terms frequency relevance frequency (TF-RF) using the Support Vector Machine, K-nearest Neighbor, and Multinomial Naïve Bayes. The total data used is 500 SMS with a comparison of 325 non-spam SMS and 175 spam SMS. After the experiment is conducted, SVM Kernel Sigmoid has the highest average accuracy value where the difference in average accuracy with Kernel RBF is 2.26%, Linear Kernel is 0.09%, k-Nearest Neighbor is 27.56%, and Multinomial Naïve Bayes is 4.37%.

Copyright
© 2022 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the First Mandalika International Multi-Conference on Science and Engineering 2022, MIMSE 2022 (Informatics and Computer Science) (MIMSE-I-C-2022)
Series
Advances in Computer Science Research
Publication Date
26 December 2022
ISBN
10.2991/978-94-6463-084-8_12
ISSN
2352-538X
DOI
10.2991/978-94-6463-084-8_12How to use a DOI?
Copyright
© 2022 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Muhammad Syulhan Ghofany
AU  - Ramaditia Dwiyansaputra
AU  - Fitri Bimantoro
AU  - Khairunnas
PY  - 2022
DA  - 2022/12/26
TI  - Indonesian SMS Spam Detection Using TF-RF Feature Weighting Method and Support Vector Machine Classifier
BT  - Proceedings of the First Mandalika International Multi-Conference on Science and Engineering 2022, MIMSE 2022 (Informatics and Computer Science) (MIMSE-I-C-2022)
PB  - Atlantis Press
SP  - 117
EP  - 129
SN  - 2352-538X
UR  - https://doi.org/10.2991/978-94-6463-084-8_12
DO  - 10.2991/978-94-6463-084-8_12
ID  - Ghofany2022
ER  -