Proceedings of the 2nd Borobudur International Symposium on Science and Technology (BIS-STE 2020)

Comparison of Effectiveness of Stemming Algorithms in Indonesian Documents

Authors
Dyah Mustikasari, Ida Widaningrum, Rizal Arifin, Wahyu Henggal Eka Putri
Corresponding Author
Ida Widaningrum
Available Online 11 August 2021.
DOI
10.2991/aer.k.210810.025How to use a DOI?
Keywords
Effectiveness, Stemming, Indonesian, Document
Abstract

Stemming is a process to determine basic word with some rules. In Bahasa Indonesia, the way is to eliminate prefixes, infixes, suffixes, or combination of prefixes and suffixes in derivative words. Several stemming algorithms for Bahasa Indonesia have been developed. But their effectiveness has not been studied. In this study, these three stemming algorithms will be compared. We used 900 affixes to conduct the comparison. Each word is searched for their basic words using the three algorithms. The basic word resulted then referred to KBBI or Indonesian dictionary to see whether they are right. Comparison process of stemming show that Sastrawi’s could do the best stemming that 95,2% of the affix words tested could be root words. The Nazief & Adriani Algorithm resulted 92,4%, while Arifin Setiono’s finished at 89%. It could state that Arifin Setiono’s needs a lot of improvement because many affixed words could not return to the root word.

Copyright
© 2021, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 2nd Borobudur International Symposium on Science and Technology (BIS-STE 2020)
Series
Advances in Engineering Research
Publication Date
11 August 2021
ISBN
10.2991/aer.k.210810.025
ISSN
2352-5401
DOI
10.2991/aer.k.210810.025How to use a DOI?
Copyright
© 2021, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Dyah Mustikasari
AU  - Ida Widaningrum
AU  - Rizal Arifin
AU  - Wahyu Henggal Eka Putri
PY  - 2021
DA  - 2021/08/11
TI  - Comparison of Effectiveness of Stemming Algorithms in Indonesian Documents
BT  - Proceedings of the 2nd Borobudur International Symposium on Science and Technology (BIS-STE 2020)
PB  - Atlantis Press
SP  - 154
EP  - 158
SN  - 2352-5401
UR  - https://doi.org/10.2991/aer.k.210810.025
DO  - 10.2991/aer.k.210810.025
ID  - Mustikasari2021
ER  -