A New Text Classifier Based on Random Forests
- DOI
- 10.2991/meita-16.2017.60How to use a DOI?
- Keywords
- Classifiers; Text processing; Machine learning; Learning algorithms
- Abstract
Various ensemble classification methods have been proposed in recent years. These methods have been proven to improve classification accuracy considerably. One of the most widely used ensemble methods is Random Forests, an ensemble of CART, it uses bagging or bootstrap aggregating. In the paper, the use of the Random Forests classifier for text classification is explored. We compare the accuracy of the Random Forest classifier to other pre-existing and freely available methods on Reuters-21578, the standard text test collection. The results showed that the model can be applied to text classification; The text classification model based on random forest had the best effect, compared with the results of a text classification model based on CART, REPTree and J48 and F1-Measure reached 0.777; The text classification model based on random forest is convenient, intuitive and effective, and the evaluation results are reliable. It can provide a new idea for the research of text classification.
- Copyright
- © 2017, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Xin Luo PY - 2017/02 DA - 2017/02 TI - A New Text Classifier Based on Random Forests BT - Proceedings of the 2016 2nd International Conference on Materials Engineering and Information Technology Applications (MEITA 2016) PB - Atlantis Press SP - 290 EP - 293 SN - 2352-5401 UR - https://doi.org/10.2991/meita-16.2017.60 DO - 10.2991/meita-16.2017.60 ID - Luo2017/02 ER -