Computer-aided Off-topic Composition Detection
- DOI
- 10.2991/iccese-18.2018.37How to use a DOI?
- Keywords
- off-topic composition detection; vector space model (VSM); latent Dirichlet allocation (LDA); semantic relations between words
- Abstract
Aiming at the problem that the lack of accurate and efficient off-topic detection algorithm for the current English composition teaching system in China, an off-topic detection algorithm based on LDA and word2vec was proposed in this paper. The algorithm used LDA to model the documents and trained the model with word2vec, with obtained semantic relation between document's topic and words, calculated the probability weighted sum of each topic and its feature words in the document. Finally, the off-topic compositions were selected by setting reasonable threshold. According to the different F-measures for the different number of topics in the document, the optimum number of topics was determined in the experiment. The experimental results show that the proposed method, with above 89% of F-measure, is more effective than traditional vector space model, and can realize the intelligent processing of off-topic compositions detection, which may be applied effectively in teaching of English composition.
- Copyright
- © 2018, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Qiang Qu AU - Yahui Zhao AU - Rongyi Cui PY - 2018/03 DA - 2018/03 TI - Computer-aided Off-topic Composition Detection BT - Proceedings of the 2nd International Conference on Culture, Education and Economic Development of Modern Society (ICCESE 2018) PB - Atlantis Press SP - 155 EP - 158 SN - 2352-5398 UR - https://doi.org/10.2991/iccese-18.2018.37 DO - 10.2991/iccese-18.2018.37 ID - Qu2018/03 ER -