Proceedings of the 2nd International Conference on Advances in Computer Science and Engineering (CSE 2013)

A Chinese Word Clustering Method Using Latent Dirichlet Allocation and K-means

Authors
Lin Qiu, Jungang Xu
Corresponding Author
Lin Qiu
Available Online July 2013.
DOI
https://doi.org/10.2991/cse.2013.60How to use a DOI?
Keywords
word clustering; latent dirichlet allocation; k-means; word similarity
Abstract
Word clustering is a popular research issue in the field of natural language processing. In this paper, Latent Dirichlet Allocation algorithm is used to extract the topics from nouns in the text, and the highest probability noun of each topic is selected as the centroids of the k-means algorithm. Experimental results show that this method can get better effects than the graph-based word clustering algorithms using a web search engine.
Open Access
This is an open access article distributed under the CC BY-NC license.

Download article (PDF)

Proceedings
2nd International Conference on Advances in Computer Science and Engineering (CSE 2013)
Part of series
Advances in Intelligent Systems Research
Publication Date
July 2013
ISBN
978-90786-77-70-3
ISSN
1951-6851
DOI
https://doi.org/10.2991/cse.2013.60How to use a DOI?
Open Access
This is an open access article distributed under the CC BY-NC license.

Cite this article

TY  - CONF
AU  - Lin Qiu
AU  - Jungang Xu
PY  - 2013/07
DA  - 2013/07
TI  - A Chinese Word Clustering Method Using Latent Dirichlet Allocation and K-means
BT  - 2nd International Conference on Advances in Computer Science and Engineering (CSE 2013)
PB  - Atlantis Press
SP  - 269
EP  - 272
SN  - 1951-6851
UR  - https://doi.org/10.2991/cse.2013.60
DO  - https://doi.org/10.2991/cse.2013.60
ID  - Qiu2013/07
ER  -