Extracting Latent Topics from User Reviews Using Online LDA
- DOI
- 10.2991/icitme-18.2018.41How to use a DOI?
- Keywords
- natural language processing; topic model; latent dirichlet allocation; yelp reviews
- Abstract
As local business directory service sites like Dianping.com and Yelp.com are increasingly popular, user reviews are becoming more and more important in informing customers of product and service quality. The reviews can also provide meaningful insights to business owners. However, huge amounts of online user reviews are displayed in texts and are of high dimensionality. They also imply different latent topics. Therefore, it is intractable to pinpoint the demand of customers from a large amount of incremental user reviews manually. The goal of this paper is to help businesses discover user demands from enormous reviews of high dimensionality, which in turn will help improve their business. To this end, we propose using online Latent Dirichlet Allocation (LDA) as topic model to discover latent topics from user reviews. We used the open dataset from Yelp Dataset Challenge, and further cleaned and filtered the dataset to focus on the user reviews of restaurants in Phoenix, Arizona, US. By running Online LDA over the cleaned dataset, we discovered 50 latent topics. In this paper, we present the breakdown of latent topics over all reviews and the word distribution of topics. Furthermore, the method adopted by this paper could prove useful to specific business owners in discovering user demands and points of interest.
- Copyright
- © 2018, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Zilin Wang PY - 2018/08 DA - 2018/08 TI - Extracting Latent Topics from User Reviews Using Online LDA BT - Proceedings of the 2018 International Conference on Information Technology and Management Engineering (ICITME 2018) PB - Atlantis Press SP - 204 EP - 208 SN - 1951-6851 UR - https://doi.org/10.2991/icitme-18.2018.41 DO - 10.2991/icitme-18.2018.41 ID - Wang2018/08 ER -