Mining micro-blogging users’ interest features via fingerprint generation
- DOI
- 10.2991/iccsee.2013.370How to use a DOI?
- Keywords
- micro-blogging,interest feature, tweet, fingerprint
- Abstract
Nowadays, micro-blogging is widely used as a communication and information sharing social network service, therefore mining micro-blogging users’ behavior features is very important both in the economic and social fields. A framework for the analysis of user’s interest features is proposed in this paper. After data cleaning, word segmentation, POS(part of speech) filtering and synonym merging, the keywords that called terms of all the tweets posted by a typical user in 2011 are extracted. Then VSM(vector space model) is used to generate the feature vector of the tweets from these terms. Furthermore, a k-bit binary called fingerprint is generated from the high dimensional feature vector of the tweets by use of Simhash algorithm. The micro-blogging user’s interest features and change patterns could be detected by analyzing the fingerprint sequences and the distance between the adjacent two fingerprints. Taking Sina micro-blogging as background, a series of experiments are done to prove the effectiveness of the algorithms.
- Copyright
- © 2013, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Dong Liu AU - Quanyuan Wu AU - Weihong Han PY - 2013/03 DA - 2013/03 TI - Mining micro-blogging users’ interest features via fingerprint generation BT - Proceedings of the 2nd International Conference on Computer Science and Electronics Engineering (ICCSEE 2013) PB - Atlantis Press SP - 1469 EP - 1472 SN - 1951-6851 UR - https://doi.org/10.2991/iccsee.2013.370 DO - 10.2991/iccsee.2013.370 ID - Liu2013/03 ER -