Connecting Chinese Users Across Social Media Sites
- 10.2991/ic3me-15.2015.245How to use a DOI?
- virtual identity; Chinese; username; bigdata; pinyin.
The usage of social network usernames in the research of social identity linkage has been proved, especially for the English usernames. However, how to properly connect Chinese user identities by matching the usernames remains to be explored. Since a Chinese user may name or rename his/her usernames in different ways (e.g. using a Chinese username or translating it into English, using simplified Chinese characters or traditional ones, converting some words in a given username to their homophonic words, etc.), it is more difficult to connect Chinese users than connect English users by matching usernames. This paper proposes a kind of language mapping method which can translate different type of Chinese words of a given username into their corresponding Pinyin words. However, the number of user identities in a given social network can be very large, thus the username matching process between two social networks is very costly. Basically, we use the Hadoop and Spark frameworks to conquer the efficiency problems. We also have a study on various username matching algorithms, and figure out the features that are useful in Chinese username matching.
- © 2015, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Yanan Li AU - Junxing Zhu AU - Zhongcheng Zhou AU - Bin Zhou AU - Xiaobo Wu PY - 2015/08 DA - 2015/08 TI - Connecting Chinese Users Across Social Media Sites BT - Proceedings of the 3rd International Conference on Material, Mechanical and Manufacturing Engineering PB - Atlantis Press SP - 1273 EP - 1279 SN - 2352-5401 UR - https://doi.org/10.2991/ic3me-15.2015.245 DO - 10.2991/ic3me-15.2015.245 ID - Li2015/08 ER -