Proceedings of the 2016 4th International Conference on Advanced Materials and Information Technology Processing (AMITP 2016)

Chinese Named Entity Extraction System Based On Word2vec Under Spark Platform

Authors
Jialu Yuan, Yongping Xiong
Corresponding Author
Jialu Yuan
Available Online September 2016.
DOI
10.2991/amitp-16.2016.74How to use a DOI?
Keywords
Keywords Spark, word2vec, NER, neural network, machine learning
Abstract

This paper proposes a real-time system that support the Chinese named entity extractions, which through word2vec algorithm training language mode to obtain word vector, and by calculating the Euclidean distance between word vectors to extract Chinese named entity, and transplant algorithm to Spark platform, using the Spark distributed computing ability improve training efficiency. First the system cut corpus into words with the help of existing word segmentation and get the rough corpus, then trains the rough corpus by word2vec algorithm to obtain word vectors and extracts the first layer of named entity according clustering algorithm, finally, the system uses the Named Entity Extraction(NEE) algorithm to extract the named entities and realize it on the spark platform.

Copyright
© 2016, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 2016 4th International Conference on Advanced Materials and Information Technology Processing (AMITP 2016)
Series
Advances in Computer Science Research
Publication Date
September 2016
ISBN
978-94-6252-245-9
ISSN
2352-538X
DOI
10.2991/amitp-16.2016.74How to use a DOI?
Copyright
© 2016, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Jialu Yuan
AU  - Yongping Xiong
PY  - 2016/09
DA  - 2016/09
TI  - Chinese Named Entity Extraction System Based On Word2vec Under Spark Platform
BT  - Proceedings of the 2016 4th International Conference on Advanced Materials and Information Technology Processing (AMITP 2016)
PB  - Atlantis Press
SP  - 387
EP  - 394
SN  - 2352-538X
UR  - https://doi.org/10.2991/amitp-16.2016.74
DO  - 10.2991/amitp-16.2016.74
ID  - Yuan2016/09
ER  -