Research on Web Character Information Extraction Based on Semantic Similarity

Baocheng Wang; Wei Huang; Zhongren Li; Ke Xiao

doi:10.2991/ceie-16.2017.85

<Previous Article In Volume

Next Article In Volume>

Research on Web Character Information Extraction Based on Semantic Similarity

Authors

Baocheng Wang, Wei Huang, Zhongren Li, Ke Xiao

Corresponding Author

Baocheng Wang

Available Online October 2016.

DOI: 10.2991/ceie-16.2017.85 How to use a DOI?
Keywords: Semantic Similarity; Character Information Extraction; Machine Learning
Abstract: As for the loss of the comprehensiveness from the large amount of data when extracting information, this paper proposes a method of character information extraction based on semantic similarity algorithm to improve the comprehensiveness of the character information extraction of massive data in the network. The algorithm is put into the semantic tree to choose the synonyms of the word, and the character feature set which is extended by semantic similarity is applied to character information extraction. The results show that the recall reaches to 81.87% in the case of the accuracy rate being basically unchanged. Therefore, this method of character information extraction is obviously improving in comprehensiveness, and it can be used in network data.
Copyright: © 2017, the Authors. Published by Atlantis Press.
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the International Conference on Communication and Electronic Information Engineering (CEIE 2016)
Series: Advances in Engineering Research
Publication Date: October 2016
ISBN: 978-94-6252-312-8
ISSN: 2352-5401
DOI: 10.2991/ceie-16.2017.85 How to use a DOI?
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

ris enw bib

TY  - CONF
AU  - Baocheng Wang
AU  - Wei Huang
AU  - Zhongren Li
AU  - Ke Xiao
PY  - 2016/10
DA  - 2016/10
TI  - Research on Web Character Information Extraction Based on Semantic Similarity
BT  - Proceedings of the International Conference on Communication and Electronic Information Engineering (CEIE 2016)
PB  - Atlantis Press
SP  - 663
EP  - 670
SN  - 2352-5401
UR  - https://doi.org/10.2991/ceie-16.2017.85
DO  - 10.2991/ceie-16.2017.85
ID  - Wang2016/10
ER  -

download .riscopy to clipboard