A Hybrid Method to Evaluate Similarity of XML Document
- DOI
- 10.2991/emcs-16.2016.165How to use a DOI?
- Keywords
- ML; Path; Semantic; Fuzzy Longest common subsequence; Hungarian
- Abstract
XML is an important standard of information representation and data exchange over the Internet, document classification is an important way to get useful information from the mass of information solutions, a method of XML document classification is proposed based on fuzzy matching path in the paper. First, the information that has no influence on the classification is removed; Then a mixed method is used to compute XML document similarity, XML document is expressed as a collection of path, deleting the recurring and matching fuzzy path in order to improve efficiency, Hungarian algorithm to calculate the similarity between documents; Finally, 2 experiments are done and the results show that the method is effective.
- Copyright
- © 2016, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Yubiao Dai AU - Xueli Ren PY - 2016/01 DA - 2016/01 TI - A Hybrid Method to Evaluate Similarity of XML Document BT - Proceedings of the 2016 International Conference on Education, Management, Computer and Society PB - Atlantis Press SP - 677 EP - 680 SN - 2352-538X UR - https://doi.org/10.2991/emcs-16.2016.165 DO - 10.2991/emcs-16.2016.165 ID - Dai2016/01 ER -