Proceedings of the 11th Joint Conference on Information Sciences (JCIS 2008)

A Link Structure Based Website Topic Hierarchy Extracting Approach

Authors
Zhao Xu1, Qingcai Chen, Hongzhi Guo
1Harbin Institute of Technology Shenzhen Graduate School
Corresponding Author
Zhao Xu
Available Online December 2008.
DOI
10.2991/jcis.2008.98How to use a DOI?
Keywords
Link Structure; Website Topic Hierarchy; Weighted Directed Graph
Abstract

Visualizing hierarchy of a website is very helpful for both users’ navigating and search engine efficiently presenting results. In this paper, treating webpages as nodes and hyperlinks as directed edges, the link structure is firstly modeled as weighted directed graph. Considering multiple website features, which include directory path, contents and anchor texts etc.,the weight is determined by semantic relevance between webpages. The single source shortest path algorithm is finally applied to extract the Topic hierarchy. Conducted experiment on real web to evaluate the proposed algorithm shows the proposed method gets an average pre-cision gain of 11.67% than baseline method.

Copyright
© 2008, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 11th Joint Conference on Information Sciences (JCIS 2008)
Series
Advances in Intelligent Systems Research
Publication Date
December 2008
ISBN
10.2991/jcis.2008.98
ISSN
1951-6851
DOI
10.2991/jcis.2008.98How to use a DOI?
Copyright
© 2008, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Zhao Xu
AU  - Qingcai Chen
AU  - Hongzhi Guo
PY  - 2008/12
DA  - 2008/12
TI  - A Link Structure Based Website Topic Hierarchy Extracting Approach
BT  - Proceedings of the 11th Joint Conference on Information Sciences (JCIS 2008)
PB  - Atlantis Press
SP  - 584
EP  - 589
SN  - 1951-6851
UR  - https://doi.org/10.2991/jcis.2008.98
DO  - 10.2991/jcis.2008.98
ID  - Xu2008/12
ER  -