Gray Tunneling Based on Joint Link for Focused Crawling
Wei Dong, Hong Ni, Haojiang Deng, Liheng Tuo
Available Online April 2015.
- https://doi.org/10.2991/icmra-15.2015.167How to use a DOI?
- Focused Crawling; Gray Tunneling; Web Link Machine Learning; Q Learning
- Tunneling problems of the topic-multiplicity of a web page makes the relevance of the highly relevant page to be weakened. In this paper, we proposed a novel relevance prediction for focused crawling to solve gray tunneling. Our approach is based on calculating the relevancy score of web page based on its block relevancy score with respect to topics and calculating the URL score based on its parent pages and its anchor contexts, and we joins the context similarity and the link similarity which is based on Q feedback learning. Experimental results showed that the proposed method outperformed the Link-Contexts, Best-First and Breadth-First for all test data sets.
- Open Access
- This is an open access article distributed under the CC BY-NC license.
Cite this article
TY - CONF AU - Wei Dong AU - Hong Ni AU - Haojiang Deng AU - Liheng Tuo PY - 2015/04 DA - 2015/04 TI - Gray Tunneling Based on Joint Link for Focused Crawling BT - 3rd International Conference on Mechatronics, Robotics and Automation PB - Atlantis Press SP - 859 EP - 862 SN - 2352-538X UR - https://doi.org/10.2991/icmra-15.2015.167 DO - https://doi.org/10.2991/icmra-15.2015.167 ID - Dong2015/04 ER -