Optimization of Hidden Markov Model by a Genetic Algorithm for Web Information Extraction
- 10.2991/iske.2007.48How to use a DOI?
- hidden Markov model; genetic algorithm; Baum-Welch algorithm; Viterbi algorithm; information extraction
This paper demonstrates a new training method based on GA and Baum-Welch algorithms to obtain an HMM model with optimized number of states in the HMM models and its model parameters for web information extraction. This method is not only able to overcome the shortcomings of the slow convergence speed of the HMM approach. In addition, this method also finds better number of states in the HMM topology as well as its model parameters. From the experiments with the 2100 webs extracted from our corpus, this method is able to find the optimal topology in all cases. The experiments are found that the GA-HMM approach has an average precision rate of 84.483% while the HMM trained by the Baum-Welch method has an average precision rate of 71.049%. This implies that the GA-HMM method is more optimized than the HMM trained by the Baum-Welch method.
- © 2007, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Jiyi Xiao AU - Lamei Zou AU - Chuanqi Li PY - 2007/10 DA - 2007/10 TI - Optimization of Hidden Markov Model by a Genetic Algorithm for Web Information Extraction BT - Proceedings of the 2007 International Conference on Intelligent Systems and Knowledge Engineering (ISKE 2007) PB - Atlantis Press SP - 282 EP - 287 SN - 1951-6851 UR - https://doi.org/10.2991/iske.2007.48 DO - 10.2991/iske.2007.48 ID - Xiao2007/10 ER -