模式识别与人工智能
Friday, Apr. 11, 2025 Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
  2011, Vol. 24 Issue (3): 385-390    DOI:
Articles Current Issue| Next Issue| Archive| Adv Search |
Web Information Extraction Based on Genetic Algorithm

Download: PDF (395 KB)   HTML (1 KB) 
Export: BibTeX | EndNote (RIS)      
Abstract  WHISK system is a semi automatic information extraction (IE) system. It works well in extracting information for structured or semi structured web texts. However, but there is no guarantee that the rule learning algorithm can extend rules in an optimal way. Besides, the generation of rule set is  time consuming. To solve these problems, the genetic algorithm is introduced to improve the supervised machine learning algorithm WHISK by a heuristic rule expansion, and a removing method is used to generate the rule set. The experimental results show that the proposed algorithm performs well in terms of the efficiency and the recall rate.
Key wordsInformation Extraction      WHISK System      Genetic Algorithm      Rule Learning     
ZTFLH: TP 181  
Service
E-mail this article
Add to my bookshelf
Add to citation manager
E-mail Alert
RSS
Articles by authors
GUO Yin-Rui
CHEN Rong
Cite this article:   
GUO Yin-Rui,CHEN Rong. Web Information Extraction Based on Genetic Algorithm[J]. , 2011, 24(3): 385-390.
URL:  
http://manu46.magtech.com.cn/Jweb_prai/EN/      OR     http://manu46.magtech.com.cn/Jweb_prai/EN/Y2011/V24/I3/385
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn