模式识别与人工智能
Friday, Apr. 4, 2025 Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
  2011, Vol. 24 Issue (1): 130-137    DOI:
Orignal Article Current Issue| Next Issue| Archive| Adv Search |
Data Extraction from Limited Deep Web Based on Latticial Space
ZHANG Zhuo1, LI Shi-Jun1, ZHANG Nai-Zhou1,2, TIAN Jian-Wei1
1.School of Computer, Wuhan University, Wuhan 430079
2.Department of Computer Science, Zhixing College of HuBei University, Wuhan 430072

Download: PDF (542 KB)   HTML (1 KB) 
Export: BibTeX | EndNote (RIS)      
Abstract  In the situation of crawling Deep Web database that limits the number of results, the problem of appropriately predicting the results size of queries can be modeled as a set covering problem with condition of limited set size. This problem is modeled as a concept covering problem. Firstly, the relation among all couples composed by a query and its result is proved as tolerance. Secondly, set of them is proved as a complete lattice which is homomorphism to the concept lattice from the same source. Therefore, the order relation between concepts can be utilized to describe correlation between queries. The intent of a concept can be considered as a query, thus the result size is forecasted by cardinality of the concept extent. A lattice-based algorithm is proposed for data extraction from limited Deep Web database, called Ladeldew. Semi-lattice pruned based on the cardinality of extent is exploited by Ladeldew as search space. The new search space is iteratively generated from new data until nothing can be extracted. Both controlled and real experiments are implemented to evaluate Ladeldew, and the results verify its theoretical correction and realistic application.
Key wordsData Extraction      Tolerance Relation      Formal Concept Analysis      Concept Lattice     
Received: 16 December 2009     
ZTFLH: TP311.1  
Service
E-mail this article
Add to my bookshelf
Add to citation manager
E-mail Alert
RSS
Articles by authors
ZHANG Zhuo
LI Shi-Jun
ZHANG Nai-Zhou
TIAN Jian-Wei
Cite this article:   
ZHANG Zhuo,LI Shi-Jun,ZHANG Nai-Zhou等. Data Extraction from Limited Deep Web Based on Latticial Space[J]. , 2011, 24(1): 130-137.
URL:  
http://manu46.magtech.com.cn/Jweb_prai/EN/      OR     http://manu46.magtech.com.cn/Jweb_prai/EN/Y2011/V24/I1/130
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn