模式识别与人工智能
Friday, Apr. 11, 2025 Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
  2007, Vol. 20 Issue (4): 519-524    DOI:
Researches and Applications Current Issue| Next Issue| Archive| Adv Search |
Distributed WEB Information Retrieval Based on Link Partition
ZHANG Gang1,2, WANG Bin1, WU LiHui1
1.Software Division, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080
2.Graduate School of Chinese Academy of Sciences, Beijing 100039

Download: PDF (350 KB)   HTML (1 KB) 
Export: BibTeX | EndNote (RIS)      
Abstract  Distributed information retrieval is an effective way for large scale WEB information retrieval. A link based clustering algorithm ( LIBCA) is proposed for document partition. The BloomFilter Algorithm is selected to improve the efficiency of LIBCA. CORI collection selection algorithm and OKAPI BM25 are used in the process of distributed information retrieval. Based on TREC WEB dataset for the recent three years, a performance comparison is performed among the methods of link based distributed information retrieval, centralized retrieval, and random based distributed information retrieval. The experiment indicates that at P@10 the results of link partition based distributed WEB information retrieval are equal or even better than that of centralized retrieval. The efficiency experimental results indicate that the LIBCA plus BloomFiltern achieves a high system performance and it can deal with large dataset.
Key wordsWEB Link      Clustering      Distributed Information Retrieval     
Received: 26 July 2005     
ZTFLH: TP391  
Service
E-mail this article
Add to my bookshelf
Add to citation manager
E-mail Alert
RSS
Articles by authors
ZHANG Gang
WANG Bin
WU LiHui
Cite this article:   
ZHANG Gang,WANG Bin,WU LiHui. Distributed WEB Information Retrieval Based on Link Partition[J]. , 2007, 20(4): 519-524.
URL:  
http://manu46.magtech.com.cn/Jweb_prai/EN/      OR     http://manu46.magtech.com.cn/Jweb_prai/EN/Y2007/V20/I4/519
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn