模式识别与人工智能
Wednesday, Apr. 2, 2025 Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
  2015, Vol. 28 Issue (5): 472-480    DOI: 10.16451/j.cnki.issn1003-6059.201505011
Researches and Applications Current Issue| Next Issue| Archive| Adv Search |
BMGSJoin: A MapReduce Based Graph Similarity Join Algorithm
CHEN Yi-Fan, ZHAO Xiang, HE Pei-Jun, ZHANG Wei-Ming, TANG Jiu-Yang
Science and Technology on Information System and Engineering Laboratory, National University of Defense Technology, Changsha 410073

Download: PDF (551 KB)   HTML (1 KB) 
Export: BibTeX | EndNote (RIS)      
Abstract  Graph similarity join has extensive use in the field of data mining, especially in data pre-processing, it could be applied to data cleaning, near duplicate detection, etc. Thus, it is of great importance to study graph similarity join. Graph similarit join based on edit distance constraints is studied, that is, all the edit distances in the return pair of graphs are no larger than a given threshold. Based on MapReduce programming model, an algorithm named MGSJoin is proposed with the ″filtering-verification″ framework, and it relies on graph signatures of path-based q-grams for filtering out non-promising candidates, i.e. count filtering.With the potential issue of too many key-value pairs, Bloom Filter is introduced to improve the algorithm and BMGSJoin is designed. The improvement of efficiency and scalability by the proposed algorithm is demonstrated by extensive experimental results, and it may meet the current challenges of big data mining and analysis.
Key wordsGraph Similarity Join      MapReduce      Bloom Filter     
Received: 25 March 2014     
ZTFLH: TP 391.4  
Service
E-mail this article
Add to my bookshelf
Add to citation manager
E-mail Alert
RSS
Articles by authors
CHEN Yi-Fan
ZHAO Xiang
HE Pei-Jun
ZHANG Wei-Ming
TANG Jiu-Yang
Cite this article:   
CHEN Yi-Fan,ZHAO Xiang,HE Pei-Jun等. BMGSJoin: A MapReduce Based Graph Similarity Join Algorithm[J]. , 2015, 28(5): 472-480.
URL:  
http://manu46.magtech.com.cn/Jweb_prai/EN/10.16451/j.cnki.issn1003-6059.201505011      OR     http://manu46.magtech.com.cn/Jweb_prai/EN/Y2015/V28/I5/472
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn