模式识别与人工智能
Friday, Apr. 11, 2025 Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
  2016, Vol. 29 Issue (10): 894-906    DOI: 10.16451/j.cnki.issn1003-6059.201610004
Papers and Reports Current Issue| Next Issue| Archive| Adv Search |
Frequent Pattern Mining from Biological Sequences Based on Score Matrix
YUAN Ermao1, GUO Dan1, HU Xuegang1, WU Xindong1,2
1.School of Computer and Information, Hefei University of Technology, Hefei 230009.
2.Department of Computer Science, University of Vermont, Burlington, VT 05405

Download: PDF (631 KB)   HTML (1 KB) 
Export: BibTeX | EndNote (RIS)      
Abstract  Mining significant frequent patterns from biological sequences is an important task in bioinformatics. An algorithm of mining approximate frequent pattern based on score matrix (MAPS) is proposed. Firstly, approximate matching score matrix (MSM) is constructed to handle insertion, replacement and deletion operations with interval constraints. Secondly, the approximate pattern matching based on score matrix (S-APM) scheme is designed to obtain counts of approximate occurrences of each pattern. Finally, a data driven pattern generation method and an Apriori-like rule are adopted to avoid unnecessary candidate patterns. Experiments on protein and DNA sequences show that the MAPS produces better performance, and it can be used to discover co-occurrence frequent patterns among different sequences.
Key wordsApproximate Matching      Wildcards      Interval Constraint      Score Matrix      Frequent Pattern     
Received: 08 February 2016     
ZTFLH: TP 391  
Fund:Supported by National Natural Science Foundation of China-Joint Research Fund for Overseas Chinese, Hong Kong and Macao Young Scholars (No.61229301), Young Scientists Fund of National Natural Science Foundation of China (No.61305062)
About author:: (YUAN Ermao, born in 1991, master student. His research interests include pattern matching and data mining.)
(GUO Dan(Corresponding author), born in 1983, Ph.D., associate professor. Her research interests include artificial intelligence and pattern mining.)
(HU Xuegang, born in 1961, Ph.D., professor. His research interests include data mining and artificial intelligence.)
(WU Xindong, born in 1963, Ph.D., professor. His research interests include data mining, system based on knowledge and world wide web information retrieval.)
Service
E-mail this article
Add to my bookshelf
Add to citation manager
E-mail Alert
RSS
Articles by authors
YUAN Ermao
GUO Dan
HU Xuegang
WU Xindong
Cite this article:   
YUAN Ermao,GUO Dan,HU Xuegang等. Frequent Pattern Mining from Biological Sequences Based on Score Matrix[J]. , 2016, 29(10): 894-906.
URL:  
http://manu46.magtech.com.cn/Jweb_prai/EN/10.16451/j.cnki.issn1003-6059.201610004      OR     http://manu46.magtech.com.cn/Jweb_prai/EN/Y2016/V29/I10/894
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn