Abstract:Clustering analysis of gene expression data based on similar expression measures can not fully reveal the genetic function similarity between genes. Combined with gene transitive co-expression, a method for clustering analysis based on transitive co-expression is proposed to solve the problem. Firstly, the gene-related graph is built by using coefficient between gene expression profiles. Next, the transitive co-expression relationship between genes is obtained by the shortest path analysis. Then, clustering is performed by using k-means algorithm with transitive co-expression relationship as similarity measure. The experiments on Yeast gene expression data show that the transitive co-expression-based clustering method achieves better clustering performance compared with expression-based clustering method, and the clustering accuracy is significantly higher than that of the expression-based clustering method. The experimental results indicate that the proposed algorithm has better performance in revealing the nature of gene similarity compared with expression-based clustering method.
王文俊. 基于传输互表达的基因表达数据聚类分析[J]. 模式识别与人工智能, 2012, 25(6): 894-899.
WANG Wen-Jun. Clustering Analysis of Gene Expression Data Based on Transitive Co-Expression. , 2012, 25(6): 894-899.
[1] Karakach T K,Flight R M,Douglas S E,et al.An Introduction to DNA Microarrays for Gene Expression Analysis.Chemometrics and Intelligent Laboratory Systems,2010,104(1): 28-52 [2] Katagiri F,Glazebrook J.Overview of mRNA Expression Profiling Using DNA Microarrays.[EB/OL].[2011-09-01].http://www.current protocals.com/wileyCDA/CPUnit/refid-mb2204.html [3] Nguyen D V,Arpat A B,Wang N,et al.DNA Microarray Experiments: Biological and Technological Aspects.Biometrics,2002,58(4): 701-717 [4] Verducci J S,Melfi V F,Lin S,et al.Microarray Analysis of Gene Expression: Considerations in Data Mining and Statistical Treatment.Physiological Genomics,2006,25(3): 355-363 [5] Kerr G,Ruskin H J,Crane M,et al.Techniques for Clustering Gene Expression Data.Computers in Biology and Medicine,2008,38(3): 283-293 [6] Jung K,Grade M,Gaedcke J,et al.A New Sensitivity-Preferred Strategy to Build Prediction Rules for Therapy Response of Cancer Patients Using Gene Expression Data.Computer Methods and Programs in Biomedicine,2010,100(2): 132-139 [7] Xu R,Damelin S,Nadler B,et al.Clustering of High-Dimensional Gene Expression Data with Feature Filtering Methods and Diffusion Maps.Artificial Intelligence in Medicine,2010,48(2/3): 91-98 [8] An J,Chen Y P.Finding Rule Groups to Classify High Dimensional Gene Expression Datasets.Computational Biology and Chemistry,2009,33(1): 108-113 [9] Lin K S,Chien C F.Cluster Analysis of Genome-Wide Expression Data for Feature Extraction.Expert Systems with Applications,2009,36(2): 3327-3335 [10] Song J,Nicolae D L.A Sequential Clustering Algorithm with Applications to Gene Expression Data.Journal of the Korean Statistical Society,2009,38(2): 175-184 [11] Lu Xinguo,Lin Yaping,Wang Haijun,et al.A Relative Space Based Cancer Classification with Gene Expression Profiles.Acta Electronica Sinica,2008,36 (4): 614-619 (in Chinese) (卢新国,林亚平,王海军,等.基于微阵列基因表达谱的一种关联空间的癌症分类算法.电子学报,2008,36(4): 614-619) [12] Li Yinxin,Liu Quanjin,Ruan Xiaogang.A Method for Extracting Knowledge from Tumor Gene Expression Data.Acta Electronica Sinica,2004,32(9): 1479-1482 (in Chinese) (李颖新,刘全金,阮晓钢.一种肿瘤基因表达数据的知识提取方法.电子学报,2004,32(9): 1479-1482) [13] Karypis G,Han E H,Kumar V.CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling .Journal of Computer,1999,32(8): 68-75 [14] Eisen M B,Spellman P T,Brown P O,et al.Cluster Analysis and Display of Genome-Wide Expression Patterns.Proc of the National Academy of Science of the USA,1998,95(25): 14863-14868 [15] Herwig R,Poustka A J,Müller C,et al.Large-Scale Clustering of cDNA-Fingerprinting Data.Proc of the National Academy of Science,1999,9(11): 1093-1105 [16] Kohonen T.The Self-Organizing Map.Proc of IEEE,1990,78(9): 1464-1480 [17] Gong Gaiyun,Mao Yongcai,Gao Xinbo,et al.Fuzzy C-mean Clustering Method for Analyzing Microarray Gene Expression Data.Journal of Xidian University,2004,31(2): 291-295 (in Chinese) (宫改云,毛用才,高新波,等.模糊C-均值聚类的微阵列基因表达数据分析.西安电子科技大学学报,2004,31(2): 291-295) [18] Zhou X,Kao M C,Hung W W.Transitive Functional Annotation by Shortest-Path Analysis of Gene Expression Data.Proc of the National Academy of Sciences of the USA,2002,99(20): 12783-12788 [19] Qu Y,Xu S Z.Supervised Cluster Analysis for Microarray Data Based on Multivariate Gaussian Mixture.Bioinformatics,2004,20(12): 1905-1913