模式识别与人工智能
2025年4月3日 星期四   首 页     期刊简介     编委会     投稿指南     伦理声明     联系我们                                                                English
模式识别与人工智能  2014, Vol. 27 Issue (1): 49-59    DOI: :10.1186/1471-2105-9-12
[7] Wang Shulin, Li Xueling, Fang Jianwen
研究与应用 最新目录| 下期目录| 过刊浏览| 高级检索 |
基于迭代Lasso的肿瘤分类信息基因选择方法研究
张靖,胡学钢,李培培,张玉红
合肥工业大学 计算机与信息学院 合肥 230009
Informative Gene Selection for Tumor Classification Based on Iterative Lasso
ZHANG Jing, HU Xue-Gang, LI Pei-Pei, ZHANG Yu-Hong
School of Computer and Information, Hefei University of Technology, Hefei 230009

全文: PDF (603 KB)   HTML (1 KB) 
输出: BibTeX | EndNote (RIS)      
摘要 近年来,基于基因表达谱的肿瘤分类问题引起了广泛关注,为癌症的精确诊断及分型提供了极大的便利.然而,由于基因表达谱数据存在样本数量小、维数高、噪声大及冗余度高等特点,给深入准确地挖掘基因表达谱中所蕴含的生物医学知识和肿瘤信息基因选择带来了极大困难.文中提出一种基于迭代Lasso的信息基因选择方法,以获得基因数量少且分类能力较强的信息基因子集.该方法分为两层:第一层采用信噪比指标衡量基因的重要性,以过滤无关基因;第二层采用改进的Lasso方法进行冗余基因的剔除.实验采用5个公开的肿瘤基因表达谱数据集验证了本文方法的可行性和有效性,与已有的信息基因选择方法相比具有更好的分类性能。
服务
把本文推荐给朋友
加入我的书架
加入引用管理器
E-mail Alert
RSS
作者相关文章
张靖
胡学钢
李培培
张玉红
关键词 基因表达谱肿瘤分类迭代Lasso基因选择    
Abstract:Tumor classification based on gene expression profiles, which is of tremendous convenience for cancer accurate diagnosis and subtype recognition, has drawn a great attention in recent years. Due to the characteristics of small samples, high dimensionality, much noise and data redundancy for gene expression profiles, it is difficult to mine biological knowledge from gene expression profiles profoundly and accurately, and it also brings enormous difficulty to informative gene selection in the tumor classification.Therefore, an iterative Lasso-based approach for gene selection,called Gene Selection Based on Iterative Lasso(GSIL), is proposed to select an informative gene subset with fewer genes and better classification ability. The proposed algorithm mainly involves two steps. In the first step, a gene ranking algorithm, Signal Noise Ratio, is applied to select top-ranked genes as the candidate gene subset, which aims to eliminate irrelevant genes. In the second step, an improved method based on Lasso, Iterative Lasso, is employed to eliminate the redundant genes. The experimental results on 5 public datasets validate the feasibility and effectiveness of the proposed algorithm and demonstrate that it has better classification ability in comparison with other gene selection methods.
    
ZTFLH: TP 391  
基金资助:国家自然科学基金项目(No.61273292)、安徽省自然科学基金项目(No.1208085QF122)、中央高校基本科研业务费专项资金项目(No.2011HGBZ1329,2011HGQC1013)资助
作者简介: 张靖(通讯作者),女,1987年生,博士研究生,主要研究方向为数据挖掘.E-mail:hfzjwjl@gmail.com.胡学钢,男,1961年生,教授,博士生导师,主要研究方向为数据挖掘、人工智能.李培培,女,1983年生,博士研究生,主要研究方向为数据挖掘.张玉红,女,1979年生,博士,讲师,主要研究方向为数据挖掘。
引用本文:   
张靖,胡学钢,李培培,张玉红. 基于迭代Lasso的肿瘤分类信息基因选择方法研究[J]. 模式识别与人工智能, 2014, 27(1): 49-59. ZHANG Jing, HU Xue-Gang, LI Pei-Pei, ZHANG Yu-Hong. Informative Gene Selection for Tumor Classification Based on Iterative Lasso. , 2014, 27(1): 49-59.
链接本文:  
http://manu46.magtech.com.cn/Jweb_prai/CN/:10.1186/1471-2105-9-12
[7] Wang Shulin, Li Xueling, Fang Jianwen
     或     http://manu46.magtech.com.cn/Jweb_prai/CN/Y2014/V27/I1/49
版权所有 © 《模式识别与人工智能》编辑部
地址:安微省合肥市蜀山湖路350号 电话:0551-65591176 传真:0551-65591176 Email:bjb@iim.ac.cn
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn